Watch, Follow, &
Connect with Us
Public Report
Report From: Delphi-BCB/RTL/Delphi/Pascal Strings    [ Add a report in this area ]  
Report #:  111103   Status: Open
Inefficient loop in Pos() for purepascal
Project:  Delphi Build #:  17.0.4625.53395
Version:    17.0 Submitted By:   Leif Uneus
Report Type:  Suggestion / Enhancement Request Date Reported:  12/5/2012 2:31:48 PM
Severity:    Commonly encountered problem Last Updated: 12/16/2012 4:26:02 PM
Platform:    All platforms Internal Tracking #:  
Resolution: None (Resolution Comments) Resolved in Build: : None
Duplicate of:  None
Voting and Rating
Overall Rating: No Ratings Yet
0.00 out of 5
Total Votes: None
Description
The purepascal implementation in XE3 for the different overloaded Pos() functions in System.pas got an overhaul since XE2.

Unfortunately the loop where everything happens is not optimal, in fact poorly designed. It was probably taken from the purepascal Posex() in System.StrlUtils implementation in XE2.

This is what it looks like:

    for I := 0 to LIterCnt do
    begin
      J := 0;
      while (J >= 0) and (J < L) do // <-- These checks can be skipped !
      begin
        if PS[I + J] = PSubStr[J] then
          Inc(J)
        else
          J := -1;  // <-- Here you just want to start with next i instead
      end;
      if J >= L then  // <-- You just want to do this when J is incremented
        Exit(I + Offset);
    end;

This is what it should look like:

    for I := 0 to LIterCnt do
    begin
      J := 0;
      repeat
        if PS[I + J] <> PSubStr[J] then
          break;
        Inc(J);
        if J >= L then
          Exit(I + Offset);
      until false;
    end;

The increase in performance with the corrected code is not negligible :-)
In the submitted test code OptPos() is a factor of 3 faster than Pos when tested for Win64 target platform.

As a bonus, I submit a new routine which is 30 to 300% faster than OptPos, called OptPos1.
The increase in performance is due to a loop where the compiler can make better optimization, plus matching the characters from the end of SubStr to the beginning.

function OptPos1(const SubStr, Str: UnicodeString; Offset: Integer): Integer;
var
  I, LIterCnt, L, J: NativeInt;
  PSubStr, PS: PWideChar;
begin
  L := Length(SubStr);
  { Calculate the number of possible iterations. Not valid if Offset < 1. }
  LIterCnt := Length(Str) - Offset - L + 1;

  { Only continue if the number of iterations is positive or zero (there is space to check) }
  if (Offset > 0) and (LIterCnt >= 0) and (L > 0) then
  begin
    PSubStr := PWideChar(SubStr);
    PS := PWideChar(Str);
    Inc(PS, Offset - 1);

    Dec(L);
    I := 0;
    J := L;
    repeat
      if PS[I + J] <> PSubStr[J] then
      begin
        Inc(I);
        J := L;
        Dec(LIterCnt);
        if (LIterCnt < 0)
          then Exit(0);
      end
      else
      if (J > 0) then
        Dec(J)
      else
        Exit(I + Offset);
    until false;
  end;

  Result := 0;
end;


Steps to Reproduce:
program ProjOptPos;

// Run this for Win64 target platform .

{$APPTYPE CONSOLE}

{$R *.res}

uses
  System.Diagnostics;


function OptPos(const SubStr, Str: UnicodeString; Offset: Integer): Integer;
var
  I, LIterCnt, L, J: NativeInt;
  PSubStr, PS: PWideChar;
begin
  L := Length(SubStr);
  { Calculate the number of possible iterations. Not valid if Offset < 1. }
  LIterCnt := Length(Str) - Offset - L + 1;

  { Only continue if the number of iterations is positive or zero (there is space to check) }
  if (Offset > 0) and (LIterCnt >= 0) and (L > 0) then
  begin
    PSubStr := PWideChar(SubStr);
    PS := PWideChar(Str);
    Inc(PS, Offset - 1);

    for I := 0 to LIterCnt do
    begin
      J := 0;
      repeat
        if PS[I + J] <> PSubStr[J] then
          Break;
        Inc(J);
        if J >= L then
          Exit(I + Offset);
      until false;
    end;
  end;

  Result := 0;
end;

Var
  sw : TStopWatch;
  i,j : Integer;
  MySub,MyStr : String;
begin
  sw := TStopWatch.Create;

  MySub := 'HELLO';
  MyStr := 'HELLHELLHELLHELLO';

  WriteLn(System.Pos(MySub,MyStr,1));
  WriteLn(OptPos(MySub,MyStr,1));

  sw.Reset;
  sw.Start;
  for i := 0 to 10000000 do
  begin
    j := System.Pos(MySub,MyStr,1);
  end;
  sw.Stop;
  WriteLn('System.Pos elapsed (ms):  ',sw.ElapsedMilliseconds);

  sw.Reset;
  sw.Start;
  for i := 0 to 10000000 do
  begin
    j := OptPos(MySub,MyStr,1);
  end;
  sw.Stop;
  WriteLn('MySystem.Pos elapsed (ms):  ',sw.ElapsedMilliseconds);

  ReadLn;

end.
Workarounds
None
Attachment
None
Comments

None

Server Response from: ETNACODE01