TRegExr interface

Public methods and properties of TRegExpr class:

class function VersionMajor : integer;
class function VersionMinor : integer;

Return major and minor version, for example, for v. 0.944 VersionMajor = 0 and VersionMinor = 944

property Expression : string

Regular expression.

For optimization, TRegExpr will automatically compiles it into ‘P-code’ (You can see it with help of Dump method) and stores in internal structures. Real [re]compilation occures only when it really needed - while calling Exec[Next], Substitute, Dump, etc and only if Expression or other P-code affected properties was changed after last [re]compilation.

If any errors while [re]compilation occures, Error method is called (by default Error raises exception - see below)

property ModifierStr : string

Set/get default values of r.e.modifiers. Format of the string is similar as in (?ismx-ismx). For example ModifierStr := ‘i-x’ will switch on modifier /i, switch off /x and leave unchanged others.

If you try to set unsupported modifier, Error will be called (by defaul Error raises exception ERegExpr).

  property ModifierI : boolean

Modifier /i - (“caseinsensitive”), initialized with RegExprModifierI value.

  property ModifierR : boolean

Modifier /r - (“Russian.syntax extensions), initialized with RegExprModifierR value.

  property ModifierS : boolean

Modifier /s - ‘.’ works as any char (else doesn’t match LineSeparators and LinePairedSeparator), initialized with RegExprModifierS value.

  property ModifierG : boolean;

Modifier /g Switching off modifier /g switchs all operators in non-greedy style, so if ModifierG = False, then all ‘*’ works as ‘*?’, all ‘+’ as ‘+?’ and so on, initialized with RegExprModifierG value.

property ModifierM : boolean;

Modifier /m Treat string as multiple lines. That is, change `^’ and `$’ from matching at only the very start or end of the string to the start or end of any line anywhere within the string, initialized with RegExprModifierM value.

  property ModifierX : boolean;

Modifier /x - (“eXtended syntax”), initialized with RegExprModifierX value.

function Exec (const AInputString : string) : boolean;

match a programm against a string AInputString

!!! Exec store AInputString into InputString property

For Delphi 5 and higher available overloaded versions:

function Exec : boolean;

without parameter (uses already assigned to InputString property value)

function Exec (AOffset: integer) : boolean;

is same as ExecPos

function ExecNext : boolean;

Find next match:


Works same as

if MatchLen \[0\] = 0 then ExecPos (MatchPos \[0\] + 1)
  else ExecPos (MatchPos \[0\] + MatchLen \[0\]);

but it’s more simpler !

Raises exception if used without preceeding successful call to Exec* (Exec, ExecPos, ExecNext).

So You always must use something like

if Exec (InputString) then repeat { proceed results} until not ExecNext;


function ExecPos (AOffset: integer = 1) : boolean;

find match for InputString starting from AOffset position

(AOffset=1 - first char of InputString)

property InputString : string;

returns current input string (from last Exec call or last assign to this property).

Any assignment to this property clear Match* properties !

function Substitute (const ATemplate : string) : string;

Returns ATemplate with ‘$&’ or ‘$0’ replaced by whole r.e. occurence and ‘$n’ replaced by occurence of subexpression #n.

Since v.0.929 ‘$’ used instead of ‘\’ (for future extensions and for more Perl-compatibility) and accept more then one digit.

If you want place into template raw ‘$’ or ‘\’, use prefix ‘\’

Example: ‘1\$ is $2\\rub\\’ -> ‘1$ is <Match[2]>\rub\’

If you want to place raw digit after ‘$n’ you must delimit n with curly braces ‘{}’.

Example: ‘a$12bc’ -> ‘a<Match[12]>bc’, ‘a${1}2bc’ -> ‘a<Match[1]>2bc’.

procedure Split (AInputStr : string; APieces : TStrings);

Split AInputStr into APieces by r.e. occurencies

Internally calls Exec[Next]

function Replace (AInputStr : RegExprString; const AReplaceStr : RegExprString;
  AUseSubstitution : boolean = False) : RegExprString;

function Replace (AInputStr : RegExprString;
  AReplaceFunc : TRegExprReplaceFunction) : RegExprString;

function ReplaceEx (AInputStr : RegExprString;
  AReplaceFunc : TRegExprReplaceFunction)  : RegExprString;

Returns AInputStr with r.e. occurencies replaced by AReplaceStr

If AUseSubstitution is true, then AReplaceStr will be used

as template for Substitution methods.

For example:

Expression := '({-i}block|var)\\s\*\\(\\s\*(\[^ \]\*)\\s\*\\)\\s\*';
Replace ('BLOCK( test1)', 'def "$1" value "$2"', True);

  will return:  def ‘BLOCK’ value ‘test1’

Replace ('BLOCK( test1)', 'def "$1" value "$2"', False)

  will return:  def “$1” value “$2”

Internally calls Exec[Next]

Overloaded version and ReplaceEx operate with call-back function,

so You can implement really complex functionality.

property SubExprMatchCount : integer; // ReadOnly

Number of subexpressions has been found in last Exec* call.

If there are no subexpr. but whole expr was found (Exec* returned True), then SubExprMatchCount=0, if no subexpressions nor whole r.e. found (Exec* returned false) then SubExprMatchCount=-1.

Note, that some subexpr. may be not found and for such subexpr. MathPos=MatchLen=-1 and Match=’’.

For example:

Expression := '(1)?2(3)?';
Exec ('123'): SubExprMatchCount=2, Match[0]='123', [1]='1', [2]='3'

Exec ('12'): SubExprMatchCount=1, Match[0]='12', [1]='1'

Exec ('23'): SubExprMatchCount=2, Match[0]='23', [1]='', [2]='3'

Exec ('2'): SubExprMatchCount=0, Match[0]='2'

Exec ('7') - return False: SubExprMatchCount=-1

property MatchPos [Idx : integer] : integer; // ReadOnly

pos of entrance subexpr. #Idx into tested in last Exec* string. First subexpr. have Idx=1, last - MatchCount, whole r.e. have Idx=0.

Returns -1 if in r.e. no such subexpr. or this subexpr. not found in input string.

property MatchLen [Idx : integer] : integer; // ReadOnly

len of entrance subexpr. #Idx r.e. into tested in last Exec* string. First subexpr. have Idx=1, last - MatchCount, whole r.e. have Idx=0.

Returns -1 if in r.e. no such subexpr. or this subexpr. not found in input string.

property Match [Idx : integer] : string; // ReadOnly

== copy (InputString, MatchPos [Idx], MatchLen [Idx])

Returns ‘’ if in r.e. no such subexpr. or this subexpr. not found in input string.

function LastError : integer;

Returns ID of last error, 0 if no errors (unusable if Error method raises exception) and clear internal status into 0 (no errors).

function ErrorMsg (AErrorID : integer) : string; virtual;

Returns Error message for error with ID = AErrorID.

property CompilerErrorPos : integer; // ReadOnly

Returns pos in r.e. there compiler stopped.

Usefull for error diagnostics

property SpaceChars : RegExprString

Contains chars, treated as \s (initially filled with RegExprSpaceChars global constant)

property WordChars : RegExprString;

Contains chars, treated as \w (initially filled with RegExprWordChars global constant)


property LineSeparators : RegExprString

line separators (like \n in Unix), initially filled with RegExprLineSeparators global constant)

see also about line separators

property LinePairedSeparator : RegExprString

paired line separator (like \r\n in DOS and Windows).

must contain exactly two chars or no chars at all, initially filled with RegExprLinePairedSeparator global constant)

see also about line separators

For example, if You need Unix-style behaviour, assign LineSeparators := #$a (newline character) and LinePairedSeparator := ‘’ (empty string), if You want to accept as line separators only \x0D\x0A but not \x0D or \x0A alone, then assign LineSeparators := '' (empty string) and LinePairedSeparator := #$d#$a.

By default ‘mixed’ mode is used (defined in RegExprLine[Paired]Separator[s] global constants):

LineSeparators := #$d#$a; 
LinePairedSeparator := #$d#$a

Behaviour of this mode is detailed described in the syntax section.

class function InvertCaseFunction  (const Ch : REChar) : REChar;

Converts Ch into upper case if it in lower case or in lower if it in upper (uses current system local setings)

property InvertCase : TRegExprInvertCaseFunction;

Set this property if you want to override case-insensitive functionality.

Create set it to RegExprInvertCaseFunction (InvertCaseFunction by default)

procedure Compile;

[Re]compile r.e. Usefull for example for GUI r.e. editors (to check all properties validity).

function Dump : string;

dump a compiled regexp in vaguely comprehensible form

Global constants

EscChar = ‘\’;  // ‘Escape’-char (‘\’ in common r.e.) used for escaping metachars (\w, \d etc).

 // it’s may be usefull to redefine it if You are using C++ Builder - to avoide ugly constructions

 // like ‘\\w+\\\\\\w+\\.\\w+’ - just define EscChar=’/’ and use ‘/w+\/w+/./w+’

  Modifiers default values:

RegExprModifierI : boolean = False;                // TRegExpr.ModifierI
RegExprModifierR : boolean = True;                // TRegExpr.ModifierR
RegExprModifierS : boolean = True;                // TRegExpr.ModifierS
RegExprModifierG : boolean = True;                // TRegExpr.ModifierG
RegExprModifierM : boolean = False;                //TRegExpr.ModifierM
RegExprModifierX : boolean = False;                //TRegExpr.ModifierX


RegExprSpaceChars : RegExprString = ‘ ‘#$9#$A#$D#$C;

 // default for SpaceChars property


RegExprWordChars : RegExprString =


 + ‘abcdefghijklmnopqrstuvwxyz’


 // default value for WordChars property


RegExprLineSeparators : RegExprString =

  #$d#$a{$IFDEF UniCode}#$b#$c#$2028#$2029#$85{$ENDIF};

 // default value for LineSeparators property

RegExprLinePairedSeparator : RegExprString =


 // default value for LinePairedSeparator property


RegExprInvertCaseFunction : TRegExprInvertCaseFunction = TRegExpr.InvertCaseFunction;

// default for InvertCase property

Usefull global functions

function ExecRegExpr (const ARegExpr, AInputStr : string) : boolean;

true if string AInputString match regular expression ARegExpr

! will raise exeption if syntax errors in ARegExpr

procedure SplitRegExpr (const ARegExpr, AInputStr : string; APieces : TStrings);

Split AInputStr into APieces by r.e. ARegExpr occurencies

function ReplaceRegExpr (const ARegExpr, AInputStr, AReplaceStr : string;
  AUseSubstitution : boolean = False) : string;

Returns AInputStr with r.e. occurencies replaced by AReplaceStr.

If AUseSubstitution is true, then AReplaceStr will be used as template for Substitution methods.

For example:

ReplaceRegExpr ('({-i}block|var)\\s\*\\(\\s\*(\[^ \]\*)\\s\*\\)\\s\*',
  'BLOCK( test1)', 'def "$1" value "$2"', True)

will return:  def ‘BLOCK’ value ‘test1’

ReplaceRegExpr ('({-i}block|var)\\s\*\\(\\s\*(\[^ \]\*)\\s\*\\)\\s\*',
  'BLOCK( test1)', 'def "$1" value "$2"')

 will return:  def “$1” value “$2”

function QuoteRegExprMetaChars (const AStr : string) : string;

Replace all metachars with its safe representation, for example ‘abc$cd.(‘ converts into ‘abc\$cd\.\(‘

This function usefull for r.e. autogeneration from user input

function RegExprSubExpressions (const ARegExpr : string;
  ASubExprs : TStrings; AExtendedSyntax : boolean = False) : integer;

Makes list of subexpressions found in ARegExpr r.e.

In ASubExps every item represent subexpression, from first to last, in format:

 String - subexpression text (without ‘()’)

 low word of Object - starting position in ARegExpr, including ‘(‘ if exists! (first position is 1)

 high word of Object - length, including starting ‘(‘ and ending ‘)’ if exist!

AExtendedSyntax - must be True if modifier /x will be On while using the r.e.

Usefull for GUI editors of r.e. etc (You can find example of using in TestRExp.dpr project)

Result code        Meaning

0                Success. No unbalanced brackets was found;

-1                there are not enough closing brackets ‘)’;

-(n+1)                at position n was found opening ‘[’ without corresponding closing ‘]’;

n                at position n was found closing bracket ‘)’ without corresponding opening ‘(‘.


If Result <> 0, then ASubExprs can contain empty items or illegal ones

Exception type

Default error handler of TRegExpr raise exception:


ERegExpr = class (Exception)
   ErrorCode : integer; // error code. Compilation error codes are before 1000
   CompilerErrorPos : integer; // Position in r.e. where compilation error occured

How to use Unicode

TRegExpr now supports UniCode, but it works very slow :(

Who want to optimize it ? ;)

Use it only if you really need Unicode support !

Remove . in {.$DEFINE UniCode} in regexpr.pas. After that all strings will be treated as WideString.