RegExReplace - missing parameter, no backreference, issue related to VarRef

Issue #76 closed
Winter Laite created an issue

ISSUE

RegExReplace does not function as documented here.

Test code, more or less verbatim from link:

; ┌─────────────────────────────────────────────────────────────────────────┐
;   #1  Reports "abc123xyz" because the $ allows a match only at the end.  
; └─────────────────────────────────────────────────────────────────────────┘

MsgBox(RegExReplace("abc123123", "123$", "xyz"))

; ┌──────────────────────────────────────────────────────────────────────────────────┐
;   #2 Reports "123" because a match was achieved via the case-insensitive option.  
; └──────────────────────────────────────────────────────────────────────────────────┘

MsgBox(RegExReplace("abc123", "i)^ABC"))

; ┌────────────────────────────────────────────────────────────┐
;   #3 Reports "aaaXYZzzz" by means of the $1 backreference.  
; └────────────────────────────────────────────────────────────┘

MsgBox(RegExReplace("abcXYZ123", "abc(.*)123", "aaa$1zzz"))

; ┌─────────────────────────────────────────────────────────────────┐
;   #4  Reports an empty string and stores 2 in ReplacementCount.  
; └─────────────────────────────────────────────────────────────────┘

; MsgBox(RegExReplace("abc123abc456", "abc\d+", "", &ReplacementCount))

The comments are taken directly from the examples at the linked page.

#1 and #2 work correctly.

#3 reports ‘aaa$1zzz’ - apparently backreference is not supported at the moment.

#4 is commented out as it will not compile, nor was it expected to because of the VarRef ('&'). Removing the ‘&', compiling still fails, as there is no apparent provision for 'ReplacementCount’.

Comments (7)

  1. Matt Feemster repo owner

    I’ve made a commit which probably fixes some of this, but not all of it. So let’s take it a step at a time.

    Because we are not supporting reference parameters, RegExReplace() will return an object which will implicitly convert to a bool, indicating success.

    It also has a member named OutputVarCount, to make up for the reference parameter we’ve removed.

    See the class RegExResults in the file Strings.cs for details.

    Note: a similar class, ReplaceResults, is returned from StrReplacewhich is also in Strings.cs.

    See the unit test file string-regex.ahk to get an idea of which tests already exist and see how yours fit in there.

    As for backreferences, my apologies for not noting this earlier. AHK uses the Perl Compatible Regular Expressions (PCRE) library.

    For Keysharp, we are using the built in C# reg ex library. Backreferences are handled, but use a slighly different syntax. Please see the “Backreference Constructs” section midway down the page here: https://docs.microsoft.com/en-us/dotnet/standard/base-types/regular-expression-language-quick-reference

  2. Winter Laite reporter

    Thank you, I’m working on this but having trouble. Here’s something simple I thought would work after reviewing the link above.

    MsgBox(RegExReplace("abcXYZ123", "abc(.*)123", "aaa\1zzz"), "Should be aaaXYXzzz")
    

    But the result is just ‘aaa\1zzz’.

    Combining RegExReplace with RegExMatch works, but is REALLY tortured. And just seems wrong. 😀

    MsgBox(RegExReplace("abcXYZ123", "abc(.*)123", "aaa" RegExMatch("abcXYZ123", "abc(.*)123")[1] "zzz"), "Should be aaaXYXzzz")
    

  3. Matt Feemster repo owner

    I’ve committed a fix, please test.

    Backreferences now work using the standard AHK syntax with $1.

    Also of note, instead of the reference parameter on the last example, retrieve the member “OutputVarCount” from the returned object.

    I’ve added a unit test for RegExReplace() with all four of these examples in it.

  4. Winter Laite reporter

    Following latest commit, the following code works. Marking resolved and closing.

    ; ┌─────────────────────────────────────────────────────────────────────────┐
    ;   #1  Reports "abc123xyz" because the $ allows a match only at the end.  
    ; └─────────────────────────────────────────────────────────────────────────┘
    
    MsgBox(RegExReplace("abc123123", "123$", "xyz"))
    
    ; ┌──────────────────────────────────────────────────────────────────────────────────┐
    ;   #2 Reports "123" because a match was achieved via the case-insensitive option.  
    ; └──────────────────────────────────────────────────────────────────────────────────┘
    
    MsgBox(RegExReplace("abc123", "i)^ABC"))
    
    ; ┌────────────────────────────────────────────────────────────┐
    ;   #3 Reports "aaaXYZzzz" by means of the $1 backreference.  
    ; └────────────────────────────────────────────────────────────┘
    
    MsgBox(RegExReplace("abcXYZ123", "abc(.*)123", "aaa$1zzz"))
    
    
    ; ┌──────────────────────────────────────────────────────────┐
    ;   #4 Reports an empty string - OutputVarCount contains 2  
    ; └──────────────────────────────────────────────────────────┘
    
    count := RegExReplace("abc123abc456", "abc\d+", "").OutputVarCount
    MsgBox(RegExReplace("abc123abc456", "abc\d+", ""), count)
    

    Thank you!

  5. Log in to comment