question

BernardBOROWSKI-2444 avatar image
0 Votes"
BernardBOROWSKI-2444 asked BernardBOROWSKI-2444 answered

Unexpected regular expression groups capture

Hello,

Trying to get subcaptures with the following string and regular expression, I'm surprised by the result I get. Can someone explain me the result. Is the NFA engine doing bad job ?

I'm working under Powershell. So, I imagine with, using the Windows .NET framework.

My string :

 "p1  p2  p3,'   ''  ',   p4"

I put it between quotes at it includes several couples of apostrophes. One with 3 spaces between, the second, immediatly following with 2 spaces inside.

This string wholy match the following regular expression :

 ^(\w*)\s+(\w+)(([^' ]*)|('[^']*')|\s?)*$

Indeed :

 "p1  p2  p3,'   ''  ',   p4" -match "^(\w*)\s+(\w+)(([^' ]*)|('[^']*')|\s?)*$"

results to :

 True

But :

 $matches

gives

 Name                       Value
 ----                           -----
 5                               '  '
 4
 3
 2                              p2
 1                              p1
 0                              p1  p2  p3,'   ''  ',   p4

I would have expected much more. That is (not listing 1st level subgroup with the alternative 2nd level subgroups)

p1
p2
P3,
' '
' '
,
p4

Why "p3,", "' '","' '" and "," are not captured ?


 "p1  p2  p3,'   ''  ',   p4" -match "^(\w*)\s+(\w+)(?:([^' ]*)|('[^']*')|\s?)*$"

so with a non capturing 1st level group, do not gives a better result.
It simply gives one less (empty) submatch. That is, expectedly, minus the 1st level subgroup I guess.

Name Value


4 ' '
3
2 p2
1 p1
0 p1 p2 p3,' '' ', p4

Do I miss the equivalent of the "global" Ecmascript flag of regexp objects ?
However I get the same result using javacript.

So what's wrong ?

Thank for the help.

dotnet-runtimedotnet-standard
5 |1600 characters needed characters left characters exceeded

Up to 10 attachments (including images) can be used with a maximum of 3.0 MiB each and 30.0 MiB total.

Bruce-SqlWork avatar image
0 Votes"
Bruce-SqlWork answered

Use this website. Enter your values. It gives a good explanation of your matches

https://regex101.com/

5 |1600 characters needed characters left characters exceeded

Up to 10 attachments (including images) can be used with a maximum of 3.0 MiB each and 30.0 MiB total.

BernardBOROWSKI-2444 avatar image
0 Votes"
BernardBOROWSKI-2444 answered

Thanks for the website indication. But not as useful as the information I missed which is the grouping-constructs-in-regular-expressions explanation in the .NET reference site.

It gives the rule when a quantifier is applied to a capturing group. And the way to get the "previous" captures for a group while corresponding match item gives only the last.

5 |1600 characters needed characters left characters exceeded

Up to 10 attachments (including images) can be used with a maximum of 3.0 MiB each and 30.0 MiB total.