Strange findRegexp behaviour

"169.254.0.0".findRegexp("/d+");

doesn’t work but

"169.254.0.0".findRegexp("[0-9]+");

does. Is it normal?

Your slash is the wrong way, and it seems there needs to be two of them as forward slash is the escape character (I did not know this and seems a little odd).

"169.254.0.0".findRegexp("\\d+");

// an alternative
"169.254.0.0".findRegexp("[[:digit:]]+");

this is super strange and not as per documentation (“using Perl standard”)

https://perldoc.perl.org/perlre

/d+ is the perl expression :slight_smile:

and 2 backslashes would escape a single backslash, no?

this is a strange behaviour - I think (if not a bug) we’d need to add a translation table for the standard?

Ah, but it is in the boost guide that the docs link too,
Perl Regular Expression Syntax - 1.69.0,
where it is \d+.

Yea, the second backslash (or are they forward, meh) I guess is needed to escape the first, makes sense given supercollider parses it before passing it to boost, just creates an odd syntax.

1 Like

ok I’ll try to propose a PR to the helpfile to make it clearer if I find a way (and the time)

thanks

You have misread, the page you linked also says \d is the digit character class.

If I remember my logic classes, also is inclusive right? :smiley:

/d doesn’t work. \d doesn’t work. \\d works. I use regex and SC daily and this was a surprise. i pitty the beginner, so I’ll try to make the helpfile helpful.

1 Like

“also” here means, the SC, Perl, and Boost documentation are all correct, and you are mistaken. /d is a modifier or just a literal and \d is the digit character class. You can test this on many websites or in a Perl interpreter yourself. For instance:

perl -e 'print "1" =~ /\d/;' # prints "1"
perl -e 'print "1" =~ /\/d/;' # prints nothing
"\d+"
-> d+

So if you write \d into the string literal, you haven’t written a string containing backslash-d.

"\\d+"
-> \d+

The double backslash isn’t regex syntax. Within a string or 'symbol' literal, it’s the way to write a single backslash (because it’s necessary for SC syntax to disambiguate between backslash as an escape character and backslash as a real character in the resulting string).

It’s not even that odd, since it’s following precedent from C and basically all C derivatives (java etc).

It gets quite intricate in regular expression literals though. I might have gotten it right the first time on perhaps fewer than half a dozen occasions.

hjh

3 Likes