| Store | Cart

Re: [perl #126185] /(?-p)/ should be an error

From: demerphq <deme...@gmail.com>
Thu, 8 Oct 2015 20:20:06 +0200
On 7 October 2015 at 23:36, Karl Williamson <pub...@khwilliamson.com> wrote:
> On 10/07/2015 03:13 PM, Abigail wrote:>>>> On Wed, Oct 07, 2015 at 10:02:22PM +0200, Victor ADAM wrote:>>>>>>>> I think that the consequences are mild, so it was the right design>>>> decision to be a warning.>>>>>>>>> In that case we should fix all the equivalent but inconsistent errors:>>> /(?-a)/ dies with “Regexp modifier "a" may not appear after the "-" in>>> regex; marked by <-- HERE in m/(?-a <-- HERE )/ at -e line 1.”>>>>>> Unless I’m missing something, the two are essentially the same, and>>> should thus have the same behavior. Having /(?-a)/ die but /(?-p)/>>> warn doesn’t sound like a “right design decision” to me.>>>>>>>> Yeah, but we have to be practical. /a is fairly new, and /(?-a)/ has>> died since /a was introduced. /(?-p)/ has been allowed for some time.>> We don't know whether there's code out there which has it. Or how much.>> [1]>> Considering that it's pretty harmless, I don't think dying is appropriate.>> Perhaps we should not even add a warning now. Now, if /p was new, I>> would not object to it dying.>>>> What's the gain if /(?-p)/ started dying?>>>>>> [1]  If there's a lot of code out there which uses /(?-p)/, then making>>       it die (or even warn) now hurts a lot of people. If there isn't a>>       lot of code out there which uses it, then it hardly matters what>>       is done (die, warn, nothing).>>>>>>>> Abigail>>>> To be sure, there is some subjectivity to this whole thing, and reasonable> people can come to different conclusions.>> I agree with Abigail, but also, it's obvious what -p would mean, because p> is a boolean flag.  But it's not obvious what -a would mean.

While I realize Victor already agreed that this ticket can be
closed[1] I wanted to add some comments.

(?p) is a bit of an odd case as far as regex flags go.  In some ways
it is the platypus of the regex modifiers. :-)

Unlike most of the other modifiers it shouldn't exist at all (arguably
like /o), and that it does is purely to work around performance issues
in our implementation. Thus its presence or absence shouldn't change
anything. If we reworked other bits of the implementation it could be
made a complete no-op, and ${^MATCH} and friends would just be an
always available long form for $& and friends, or perhaps deprecated
outright in some future version.

Furthermore regex modifier flags are divided into two groups, parse
modifiers and execution modifiers. Parse modifiers are used to change
the meaning of parts of the pattern, such as (?i) making the following
text case-insensitive, so they must be embeddable. Execution modifiers
are things like /g which control how the pattern will be executed, and
are not embeddable for the obvious reason that it makes no sense to
have the pattern control something like the /g flag.

On the face of things, /p flag is an execution modifier like /g, which
would then argue that it should not be embeddable. *BUT*, unlike the
other execution modifiers it has a side effect that is visible inside
of (?{ ... }) and (??{ ... }), which means it must be embeddable, or
it would not be possible to use ${^MATCH} and friends inside of things
like (??{ }).  For instance, imagine injecting a qr// object with
appropriate behaviour into a library that accepts a pattern as a
debugging aid, like in the following toy example:

perl -le'sub do_match { my ($string,$pattern)= @_; my $count=0; while
($string=~/$pattern/gc) { $count++ } return $count} my $qr= qr/fo+(?{
print "${^PREMATCH}|${^MATCH}|${^POSTMATCH}" })[ab]/p;
do_match("foofooooomfooob",$qr)'
|foo|fooooomfooob
|fo|ofooooomfooob
foo|fooooo|mfooob
foo|foooo|omfooob
foo|fooo|oomfooob
foo|foo|ooomfooob
foo|fo|oooomfooob
foofooooom|fooo|b

Once an execution flag becomes embeddable you have to decide what to
do about the "minus case", and because it did no harm I chose to warn,
not die, if someone used it.  In hindsight I think I was wrong to do
so. Depending on what you think we should do in the future when/if the
preserve flag becomes unnecessary, it should either die, or be a
silent no-op. But that is the "deprecation debate" in disguise and is
better left I think to a different thread.

Whatever we chose to do about (?-p), it should be because it makes
sense in context with (?p), and not /just/ to be consistent with what
(?-a) does. (?a) is a parse modifier, and it is not obvious that (?-a)
is senseless[1]. Perhaps in the future we will give it meaning
(however unlikely), which would then change the semantics of the
match. So making it die now keeps that option open for the future.

Cheers,
Yves
[1] I could imagine /(?a)blah(?-a)foo/ to mean the same thing as
/(?a:blah)foo/, but it is not clear what /(?a)(?:foo|(?-a)blah)foo/
should do. None of this should be taken as me disagreeing with Karl
here, I am just trying to say that to me (?-a) /could/ have a sane
meaning given what (?a) means, but that to me (?-p) can't have a sane
meaning given what (?p) means.


-- 
perl -Mre=debug -e "/just|another|perl|hacker/"

Recent Messages in this Thread
Victor ADAM (via RT) Sep 25, 2015 05:03 pm
demerphq Oct 07, 2015 11:34 am
Karl Williamson Oct 07, 2015 07:09 pm
Victor ADAM Oct 07, 2015 08:02 pm
Abigail Oct 07, 2015 09:13 pm
Karl Williamson Oct 07, 2015 09:36 pm
demerphq Oct 08, 2015 06:20 pm
Victor ADAM Oct 08, 2015 12:29 pm
Karl Williamson via RT Oct 08, 2015 01:51 pm
Messages in this thread