Beefy Boxes and Bandwidth Generously Provided by pair Networks
go ahead... be a heretic
 
PerlMonks  

Re: matching problem

by AnomalousMonk (Monsignor)
on Feb 25, 2013 at 15:31 UTC ( #1020534=note: print w/ replies, xml ) Need Help??


in reply to matching problem

Balaton: Athanasius has replied:

... I think it’s highly unlikely that this line:
    if ($line =~ ModuleMatching::MatchLAC($locus_acc_no)) {
can be correct. But without knowing what  sub ModuleMatching::MatchLAC is supposed to do, it’s hard to give advice.

I agree that the definition of  ModuleMatching::MatchLAC() given in the OP is most likely some sort of dummy placeholder, but in any event, the given function can be explained as follows (this is for Balaton; I believe Athanasius understands all this quite clearly):

  • The function returns the result of a match of the passed argument  $_[0] against a literal regex (i.e., a regex having no interpolations (Update: literal regex: my terminology may be a bit off here));
  • The function is called in scalar context imposed by the  =~ operator, so the result of the match within the function is either 1 (successful match) or '' (the empty string; match failed);
  • The '' or 1 returned by the function is then converted to a regex (with 1 stringized to '1') and a match is made against  $line. If the match is against  /1/ the result is obvious. If the match is against  // (the empty regex, created from the empty string), the result will come from a match against the regex most recently matched or, if no regex has ever been matched, against the null regex, which matches anything. Straight from the docs: "If the PATTERN evaluates to the empty string, the last successfully matched regular expression is used instead. [...] If no match has previously succeeded, this will (silently) act instead as a genuine empty pattern (which will always match)."

The problematic results of these matches can be illustrated as follows:

>perl -wMstrict -le "for my $line ('', 'X', 'Y') { for my $locus_acc_no ('', 'X', 'Y') { if ($line =~ MatchLAC($locus_acc_no)) { print qq{ match: '$line' =~ MatchLAC('$locus_acc_no')}; } else { print qq{NO match: '$line' =~ MatchLAC('$locus_acc_no')}; } } } ;; sub MatchLAC { return $_[0] =~ /^X$/; } " match: '' =~ MatchLAC('') NO match: '' =~ MatchLAC('X') match: '' =~ MatchLAC('Y') match: 'X' =~ MatchLAC('') NO match: 'X' =~ MatchLAC('X') match: 'X' =~ MatchLAC('Y') match: 'Y' =~ MatchLAC('') NO match: 'Y' =~ MatchLAC('X') match: 'Y' =~ MatchLAC('Y')


Comment on Re: matching problem
Select or Download Code
Re^2: matching problem
by Athanasius (Monsignor) on Feb 26, 2013 at 04:13 UTC

    Actually, I missed the definition of sub MatchLAC in the OP; my reply was directed solely at Re^2: matching problem. Mea culpa.

    ++AnomalousMonk for the excellent exposition! I didn’t know that the empty regex acts as a stand-in for the regex most recently matched (whether the match was successful or not). Is this documented anywhere? I’ve been looking through perlre, etc., but the only mention of the empty regex I’ve found so far relates to its use with split, where it means “split the string into individual characters.”

    I guess overloading it makes sense, as a match that always succeeds isn’t much use. Are there any typical use cases for employing the empty regex to mean “repeat the regex used in the previous match”?

    Athanasius <°(((><contra mundum Iustus alius egestas vitae, eros Piratica,

      ... excellent exposition!

      Thank you very much!

      ... the empty regex acts as a stand-in for the regex most recently matched (whether the match was successful or not).

      In looking for empty pattern documentation (see below), I discovered this is not the case: "If the PATTERN evaluates to the empty string, the last successfully matched regular expression is used instead." (Strange what you can find when you actually read the docs!) Fixed my reply: thanks!

      Is this documented anywhere?

      The only place I've seen it is in perlop in the Regexp Quote-Like Operators section: the discussion of the  m// operator has a sub-section titled "The empty pattern //" (there's also a brief back-reference to it in the discussion of the  s/// operator).

      Are there any typical use cases for employing the empty regex to mean “repeat the regex used in the previous match”?

      My vague impression is this is something that evolved early-on as an emulation of shell usage or maybe from a desire for some kind of command line one-liner short-cut facility: saves typing, y'know. Offhand, I can't come up with a compelling example.

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://1020534]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others drinking their drinks and smoking their pipes about the Monastery: (6)
As of 2014-08-01 00:48 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    My favorite superfluous repetitious redundant duplicative phrase is:









    Results (256 votes), past polls