Beefy Boxes and Bandwidth Generously Provided by pair Networks
There's more than one way to do things
 
PerlMonks  

How do I get what is to the left of my match?

( #33972=categorized question: print w/ replies, xml ) Need Help??
Contributed by NotProud on Sep 26, 2000 at 05:08 UTC
Q&A  > regular expressions


Description:

Hm. I can not get the left to show up. Any ideas?
"left center right" =~ /center/; print "<br>Left: <$`>\n"; print "<br>Match: $&\n"; print "<br>Right: <$'>\n";
Thanks

Answer: How do I get what is to the left of my match?
contributed by tye

Note that if you ever even mention $`, $&, or $', anywhere, then all regular expressions anywhere in that run of Perl will be not-insignificantly slower. So their use is strongly discouraged in code that might be reused or where performance is important. This is because using those anywhere forces each regex to make copies of those strings every time, even though most of those copies will never be used (if Perl ever needs them then Perl can't predict when it might need them and so must always make the copies).

But the latest version of Perl adds an alternate way to get this type of information, @- and @+. Here is a sample of how to use them:

( my $str= "left center right" ) =~ /center/; print "\nLeft: <", substr( $str, 0, $-[0] ), ">\nMatch: <", substr( $str, $-[$#-], $+[$#-] - $-[$#-] ), ">\nRight: <", substr( $str, $+[$#+] ), ">\n"; __END__ This prints: Left: <left > Match: <center> Right: < right>

At the time of this writing, perlvar isn't recent enough to mention @- and @+. But if you have a version of Perl recent enough to have @- and @+, (Perl 5.6.0 or later) then your perlvar.pod will also include documentation on them.

If you can't find perlvar.pod then enter the command perldoc perlvar or cd to your perl lib directory and there should be a pod directory that contains that file. These pod files contain some simple "mark-up" codes but are designed to be easy for humans to read. (You can read perlpod.pod for more information on the mark-up language.)

        - tye
Answer: How do I get what is to the left of my match?
contributed by fundflow

It works for me. The following should give you the same result and seems more readable:

$_="left center right"; m/(.*)(center)(.*)/; print "$1:$2:$3\n";
Answer: How do I get what is to the left of my match?
contributed by lima1

The CPAN module Regexp::MatchContext was written for this task. The module's SYNOPSIS:

use Regexp::MatchContext -vars; $str = m/(?p) \d+ /; print "Before: $PREMATCH\n"; print "Matched: $MATCH\n"; print "After: $POSTMATCH\n"; $MATCH = 2 * $MATCH; # substitute into original $str
Note that this and the previous solutions are significantly slower than using the matchvariables &`, $& and $'. However, as tye mentioned, these variables will slow down EVERY other regular expression without capturing parentheses.

The following benchmark (searching a short (11 characters or base pairs) DNA sequence in a 2000 bp DNA sequence) shows the results of a comparison of all four solutions:

Rate regex context at_minus matchvars regex 17271/s -- -22% -66% -84% context 22239/s 29% -- -56% -79% at_minus 50420/s 192% 127% -- -53% matchvars 107527/s 523% 384% 113% --
Note that this benchmark uses match variables and thus slows down all four solutions. The results without the match variable solution are:
Rate regex context at_minus regex 17544/s -- -24% -69% context 23112/s 32% -- -60% at_minus 57361/s 227% 148% --

Appendix: Source code of the benchmark

#!/usr/bin/perl use strict; use warnings; use Benchmark qw(:all); use Regexp::MatchContext; my $count = 300000; # to test that all solutions produce the same output my $VERBOSE = 0; $count = 2 if $VERBOSE; my $seq = 'GGGTTGAAGTTTAGACCGCTCACAGTAGTTCTACCTATAGAAAAGATCATGAAAGAGGCGATC +AGAATGGTACTCGAATCCATTTACGATCCCGAGTTTCCAGACACATCGCATTTCCGCTCGGGTCAAGGC +TGCCACTCGGTCCTAAGACGGATCAAAGAAGAGTGGGGAATCTCTCGCTGGTTTTTAGAATTCGACATC +AGGAAGTGTTTTCACACCATCGACCGACATCGACTCATCCAAATTTTGAAGGAAGAGATCGACGATCCC +AAGTTCTTTTACTCCATTCAGAAAGTATTTTCCGCCGGACGACTCGTAGGAGTTGAGAGGGGCCCTTAC +TCCGTCCCACACAGTGTACTACTATCGGCCCTACCAGGCAACATCTACCTACACAAGCTCGATCAGGAG +ATAGGGAGGATCCGACAGAAGTACGAAATTCCGATTGTTCAGAGAGTCAGATCGGTTCTATTAAGGACA +GGTCGTCGTATTGATGACCAAGAAAACCCTGGAGAAGAAGCAAGCTTCAACGCTCCCCAAGACAACAGA +GCCATCATTGTGGGGAGCGTTAAGAGCATGCAACGCAAAGCGGCCTTTCATTCCCTTGTTTCGTCGTGG +CACACCCCCCCCACAAGCACCCTCCGGCTCAGGGGGGACCAGAAAAGGCCTTTCGTTTTCCCCCCTTCG +TCGGCCCTTGCCGTCTTCCTTAACAAGCCCTCGAGCCTTCTTTGCGCCGCCTTCCTCATAGAAGCCGCC +GGGTTGACCCCGAAGGCTGAATTCTATGGTGGAGAACGCTGTAATAATAATTGGGCCATGAGAGACCTT +CTTAAGTATTGCAAAAGAAAGGGCCTGCTGATAGAGCTGGGCGGGGAGGCGATACTAGTTATCAGGTCA +GAGAGAGGCCTGGCCCGTAAGCAGGCCCCCTTAAAAACCCATTACTTAATAAGGATTTGTTACGCGCGA +TATGCCGACGACTTACTACTGGGAATCGTGGGTGCCGTAGAGCTTCTCATAGAAATACAAAAACGTATC +GCCCATTTCCTACAATCTGGCCTGAACCTTTGGGTAGGCTCCGCAGGATCAACAACAATAGCTGCACGG +AGTACGGTAGAATTCCTTGGTACGGTCATTCGGGAAGTCCCTCCGAGGACGACTCCCATACAATTTTTG +CGAGAGCTGGAAAAGCGTCTACGGGTAAAGCACCGTATCCATATAACTGCTTGCCACCTACGCTCCGCC +ATCCATTCAAAGTTTAGGAACCTAGGTGATAGTATCCCGATCAAACAGCTGACGAAGGGGATGAGCAAA +ACAGGGAGTCTACAGGACGGGGTTCAACTAGCGGAGACTCTTGGAACAGCTGGAGTCAGAAGTCCCCAA +GTTAGCGTATTATGGGGGACCGTCAAGCACATCCGGCAAGGATCAAGGGGGATCTCGTTCTTGCATAGC +TCAGGTCGGAGCAACGCGTCATCGGACGTTCAACAGGTAGTCTCACGATCGGGCACTCATGCCCGTAAG +TTGTCATTGTATACTCCCCCGGGTCGGAAGGCGGCGGGGGAGGGAGGAGGACACTGGGCGGGATCTATC +AGCAGCGAATTCCCCATAAAGATAGAGGCACCTATAAAAAAGATACTCCGAAGGCTTCGGGATCGAGGT +ATCATTAGCCGAAGAAGACCCTGGCCAATCCACGTGGCCTGTTTGACGAACGTCAGCGACGAAGACATC +GTAAATTGGTCCGCGGGCATCGCGATAAGTCCTCTGTCCTACTACAGGTGCCGCGACAACCTTTATCAA +GTCCGAACGATTGTCGACCACCAGATTCGCTGGTCTGCAATATTCACCCTAGCCCACAAGCACAAATCC +TCGGCGCCGAATATAATCCTCAAGTACTCCAAAGACTCAAATATTGTAAATCAAGAAGGTGGCAAGATC +CTTGCAGAGTTCCCCAACAGCATAGAGCTTGGGAAGCTCGGACCCGGTCAAGACCTGAACAAGAAGGAA +CACTCAACTACTAGTCTAGTCTAG'; cmpthese( $count, { 'regex' => sub { my ( $prematch, $match, $postmatch ) = $seq =~ m{(\A .*?) (CTGGCCCGTAA) (.*\z) }xms; warn "$prematch $match $postmatch" if $VERBOSE; }, 'matchvars' => sub { $seq =~ m{CTGGCCCGTAA}xms; my ($prematch, $match, $postmatch) = ($`, $&, $'); warn "$prematch $match $postmatch" if $VERBOSE; }, 'context' => sub { $seq =~ m{(?p)CTGGCCCGTAA}xms; my ( $prematch, $match, $postmatch ) = ( PREMATCH(), MATCH(), POSTMATCH() ); warn "$prematch $match $postmatch" if $VERBOSE; }, 'at_minus' => sub { $seq =~ m{CTGGCCCGTAA}xms; my $prematch = substr( $seq, 0, $-[0] ); my $match = substr( $seq, $-[$#-], $+[$#-] - $-[$#-] ); my $postmatch = substr( $seq, $+[$#+] ); warn "$prematch $match $postmatch" if $VERBOSE; }, } );

Please (register and) log in if you wish to add an answer



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • Outside of code tags, you may need to use entities for some characters:
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.
  • Log In?
    Username:
    Password:

    What's my password?
    Create A New User
    Chatterbox?
    and the web crawler heard nothing...

    How do I use this? | Other CB clients
    Other Users?
    Others romping around the Monastery: (6)
    As of 2014-08-30 03:52 GMT
    Sections?
    Information?
    Find Nodes?
    Leftovers?
      Voting Booth?

      The best computer themed movie is:











      Results (291 votes), past polls