http://www.perlmonks.org?node_id=151889

benlaw has asked for the wisdom of the Perl Monks concerning the following question:

Hi ,all
i don't understand regular expression
my ($label) = /\.([^\.]*)$/;
what does it mean? well the data for the regular expression is
foreach (qw/one one.five two two.one two.ten two.ten.12 three three.ni +ne)
Thx ! and the full code is here
http://benlaw2.topcities.com/code.txt

Replies are listed 'Best First'.
Re: About regular expression
by BeernuT (Pilgrim) on Mar 15, 2002 at 02:53 UTC
    Perhaps you should look over
    perl regular expressions
    Tutorials (check out the few perlre tuts there).
    Here is the output from japhy's YAPE::Regex::Explain mod
    pattern: /\.([^\.]*)$/ (?-imsx: group, but do not capture (case-sensitive) (with ^ and $ matching normally) (with . not matching \n) (matching whitespace and # normally): \. '.' ( group and capture to \1: [^\.]* any character except: '\.' (0 or more times (matching the most amount possible)) ) end of \1 $ before an optional \n, and the end of the string ) end of grouping


    -bn
Re: About regular expression
by Trimbach (Curate) on Mar 15, 2002 at 02:49 UTC
    First, you're missing a ~ character. It should be: -- Not necessary. There's an implicit match against $_.--geb

    (code deleted)

    Second, let's break it down:

    my ($label) =~ / # begin \. # find a literal . ( # start capturing [^\.] # Any character that isn't a \ or . * # zero or more ) # stop capturing $ # only match from the end of the string / # end;
    In a nutshell this will match the "extension" of the data you provided, i.e. the "five" in "one.five" or the "12" in "two.ten.12". What's matched is captured into $label.

    There's lots of info on regexes both here and in the regular Perl documentation. They can be challenging at first, but are well worth the effort to figure out. :-D

    Gary Blackburn
    Trained Killer

    Edited

      Sorry, but I don't think there is a missing '~'. $label is assigned the matched between the parens (everything after the dot). It's not being matched by the re.

      Aziz,,,

        Yeah, you're right. There's an implicit match against $_ there that I didn't see. I guess it just wasn't implied enough. :-D

        Gary Blackburn
        Trained Killer

      [^\.] # Any character that isn't a \ or .

      Not exactly. The backslash is used as an escape character, not only in the regex itself, but in character classes too. Even though it's not necessary at all to escape a dot in a character class, Perl removes the backslash itself (Perl always does something when you use a backspace in an interpolated string, unlike some languages where "\q\n" is backslash, q, newline. In Perl, "\q\n" is q, newline.).

      [^\.] # Any character that is not . [^.] # Any character that is not .

      U28geW91IGNhbiBhbGwgcm90MTMgY
      W5kIHBhY2soKS4gQnV0IGRvIHlvdS
      ByZWNvZ25pc2UgQmFzZTY0IHdoZW4
      geW91IHNlZSBpdD8gIC0tIEp1ZXJk
      

Re: About regular expression
by abstracts (Hermit) on Mar 15, 2002 at 03:03 UTC
    Hello

    I'll try explaining what this code does. But first: about regular expressions.

    Without getting too technical, Regular Expressions are one way of describing a set of words. So, when we say

    if ($str =~ /some_re/){ do something ... }
    what we're really doing is asking if the string belongs to the set described by the regular expression.

    I don't (and can't) explain the whole of perl's regular expressions but I'll explain enough to make you understand the code segment.

    • /A/
      describes all words that contain the capital letter A.
    • /abc/
      describes all words that contain the sequence 'abc' ("abc", "sjdhabcsd", aaabc", ...).
    • /^abc/
      describes all words that begin with the sequence "abc" ("abc", "abcdefg", "abcccccc").
    • /abc$/
      describes all words that end with the sequence "abc" ("abc", "sjdhjabc", "aaabc").
    • /a*/
      describes all words that have zero or more of the letter 'a' ("", "abcj", or anything really).
    • /(ab)*/
      describes all words that have zero or more of the sequence 'ab' ("", "ab", "abab", "cdr", "tryabtry", ...).
    • /[abc]/
      contains any of 'a', 'b', or 'c'.
    • /[^abc]/
      none of these: for example
      /a[^bc]d/
      means 'a' followed by anything but 'b' or 'c', followed by a 'd'.
    • /./
      matches anything so
      /a.c/
      describes all words than contain 'a' followed by anything, followed by a 'c' ("aac", "abc", "a c", ...). To match a dot '.', you need to escape it with a \.
    foreach (qw/one one.five two two.one two.ten two.ten.12 three three.nine four five/){ print "$_\n"; if($_ =~ /\./){ # if the word has a dot ".". # Keep everything after the last "." in the string my ($label) = /\.([^\.]*)$/; # a dot, followed by zero # or more letters that are # not dots, attached to the # end of the string. # return those letters # since they're between # brackets. $tree->add($_, -text => $label , # add the label to $tree -image => $mw->Getimage(folder)); } }
    Hope this helped

    Aziz,,,

    Update: Thanks blakem for pointing out the need for escapes for \] in <pre> tags.

Re: About regular expression
by Anonymous Monk on Mar 15, 2002 at 03:24 UTC
    Thx for all ! it 's v.clear ! ^^"
Re: About regular expression
by trammell (Priest) on Mar 15, 2002 at 20:48 UTC
    Note that escaping the period in the character class is not necessary; [^\.] is the same as [^.] .