Beefy Boxes and Bandwidth Generously Provided by pair Networks
Problems? Is your data what you think it is?
 
PerlMonks  

regular expression not matching

by ghosh123 (Monk)
on Mar 13, 2013 at 06:25 UTC ( #1023116=perlquestion: print w/ replies, xml ) Need Help??
ghosh123 has asked for the wisdom of the Perl Monks concerning the following question:

Hi,
I am parsing the following file :

Users of Enterprise_VO: (Total of 1 license issued; Total of 0 licenses in use)
Users of Fusion: (Total of 4 licenses issued; Total of 0 licenses in use)
Users of Galaxy-AdvCTS: (Total of 5 licenses issued; Total of 0 licenses in use)
Users of Galaxy-AdvTech: (Total of 5 licenses issued; Total of 0 licenses in use)
Users of Galaxy-Common: (Total of 30 licenses issued; Total of 1 license in use)

But the below code is not able to match when there is a '_' or '-' in the string just after "Users of". For example 'Fusion' is matching but not 'Galaxy-AdvTech' and 'Enterprise_VO' etc.

while(<FILE>) { if(/^Users of ([\w+\_\-]):\s+\(Total of ([0-9]+) licenses issu +ed;\s+Total of ([0-9]+) (licenses|license) in use/) { print "$1 |$2|$3 \n"; } }

Comment on regular expression not matching
Download Code
Re: regular expression not matching
by Anonymous Monk on Mar 13, 2013 at 06:34 UTC

    So you want to figure out why, is that your question?

    Can you explain what  ([\w+\_\-]) means?

      Ooops ... I am sorry, I just pasted the wrong code.
      Actually I have been using this
      It was working fine as long as there was no _ and - But now , how it is to be changed to take care of '_' and '-' ?

      if(/^Users of (\w+):\s+\(Total of ([0-9]+) licenses issued;\s+Total of + ([0-9]+) (licenses|license) in use/)

Re: regular expression not matching
by kcott (Abbot) on Mar 13, 2013 at 07:42 UTC

    G'day ghosh123,

    "But the below code is not able to match when there is a '_' or '-' in the string just after "Users of". For example 'Fusion' is matching but not 'Galaxy-AdvTech' and 'Enterprise_VO' etc."

    Rubbish! The code you posted isn't matching any of your sample data:

    $ perl -Mstrict -Mwarnings -e ' while (<>) { if(/^Users of ([\w+\_\-]):\s+\(Total of ([0-9]+) licenses issu +ed;\s+Total of ([0-9]+) (licenses|license) in use/ ) { print "$1 |$2|$3 \n"; } else { print "No match!\n"; } } ' Users of Enterprise_VO: (Total of 1 license issued; Total of 0 license +s in use) No match! Users of Fusion: (Total of 4 licenses issued; Total of 0 licenses in u +se) No match! Users of Galaxy-AdvCTS: (Total of 5 licenses issued; Total of 0 licens +es in use) No match! Users of Galaxy-AdvTech: (Total of 5 licenses issued; Total of 0 licen +ses in use) No match! Users of Galaxy-Common: (Total of 30 licenses issued; Total of 1 licen +se in use) No match!

    This is because firstly, [\w+\_\-] is only matching a single character which will either be: a word character (alphanumeric, underscore [and others - see perlrecharclass - Backslash sequences]), a plus sign or a hyphen (both backslashes are unnecessary and the underscore is already covered by the \w) - [\w-]+ is what you want here. And, secondly, the first instance of licenses does not match the word "license" - licenses? is what you want here. Putting all that together:

    $ perl -Mstrict -Mwarnings -e ' while (<>) { if(/^Users of ([\w-]+):\s+\(Total of ([0-9]+) licenses? issued +;\s+Total of ([0-9]+) (licenses|license) in use/) { print "$1 |$2|$3 \n"; } else { print "No match!\n"; } } ' Users of Enterprise_VO: (Total of 1 license issued; Total of 0 license +s in use) Enterprise_VO |1|0 Users of Fusion: (Total of 4 licenses issued; Total of 0 licenses in u +se) Fusion |4|0 Users of Galaxy-AdvCTS: (Total of 5 licenses issued; Total of 0 licens +es in use) Galaxy-AdvCTS |5|0 Users of Galaxy-AdvTech: (Total of 5 licenses issued; Total of 0 licen +ses in use) Galaxy-AdvTech |5|0 Users of Galaxy-Common: (Total of 30 licenses issued; Total of 1 licen +se in use) Galaxy-Common |30|1

    However, most of that regex is superfluous: /^Users of ([^:]+): \(Total of (\d+) [^0-9]+ (\d+)/ matches all of your supplied data:

    $ perl -Mstrict -Mwarnings -E ' while (<>) { if (/^Users of ([^:]+): \(Total of (\d+) [^0-9]+ (\d+)/) { say "$1 $2 $3"; } } ' Users of Enterprise_VO: (Total of 1 license issued; Total of 0 license +s in use) Enterprise_VO 1 0 Users of Fusion: (Total of 4 licenses issued; Total of 0 licenses in u +se) Fusion 4 0 Users of Galaxy-AdvCTS: (Total of 5 licenses issued; Total of 0 licens +es in use) Galaxy-AdvCTS 5 0 Users of Galaxy-AdvTech: (Total of 5 licenses issued; Total of 0 licen +ses in use) Galaxy-AdvTech 5 0 Users of Galaxy-Common: (Total of 30 licenses issued; Total of 1 licen +se in use) Galaxy-Common 30 1

    -- Ken

Re: regular expression not matching
by igelkott (Curate) on Mar 13, 2013 at 07:46 UTC

    I understand that you posted the wrong code but a small change in the original will work as (apparently) intended. Specifically, the first '+' should be moved outside the square brackets since "\w", "_", "-" are all in the set you're looking for.

    In other words, change ([\w+\_\-]) to ([\w\_\-]+). Better yet, since "_" is included in "\w" and "-" doesn't need to be escaped if it's first or last, you can simplify this to ([\w-]+).

Re: regular expression not matching
by AnomalousMonk (Abbot) on Mar 13, 2013 at 15:58 UTC

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://1023116]
Approved by kcott
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others pondering the Monastery: (6)
As of 2014-09-19 03:01 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    How do you remember the number of days in each month?











    Results (129 votes), past polls