Beefy Boxes and Bandwidth Generously Provided by pair Networks
Keep It Simple, Stupid
 
PerlMonks  

regular expression not matching

by ghosh123 (Monk)
on Mar 13, 2013 at 06:25 UTC ( #1023116=perlquestion: print w/ replies, xml ) Need Help??
ghosh123 has asked for the wisdom of the Perl Monks concerning the following question:

Hi,
I am parsing the following file :

Users of Enterprise_VO: (Total of 1 license issued; Total of 0 licenses in use)
Users of Fusion: (Total of 4 licenses issued; Total of 0 licenses in use)
Users of Galaxy-AdvCTS: (Total of 5 licenses issued; Total of 0 licenses in use)
Users of Galaxy-AdvTech: (Total of 5 licenses issued; Total of 0 licenses in use)
Users of Galaxy-Common: (Total of 30 licenses issued; Total of 1 license in use)

But the below code is not able to match when there is a '_' or '-' in the string just after "Users of". For example 'Fusion' is matching but not 'Galaxy-AdvTech' and 'Enterprise_VO' etc.

while(<FILE>) { if(/^Users of ([\w+\_\-]):\s+\(Total of ([0-9]+) licenses issu +ed;\s+Total of ([0-9]+) (licenses|license) in use/) { print "$1 |$2|$3 \n"; } }

Comment on regular expression not matching
Download Code
Re: regular expression not matching
by Anonymous Monk on Mar 13, 2013 at 06:34 UTC

    So you want to figure out why, is that your question?

    Can you explain what  ([\w+\_\-]) means?

      Ooops ... I am sorry, I just pasted the wrong code.
      Actually I have been using this
      It was working fine as long as there was no _ and - But now , how it is to be changed to take care of '_' and '-' ?

      if(/^Users of (\w+):\s+\(Total of ([0-9]+) licenses issued;\s+Total of + ([0-9]+) (licenses|license) in use/)

Re: regular expression not matching
by kcott (Abbot) on Mar 13, 2013 at 07:42 UTC

    G'day ghosh123,

    "But the below code is not able to match when there is a '_' or '-' in the string just after "Users of". For example 'Fusion' is matching but not 'Galaxy-AdvTech' and 'Enterprise_VO' etc."

    Rubbish! The code you posted isn't matching any of your sample data:

    $ perl -Mstrict -Mwarnings -e ' while (<>) { if(/^Users of ([\w+\_\-]):\s+\(Total of ([0-9]+) licenses issu +ed;\s+Total of ([0-9]+) (licenses|license) in use/ ) { print "$1 |$2|$3 \n"; } else { print "No match!\n"; } } ' Users of Enterprise_VO: (Total of 1 license issued; Total of 0 license +s in use) No match! Users of Fusion: (Total of 4 licenses issued; Total of 0 licenses in u +se) No match! Users of Galaxy-AdvCTS: (Total of 5 licenses issued; Total of 0 licens +es in use) No match! Users of Galaxy-AdvTech: (Total of 5 licenses issued; Total of 0 licen +ses in use) No match! Users of Galaxy-Common: (Total of 30 licenses issued; Total of 1 licen +se in use) No match!

    This is because firstly, [\w+\_\-] is only matching a single character which will either be: a word character (alphanumeric, underscore [and others - see perlrecharclass - Backslash sequences]), a plus sign or a hyphen (both backslashes are unnecessary and the underscore is already covered by the \w) - [\w-]+ is what you want here. And, secondly, the first instance of licenses does not match the word "license" - licenses? is what you want here. Putting all that together:

    $ perl -Mstrict -Mwarnings -e ' while (<>) { if(/^Users of ([\w-]+):\s+\(Total of ([0-9]+) licenses? issued +;\s+Total of ([0-9]+) (licenses|license) in use/) { print "$1 |$2|$3 \n"; } else { print "No match!\n"; } } ' Users of Enterprise_VO: (Total of 1 license issued; Total of 0 license +s in use) Enterprise_VO |1|0 Users of Fusion: (Total of 4 licenses issued; Total of 0 licenses in u +se) Fusion |4|0 Users of Galaxy-AdvCTS: (Total of 5 licenses issued; Total of 0 licens +es in use) Galaxy-AdvCTS |5|0 Users of Galaxy-AdvTech: (Total of 5 licenses issued; Total of 0 licen +ses in use) Galaxy-AdvTech |5|0 Users of Galaxy-Common: (Total of 30 licenses issued; Total of 1 licen +se in use) Galaxy-Common |30|1

    However, most of that regex is superfluous: /^Users of ([^:]+): \(Total of (\d+) [^0-9]+ (\d+)/ matches all of your supplied data:

    $ perl -Mstrict -Mwarnings -E ' while (<>) { if (/^Users of ([^:]+): \(Total of (\d+) [^0-9]+ (\d+)/) { say "$1 $2 $3"; } } ' Users of Enterprise_VO: (Total of 1 license issued; Total of 0 license +s in use) Enterprise_VO 1 0 Users of Fusion: (Total of 4 licenses issued; Total of 0 licenses in u +se) Fusion 4 0 Users of Galaxy-AdvCTS: (Total of 5 licenses issued; Total of 0 licens +es in use) Galaxy-AdvCTS 5 0 Users of Galaxy-AdvTech: (Total of 5 licenses issued; Total of 0 licen +ses in use) Galaxy-AdvTech 5 0 Users of Galaxy-Common: (Total of 30 licenses issued; Total of 1 licen +se in use) Galaxy-Common 30 1

    -- Ken

Re: regular expression not matching
by igelkott (Curate) on Mar 13, 2013 at 07:46 UTC

    I understand that you posted the wrong code but a small change in the original will work as (apparently) intended. Specifically, the first '+' should be moved outside the square brackets since "\w", "_", "-" are all in the set you're looking for.

    In other words, change ([\w+\_\-]) to ([\w\_\-]+). Better yet, since "_" is included in "\w" and "-" doesn't need to be escaped if it's first or last, you can simplify this to ([\w-]+).

Re: regular expression not matching
by AnomalousMonk (Abbot) on Mar 13, 2013 at 15:58 UTC

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://1023116]
Approved by kcott
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others surveying the Monastery: (6)
As of 2014-12-28 11:43 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    Is guessing a good strategy for surviving in the IT business?





    Results (180 votes), past polls