Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl: the Markov chain saw
 
PerlMonks  

Problem in pattern matching with alternation

by perladdict (Chaplain)
on Aug 12, 2007 at 14:06 UTC ( #632055=perlquestion: print w/ replies, xml ) Need Help??
perladdict has asked for the wisdom of the Perl Monks concerning the following question:

Hi monks, Below is the line from which i have to extract storage path of each vob tags.
* /vobs/bt_rel /usr/add-on/puccase_vob01/ccvob01/bt_rel.vbs public (replicated)
* /scm /usr/add-on/puccase_vob01/ccvob01/scm.vbs public (replicated)
* /v_dialermidtier /usr/addon/puccase_vob01/ccvob01/v_dialermidtier.vbs public (replicated)
* /v_dialer /usr/add-on/puccase_vob01/ccvob01/v_dialer.vbs public (replicated)
* /vobs/UMTools /user/addon/puccase_vob01/ccvob01/UMtools.vbs replicated)

I am trying to extract storage path of each vobtag as by below regular expression
#!/usr/bin/perl $arg=$ARGV[0]; $cmd1="cleartool lsvob $arg"; $arr=`$cmd1`; $storage=$1 if($arr=~/^*\s+\/\w+\/\w+\s+(.+)\s+\w+|^*\s+\/\w+\s+(.+)\s ++\w+\); print "$storage\n";
The above code is working only for the tags which is having 2 "\\"in it,but i have to match the storage path of vob tags which will not have 2 "\\" in it.
I am trying to match those vobtags storage path also by alternation as below
$storage=$1 if($arr=~/^*\s+\/\w+\/\w+\s+(.+)\s+\w+|^*\s+\/\w+\s+(.+)\s ++\w+\|^*\s+\/\w+\s+(.+)\s+\w+/);
Can anyone help me out to find where i am doing mistake in the above script.....

Comment on Problem in pattern matching with alternation
Select or Download Code
Re: Problem in pattern matching with alternation
by naikonta (Curate) on Aug 12, 2007 at 14:24 UTC
    It's a bit unclear to me which part you consider "vob tags" from your example path lines. The first regex won't even compile. The `$cmd` output you capture in $arr, is the same with those path lines you list? If not, what can $arr possibly look like? I don't see any "\\" there also.

    Open source softwares? Share and enjoy. Make profit from them if you can. Yet, share and enjoy!

      I have to match extract the fallowing storage path from the lines as below
      /usr/add-on/puccase_vob01/ccvob01/bt_rel.vbs public
      /usr/add-on/puccase_vob01/ccvob01/scm.vbs
      /usr/add-on/puccase_vob01/ccvob01/v_dialerclient_rel.vbs
      The arguments means vobtags are as fallows
      /vobs/bt_rel
      /scm
      /vobs/bt_rel
      /v_dialerclient_rel
      The script is working fine for the tags"/vobs/bt_rel"
      "/vobs/UMTools" but its not working for tags /scm,/v_dialer which having only one slash(\) in it
      i am using alternation operator in between the match,its not working,i unable to find the mistake after investing an hour for this task.
      I think by this detail u may help me to find the mistake in my script
        First of all, "/" is slash, "\" is backslash. Having two slashes in path (for Unix like) means that the path has two parts. Saying "2 '\\'" means (at least to me) there are two pairs of adjacent '\', that is "\\" and "\\" :-)

        OK, back to the problem...
        I tend to think that you want the last part of the path which you can use basename functionality. So,

        while (<DATA>) { chomp; (my $tag = (split)[0]) =~ s!.*(/.*)\.vbs$!$1!; print $tag, "\n"; } __DATA__ /usr/add-on/puccase_vob01/ccvob01/bt_rel.vbs public /usr/add-on/puccase_vob01/ccvob01/scm.vbs /usr/add-on/puccase_vob01/ccvob01/v_dialerclient_rel.vbs
        would result in
        /bt_rel /scm /v_dialerclient_rel
        But, I'm confused with /vobs/bt_rel. Where the /vobs part comes from?

        Update: The problem with your code is that you are trying to track the path level manually, and using regex complicates the situation.

        Update2: I just noticed that the leading asterisks are part of the lines, followed by some space(s) then the tags. Here is my modified code:

        $ cat extract-vobtags.pl while (<DATA>) { chomp; my($tag, $storage) = (split)[1,2]; printf "%20s: %s\n", $tag, $storage; } __DATA__ * /vobs/bt_rel /usr/add-on/puccase_vob01/ccvob01/bt_rel.vbs public (re +plicated) * /scm /usr/add-on/puccase_vob01/ccvob01/scm.vbs public (replicated) * /v_dialermidtier /usr/addon/puccase_vob01/ccvob01/v_dialermidtier.vb +s public (replicated) * /v_dialer /usr/add-on/puccase_vob01/ccvob01/v_dialer.vbs public (rep +licated) * /vobs/UMTools /user/addon/puccase_vob01/ccvob01/UMtools.vbs replicat +ed) $ perl extract-vobtags.pl /vobs/bt_rel: /usr/add-on/puccase_vob01/ccvob01/bt_rel.vbs /scm: /usr/add-on/puccase_vob01/ccvob01/scm.vbs /v_dialermidtier: /usr/addon/puccase_vob01/ccvob01/v_dialermidtier +.vbs /v_dialer: /usr/add-on/puccase_vob01/ccvob01/v_dialer.vbs /vobs/UMTools: /user/addon/puccase_vob01/ccvob01/UMtools.vbs

        Open source softwares? Share and enjoy. Make profit from them if you can. Yet, share and enjoy!

        I appreciate that you are trying, but you are still not making any sense. You are not using "code" tags enough in your posts, you are not giving us anything we can try to run ourselves to demonstrate your problem, and you keep confusing "slash" with "\".

        I'll propose the following, which is based on code and data in the original post at the top of this thread. Please try this out, tell us whether it works for you, and if it doesn't (and you are still stumped about how to make it work), tell us exactly how it should work.

        #!/usr/bin/perl use strict; use warnings; while (<DATA>) { print "$1\n" if ( m{^\*\s+(?:/\w+)+\s+(.+?)\s+} ); } __DATA__ * /vobs/bt_rel /usr/add-on/puccase_vob01/ccvob01/bt_rel.vbs public (re +plicated) * /scm /usr/add-on/puccase_vob01/ccvob01/scm.vbs public (replicated) * /v_dialermidtier /usr/addon/puccase_vob01/ccvob01/v_dialermidtier.vb +s public (replicated) * /v_dialer /usr/add-on/puccase_vob01/ccvob01/v_dialer.vbs public (rep +licated) * /vobs/UMTools /user/addon/puccase_vob01/ccvob01/UMtools.vbs replicat +ed)
        The difference between that and the OP code is:
        • it loops over each line of input, instead of reporting only a single output from all lines of input
        • it escapes the initial "*" character in the regex, to avoid a syntax error
        • it puts curlies around the regex, to avoid \/ (toothpick syndrome)
        • it allows the first string following "*" to contain any number (1 or more) of adjacent "/word" patterns (but does not capture this string)

        If you are just trying to get the third space-delimited token from each line of input, you could do this for each line, instead of the regex match:

        print +(split /\s+/)[2]; # print third "word" of line
        (updated to include "+" outside the parens -- thanks, naikonta!)

        If the data shown above is not correct, show us the actual data (inside <code> tags, please). If the output is not what you want, show us exactly what you want (based on the correct input data, again using <code> tags).

Re: Problem in pattern matching with alternation
by FunkyMonk (Canon) on Aug 12, 2007 at 14:32 UTC
    Duplicate posting. Sorry.
      I still think that it's better to update or replace your original post (632059) than posting a notification node (632060) only to be replaced later. And now the first posting (632059) is missing forever unless restored. It's not the same thing with duplicate posting due to rapid double clicking the create button.

      Open source softwares? Share and enjoy. Make profit from them if you can. Yet, share and enjoy!

        I agree.

        If only it had been that simple:(

        I clicked Create, when I meant to click Preview. I quickly clicked Preview (PM has been very slow for me this afternoon - except at that point). I thought I was was updating the original node when I copied the textbox to the clipboard, and posted the "Coming soon" message. I pasted my original reply back in, modified it, and then posted it.

        That's what I intended. I have no idea how I managed to post two nodes.

        Apologies all round.

Re: Problem in pattern matching with alternation
by FunkyMonk (Canon) on Aug 12, 2007 at 14:34 UTC
    The above code is working only for the tags which is having 2 "\\"in it,but i have to match the storage path of vob tags which will not have 2 "\\" in it.
    There aren't any "\\" in the data you provided. There isn't even a single "\". Do you mean vob tags with two "/" in them?

    A few other comments...

    • What is a vob tag anyway, and which part of your data is the vob tag?
    • Didn't you get a "^* matches null string many times in regex" warning when you ran this? I did.
    • The * is special in a regexp. It means "match the previous thing zero or more times". In this case the thing is ^ - beginning of line. Your regex starts by trying to match the start of line zero or more times!
    • Do you know that you're not forced to use // as regex delimiters. By using some other delimeter (eg m{} - you must include the "m") you won't have to backslash all those forward slashes.

    We will try and help you, but you've got to give us a chance.

Re: Problem in pattern matching with alternation
by liverpole (Monsignor) on Aug 12, 2007 at 14:47 UTC
    Hi perladdict,

    You've already been given some very good general advice by toolic, but you're still not using it.

    As naikonta correctly points out, your first regex doesn't compile, and you don't have any "\\" pattern that you talk about.

    Additionally, the beginning of your regex doesn't even make sense -- /^*/ tries to match zero or more of the beginning of the line, which is nonsensical (and gives you warnings if you would only use warnings).

    It's not enough to say "it's not working right", you have to say how it's not working right.

    Please rewrite your program to use strict and warnings, so that it runs without warnings/errors.  Please make your program self-contained (ie. all necessary data is within the program), so that it doesn't depend on an environment which has cleartool, nor an environment which matches yours.  Please rephrase your question so that it's crystal clear what you're getting vs. what you want to get:

    Here's the data that I'm expecting to get: ... ... ... But in fact, I'm actually getting the following: ... ... ...

    s''(q.S:$/9=(T1';s;(..)(..);$..=substr+crypt($1,$2),2,3;eg;print$..$/
Re: Problem in pattern matching with alternation
by NetWallah (Abbot) on Aug 12, 2007 at 15:25 UTC
    The following re seems to meet your requirements:
    my ($tag,$storage)=$arr=~/^\*\s+(\S+)\s+(\S+)\s+\w+/;
    Update:Added Tag extraction. Assumes there are no spaces in the path.

    ($arr seems strangely named, considering it is a scalar).

    I'm a former clearcase admin, so you have my sympathies.

         "An undefined problem has an infinite number of solutions." - Robert A. Humphrey         "If you're not part of the solution, you're part of the precipitate." - Henry J. Tillman

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://632055]
Approved by naikonta
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others scrutinizing the Monastery: (14)
As of 2014-07-29 18:06 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    My favorite superfluous repetitious redundant duplicative phrase is:









    Results (225 votes), past polls