Beefy Boxes and Bandwidth Generously Provided by pair Networks
The stupid question is the question not asked
 
PerlMonks  

first two words-pattern matching

by shamala (Acolyte)
on May 12, 2004 at 05:51 UTC ( [id://352650]=perlquestion: print w/replies, xml ) Need Help??

shamala has asked for the wisdom of the Perl Monks concerning the following question:

Hi monks, i have a file whose contents are as below:
<first line> :ctx ctx_3 ctx_3 040425 0 0 +07-May-2004 03:29 07-May-2004 03:29 0.00 hobbit + N/A failed PLE-QA N/A 0 <second line> Not Published 466008691 <third line>plsql tkmain_ps4 tkmain_ps4 040425 0 + 0 07-May-2004 03:29 07-May-2004 03:29 0.00 + hobbit N/A failed PLE-QA N/A 0 <...> Not Published 466008691 <nth line>plsql tkmain_pu3 tkmain_pu3 040425 0 + 0 07-May-2004 03:29 07-May-2004 03:29 0.00 h +obbit N/A failed PLE-QA N/A 0 Not Published 466008691 rdbms tkmain_2n1 tkmain_2n1 040425 0 0 + 07-May-2004 01:37 07-May-2004 01:37 0.00 hobbit + N/A failed PLE-QA N/A 0 Not Published 465894765
I want only the first two words(for ex:ctx ctx_53) from each of the lines that begin with the similar kind of words like the third line has - plsql tkmain_ps4 and i want this to be matched. I have written the following but isnt working...could any of you please help...i am a begginner at perl.
#! /usr/local/bin/perl #use strict; open(rerun,"rerunlrg") while(my $pattern=<rerun>){ if(my $pattern=~ m/^(\w\s+\w\s).*?/){ print "matched\n"; print "$1\n"; }
Thanks ppl

Edited by Chady -- added code tags around file contents.

Replies are listed 'Best First'.
Re: first two words-pattern matching
by davido (Cardinal) on May 12, 2004 at 06:14 UTC
    Can we assume that <nth line> really is just your notation for the beginning of a new line, where lines end with "\n"?

    If that's the case:

    use strict; # Don't comment it out. open RERUN, "rerunlrg" or die "Couldn't open input file.\n$!"; while ( my $line = <RERUN> ) { print "Matched:\n$1\n\n" if $line =~ m/^(\S+\s+\S+)/; } close RERUN;

    Go ahead and give that a try.

    One problem your regexp has is that you're not putting a quantifier after your \w metachars. That means that you're trying to match a single word character, followed by any positive amount of whitespace, followed by a single word character, followed by a single whitespace, followed by any amount of anything. You really want to be matching words of arbitrary length, presumably longer than a single character each.

    Another problem your script has is that you're not checking for success when you open the file. You should probably be invoking die if there is a failure to open the file. See the example I provided. You can read up on this issue in perlopentut.


    Dave

Re: first two words-pattern matching
by matija (Priest) on May 12, 2004 at 06:19 UTC
    ^(\w\s+\w\s) That matches a single character folowed by one or more spaces, folowed by another single character, folowed by one space.

    What you probably want is something along the lines of

    ^(\w+\s+\w+\b)
    Note that I removed the final space from your match by replacing it with \b - I assume you don't really need that final space.
Re: first two words-pattern matching
by TilRMan (Friar) on May 12, 2004 at 06:29 UTC
    use strict; use warnings; open RERUN, "rerunlrg" or die "Can't open rerunlrg: $!"; while (<RERUN>) { next if $. % 2 == 0; # Skip the second, fourth, ... line if (m[(\S+)\s+(\S+)]) { print "Matched: $1 $2\n"; } } close RERUN;
      Hey TilRman ..dat was neat!!! thanks
Re: first two words-pattern matching
by saskaqueer (Friar) on May 12, 2004 at 06:18 UTC
    while (<DATA>) { # We could use split( /\s+/, $_, 3 ) to explicitly # set the max split limit to 3, but split() is smart # enough to see we are capturing into 2 variables and # automatically sets the limit to 3. Smart perl! my ($first, $second) = split( /\s+/ ); print "Line $.: '$first $second'\n"; } __DATA__ this is the first line right here line two comes next way down here then third line is here fourth comes next
Re: first two words-pattern matching
by snadra (Scribe) on May 12, 2004 at 10:52 UTC
    Hello,

    your file seems to be a csv file. Wich is not seperated by commas but by tabs. But the seperator is not important. Maybe the Text::CSV module or other modules about csv can help you as well.

    snadra

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://352650]
Approved by Corion
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others chanting in the Monastery: (5)
As of 2024-04-18 20:33 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found