Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl-Sensitive Sunglasses
 
PerlMonks  

Parsing a text file in Perl.

by ramki067 (Acolyte)
on May 26, 2014 at 10:23 UTC ( #1087412=perlquestion: print w/ replies, xml ) Need Help??
ramki067 has asked for the wisdom of the Perl Monks concerning the following question:

0 down vote favorite I have a text file as below. I need to find keywords as "(number) tests from (number) test cases ran" and store the number. In the below case the number is 67. Secondly, I need to find "PASSED" keyword and then followed by "(number) tests". In the below case it is 67 tests. How can i do it. Thanks, Sharath
< 0x00000: 5b 2d 2d 2d 2d 2d 2d 2d 2d 2d 2d 5d 20 31 34 20 [------ +----] 14 < 0x00010: 74 65 73 74 73 20 66 72 6f 6d 20 44 49 53 41 42 tests f +rom DISAB < 0x00020: 4c 45 44 5f 47 65 6e 65 72 69 63 44 52 4d 54 65 LED_Gen +ericDRMTe < 0x00030: 73 74 20 28 31 32 38 34 39 20 6d 73 20 74 6f 74 st (128 +49 ms tot < 0x00040: 61 6c 29 0d 0a 0d 0a 5b 2d 2d 2d 2d 2d 2d 2d 2d al).... +[-------- < 0x00050: 2d 2d 5d 20 47 6c 6f 62 61 6c 20 74 65 73 74 20 --] Glo +bal test < 0x00060: 65 6e 76 69 72 6f 6e 6d 65 6e 74 20 74 65 61 72 environ +ment tear < 0x00070: 2d 64 6f 77 6e 0d 0a 5b 3d 3d 3d 3d 3d 3d 3d 3d -down.. +[======== < 0x00080: 3d 3d 5d 20 36 37 20 74 65 73 74 73 20 66 72 6f ==] 67 +tests fro < 0x00090: 6d 20 34 20 74 65 73 74 20 63 61 73 65 73 20 72 m 4 tes +t cases r < 0x000a0: 61 6e 2e 20 28 34 33 38 32 33 20 6d 73 20 74 6f an. (43 +823 ms to < 0x000b0: 74 61 6c 29 0d 0a 5b 20 20 50 41 53 53 45 44 20 tal)..[ + PASSED < 0x000c0: 20 5d 20 36 37 20 74 65 73 74 73 2e 0d 0a ] 67 t +ests...

Comment on Parsing a text file in Perl.
Download Code
Re: Parsing a text file in Perl.
by Anonymous Monk on May 26, 2014 at 10:28 UTC
      I wrote the below code, but its not matching:
      open (FILE, '<', '123.log') or die "Could not open 123.log: $!"; while (<FILE>) { #print $_ if (/^[==========]/ .. /^tests.../); if (/^[==========]/ .. /^tests.../){ print "Line Found:".$_."\n"; } } close (FILE) or die "Could not close 123.log: $!";
        I tried one more shot with the below but no luck.
        open (FILE, '<', '123.log') or die "Could not 123.log: $!"; my $i=0; while (<FILE>) { #print $_ if (/^[==========]/ .. /^tests./); if (/^[0-9] tests from [0-9] test cases ran/ .. /^[0-9] tests\./){ print "$i.Match Found:".$_."\n"; $i++; } }
        Any help?

        If you want to match [==========] in a regex, you need to escape the square brackets — otherwise they form a character class.

        However, this approach cannot work: you are reading the data file line-by-line, which to Perl means from one newline (\n) to the next, but a typical line of input looks like this:

        < 0x00070: 2d 64 6f 77 6e 0d 0a 5b 3d 3d 3d 3d 3d 3d 3d 3d -down.. +[========

        As can be seen, there are no lines which match [==========]. Likewise, the special regex character ^ matches at the beginning of a line, and neither [==========] nor tests appears at the beginning of a line.

        You will need a different strategy, along the lines outlined by hippo below.

        Hope that helps,

        Athanasius <°(((><contra mundum Iustus alius egestas vitae, eros Piratica,

Re: Parsing a text file in Perl.
by hippo (Curate) on May 26, 2014 at 10:40 UTC

    Looks like a two stage process to me:

    1. Re-assemble the original data from the hexdump into a single string
    2. Use regexes with capture groups to retrieve the required data

    That's how I would do it, anyway.

    Update: OP cross-posted at SO

Re: Parsing a text file in Perl.
by Anonymous Monk on May 26, 2014 at 11:58 UTC
    It would be much easier to deal with the original file of which this is a hexdump. Can't you get that?
Re: Parsing a text file in Perl.
by Old_Gray_Bear (Bishop) on May 26, 2014 at 23:19 UTC
    Since you know where the Human readable code starts, extract it from each line and build up a string of readable text. Extract the data you need and go on to the next line.

    Note: each new entry in the testing log starts with the characters '[-', so you know when you have reached the end of the first line and the beginning of the next.

    Now, the real question -- why on God's Green Earth are you not using the original log? Going through all this mishigas to interpret a dump-format?!?? Sheesh, do it the simple way....

    ----
    I Go Back to Sleep, Now.

    OGB

Re: Parsing a text file in Perl.
by soonix (Curate) on May 27, 2014 at 11:43 UTC

    Your text file looks like the output of a diff between two hexdumps of log files.

    If you are doing this for learning purposes or are creating a Rube_Goldberg_machine for showing off :-) then go ahead, but, as others have already written, it would not only be easier, but also more stable, if you could start from an "easier" format, e.g. a diff between the "original" logs...

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://1087412]
Approved by Corion
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others perusing the Monastery: (6)
As of 2014-12-17 23:49 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    Is guessing a good strategy for surviving in the IT business?





    Results (40 votes), past polls