Beefy Boxes and Bandwidth Generously Provided by pair Networks
No such thing as a small change
 
PerlMonks  

Parsing Multiple Lines.

by /dev/trash (Curate)
on May 24, 2004 at 00:57 UTC ( #355801=perlquestion: print w/ replies, xml ) Need Help??
/dev/trash has asked for the wisdom of the Perl Monks concerning the following question:

I have a text file, that I am trying to parse. Each "record" starts with filename.jpg: followed by a blank line or filename.jpg: followed by multiple lines of data. What I want to do is take the info that is associated with a filename and keep it in a variable to work with later.
This is what I have so far:
#!/usr/bin/perl use warnings; use strict; my $fh; open($fh, "</home/me/bulk.txt") or die "Can't open: $!"; while (my $line = <$fh>) # was $line { if ($line=~/(jpg:\Z)/) { print "\n"; print $1; if ($line=~/(\w)/) { print "\n"; print $line; print "\n"; } } }
The part that I am stuck with is after finding the *.jpg filename I want to check to see if the next line is a blank or has data. This is an example of the text file.
rib.jpg: May.jpg: Camera-Specific Properties: Equipment Make: OLYMPUS OPTICAL CO.,LTD Camera Model: C860L,D360L Camera Software: OLYMPUS CAMEDIA Master Maximum Lens Aperture: f/2.8 Image-Specific Properties: Image Orientation: Top, Left-Hand Horizontal Resolution: 72 dpi Vertical Resolution: 72 dpi Image Created: 2001:04:13 23:59:14 Exposure Time: 1/11 sec F-Number: f/2.8 Exposure Program: Normal Program ISO Speed Rating: 500 Exposure Bias: 1/2 EV Metering Mode: Pattern Light Source: Fluorescent Flash: Flash Focal Length: 5.50 mm Color Space Information: sRGB Image Width: 228 Image Height: 380 Compression Setting: SQ Macro Mode: Normal oher.jpg:

Comment on Parsing Multiple Lines.
Select or Download Code
Re: Parsing Multiple Lines.
by Zaxo (Archbishop) on May 24, 2004 at 01:11 UTC

    Your routine already distinguishea between data lines and blank (well, nonword) ones. You just haven't used the information. Add an else clause to the end of the if ($line=~/(\w)/) { statement, like:

    } else { print "This line intentionally left blank.\n"; }
    For other ideas, you could chomp and then test length, or else test for not matching non-whitespace: $line !~ /\S/. Each suggestion accomodates a little different notion of which lines are considered blank.

    You probably mean to print whole lines, rather than just what you captured ($1).

    After Compline,
    Zaxo

      After posting my question, I did one more search and came up with this reply to a question: Re: multi-line regex match quest It works to a point but I get this:
      Use of uninitialized value in pattern match (m//) at parse.pl line 16, + <$fh> line 731.

        Use of uninitialized value in pattern match (m//) at parse.pl line 16, <$fh> line 731

        A warning that tells you the variable you perform a pattern match was empty at the point given
        (line 16 in your code and line 731 in the file you're reading from.)

Re: Parsing Multiple Lines.
by NetWallah (Abbot) on May 24, 2004 at 03:59 UTC
    You have declared the filehandle used in "open" (my $fh) - That makes $fh a Symbolic reference to the file handle, and I don't believe you are trying to do that - more likely, this is a result of misunderstanding the statement in the doc:
    If FILEHANDLE is an undefined lexical (my) variable the variable is assigned a reference to a new anonymous filehandle....

    Juse use an UNDEFINED name like FH (No dollar), and you'll be OK.
    Update: OK - seems like I need to re-read the docs myself. See notes below.

    Offense, like beauty, is in the eye of the beholder, and a fantasy.
    By guaranteeing freedom of expression, the First Amendment also guarntees offense.

      Er, no, open(my $fh, "...") is correct usage. $fh is autovivified into an actual filehandle (not just a symbolic reference). It is the preferred method for opening a filehandle without clobbering an existing one. The excerpt you are referring to does mean an undefined scalar variable ($fh), not a bareword (FH).

      perldoc perlopentut provides several examples of this in the Indirect Filehandles section.

      Bzzzt. Not a symref.

      $ perl -e'my $fh;open($fh, "< foo") or die $!; print "$fh"' GLOB(0x804b3f8)$
      Nothing wrong with OP's lexical filehandle. It is good practice to localize a global handle such as you recommend within some scope. Then you don't need to worry about name uniqueness.

      After Compline,
      Zaxo

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://355801]
Approved by Steve_p
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others wandering the Monastery: (5)
As of 2015-07-04 10:38 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    The top three priorities of my open tasks are (in descending order of likelihood to be worked on) ...









    Results (59 votes), past polls