Beefy Boxes and Bandwidth Generously Provided by pair Networks
There's more than one way to do things

comment on

( #3333=superdoc: print w/replies, xml ) Need Help??

To propose a really fitting solution, we need to see some code.

How do you extract the title? Do you read it in a separate run through the file? Do you have a loop that does one of several things depending on what the current line starts with? What else is your code doing? The solution will differ depending on your existing implementation.

I am guessing that: the information is all located in a single file, you're only doing one iteration over it, and all pieces of information follow the format you already showed (ie if broken across multiple lines, the following lines start with the same tag followed by a line number).

In that case, the way I'd handle this is to read the lines batchwise, reconstruct them into a single line, then hand it off to the appropriate handler.

my %handler = ( HEADER => sub { ... }, TITLE => sub { ... }, COMPND => sub { ... }, ); my ($tag, $text) = ("")x2; while(<>) { chomp; my ($curr_tag, $curr_text) = split /\s+/, $_, 2; if($curr_tag ne $prev_tag) { $handler{$tag}->($tag, $text) if exists $handler{$tag}; # complain_about_unknown() if not exists $handler{$tag}; ? ($tag, $text) = ($curr_tag, ""); } else { my $curr_linenr; ($curr_linenr, $curr_text) = split /\s+/, $curr_text, 2; # perform validation on line nr here? } $text .= " " . $curr_text; }
So now we have a parser that lets us write handlers for the tags that don't individually need to worry about multiple line text. And then the distinction is painless:
my %record; my %handler = ( # ... TITLE => sub { $record{TITLE} = $_[1] unless exists $record{TITLE} + }, # ... );
Or if there are multiple records per file:
my $curr_rec = 0; my @record; my %handler = ( # ... HEADER => sub { ++$curr_rec }, TITLE => sub { $record[$curr_rec]->{TITLE} = $_[1] unless exists $record[$curr_rec]->{TITLE} }, # ... );
You get the idea.

Makeshifts last the longest.

In reply to Re: Finding first block of contiguous elements in an array by Aristotle
in thread Finding first block of contiguous elements in an array by FamousLongAgo

Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post; it's "PerlMonks-approved HTML":

  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.
  • Log In?

    What's my password?
    Create A New User
    and the web crawler heard nothing...

    How do I use this? | Other CB clients
    Other Users?
    Others drinking their drinks and smoking their pipes about the Monastery: (6)
    As of 2020-06-01 16:33 GMT
    Find Nodes?
      Voting Booth?
      Do you really want to know if there is extraterrestrial life?

      Results (4 votes). Check out past polls.