Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl Monk, Perl Meditation
 
PerlMonks  

How to ignore and retrieve certain values from text file which is to be split into multiple HTML files with tables?

by perly_white (Initiate)
on Apr 22, 2016 at 23:37 UTC ( [id://1161285]=perlquestion: print w/replies, xml ) Need Help??

perly_white has asked for the wisdom of the Perl Monks concerning the following question:

I have a text file of the form: Filename1 Item1 - Answer Filename1 Item2 - Answer .... Filename1000 Item1 - Answer I am trying to create an individual HTML file with a table for each item for each file. Obviously I need a loop. However, I am unsure how to read and ignore the repetitive format of the file in which Filename1 occurs on every single line item related to Filename1, etc. I need to know the filename because whenever, I encounter a new filename, it is time to save the existing HTML file and begin a new table for the next file's items. I don't want a table with the filename in each row so I want to ignore it after the first occurrence. However, I need to keep reading because I need the Item value from each line. Any suggestions on how to handle this? Thanks so much!

Replies are listed 'Best First'.
Re: How to ignore and retrieve certain values from text file which is to be split into multiple HTML files with tables?
by Athanasius (Archbishop) on Apr 23, 2016 at 03:40 UTC

    Hello perly_white, and welcome to the Monastery!

    The requirements are not entirely clear. If filenames can appear out-of-order in the input file:

    Filename1 Item1 - Answer Filename2 Item1 - Answer Filename1 Item2 - Answer

    then you will need to either (1) read the whole file into a suitable data structure before writing tables, or (2) keep track of each open file, associating the filename with the handle. For (1), you could use a hash of arrays1 like this:

    my %files; $files{Filename1} = [ 'Item1 - Answer' ]; push @{ $files{Filename1} }, 'Item2 - Answer'; ...

    For (2), you would need a simple hash with filename/filehandle key/value pairs.

    However, it appears from the question that you know in advance that filenames cannot appear out-of-order. If that’s the case, the following skeleton script should provide a straightforward approach:

    use strict; use warnings; use autodie; # open the data file for reading my $data_filename = 'data.txt'; open my $in_fh, '<', $data_filename; # output files my $current_filename = ''; my $out_fh; while (<$in_fh>) # process one line of data { my ($new_filename, $item) = split ' ', $_, 2; if ($new_filename ne $current_filename) { finalize_table($out_fh) if defined $out_fh; open $out_fh, '>', $new_filename; $current_filename = $new_filename; initialize_table($out_fh); } add_row($out_fh, $item); } close $in_fh; finalize_table($out_fh) if defined $out_fh; sub initialize_table { ... } sub add_row { ... } sub finalize_table { my ($fh) = @_; # ... close $out_fh; }

    Update: 1See perldsc#HASHES-OF-ARRAYS.

    Hope that helps,

    Athanasius <°(((><contra mundum Iustus alius egestas vitae, eros Piratica,

Re: How to ignore and retrieve certain values from text file which is to be split into multiple HTML files with tables?
by tangent (Parson) on Apr 23, 2016 at 10:45 UTC
    This is a way you could do it using Template Toolkit. Using a template system allows you to keep the HTML out of your code and will also handle writing the new files for you:
    use Template; my $template = Template->new; my $tmpl = 'table.tmpl'; my %names; while ( my $line = <DATA> ) { chomp $line; my ($name,$item,$answer) = ($line =~ m/^(\w+)\s+(\w+)\s*-\s*(.*)/) +; push( @{ $names{$name} }, { item=>$item, answer=>$answer } ); } for my $name ( keys %names ) { my $table = { title=>$name, rows=>$names{$name} }; $template->process( $tmpl, $table, "$name.html" ) || die $template->error(); } __DATA__ Filename1 Item1 - Answer Filename1 Item2 - Answer Filename2 Item1 - Answer Filename2 Item2 - Answer
    This will create two new files "Filename1.html" and "Filename2.html".

    The content part of the template file "table.tmpl" would look like this:
    <h1>[% title %]</h1> <table> [% FOREACH row IN rows %] <tr> <td>[% row.item %]</td> <td>[% row.answer %]</td> </tr> [% END %] </table>
    Obviously, you need to add the html and body tags around this, and you can also add CSS and other static elements.
Re: How to ignore and retrieve certain values from text file which is to be split into multiple HTML files with tables?
by Anonymous Monk on Apr 23, 2016 at 00:13 UTC
    provide representative sample data in code tags, 20 lines max

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://1161285]
Approved by stevieb
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others making s'mores by the fire in the courtyard of the Monastery: (4)
As of 2024-03-28 16:43 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found