Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

Hi Monks,

I am new to perl scripting.I wrote a small script to access a web link and get the output in a file named output.txt.

my output.txt has several lines in it but I need only few lines from the whole file.I know how to search for one line and print it in my final output file.

here is my question.

I want to print all those few lines i want from my output file into my final output file.

here is the code I am using to search and print one line from my output file

my $output = $mech->content(); open(OUTFILE, ">$outfile"); print OUTFILE "$output"; if ( $output=~ /(line i want )/ ) { open( FINAL, ">", "final output.txt" ); print FINAL "$1"; } close FINAL;

Please Advise.....

Thanks in Advance..
  • Comment on searching for multiple lines from outfile and printing them all in the final outfile
  • Download Code

Replies are listed 'Best First'.
Re: searching for multiple lines from outfile and printing them all in the final outfile
by ww (Archbishop) on May 16, 2009 at 12:01 UTC

    OP and OP's additional notes suggest that the objective is a multi-level or "tiered" selection of data.

    1. Select from among many inputs to one "output file"
    2. Then sub-select lines from "output file" to a "final output file."

    The problem then, is the OP's non-specification of the criteria by which one discards some lines from the "output file" and/or selects others for inclusion in the "final output file."

    If this is indeed what you intend, Annonymonk, please let us know. Is the basis for subselection to be

    • Every third line
    • random selection
    • Some criterion susceptible to use of a regex
    • other?
      it is sub-select lines from "output file" to a "final output file." and random selction from output file.

      am I clear or please let me know if you need mroe information.

      Thanks a lot.

        Assumes @output is populated from your source; prints to console:

        for (0..9) { print $output[rand($#output)] . "\n"; }
        • Printing to your final_output.txt file is left as an exercise for the student (since you've already solved that).
        • This is likely to select duplicate_lines to your final file, which is still permitted by your spec (but which -- I think -- is not actually your intent). Search the Monastery for unique or @seen and similar for the many solutions to avoiding duplication.
        • There is not need to create an output file (thereby eating cycles and disk space); do your initial selection from the source to an array @output.
Re: searching for multiple lines from outfile and printing them all in the final outfile
by lakshmananindia (Chaplain) on May 16, 2009 at 04:03 UTC

    What advise you need??

    What is the problem with your code??

    --Lakshmanan G.

    The great pleasure in my life is doing what people say you cannot do.


      thanks for responding

      suppose say my output file has 100 lines.but out if 100 lines I just want 10 lines that I need to be printed in my final output file.I want to search for all 10 lines and print it in a saperate final file.

      i know how to search for 1 line and print it in final output file but not miltiple lines in my final output file.

      my current code only searches for one line and prints it in final output file

      could you please advice how to do that/? Thanks!
        The canonical way is:
        open my $in_file, '<', 'some_file.txt' or die "$!\n"; while (my $line = <$in_file>) { chomp($line); # If necessary print if $line =~ /foo/; # Or some other conditional }
        Update:Note that in the above example you wouldn't actually want the chomp, but it's usually a good idea if you are doing more than just printing matching lines.

        Regards,
        Darren

        If the lines you require are consecutive, then you could use the range operator to do it along the lines of i.e. untested...
        use warnings; use strict; use autodie; open INFILE, "<infile"; open OUTFILE, ">outfile"; while (<INFILE>) { print OUTFILE if /start regex/ .. /end regex/; } close INFILE; close OUTFILE;
        A user level that continues to overstate my experience :-))
Re: searching for multiple lines from outfile and printing them all in the final outfile
by Marshall (Canon) on May 16, 2009 at 16:17 UTC
    I am also unsure as what problem you are having. As a suggestion, I would move the open of FINAL outside the "if". The ">" open will create a new blank file. >> would append an existing file. If you have a bunch of these "if" statements, you would only get the output of the last one! if you keep reopening the FINAL file for every if.

    Also since it sounds like the output file doesn't have many lines in it, you may find it more convienent to finish writing it, close it, then open it for read, possibly just reading it all into memory at once. And do the search for what you want all at once rather than on a line by line basis as you go - this depends upon what you are looking for in the output.

    I've got a number of gizmos that parse webpages, and often as another poster suggested, it is easier to do this in steps (eg, don't try to do it all with one regex). Do it in a couple of smaller steps.

    Update: Oh another point, I see you are on Windows which allows spaces in the file names. You will save yourself a lot of grief if you start avoiding that feature. Your code will be more portable and you won't have to double quote the filename say in a type command on windows. Use an underscore _ perhaps like final_output.txt instead of the space.

    my $output = $mech->content(); open(OUTFILE, ">", "$outfile"); print OUTFILE "$output"; open( FINAL, ">", "final output.txt" ); if ( $output=~ /(line i want )/ ) { print FINAL "$1"; } close FINAL;