Beefy Boxes and Bandwidth Generously Provided by pair Networks
Do you know where your variables are?

comment on

( #3333=superdoc: print w/replies, xml ) Need Help??
Hello All,

I've spent the last couple hours developing a small utility for my personal use at home and work. Basically, it is program that asks for a file to read from, a file to write to, and two strings containing patterns. Optionally, the file to be read can be taken as the first command line argument while the file to output to can be taken as the second command line argument. If only one argument is given, it is assumed to be the input file.

Anyways, the program calls a subroutine called pry to move through the input file (file to be read) line by line. It looks for the first pattern specified, and, upon finding it, prints each consecutive line to the output file specified until the second pattern is found.

I intend to upgrade and tweak on this utility as life goes on, but for now it's a decent, bare-bones utility. I am probably going to add the option to add case insensitivity to the start and stop patterns, as well as the option to specify how many sections within a text file the program should extract (for instance, if it should match all groups that it finds, or just some, or just the first, or last, and so on.

I am probably also going to try to turn it into a module as well so that I can just load it into my other programs and pry chunks out of text files as I see fit. Of course, I don't know much about module development yet (only on chapter 7 of the camel book and chapter 7 of Simon Cozens book), so that is for a later time.

I imagine someone probably already has a whole package and/or utility that does this or something similar already. It is probably a lot more professional looking as well. Nonetheless, this is a fun little program I wrote just to see if I could, and to test my skills thus far. If anyone else could/would find it useful, I want them to have access to it as well. I welcome any and all criticisms and thoughts.

Without further ado, I give you the PINSS (PINSS Is Not a Sentient Searcher) program.

#!/usr/bin/perl # PINSS.plx # Short for "Pins Is Not a Sentient Searcher" use strict; use warnings; sub pry; my $file_in; my $file_out; print "Please input the phrase or perl regular expression you want to +use to begin capturing data: "; chomp(my $start_exp = <STDIN>); print "Please input the phase or perl regular expression you want to u +se to cessate capture of data: "; chomp(my $stop_exp = <STDIN>); if(scalar @ARGV > 1){ $file_in = shift @ARGV; $file_out = shift @ARGV; print "\nINPUT FILE: $file_in\n"; print "OUTPUT FILE: $file_out\n"; pry($file_in, $file_out, $start_exp, $stop_exp); print "\n\nSee $file_out for results\n\n"; }elsif(scalar @ARGV == 1){ $file_in = shift; print "Please specify an output file to print to (type 'screen' to + print to terminal screen): "; chomp($file_out = <STDIN>); print "\nINPUT FILE: $file_in\n"; print "OUTPUT FILE: $file_out\n"; pry($file_in, $file_out, $start_exp, $stop_exp); print "\n\nSee $file_out for results\n\n"; }else{ print "Please specify an input file to read from: "; chomp($file_in = <STDIN>); print "Please specify an output file to print to (type 'screen' to + print to terminal screen): "; chomp($file_out = <STDIN>); print "\nINPUT FILE: $file_in\n"; print "OUTPUT FILE: $file_out\n"; pry($file_in, $file_out, $start_exp, $stop_exp); print "\n\nSee $file_out for results\n\n"; } sub pry(){ my $in_file = shift; my $out_file = shift; my $start = shift; my $stop = shift; # print "$in_file\n$out_file\n"; my $flag; open INFILE, $in_file or die "Cannot open input file to read fro +m: $!"; if($out_file =~ m/screen/i){ *OUTFILE = *STDOUT; }else{ open OUTFILE, ">$out_file" or die "Cannot open output file to +read from: $!"; } while(my $line = <INFILE>){ chomp $line; next if $line =~ m/^\s*$/; # Skip all blank and whitespace o +nly lines if ($flag){ if($line =~ m/$stop/){ $flag = 0; print OUTFILE "CAPTURE ENDED AT: $line\n"; next; } print OUTFILE "$line\n"; } if($line =~ m/$start/){ $flag = 1; print OUTFILE "\nCAPTURE STARTED AT: $line\n" } } close INFILE; return; } close OUTFILE;


In reply to Text File Section Extractor by BJ_Covert_Action

Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post; it's "PerlMonks-approved HTML":

  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.
  • Log In?

    What's my password?
    Create A New User
    and the web crawler heard nothing...

    How do I use this? | Other CB clients
    Other Users?
    Others browsing the Monastery: (7)
    As of 2020-01-23 19:34 GMT
    Find Nodes?
      Voting Booth?