Beefy Boxes and Bandwidth Generously Provided by pair Networks
No such thing as a small change

Splitting a file based on matched conditions

by skumar_pm (Initiate)
on Sep 19, 2017 at 08:59 UTC ( #1199658=perlquestion: print w/replies, xml ) Need Help??
skumar_pm has asked for the wisdom of the Perl Monks concerning the following question:

I want to split file contents to an outputfile by comparing real number data present in the last line? For example, the file contains of more 100K lines

Start:/abc/def ..... End 1.2 Start:/xyz/uvw ..... End 2.8

I want to print everylines from "Start" to "End" to OUTFILE1 if "End" contains values between 1-1.9; Otherwise print all the lines to OUTFILE2 if "End" contains values between 2-2.9. Likewise, mulitple output files have to be generated.

code here

foreach $lineIn (@file1_list) { $_ = $lineIn; if (/Start:/) { $pattern1 = 1; } elsif(/End\s/) { my @slackno = split /\s+/, $_; $pattern2 = 1; push (@buflines, $_); } if ($pattern1 =~ 1 and $pattern2 =~0) { push (@buflines, $_); } else { $pattern1 = 0; $pattern2 = 0; } } if ($slackno[3] >= 2.0 and $slackno[3] <= 2.9) { foreach ( @buflines ) { print FILE2 $_; } } close(FILE2);

2017-09-20 Athanasius added code tags

Replies are listed 'Best First'.
Re: Splitting a file based on matched conditions
by tybalt89 (Priest) on Sep 19, 2017 at 14:24 UTC
    #!/usr/bin/perl # use strict; use warnings; use Path::Tiny; local $/ = "\nEnd"; while(<DATA>) # read section to (and including) End { $_ .= do { local $/ = "\n"; <DATA> }; # complete last line /\nEnd (\d+).*$/ and path("OUTFILE$1")->append($_); } __DATA__ Start:/abc/def ..... End 1.2 Start:/xyz/uvw ..... End 2.8
Re: Splitting a file based on matched conditions
by thanos1983 (Vicar) on Sep 19, 2017 at 09:15 UTC

    Hello skumar_pm,

    Welcome to the Monastery. Refrain from duplicating your questions Splitting a file based on matched conditions, it would be better to update your question instead of creating a new one.

    Use <code> tags so we can view your code and download it. Review your code format it and update it.

    Looking forward to your update, BR.

    Seeking for Perl wisdom...on the process of learning...not there...yet!
Re: Splitting a file based on matched conditions
by kcott (Chancellor) on Sep 19, 2017 at 06:36 UTC
Re: Splitting a file based on matched conditions
by Laurent_R (Canon) on Sep 19, 2017 at 14:07 UTC
    Hi skumar_pm,

    I think you could read the file line by line until you meet an End tag and push them in an array. When you reach an End tag, determine in which output file it needs to go, print the array to the proper file handle. Then, empty the array, and read again lines into the array until reaching another End tag, and so on.

    With a file containing 100k-lines, it should not be a problem to store the lines of one section (a Start..End chunk) into an memory.

      Thanks for your suggestion.
Re: Splitting a file based on matched conditions
by talexb (Canon) on Sep 19, 2017 at 15:21 UTC

    I would recommend using Tie::File for this -- you'll be able to easily look at the last line of the file, make your decision on how to split the file, then write the portions of the file out as necessary.

    Alex / talexb / Toronto

    Thanks PJ. We owe you so much. Groklaw -- RIP -- 2003 to 2013.

Re: Splitting a file based on matched conditions
by Marshall (Abbot) on Sep 19, 2017 at 18:27 UTC
    Below is some "starter code" for you showing another method of doing this. Rather setting flags in the main body of the code, calling a subroutine to handle processing the record can be a good way. If you are in the subroutine, then that means that you are inside the record - no need to have a specific flag for that status..

    100K lines isn't big enough to worry much about seeking around. I'd just save the data in a buffer and then decide what to do with it when you see the End line.

    You can also look at: Flipin good, or a total flop?

    I use a variety of methods for similar tasks with some attention to who I'm writing the code for and the level of Perl expertise I expect them to have. Most of the time the number of lines of code is meaningless as long as it is clear and well structured.

    #/usr/bin/perl use strict; use warnings; #### Not complete code #### #### This is just a starting framework ### while (my $line = <DATA>) { handle_section ($line) if $line =~ /^Start:/; } sub handle_section { my $start_line = shift; my @buf; my ($parm) = $start_line =~ /Start:(.*)/; push @buf, $start_line; print "Staring Parm is: $parm\n"; my $line; while (defined ($line = <DATA>) and $line !~ /End/) { push @buf, $line; } # do something based upon End line (still in $line) # If not an End line, then malformed record # end of data was reached without an End line } __DATA__ Start:/abc/def ..... End 1.2 Start:/xyz/uvw ..... End 2.8
Re: Splitting a file based on matched conditions
by roboticus (Chancellor) on Sep 19, 2017 at 15:56 UTC


    You could just read the file in as a single string, then break it into chunks with regular expressions. Then examine the chunks to decide which array you want to put them.

    Alternatively, you could change the input record delimiter to '\nStart:' and read them a chunk at a time, then again, examine each chunk to decide where to put them.


    When your only tool is a hammer, all problems look like your thumb.

Re: Splitting a file based on matched conditions
by Anonymous Monk on Sep 19, 2017 at 13:02 UTC
    To read "the last line" of any file, seek() to a position relative to the end-of-file, that you know is far-enough back to encompass the last few lines, then start reading. (If the file might be UTF-8 then you need to be prepared to ignore UTF-8 errors on the off-chance that your seek put you smack in the middle of a multibyte sequence.) You really want to be sure that you read more-than-one line in this way so as to be very certain that the last line you read is complete.

Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://1199658]
Approved by Corion
and all is quiet...

How do I use this? | Other CB clients
Other Users?
Others romping around the Monastery: (2)
As of 2018-07-21 21:38 GMT
Find Nodes?
    Voting Booth?
    It has been suggested to rename Perl 6 in order to boost its marketing potential. Which name would you prefer?

    Results (450 votes). Check out past polls.