Beefy Boxes and Bandwidth Generously Provided by pair Networks
laziness, impatience, and hubris
 
PerlMonks  

Re: split of files

by Moron (Curate)
on Jun 22, 2007 at 09:53 UTC ( #622761=note: print w/ replies, xml ) Need Help??


in reply to split of files

if you mean do file split like unix split does but on pattern-matched boundaries instead of byte counts - I rather imagine something like: (update: with linux, split with -p is available to split on a regexp - then the perl script only has to shell that and cleanup after as follows: glob for the per sequence files, cat each 1000 files at a time together into some second naming convention and remove each 1000 per iteration - on second thoughts I prefer what follows after all!)

my $suffix = 'z'; my $sequence = 0; my $maxseq = 1000; my $input = shift @ARGV or die "usage"; open my $ifh, $input or die "$!: $input\n"; my $ofh; while( <$ifh> ) { /\AINPUT\sSEQUENCE/ and SwitchFile( $input, \$ofh, \$suffix, \$sequence, $maxseq ); $ofh or die "Unexpected prelude: $_"; print $ofh $_; } close $ofh; sub SwitchFile { my ( $input, $oref, $sref, $qref, $max ) = @_; if ( defined( $$oref ) ) ( ++$$qref < $max ) and return; $$qref = 0; close $$oref; } my $newfile = "$input." . ++$$sref; open my $ofh, ">$newfile" or die "$!: $newfile"; $$oref = $ofh; }
This would create the 270 files with suffixes .aa thru .jj

__________________________________________________________________________________

^M Free your mind!


Comment on Re: split of files
Download Code

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://622761]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others lurking in the Monastery: (5)
As of 2014-09-20 18:30 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    How do you remember the number of days in each month?











    Results (160 votes), past polls