Beefy Boxes and Bandwidth Generously Provided by pair Networks
XP is just a number
 
PerlMonks  

Re: split of files

by Moron (Curate)
on Jun 22, 2007 at 09:53 UTC ( #622761=note: print w/ replies, xml ) Need Help??


in reply to split of files

if you mean do file split like unix split does but on pattern-matched boundaries instead of byte counts - I rather imagine something like: (update: with linux, split with -p is available to split on a regexp - then the perl script only has to shell that and cleanup after as follows: glob for the per sequence files, cat each 1000 files at a time together into some second naming convention and remove each 1000 per iteration - on second thoughts I prefer what follows after all!)

my $suffix = 'z'; my $sequence = 0; my $maxseq = 1000; my $input = shift @ARGV or die "usage"; open my $ifh, $input or die "$!: $input\n"; my $ofh; while( <$ifh> ) { /\AINPUT\sSEQUENCE/ and SwitchFile( $input, \$ofh, \$suffix, \$sequence, $maxseq ); $ofh or die "Unexpected prelude: $_"; print $ofh $_; } close $ofh; sub SwitchFile { my ( $input, $oref, $sref, $qref, $max ) = @_; if ( defined( $$oref ) ) ( ++$$qref < $max ) and return; $$qref = 0; close $$oref; } my $newfile = "$input." . ++$$sref; open my $ofh, ">$newfile" or die "$!: $newfile"; $$oref = $ofh; }
This would create the 270 files with suffixes .aa thru .jj

__________________________________________________________________________________

^M Free your mind!


Comment on Re: split of files
Download Code

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://622761]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others lurking in the Monastery: (9)
As of 2015-07-31 02:42 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    The top three priorities of my open tasks are (in descending order of likelihood to be worked on) ...









    Results (274 votes), past polls