Beefy Boxes and Bandwidth Generously Provided by pair Networks
go ahead... be a heretic
 
PerlMonks  

How do i split a file into given number of parts.

( #90768=categorized question: print w/ replies, xml ) Need Help??
Contributed by turumkhan on Jun 22, 2001 at 19:56 UTC
Q&A  > files


Description:

I have a file name foo. I have to split it into number of parts say 5 and name it foo1, foo2, foo3 and so on till foo5 how do i do that. - Turumkhan

Answer: How do i split a file into given number of parts.
contributed by mr_mischief

This solution takes as its first argument the number of ways to split a file, and the remainder of arguments are taken to be files which need to be split. It does not delete the originals.

#!/usr/bin/perl -w use strict; ### a must my $parts = shift; ### how many parts to split my @file = @ARGV; ### the files to split foreach ( @file ) { ### how big should the new file be? my $size = -s $_; my $buf_size = $size / $parts; ### ready a buffer for the data my $buf = ''; ### open the input file open ( In, $_ ) || warn "Cannot read $_: $!\n"; binmode ( In ); ### for as many parts as there are, read ### the amount of data, then write it to ### the appropriate output file. for ( my $i = 0; $i < $parts; $i++ ) { ### read an output file worth of data read ( In, $buf, $buf_size ) || warn "Read zero bytes from $_!\n"; ### write the output file open ( Out, "> $_$i" ) || warn "Cannot write to $_$i: $!\n"; print Out $buf; ### if this is the last segment, ### grab the remainder if ( $i == ( $parts - 1 ) ) { my $rem = $size % $buf_size; if( $rem ) { read ( In, $buf, $rem ) || warn "Read zero bytes from $_!\n"; print Out $buf; } } ### we are done with the current output file close ( Out ); } ### we're done spliting the input file close ( In ); } exit;


Its weakness is it only loads one buffer per output segment, and for greater usability it should probably loop if it's a huge size. I don't see this being a huge problem given the nature of the program, though, because there's some place the split files are going that's too small for the original file, so the parts should be small enough for main memory. I'll therefore leave more exotic buffer manipulation as an exercise.

Chris
Boo!
Answer: How do i split a file into given number of parts.
contributed by CharlesClarkson

We could adjust the answer given by mr_mischief by pulling the if block out of the inner foreach and using the whole file's size instead of calculating a remainder.

#!/usr/bin/perl use strict; use warnings; my $parts = shift; ### how many parts to split my @file = @ARGV; ### the files to split foreach ( @file ) { ### how big should the new file be? my $size = (-s) / $parts; ### open the input file open my $in_fh, $_ or warn "Cannot read $_: $!"; binmode $in_fh; ### for all but the last part, read ### the amount of data, then write it to ### the appropriate output file. for my $part (1 .. $parts - 1) { ### read an output file worth of data read $in_fh, my $buffer, $size or warn "Read zero bytes from $ +_: $!"; ### write the output file open my $fh, "> $_$part" or warn "Cannot write to $_$part: $!" +; print $fh $buffer; } # for the last part, read the rest of # the file. Buffer will shrink # to the actual bytes read. read $in_fh, my $buffer, -s or warn "Read zero bytes from $_: $!"; open my $fh, "> $_$parts" or warn "Cannot write to $_$parts: $!"; print $fh $buffer; } __END__

The same weeknesses exist for memory usage. There could be better checks for file existence, too.

Answer: How do i split a file into given number of parts.
contributed by trentonl

Since you're just writing them out, perhaps you can use the unix 'split' command?

# n files each with no more than 10000 lines
split --lines=10000 

# files contaning at most 1 meg
split --bytes=1m

Answer: How do i split a file into given number of parts.
contributed by fundflow

On unix, you can use the 'split' command.

Please (register and) log in if you wish to add an answer



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • Outside of code tags, you may need to use entities for some characters:
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.
  • Log In?
    Username:
    Password:

    What's my password?
    Create A New User
    Chatterbox?
    and the web crawler heard nothing...

    How do I use this? | Other CB clients
    Other Users?
    Others rifling through the Monastery: (3)
    As of 2014-09-02 03:19 GMT
    Sections?
    Information?
    Find Nodes?
    Leftovers?
      Voting Booth?

      My favorite cookbook is:










      Results (18 votes), past polls