Beefy Boxes and Bandwidth Generously Provided by pair Networks
Welcome to the Monastery
 
PerlMonks  

How do i split a file into given number of parts.

by turumkhan (Novice)
on Jun 22, 2001 at 19:56 UTC ( [id://90768]=perlquestion: print w/replies, xml ) Need Help??

turumkhan has asked for the wisdom of the Perl Monks concerning the following question:

I have a file name foo. I have to split it into number of parts say 5 and name it foo1, foo2, foo3 and so on till foo5 how do i do that. - Turumkhan

Originally posted as a Categorized Question.

  • Comment on How do i split a file into given number of parts.

Replies are listed 'Best First'.
Re: How do i split a file into given number of parts.
by mr_mischief (Monsignor) on Jun 23, 2001 at 01:29 UTC
    This solution takes as its first argument the number of ways to split a file, and the remainder of arguments are taken to be files which need to be split. It does not delete the originals.

    #!/usr/bin/perl -w use strict; ### a must my $parts = shift; ### how many parts to split my @file = @ARGV; ### the files to split foreach ( @file ) { ### how big should the new file be? my $size = -s $_; my $buf_size = $size / $parts; ### ready a buffer for the data my $buf = ''; ### open the input file open ( In, $_ ) || warn "Cannot read $_: $!\n"; binmode ( In ); ### for as many parts as there are, read ### the amount of data, then write it to ### the appropriate output file. for ( my $i = 0; $i < $parts; $i++ ) { ### read an output file worth of data read ( In, $buf, $buf_size ) || warn "Read zero bytes from $_!\n"; ### write the output file open ( Out, "> $_$i" ) || warn "Cannot write to $_$i: $!\n"; print Out $buf; ### if this is the last segment, ### grab the remainder if ( $i == ( $parts - 1 ) ) { my $rem = $size % $buf_size; if( $rem ) { read ( In, $buf, $rem ) || warn "Read zero bytes from $_!\n"; print Out $buf; } } ### we are done with the current output file close ( Out ); } ### we're done spliting the input file close ( In ); } exit;


    Its weakness is it only loads one buffer per output segment, and for greater usability it should probably loop if it's a huge size. I don't see this being a huge problem given the nature of the program, though, because there's some place the split files are going that's too small for the original file, so the parts should be small enough for main memory. I'll therefore leave more exotic buffer manipulation as an exercise.

    Chris
    Boo!
Re: How do i split a file into given number of parts.
by CharlesClarkson (Curate) on Jun 23, 2001 at 12:52 UTC

    We could adjust the answer given by mr_mischief by pulling the if block out of the inner foreach and using the whole file's size instead of calculating a remainder.

    #!/usr/bin/perl use strict; use warnings; my $parts = shift; ### how many parts to split my @file = @ARGV; ### the files to split foreach ( @file ) { ### how big should the new file be? my $size = (-s) / $parts; ### open the input file open my $in_fh, $_ or warn "Cannot read $_: $!"; binmode $in_fh; ### for all but the last part, read ### the amount of data, then write it to ### the appropriate output file. for my $part (1 .. $parts - 1) { ### read an output file worth of data read $in_fh, my $buffer, $size or warn "Read zero bytes from $ +_: $!"; ### write the output file open my $fh, "> $_$part" or warn "Cannot write to $_$part: $!" +; print $fh $buffer; } # for the last part, read the rest of # the file. Buffer will shrink # to the actual bytes read. read $in_fh, my $buffer, -s or warn "Read zero bytes from $_: $!"; open my $fh, "> $_$parts" or warn "Cannot write to $_$parts: $!"; print $fh $buffer; } __END__

    The same weeknesses exist for memory usage. There could be better checks for file existence, too.

Re: How do i split a file into given number of parts.
by trentonl (Initiate) on Dec 21, 2004 at 20:20 UTC
    Since you're just writing them out, perhaps you can use the unix 'split' command?
    # n files each with no more than 10000 lines
    split --lines=10000 
    
    # files contaning at most 1 meg
    split --bytes=1m
    
    
Re: How do i split a file into given number of parts.
by fundflow (Chaplain) on Jun 23, 2001 at 14:29 UTC
    On unix, you can use the 'split' command.
Re: How do i split a file into given number of parts.
by mr_mischief (Monsignor) on Jun 22, 2001 at 22:51 UTC
    This solution takes as its first argument the number of ways to split a file, and the remainder of arguments are taken to be files which need to be split. It does not delete the originals.

    #!/usr/bin/perl -w use strict; ### a must my $parts = shift; ### how many parts to split my @file = @ARGV; ### the files to split foreach ( @file ) { ### how big should the new file be? my $size = -s $_; my $buf_size = $size / $parts; ### ready a buffer for the data my $buf = ''; ### open the input file open ( In, $_ ) || warn "Cannot read $_: $!\n"; binmode ( In ); ### for as many parts as there are, read ### the amount of data, then write it to ### the appropriate output file. for ( my $i = 0; $i < $parts; $i++ ) { ### read an output file worth of data read ( In, $buf, $buf_size ) || warn "Read zero bytes from $_!\n"; ### write the output file open ( Out, "> $_$i" ) || warn "Cannot write to $_$i: $!\n"; print Out $buf; close ( Out ); } ### we're done spliting the input file close ( In ); } exit;

    Its weakness is it only loads one buffer per output segment, and for greater usability it should probably loop if it's a huge size. I don't see this being a huge problem given the nature of the program, though, because there's some place the split files are going that's too small for the original file, so the parts should be small enough for main memory. I'll therefore leave more exotic buffer manipulation as an exercise.

    Chris
    Boo!

    Originally posted as a Categorized Answer.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://90768]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others sharing their wisdom with the Monastery: (5)
As of 2024-04-23 18:03 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found