http://www.perlmonks.org?node_id=493374
Category: utility script
Author/Contact Info tomas cebrian tomas.cebrian@terra.es
Description: I have a very slow connection at home, but I have access to adsl elsewhere, so I can download demos and big programs. Hopeless, there isn't a cdrecorder there, so I have to split this files into smaller ones that fit into my 64 mgs stick memory. And this is what "korta" script do. Split files and join them at home.
#    
#    nombre:  korta.pl
#    script:     1 agosto 2005
#    program: tomas cebrian
#
#    I have found this script usefull to carry demos from a computer w
+ith internet
#    to another without.  I have only a 64 mgs of memory stick, so I h
+ave to
#    split a big file into parts, and carry them one by one.  
#
#    Utilizo este script para llevar demos de un ordenador con interne
+t a otro
#    sin él.  Tengo un cacharro de esos de memoria usb de 64 mgs, ento
+nces tengo
#    que partir la demo en partes y llevarlas en veces de un sitio a o
+tro.
#
#    Hasta que me compre una memoria maaaaas grande, esto me va bien
#
#    Until I buy a bigger memory stick, it works well for me.


#    ENGLISH

#    splits a file into several tmp files, tmp0.tmp tmp1.tmp
#    using the option "-partir"

#    joins tmp files into the original, using the option "-juntar"

#    use:    perl korta.pl [-partir][-juntar] file [-tbytes]
#    if the option -t(bytes) is not used, the file will be splitted
#    into files of 1.400 bytes, to fit into a floppy

#    the argument "file" is necesary for the input and for the output 
+as well.



#    SPANISH

#    divide un archivo en varias partes tmp0.tmp, tmp1.tmp...
#    con el argumento -partir.
 
#    auna los archivos temporales en el archivo original con
#    el argumento -juntar

#    uso:     perl korta.pl [-partir][-juntar] archivo [-tbytes] ([-to
+ctetos])
#    si la opción -t(bytes) no se usa, el archivo se dividirá en
#    partes de 1.400 bytes para que quepan en un floppy
 
#    el argumento "archivo" es necesario tanto para la entrada como pa
+ra la salida


#    Just one more thing:    I'm not a programmer, I'm only a Carterpi
+llar driver, so
#    excuse my big errors (sure there're), and if you want to improve 
+the script, or
#    correct me, please do it.  I'm at tomas.cebrian@terra.es

#    Solo una cosa mas:    No soy programador, yo conduzco una palera 
+en un almacén de
#    cereales, así que intentad perdonar los errores que haya, y si qu
+ereis corregirme,
#    o mejorar el script para vuestro uso, hacedlo, por favor. Estoy e
+n tomas.cebrian@terra.es



my %hash;
my $parte;
my @array;
my $nro_parte=0;



$MAX_SIZE_PART=1400;



my $file=$ARGV[1];


#    cambiando el tamaño maximo de los archivos temporales
#    changing max size of tmp files

if(($ARGV[2]=~m/-t/)==1){
    $MAX_SIZE_PART=substr($ARGV[2],2);    
}


my $nro_files=0;


#    una tontada para saber el tamaño del archivo
#    nonsense to know the file size

if($ARGV[0]eq"-info"){
    $len=length($file);
    print $len."\n";
    exit 1;
}



#    por si acaso se le ocurre a alguien
#    maybe someone could try it

if($ARGV[0]eq"-h" || $ARGV[0] eq "-help"){
    print "Uso:  perl -w korta.pl [-info] [-partir] [-juntar] archivo 
+-t(maximo tamaño del bloque\n";
    print "Por defecto partira los archivos en partes que ocupen un fl
+oppy\n\n";
    exit 1;
}



#    si queremos partir
#    if we want to split

if($ARGV[0]eq"-partir"){
    
    
open(INPUT, "$file")||die "$!\n";

    binmode INPUT;

    while(read(INPUT, $parte, $size)){

        push @array,$parte;

        
        if($nro_partes>$MAX_SIZE_PART){  

            open(OUTPUT,">tmp$nro_files.tmp")||die "$!\n";
            binmode OUTPUT;
             
             $_=join('',@array);
             print OUTPUT  $_;

            close(OUTPUT);
            undef @array;
            $nro_partes=0;
            $nro_files+=1;
        }

        $nro_partes+=1;
    }
     
    open(OUTPUT,">tmp$nro_files.tmp")||die "$!\n";
            binmode OUTPUT;
             
             $_=join('',@array);
             print OUTPUT  $_;

            close(OUTPUT);


close(INPUT);

exit;
}


#    y para juntar
#    and for join

if($ARGV[0]eq"-juntar"){


$final=$ARGV[1];

my  @files=<*.tmp>;

foreach$file(@files){
    
open(INPUT,"$file")||die "$!\n";

   binmode INPUT;
   while(read(INPUT, $chunk,$size)){
       push @farray, $chunk;
   }


close(INPUT);
}

open(OUTPUT,">$final")||die"$!\n";
binmode OUTPUT;

$_=join '',@farray;
print OUTPUT $_;

close OUTPUT;

exit;
}


#    si solo escribimos en la linea de comandos "perl korta.pl", verem
+os esto y saldremos del script
#    if we only type in the command line "perl korta.pl", we'll see th
+iw and exit from the script

print "Uso:  perl korta.pl [-info] [-partir] [-juntar] archivo -t(maxi
+mo tamaño del bloque)\n";
    print "Por defecto partira los archivos en partes que ocupen un fl
+oppy\n\n\n";
exit 1;

2005-09-20 Retitled by g0n, as per Monastery guidelines
Original title: 'korta'

Replies are listed 'Best First'.
Re: korta - split large files
by jdporter (Paladin) on Sep 20, 2005 at 12:29 UTC
    You might also look at the Perl implementation of the standard split command in the Perl Power Tools project.
      Also the File::Split module.

      Caution: Contents may have been coded under pressure.
Re: korta - split large files
by graff (Chancellor) on Sep 20, 2005 at 16:56 UTC
    In this little chunk of your code:
    # nonsense to know the file size if($ARGV[0]eq"-info"){ $len=length($file); print $len."\n"; exit 1; }
    you are simply reporting the number of characters in the file name that the user has given on the command line. In order to state the actual byte count of data in the file, you want:
    $len = -s $file; # not length($file)
    I would recommend that since you are already using "my" declarations a lot, you should go ahead and include "use strict" as well. It might help catch some problems like:

    $size is first mentioned in the "read()" call for the "-partir" operation; this means that you are passing an undefined value to "read()" for its "LENGTH" parameter, which means you don't read anything. Maybe you meant to put something like "$MAX_SIZE_PART*1024" there instead?

    You never know what mistakes users will make on command line args, so I would advise more careful sanity checks on the @ARGV values. Consider using one of the Getopt modules. At a minimum, this:

    if(($ARGV[2]=~m/-t/)==1){ $MAX_SIZE_PART=substr($ARGV[2],2); }
    should be:
    if ( $ARGV[2] =~ /^-t(\d+)/ ) { $MAX_SIZE_PART = $1; } else { die "Usage: blah blah\n"; }
    Also, it looks like you are reading the entire content of the file into memory. Since this is for handling really big files, it would be better to read and write a chunk at a time, and not save each successive chunk by pushing them all onto an array.

    Then there's also something very confusing about your use of $nro_partes (which probably should be spelled the same as the one declared at the top as "my $nro_parte;") -- you increment this by one on each read, but you compare it to $MAX_SIZE_PART.

    Assuming that you have been using some version of this code that actually works, I believe that you have posted some different version, because I doubt that the code as originally posted will work.

      Thank you very much for correcting me... As I explain in my comments, I'm not really a programmer, and, well, I'm begining with perl. You are right! I have downloaded the script and found a difference with the one I use... I don't declare "my $size=1024". If you do this, the script will work. I think I posted the code before correcting the bug :-( With $no_partes, what the script do is read parts of $MAX_SIZE_PART. When $no_partes is less than $MAX_SIZE_PART it gets out with the rest of the file. Ej: 120 mgs into 50, 50 and 20. Thank you for your advice about @ARGV... I'm not very clear about the use of the arguments by now. I have to learn a lot yet ;-) And it's true, I should better read and write a chunk at a time... In fact, my hard disk seems like if it's going to explode each time I run the script. I know that the first thing to do when you want to program in Perl is look at the CPAN, but I know that the best way for learn programing is make programs... And I was sooo proud about this script! Well, I'll try to be more careful with the code before post it. Thank you.
Re: korta - split large files
by monarch (Priest) on Sep 20, 2005 at 13:41 UTC
    ..because, of course, the early 90s weren't proliferated with file splitting utilities on bulletin board systems.. *sigh*
Re: korta - split large files
by elwarren (Priest) on Sep 21, 2005 at 21:04 UTC
    I got three old Zip drives sitting in my basement collecting dust. You can keep one at home, take one to the office, and keep the third in your bag. You can have them super cheap :-) 100mb > 64mb.