|Perl: the Markov chain saw|
Perl script to split a file and process then concatenate based on size and a stringby sharp859 (Initiate)
|on Feb 02, 2013 at 06:49 UTC||Need Help??|
sharp859 has asked for the
wisdom of the Perl Monks concerning the following question:
Could somebody help me to get some logic to following in perl I am using windows 7.
C:\script>perl split_concatenate.pl large_file a or b (Input would be large file and value a or b to process it later).
Check the file if it is greater than 40KB (some size), and choice is "a" , if not run a command
command -i large_file.txt -o large_file_new -a
else if the choice is b
command -i large_file.txt -o large_file_new -b
if it is greater say 40KB+, split the file for each 40KB arround (will be part1,) and append a first "particular string" which will be in the file_part2 to the part1 save it for processing, if there are multiple "particular string" then create subsequent files which should end with next "particular string" in the following part. ("Particular String" starts with some String but ends in different value). So script should search if there are more "Particular string", in the part2 or so and append first available one, if there is only one available no need to anything just split. As file always should end with a particular string.
Then process same command
command -i filepart1.txt -o filepart1.dat -a command -i filepart2.txt - o filepart2.dat (if needed) -a
command -i filepart1.txt -o filepart1.dat -b command -i filepart2.txt - o filepart2.dat (if needed) -b
After this needs to be concatenated.
Concatenate filepart1.dat + filepart2.dat + filepartN =large_file.dat
I started to find the size first using below code,#!/usr/bin/perl use strict; use warnings; use File::stat; my $filesize = stat("Full_File.txt")->size; print "Size: $filesize\n"; exit 0;
It will be great if some one help so I can learn. If this is not possible, then @ each 500th line the file reaches to 40KB, so I think this would be easier, every 500th line append the Next available "Particular String", and split and process above command, if file is less than 1000 lines then only 2 split and no need to append in the part2 as already it has one in its eof. May be easier?Thanks a lot