Beefy Boxes and Bandwidth Generously Provided by pair Networks
Welcome to the Monastery
 
PerlMonks  

Merging files, 1 line for every 10

by Luxin (Initiate)
on Sep 06, 2011 at 21:16 UTC ( #924466=perlquestion: print w/ replies, xml ) Need Help??
Luxin has asked for the wisdom of the Perl Monks concerning the following question:

Hi All,

I need to merge two files. File1 has 30,000 rows. File2 has 3,000. I need to evenly merge File2 throughout File1, with File3 as the output. Any ideas on how to accomplish this?

I did multiple searched but didn't find anything that would solve this problem.

Thanks!

Comment on Merging files, 1 line for every 10
Re: Merging files, 1 line for every 10
by charlesboyo (Beadle) on Sep 06, 2011 at 21:20 UTC
    Hi Luxin. What have you tried? Show some code and you'll get help. No code, no play.
Re: Merging files, 1 line for every 10
by CountZero (Bishop) on Sep 06, 2011 at 21:27 UTC
    What a very strange concept. I can see no real practical use for such a thing.

    Homework?

    CountZero

    A program should be light and agile, its subroutines connected like a string of pearls. The spirit and intent of the program should be retained throughout. There should be neither too little or too much, neither needless loops nor useless variables, neither lack of structure nor overwhelming rigidity." - The Tao of Programming, 4.1 - Geoffrey James

      LOL - No, not homework. I am performance testing a system that uses multiple data repositories. In Production, about 10% of requests get serviced by system A, and 90% get serviced by system B. To properly mimic production I need to load test the system with similar requests, hence the 10/90 split.
        OK. But doing a "1 line from file A followed by 9 lines from file B", you are actually handling a degenerate case where all requests are artificially spaced in an even way. Wouldn't it be more "real" to randomly determine from which file to read the next request:
        if (int(rand(10) = 0)) { # read from file A } else { # read from file B }

        CountZero

        A program should be light and agile, its subroutines connected like a string of pearls. The spirit and intent of the program should be retained throughout. There should be neither too little or too much, neither needless loops nor useless variables, neither lack of structure nor overwhelming rigidity." - The Tao of Programming, 4.1 - Geoffrey James

Re: Merging files, 1 line for every 10
by JavaFan (Canon) on Sep 06, 2011 at 21:53 UTC
    I wouldn't use Perl for something so trivial - that's easily done in a shell one-liner. I'll give a shell (bash) solution, porting it to Perl is left as an exercise for the reader. Assume the names of the two files are in variables $FILE1 and $FILE2.
    sort -n -k1,1 <(nl -ba -i2 -v1 $FILE1) <(nl -ba -i20 -v0 $FILE2) | c +ut -f2-
Re: Merging files, 1 line for every 10 (one-liner)
by BrowserUk (Pope) on Sep 06, 2011 at 22:10 UTC

    A one-liner:

    perl -ple"printf q[%s], scalar <STDIN> unless $n++ % 10" file1 < file2

    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    "Science is about questioning the status quo. Questioning authority".
    In the absence of evidence, opinion is indistinguishable from prejudice.
Re: Merging files, 1 line for every 10
by ForgotPasswordAgain (Deacon) on Sep 06, 2011 at 22:14 UTC

    Say that you read each file into an array.

    use File::Slurp; my @file1 = read_file('file1'); my @file2 = read_file('file2');

    Where would you go from there? Eventually you could have a new array, write it out like this:

    my $lines = merge_lines(\@file1, \@file2); write_file('file3', @$lines);

    How would you write 'merge_lines'?

    sub merge_lines { my ($lines1, $lines2) = @_; my @lines; .... return \@lines; }

    Not the most efficient way to do it, but...

Re: Merging files, 1 line for every 10
by Kc12349 (Monk) on Sep 06, 2011 at 22:45 UTC

    I'm with others here wondering what possible purpose this serves, but I would probably do something like the below. $in_file1 being the larger of the two files.

    open($in_fh1, '<', $in_file1); open($in_fh2, '<', $in_file2); open($out_fh, '>', $out_file); while (my $line1 = <$in_fh1>) { chomp($line1); state $i; $i++; say {$out_fh} $line1; unless ($i % 10) { chomp(my $line2 = <$in_fh2>); say {$out_fh} $line2 if $line2; } }
      I'd recommend a simpler code:
      open my $in_fh1, '<', $in_file1; open my $in_fh2, '<', $in_file2; open my $out_fh, '>', $out_file; while (defined(my $line1 = <$in_fh1>)) { print {$out_fh} $line1; unless ($. % 10) { my $line = <$in_fh2>; print {$out_fh} $line; } }

        The use of $. does make things more simple. I'll have to take note of that for similar situations in the future. What is the benefit of the added defined call?

Re: Merging files, 1 line for every 10
by Marshall (Prior) on Sep 07, 2011 at 04:06 UTC
    You have not adequately described the problem. Free Dictionary defines "merge" to mean : merge. In this case, perhaps combining two flows stepwise.

    This term: "evenly merge" does not have a clear definition that I know of.

    Did you mean: ten rows from File1, then one row from File2?
    That seems rather easy - too easy for this question - so what is it that you mean by "evenly merge".

    Ooops, update: I guess it really is simple. My bad. Loop for 10 lines from File1 and send each line to File3. Send one line from File2 to File3. Loop back for the next 10 lines from File1. What's the problem?

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://924466]
Approved by Corion
Front-paged by Corion
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others having an uproarious good time at the Monastery: (8)
As of 2014-09-19 18:54 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    How do you remember the number of days in each month?











    Results (144 votes), past polls