Beefy Boxes and Bandwidth Generously Provided by pair Networks
The stupid question is the question not asked
 
PerlMonks  

Re: Appending single record to CSV (or TDV, not too late to switch) - filling 13 fields from 13 files, one of which is split into array of 2 and I just need half of it...

by kcott (Archbishop)
on Jan 23, 2018 at 08:25 UTC ( [id://1207733]=note: print w/replies, xml ) Need Help??


in reply to Appending single record to CSV (or TDV, not too late to switch) - filling 13 fields from 13 files, one of which is split into array of 2 and I just need half of it...

G'day hrholmer,

Welcome to the Monastery.

[This post was difficult to read, which you acknowledge. For future reference, avoid prosaic descriptions and this chatty style you've adopted. Choose concise and succinct statements to describe your problem; use short dot points instead of drawn out paragraphs; put code in blocks and use pseudocode if don't know what syntax you need.]

The following script, pm_1207659_csv_append.pl, provides some techniques which you may find useful (assuming I've got the basic idea of what you're trying to accomplish).

#!/usr/bin/env perl -l use strict; use warnings; use autodie; use Text::CSV; use Inline::Files; my $csv = Text::CSV::->new(); my $csv_file = 'pm_1207659_out.csv'; my %seen; open my $csv_fh, '<', $csv_file; while (my $row = $csv->getline($csv_fh)) { $seen{$row->[0]} = 1; } close $csv_fh; my @in_fhs = (\*FILE1, \*FILE2, \*FILE3); open my $out_fh, '>>', $csv_file; while (1) { my @data = map scalar readline $_, @in_fhs; last unless defined $data[0]; chomp @data; my $key = (split /_/, $data[0])[1]; next if $seen{$key}++; $csv->print($out_fh, [$key, @data[1 .. $#in_fhs]]); } close $out_fh; __FILE1__ A_B B_B C_A B_C C_D __FILE2__ F2-1 F2-2 F2-3 F2-4 F2-5 __FILE3__ F3-1 F3-2 F3-3 F3-4 F3-5

Notes:

  • Let Perl do your I/O checking with the autodie pragma.
  • Use Text::CSV as ++Tux has already discussed. As the name suggests, comma-separated is the default; use the sep_char attribute if you want something different (e.g. "\t" for tabs).
  • I've only used 3 (not 13) files. Inline::Files is for demonstration purposes. You'll need to actually open real files.
  • I've created 'pm_1207659_out.csv' with some initial data to show handling existing duplicates. The new data also contains duplicates which is handled using the same %seen hash. As stated above, this is intended to demonstrate a technique: you'll need to adapt it to your needs.
  • You should also note that I've made gross assumptions about the data. For instance, all files have the same number of records; and all records contain only valid data. You'll need to add appropriate validation and checking for any production code.

Here's a sample run with before and after data:

$ cat pm_1207659_out.csv X,Y,Z C,what,ever some,thing,else $ pm_1207659_csv_append.pl $ cat pm_1207659_out.csv X,Y,Z C,what,ever some,thing,else B,F2-1,F3-1 A,F2-3,F3-3 D,F2-5,F3-5

You should be able to follow this through and see how duplicate data (existing and new) is handled.

— Ken

  • Comment on Re: Appending single record to CSV (or TDV, not too late to switch) - filling 13 fields from 13 files, one of which is split into array of 2 and I just need half of it...
  • Select or Download Code

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://1207733]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others chilling in the Monastery: (3)
As of 2024-04-26 00:40 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found