Can Text::CSV_XS return key-value pairs?

by choroba (Bishop)
on Jun 11, 2017 at 12:40 UTC

in reply to Can Text::CSV_XS return key-value pairs?

You can use Text::CSV_XS for handling both the steps, splitting on "|" and splitting on ";". I wasn't able to achieve this using the csv method, so I had to go more procedural:
#!/usr/bin/perl use warnings; use strict; use Text::CSV_XS qw(csv); use Data::Dumper; my $file = shift; my $csv = 'Text::CSV_XS'->new({ sep_char => '|', quote_char => undef, empty_is_undef => 1, allow_whitespace => 1, }) or die 'Text::CSV_XS'->error_diag; open my $CSV, '<', $file or die $!; my %structure; my $inner = 'Text::CSV_XS'->new({ sep_char => ';', quote_char => undef, allow_whitespace => 1, }) or die 'Text::CSV_XS'->error_diag; while (my $row = $csv->getline($CSV)) { $inner->parse($row->[1]); $structure{ $row->[0] } = [ $inner->fields ]; } print Dumper(\%structure);

Update: Fixed along the suggestions by Tux below.

Can Text::CSV_XS return key-value pairs?
by Tux (Abbot) on Jun 11, 2017 at 12:45 UTC

    That is close, but not what the OP requested. The second Text::CSV_XS->new also needs an allow_whitespace => 1. I'd also put the constructor *outside* of the second loop. No need to create it on every iteration. And yes, it is safer than using split, but I did not want to be pedantic

    Enjoy, Have FUN! H.Merijn
Can Text::CSV_XS return key-value pairs?
by Lady_Aleena (Curate) on Jun 13, 2017 at 14:05 UTC

    choroba, could you please include where you got the headings/headers? You used the method new() instead of the function csv(). Since new() doesn't have a header option, I don't know where the headers are invoked if they are not the first line of the file being parsed.

    As for the secondary new() splitting (or whatever does it) at ;, that would be triggered whenever a header has a + at the end of it (and the + being removing from the header after parsing). So, for example, my movies.txt has the following headers...

    'headers' => ['title','start year','end year',qw(media format+ Wikiped +ia allmovie IMDb Flixster genre+ source company)],

    format and genre would be parsed at the ; while all other fields are strings.

    So for a file where I only want a key-value pair for each line, it looks like I will have to put in 2 headers, and if the second one has a + it is to be parsed with the value being split (or whatever) on the ;.

    I gravitated directly to csv() because it looked easier to use than new() since I could not figure out what did what. Like what makes an array of hashes and what makes a hash of hashes (what I use mostly). The whatchamacallits (like getline, parse, etc) are not grouped together in such a way as to make it obvious to me.

    So, would you please expand the code so I can see everything you are doing? I am a bit lost.

    No matter how hysterical I get, my problems are not time sensitive. So, relax, have a cookie, and a very nice day!
    Lady Aleena
      I used new instead of csv because it gives you more control over what you can do. Here, no headers are defined at all, which means the row is split into an array reference $row. The second part of the row is processed by the very same module, as you recognised, to split the string on semicolons—if I understand your comment correctly, you'd need to do that for each column with the + in the general case. The output structure is being built on line 31, where the first part of the line is used as the key, and the result of the secondary processing is used as the value.

