Can Text::CSV_XS return key-value pairs?

Lady_Aleena has asked for the wisdom of the Perl Monks concerning the following question:

Hello. I'm currently rewriting one of my more loathed subs with Text::CSV_XS. I was wondering if there were a way to write it so that it returns key-value pairs instead of an array or hash of hashes when there are only two fields in the file.

My code so far...

#!/usr/bin/perl
use strict;
use warnings;

use Data::Dumper;
use Text::CSV_XS qw(csv);

sub make_list {
  my (%opt) = @_;

  my $headers = $opt{'headings'};
  my $filters;
  for my $header (@$headers) {
    if ($header =~ s/\+$//) {
      $filters->{$header} = sub { $_ = [split(/;\s*/, $_)]; $_ };
    }
  }

  my $list = csv (
    in               => $opt{file},
    headers          => $headers,
    key              => $opt{key} ? $headers->[0] : undef,
    filter           => $filters,
    sep_char         => '|',
    quote_char       => undef,
    empty_is_undef   => 1,
    allow_whitespace => 1,
  ) || die Text::CSV_XS->error_diag;
  
  return $list;
}

my $file = 'Artifacts.txt';
my $headers = ['item', 'seen in+'];

print Dumper make_list( file => $file, headings => $headers);
[download]

The file...

Crystal Skull|Stargate SG-1, Crystal Skull; The Librarian: Return to K
+ing Soloman's Mines; Indiana Jones and the Kingdom of the Crystal Sku
+ll
Spear of Destiny|The Librarian: Quest for the Spear; Hellboy; Constant
+ine
Book or Key of Soloman|The Librarian: Return to King Soloman's Mines; 
+Season of the Witch
Necronomicon|H.P. Lovecraft
Ark of the Covenant|Raides of the Lost Ark; The Librarian: Quest for t
+he Spear
TARDIS|Doctor Who; The Sarah Jane Adventures; The Librarians, And the 
+Final Curtain
Doc Brown's Delorean|Back to the Future; Back to the Future Part II; B
+ack to the Future Part III; The Librarians, And the Final Curtain
Pandora's Box|Warehouse 13; The Librarian: Quest for the Spear
Excalibur|Excalibur; The Last Legion; The Librarian: Quest for the Spe
+ar
H.G. Wells' Time Machine|The Librarians; Warehouse 13
Holy Grail|Indiana Jones and the Last Crusade; The Librarian: Quest fo
+r the Spear
[download]

So what I want from this file is the first field to be the keys and the second field to be an array ref. Here is an example of what I want.

$VAR1 = {
          'Ark of the Covenant' => [
                                     'Raides of the Lost Ark',
                                     'The Librarian: Quest for the Spe
+ar'
                                   ],
          'Necronomicon' => [
                              'H.P. Lovecraft'
                            ],
          'Holy Grail' => [
                            'Indiana Jones and the Last Crusade',
                            'The Librarian: Quest for the Spear'
                          ],
          'H.G. Wells\' Time Machine' => [
                                           'The Librarians',
                                           'Warehouse 13'
                                         ],
          'Spear of Destiny' => [
                                  'The Librarian: Quest for the Spear'
+,
                                  'Hellboy',
                                  'Constantine'
                                ],
          'Book or Key of Soloman' => [
                                        'The Librarian: Return to King
+ Soloman\'s Mines',
                                        'Season of the Witch'
                                      ],
          'Crystal Skull' => [
                               'Stargate SG-1, Crystal Skull',
                               'The Librarian: Return to King Soloman\
+'s Mines',
                               'Indiana Jones and the Kingdom of the C
+rystal Skull'
                             ],
          'Doc Brown\'s Delorean' => [
                                       'Back to the Future',
                                       'Back to the Future Part II',
                                       'Back to the Future Part III',
                                       'The Librarians, And the Final 
+Curtain'
                                     ],
          'TARDIS' => [
                        'Doctor Who',
                        'The Sarah Jane Adventures',
                        'The Librarians, And the Final Curtain'
                      ],
          'Pandora\'s Box' => [
                                'Warehouse 13',
                                'The Librarian: Quest for the Spear'
                              ],
          'Excalibur' => [
                           'Excalibur',
                           'The Last Legion',
                           'The Librarian: Quest for the Spear'
                         ]
        };
[download]

Is it possible with Text::CSV_XS?

No matter how hysterical I get, my problems are not time sensitive. So, relax, have a cookie, and a very nice day!

Lady Aleena

Comment on Can Text::CSV_XS return key-value pairs? Select or Download Code

Replies are listed 'Best First'.
Re: Can Text::CSV_XS return key-value pairs? by Tux (Canon) on Jun 11, 2017 at 11:48 UTC
Something like this? `use 5.18.2; use warnings; use Data::Peek; use Text::CSV_XS qw(csv); my $file = "pm-1192516.csv"; sub make_list_2 { my %opt = @_; my $headers = $opt{headings}; my %foo; my $list = csv ( in => $opt{file}, headers => $headers, sep_char => "\|", quote_char => undef, empty_is_undef => 1, allow_whitespace => 1, auto_diag => 1, on_in => sub { push @{$foo{$_{item}}}, split m/;\s*/ => $_{"seen in+"} }, ); return \%foo; } DDumper make_list_2 (file => $file, headings => [ "item", "seen in+" ] +);` [download] -> { 'Ark of the Covenant' => [ 'Raides of the Lost Ark', 'The Librarian: Quest for the Spear' ], 'Book or Key of Soloman' => [ 'The Librarian: Return to King Soloman\'s Mines', 'Season of the Witch' ], 'Crystal Skull' => [ 'Stargate SG-1, Crystal Skull', 'The Librarian: Return to King Soloman\'s Mines', 'Indiana Jones and the Kingdom of the Crystal Skull' ], 'Doc Brown\'s Delorean' => [ 'Back to the Future', 'Back to the Future Part II', 'Back to the Future Part III', 'The Librarians, And the Final Curtain' ], Excalibur => [ 'Excalibur', 'The Last Legion', 'The Librarian: Quest for the Spear' ], 'H.G. Wells\' Time Machine' => [ 'The Librarians', 'Warehouse 13' ], 'Holy Grail' => [ 'Indiana Jones and the Last Crusade', 'The Librarian: Quest for the Spear' ], Necronomicon => [ 'H.P. Lovecraft' ], 'Pandora\'s Box' => [ 'Warehouse 13', 'The Librarian: Quest for the Spear' ], 'Spear of Destiny' => [ 'The Librarian: Quest for the Spear', 'Hellboy', 'Constantine' ], TARDIS => [ 'Doctor Who', 'The Sarah Jane Adventures', 'The Librarians, And the Final Curtain' ] } [download] Enjoy, Have FUN! H.Merijn	[reply] [d/l] [select]
Re^2: Can Text::CSV_XS return key-value pairs? by Lady_Aleena (Priest) on Jun 11, 2017 at 23:32 UTC
Tux, can I add that to what I already have? Also, the value will not always be split. So can the value be first put through the filters then go to the `on_in`? And if this needs to use an additional hash, then how do I get everything to use the outside hash, the one you called `%foo`? sub make_list { my (%opt) = @_; my $file = $opt{file} && ref($opt{file}) eq 'ARRAY' ? data_file(@{$o +pt{file}}) : $opt{file}; my $headers = $opt{'headings'} ? $opt{'headings'} : ['heading']; my $filters; for my $header (@$headers) { if ($header =~ s/\+$//) { $filters->{$header} = sub { $_ = [split(/;\s/, $_)]; $_ }; } } my $list = csv ( in => $file, headers => $headers, key => $opt{key} ? $headers->[0] : undef, filter => $filters, sep_char => '\|', quote_char => undef, empty_is_undef => 1, allow_whitespace => 1, auto_diag => 1, ); return $list; } [download] I did not want to include the code I am trying to replace with an all in one, but it looks like I have to. Read more... (2 kB) No matter how hysterical I get, my problems are not time sensitive. So, relax, have a cookie, and a very nice day!* Lady Aleena	[reply] [d/l] [select]
Re^3: Can Text::CSV_XS return key-value pairs? by Tux (Canon) on Jun 13, 2017 at 15:06 UTC
Like this then? (only requirement is to have "item" a required field use 5.18.2; use warnings; use Data::Peek; use Text::CSV_XS qw(csv); my $file = "pm-1192516.csv"; sub make_list { my %opt = @_; my %plus; my @head = map { s/\+$// && $plus{$_}++; $_ } @{$opt{headings}}; my %foo; my $list = csv ( in => $opt{file}, headers => \@head, sep_char => "\|", quote_char => undef, empty_is_undef => 1, allow_whitespace => 1, auto_diag => 1, on_in => sub { foreach my $f (@head) { $f eq "item" and next; push @{$foo{$_{item}}{$f}}, $plus{$f} ? split m/;\s*/ +=> $_{$f} : $_{$f}; } }, ); return \%foo; } DDumper make_list (file => $file, headings => [ "item", "seen in+" ]); [download] -> { 'Ark of the Covenant' => { 'seen in' => [ 'Raides of the Lost Ark', 'The Librarian: Quest for the Spear' ] }, 'Book or Key of Soloman' => { 'seen in' => [ 'The Librarian: Return to King Soloman\'s Mines', 'Season of the Witch' ] }, 'Crystal Skull' => { 'seen in' => [ 'Stargate SG-1, Crystal Skull', 'The Librarian: Return to King Soloman\'s Mines', 'Indiana Jones and the Kingdom of the Crystal Skull' ] }, 'Doc Brown\'s Delorean' => { 'seen in' => [ 'Back to the Future', 'Back to the Future Part II', 'Back to the Future Part III', 'The Librarians, And the Final Curtain' ] }, Excalibur => { 'seen in' => [ 'Excalibur', 'The Last Legion', 'The Librarian: Quest for the Spear' ] }, 'H.G. Wells\' Time Machine' => { 'seen in' => [ 'The Librarians', 'Warehouse 13' ] }, 'Holy Grail' => { 'seen in' => [ 'Indiana Jones and the Last Crusade', 'The Librarian: Quest for the Spear' ] }, Necronomicon => { 'seen in' => [ 'H.P. Lovecraft' ] }, 'Pandora\'s Box' => { 'seen in' => [ 'Warehouse 13', 'The Librarian: Quest for the Spear' ] }, 'Spear of Destiny' => { 'seen in' => [ 'The Librarian: Quest for the Spear', 'Hellboy', 'Constantine' ] }, TARDIS => { 'seen in' => [ 'Doctor Who', 'The Sarah Jane Adventures', 'The Librarians, And the Final Curtain' ] } } [download] Enjoy, Have FUN! H.Merijn	[reply] [d/l] [select]
Re^4: Can Text::CSV_XS return key-value pairs? by Lady_Aleena (Priest) on Jun 15, 2017 at 00:18 UTC
Re: Can Text::CSV_XS return key-value pairs? by choroba (Cardinal) on Jun 11, 2017 at 12:40 UTC
You can use Text::CSV_XS for handling both the steps, splitting on "\|" and splitting on ";". I wasn't able to achieve this using the `csv` method, so I had to go more procedural: #!/usr/bin/perl use warnings; use strict; use Text::CSV_XS qw(csv); use Data::Dumper; my $file = shift; my $csv = 'Text::CSV_XS'->new({ sep_char => '\|', quote_char => undef, empty_is_undef => 1, allow_whitespace => 1, }) or die 'Text::CSV_XS'->error_diag; open my $CSV, '<', $file or die $!; my %structure; my $inner = 'Text::CSV_XS'->new({ sep_char => ';', quote_char => undef, allow_whitespace => 1, }) or die 'Text::CSV_XS'->error_diag; while (my $row = $csv->getline($CSV)) { $inner->parse($row->[1]); $structure{ $row->[0] } = [ $inner->fields ]; } print Dumper(\%structure); [download] Update: Fixed along the suggestions by Tux below. ($q=q:Sq=~/;[c](.)(.)/;chr(-\|\|-\|5+lengthSq)`"S\|oS2"`map{chr \|+ord }map{substrSq`S_+\|`\|}3E\|-\|`7**2-3:)=~y+S\|`+$1,++print+eval$q,q,a, [download]	[reply] [d/l] [select]
Re^2: Can Text::CSV_XS return key-value pairs? by Tux (Canon) on Jun 11, 2017 at 12:45 UTC
That is close, but not what the OP requested. The second `Text::CSV_XS->new` also needs an `allow_whitespace => 1`. I'd also put the constructor outside of the second loop. No need to create it on every iteration. And yes, it is safer than using `split`, but I did not want to be pedantic Enjoy, Have FUN! H.Merijn	[reply] [d/l] [select]
Re^2: Can Text::CSV_XS return key-value pairs? by Lady_Aleena (Priest) on Jun 13, 2017 at 14:05 UTC
choroba, could you please include where you got the headings/headers? You used the method `new()` instead of the function `csv()`. Since `new()` doesn't have a header option, I don't know where the headers are invoked if they are not the first line of the file being parsed. As for the secondary `new()` splitting (or whatever does it) at `;`, that would be triggered whenever a header has a `+` at the end of it (and the `+` being removing from the header after parsing). So, for example, my movies.txt has the following headers... `'headers' => ['title','start year','end year',qw(media format+ Wikiped +ia allmovie IMDb TV.com Flixster genre+ source company)],` [download] `format` and `genre` would be parsed at the `;` while all other fields are strings. So for a file where I only want a key-value pair for each line, it looks like I will have to put in 2 headers, and if the second one has a `+` it is to be parsed with the value being split (or whatever) on the `;`. I gravitated directly to `csv()` because it looked easier to use than `new()` since I could not figure out what did what. Like what makes an array of hashes and what makes a hash of hashes (what I use mostly). The whatchamacallits (like getline, parse, etc) are not grouped together in such a way as to make it obvious to me. So, would you please expand the code so I can see everything you are doing? I am a bit lost. *No matter how hysterical I get, my problems are not time sensitive. So, relax, have a cookie, and a very nice day!* Lady Aleena	[reply] [d/l] [select]
Re^3: Can Text::CSV_XS return key-value pairs? by choroba (Cardinal) on Jun 13, 2017 at 23:19 UTC
I used `new` instead of `csv` because it gives you more control over what you can do. Here, no headers are defined at all, which means the row is split into an array reference `$row`. The second part of the row is processed by the very same module, as you recognised, to split the string on semicolons—if I understand your comment correctly, you'd need to do that for each column with the + in the general case. The output structure is being built on line 31, where the first part of the line is used as the key, and the result of the secondary processing is used as the value. ($q=q:Sq=~/;[c](.)(.)/;chr(-\|\|-\|5+lengthSq)`"S\|oS2"`map{chr \|+ord }map{substrSq`S_+\|`\|}3E\|-\|`7**2-3:)=~y+S\|`+$1,++print+eval$q,q,a, [download]	[reply] [d/l] [select]

Back to Seekers of Perl Wisdom