http://www.perlmonks.org?node_id=1163588

andreas1234567 has asked for the wisdom of the Perl Monks concerning the following question:

Dear Monks,
I am tasked to convert CSV to XML and have turned to the not-updated-in-15-years XML::CSV. It has a number of issues - lack on support for accented characters and poor error handling are some of them. What would you recommend for converting CSV into a custom, possibly rather complex XML format?
use warnings; use strict; use Test::More; use File::Spec::Functions; use XML::XPath; use lib catdir qw ( lib ); plan tests => 8; use_ok q{XML::CSV}; my $base = q{input}; my $csvfile = catdir q(csv), qq{$base.txt}; my $xmlfile = catdir q(xml), qq{$base.xml}; my $default_obj_xs = Text::CSV_XS->new({sep_char => ",", quote_char => + '"', encoding => "utf8"}); my $csv_obj = XML::CSV->new( { csv_xs => $default_obj_xs, error_out => + 1 }); # convert csv to xml, print to file my @arr_of_headings = map { "Col$_" } (1..9); $csv_obj->{column_headings} = \@arr_of_headings; $csv_obj->parse_doc($csvfile); $csv_obj->declare_xml({version => '1.0', encoding => 'UTF-8', standalo +ne => 'yes'}); $csv_obj->print_xml($xmlfile, {format => " ", file_tag => "Import", re +cord_tag => "Row"}); # test xml output my $xp = XML::XPath->new(filename => $xmlfile); my @nodes = $xp->findnodes(q{/Import/record}); cmp_ok(scalar(@nodes), q[==], 3, q[We have 3 nodes]); @nodes = $xp->findnodes(q{/Import/record[1]/Col2}); cmp_ok(scalar(@nodes), q[==], 1, q[We have 1 match on line 1]); ok(exists($nodes[0])); cmp_ok($nodes[0]->string_value(), q{eq}, q{AB12345}); @nodes = $xp->findnodes(q{/Import/record[3]/Col2}); cmp_ok(scalar(@nodes), q[==], 1, q[We have 1 match on line 3]); ok(exists($nodes[0])); cmp_ok($nodes[0]->string_value(), q{eq}, q{EF12345}); __END__ C:\IT\Temp\>prove t\02_poc.t t\02_poc.t .. 1/8 # Failed test at t\02_poc.t line 40. # got: '' # expected: 'EF12345' # Looks like you failed 1 test of 8. t\02_poc.t .. Dubious, test returned 1 (wstat 256, 0x100) Failed 1/8 subtests Test Summary Report ------------------- t\02_poc.t (Wstat: 256 Tests: 8 Failed: 1) Failed test: 8 Non-zero exit status: 1 Files=1, Tests=8, 1 wallclock secs ( 0.06 usr + 0.00 sys = 0.06 CPU +) Result: FAIL C:\IT\Temp\> C:\IT\Temp\>type "csv\input.txt" 1,AB12345,03.04.2016 15:43:14,-76775.70,Toll road INC,Bridge 55,19.8,0 +4.04.2016 06:55:41 2,CD12345,01.04.2016 16:39:15,-76775.70,Toll road INC,River Kwai,8.1,0 +4.04.2016 06:27:36 3,EF12345,01.04.2016 16:39:15,-76775.70,Toll road INC,Champs-Élysées,8 +.1,04.04.2016 06:27:36 C:\IT\Temp\>type "xml\input.xml" <?xml version="1.0" encoding="UTF-8" standalone="yes"?> <Import> <record> <Col1>1</Col1> <Col2>AB12345</Col2> <Col3>03.04.2016 15:43:14</Col3> <Col4>-76775.70</Col4> <Col5>Toll road INC</Col5> <Col6>Bridge 55</Col6> <Col7>19.8</Col7> <Col8>04.04.2016 06:55:41</Col8> <Col9></Col9> </record> <record> <Col1>2</Col1> <Col2>CD12345</Col2> <Col3>01.04.2016 16:39:15</Col3> <Col4>-76775.70</Col4> <Col5>Toll road INC</Col5> <Col6>River Kwai</Col6> <Col7>8.1</Col7> <Col8>04.04.2016 06:27:36</Col8> <Col9></Col9> </record> <record> <Col1></Col1> <Col2></Col2> <Col3></Col3> <Col4></Col4> <Col5></Col5> <Col6></Col6> <Col7></Col7> <Col8></Col8> <Col9></Col9> </record> </Import> C:\IT\Temp\>
--
No matter how great and destructive your problems may seem now, remember, you've probably only seen the tip of them. [1]

Replies are listed 'Best First'.
Re: Convert CSV to XML
by Corion (Patriarch) on May 20, 2016 at 07:04 UTC

    If you have the XSD of the taget XML, I recommend XML::Writer XML::Compile. You have to mush your input data into the data structure corresponding to the output XML data, but that's all you have to do. XML::Writer will then create the appropriate XML from the XSD and your data structure.

    I've also had good experience in creating a template using HTML::Template(::Compiled) and filling in the data from a flat datastructure, but that really depends on the output XML format.

      Dear Corion,

      Thanks for your helpful reply. However, I cannot find any reference to XSD's in the documentation to XML::Writer. Do you have any pointers? Or did you perhaps mean XML::Compile as described in this blog post? Thanks again.

      Best regards
      --
      No matter how great and destructive your problems may seem now, remember, you've probably only seen the tip of them. [1]
Re: Convert CSV to XML
by tangent (Parson) on May 20, 2016 at 12:44 UTC
    Using a combination of Text::CSV and Template Toolkit would be a flexible solution. If you have a complex format a templating system gives you the very fine control you need.

    Use Text::CSV to convert your file to a structure like:
    @rows = ( { Col1 => 1, Col2 => 'AB12345', Col3 => '03.04.2016 15:43:14', Col4 => '-76775.70', Col5 => 'Toll road INC', }, { Col1 => '2', Col2 => 'CD12345', Col3 => '01.04.2016 16:39:15', Col4 => '-76775.70', Col5 => 'Toll road INC', }, );
    Then pass that structure to Template Toolkit
    my $params = { rows => \@rows }; my $template = Template->new; $template->process('template.xml', $params, 'output.xml');
    Your template file 'template.xml' could look something like:
    <?xml version="1.0" encoding="UTF-8" standalone="yes"?> <Import> [% FOREACH row IN rows %] <record> <Col1>[% row.Col1 %]</Col1> <Col2>[% row.Col2 %]</Col2> [% IF row.Col3 %] <Col3>[% row.Col3 %]</Col3> [% END %] <Col4>[% row.Col4 %]</Col4> <Col5>[% row.Col5 %]</Col5> </record> [% END %] </Import>
Re: Convert CSV to XML
by Anonymous Monk on May 20, 2016 at 07:04 UTC