Beefy Boxes and Bandwidth Generously Provided by pair Networks
more useful options

Convert CSV to XML

by andreas1234567 (Vicar)
on May 20, 2016 at 06:58 UTC ( #1163588=perlquestion: print w/replies, xml ) Need Help??

andreas1234567 has asked for the wisdom of the Perl Monks concerning the following question:

Dear Monks,
I am tasked to convert CSV to XML and have turned to the not-updated-in-15-years XML::CSV. It has a number of issues - lack on support for accented characters and poor error handling are some of them. What would you recommend for converting CSV into a custom, possibly rather complex XML format?
use warnings; use strict; use Test::More; use File::Spec::Functions; use XML::XPath; use lib catdir qw ( lib ); plan tests => 8; use_ok q{XML::CSV}; my $base = q{input}; my $csvfile = catdir q(csv), qq{$base.txt}; my $xmlfile = catdir q(xml), qq{$base.xml}; my $default_obj_xs = Text::CSV_XS->new({sep_char => ",", quote_char => + '"', encoding => "utf8"}); my $csv_obj = XML::CSV->new( { csv_xs => $default_obj_xs, error_out => + 1 }); # convert csv to xml, print to file my @arr_of_headings = map { "Col$_" } (1..9); $csv_obj->{column_headings} = \@arr_of_headings; $csv_obj->parse_doc($csvfile); $csv_obj->declare_xml({version => '1.0', encoding => 'UTF-8', standalo +ne => 'yes'}); $csv_obj->print_xml($xmlfile, {format => " ", file_tag => "Import", re +cord_tag => "Row"}); # test xml output my $xp = XML::XPath->new(filename => $xmlfile); my @nodes = $xp->findnodes(q{/Import/record}); cmp_ok(scalar(@nodes), q[==], 3, q[We have 3 nodes]); @nodes = $xp->findnodes(q{/Import/record[1]/Col2}); cmp_ok(scalar(@nodes), q[==], 1, q[We have 1 match on line 1]); ok(exists($nodes[0])); cmp_ok($nodes[0]->string_value(), q{eq}, q{AB12345}); @nodes = $xp->findnodes(q{/Import/record[3]/Col2}); cmp_ok(scalar(@nodes), q[==], 1, q[We have 1 match on line 3]); ok(exists($nodes[0])); cmp_ok($nodes[0]->string_value(), q{eq}, q{EF12345}); __END__ C:\IT\Temp\>prove t\02_poc.t t\02_poc.t .. 1/8 # Failed test at t\02_poc.t line 40. # got: '' # expected: 'EF12345' # Looks like you failed 1 test of 8. t\02_poc.t .. Dubious, test returned 1 (wstat 256, 0x100) Failed 1/8 subtests Test Summary Report ------------------- t\02_poc.t (Wstat: 256 Tests: 8 Failed: 1) Failed test: 8 Non-zero exit status: 1 Files=1, Tests=8, 1 wallclock secs ( 0.06 usr + 0.00 sys = 0.06 CPU +) Result: FAIL C:\IT\Temp\> C:\IT\Temp\>type "csv\input.txt" 1,AB12345,03.04.2016 15:43:14,-76775.70,Toll road INC,Bridge 55,19.8,0 +4.04.2016 06:55:41 2,CD12345,01.04.2016 16:39:15,-76775.70,Toll road INC,River Kwai,8.1,0 +4.04.2016 06:27:36 3,EF12345,01.04.2016 16:39:15,-76775.70,Toll road INC,Champs-Élysées,8 +.1,04.04.2016 06:27:36 C:\IT\Temp\>type "xml\input.xml" <?xml version="1.0" encoding="UTF-8" standalone="yes"?> <Import> <record> <Col1>1</Col1> <Col2>AB12345</Col2> <Col3>03.04.2016 15:43:14</Col3> <Col4>-76775.70</Col4> <Col5>Toll road INC</Col5> <Col6>Bridge 55</Col6> <Col7>19.8</Col7> <Col8>04.04.2016 06:55:41</Col8> <Col9></Col9> </record> <record> <Col1>2</Col1> <Col2>CD12345</Col2> <Col3>01.04.2016 16:39:15</Col3> <Col4>-76775.70</Col4> <Col5>Toll road INC</Col5> <Col6>River Kwai</Col6> <Col7>8.1</Col7> <Col8>04.04.2016 06:27:36</Col8> <Col9></Col9> </record> <record> <Col1></Col1> <Col2></Col2> <Col3></Col3> <Col4></Col4> <Col5></Col5> <Col6></Col6> <Col7></Col7> <Col8></Col8> <Col9></Col9> </record> </Import> C:\IT\Temp\>
No matter how great and destructive your problems may seem now, remember, you've probably only seen the tip of them. [1]

Replies are listed 'Best First'.
Re: Convert CSV to XML
by Corion (Pope) on May 20, 2016 at 07:04 UTC

    If you have the XSD of the taget XML, I recommend XML::Writer XML::Compile. You have to mush your input data into the data structure corresponding to the output XML data, but that's all you have to do. XML::Writer will then create the appropriate XML from the XSD and your data structure.

    I've also had good experience in creating a template using HTML::Template(::Compiled) and filling in the data from a flat datastructure, but that really depends on the output XML format.

      Dear Corion,

      Thanks for your helpful reply. However, I cannot find any reference to XSD's in the documentation to XML::Writer. Do you have any pointers? Or did you perhaps mean XML::Compile as described in this blog post? Thanks again.

      Best regards
      No matter how great and destructive your problems may seem now, remember, you've probably only seen the tip of them. [1]
Re: Convert CSV to XML
by tangent (Vicar) on May 20, 2016 at 12:44 UTC
    Using a combination of Text::CSV and Template Toolkit would be a flexible solution. If you have a complex format a templating system gives you the very fine control you need.

    Use Text::CSV to convert your file to a structure like:
    @rows = ( { Col1 => 1, Col2 => 'AB12345', Col3 => '03.04.2016 15:43:14', Col4 => '-76775.70', Col5 => 'Toll road INC', }, { Col1 => '2', Col2 => 'CD12345', Col3 => '01.04.2016 16:39:15', Col4 => '-76775.70', Col5 => 'Toll road INC', }, );
    Then pass that structure to Template Toolkit
    my $params = { rows => \@rows }; my $template = Template->new; $template->process('template.xml', $params, 'output.xml');
    Your template file 'template.xml' could look something like:
    <?xml version="1.0" encoding="UTF-8" standalone="yes"?> <Import> [% FOREACH row IN rows %] <record> <Col1>[% row.Col1 %]</Col1> <Col2>[% row.Col2 %]</Col2> [% IF row.Col3 %] <Col3>[% row.Col3 %]</Col3> [% END %] <Col4>[% row.Col4 %]</Col4> <Col5>[% row.Col5 %]</Col5> </record> [% END %] </Import>
Re: Convert CSV to XML
by Anonymous Monk on May 20, 2016 at 07:04 UTC

Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://1163588]
Approved by kevbot
Front-paged by kcott
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others studying the Monastery: (7)
As of 2019-06-18 11:42 GMT
Find Nodes?
    Voting Booth?
    Is there a future for codeless software?

    Results (81 votes). Check out past polls.

    • (Sep 10, 2018 at 22:53 UTC) Welcome new users!