Beefy Boxes and Bandwidth Generously Provided by pair Networks
Clear questions and runnable code
get the best and fastest answer
 
PerlMonks  

Convert CSV to XML

by andreas1234567 (Vicar)
on May 20, 2016 at 06:58 UTC ( #1163588=perlquestion: print w/replies, xml ) Need Help??

andreas1234567 has asked for the wisdom of the Perl Monks concerning the following question:

Dear Monks,
I am tasked to convert CSV to XML and have turned to the not-updated-in-15-years XML::CSV. It has a number of issues - lack on support for accented characters and poor error handling are some of them. What would you recommend for converting CSV into a custom, possibly rather complex XML format?
use warnings; use strict; use Test::More; use File::Spec::Functions; use XML::XPath; use lib catdir qw ( lib ); plan tests => 8; use_ok q{XML::CSV}; my $base = q{input}; my $csvfile = catdir q(csv), qq{$base.txt}; my $xmlfile = catdir q(xml), qq{$base.xml}; my $default_obj_xs = Text::CSV_XS->new({sep_char => ",", quote_char => + '"', encoding => "utf8"}); my $csv_obj = XML::CSV->new( { csv_xs => $default_obj_xs, error_out => + 1 }); # convert csv to xml, print to file my @arr_of_headings = map { "Col$_" } (1..9); $csv_obj->{column_headings} = \@arr_of_headings; $csv_obj->parse_doc($csvfile); $csv_obj->declare_xml({version => '1.0', encoding => 'UTF-8', standalo +ne => 'yes'}); $csv_obj->print_xml($xmlfile, {format => " ", file_tag => "Import", re +cord_tag => "Row"}); # test xml output my $xp = XML::XPath->new(filename => $xmlfile); my @nodes = $xp->findnodes(q{/Import/record}); cmp_ok(scalar(@nodes), q[==], 3, q[We have 3 nodes]); @nodes = $xp->findnodes(q{/Import/record[1]/Col2}); cmp_ok(scalar(@nodes), q[==], 1, q[We have 1 match on line 1]); ok(exists($nodes[0])); cmp_ok($nodes[0]->string_value(), q{eq}, q{AB12345}); @nodes = $xp->findnodes(q{/Import/record[3]/Col2}); cmp_ok(scalar(@nodes), q[==], 1, q[We have 1 match on line 3]); ok(exists($nodes[0])); cmp_ok($nodes[0]->string_value(), q{eq}, q{EF12345}); __END__ C:\IT\Temp\>prove t\02_poc.t t\02_poc.t .. 1/8 # Failed test at t\02_poc.t line 40. # got: '' # expected: 'EF12345' # Looks like you failed 1 test of 8. t\02_poc.t .. Dubious, test returned 1 (wstat 256, 0x100) Failed 1/8 subtests Test Summary Report ------------------- t\02_poc.t (Wstat: 256 Tests: 8 Failed: 1) Failed test: 8 Non-zero exit status: 1 Files=1, Tests=8, 1 wallclock secs ( 0.06 usr + 0.00 sys = 0.06 CPU +) Result: FAIL C:\IT\Temp\> C:\IT\Temp\>type "csv\input.txt" 1,AB12345,03.04.2016 15:43:14,-76775.70,Toll road INC,Bridge 55,19.8,0 +4.04.2016 06:55:41 2,CD12345,01.04.2016 16:39:15,-76775.70,Toll road INC,River Kwai,8.1,0 +4.04.2016 06:27:36 3,EF12345,01.04.2016 16:39:15,-76775.70,Toll road INC,Champs-Élysées,8 +.1,04.04.2016 06:27:36 C:\IT\Temp\>type "xml\input.xml" <?xml version="1.0" encoding="UTF-8" standalone="yes"?> <Import> <record> <Col1>1</Col1> <Col2>AB12345</Col2> <Col3>03.04.2016 15:43:14</Col3> <Col4>-76775.70</Col4> <Col5>Toll road INC</Col5> <Col6>Bridge 55</Col6> <Col7>19.8</Col7> <Col8>04.04.2016 06:55:41</Col8> <Col9></Col9> </record> <record> <Col1>2</Col1> <Col2>CD12345</Col2> <Col3>01.04.2016 16:39:15</Col3> <Col4>-76775.70</Col4> <Col5>Toll road INC</Col5> <Col6>River Kwai</Col6> <Col7>8.1</Col7> <Col8>04.04.2016 06:27:36</Col8> <Col9></Col9> </record> <record> <Col1></Col1> <Col2></Col2> <Col3></Col3> <Col4></Col4> <Col5></Col5> <Col6></Col6> <Col7></Col7> <Col8></Col8> <Col9></Col9> </record> </Import> C:\IT\Temp\>
--
No matter how great and destructive your problems may seem now, remember, you've probably only seen the tip of them. [1]

Replies are listed 'Best First'.
Re: Convert CSV to XML
by Corion (Pope) on May 20, 2016 at 07:04 UTC

    If you have the XSD of the taget XML, I recommend XML::Writer XML::Compile. You have to mush your input data into the data structure corresponding to the output XML data, but that's all you have to do. XML::Writer will then create the appropriate XML from the XSD and your data structure.

    I've also had good experience in creating a template using HTML::Template(::Compiled) and filling in the data from a flat datastructure, but that really depends on the output XML format.

      Dear Corion,

      Thanks for your helpful reply. However, I cannot find any reference to XSD's in the documentation to XML::Writer. Do you have any pointers? Or did you perhaps mean XML::Compile as described in this blog post? Thanks again.

      Best regards
      --
      No matter how great and destructive your problems may seem now, remember, you've probably only seen the tip of them. [1]
Re: Convert CSV to XML
by tangent (Vicar) on May 20, 2016 at 12:44 UTC
    Using a combination of Text::CSV and Template Toolkit would be a flexible solution. If you have a complex format a templating system gives you the very fine control you need.

    Use Text::CSV to convert your file to a structure like:
    @rows = ( { Col1 => 1, Col2 => 'AB12345', Col3 => '03.04.2016 15:43:14', Col4 => '-76775.70', Col5 => 'Toll road INC', }, { Col1 => '2', Col2 => 'CD12345', Col3 => '01.04.2016 16:39:15', Col4 => '-76775.70', Col5 => 'Toll road INC', }, );
    Then pass that structure to Template Toolkit
    my $params = { rows => \@rows }; my $template = Template->new; $template->process('template.xml', $params, 'output.xml');
    Your template file 'template.xml' could look something like:
    <?xml version="1.0" encoding="UTF-8" standalone="yes"?> <Import> [% FOREACH row IN rows %] <record> <Col1>[% row.Col1 %]</Col1> <Col2>[% row.Col2 %]</Col2> [% IF row.Col3 %] <Col3>[% row.Col3 %]</Col3> [% END %] <Col4>[% row.Col4 %]</Col4> <Col5>[% row.Col5 %]</Col5> </record> [% END %] </Import>
Re: Convert CSV to XML
by Anonymous Monk on May 20, 2016 at 07:04 UTC

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://1163588]
Approved by kevbot
Front-paged by kcott
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others making s'mores by the fire in the courtyard of the Monastery: (2)
As of 2021-03-05 04:26 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?
    My favorite kind of desktop background is:











    Results (109 votes). Check out past polls.

    Notices?