Beefy Boxes and Bandwidth Generously Provided by pair Networks
Think about Loose Coupling
 
PerlMonks  

how to convert multiple xml files with different structure into CSV files

by pavithravenky (Initiate)
on Jul 17, 2014 at 05:23 UTC ( [id://1093982]=perlquestion: print w/replies, xml ) Need Help??

pavithravenky has asked for the wisdom of the Perl Monks concerning the following question:

As per my requirement, I need to convert XML files into CSV file using perl scripting. Currently I recieve multiple xml files with different schema structure. How can i read the XML files with different structure for example I can have two files as specified below: XML File 1:
<root> <Report> <EMP> <ID>1</ID> <Name>XXX</Name> <Company>YYY</Company> <Join_Date>2010-10-16</Join_Date> </EMP> <DEP> <DEP> <DEP_ID>2</DEP_ID> <DEP_Name>XX</DEP_Name> <Jobtitle></Jobtitle> <Jobdescr></Jobdesc> </DEP> <DEP> <DEP_ID>3</DEP_ID> <DEP_Name>YY</DEP_Name> <Jobtitle></Jobtitle> <Jobdescr></Jobdesc> </DEP>
XML File 2:
<root> <Order> <OrderId>2</OrderId> <TxnNumber>1</TxnNumber> <OrderDesc>General Information</OrderDesc> <Location>ABC</Location> </Order>
Hence how to read the xml files dynamically and convert it to a csv files without any xsd defined?

Replies are listed 'Best First'.
Re: how to convert multiple xml files with different structure into CSV files
by Corion (Patriarch) on Jul 17, 2014 at 06:35 UTC

    For each different XML schema, you will have to define what constitutes a "row", and then read the XML file in, and for each "row" element, output all tag values that you have seen so far.

      The XML files that are created can be of different structures,threfore the xml schema cannot be defined, hence is there any option to define a genric code without explicitly defining the rows for each xml file

        That is not possible in the general case.

        I've done something like counting the tag that occurs the most and then using the topmost tag of such tags as the "row" tag, but if you haven't spent enough time on the problem and don't know the limitations of this approach, it just won't work for you.

        My opinion is that without human input, it is a futile approach to try to convert XML to CSV and to expect good results.

        For learning about how to convert XML to CSV, consider opening the XML files in Excel 2010 and watch how Excel converts them to tabular structure. If you find that Excel converts all your files to a suitable (and stable) tabular structure, you will find that my approach of using the most frequent tag as the "row" tag will likely work too.

        A reply falls below the community's threshold of quality. You may see it by logging in.
Re: how to convert multiple xml files with different structure into CSV files
by Laurent_R (Canon) on Jul 17, 2014 at 06:51 UTC
    Please use code tags (<code> and </code>) to display your XML content properly.
Re: how to convert multiple xml files with different structure into CSV files
by RichardK (Parson) on Jul 17, 2014 at 09:25 UTC

    So what do you expect the CSV to look like for the example XML you've shown?

    If you can show us the output format you want, then hopefully someone will be able to suggest some approaches you could try.

      The Output of the XML should be as specified below for XML File 1:
      ID Name Company Join_Date DEP_ID DEP_Name Jobtitle + Jobdescr 1 xxx abc 1/1/2014 1 zzz xx desc1 1 xxx abc 1/1/2014 2 yyy yy desc2

        So where did all that extra junk come from? there are no elements / columns called '1 xxx abc' etc

        You are not going to get very far in solving this problem if you cannot clearly & logically state you requirements.

        Also, use code tags!

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://1093982]
Approved by Corion
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others contemplating the Monastery: (4)
As of 2024-04-19 23:08 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found