Beefy Boxes and Bandwidth Generously Provided by pair Networks
Just another Perl shrine
 
PerlMonks  

Re: xml join

by graff (Chancellor)
on Oct 02, 2013 at 22:35 UTC ( #1056688=note: print w/ replies, xml ) Need Help??


in reply to xml join

This seems like it's not a perl question (esp. since you didn't post any code).

But, by way of a non-perl answer, I have found that it can be very useful to concatenate a bunch of xml files that use a common schema/dtd, e.g. to get a global summary of contents.

All I need is some arbitrary tag to serve as the outer-most container for the concatenated set. Sometimes the files have this at the beginning:

<?xml version="1.0"?>
(and sometimes that includes an 'encoding' attribute as well); those need to be filtered out. So the process boils down to a simple, 3-step sequence of shell commands (assuming there's one directory containing all the xml files of interest, and a separate path to use for output):
echo '<arbitrary_tag>' > outpath/combined.xml cat inpath/*.xml | fgrep -v '<?xml' >> outpath/combined.xml echo '</arbitrary_tag>' >> output/combined.xml
That also assumes that the order of file names you get from a default sort will put the files in the desired sequence (if that matters at all). If you want them in a sequence that differs from a default sort on the file names, you'll need to create a separate text file that lists the xml file names in the desired order, then pipe that list to "xargs cat" (instead of doing "cat inpath/*.xml").

That also assumes you're on a system where the unix/linux/osx/cygwin "cat", "echo", "fgrep" and "xargs" commands are available.


Comment on Re: xml join
Select or Download Code

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://1056688]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others about the Monastery: (8)
As of 2015-07-04 08:59 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    The top three priorities of my open tasks are (in descending order of likelihood to be worked on) ...









    Results (58 votes), past polls