Beefy Boxes and Bandwidth Generously Provided by pair Networks
Your skill will accomplish
what the force of many cannot
 
PerlMonks  

help need when processing xml using perl

by veerubiji (Sexton)
on Nov 27, 2011 at 15:55 UTC ( #940269=perlquestion: print w/ replies, xml ) Need Help??
veerubiji has asked for the wisdom of the Perl Monks concerning the following question:

Hi monks, I have one string with xml data,I am processing that sting using XML::Simple module and template toolkit process. By using PDF latex I am converting it into pdf. But my problem is I have some repeated information in XML format.so its also repeating in pdf.How to remove this information.My data look like this.

<specification> <details> <name>johan</name> <address>Langgt 23</address> ---more info--- </details> <details> <name>venu</name> <address>storgatan 27</address> ---more info--- </details> <details> <name>kent</name> <address>nygatan 46</address> ---more info---- </details> <details> <name>johan</name> <branch>ece</branch> ---more info-- </details> </specification>

In the above xml file name johan repeated in two nodes and in frist node He having the address details, In the repeated node he having some other details but I need only nodes that contain address information . If any nodes don't have address nodes then remove entire node information.I am new to perl language how to find the nodes that don't have address information and how to delete that node. please help me.

Some body suggested XML::simple automatically deletes repeated information with some specific configuration but I cant found in that module.

please help me i am working from last 10 days on this task but I am unable to remove such information, Any help appreciated.

I am processing like that

#!/usr/bin/perl use warnings; use strict; use Data::Dumper; use XML::Simple; use XML::Fast; use Template; my $data="some xml data"; my $xml = new XML::Simple; my $data = $xml->XMLin("$data", ForceArray =>['name','address'],); #print Dumper($data); my $template = Template->new(); my $filename = 'output1.tex'; $template->process(\*DATA, $data, $filename) || die "Template process failed: ", $template->error(), "\n"; system( "pdflatex $filename" ); __DATA__ \documentclass[a4paper,leqno,twoside]{article} \usepackage[latin1]{inputenc} \usepackage[english]{babel} \usepackage{multirow} \renewcommand{\familydefault}{\sfdefault} \usepackage{color} \usepackage[colorlinks=true]{hyperref} \begin{document} [% FOREACH st IN details %] [% st.name %] [%st.address%] ---more info--- [%END%]

any help appreciated

Thanks in advance

Comment on help need when processing xml using perl
Select or Download Code
Re: help need when processing xml using perl
by LesleyB (Friar) on Nov 27, 2011 at 16:07 UTC

    It would be helpful to see the Data::Dumper output.

Re: help need when processing xml using perl
by roboticus (Canon) on Nov 27, 2011 at 17:11 UTC

    veerubiji:

    Before processing the template, scan through your $data structure and delete all the duplicate items--specifically the ones that don't have complete information for your template.

    ...roboticus

    When your only tool is a hammer, all problems look like your thumb.

      Can you suggest me, which module is useful and how to delete such information.if possible can you provide one example.

        veerubiji:

        Come on, now. It's just a data structure. Print it out with something like Data::Dumper, then write some code to iterate over the data structure and delete the things you don't want. Give it a try.

        ...roboticus

        When your only tool is a hammer, all problems look like your thumb.

Re: help need when processing xml using perl
by Anonymous Monk on Nov 27, 2011 at 19:01 UTC

    Maybe

    http://stackoverflow.com/questions/8224980/how-to-remove-duplicate-nodes-from-xml-file-using-perl

    http://stackoverflow.com/questions/8260877/how-can-i-delete-duplicate-information-with-perl

Re: help need when processing xml using perl
by CountZero (Bishop) on Nov 27, 2011 at 22:38 UTC
    You can easily add a test in your template to check if st.address is not empty.
    [% FOREACH st IN details %] [%IF st.address%] [% st.name %] [% st.address %] ---more info--- [%END%] [%END%]

    CountZero

    A program should be light and agile, its subroutines connected like a string of pearls. The spirit and intent of the program should be retained throughout. There should be neither too little or too much, neither needless loops nor useless variables, neither lack of structure nor overwhelming rigidity." - The Tao of Programming, 4.1 - Geoffrey James

      Thank you very much for your solution, Its working.

Re: help need when processing xml using perl
by TJPride (Pilgrim) on Nov 28, 2011 at 05:16 UTC
    CountZero's solution is of course far more elegant. I thought I'd see if I could code a filter using just regex, however:

    use strict; use warnings; my ($xml, $content, @xml); $xml = join '', <DATA>; while ($xml =~ m/(<details>.*?<\/details>)/sg) { $content = $1; next if $content !~ /<address>.+?<\/address>/; push @xml, $content; } $xml = "<specification>\n" . join("\n", map { " $_" } @xml) . "\n</specification>"; print $xml; __DATA__ <specification> <details> <name>johan</name> <address>Langgt 23</address> ---more info--- </details> <details> <name>venu</name> <address>storgatan 27</address> ---more info--- </details> <details> <name>kent</name> <address>nygatan 46</address> ---more info---- </details> <details> <name>johan</name> <branch>ece</branch> ---more info-- </details> </specification>

      Thanks for your solution.but I don't know how to vote you.

        You have to get to level 2 or 3 before you start getting votes, and until then you won't see the vote option.

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://940269]
Approved by ww
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others avoiding work at the Monastery: (13)
As of 2014-07-25 18:18 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    My favorite superfluous repetitious redundant duplicative phrase is:









    Results (174 votes), past polls