Beefy Boxes and Bandwidth Generously Provided by pair Networks
We don't bite newbies here... much
 
PerlMonks  

XML::Twig output problem

by plockhart (Initiate)
on Jan 05, 2012 at 17:33 UTC ( [id://946440]=perlquestion: print w/replies, xml ) Need Help??

plockhart has asked for the wisdom of the Perl Monks concerning the following question:

XML::Twig only returning first node, not all nodes matching as it carries over the previous JobActiveDate value. If I flush the sub DumpOldAds, it throws out all ads!/p>

I have this xml I'm massaging for a feed of job ads:

<?xml version="1.0" encoding="UTF-8"?> <nis:Jobs xsi:schemaLocation="http://schemas.monster.com/Monster http: +//schemas.monster.com/Current/XSD/Job.xsd http://schemas.monster.com/ +Monster/NIS http://schemas.monster.com/Current/Extensions/NIS/XSD/NIS +Job.xsd" xmlns:nis="http://schemas.monster.com/Monster/NIS" xmlns="ht +tp://schemas.monster.com/Monster" xmlns:xsi="http://www.w3.org/2001/X +MLSchema-instance"> <nis:NISJob> <Job jobRefCode="NewsMinimumXMLSample" jobAction="addOrUpdate"> <RecruiterReference> <UserName>nis123456_jsmith</UserName> </RecruiterReference> <CompanyReference> <CompanyXCode>xnis123456_jsmithx</CompanyXCode> <CompanyName><![CDATA[John Smith Press]]></CompanyName> </CompanyReference> <Channel monsterId="4770"/> <JobInformation> <JobTitle><![CDATA[Printer]]></JobTitle> <Contact> <StructuredName> <GivenName><![CDATA[John]]></GivenName> <FamilyName><![CDATA[Smith]]></FamilyName> </StructuredName> <Address> <StreetAddress>1 Main Street</StreetAddress> <City>Maynard</City> <State>MA</State> <CountryCode>US</CountryCode> <PostalCode>01754</PostalCode> </Address> <Phones> <Phone phoneType="contact">8885551212</Phone> </Phones> <E-mail>news.dev.sample@monster.com</E-mail> </Contact> <JobBody><![CDATA[<font face="StdNewtonABold" size=1 color +="#000000"><dl><dl><dd>Job Body goes here</font></dl></dl>]]></JobBod +y> </JobInformation> <JobPostings> <JobPosting> <Location> <CountryCode>US</CountryCode> <PostalCode>01754</PostalCode> <Continent>NA</Continent> </Location> <JobCategory monsterId="2"/> <JobOccupations> <JobOccupation monsterId="11713"/> </JobOccupations> <BoardName monsterId="1"/> <JobPostingDates> <JobActiveDate>2011-11-28T00:00:00</JobActiveDate> <JobExpireDate>2011-12-28T00:00:00</JobExpireDate> </JobPostingDates> </JobPosting> </JobPostings> </Job> </nis:NISJob> </nis:Jobs>

Using this code:

#!/usr/bin/perl -w use strict; use warnings; use XML::Twig; use Date::Manip; use Data::Dumper; my $datetoday = UnixDate("today","%m-%d-%Y"); chomp $datetoday; my $xmlFile = "WacoTri_pre_$datetoday.xml"; my $xmlOut = "WacoTri_$datetoday.xml"; open (XMLOUT, ">$xmlOut"); my $twig = new XML::Twig( twig_handlers => { 'CompanyXCode' => \&lc +cXCode, 'JobPostingDates' => \&DoDate, 'nis:NISJob' => \&DumpOldAds }, pretty_print => 'indented', ); $twig->parsefile ($xmlFile); $twig->flush; $twig-> print (\*XMLOUT); close XMLOUT; sub DoDate { my ( $twig, $JobPostingDates)= @_; my $dtidate = $JobPostingDates->first_child('JobActiveDate')-> +text; my $dater = substr $dtidate, 0, 10; my $newDate = (DateCalc($dater,"+ 7 days")); my $outputdate = UnixDate($newDate, "%Y-%m-%d"); chomp $outputdate; substr $dtidate, 0, 10, "$outputdate"; $JobPostingDates->first_child('JobExpireDate')->set_text($dtid +ate); #$twig->flush; } sub lccXCode { my ( $twig, $CompanyXCode ) = @_; my $XCode = $CompanyXCode->text; my $lowercaseXCode = lc ($XCode); $CompanyXCode->set_text ($lowercaseXCode); } sub DumpOldAds { my ($twig,$nisNISJob)=@_; my $date = UnixDate("today", "%Y-%m-%d".'T00:00:00'); chomp $date; my $adate = $nisNISJob->findvalue('.//JobActiveDate'); print "\n$adate -|- $date\n"; if( $adate ne $date) { $nisNISJob->delete; } else { print $nisNISJob; }; }

What happens is that everything works as expected - Except that the output file contains only the first ad, not all the ads with a JobActiveDate of today as specified in the conditional sub DumpOldAds. I should get:

<?xml version="1.0" encoding="UTF-8"?> <nis:Jobs xsi:schemaLocation="http://schemas.monster.com/Monster http: +//schemas.monster.com/Current/XSD/Job.xsd http://schemas.monster.com/ +Monster/NIS http://schemas.monster.com/Current/Extensions/NIS/XSD/NIS +Job.xsd" xmlns:nis="http://schemas.monster.com/Monster/NIS" xmlns="ht +tp://schemas.monster.com/Monster" xmlns:xsi="http://www.w3.org/2001/X +MLSchema-instance"> <nis:NISJob> <Job jobRefCode="NewsMinimumXMLSample" jobAction="addOrUpdate"> <RecruiterReference> <UserName>nis123456_jsmith</UserName> </RecruiterReference> <CompanyReference> <CompanyXCode>xnis123456_jsmithx</CompanyXCode> <CompanyName><![CDATA[John Smith Press]]></CompanyName> </CompanyReference> <Channel monsterId="4770"/> <JobInformation> <JobTitle><![CDATA[Printer]]></JobTitle> <Contact> <StructuredName> <GivenName><![CDATA[John]]></GivenName> <FamilyName><![CDATA[Smith]]></FamilyName> </StructuredName> <Address> <StreetAddress>1 Main Street</StreetAddress> <City>Maynard</City> <State>MA</State> <CountryCode>US</CountryCode> <PostalCode>01754</PostalCode> </Address> <Phones> <Phone phoneType="contact">8885551212</Phone> </Phones> <E-mail>news.dev.sample@monster.com</E-mail> </Contact> <JobBody><![CDATA[<font face="StdNewtonABold" size=1 color +="#000000"><dl><dl><dd>Job Body goes here</font></dl></dl>]]></JobBod +y> </JobInformation> <JobPostings> <JobPosting> <Location> <CountryCode>US</CountryCode> <PostalCode>01754</PostalCode> <Continent>NA</Continent> </Location> <JobCategory monsterId="2"/> <JobOccupations> <JobOccupation monsterId="11713"/> </JobOccupations> <BoardName monsterId="1"/> <JobPostingDates> <JobActiveDate>2011-11-28T00:00:00</JobActiveDate> <JobExpireDate>2011-12-28T00:00:00</JobExpireDate> </JobPostingDates> </JobPosting> </JobPostings> </Job> </nis:NISJob> <nis:NISJob> ... Job with today's active date </nis:NISJob> <nis:NISJob> ... Job with today's active date </nis:NISJob> </nis:Jobs>

I can't seem to find the solution, though I feel like it is some simple thing staring me in the face! I seek the wisdom!

Replies are listed 'Best First'.
Re: XML::Twig output problem
by toolic (Bishop) on Jan 05, 2012 at 18:12 UTC
    The code you posted has compile errors for me (DumpOldAds and extra semicolons). Are you sure this is the code you are using? If not, please update the OP.

    Also, please post the exact output you expect.

      The output should have all ads - <nis:NISJob>...</nis:NISJob> that have a JobActiveDate of today. What I get in the output file is just the last ad - <nis:NISJob>...</nis:NISJob> in the file with today's JobActiveDate. Yup, forgot to take out some testing trash on the sub:

      sub DumpOldAds { my ($twig,$nisNISJob)=@_; my $date = UnixDate("today", "%Y-%m-%d".'T00:00:00'); chomp $date; my $adate = $nisNISJob->findvalue('//JobActiveDate'); print "\n$adate -|- $date\n"; if( $adate ne $date) { $nisNISJob->delete; }; #else {$twig->flush;}; }

        Make that the first ad not the last, sorry, long week. If I flush it, it deletes all the ads even though the comparison is valid to keep them!

Re: XML::Twig output problem
by plockhart (Initiate) on Jan 06, 2012 at 19:58 UTC

    I found a fix, though I am not understanding why it worked. I am new to both Perl and XML::Twig, so it should be no surprise. I updated the code to reflect the fix. If someone could explain what I am missing, I would certainly appreciate it!

      I found a fix, though I am not understanding why it worked. I am new to both Perl and XML::Twig, so it should be no surprise. I updated the code to reflect the fix. If someone could explain what I am missing, I would certainly appreciate it!

      Um, what did you update, where?

        I updated the code that I was having trouble with, in particular:

        sub DumpOldAds { my ($twig,$nisNISJob)=@_; my $date = UnixDate("today", "%Y-%m-%d".'T00:00:00'); chomp $date; my $adate = $nisNISJob->findvalue('.//JobActiveDate'); print "\n$adate -|- $date\n"; if( $adate ne $date) { $nisNISJob->delete; } else { print $nisNISJob; }; }

        When I replaced "else {$twig->flush} with else {print $nisNISJob), it worked. What am I missing concerning flush not working? Thanks, P

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://946440]
Approved by toolic
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others lurking in the Monastery: (5)
As of 2024-04-24 11:26 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found