Beefy Boxes and Bandwidth Generously Provided by pair Networks
P is for Practical
 
PerlMonks  

XPath not behaving as expected...

by spstansbury (Monk)
on Jan 11, 2011 at 17:31 UTC ( [id://881710]=perlquestion: print w/replies, xml ) Need Help??

spstansbury has asked for the wisdom of the Perl Monks concerning the following question:

I am processing a batch of vulnerability reports and and my XPath syntax is not working as I expect.

The block of code:

for ($xc->findnodes( 'fndvuln', $host)) { $fnd_vuln_id = $_->findvalue('./@id'); print "\n"; print $fnd_vuln_id . "\n"; $commonRecord{"nCircleVulnID"} = $fnd_vuln_id; # The Vulnerabliliy descriptions are in /audit/vulnerabilities for $vuln ( $xc->findnodes("/audit/vulnerabilities/vuln[\@id = '$f +nd_vuln_id']")) { $commonRecord{"nCircleVulnName"} = $xc->findvalue('vname', $vu +ln); $commonRecord{"nCircleVulnScore"} = $xc->findvalue('vscore', $ +vuln); $commonRecord{"nCircleVulnRisk"} = $xc->findvalue('risk', $vul +n); $commonRecord{"nCircleVulnSkill"} = $xc->findvalue('skill', $v +uln); $commonRecord{"nCircleVulnStrategy"} = $xc->findvalue('strateg +y', $vuln); $commonRecord{"nCircleVulnDesc"} = &clean( $xc->findvalue( 'vd +escription', $vuln)); # This is where the issue is: if ( $xc->findnodes( 'advisories/cve', $vuln )) { for ( $xc->findvalue( 'advisories/cve', $vuln )) { print $_ . "\n"; push ( @cve_records, $_ ); } } }

And the XML that it is reading:

<audit> <devices> <host id="125861" persistent_id="20164"> <fndvuln id="3522" port="161" proto="udp"/> </host> </devices> <vulnerabilities> <vuln id="3522"> <vname> SNMP System Description Available (system.sysDescr)</vname> <vscore>48</vscore> <risk>Exposure</risk> <skill>Automated Exploit</skill> <strategy>Network Reconnaissance</strategy> <vdescription> The SNMP System Description (sys.sysDescr, OID=.iso.3.6.1.2.1.1.1.0) i +s remotely available. This can give detailed operating system, build, + and version information about the target. </vdescription> <advisories> <cve>CVE: CVE-1999-0516</cve> <cve>CVE: CVE-1999-0517</cve> </advisories> </vuln>

Note the there are multiple <cve></cve> elements but are concatenated in the output:

3522 CVE: CVE-1999-0516CVE: CVE-1999-0517

How do I make the for loop read the <cve> elements individually so that I can push them onto the array?

As always, thanks for any input...

Scott

Replies are listed 'Best First'.
Re: XPath not behaving as expected...
by ikegami (Patriarch) on Jan 11, 2011 at 18:10 UTC

    s/findvalue/findnodes/. It's also very silly to do the search twice.

    if ( $xc->findnodes( 'advisories/cve', $vuln )) { for ( $xc->findvalue( 'advisories/cve', $vuln )) { push ( @cve_records, $_ ); } }

    should be

    for my $cve_node ($xc->findnodes('advisories/cve', $vuln)) { push @cve_records, $cve_node->textContent(); }

    By the way,

    $fnd_vuln_id = $_->findvalue('./@id');

    should be

    my $fnd_vuln_id = $_->findvalue('./@id');

    and it can be simplified to

    my $fnd_vuln_id = $_->findvalue('@id');

    and even to

    my $fnd_vuln_id = $_->getAttribute('id');

    By the way, it would probably make more sense to load the nodes /audit/vulnerabilities/vuln into a hash (keyed by the id attribute) before the loop instead of repeatedly searching the tree inside the loop.

    Update: Added notes.

      Once again, thank you very much!

      Scott...

Re: XPath not behaving as expected...
by bart (Canon) on Jan 11, 2011 at 22:53 UTC
    In the spirit of TIMTOWTDI, you can probably use this as selector for the individual nodes:
    'advisories/cve[1]'
    for the first, and
    'advisories/cve[2]'
    for the second.

    I am guessing that for extracting the text, the XPath processor simply concatenates everything it finds..

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://881710]
Approved by ww
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others avoiding work at the Monastery: (5)
As of 2024-04-18 20:51 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found