Beefy Boxes and Bandwidth Generously Provided by pair Networks
Think about Loose Coupling

Re^2: Extracting tagged data from a XML file

by theroninwins (Friar)
on Aug 31, 2004 at 09:45 UTC ( #387142=note: print w/replies, xml ) Need Help??

in reply to Re: Extracting tagged data from a XML file
in thread Extracting tagged data from a XML file

For some reason this works but when i leave the NEIGHBOUR out, it doesn't anymore how come ?? I know regular expretions and from my point of view "$_ =~ ( /<IPAddress>(.+?)<\/IP.+?/ );" should work fine. Where is the mistake?? (I am leaving the rest as it is. the problem is that i only need the IPAddress and not the NEIGHBOURIPAdress and that IPAddress string is in both so how can i get around that and only get the first one (otherwise I have all IPs in that list a thousand times). Sorry for writing that now I just noticed that
  • Comment on Re^2: Extracting tagged data from a XML file

Replies are listed 'Best First'.
Re^3: Extracting tagged data from a XML file
by Random_Walk (Prior) on Aug 31, 2004 at 10:38 UTC
    Here is a naughty one liner to extract all occurances of IP addresses between tags <IP-ADDRESS> in any case and across lines. Very inefficient for a large file as it reads it all into memory. Change the word data to the name of your file.
    perl -le 'local$/;open F,data;$_=<F>;s/\n//g;while(/<ip-address>(.+?)< +\/ip/gi){print $1}'
    Here is the same but to grap IP addresses from either <IP-ADDRESS> or <IP_NEIGHBOUR> tags.
    perl -le 'local$/;open F,data;$_=<F>;s/\n//g;while(/<ip-(neigbour|addr +ess)>(.+?)<\/ip/gi){print $2}'


    For some reason I got marked -1 on this, if anyone can explain what I did wrong here I'd love to know. I realise the code is naughty for eating the file in one gulp but if the file is small this can't do much harm and it makes for a very simple solution to the possible problem of the addresses being broken accross line breaks.

    Anyway, looking at this thread the OP looks to have changed his mind and not want to capture the <IP-NEIGHBOUR> addresses, as well as wanting only unique addresses returned, I will update this space soon with a version that is more mem friendly and possibly redeem myself in the eyes of the monastery.

    further update

    OK, I have had the error of my ways pointed out, thou shalt not parse XML with regexp. I shall stop sinning now, no further unholy code shall follow.
      if anyone can explain what I did wrong here I'd love to know

      You tried to parse XML with regular expressions.


      "The first rule of Perl club is you do not talk about Perl club."
      -- Chip Salzenberg

Log In?

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://387142]
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others about the Monastery: (3)
As of 2022-05-25 20:25 GMT
Find Nodes?
    Voting Booth?
    Do you prefer to work remotely?

    Results (90 votes). Check out past polls.