Beefy Boxes and Bandwidth Generously Provided by pair Networks
We don't bite newbies here... much
 
PerlMonks  

Re^2: Regular Expression: I need a regex to fetch data from an html file

by Anonymous Monk
on Feb 27, 2012 at 10:27 UTC ( [id://956417]=note: print w/replies, xml ) Need Help??


in reply to Re: Regular Expression: I need a regex to fetch data from an html file
in thread Regular Expression: I need a regex to fetch data from an html file

<tr><td id='Auf'>50956866</td> <td id='Ku'>D510848</td> <td id='Rec'>18.10.2011</td> <td id='Re'>EUR 118,95</td> <td id='Za'>EUR 0,00</td> <td id='Off'>EUR 118,95</td>
this was my html file from where I wanted to extract the data but finally I have the solution just wanted to share this with you monks the Regex:
<td id='AuftragsId'>(.*)</td>\s*<td id='KundenNr'>(.*)</td>\s*<td id= +'RechnungsDatum'>(.*)</td>\s*<td id='RechnungsBetragAktuell'>(.*)</td +>\s*<td id='ZahlungsBetragAktuell'>(.*)</td>\s*<td id='OffenePosten'> +(.*)</td>

Replies are listed 'Best First'.
Re^3: Regular Expression: I need a regex to fetch data from an html file
by bitingduck (Chaplain) on Feb 27, 2012 at 16:26 UTC

    There are a lot of nice modules in CPAN that will do your extraction in a more robust way-- i.e. they won't break if the maker of the table makes small changes in the text.

    Some places to start:
    HTML::TableExtract
    HTML::TreeParser
    HTML::TokeParser

    Unless you're trying to do something really out there (and maybe even then), someone has probably already solved more than half of your problem and posted a module that does it reliably.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://956417]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others imbibing at the Monastery: (6)
As of 2024-04-23 15:42 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found