Beefy Boxes and Bandwidth Generously Provided by pair Networks DiBona
The stupid question is the question not asked
 
PerlMonks  

Re^2: Regular Expression: I need a regex to fetch data from an html file

by Anonymous Monk
on Feb 27, 2012 at 10:27 UTC ( #956417=note: print w/ replies, xml ) Need Help??


in reply to Re: Regular Expression: I need a regex to fetch data from an html file
in thread Regular Expression: I need a regex to fetch data from an html file

<tr><td id='Auf'>50956866</td> <td id='Ku'>D510848</td> <td id='Rec'>18.10.2011</td> <td id='Re'>EUR 118,95</td> <td id='Za'>EUR 0,00</td> <td id='Off'>EUR 118,95</td>
this was my html file from where I wanted to extract the data but finally I have the solution just wanted to share this with you monks the Regex:
<td id='AuftragsId'>(.*)</td>\s*<td id='KundenNr'>(.*)</td>\s*<td id= +'RechnungsDatum'>(.*)</td>\s*<td id='RechnungsBetragAktuell'>(.*)</td +>\s*<td id='ZahlungsBetragAktuell'>(.*)</td>\s*<td id='OffenePosten'> +(.*)</td>


Comment on Re^2: Regular Expression: I need a regex to fetch data from an html file
Select or Download Code
Re^3: Regular Expression: I need a regex to fetch data from an html file
by bitingduck (Friar) on Feb 27, 2012 at 16:26 UTC

    There are a lot of nice modules in CPAN that will do your extraction in a more robust way-- i.e. they won't break if the maker of the table makes small changes in the text.

    Some places to start:
    HTML::TableExtract
    HTML::TreeParser
    HTML::TokeParser

    Unless you're trying to do something really out there (and maybe even then), someone has probably already solved more than half of your problem and posted a module that does it reliably.

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://956417]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others about the Monastery: (8)
As of 2014-04-18 06:34 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    April first is:







    Results (462 votes), past polls