Beefy Boxes and Bandwidth Generously Provided by pair Networks
Just another Perl shrine
 
PerlMonks  

Parsing URL using regular expression

by knsridhar (Scribe)
on Jul 20, 2005 at 15:12 UTC ( [id://476519]=perlquestion: print w/replies, xml ) Need Help??

knsridhar has asked for the wisdom of the Perl Monks concerning the following question:

Hi Monks

I have written a script which parses mbox and updates into the database. When i get a mail with 8859 character set or quoted printable format, i parse it by

$message_body =~ s/=([ 0-9A-Fa-f ]{2})/chr(hex($1))/ge; $message_body =~ s/=\n//g; $message_body =~ s/=3D/=/g; $message_body =~ s/>/>\n/g;
Now the problem is while parsing mails whose body contains hyperlinks with key/value pair, like
http://www.foo.com/bar.pl?aid=3D2744&bid=3D&cn=3DOM= Ddemo2&rid=3D0
the message body written to db doesnt contain any hyperlinks. Is there any way to solve this problem

Thanks
Sridhar

Replies are listed 'Best First'.
Re: Parsing URL using regular expression
by Sec (Monk) on Jul 20, 2005 at 15:24 UTC
    I suspect your mistake lies somewhere else. The above mentioned snipped works as expected and keeps the URL parameters:
    #!perl -l $message_body="http://www.foo.com/bar.pl?aid=3D2744&bid=3D&cn=3DOM= Ddemo2&rid=3D0"; $message_body =~ s/=([0-9A-Fa-f]{2})/chr(hex($1))/ge; $message_body =~ s/=\n//g; $message_body =~ s/=3D/=/g; $message_body =~ s/>/>\n/g; print $message_body;
    results in http://www.foo.com/bar.pl?aid=2744&bid=&cn=OMDdemo2&rid=0
    While we're at it. the third substitute (the one with =3D) seems unnecessary, as the first substitute already transforms =3D into =.
Re: Parsing URL using regular expression
by Jaap (Curate) on Jul 20, 2005 at 15:22 UTC
    What do you mean with: "the message body written to db doesnt contain any hyperlinks" Do you mean the entire URL is not in the database?
      Yes, the entire URL is not written to the database
Re: Parsing URL using regular expression
by Anonymous Monk on Jul 20, 2005 at 16:00 UTC
    What is your question actually? How to add URLs to a database? How to parse URLs? How to extract URLs from a body of text? How to deal with quoted-printable?

    Confused Monk

      My requirement to parse a mail which contains hyperlink with key/value pair in its message body Sorry if i was not clean

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://476519]
Approved by Tanalis
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others browsing the Monastery: (2)
As of 2024-04-26 05:34 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found