Beefy Boxes and Bandwidth Generously Provided by pair Networks
XP is just a number
 
PerlMonks  

regular expression

by ansh batra (Friar)
on Jan 07, 2013 at 09:06 UTC ( #1011981=perlquestion: print w/replies, xml ) Need Help??
ansh batra has asked for the wisdom of the Perl Monks concerning the following question:

my data is

</div><div class='ClipPicture'><a href='http://www.laptop-keys.com/Key +boardKeys/Cart/Acer/A_Series/A110/A8A'><img alt='Clip Style Pic' src= +'http://www.laptop-keys.com/images/KeyboardImages/A8A.png' width='659 +' height='135'/></a></div><div class='RadioButtonContainerClipStyle'>
this is one line of input
i want to get value of src attribute of img tag. i.e http://www.laptop-keys.com/images/KeyboardImages/A8A.png
my code
if($line=~ /<img .* src=\'(.*?)\' .*\/><\/a>/) { print "got2\n"; $kb_lay_img_url=$&; }

please help

Replies are listed 'Best First'.
Re: regular expression
by frozenwithjoy (Priest) on Jan 07, 2013 at 09:38 UTC
    You almost have it right. If I change $kb_lay_img_url=$&; to $kb_lay_img_url=$1;, it works for me. (see: Special Variables)

    EDIT: Here, I've made this change (and added Anon's suggestion, which makes the regex safer/better):

    #!/usr/bin/env perl use strict; use warnings; use feature 'say'; my $line = "</div><div class='ClipPicture'><a href='http://www.laptop-keys.com/Ke +yboardKeys/Cart/Acer/A_Series/A110/A8A'><img alt='Clip Style Pic' src +='http://www.laptop-keys.com/images/KeyboardImages/A8A.png' width='65 +9' height='135'/></a></div><div class='RadioButtonContainerClipStyle' +>"; if ($line=~ /<img .* src='([^']+)'.*\/><\/a>/) { print "got2\n"; my $kb_lay_img_url = $1; say $kb_lay_img_url; } __END__ got2 http://www.laptop-keys.com/images/KeyboardImages/A8A.png
      thanks. :)
Re: regular expression
by choroba (Chancellor) on Jan 07, 2013 at 09:36 UTC
    Do not use regular expressions to parse HTML. Use a proper tool. For example, using XML::XSH2, you can load the whole file and find the src attribute easilly:
    open :F html 1011981.html ; echo //img[@alt="Clip Style Pic"]/@src ;
    لսႽ ᥲᥒ⚪⟊Ⴙᘓᖇ Ꮅᘓᖇ⎱ Ⴙᥲ𝇋ƙᘓᖇ
Re: regular expression (blah blah blah)
by Anonymous Monk on Jan 07, 2013 at 09:07 UTC
     '([^']+)'

      it is fetching the whole img tag
      i need only value of src attribute

        Most likely what Anonymous Monk showed you was not supposed to be the complete regular expression but intended to be used for matching the src attribute only. You will have to apply that hint appropriately.

        Did you try to understand my regex pattern? YAPE::Regex::Explain? Put it in your regex, match up the quotes, delete your stuff, use my stuff

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://1011981]
Approved by Corion
help
Chatterbox?
[haukex]: I think you're right, I think Pod::Simple is the preferred parser now
[haukex]: But I was just using it as an author test anyway
[Corion]: haukex: Aaah - I thought you were still running these tests on every machine, but you only run these as author or Devel::Cover tests
[Corion]: haukex: Yeah, I think back then I used Test::Inline, which used a pod parser that was going through some changes and I didn't want to cater for all the various versions and thus stopped testing the Pod completely
[choroba]: I usually do this with presentations
[Corion]: But now I think statically (re)generating the Pod tests is a saner approach, and likely I'll regenerate the tests either in Makefile.PL or from xt/ but have them live below t/
[choroba]: I keep the snippets in files of their own, and use a Makefile to syntax highlight them and insert them into slides, while also running them and inserting the output if required
[Corion]: choroba: Ooooh - I didn't think of that! I write my presentations as POD and if it "roughly" looks like Perl code, I should also syntax-check that...
[haukex]: Yes sorry I don't run them all the time, my POD tests are only run as author tests (and are excluded when I'm using Devel::Cover)
[Corion]: choroba: Hmm - no, I keep the snippets inline, but as my framework also has support for capturing output etc., maybe I should do the same...

How do I use this? | Other CB clients
Other Users?
Others pondering the Monastery: (10)
As of 2017-02-27 12:22 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?
    Before electricity was invented, what was the Electric Eel called?






    Results (385 votes). Check out past polls.