Beefy Boxes and Bandwidth Generously Provided by pair Networks kudra
Keep It Simple, Stupid
 
PerlMonks  

Re: grabbing link and 3 regexes to save HTML to disk

by Athanasius (Prior)
on Mar 22, 2013 at 13:01 UTC ( #1024925=note: print w/ replies, xml ) Need Help??


in reply to grabbing link and 3 regexes to save HTML to disk

Hello Discipulus,

I don’t have an answer to your question, sorry, just a few comments on syntax:

  • The comma operator has a lower precedence than ||, so a line such as:

    open RENDER, "> $ENV{TEMP}/_temp.html" || die "unable to write to %TEM +P%\\_temp.html";

    actually parses as:

    open RENDER, ( "> $ENV{TEMP}/_temp.html" || die "unable to write to %T +EMP%\\_temp.html" );

    which is not what you want. Either change || to the lower-precedence or, or put the arguments to open into parentheses.

  • In a regex, (:?X) captures X preceded by zero or one literal colons. For clustering (which is non-capturing), you need (?:X).

  • You can avoid “leaning toothpick syndrome” by using regex delimiters other than the forward slash:

    s{src="([^"]*)/}{src="./_temp_files/}gm

Hope that helps,

Athanasius <°(((><contra mundum Iustus alius egestas vitae, eros Piratica,


Comment on Re: grabbing link and 3 regexes to save HTML to disk
Select or Download Code
Re^2: grabbing link and 3 regexes to save HTML to disk
by Discipulus (Deacon) on Mar 22, 2013 at 20:41 UTC
    Hello Athanasius (i wish our nicks come true)

    many thanks for your points:
    • Never realized this about precedence: commas bites me everytime. I reopened perldoc and i see ANY example with parentheses! never used by me (bad). i'll take care in the future.
    • as you see i leak a lot with regexes (i was trying to install yape-regex-explain but got stucked in a 5.8 version..)
    • LOL .. imagine a non-english native, translating mentally this syndrome.. lol now I know is an idiom born in Perl's coulture. I'm cronical with that syndrome because i ever used a colorized Perl IDE.. but i'll try
    thanks a lot for the kindeness, even if OT.

    L*

    there are no rules, there are no thumbs..

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://1024925]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others chanting in the Monastery: (10)
As of 2014-04-16 11:31 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    April first is:







    Results (424 votes), past polls