Beefy Boxes and Bandwidth Generously Provided by pair Networks
Do you know where your variables are?
 
PerlMonks  

Re: Is there a Limit on Matching .*

by sauoq (Abbot)
on Jul 15, 2003 at 00:01 UTC ( #274232=note: print w/ replies, xml ) Need Help??


in reply to Is there a Limit on Matching .*

Is that what's happening here?

No. There's no arbitrary limit on the number of characters dot-star can match.

What's happening in your case, I'll bet, is that you have newlines in your $chunk and forgot that a dot doesn't match a newline unless you include the /s modifier on the regex.

Be careful about setting your input record separator to '</h1>' too. That's an exact string and will be case sensitive.

I guess I'd be remiss without including some standard scolding like, "you should parse HTML with an HTML parser, not a regex."

-sauoq
"My two cents aren't worth a dime.";


Comment on Re: Is there a Limit on Matching .*
Re: Re: Is there a Limit on Matching .*
by BUU (Prior) on Jul 15, 2003 at 03:43 UTC
    But he's not "parsing" html at all, all he's doing is extracting a certain pattern. Sounds like what a regex was designed for to me.
      But he's not "parsing" html at all, all he's doing is extracting a certain pattern.

      It's a question of whether he should be parsing it instead of using a regex to extract a chunk. I don't know; I'm not working on his project. (And that's why I tossed it in as an afterthought.) TIMTOWTDI, YMMV, etc., etc., and so forth.

      -sauoq
      "My two cents aren't worth a dime.";
      
Re: Re: Is there a Limit on Matching .*
by svsingh (Priest) on Jul 15, 2003 at 14:43 UTC
    That did it. Thank you everyone! Also, thanks for the tip on the input separator. The HTML files are being generated by RoboHELP and the tags are consistently lowercase. The insensitive match is more of a habit than anything else. I think I should take that out of this script for effeciency.

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://274232]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others studying the Monastery: (10)
As of 2014-07-29 22:00 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    My favorite superfluous repetitious redundant duplicative phrase is:









    Results (229 votes), past polls