Beefy Boxes and Bandwidth Generously Provided by pair Networks
go ahead... be a heretic
 
PerlMonks  

Re: Is there a Limit on Matching .*

by sauoq (Abbot)
on Jul 15, 2003 at 00:01 UTC ( #274232=note: print w/replies, xml ) Need Help??


in reply to Is there a Limit on Matching .*

Is that what's happening here?

No. There's no arbitrary limit on the number of characters dot-star can match.

What's happening in your case, I'll bet, is that you have newlines in your $chunk and forgot that a dot doesn't match a newline unless you include the /s modifier on the regex.

Be careful about setting your input record separator to '</h1>' too. That's an exact string and will be case sensitive.

I guess I'd be remiss without including some standard scolding like, "you should parse HTML with an HTML parser, not a regex."

-sauoq
"My two cents aren't worth a dime.";

Replies are listed 'Best First'.
Re: Re: Is there a Limit on Matching .*
by BUU (Prior) on Jul 15, 2003 at 03:43 UTC
    But he's not "parsing" html at all, all he's doing is extracting a certain pattern. Sounds like what a regex was designed for to me.
      But he's not "parsing" html at all, all he's doing is extracting a certain pattern.

      It's a question of whether he should be parsing it instead of using a regex to extract a chunk. I don't know; I'm not working on his project. (And that's why I tossed it in as an afterthought.) TIMTOWTDI, YMMV, etc., etc., and so forth.

      -sauoq
      "My two cents aren't worth a dime.";
      
Re: Re: Is there a Limit on Matching .*
by svsingh (Priest) on Jul 15, 2003 at 14:43 UTC
    That did it. Thank you everyone! Also, thanks for the tip on the input separator. The HTML files are being generated by RoboHELP and the tags are consistently lowercase. The insensitive match is more of a habit than anything else. I think I should take that out of this script for effeciency.

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://274232]
help
Chatterbox?
[Eily]: you wrote boos instead of boss, and jedikaiti read that as boobs :)
[LanX]: Win claims only 8 MB left on C: how do I find out where the problem happens?
[Eily]: by Tux. And going home sounds like a good idea
[Eily]: LanX pretty much everywhere except in 8MB ?
[Eily]: s/by/bye/ someone hacked my keyboard I'm sure
[LanX]: good point! :)
[Eily]: LanX I don't understand the question though, you want to know what is taking so much space?
[Tanktalus]: Probably the swapfile :)

How do I use this? | Other CB clients
Other Users?
Others studying the Monastery: (12)
As of 2017-09-20 16:46 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?
    During the recent solar eclipse, I:









    Results (237 votes). Check out past polls.

    Notices?