Beefy Boxes and Bandwidth Generously Provided by pair Networks
go ahead... be a heretic
 
PerlMonks  

comment on

( [id://3333]=superdoc: print w/replies, xml ) Need Help??

A little more detail on what you're doing with the code would be helpful. For example, are you just testing for existance, or are you extracting pieces of data? Are these regular expressions using metacharacters such as .*[]?, or are they constant strings?

Each of these answers may help us help you optimise your code appropriately. For example, constant strings generally are faster with index than regular expressions. But if you have thousands, and you use a regular expression optimizer of some sort from CPAN, you may be able to get a reasonable state machine for finding your data.

On the other hand if you're trying to extract data, which I kind of doubt, and your regular expressions actually use regexp metacharacters, you're probably best off looping through the list:

my @regexps = load_regexps(); @regexps = map { qr/$_/ } @regexps; # pre-compile 'em all. foreach my $re (@regexps) { if ($text =~ $re) { # do stuff based on match. } }
Here we precompile each one, and then try each one after another. The compiled regular expressions should execute a bit faster - I'm not sure why, but I'm guessing because the state machine is way simpler. Note that if you only check a single chunk of text, you won't save anything by pre-compiling the regular expressions.


In reply to Re: pattern matching with large regex by Tanktalus
in thread pattern matching with large regex by Anonymous Monk

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post; it's "PerlMonks-approved HTML":



  • Are you posting in the right place? Check out Where do I post X? to know for sure.
  • Posts may use any of the Perl Monks Approved HTML tags. Currently these include the following:
    <code> <a> <b> <big> <blockquote> <br /> <dd> <dl> <dt> <em> <font> <h1> <h2> <h3> <h4> <h5> <h6> <hr /> <i> <li> <nbsp> <ol> <p> <small> <strike> <strong> <sub> <sup> <table> <td> <th> <tr> <tt> <u> <ul>
  • Snippets of code should be wrapped in <code> tags not <pre> tags. In fact, <pre> tags should generally be avoided. If they must be used, extreme care should be taken to ensure that their contents do not have long lines (<70 chars), in order to prevent horizontal scrolling (and possible janitor intervention).
  • Want more info? How to link or How to display code and escape characters are good places to start.
Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others scrutinizing the Monastery: (3)
As of 2024-03-19 03:46 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found