Beefy Boxes and Bandwidth Generously Provided by pair Networks
Pathologically Eclectic Rubbish Lister
 
PerlMonks  

Re^4: Find Prefix if regex didn't match

by demoralizer (Acolyte)
on Oct 31, 2012 at 14:05 UTC ( #1001669=note: print w/ replies, xml ) Need Help??


in reply to Re^3: Find Prefix if regex didn't match
in thread Find Prefix if regex didn't match

aah now I got your idea, not bad but that's too much simplified ;)

Extracting a \w+ prefix from the expression e.g. doesn't work with stuff like this:
my $search="(AB)+.*Z";

The problem is that the search string is given and therefore I have no influence on it. Maybe you have been irritated by my ".*ABC" example but what I ment here was that in such a case there is no unmatchable prefix that can be cut away.


Comment on Re^4: Find Prefix if regex didn't match
Re^5: Find Prefix if regex didn't match
by Anonymous Monk on Oct 31, 2012 at 14:11 UTC
    forget about cutting, cutting will never speed anything up
      In my case I see that cutting works...

      May be some more explanation is necessary: I'm reading (non-blocking) to a socket where a process running on another machine sends logging stuff to me. Sometimes I get many single characters sometimes I get large blocks. I don't know when I will receive the next package. A user can give me a regular expression I have to watch for and a timeout value. If I can match I return immediately if not I return after given timeout with an error.

      Using expressions like "ABC" works quite fast and don't cause any problems with the timeout but not expressions like "AB.*Z". They only work as long as I get a few characters wihtin a few packages but not with thousands of them.

      If I check the received string length and cut it after e.g. it gets larger than 20 characters I have no timeout problems any more and eth. works fine.
        Show code and prove it, its easy
        #!/usr/bin/perl -- use Benchmark qw/ cmpthese /; print "$]\n"; cmpthese( -3, { circumcised => sub { my $what = 'WWWA'; $what =~ /AB.*Z/gc; # fail one $what .='DBBBABC'; pos($what) = 8; $what =~ /AB.*Z/gc; # fail another return; }, uncut => sub { my $what = 'WWWA'; $what =~ /AB.*Z/gc; # fail one $what .='DBBBABC'; $what =~ /AB.*Z/gc; # fail another return; }, }, ); print "\n\n"; cmpthese( -3, { circumcised => sub { my $what = 'WWWA'; $what =~ /AB.*Z/gc; # fail one $what .='DBBBABCZ'; pos($what) = 8; $what =~ /AB.*Z/gc; # match one return; }, uncut => sub { my $what = 'WWWA'; $what =~ /AB.*Z/gc; # fail one $what .='DBBBABCZ'; $what =~ /AB.*Z/gc; # match one return; }, }, ); print "\n\n"; cmpthese( -3, { circumcised => sub { my $what = 'WWWA'; $what =~ /AB.*Z/gc; # fail one $what .='DBBBABCZ'; substr $what, 0, 8, ''; $what =~ /AB.*Z/gc; # match one return; }, uncut => sub { my $what = 'WWWA'; $what =~ /AB.*Z/gc; # fail one $what .='DBBBABCZ'; $what =~ /AB.*Z/gc; # match one return; }, }, ); print "\n\n"; cmpthese( -3, { circumcised => sub { my $what = 'WWWA'; $what =~ /AB.*Z/; # fail one $what .='DBBBABCZ'; substr $what, 0, 8, ''; $what =~ /AB.*Z/; # match one return; }, uncut => sub { my $what = 'WWWA'; $what =~ /AB.*Z/; # fail one $what .='DBBBABCZ'; $what =~ /AB.*Z/; # match one return; }, }, ); print "\n\n"; __END__
        5.014001 Rate circumcised uncut circumcised 370001/s -- -70% uncut 1245010/s 236% -- Rate circumcised uncut circumcised 301014/s -- -41% uncut 506153/s 68% -- Rate circumcised uncut circumcised 398024/s -- -19% uncut 492750/s 24% -- Rate circumcised uncut circumcised 554597/s -- -26% uncut 749156/s 35% --

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://1001669]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others drinking their drinks and smoking their pipes about the Monastery: (4)
As of 2014-07-26 14:47 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    My favorite superfluous repetitious redundant duplicative phrase is:









    Results (178 votes), past polls