in reply to Re^5: Find Prefix if regex didn't match in thread Find Prefix if regex didn't match
In my case I see that cutting works...
May be some more explanation is necessary: I'm reading (non-blocking) to a socket where a process running on another machine sends logging stuff to me. Sometimes I get many single characters sometimes I get large blocks. I don't know when I will receive the next package. A user can give me a regular expression I have to watch for and a timeout value. If I can match I return immediately if not I return after given timeout with an error.
Using expressions like "ABC" works quite fast and don't cause any problems with the timeout but not expressions like "AB.*Z". They only work as long as I get a few characters wihtin a few packages but not with thousands of them.
If I check the received string length and cut it after e.g. it gets larger than 20 characters I have no timeout problems any more and eth. works fine.
Re^7: Find Prefix if regex didn't match
by Anonymous Monk on Oct 31, 2012 at 14:36 UTC
|
Show code and prove it, its easy
#!/usr/bin/perl --
use Benchmark qw/ cmpthese /;
print "$]\n";
cmpthese(
-3,
{
circumcised => sub {
my $what = 'WWWA';
$what =~ /AB.*Z/gc; # fail one
$what .='DBBBABC';
pos($what) = 8;
$what =~ /AB.*Z/gc; # fail another
return;
},
uncut => sub {
my $what = 'WWWA';
$what =~ /AB.*Z/gc; # fail one
$what .='DBBBABC';
$what =~ /AB.*Z/gc; # fail another
return;
},
},
);
print "\n\n";
cmpthese(
-3,
{
circumcised => sub {
my $what = 'WWWA';
$what =~ /AB.*Z/gc; # fail one
$what .='DBBBABCZ';
pos($what) = 8;
$what =~ /AB.*Z/gc; # match one
return;
},
uncut => sub {
my $what = 'WWWA';
$what =~ /AB.*Z/gc; # fail one
$what .='DBBBABCZ';
$what =~ /AB.*Z/gc; # match one
return;
},
},
);
print "\n\n";
cmpthese(
-3,
{
circumcised => sub {
my $what = 'WWWA';
$what =~ /AB.*Z/gc; # fail one
$what .='DBBBABCZ';
substr $what, 0, 8, '';
$what =~ /AB.*Z/gc; # match one
return;
},
uncut => sub {
my $what = 'WWWA';
$what =~ /AB.*Z/gc; # fail one
$what .='DBBBABCZ';
$what =~ /AB.*Z/gc; # match one
return;
},
},
);
print "\n\n";
cmpthese(
-3,
{
circumcised => sub {
my $what = 'WWWA';
$what =~ /AB.*Z/; # fail one
$what .='DBBBABCZ';
substr $what, 0, 8, '';
$what =~ /AB.*Z/; # match one
return;
},
uncut => sub {
my $what = 'WWWA';
$what =~ /AB.*Z/; # fail one
$what .='DBBBABCZ';
$what =~ /AB.*Z/; # match one
return;
},
},
);
print "\n\n";
__END__
5.014001
Rate circumcised uncut
circumcised 370001/s -- -70%
uncut 1245010/s 236% --
Rate circumcised uncut
circumcised 301014/s -- -41%
uncut 506153/s 68% --
Rate circumcised uncut
circumcised 398024/s -- -19%
uncut 492750/s 24% --
Rate circumcised uncut
circumcised 554597/s -- -26%
uncut 749156/s 35% --
| [reply] [d/l] [select] |
|
Sorry, I haven't found any time spending on this topic the last days...
Your benchmarks are not exactly what I thought about, what I have in my mind is:
#!/usr/bin/perl --
use Benchmark qw/ cmpthese /;
print "$]\n";
my $what1 = 'WWWA';
my $what2 = $what1;
cmpthese(
-3,
{
circumcised => sub {
$what1 =~ /EB.*Z/; # fail one
$what1 .='DBBBABC';
substr $what1, 0, 8, '';
$what1 =~ /EB.*Z/; # fail another
return;
},
uncut => sub {
$what2 =~ /EB.*Z/; # fail one
$what2 .='DBBBABC';
$what2 =~ /EB.*Z/; # fail another
return;
},
},
);
print "\n\n";
__END__
5.010001
Rate uncut circumcised
uncut 2735/s -- -100%
circumcised 3398892/s 124179% --
| [reply] [d/l] |
|
Your benchmarks are not exactly what I thought about, what I have in my mind is: I see :) #!/usr/bin/perl --
use Benchmark qw/ cmpthese /;
print "$]\n";
my $what1 = my $what2 = my $orig = 'LMNOPQRSTUVWYXZWWWA';
my $pos1 = 0;
cmpthese(
-3,
{
circumcised => sub {
$what1 =~ /AB.*Z/gc; # fail one
$what1 .='DBBBABC';
$pos1+=8;
pos($what1) = $pos1;
$what1 =~ /AB.*Z/gc; # fail another
return;
},
uncut => sub {
$what2 =~ /AB.*Z/gc; # fail one
$what2 .='DBBBABC';
$what2 =~ /AB.*Z/gc; # fail another
return;
},
},
);
print "length circumcised(@{[length $what1 ]}) uncut(@{[ length $what2
+ ]})\n\n";
$what1 = $what2 = $orig;
$pos1 = 0;
cmpthese(
-3,
{
circumcised => sub {
$what1 =~ /AB.*Z/gc; # fail one
$what1 .='DBBBABCZ';
$pos1+=8;
pos($what1) = $pos1;
$what1 =~ /AB.*Z/gc; # match one
return;
},
uncut => sub {
$what2 =~ /AB.*Z/gc; # fail one
$what2 .='DBBBABCZ';
$what2 =~ /AB.*Z/gc; # match one
return;
},
},
);
print "length circumcised(@{[length $what1 ]}) uncut(@{[ length $what2
+ ]})\n\n";
$what1 = $what2 = $orig;
cmpthese(
-3,
{
circumcised => sub {
$what1 =~ /AB.*Z/gc; # fail one
$what1 .='DBBBABCZ';
substr $what1, 0, 8, '';
$what1 =~ /AB.*Z/gc; # match one
return;
},
uncut => sub {
$what2 =~ /AB.*Z/gc; # fail one
$what2 .='DBBBABCZ';
$what2 =~ /AB.*Z/gc; # match one
return;
},
},
);
print "length circumcised(@{[length $what1 ]}) uncut(@{[ length $what2
+ ]})\n\n";
$what1 = $what2 = $orig;
cmpthese(
-3,
{
circumcised => sub {
$what1 =~ /AB.*Z/; # fail one
$what1 .='DBBBABCZ';
substr $what1, 0, 8, '';
$what1 =~ /AB.*Z/; # match one
return;
},
uncut => sub {
$what2 =~ /AB.*Z/; # fail one
$what2 .='DBBBABCZ';
$what2 =~ /AB.*Z/; # match one
return;
},
},
);
print "length circumcised(@{[length $what1 ]}) uncut(@{[ length $what2
+ ]})\n\n";
__END__
5.014001
Rate uncut circumcised
uncut 2305/s -- -100%
circumcised 517638/s 22360% --
length circumcised(13175874) uncut(219812)
Rate uncut circumcised
uncut 2843/s -- -99%
circumcised 361434/s 12614% --
length circumcised(10770163) uncut(396371)
Rate uncut circumcised
uncut 2855/s -- -100%
circumcised 579256/s 20188% --
length circumcised(19) uncut(395315)
Rate uncut circumcised
uncut 2176/s -- -100%
circumcised 435567/s 19915% --
length circumcised(19) uncut(235675)
But, still, neither results seems very important , network is always much slower than both
| [reply] [d/l] |
|
|
|