Thanks for your fast answer!
some good aproaches but no hit...
pos() doesn't work because if the string only contains a prefix of the given expression (what will be the hot case I'm looking for) I will get undef and not the position I need or do I overlook sth. here?
My "window" can become as large as it likes, doesn't matter, but I have to ensure that no matching will be overseen, that's quite important! It's just to make the searching faster and if I know that the first n characters can be thrown away because they can never be part of a matching would do the job.
Any alternative suggestion?
| [reply] [Watch: Dir/Any] |
pos() doesn't work because if the string only contains a prefix of the given expression (what will be the hot case I'm looking for) I will get undef and not the position I need or do I overlook sth. here? Once again in english, please?
Any alternative suggestion? Not really. To remove a prefix requires a regex match. And then you do real matching. I doubt there is any savings to be had by matching twice ... or actually cutting the string, even with pos
my $search="AB.*Z";
my $string="WWWA";
my $search_prefix = $1 if $search =~ /^(\w+)/g;
warn $search_prefix;
my $prefix_offset = index ( $string, $search_prefix );
substr $string, 0, $prefix_offset , '';
warn $string;
$string = "WWWADBBBABC";
$prefix_offset = index ( $string, $search_prefix );;
warn $string;
substr $string, 0, $prefix_offset , '';
warn $string;
$string = "WWWADBBBABC";
pos( $string ) = $prefix_offset;
warn pos( $string );
## next match m//g starts at offset
__END__
AB at jank line 5.
A at jank line 8.
WWWADBBBABC at jank line 11.
ABC at jank line 13.
8 at jank line 16.
For some idea why I think so , maybe , see Why does global match run faster than none global?, Multiple Regex evaluations or one big one? | [reply] [Watch: Dir/Any] [d/l] |
| [reply] [Watch: Dir/Any] |
I try to match only once... but to become faster I try to cut unmatchable stuff at the beginning of my search string so that the next try needn't to start from the beginning again.
At the moment there are some doubts if cutting really makes things better but that is what I see here.
| [reply] [Watch: Dir/Any] |
Hello demoralizer and welcome to PerlMonks. May I ask you to post some (working) code here? Just put what you already came up with, even if it is slow, at least we will get a better idea of the context, then we can figure a way to optimize it. Thanks!
There are no stupid questions, but there are a lot of inquisitive idiots.
| [reply] [Watch: Dir/Any] |
Hi greengaroo,
show working code will be little bit hard because I'm listening to sockets... but here is sth. that should at least show what I'm trying to do:
# $term is the socket
# $text contains all received and unscanned text
# $scanned contains all scanned text
my $rec;
my $time = gettimeofday();
while(1)
{
# sth. to read?
if (read($term, $rec, 0xFFFF))
{
# collect what have been read
$text .= $rec;
# expression found?
if ($text =~ s/(.*)($expect)//s)
{
$scanned .= $1;
$scanned .= "MATCH";
$scanned .= $2;
return 0;
}
# shorten string for speed up
elsif (length($text) >= 20)
{
$scanned .= "CUTTED";
$scanned .= substr($text, 0, length($text) - 20 + 1);
$text = substr($text, length($text) - 20 + 1);
}
}
# timeout?
if (gettimeofday() - $time > $timeout)
{
$scanned .= "TIMEOUT";
return 1;
}
}
It seems that the "elsif (length($text) >= 20)" makes things faster but doesn't do exactly what I want because in this way I can lose possible matchings :( | [reply] [Watch: Dir/Any] [d/l] |