Re: Regex anchor speed ?: Is /^.*X/ faster than /X/ ?

in reply to Regex anchor speed ?: Is /^.*X/ faster than /X/ ?

Actually, you aren't matching e-mail addresses. You are simply checking for the existence of the @ symbol. If you know the only lines that contain an @ symbol are e-mail addresses, you're fine.

Fastolfe is correct that a straight match will be faster. The /^.*\@/ is forced to match to the end of the string (or to a newline) and then backtrack to the @ symbol. This is extra overhead and will slow it down.

If you use minimal matching with /^.*?\@/, the regex matches every character (except for newline and end-of-string) and then looks ahead for the @ symbol. Again, you have extra overhead.

A simple scan for the @ symbol (/\@/) just looks for the @ symbol and returns true if found. That is the fastest way to scan for the character. See Death to Dot Star! if you wish to understand more about the dot metacharacters in regexes.

Comment on Re: Regex anchor speed ?: Is /^.*X/ faster than /X/ ?

Replies are listed 'Best First'.
RE: Answer: Regex anchor speed ?: Is /^.*X/ faster than /X/ ? by Dominus (Parson) on Nov 12, 2000 at 22:25 UTC
Ovid said: > The `/^.\@/` is forced to match to > the end of the string (or to a newline) and then backtrack* to the `@` symbol. But that is not exactly true. In general, Perl does behave that way. But in some cases, such as this one, there is an optimization: Perl sees that the string can't match unless it contains a `@` character, so it looks for the `@` first, and works outwards from there. In particular, it does not let `.` match all the way to the end of the string and then backtrack it; it gets the right length for `.` on the first try. Isn't that interesting? The nongreedy `.*?` version is optimized similarly, so I would be surprised if it performed any differently in this example.	[reply] [d/l] [select]

Replies are listed 'Best First'.

RE: Answer: Regex anchor speed ?: Is /^.*X/ faster than /X/ ?
by Dominus (Parson) on Nov 12, 2000 at 22:25 UTC

Ovid said

> The /^.*\@/ is forced to match to
> the end of the string (or to a newline) and then backtrack to the @ symbol.

@

not

.*

Isn't that interesting?

The nongreedy .*? version is optimized similarly, so I would be surprised if it performed any differently in this example.

[reply]
[d/l]
[select]

In Section Seekers of Perl Wisdom