![]() |
|
"be consistent" | |
PerlMonks |
Re: Regex simplificationby Arien (Pilgrim) |
on Aug 26, 2002 at 08:27 UTC ( [id://192797]=note: print w/replies, xml ) | Need Help?? |
Extracting the lines that match for an array of lines using the Perl function grep (as opposed to the program) is no more complicated than this: my @matches = grep /PATTERN/, @lines;Now, since you will be extracting the usernames from these matches as well, you might as well do that while matching, as explained by Popcorn Dave. Don't use "dot start" (.*) in your regex (although some regexes above do), because it will cause unnecessary backtracking. Dot matches anything but a newline by default and the star indicates "zero or more of the preceeding". So, when trying to match a line and getting to "dot star" this will match to the end of the line and after that the dot will let go, bit by bit, anything necessary for an overall match. Things will get worse when "dot star" makes more appearances in the regex. As far as the regex goes, it seems from your code that this will do just fine: /<!-- USER \d+ - (\S+) -->/iThat is, match <!-- USER followed by a space, some number, a space, a minus, a space, one or more occurences of a non-whitespace, a space, and finally -->. All this case-insensitively. Although non-backtracking subpatterns admittedly will help you somewhat in making your code faster, I would not use them if they're not really needed: they would just obscure what is happening. Putting it all together, you would end up with something like this:
You may see people doing the same thing like this: my @users = map { /<!-- USER \d+ - (\S+) -->/i ? $1 : () } @lines;What is happening here is that for each element of @lines you check if the line matches your regex. If so, you add the value of $1 (the username) to the list of @users; if not, you add an empty list (ie. nothing) to @users. This might come in handy when reading other peoples' code. Hope this helps. — Arien Edit: Also, if you know what you are looking for can only appear at the start of the line you can speed things up by anchoring your regex (using ^) like this: /^<!-- USER \d+ - (\S+) -->/i
In Section
Seekers of Perl Wisdom
|
|