Re: Search for account number in a file name

For that specific task, I'd probably use index rather than a regex.

if (index($filename, $account) >= 0) {
    print $filename."\n";
}
else {
    print "No match for $account\n";
}
[download]

One consideration may be where $account appears in $filename.

You can Benchmark if speed is important.

By the way, the code you posted doesn't compile: you possibly meant \b instead of /b (which gives the syntax error: Unknown regexp modifier "/b" ...); however, fixing that gives: No match for 7766541.

-- Ken

Comment on Re: Search for account number in a file name Select or Download Code

Replies are listed 'Best First'.
Re^2: Search for account number in a file name by Anonymous Monk on Jun 21, 2013 at 02:19 UTC
Would "index" be faster than using regular expression? Yes I meant "\b \b".	[reply]
Re^3: Search for account number in a file name by kcott (Archbishop) on Jun 21, 2013 at 03:32 UTC
Would "index" be faster than using regular expression? That would depend on a number of factors. I provided the Benchmark link so that you could determine this for yourself. Before worrying too much about speed, ask yourself how important that is. If you can process all your data in 100ms, how much effort are you prepared to put in to get it to run in, say, half that time; and would anyone notice the difference. What are you doing with the results? Sending them to a terminal, a file, a database, a printer, across a network: all of these will probably take much longer than any processing occurring in the CPU. If you're just looking for a function that searches for one string inside another, that's what index does and what I'd probably choose for that task; if patterns are involved, that's what a regexp engine does and, in that case, that's what I'd probably use. If you do decide to optimise, you need to start with a regex that works correctly and consistently. toolic has provided code (in Re: Search for account number in a file name) that returns a correct result for your single example based on `$account` appearing anywhere within `$filename`. You didn't specify anything beyond this; however, I raised the issue that its position might be meaningful. The regexp engine will typically find an anchored pattern faster than an unanchored one. If you know that `$filename` will always start with zero or more `0`s immediately followed by `$account`, then you can write `/^0*$account/` which would probably be faster than `/$account/`; if you know it will always start with exactly three zeros, then `/^000$account/` may be faster still. Similarly, you'll need to look at the code you're using with index. If you have information regarding the position of `$account`, then maybe, instead of `index($filename, $account) >= 0`, you'd want `index($filename, $account) >= 3` or `index($filename, $account) == 3` or something else. Perhaps you'd use the optional third argument: `index($filename, $account, $position)`. When you have two (or more) pieces of code that are working correctly, then you can compare them. That's when you'd use Benchmark. -- Ken	[reply] [d/l] [select]


"be consistent"
	PerlMonks