Beefy Boxes and Bandwidth Generously Provided by pair Networks
There's more than one way to do things
 
PerlMonks  

Exact pattern match inside index()

by jayarkay (Initiate)
on Oct 15, 2012 at 20:08 UTC ( #999166=perlquestion: print w/ replies, xml ) Need Help??
jayarkay has asked for the wisdom of the Perl Monks concerning the following question:

Hello,

I am trying to use the index() function and I want to find the position of a word inside a string, only when it is an exact match. For example:

My string is STRING="CATALOG SCATTER CAT CATHARSIS"

And my search string is KEY=CAT

I want to say something like index($STRING, $KEY) and get the position of CAT, and not CATALOG. How do I accomplish this? The documentation says "The index function searches for one string within another, but without the wildcard-like behavior of a full regular-expression pattern match", which makes me think that it may not be that straight-forward, but my perl skills are limited :). Is it possible to do what I am trying to do?

Hopefully, I was able to articulate my question well. Thanks in advance for your help!

Comment on Exact pattern match inside index()
Select or Download Code
Re: Exact pattern match inside index()
by Anonymous Monk on Oct 15, 2012 at 20:20 UTC
    You are matching for \bCAT\b with a regular expression. From perldoc perlre:
    A word boundary ("\b") is a spot between two characters that has a "\w" on one side of it and a "\W" on the other side of it (in either order), counting the imaginary characters off the beginning and end of the string as matching a "\W". (Within character classes "\b" represents backspace rather than a word boundary, just as it normally does in any double‐quoted string.)

    \w Match a "word" character (alphanumeric plus "_")
    \W Match a non-"word" character
Re: Exact pattern match inside index()
by toolic (Chancellor) on Oct 15, 2012 at 20:21 UTC
    You could use a regex with pos:
    use warnings; use strict; my $s = 'CATALOG SCATTER CAT CATHARSIS'; my $k = 'CAT'; if ($s =~ /\b$k\b/g) { print "match started at position ", (pos($s) - length($k)), "\n"; } __END__ match started at position 16

    UPDATE: kennethk's @- is much cleaner.

Re: Exact pattern match inside index()
by kennethk (Monsignor) on Oct 15, 2012 at 20:27 UTC
    index does a check for literal characters. By specifying you want CAT but not CATALOG, you are adding criteria that cannot be expressed using literal characters only. You could get closer by using $KEY = " CAT ", since your string is whitespace delimited, but this would fail if CAT is either your first or last entry.

    The only way to implement this robustly would be to use a tool that can check for non-literal character criteria, e.g. regular expressions. For example, you could use the word anchor \b to specify you want word boundaries on either side of CAT -- see Using character classes. You can then use $-[0] (see @-) to learn the position of that match. So you might say

    my $string = "CATALOG SCATTER CAT CATHARSIS"; my $key = 'CAT'; $string =~ /\b\Q$key\E\b/; print "$key found at $-[0]\n";

    #11929 First ask yourself `How would I do this without a computer?' Then have the computer do it the same way.

Re: Exact pattern match inside index()
by jayarkay (Initiate) on Oct 17, 2012 at 15:48 UTC
    Thank you all for your responses. I apologize for the delay in responding. Both toolic and Kennethk's suggestions worked. And as toolic added in the update, kennethk's option of using $-[0] seemed cleaner. Thanks also for the references.

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://999166]
Approved by Perlbotics
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others drinking their drinks and smoking their pipes about the Monastery: (4)
As of 2014-09-18 00:55 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    How do you remember the number of days in each month?











    Results (101 votes), past polls