Beefy Boxes and Bandwidth Generously Provided by pair Networks
Welcome to the Monastery
 
PerlMonks  

Substring comparison

by schtinkfist893 (Novice)
on May 01, 2012 at 13:41 UTC ( #968251=perlquestion: print w/ replies, xml ) Need Help??
schtinkfist893 has asked for the wisdom of the Perl Monks concerning the following question:

Hello- I am fairly new to Perl and have been working on this problem for a couple days now and am a little lost. I need to compare to see if a string in a list is not/or is a substring of an element of another list.
For example:
list1[0] = '123'
list2[0] = '333 444 555 666'
list21 = '333 444 555 666 123'
Returns should be
list1[0] =~ list2[0] = false
list1[0] =~ list21 = true
and vice versa for !~
Using =~ and !~ m/$string/i has not been helpful. It has been returning false for anything. I have tried to split out the whitespace using split(/\s+/, $list2$i) but I get strange results. I just glanced over 2D arrays and thought that this might be an option. Any guidance would be aprreciated! Thanks!

Comment on Substring comparison
Re: Substring comparison
by Corion (Pope) on May 01, 2012 at 13:47 UTC

    For plain substring searching, maybe index is enough for you?

    As you don't show any code, it's hard to tell, but your original approach of $list1[0] =~ /$list2[0]/ should work as intended. You don't show any code, so it's hard to tell where you went wrong.

Re: Substring comparison
by JavaFan (Canon) on May 01, 2012 at 13:47 UTC
    Use the index function. It does exactly this. See the manual page.
Re: Substring comparison
by dorko (Parson) on May 01, 2012 at 13:55 UTC
    Please go take a look at List::Compare. I think it will do everything you need and more.

    Cheers,

    Brent

    -- Yup, I'm a Delt.
Re: Substring comparison
by schtinkfist893 (Novice) on May 01, 2012 at 15:12 UTC
    I am working with index() and how I am using it with Arrays is not working
    foreach(@Array1) { $string1 = $_; foreach(@Array2) { my string2 = $_; my $result = index($string2, $string1); if($result <= 0) { print $string1, " is not found in ", $string2, +"\n"; } } }

    My @Array1[0] and $string1 is 'FFF'
    My @Array2[0] and $string2 is 'FFF NNN JKK III LLL QQQ'
    However my return on the above is

    FFF is not found in FFF NNN JKK III LLL QQQ

    Cleary 'FFF' is part of string 'FFF NNN JKK III LLL QQQ' and should return a result >= 0

    This code works however when I am not using Arrays
    my $string = "FFF NNN JKK III LLL QQQ"; my $substr = "LLL"; my $result = index($string, $substr); if($result > 0) { print "Result: $result\n"; } else { print "not found";
    Result: 16
    Is there some way I am going about my Array handling incorrect
      Index returns 0 if the match is at the front of the string, i.e. string offset 0.

      perl -le 'print index("foobar","foo")'

      Cleary 'FFF' is part of string 'FFF NNN JKK III LLL QQQ' and should return a result >= 0
      Indeed, and it does. However, the negation of result >= 0 is not if($result <= 0), which is what you wrote in your program. There is a value (0) that's both >= 0 and <= 0.

      Use if ($result == -1) in your program.

Re: Substring comparison
by schtinkfist893 (Novice) on May 01, 2012 at 15:29 UTC
    For some better context here are the Arrays:
    Array1
    XXX FFF ZZZ AAA BBB QQQ LLL JKK III CCC DDD DCD

    Array2
    XXX BBB AAA CCC DDD EEE
    FFF NNN JKK III LLL QQQ

    The output

    FFF is not found in FFF NNN JKK III LLL QQQ ZZZ is not found in FFF NNN JKK III LLL QQQ AAA is not found in FFF NNN JKK III LLL QQQ BBB is not found in FFF NNN JKK III LLL QQQ LLL is not found in FFF NNN JKK III LLL QQQ JKK is not found in FFF NNN JKK III LLL QQQ III is not found in FFF NNN JKK III LLL QQQ CCC is not found in FFF NNN JKK III LLL QQQ DDD is not found in FFF NNN JKK III LLL QQQ DCD is not found in FFF NNN JKK III LLL QQQ
Re: Substring comparison
by schtinkfist893 (Novice) on May 01, 2012 at 15:42 UTC
    A closer look with some debug out shows that my string QQQ hit and retuned am index value
    BBB and result: -1 BBB is not found in FFF NNN JKK III LLL QQQ QQQ and result: 20 LLL and result: -1 LLL is not found in FFF NNN JKK III LLL QQQ JKK and result: -1 JKK is not found in FFF NNN JKK III LLL QQQ

    We expect BBB to not return a result, however JKK and LLL should return.
      As trammell pointed out index() returns 0 if the sought for string is the beginning. Try this:
      @Array1 = qw{XXX FFF ZZZ AAA BBB QQQ LLL JKK III CCC DDD DCD}; @Array2 = ("XXX BBB AAA CCC DDD EEE", "FFF NNN JKK III LLL QQQ"); foreach(@Array1) { $string1 = $_; foreach(@Array2) { my $string2 = $_; my $result = index($string2, $string1); print $string1, ($result >= 0 ? " IS " : " IS NOT", + " found in "), $string2, "\n"; } }
      Excellent! flaviodesousa solution works for explicitly defined arrays

      my @Array1 = qw{XXX FFF ZZZ AAA BBB QQQ LLL JKK III CCC DDD DCD};
      my @Array2 = ("XXX BBB AAA CCC DDD EEE", "FFF NNN JKK III LLL QQQ");

      XXX IS found in XXX BBB AAA CCC DDD EEE
      XXX IS NOT found in FFF NNN JKK III LLL QQQ
      FFF IS NOT found in XXX BBB AAA CCC DDD EEE
      FFF IS found in FFF NNN JKK III LLL QQQ

      However my arrays are being read in from a text file
      my $file1 = '/tmp/file1.txt'; my @Array1; open FILE, $file1 or die $!; while(<FILE>) { @Array1 = <FILE>; } my $file2 = '/tmp/file2.txt'; my @Array2; open FILE, $file2 or die$!; while(<FILE>) { @Array2 = <FILE>; } print @Array1; Print @Array2;
      /tmp/file1.txt is as stated above
      XXX FFF ZZZ AAA BBB QQQ LLL JKK III CCC DDD DCD

      /tmp/file2.txt is
      XXX BBB AAA CCC DDD EEE FFF NNN JKK III LLL QQQ

      This outputs
      FFF ZZZ AAA BBB QQQ LLL JKK III CCC DDD DCD FFF NNN JKK III LLL QQQ

      So it looks the method I am reading in the Arrays is not as thorough as I expect it to be.
        while(<FILE>) { @Array1 = <FILE>; }

        That is not going to do quite what you'd hoped. The first time into the loop it will read the first line of the file into the $_ variable, which you never use so that line will be lost. Then, inside the loop you read the file handle in list context as you have an array on the LHS. That will have the effect of reading the remaining lines of the file, one line per element, into the array; the the loop will exit on the next iteration as EOF has been reached. Your array will contain all lines but the first. As you've discovered in your subsequent post you will have to chomp to get rid of line terminators. The correct way to populate your array in a loop would be to use push.

        my @arr; while ( <$fh> ) { chomp; push @arr, $_; }

        However, there is an easier way as chomp will also operate on arrays

        my @arr = <$fh>; chomp @arr;

        or even

        chomp( my @arr = <$fh> );

        I hope this is helpful.

        Cheers,

        JohnGG

Re: Substring comparison
by schtinkfist893 (Novice) on May 01, 2012 at 16:48 UTC
    FOUND IT!
    Rookie(which I am) mistake, needed to chomp() in the arrays since the extra whitespace was really messing with the string comparisons
    Thanks to everyone's help and ideas!

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://968251]
Front-paged by Corion
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others chilling in the Monastery: (3)
As of 2014-07-13 12:08 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    When choosing user names for websites, I prefer to use:








    Results (249 votes), past polls