Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl: the Markov chain saw
 
PerlMonks  

count the maximum no.of occurence

by sarvan (Sexton)
on Jun 30, 2011 at 04:39 UTC ( #912086=perlquestion: print w/ replies, xml ) Need Help??
sarvan has asked for the wisdom of the Perl Monks concerning the following question:

Hi there,

I have program which compares elements of an array with the sentence and prints if it finds the matching.

I m not only comparing single element rather 3 elements at a time.

$str1='It is a guide to action which ensures that the military always +obey the commands of the party.'; chomp($str1); $str2='It is a guide to action that ensures that the military will for +ever heed Party commands is a guide.'; chomp($str2); @arr1=split(/\s+/, $str1); $n=0; for($i=0; $i<$#arr1;$i++) { $t1="$arr1[$i] $arr1[$i+1] $arr1[$i+2]"; if($str2=~/$t1/) { print "$t1\12"; $n++; } } print "\n No of matching is : $n";
The output is: It is a is a guide a guide to guide to action ensures that the that the military No of matching is : 6.
What i wanted is, when the three elements from $str1 i.e "is a guide" compared with $str2, it should find two matches, but now it outputs only 1 match. Also i want to count no.of matches for each combination. I.e in this case the countOF(is a guide) is equal to 2. Any suggestion..

Comment on count the maximum no.of occurence
Select or Download Code
Re: count the maximum no.of occurence
by jwkrahn (Monsignor) on Jun 30, 2011 at 05:34 UTC

    You need something like this:

    use warnings; use strict; my $str1 = q/It is a guide to action which ensures that the military a +lways obey the commands of the party./; my $str2 = q/It is a guide to action that ensures that the military wi +ll forever heed Party commands is a guide./; my $n = 0; while ( $str1 =~ /(?=(\S+\s+\S+\s+\S+))/g ) { my $t1 = $1; while ( $str2 =~ /($t1)/g ) { print $1; $n++; } } print "No of matching is : $n";
      use warnings; use strict; my $str1 = q/It is a guide to action which ensures that the military a +lways obey the commands of the party./; my $str2 = q/It is a guide to action that ensures that the military wi +ll forever heed Party commands is a guide./; my $n = 0; while ( $str1 =~ /(?=(\S+\s+\S+\s+\S+))/g ) { my $t1 = $1; while ( $str2 =~ /($t1)/g ) { print $1; $n++; } } print "No of matching is : $n";
      but output of the above code comes like this:

      It is at is ais a guideis a guides a guides a guidea guide toguide to actionuide to actionide to actionde to actione to actionensures that thensures that thesures that theures that theres that thees that thes that thethat the militaryhat the militaryat the militaryt the military

      Not the expected output.. What i expected is:
      It is a is a guide is a guide a guide to guide to action ensures that the that the military
      And the maximum reference count of each word i want.. i.e in this case. "is a guide" is matched two times so its count is 2. Any suggestions.

        Sorry, you need to add newlines, like so:

        use warnings; use strict; my $str1 = q/It is a guide to action which ensures that the military a +lways obey the commands of the party./; my $str2 = q/It is a guide to action that ensures that the military wi +ll forever heed Party commands is a guide./; my $n = 0; while ( $str1 =~ /(?=(\S+\s+\S+\s+\S+))/g ) { my $t1 = $1; while ( $str2 =~ /($t1)/g ) { print "$1\n"; $n++; } } print "No of matching is : $n\n";
Re: count the maximum no.of occurence
by wind (Priest) on Jun 30, 2011 at 06:12 UTC

      I am reading perlretut, and g modifier and zero width look ahead assertion(?=) is what I am reading now... This is good example for me.

      use warnings; use strict; my $str1 = q/It is a guide to action which ensures that the military a +lways obey the commands of the party./; my $str2 = q/It is a guide to action that ensures that the military wi +ll forever heed Party commands is a guide./; my $n=0; while ( $str1 =~ /(?=(\S+\s+\S+\s+\S+))/g ) { #while ( $str1 =~ /(\S+\s+\S+\s+\S+)/g ) { my $t1 = $1; my $line=""; while ( $str2 =~ /($t1)/g ) { $line .= $line eq '' ? '' : ','; $line.="str=$t1, $1($-[0])"; $n++; } print "$line\n" if $str2 =~/$t1/; } print "No of matching is : $n\n";
Re: count the maximum no.of occurence
by sundialsvc4 (Abbot) on Jun 30, 2011 at 12:05 UTC

    For “reasonable amounts” of data, hashrefs are a good way to count things and, to that end, Perl has the very nice voodoo that says that you can simply do $$myhash{$mykey}++ “and It Just Works,™” whether a hash-entry at $mykey previously existed or not.   (If it did, it is incremented ... if not, it automagically appears with a value of 1.)   A very nice example of Perl’s tendencies toward “DWIM = Do What I Mean.™”   It is a very pragmatic language, built and enhanced by people who had a (paying) job to do and who wanted a rugged language with which to do it ... far removed from the hallways of academia.   (You probably won’t write your doctorial thesis about Perl, but you are very likely to complete your doctorial thesis using it.)

    If on the other hand you are dealing with very large amounts of data, the hands-down way to do it, IMHO, is with an external disk sort, followed by logic that simply compares “this” record to “the preceding record, if any.”   Yes, it is precisely the algorithm that was used with all those punched-cards and, later, reel-to-reel magnetic tapes.   Disk sorts are unexpectedly-fast, and I have watched sorts of (a couple...) billions of records be completed in an acceptable amount of time on a relatively modest machine.   The sort time is more-or-less O(logn), and the subsequent processing of course is pure-linear, O(n).   When data volumes reach the point where virtual-memory paging becomes a significant consideration (and speeds “hit the wall” because of it ...) when using hashes, this strategy remains graceful.

      In what way does this answer the OPs question?
        Maybe sundialsvc4 is testing out a semi-random reply generating AI program?

        Elda Taluta; Sarks Sark; Ark Arks
        My deviantART gallery

      The sort time is more-or-less O(logn),

      No, you moron. Sorting is, at best, O(N log N).


      Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
      "Science is about questioning the status quo. Questioning authority".
      In the absence of evidence, opinion is indistinguishable from prejudice.

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://912086]
Approved by Corion
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others contemplating the Monastery: (13)
As of 2014-10-20 10:55 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    For retirement, I am banking on:










    Results (75 votes), past polls