Beefy Boxes and Bandwidth Generously Provided by pair Networks
Just another Perl shrine
 
PerlMonks  

Re: partial match between 2 files

by Kenosis (Priest)
on Dec 05, 2012 at 05:53 UTC ( #1007210=note: print w/ replies, xml ) Need Help??


in reply to partial match between 2 files

You said that you were interested in "...a partial character by character match between 2 files until a non matching character occurs..." By the desired output, it looks like you want a character-by-character match between two words. Here, I believe, is what you've provided:

first_file second_file output ~~~~~~~~~~ ~~~~~~~~~~~ ~~~~~~ amayaM -> amayamAn -> amaya+mAn souraM -> vismayamAn -> soura+mA kamalZ -> souramA -> -> kamalAn ->

The output from the first pair of words makes sense, but I don't see a pattern between the words and the output after that. Please reformat your data using <code> tags and include enough to show the pattern. Also, please show the code that you have tried.


Comment on Re: partial match between 2 files
Download Code
Re^2: partial match between 2 files
by lakssreedhar (Acolyte) on Dec 05, 2012 at 06:26 UTC

    the output should be first_file second_file output ~~~~~~~~~~ ~~~~~~~~~~~ ~~~~~~ amayaM -> amayamAn -> amaya+mAn vismayaM -> vismayamAn -> vismaya+mAn souraM -> souramA -> soura+mA kamalZ -> kamalAn -> kamal+An .The code i wrote wont make any sense.

    #!/usr/bin/perl #read dictionary open(RE,"file1"); while(<RE>) { chomp; my @tmp =split(/\,/,$_); $key="$tmp[0]"; #print "$key\n "; my @words=split(//,$key); } close(RE); my $length1 = $#words; #check for a partial match open(RE1,"file2"); while(<RE1>) { $inp_word4 = $_; my @inp_word1 =split(//,$inp_word4); #print "@inp_word1"; } close(RE1) my $length2=$#inp_word1; if($length1<$length2) { compare the array elements in another loop }

      The following code shows one way to tackle this problem:

      #! perl use Modern::Perl; my $file1 = 'amayaM vismayaM souraM kamalZ'; my $file2 = 'amayamAn vismayamAn souramA kamalAni'; my %words1 = map { $_ => undef } split /\s+/, $file1; my @words2 = split /\s+/, $file2; for my $word2 (@words2) { for my $word1 (keys %words1) { my $stem = substr($word1, 0, -1); my $len = length $stem; if (substr($word2, 0, $len) eq $stem) { say $word1, ' -> ', $word2, ' -> ', $stem, '+', substr($wo +rd2, $len); last; } } }

      Output:

      19:21 >perl 415_SoPW.pl amayaM -> amayamAn -> amaya+mAn vismayaM -> vismayamAn -> vismaya+mAn souraM -> souramA -> soura+mA kamalZ -> kamalAni -> kamal+Ani 19:22 >

      Hope that helps,

      Athanasius <°(((><contra mundum Iustus alius egestas vitae, eros Piratica,

      The exclusive-or operator (^) between strings returns \x00 for each matching pair of characters, and a different value for non-matching characters. Thus, 'Perl' ^ 'Perl' would return '\x00\x00\x00\x00'. Matching a returned string for [^\x00] will show where the strings differ. In your case, only the first difference is requested. Given this, consider the following that uses your data:

      use warnings; use strict; open my $fh1, '<', 'first_file.txt' or die $!; open my $fh2, '<', 'second_file.txt' or die $!; while ( my $s1 = <$fh1> ) { chomp $s1; chomp( my $s2 = <$fh2> ); ( $s1 ^ $s2 ) =~ /[^\x00]/; substr( $s2, $-[0], 0 ) = '+' if defined $-[0]; print $s2, "\n"; } close $fh2; close $fh1;

      Output:

      amaya+mAn vismaya+mAn soura+mA kamal+An

      The variable $-[0] contains the position of the last match, which is passed to substr to insert a + at the location of the first difference between the two strings.

        the code is working fine for those 2 files but if i am adding more words to file 1 which does not match to any of the file2 words then an error is occuring like Use of uninitialized value $s2 in chomp at triedsplit.pl line 9, <$fh2> line 10. Use of uninitialized value $s2 in bitwise xor (^) at triedsplit.pl line 11, <$fh2> line 10. Use of uninitialized value $s2 in print at triedsplit.pl line 13, <$fh2> line 10

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://1007210]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others contemplating the Monastery: (11)
As of 2015-07-03 09:17 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    The top three priorities of my open tasks are (in descending order of likelihood to be worked on) ...









    Results (51 votes), past polls