Beefy Boxes and Bandwidth Generously Provided by pair Networks
Clear questions and runnable code
get the best and fastest answer
 
PerlMonks  

compare columns in all possible combination

by Anonymous Monk
on Sep 20, 2014 at 13:21 UTC ( [id://1101382]=perlquestion: print w/replies, xml ) Need Help??

Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

Hi! I am trying to compare columns in all possible combination, I have two files one with all possible combination and other with data

file2

1 2 3 4 1 2 3 5 1 2 4 5 1 3 4 5 2 3 4 5

file1

A B C D E 0 0 0 + 0 + 0 + + + 0 + + + + 0 0 + + 0 + + + + + 0 + + + + + 0 + + 0

so far I have tried this, but it seems only first line of file2 is getting parsed to while loop

open (BC,"file2.txt")||die("cannot open"); open (AB,"file1.txt")||die("cannot open"); @file=<BC>; chomp(@file); foreach $fl(@file) { if($fl=~/(.*?)\s+(.*?)\s+(.*?)\s+(.*)/) { $w=$1-1; $x=$2-1; $y=$3-1; $z=$4-1; } while(<AB>) { @data=split("\t",$_); chomp(@data); push(@col1,$data[$w]); push(@col2,$data[$x]); push(@col3,$data[$y]); push(@col4,$data[$z]); } } for($i=1;$i<@col1;$i++) { if(($col1[$i] eq '+') && ($col2[$i] eq '+') && ($col3[$i] eq +'+')) { $j++; } if(($col1[$i] eq '+') && ($col2[$i] eq '+') && ($col4[$i] e +q '+')) { $k++; } if(($col3[$i] eq '+') && ($col2[$i] eq '+') && ($col4[$i] eq ' ++')) { $l++; } if(($col3[$i] eq '+') && ($col2[$i] eq '+') && ($col4[$i] eq + '+')&&($col1[$i] eq '+') ) { $m++; } print $col1[0],"\t",$col2[0],"\t",$col3[0],"\t\t",$j,"\n"; print $col1[0],"\t",$col2[0],"\t",$col4[0],"\t\t",$k,"\n"; print $col4[0],"\t",$col2[0],"\t",$col3[0],"\t\t",$l,"\n"; print $col1[0],"\t",$col2[0],"\t",$col3[0],"\t",$col4[0],"\t",$m," +\n"; }

desired output

A B C 1 A B D 1 D B C 3 A B C D 1 A B C 1 A B E 1 E B C 3 A B C E 1 A B D 1 A B E 1 E B D 3 A B D E 1 A C D 3 A C E 2 E C D 4 A C D E 2 B C D 3 B C E 3 E C D 4 B C D E 3
thank you

Replies are listed 'Best First'.
Re: compare columns in all possible combination
by roboticus (Chancellor) on Sep 20, 2014 at 13:29 UTC

    You're putting a while loop to read a file inside of a for loop that reads another file. However, you've not told perl to restart the file at the beginning just before your while loop. So the first time through your for loop, you read one line of the first file and the entire second file. At the next iteration of your for loop, you read the second line of the first file, and *nothing* from the second file because you've already read it all.

    The simplest (and least efficient) fix would be to close the second file after your while loop, and open it just before the while loop.

    A better fix would be to read the second file into an array first, and then in your for loop you can simply loop over the array containing the second file.

    ...roboticus

    When your only tool is a hammer, all problems look like your thumb.

Re: compare columns in all possible combination
by LanX (Saint) on Sep 20, 2014 at 13:35 UTC
    I don't understand your task description and sorry your code is too ugly to be easily understood.

    But its certainly not a good idea to read <AB> again and again in a loop processing the lines in <BC> (i.e. @file ).

    So put the <AB> stuff into an initialization part populating @coll just once, that is before looping.

    Cheers Rolf

    (addicted to the Perl Programming Language and ☆☆☆☆ :)

Re: compare columns in all possible combination
by Lennotoecom (Pilgrim) on Sep 20, 2014 at 19:08 UTC
    Hello mate,
    please describe in plain steps
    how exactly works the algorithm for receiving the desired output.

    update
    check if this what you want
    use Inline::Files; push @b, [split ' '] for <FILEB>; while (@a = split ' ', <FILEA>){ --$_ for @a; print "@{$b[0]}[@a]\n"; for $i (1.. $#b) { undef @h{@{$b[$i]}[@a]}; $s++ if scalar keys %h == 1; undef %h; } print "$s\n" if $s; undef $s; } __FILEA__ 1 2 3 4 1 2 3 5 1 2 4 5 1 3 4 5 2 3 4 5 __FILEB__ A B C D E 0 0 0 + 0 + 0 + + + 0 + + + + 0 0 + + 0 + + + + + 0 + + + + + 0 + + 0

      Hello! sorry for late reply.

      Algorithm include following steps:

      1. accessing each line of file 2

      2.and parsing these elements to while loop(so on accessing the respective columns)

      3.comparing those columns in for loop

        No, I meant
        how does every line like 1 2 3 4
        relate to the file1 ?
        We take only 1 2 3 4 columns from file1?
        A B C D? what is the logic we're trying to achieve at the end?
        To summarize the column's pluses and zeroes in different arrays?
        Maybe it is an XY problem we're facing here.
        At this point I got this:
        we take the first line 1 2 3 4, decrement every element,
        then in a loop we push 4 elements into different four arrays according to these values.
        At the end we're getting four arrays that look like this:
        A0+000+0A0+000+0A0+000+0A0+000+0A0+000+0.
        So what's the end goal?
Re: compare columns in all possible combination
by Cristoforo (Curate) on Sep 22, 2014 at 04:02 UTC
    Maybe this solution - however it does not rely on file2 to calculate the combinations. That is done instead using Algorithm::Combinatorics.
    #!/usr/bin/perl use strict; use warnings; use Algorithm::Combinatorics 'subsets'; my @headers = split ' ', scalar <DATA>; my @data = map [split], <DATA>; my @subsets; for my $k (3, 4) { push @subsets, subsets([0 .. $#headers], $k); } for my $indices (@subsets) { my $count = 0; for my $aref (@data) { ++$count unless grep $_ eq '0', @$aref[ @$indices ]; } print "@headers[ @$indices ]\t$count\n"; } __DATA__ A B C D E 0 0 0 + 0 + 0 + + + 0 + + + + 0 0 + + 0 + + + + + 0 + + + + + 0 + + 0
    The output is
    A B C 1 A B D 1 A B E 1 A C D 3 A C E 2 A D E 2 B C D 3 B C E 3 B D E 3 C D E 4 A B C D 1 A B C E 1 A B D E 1 A C D E 2 B C D E 3
      Thank you very much. Your code is working perfectly fine on my dataset.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://1101382]
Approved by Corion
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others lurking in the Monastery: (3)
As of 2024-04-19 21:19 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found