Beefy Boxes and Bandwidth Generously Provided by pair Networks
go ahead... be a heretic
 
PerlMonks  

Re: Find overlap

by jwkrahn (Monsignor)
on Oct 14, 2012 at 10:01 UTC ( #998945=note: print w/ replies, xml ) Need Help??


in reply to Find overlap

Here is one way to do it:

#!/usr/bin/perl use warnings; use strict; @ARGV = ( '148Nsorted.bed', '162Nsorted.bed', '174Nsorted.bed', '175Ns +orted.bed' ); my %data; while ( <> ) { /^\s*(\S+)\s+(\d+)\s+(\d+)\s*$/ or next; $data{ $1 } |= '0' x ( $2 - 1 ) . '1' x ( $3 - ( $2 - 1 ) ); } keys( %data ) == 1 or die "Error: too many keys.\n"; my ( $name, $string ) = each %data; $string =~ /10+1/ and die "Error: no overlap.\n"; $string =~ /^0*1/ and my $start = $+[ 0 ]; $string =~ /.*1/ and my $end = $+[ 0 ]; print "$name\t$start\t$end\n";


Comment on Re: Find overlap
Download Code
Re^2: Find overlap
by Anonymous Monk on Oct 14, 2012 at 16:32 UTC
    Hi, Thanks but this doesn't work for these files. I get an error "Too many keys"
Re^2: Find overlap
by linseyr (Acolyte) on Oct 14, 2012 at 16:46 UTC
    Sorry but im pretty new with perl so I dont really know what this code does. Could you give me some explanation? And why do I get the error: Too many keys when I try to run it? Thanks.

      The code uses the bit-wise OR operator (|) to turn all bytes in the string in the range to the character '1'.

      And why do I get the error: Too many keys when I try to run it?

      The keys of %data represent the first column of the data files ('chr1') so if you get that error message it means that there was something other than 'chr1' in one of the files.

      After thinking about the problem, and rereading it, it seems that ALL files must overlap so this may work better:

      #!/usr/bin/perl use warnings; use strict; @ARGV = ( '148Nsorted.bed', '162Nsorted.bed', '174Nsorted.bed', '175Ns +orted.bed' ); my ( $bit_mask, %data ) = 1; while ( <> ) { /^\s*(\S+)\s+(\d+)\s+(\d+)\s*$/ or next; $data |= chr( 0 ) x ( $2 - 1 ) . chr( $bit_mask ) x ( $3 - ( $2 - +1 ) ); $bit_mask <<= 1; } keys( %data ) == 1 or die "Error: too many keys: @{[ keys %data ]}\n"; my ( $name, $string ) = each %data; $string =~ /\x0f/ or die "Error: no overlap on all files."; $string =~ /[^\0]/ and my ( $start, $end ) = ( $+[ 0 ], length $string + ); print "$name\t$start\t$end\n";

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://998945]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others examining the Monastery: (8)
As of 2014-09-15 11:08 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    My favorite cookbook is:










    Results (147 votes), past polls