Beefy Boxes and Bandwidth Generously Provided by pair Networks
Just another Perl shrine
 
PerlMonks  

Re: Population of HoAoA based on file contents

by iangibson (Scribe)
on May 14, 2012 at 19:29 UTC ( [id://970494]=note: print w/replies, xml ) Need Help??


in reply to Population of HoAoA based on file contents

Okay, I think I've made some progress, but I'm still not quite there yet. Here's what I now have:

#!/usr/bin/perl use warnings; use strict; use v5.14; use Getopt::Long; use Bio::PopGen::IO; use Bio::PopGen::Statistics; die "need two arguments (i.e. chr cont) at invocation" unless @ARGV == + 2; chomp( my $chr_num = shift ); chomp( my $cont = shift ); open my $out_file, ">", "chr${chr_num}_exome_snps_processed_${cont}_ST +ATS" or die "Can't open output file: $!\n"; open my $in_file, "<", "chr${chr_num}_exome_snps_processed_$cont" or die "Can't open input file: $!\n"; my %data; my @snp_bins; my @individuals; my @all_snps; while (<$in_file>) { chomp; if (/^SAMPLE/) { my ( $placeholder, @coords ) = split /,/; foreach my $coord (@coords) { push @snp_bins, int( $coord / 100_000 ); } } else { my ( $id, @snps ) = split /,/; push @individuals, $id; push @all_snps[$. - 2], join(',', @snps); } } foreach my $individual (@individuals) { foreach my $index ( 0 .. $#snp_bins ) { push( @{ $data{$individual}[ $snp_bins[$index] ] }, $all_snps[ +$index] ); } } close $in_file;

But there's still (at least) a problem with the line

push @all_snps[$. - 2], join(',', @snps);

I hope I'm otherwise headed in the right direction..?

In regard to what I will do with undefined bins: I will iterate through all the bins, and any that don't have a minimum number of elements simply won't be passed as data to the bioperl popgen stats methods, later on in the program.

Replies are listed 'Best First'.
Re^2: Population of HoAoA based on file contents
by state-o-dis-array (Hermit) on May 15, 2012 at 14:46 UTC
    It seems to me that this would do what you are trying to accomplish.
    while (<$in_file>) { chomp; if (/^SAMPLE/) { my ( $placeholder, @coords ) = split /,/; foreach my $coord (@coords) { push @snp_bins, int( $coord / 100_000 ); } } else { my ( $id, @snps ) = split /,/; #need to check here that $#snps == $#snp_bins ? foreach my $index ( 0 .. $#snp_bins ) { push( @{ $data{$id}[ $snp_bins[$index] ] }, $snps[$index] ); } } }
    Unless you need them elsewhere, I don't see a reason in your code snippent to store @individuals and @all_snps. The "need to check" comment above is based on my attempt to understand your last statement above. If it's possible that @snps might not have an entry for each @snp_bins, then you'll want this check.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://970494]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others scrutinizing the Monastery: (2)
As of 2026-02-19 06:51 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found

    Notices?
    hippoepoptai's answer Re: how do I set a cookie and redirect was blessed by hippo!
    erzuuliAnonymous Monks are no longer allowed to use Super Search, due to an excessive use of this resource by robots.