Beefy Boxes and Bandwidth Generously Provided by pair Networks
go ahead... be a heretic
 
PerlMonks  

Dumb problem with dumper.

by oxydeepu (Novice)
on Nov 26, 2012 at 18:51 UTC ( [id://1005742]=perlquestion: print w/replies, xml ) Need Help??

oxydeepu has asked for the wisdom of the Perl Monks concerning the following question:

Hi monks,

I am here with a new query. Hope to get a quick reply.
I am try to search for the number of occurrence of 10 hexamer word in a big file of fasta sequence.
I want to get the number of occurrence of each hexamer at each position till 5000, which is the longest sequence.
the code i wrote is given below

#!/usr/bin/perl use Data::Dumper; open $f,$ARGV[0]; while(<$f>) { chomp; $id = $_; #eat up the header line chomp($s = <$f>); if($s !~ /^>/) { push @seq,$s; } } open $fa, "top_10_hexamers.txt"; while(<$fa>) { chomp; $hx{$_}++; #hash of 10 hexamer } @words = sort keys %hx; for($i = 1;$i < 5000;$i++) { for($j = 0;$j <= $#words;$j++) { $result[$i][$j] = 0; # array of 5000 columns of positi +on and words as columns } } foreach $x(@seq) { for($j = 0; $j < (length($x) - 6);$j++) { $wrd = substr $x,$j,6; #getting the hexamer c +ombinations foreach $w(@words) { if($w eq $wrd) # comparing with the w +ord { $result->{$j}->{$wrd}++; # try +ing to get position, word and frequency } } } $wrd = ""; } print Dumper $result;

It worked for me when my sequences where 50 bases long. I used to get a hash named $VAR1. But Now I am working with 5000, and it prints $VAR1 = undef;
Can any one help me with this problem.
Thank you in advance,

Best regards,
Deepak

Replies are listed 'Best First'.
Re: Dumb problem with dumper.
by toolic (Bishop) on Nov 26, 2012 at 19:14 UTC

    Tip #2 from the Basic debugging checklist: start printing array and hash values as soon as you populate them:

    print Dumper(\@seq); open $fa, "top_10_hexamers.txt"; while(<$fa>) { chomp; $hx{$_}++; #hash of 10 hexamer } print Dumper(\%hx);

      Thank you guys.
      Sorry for spamming.
      I figured it out. It works fine.
      Thank you
      Deepak

Re: Dumb problem with dumper.
by aitap (Curate) on Nov 26, 2012 at 19:40 UTC

    I am try to search for the number of occurrence of 10 hexamer word in a big file of fasta sequence.
    Did you try using BioPerl (http://bioperl.org)?

    for($i = 1;$i < 5000;$i++) { for($j = 0;$j <= $#words;$j++) { $result[$i][$j] = 0; # array of 5000 columns of positi +on and words as columns } }
    If you want to initialise your array for better handling of large data, look there: Re^3: how apply large memory with perl?. By the way, this initialisation is lost because you initialize an array and work later with a hash reference. strict would never allow it.

    Sorry if my advice was wrong.
Re: Dumb problem with dumper.
by tobyink (Canon) on Nov 26, 2012 at 19:20 UTC

    You appear to be using the $result variable for two separate purposes; here as an array of arrays:

    $result[$i][$j]

    and here as a hash of hashes

    $result->{$j}->{$wrd}++

    Make up your mind! Short of some freaky overload magic, that's not likely to work.

    Meh. The first is, of course, @result. Either way, the rest of my advice still stands...

    I'd strongly recommend you use strict; near the top of your script. (In particular, it's the vars feature of strict that you want, but might as well enable the whole strict pragma.) This will force you to declare the variables you use; and thus encourage you to get them straight in your own head.

    perl -E'sub Monkey::do{say$_,for@_,do{($monkey=[caller(0)]->[3])=~s{::}{ }and$monkey}}"Monkey say"->Monkey::do'

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://1005742]
Approved by toolic
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others meditating upon the Monastery: (7)
As of 2024-04-23 16:50 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found