Beefy Boxes and Bandwidth Generously Provided by pair Networks
more useful options
 
PerlMonks  

Re: motif finding

by quester (Vicar)
on Jan 31, 2012 at 07:58 UTC ( #950903=note: print w/ replies, xml ) Need Help??


in reply to motif finding

A more "Perl-ish" way of doing the same thing...

use strict; use warnings; use Term::ANSIColor; use autodie; #Program to find motif site in a given protein sequence using files my $motif = "AGGGGG"; open( my $read, "<dna.txt" ); my @e = <$read>; $_ = join( " ", @e ); s/\s+//g; my @c; push @c, pos( ) - length( $motif ) + 1 while /$motif/g; s/$motif/color( 'bold green' ) . $motif . color( 'black' )/eg; print $_, "\n"; print "Number of sites the motif (AGGGGG) is present: ", scalar @c, "\ +n"; print "And the positions in the string are: ", join( ',', @c ), "\n\n" +;
The eliminates counting characters one at a time, as in the $i loop in the original, in favor of using pattern matching on the entire character string. I have found that eliminating loop counters wherever possible greatly reduces the number of bugs in my code.


Comment on Re: motif finding
Download Code
Re^2: motif finding
by educated_foo (Vicar) on Jan 31, 2012 at 13:55 UTC
    Or, even more Perl-ish, with a bit less extra work (e.g. only one //g loop):
    use Term::ANSIColor; open(READ,"<dna.txt"); $m = 'AGGGGG'; $_ = do { local $/; <READ> }; # read whole file s/\s+//g; # remove blanks s{$m}{ # search the string push @c, 1 - length($m) + pos; # remember position color('bold green').$m.color('reset'); # remember to reset! }eg; print "$_\n"; # print transformed string print "NUMBER OF SITES THE MOTIF ($m) IS PRESENT: ".@c."\n"; print "AND THE POSITION IN THE STRING IS:", join(',', @c), "\n\n";
      my motif input is a file, how i can modified the program to make it work?
Re^2: motif finding
by RichardK (Vicar) on Jan 31, 2012 at 14:19 UTC

    I think using File::Slurp is even easier and more perl-ish :)

    use File::Slurp; # read file as a string my $text = read_file('dna.txt'); # now remove whitespace including line breaks $text =~ s/\s+//g; # ...do stuff
    (update : removed a stray space)

      Thank you very much for the reply. In the code that i have used i am giving the input (the motif sequence). considering entire genome as a single string if i want the most repeated elements of say 20 base pairs in the entire string how can i find it?

        I'm not sure what you're looking for, can you explain with a simple example?

        Are you looking for repeats of given string or something more complex?

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://950903]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others studying the Monastery: (11)
As of 2015-07-05 17:47 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    The top three priorities of my open tasks are (in descending order of likelihood to be worked on) ...









    Results (67 votes), past polls