Beefy Boxes and Bandwidth Generously Provided by pair Networks
Don't ask to ask, just ask
 
PerlMonks  

Reverse Complement

by stamp1982 (Novice)
on Jun 29, 2013 at 22:29 UTC ( #1041523=perlquestion: print w/ replies, xml ) Need Help??
stamp1982 has asked for the wisdom of the Perl Monks concerning the following question:

I am sure I am overthinking this. I have been working on it for 5days now and cant seem to get it to work. The code is supposed to check if 2 DNA sequences given as a user input are reverse complements to each other. Can you tell me where I am going wrong/suggestions?

use warnings; use strict; #Check if there is an input of sequences if((scalar @ARGV)==0){ print "Please enter two sequences as arguments on the command line"; exit; } #get the input from command line my @two_seq=@ARGV; my $sequence=join('',@two_seq); #Remove newline and/or blank space between sequences chomp $sequence; $sequence=~s/^\n*$//g; $sequence=~tr/atcg/ATCG/; #Check if there are only two sequences entered if((scalar @two_seq)>2){ print "Sorry,no more than two sequences please.\nNote:Use space to s +eperate two sequences.\n"; exit; } #Check if two sequences are of the same length if((length $two_seq[0])!=(length $two_seq[1])){ print "Sorry, the two sequences you have just entered are of differe +nt lengths.\nPlease try again on the command line."; exit; } #Initialize joined sequences to be checked my @check_seq=split('',$sequence); #Compare reverse complement sequences for(my $i=0;$i<(length $sequence)/2;$i++) { #Start checking from both end of the sequence my $start_base=shift @check_seq; my $end_base=pop @check_seq; $end_base=~tr/ATCG/TAGC/; if($start_base eq $end_base){ next; }else{ print "Unfortunately, the two sequences are not reverse-complement +.\n"; exit; } } #W reverse-complements print "Yes, the two sequences are reverse-complement of each other.\n" +; exit;

Comment on Reverse Complement
Download Code
Re: Reverse Complement
by LanX (Canon) on Jun 29, 2013 at 22:31 UTC
    Use: <p> text here (a paragraph) </p> and: <code> code here </code> to format your post; it's "PerlMonks-approved HTML":

    Cheers Rolf

    ( addicted to the Perl Programming Language)

      My apologies
Re: Reverse Complement
by rjt (Deacon) on Jun 30, 2013 at 00:00 UTC

    Here are some suggestions. The actual transformation is extremely simple; most of the program is input validation and output. Throughout, I have used die for fatal errors, which will use STDERR for output, and exit with a non-zero exit code. That's normally a good thing, but if that is not what you want, you can go back to using print and exit.

    There is no need to concatenate both strings and split it later, going over character by character. reverse will reverse a string for you.

    Providing sample input and output would have been really helpful. Fortunately, I was able to look up what a reverse complement was fairly easily.

    I've kept the same basic structure and order as your code, so I suggest you compare the two and if anything is unclear, let us know.

    #!/usr/bin/env perl use warnings; use strict; die "Please enter two sequences as arguments on the command line\n" unless @ARGV == 2; my ($orig, $comp) = map { uc } @ARGV; die "Please enter only ATCG sequences\n" if grep { /[^ATCG]/ } $orig, +$comp; die "Sorry, the two sequences you have just entered are of different l +engths." . "\nPlease try again on the command line.\n" if length $orig != length $comp; # $comp will transform back to $orig iff it is the reverse complement $comp =~ y/ATCG/TAGC/; if ($orig eq reverse $comp) { print "Yes, the two sequences are reverse-complement of each other +.\n"; exit; } else { die "Unfortunately, the two sequences are not reverse-complement.\ +n"; }

    Output:

    $ perl 1041523.pl GGGGaaaaaaaatttatatat atatataaattttttttcccc Yes, the two sequences are reverse-complement of each other. $ perl 1041523.pl GGGGaaaaaaaatttatatat atatataaattttatttcccc Unfortunately, the two sequences are not reverse-complement.
      Do you have to use STDIN in this case or do I have to store it a file? How did you get your outputs? I run the code and it did not allow any user input.

      please two sequences as arguments on the command line. Press any key to continue

      what am I missing and why cant I run the code?

      Stamp1982

        EDIT: (More of a complete 180, really): I somehow completely missed the fact you already asked the exact same question in Re^2: Reverse Complement and had a lengthy exchange with AnomylousMonk, who was very helpful.

        I actually used the exact same structure (and even used the exact same error messages!) that you provided in the original post. Did you write that code yourself? Anyway, I also gave a direct example of how to run the script. Assuming you saved the script as 1041523.pl (an odd name, to be sure—feel free to choose a better one), type the following:

        perl 1041523.pl GGGGaaaaaaaatttatatat atatataaattttttttcccc

        Of course, substitute any two sequences for the two shown here.

Re: Reverse Complement
by AnomalousMonk (Monsignor) on Jun 30, 2013 at 01:54 UTC

    Here's a variation on rjt's approach that uses a different method to compare the reversed, complemented strings, and then flags locations of non-reverse-complementarity:

    >perl -wMstrict -le "die 'Please enter exactly two sequences' unless @ARGV == 2; ;; my ($seq1, $seq2) = map uc, @ARGV; die 'Please enter only ATCG sequences' if grep /[^ATCG]/, $seq1, $seq +2; ;; die 'Sequence lengths differ' if length $seq1 != length $seq2; ;; (my $comp = $seq2) =~ tr/ATCG/TAGC/; my $mask = $seq1 ^ reverse $comp; ;; if ($mask !~ m{ [^\000] }xms) { print 'The sequences are reverse-complementary'; exit; } ;; $mask =~ tr{\000-\377}{_^}; print 'Sequences are non-reverse-complementary at these locations:'; print qq{@ARGV}; print qq{$mask }, ''.reverse $mask; " GGGGaaaaaaCatttatatat atatataaatttttATtcccc Sequences are non-reverse-complementary at these locations: GGGGaaaaaaCatttatatat atatataaatttttATtcccc ______^___^__________ __________^___^______
      Do you have to use STDIN in this case or do I have to store it a file or just run the code as it is? How did you get your outputs? I run the code and it did not allow any user input

      please two sequences as arguments on the command line. Press any key to continue

      what am I missing and why cant I run the code?

      Stamp1982

        As posted, the code is a cut-and-paste of a command line session. It is intended to show source code and invocation switches, pragmata, etc., command line input (to @ARGV), and command line output. The source code is everything between the pair of  " (double-quotes). (Note that these are the only double-quotes in the posted code section.)

        I don't know what you are doing to run the code, but try this:

        1. Download the posted code section using the  [download] link just after it.
        2. Edit out the first " and everything before it, and also the last " and everything after it. What remains should be the entire source code.
        3. Save the edited file.
        4. Invoke the edited file with the samealmost the same command line invocation used originally (Update: do not include the  -e switch as I originally wrote below!) and with the sample input you want, e.g.:
              perl -wMstrict -l  your_file.pl  GGGGaaaaaaCatttatatat  atatataaatttttATtcccc
        5. Examine output. Experiment with various different input data. Play with variations on the source code. Enjoy!

        In fact, here's the edited section of code that should be all ready to go, just use the  [download] link and save to a file of your choice:

        die 'Please enter exactly two sequences' unless @ARGV == 2; ;; my ($seq1, $seq2) = map uc, @ARGV; die 'Please enter only ATCG sequences' if grep /[^ATCG]/, $seq1, $seq2 +; ;; die 'Sequence lengths differ' if length $seq1 != length $seq2; ;; (my $comp = $seq2) =~ tr/ATCG/TAGC/; my $mask = $seq1 ^ reverse $comp; ;; if ($mask !~ m{ [^\000] }xms) { print 'The sequences are reverse-complementary'; exit; } ;; $mask =~ tr{\000-\377}{_^}; print 'Sequences are non-reverse-complementary at these locations:'; print qq{@ARGV}; print qq{$mask }, ''.reverse $mask;

        Update: I just downloaded and saved the code block immediately above in this reply and went through the steps listed above (except it doesn't have to be edited), and it works. Be sure not to use the  -e switch when invoking a file; it is used for command-line source code statements. HTH.

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://1041523]
Approved by davido
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others having an uproarious good time at the Monastery: (6)
As of 2014-08-30 02:08 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    The best computer themed movie is:











    Results (291 votes), past polls