Beefy Boxes and Bandwidth Generously Provided by pair Networks
more useful options
 
PerlMonks  

Re^3: Filter and writing error log file

by choroba (Abbot)
on Jul 23, 2014 at 13:29 UTC ( #1094786=note: print w/ replies, xml ) Need Help??


in reply to Re^2: Filter and writing error log file
in thread Filter and writing error log file

To check that a string contains something other than A, C, T, or G, search for the offending character, so in your condition, use

$seq !~ /[^ACTG]/

Note that | is not needed in a character class (in fact, it matches literally, so avoid it if you don't want to match it).

لսႽ ᥲᥒ⚪⟊Ⴙᘓᖇ Ꮅᘓᖇ⎱ Ⴙᥲ𝇋ƙᘓᖇ


Comment on Re^3: Filter and writing error log file
Download Code
Re^4: Filter and writing error log file
by newtoperlprog (Novice) on Jul 23, 2014 at 13:44 UTC

    Thanks for the suggestions. One question, why we have to use '^' to match rather than

    [ATGC]
      See perlre. The carret negates the class, so the regular expression matches non-ACTG characters, but I used !~ to negate that. It's like the difference between

      "The sequence doesn't contain invalid characters"

      and

      "The sequence contains valid characters"

      These two are not equivalent, as the second lacks the work "only".
      لսႽ ᥲᥒ⚪⟊Ⴙᘓᖇ Ꮅᘓᖇ⎱ Ⴙᥲ𝇋ƙᘓᖇ

        Thanks Choroba for the suggestions.

        I am checking a rule against a string of sequence (19 letter longs) in a while loop and formed this filter

        So basically, I want to match A at position 3, T at position 10, [ACT] at position 13, [AT] at position 19 and atleast 3 A's or 3 T's from position 15-19 and the $gcper should be in between 30-52.

        if (($seq =~/\w{2}A\w{6}T\w{2}[ACT]\w[GCA{4}T{4}]{4}[AT]/) && ($gcper >= 30 && $gcper <= 52)) { print "$seq\t$seqpos\t$gcper\n"; }

        I checked the result and it seemed to work, i need help in wheather this code writing is ok or i can improve it better.

        Another thing which I want to check is: no GC stretch more than 9 letters long, but I don't know how I can insert that check in the above code.

        GCAGGTGGATCTATTTCAT 3201-3220 42.11 TAAGAGGTGTTATTTGGAA 3268-3287 31.58 ATACGATGCTTCAAGAGAA 3346-3365 36.84 CAAGCTCATCATACTGGCT 1201-1220 47.37 GGTACTGACTTTGCTTGCT 2923-2942 47.37 CGTAGTGTTAAGTTATAGT 3003-3022 31.58 GTATGGGTAGGGTAAATCA 3248-3267 42.11 CCTGCTGTGATACGATGCT 3337-3356 52.63 CCTGCGCGCGCGCGATGCT 3300-3318 50.63
Reaped: Re^4: Filter and writing error log file
by NodeReaper (Curate) on Jul 23, 2014 at 13:57 UTC
Re^4: Filter and writing error log file
by newtoperlprog (Novice) on Jul 23, 2014 at 15:20 UTC

    I have one loop related question. I have defined an array of alphabet from ("A" .. "Z") but after reading a long file the alphabets end and the program shows error of uninitialized values.

    My questions is how can I define an array of alphabets which can go to AA, BB, CC and ...so on when the "A" .. "Z" ends.

      Alphabet stands for the whole series of letters. Use the word "letter" for a single character like "A" or "Z", please, not alphabet.

      Don't create an array. Just start with

      my $letter = 'A';

      In every iteration, do

      $letter++;

      and let the magic do all the work.

      لսႽ ᥲᥒ⚪⟊Ⴙᘓᖇ Ꮅᘓᖇ⎱ Ⴙᥲ𝇋ƙᘓᖇ

      Thanks for explaining the '^' behaviour and loop through the letters.

      I have some more doubts and questions regarding one of my another programs which I have written but its very crude and i want some help in making in more robust.

      Is it ok to post it here and get some help.

      Thanks

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://1094786]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others drinking their drinks and smoking their pipes about the Monastery: (12)
As of 2014-09-17 14:08 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    How do you remember the number of days in each month?











    Results (81 votes), past polls