Beefy Boxes and Bandwidth Generously Provided by pair Networks
"be consistent"
 
PerlMonks  

Re: Defining substring matches

by BrowserUk (Pope)
on Sep 21, 2013 at 00:42 UTC ( #1055090=note: print w/ replies, xml ) Need Help??


in reply to Defining substring matches

I wish to use the substr() function to search for the particular motif.
#

substr is *not* designed for (nor capable of) searching for anything; so why are you specifying that particular function?

You've defined your IUPAC codes in terms of regex character classes; so why are you eschewing the regex engine?

Given your table, it is trivial to convert IUPAC codes into a regex and use the regex engine to search your fasta file:

my %IUPAC = ( A => '[A]', C => '[C]', G => '[G]', T => '[T]', R => '[AG]', Y => '[CT]', M => '[AC]', K => '[GT]', W => '[AT]', S => '[GC]', B => '[CGT]', D => '[AGT]', H => '[ACT]', V => '[ACG]', N => '[ACGT]', ); my( $file, $motif ) = @ARGV; my $re = join '', map $IUPAC{ $_ }, split '', $motif; open FASTA, '<', $file or die $!; getc( FASTA ); ## discard first '>' until( eof( FASTA ) ) { chomp( my $id = <FASTA> ); ## read ident my $seq = do{ local $/ = '>'; <FASTA> }; $seq =~ tr[\n>][]d; while( $seq =~ m[($re)]g ) { printf "Found: '$1' at '$id':%d\n", $-[0]; } }

NB: The above is untested code typed directly into my browser.


With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority".
In the absence of evidence, opinion is indistinguishable from prejudice.


Comment on Re: Defining substring matches
Download Code
Re^2: Defining substring matches
by jwkrahn (Monsignor) on Sep 21, 2013 at 01:24 UTC
    until( eof( FASTA) {

    You have two left parentheses but only one right parenthesis.

    $seq =~ tr[\n][];

    You are replacing the newline character with a newline character?

      You are replacing the newline character with a newline character?

      Indeed, to count newline characters, but not save the count? Hmmm:)

      See above:
      NB: The above is untested code typed directly into my browser.

      Corrected.


      With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
      Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
      "Science is about questioning the status quo. Questioning authority".
      In the absence of evidence, opinion is indistinguishable from prejudice.
Re^2: Defining substring matches
by AnomalousMonk (Abbot) on Sep 21, 2013 at 14:32 UTC
    while( $seq =~ m[($re)]g ) { printf "Found: '$1' at '$id':%d\n", $-[0]; }

    A question of idle (and perhaps rather trivial) curiosity: In the quoted code, you use  $-[0] "offset of the start of the last successful match" (see  @- in perlvar). My reflexive choice would have been  $-[1] since capture group 1 is being matched. There's no difference in the behavior of the code since capture group 1 is all that's matched, but was there a particular reason you chose as you did?

      Um. I cannot remember ever using indexes above 0 with @- or @+. I'm sure I probably have at some point, but I don't remember doing so.

      When I first typed it, I didn't use capture brackets and only printed the id and position. I added the capture brackets and '$1' as an afterthought.


      With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
      Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
      "Science is about questioning the status quo. Questioning authority".
      In the absence of evidence, opinion is indistinguishable from prejudice.
Re^2: Defining substring matches
by drhicks (Novice) on Sep 21, 2013 at 15:40 UTC

    This code works perfect for the job. It is so much simpler and does more than the script I had written using substr(). Thanks Derrick

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://1055090]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others chanting in the Monastery: (6)
As of 2014-12-21 10:54 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    Is guessing a good strategy for surviving in the IT business?





    Results (104 votes), past polls