Beefy Boxes and Bandwidth Generously Provided by pair Networks
Syntactic Confectionery Delight
 
PerlMonks  

Re: Defining substring matches

by BrowserUk (Pope)
on Sep 21, 2013 at 00:42 UTC ( #1055090=note: print w/ replies, xml ) Need Help??


in reply to Defining substring matches

I wish to use the substr() function to search for the particular motif.
#

substr is *not* designed for (nor capable of) searching for anything; so why are you specifying that particular function?

You've defined your IUPAC codes in terms of regex character classes; so why are you eschewing the regex engine?

Given your table, it is trivial to convert IUPAC codes into a regex and use the regex engine to search your fasta file:

my %IUPAC = ( A => '[A]', C => '[C]', G => '[G]', T => '[T]', R => '[AG]', Y => '[CT]', M => '[AC]', K => '[GT]', W => '[AT]', S => '[GC]', B => '[CGT]', D => '[AGT]', H => '[ACT]', V => '[ACG]', N => '[ACGT]', ); my( $file, $motif ) = @ARGV; my $re = join '', map $IUPAC{ $_ }, split '', $motif; open FASTA, '<', $file or die $!; getc( FASTA ); ## discard first '>' until( eof( FASTA ) ) { chomp( my $id = <FASTA> ); ## read ident my $seq = do{ local $/ = '>'; <FASTA> }; $seq =~ tr[\n>][]d; while( $seq =~ m[($re)]g ) { printf "Found: '$1' at '$id':%d\n", $-[0]; } }

NB: The above is untested code typed directly into my browser.


With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority".
In the absence of evidence, opinion is indistinguishable from prejudice.


Comment on Re: Defining substring matches
Download Code
Re^2: Defining substring matches
by jwkrahn (Monsignor) on Sep 21, 2013 at 01:24 UTC
    until( eof( FASTA) {

    You have two left parentheses but only one right parenthesis.

    $seq =~ tr[\n][];

    You are replacing the newline character with a newline character?

      You are replacing the newline character with a newline character?

      Indeed, to count newline characters, but not save the count? Hmmm:)

      See above:
      NB: The above is untested code typed directly into my browser.

      Corrected.


      With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
      Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
      "Science is about questioning the status quo. Questioning authority".
      In the absence of evidence, opinion is indistinguishable from prejudice.
Re^2: Defining substring matches
by AnomalousMonk (Abbot) on Sep 21, 2013 at 14:32 UTC
    while( $seq =~ m[($re)]g ) { printf "Found: '$1' at '$id':%d\n", $-[0]; }

    A question of idle (and perhaps rather trivial) curiosity: In the quoted code, you use  $-[0] "offset of the start of the last successful match" (see  @- in perlvar). My reflexive choice would have been  $-[1] since capture group 1 is being matched. There's no difference in the behavior of the code since capture group 1 is all that's matched, but was there a particular reason you chose as you did?

      Um. I cannot remember ever using indexes above 0 with @- or @+. I'm sure I probably have at some point, but I don't remember doing so.

      When I first typed it, I didn't use capture brackets and only printed the id and position. I added the capture brackets and '$1' as an afterthought.


      With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
      Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
      "Science is about questioning the status quo. Questioning authority".
      In the absence of evidence, opinion is indistinguishable from prejudice.
Re^2: Defining substring matches
by drhicks (Novice) on Sep 21, 2013 at 15:40 UTC

    This code works perfect for the job. It is so much simpler and does more than the script I had written using substr(). Thanks Derrick

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://1055090]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others lurking in the Monastery: (5)
As of 2015-07-06 03:50 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    The top three priorities of my open tasks are (in descending order of likelihood to be worked on) ...









    Results (70 votes), past polls