Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl Monk, Perl Meditation

Using a regexp to extract sequence fields

by tamaguchi (Pilgrim)
on Feb 13, 2006 at 09:52 UTC ( #529772=perlquestion: print w/replies, xml ) Need Help??

tamaguchi has asked for the wisdom of the Perl Monks concerning the following question:

I have a sequence that starts with '>' and is delimited by '|' so that the seqence looks..


Now I would need to match for example the subsequence between the second '|' and the third '|' to get 'blabla2' out, unfortunatly my knowledge of regexps has become little rusty could you help me out? I can ofcourse not mach 'anything' or 'blabla' directly since they are arbitrary sequences in reality. Thank you very for any help much.

2006-02-13 Retitled by Arunbear, as per Monastery guidelines
Original title: 'Regexp'

  • Comment on Using a regexp to extract sequence fields

Replies are listed 'Best First'.
Re: Using a regexp to extract sequence fields
by McDarren (Abbot) on Feb 13, 2006 at 10:03 UTC
    Whenever you have delimited data, you probably want to consider using split.

    For example:

    my $foo = ">blabla1|anyting1|blabla2|anyting2|blabla3| "; my @fields = split /\|/, $foo;
    In this case, you said you wanted the 3rd "field", so you would just access it as $fields[2]

    Hope this helps,
    Darren :)

      Yes thank you. I know there are many ways to get for example 'blabla2' out. But the expression should not be used directly for programming. In the context I should use ít it must match the sequence directly so that I get blabla2 out in $&.
Re: Using a regexp to extract sequence fields
by svenXY (Deacon) on Feb 13, 2006 at 09:58 UTC
    I'd do this with split, although regex is possible as well:
    #!/usr/bin/perl use strict; my $string = '>blabla1|anyting1|blabla2|anyting2|blabla3|'; # with split my $third = (split(/\|/, $string))[2]; # with regex my ($third_regexed) = $string =~ /^[^\|]+\|[^\|]+\|([^\|]+)\|.*/; print "splitted: $third\nregexed: $third_regexed\n";
    results in:
    splitted: blabla2 regexed: blabla2


    Update: added regex solution
Re: Using a regexp to extract sequence fields
by Samy_rio (Vicar) on Feb 13, 2006 at 10:04 UTC

    Hi tamaguchi, Try this,

    use strict; use warnings; $_='>blabla1|anyting1|blabla2|anyting2|blabla3|'; if (m/>[^\|]+\|[^\|]+\|([^\|]+)\|/){ print $1; }


    Velusamy R.

    eval"print uc\"\\c$_\""for split'','j)@,/6%@0%2,`e@3!-9v2)/@|6%,53!-9@2~j';

      why the /si ? It's not necessary at all.
Re: Using a regexp to extract sequence fields
by misterb101 (Sexton) on Feb 13, 2006 at 10:23 UTC
    Or of course only with regexp:

    my @matches ">blabla1|anyting1|blabla2|anyting2|blabla3|" =~ m/[a-zA- +Z0-9]+/xg; print $matches[2];
Re: Using a regexp to extract sequence fields
by MCS (Monk) on Feb 13, 2006 at 15:02 UTC

    The problem (well not really a problem but given your example, a problem) with Regular Expressions is that they are very specific. In order for us to help you more, we need a little more information

    Are you just trying to match a specific line of text? (ie 'blabla2') or are you trying to match blabla[some number]? If you are looking for an exact string, your best bet is to use split (as per other examples here) and then iterate over the array (say a foreach) and compare each item to your string.

    Of course you say that 'anything' or 'blabla' are arbitrary sequences, I'm guessing you're not trying to match an exact string but rather a positional thing. In that case, use split and then you will have your data in an array and you can go to any position. $yourarray[2] for 'blabla2' $yourarray[0] for 'blabla1' etc...

    If you are looking for 'blabla#' where # is a number you can use the regex: $mystring =~ /blabla\d+/. If you need more information than I've provided, you're going to have to provide more information about your requirements.

Log In?

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://529772]
Approved by Corion
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others surveying the Monastery: (1)
As of 2023-05-31 00:27 GMT
Find Nodes?
    Voting Booth?

    No recent polls found