Beefy Boxes and Bandwidth Generously Provided by pair Networks Bob
Clear questions and runnable code
get the best and fastest answer
 
PerlMonks  

parse problem

by Anonymous Monk
on Apr 20, 2003 at 01:35 UTC ( #251753=perlquestion: print w/ replies, xml ) Need Help??
Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

I want to get the number from a html source file, I want to parse the data like:
gi|12345678|ref|NP_001234.1|
and get 12345678, I did as below:
$data = gi|12345678|ref|NP_001234.1|; @data = split ('gi|',$data); @data1 = split ('|ref',$data[1]); $number = $data1[0];

I got e, g,..., some weird letter, when I changed the code to below:
$data = gi|12345678|ref|NP_001234.1|; @data = split ('gi',$data); @data1 = split ('ref',$data[1]); $number = $data1[0];
I got:|12345678|, I try use regular expression to remove the |:
$number =~ m/[0-9]*/;

I got the same thing which has |12345678|, What can I do? Please help and Thanks in advance! Please help and Thanks in advance!

Comment on parse problem
Select or Download Code
Re: parse problem
by dpuu (Chaplain) on Apr 20, 2003 at 01:46 UTC
    Your problem may be that the first arg to split is a regular extression -- and the vertical bar is a pattern separator with an empty extression on its left -- which can always match. If you are only wanting the one number you show, then your could use:
    $data =~ /gi\|(\d+)\|ref/ and $number = $1;
    Note that the vertical bar is escaped using the backslash. --Dave
Re: parse problem
by DrManhattan (Chaplain) on Apr 20, 2003 at 02:05 UTC
    The first argument to split() needs to be a regular expression matching the string that delimits the fields in your data. In your case, the fields in your line are separated by a '|', so the code could look like this:
    #!/usr/bin/perl use strict; my $data = 'gi|12345678|ref|NP_001234.1|'; my @data = split /\|/, $data; my $number = $data[1];
    Or more concisely:
    #!/usr/bin/perl use strict; my $data = 'gi|12345678|ref|NP_001234.1|'; my $number = (split(/\|/, $data))[1];

    -Matt

Re: parse problem
by artist (Parson) on Apr 20, 2003 at 04:50 UTC
    Hi
    You have already received good solutions.
    Your algorithm should be:
    A. split the data with the pattern . (pipe symbol in your case)
    B. get the second item from the result of the above split.
    Learn more about split.

    artist

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://251753]
Approved by vek
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others examining the Monastery: (7)
As of 2014-04-20 04:24 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    April first is:







    Results (485 votes), past polls