Beefy Boxes and Bandwidth Generously Provided by pair Networks
Think about Loose Coupling

Re: ... architecting & implementing help w/ Perl...

by aitap (Curate)
on Oct 25, 2012 at 17:16 UTC ( #1000905=note: print w/replies, xml ) Need Help??

in reply to ... architecting & implementing help w/ Perl...

  • Use File::Find instead of `find`
  • Use split to, well, split strings by the "/" character and get the second piece: my $pmid = (split "/",$string,3)[1]
  • use LWP::Simple to make HTTP POST requests or search CPAN for REST or SOAP API modules to interact with that site
Sorry if my advice was wrong.

Replies are listed 'Best First'.
Re^2: ... architecting & implementing help w/ Perl...
by rickkar (Initiate) on Oct 25, 2012 at 18:04 UTC

    something like this...?

    use File::Find; my $client = REST::Client->new( $an_url ); File::Find::find( sub { return unless m/\.xml$/; carp "Could not open $File::Find::name!" unless open( my $fh, '<', $File::Find::name ) ; my $doi; while ( <$fh> ) { next unless ( $doi ) = m{[^/]*/([^/]*)}; $client->GET( join( '/', $base, $doi )); do_stuff_with_content( $client->responseContent ); } close $fh; } => '.' );

      Yes, looks correct.

      You may want to use some XML parser (XML::Twig, for example) to search for data in the XML files if you are not completely sure that internal representation of XML data will not change.

      You can also use File::Slurp and read files requiring less lines of code.

      Sorry if my advice was wrong.

      i'm able to refine the problem...

      Statement of the Problem: parse Medline/Pubmed file paths on a Unix system in order to finally pass the PMID from each path to a pmid2doi conversion website < > ... and output companion DOIs…

      (1) parse this link and fetch the pmid; "/xxxxx/xxxxx/xxxxx/xxxxx/xxxxx/UNC00000000000042/00223468/v45i3/S0022346809003820";

      (2) submit a query to, fetch the return contents and parse the DOI value.

      If you simply point your browser to:

      then your browser will display the result in the form: {"pmid":18507872,"doi":"10.1186/gb-2008-9-5-r89"}

      and then you need to parse this JSON format.

      Examples of how to do that are at:

      #!/usr/local/bin/perl use strict; use warnings; use 5.010; use LWP::Simple; # Fetch line from <DATA> while ( <DATA> ) { # PMID is an 8-digit string, surrounded by "/" and "/" my $pmid = $1 if ( /\/(\d{8})\// ); # Query pmid in my $ret = get("$pmid"); unless (defined $ret) { warn "Failed to get doi for '$pmid': $!\n"; next; } # Parse query result, which would be like: # {"pmid":18507872,"doi":"10.1186/gb-2008-9-5-r89"} if ( $ret =~ /"doi":"(.*?)"}/ ) { my $doi = $1; # Output say $pmid, "\t=>\t", $doi; } else { say "doi not found in '$ret'"; } } exit 0; __DATA__ /xxxxx/xxxxx/xxxxx/xxxxx/xxxxx/UNC00000000000042/00223468/v45i3/S00223 +46809003820

      i'd appreciate any critiques/insights -- thx!

Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://1000905]
and all is quiet...

How do I use this? | Other CB clients
Other Users?
Others meditating upon the Monastery: (4)
As of 2018-04-20 17:10 GMT
Find Nodes?
    Voting Booth?