Beefy Boxes and Bandwidth Generously Provided by pair Networks
No such thing as a small change
 
PerlMonks  

Re: searching keywords

by haukex (Archbishop)
on Feb 28, 2017 at 08:13 UTC ( [id://1183098]=note: print w/replies, xml ) Need Help??


in reply to searching keywords

Welcome to the Monastery, pdahal. To get started, have a look at the following - the more better information you give us, the faster and better we can provide help: How do I post a question effectively? and Short, Self-Contained, Correct Example.

But even if the abstract doesn't contain NOV, it lists NOV if there are words like novel.

Since you haven't shown any code or sample input, I can only guess. Are you using a regular expression like /NOV/i? If so, then perhaps adding a "word boundary" anchor \b (see perlretut) will help: /\bNOV\b/i.

Update: More is not better...

Replies are listed 'Best First'.
Re^2: searching keywords
by pdahal (Acolyte) on Feb 28, 2017 at 08:19 UTC

    Here I have attached my code.

    use warnings; use XML::Simple; use LWP::UserAgent; use HTTP::Request::Common; use URI::Escape; use Data::Dumper; use Text::CSV; my @keywords; my $file ="proteinlist.csv"; my $ua = LWP::UserAgent->new; my $csv = Text::CSV->new({ sep_char => ',' }); #Open result CSV file. open(my $fh, ">", "result1.csv"); print $fh "Pubmed ID, Drug name, Keyword(s) that matches, List of prot +eins in the abstract\n"; #Open the CSV file containing list of PubMed IDs open(my $data, '<', "pmid.csv"); while (my $line = <$data>) { chomp $line; if ($csv->parse($line)) { #Skip first line next if ($. == 1); my @fields = $csv->fields(); #Replace (-) with (,) $fields[0] =~ tr/-/,/; $fields[1] =~ tr/-/,/; #Split alt name my @id = split /[+]/, $fields[1]; for (my $i = 0; $i < scalar @id; $i++){ #Initialize http request my $args = "db=pubmed&id=$id[$i]&retmode=text&rettype=abstract"; my $req = new HTTP::Request POST => 'https://eutils.ncbi.nlm.nih.g +ov/entrez/eutils/efetch.fcgi'; $req->content_type('application/x-www-form-urlencoded'); $req->content($args); #Get response my $response = $ua->request($req); my $content = $response->content; $fields[0] =~ tr/,/-/; my $keystr = ""; #open csv file containing the protein list and compare with the co +ntent of abstract open(my $data, "<", $file) or die "Could not open '$file' $!\n"; while (my $readinline = <$data>) { chomp $readinline; #initialize the first data of csv as the first keyword my @fields = split "," , $readinline; $keywords[$i] = $fields[0]; if (regex(lc $content,lc $keywords[$i]) != -1) { if ($keystr eq ""){ $keystr = $keywords[$i]; }else{ $keystr = $keystr . "+$keywords[$i]"; } } } if ($keystr ne ""){ print $fh "$id[$i],$fields[0],$keystr,Yes\n"; print "$id[$i],$fields[0],$keystr,Yes\n"; }else{ print $fh "$id[$i],$fields[0],No keyword matches,No\n"; print "$id[$i],$fields[0],No keyword matches,No\n"; } } } } close($fh);

      Where is the "regex()" function declared and what does it do?

      if (regex(lc $content,lc $keywords[$i]) != -1) {

      Also, what is the input data to your program?

      Is the XML part and the download part necessary for your problem or could you maybe show just the relevant data and include that data in the program directly?

        The input data to my program is the csv file that contains the list of keywords.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://1183098]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others surveying the Monastery: (8)
As of 2024-04-19 17:21 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found