Beefy Boxes and Bandwidth Generously Provided by pair Networks
Problems? Is your data what you think it is?
 
PerlMonks  

conditional statement in substitution expression

by newbio (Beadle)
on Aug 04, 2009 at 16:31 UTC ( [id://785818]=perlquestion: print w/replies, xml ) Need Help??

newbio has asked for the wisdom of the Perl Monks concerning the following question:

Hello Monks,

Example:

Input: **Type_1_deiodinase** , **D1** , metabolizes different forms, **A** , **B** of thyroid hormones to control levels of T3 , the active ligand for **thyroid_hormone_receptors** , **TR**

Output: **Type_1_deiodinase_(D1)** metabolizes different forms, **A_(B)** of thyroid hormones to control levels of T3 , the active ligand for **thyroid_hormone_receptors_(TR)**

If I do the following:  $line =~ s/\*\*([^\*]+)\*\*\s\,\s\*\*([^\*]+)\*\*(\s\,)?/**$1_($2)**/g;

it merges all $1 and $2 in the sentence. However, I want to merge terms only if either of $1 or $2 contains '_'.

That is in the above sentence, **Type_1_deiodinase_(D1)** and **thyroid_hormone_receptors_(TR)** are OK, while **A_(B)** is not. Is there a way to apply 'if' condition in the substitution expression above so that I can merge only those adjacent terms that contain '_'?

Thanks a lot.

Replies are listed 'Best First'.
Re: conditional statement in substitution expression
by Roy Johnson (Monsignor) on Aug 04, 2009 at 17:16 UTC
    s/\*\*([^\*]+)\*\*\s\,\s\*\*([^\*]+)\*\*(\s\,)?/(grep {index($_,'_') +>=0} $1,$2)?"**$1_($2)**":$&/ge;
    This incurs the performance penalty for using $&, so you might prefer
    while(<DATA>) { s/(\*\*([^\*]+)\*\*\s\,\s\*\*([^\*]+)\*\*(\s\,)?)/(grep {index($_,'_ +')>=0} $2,$3)?"**$2_($3)**":$1/ge; print; }

    Caution: Contents may have been coded under pressure.
Re: conditional statement in substitution expression
by jethro (Monsignor) on Aug 04, 2009 at 16:52 UTC
    You might split the regex into two. In the first the regex should have the first ([^\*]+) changed to ([^\*]*_[^\*]*). In regex two the second ([^\*]+) is changed. Note this will also substitute if BOTH substrings contain underscores, but you could use ([^\*_]+) to prevent that.
Re: conditional statement in substitution expression
by jwkrahn (Abbot) on Aug 04, 2009 at 17:04 UTC
    $line =~ s/\Q**\E(([^*]+)\Q**\E\s,\s\Q**\E([^*]+))\Q**\E(\s,)?/ "$2$3" + =~ tr!_!! ? "**$2_($3)**" : "**$1**" /eg;

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://785818]
Approved by planetscape
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others scrutinizing the Monastery: (8)
As of 2024-04-19 15:34 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found