Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl Monk, Perl Meditation
 
PerlMonks  

replace part of a line with part of the following line after if is TRUE

by N311V (Initiate)
on Oct 02, 2013 at 01:23 UTC ( [id://1056562]=perlquestion: print w/replies, xml ) Need Help??

N311V has asked for the wisdom of the Perl Monks concerning the following question:

Hi PerlMonks, I'm having trouble writing a script so I've come here seeking assistance. I need to replace part of a line with part of the following line if a requirement is met. Example of file:

<Hit_num>6</Hit_num> <Hit_id>gnl|BL_ORD_ID|3665984</Hit_id> <Hit_def>gi|158703516|gb|ABW77886.1| odorant-bin... <Hit_accession>3665984</Hit_accession>

I need to replace "gnl|BL_ORD_ID|3665984" that bit in the second line with "gi|158703516|gb|ABW77886.1|" this bit in the third line. My current script is:

#!/usr/bin/perl -w use strict; use warnings; my $file = "$ARGV[0]"; open IN, ">fixed_$file"; open A, $file; my $replace = (); while (<A>) { if ($_ =~ /Hit_id/) { next; $replace =~ m/gi|.|/; } $_ =~ s/\>.\</\>$replace\</; print IN "$_"; } close IN; close A; exit;

I don't think next is doing what I want nor do I know if my regular expression are correct. Any help would be appreciated. Thanks

Replies are listed 'Best First'.
Re: replace part of a line with part of the following line after if is TRUE
by aaron_baugher (Curate) on Oct 02, 2013 at 02:04 UTC

    You're right, some of these lines aren't doing what you think. next immediately breaks out of the current while loop and restarts the loop at the beginning, so the following line is never executed. That means it never tells you that $replace is undefined, so what you're trying to do with it won't work.

    In general, when your text is arranged in multi-line chunks, it works best to process it that way rather than line-by-line. You can use the input record separator variable $/ to split your input file on something that ends each chunk. That way you don't have to dance around with saving each line long enough to look at the next line and see if something needs to be done to the previous line before outputting it.

    Here's an example of what I'm talking about; you can adjust it to use your input and output files. It reads in each chunk, looks to see if it has the piece you want to copy, and copies it into the other spot if it's there.

    #!/usr/bin/env perl use strict; use warnings; $/ = '/Hit_accession>'; while(<DATA>){ if( m{<Hit_def>(.+?) odorant-bin} ){ my $match = $1; s{(<Hit_id>).+?(</Hit_id>)}{$1$match$2}; } print $_; } __DATA__ <Hit_num>6</Hit_num> <Hit_id>gnl|BL_ORD_ID|3665984</Hit_id> <Hit_def>gi|158703516|gb|ABW77886.1| odorant-bin... <Hit_accession>3665984</Hit_accession> <Hit_num>7</Hit_num> <Hit_id>gnl|BL_ORE_ID|3665984</Hit_id> <Hit_def>gi|153326716|gb|ABF88997.2| odorant-bin... <Hit_accession>3665984</Hit_accession>

    Aaron B.
    Available for small or large Perl jobs; see my home node.

      @ Anonymous monk, thanks for the links I'll be sure to study them.

      @ Aaron B., thank you this has solved my problem beautifully.

Re: replace part of a line with part of the following line after if is TRUE
by ww (Archbishop) on Oct 02, 2013 at 03:28 UTC

    I really think we (and perhaps, you) need a clearer definition of what you're trying to do... and what rule establishes that the "requirement is met?"

    If it's a one-off replacement, done once, you can do it manually -- with an editor -- far more readily than with a script, regexen and an ill-speced appear to the Monks.

    So, judging that what you've shown is one sample of multiple replacements you need to do, give us the rule for replacement: what does the "following line" have to contain (and what is NOT allowed in it) to justify replacing the preceding line?

    A good spec makes it easy for us to help. Posting a poor, incomplete or inaccurate spec just leads to frustration, all around.

Re: replace part of a line with part of the following line after if is TRUE
by Laurent_R (Canon) on Oct 02, 2013 at 06:34 UTC

    Just one additional remark on top of the previous posts.

    open IN, ">fixed_$file";

    Since you are opening this file for output, IN is a very poor choice of identifier.

Re: replace part of a line with part of the following line after if is TRUE
by Anonymous Monk on Oct 02, 2013 at 01:52 UTC

    nor do I know if my regular expression are correct

    Would you like to find out? Here is how

    use re 'debug'; $_ = q{<Hit_id>gnl|BL_ORD_ID|3665984</Hit_id>}; s/\>.\</\>REPLACE\</; __END__

    I don't think next is doing what I want

    What is that (what do you want)? How did you learn about next?

    See next, perlsyn, See "Loop control" section in Learn Perl in about 2 hours 30 minutes

    See "Loop control" subsection in "Control Flow" section of Chapter 3 in the free book Modern Perl a loose description of how experienced and effective Perl 5 programmers work....You can learn this too.

    So to learn about next you try like this

    for my $ix (1..10){ print "ix($ix)\n"; if( $ix == 3 ){ print "its threeee\n"; next; print "PANCAKES\n"; } print "its not threee\n"; } __END__

    After try the above two programs, what did you learn? What are you going to try next?

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://1056562]
Approved by ww
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others contemplating the Monastery: (6)
As of 2024-04-18 10:57 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found