Beefy Boxes and Bandwidth Generously Provided by pair Networks
Just another Perl shrine
 
PerlMonks  

Replacing Text At Specific Positions

by sheasbys (Initiate)
on Jun 05, 2007 at 15:31 UTC ( [id://619420]=perlquestion: print w/replies, xml ) Need Help??

sheasbys has asked for the wisdom of the Perl Monks concerning the following question:

Esteemed Monks, I have files where I need to replace values in specific places. The standard line lengths ar 210 characters and I need to change the characters in positions 100 and 149 only. I also need to ignore any line that begins with a 2. When this outputs I need to preserve the lines beginning with a 2 and output each line until completion but I cannot get any "foreach" statement to function correctly. I would rather not load the file in memory as the files will often have up to a million lines each and this definitely takes a hit on CPU performance. Any suggestions, solutions, or better ways of doing this are welcome. Thanks in advance -- Perl Newbie
use File::Copy; my($input_file) = $ARGV[0]; my($output_file) = $ARGV[1]; if ( !defined($input_file) || !defined($output_file) ) { print "Error: usage: ss7jurisdiction input_file output_file\n"; } else { # -----Backup the input files in case of error----- copy( $input_file, $input_file . ".bak" ) or die "Could not backup file $input_file to $input_file.bak: $!\ +n"; # -----Attempt to open all of the files----- open( INFILE, $input_file ) || die( "Could not read input file 1 ( +$input_file): $!" ); open( OUTPUT, "> " . $output_file ) || die( "Could not open output + file ($output_file): $!" ); while (<INFILE>) { my $line = $_; chomp($line); # -----A line starting with a '2' is a header and is left unch +anged if ( $line !~ m/^2/ ) { # -----Locate postions 100 and 149 $juris1 = substr( $line, 99, 1 ); $juris2 = substr( $line, 148, 1 ); $juris1 = "9"; $juris2 = "Z"; # -----Generate the output string----- $output_line = substr( $line, 0, 98 ) . $juris1 . substr( $line, 100, 48 ) . $juris2 . substr( $line, 149, 61 ) . "\n"; print OUTPUT $output_line; last; } } # -----Close all of the files----- close( INFILE ); close( OUTPUT ); }

Replies are listed 'Best First'.
Re: Replacing Text At Specific Positions
by Util (Priest) on Jun 05, 2007 at 16:05 UTC

    Working, tested code:

    use strict; use warnings; my $usage_message = "usage: $0 input_file output_file\n"; die $usage_message if @ARGV != 2; my ( $input_file, $output_file ) = @ARGV; open my $in_fh, '<', $input_file or die "Could not open input file 1 '$input_file': $!"; open my $out_fh, '>', $output_file or die "Could not open output file '$output_file': $!"; while ( my $line = <$in_fh> ) { chomp $line; warn unless length($line) == 210; if ( $line =~ m/^2/ ) { # Do Nothing! # A line starting with a '2' is a # header and is left unchanged. } else { # Change postions 100 and 149 substr( $line, 100-1, 1, '9' ); substr( $line, 149-1, 1, 'Z' ); } print {$out_fh} $line, "\n"; } close $in_fh or warn "Could not close input file 1 '$input_file': $!"; close $out_fh or warn "Could not close output file '$output_file': $!";
    Or use this one-liner:
    perl -wlpe 'if(!/^2/){substr($_,99,1,"9");substr($_,148,1,"Z");}' <in. +dat >out.dat

      Util,

      Thanks so much for the code. As a Newbie this has taught me a lot and I am astounded at the concise code to accomplish what originally started with 100 lines or so on my first convoluted attempt and now it can be distilled into one line.

      Thanks to you and to the other contributers,

      Stephen
Re: Replacing Text At Specific Positions
by ikegami (Patriarch) on Jun 05, 2007 at 15:48 UTC
    $output_line = substr( $line, 0, 98 ) . $juris1 . substr( $line, 100, 48 ) . $juris2 . substr( $line, 149, 61 ) . "\n";

    can be made a bit more readable:

    $output_line = "$line\n"; substr($output_line, 99, 1) = $juris1; substr($output_line, 148, 1) = $juris2;

    or

    $output_line = "$line\n"; substr($output_line, 99, 1, $juris1); substr($output_line, 148, 1, $juris2);

    Otherwise, it looks like the only problem you have is that you're not printing out the lines starting with "2", and you end the loop prematurly using last.

    while (<INFILE>) { my $line = $_; if ( $line !~ m/^2/ ) { substr($line, 99, 1, "9"); substr($line, 148, 1, "Z"); } print OUTPUT $line; }

    Are you having any other problems?

      Or if you're trying to preserve all the lines starting with the line starting with "2", then:
      my $found_2; while (<INFILE>) { my $line = $_; if ( !$found_2 ) { if ( $line ~= m/^2/ ) { $found_2 = 1; } else { substr($line, 99, 1, "9"); substr($line, 148, 1, "Z"); } } print OUTPUT $line; }
Re: Replacing Text At Specific Positions
by shmem (Chancellor) on Jun 05, 2007 at 15:48 UTC
    Reformatting your code a bit reveals
    while (<INFILE>) { my $line = $_; chomp($line); # -----A line starting with a '2' is a header and is l +eft unchanged if ( $line !~ m/^2/ ) { # -----Locate postions 100 and 149 $juris1 = substr( $line, 99, 1 ); $juris2 = substr( $line, 148, 1 ); $juris1 = "9"; $juris2 = "Z"; # -----Generate the output string----- $output_line = substr( $line, 0, 98 ) . $juris1 . substr( $line, 100, 48 ) . $juris2 . substr( $line, 149, 61 ) . "\n"; print OUTPUT $output_line; last; } }

    that there's no 'else' for your 'if'. What are you supposed to do when you hit /^2/, again? if you've found the /^2/ line, just stream through the rest of the file?

    while (<INFILE>) { my $line = $_; chomp($line); # -----A line starting with a '2' is a header and is l +eft unchanged if ( $line !~ m/^2/ ) { # -----Locate postions 100 and 149 $juris1 = substr( $line, 99, 1 ); $juris2 = substr( $line, 148, 1 ); $juris1 = "9"; $juris2 = "Z"; # -----Generate the output string----- $output_line = substr( $line, 0, 98 ) . $juris1 . substr( $line, 100, 48 ) . $juris2 . substr( $line, 149, 61 ) . "\n"; print OUTPUT $output_line; # last; # <-- this last exits the input loop, +no more lines read. # Is that really what you want to do? # exit the loop after the first non - +/^2/ - line? } else { print OUTPUT while <INFILE>; } }

    But your negated 'if' condition and the comment above it confuses me a bit. Shouldn't that be

    # ----- Do as long as line doesn't start with '2'

    or such?

    --shmem

    _($_=" "x(1<<5)."?\n".q·/)Oo.  G°\        /
                                  /\_¯/(q    /
    ----------------------------  \__(m.====·.(_("always off the crowd"))."·
    ");sub _{s./.($e="'Itrs `mnsgdq Gdbj O`qkdq")=~y/"-y/#-z/;$e.e && print}
Re: Replacing Text At Specific Positions
by FunkyMonk (Chancellor) on Jun 05, 2007 at 15:49 UTC
    I'm confused by "I cannot get any "foreach" statement to function correctly". Do you mean you want to replace

    while (<INFILE>) { my $line = $_;

    with foreach? If that is the case, there's no need for foreach, just replace those two lines with:

    while (my $line = <INFILE>) {

    Did you know that substr can take a 4th argument?

    $juris1 = substr( $line, 99, 1 ); $juris2 = substr( $line, 148, 1 ); $juris1 = "9"; $juris2 = "Z"; #or # #substr( $line, 99, 1, '9' ); # 4th argument replaces #substr( $line, 148, 1, 'Z' );# selected sub-string

    It doesn't look like you're using strict and warnings. They are real timesavers and you should look in to using them.

Re: Replacing Text At Specific Positions
by naikonta (Curate) on Jun 05, 2007 at 15:52 UTC
    If understand your problem correctly, I think you can use one-liner to solve it.
    perl -pi.bak -e '/^2/ or substr $_, 99, 1, 9 and substr $_, 148, 1, "Z +"' input_file_name
    I might be wrong about the positions and the replacements but you can adjust that. The original file will be automatically copied to input_file_name.bak.

    Open source softwares? Share and enjoy. Make profit from them if you can. Yet, share and enjoy!

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://619420]
Approved by almut
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others drinking their drinks and smoking their pipes about the Monastery: (5)
As of 2024-04-19 02:34 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found