Beefy Boxes and Bandwidth Generously Provided by pair Networks
We don't bite newbies here... much
 
PerlMonks  

modifying a string in place

by davidj (Priest)
on Jan 19, 2006 at 18:11 UTC ( [id://524306]=perlquestion: print w/replies, xml ) Need Help??

davidj has asked for the wisdom of the Perl Monks concerning the following question:

My fellow monks,
I have an interesting text processing task before (Not homework). What I need to do is open a file, skip the first 4 lines, then on all the remaining lines, duplicate each character except for the '^' and '#' characters, and rewrite the file.

On an input file of:
andromeda:davidj perl_test > cat f.txt ^this^ ^is^ ^a^ ^test^ ^david#jenkins^ ^ cinea#jenkins ^
the output should be:
andromeda:davidj perl_test > cat out.txt ^this^ ^is^ ^a^ ^test^ ^ddaavviidd#jjeennkkiinnss^ ^ cciinneeaa#jjeennkkiinnss ^
I currently have the following code which works perfectly well:
#!/usr/bin/perl use strict; open(FILE, "<f.txt"); open(OUT, ">out.txt"); while(<FILE>) { my $str = ""; chomp $_; if( 1 .. 4 ) { print OUT "$_\n"; next; } while( $_ =~ m/(.)/g ) { if( $1 =~ m/(\^|\#)/ ) { $str .= "$1"; } else { $str .= "$1$1"; } } print "$str\n"; print OUT "$str\n"; } close(FILE); close(OUT);
I didn't like the idea of creating a temporary string, so I have the following which modifies the text as it is processing it, and also works perfectly well:
#!/usr/bin/perl use strict; open(FILE, "<f.txt"); open(OUT, ">out.txt"); while(<FILE>) { chomp $_; if( 1 .. 4 ) { print OUT "$_\n"; next; } for( my $i = 0; $i < length($_); $i++ ) { if( substr($_, $i, 1) =~ m/(\^|\#)/ ) { substr($_, $i, 1) = "$1"; } elsif( substr($_, $i, 1) =~ m/(.)/ ) { substr($_, $i, 1) = "$1$1"; $i++; } } print OUT "$_\n"; } close(FILE); close(OUT);
I don't like this solution because it breaks the cardinal rule of not modifying a for loop counter inside the loop. (Not that I'm any kind of coding purist, mind you :)

Benchmarking the solutions indicates that (not surprisingly) using a temporary string is quicker. The following results are on 250000 iterations of a file with 1750 lines, each line no more than 50 characters.
andromeda:davidj perl_test > perl test.pl Rate 2nd string In place 2nd string 28969/s -- -17% In place 35112/s 21% --
Now to my curiosity: Both of these solutions work and I am satisfied with using either of them. What I'd like to have, purely for the educational value, is a more "Perlish" way of doing this, and/or a more efficient way.

as always thank you for your assistance,

davidj

Replies are listed 'Best First'.
Re: modifying a string in place
by Roy Johnson (Monsignor) on Jan 19, 2006 at 18:15 UTC
    while (<>) { s/([^^#])/$1$1/g if ($. > 4); print; }
    or just perl -pe 's/([^^#])/$1$1/g if ($. > 4)' file > newfile

    Caution: Contents may have been coded under pressure.
Re: modifying a string in place
by japhy (Canon) on Jan 19, 2006 at 18:16 UTC
    This substitution, s/([^^#])/$1$1/g, will do what you want. I'd expect it to be the fastest. It replaces any non-^ non-# character with itself twice.

    Jeff japhy Pinyan, P.L., P.M., P.O.D, X.S.: Perl, regex, and perl hacker
    How can we ever be the sold short or the cheated, we who for every service have long ago been overpaid? ~~ Meister Eckhart
Re: modifying a string in place
by graff (Chancellor) on Jan 20, 2006 at 04:12 UTC
    Thank you for demonstrating this use of the range operator:
    for ( @whatever ) { if ( 1 .. 4 ) { # enter this block during the first four iterations, then neve +r again } }
    It's documented in the perlop man page, but I had never seen it before, and when I first looked at your post, I thought "that can't be right -- how could that possibly work". But once I ran it myself, and studied the man page carefully, I saw the beauty of it, and I'm grateful for that.

    Update: Having said that, it seems I'm still missing something. I stepped through the OP code with "perl -d", and sure enough, the "if ( 1 .. 4 )" worked as the OP says it should: the "if" block is entered on the first four iterations, then the "else" block is entered on the remaining iterations.

    But when I tried the simplest possible snippet to do the same basic thing, it didn't work that way:

    $_ = 0; while ($_<6) { $_++; if ( 1 .. 4 ) { print "$_: True\n"; next; } print "$_: False\n"; } __OUTPUT__ 1: False 2: False 3: False 4: False 5: False 6: False
    When I study the man page again, this is sort of what I should have expected (but I think I should have gotten at least one "True" output):
    In scalar context, ".." returns a boolean value. The operator is bistable, like a flip-flop, and emulates the line-range (comma) operator of sed, awk, and various editors. Each ".." operator maintains its own boolean state. It is false as long as its left operand is false. Once the left operand is true, the range operator stays true until the right operand is true, AFTER which the range operator becomes false again. It doesn't become false till the next time the range operator is evaluated. It can test the right operand and become false on the same evaluation it became true (as in awk), but it still returns true once.
    I get even more puzzled when I try this variation, which should evaluate to false on the first iteration (but doesn't -- and it doesn't flip-flop either):
    $_ = 0 while ($_<6) { $_++; if ( 0 .. 3 ) { print "$_: True\n"; next; } else { print "$_: False\n"; } } __OUTPUT__ 1: True 2: True 3: True 4: True 5: True 6: True
    What am I doing wrong here?

      The constant form of the flip-flop only operates against $.. Ie. The line number of the current file being read. Hence the first loop below produces the output expected, but the second does not.

      #! perl -slw use strict; while( <DATA> ) { print if 1..4; } for ( 1 .. 8 ) { print if 1..4; } __DATA__ line 1 line 2 line 3 line 4 line 5 line 6 line 7 line 8

      Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
      Lingua non convalesco, consenesco et abolesco. -- Rule 1 has a caveat! -- Who broke the cabal?
      "Science is about questioning the status quo. Questioning authority".
      In the absence of evidence, opinion is indistinguishable from prejudice.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://524306]
Approved by Roy Johnson
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others pondering the Monastery: (6)
As of 2024-03-19 09:27 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found