Beefy Boxes and Bandwidth Generously Provided by pair Networks
No such thing as a small change
 
PerlMonks  

Comment on

( #3333=superdoc: print w/ replies, xml ) Need Help??
My fellow monks,
I have an interesting text processing task before (Not homework). What I need to do is open a file, skip the first 4 lines, then on all the remaining lines, duplicate each character except for the '^' and '#' characters, and rewrite the file.

On an input file of:
andromeda:davidj perl_test > cat f.txt ^this^ ^is^ ^a^ ^test^ ^david#jenkins^ ^ cinea#jenkins ^
the output should be:
andromeda:davidj perl_test > cat out.txt ^this^ ^is^ ^a^ ^test^ ^ddaavviidd#jjeennkkiinnss^ ^ cciinneeaa#jjeennkkiinnss ^
I currently have the following code which works perfectly well:
#!/usr/bin/perl use strict; open(FILE, "<f.txt"); open(OUT, ">out.txt"); while(<FILE>) { my $str = ""; chomp $_; if( 1 .. 4 ) { print OUT "$_\n"; next; } while( $_ =~ m/(.)/g ) { if( $1 =~ m/(\^|\#)/ ) { $str .= "$1"; } else { $str .= "$1$1"; } } print "$str\n"; print OUT "$str\n"; } close(FILE); close(OUT);
I didn't like the idea of creating a temporary string, so I have the following which modifies the text as it is processing it, and also works perfectly well:
#!/usr/bin/perl use strict; open(FILE, "<f.txt"); open(OUT, ">out.txt"); while(<FILE>) { chomp $_; if( 1 .. 4 ) { print OUT "$_\n"; next; } for( my $i = 0; $i < length($_); $i++ ) { if( substr($_, $i, 1) =~ m/(\^|\#)/ ) { substr($_, $i, 1) = "$1"; } elsif( substr($_, $i, 1) =~ m/(.)/ ) { substr($_, $i, 1) = "$1$1"; $i++; } } print OUT "$_\n"; } close(FILE); close(OUT);
I don't like this solution because it breaks the cardinal rule of not modifying a for loop counter inside the loop. (Not that I'm any kind of coding purist, mind you :)

Benchmarking the solutions indicates that (not surprisingly) using a temporary string is quicker. The following results are on 250000 iterations of a file with 1750 lines, each line no more than 50 characters.
andromeda:davidj perl_test > perl test.pl Rate 2nd string In place 2nd string 28969/s -- -17% In place 35112/s 21% --
Now to my curiosity: Both of these solutions work and I am satisfied with using either of them. What I'd like to have, purely for the educational value, is a more "Perlish" way of doing this, and/or a more efficient way.

as always thank you for your assistance,

davidj

In reply to modifying a string in place by davidj

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post; it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.
  • Log In?
    Username:
    Password:

    What's my password?
    Create A New User
    Chatterbox?
    and the web crawler heard nothing...

    How do I use this? | Other CB clients
    Other Users?
    Others chilling in the Monastery: (8)
    As of 2015-07-05 21:59 GMT
    Sections?
    Information?
    Find Nodes?
    Leftovers?
      Voting Booth?

      The top three priorities of my open tasks are (in descending order of likelihood to be worked on) ...









      Results (68 votes), past polls