Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl: the Markov chain saw

String Manupulation

by LeeC79 (Acolyte)
on Aug 27, 2003 at 17:23 UTC ( [id://287095] : perlquestion . print w/replies, xml ) Need Help??

LeeC79 has asked for the wisdom of the Perl Monks concerning the following question:

I want to replace the spaces, ie " ", in a string with dashes, "-". I'm sure this is extremely easy, one function should do it. I new to Perl, be patient please.

Replies are listed 'Best First'.
Re: String Manupulation
by jdtoronto (Prior) on Aug 27, 2003 at 17:26 UTC
      Better is $string =~ tr/ /-/; Don't use substitution when transliteration is applicable.

      We are the carpenters and bricklayers of the Information Age.

      The idea is a little like C++ templates, except not quite so brain-meltingly complicated. -- TheDamian, Exegesis 6

      Please remember that I'm crufty and crochety. All opinions are purely mine and all code is untested, unless otherwise specified.

        Better is $string =~ tr/ /-/; Don't use substitution when transliteration is applicable.

        Why? I often hear this advice and it usually stems from the fiction that "tr/// is always faster than s///".

        A better rule, IMHO, is to use the tool that fits best. In this case, both fit equally well. I personally prefer s/ /-/g because it will be recognized more widely.

        If I felt that the requirement was likely to become something like "change ' ' to '-' and tab to '_'", then I might start with tr/ /-/ in expectation of changing it to something like tr/ \t/-_/ (which could be done with s/// but not so cleanly). While if I felt that the requirement was likely to become something like "change whitespace to '-'", then I'd start with s/ /-/g in expectation of changing it to something like s/\s+/-/g (which could be done with tr/// but not so cleanly).

        In the very rare case where the performance difference between the two matters, which to use depends on your input. Benchmarking with one 10kB string I get:

        Rate 1tr 0tr 0s 1s 1tr 35435/s -- -1% -27% -30% 0tr 35863/s 1% -- -26% -29% 0s 48562/s 37% 35% -- -4% 1s 50833/s 43% 42% 5% --
        [ Note that "0s" and "1s" are identical as are "0tr" and "1tr". I usually include such so that runs of each case are interleaved so I get an idea how much variability there is between runs vs. real differences in performance. ]

        With a different 10kB string I get:

        Rate 0s 1s 0tr 1tr 0s 20623/s -- -2% -38% -38% 1s 20993/s 2% -- -37% -37% 0tr 33175/s 61% 58% -- -1% 1tr 33522/s 63% 60% 1% --
        Note that in both cases, the speed difference between s/// vs. tr/// is only a few micro seconds on a 10kB string so this is extremely unlikely to matter either way for the vast majority of uses.

                        - tye
        Remember, I'm new. Be gentle. I'm simply trying to create a log file that will have the current date and timestamp as the filename. But I can't figure out what is wrong with my code. The only thing I can think of is that the filename I'm trying to use is to long.
        #!/usr/local/bin/perl -w use strict; my $localtime = scalar localtime; my $tmp = ".txt"; my $logfile = $localtime.$tmp; $logfile =~ tr/ /-/; open( OUTFILE, ">$logfile" ); print OUTFILE "Hello"; close(OUTFILE);
        And this is the error I get when trying to run:
        print() on closed filehandle OUTFILE at line 12.
        Any hints?
Re: String Manupulation
by Theo (Priest) on Aug 28, 2003 at 23:58 UTC
    I'm pretty new to perl too, so I don't understand everything about this command line example, but it seems like an easy way to solve your replacement task.
    perl -pi.bak 's/ /-/' file.out


      I like the s/ /-/, but some of the surrounding syntax needs work. I'll present my version(s) first, then explain things.
      perl -pe 's/ /-/;' > file.out or perl -pi.bak -e 's/ /-/;' file.out
      • All one-liners need the -e flag - that's what tells Perl that you are including the program on the command line. The -e needs to be right next to the code, with a space being optional. See perlrun if my silly explanation isn't sufficient.
      • I included the semi-colon as a good habit. Some one-liners can have more than one command in them, and the semi-colon can only be omitted from the last command. So I always include it.
      • In my first version I replaced the -i.bak with shell redirection. I just don't like in-place editing, no matter how backups are done. (If you erase the data in your file while inplace editing, and you just re-execute the oneliner, you are "editing" an empty file and overwriting your file.bak with emptiness. Result: you clobber all of your data, I hope your backups are quite recent.)

      I think that's it. You might want to check out -l for making all the line endings work nicely. This came up on the SPUG mailing list recently (SPUG: Docs on "-l" wrong?), where I described -l as "Automagically takes care of line endings, so you don't have to think about chomp or \n - high DWIMage factor." That's how it was first explained to me, and it covers the basics.

      We had one-liners come up on the SPUG mailing list in April (SPUG:Best One-Liners and Scripts for UNIX) and the discussion was quite instructive - that's where I started learning about the flags.

      Oh, yeah. One more flag which you need to know: -c will check your code to see if it compiles. Not strictly one-linerish, but it is technically a command line flag. :)

      Perl programming and scheduling in the corporate world, as explained by dragonchild:
      "Uhh ... that'll take me three weeks, broken down as follows: 1 day for coding, the rest for meetings to explain why I only need 1 day for coding."