Beefy Boxes and Bandwidth Generously Provided by pair Networks
Welcome to the Monastery
 
PerlMonks  

a farewell to chop

by John M. Dlugosz (Monsignor)
on Sep 11, 2002 at 15:14 UTC ( #196971=perlmeditation: print w/replies, xml ) Need Help??

I read that chop is going away in Perl 6, because chomp is almost always what is wanted.

almost? I just came accross a counterexample in an old script. It splits a line and then removes the last character from one of the resulting values.

Well, chop is so simple and presumably fast, that it seems awkward to do without it. I suppose that substr is the general case, but I would have to look up the arguments in the docs, to tell it to locate from index -1 through the end, and replace with nothing. Replacing a trivial, easy-to-understand, and very efficient function with one that's rarely used (so I have to look it up) just doesn't sit well with me.

Obviously, I'd define my own sub chop that does this, just to keep the point of usage self-documenting.

But, if people are going to do this, what's the point of removing it from the language? Maybe the string "class" can have members for friendly-named common things to do, even if they could all be expressed with regex's or substr's. Perhaps $s.chop() because we already know what it means, and more generally $s.chop($length) will efficiently delete the last n items (bytes, chars, glyphs, depending on the same criteria that affects a regex at that point) from the string.

—John

Replies are listed 'Best First'.
use Perl5 'chop' ; [Re: a farewell to chop]
by bronto (Priest) on Sep 11, 2002 at 15:28 UTC

    Good point, brother John

    Uhm... maybe, instead of wiping away some functions, they could be made "optional" and imported via a use, like

    use Perl5 'chop' ;

    I wonder what Perl6 people thinks about that...

    Ciao!
    --bronto

    UPDATE I'm just using chop right now after years! I have a string that matches /^\d+[MG]$/i, a disk quota given in Megs or Gigs, and I have to convert it in Kbytes:

    if (defined $quota) { my $factor = chop $quota ; my $softquota = $quota*$QuotaConversionFactor{uc($factor)} ; ...

    Don't let it go away, please... :-(

    --
    # Another Perl edition of a song:
    # The End, by The Beatles
    END {
      $you->take($love) eq $you->made($love) ;
    }

      For things that are no-longer built into the core, certainly a standard module can supply useful implementations of them.

      I like the idea of calling it Perl5 or somesuch, and importing just the ones I need.

Re: a farewell to chop
by Aristotle (Chancellor) on Sep 11, 2002 at 15:29 UTC

    Personally, I haven't ever used chop except in the occasional obfu or golf.. having to look up the substr arguments wouldn't be any noteworthy pain since I just don't need chop that often. Actually, I need substr fairly frequently, so I wouldn't have to look it up at all. :-)

    I suppose a few people might miss chop, but I won't be one of them. Just like some people have less use for substr than me. I do think that it is one of the things that you may want to keep because we've always had it just as well as ditch cause it's hardly much use.

    So the question is, would it hurt to keep around? I think one argument in favour of ditching is that it's one less thing to advise beginners about.

    I don't know. I see your point that this one specific operation will be a lot more awkward afterwards, and I concede that. What I'm not sure about is the significance of that argument. Count me as undecided, with a tendency towards ditching.

    Makeshifts last the longest.

Re: a farewell to chop (good)
by Ovid (Cardinal) on Sep 11, 2002 at 22:46 UTC

    It's about the risk/reward ratio. Why go to all of the trouble to include a seldom used keyword that introduces so many bugs? If you need chop, there are plenty of ways to duplicate the functionality. Further, if you go to the trouble of duplicating that, it probably means that it's really what you need.

    So many things are being added to Perl, it makes sense to remove items that are seldom used and prone to cause problems. I rarely see an instance of chop that isn't a bug (you should see all of the code reviews I've done on applicants lately!).

    Cheers,
    Ovid

    Join the Perlmonks Setiathome Group or just click on the the link and check out our stats.

      That is somewhat troubling to me, actually. Most times when I see chop being misused, it's on irc, glancing over people's pre-made web scripts that confused chop and chomp. But knowing that chop is used incorrectly by professional applicants (hopefully just in typographical errors?), is mildly discomforting in comparison. I believe that I've read before that you are hiring for positions heavily immersed in Perl, and this seems to me like something most people with a moderate amount of Perl in them would notice (again, forgiving typos).
      It seems silly to me to remove it. It does exactly what it says, semantically calling chomp, trim() would be a better idea. Though I think everything is fine the way it is.

      -Lee

      "To be civilized is to deny one's nature."
Re: a farewell to chop
by Molt (Chaplain) on Sep 11, 2002 at 15:30 UTC

    I can well understand them getting rid of chop, although I do think the main problem is having it named so similarly to chomp.

    When I was beginning Perl I came across chop and chomp, and whilst I could remember the fact that one wasn't fussy what it removed and one only removed end-of-line characters it took me a remarkable amount of time to learn which was which. Caused some nasty bugs, too.

    Since I learnt the names properly I don't think I've ever touched chop for anything. So far you've only said you've been able to find one example, and people can do it with substr's (substr ($foo,-1) = '', admittedly messy), or a simple regexp $foo =~ s/.$//; which to me is perfectly readable. I really don't see the advantage of keeping chop paying off against the risk of having the confusing (and easily mis-typed) chomp/chop pair.

    I also can't see many people are going to go and write their own version of chop, to be honest. It's a simple enough thing to 'just do' and the function call imposes a much higher overhead than the operation itself.

      or a simple regexp $foo =~ s/.$//; which to me is perfectly readable

      ...but will not handle multi-byte characters. chop will.

      ~Particle *accelerates*

        Which, fortunately, is why we've got the marvellous \X sequence: s/\X$// will do what you want.

        --
        Tommy
        Too stupid to live.
        Too stubborn to die.

        Doesn't the dot handle multi-byte UTF-8 characters when the string is of the character persuasion and "use utf8" is in scope?

        Update Perhaps you meant multiple codepoints used to "compose" one glyph, rather than multiple bytes to form one codepoint. The former is what \X does. Perl5 regex only does the latter; Perl6 is said to do the former too (u0, u1, and u2 levels if memory serves).

        ...but will not handle multi-byte characters.

        It will in Perl 6, and already does under the utf8 pragma in Perl 5.6+. Besides, as a Perl 5 regex, it doesn't make sense for $ matches before a trailing \n.

        - Yes, I reinvent wheels.
        - Spam: Visit eurotraQ.
        

Re: a farewell to chop
by cephas (Pilgrim) on Sep 11, 2002 at 16:05 UTC
    So far, a brief look through stuff here at work, I've counted 43 occurences of chop(). I think we'll miss it a bit.

    I'd be happy if they would simply rename chop(). That way we solve the problems of confusing beginners, but keep the functionality that seems to be useful to some of us. Maybe it could be renamed cut() at the risk of confusing any shell scripters out there.

    cephas
      I like truncate, and it could work for arrays and strings. Hmm, isn't that what pop does for an array? Just allow pop to work on a string, too.

      How about calling it hack? :-)

      Makeshifts last the longest.

      Ah, but there is already a perl6 regular expression assertion named <cut>, which does something different. Having a prefix and an assertion that are named the same yet do different things would lead to even more confusion. Perhaps trim()?
        Trim() has a well-understood meaning in other languages and I think it would be a misleading to redefine it for Perl. How about scalp()?

        --
        May the Source be with you.

        You said you wanted to be around when I made a mistake; well, this could be it, sweetheart.

        What Solo said.. or maybe bite? Or nibble? :-)

        Makeshifts last the longest.

Make things a *little* easier
by charnos (Friar) on Sep 11, 2002 at 20:37 UTC
    Mostly, I agree with your point of view..chop isn't always needed, but killing off a function after years of use (especially one that presumably takes up so little space in the scheme of things) and is still somewhat relevant with no exact replacement, makes little sense to me. However, as per this statement: "...but I would have to look up the arguments in the docs, to tell it to locate from index -1 through the end, and replace with nothing", I believe that this can be accomplished a little simpler (at least to me) as thus:

    $str='abcde'; $str = substr $str,0,(length($str)-1); print $str;
    Which effectively truncates the last character, as chop() does. To me, that makes almost as much sense as chop does though, so beyond golf or obfuscation, I personally won't miss it much (I don't have that many legacy scripts that need to hack off one char).A chop() method in the string class with variable length would be nice as well :)
    -Marc
      Or, how about this, using lvalue substr:
      $str = 'abcde'; substr($str, -1) = ''; print $str, $/;
      Update: just in case anyone was wondering (as Solo did in a /msg), this is real Perl 5 code. I believe lvalue substr was added in some patchlevel release of Perl 5.005.

      Update^2: the advantage this has over the 4-arg substr is that you don't need to put the number of chars to replace in the string (which would always be the absolute value of the negative offset in situations like this); it just replaces everything up through the end of the string.

        the advantage this has over the 4-arg substr is that you don't need to put the number of chars to replace in the string (which would always be the absolute value of the negative offset in situations like this);

        I noticed that. If you leave off the argument, it automatically takes the rest of the chars to the end. If you want to give a 4th argument, you can't leave off the 3rd.

        But, shouldn't passing undef for the 3rd argument mean the same thing as leaving it off? That's what I would expect.

      Copy all except the last vs remove the last; the latter could be much more efficient for a large string.
      substr ($str, -1, 1, '');
      is what I was thinking.
Re: a farewell to chop
by blakem (Monsignor) on Sep 11, 2002 at 22:04 UTC
    The last time I used chop() was to solve a word puzzle. Find pairs of words that can be turned into one another by slicing in "half" and swapping the pieces... i.e.
    vessel becomes selves: vessel => SLICE => 'ves' | 'sel' SWAP => 'sel' | 'ves' NEW => 'selves' loyal becomes alloy loyal => SLICE => 'loy' | 'al' SWAP => 'al' | 'loy' NEW => 'alloy'
    The code that looped through the transformations looked something like:
    $word = chop($word) . $word;
    I can't say that I'll miss chop() but if it was up to me, I wouldn't remove it.

    -Blake

      That also shows the similarity between chop on a string and pop on a list.

      Why not provide all the same commands (perhaps as members) that treat a string semantically as a list of characters? Have push, pop, shift, and unshift. substr is like splice.

      —John

Re: a farewell to chop
by sauoq (Abbot) on Sep 12, 2002 at 06:31 UTC

    I read this earlier and for some reason, I haven't been able to shake it out of my head all day. I've finally settled on why. I would rather see chomp go.

    It seems to me that chop is the more general function in that it'll remove any character. Furthermore, it has a more useful return value. Lastly, chomp could easily be replaced with a nice simple regex s|$/$||; which, in scalar context, would return a value about as useful as chomp's.

    Don't get me wrong. I understand the benefits of chomp (for cross platform code in particular) and I know chomp is ubiquitous in existing code. I don't really think chomp should be removed either.

    It's just that, of the two, I like chop more. It's an old friend that I still fondly remember from the perl4 days before chomp.

    If the standalone chop has to go, I like John's idea of making it a method in the string class. That said, I strongly agree that chop should be kept. I don't think it should be renamed. Without it, Perl will feel just a little less like Perl to me.

    -sauoq
    "My two cents aren't worth a dime.";
    
Re: a farewell to chop
by Juerd (Abbot) on Sep 11, 2002 at 22:46 UTC

    I think that for the very rare situations when chop is actually useful, s/.$// will suffice. All sorts of optimizations will make it fast. Something used this little does not need to occupy space in the core namespace (PHP has te be "better" for something).

    - Yes, I reinvent wheels.
    - Spam: Visit eurotraQ.
    

      Something used this little does not need to occupy space in the core namespace (PHP has te be "better" for something).

      Well, PHP puts everything it can think of in the core namespace (and has no other) and they didn't get chop right, either.

      ;-)

      — Arien

Re: a farewell to chop
by BrowserUk (Pope) on Sep 13, 2002 at 04:28 UTC

    I've never needed to use chop(), so I not really fussed about its disappearance or not, but this does give me the opportunity to mention my great wish for Perl6.

    Please let me treat my strings as arrays of char!

    Yes, I know I could probably write a module, Maybe use Overload; (can it handle operators that are paired []?) or just split to an array and then join (or unpack & pack; or substr as Rvalue and Lvalue) etc., but it would be so useful sometimes to be able to say $string[n]... which, with Perl6's new treatment of sigil's would no longer be ambiguous.

    Then chop (getting smoothly back on topic:) simply becomes $#string-- or whatever $# will be in the new money.



    Well It's better than the Abottoire, but Yorkshire!
Re: a farewell to chop
by helgi (Hermit) on Sep 12, 2002 at 13:21 UTC
    I like using chop to do this sort of thing (I am a bioinformaticist):

    my $dna = 'gagagtatgcgattaatgcatattataaaaagcggcatgacggca'; for (1..10) { my $base = chop($dna); print "$base\n"; }
    Regards,

    Helgi Briem

      You can do better without chop there, though.. print "$_\n" for reverse split //, substr($dna, -10);
      And if you need $dna to loose the last characters, print "$_\n" for reverse split //, substr($dna, -10, 10, '');
      Brackets not necessary here, but added for readability.

      Makeshifts last the longest.

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlmeditation [id://196971]
Approved by Aristotle
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others perusing the Monastery: (7)
As of 2019-10-16 19:59 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?
    Notices?