Beefy Boxes and Bandwidth Generously Provided by pair Networks
Keep It Simple, Stupid
 
PerlMonks  

Re: Strange behaviour of tr function in case the set1 is supplied by a variable

by Anonymous Monk
on Nov 16, 2017 at 03:09 UTC ( [id://1203546]=note: print w/replies, xml ) Need Help??


in reply to Strange behaviour of tr function in case the set1 is supplied by a variable

Looks like in tr function a scalar variable is accepted as the fist argument, but is not compiled properly into set of characters

:)

you're guessing how tr/// works, you're guessing it works like s/// or m///, but you can't guess , it doesn't work like that, it doesn't interpolate variables, read perldoc -f tr for the details

  • Comment on Re: Strange behaviour of tr function in case the set1 is supplied by a variable

Replies are listed 'Best First'.
Re^2: Strange behaviour of tr function in case the set1 is supplied by a variable
by likbez (Sexton) on Nov 16, 2017 at 04:41 UTC
    you're guessing how tr/// works, you're guessing it works like s/// or m///, but you can't guess , it doesn't work like that, it doesn't interpolate variables, read perldoc -f tr for the details
    Houston, we have a problem ;-)

    First of all that limits tr area of applicability.

    The second, it's not that I am guessing, I just (wrongly) extrapolated regex behavior on tr, as people more often use regex then tr. Funny, but searching my old code and comments in it is clear that I remembered (probably discovered the hard way, not by reading the documentation ;-) this nuance several years ago. Not now. Completely forgotten. Erased from memory. And that tells you something about Perl complexity (actually tr is not that frequently used by most programmers, especially for counting characters).

    And that's a real situation, that we face with Perl in other areas too (and not only with Perl): Perl exceeds typical human memory capacity to hold the information about the language. That's why we need "crutches" like strict.

    You simply can't remember all the nuances of more then a dozen of string-related built-in functions, can you? You probably can (and should) for index/rindex and substr, but that's about it.

    So here are two problems here:

    1. Are / / strings uniformly interpreted in the language, or there is a "gotcha" because they are differently interpreted by tr (essentially as a single quoted strings) and regex (as double quoted strings) ?

    2. If so, what is the quality of warnings about this gotcha? There is no warning issued, if you use strict and warnings. BTW,it looks like $ can be escaped:

    main::(-e:1): 0 DB<5> $_='\$bba\$' DB<6> tr/\$/?/ DB<7> print $_ \?bba\?

    Right now there is zero warnings issued with use strict and use warnings enabled. Looks like this idea of using =~ for tr was not so good, after all. Regular syntax like tr(set1, set2) would be much better. But it's to late to change and now we need warnings to be implemented.

      Are / / strings uniformly interpreted in the language?
      // is an abbreviation for m// (be careful of context). But // is can be replaced by (almost?) any delimiter, by using m or s or tr.

      Quote-Like-Operators shows 2 interesting examples with tr:

          tr[aeiouy][yuoiea] or tr(+\-*/)/ABCD/.

      -QM
      --
      Quantum Mechanics: The dreams stuff is made of

        // is an abbreviation for m// (be careful of context). But // is can be replaced by (almost?) any delimiter, by using m or s or tr.

        You make a very good point. Now I started to understand why they put description of tr, which is actually a function into this strange place

        http://perldoc.perl.org/perlop.html#Quote-Like-Operators
        Strings with arbitrary delimiters after tr, m, s, etc are a special, additional type of literals. Each with its own rules. And those rules are different from rules that exist for single quoted strings, or double quoted strings or regex (three most popular types of literals in Perl).

        For example, the treatment of backslash in "tr literal" is different from single quoted strings:

        "A single-quoted, literal string. A backslash represents a backslash unless followed by the delimiter or another backslash, in which case the delimiter or backslash is interpolated."

        This means that in Perl there is a dozen or so of different types of literals, each with its own idiosyncratic rules. Which create confusion even for long time Perl users as they tend to forget detail of constructs they use rarely and extrapolate them from more often used constructs.

        For example, in my case, I was burned by the fact that "m literals" allows interpolation of variables, but "tr literals" do not. And even created a test case to study this behavior :-)

        In other words, the nature of those "context-dependent-literals" (on the level of lexical scanner they are all literals) is completely defined not by delimiters they are using (which are arbitrary), but by the operator used before it. If there is none, m is assumed.

        This "design decision" (in retrospect this is a design decision, although in reality it was "absence of design decition" situation ;-) adds unnecessary complexity to the language and several new (and completely unnecessary) types of bugs.

        This "design decision" is also poorly documented and for typical "possible blunders" (for tr that would be usage of "[","$","@" without preceding backslash) there is no warnings.

        This trick of putting tr description into http://perldoc.perl.org/perlop.html that I mentioned before now can be viewed as an attempt to hide this additional complexity. It might be beneficial to revise the docs along the lines I proposed.

        In reality in Perl qq, qr, m, s, tr are functions each of which accepts (and interpret) a specific, unique type of "context-dependent-literal" as the argument. q can be interpreted (final string representation obtained) at compile time, so if this is not a function. That's the reality of this, pretty unique, situation with the language, as I see it.

        Quote-Like-Operators shows 2 interesting examples with tr:
        tr[aeiouy][yuoiea] or tr(+\-*/)/ABCD/.
        The second variant look like a perversion for me. I never thought that this is possible. I thought that the "arbitrary delimiter" is "catched" after the operator and after that they should be uniform within the operator ;-).

        And the first is not without problems either: if you "extrapolate" your skills with regex into tr you can write instead of

        tr[aeiouy][yuoiea]

        obviously incorrect

        tr/[aeiouy]/[yuoiea]/

        that will work fine as long as strings are of equal length.

        I am talking about adding a simple warning, if symbols "[" and "$" in tr set1 or set2 are not backlashed (backslashing them is allowed in current version of Perl; also there is no break of compatibility with the old code here).

        That's around a dozen of lines of code in C-language.

        IMHO adding such a warning makes sense and this is a constructive suggestion, not blank critique of the language, as you assume. Your mileage may vary.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://1203546]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others taking refuge in the Monastery: (8)
As of 2024-04-18 06:03 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found