in reply to Strange behaviour of tr function in case the set1 is supplied by a variable
Looks like in tr function a scalar variable is accepted as the fist argument, but is not compiled properly into set of characters
:)
you're guessing how tr/// works, you're guessing it works like s/// or m///, but you can't guess , it doesn't work like that, it doesn't interpolate variables, read perldoc -f tr for the details
Re^2: Strange behaviour of tr function in case the set1 is supplied by a variable
by likbez (Sexton) on Nov 16, 2017 at 04:41 UTC
|
you're guessing how tr/// works, you're guessing it works like s/// or m///, but you can't guess , it doesn't work like that, it doesn't interpolate variables, read perldoc -f tr for the details
Houston, we have a problem ;-)
First of all that limits tr area of applicability.
The second, it's not that I am guessing, I just (wrongly) extrapolated regex behavior on tr, as people more often use regex then tr. Funny, but searching my old code and comments in it is clear that I remembered (probably discovered the hard way, not by reading the documentation ;-) this nuance several years ago. Not now. Completely forgotten. Erased from memory. And that tells you something about Perl complexity (actually tr is not that frequently used by most programmers, especially for counting characters).
And that's a real situation, that we face with Perl in other areas too (and not only with Perl): Perl exceeds typical human memory capacity to hold the information about the language. That's why we need "crutches" like strict.
You simply can't remember all the nuances of more then a dozen of string-related built-in functions, can you? You probably can (and should) for index/rindex and substr, but that's about it.
So here are two problems here:
1. Are / / strings uniformly interpreted in the language, or there is a "gotcha" because they are differently interpreted by tr (essentially as a single quoted strings) and regex (as double quoted strings) ?
2. If so, what is the quality of warnings about this gotcha? There is no warning issued, if you use strict and warnings. BTW,it looks like $ can be escaped:
main::(-e:1): 0
DB<5> $_='\$bba\$'
DB<6> tr/\$/?/
DB<7> print $_
\?bba\?
Right now there is zero warnings issued with use strict and use warnings enabled.
Looks like this idea of using =~ for tr was not so good, after all. Regular syntax like tr(set1, set2) would be much better. But it's to late to change and now we need warnings to be implemented. | [reply] [d/l] |
|
| [reply] [d/l] [select] |
|
// is an abbreviation for m// (be careful of context). But // is can be replaced by (almost?) any delimiter, by using m or s or tr.
You make a very good point. Now I started to understand why they put description of tr, which is actually a function into this strange place
http://perldoc.perl.org/perlop.html#Quote-Like-Operators
Strings with arbitrary delimiters after tr, m, s, etc are a special, additional type of literals. Each with its own rules. And those rules are different from rules that exist for single quoted strings, or double quoted strings or regex (three most popular types of literals in Perl).
For example, the treatment of backslash in "tr literal" is different from single quoted strings:
"A single-quoted, literal string. A backslash represents a backslash unless followed by the delimiter or another backslash, in which case the delimiter or backslash is interpolated."
This means that in Perl there is a dozen or so of different types of literals, each with its own idiosyncratic rules. Which create confusion even for long time Perl users as they tend to forget detail of constructs they use rarely and extrapolate them from more often used constructs.
For example, in my case, I was burned by the fact that "m literals" allows interpolation of variables, but "tr literals" do not. And even created a test case to study this behavior :-)
In other words, the nature of those "context-dependent-literals" (on the level of lexical scanner they are all literals) is completely defined not by delimiters they are using (which are arbitrary), but by the operator used before it. If there is none, m is assumed.
This "design decision" (in retrospect this is a design decision, although in reality it was "absence of design decition" situation ;-) adds unnecessary complexity to the language and several new (and completely unnecessary) types of bugs.
This "design decision" is also poorly documented and for typical "possible blunders" (for tr that would be usage of "[","$","@" without preceding backslash) there is no warnings.
This trick of putting tr description into http://perldoc.perl.org/perlop.html that I mentioned before now can be viewed as an attempt to hide this additional complexity. It might be beneficial to revise the docs along the lines I proposed.
In reality in Perl qq, qr, m, s, tr are functions each of which accepts (and interpret) a specific, unique type of "context-dependent-literal" as the argument. q can be interpreted (final string representation obtained) at compile time, so if this is not a function. That's the reality of this, pretty unique, situation with the language, as I see it.
Quote-Like-Operators shows 2 interesting examples with tr:
tr[aeiouy][yuoiea] or tr(+\-*/)/ABCD/.
The second variant look like a perversion for me. I never thought that this is possible. I thought that the "arbitrary delimiter" is "catched" after the operator and after that they should be uniform within the operator ;-).
And the first is not without problems either: if you "extrapolate" your skills with regex into tr you can write instead of
tr[aeiouy][yuoiea]
obviously incorrect
tr/[aeiouy]/[yuoiea]/
that will work fine as long as strings are of equal length.
| [reply] [d/l] [select] |
|
|
What is a "// string"?
We have tr///ansliteration operator,
m//atch operator,
s///ubstitution operator,
qr//egex operator,
q//uote and qq//uote operator,
and only the firstr// one is rare.
A special warnings is moot?
Cause I mean you found out
without a special warnings tr/// wasnt doing what you wanted --
oh noes I made mistake, perl should have held my hand and kissed my booboo and spoon fed me wisdom and knowledge and wrote my program for me.
You can have Tr('list','replacement,'options') anytime you want,
simply stop shifting responsibility for your ignorance on the designers of the language.
I've been in the same exact boat more than once,
wrote something,
couldn't figure out what mistake I was making , posted about it,
what you have to do is learn from it,
take responsibility for your own knowledge,
write a perlcritic policy (its easy) to remind you in case you forget.
Information is not Knowledge. Knowledge is not Wisdom. Wisdom is not Truth. Truth is not Beauty. Beauty is not Love. Love is not Music. Music is the best.
| [reply] |
|
I am talking about adding a simple warning, if symbols "[" and "$" in tr set1 or set2 are not backlashed (backslashing them is allowed in current version of Perl; also there is no break of compatibility with the old code here).
That's around a dozen of lines of code in C-language.
IMHO adding such a warning makes sense and this is a constructive suggestion, not blank critique of the language, as you assume. Your mileage may vary.
| [reply] |
|
|