Of course it does...
ALL of the linux core utils know how -- Perl is just braindead by choice of its creators.
cat, chmod, chown, chroot cp, cut, dirname, (all of the file name routines work with UTF-8); tac, wc uniq, sort, sed, awk, grep,
Perl's "correctness" used to be measured by weather or not it produced the same output as the core utilities that it was based on. Perl derived from those core utils -- and their behavior set the standard for how perl ran.
Perl Fails randomly and often on compatibility with the utils that it was designed to be a combination of.
Simple word count program:
#!/usr/bin/perl -w ## 'pwc'
use 5.14.0;
my ($l,$w,$c)=(0,0,0);
while (<>) {
++$l;
$c += length $_;
while ( m{^\W*(\w+)(.*)$} ) {
++$w;
$_=$2;
}
}
printf "%d\t%d\t%d\n", $l, $w, $c;
a text file:
> file /tmp/txt
/tmp/txt: UTF-8 Unicode text
> wc -lwm /tmp/txt
3 5 38 /tmp/txt wc -lwm /tmp/txt
> pwc /tmp/txt
3 24 64
---
(There are 5 words in /tmp/txt, but I can't post it here, as the 'bb-software for perlmonks, like perl isn't UTF-8 safe/compatible).
It gets closer with an autosplit version:
(from http://www.catonmat.net/download/perl1line.txt)
# Find the total number of fields (words) on all lines
> perl -alne '$t += @F; END { print $t}' /tmp/txt
4
(it only was off by 1)...
I could spend weeks detailing all the broken semantics, but it would be a waste of my time...just have to learn all the bugs in perl so you can work around them (as stated in a previous post -- when people told me labeling dysfunctional behavior was the sign of a bad craftsman (i.e. they blame their tools)... which is a meaningless statement considering it is also said that a good craftsman knows their tools (which means 'characterizing it's behavior')....
So the idea that it is "too hard" for perl to know how to correctly interpret text data is patently and easily, provably false as millions of other programs get it right. Perl's algorithms in this area are governed by ideologues who have beliefs about how the world should be run and enforce them on everyone else. There are multiple examples where they reduce choice -- take away choices from the users because the users are presumed to be too stupid to make their own decisions (yet these same people will complain when MS does similar).
Perl could be alot more intelligent in alot of areas, than it is -- in some cases it would involve, not implementing code, but ***removing*** code that was added to deliberately limit perl's functionality or to cause erroneous behavior.
But one can spend all their time pointing out the numerous flaws of the language, or attempt to work around them and get work done. The two are not completely, but to some extent are mutually exclusive as they draw on the same resource: time.
Until those in charge allow change, it won't happen. And it is a matter of allow -- since one change that was asked for came down to .. well no one who is capable of making the change wants it enough to do it". The proponent of the idea asked "if someone who was capable of making the change, submitted a patch, does that imply there would be no problem adding it into the source base?
The conversation was terminated at that point as the question was not answerable with a simple yes/no.
|