If you've discovered something amazing about Perl that you just need to share with everyone,
this is the right place.
This section is also used for non-question discussions about Perl, and for any discussions that are not specifically programming related. For example, if you want to share or discuss opinions on hacker culture, the job market, or Perl 6 development, this is the place. (Note, however, that discussions about the PerlMonks web site belong in PerlMonks Discussion.)
Meditations is sometimes used as a sounding-board — a place to post initial drafts of perl tutorials, code modules, book reviews, articles, quizzes, etc. — so that the author can benefit from the collective insight of the monks before publishing the finished item to its proper place (be it Tutorials, Cool Uses for Perl, Reviews, or whatever). If you do this, it is generally considered appropriate to prefix your node title with "RFC:" (for "request for comments").
First of all, forgive me if this isn't quite the right place. I am a new user and am not entirely sure about the rubric for this area of the site, though I'm pretty sure this is the right place.
About a decade ago, when I was in my late teens and early twenties, I was very proficient in and eager to use Perl. Though it was a little idiosyncratic, it was certainly much less tedious to get things done in than C, which I had used earlier. Gradually I drifted away towards Python and now I use it for most things. I've since forgotten virtually everything I knew about Perl. I know that Python will still be obviously superior for, for example, most aspects of scientific computing (possibly excepting bioinformatics?) and machine learning, but where does Perl really shine these days? That goes equally for the more conventional Perl 5 as well as the newer Perl 6. Also, what are hot items on CPAN these days?
Recently, I was in need to compare two versions of the same perl module containing many subs. I was interested to see the difference between the contents of subs with the same name. This happened because I forked the same code in two different machines and made changes in the modules in both machines.
For this purpose I wrote the following basic script which I place here for public use but also for comments from the monastic community.
The simple script makes use of two excellent modules, namely PPI and Text::WordDiff. PPI parses perl code and is capable of extracting subs and their contents. Text::WordDiff outlines (and color-codes) the differences between two blocks of text (the contents of identically-named subs in two files).
Unix's diff is a fine tool in general, but code has a few idiosyngracies which make diffing sometimes impractical. For example when same-content subs have different order in their respective files.
That said, I wanted a quick tool to check my two versions of the perl module, find enhancements in either file I made and produce a final version.
Here is the script:
#!/usr/bin/env perl
use strict;
use warnings;
use PPI;
use Text::WordDiff;
if( scalar(@ARGV) != 2 ){ print usage($0) . "\n"; exit(0); }
my ($infile1, $infile2) = @ARGV;
my $doc = PPI::Document->new($infile1);
if( ! defined($doc) ){ print STDERR "$0 : call to ".'PPI::Document->ne
+w()'." has failed for input file '$infile2'.\n"; exit(1); }
my (%subs1, %subs2, $asub);
for $asub ( @{ $doc->find('PPI::Statement::Sub') || [] } ) {
# loop over all subs in file
unless ( $asub->forward ) {
# store sub's contents in hash keyed on sub's name
$subs1{ $asub->name } = $asub->content;
}
}
$doc = PPI::Document->new($infile2);
if( ! defined($doc) ){ print STDERR "$0 : call to ".'PPI::Document->ne
+w()'." has failed for input file '$infile2'.\n"; exit(1); }
for $asub ( @{ $doc->find('PPI::Statement::Sub') || [] } ) {
# loop over all subs in file
unless ( $asub->forward ) {
# store sub's contents in hash keyed on sub's name
$subs2{ $asub->name } = $asub->content;
}
}
my ($k, $v1, $v2, $res, $anitem);
my @dont_exist = ();
my %allkeys = map { $_ => 1 } (keys %subs1, keys %subs2);
foreach $k (sort keys %allkeys){
if( ! defined($v1=$subs1{$k}) ){
push(@dont_exist, "$k : Does not exist in '$infile1'\n");
next
} elsif( ! defined($v2=$subs2{$k}) ){
push(@dont_exist, "$k : Does not exist in '$infile2'\n");
next
}
# sub (same name) exists in both files, diff sub's contents in fil
+es:
$res = Text::WordDiff::word_diff(
\$v1,
\$v2,
);
# print diff results
print
"----- begin '$k' -----\n"
. $res
. "\n----- end '$k' ------\n"
;
}
# and also print the subs which exist in one file but not the other fi
+le
print join("", @dont_exist);
exit(0);
sub usage {
return "Usage : ".$_[0]." file1 file2\nColor output guide:\n\tRED
+: file1\n\tGREEN: file2\n"
}
I don't know how well known this is (it's probably quite old new for you) but recently I had the problem of wanting to run a Perl-script on a non-rooted Android-tablet and came across the fabulous Termux-project.
Installing this app gives you a terminal-environment and with a simple
In order to deal with Japanese orders, I recently had to convert my whole system to UTF-8. A day or 2's job I thought. 2.5 weeks later, I'm finally there. There is a lot of stuff on Perlmonks and the internet in general about this but it is hard to understand and even harder to implement. Most of the advice I read was along the lines of RTFM or did not give the whole story. It's pretty clear this is a common problem, too. I wanted to give something back to the community as perlmonks has helped me a lot, so I thought I would share some insights that I hope will be practical and useful.
There is a lot out there telling you to used decode/encode and giving lectures on internal representation of UTF8 in Perl and wotnot. In the end I've only had to use decode in one place where data is coming in from elsewhere. If you get all the other stuff right, I believe you shouldn't need any or many instances of decode/encode.
Our system involves a local website using MySQL, a live website, static webpages, generated webpages, various text files and CGI website forms. All of this needs work to make it work. Here are the things that I needed to do:
Checklist of changes to make
* Firstly, every script file is converted to UTF-8 format. Easy.
* Every script to have this at the top: use utf8;
This tells perl that the script itself is in UTF format. So a £ in the script will be interpreted as a UTF-8 £. It's no good just putting this in the calling script as it only seems to extend for the scope of the script underneath; not any other scripts that are imported with require...
* Ideally each database table must be turned to UTF-8 format. This turns out to be difficult and time-consuming because any tables with foreign keys won't convert unless you first delete the foreign keys. For those that won't easily convert, you can convert only the fields that might hold UTF-8 encoded characters to UTF-8 format. Also BLOB fields are a problem unless the whole table is UTF-8. I had to convert problem BLOB fields to TEXT fields and then convert them to UTF-8 format (a 2 step process, doing both in 1 step fails).
* Rose::DB (or whatever database method you are using) needs to be told that incoming data from the Database is in UTF-8. For Rose:DB, add this to the connector in DB.pm and then regenerate connect_options => {mysql_enable_utf8 => 1}
* binmode(STDOUT, ":utf8"); # Put this at the top of a script - tells it to output UTF to stdout. Not sure if this is just needed only once in the opening script or in any requires, too?
* Webpages must have this in the head section:
<meta http-equiv="content-type" content="text/html; charset=UTF-8">
* use CGI qw(-utf8); to treat incoming CGI parameters as UTF-8. Getting this working was subtle - test carefully.
* When outputting a CGI webpage, the first thing to do is to output the http header and this needs to be told about UTF8 too:
Personally I found that print header(-type=>'text/html', -cookie=>'', -charset=>'utf-8'); gave problems with cookies so ended up outputting it direct:
print "Content-type: text/html; charset=utf-8\n$cookie\n\n";
* use open ':encoding(utf8)'; # tells it to deal with all files in a UTF8 way. In fact, I was more careful with this and did not use it in general. Instead, I have specifically opened each file that needed it with open($fh, '<:encoding(UTF-8)', $filename);. Because some files that I have to deal with have not been given to me in UTF-8 format. Careful - this can fail if the $filename variable is not also in UTF8!
Identifying Errors
In doing this, you will make mistakes and see weird characters appearing in unexpected places. I developed my own personal understanding of how to deal with them. These are my own notes for practical situations so please bear with me, if the explanations are not exactly correct - it was about fixing stuff not being a perl rocket scientist.
You see £ displayed as '£'
If £ sign is coming from dbase and is stored correctly in dbase and webpage is correctly displaying UTF-8 characters from elsewhere (e.g. write japanese text into the perl script and print it), then the UTF-8 is not being retrieved from the database as UTF-8 (presumably being assumed to be Latin1).
The £ is within a UTF-8 encoded PERL script but use utf8; is not set at the top of the script.
The £ is displayed correctly in a form initially but when the form is saved/updated, the £ then displays as '£'. Use the -utf8 CGI pragma to treat incoming parameters as UTF-8: use CGI ('-utf8');
£ is displayed on a webpage as �
This happens when the http header Content Type is not UTF8 and the meta tag is similarly <meta http-equiv="Content-Type" content="text/html" />
£ or other characters are being displayed as a diamond with ? inside it
StackOverflow:...usually the sign of an invalid (non-UTF-8) character showing up in an output (like a page) that has been declared to be UTF-8. Can be fixed by putting the following at the top of script: binmode(STDOUT, ":utf8");
Error message: Wide character in print
Means a print statement (to STDOUT or a file) that is outputting Latin1 includes a UTF-8 character... To fix, add '>:encoding (UTF-8) to the open statement or #binmode(STDOUT, ":utf8");
#prepare
my $sth = $dbh->prepare('SELECT * FROM people WHERE lastname = ? AND f
+irstname = ?');
#execute with list of bindvars
$sth->execute( $lastname, $firstname );
But it's a bit cumbersome to adjust the bind values if the order changes.
It's even more work if you have to use an array of values like inside an IN ( ?, ?, ?) operation.
I started to hack something to auto-generate placeholders, for a string passed inside a code-block:
$scalars from the closure are replaced with a placeholder ?
@arrays are replaced with a list of comma separated placeholders ?,?,?
underscored _var_names are ignored ( placeholders can't be everywhere)
The second returned parameter is a list of var-refs in the correct order, such that the bind variables can be safely changed.
Parsing the output of B::Deparse is even more fragile than I thought, the next version will walk the OP-Tree directly. (For instance parsing multiline SQL doesn't work yet.)
I'm not yet sure how to combine this in the best way with DBI.
This is a one days job in the sense of "release often".
Comments?
update
Hmmm ... I can probably avoid the hassle of parsing the OP-tree by tying the variables ...
use strict;
use warnings;
use B::Deparse;
use PadWalker qw/closed_over peek_sub set_closed_over/;
use Data::Dump qw/pp/;
# ========= Tests
use Test::More;
# lexicals for placeholders
my $a = 'A';
my @list = qw/L I S T/;
my $x = 'X';
# no placeholders for underscore vars
my @_table = "any_table";
my $sql = sub { "SELECT * FROM @_table WHERE a = $a AND b IN (@list) A
+ND c = $x" };
my @stm = holderplace($sql);
is_deeply( \@stm,
[
"SELECT * FROM any_table WHERE a = ? AND b IN (?, ?, ?, ?
+) AND c = ?",
[\"A", ["L", "I", "S", "T"], \"X"]
],
"statement with placeholders plus bind variables"
);
# change bind variables
$a = 'AA';
@list = qw/LL II SS TT/;
$x = 'XX';
is_deeply( \@stm,
[
"SELECT * FROM any_table WHERE a = ? AND b IN (?, ?, ?, ?
+) AND c = ?",
[\"AA", ["LL", "II", "SS", "TT"], \"XX"]
],
"statement with placeholders plus changed variables"
);
done_testing();
# ========== Code
sub holderplace {
my ($lambda)=@_;
my $h_vars = closed_over($lambda);
my %new_vars;
my @value_refs;
for my $key ( keys %$h_vars) {
my $sigil = substr $key,0,1;
# exclude variables starting with _
next if $key =~ m/^\Q${sigil}\E_/;
if ( '$' eq $sigil ) {
$new_vars{$key} = \'?';
} elsif ( '@' eq $sigil ) {
$new_vars{$key} = [ join ", ", ("?") x @{$h_vars->{$key} } ];
} else {
next; # Error?
}
}
# Create Statement with placeholders
set_closed_over( $lambda, \%new_vars );
my $newstr = $lambda->();
# Variable refs in order of placeholders
my @var_refs =
map { $h_vars->{$_} }
grep { $new_vars{$_} }
@{ get_vars($lambda) };
return ("$newstr", \@var_refs );
}
sub get_vars {
# scans output of B::Deparse to get interpolated vars in order
my ($lambda)=@_;
# deparse sub body
my $source = B::Deparse->new('-q')->coderef2text($lambda);
# returns something like:
# {
# use warnings;
# use strict;
# 'SELECT * FROM ' . join($", @_table) . ' WHERE x = ' . $a . ' AN
+D b IN (' . join($", @list) . ') ' . $x;
# }
# truncate {block} and use statements
$source =~ s/^{\s*(use.*?;\s*)*//s;
$source =~ s/;\s*}$//s;
#warn $source;
my %quotes = qw"[ ] ( ) < > { } / /";
$quotes{'#'}='#';
# single quotes like q(...)
my $re_q = join "|", map { "q\\$_.*?\\$quotes{$_}" } keys %quotes;
#warn pp
my @parts = split /\s* (?: '(?:\\'|[^'])*?' | $re_q )\s*/msx, $so
+urce;
for my $part (@parts) {
next unless $part =~ /^\..*\.?$/;
if ( $part =~ /^\. join\(.*? (\@\w+)\)( \.)?$/) {
$part = $1; # array
}
elsif ( $part =~ /^\. (\$\w+)( \.)?$/) {
$part = $1; # scalar
}
}
return \@parts;
}
I post this under meditation because this is not really a question by me, even if some question mark is in the text.
Art needs restauration and restauration needs keen eyes and gentle hands. We have here at the monastery precious and invaluable masterpieces ruining as the time passes.
The most incredible things I ever seen in Perl is 3-D Stereogram, Self replicating source. by the genial monk Toodles but unfortunately it just run on perl 5.8 (or as someone said in 5.10).
Is some monk able to spot why this happens?
Is someone able to make this masterpiece run as expected on modern perl versions? Infact it doesnt fail on 5.26 but doesnt produce the stereogram effect: only sparse lines.
L*
There are no rules, there are no thumbs..
Reinvent the wheel, then learn The Wheel; may be one day you reinvent one of THE WHEELS.
Thank you to all who helped me get this started. Even though my name is on it that just means that I borrowed most of it from others. You will see your suggested way of doing things throughout this post.
I will describe how I created the postgresql database. Any and all comments are welcome.
I did not use these tables but I am strongly considering them.
CREATE TABLE seven_stud_spread_stakes(bring_in int, fourth_street int
+check(fourth_street >= bring_in), fifth_street int check(fifth_street
+ >= bring_in), sixth_street int check(sixth_street >= bring_in), seve
+nth_street int check(seventh_street >= bring_in));
CREATE TABLE colorado_limit_stakes(small_blind int, big_blind int chec
+k(big_blind >= small_blind), preflop int check(preflop <= 100), flop
+int check(flop <= 100), turn int check(turn <= 100), river int check(
+river >= big_blind));
CREATE TABLE spread_limit_stakes(small_blind int, big_blind int check(
+big_blind >= small_blind), preflop int check(preflop >= small_blind),
+ flop int check(flop >= small_blind), turn int check(turn >= small_bl
+ind), river int check(river >= small_blind));
CREATE TABLE seven_stud_stakes(bring_in int, fourth_street int check(f
+ourth_street >= bring_in), fifth_street int check(fifth_street >= fou
+rth_street), sixth_street int check(sixth_street >= fifth_street), se
+venth_street int check(seventh_street >= sixth_street));
CREATE TABLE stakes(small_blind int, big_blind int check(big_blind >=
+small_blind), preflop int check(preflop >= big_blind), flop int check
+(flop >= preflop), turn int check(turn >= flop), river int check(rive
+r >= turn));
INSERT INTO stakes VALUES(1, 3, 3, 3, 6, 9);
I am using the following tables. Note that I am using v_limits since limit is a reserved word in postgresql.
CREATE TABLE v_limits(v_limit TEXT PRIMARY KEY);
CREATE TABLE states(abbreviation TEXT PRIMARY KEY, state TEXT);
CREATE TABLE cities(city TEXT PRIMARY KEY);
CREATE TABLE games(game TEXT PRIMARY KEY);
CREATE TABLE hi_lows(hi_lo TEXT PRIMARY KEY);
CREATE TABLE kills(kill TEXT PRIMAY KEY);
CREATE TABLE stakes(stake TEXT PRIMARY KEY);
CREATE TABLE venues(venue TEXT PRIMAY KEY);
CREATE TABLE visits(id INT PRIMARY KEY, arrival_date DATE, departure_d
+ate DATE, arrival_time TIME, departure_time TIME, venue TEXT REFERENC
+ES venues(venue), city TEXT REFERENCES cities(city), state TEXT REFER
+ENCES states(abbreviation), game TEXT REFERENCES games(game), stake T
+EXT REFERENCES stakes(stake), kill TEXT REFERENCES kills(kill), hi_lo
+ TEXT REFERENCES hi_lows(hi_lo), v_limit REFERENCES v_limits(v_limit)
+, buy_in MONEY, cash_out MONEY);
I'm about to email the pumpking for an intervention as a personal favour. Because I'm convinced that a half-arsed solution is better than no solution, it's due past time that the over 20 year old embarrassment gets fixed:
› ver
Microsoft Windows [Version 10.0.16299.125]
› chcp 65001
Aktive Codepage: 65001.
› type αω.bat
@echo hiαω
› node -p "require('child_process').execSync('αω.bat').toString()"
hiαω
› perl6 -e "run 'αω.bat'"
hiαω
› php -r "system('αω.bat');"
hiαω
› python -c "import subprocess; subprocess.call('αω.bat')"
hiαω
› ruby -e "system 'αω.bat'"
hiαω
› perl -Mutf8 -e "system 'αω.bat'"
Der Befehl "a?.bat" ist entweder falsch geschrieben oder
konnte nicht gefunden werden.
› perl -Mutf8 -MWin32::Unicode::Process=systemW -e "systemW('αω.bat')"
Der Befehl "a?.bat" ist entweder falsch geschrieben oder
konnte nicht gefunden werden.
Plan of attack: use 5.028 enables use feature 'just-make-it-work-already-dammit', which checks $^O eq 'MSWin32' and then replaces all the broken chdir, mkdir, open, opendir, rename, rmdir, system, unlink, utime, -X stat etc. with the working equivalent code from Win32::Unicode and also somehow on -e, not just with code executed from files.
Now tell me why this is a stupid idea, but keep in mind that
if all the other languages can hack it, then so can we, no matter how shitty and insufficient you think the initial patch is
the better is the enemy of the good and a "better" solution did not turn up for decades
if I simply file a perlbug it just gets marked by p5p as a duplicate of a discussion whose proposed "better" solution did not turn up for decades
Periodically, I'll go through my posts via the normal mechanism and sort by lowest-highest-this-that-other, and sometimes, like today, I have found one that I'd like to reply to.
Now, this post has a (relatively) high XP count, and the responses have 15-25% higher than that.
I want to reply to a poster on such thread legitimately (orig post count was ~45, replies were 60+), but I don't want it realized that I'm doing it in order to get exposure on the overall hierarchical post.
How do Monks handle these situations? Shatter humility or what?
-stevieb
ps. I have been accused of raising posts for XP+ before, but those who raised that are irrelevant to me.
I tried to avoid eval to evaluate the expressions, at the same time, I didn't want to implement the traditional full math expression parser as there were only two operations of the same precedence in use.
I haved produced my first, at least in my intention, serious module: Win32::Event2Log for the moment on github (current version). I tried to follow all
best practices for module creation (a long read..) and I announced on prepan last week
but I had no comment back.
The windows Event Viewer, in my experience, it's good just to lead you to a carpal tunnel syndrome so in the past I have arranged a bounch of Perl programs to inspect it's registries using Win32::EventLog to trigger some action. This approach it's difficult and everytime I had to restart from scratch. So I had the, cool, idea to write an engine that read events and, if a given rule matches, write them on a plain logfile, then the road it's plain for a Perl programmer.
presentation
Essentially what the module do, as it is explained in it's POD, is using Win32::EventLog and parsing windows events and writing them to plain logfiles. This module is rule based: a rule it's a minimal set of conditions to be met to write an entry to a logfile. You must add valid rules before starting the engine. Once started, the engine will check events every x seconds (specified using interval argument) and for every registry (System, Application, Security, Installation or a user defined one) that is requested at least in one rule will check for an event's source specified and optionally for some text contained in the event's description.
The resulting engine it's designed to survive to shutdowns and user's interruption issued with CTRL-C in the
console or a kill of the PID: next run of the program will read just unparsed events on. This is achieved storing
numbers of last event read (for each registry) in a file specified with the lastreadfile argument.
A simple example of it's usage is (as in the example section of the module) is the following:
use strict;
use warnings;
use Win32::Event2Log;
my $main_log = $0.'.mainlog.log';
my $last_numbers_log = $0.'.last_numbers.log';
my $sys_errors_log = $0.'.System_err_warn.log';
my $engine = Win32::Event2Log->new(
interval => 60,
endtime => 0,
mainlog => $main_log,
verbosity => 2,
lastreadfile=> $last_numbers_log,
+
);
$engine->add_rule (
registry => 'System',
eventtype=> 'error|warning',
source => qr/./,
log => $sys_errors_log,
name => 'System errors and warnings',
+
);
$engine->start;
But since I've always produced modules as private containers of almost related functions, I'm a bit a newbie in regard to CPAN standards. Infact I plan to release it on CPAN soon, but not before having listen your advices. So my Request for Comments are:
RFC
1) name:
I think the Win32 is naturally the correct one but what about Event2Log ? it seemed the best choice for me
2) testing:
in this field I read a lot in the past but, my sin, practiced almost no times.. I've done my best writing 01-basic.t (here(current version)). How the test can be improved?
I need to bail out in the test if $^O is not MSWIn32? I tested only the public methods I offer: should I test also private functions?
3) design and enanchemts:
Even if the module runs well enough in my tests on various scenarios, I already plan to modify it. Infact actually the core of the engine
is a while (1) {.. loop where new events are checked and rules applied (you can see it here(current version)).
I plan to abstract the reading part, maybe adding a Win32::Event2Log::Reader submodule. Infact I want also the user to choose if use Win32::EventLog as reader or a wrapper around wevtutil.exe that I plan to write soon. How achieve this? Having Win32::Event2Log::Reader using Win32::EventLog by default and Win32::Event2Log::Reader::Wevtutil subclassing Win32::Event2Log::Reader ? What is the cleanest design for such modification? Which tests I must add?
4) design of an eventual Win32::Event2Log::Reader :
This seemed to me a good use for an iterator: $reader->next will replace a lot of odd code in my current module. The fact I'm wondering about is for the wrapper around the system call wevtutil.exe
Since system calls are expensive I plan the first time the iterator it's initialized, to query all previous events and return them one at time: the array of events this first time can be many Mb and in successive calls possibly just few bytes. This seems against the good design of a ligth sized iterator. It's justificable to avoid possibly many system calls?
Thanks for reading.
L*
There are no rules, there are no thumbs..
Reinvent the wheel, then learn The Wheel; may be one day you reinvent one of THE WHEELS.
Maybe you are just starting out with Perl and don't have a computer set up
for Perl. Unix computers already include Perl, but maybe you don't have the
permissions to run it on your computer.
Maybe you just want to try a Perl module for some
external program that you don't want to install on your
machine. Maybe you just got a bug report for Linux that you
can't easily replicate. Maybe you are online and don't have access
to your home machine. Maybe you just need 5GB of storage quickly.
That's how you pay for it - with information about yourself. Google will monitor what programs you invoke
but not the command line parameters. In return, you get 5GB of permanent storage and a 2GB RAM virtual machine
that includes Perl 5.24, other programming languages, the Google Cloud SDKs and other stuff.
Using the Cloud Shell as a web development environment
The cloud shell also comes with an included web proxy so that you (and only you) can try out web applications
served from any web server on that machine. This makes the Cloud Shell a convenient testbed to try out web frameworks like Mojolicious, Dancer2, Dancer or even CGI::Application in PSGI mode.
"Your code sucks." I've said that more than once, some times quite literally, more recently, I tend to wrap it in a minimal bit of politeness. And much more often, I say "This [3rd party] code sucks". In fact, saying "your code sucks" and even "your design sucks" is part of my work.
YEAH! I'M PAID FOR BEING AN ASSHOLE!
Well, actually not.
Being paid for being a beancounter
At work, we write code for our embedded systems that work in industrial, medical and aerospace environments. Some of our systems are quite harmless, less dangerous than a lamp. To cause damage or to harm people, you would have to throw the systems at people. But most systems have real-time requirements, control potentially dangerous machines or oxygen supplies, or similar stuff. So errors may cause real damage, hurt or kill people. One way to reduce risks is to do peer reviews, starting way before we even think about writing code. Of course, code is also peer reviewed in nearly all of our projects. We are quite used to poke in other people's code, search weak points, and do bean counting. It improves not only our products, but also the way we write code.
Saying "your code completely sucks" is extremely rare. In fact, most times, it's the little details. Last minute changes in the code, hastily and/or interrupted, leaving a little bit of mess. Misleading names, documentation that was not updated to match changed code, left-over comments from previous iterations, a missing case in a switch, ignoring the style guide, you name it. At the end of a peer review, we have a list of problems in the code, and usually, author and reviewer agree without discussion that and how those problems have to be fixed. Sometimes, the author has to justify why and how (s)he has written a piece of code, and that this way is in fact correct. In those cases, the usual problem is lack of comments and/or documentation in the code.
Impedance mismatch
"Your code sucks" does not mean "you suck".
A while ago, we had a project that has grown too much for our little team, so we decided to subcontract a little, quite independent part of the project to an external developer. We drafted a minimal requirements list and an interface specification, added our style guide, and had a meeting with the external developer. We gave him a suitable development board, hacked to the point that relevant parts were similar to the real product, a lot of ready-to-use hardware drivers, and waited for him to come back with working code.
A second aspect of this approach was to search for someone who could help us in future projects by taking over parts of the development in busy times.
What came back was a big mess of spaghetti code, completely ignoring the style guide, lacking documentation, and hardly working at all. A classic case for "your code sucks big time", but let's face it: If you search an external developer for long-term relations, you try to be positive and helpful: "Look, we need the code to match our style guide. That's written in the contract with our customer. We need documentation, and it has at least to compile on the target CPU. Yes, your dev board has a different CPU. Use #define and #ifdef instead of hardcoding. Compile for our target, even if you can't run that on the CPU we gave you. Do this, add that, remove those, don't copy and paste, use functions, bla bla bla. This is how to use doxygen: Just add an extra asterisk at the start of the comment, bla bla bla."
Wash, rinse, repeat. The next iteration still sucked. And so did the third one. My written response to the fifth or sixth iteration was (not literally!) "your code sucks". I explained that every iteration took me more than an hour just to make it compile. I explained that we agreed on the expected behaviour of the code, but the code did not show that behaviour. That the behaviour and the form of his code were not acceptable. And that he was hired to save us time, not to cost us time.
Half an hour later, my boss came around, telling me that the external developer has cancelled the contract because of my mail. The external developer has read it as "you suck". Well ...
We agreed that my mail was not very polite, but also that the entire mail (and all previous ones) just criticized the code and documentation. My boss phoned him, and discussed more than an hour. They finally agreed on a final day in our office for handing over the code and make it run on the target system. The external developer worked with me on my computer, and we made his part of the software work on the target and added a lot of documentation.
End of the story: We had the required part of software, in a state that worked, but was still ugly. We didn't change much after that day, and so that part is still ugly. It works, and I would like to clean up the last dirty corners, but it's not worth the time. Oh, and that external developer won't be hired for new jobs.
You are too academic
In a previous job in a medical environment, I had to write an interface between an existing piece of software and a new laboratory machine that replaced an older one. The machine reported its data via RS232, and a simple external program wrote the data into a file on a file server. So I copied the old driver into a new file and tried to make sense of the existing code.
The system was written in what was originally a subset of C, but has evolved into some mix of the Hunchback of Notre-Dame, Gollum, and Salvatore from "The Name of the Rose". Not quite ideal conditions for writing safe code for experienced developers, but usable. Unfortunately, the software was written by a salesman that originally just sold the IDE for Hunchback-Gollum-Salvatore (HGS - not the real name, of course). He was hired to use HGS to write that medical software, ignoring all rules for developing medical software, and bypassing the in-house IT department. He had no idea how to write software, he had no idea how to plan his time, and gave unreasonable promises of what the software would be able to do in no time. To make things even worse, a research diver was hired to help him developing the software.
I was hired to replace the salesman-developer.
So I read the driver code. It was just a single function. 1500 or 2000 lines of code in a single function. Some parts were copied five or six times instead of moving them to functions. And I found errors. Many errors. Obvious errors. Errors that no sane developer would make. Well, I was new on the job, and I was not sure if I understood all of HGS. So I RTFM, twice to be sure. I found that HGS documented that comments are "like in C". In C, comments don't nest. In the HGS compiler, comments don't nest. But in the editor of the HGS IDE, they do nest. So you end up with code that looks like it is commented out, but it is not. The compiler happily compiles it, and the runtime executes what looks like a comment in the IDE. I found that fopen returns a handle, or 0 on error. But alas, there is no way to find out what error has happened. Permissions, lack of privileges, locking, network error, non-existing file? You just get a 0 back from fopen(). There is no errno. No try-catch. No exceptions. And I found at least four more bugs in HGS itself.
Back to the driver code. There were errors about every 10 lines of code, and they were real errors, even in HGS. I looked at some of the other code, and found about the same error rate. So used a little bit of perl to simply extract all of the code that the salesman and the diver had written, and made perl count the lines of code. Then I multiplied the lines of code by the error rate from the driver code. Tens of thousands of potential errors in a medical software does not sound sane, does it?
At the next management meeting, I raised the issue. I explicitly stated that the total number of errors was a rough estimate, that may be too high by a factor of 10 if we were lucky. But even then, thousands of potential errors would remain in software were a single error could lead to severe medical complications in an emergency situation. I recommended to rewrite the software from scratch, because the existing code base was in a horrible state and HGS does not help improve the situation.
I was told by the managing director that I was "too academic". Well, if you prefer having the software help kill people ...
A merger changed a lot of priorities, and so that piece of crap was assigned to someone else, in a different federal state. I was quite happy with that decision, and even more when I heard that they had decided to outsource that project and have it reworked.
About two year later, the new old software was presented to the management, the IT and the laboratory teams. I sat in the rear corner, not really interested in that management show. "Look, new shiny buttons that look and work exactly like the old ones." But then someone from the laboratory team asked how much of the scary salesman code has survived. The presenter smiled. "We have removed almost all of that crap. It was actually easier just to start from zero than to fix the code. The software still looks and feels the same, but the errors are gone." I could not help smiling from ear to ear. The laboratory manager noticed that, raised his hand and said: "Look at Alexander, watch him smile. He told you to do exactly that two years ago."
Well, not exactly. I recommended to get rid of HGS, but the company that reworked the software was a HGS shop, so HGS stayed. I never read any line of the new code, but I'm sure that even the new code will have race conditions
and will have trouble coping with I/O errors, because it is very hard to avoid race conditions in HGS, and it is nearly impossible to do sane error handling in HGS.
Alexander
--
Today I will gladly share my knowledge and experience, for there are no sweeter words than "I told you so". ;-)
There are two hard problems in computer science: cache invalidation and naming things*. Questions about naming things, for me, show up the most when designing a new class (and I wrap nearly all my code into a class, these days). Since I used to earn my rent by writing code in C++, my method names in Perl tend to reflect that heritage.
The most obvious example of this pattern is the way I name object constructors. If I have a package, Xyzzy, the constructor for that class is usually called Xyzzy::new. When the initialization of an object is expensive, I would wrap the constructor in a singleton design pattern, and call that method new. A simplified implementation of this constructor might look like the following:
This design pattern allowed me to conceal the singleton nature of a Xyzzy, and I used to think that was a good thing.
Recently, however, the needs of my job called for me to write a substantial quantity of code in the Programming Language That Shall Not Be Named. That language was written with a philosophy that directly opposes TMTOWTDI – for any task you want to perform, there is One True Way you must do it. It is a philosophy that complicates the implementation of simple one liners, but greatly reduces, I suspect, the time spent grading test questions which must be answered by writing code in that language.
One of the True Way conflicts I encountered while working in this programming language was the implementation of singleton constructors. You cannot choose a different name for your constructor, and the memory allocation of the object is done externally before your constructor code gets invoked. In short, there is simply no way you can override the constructor with a singleton allocator, and any class method that implements a singleton design pattern must be explicitly invoked by the caller. This leaves me with something like
Having switched to this new nomenclature, I find that singleton instances are only reused when I want them to be used. True, that’s almost always, but this naming technique does leave me the option of constructing a new object instance if I wanted to do something ugly to it and didn’t want to risk polluting the cache. On the other hand, if I want to use this pattern on a class that is already widely used, I have to go on a global search-and-destroy mission, replacing constructor calls with calls to instance(), if I want to benefit from the performance improvement that comes from using a singleton.
These days, I still find myself banging my head on the desk when none of the four different ways I might solve a problem in Perl can be applied in The Other Language, but I think this one particular technique is beginning to grow on me. And I’m glad Perl follows TMTOWTDI; it allows me to bring back these new techniques back into my regular job, and benefit from them here, as well.
*Also, off-by-one errors. But if I had said that up above, someone may have accused me of being swayed by that Other Programming Language into parroting the Spanish Inquisition sketch – and that is a dead parrot.