Says
Tilly:
Essentially any API which make data and
metadata easily confused should be viewed with suspicion.
Dan Bernstein says something similar in his
explanation of the design
of qmail:
Don't parse.
I have discovered that
there are two types of command interfaces in the world of computing:
good interfaces and user interfaces.
The essence of user interfaces is parsing:
converting an unstructured sequence of commands,
in a format usually determined more by psychology than by solid engineering,
into structured data.
When another programmer wants to talk to a user interface,
he has to quote:
convert his structured data into an unstructured sequence of commands
that the parser will, he hopes,
convert back into the original structured data.
This situation is a recipe for disaster.
The parser often has bugs:
it fails to handle some inputs according to the documented interface.
The quoter often has bugs:
it produces outputs that do not have the right meaning.
Only on rare joyous occasions does it happen that
the parser and the quoter both misinterpret the interface in the same way.
When the original data is controlled by a malicious user,
many of these bugs translate into security holes.
Some examples:
the Linux login -froot security hole;
the classic find | xargs rm security hole;
the Majordomo injection security hole.
Even a simple parser like getopt
is complicated enough for people to screw up the quoting.
In qmail, all the internal file structures are incredibly simple:
text0 lines beginning with single-character commands.
(text0 format means that
lines are separated by a 0 byte instead of line feed.)
The program-level interfaces don't take options.
All the complexity of parsing RFC 822 address lists and rewriting headers
is in the qmail-inject program,
which runs without privileges and is essentially part of the UA.
Looked at from Dan's point of view, the second argument of two-argument
open is part of the user interface, a piece of unstructured
data (a string) which Perl must parse into structured data (a record that
lists the filename and the open mode). A program that
wants to use
open to open a certain filename with a certain mode must
quote this information, turning it into a string, which is then given to
open, which then parses it back into a name and a mode.
As Dan predicts, the quoting and parsing processes don't always interpret
the interface in the same way. Three-argument
open
allows the structured data to remain more structured throughout, because the
mode and filename never need to be combined into a single
unstructured string.
--
Mark Dominus
Perl Paraphernalia