Re^2: Avoiding compound data in software and system design

Replies are listed 'Best First'.
Re^3: Avoiding compound data in software and system design by BrowserUk (Patriarch) on Apr 21, 2010 at 20:47 UTC
You are confusing a complex of objects with compound data. No I'm not. You are making an artificial separation where none exists. Take urls. These are both complex and compound. And simple. Whilst there are (many) modules like URI* that allow you to treat these as objects and access all their internal bits separately, the vast majority of modules that use urls as inputs (eg.LWP), take them in their simple string form. Why? Because they do not care what is inside, and do not want to have to deal with it. For most applications of those latter modules, the user will be supplying a 'simple string', picked out of a text file (log file; html; whatever), and all they need or want to know is, can I reach it? If they had to tease apart the myriad forms of url/uri/urn formats in order to populate a ur object in order to pass it to LWP--that would promptly just stick all the bits back together again--it would be an entirely unnecessary waste of time & resources. Complexity without merit or benefit. Same goes for file systems entities. We pass open a string, not some kind of FileSystem::Object. Because for the most part, they are simply an opaque scalar entity we use. Not pick apart and fret over. And the same goes for your example of DBI data source names. At the DBI level, and below, they are simply opaque entities to be gathered and passed through uninspected. Requiring some kind of object be used for them would create unnecessary and useless complexity. They do not even have a consistent constitution. Your example breaks them down as dbi `dbi mysql database host port` [download] And then as `__PACKAGE__->register_db( driver => 'pg', database => 'my_db', host => 'localhost', usern +ame => 'joeuser', password => 'mysecret', );` [download] but you've lost two parts (dbi/port) and gained two parts (user/pass). And then you get something like DBD::WMI, which doesn't need and cannot use most of those--either set of 5. And DBD::SQLite that also has no use for most of those fields. And these came into being long after the DBI/DBD interfaces were designed and implemented. Rather than something to be "avoided", DBI's use of a string for the data source name is the sign of a well-though through, flexible interface. One that recognises that you cannot fit the world into labelled boxes, and that in many situations, there is no purpose in trying. You should be celebrating the vision and skill of those authors for designing an interface so flexible it can accommodate future developments without requiring constant re-writes as time passes and uses evolve. Not decrying them. Consider: Will your* interfaces survive so long, so well? Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error. "Science is about questioning the status quo. Questioning authority". In the absence of evidence, opinion is indistinguishable from prejudice. RIP an inspiration; A true Folk's Guy	[reply] [d/l] [select]
Re^4: Avoiding compound data in software and system design by metaperl (Curate) on Apr 22, 2010 at 15:51 UTC
You are making an artificial separation where none exists. we will see about that (grin) But the distinction is simple: conceptual elements belong in separate data elements or in a single element with straightforward access. The DBI dsn string has several conceptual elements which are not in separate data elemnets. And access is not straightforward - had a hash reference been used, access would be more straightforward, with no loss in API quality. But like I said in the opening post of this thread: Typically people either know this and dont need to be told or they dont know it and dont care :) So it's almost like screaming at a wall. But your comments about URLs are well-taken. I thought about that this morning when I woke up. And in a sense, you could consider DSNs as a form of URL. In fact, SQLAlchemy uses URLs instead of DSNs Rather than something to be "avoided", DBI's use of a string for the data source name is the sign of a well-though through, flexible interface. One that recognises that you cannot fit the world into labelled boxes, and that in many situations, there is no purpose in trying. I dont agree: it requires more parsing to decide which DBD to dispatch to this way. You should be celebrating the vision and skill of those authors for designing an interface so flexible it can accommodate future developments without requiring constant re-writes as time passes and uses evolve. Not decrying them. `$dsn` as a hash reference would have been just as flexible and much finer grained. And it would not suffer from a case of compound data. And the code to decide which DBD to dispatch to would've been more succinct. And I would not have had to write DBIx::DBH in order to work with Rose::DB and DBI interchangeably. The Rose::DB API has finer granularity and does not suffer from the compound data issues that the DBI one does: connection info from Rose::DB can be converted into DBI connection info in a simple fashion, vice versa not so. The mantra of every experienced web application developer is the same: thou shalt separate business logic from display. Ironically, almost all template engines allow violation of this separation principle, which is the very impetus for HTML template engine development. -- Terence Parr, "Enforcing Strict Model View Separation in Template Engines"	[reply] [d/l]
Re^5: Avoiding compound data in software and system design by BrowserUk (Patriarch) on Apr 22, 2010 at 23:45 UTC
Typically people either know this and dont need to be told or they dont know it and dont care :) So it's almost like screaming at a wall. You are deluding yourself. The DBI interface has been around for 15 years. And you are the first person to see this 'need'? And I would not have had to write DBIx::DBH in order to work with Rose::DB and DBI interchangeably. Have you looked inside Rose::DB? Have you looked at all the code and utterly pointless machinations it goes through in dealing with that hash in order to do what? To tack all the bits together into a string and pass it on to DBI! And what does it achieve? Nothing! Just a couple of hundred extra lines of code that complicate the interface and slow things down for no net gain whatsoever. Rose::Db is essentially a wrapover DBI. And you're writing a wrapover that wrapover so that you can "use them interchangably". Sir! Your logic is flawed. Even though you cannot see it. Your logic is flawed. Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error. "Science is about questioning the status quo. Questioning authority". In the absence of evidence, opinion is indistinguishable from prejudice. RIP an inspiration; A true Folk's Guy	[reply]
Re^6: Avoiding compound data in software and system design by siracusa (Friar) on Apr 28, 2010 at 04:29 UTC
Re^7: Avoiding compound data in software and system design by BrowserUk (Patriarch) on Apr 28, 2010 at 06:33 UTC
Some notes below your chosen depth have not been shown here
Re^4: Avoiding compound data in software and system design by metaperl (Curate) on Apr 28, 2010 at 14:23 UTC
No I'm not. You are making an artificial separation where none exists. Everything a human does is 'artificial' - I think what you mean is superficial or arbitrary. And as this thread shows, even EF Codd was somewhat vague and arbitrary in specifying what constituted atomic data. So yes, you're right, the definitions are vague and somewhat subjective. But throwing some light and angst on the issue should make us more aware and intelligent in future API decisions. The mantra of every experienced web application developer is the same: thou shalt separate business logic from display. Ironically, almost all template engines allow violation of this separation principle, which is the very impetus for HTML template engine development. -- Terence Parr, "Enforcing Strict Model View Separation in Template Engines"	[reply]
Re^5: Avoiding compound data in software and system design by BrowserUk (Patriarch) on Apr 28, 2010 at 17:06 UTC
EF Codd eh? Circa 1981, I had to do a CS project, and having read an article (in Byte I think) on Codd's paper, I wrote up the proposal for my project as: "A simple exploration of the Relational Model". To be written in BASIC Plus 2. And yes, BASIC. I had one term to write it. It took 6 weeks for the college library to obtain a photocopy of the paper--it had to come from the British Library in London, the only people in the UK who had a copy. It was photocopy, of a photocopy, of a bound paper with all the distortions and fuzzy greyness that entails. It took me two whole weeks to read it--I understood very little of it. So there I was with half my time gone and nothing to show for it. Back to the point. And that is, all DBI needs to know is the first two fields of the DSN. The first must match 'dbi' (+-case); the second must match a module "DBD::<2ndfield>" that is installed locally. What comes after that is none of its concern. It just gets passed through to the loaded DBD driver. And the forms of that opaque token are myriad. A quick survey turns up: $dbh = DBI->connect("dbi:Informix:$database", $user, $pass, %attr); $dbh = DBI->connect("DBI:Unify:dbname[;options]" [, user [, auth [, a +ttr]]]); $dbh = DBI->connect("dbi:Oracle:host=$host;sid=$sid", $user, $passwd) +; $dbh = DBI->connect("dbi:SQLite:dbname=$dbfile","",""); $dbh = DBI->connect("DBI:drizzle:database=test;host=localhost", "joe" +, "joe's password", {'RaiseError' => 1}); $dbh = DBI->connect('dbi:ODBC:DSN', 'user', 'password'); $dbh = DBI->connect("dbi:Pg:dbname=$dbname", '', '', {AutoCommit => 0 +}); $dbh = DBI->connect('DBI:RAM:','usr','pwd',{RaiseError=>1}); $dbh = DBI->connect("DBI:Wire10:host=$host", $user, $password, {Raise +Error' => 1, 'AutoCommit' => 1} $dbh = DBI->connect("DBI:CSV:f_dir=/home/joe/csvdb") $dbh = DBI->connect("dbi:JDBC:hostname=$hostname;port=$port;url=$url" +, $user, $password); $dbh = DBI->connect("dbi:Sqlflex:$database", $user, $pass, %attr); $dbh = DBI->connect("dbi:DB2:db_name", $username, $password); $dbh = DBI->connect("DBI:mysql:database=test"); $dbh = DBI->connect('DBI:DBMaker:' . $database, $user, $pass); $dbh = DBI->connect('dbi:PgPP:dbname=$dbname', '', ''); $dbh = DBI->connect('dbi:PgLite:dbname=file'); $dbh = DBI->connect("dbi:ADO:Provider=Microsoft.Jet.OLEDB.4.0;Data So +urce=C:\data\test.mdb", $usr, $pwd, $att ) $dbh = DBI->connect("DBI:Ingres:dbname[;options]", user [, password], + \%attr); $dbh = DBI->connect('DBI:Solid:TCP/IP somewhere.com 1313', $user, $pa +ss, 'Solid'); $dbh = DBI->connect("dbi:Google:", $KEY); [download] Look at the variations once you get beyond the first two fields. Yes you could keep these all separate in a hash, but to what end? You (as a DBI user) cannot do anything useful with them because there is insufficient consistency to make even validation judgements, much less anything else. Even where several DBDs require, for example, a "dbname", for some this will be have SQL identifier limitations--though even they aren't consistent across all SQL-like DBs. For some it will be a filename (with local filesystem semantics--case dependance (or not); reserved characters (or not); length limitations (or not). For some, it's a hostname and port. For some--see the ADO example--it's a whole bunch of stuff entirely unique to that DBD. For some the subfields have to be prefixed with their tagname, others are position dependant. Why stick all these disparate bit into a hash and then have DBI concatenate the bits--risking getting it wrong because (for example) it adds tagnames where none are required, or the hash ordering screws up the position dependance; or ...? To achieve all that, you'd need more than just a hash. You'd need one flag per field to decide whether the key name should be prepended to the fields value. You'd need another value to ensure ordering. You'd need yet another flag to ensure that (for example) backslashes in pathnames got escaped for interpolation. And all of that complexity buys you what? The user can far more easily know what the requirements are for the DBD (or two; or three) he is going to use, than any programmer can try and unify into one generic interface structure that will stand the test of time. Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error. "Science is about questioning the status quo. Questioning authority". In the absence of evidence, opinion is indistinguishable from prejudice. RIP an inspiration; A true Folk's Guy	[reply] [d/l]
Re^6: Avoiding compound data in software and system design by siracusa (Friar) on Apr 29, 2010 at 12:33 UTC
Re^7: Avoiding compound data in software and system design by BrowserUk (Patriarch) on Apr 29, 2010 at 17:04 UTC
Some notes below your chosen depth have not been shown here
Re^6: Avoiding compound data in software and system design by Hue-Bond (Priest) on Apr 29, 2010 at 13:18 UTC
Re^3: Avoiding compound data in software and system design by ikegami (Patriarch) on Apr 28, 2010 at 18:23 UTC
You did this: my $vals = "a:1,b:2" then you would have an apparently atomic data item that it really not atomic, because you would have to do string-twiddling to extract relevant subparts. I don't see why searching through an associative array stored as "a:1,b:2" makes the type not atomic when the example you used for an atomic type ({a=>1,b=>2}) is an associative array that requires searching through a list of buckets then through a linked list.	[reply]


go ahead... be a heretic
	PerlMonks