Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl: the Markov chain saw
 
PerlMonks  

Re: Avoiding compound data in software and system design

by BrowserUk (Pope)
on Apr 21, 2010 at 00:20 UTC ( #835918=note: print w/replies, xml ) Need Help??


in reply to Avoiding compound data in software and system design

Sorry, but this smacks of: I just got bitten by something, so now I'm gonna demonise it.

  1. Are hashes evil? They consists of keys and values.
  2. Floats? Exponent and characteristic.
  3. Integers? Magnitude and sign.
  4. Bytes? Many bits.
  5. Strings? ...
  6. Objects? ...

I can't use my new vacuum cleaner in it's box, but I'm glad it came in one.


Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority".
In the absence of evidence, opinion is indistinguishable from prejudice.
  • Comment on Re: Avoiding compound data in software and system design

Replies are listed 'Best First'.
Re^2: Avoiding compound data in software and system design
by herveus (Parson) on Apr 21, 2010 at 12:54 UTC
    Howdy!

    Those are all red herrings.

    Hashes are not scalar data; they are a collection of values indexed by key. Hashes using the old sub-key thingy that predated references would be an example, but not because they are hashes.

    The scalar data types are usefully atomic. If you need to work with the sub-parts of the underlying representation, you get to disassemble them yourself. Strings, per se, are only compound insofar as you define the values to be so and need to work with individual parts. Objects, more or less by definition, *can* have numerous attributes, but the parts are explicit and individually addressable (for most sane implementations).

    I see the point; it needs to be applied judiciously.

    yours,
    Michael

      The string scalarDSN data type is usefully atomic. If you need to work with the sub-parts of the underlying representation, you get to disassemble them yourself.

      Seeing as you can swap in DSN for a scalar data type, what you said of scalar data types applies to DSNs as well.

      By your logic, the problem isn't the compoundness of DSNs, it's the lack or perceived lack of tools to manipulate DSNs.

Re^2: Avoiding compound data in software and system design
by metaperl (Curate) on Apr 21, 2010 at 14:13 UTC
    Sorry, but this smacks of: I just got bitten by something, so now I'm gonna demonise it
    Yes, it's called evolution. Intelligence is the ability to identify, formulate and resolve problems. So this post was made to identify and formulate a problem in hopes that it is not repeated. And yes, I did get bitten by the DBI API and now I have to go redo something so it works with Rose.

    Continuing, Let me present the definition of compound data to you once again:

    A compound datum is an apparently atomic data item that it really not atomic.
    Are hashes evil? They consists of keys and values.
    evil? You brought demons in the picture, not me. The point at hand is "apparently atomic". they are not apparently atomic. you dissected hashes into their parts yourself.

    Now, if instead of this hash:

    %a = (a => 1, b => 2);
    You did this: my $vals = "a:1,b:2" then you would have an apparently atomic data item that it really not atomic, because you would have to do string-twiddling to extract relevant subparts.
    Floats? Exponent and characteristic.
    Seems atomic to me. And the subparts you mention, can they be easily accessed/used?
    Integers? Magnitude and sign.
    or 32 bits (grin).
    my $int = Integer->new(magnitude => 12, sign => '+');
    ah, perfect decomposition!

    My post did not say it listed all examples of compound data. And if there are more, then fine. Besides, the focus was on software and system design, not language elements.

    Bytes? Many bits.
    Again, complex data is not 'compound data'. Compound is a specific term referring to a specific mistake in software and system design.
    Strings?
    Yes, they are complex, but only compound when mis-used.
    Objects? ...
    Yes, an object is atomic, not apparently atomic. It may have subparts, but each has a well-defined means of accessing/changing it.
    I can't use my new vacuum cleaner in it's box, but I'm glad it came in one.
    You are confusing a complex of objects with compound data. The vacuum cleaner's relation to the box was meaningful and useful. Packing multiple datums into a string is counter-productive to flexible software and system design as was demonstrated.



    The mantra of every experienced web application developer is the same: thou shalt separate business logic from display. Ironically, almost all template engines allow violation of this separation principle, which is the very impetus for HTML template engine development.

    -- Terence Parr, "Enforcing Strict Model View Separation in Template Engines"

      You are confusing a complex of objects with compound data.

      No I'm not. You are making an artificial separation where none exists.

      Take urls. These are both complex and compound. And simple.

      Whilst there are (many) modules like URI* that allow you to treat these as objects and access all their internal bits separately, the vast majority of modules that use urls as inputs (eg.LWP*), take them in their simple string form. Why?

      Because they do not care what is inside, and do not want to have to deal with it. For most applications of those latter modules, the user will be supplying a 'simple string', picked out of a text file (log file; html; whatever), and all they need or want to know is, can I reach it?

      If they had to tease apart the myriad forms of url/uri/urn formats in order to populate a ur* object in order to pass it to LWP*--that would promptly just stick all the bits back together again--it would be an entirely unnecessary waste of time & resources. Complexity without merit or benefit.

      Same goes for file systems entities. We pass open a string, not some kind of FileSystem::Object. Because for the most part, they are simply an opaque scalar entity we use. Not pick apart and fret over.

      And the same goes for your example of DBI data source names. At the DBI level, and below, they are simply opaque entities to be gathered and passed through uninspected. Requiring some kind of object be used for them would create unnecessary and useless complexity.

      They do not even have a consistent constitution. Your example breaks them down as dbi

      dbi mysql database host port

      And then as

      __PACKAGE__->register_db( driver => 'pg', database => 'my_db', host => 'localhost', usern +ame => 'joeuser', password => 'mysecret', );

      but you've lost two parts (dbi/port) and gained two parts (user/pass).

      And then you get something like DBD::WMI, which doesn't need and cannot use most of those--either set of 5. And DBD::SQLite that also has no use for most of those fields. And these came into being long after the DBI/DBD interfaces were designed and implemented.

      Rather than something to be "avoided", DBI's use of a string for the data source name is the sign of a well-though through, flexible interface. One that recognises that you cannot fit the world into labelled boxes, and that in many situations, there is no purpose in trying.

      You should be celebrating the vision and skill of those authors for designing an interface so flexible it can accommodate future developments without requiring constant re-writes as time passes and uses evolve. Not decrying them.

      Consider: Will your interfaces survive so long, so well?


      Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
      "Science is about questioning the status quo. Questioning authority".
      In the absence of evidence, opinion is indistinguishable from prejudice.
        You are making an artificial separation where none exists.
        we will see about that (grin)

        But the distinction is simple: conceptual elements belong in separate data elements or in a single element with straightforward access. The DBI dsn string has several conceptual elements which are not in separate data elemnets. And access is not straightforward - had a hash reference been used, access would be more straightforward, with no loss in API quality.

        But like I said in the opening post of this thread: Typically people either know this and dont need to be told or they dont know it and dont care :) So it's almost like screaming at a wall.

        But your comments about URLs are well-taken. I thought about that this morning when I woke up. And in a sense, you could consider DSNs as a form of URL. In fact, SQLAlchemy uses URLs instead of DSNs

        Rather than something to be "avoided", DBI's use of a string for the data source name is the sign of a well-though through, flexible interface. One that recognises that you cannot fit the world into labelled boxes, and that in many situations, there is no purpose in trying.
        I dont agree: it requires more parsing to decide which DBD to dispatch to this way.
        You should be celebrating the vision and skill of those authors for designing an interface so flexible it can accommodate future developments without requiring constant re-writes as time passes and uses evolve. Not decrying them.
        $dsn as a hash reference would have been just as flexible and much finer grained. And it would not suffer from a case of compound data. And the code to decide which DBD to dispatch to would've been more succinct. And I would not have had to write DBIx::DBH in order to work with Rose::DB and DBI interchangeably.

        The Rose::DB API has finer granularity and does not suffer from the compound data issues that the DBI one does: connection info from Rose::DB can be converted into DBI connection info in a simple fashion, vice versa not so.



        The mantra of every experienced web application developer is the same: thou shalt separate business logic from display. Ironically, almost all template engines allow violation of this separation principle, which is the very impetus for HTML template engine development.

        -- Terence Parr, "Enforcing Strict Model View Separation in Template Engines"

      You did this: my $vals = "a:1,b:2" then you would have an apparently atomic data item that it really not atomic, because you would have to do string-twiddling to extract relevant subparts.

      I don't see why searching through an associative array stored as "a:1,b:2" makes the type not atomic when the example you used for an atomic type ({a=>1,b=>2}) is an associative array that requires searching through a list of buckets then through a linked list.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://835918]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others surveying the Monastery: (4)
As of 2021-12-07 13:30 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?
    R or B?



    Results (33 votes). Check out past polls.

    Notices?