|Problems? Is your data what you think it is?|
Re: OO concepts and relational databasesby stvn (Monsignor)
|on Aug 02, 2004 at 16:40 UTC||Need Help??|
Very nice Meditation! I very much agree with you on most points. It has long been discussed that there is a mismatch between Relational databases and OO programming. A quick aside to link to some of that information for those interested:
And now back to your regularly scheduled mediation response :)
1. Database decomposition is very different from class decompositionI cant argue with your first point, although, I think that maybe what really needs to happen on both ends is a relaxing of the almost maniacal drive for "purity". I tend to agree with the author of the article mentioned above, that some of this mismatch may be cultural, or at least that to solve it, the cultural aspects of it cannot be ignored.
2. Well-written OO code is inefficientI cannot argue the details of this, but I do think that maybe it is not as true as it used to be. In the old days of LISP, the big argument was that LISP was slow since it was interpreted, so we should all use FORTRAN or COBOL or some other langauge-of-the-day. Over the years though LISP has gotten a lot faster, due not only to having a number of very smart people working on the problem over 30 or so years now, but also because compiler technology and optimization ideas have evolved with it. Take a look at Standard ML's compilation model to see where functional languages have gone.
I think the same is happening with OO, although again, I cannot argue it in detail. But it seems to me that there is plenty of work on the ideas of improving the OO compilation model, and optimizing things like method dispatch, and such. After all OO is all about abstraction, so eventually the optimization of OO code will get abstracted away by the compiler.
However, your meditation is not about OO compilation techniques :)
Sounds simple, right? Well, each database hit might have actually been up to five (or more) queries.I would argue that if it takes up to 5 queries for each DB hit, you might want to rethink your approach to this.
Personally for things like a Bank object, I like to use long-lived Singletons for this, which hold all but the more volatile data in their instance (Of course, this will not work in all scenarios, but for the sake of argument/example ...).
I also have experimented with creating multiple "views" of Person-like objects, in which I retrieve certain sub-sets of the data. Obviously when I am looking for the checking account balance, I don't care much about personal information like the address of the Person. Of course this ends up being a compromise in the end, because I end up having to supply more information than normal about my "Person" needs. But, like I said, I am still playing with the idea, so we shall see.
My point here is that if this is a non-trivial application, and you have chosen to go with an OO-Relational mapping paradigm, you would likely not implement it in a fashion that would be naive to your performance/scalability needs.
Getting the most out of a database is HARDOf course, thats why you hire a DBA :-P
Unless you're a math major, you may have never heard of set theory, let alone understand how it can affect you.Funny, I was a Fine-Arts Painting major (a.k.a. - studying to be a waiter), and I have heard of set-theory :).
If you think about SQL as a way of using set theory, it's a lot easier to work with.This is very true, although it only works if you understand set-theory. I also think it helps to know about state machines and Finite State Automata to understand regular expressions as well. But I think for many people, knowing "theory" is more a barrier to entry than it is a helping hand.
Why do I bring this up? Well, very few people think about objects as representing sets or set operations, even when they really are.In my (admittedly limited) experience, I have more often seen the relationship between object theory and set theory referred to when discussing inheritance, and other such "under-the-hood" things. So I am not sure exactly what you mean with this though.
And, if you are looking at a database as an extension of your objects, you're going to mis-design your database schema.The inverse is true as well, if your objects are to be subservient to your data, then your class design will likely be bad. Again, its about compromise IMO, you can't have the best of both, you need to pick and choose what best fits your applications requirements/needs.
4. Classes are not tables and attributes are not columnsOf course they aren't, if they were, things would be a whole lot easier, and you would have never needed to write this meditation. You present several good ideas here, the Entity attribute table in particular. I have seen similar things both praised and shunned by OO programmers and DBAs.
In the design and development of a non-trivial application, it is naive to assume class-to-table and attribute-to-column relationships. I again will say that this comes down to compromise, and doing whats best for the problem at hand.
But, I suspect this argument is going to go the way of optimizing C code with inlined ASM. As Moore's Law continues its inexorable march, higher and higher level tools that are more wasteful of CPU, RAM, and I/O will become the norm. And, there may be good reason to embrace this changing landscape. As software becomes more and more complex in its requirements, interfaces, and functioning, treating software as a brick wall to be built in components may be more cost-effective than squeezing an extra 30% from the CPU.I disagree with this, I think that to say this in the end will come down to an issue of optimization is possibly simplifying the argument (and surely I too am simplifying your closing statement).
I think that the problem with OO-Relational mapping (both the available tools and some of the current prevailing "wisdom" on the subject) is that they are trying to create a technological middle-man between two technologies/paradigms/ways-of-thinking that don't fit together easily. The logical and rational roads that lead to better, cleaner more elegant solutions in both tend to move in (somewhat) opposite directions from one another. To me this says that any and all attempts to keep this middle-man solution happy will require too much work to keep up with the changing landscapes on either side.
IMO, what is required is not the middle-man approach, but instead a partial re-thinking of both. A compromise, which is not ad-hock, but a smart combining of the bests of both worlds. Keep in mind that this is all still a relatively young field (as is Computer Programming in general), and we are just seeing the beginning of it now.
Again, excellent meditation, one of my favorite topics of late. Thanks for the post.