|Problems? Is your data what you think it is?|
eduardo's scratchpadby eduardo (Curate)
|on Jun 03, 2004 at 13:22 UTC||Need Help??|
Why did we use a relational database... well, there is really a simple answer for that. We didn't for 1/2 of our project, and we did for the other 1/2. :) Let's look at what we built, shall we?
The "project" at surfari consisted off two main products, a website and a search engine. The website needed to keep basic user information, such as username, password, favorite sites, personal bookmarks, etc... Now, I understand your "stunned outrage" at a statement like: Since we didn't need transactions... but, let's analyze why it is that our esteemed friend jeffa would make such a dangerous statement.
When does one "need" transactions? Well, first and foremost, when your data is valuable :) Let's analyze the different dimensions of data that we were storing, and let's see how transactions would have added value...
Hm... So, why were we using a relational database in the first place?
We were using the ability to store "fake foreign keys" (referential integrity wasn't insured by MySQL, but that's ok, it didn't change the value of our data) with great ease. We had an *incredibly* nice interface to a very customizable persistent data store... We had a really nice query language for our data store... and When we started doing collaborative filtering (Other users that like this stuff liked this other stuff), having it all in a relational database made that development *super fast*! Not to mention that with "time to market" constraints of Internet time, the fact that it provided all of this super duper functionality in an API we all already knew and loved made choosing MySQL a no-brainer. None of the features of the, admittedly incredible, Postgress would have made 1 lick of difference... after all, it's not like we were:
:) God I love classical examples.
So, where didn't we use relational databases? In our search engine! We took a DEC Alpha optimized B-Tree library, put an advanced forest-and-trees tree balancing and distributing decision walker in front of it, added a *sweet* splay-like aging cache mechanism in front of it to alleviate the von-neumann bottleneck, and took advantage of log2(n) as much as we could :)
So... did we care about the integrity of our data? Sure! We were going to retire from the IPO... however, in *so many* of the applications I have faced in my professional life, (which granted, have *not* been in the financial sector), all "logical transactions" have been of single statement cardinality.
I advocate right tool for the right job... and in this case, the only reason a relational database was used was the really sweet query interface to a persistent data store, and the handy dandy functionaity provided by auto-increments, indexing, etc... I can guarantee you that any time I see a more complex data model, I reach for a transactional data store...