|Problems? Is your data what you think it is?|
Google like scalability using perl?by dpavlin (Friar)
|on Oct 08, 2009 at 23:48 UTC||Need Help??|
I have had enough with CouchDB. Really. I just couldn't wait for queries to run for the first time any longer. And I really wanted to write my views in perl. Because I was mostly manipulating hashes which came from perl anyway.
Then I started dreaming. Disks are slow. Only right way to read them is sequentially, so it rules out disk as storage if you want to have fast access. So, what is fastest way to run perl snippets? In memory.
But, I would run out of memory eventually. So I would have to support sharding across bunch of machines... That would also enable me to scale linearly just by adding a new machine or two...
I just had a bunch of web kiosks running Debian live around, so I decided to try does this idea sound sane in a week. And sure enough, I had working version after a week.
Then I decided to rip it apart and rewrite into simpler one which is modeled around management of individual nodes and much simpler network protocol. And I discovered that parsing file on master node isn't bad because it cuts down on required dependencies on deployment. And with a little bit of caching it even provides speedup on startup.
In this stage, it's quite usable toy. I would really appreciate feedback about it. Is it interesting for other people to try it? Or did I just learned a few lessons about scalability that everybody else already knows?
And lastly how would you release such a toy? CPAN doesn't really seems like a right place, so is Ohloh enough? It's not in git, so github isn't really an option.
Update: (much) more details about Sack implementation and reasons behind them in comments thanks to everybody who asked and forced me to clarify :-)