Beefy Boxes and Bandwidth Generously Provided by pair Networks
"be consistent"
 
PerlMonks  

Re^2: Large chunks of text - database or filesystem?

by bpphillips (Friar)
on Mar 19, 2005 at 13:41 UTC ( #440922=note: print w/replies, xml ) Need Help??


in reply to Re: Large chunks of text - database or filesystem?
in thread Large chunks of text - database or filesystem?

BLOBs are variable length (at least in MySQL, blob storage requirements). There are also MEDIUMBLOBs which hold up to 16MB and LONGBLOBS which hold up to 4GB, all as variable length fields requiring at most 4 bytes more than the data to store the length of the data.

If you happen to be fortunate to have a hosting account with MySQL v4.1.1 or greater you might be able to use the (de)compress() function to (de)compress the data on the fly in MySQL. Or for that matter, you could just compress it using perl before inserting it in the table.

For searching, plucene is a really cool product but I've only used it in relatively small settings (i.e. indexes of under 30 MB). I've heard differing opinions on plucene's speed so it my not be an acceptable option for your case. MySQL also offers full-text searching functionality since v3.23.23 but that would prevent you from doing compression of the data.

As another alternative, I recently heard of another effort (still in alpha) that was inspired by Plucene but with the goal of overcoming Plucene's performance problems. It's called Kinosearch. I have no actual experience with it but it appears to be promising (I'd be interested in hearing from anyone else that's actually used it).
  • Comment on Re^2: Large chunks of text - database or filesystem?

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://440922]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others chanting in the Monastery: (4)
As of 2022-05-25 17:07 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?
    Do you prefer to work remotely?



    Results (90 votes). Check out past polls.

    Notices?