Beefy Boxes and Bandwidth Generously Provided by pair Networks
XP is just a number

Re: Large chunks of text - database or filesystem?

by mattr (Curate)
on Mar 22, 2005 at 16:10 UTC ( #441522=note: print w/replies, xml ) Need Help??

in reply to Large chunks of text - database or filesystem?

My 2 cents having written forum software for a popular website and seen the wwwforum etc. (though I don't remember it too well right now).

If you are doing it from scratch, keep it simple, powerful and flexible. If doing it all over again I'd do it in Mysql (which I know). It is easy to back up, it's easy to add information about parents/children for threading, you can do a backup with a single command, you can use LIKE to search it. I've used filesystem-based things in the past and when (well this is a while ago) they didn't blow up the box or fail to port to another OS, etc., it was hard to guarantee integrity (who will miss a few files that got deleted by accident) and conceptually difficult to access. Sure you can lift weights but a database is easy.

Recently I used Class::DBI and mysql to build an appointment tracking system with messaging sort of like forums. Only possible with a DB really. If you need to add something to a textfile hierarchy it gets kludgy.

Storage efficiency isn't such a big deal I don't think, because of variable field lengths in a database (VARCHAR or VARCHAR BINARY or BLOB, read the online manual at

The speed overhead might definitely be an issue with say Class::DBI (probably okay for a simple forum).

Be sure to spend much time on your admin interface so you can take things off that are inappropriate and I also had to search for "badwords". Even ban users and log ips. Think about using HTML::Template and modularize so you aren't writing the same display code 4 different times. Don't put HTML in heredocs in the code. I'm saying this because I started with someone else's code and tore most of it apart. It was badly done and I even had to rewrite the file saving/locking. Do yourself a favor and start clean and secure.

To save work I made the admin log in to get an admin cookie set to let him edit the live threads, it worked.

Also I think that over time you will need to retire posts or maybe even archive by month or season. Say you have 10,000 posts, you don't want to tell people you have 200 pages to surf and show that many navigational links right? With a DB it is easier to sort by date because it's orthogonal. Even paging is easier (a Pager for clasists). With text files you are going to be building all kinds of hash index files and it will be a mess in your head all the time. Go with a DB and stop thinking about all this nitty gritty. You need to focus on the admin and real world issues that come up which unless you are really experienced with this will likely hit you like a sack of bricks. (If you have run a busy IRC channel then maybe you already know a bunch of this).

Finally you probably want to hide email addresses but keep them in the DB. For example we had the star of the site, a sort of Martha Stewart of Japan (well now she is, wasn't a few years ago), email people back if she wanted to do so. Also if I remember right our board allowed a user to store a "save password" that s)he could use to delete his post (well maybe I tore that out). Of course the hierarchy would have to be mended. So clients might ask you for this kind of thing one day.

Also do a lot of user testing before the launch. Please. All kinds of little thing will bite you.

  • Comment on Re: Large chunks of text - database or filesystem?

Log In?

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://441522]
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others wandering the Monastery: (6)
As of 2022-05-28 21:14 GMT
Find Nodes?
    Voting Booth?
    Do you prefer to work remotely?

    Results (101 votes). Check out past polls.