DB_File vs flat file in terms of speed

kiat has asked for the wisdom of the Perl Monks concerning the following question:

Hi Monks

I'm new to perl's database modules and I'm now experimenting with 'use DB_File'.

I've been used to parsing data from a flat text file but am considering using a proper database such as DB_File.

Let's say I've 1000 entries stored as follows:

orange:fruit
bicycle:transport
movie:entertainment
abc:alphabet
.
.
and so on.

Let's say I want to retrieve the entry 'movie' and modify it. With a flat file, I would need to loop through the entire file to find a match and then make the modification.

With 'use DB_File' and using 'tie', I can locate the entry simply by using the key 'movie' as in

$key='movie';
$retrieved = $hash{$key};
[download]

to retrieve its corresponding value.

It appears then that using DB_File is a lot faster and more efficient. Am I on the right track with regards to how DB_File via 'tie' works and the merits of DB_File as a method storing and retrieving data?

I look forward to hearing your feedback and comments :)

cheers,

kiat

Comment on DB_File vs flat file in terms of speed Download Code

Replies are listed 'Best First'.
Re: DB_File vs flat file in terms of speed by MZSanford (Curate) on Oct 25, 2002 at 15:44 UTC
Am I on the right track with regards to how DB_File via 'tie' works and the merits of DB_File as a method storing and retrieving data? You have it working, so all of the syntax is correct. As for the usage of `DB_File`, it seems to me you have found it's exact usefulness. I tend to work in RDMS's, so i don't normally use `DB_File`, but you seem to have it's number ... keep it up. from the frivolous to the serious	[reply] [d/l] [select]
Re: Re: DB_File vs flat file in terms of speed by kiat (Vicar) on Oct 25, 2002 at 16:11 UTC
Thanks, MZSanford! I'll check out on RDMS :)	[reply]
Re: DB_File vs flat file in terms of speed by perrin (Chancellor) on Oct 25, 2002 at 15:51 UTC
You are correct. Here's another tip for you: SDBM_File is much faster than DB_File when dealing with small keys (less than 2k each). Also, if you want to use a dbm file for both reading and writing in a multi-process situation (like CGI), use MLDBM::Sync.	[reply]
Re: Re: DB_File vs flat file in terms of speed by kiat (Vicar) on Oct 25, 2002 at 16:08 UTC
Thanks, perrin! Is SDBM_File a standard perl module just like DB_File?	[reply]
Re: Re: Re: DB_File vs flat file in terms of speed by perrin (Chancellor) on Oct 25, 2002 at 16:54 UTC
Yes, it is.	[reply]
Re: (nrd) DB_File vs flat file in terms of speed by newrisedesigns (Curate) on Oct 25, 2002 at 15:49 UTC
MZSanford++ Not only will DB_File keep your little duckling hashes in the proverbial row, but it's a much easier method of retrieval than a flat file database. There is a small file size difference. I recently moved away from flat files for storing my simple programs' information. My data files are now around 40KB for files that were once 2-4KB flat files. Regranted, on most PCs today, this kind of difference is insignificant, but on a hosted webserver, it may become a nuisance (if you have a lot of databases). Best of luck in your database endeavors. John J Reiser newrisedesigns.com	[reply]
Re: Re: (nrd) DB_File vs flat file in terms of speed by kiat (Vicar) on Oct 25, 2002 at 16:05 UTC
Thanks, John! The increase in file size (from 2 to 40 KB) is atrocious :) So assuming server space is premium, one will have to weigh the speed and efficiency gained against the increase in file file...?	[reply]
Re^3: (nrd) DB_File vs flat file in terms of speed by newrisedesigns (Curate) on Oct 25, 2002 at 16:13 UTC
I don't believe the database size is directly related to the flat file size. If you tie a %hash, put nothing in it, then untie, the db file will be around 32KB (at least for me: Win2K/ASPerl). I assume this 32KB is the data structure with no data. It grows slowly (not exponentially, thank God) as you put more data into the tied hash. Size isn't a pressing issue, just a thought. I have 125MB of data on my webserver, but have only used about 10 for the past year. Next on your plate should be learning SQL and the DBI module, so you can interface with mySQL, Postgres, Oracle, and Access. Dominus has a good article on SQL and DBI here Glad to be of service, John J Reiser newrisedesigns.com	[reply]
Re^2: (nrd) DB_File vs flat file in terms of speed by Anonymous Monk on Dec 19, 2007 at 23:18 UTC
Hi John My apps have 100k entries In flat its looks like this(matched users): 3052002:1054002 3052002:2050002 1054004:2050002 1054004:1054002 and its size is 4M but when i use SDBM like this: $all_match{'3052002:1054002'}=1; $all_match{'3052002:2050002'}=1; $all_match{'1054004:2050002'}=1; $all_match{'1054004:1054002'}=1; The file size gets to 1G any idea how can i get it smaller? Thanks Eyal	[reply]
Re^2: (nrd) DB_File vs flat file in terms of speed by Anonymous Monk on Dec 19, 2007 at 23:21 UTC
why its all in one line? thus is the format for one line: 3052002:1054002	[reply]
Re: DB_File vs flat file in terms of speed by princepawn (Parson) on Oct 25, 2002 at 16:23 UTC
I cant help but answer a question you didn't ask. Now, if flexibility of slicing and dicing the data were the issue, then DBD::AnyData would put you far ahead of the game with flat files. But then again, wait, if DBD::AnyData is indeed for AnyData, then why not dbm_files as well?!	[reply]


Your skill will accomplish what the force of many cannot
	PerlMonks