http://www.perlmonks.org?node_id=914545


in reply to Design flat files database

If you are using spinning disks (not static ram) to store your files and directories, then a useful rule of thumb is that a 7200 rpm disk spins 120 times a second, so each byte rolls by every .0083 seconds. On average, you can do no better than 4 milliseconds to fetch the byte(s) you are after in a random access. (You can get a lot of associated bytes with the same read, so data density helps in transfer time, but not at all with disk latency). With processor cycle times in the neighborhood of a nanosecond, you can execute a lot of instructions in 4 milliseconds. You could search a few thousand bytes of directory data, even using linear search, in far less time than it would take you to access that data. So don't make your directory hierarchy too deep. Each subdirectory is going to cost you at least 4 milliseconds to read. A few levels may get cached, but that's equally true of shallow hierarchies. Keep disk accesses in mind when you design your system.