Beefy Boxes and Bandwidth Generously Provided by pair Networks
Syntactic Confectionery Delight
 
PerlMonks  

Re: Joint Database Technology

by erichansen1836 (Acolyte)
on Jun 17, 2017 at 16:52 UTC ( #1193017=note: print w/replies, xml ) Need Help??


in reply to Joint Database Technology

Flat File Databases Indexed by SDBM Databases - An ISAM, NoSQL, Embedded Database System:

Follow first post in thread to find code examples: http://www.perlmonks.org/?node_id=1121222

SDBM databases of Key/Value pairs (tied to program hash tables) may be used to persistently index the fixed-length records of a "text" Flat File database for immediate random access. Historically, Flat File databases and SDBM databases have proven to be relatively weak when used alone, but when combined, the two offer a relatively strong relational database system. The binary SDBM files can store as the KEY (in the Key/Value pair), one or more fields and/or partial fields (concatenated together) contained within the Flat File, fixed-length records. The VALUE would hold the byte location offset of each record indexed, for positioning the file pointer for Read/Write operations. Edits to records are made "in place" either overwriting the existing record in its entirety, or overwriting only a single field contained within a record. Multiple Indexes (varying Key formats and their accompanying byte offset, Values) may be stored in a single SDBM file. Example: a Unique Primary Key Index which may be used is Social Security Number, and relative to that, an Alternate Key Index (with Duplicates) may be used on a combination of the fields and partial fields {BirthDate|LastNameFirst4Chars|FirstNameInitial|SeqNbr} where Sequence Number(SeqNbr) is an incremented number used to make each duplicate instance of the KEY {BirthDate, LastNameFirst4Chars, FirstNameInitial} unique. This database system would be most practical and easy to implement for Read Only, Data Warehousing and Data Marts, but with the addition of exclusively locked and released semaphore files, employed to prevent race conditions from occurring to protect the Flat File data source, a concurrent multi-user, Read/Write Database System could conceivably be implemented. A Read/Write Database System of this sort would need to have controlled access by a Front-End Database Application User-Interface which enforced a manual record locking and release strategy, and which managed the use of semaphore files.

SEE Semaphore Files Perl article at: http://www.informit.com/articles/article.aspx?p=28258&seqNum=3

PARENT/CHILD one-to-many record relationships may also be managed. The Parent records would be stored in a separate Flat File from that of the Children. Example: If the Parent records are LOANS, and the Child records are the individual COLLATERAL (say motor vehicles) securing those loans, then the Unique Primary Key for the Child records could be on VIN (Vehicle Identification Number), and an Alternate Key with Duplicates could be the Loan Number of the Parent record. Added to the Loan Number (Alternate Key with Duplicates), would be a sequence number incremented for each Child collateral item added to the database for each Loan (Loan_Nbr|SeqNbr). Thus, the Children are tied back to the Parent by the Loan Number (and vice versa). You would also want to store the LOAN NBR of the parent records within each of their child records so that you maintain the ties between parent and child Flat File records in the event the binary SDBM indexing becomes corrupt and needs to be rebuilt from the Flat File data.

Redundant data typically stored in Flat File databases can also be eliminated. Now, a Part Number alone need be stored in each record of a Flat File typically containing both Part Numbers and Part Descriptions. A Look Up of a single instance of the Part Description (by Part Number) is all that is required.

Replies are listed 'Best First'.
Re^2: Joint Database Technology
by erix (Parson) on Jun 17, 2017 at 18:41 UTC

    ... can really be ...
    ... would be ...
    ... could be designed to ...

    Apparently, Joint Database Technology is still a castle in the air, isn't it?

    I don't see a single item in your argumentation that isn't very likely better (and /much/ better) done by an already existing system like PostgreSQL, which already exists(!), has been developed and improved on for 20+ years(!), and is standard-compliant like few other systems. There is a large community of people improving postgres, producing a new version (with new features) every year. Bugs often get fixed within 24 hours. PostgreSQL is completely free and you may use it to build a closed-source system.

    Here is a challenge: can you publish here such a system of your design, giving full code, and using data in the form of free text files (or even self-generated ones), so that any one of us can reproduce it and test its performance? (multi-platform, please: windows-only is not good enough).

    If you do that I'll try to counter it with a postgres-based system to parallel it. Only then can we see what are the pros and cons of your SDBM/hash/perl/text-system versus poor old conventional PostgreSQL.

      Think outside the box. Gene Cranz (NASA Flight Director - Apollo 13) "I don't care what anything was designed to do, I care about what it can do".

        Think outside the box.

        Hear, hear.

        I couldn't agree more -- let's see what this mysterious, unpublished system can do.

        If it's clearly better than the usual suspects, I'll switch to it without any hesitation.

Re^2: Joint Database Technology
by roboticus (Chancellor) on Jun 18, 2017 at 18:08 UTC

    erichansen1836:

    Sounds much like another implementation of an ISAM-like system, an early step along the road to relational database systems.

    ...roboticus

    When your only tool is a hammer, all problems look like your thumb.

      Original Thread to follow for Perl Code examples and initial discussion/description = http://www.perlmonks.org/?node_id=1121222

      It has been pointed out that what I have referred to here on this thread as Joint Database Technology (Flat "text" Files/SDBM Files tied to Hash Tables) is really a type of NoSQL/ISAM (Indexed Sequential Access Method) technology.

      The records in the Flat Files do not need to be maintained in a sorted record order either, although this may be beneficial for sequential access. For random access, it is not necessary. HOWEVER..YOU COULD sort the Keys in the hash table tied to any of the SDBM files used for random access indexing. Then process the unsorted Flat File in sequential order because you access each record randomly by the sorted KEYs.

      What I have done recently which works well is to create an SDBM file with only the byte offsets stored which point to the first verse in each chapter of each book of the Bible. I also store the number of verses within each of those chapters in the SDBM file, so that I read the first verse of each chapter of each book randomly, then read the rest of the verses in a specific chapter sequentially from there. This makes the SDBM file much smaller in size since I am not storing a byte offset for every verse (but only the first verse in each chapter).

      My latest enhancement has been to read an entire chapter of Bible verses in one(1) read statement execution, and use the unpack statement to break out each verse into an array for processing. This code enhancement (TreeView DblClick Event) was posted to this thread for the KJV Bible Navigator DEMO software posted here. DEMO refers to a demonstration of the Joint Database Technology technique, not that the Perl code posted here is only a DEMO of a more fully functional KJV Bible Navigtator software which might exist. This software is not for sale, but only provided to demonstrate the FlatFile/SDBM joint/dual/tandem database system technology technique within a useful application program. This code could be tweaked to make it display a database of Customers/Invoices/Items within a TreeView/RichEdit widget GUI user-interface. Or whatever data you like.

        Is POSTGRESQL totally FREE to implement in a large organization? Is it as simple to maintain as ISAM (Flat Files/SDBM)?

        Yes and yes. And it has transactions. And replication. And multiple concurrent updates. Why on earth would you not use a database when a database is what you need?

        Your capitalisation is incorrect, BTW. It's "PostgreSQL".

        [...]have need of a READ ONLY database[...]

        Ah, so now your Great Program turns out to be only a read-only tool?

        I will now stop wasting my time, and no longer assume that your posts propose a serious database alternative. Have fun, though :-)

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://1193017]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others surveying the Monastery: (8)
As of 2018-11-20 15:25 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?
    My code is most likely broken because:
















    Results (225 votes). Check out past polls.

    Notices?