Beefy Boxes and Bandwidth Generously Provided by pair Networks
Syntactic Confectionery Delight
 
PerlMonks  

Flat file DB

by Anonymous Monk
on Aug 30, 2001 at 01:02 UTC ( #108902=perlquestion: print w/ replies, xml ) Need Help??
Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

I'm hoping the problem I'm requesting help with here makes sense. I posed this problem to a few of my collegues then watched the expressions on their faces change to the blue screen of death (not totally supprising considering they are no were close to being Perl guru's, or anything guru's).

I'm relativly new to Perl, and have never worked with flat file databases before.

I was given a flat file, pipe delimited database & need to generate reports based on the data said database. I'm having difficulty in determining the most efficient (but right now I'd settle for in-effecient if it gets the job done) way to sort through and pull out two entries from the database using a couple of search keys. I'll then format the data into a HTML based report in the fasion chosen by my PHB.

Any clues or tips would be better than Tylenol for my headache. Thanks

Comment on Flat file DB
Re: Flat file DB
by jryan (Vicar) on Aug 30, 2001 at 01:36 UTC

    I assume your database looks something like this?

    entry1|entry2|entry3|entry4|entry5| entry6|entry7|entry8|entry9|entry10| entry11|entry12|entry13|entry14|entry15|

    The easiest way to do it would be to open the file, put the data into an array, split each array element on the pipe, and then go from there.

    open (DATA, "data.dat") @data = <DATA>; close (DATA); foreach $item (@data) { @elements = split (/\|/, $item); # do your comparisons to each item in @elements here }

    This is really an inefficient way to store data... I suggest you upgrade to a (my|postgre|donthurtme) database. Perl works really well with them, has a lot of support, and they are much faster

      Actually, the data looks something like this:

      entry1_stuff1|entry1_stuff2|entry1_stuff3 entry2_stuff1|entry2_stuff2|entry2_stuff3 entry3_stuff1|entry3_stuff2|entry3_stuff3

      and so on...

Re: Flat file DB
by cLive ;-) (Parson) on Aug 30, 2001 at 01:43 UTC
    Or perhaps using DBI with DBD::CSV to do the select as if it was a 'real' database.

    cLive ;-)

Re: Flat file DB
by tachyon (Chancellor) on Aug 30, 2001 at 01:58 UTC

    Here is is in a nutshell. This example reads from the DATA filehandle. By default this is the stuff after __DATA__ at the end of the program but if you open $file onto it it will be the contents of that file. To do this just uncomment the # open DATA, $file....line. If you do a Super Search for "flatfile pipe delimited database" you will find heaps of stuff on this. If you have other problems supply your problem code and some sample data. Maybe have a look at New Monks if you want to good rundown on the Monastery and Perl all on one page.

    #!/usr/bin/perl -w use strict; my $file = 'c:/data.txt'; my @data; my $line_cnt = -1; # get data in a 2D array @data[$record][$field] # open DATA, $file or die "Oops, Perl says $!\n"; while (<DATA>) { chomp; next unless $_; my @fields = split /\|/, $_; push @data, \@fields; $line_cnt++; } close DATA; # access data in our 2D array, note offsets are from 0 not 1 print "Line 2 element 3 is ", $data[1][2], "\n\n"; # dump all the data as an (indented :-) HTML table print "<table border='1'>\n"; for my $line ( 0 .. $line_cnt ) { print " <tr>\n"; my @fields = @{$data[$line]}; for my $field ( 0 .. $#fields ) { print " <td>$fields[$field]</td>\n"; } print " </tr>\n"; } print "</table>\n"; __DATA__ 1|A|B|C 2|D|E|F 3|G|H|I

    cheers

    tachyon

    s&&rsenoyhcatreve&&&s&n.+t&"$'$`$\"$\&"&ee&&y&srve&&d&&print

Re: Flat file DB
by rbi (Monk) on Aug 30, 2001 at 15:41 UTC
    Hi,
    I once wrote this code to learn some few things (I'm rather new to Perl, too).
    Save it into an executable file (I named it "agenda"), and it should work as a flat DB reader and printer. It is intended as an address manager... Probably silly (grep and few other things can do the same), but you can maybe grab some simple idea for your DB needs.
    ciao Roberto
    #!/usr/bin/perl use strict; my $ref_values; my $ref_names; ### Database file my $db_file = './phone.txt' || $ARGV[1]; ### Field eparator my $separator = '#' || $ARGV[2]; ### Set the primary key my $pri = 1 || $ARGV[3]; my $primary = $pri - 1; my $keyword = $ARGV[0] || die qq( \n), qq(use: agenda keyword (databasefile separator pos.key)\n), qq( \n), qq( default database: $db_file\n), qq( default separator: $separator\n), qq( default pos.key: $pri\n), qq( \n), qq( examples of use:\n), qq( agenda smith (shows all '*smith*' in field 'pos.key')\n), qq( agenda . (shows all records)\n), qq( \n), qq( To create a database, edit a text file writing in the firs +t\n), qq( line the names of the fields, separated by some *separator +*\n), qq( ('#' for example), and related entries in the following re +cords\n), qq( \n); ### Load the variables ($ref_values,$ref_names) = db_read($db_file, $separator, $primary); ### Print the database content db_print($ref_values,$ref_names,$keyword); ############ sub db_read{ ############ ### ### This routine reads the content of an ASCII database ### with field names as first record, storing the content ### into an hash of arrays. ### ### use strict; my $db_file = shift(); ### database file name my $separator = shift(); ### field separator my $primary = shift(); ### primary key my %values; my @fields; my @names; my $i; ### Open the database for reading open DB, "<$db_file" or die "Can't open $db_file: $!\n"; ### Read the first record containing the names of the fields my $first_record = <DB>; ### Remove the newline chomp $first_record; ### Split the record to get the field names @names = split($separator,$first_record); ### Now read the rest of the file and store ### the records into an hash of arrays while (<DB>) { ### Remove the new line character at end of record chomp; ### Split the record into fields @fields = split /$separator/; ### Populate the hashes defined by the fields in $first_record foreach $i (0..$#names-1) { $values{$fields[$primary]}[$i] = $fields[$i]; } } ### Close database close DB; ### Return the the reference to the hash of arrays ### and the array of field names return (\%values,\@names); } ############# sub db_print{ ############# ### ### This routine prints the content of an hash of arrays ### sorted by its keys. The hash of arrays is created by ### db_read subruotine. ### use strict; my %values = %{ shift() }; my @names = @{ shift() }; my $keyword = shift(); my $item; my $features; for $item ( sort keys %values ) { ### Scan the hash of arrays searching for $keyword my $found = 0; for $features (0..$#names-1) { if ($item =~ /$keyword/i or $values{$item}[$features]=~ /$keyword/i) { $found = 1; } } ### Print the hash of arrays if anything was found if ($found) { printf qq(\n ** keyword: %s\n\n), $item; for $features (0..$#names-1) { printf qq( %s = %s\n), $names[$features], $values{$item}[$features]; } $found = -1; } } }
    a sample phone.txt file:
    1#COGNOME#NOME#ENTE#UFFICIO#FAX#CELLULARE#CASA 2#Smith#John#CNR ICG#012 306816#012 304056## 3#Doe#John#AAA Corp.#012 306816#012 304056#335 2321212#

      Allow me to comment a little bit on your code, you are sometimes making your life harder than it has to be ;-) I will make the comments sequentially with the source code.

      • For multiline strings use a HERE document like this
        my $keyword = $ARGV[0] || die << "USAGE"; use: agenda keyword (databasefile separator pos.key) default database: $db_file default separator: $separator default pos.key: $pri and so on USAGE # the line above marks the end
      • It is sufficient to use strict only once at the top of your script, you don't have to repeat it in every subroutine.
      • open DB, "<$db_file" or die "Can't open $db_file: $!\n"; # leave out the final \n open DB, "<$db_file" or die "Can't open $db_file: $!";
        Then perl appends automatically the line number where the error occured - very helpful for debugging. And don't worry, perl will put a newline at the end of the statement.
      • Use less comments!! Yes, too many comments can be as bad as too little comments as the code gets cluttered and makes reading more difficult, e.g.
        ### Close database close DB;
        is really unnecessary, both lines read exactly the same. There has been quite some discussion about commenting correctly here at PM, have a look with Super Search.
      • This can be contracted
        my $first_record = <DB>; chomp $first_record; # in one line chomp(my $first_record = <DB>);
      • You are throwing the last field of the every entry away, $#names contains the last index of the array @names, not the number of elements. To avoid such one-off errors don't use loops which use indices - if avoidable. Here you can write this instead of the loop: @{ $values{$fields[$primary]} } = @fields;
      • Use compiled regexes with the /o modifier if the variable you are interpolating doesn't change, e.g. $item =~ /$keyword/io;
      • Don't use printf unless necessary, in your cases you can simply use the string interpolation and write
        print "\n ** keyword: $item\n\n"; # and print " $names[$features] = $values{$item}[$features]\n";

      There are more things, e.g. I don't think the hash of arrays is the best datastructure for this task, you are not using the main feature of a hash, the direct access to an element by a key.

      It has probably been some time since you wrote this script, your perl has improved in the meantime. So as a good exercise I'd suggest to rewrite the whole thing, rethinking your data structure and incorporating my advices. If you then post your script here you will get further suggestions for improvement. And at the end of the day you will have learnt a lot :-)

      -- Hofmator

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://108902]
Approved by root
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others meditating upon the Monastery: (10)
As of 2014-10-20 21:40 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    For retirement, I am banking on:










    Results (92 votes), past polls