Beefy Boxes and Bandwidth Generously Provided by pair Networks
We don't bite newbies here... much
 
PerlMonks  

DBD::Anydata

by idsfa (Vicar)
on May 14, 2004 at 18:04 UTC ( #353439=modulereview: print w/ replies, xml ) Need Help??

Item Description: The DBD::AnyData module provides a DBI/SQL interface to data in many formats and from many sources.

Review Synopsis:

Module Author: Jeff Zucker <jeff@vpservices.com>
Documentation

Abstract

Excellent tool for developing programs with limited "database" needs, prototyping full-on RDBMS applications and pulling in common data interchange formats. If you don't need /want the SQL baggage, try AnyData instead.

Pre-Requisites

Overview

DBD::AnyData is a DBI/SQL wrapper around AnyData which allows the author to use many SQL constructs on traditionally non-SQL data sources. Descendant from DBD::RAM, DBD::AnyData also implements that module's ability to load data from multiple formats and treat them as if they were SQL tables. This table can be held entirely in memory or tied to the underlying data file. Tables can also be exported in any format which the module supports.

Review

The variety and number of file formats in use is staggeringly large and continues to grow. Perl hackers are often faced with the job of being syntactic glue between applications, translating output from one program into the necessary input for another. Abstracting the exact format of these data allows the programmer to rise above mere hacking and actually craft something (re)usable. Separating the logic from the presentation improves the clarity of both.

DBD::AnyData attempts to provide this abstraction by presenting a DBI/SQL interface. It layers over the required/companion AnyData module, which presents a tied hash interface. The perl purist will most likely prefer to stick with AnyData, minus the DBD. The extra layer of abstraction will be most useful if you are more comfortable with SQL or your application design requires it. To my mind, the niftiest use of this module is the ability to prototype your code as if you had a whole relational database, but have the ease of a few simple CSVs actually holding the data.

The list of supported formats is impressive, and continues to expand. CPAN currently lists:

  • perl data structures and __DATA__ segments
  • Delimited text (Comma/Pipe/Tab/Colon/whatever separated)
  • Fixed length records
  • HTML Tables
  • INI Files
  • passwd Files
  • MP3 Files (specifically, their ID3 tags)
  • Paragraph Files
  • Web Server Logs
  • XML Files
  • DBI Connections (to leverage existing modules)

With more on the way.

DBD::AnyData has three basic modes of operation: file access, in-memory access and format conversion. These modes are implemented as five extension methods over a standard DBD.

In file access mode, the data file is read on each request and written on each change. The entire file is never read into memory (unless requested) and so this method is suitable for large data files. Be aware that these are not atomic commits, so your database could end up in an inconsistent state. This mode is not supported for remote files or certain formats (DBI, XML, HTMLtable, MP3 or perl ARRAYs).

In-Memory mode loads the entire data source into memory. Obviously a problem for huge data sets, but then you probably have those in a relational database already. This method is ideal for querying a remote data source, handled in the background by good old LWP.

Conversion mode takes data from an input (which can be local or remote, and in any supported format) and writes it to a local file, perl string or perl array. This function alone would be reason enough for the module to exist, and it's really more of an afterthought.

Caveats

  • Again, if you don't need SQL, use AnyData instead
  • Currently, DBD::AnyData will not allow SQL against multiple tables in the same SQL statement (no JOINs) Updated: per jZed this feature is now available
  • It isn't a real RDBMS. Don't expect atomicity, journals, etc etc
  • Not all formats are fully featured, and most require more modules

Summary

DBD::AnyData is one of those fun modules that lets you shove the crud work off on someone else (the author of the AnyData::Format:: module) and get on with crafting good code. I've found it especially helpful when putting together tiny web apps that might end up getting huge (and thus require a moving to a true database). Anything that lets me stop writing file format converters is worth checking out in my book.

Comment on DBD::Anydata
Re: DBD::Anydata
by artist (Parson) on May 14, 2004 at 19:44 UTC
    Currently, DBD::AnyData will not allow SQL against multiple tables in the same SQL statement (no JOINs)

    I am looking for non-RDBMS(mysql, access) based module which can allow the same.

Re: DBD::Anydata
by lwicks (Pilgrim) on May 19, 2004 at 10:14 UTC
    A good review, I like the AnyData module. JZed is also very good at helping out for us noobs, so that is another plus for the module.

    There is also a stand-alone version of themodule, which you can ftp up onto a server and have running in minutes. For me this has been really handy for getting database driven prototypes up and running in short amounts of time on pretty much any webserver.

    For a noob like myself, the module allows me to start out using SQL in my code from the beginning. Which means that I have more transportable code. I can move it to a "real" database pretty easily.

    Joins would be good and maybe will come soon? Not that I really know how to use them! :-)

    Kia Kaha, Kia Toa, Kia Manawanui!
    Be Strong, Be Brave, Be perservering!

Re: DBD::Anydata
by jZed (Prior) on Oct 01, 2004 at 18:35 UTC
    I can't believe I missed this when it was posted :-(. Thanks for all the kind words.

    The only correction I can see is that DBD::AnyData *does* now support two-table joins including both implicit joins specified in the WHERE clause as well as explicit INNER and OUTER joins and some variations of ON and USING clauses.

    Thanks again!

    -- jZed = Jeff Zucker

Re: DBD::Anydata
by rongoral (Beadle) on Nov 17, 2004 at 21:41 UTC

    Greetings,

    I hope that I am not going to be wasting anyone's time here, however, I have not seen this addressed.

    I am trying to implement jZed's DBD::AnyData module to parse a flat file db. My test code is:

    #!/usr/bin/perl use strict; use warnings;use CGI::Carp qw(fatalsToBrowser); use DBI; use DBD::AnyData; print qq[content-type: text/html\n\n]; my $dbh = DBI->connect('dbi:AnyData(RaiseError=>1):'); my $table = 'mothers'; my $format = 'Pipe'; my $file = 'databases/mothers.tbl'; my $flags = ''; # The first row holds the field names $dbh->func( $table, $format, $file, $flags, 'ad_catalog'); my $sth = $dbh->prepare("SELECT * FROM mothers WHERE name = 'Sandra'") +; $sth->execute(); while (my $row = $sth->fetch) {print "@$row<br />";} 1;

    The "table" consists of one record:

    name|age|address|phone Sandra | 46 | 1252 Some St, Anywhere, WI 12345 | 123-456-7890

    I was getting some wierd results and so went in and activated the debugging code left in place. Specifically, the code is located at line 529 in the open_table sub as follows:

    print join("\n", $format,@$file,$createMode), "\n";

    This would return the error -

    DBD::AnyData::st execute failed: Can't use string ("databases/mothers.tbl") as an ARRAY ref while "strict refs" in use at DBD/AnyData.pm line 529, <GEN0> line 1.

    So, I replaced the print and join statement with the following -

    if (ref($file) eq 'ARRAY'){print join("\n", $format, @$file, $createMo +de), "\n";}<br /> else {print join("\n", $format, $file, $createMode), "\n";}

    This then allows the printing of $file when it was not an array ref.

    As I said, I hope that I have not wasted anyone's time with this. And btw, where is that stand alone version? My host does not have this pm or many of the supporting pms.

    Humbly -
    Ron Goral

      Rather than finding out the values with that debug statement, you can use the ad_get_catalog() function to see what values are available for a given table:
      $dbh->func( $table, $format, $datasource, 'ad_catalog'); my $catalog = $dbh->func('ad_get_catalog'); use Data::Dumper; print Dumper $catalog;

Back to Reviews

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: modulereview [id://353439]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others browsing the Monastery: (6)
As of 2014-07-30 00:40 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    My favorite superfluous repetitious redundant duplicative phrase is:









    Results (229 votes), past polls