Beefy Boxes and Bandwidth Generously Provided by pair Networks
Your skill will accomplish
what the force of many cannot
 
PerlMonks  

File Loader (load the content of a file and insert into DB)

by r34d0nl1 (Pilgrim)
on May 20, 2005 at 20:08 UTC ( [id://459113]=perlquestion: print w/replies, xml ) Need Help??

r34d0nl1 has asked for the wisdom of the Perl Monks concerning the following question:

I'm going to develop (with your help, I hope) a loader that will receive files that will be
drop in a directory X and then will read the file, do some verifications at the content
and insert the information from the text file into a database.
What you think would be better? A script that runs every minute from crontab;
or a program running as service?
It's gonna run under unix HP UX and Perl version 5.6.1.
I also would like to see some kind of example about loaders (this one will 'talk' with Oracle);
I've never produced one before.
I wounder if you could provide me some link and directions.

Thanks since now for your help, masters.
  • Comment on File Loader (load the content of a file and insert into DB)

Replies are listed 'Best First'.
Re: File Loader (load the content of a file and insert into DB)
by jpeg (Chaplain) on May 20, 2005 at 20:40 UTC
    Hi,
    In my experience, writing a program to stay open and monitor a directory is easier: you readdir the contents, sleep, and readdir again, comparing the arrays and firing off a sub to process the new files. Create a hash and get the mtimes from stat if you want to check if files inside the dir have been changed.
    If you wanted to write a script that ended and needed to be restarted by crond you could look at Storable.

    As far as talking to Oracle - it's easy enough; perldoc DBD::Oracle should tell you what you need to know. It's not much different than talking to any other DB.

    $dbh = DBI->connect("dbi:Oracle:host=$orasrvr;sid=$sid", "$uname","$pa +ss");
    should get you started.

    Loaders can be as simple or complicated as the business logic dictates. You process each line and decide what to do with it. I've had to pull a table from a db into a hash and check if each line of a file existed in the hash, and I've had simple tasks like building a SQL INSERT query aroud data in each line.

    --
    jpg
Re: File Loader (load the content of a file and insert into DB)
by holli (Abbot) on May 20, 2005 at 20:25 UTC
Re: File Loader (load the content of a file and insert into DB)
by jhourcle (Prior) on May 20, 2005 at 21:46 UTC
    Depending on how fast you're going to be pumping data into Oracle, you might be better served creating a text file from the new data (after verifying it, etc.), and then calling oracle's sqlldr (aka. SQL*Loader)
      If it's an Oracle 9i db you could use external tables, which is essentially sqlldr reading files as they are needed. It's quite fast as well. Just define the directory, the filename and the file format and you're done.

      But that's getting off topic. I'd go with the cron vs a looping daemon. Just make sure you don't read the same file twice if the first invocation of your code doesn't complete before the next run...
Re: File Loader (load the content of a file and insert into DB)
by bgreenlee (Friar) on May 20, 2005 at 20:32 UTC

    As long as your app is ok with the one-minute delay, I'd just use the crontab. The advantages are:

    - don't have to worry about what happens if your service dies, or setting it up so that it starts automatically if the box reboots

    - you get email notification of errors for free (assuming that you're checking the email of that user's crontab)

    I'm not sure what you mean by "loaders". Do you mean a database API? You probably want DBD::Oracle.

    -b

      There are definitely pitfalls of using cron for this kind of thing for production code. Particularly when the cycles are short.

      You should consider what will happen if the task takes longer than the period you've set it to run from cron. If it's bad for overlapping runs (and it usually is, if for nothing else downward performance spirals when the system gets REALLY busy). Various locking schemes are possible to avoid the problems of overlapping runs, but most of them are trickier than it would seem at first and there are various race conditions to be considered.

      But, you make good points about the simplicity of a crontab-based solution.

Re: File Loader (load the content of a file and insert into DB)
by Anonymous Monk on May 23, 2005 at 17:14 UTC
    One thing you have to watch out for here is that the file has actually been fully written. I recently wrote something like this using SGI::FAM, and every once in a while I would get blank contents... I would highly advise using flock to make sure that the new file has finished being copied/moved/written before you start processing it.
A reply falls below the community's threshold of quality. You may see it by logging in.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://459113]
Approved by ww
Front-paged by ww
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others examining the Monastery: (5)
As of 2024-04-24 10:09 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found