Beefy Boxes and Bandwidth Generously Provided by pair Networks
P is for Practical
 
PerlMonks  

Insert BLOB into DB without using system memory, but using filehandle.

by songahji (Friar)
on Sep 18, 2009 at 14:46 UTC ( [id://796131]=perlquestion: print w/replies, xml ) Need Help??

songahji has asked for the wisdom of the Perl Monks concerning the following question:

Ladies and Gents,

I am trying to insert img, pdf, etc files into DB Blog.
It is working now.
I slurp the data into $data and do

sub _insert_to_db { my $self = shift; my $sql = "INSERT INTO DBI_LOB(ID_KEY, FILENAME, TIME_STAMP, FILE_DA +TA) VALUES (:ID, :FN, SYSDATE, :BUFF)"; my $sth = $self->{'dbh'}->prepare($sql); $sth->bind_param(":ID", $self->{'id'}); $sth->bind_param(":FN", basename($self->{'filename'})); # NOTE this statement on the $data $sth->bind_param(":BUFF", $self->{'data'}, { ora_type=>ORA_BLOB }); + eval { $sth->execute(); }; if ($@) { croak "Failed to insert $sql: \n $@"; } }

You see that the $self->{'data'} is comes from

open(local *IN, '<', "file.pdf"); $self->{'data'} = do {local $/, <IN>}; close IN;

This I read the content into Memory, what if the file is 2G? I will hog the memory, am I not? Is there anyway I do it with filehandle, and some how stream it? Would DBI or DBD::Oracle let me do it?

Best regards,
Songahji

Replies are listed 'Best First'.
Re: Insert BLOB into DB without using system memory, but using filehandle.
by ikegami (Patriarch) on Sep 18, 2009 at 15:42 UTC
    If you're dealing with files that big, why are you putting them in a database?
    1. Get an id from a database sequence,
    2. save the file using a file name based on the id, then
    3. save the id in the database.
      I wish I could, but the data must be in the DB. Disk space is not an issue in this project.

        Repeat after me:

        1. I will not store files inside of the database.
        2. I will not store files inside of the database.
        3. I will not store files inside of the database.
        4. I will not store files inside of the database.
        5. I will not store files inside of the database.
        6. I will not store files inside of the database.
        7. I will not store files inside of the database.
        8. I will not store files inside of the database.

        Storing files inside of the database is just a Bad Idea. You are much better off doing something like the following:

        1. Insert a new record into the database (filename, size, source, date, etc).
        2. Get the ID of the new record.
        3. Take the md5 hash signature of the ID + some secret code.
        4. Your hash is something like 0123456789abcdef0123456789abcdef
        5. Update the database record to show "hash = '0123456789abcdef0123456789abcdef'"
        6. Make folder /var/media/01/23/45
        7. Write file contents to /var/media/01/23/45/0123456789abcdef0123456789abcdef

        NOTE: /var/media/ could be a local volume or a remote-mounted SAN. You choose.

        Now, you can partition the files across multiple servers. You can store the files on a SAN and have (many) dummy file servers in front of it.

        This is (basically) how wikimedia does it. And just look at how many files they have. Storing the files inside of the database is just Not How It's Done.

Re: Insert BLOB into DB without using system memory, but using filehandle.
by Joost (Canon) on Sep 18, 2009 at 16:39 UTC
      Many thanks. ++
Re: Insert BLOB into DB without using system memory, but using filehandle.
by talexb (Chancellor) on Sep 18, 2009 at 16:10 UTC
      This I read the content into Memory, what if the file is 2G? I will hog the memory, am I not? Is there anyway I do it with filehandle, and some how stream it? Would DBI or DBD::Oracle let me do it?

    You might hog memory -- but I'm sure the operating system will do something intelligent about that.

    If you're worried about 2G files (I'm assuming that's the same as the amount of physical RAM you have), perhaps you can a) limit the size of the file you accept, b) get more memory, c) make your cache larger. In any case, I would Try it And See .. a 2G file is pretty large lump of stuff to be moving around -- I assume you have sufficient disk space and backup capacity to handle all that.

    Alex / talexb / Toronto

    "Groklaw is the open-source mentality applied to legal research" ~ Linus Torvalds

Re: Insert BLOB into DB without using system memory, but using filehandle.
by stonecolddevin (Parson) on Sep 18, 2009 at 20:56 UTC

    Check out DBIx::Class::InflateColumn::FS.  It automatically writes your file to the file system and provides a Path::Class object as the record retrieved from the database.  This keeps your database clean and stores only the pointers, and allows the filesystem to take care of the image files as it should.

    mtfpny

Re: Insert BLOB into DB without using system memory, but using filehandle.
by Anonymous Monk on Sep 18, 2009 at 15:15 UTC
    Check if the file is too big, and refuse to insert it.
      Sorry, I must insert the data into DB. It's the requirements. No worry the DB has so huge space.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://796131]
Approved by ikegami
Front-paged by broomduster
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others rifling through the Monastery: (6)
As of 2024-04-18 22:03 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found