Beefy Boxes and Bandwidth Generously Provided by pair Networks
Keep It Simple, Stupid
 
PerlMonks  

Bulk data insertion into MySql

by sowraaj (Novice)
on Sep 02, 2011 at 13:26 UTC ( #923843=perlquestion: print w/ replies, xml ) Need Help??
sowraaj has asked for the wisdom of the Perl Monks concerning the following question:

Hi Perlmonks,

i need to modify this code to enhance it speed by using bulk insertion concept, i can't able to understand that concept at the same time i got an error message that shows @INC error while using use DBIx::BulkLoader::MySql, Pls help myself to enhance the speed of this code

#!/usr/bin/perl use strict; use warnings; use DBI; my %info; my $dbh = DBI-> connect('dbi:mysql:raj','root', '12345', {PrintError = +> 0, RaiseError =>0}); my $sth1 =$dbh->prepare("Truncate TABLE SOWND1"); $sth1->execute(); my $sth2 = $dbh->prepare("insert into SOWND1 values(?,?,?)"); my @files = <~/Desktop/Log_data/SCHOOL/DATA/*>; for my $filee (@files) { local $/ =''; open my $fh,'<', $filee or die "$!\n"; while(my $record = <$fh>) { $record =~ s/^=++ SCHOOL \(B\): MUMBAI =++/Type : BOYS/ig; $record =~ s/^=++ SCHOOL \(G\): MUMBAI =++/Type : GIRLS/ig; map { my($keys,$value)=split /\s*:\s*/,$_; $keys=~ s/\s//g; $info{$keys} = $value; }split /\n/, $record; $sth2->execute($info{Type},$info{School},$info{Name}); } }

Thanks in advance

Comment on Bulk data insertion into MySql
Download Code
Re: Bulk data insertion into MySql
by Anonymous Monk on Sep 02, 2011 at 13:30 UTC
Re: Bulk data insertion into MySql
by sundialsvc4 (Monsignor) on Sep 02, 2011 at 14:54 UTC

    When the use of a command-line or other tool is (for whatever reason) infeasible, bulk-insertion operations are categorically improved through the judicious use of three techniques:

    1. Table locking, or the use of transactions with a very strict “isolation level,” to preclude any conflicting operations by other users... or even access by other users.
    2. Transactions, in general, to permit the database system to do “lazy writes” while carrying out the operation.   The transaction is periodically committed and reopened so that the rollback-file does not become huge.
    3. Disabling index updates and referential integrity checks until the entire operation is finished.

    There is a very-considerable amount of material on this subject on the Internet, and you need not be concerned if the article you read is not specifically talking about Perl.   The core concepts are universal.

      Transactions make insert operations in most database, including MySQL slower. There's no such thing as a "lazy write" - in order to remain consistent, data needs to be written to disk as soon as possible.

      Now, I'd still recommend putting everything in a transaction, but not for performance reasons.

Re: Bulk data insertion into MySql
by CountZero (Bishop) on Sep 02, 2011 at 16:25 UTC
    But your code does not use DBIx::BulkLoader::MySql at all! You are using straight DBI.

    The fastest option for your bulk loading is probably using the LOAD DATA INFILE statement from MySQL.That statement expect a file with the data and does a very fast insert into the database. You can use Perl to transform your data into a format expected by the LOAD DATA INFILE statement.

    CountZero

    A program should be light and agile, its subroutines connected like a string of pearls. The spirit and intent of the program should be retained throughout. There should be neither too little or too much, neither needless loops nor useless variables, neither lack of structure nor overwhelming rigidity." - The Tao of Programming, 4.1 - Geoffrey James

      Dear friend,

      When i use 'DBIx::BulkLoader::MySql' i got the error that shows @INC...., .pm file not find. I noted that it may due to http://code.activestate.com/ppm/DBIx-BulkLoader-Mysql/. if you don't mine explain the LOAD DATA INFILE concept in a small program(Now i insert 10,00,000rows/3min)I would like to reduce that time to seconds

      Thanks

        I doubt that even the LOAD DATA INFILE statement will insert 10 lakh records in a few seconds.

        As far as DBIx::BulkLoader::Mysql is concerned, did you try installing it through cpan?

        CountZero

        A program should be light and agile, its subroutines connected like a string of pearls. The spirit and intent of the program should be retained throughout. There should be neither too little or too much, neither needless loops nor useless variables, neither lack of structure nor overwhelming rigidity." - The Tao of Programming, 4.1 - Geoffrey James

Re: Bulk data insertion into MySql
by Marshall (Prior) on Sep 03, 2011 at 00:02 UTC
    Your code looks like straight DBI. The easiest way to make this a LOT faster is to make the whole operation one single transaction.

    $dbh->do("SET autocommit = 0"); .... $dbh->do("COMMIT");
    I took the above "SET autocommit = 0" from some MySQL code that worked in my app. If you commit every 1/4-1/2 million records or so, the performance will be fine. There can be other ways to start the transaction, perhaps BEGIN or other keywords.

    I do find this map pretty ugly:

    map { my($keys,$value)=split /\s*:\s*/,$_; $keys=~ s/\s//g; $info{$keys} = $value; }split /\n/, $record; #Don't use map when you mean foreach (or for) foreach (split /\n/, $record) { .... }

      Dear friend,

      I tried the method Auto_commit& Commit in my coding but i noted that there is not large difference compare to my previous code. If you know any method which is helped to enhance the insertion time by bulk loading . (Now i insert 10,00,000rows/2min)I would like to reduce that time to seconds

      Thanks

        If you can already insert one million rows in two minutes, that seems like a very good speed to me.

        Your task looks like a one-time script to import ("bulk load") a number of files into the database. Why do you need it faster? How often do you need to run it?

Re: Bulk data insertion into MySql
by Anonymous Monk on Sep 03, 2011 at 11:27 UTC

      Hi friend,

      Thanks for your comment, i need to say one thing that i don't have any experience in Perl i just have 10days experience. I understand the logic what you have post on my previous thread, but i don't understand where i introduce your code instead of my previous code

      Thanks

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://923843]
Approved by chrestomanci
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others about the Monastery: (6)
As of 2014-08-21 00:01 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    The best computer themed movie is:











    Results (126 votes), past polls