Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl Monk, Perl Meditation
 
PerlMonks  

Re: Inserting Hash Information Into MySQL

by CountZero (Bishop)
on Jun 20, 2012 at 16:36 UTC ( #977400=note: print w/ replies, xml ) Need Help??


in reply to Inserting Hash Information Into MySQL

A database cannot save a Perl hash (and certainly not a hash of array-refs)in its native format.

You can save the "id" in the database, but the "array" which is the "value" part of the hash cannot be saved as such. Of course this "array" is really an array-ref and you will have to dereference it to get the data itself. But that means you will have a multiple of data-items which of course will never fit in one field in your database.

You will have to review and revise your database schema to make this work.

CountZero

A program should be light and agile, its subroutines connected like a string of pearls. The spirit and intent of the program should be retained throughout. There should be neither too little or too much, neither needless loops nor useless variables, neither lack of structure nor overwhelming rigidity." - The Tao of Programming, 4.1 - Geoffrey James

My blog: Imperial Deltronics


Comment on Re: Inserting Hash Information Into MySQL
Re^2: Inserting Hash Information Into MySQL
by erix (Vicar) on Jun 20, 2012 at 17:07 UTC
    A database cannot save a Perl hash (and certainly not a hash of array-refs)in its native format.

    A database should have no problem with a hash. For instance, PostgreSQL has hstore [1], which not only makes implementing a hash easy, but can also index them (with either GIN (speed of lookup) or GiST (speed of writing)).

    You will have to review and revise your database schema

    Perhaps revise not 'database schema' but just 'database'... ;-)

    [1] http://www.postgresql.org/docs/current/static/hstore.html

      Thank you for the information, that is really interesting. I wonder how it integrates with SQL.

      CountZero

      A program should be light and agile, its subroutines connected like a string of pearls. The spirit and intent of the program should be retained throughout. There should be neither too little or too much, neither needless loops nor useless variables, neither lack of structure nor overwhelming rigidity." - The Tao of Programming, 4.1 - Geoffrey James

      My blog: Imperial Deltronics

        (PostgreSQL:) Once you have a table 't' with a column 'hs' of type hstore, you can use statements like:

        select * from t where hs ? 'x';

        to do 'hash-lookups' (but there are many more hstore operators).

        Below is a test program in bash/sql. It creates a table with an hstore with a million entries and indexes it.

        Then a few searches to show the retrieval performance; same search repeated three times without index and then repeated three times with index:

        $ pg_sql/pgsql.HEAD/hstore/create_hstore.sh rowcount | size ----------+-------- 1000000 | 249 MB (1 row) Time: 183.026 ms Time: 184.673 ms Time: 183.903 ms Time: 0.295 ms Time: 0.245 ms Time: 0.274 ms

        You see that the speedup is quite considerable :)

        Here is that create_hstore.sh program:

        (You must have hstore installed - see 'contrib' in older postgres or extensions in newer Pg's)

        #!/bin/sh schema=public table=testhstore t=$schema.$table echo " --/* drop table if exists $t; create table $t ( hs hstore ); insert into $t select hstore(md5(cast(f.x as text)), cast(f.x as text)) from generate_series(1, 1000000) as f(x); create index ${table}_hs_idx on $t using gin (hs); analyze $t; --*/ select count(*) as rowcount, pg_size_pretty(pg_total_relation_size( '$ +{t}' )) as size from $t; " | psql -q v=9509342c6a6b283d07a3ce406b06eb1e # discard one run for cache effects, then run three times, no index: echo "set enable_bitmapscan=0; \o /dev/null select * from $t where hs ? '$v'; \o \timing on select * from $t where hs ? '$v'; select * from $t where hs ? '$v'; select * from $t where hs ? '$v';" \ | psql -q | grep -E '^Time:' # discard one run for cache effects, then run three times with index: echo "set enable_bitmapscan=1; \o /dev/null select * from $t where hs ? 'x'; \o \timing on select * from $t where hs ? '$v'; select * from $t where hs ? '$v'; select * from $t where hs ? '$v';" \ | psql -q | grep -E '^Time:'

        Obviously we're far away from standard SQL here. Also from the OP's question, I think, but since you expressed interest I thought I'd construct an example. As always, there is much more in the Fine Manual: http://www.postgresql.org/docs/current/static/hstore.html.

        (run on a pretty average 8120 desktop, slow single SATA disk, Centos 6.2 Linux, postgresql 9.3devel)

        update: Changed to /dev/null redirection with \o inside a psql session. Also changed Timings from EXPLAIN ANALYZE to those from psql's '\timing on' (so I'm showing the slowest);

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://977400]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others having an uproarious good time at the Monastery: (7)
As of 2014-09-20 11:10 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    How do you remember the number of days in each month?











    Results (158 votes), past polls