Beefy Boxes and Bandwidth Generously Provided by pair Networks
Do you know where your variables are?

Re^4: Inserting Hash Information Into MySQL (PostgreSQL's hstore)

by erix (Parson)
on Jun 23, 2012 at 21:40 UTC ( #978017=note: print w/replies, xml ) Need Help??

in reply to Re^3: Inserting Hash Information Into MySQL
in thread Inserting Hash Information Into MySQL

(PostgreSQL:) Once you have a table 't' with a column 'hs' of type hstore, you can use statements like:

select * from t where hs ? 'x';

to do 'hash-lookups' (but there are many more hstore operators).

Below is a test program in bash/sql. It creates a table with an hstore with a million entries and indexes it.

Then a few searches to show the retrieval performance; same search repeated three times without index and then repeated three times with index:

$ pg_sql/pgsql.HEAD/hstore/ rowcount | size ----------+-------- 1000000 | 249 MB (1 row) Time: 183.026 ms Time: 184.673 ms Time: 183.903 ms Time: 0.295 ms Time: 0.245 ms Time: 0.274 ms

You see that the speedup is quite considerable :)

Here is that program:

(You must have hstore installed - see 'contrib' in older postgres or extensions in newer Pg's)

#!/bin/sh schema=public table=testhstore t=$schema.$table echo " --/* drop table if exists $t; create table $t ( hs hstore ); insert into $t select hstore(md5(cast(f.x as text)), cast(f.x as text)) from generate_series(1, 1000000) as f(x); create index ${table}_hs_idx on $t using gin (hs); analyze $t; --*/ select count(*) as rowcount, pg_size_pretty(pg_total_relation_size( '$ +{t}' )) as size from $t; " | psql -q v=9509342c6a6b283d07a3ce406b06eb1e # discard one run for cache effects, then run three times, no index: echo "set enable_bitmapscan=0; \o /dev/null select * from $t where hs ? '$v'; \o \timing on select * from $t where hs ? '$v'; select * from $t where hs ? '$v'; select * from $t where hs ? '$v';" \ | psql -q | grep -E '^Time:' # discard one run for cache effects, then run three times with index: echo "set enable_bitmapscan=1; \o /dev/null select * from $t where hs ? 'x'; \o \timing on select * from $t where hs ? '$v'; select * from $t where hs ? '$v'; select * from $t where hs ? '$v';" \ | psql -q | grep -E '^Time:'

Obviously we're far away from standard SQL here. Also from the OP's question, I think, but since you expressed interest I thought I'd construct an example. As always, there is much more in the Fine Manual:

(run on a pretty average 8120 desktop, slow single SATA disk, Centos 6.2 Linux, postgresql 9.3devel)

update: Changed to /dev/null redirection with \o inside a psql session. Also changed Timings from EXPLAIN ANALYZE to those from psql's '\timing on' (so I'm showing the slowest);

Replies are listed 'Best First'.
Re^5: Inserting Hash Information Into MySQL (PostgreSQL's hstore)
by BrowserUk (Pope) on Jun 24, 2012 at 06:25 UTC

    Hm. I'm not sure that doing 4 lookups of the same value from the same process and then discarding the first one is much of a demonstration.

    It is effectively the same thing as doing this:

    c:\test> perl -MDigest::MD5=md5_hex -E"printf qq[%s %07d\n], md5_hex($ +_),$_ for 1..1e6" >hstore c:\test > type #! perl -slw use strict; use Time::HiRes qw[ time ]; use Digest::MD5 qw[ md5_hex ]; my $table = do{ local( @ARGV, $/ ) = 'hstore'; <> }; my %cache; my $v = '9509342c6a6b283d07a3ce406b06eb1e'; my $val = ( $cache{ $v } ) //= $table =~ m[$v (\d+)$]; for ( 1 .. 3 ) { my $start = time; my $val = ( $cache{ $v } ) //= $table =~ m[$v (\d+)$]; printf "Time: %.9f\n", time() - $start; } c:\test> hstore Time: 0.000007868 Time: 0.000009060 Time: 0.000001907

    With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    "Science is about questioning the status quo. Questioning authority".
    In the absence of evidence, opinion is indistinguishable from prejudice.

    The start of some sanity?

Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://978017]
marto waves
[Marshall]: marto howdy!
[marto]: Hey there
[Marshall]: hope we get a good answer to Script stopped working... this OP is making serious efforts to solve problem on his own.

How do I use this? | Other CB clients
Other Users?
Others musing on the Monastery: (4)
As of 2018-03-18 12:51 GMT
Find Nodes?
    Voting Booth?
    When I think of a mole I think of:

    Results (230 votes). Check out past polls.