[Bizgres-general] A bitmap index access method is available in
bizgres CVS tip
Mark Kirkwood
markir at paradise.net.nz
Mon Dec 19 06:12:50 GMT 2005
Mark Kirkwood wrote:
> Jie Zhang wrote:
>
>> A bitmap index access method is now available in bizgres CVS tip. The
>> bitmap index is a highly efficient structure to index high dimensional
>> data, especially when the attributes to be indexed have very low
>> cardinalities.
>>
>> To create a bitmap index on desirable attributes, use the following
>> command:
>>
>> create index <index_name> on <table_name> using bitmap(<att_list>);
>>
>> Here, <att_list> can be a single attribute or multiple attributes in
>> table <table_name>. Multiple attributes are separated by a comma.
>>
>> Note: to use this feature, you need to do initDB.
>>
>
> Very cool,
>
> I thought I would have a play (see below).
>
> What I am seeing is that vanilla Pg 8.1.1 with btree indexes on several
> low cardinality columns seems to be considerably faster than Bizgres
> with on-disk bitmaps (for my test case). I would have thought that it
> was an ideal candidate to show the advantages of on-disk representation
> (several low cardinality columns) - any ideas?
>
> The only contributors I can think of that are helping 8.1.1 are the
> count optimization and the virtual tuples (or are these in Bizgres
> already?), however previous testing did *not* show a factor of 2 boost
> from these two alone.
>
Looking at the original setup again (the 3 indexes), a quick display of
the pagesizes of the various relations shows:
Bizgres current:
bitmap=# select relname,relpages from pg_class
where relname like 'bitmaptest%'
or relname like 'bm_internal_t2200bitmaptest%';
relname | relpages
------------------------------------+----------
bitmaptest | 93458
bitmaptest_id | 21899
bitmaptest_val0 | 19363
bitmaptest_val1 | 19360
bitmaptest_val2 | 70125
bm_internal_t2200bitmaptest_val0_0 | 1
bm_internal_t2200bitmaptest_val1_0 | 1
bm_internal_t2200bitmaptest_val2_0 | 1
(8 rows)
Postgres 8.1:
bitmap=# select relname,relpages from pg_class
where relname like 'bitmaptest%'
bitmap-# ;
relname | relpages
------------------------------------+----------
bitmaptest | 88496
bitmaptest_id | 21899
bitmaptest_val0 | 21899
bitmaptest_val1 | 21899
bitmaptest_val2 | 21899
(5 rows)
That (bitmapped) bitmaptest_val2 is relatively big at ~600Mb, so I'm
wondering if the 100 distinct values is just at the point where the
bitmap is getting inflated. Thoughts?
Cheers
Mark
More information about the Bizgres-general
mailing list