[Bizgres-general] A bitmap index access method is available in bizgres CVS tip

Mark Kirkwood markir at paradise.net.nz
Mon Dec 19 06:12:50 GMT 2005


Mark Kirkwood wrote:
> Jie Zhang wrote:
> 
>> A bitmap index access method is now available in bizgres CVS tip. The
>> bitmap index is a highly efficient structure to index high dimensional
>> data, especially when the attributes to be indexed have very low
>> cardinalities.
>>
>> To create a bitmap index on desirable attributes, use the following 
>> command:
>>
>> create index <index_name> on <table_name> using bitmap(<att_list>);
>>
>> Here, <att_list> can be a single attribute or multiple attributes in
>> table <table_name>. Multiple attributes are separated by a comma.
>>
>> Note: to use this feature, you need to do initDB.
>>
> 
> Very cool,
> 
> I thought I would have a play (see below).
> 
> What I am seeing is that vanilla Pg 8.1.1 with btree indexes on several 
> low cardinality columns seems to be considerably faster than Bizgres 
> with on-disk bitmaps (for my test case). I would have thought that it 
> was an ideal candidate to show the advantages of on-disk representation 
> (several low cardinality columns) - any ideas?
> 
> The only contributors I can think of that are helping 8.1.1 are the 
> count optimization and the virtual tuples (or are these in Bizgres 
> already?), however previous testing did *not* show a factor of 2 boost 
> from these two alone.
> 

Looking at the original setup again (the 3 indexes), a quick display of 
the pagesizes of the various relations shows:

Bizgres current:
bitmap=# select relname,relpages from pg_class
          where relname like 'bitmaptest%'
          or relname like 'bm_internal_t2200bitmaptest%';
               relname               | relpages
------------------------------------+----------
  bitmaptest                         |    93458
  bitmaptest_id                      |    21899
  bitmaptest_val0                    |    19363
  bitmaptest_val1                    |    19360
  bitmaptest_val2                    |    70125
  bm_internal_t2200bitmaptest_val0_0 |        1
  bm_internal_t2200bitmaptest_val1_0 |        1
  bm_internal_t2200bitmaptest_val2_0 |        1
(8 rows)


Postgres 8.1:
bitmap=# select relname,relpages from pg_class
          where relname like 'bitmaptest%'
bitmap-# ;
      relname                        | relpages
------------------------------------+----------
  bitmaptest                         |    88496
  bitmaptest_id                      |    21899
  bitmaptest_val0                    |    21899
  bitmaptest_val1                    |    21899
  bitmaptest_val2                    |    21899
(5 rows)


That (bitmapped) bitmaptest_val2 is relatively big at ~600Mb, so I'm 
wondering if the 100 distinct values is just at the point where the 
bitmap is getting inflated. Thoughts?

Cheers

Mark


More information about the Bizgres-general mailing list