[Bizgres-general] A bitmap index access method is available in bizgres CVS tip

Luke Lonergan llonergan at greenplum.com
Wed Dec 21 10:37:47 GMT 2005


Jie,

On 12/20/05 10:31 PM, "Jie Zhang" <jzhang at greenplum.com> wrote:

> There are total 10,000,000 uncompressed bits for each value. Consider the
> word size is 32, then there are 10,000,000/32 = 312,500 words. Since the
> values are evenly distributed in the table, then every 4 words(120 bits)
> will be compressed into about 2 words, because bit 1 will happen in every
> 100 bits. So there are 312,500/2 = 156,250 words in a compressed bit vector.
> Currently, each bitmap page can store about 225 words. Therefore, the number
> of bitmap pages for a value can be 156,250/225=695. That is about
> 695*8192=5.693M. So for 100 distinct values, the bitmap index will take up
> about 569M, which matches what you get there, and this should be the worse
> case.

Yup - thx for the calculation.

We should try the 8-bit words to increase the compression - though the
access speed might suffer a bit as per the above:

8-bit words:
13 words = 104 bits
Compressed size is 2 words

The disadvantage is that our header word now only carries 4 references, so
the header overhead would be 1 header per 4 words instead of the current 1
per 16 - but it sounds like we'd gain big overall storage efficiency for
these low cardinality cases.

- Luke




More information about the Bizgres-general mailing list