[Bizgres-general] A bitmap index access method is available in bizgres CVS tip

Jie Zhang jzhang at greenplum.com
Thu Dec 22 06:44:48 GMT 2005



On 12/21/05 7:40 AM, "Jim C. Nasby" <jnasby at pervasive.com> wrote:

> On Tue, Dec 20, 2005 at 10:31:37PM -0800, Jie Zhang wrote:
>> There are total 10,000,000 uncompressed bits for each value. Consider the
>> word size is 32, then there are 10,000,000/32 = 312,500 words. Since the
>> values are evenly distributed in the table, then every 4 words(120 bits)
>> will be compressed into about 2 words, because bit 1 will happen in every
>> 100 bits. So there are 312,500/2 = 156,250 words in a compressed bit vector.
>> Currently, each bitmap page can store about 225 words. Therefore, the number
>> of bitmap pages for a value can be 156,250/225=695. That is about
>> 695*8192=5.693M. So for 100 distinct values, the bitmap index will take up
>> about 569M, which matches what you get there, and this should be the worse
>> case.
> 
> I'm probably missing something obvious here, but why would a 100 value
> index be 100 times bigger than a 10 value index, instead of just 10x?

Based the data that Mark provided,

> bitmap:
>   val0  178 s, 150M    # 10 distinct
>   val1  150 s, 150M    # 10 distinct
>   val2  224 s, 550M    # 100 distinct

a 100 value index is about 4 times bigger than a 10 value index. Since
values are uniformly distributed, all bitmaps in a 10 value index are not
compressed at all. So each bitmap in a 10 value index is twice as big as
that in a 100 value index, while the number of distinct values in 10 value
index is 10 times less than that in a 100 value index. Therefore, a 100
value bitmap index is roughly about 10/2=5 times as big as a 10 value index.

Best,
Jie




More information about the Bizgres-general mailing list