[Bizgres-general] Re: Non equality operators and bitmap indexes

Mischa Sandberg mischa.sandberg at telus.net
Thu Dec 29 22:02:53 GMT 2005


> From: "Jie Zhang" <jzhang at greenplum.com>
> Subject: [Bizgres-general] Re: Non equality operators and bitmap
> 	indexes (was : A bitmap index access method is available...)
> To: "Mark Kirkwood" <markir at paradise.net.nz>,	"bizgres-general"
> 
> On 12/28/05 3:49 PM, "Mark Kirkwood" <markir at paradise.net.nz> wrote:
> > I noticed that queries like:
> > bitmap=# EXPLAIN SELECT * FROM bitmaptest WHERE val0 <= 1;
> > will not use the available (bitmap) index on val0, whereas, an
> > equivalent query recoded to use equality operators will happily do
> > I'm guessing it is a TODO item, but thought I may as well mention
> it :-)
> 
> There are three known encoding schemes for the bitmap index: the
> equality scheme (each bitmap is associated with each one of all
> distinct values), the interval scheme (each bitmap is associated with
> several distinct values that lie within an interval), and the range
> scheme (each bitmap is associated with distinct values that are
> less/greater than a given value). Each of these schemes has different
. performance advantages for different queries.
> Currently, we only support the equality scheme. I think that the
> better way is to let users decide which encoding scheme to use. 
> Any comments on this?

This sounds like the same issues with using a normal btree index,
with stats histograms. Suppose you picked a performance-affecting bound
characteristic, like the max size of a bitmap, and use that to force the
key value sets to split and merge?

You can merge a set of keys by common bit prefix (coarse range) and set
of suffixes. "Bit" works for both strings and normalized numbers. Limit
the size of any set, including all overhead, to (say) one disk page.

Just a thought... this is sort of how I organized a fast index for "LIKE
'%xyz%' searches, where every 4byte substring of every key was an index
key to a bitmap. A bitmap was limited to 2K. If it grew beyond that, ALL
records containing its 4byte key were re-scanned for the pairs of
overlapping 5byte keys; those 5byte keys and their bitmaps were added to
the index, and the original 4byte key turned into a pointer to the set
of 5byte keys. Yes, single-row inserts and deletes truly sucked for
speed. But searching for a string, and'ing together bitmaps for keys
shorter than the search string, and or'ing together bitmaps for keys
longer than the search string, was very fast.



More information about the Bizgres-general mailing list