[Bizgres-general] Re: Indexes on joins

Josh Berkus jberkus at greenplum.com
Thu Oct 13 17:15:34 GMT 2005


Matt,

> I have a question regarding the lack of index usage when doing joins.  I 
> don't know if this is a problem with Bizgres or Postgres, but I figured 
> y'all would be the ones to ask.

You've run into a limitation of the current partitioning/CE 
implementation.  Use of indexes and elimination of partitions happens at 
  query planning time.  However, at plan time we have no reliable way to 
discern what values will be coming back from the other side of the join 
so as to pass them down to the partitioned relation.  There are two 
seperate flaws here:

1) partitions cannot be eliminated since the estimated values may differ 
from the real ones;

2) PostgreSQL isn't sure if you're selecting most of the values in each 
partition or only a few of them, so can't accurately estimate costs on 
indexes vs. seq scan.

(2) will be the next issue to get fixed, most likely (Simon?), because 
it's just a flaw in the cost estimation system on partitioned relations. 
  What the planner is confused by is that:

select partitioned_table.* from paritioned_table
where id IN (select some condition from other table
    which covers 70% of most partitions)

... deserves a much different exectution plan from ....

select partitioned_table.* from partitioned_table
where id IN (select some condition from another table
     that corresponds to only a few rows in one partition)

Currently, the planner isn't "smart" enough to distinguish the two and 
thus tends to fall back on the fool-proof, if very slow, seqscan+append 
plan.

To sum up:  we're aware of the problem, it's a TODO.

--Josh



More information about the Bizgres-general mailing list