[Bizgres-general] Re: Indexes on joins
Josh Berkus
jberkus at greenplum.com
Thu Oct 13 17:15:34 GMT 2005
Matt,
> I have a question regarding the lack of index usage when doing joins. I
> don't know if this is a problem with Bizgres or Postgres, but I figured
> y'all would be the ones to ask.
You've run into a limitation of the current partitioning/CE
implementation. Use of indexes and elimination of partitions happens at
query planning time. However, at plan time we have no reliable way to
discern what values will be coming back from the other side of the join
so as to pass them down to the partitioned relation. There are two
seperate flaws here:
1) partitions cannot be eliminated since the estimated values may differ
from the real ones;
2) PostgreSQL isn't sure if you're selecting most of the values in each
partition or only a few of them, so can't accurately estimate costs on
indexes vs. seq scan.
(2) will be the next issue to get fixed, most likely (Simon?), because
it's just a flaw in the cost estimation system on partitioned relations.
What the planner is confused by is that:
select partitioned_table.* from paritioned_table
where id IN (select some condition from other table
which covers 70% of most partitions)
... deserves a much different exectution plan from ....
select partitioned_table.* from partitioned_table
where id IN (select some condition from another table
that corresponds to only a few rows in one partition)
Currently, the planner isn't "smart" enough to distinguish the two and
thus tends to fall back on the fool-proof, if very slow, seqscan+append
plan.
To sum up: we're aware of the problem, it's a TODO.
--Josh
More information about the Bizgres-general
mailing list