[Bizgres-general] Question: Is 64-bit cache support valuable?

Josh Berkus jberkus at greenplum.com
Tue Sep 27 18:03:53 GMT 2005


Jim,

> So maybe the direction that PostgreSQL should head is that it has two
> over-arching memory limits: one for shared memory and one for non-shared
> memory. All shared memory objects would pull out of the shared memory
> pool, with all left-over space being used for buffers (of course all
> these other shared memory objects would still need limits on them so we
> didn't run out of buffers). All other memory consuming
> objects/operations, such as sort, would have to obey a total memory
> limit, designed to ensure that the machine doesn't swap. ISTM that
> whatever memory we're not using up to that limit could be considered the
> sort pool.

We discussed this.   The problem is that it's not practical given the 
completely anonymous per-query-node allocation of sort mem.  That is, sort 
mem limits are checked as each node is planned, and then allocated as each 
node is executed, and memory is releaseed, all without every checking with 
any central authority other than the OS.  Adding a centralized "sort mem 
pool" to this would require major re-engineering of both the planner and 
executor, as well as months of troubleshooting locking problems.

There's also some logical issues with centralized sort_mem.  At plan time, 
the planner costs query structures based on the sort_mem available (the 
fixed hard limit in the current code).  Were sort_mem to be drawn from a 
pool, you'd face the likely possibility that the amount of sort_mem 
available at plan time would be different than the amount available at 
execution time.  With prepared plans, doubly so.

So we started looking at other methods.  As Luke points out, a 
well-established way of dealing with the same basic issue of resource 
allocation on mainframes was queueing.   After some discussion, we decided 
that query queuing would reduce the "under-allocation problem" and provide 
some other benefits as well.   It would also be implementable in a way 
that touched less of the central PostgreSQL code (i.e., you could turn it 
off if you didn't want it).

The tentative idea is to implement several queues.  Each queue would allow 
a certain number of concurrently executing queries, and put any additional 
on a wait list.  This would allow allocation of sort mem based on the 
number of concurrent queries (a more reliable estimate than the number of 
connections) as well as seperating the queues and sort mem limits based on 
the expected type and criticality of queries in each queue.  

I don't know that we came to a consensus on the best way to divide queues; 
Simon favored ROLES while Luke favored estimate-based allocation.   
However, we think that queuing is definitely the way to go.

-- 
Josh Berkus                GreenPlum Inc
Community Liason       www.greenplum.com
415-752-2500       jberkus at greenplum.com



More information about the Bizgres-general mailing list