[Bizgres-general] Question: Is 64-bit cache support valuable?

Jim C. Nasby jnasby at pervasive.com
Wed Sep 28 19:45:33 GMT 2005


I tend to agree about the queuing, but here's something else to
consider:

Currently, it's next to impossible to allow the backend to handle all
caching via the buffers. I know that the consensus in the past has been
that this is bad, but I've seen nothing to show that's still the case
with the more advanced buffer management algorithms we've got now. On
top of that, the database is *always* going to have a much better idea
of it's needs for data than the OS will, and it's likely that buffer
management will become more advanced in the comming versions.

But as long as things like sort-mem are hanging off the side, instead of
being properly incorporated into our memory management is't going to be
damn near impossible to do a good job of effectively utilizing memory on
a system. And keep in mind that this goes beyond just buffers and sort
mem. The FSM comes immediately to mind. Right now, it's really easy to
run it out of room, yet if you set it very large a lot of the time
you're just wasting memory. It would be much more efficient to allow
them to share memory.

What this leads to is having most (if not all) memory be pulled from
shared memory. That way, when sort memory isn't in use, the database can
use it for buffering.

Obviously, much of this is beyond the 8.2 or even 8.3 timeframe, but I'm
worried about us getting painted into a corner with some of this stuff.
I'm not saying that queues does this, but we still need to be careful.

Finally, I think this needs to hit -hackers in the near future. We saw
how well not discussing newsysviews there worked out...

On Tue, Sep 27, 2005 at 11:03:53AM -0700, Josh Berkus wrote:
> Jim,
> 
> > So maybe the direction that PostgreSQL should head is that it has two
> > over-arching memory limits: one for shared memory and one for non-shared
> > memory. All shared memory objects would pull out of the shared memory
> > pool, with all left-over space being used for buffers (of course all
> > these other shared memory objects would still need limits on them so we
> > didn't run out of buffers). All other memory consuming
> > objects/operations, such as sort, would have to obey a total memory
> > limit, designed to ensure that the machine doesn't swap. ISTM that
> > whatever memory we're not using up to that limit could be considered the
> > sort pool.
> 
> We discussed this.   The problem is that it's not practical given the 
> completely anonymous per-query-node allocation of sort mem.  That is, sort 
> mem limits are checked as each node is planned, and then allocated as each 
> node is executed, and memory is releaseed, all without every checking with 
> any central authority other than the OS.  Adding a centralized "sort mem 
> pool" to this would require major re-engineering of both the planner and 
> executor, as well as months of troubleshooting locking problems.
> 
> There's also some logical issues with centralized sort_mem.  At plan time, 
> the planner costs query structures based on the sort_mem available (the 
> fixed hard limit in the current code).  Were sort_mem to be drawn from a 
> pool, you'd face the likely possibility that the amount of sort_mem 
> available at plan time would be different than the amount available at 
> execution time.  With prepared plans, doubly so.
> 
> So we started looking at other methods.  As Luke points out, a 
> well-established way of dealing with the same basic issue of resource 
> allocation on mainframes was queueing.   After some discussion, we decided 
> that query queuing would reduce the "under-allocation problem" and provide 
> some other benefits as well.   It would also be implementable in a way 
> that touched less of the central PostgreSQL code (i.e., you could turn it 
> off if you didn't want it).
> 
> The tentative idea is to implement several queues.  Each queue would allow 
> a certain number of concurrently executing queries, and put any additional 
> on a wait list.  This would allow allocation of sort mem based on the 
> number of concurrent queries (a more reliable estimate than the number of 
> connections) as well as seperating the queues and sort mem limits based on 
> the expected type and criticality of queries in each queue.  
> 
> I don't know that we came to a consensus on the best way to divide queues; 
> Simon favored ROLES while Luke favored estimate-based allocation.   
> However, we think that queuing is definitely the way to go.
> 
> -- 
> Josh Berkus                GreenPlum Inc
> Community Liason       www.greenplum.com
> 415-752-2500       jberkus at greenplum.com
> 
> _______________________________________________
> Bizgres-general mailing list
> Bizgres-general at pgfoundry.org
> http://pgfoundry.org/mailman/listinfo/bizgres-general
> 

-- 
Jim C. Nasby, Sr. Engineering Consultant      jnasby at pervasive.com
Pervasive Software      http://pervasive.com    work: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf       cell: 512-569-9461


More information about the Bizgres-general mailing list