[Bizgres-general] Question: Is 64-bit cache support valuable?

Simon Riggs simon at 2ndquadrant.com
Thu Jul 21 10:23:56 GMT 2005


On Wed, 2005-07-20 at 10:33 -0700, Luke Lonergan wrote:
> Simon, Jim,
> 
> >> Also, something that's always irked me is how sorts are handled. The
> >> problem is that unless you know your workload really, really well, you
> >> can't give large sorts as much memory as they'd like to have without a
> >> serious risk of running the system out of memory. Is there something
> >> that can be done to put a global limit on sort memory (and possibly
> >> other memory uses)? The idea I've proposed in the past is that as each
> >> sort consumes or releases memory, it updates a global counter that
> >> tracks how much sort memory is in use. As this approaches the set limit,
> >> sorts would be allowed to allocate less memory. Ultimately, the idea is
> >> to do everything possible to allow sorts that can fit entirely in memory
> >> to do so, while forcing sorts that obviously can't fit in memory to
> >> disk. So, like buffer management, how hard would it be to have
> >> different/additional sort memory management from the base postgresql
> >> code?
> > 
> > Almost exactly what I had in mind. Spooky, but good. I'll not discuss
> > the full design just yet, but it has the features you mention, plus some
> > extra control features.

OK...heres thoughts to date...

The challenge is to come up with a scheme that has these features:
- prevents overallocation of memory which results in swapping
- maximises allocation of large memory servers
- allows heavy memory users to "go large" when they need to
- prevents low memory users from conflicting with heavy users/each other
- allows the administrator to have control over the system
- allows some level of control at user level e.g. "need lots, please"
- must be easy enough to get right in real-world use, not benchmarks
- also need to take into account temp_buffers settings
- there may also be a requirement for predictability, so that important
regular tasks can complete in known durations or by deadlines

Most important thing is that the memory management needs to be a Top-
down Memory Allocation scheme. So, all work_mem requests get allocated
from a single top-level pool. This is defined as total_work_mem which is
a SUSET parameter allowing the administrator to change it while the
server is running. This is possible because work_mem is a *dynamic*
memory limit, not a pre-allocation, like shared_buffers. We may not use
all of total_work_mem at any point in time.

Next, we need to consider memory limit policy. How will individual users
have their limits set? It is important that we do not think of this as a
hard allocation because we may not use dynamic memory right up to the
limit. i.e. we may request 16384 then use only 9600 KB.

Currently, since we do not attempt to manage memory centrally an
individual backend doesn't report back when it has reached the actual
peak of the memory usage required for a particular operation. (Not
checked that *all* operations actually would know when they got there,
but would definitely work for sorts...) We would need additional code in
a number of places in the executor to report back, allowing us to
calculate how much could be reallocated to others. We would also need to
report back following freeing of the memory.

Memory allocation calculations are used in the planner to decide whether
to perform Sort/Merge joins, materializations, hash joins and hashable
sub-queries. The performance of those features is very dependent upon
getting the memory allocation correct. So we need a way of saying "what
limit can I use for now for planning that you can guarantee me will be
the minimum I receive when we try to execute this query?"
If the query is being immediately executed then we can set the dynamic
memory limit during planning and then begin using it later in the
executor. ...But the planner does not currently know about
prepared/immediate plans. 

Other thoughts...

Limit policy: alternatives are
  Absolute - gives us the amount we request e.g. 16384

  Variable - gives us a proportion of the total available

Some form of variable scheme, perhaps more than one. An idea might be to
allocate 1/2 of the available memory to first applicant, then 1/4, then
1/8... or something less severe.

Another way to do that might be to define "huge", "large", "medium" etc
in a variable and relative way so that we have a level of abstraction
between the request and what actual size is returned.

  Relative - gives us a proportion of the total e.g. 10%

Note that we might have backends report back that they have not used all
their quota, allowing us to reallocate. That might end up with, say, 11
allocations of 10%... even though we still respect the overall limit.

Limit action
What should we do if the requested size is unavailable?
- Overallocate (current action)
- Trimming - Give back the maximum available, however much that is...
- Step down - Return the next smaller unit of size, in a variable scheme
- Block on a Wait loop until limit request succeeds... but might wait a
long time depending upon workload
- Throw error and allow user to resubmit later - if we did that it would
be important to throw and error quickly, rather than execute for an
hour, hit a limit and then abort

We might wish to consider having the top-level pool broken down further
into several smaller pools which could be targeted by particular user
groups.

For backwards compatibility, if total_work_mem is not set then memory
allocation will use standard absolute memory allocation policy.

It should be fairly straightforward to implement a memory limit request
system using shared memory. We would probably need a system view defined
for it, to allow diagnostics and monitoring.

[This is for ideas only at this stage - I'm not intending to discuss
this in detail or to take this through to detailed design]

Best Regards, Simon Riggs




More information about the Bizgres-general mailing list