[Bizgres-general] Statement Queuing take II - Resource Scheduling

Mark Kirkwood mkirkwood at greenplum.com
Tue Jun 13 01:47:43 UTC 2006


Jim C. Nasby wrote:
> On Mon, Jun 12, 2006 at 02:59:03PM +1200, Mark Kirkwood wrote:
>   
>> Transaction level is attractive, because it is considerably simpler to
>> implement and avoids any deadlock issues[1]. However it is highly
>> desirable to be able to make additional decisions based on the nature of
>> the statements contained in the transaction, which is not possible using
>> this level of granularity alone. Also there is the possibility of of
>> idle backends occupying a slot in the queue until they timeout, which is
>> undesirable.
>>     
>  
> How horrible would it be to have a mode where an entire transaction was
> entered into the backend before any actual execution took place? This
> would allow for estimating the total work required be the transaction. I
> think it could also have other benefits down the road, such as knowing
> what tables would not be involved in a transaction, which means that
> vaccum could ignore this long-running transaction when dealing with
> them.
>
>   
It's a nice idea - I suspect difficult to do - as the whole client <-> 
server communication protocol seems to be built on the idea of 
submitting commands/statements and getting a response back *before* 
going on to the next one - so, ahem, might make messing with the 
deadlock detector look like an easy task by comparison :-).
>> A challenge for this sort of resource scheduling is in obtaining the
>> information from the host OS platform in order to determine the limits - 
>> a plugin or port may be needed for each one.
>>     
>  
> There's other things that would definately benefit from having this
> information. I know there's been numerous times when it would be nice to
> know if something had come from the OS cache or not, as one example.
>
>   
Yes - I guess there will need to be a discussion about which platforms 
are the most strategic to build this sort of stuff for (at a guess: 
Linux first then one of Freebsd, Solaris). Aside - I was looking at 
doing something like this for Freebsd a while ago (extending 
pg_buffercache to work with PG + OS caches) - it looked like it was 
going to be a good way to get familiar with the  OS kernel code :-).
>> It is envisaged that there will be new parameters for each additional 
>> limiter (e.g. max_work_memory=K for a memory limiter), these will also 
>> queue similarly to the active statement case.
>>     
>  
> One thing that would be extremely handy to have very early on (which
> wouldn't require any OS-specific plugins) is the ability to limit based
> on work_mem consumption. The idea is to configure a role that's using
> queueing so that it has a much larger work_mem setting available for
> it's queries; but you would then need to limit when queries could run
> based on both estimated and actual work_mem consumption.
>
>   
Yeah - I would suspect something like that is likely to be the next 
limiter, as work_mem blowout is probably the no. 1 bad guy for this sort 
of thing!
> Ideally, this would eventually extend down into operations that use
> work_mem, so that if a per-statement or per-queue work_mem limit was
> going to be exceeded, the backend would switch over to using disk.
>  
>   
>> 1/ This resource management functionality is targeted at DSS/Data
>> Warehouse workloads/systems - is it suitable for batch or financial 
>>     
>  
> In my experience, batch systems typically have their own set of controls
> in place to limit concurrency, so I think the immediate benefit won't be
> as large. The real question is: would this queueing system allow those
> batch systems to be simplified.
>
>   
Right.
> Once we have the ability to hold statements (or transactions) based on
> things like CPU utilization or I/O bandwidth, I think it becomes even
> more valuable to batch processing.
>
>   
Yeah - there are two metrics here:  estimated (cpu or IO) utilization 
for the statement itself, and the current utilization of the system 
right now - it would probably be nice to be able to limit based on both 
of these!
>> year-end type workloads too? (It is envisaged that there *will* be a 
>> performance hit if enabled on a typical OLTP systems needing a high 
>> level concurrent activity).
>>     
>
> Will this hit exist even for roles that have queueing disabled?
>
>   
No (he says, waving hands) - for those it's just an if test to check the 
value corresponds to 'no limit' (tho' see below about roles in roles....).
>> 2/ Is there a need for separate timeout parameters for these resource
>> locks (or is two very similar parameters even more likely to confuse
>> than a possible behavior change to the existing ones?).
>>     
>  
> One issue I'm worried about is a large statement in the middle of a
> transaction getting queued and not running for a very long time (hours).
> A timeout would remove that risk. Another possibility is a 'time-in',
> where if a statement has been in the queue for too long, start running
> everything in front of it in the queue, as well as itself (I don't think
> we'd want to run something from the queue out of order).
>
>   
>> 3/ The resource locks will be released at statement finish - is this
>> possible to detect reliably?
>>     
>  
> Not sure what you're asking here... surely you can tell when a statement
> finishes?
>
>   
Just wondering aloud if there is any subtlety there (have not looked at 
that section of the code yet!)
>> 4/ Do we want this ROLE related control, or should there be a global
>> parameter that controls all connections except the superuser?
>>     
>
>
> One question is what about roles assigned to other roles? Do you always
> use the limits of the lowest role in the tree, or are limits cumulative?
> In the future, I can see cumulative limits being very useful. I don't
> think this should be implemented now, but it's probably worth
> considering in the design and syntax.
>
>   
Yes - I was thinking about this myself, wondering if finding 'all roles 
assigned to role x' is an expensive operation....(as it could get 
performed for every statement)...however - presumably something like 
this happens at connection startup, so we may get it for free (I need to 
check the code).
>> 5/ There is to be one lock per resource limit *and* ROLE, is this
>> necessary? - could we work out the ROLE whenever we examine the queue?
>> (would that be too big a performance hit)?
>>     
>  
> Is it bad to have one per limit and role? With only one kind of limit, I
> don't see this as being a big deal.
>
>   
Yeah -  I guess I was thinking about the general case when we possibly 
have several limit types for each role...

Cheers

Mark


More information about the Bizgres-general mailing list