[Bizgres-general] Statement Queuing take II - Resource Scheduling
Mark Kirkwood
mkirkwood at greenplum.com
Tue Jun 13 01:47:43 UTC 2006
Jim C. Nasby wrote:
> On Mon, Jun 12, 2006 at 02:59:03PM +1200, Mark Kirkwood wrote:
>
>> Transaction level is attractive, because it is considerably simpler to
>> implement and avoids any deadlock issues[1]. However it is highly
>> desirable to be able to make additional decisions based on the nature of
>> the statements contained in the transaction, which is not possible using
>> this level of granularity alone. Also there is the possibility of of
>> idle backends occupying a slot in the queue until they timeout, which is
>> undesirable.
>>
>
> How horrible would it be to have a mode where an entire transaction was
> entered into the backend before any actual execution took place? This
> would allow for estimating the total work required be the transaction. I
> think it could also have other benefits down the road, such as knowing
> what tables would not be involved in a transaction, which means that
> vaccum could ignore this long-running transaction when dealing with
> them.
>
>
It's a nice idea - I suspect difficult to do - as the whole client <->
server communication protocol seems to be built on the idea of
submitting commands/statements and getting a response back *before*
going on to the next one - so, ahem, might make messing with the
deadlock detector look like an easy task by comparison :-).
>> A challenge for this sort of resource scheduling is in obtaining the
>> information from the host OS platform in order to determine the limits -
>> a plugin or port may be needed for each one.
>>
>
> There's other things that would definately benefit from having this
> information. I know there's been numerous times when it would be nice to
> know if something had come from the OS cache or not, as one example.
>
>
Yes - I guess there will need to be a discussion about which platforms
are the most strategic to build this sort of stuff for (at a guess:
Linux first then one of Freebsd, Solaris). Aside - I was looking at
doing something like this for Freebsd a while ago (extending
pg_buffercache to work with PG + OS caches) - it looked like it was
going to be a good way to get familiar with the OS kernel code :-).
>> It is envisaged that there will be new parameters for each additional
>> limiter (e.g. max_work_memory=K for a memory limiter), these will also
>> queue similarly to the active statement case.
>>
>
> One thing that would be extremely handy to have very early on (which
> wouldn't require any OS-specific plugins) is the ability to limit based
> on work_mem consumption. The idea is to configure a role that's using
> queueing so that it has a much larger work_mem setting available for
> it's queries; but you would then need to limit when queries could run
> based on both estimated and actual work_mem consumption.
>
>
Yeah - I would suspect something like that is likely to be the next
limiter, as work_mem blowout is probably the no. 1 bad guy for this sort
of thing!
> Ideally, this would eventually extend down into operations that use
> work_mem, so that if a per-statement or per-queue work_mem limit was
> going to be exceeded, the backend would switch over to using disk.
>
>
>> 1/ This resource management functionality is targeted at DSS/Data
>> Warehouse workloads/systems - is it suitable for batch or financial
>>
>
> In my experience, batch systems typically have their own set of controls
> in place to limit concurrency, so I think the immediate benefit won't be
> as large. The real question is: would this queueing system allow those
> batch systems to be simplified.
>
>
Right.
> Once we have the ability to hold statements (or transactions) based on
> things like CPU utilization or I/O bandwidth, I think it becomes even
> more valuable to batch processing.
>
>
Yeah - there are two metrics here: estimated (cpu or IO) utilization
for the statement itself, and the current utilization of the system
right now - it would probably be nice to be able to limit based on both
of these!
>> year-end type workloads too? (It is envisaged that there *will* be a
>> performance hit if enabled on a typical OLTP systems needing a high
>> level concurrent activity).
>>
>
> Will this hit exist even for roles that have queueing disabled?
>
>
No (he says, waving hands) - for those it's just an if test to check the
value corresponds to 'no limit' (tho' see below about roles in roles....).
>> 2/ Is there a need for separate timeout parameters for these resource
>> locks (or is two very similar parameters even more likely to confuse
>> than a possible behavior change to the existing ones?).
>>
>
> One issue I'm worried about is a large statement in the middle of a
> transaction getting queued and not running for a very long time (hours).
> A timeout would remove that risk. Another possibility is a 'time-in',
> where if a statement has been in the queue for too long, start running
> everything in front of it in the queue, as well as itself (I don't think
> we'd want to run something from the queue out of order).
>
>
>> 3/ The resource locks will be released at statement finish - is this
>> possible to detect reliably?
>>
>
> Not sure what you're asking here... surely you can tell when a statement
> finishes?
>
>
Just wondering aloud if there is any subtlety there (have not looked at
that section of the code yet!)
>> 4/ Do we want this ROLE related control, or should there be a global
>> parameter that controls all connections except the superuser?
>>
>
>
> One question is what about roles assigned to other roles? Do you always
> use the limits of the lowest role in the tree, or are limits cumulative?
> In the future, I can see cumulative limits being very useful. I don't
> think this should be implemented now, but it's probably worth
> considering in the design and syntax.
>
>
Yes - I was thinking about this myself, wondering if finding 'all roles
assigned to role x' is an expensive operation....(as it could get
performed for every statement)...however - presumably something like
this happens at connection startup, so we may get it for free (I need to
check the code).
>> 5/ There is to be one lock per resource limit *and* ROLE, is this
>> necessary? - could we work out the ROLE whenever we examine the queue?
>> (would that be too big a performance hit)?
>>
>
> Is it bad to have one per limit and role? With only one kind of limit, I
> don't see this as being a big deal.
>
>
Yeah - I guess I was thinking about the general case when we possibly
have several limit types for each role...
Cheers
Mark
More information about the Bizgres-general
mailing list