[Bizgres-general] Statement Queuing take II - Resource Scheduling
Mark Kirkwood
mkirkwood at greenplum.com
Fri Jun 16 01:23:43 UTC 2006
Josh Berkus wrote:
> So, I want to suggest an alternative approach to something:
>
> Mark suggested a two-variable approach:
>
> concurrent_queries == max number of simultaneously executing queries, per
> ROLE.
> trivial_query_threshold == estimated query cost below which queries don't
> count towards the above
>
> From a user perspective, it would be better to simplify this as:
>
> total_query_cost == total cost of all queries to run simultaneously; beyond
> this cost, queries will queue up.
>
Nice idea, I might do it this may instead. thoughts?
> Regardless, I see some potential pitfalls with the whole resource
> allocation thing:
>
> Locking: PostgreSQL already has issues with locking limiting our
> multi-processor scalability. These resource limits would cause
> additional lock contention, since before executing each query you'd need
> to take one away from a list, and add it back after completion.
> This isn't as much of a concern with the low-volume, high-query-cost
> environment of DSS. However, the contention would mean that this approach
> would be useful *only* for DSS, and would need to be disabled for OLTP and
> Web systems. Further, that might relegate the allocator to /contrib,
> never to make it into the main code.
>
>
Yeah - good point. I understand that some people will want to do some
OLTP and some DSS on the same system. I'm thinking that would work under
restricted circumstances like:
- OLTP is low volume (e.g. users updating their DSS data on-line) or
- roles using OLTP are distinct from those using DSS (so OLTP roles can
have all the resource queuing stuff disabled).
There are however, wider issues with doing OLTP and DSS on the same
system - e.g:
- DSS systems are typically optimized for sequential throughput
- OLTP systems are typically optimized for random throughput
- DSS systems are typically optimized for a few large memory sized, high
cpu using, low cache hit ratio threads of execution
- OLTP systems are typically optimized for many small memory sized, high
cache hit ratio threads of execution
This means that you probably shouldn't put your airline reservation
system and the data warehouse for it on the same box and expect them
both to work well!
> Permissions: For RA to work, we need to be able to make some of the RA
> variables superuser-only-set. This is an issue because PostgreSQL does
> not have a way, currently, to create superuser-only runtime-change
> variables, nor a way for a superuser to say "regular users can't change
> this userset variable". Mind you, the latter would be good for
> PostgreSQL system management in general; it's *always* been a problem that
> Joe User can decide to up his work_mem to 524,000 if he wants to.
>
>
Yeah - but for most users their entire interface to the data will be
their ad-hoc tool or web page, which will not have this ability (one
would hope :-) ). I'm thinking that folks using psql or pgadmin3 to get
their DSS reports are "expert", and can be told nicely "don't do that"
(maybe followed by DROP ROLE if the nice approach fails ;-) )
> Deadlocking: this has already been discussed. This is a reason to do
> per-role thresholds; in an extreme case, one could give SPs their own
> Roles to prevent deadlocks.
>
> Client Protocol Issues: we would need to add a message to the client
> protocol which indicates "query accepted and queued". Drivers would need
> to support this. Then comes the question of, do we return control to the
> client? If so, do they get to send another query? What happens if
> results for the 1st query come back while the 2nd query is executing? If
> we don't return control, don't we risk running out of connections?
> This is all the multiple-query-client protocol which has been previously
> discussed, and is a significant project in its own right.
>
I was not thinking of amending the client protocol - the idea being that
queuing would be "transparent" to clients (there was the discussion
point about whether there should be resource queuing specific timeouts).
Cheers
Mark
More information about the Bizgres-general
mailing list