[Bizgres-general] Off-list Re: [ENG] Re: Statement Queuing take II - Resource Scheduling (Running with cost and cursors)
Luke Lonergan
llonergan at greenplum.com
Sat Aug 5 05:37:47 UTC 2006
Jim,
On 8/4/06 8:55 AM, "Jim C. Nasby" <jnasby at pervasive.com> wrote:
> Sorry for the noise, but I thought of a real-world use case (it would be
> a good idea to find more of these).
Thanks for the useful noise!
In discussions with each of three top analyst firms covering enterprise data
warehousing, they emphasized the importance of handling "mixed workload" in
enterprise warehouse environments. All three analyst firms aren't being
paid by us, so this is unbiased research and they all brought the topic up
first.
All divide workload into four categories, starting with the usual three:
- Ad-hoc queries, involving sorting, sequential scan, etc
- Reporting queries, involving large numbers of users at peak times
- Loading, involving ELT queries and summarization
Then there's a fourth they are tracking:
- Queries directly involved in web infrastructure, aka "active warehouse"
Teradata is considered the king of managing mixed workloads, followed by IBM
who is playing catch-up. Oracle doesn't rate.
In order to handle mixed workloads, we will have to be able to tolerate
simultaneous streams of ad-hoc queries, some taking tens of minutes, some
taking days to execute, alongside reporting queries that have to execute in
large numbers simultaneously.
We also need to be able to time-share statement workload within different
statement queues:
Consider the situation where we set up a queue for ad-hoc queries that
recognize the use of sorting in statements and that queue allows one query
to run at a time. There may be other queues for lighter weight queries.
During normal use, ad-hoc queries will run consecutively which is OK when
they are all of reasonable duration, say ten minutes each.
Now someone runs an ad-hoc query that is going to take a much longer time to
run, say one day. Every following ad-hoc query will wait the whole day for
the queue to clear.
The proven way to remove this issue is to time share the single queue slot
for some number of runnable entries, say two. During each time slot, say
one minute, each runnable statement in the queue will rotate into execution.
After it's time is done, a SIGSUSPEND is sent to it's PID and the next
runnable statement rotates in and so on.
- Luke
More information about the Bizgres-general
mailing list