[Bizgres-general] Off-list Re: [ENG] Re: Statement Queuing take II - Resource Scheduling (Running with cost and cursors)
Luke Lonergan
LLonergan at greenplum.com
Mon Aug 7 02:31:43 UTC 2006
Mark,
> Right, but if our time slicing conflicts or confuses the
> OS's, or is not really lessening the resource load, then we
> are losing out.
Of course, but your points below miss the objective of the queue - we're
not worried about CPU or memory sharing, it's disk sharing of a certain
kind that causes non-linear effects.
Sorting queries reading and writing at the same time are too difficult
for the predictive read-ahead algorithms in the OS I/O scheduler to
handle. As a consequence, two sorting queries running at once will slow
each other down by a non-linear factor caused by the change from
sequential access to random access. We regularly see a factor of 3-4
slowdown (above and beyond the linear) in practice.
If we have enough resources to let K elements of a size N
> queue be time sliced, we have enough to let K be active (i.e
> unsliced - managed by the queues "active count" limit). The
> reasoning for this is essentially my point last time - time
> slicing K queries each of which needs M of a resource is
> typically going to use KM of the resource - and letting 'em
> run as usual will typically use KM of the resource - probably
> more efficiently.
If we were only talking about CPU and memory utilization (your N is
memory in this case), then you would be right. In the case of interest,
where the non-linear disk sharing is the problem, this is describing an
overhead.
> Now we could save to disk (i.e serialize or create continuation) the
> resource(s) at the end of each query's time slice - (i.e. the
> sort set, executor state and data for the query progress,
> plus the partially generated tuple results sets, hash joins,
> loops etc) - but this could well cause more problems than we
> are solving (e.g - it might take KM of the resource to save
> each query's M if it!). However even assuming it is not that
> bad - for some resources (e.g. memory), it may not be
> released back to the OS in time for the next few time slices to use!
> (I've just noticed that Gavin has just discussed this issue
> more completely)
Again - we don't have to worry about this - the system will have enough
RAM to share among the different runnables by definition. No need to
talk about the swapping problem.
> With respect to read-ahead, ISTM that one of the main reasons
> for this queuing work is that there is typically always
> memory pressure. With respect to the buffer cache, 30
> seconds is enough time for query N to completely flush the
> buffer cache records that query N-k was using, thus every
> query will be typically seeing a completely cold cache for
> its time slice. This is going to increase system resource
> usage and generally result in poorer performance than
> executing the queries sequentially.
Nope. I think that's the basic misunderstanding here - that's one of
the issues we're addressing. The time slicing addresses another.
> I don't think so (despite the fact that it would be fun to
> program :-) ).
Let's continue this discussion - I definitely see misunderstandings in
the above.
> I think the suggestion for defining queues of statement
> workload is sound, and the current queue design with its
> various limits (cost, active count, +) should work well with
> it, and do pretty much what you are wanting to achieve.
Yes they will. Leaving this second problem I am describing, which is
very familiar to OS people.
> We
> probably need to thrash around the whole user
> -> queue vs statement -> queue association a bit more to see if we can
> somehow get the best of both approaches! (I think Gavin hinted at this
> too...)
Sure.
- Luke
More information about the Bizgres-general
mailing list