[Bizgres-general] Off-list Re: [ENG] Re: Statement Queuing take II - Resource Scheduling (Running with cost and cursors)

Luke Lonergan LLonergan at greenplum.com
Mon Aug 7 02:31:43 UTC 2006


Mark,
  
> Right, but if our time slicing conflicts or confuses the 
> OS's, or is not really lessening the resource load, then we 
> are losing out.

Of course, but your points below miss the objective of the queue - we're
not worried about CPU or memory sharing, it's disk sharing of a certain
kind that causes non-linear effects.

Sorting queries reading and writing at the same time are too difficult
for the predictive read-ahead algorithms in the OS I/O scheduler to
handle.  As a consequence, two sorting queries running at once will slow
each other down by a non-linear factor caused by the change from
sequential access to random access.  We regularly see a factor of 3-4
slowdown (above and beyond the linear) in practice.

 If we have enough resources to let K elements of a size N 
> queue be time sliced, we have enough to let K be active (i.e 
> unsliced - managed by the queues "active count" limit). The 
> reasoning for this is essentially my point last time - time 
> slicing K queries each of  which needs M of a resource is 
> typically going to use KM of the resource - and letting 'em 
> run as usual will typically use KM of the resource - probably 
> more efficiently. 

If we were only talking about CPU and memory utilization (your N is
memory in this case), then you would be right.  In the case of interest,
where the non-linear disk sharing is the problem, this is describing an
overhead.

> Now we could save to disk (i.e serialize or create continuation) the
> resource(s) at the end of each query's time slice - (i.e. the 
> sort set, executor state and data for the query progress, 
> plus the partially generated tuple results sets, hash joins, 
> loops etc) - but this could well cause more problems than we 
> are solving (e.g - it might take KM of the resource to save 
> each query's M if it!).  However even assuming it is not that 
> bad - for some resources (e.g. memory), it may not be 
> released back to the OS in time for the next few time slices to use! 
> (I've just noticed that Gavin has just discussed this issue 
> more completely)

Again - we don't have to worry about this - the system will have enough
RAM to share among the different runnables by definition.  No need to
talk about the swapping problem.

> With respect to read-ahead, ISTM that one of the main reasons 
> for this queuing work is that there is typically always 
> memory pressure. With respect to the buffer cache,  30 
> seconds is enough time for query N to completely flush the 
> buffer cache records that query N-k was using, thus every 
> query  will be typically seeing a completely cold cache for 
> its time slice. This is going to increase system resource 
> usage and generally result in poorer performance than 
> executing the queries sequentially.

Nope.  I think that's the basic misunderstanding here - that's one of
the issues we're addressing.  The time slicing addresses another.
 
> I don't think so (despite the fact that it would be fun to 
> program :-) ).

Let's continue this discussion - I definitely see misunderstandings in
the above.
 
> I think the suggestion for defining queues of statement 
> workload is sound, and the current queue design with its 
> various limits (cost, active count, +) should work well with 
> it, and do pretty much what you are wanting to achieve.

Yes they will.  Leaving this second problem I am describing, which is
very familiar to OS people.

> We 
> probably need to thrash around the whole user 
> -> queue vs statement -> queue association a bit more to see if we can
> somehow get the best of both approaches! (I think Gavin hinted at this
> too...)

Sure.

- Luke



More information about the Bizgres-general mailing list