[Bizgres-general] Off-list Re: [ENG] Re: Statement Queuing take II - Resource Scheduling (Running with cost and cursors)
Mark Kirkwood
mkirkwood at greenplum.com
Mon Aug 7 02:22:20 UTC 2006
Luke Lonergan wrote:
>
> The purpose of having our own timeslicing scheduler instead of just
> using the one built into the OS is to avoid the I/O conflicts that the
> OS is not aware of, but we are.
>
>
Right, but if our time slicing conflicts or confuses the OS's, or is not
really lessening the resource load, then we are losing out.
> Point by point:
> - Postgres can and will only use O(1GB) of RAM per query within it's
> process segment and we can establish the amount beforehand. Therefore
> we can control the number of runnables in the queues to ensure that
> there will not be swapping.
>
>
If we have enough resources to let K elements of a size N queue be time
sliced, we have enough to let K be active (i.e unsliced - managed by the
queues "active count" limit). The reasoning for this is essentially my
point last time - time slicing K queries each of which needs M of a
resource is typically going to use KM of the resource - and letting 'em
run as usual will typically use KM of the resource - probably more
efficiently.
Now we could save to disk (i.e serialize or create continuation) the
resource(s) at the end of each query's time slice - (i.e. the sort set,
executor state and data for the query progress, plus the partially
generated tuple results sets, hash joins, loops etc) - but this could
well cause more problems than we are solving (e.g - it might take KM of
the resource to save each query's M if it!). However even assuming it
is not that bad - for some resources (e.g. memory), it may not be
released back to the OS in time for the next few time slices to use!
(I've just noticed that Gavin has just discussed this issue more completely)
> - I/O readahead buffering is O(16MB) and is retained in the OS I/O cache
> unless there is memory pressure. Even if an I/O buffer is flushed, it
> takes milliseconds to refill one, so our intra-queue timeslicing can be
> chosen at a number like 30 seconds to ensure that we're not wasting
> work.
>
>
With respect to read-ahead, ISTM that one of the main reasons for this
queuing work is that there is typically always memory pressure. With
respect to the buffer cache, 30 seconds is enough time for query N to
completely flush the buffer cache records that query N-k was using, thus
every query will be typically seeing a completely cold cache for its
time slice. This is going to increase system resource usage and
generally result in poorer performance than executing the queries
sequentially.
> Do these address the issues you identified?
>
>
>
I don't think so (despite the fact that it would be fun to program :-) ).
I think the suggestion for defining queues of statement workload is
sound, and the current queue design with its various limits (cost,
active count, +) should work well with it, and do pretty much what you
are wanting to achieve. We probably need to thrash around the whole user
-> queue vs statement -> queue association a bit more to see if we can
somehow get the best of both approaches! (I think Gavin hinted at this
too...)
Cheers
Mark
More information about the Bizgres-general
mailing list