[Skytools-users] Cooperative Consumer

Marko Kreen markokr at gmail.com
Tue Jul 29 11:08:17 UTC 2008


On 7/24/08, Dimitri Fontaine <dim at tapoueh.org> wrote:
>  As I understand it, pick open batch for consumer means fetching the non-null
>  pgq.subscription.sub_batch value, and if none exist asking for the
>  next_batch() of the 'master' consumer, update pgq.subscription for the
>  consumer and returning this.
>
>  I'm not sure about the absence of open batches for the master consumer. I
>  think it means I should finish_batch() as soon as it's copied over to 'real'
>  consumer, but that means:
>   - pgq_coop have to fetch another id from pgq.batch_id_seq
>   - batch is already finished for master as soon as a consumer begins its work

You could copy values over then "close" master's batch by hand.  The
'finish_batch' logic is quite simple.

>  I'd still want a way for a consumer to abort gracefully a batch, and I think
>  automatically make events available to other consumers is the way to go.
>
>  We could have pgq_coop.release_batch() mark all batch events to get retried
>  later, that's simple.

I think it would be good idea to have working retry events anyway, despite
how the release_batch() is handled.  But the problem is - they are
per-consumer, to allow both strict batch-based consumer and relaxed one
which uses retry events to coexist.  And I would not like to lose the
property.

After some thinking I now have a slightly ugly (but small) hack how to
approach this - we need to make pgq.subscription.sub_id non-unique,
so that all consumers in one coop group share same id.  That way we
can attach events to common id, so all consumers in group can see them.
If there are any place that expect unique sub_id in pgq, it need to be
rewritten to use (sub_queue, sub_consumer) as they stay unique.  There
should not be many of them.

Although it's small change it's slightly dangerous, so I think we should
cut 2.2.x branch for pgq_coop work.  Then 2.1.8 will be last 'feature'
release in 2.1 branch, then it will be bugfixes-only.

>  Or pgq_coop could keep its own subscription_aborted table where to track such
>  case (filled at pgq_coop.release_batch() time), which pgq_coop.next_batch()
>  would take into account. This way aborted batches from any consumer of the
>  group is available for any other consumer.

This could work, but note that you cannot just store batch_id there,
either you copy tick_id's too or you let the batch stay 'open' in
pgq.subscription under the same consumer.  I prefer the latter, because
former hides the batch from core pgq.

Both ways could work, but as this is supposed to be rare event I would
prefer avoiding introducing separate subsystem for that.

Btw - to make your life simpler, maybe you should ignore both retry
events and .release_batch() in first version, just concentrate on
getting the basic API to work.  We can later then work on the hard parts.

>  And maybe we just need a cooperative group name rather than a master consumer,
>  so we don't ever copy batches around between master and coop consumer.

Yes.   Simply 'group' and 'consumer'?  I'm open to better suggestions.

But one thing I would like to see - that simple 'pgqadm status' should show
that there is group of consumers.  Multi-part names would be simplest:

  <group>                     (as the 'master' consumer)
  <group>.<consumer1>         (coop consumer)
  <group>.<consumer2>

-- 
marko



More information about the Skytools-users mailing list