Figure out how to do stream positions in the syncserver #211

erikjohnston · 2017-09-06T09:58:50Z

Right now its a hack, so we should look into getting something proper in place before we build too much on top of it.

Do we want to allow users to switch from one instance of the sync server to another? This would allow us to auto fail over if one of the sync servers died, but would entail having stream positions be globally defined, rather than specific to a particular instance.

We also need to figure out how to do this efficient with respect to database design.

NegativeMjark · 2017-09-06T10:46:13Z

For more context the current syncserver uses a single auto-incrementing integer to represent the position a client is at in the stream of events. This integer is incremented whenever it receives a message from kafka.

However if we ran multiple syncserver instances then they could receive the messages from kafka in a different order. (kafka guarantees the order within a partition of a topic, but not the order between those partitions) This would result in the different instances assigning different client API stream positions to the same message.

So it would be impossible for a client to change which server it was querying from because that server wouldn't know what position in the stream it was at.

erikjohnston · 2017-09-06T10:50:50Z

However if we ran multiple syncserver instances then they could receive the messages from kafka in a different order.

Presumably this is also applicable if we partitioned the room server?

kegsay · 2020-08-26T12:05:06Z

As Erik points out, this is not unique to sharded syncservers, but also applies to sharded roomservers. This was considered when device keys were added, whereby I added the format $topic_id-$partition-$offset to sync tokens e.g dl-0-1452 with the intention that:

Sharded key servers would write to different partitions resulting in extra positions in the token e.g dl-0-1452.dl-1-1322,dl-2-784.
The token's IsAfter function returns true if any offset is higher (or if there are additional partitions).

This guarantees that we observe all updates from all sharded upstream components but does nothing to guarantee ordering. This is a larger piece of work fundamentally around graph linearisation -- matrix-org/gomatrixserverlib#187 . Assuming we do things "properly", we will have a deterministic algorithm to linearise the DAG, meaning the ordering we observe from Kafka is irrelevant from syncapis point of view. Using a similar technique for things outside the room DAG (e.g key updates) would probably work well, but ordering for key updates is less important than just being told about them in the first place.

This does mean that linearisation is not fixed and is instead fluid depending on subsequent messages, making it harder to pre-calculate and optimise.

@neilalexander any thoughts?

kegsay · 2022-12-05T14:39:36Z

Years later, Dendrite is not focused on providing sharded syncapis at present. When we do, we will likely pin users to a specific syncapi instance, similar to Synapse. In a sliding sync future, this becomes much easier as there is no long-term positional information being kept on the client.

erikjohnston added C-Sync-API design labels Sep 6, 2017

erikjohnston mentioned this issue Sep 6, 2017

Current sync server needs to be rewritten to support different streams #212

Closed

kegsay added design:scaling and removed design labels Apr 3, 2020

kegsay closed this as completed Dec 5, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Figure out how to do stream positions in the syncserver #211

Figure out how to do stream positions in the syncserver #211

erikjohnston commented Sep 6, 2017 •

edited

Loading

NegativeMjark commented Sep 6, 2017

erikjohnston commented Sep 6, 2017

kegsay commented Aug 26, 2020 •

edited

Loading

kegsay commented Dec 5, 2022

Figure out how to do stream positions in the syncserver #211

Figure out how to do stream positions in the syncserver #211

Comments

erikjohnston commented Sep 6, 2017 • edited Loading

NegativeMjark commented Sep 6, 2017

erikjohnston commented Sep 6, 2017

kegsay commented Aug 26, 2020 • edited Loading

kegsay commented Dec 5, 2022

erikjohnston commented Sep 6, 2017 •

edited

Loading

kegsay commented Aug 26, 2020 •

edited

Loading