Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Partially-stated room that we fail to join updates current state and indicates that it's in the room #17602

Conversation

MadLittleMods
Copy link
Contributor

@MadLittleMods MadLittleMods commented Aug 23, 2024

(This is more bug report than PR, please someone else take this on)

Partially-stated room that we fail to join, updates current state and indicates that it's in the room (delta_state.no_longer_in_room = False).

Test to reproduce:

SYNAPSE_TEST_LOG_LEVEL=INFO poetry run trial tests.handlers.test_federation.PartialJoinTestCase.test_failed_partial_join_is_clean

In _trial_temp/test.log, we can observe:

asdf _calculate_state_delta existing_state {}
asdf _calculate_state_delta current_state immutabledict({})
asdf state_delta: DeltaState(to_delete=[], to_insert={}, no_longer_in_room=False)

Wtf? no_longer_in_room=False seems to indicate that they currently in the room 😖. There is no current_state_events for this room so this doesn't line up.

This behavior was first introduced in matrix-org/synapse#13403 and the only reason it doesn't end up calling _update_current_state_txn(...) now is because #17215 added some if state_delta.is_noop(): protection (for something unrelated AFAICT).

It's very confusing for a partially stated room to come through this flow indicating that it is still in the room (delta_state.no_longer_in_room = False).

Normally, no_longer_in_room is set at

if delta_ids is not None:
# If there is a delta we know that we've
# only added or replaced state, never
# removed keys entirely.
delta = DeltaState([], delta_ids)
elif current_state is not None:
with Measure(self._clock, "persist_events.calculate_state_delta"):
delta = await self._calculate_state_delta(room_id, current_state)
if delta:
# If we have a change of state then lets check
# whether we're actually still a member of the room,
# or if our last user left. If we're no longer in
# the room then we delete the current state and
# extremities.
is_still_joined = await self._is_server_still_joined(
room_id,
ev_ctx_rm,
delta,
)
if not is_still_joined:
logger.info("Server no longer in room %s", room_id)
delta.no_longer_in_room = True

But there is a different flow that the partial-stated rooms use that doesn't bother setting delta_state.no_longer_in_room at all,

async def _update_current_state(
self, room_id: str, _task: _UpdateCurrentStateTask
) -> None:
"""Callback for the _event_persist_queue
Recalculates the current state for a room, and persists it.
"""
state = await self._calculate_current_state(room_id)
delta = await self._calculate_state_delta(room_id, state)
await self.persist_events_store.update_current_state(room_id, delta)

Seems like we should update this flow to accurately say delta_state.no_longer_in_room = True and make sure any current state events are cleared out. Although, I'm not sure current_state_events are ever added for a partially-stated room in this case.

Overall, it's very easy to overlook this kind of thing because we default no_longer_in_room to False so you don't need to provide it when creating a DeltaState(...).


Found while working on #17512

Pull Request Checklist

  • Pull request is based on the develop branch
  • Pull request includes a changelog file. The entry should:
    • Be a short description of your change which makes sense to users. "Fixed a bug that prevented receiving messages from other servers." instead of "Moved X method from EventStore to EventWorkerStore.".
    • Use markdown where necessary, mostly for code blocks.
    • End with either a period (.) or an exclamation mark (!).
    • Start with a capital letter.
    • Feel free to credit yourself, by adding a sentence "Contributed by @github_username." or "Contributed by [Your Name]." to the end of the entry.
  • Code style is correct
    (run the linters)

…er_in_room: False`

```
SYNAPSE_TEST_LOG_LEVEL=INFO poetry run trial tests.handlers.test_federation.PartialJoinTestCase.test_failed_partial_join_is_clean
```
@MadLittleMods
Copy link
Contributor Author

MadLittleMods commented Aug 27, 2024

Closing as this should ideally be fixed but we can avoid the problem with a simple check

# If no state is changing, we don't need to do anything. This can happen when a
# partial-stated room is re-syncing the current state.
if not to_insert and not to_delete:
    # Return early
    return

And there is already a test to exercise that you're handling this scenario (tests.handlers.test_federation.PartialJoinTestCase.test_failed_partial_join_is_clean).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant