Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MSC2753: Peeking via sync (take 2) #2753

Open
wants to merge 21 commits into
base: old_master
Choose a base branch
from
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
301 changes: 301 additions & 0 deletions proposals/2753-peeking-via-sync-v2.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,301 @@
# MSC2753: Peeking via Sync (Take 2)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Something this should also handle is supporting /event and /context into peeked rooms.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't that be accepted/handled automatically as long as a peek is active?


## Problem

[Room previews](https://matrix.org/docs/spec/client_server/r0.6.1#id116), more
commonly known as "peeking", has a number of usecases, such as:

* Look at a room before joining it, to preview it.
* Look at a user's profile room (see
[MSC1769](https://github.com/matrix-org/matrix-doc/issues/1769)).
* Browse the metadata or membership of a space (see
[MSC1772](https://github.com/matrix-org/matrix-doc/issues/1772)).
* Monitor [moderation policy lists](https://matrix.org/docs/spec/client_server/r0.6.1#moderation-policy-lists).

Currently, peeking relies on the deprecated v1 `/initialSync` and `/events`
APIs.

This poses the following issues:

* Servers and clients must implement two separate sets of event-syncing logic,
doubling complexity.
* Peeking a room involves setting a stream of long-lived `/events` requests
going. Having multiple streams is racey, competes for resources with the
`/sync` stream, and doesn't scale given each room requires a new `/events`
stream.
* `/initialSync` and `/events` are deprecated and not implemented on new
servers.

This proposal suggests a new API in which events in peeked rooms would be
returned over `/sync`.

richvdh marked this conversation as resolved.
Show resolved Hide resolved
## Proposal
richvdh marked this conversation as resolved.
Show resolved Hide resolved

Peeking into a room remains per-device: if the user has multiple devices, each
of which wants to peek into a given room, then each device must make a separate
request.

To help avoid situations where clients create peek requests and forget about
them, each peek request is given a lifetime by the server. The client must
*renew* the peek request before this lifetime expires. The server is free to
pick any lifetime.

### Starting a peek

We add an CS API called `/peek`, which starts peeking into a given
room. This is similar to
[`/join/{roomIdOrAlias}`](https://matrix.org/docs/spec/client_server/r0.6.1#post-matrix-client-r0-join-roomidoralias)
but has a slightly different API shape.

For example:

```
POST /_matrix/client/r0/peek/{roomIdOrAlias} HTTP/1.1

{
"servers": [
"server1", "server2"
]
}
```

A successful response has the following format:

```
{
"room_id": "<resolved room id>",
"peek_expiry_ts": 1605534210000
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

expecting clients to maintain a clock and be NTP synced feels unnecessary. it's also inconsistent with the renewal_interval approach that #2444 takes. can't we just return a lifetime for the peek?

Copy link
Contributor

@neilalexander neilalexander Nov 17, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Don't we already expect clients to maintain a clock to keep TLS alive?)

Copy link
Member

@richvdh richvdh Nov 17, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can't we just return a lifetime for the peek?

yes we can, and I was in somewhat two minds about it. renewal_interval? ttl for consistency with /voip/turnServer ? lifetime for consistency with m.call.invite ? <something>_ms for consistency with retry_after_ms ?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Don't we already expect clients to maintain a clock to keep TLS alive?)

I'm not following this. Most client impls have nothing to do with TLS.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not following this. Most client impls have nothing to do with TLS.

I mean that if the client's clock is that far out of sync, that they aren't going to be able to set up a TLS connection to the CS API anyway.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh, I see. Possibly, yes.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so can we make it a renewal_interval for consistency with #2444 (and to avoid people having to be NTP synced?)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

well I agree that it may as well be consistent with #2444, but we can change #2444 to match whatever we decide here.

renewal_interval implies a fixed interval, whereas the server might want to vary the lifetime depending on what else is going on. I'd prefer lifetime.

OTOH I don't feel that strongly and don't want to bikeshed it, so if you are keen on renewal_interval we can go with that.

}
```

The `servers` parameter is optional and, if present, gives a list of servers to
try to peek through.

richvdh marked this conversation as resolved.
Show resolved Hide resolved
XXX: should we limit this API to room IDs, and require clients to do a `GET
/_matrix/client/r0/directory/room/{roomAlias}` request if they have a room
alias? (In which case, `/_matrix/client/r0/room/{room_id}/peek` might be a
better name for it.) On the one hand: cleaner, simpler API. On the other: more
requests needed for each operation.
Comment on lines +74 to +78
Copy link
Member

@richvdh richvdh Nov 16, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thoughts on this particularly welcome. (I in no way promise to follow the result, but feel free to express opinion with a reaction on this comment: 👍 for "require an ID", 👎 for "allow either ID or alias")

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

require an ID for me. If a client is trying to peek an alias, they will have likely already (or should have) called the directory API to confirm it's existence, IMO.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we want peeks to be as fast as possible, and adding in a roundtrip to first lookup the alias is completely unnecessary and so adds latency for no good reason. So i reckon we should keep it supporting aliases.

Copy link
Member

@richvdh richvdh Nov 16, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(a) if the peek is over federation, there are going to be multiple requests anyway, and it's going to be slow anyway, and having two API calls gives better feedback to the client as to which part failed. (b) As H-S says, in most cases we'll know the ID anyway (in particular #1772 uses IDs everywhere).

I'm about 80% in favour of requiring an ID but can see both arguments.


Both `room_id` and `peek_expiry_ts` are required in the
response. `peek_expiry_ts` gives a timestamp (milliseconds since the unix
epoch) when the server will *expire* the peek if the client does not renew it.

The server ratelimit requests to `/peek` and returns a 429 error with

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
The server ratelimit requests to `/peek` and returns a 429 error with
The server should ratelimit requests to `/peek` and returns a 429 error with

`M_LIMIT_EXCEEDED` if the limit is exceeded.

Otherwise, the server first resolves the given room alias to a room ID, if
needed.

If there is already an active peek for the room in question, it is renewed and
a successful response is returned with the updated `peek_expiry_ts`.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
a successful response is returned with the updated `peek_expiry_ts`.
a successful response is returned with an updated `peek_expiry_ts`.

"an" seems to imply that the client is providing the expiry which is not the case here.


If the user is already *joined* to the room in question, the server returns a
400 error with `M_BAD_STATE`.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be useful to have a specific error code here. Especially if an alias is accepted as the client may not be able that it is joined to the room already. In particular it seems that at least the resolved room ID should be returned.


If the room in question does not exist, the server returns a 404 error with
`M_NOT_FOUND`.

If the room does not allow peeking (ie, it does not have `history_visibility`
of `world_readable` <sup id="a1">[1](#f1)</sup>), the server returns a 403
error with `M_FORBIDDEN`.
Comment on lines +99 to +101
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could a rule be added where a user can peek the room if they have been invited to that room? (If history_visibility is shared or invited)

This would allow "previewing the room before joining it", i.e. "do you want to join this room" with some context.

This would also allow clients to implement "accept/reject DM request", and display DMs in a similar manner as they do in many IM clients today; by adding the DM to the DM list as any other DM.

Such a rule could probably synergize with MSC2199 and MSC2444, providing users more context about the rooms/DMs they're invited to, and be able to accept/reject with more information at hand.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is mentioned, but it would be better to explicitly state under what conditions a user can peek into a given room.


Otherwise, the server starts a peek for the calling device into the given room,
and returns a 200 response as above.

When a peek first starts, the current state of the room is returned in the
`peek` section of the next `/sync` response.

### Stopping a peek

To stop peeking, the client calls `rooms/<id>/unpeek`:

```
POST /_matrix/client/r0/rooms/{room_id}/unpeek HTTP/1.1

{}
```

The body must be a JSON dictionary, but no parameters are specified.

A successful response has an empty body.

If the room is unknown or was not previously being peeked the server returns a
400 error with `M_BAD_STATE`.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This leads to an unhappy situation where a delayed request may fail because the peek has expired. IIUC this would be hard for for the client to handle gracefully in any other way then just assuming success. Since this is the case why do we differentiate here? Why not make this return success in any case where the client is not peeking when the response is returned. AKA why not make this idempotent.


### `/sync` response

We add a new `peek` section to the `rooms` field of the `/sync`
response. `peek` has the same shape as `join`.

While a peek into a given room is active, any new events in the room cause that
room to be included in the `peek` section (but only for the device with the
active peek). When a peek first starts, the entire state of the room is
included in the same way as when a room is first joined.

If the client requests lazy-loading via `lazy_load_members`, then redundant
membership events are excluded in the same way as they are for joined rooms.

If a user subsequently `/join`s a room they are peeking, the room will
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If someone /peeks, they'll see the room in the peek section and eventually the leave section.
If someone /peeks, then eventually joins the room, it will further appear in the join section.
Now, what if we have these sequence of events.

  • Device A /peeks into a room.
  • Device A /syncs once and gets the (initial) state of the peeked room.
  • Device B /joins the room that Device A is peeking.
  • Device B leaves the room almost immediately.
  • Device A /syncs again ... and gets what?

According to this paragraph, it's reasonable to infer that Device A will see the room appear in the leave section, since the user has left the room but this client has some state that needs updating.

Does this mean my (Device A) device specific /peek has been cancelled by this other random device?
Sure, I could /peek again but room state can be expensive to get. I already have fairly fresh state locally that just needs an incremental update.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The other question is what is the desired UX. I would expect that if I was peeking a room in the context of considering joining I would expect that joining on another device transitioned the view into the room itself as a member. In that case leaving the room should bring me to whatever the "left room" experience is.

However there are other cases for peeking such as spaces or "X as rooms" (such as "profiles as rooms") where you would not want to be unpeeked based on joining the space. However this likely applies for both this device and another device joining the space.

It seems that the best option is the the room is left in the peek list if it is left, whether on this client or another one. Then this client can decide if it still needs that peek (for example it is peeking the space for membership info) or if it is no longer necessary (for example it was for a room preview, and after joining and leaving the user should now be in a "left room" UX where no new info is required).

thenceforth appear in the `join` section instead of `peek`. For devices which
were already peeking into the room, the server should *not* include the entire
room state for the room in the `/sync` response, allowing the client to build
on the state and history it has already received without re-sending it down
`/sync`.

When a room stops being peeked (either because the client called `/unpeek` or
because the server timed out the peek), the room will be included in the
`leave` section of the `/sync` response, including any events that occured
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I'm peeking into a room, then someone invites me to it, then rescinds the invitation, has my peek been cancelled?
How do I differentiate between my invite being rescinded and my peek ending?

Also, if my peek ended, I have to start again, etc. More on this in the previous comment.

between the previous `/sync` and the the peek ending. If there are no such
events, the room's entry in the `leave` section will be empty.

For example:

```js
{
"rooms": {
"join": { /* ... */ },
"leave": {
"!unpeeked:example.org": {
"timeline": {
"events": [
{ "type": "m.room.message", "content": {"body": "just one more thing"}}
]
}
},
"!alsounpeeked:example.com": {}
}
}
}
```

## Encrypted rooms

(this para taken from MSC #2444):

It is considered a feature that you cannot peek into encrypted rooms, given
the act of peeking would leak the identity of the peeker to the joined users
in the room (as they'd need to encrypt for the peeker). This also feels
acceptable given there is little point in encrypting something intended to be
world-readable.

## Future extensions

* "snapshot" API, for a one-time peek operation which returns the current
state of the room without adding the room to future `/sync` responses. Might
be useful for certain usecases (eg, looking at a user's public profile)?
Comment on lines +184 to +186
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and for things like matrix-static which do not want to maintain /sync and actively peek over a large set of rooms.


* "bulk peek" API, for peeking into many rooms at once. Might be useful for
flair (which requires peeking into lots of users' profile rooms), though
realistically that usecase will need server-side support.

* "cross-device" peeks could be useful for microblogging etc?

## Potential issues
richvdh marked this conversation as resolved.
Show resolved Hide resolved

* Expiring peeks might be hard for clients to manage?

## Alternatives

### Keep /peek closer to /join

Given that peeking has parallels to joining, it might be preferable to keep the
API closer to `/join`. On the other hand, they probabably aren't actually
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typo

Suggested change
API closer to `/join`. On the other hand, they probabably aren't actually
API closer to `/join`. On the other hand, they probably aren't actually

similar enough to make it worth propagating the oddities of `/join` (in
particular, the use of query-parameters
([matrix-doc#2864](https://github.com/matrix-org/matrix-doc/issues/2864)).

### Filter-based API

[MSC1776](https://github.com/matrix-org/matrix-doc/pull/1776) defined an
alternative approach, where you could use filters to add peeked rooms into a
given `/sync` response as needed. This however had some issues:

* You can't specify what servers to peek remote rooms via.
* You can't identify rooms via alias, only ID
* It feels hacky to bodge peeked rooms into the `join` block of a given
`/sync` response
* Fiddling around with custom filters feels clunky relative to just calling a
`/peek` endpoint similar to `/join`.

### Use the `join` block

It could be seen as controversial to add another new block to the `/sync`
response. We could use the existing `join` block, but:

* it's a misnomer (given the user hasn't joined the rooms)
* `join` refers to rooms which the *user* is in, rather than that they are
peeking into using a given *device*
* we risk breaking clients who aren't aware of the new style of peeking.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

surely not given that it would still be per-device and those clients wouldn't be calling the new /peek API

* there's already a precedent for per-device blocks in the sync response (for
to-device messages)

### Per-account peeks

It could be seen as controversial to make peeking a per-device rather than
per-user feature. When thinking through use cases for peeking, however:

1. Peeking a chatroom before joining it is the common case, and is definitely
per-device - you would not expect peeked rooms to randomly pop up on other
devices, or consume their bandwidth.
2. [MSC1769](https://github.com/matrix-org/matrix-doc/pull/1769) (Profiles as
rooms) is also per device: if a given client wants to look at the Member
Info for a given user in a room, it shouldn't pollute the others with that
data.
3. [MSC1772](https://github.com/matrix-org/matrix-doc/pull/1772) (Groups as
rooms) uses room joins to indicate your own membership, and peeks to query
the group membership of other users. Similarly to profiles, it's not clear
that this should be per-user rather than per-device (and worse case, it's a
matter of effectively opting in rather than trying to filter out peeks you
don't care about).

### Alternatives to expiring peeks

Having servers expire peek requests could be fiddly, so we considered a number
of alternatives:

* Allow peeks to stack up without limit and trust that clients will not forget
about them: after all, it is in clients' best interest not to leak
resources, to reduce the amount of data to be handled, and it is not obvious
that leaking peeks is easier than leaking joins.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The client cost to "forgetting" a peek is very low. However the server may need to do some work (if /sync is poll-based for all rooms). I think the only client cost is if you are offline for a while and get a huge number of events before you get the chance to unpeek a room you no longer need.

One option would be to return all peeked rooms in the sync response even if they have no events. The upside is that you constantly (or periodically) remind the clients what they are peeking but the downside is some extra transfer and parsing (although this can be made arbitrarily low by only sending a "reminder" for each room occasionally).


Ultimately this does not align with our experience of administering
`matrix.org`: it seems that where a resource *can* be leaked, it ultimately
will be, and it is better to design the API to prevent it.

* Limit the number of peeks that can be active at once, to force clients to be
fastidious in their peek cleanups. However, it is hard to see what a good
limit would be. Furthermore: peeks could be lost through no fault of the
client (for example: when a `/peek` request succeeds but the client
does not receive the response), and these leaked peaks could stack up until
peeking becomes inoperative.

* Automatically clear active peeks when a `/sync` request is made without a
`since` parameter. However, this feels like magic at a distance, and also
means that if you initial-sync separately (e.g. you stole an access token
from the DB to manually debug something) then existing clients will be
broken.

* Have the client resubmit the list of active peeks every time it wants to add
or remove one. This could amount to a sigificant quantity of data.

## Security considerations

Servers should ratelimit calls to `/peek` to stop someone DoSing the
server.

## Unstable prefix

The following mapping will be used for identifiers in this MSC during
development:

Proposed final identifier | Purpose | Development identifier
------------------------------- | ------- | ----
`/_matrix/client/r0/peek` | API endpoint | `/_matrix/client/unstable/org.matrix.msc2753/peek`
`/_matrix/client/r0/rooms/{roomId}/unpeek` | API endpoint | `/_matrix/client/unstable/org.matrix.msc2753/rooms/{roomId}/unpeek`

## Footnotes

<a id="f1"/>[1]: `join_rules` do not affect peekability - it's
possible to have an invite-only room which joe public can still peek into, if
`history_visibility` has been set to `world_readable`.[↩](#a1)