Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Emit event standard for games #373

Open
2 tasks
SebastienGllmt opened this issue May 27, 2024 · 0 comments
Open
2 tasks

Emit event standard for games #373

SebastienGllmt opened this issue May 27, 2024 · 0 comments

Comments

@SebastienGllmt
Copy link
Contributor

SebastienGllmt commented May 27, 2024

To avoid having to create custom indexers for every behavior, smart contract chains typically have an event standard, so that dApps can emit events in a way that every indexer and tool understands

Examples:

We should implement the same functionality in Paima for the same reason

Blockers

How this could work

  1. Games have a new folder (similar to /api) called /events
  2. Devs specify events their game can emit in this folder by using typebox or typia (see alternatives for tradeoffs)
  3. Specify types both to be used in events as well as objects types usable from events. This includes a way for devs to specify which fields in a type is indexed (we need to ensure this is persisted to the json schema as well). Ex:
// custom type example
const Temperature = Type.Object({ sensor: Type.String(), temperature: Type.Number() });

// custom event example
const TemperatureChange = Type.Object({
    name: "TemperatureChange",
    inputs: Type.Tuple([
        Type.Object({
            "indexed": false,
            "name": "old",
            "type": Temperature,
        }),
        Type.Object({
            "indexed": false,
            "name": "new",
            "type": Temperature,
        }),
    ]),
});

export const {
    types: {
        Temperature,
    },
    events: {
        TemperatureChange,
    }
}
  1. Paima Engine, when compiling a project, generates 2 outputs:
    1. (lower priority) Json schema definitions and places the definition in the packaged folder (with all the other Paima Engine build outputs)
      1. (in the typebox case) fs.writeFileSync('user-schema.json', JSON.stringify(Type.Strict(typeboxObject), null, 2)); (we'll need something like this that aggregates all the different objects defined into a single file probably)
      2. (in the typia case) https://typia.io/docs/json/schema/
    2. (higher priority) a JS file for Paima Engine to load later (similar to how we load gameCode)
  2. Events and custom types should be enforced to be immutable by Paima Engine (you cannot change or delete old types). This can be done by storing the hash of these json-schema of custom types in the Paima Engine DB and checking if the hashes match on startup
    1. For modification:
      1. You can't change custom types because this would break indexers (see section on topic name generation) as the shortname notation would no longer represent the same type (i.e. TemperatureChange(Temperature, Temperature) would change meaning if under-the-hood you changed the Temperature definition)
      2. There is no concept of changing an event (since they are overloadable). You would essentially just be deleting one topic and creating a new one (that happens to have the same name)
    2. For deletion: this would just make it pointlessly hard to process historical data (note: you can define a type as deprecated in json-schema)
  3. Paima Engine has a new event table that is just a <id: bigint, topic: string, address: varchar, data: jsonb, block_height: bigint, tx: int, idx: number> type (note: one block can contain many STF calls which each may trigger multiple logs. For explorer purposes, we'll want to know which STF in which block created which log)
    1. id: an incrementing primary key in the table
    2. topic: a hash uniquely representing the event type. See below section for how this hash is calculated
    3. address: the address that initiated this event (see compatibility section below for how we can set this field)
    4. data: json-encoded data that was emitted
    5. block_height: block height in the rollup where this event was emitted
    6. tx: transaction index inside the block where this event was emitted (see compatibility section about transaction hashes. We may want to add the hash here as well, but maybe not)
    7. idx: the position of this event within the transaction (some transactions can trigger multiple events). This is meant to help with pagination, but maybe it's not necessary.
  4. When Paima Engine launches, it creates database tables based on the indexed property. Ex: CREATE INDEX topic_fieldName ON event (topic, data->>fieldName). Some tricky things to consider:
    1. I'm not sure if Postgres allows you to create indices on complex data types (i.e. nested JSON objects). The way this works in EVM is that it creates an index on the hash of the object type instead and we can use the same fallback if needed
    2. We need to handle adding/removing indices since games can release new versions that add/remove which fields in an event is indexed. Note that, in EVM, indexed isn't part of the topic hash generation (presumably partially to help facilitate this)
    3. Note: we don't need to handle type definitions changing, as type definition changes generate different topics
  5. Currently, GameStateTransitionFunction returns SQLUpdate[]. This needs to be updated to { stateTransitions: SQLUpdate[], events: { topic: string, address: string, data: string } (we should be able to extract which fields are indexed based on the topic+data given the engine can access the events json-schema, but if this is too complex we can consider adding the indexed fields to the payload)
  6. Create a websocket server in Paima Engine on startup using MQTT.js (see alternatives for other options)

(for websockets)
10. When games emit events, they all get added to a queue (not published via websockets right away)
11. When the SQL command of processing a block is done (processSyncBlockData), we iterate over all events emitted in the block and emit them using the using MQTT. Note: this has to be done in a way where Paima Engine shutting down after a block is processed (SQL txs written to database) and before events are emitted does not lead to missed events (possibly by keeping track of the most recent log ID successfully broadcast to clients)
12. When game frontends want to listen to events they use subscribe from MQTT (they can import the types from their events/ folder and decode the data using Value.Decode for typebox or is for typia)

(for rest API)
13. Paima Engine exposes a getLogs endpoint that accepts (fromBlock, toBlock). Topics are specified as part of the URL using either URL path parameters or MQTT (for consistency with the WebSocket implementation)
14. Frontends can use https://github.com/erfanium/fetch-typebox to both fetch events and decode them properly at once

Alternative ideas:

  1. Instead of using typebox, we could allow users to write Typescript types directly using typia. It's a more typescript-native solution, but has a few issues:
    1. It depends on typescript internals, so typescript updates can cause it to fail (makes updating typescript versions harder)
    2. It only works properly with the standard tsc compiler (i.e. things like esbuild, webpack have minimal support). We can mitigate the impact of this because events/ is in a separate folder and so it can have separate tsc build system if needed, but this adds compiler setup complexity
    3. It's a bit less json-schema friendly (easier to accidentally write something not supported by json-schema). Although it still has support for the core features we need (ex: here)
    4. Although it uses Typescript natively, it still requires a certain amount of wizard syntax knowledge to use properly (it only really helps for the simplest use-cases)
  2. Instead of broadcasting in processSyncBlockData, it could be a separate process/thread (possibly with a listener list we did with wallet_connect_change). This would avoid the main block syncing slowing down if there are many users to send updates to. If we go this direction, we'd be a broker (ex: Mosquitto for MQTT or EMQX which is much more general, but also supports MQTT), which adds extra setup complexity (more things to run when you need to run Paima Engine, more things to try to reset when you reset a chain in developer environments, etc.) so this may end up being overkill. If we end up using something like EMQX as a broker, we'd have to decide if we want logs to be stored in Postgres at all (maybe only store them in the broker)
  3. There are a few options for libraries to use:
    1. socket.io: this doesn't follow the websocket standard so it could become tedious to work with. It supports listening to sub-topics through a regex system (io.of(/^\/dynamic-\d+$/))
    2. ws: really barebones (doesn't handle reconnection, etc.). We can manually implement a subscription system even though it doesn't have one.
    3. faye which is a pub-sub system based on websockets. It supports sub-topic broadcasts using wildcard notation (docs) and can run as a separate process if needed. Again, requires running custom stuff and it won't supported guaranteed delivery / guaranteed order without extensions.
    4. MQTT.js: this is more general pub-sub. It uses websockets under the hood, but requires clients to run a special MQTT client software to connect with it. Unfortunately, MQTT only preserves ordering in the optimistic case (not if messages are dropped and then resent)
    5. hyper-express: supports MQTT syntax subscriptions, but it only supports QoS 0 and its support is deprecated by its underlying uWebSockets library

Event listening system requirements?

Properties we would need from an event listener system:

  1. You can subscribe to topics including a subset of sub-topics (not just a single namespace)
  2. Guaranteed delivery
  3. Messages for a specific listener have to preserve ordering
  4. (ideally) Listeners across multiple topics preserve ordering

As you can see, achieving all of these while maintaining good performance is hard. For example, let's start by assuming we have a basic event system where you have:

  • no guarantee messages won't get missed (ex: user internet going offline may cause them to miss a message)
  • no guarantee messages appear in the same order the server sent them
  1. Fixing guaranteed delivery: you can change some settings to give you a guarantee a message won't be missed, but it works by receiving an ack from users that they received the message (doubles the communication overhead)
  2. Fixing guaranteed order: guaranteed order is hard because imagine the server
  • sends msg 1
  • then sends msg 2
  • then it realizes msg 1 got dropped so it resends msg 1
    Now, from a client perspective, it received msg 2 before msg 1. To fix the order, you have to throttle to only send 1 msg at a time (i.e. don't send msg 2 until msg 1 ack is received) or do some rearranging at the client level like TCP, but this has a performance implication

WebSocket satisfies both of these in theory because it's based on TCP which guarantees both these things. WebSockets will not re-initialize by default if the user's internet is lost, but some WebSocket libraries provide this.

Meanwhile

  • MQTT doesn't provide this by default (can be done by configuring QoS to 2)
  • Faye doesn't provide this by default (can be done through extensions)

Handling shutdowns in event listeners

The 2 main cases to test for handling shutdowns are:

  • what happens if the node gets restarted
  • what happens if the client gets restarted

Notably, there are two systems that need to handle this gracefully:

  1. The pub-sub protocol itself (ex: MQTT)
  2. Paima Engine itself

Notably, Paima Engine cannot be omitting messages in the following scenario:

  1. Paima Engine node finishes processing the state transition function and saves the result to the database
  2. Paima Engine node gets shut down before it has a chance to send out the MTQQ updates over the websocket
  3. Paima Engine node gets restart. Since (1) completed successfully, it will go directly to processing the next block, skipping (2)

we can solve this by adding another check that re-emits events that were not properly sent (and acknowledge). Since a message may have been acknowledged by only a subset of clients listening (assuming you're not connecting to just a single broker), this can still lead to duplicate messages and it would be up to the client-side MQTT to properly detect and discard and duplicate message (i.e. duplicate log ID)

Calculating topics

First, some facts about how EVM logs handle topic generation:

  • In EVM, if you have an event with signature event PersonCreated(uint indexed age, uint height), then the topic is calculated by keccak256(PersonCreated(uint256,uint256))
  • Topic overloading is allowed. That is, you can have different events with the same name that have different parameters (ex: PersonCreated(uint256) in the same contract is also allowed)
  • Events are actually encoded as a JSON object (in the ABI), but PersonCreated(uint256,uint256) acts as the short human-readable form

For us, if we use json-schema for the event type, there are a few challenges we have to solve:

  1. Deciding a canonical representation that is human-readable (something short like in the EVM case instead some massive JSON object). That is to say, the json-schema is kind of like the ABI and we need a short-form notation
  2. The topic itself doesn't need to be recursively computed (ex: PersonCreated(Person) is valid syntax) because Paima Engine enforces object types (Person) to be immutable

Topics should be calculated using the same algorithm. It will help with compatibility and tooling for the ecosystem and there is nothing fundamentally wrong with the EVM approach that would require changing things.

Compatibility

Many tools in the blockchain ecosystem are built around eth_getLogs. It would save a massive amount of developer time if Paima Engine logs are consumable via the same API as eth logs (other tools don't have to rewrite custom stuff just to support Paima Engine). This is doable if the Paima Engine node stores logs directly in SQL, but may be harder if we're wrapping stuff with custom brokers.

There are a some other points that make this a bit more complicated:

  1. address compatibility: eth_getLogs specifies the address from which the event originated. In EVM, this would always be an EVM address. However, in Paima Engine, we support other chains with different addresses as well. Some options to tackle this:
    1. Filter out events by non-EVM addresses (may cause events to be missed which could lead tools consuming this data to show the wrong thing)
    2. Include the address as-is even though it's not EVM (some tools may just happily accept these, but others may crash)
    3. Include a "special address" to represent other address schemas. ex: all Cardano actions could come from address "0x00" (we'd have to be careful to have a proper precompile standard for Paima to avoid colliding with precompile addresses of other EVM chains)
  2. scheduled transactions: Paima Engine has scheduled transactions (timers, primitives, etc.) which have no direct address that triggered them. It could be up to each scheduled address to specify their own address
    1. for primitives, this could be the hash of the primitive in the configuration file (so that primitives from different chains give different address)
    2. For timers, it could be up to the user to specify some kind of unique ID for each trigger for timers (this somewhat fits into the need for a precompile standard for Paima to be able to properly define what these addresses are without collisions)
  3. contract origin: part of the benefit of being able to specify addresses for EVM is that the address can correspond to contracts. For example, if you want to know every time somebody moved an NFT, you would get logs from that specific EVM contract address. However, in Paima Engine, there are no contract addresses like this, so all events would get associated to the user's address (and something else for scheduled transactions). If we have a precompile standard though, we could allow a syntax when emitting events to override the address corresponding to the event
  4. transaction hash: evm_getLogs includes a transaction hash, but we have no concept of a transaction hash in Paima (yet). Even if we do include a tx hash standard for Paima, we'd have to be careful to make sure it's either the same length as the EVM tx hash, or that we map it to the same length (ex: by wrapping it in another hash with the correct length output). Wrapping it may be partially unavoidable if we want to (in the future) support other wrappers for different getLog calls for different chains
  5. EVM has a max number of indexed fields in an event (up to 3), but we don't have the same restriction

Reusability

There are other cases where we want event-like behavior:

  1. In the batcher, to send events to the client on progress of their tx being included
  2. From explorers, where we care about showing blocks/txs/events as they happen

Here is how these should be handled for the batcher:

  1. User sends a transaction to the batcher.
  2. The batcher exposes a SSE to monitor the tx status and send updates to the user (ex: tx got submitted to the chain successfully)
  3. The node itself sends an event whenever a tx is processed (see MQTT subscription architecture)
  4. The batcher subscribes to all tx events from the node to know when a tx got processed by the state machine

For explorers, different pages can subscribe to different events (ex: homepage subscribes to the block event)

Rationale for batcher connection

For the batcher, SSE is better suited in theory in my opinion because:

  • All we need is one-way connection (game -> user)
  • The connection is short-lived (just sending updates about tx processing state)

However, there are a few options:

  1. Polling: very inefficient
  2. Long-polling: could miss updates and requires pointless client overhead. However, it's relatively easy to implement
  3. Websocket: would require either
    1. Client opens a websocket on every tx. Client sends the tx over the websocket, then batcher sends updates. Close the socket afterwards. This is a slightly awkward API since the client can only actually send 1 message, and this creates multiple sockets open on the same port in parallel (which is not invalid though)
    2. Client has a long-running websocket with the batcher open. This exhausts connection limits and requires ugly multiplexing
      This makes it better than polling (you only get updates when really needed), and easier to debug than websockets.
  4. SSE: makes sense, but requires many steps to get it working:
    1. Update the batcher (not necessarily the paima node itself) to fastify instead of express since express doesn't support http2 (not a bad idea in general since express is years behind other libraries in support for things). This is because SSE has a max limit of only 6 connections on HTTP1 (source) which is not enough since there can be many txs sent by the user in parallel across multiple tabs
    2. Setup certificates for the batcher (since http2 requires certificates to be used)
      1. when NETWORK=localhost we can use a locally-signed certificate
      2. for production, get can generate a certificate with cloudflare and pass it into the batcher as an env variable
    3. Since games connect to our server in 2 hops (game -> cloudflare -> user), we need to ensure Cloudflare properly passes through the SSE. To do this, we can disable "page cache" for the batcher (this requires the batcher to be on a different cloudflare subdomain from other parts of the game)

Therefore, although SSE is the best fit, it's also the most complicated to implement (not just for us, but for anybody else who wants to run their own batcher in the future) since it has 2 hosting service dependent steps

MQTT subscription architecture

For an explorer use-case, I'm not sure how our notation would allow us to live-monitor one-to-many types. For example, all transactions made by a specific address. We could try a hierarchy like block/{address}, but because blocks contain many addresses this would require sending events like block/0x0, block/0x1, etc. which makes subscriptions to block/+ not meaningful.

Convention

As a reference, there are some conventions for MQTT patterns such as the Homie convention

We have a few options:

Top-down

In this mode, we consider everything as having block as the top-level topic. This is a logical hierarchy, but I am a little scared it might be too fragile, although may it's not likely we make changes to the hierarchy of things

block/{id} → block content
block/{id}/tx/{id}/addr/{addr} → tx content
block/{id}/tx/{id}/addr/{addr}/logs/${topic}/addr/{addr} → log content (recall: tx initiation can be different from log addr)

Separated topics

Instead of having long paths, we split things up into separate top-level namespaces. This makes it easier to refactor and makes subscribing to topics a bit cleaner (instead of having to subscribe to block/+/tx/+/addr/+/logs/{topic}/# to get every log for a topic, they can instead just listen to logs/{topic}/#)

block/{id} → block content
tx/{id}/addr/{addr} → tx content
logs/${topic}/addr/{addr} → log content

Handling log topics

Indexed fields should appear inside the log topic using the following syntax

logs/${topic}/addr/{addr}/index1/index2/index3 → log content

This makes it easy to listen to events from a specific topic using something like logs/${topic}/addr/{addr}/+/value/+

Recall: the index of an object type is the hash of that object

Historical data: REST, then socket

Although many use-cases will want websockets to get up-to-date events as they happen, these services may occasionally need to be reset and may not want to miss data during the downtime.

However, our MQTT setup is not well suited for opening a websocket to get blocks from a block hash in the past. In fact, the first piece of data you get from it might be mid-block (there is no guarantee your websocket is established on block boundaries).

To handle this, some use-cases will want to use the REST endpoints to sync to the tip, then switch to websockets afterwards. Although supporting this doesn't have to be in v1 of events, I think the best way we could support this is with helper functions in the middleware that do the following:

  1. Open a websocket and listen to the events that come in (and buffer them) without actually returning them
  2. Start making REST calls to sync historical data up until the first event fetched from the websocket
  3. Once the REST endpoint is caught up, start sending events from the buffer of events fetched from the websocket
  4. Once the buffer is empty, start service websocket data in realtime
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant