Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sketching out a potential new architecture for pygls #418

Closed
wants to merge 17 commits into from

Conversation

alcarney
Copy link
Collaborator

@alcarney alcarney commented Nov 27, 2023

As mentioned in #334, I don't think we need to base our JSON-RPC implementation on an asyncio.Protocol object anymore. Since all we have to to is read and write some bytes, I think it makes more sense to base it on high-level asyncio Reader and Writer objects instead.

I'm opening this PR (very!) early so that people have the chance to provide feedback, but bear in mind that there are still many details to consider! If it looks like this approach might pan out I will, of course, put some more effort into commit messages etc. so that it's more digestible.

I'd like to think that if this new approach proves successful, we can ship it alongside the current one so that people can opt-in to try it out

Goals

  • Type annotate all the things!
  • Draw stronger boundaries between responsibilities - the current JsonRPCProtocol does a lot, including some details that are actually LSP specific
  • Have something ready to try when the next major lsprotocol version is ready - that way if this works out, server authors only have to deal with breaking changes once.
  • Clean up inconsistencies (Explore parity between server.LanguageServer and protocol.LanguageServerProtocol methods #306)
  • See if it's possible to not need "normal" and _async flavours of every method.
  • Fix the pyodide tests once and for all
  • WASM-WASI support (proof of concept here)

So far I only have a proof of concept working, where a JSON-RPC client is able to send a request to a simple JSON-RPC server and get a response.

Happy to elaborate further if needed, or even throw it all in the bin if it turns out to be a horrible idea!

TODO

Code review checklist (for code reviewer to complete)

  • Pull request represents a single change (i.e. not fixing disparate/unrelated things in a single PR)
  • Title summarizes what is changing
  • Commit messages are meaningful (see this for details)
  • Tests have been included and/or updated, as appropriate
  • Docstrings have been included and/or updated, as appropriate
  • Standalone docs have been updated accordingly

@tombh
Copy link
Collaborator

tombh commented Nov 30, 2023

Your "Goals" list is amazing!

The only one I don't understand is:

Have something ready to try when the next major lsprotocol version is ready - that way if this works out, server authors only have to deal with breaking changes once."

Is that a one-off thing that lsprotocol is planning? Or a more general solution you're considering?

@alcarney
Copy link
Collaborator Author

Or a more general solution you're considering?

No, nothing that fancy :)

Just with the breaking changes coming in lsprotocol v2024.x (#410), I think it would make sense to change to the updated approach at the same time

I'm trying to decide if it's worth trying to ship a version of pygls with both approaches in the same package, or if we ship them separate as stable 1.x series and unstable 2.x series, both available at the same time. - Do you have any thoughts on that?

@alcarney
Copy link
Collaborator Author

I now have a failing test, client is able to talk to a server over stdio, but the server is not sending correct capabilities (yet)

I'm trying to decide if it's worth trying to ship a version of pygls with both approaches in the same package

Now that I think about it, we'll probably have to go the pre-release branch approach - we won't be able to have lsprotocol 2023.x and 2024.y installed at the same time

@tombh
Copy link
Collaborator

tombh commented Dec 16, 2023

Ah yes, a pre-release is probably the best approach.

Looking good! So, pygls/lsp/_base_client.py is autogenerated right?

@alcarney
Copy link
Collaborator Author

alcarney commented Jan 3, 2024

So, pygls/lsp/_base_client.py is autogenerated right?

Yes

Ah yes, a pre-release is probably the best approach.

Cool, I'll slowly start replacing the previous approach with the new stuff then, rather than trying to maintain both in the same codebase.

Quick update. I think most of the fundamentals are nearly there, I have both a sync (for wasm) and async (flagship) implementations of the base JSON-RPC server and pygls will choose which automatically based on the current environment - though I'm not sure what the type checkers will think of that.... 🤔

I also have proof of concept end-to-end tests working for CPython, Pyodide, and WASM-WASI over stdio (on my machine at least). I want to see if I can get CI for these working before moving to some of the higher level stuff.

Again happy to elaborate more on any specific details and let me know if you have any feedback! :)

@alcarney alcarney force-pushed the jsonrpc-server branch 22 times, most recently from d17dbec to 873bc59 Compare January 4, 2024 20:04
@alcarney
Copy link
Collaborator Author

@tombh Pushed an update to the extension, if you run it in a local copy of VSCode you should be able to see the server's stderr output in the pygls-server output window - hopefully that makes it easier to figure out what's going on 🤞

image

The stderr won't be available on the web version though :/

@tombh
Copy link
Collaborator

tombh commented Mar 20, 2024

Great thanks, I'll check it out soon.

@tombh
Copy link
Collaborator

tombh commented Mar 28, 2024

I'm still just seeing that same error:

2024-02-21 09:40:58.659 [info] [Error - 09:40:58] Client pygls: connection to server is erroring.
Reader received error. Reason: unknown

But one time I did get this error!

<--- Last few GCs --->

[1663866:0x38004e0000]       42 ms: Mark-Compact (reduce) 0.7 (2.9) -> 0.7 (1.9) MB, 0.53 / 0.00 ms  (average mu = 0.291, current mu = 0.022) last resort; GC in old space requested
[1663866:0x38004e0000]       42 ms: Mark-Compact (reduce) 0.7 (1.9) -> 0.6 (1.9) MB, 0.53 / 0.00 ms  (average mu = 0.169, current mu = 0.004) last resort; GC in old space requested


<--- JS stacktrace --->


#
# Fatal JavaScript out of memory: CALL_AND_RETRY_LAST
#

Unfortunately though I could never reproduce it 🫤

Though my intuition is that it's all just something wrong with my setup, like the virtualenv or paths or something.

@alcarney
Copy link
Collaborator Author

I'm still just seeing that same error:

If I'm not mistaken, that's from the pygls-client output channel - there should now also be something in the pygls-server channel 🤞

Though my intuition is that it's all just something wrong with my setup, like the virtualenv or paths or something.

In theory at least, there shouldn't be any issues like that with this approach, the extension bundles everything it needs to run, well apart from the server's .py file that is 😅

@tombh
Copy link
Collaborator

tombh commented Mar 29, 2024

Oh! I didn't realise there was that drop down for choosing channels. And now I've seen a useful error 🥳

It can't find any of the example servers defined in examples/servers/.vscode/settings.json's "pygls.server.launchScript": "...", because examples/servers/.vscode/launch.json's configurations.pathMappings.localRoot: "${workspaceFolder}" gets set to /workspace.

When I set configurations.pathMappings.localRoot to ".", I can put whatever path I want in "pygls.server.launchScript" and everything works perfectly!

@alcarney
Copy link
Collaborator Author

alcarney commented Apr 3, 2024

It can't find any of the example servers defined in examples/servers/.vscode/settings.json's "pygls.server.launchScript": "...", because examples/servers/.vscode/launch.json's configurations.pathMappings.localRoot: "${workspaceFolder}" gets set to /workspace.

When I set configurations.pathMappings.localRoot to ".", I can put whatever path I want in "pygls.server.launchScript" and everything works perfectly!

Interesting... can't say I've ever come across pathMappings before.... 🤔

Glad to hear that you managed to get it working though!

The previous `JsonRPCProtocol` and `LanguageServerProtocol` classes
did a lot - parsing incoming messages, sending responses, managing
futures, executing handlers and in the case of
`LanguageServerProtocol`, providing the implementation of pygls' built
in features!

Instead, taking inspiration from the Sans-IO[1] approach to
implementing protocols. The base `JsonRPCProtocol` class is focused on
the data of the protocol itself, converting bytes into rich data types
and back again.

In theory at least, this allows someone to implement which ever I/O
and execution framework they like on top of this.

While `pygls.protocol.json_rpc` module provides some fallback types for
`_Request`, `_Notification` and `_Result` messages, it is expected for
protocols (like the `LanguageServerProtocol`) to subclass
`JsonRPCProtocol` and implement the following methods

- `get_notification_type`
- `get_request_type`
- `get_result_type`

in order to provide the type definitions for each message in the protocol.

[1]: https://sans-io.readthedocs.io/
In the `LanguageServerProtocol` (and in `JSON-RPC`) more generally,
both clients and servers need to have a main message handling loop.

The `JsonRPCHandler` class is a generic base for both clients and
servers to provide both the main RPC loop and the orchestration of
executing notification and request handlers.

This commit actually introduces two such handler classes, one built on
`asyncio.Task` objects and async-await, the other on
`concurrent.futures.Future` and is entirely syncronous.

The async version is intended to be the "flagship" version and should
provide equivalent functionality to the existing version of pygls,
while the sync version is a fallback for platforms (like WASM-WASI)
where frameworks like asyncio are not (yet!) available.
This refactors pygls' so that is expects a `JsonRPCHandler` rather
than a `Server`.

It also takes a different approach to collecting user options, rather
than storing an instance of `<MethodName>Options`, the `feature()`
decorator takes options as a set of `**kwargs`.

This is because there are a number of `<MethodName>Options` that have
fields that don't make sense for a user to provide when registering
their feature. For example

- Any `resolve_provider` fields, these can be computed by pygls
- The `workspace_diagnostics` field of `DiagnosticOptions` which
  again, can be computed by pygls

While this will be the preferred method of specifying options, the
existing approach will be honoured as it will help ease the migration
to the next version and it's easy enough to convert an options object
to a set of ``**kwargs``

Switching to `**kwargs` also opens up potential future enhancements
around registering features - perhaps with additional options that
pygls can use to registry features dynamically with the client?
Building on the syncronous and asyncronous `JsonRPCHandlers`, this
replaces the old base `Server` class with a pair of `JsonRPCServer`
classes.
Unlike the server use case, it is not immediately obvious if it makes
sense to provide a client implementation compatible with WebAssembly
runtimes (especially when you consider you cannot spawn subprocesses).

So for now this commit only refactors the previous `JsonRPCClient` to
be based off of the new async `JsonRPCHandler` class.
This commit adds back the `LanguageServerProtocol`, it is now *much*
simpler as it only needs to provide the converter and type definitions
given to us by `lsprotocol`
The `scripts/generate_client.py` script has been renamed to
`scripts/generate_base_lsp_classes.py` as it now not only provides us
with a `BaseLanguageClient`, but it now also provides a
BaseLanguageServer`.

This classes are just a *long* list of method declarations, generated
from `lsprotocol` type defintions giving us nice autocomplete-able
method names to call for each method in the LSP specficiation.

These methods should also be written in a way enabling both sync and
async usage without having to resort to having sync and async versions
of each method!
This refactors the `ServerCapabilitiesBuilder` to align with changes
made to the `FeatureManager`
It does not do much right now, but this creates a class where we can
add functionality that would be useful for any LanguageClient
This commit adds back the `LanguageServer`, in addition to inheriting
all the newly autogenerated method defintions, the implementations of
pygls' built in features have also been migrated here.

Additionally, the implementation of `workspace/executeCommand` now
inspects the handler's type annotations and automatically converts the
command's arguments into instances of those types - or returns a
`JsonRpcInvalidParams` error.
This commit refactors the infrastructure for writing end-to-end tests.

There are now two fixtures `get_client_for` and `uri_for`, as the
names suggest these are functions for obtaining a `LanguageClient`
connected to a server in `examples/servers` and URIs for files in
`examples/servers/workspace` respectively.

This also adds a custom cli flag to our test suite - `--lsp-runtime`
which allows to choose whether the end-to-end tests are run against
the `cpython`, `pyodide` or `wasi` runtimes.

The fixtures mentioned above, take the value of the `--lsp-runtime`
flag into account, allowing for the same test code to work against
each of the supported runtimes.
Instead of trying to run the entire pytest testsuite in the pyodide
runtime in selenium we now only run the end-to-end tests, with the
server component executing in pyodide.

By passing the ``lsp-runtime pyodide`` argument to pytest, rather than
launching the server under test via CPython, the test suite will use
NodeJS and a small wrapper script to execute the given server in the
Pyodide runtime.
This aligns the pyodide job with the updated approach, as well as
adding a job to run the tests against the WASM-WASI build of Python.
@alcarney
Copy link
Collaborator Author

alcarney commented Aug 2, 2024

I'm going to close this.

TLDR: This PR is too big (just look at the list of goals! 😅), not ready and the rest of the world is moving on.

  • Python 3.13 is coming and I expect we'll want to drop 3.8 when that goes EOL
  • We really should be looking at updating to New lsprotocol pre-release published #431 and I would not be surprised if LSP 3.18 is released soon.

I'm glad I tried this, but after some time away from it I think it's worth another shot and hopefully something a little more incremental

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

pygls does not handle async user shutdown handler correctly
2 participants