Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Performance degradation using SSL #133

Closed
theaaron123 opened this issue Nov 26, 2021 · 19 comments · Fixed by #137
Closed

Performance degradation using SSL #133

theaaron123 opened this issue Nov 26, 2021 · 19 comments · Fixed by #137
Labels
bug Something isn't working

Comments

@theaaron123
Copy link

When connecting with HTTPS to a Flask SocketIO server I'm seeing a slowdown of 20 seconds waiting for a response compared to http.

This is happening with header auth and with certificate auth.

let socket = SocketBuilder::new("https://$IP$".to_string())
        .opening_header(
            "Authorization",
            settings.header_auth_value.as_str()
        )
        .on("open", connect_callback)
        .on("submit", submit_callback)
        .on("error", |err, _| {
            warn!("Server Connection Error: {:#?}", err)
        })
        .on("error", |err, _| eprintln!("Error: {:#?}", err))
        // .transport_type(TransportType::WebsocketUpgrade)
        //  .tls_config(tls_connector)
        .connect();

I can see in the nginx logs that the websocket upgrade is switching protocols

client GET /socket.io/?EIO=4&sid=n4wlLsEmzjqGaiGHAAAG&transport=websocket HTTP/1.1 101 45

Any ideas on what could be going wrong?

PS. Not seeing this degradation with python socket io client

@1c3t3a
Copy link
Owner

1c3t3a commented Nov 26, 2021

Mhh... 20 seconds definitely sounds weird... Generally a small degradation can be expected, mostly as the underlying websocket library does not allow a split of the stream into receiver and sender (see here vs here). This would be fixed when this crate gets an async implementation, that could use the tungstenite websocket library, see #115.

Anyways a degradation of up to 20 seconds is strange, in which way is the server configured? Do you see an immediate answer to the upgrade request?

@theaaron123
Copy link
Author

Initial connection is very quick with the websocket upgrade. I've also pulled the branch with the tungstenite websocket library and seeing the same slow down waiting for a message after the server emits.

Server is flask socket io running on Gunicorn (with gevent websocket worker) using nginx to handle ssl and reverse proxy. I also see the slowdown using the debug flask server and not gunicorn. I don't see the slowdown with python socket io client though so I believe the server to be configured correctly.

I'll test deploying your node.js benchmark server remotely and see if I see a slowdown

@theaaron123
Copy link
Author

Running the tests remotely using the secure node server I don't seem to see any delay. I bypassed nginx and used the same certificates on the flask-socketio server and do see it though, no certificate errors etc and forcing websockets not polling

@theaaron123
Copy link
Author

I did some more debugging and added some timers and found the 25 second wait at acquiring the lock on the MutexGuard<Client<TlsStream>>> in WebsocketSecureTransport

    let before = Instant::now();
    let mut writer = self.client.lock()?;
    println!("Waiting acquire mutexguard LOCK for: {:.2?}", before.elapsed());

Once the lock is acquired it sends the message quickly. Any ideas on why the lock would be held for so long?

@1c3t3a
Copy link
Owner

1c3t3a commented Dec 1, 2021

Ah great, thanks for the detailed information! I think I tracked down the issue, and 20 seconds makes total sense in that context. Without splitting, the same connection for sending and receiving is used, but receiving is a blocking operation:

let mut receiver = self.client.lock()?;
// if this is a binary payload, we mark it as a message
let received_df = receiver.recv_dataframe()?;
drop(receiver);
match received_df.opcode {

After the connection is established, the client starts to wait for a message from the server by calling the blocking receive message. At the same time, in another thread (the main one by "managed" by the user) the send method is called and starts to wait for the lock. So now we're in a deadlock situation, that only gets resolved after 20 seconds, the servers ping interval! Then the lock gets dropped and the client can send.

If I am right the interval should get shorter with a shorter ping interval in the server configuration. Does this make sense @nshaaban-cPacket?

Sorry for the inconvenience, this looks like a logical error!

@1c3t3a
Copy link
Owner

1c3t3a commented Dec 1, 2021

A possible solution would be to introduce a timeout on the receiver, so that the lock gets freed regularly.

@1c3t3a
Copy link
Owner

1c3t3a commented Dec 1, 2021

I prepared a really hotfixy hotfix just to proof my point, if you like you can add the rust-socketio dependency as follows:

rust_socketio = { git = "https://github.com/1c3t3a/rust-socketio", branch = "issue-133"}

It runs in my development environment, would be interesting to know if this also holds true for your environment.

@1c3t3a 1c3t3a added the bug Something isn't working label Dec 1, 2021
@ctrlaltf24
Copy link
Collaborator

@theaaron123 to confirm is this on v0.3.0-alpha-2? or v0.2.X?

@theaaron123
Copy link
Author

On v0.3.0-alpha-2. Will test the hotfix cheers!

@theaaron123
Copy link
Author

@1c3t3a I'm getting a panic from the socketbuilder on connect with that branch


Error: String(
    "EngineIO Error",
)
thread '<unnamed>' panicked at 'EngineIO Error', /home/aaron/.cargo/git/checkouts/rust-socketio-ef22eed71d095901/0e3ff32/socketio/src/client/builder.rs:229:21

Error is generic and will need to debug but does connect when I revert back to v.0.3 branch

@1c3t3a
Copy link
Owner

1c3t3a commented Dec 2, 2021

Mhhh... I pushed a new version that logs the error message but never panics but instead retries, I am going to test it in a VM now as well.

@theaaron123
Copy link
Author

Error while polling: WebSocketError: I/O failure

@1c3t3a
Copy link
Owner

1c3t3a commented Dec 2, 2021

Oke, that's what I expected. Does it print the message multiple times?

@theaaron123
Copy link
Author

Yeah multiple prints

@1c3t3a
Copy link
Owner

1c3t3a commented Dec 2, 2021

Alright, I matched on the exact error now and added a comment explaining the issue. Could you confirm that the performance degradation isn't present anymore?

@theaaron123
Copy link
Author

Yeah working as fast as non TLS now! Great, thanks very much for the help.

@theaaron123
Copy link
Author

Hi @1c3t3a sorry to keep this issue alive but on in my environment the latency returns after your refactor i.e

Fast:

rust_socketio = { git = "https://github.com/1c3t3a/rust-socketio", rev = "b5774ea8135940a5bdadfdf0509eb72a5ae280ad"}

Slow:

rust_socketio = { git = "https://github.com/1c3t3a/rust-socketio", rev = "b264dc333c8b197fa609168fc34c813fd3d1adf3"}

Latest also slow just narrowing down to specific commit.

Will do some more debugging but looks like the set_read_timeout may not be working as expected

@1c3t3a
Copy link
Owner

1c3t3a commented Dec 15, 2021

Ok, that's unfortunate. Could you try 43bed7db9f1a87162cbc741b642ed8c7f107873c , I reverted it.

@theaaron123
Copy link
Author

Yeah that commit is working well. Cheers again

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants