Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: add "timeoutPerCommand" option to detect dead connection #658

Closed
wants to merge 2 commits into from

Conversation

luin
Copy link
Collaborator

@luin luin commented Jul 6, 2018

WIP. Need discussion, tests and documentation.

Usage:

const redis = new Redis({
  timeoutPerCommand: 10000
})

Known issues:

  • Don't work with blocking commands

debug('Command timed out. Pending commands: %s', commandQueueLength);
var err = new Error('Command timed out');
self.flushQueue(err, { offlineQueue: false });
self.silentEmit('error', err);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is emitting an error event sufficient? In my usage, I'm always supplying a callback in the form of:

redis.get(key, function (err, reply) {
    if (err) {
        // handle error
    } else {
        // handle data
    }
});

While I do also handle the error event, that handling is somewhat generic. It would be better if the callback was called with the error so that specific error handling could be invoked.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, the callbacks will be called besides an error event. self.flushQueue() does the trick.

var err = new Error('Command timed out');
self.flushQueue(err, { offlineQueue: false });
self.silentEmit('error', err);
self.disconnect(true);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume that this will trigger an automatic reconnect to that server, correct?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that's correct.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good, but see my follow-up to #587. If this node was part of a cluster, retryStrategy is overridden and set to null so there would in fact be no automatic reconnect to that node.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Per #587, there will not be any automatic reconnect in a cluster environment.

@vweevers
Copy link

vweevers commented Jul 6, 2018

Is socket.setKeepAlive(true) not an option? TCP keep-alive probes are very cheap and work well.

self.silentEmit('error', err);
self.disconnect(true);
}
}, 500);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It appears that the minimum practical value for timeoutPerCommand is 500ms. Even if I specify timeoutPerCommand = 100, the check for a timed out command occurs every 500ms and so the command time out error won't trigger as quickly as I would expect.

var commandQueueLength = self.commandQueue.length;
if (
commandQueueLength > 0 &&
Date.now() - self.lastWriteTime > self.options.timeoutPerCommand

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since self.lastWriteTime is updated every time a command is added to the commandQueue, if the client attempts commands fast enough (more than one every timeoutPerCommand ms), then self.lastWriteTime continues to get updated such that Date.now() - self.lastWriteTime will continue to be less than timeoutPerCommand and thus the original command will never time out.

Perhaps in Redis.prototype.sendCommand lastWriteTime is only updated if commandQueue.length === 1.

@jcstanaway
Copy link

@vweevers While a little old, see nodejs/node-v0.x-archive#6194. The main take away is that node.js doesn't provide sufficient configurability of the TCP keepalive functionality. Thus we'd have to wait up 10-11 minutes before detecting that the connection has closed. Searching through the node 10.x documentation, I didn't find anything that allows further configuring TCP keep alives to detect a broken connection any faster.

@vweevers
Copy link

vweevers commented Jul 6, 2018

@ccs018 Oh I see, the issue is about detecting dead connections faster. That's cool. I just hope the added functionality will remain opt-in (because for my needs, waiting 10 minutes is perfectly fine) and not add side effects to an already complicated module.

@jcstanaway
Copy link

It's not sure much about detecting dead connections fast. It's I issue a redis command (e.g., get), but if the connection dies before the response is received nothing is reported back to the client - the callback is never invoked.

@luin
Copy link
Collaborator Author

luin commented Jul 7, 2018

@vweevers This option will be disabled by default and it won't add any side effects when disabled.

@ccs018 If you issue a command, The timeoutPerCommand interval will check if the response has been received every 500ms, and if not after the specified time (timeoutPerCommand option), the callback will be invoked with a timeout error.

@vweevers
Copy link

vweevers commented Jul 7, 2018

@ccs018 This must mean TCP keep-alive is disabled by default (client side).

@luin Does ioredis expose the raw socket somewhere, so that I can call setKeepAlive on it?

@luin
Copy link
Collaborator Author

luin commented Jul 18, 2018

@vweevers Keep-alive is enabled by default. Refer to https://github.com/luin/ioredis/blob/master/API.md for details on the options for that.

if (typeof self.options.timeoutPerCommand === 'number') {
var timeoutPerCommand = self.options.timeoutPerCommand;
debug('Per-command timeout set to %s', self.options.timeoutPerCommand);
self.timeoutPerCommand = setInterval(() => {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how about using this: https://nodejs.org/api/net.html#net_socket_settimeout_timeout_callback to generate a timeout event from the socket if it becomes inactive while the queue is being processed?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This timeout guards against a socket which has no activity. The value to use probably greatly depends on the use cases. In a production environment, there's probably enough traffic to keep such a timeout from triggering. But in a dev environment, there might not be any traffic for relatively long stretches of time making this an idle connection which then times out. Setting a per command timeout seems better.

Also, if this socket timeout occurs, it doesn't enable specific error handling / recovery for the specific commands which timed out.

Or were you suggesting this as an additional check and not an alternative solution?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am suggesting to set this to something reasonable while we are processing commands and to disable it while the socket is not used.
It just seems that it can be easier to implement than setting and cancelling timeouts in the lib. And more powerful as well.

@stale
Copy link

stale bot commented Aug 24, 2018

This issue has been automatically marked as stale because it has not had recent activity. It will be closed after 7 days if no further activity occurs, but feel free to re-open a closed issue if needed.

@stale stale bot added the wontfix label Aug 24, 2018
@silverwind
Copy link
Contributor

I have a feeling that this should be a per-command option. I could certainly see using different timeouts for different commands.

I'm currently using a wrapper function that includes a Promise constructor and a setTimeout, but seeing that ioredis does not expose a way to cancel a pending command, there is a risk that it would accumulate lots of pending commands over time and eventually exhaust the system's resources.

@stale stale bot removed the wontfix label Aug 27, 2018
@silverwind
Copy link
Contributor

I guess we could do without this new option if commands were cancellable. A common way to do this on promise interfaces is to expose a .cancel method on the promise. There's also p-cancellable which adds a bit more.

@stale
Copy link

stale bot commented Sep 26, 2018

This issue has been automatically marked as stale because it has not had recent activity. It will be closed after 7 days if no further activity occurs, but feel free to re-open a closed issue if needed.

@stale stale bot added the wontfix label Sep 26, 2018
@stale stale bot closed this Oct 3, 2018
@edevil
Copy link

edevil commented Nov 2, 2018

I still think that a configurable global command timeout is useful when dealing with unresponsive servers because there are some modules that take a Redis proxy object and issue the commands themselves. All these modules would need to be updated to correctly guard for timeouts.

@edevil
Copy link

edevil commented Jul 23, 2019

ping

@andrewaustin
Copy link

@luin Any chance of reviving this one?

@BryanDonovan
Copy link

In case anyone stumbles upon this issue, I think it's closed by #1320

@luin luin deleted the command-timeout branch March 14, 2022 03:27
@seminarian
Copy link

seminarian commented Mar 21, 2024

Cancelling the promise which is sending the command to the server after commandTimeout was hit would be nice..

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants