Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Clarify cluster connection behaviour and error handling #753

Closed
nicokaiser opened this issue Dec 3, 2018 · 4 comments · Fixed by #762
Closed

Clarify cluster connection behaviour and error handling #753

nicokaiser opened this issue Dec 3, 2018 · 4 comments · Fixed by #762

Comments

@nicokaiser
Copy link

It seems to me that starting from version 4 error handling in Redis.Cluster is either incomplete or unclear:

  1. Errors on auto-connect are ignored:
const startupNodes = [
    { host: 'existingnode.domain', port: 6379 },
    { host: 'nonexistingnode.domain', port: 6379 }
];

const redis = new Redis.Cluster(startupNodes, { lazyConnect: false });

redis.on('error', err => {
    // this is never called
});

redis.publish(channel, data)
    .then(() => {
        // this is never called, as there are no timeouts
    })
    .catch(err => {
        // this is also never called
    });

This can be mitigated by explicitly calling connect():

const redis = new Redis.Cluster(startupNodes, { lazyConnect: true });

redis.connect().catch(err => {
    // this is called
});

redis.on('error', err => {
    // this is still never called
});
  1. DNS Errors on reconnect are ignored:

I guess (!) these kind of errors can only occur during initial connect, not during reconnect. Otherwise this issue would be even much more problematic, as errors during reconnect are also ignored: https://github.com/luin/ioredis/blob/v4.2.3/lib/cluster/index.ts#L214

It this behaviour intentional? If so, this should be noted somewhere in the docs, as it is different from how ioredis 3.x behaved. In 3.x, any connection errors are handled by the reconnection procedure and thus connection is re-tried until forever (or at least, until the clusterRetryStrategy returns no number anymore).

I'm not sure if this is related to #709, but it feels like something else, as DNS errors are the only errors that can have Cluster#resolveStartupNodeHostNames() (and thus, Cluster#connect()) reject.

@nicokaiser
Copy link
Author

Update: this can also happen on reconnect, e.g. if the DNS for one cluster node is not available anymore. I consider this as critical, as error handling in these cases is not possible and ioredis stops reconnecting without the application noticing.

ioredis should either emit an "error" event in these cases, or – like in version 3 – retry connecting following the clusterRetryStrategy.

@luin luin self-assigned this Dec 12, 2018
luin added a commit that referenced this issue Dec 12, 2018
@luin luin closed this as completed in #762 Dec 16, 2018
luin added a commit that referenced this issue Dec 16, 2018
Previously, exceptions caused by DNS resolving will not trigger any reconnections and fail silently. That makes errors much harder to be handled. This fix catches these exceptions and triggers a reconnection to keep the same behaviors as v3.

Closes #753
ioredis-robot pushed a commit that referenced this issue Dec 16, 2018
## [4.3.1](v4.3.0...v4.3.1) (2018-12-16)

### Bug Fixes

* **cluster:** handle connection errors by reconnection ([#762](#762)) ([21138af](21138af)), closes [#753](#753)
@ioredis-robot
Copy link
Collaborator

🎉 This issue has been resolved in version 4.3.1 🎉

The release is available on:

Your semantic-release bot 📦🚀

@knoxcard
Copy link
Contributor

knoxcard commented Jul 5, 2019

This is still a problem, why is this soooo difficult?

@abhimanyuPathania
Copy link

Following is still an issue in 4.22.0

redis.on('error', err => {
    // this is never called
});

janus-dev87 added a commit to janus-dev87/ioredis-work that referenced this issue Mar 1, 2024
## [4.3.1](redis/ioredis@v4.3.0...v4.3.1) (2018-12-16)

### Bug Fixes

* **cluster:** handle connection errors by reconnection ([#762](redis/ioredis#762)) ([21138af](redis/ioredis@21138af)), closes [#753](redis/ioredis#753)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants