-
Notifications
You must be signed in to change notification settings - Fork 3.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Auto reconnect or quickly exit when cassandra cluster down and then up. #1821
Comments
I'm experiencing the same issue. I run Cassandra as a Statefullset within my K8s cluster. For some reason my Cassandra cluster became unhealthy. I initiated a restarted of the Cassandra pods and the cluster was healthy again, but i got the same error as you mention. I was able to get it working again by restarting the Loki pod, but this is not preferred. |
This issue has been automatically marked as stale because it has not had any activity in the past 30 days. It will be closed in 7 days if no further activity occurs. Thank you for your contributions. |
I also have this error on online loki |
**What this PR does / why we need it**: level=error ts=2022-07-20T04:07:11.881370946Z caller=flush.go:146 org_id=166256_8sxv2 msg="failed to flush user" err="store put chunk: gocql: no hosts available in the pool" The restart of cassandra (on k8s) caused the production accident that loki had to restart. It happened 4 times in our production environment in only half a year. I hope this problem can be fixed. The code of this PR is not particularly complete, but this PR seems to be available. If this PR does not merge, I also hope that other contributors in the loki community can propose another PR to fix this problem. The restart of cassandra that caused loki to be unavailable is already a conclusion recorded in our JIRA. **Which issue(s) this PR fixes**: Fixes #1821 #7140
…a#6725) **What this PR does / why we need it**: level=error ts=2022-07-20T04:07:11.881370946Z caller=flush.go:146 org_id=166256_8sxv2 msg="failed to flush user" err="store put chunk: gocql: no hosts available in the pool" The restart of cassandra (on k8s) caused the production accident that loki had to restart. It happened 4 times in our production environment in only half a year. I hope this problem can be fixed. The code of this PR is not particularly complete, but this PR seems to be available. If this PR does not merge, I also hope that other contributors in the loki community can propose another PR to fix this problem. The restart of cassandra that caused loki to be unavailable is already a conclusion recorded in our JIRA. **Which issue(s) this PR fixes**: Fixes grafana#1821 grafana#7140
…a#6725) **What this PR does / why we need it**: level=error ts=2022-07-20T04:07:11.881370946Z caller=flush.go:146 org_id=166256_8sxv2 msg="failed to flush user" err="store put chunk: gocql: no hosts available in the pool" The restart of cassandra (on k8s) caused the production accident that loki had to restart. It happened 4 times in our production environment in only half a year. I hope this problem can be fixed. The code of this PR is not particularly complete, but this PR seems to be available. If this PR does not merge, I also hope that other contributors in the loki community can propose another PR to fix this problem. The restart of cassandra that caused loki to be unavailable is already a conclusion recorded in our JIRA. **Which issue(s) this PR fixes**: Fixes grafana#1821 grafana#7140
Is your feature request related to a problem? Please describe.
Using cassandra cluster as storage.When the cassandra cluster down, this error occurs.And it will never recover though the cassandra is UP.
And if we want to restart loki, loki will fall because of loop forever.
Describe the solution you'd like
As gocql wounld fix it (eg. gocql-915 created at 2017 and never closed.)
Would loki do something to reconect? Or do not retry to flush forever.
Describe alternatives you've considered
Additional context
Add any other context or screenshots about the feature request here.
The text was updated successfully, but these errors were encountered: