Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add liveness check to pkg/gameservers/controller. #116

Closed
enocom opened this issue Feb 27, 2018 · 7 comments
Closed

Add liveness check to pkg/gameservers/controller. #116

enocom opened this issue Feb 27, 2018 · 7 comments
Assignees
Labels
kind/design Proposal discussing new features / fixes and how they should be implemented kind/feature New features for Agones
Milestone

Comments

@enocom
Copy link
Contributor

enocom commented Feb 27, 2018

Original Context.

I don't know if this is possible, but what would determine if this controller is healthy is that the goroutines generated in this code are still running and haven't exited in any way.

@enocom enocom added this to the 0.2 milestone Feb 27, 2018
@markmandel markmandel added kind/feature New features for Agones kind/design Proposal discussing new features / fixes and how they should be implemented labels Feb 27, 2018
@markmandel
Copy link
Member

@enocom mind if I edit this ticket to include the full details? Just in case anyone else happens to want to pick this up? (Unless you want to expand this out, since I know you had ideas)

@enocom
Copy link
Contributor Author

enocom commented Mar 14, 2018

@markmandel Please do. I think this issue might be good for first-time contributors, too.

@markmandel
Copy link
Member

Some interesting notes here:

  • The health check handler is passed in here
  • The actual goroutines are run in the workerqueue. so there is likely opportunity to standardise this across anything that uses a workerqueue.

Final thought - for a readiness check, readiness can be turned on as soon as the WebHook that powers the MutationWebhookConfiguration is ready

@markmandel
Copy link
Member

@enocom did you have some thoughts on how you were thinking of tracking if the goroutines were still active? (or maybe a PR 😁 )

@enocom
Copy link
Contributor Author

enocom commented Mar 15, 2018

I imagine adding a Registrar, which in functionality would be similar to a sync.WaitGroup. We pass the Registrar to anything that starts a goroutine and upon starting the goroutine, we maybe call registrar.Inc() with a corresponding defer registrar.Dec() for when the goroutine exits. Then the liveness check could simply query the Registrar to see if all registered goroutines are still running.

Thoughts?

@markmandel
Copy link
Member

markmandel commented Mar 15, 2018

Works for me. Although the Inc(), Dec() and Count() would need to be threadsafe. I'm assuming a sync.RWMutex ?

Thought - would this be part of a Registrar, or maybe just part of the WorkerQueue?

@enocom
Copy link
Contributor Author

enocom commented Mar 15, 2018

There's some question of where this responsibility belongs.

Insofar as implementation, sync.RWMutex works and maybe even a sync.Mutex too. Just depends where the hot paths may be.

@enocom enocom self-assigned this Mar 19, 2018
fooock pushed a commit to fooock/agones that referenced this issue Apr 6, 2018
The liveness check is based on the worker queue having all its worker
goroutines running. If one of those goroutines exits, the liveness check
reports an unhealthy status.

Fixes googleforgames#116
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/design Proposal discussing new features / fixes and how they should be implemented kind/feature New features for Agones
Projects
None yet
Development

No branches or pull requests

2 participants