Description
Proposed Change
We see an issue with the Route Registry code.
Register
Currently, Register() calls register(), which calls Pool.Put()
Put()
updates the pool if required, and returns the PoolPutResult
to register()
:
- UPDATED, whenever the endpoint has previously been registered in the pool (even if unmodified)
- Exception: UNMODIFIED, if the message is a duplicate (based on ModificationTag)
- ADDED, if the endpoint doesn't exist in the pool yet
register()
creates the pool if required, sets the time of update and forwards the PoolPutResult
to Register()
. Register()
receives and logs the PoolPutResult
, and ingests a generic route registry metric.
Unregister
On the other hand, we have a similar functionality with Unregister(), which calls unregister(), which calls pool.Remove().
However, the structure is quite different here:
Remove()
returns a boolean value reflecting whether the route has been removed from the pool, or not. unregister()
however does some logging based on the result of Remove()
, and removes the pool if required. Unregister()
finally unconditionally ingests a generic route unregistry metric.
Issues
There are some problems with the current code:
- We have a very generic metric for registering and unregistering endpoints - it does not reveal the result of the (un)registration (endpoint added, modified, unchanged, etc). However, tests have shown that the cost for each case differs, and that it is beneficial for operations to have more detailed metrics.
pool.Put()
always returns statusUPDATED
when the route exists in the pool. Hence, the metric doesn't reflect whether a route has received an update, or if not (in case its attributes haven't changed).Unregister()
always ingests anUnregistryMessage
, independently of the result of pool.Remove(). Hence, the metric doesn't reflect whether a route has been removed or not.- We should introduce an additional metric that reflects whether a whole route has been created or removed upon
Register()
/Unregister()
. - The structure of the
Register()
andUnregister()
functions and its related private functions should be refactored, e.g.- Code / function structure should be the same
- Logging should be done in the public function functionality
- Improve code structure in general, make the long if-clauses in registry.go less complicated
Acceptance criteria
- Have better metrics that reflect the actual result of registering and unregistering an endpoint, based on the
PoolPutResult
and aPoolRemoveResult
.- Avoid breaking changes by using tags for the actual result in
CaptureRegistryMessage
andCaptureUnregistryMessage
.
- Avoid breaking changes by using tags for the actual result in
- Fix the wrong
UPDATED
result in case of no-op registry update. - Refactor the code to mitigate the issues described above.
Related links
- Draft Pull-Request Improve gorouter Route Registry Message Metrics gorouter#456: This was created before being able to use metric tags (introduced with support for Prometheus based metrics). This Pull-Request already addresses the issues described here, but needs a final refactoring to have no breaking change due to metrics renaming.