Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Introduce Named Services #1509

Merged
merged 8 commits into from
Apr 21, 2021
Merged

Introduce Named Services #1509

merged 8 commits into from
Apr 21, 2021

Conversation

refs
Copy link
Member

@refs refs commented Mar 2, 2021

What?

Currently, the way that services are configured is very binary: if a key is present in the config file, use its value, or else default to the gateway address. This is inflexible in terms of scaling (in the cloud?) and we would rather use names instead of IP addresses.

When communicating with another service the caller needs to know beforehand the address of the target, the existing implementation defaults such address to the gateway. Every service needs to explicitly know the endpoints it needs to communicate with. This is a proposal to move away from hardcoding service IP addresses and rely upon name resolution instead. It delegates the address lookup to, for the sake of this PR, a static in-memory service registry, which can be re-implemented in multiple forms, since there is only an interface to satisfy.

Technical Details

Reva configuration has been extended to support a top-level registry key / value pair. Its value looks like this:

## global registry initialization
[registry.services]
"com.owncloud.authregistry" = [{nodes = [{ address = "localhost:19000" }]}]

Note

A decision was made not to support backward compatibility with the current IP addresses mapping in order to isolate the changes to a single service and fail if misconfigured; backward compatibility can be easily added to provide support for existing functionality by simply failing at querying the registry and defaulting to the current defaults.

Implementation

The glue comes in the form of a new interface introduced in Reva, the Registry interface, that looks like:

type Registry interface {
	Add(Service) error
	GetService(string) ([]*Service, error)
}

a Service type (that will be moved to be an interface)

type Service struct {
	Name  string 	`mapstructure:"name"`
	Nodes []Node 	`mapstructure:"nodes"`
}

and Nodes:

type Node struct {
	ID string 					`mapstructure:"id"`
	Address string 				`mapstructure:"address"`
	Metadata map[string]string 	`mapstructure:"metadata"`
}

This set of interfaces and types would theoretically allow any consumer of Reva that wants to start its own service to inject its own registry. Registry injection happens now as a functional option on runtime.go:

func WithRegistry(r registry.Registry) Option {
	return func(o *Options) {
		o.Registry = r
	}
}

Current Registry Implementation breakdown

The current registry relies on an in-memory implementation of the registry.Registry interface. It will parse the static config from the toml file. This has a set of drawbacks in the way that services outside the process boundary will not know of any registered service, but since the current changes are contained to the gateway, these effects are not yet seen. A good approach would be to have a dedicated service registry service running so it can be queried upon (for more on this, see https://www.nginx.com/blog/service-discovery-in-a-microservices-architecture/).

One drawback of this approach, as previously mentioned, is that we're using a global registry as default, to prove this hypothesis and showcase how it would work on Reva and with library consumers! We do not encourage the use of globals :)

Ideas and further improvements

  • With this in mind we could sanitize the set of defaults on the gateway config.
  • Config would be much simpler as we don't have to carry IP addresses around every time a node is restarted, or we can provide support for dynamic IP addresses; configuration would just be a matter of writing the service name.
  • Client-side load balancing; by adding client awareness of service registries and a selection strategy.

@refs refs requested a review from labkode as a code owner March 2, 2021 10:40
@update-docs
Copy link

update-docs bot commented Mar 2, 2021

Thanks for opening this pull request! The maintainers of this repository would appreciate it if you would create a changelog item based on your changes.

@refs refs marked this pull request as draft March 2, 2021 10:42
@refs
Copy link
Member Author

refs commented Mar 2, 2021

@refs
Copy link
Member Author

refs commented Mar 2, 2021

CI tests are failing because the config is not yet updated :)

@butonic
Copy link
Contributor

butonic commented Mar 2, 2021

For an overview of the basic go-micro interfaces have a look at owncloud/ocis#1749

@ishank011
Copy link
Contributor

@refs @butonic this looks really cool!

One suggestion would be to have one more abstraction for the drivers of a particular service in the code itself. That way, we'll have only one service corresponding to a specific driver. The code would be cleaner and the load-balancing part would be more evident.

So instead of the current mapping map[string][]*Service, which can take the form

authprovider -> [{"auth-basic", [addr1, addr2]}, {"auth-bearer", [addr3, addr4]}, {"auth-basic", [addr5]}]

we can have map[string]map[string]*Service

authprovider -> {"basic" -> {"auth-basic", [addr1, addr2, addr5]}, "bearer" -> {"auth-bearer", [addr3, addr4]}}

Also, didn't we decide on trying out the micro interfaces directly? We're adding similar interfaces here, right?

@ishank011
Copy link
Contributor

Also, I'd be definitely in favour of a dedicated service registry, where independent services can register themselves (over GRPC?) Does micro have that?

@refs
Copy link
Member Author

refs commented Mar 2, 2021

Also, didn't we decide on trying out the micro interfaces directly? We're adding similar interfaces here, right?

We thought this way we wouldn't need to bring go-micro as a dependency and instead use our own interface definitions, a less aggressive approach ^^

@butonic
Copy link
Contributor

butonic commented Mar 2, 2021

The go micro interfaces cover service discovery as well as clients and services. It even covers transport, codec, broker and what not. But one thing at a time.

  1. A service registry could remain static in reva. Ocis could inject a dynamic one that is backed by a go micro implementation, eg. consul, kubernetes, nats, mdns ... That is what this PR shows
  2. The go-micro interfaces use metadata to distinguish services, eg. the authprovider. For the static config we can do what you propose in Introduce Named Services #1509 (comment)
  3. go micro clients can have a selector strategy that chooses the next node in the list of service nodes that was returned by the registry. I think we can replace the todo pool with clients that follow the same mechanism. It requires a registry to allow invalidating cached connections.

If this makes sense, then we can add this in small steps. I'd say after fixing more business logic. But this would be a great way forward!

pkg/registry/memory/node.go Outdated Show resolved Hide resolved
pkg/registry/memory/node.go Outdated Show resolved Hide resolved
pkg/registry/registry.go Outdated Show resolved Hide resolved
cmd/revad/runtime/runtime.go Outdated Show resolved Hide resolved
pkg/registry/memory/service.go Show resolved Hide resolved
@butonic
Copy link
Contributor

butonic commented Apr 6, 2021

@ishank011 I think we should move forward with this kind of registry. We currently do not need other aspects of go micro.

@labkode
Copy link
Member

labkode commented Apr 6, 2021

@butonic yes, we should not involve go-micro related code for now.

@refs
Copy link
Member Author

refs commented Apr 6, 2021

Indeed :) Let's fix the broken parts in this PR and get it merged. I will tackle it after my current task.

@refs refs marked this pull request as ready for review April 19, 2021 09:11
@refs
Copy link
Member Author

refs commented Apr 19, 2021

If these changes are okay, we could merge this one so we get at least the code in and could start using it in the short-term future. I evaluated and considered not to change the current config files because it would be a mess to update all occurrences in the config where we rely on the current way of configuring.

I'm in favor of providing backward compatibility with the current state of the codebase, and follow up with a PR in which we tackle the pool.go file to include the changes that use the new registry functionality.

/cc @ishank011 @labkode @butonic

@refs
Copy link
Member Author

refs commented Apr 20, 2021

ping @ishank011 shall we merge this one?

@refs refs requested a review from ishank011 April 20, 2021 10:52
@ishank011
Copy link
Contributor

@refs will review today

@refs
Copy link
Member Author

refs commented Apr 20, 2021

Thanks!

Copy link
Contributor

@ishank011 ishank011 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @refs. One generic question. We always ensure that for a given service name, we have only one value in the map. We just append the nodes of the parameter passed in the Add call. Why not have the map as map[string]registry.Service then instead of a slice?

Also, can we incorporate this comment?

So instead of the current mapping map[string][]*Service, which can take the form

authprovider -> [{"auth-basic", [addr1, addr2]}, {"auth-bearer", [addr3, addr4]}, {"auth-basic", [addr5]}]

we can have map[string]map[string]*Service

authprovider -> {"basic" -> {"auth-basic", [addr1, addr2, addr5]}, "bearer" -> {"auth-bearer", [addr3, addr4]}}

In this implementation, if I get a bearer auth request, I would need to traverse all nodes and check the metadata to find the node corresponding to the driver, plus load balancing would be a bit more cumbersome to implement. If we had the above implementation, the simplest strategy would just be selecting a random element from a slice.

pkg/registry/memory/memory.go Outdated Show resolved Hide resolved
@refs
Copy link
Member Author

refs commented Apr 21, 2021

Why not have the map as map[string]registry.Service then instead of a slice?

You're right, and this was at the core of my concerns for the memory implementation. My implementation seems overly complicated to understand. The original idea was to provide a slight grim of flexibility but at some point along the way things got confusing. I'll simplify it and incorporate the changes in your comment, the arguments in favor are better than my mindless design 👯 I'll ping you when I get them in, thanks for the pair of eyes 👀 !

… map[string]registry.Service instead of map[string][]registry.Service for the memory implementation
@refs refs requested a review from ishank011 April 21, 2021 09:11
@refs
Copy link
Member Author

refs commented Apr 21, 2021

@ishank011 changes are in, could you review again? 🙏

@ishank011
Copy link
Contributor

@refs we aren't using the config anywhere, right? It has all the necessary structs but we're declaring them again in the memory driver.

I have a few thoughts about refactoring this:

  • Remove registry.Config entirely.
  • Change the registry.Service and registry.Node interfaces to structs. The drivers can just import this pkg and directly return these instead of having duplicate definitions.
  • The map should be of type map[string]map[string]*Service in the memory driver instead of Config.

@refs
Copy link
Member Author

refs commented Apr 21, 2021

Remove registry.Config entirely.

We do use it, I just commented it out for this temporary memory implementation:

func New(m map[string]interface{}) registry.Registry {
	// c, err := registry.ParseConfig(m)
	// if err != nil {
	//	return nil
	// }

	return &Registry{
		services: map[string]registry.Service{},
	}
}

Change the registry.Service and registry.Node interfaces to structs. The drivers can just import this pkg and directly return these instead of having duplicate definitions.

The point of using interfaces is not only that Reva packages can work with them seamlessly, it serves as an integration between external service registries (such as go-micro) and Reva's internal. They are basically a contract; the memory implementation defines unexported members, yet still works with the registry interfaces to satisfy the contract, we do the same in the ocis side with the go-micro registry.

A benefit we get from having config parsing at the same level as the registry is that we can literally parse config from files and use the config object independently of the implementation, a bit of duplication there (duplication in the sense of redefining the same structs) ensures that config does not change much, or if it changes, changes are propagated to all drivers since they all use the came config struct.

@ishank011

@ishank011
Copy link
Contributor

@refs ah okay, makes sense. Thanks for the explanation!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants