Skip to content
This repository has been archived by the owner on Nov 9, 2020. It is now read-only.

Shared plugin: etcd starting up functions. #1611

Merged
merged 1 commit into from
Jul 25, 2017

Conversation

luomiao
Copy link
Contributor

@luomiao luomiao commented Jul 20, 2017

Design:
Swarm leader node starts an etcd cluster when the shared volume driver is initialized.
Swarm manager nodes join the etcd cluster through the leader node.
Swarm worker nodes do nothing.

The etcd cluster resides inside plugin runc container.
Thus when plugin is stopped, the etcd member inside the container dies.
When the plugin/VM is restarted, if the plugin find there is already a started member
in the etcd cluster with the same peer url, this entry is removed by current plugin and continue to re-join.
msterin> Who should do it ? Is it a part of this PR of future PRs ?
miao> This PR include the remove and re-join already. modified the statement above to clarify.

Note 1:
Due to etcd limitation, users should make sure the swarm cluster has the number of managers
(including leader) = 1 or >2, in order to avoid the situation of quorum loss.
msterin> what happens if they don't ? Do we print a log message with recommendation? What happens if they loose quorum ?
miao> I updated the statement above to be more accurate. The quorum loss in our case mostly happens when 1 member adding into 1-node cluster and left. This member cannot rejoin due to quorum loss. But if 2 members joined to 1-node cluster (became 3-node cluster) and 1 member left, the quorum still exist (etcd-io/etcd#6420). When the quorum loss happens, etcd cluster can still work but in unhealthy mode, which means it cannot add/remove members. We throw out error in log if users tried to start a new master plugin and failed. This is not recoverable for now but etcd has an opening progressing issue on it. I think it's ok for us to add recommendation in Readme and simply fail the new manager plugin when quorum loss happened, before ETCD make progress on this quorum loss situation.

Note 2:
For now the etcd cluster start/join only happens when the plugin is initialized.
Thus if a swarm worker node has the plugin installed first and promoted to manager later,
it won't join the etcd cluster. Before this is resolved, users should reinstall the plugin
on a promoted node.
msterin> - is there an issue for this (or work item on internal tracking) ? Also, can we print a message if this happens ? We can do a periodic (each 5 min ? ) poll to check the node status and the etcd node status. Unless the issue is planned to be fixed soon.
miao> This needs more investigation and I will create a work item on this issue. We can do periodic poll to trigger etcd member operations if the plugin cannot get notification when a node is promoted.

Note 3:
We are using the default docker swarm network (IP address from docker info) for now. And as a result, the new plugin configuration requires Network type = host.
msterin> Same comment - we need either an issue or internal work item to track, and in any case please write up specific commands to configure plugin
miao> yes we have a work item on it. There is no command needed to be done by users though. The config.json of the plugin build already has it by default, included with this PR.

Dockerfile change:
Create another dockerfile template so we have two dockerfile templates for the two plugins separately. This is because the new shared plugin requires etcd to be installed. The difference is too big for using the same template.

Test:
e2e test passed.

etcdPeerPort = ":2380"
etcdClusterToken = "vsphere-shared-etcd-cluster"
etcdListenURL = "0.0.0.0"
etcdScheme = "http://"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

etcdProtocol? Or etcdTransport?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"scheme" is the corresponding member name of URL struct: https://golang.org/pkg/net/url/#URL


// InitEtcd start or join ETCD cluster depending on the role of the node
func InitEtcd() error {
// create new docker client
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was adding similar things to shared_plugin.go. Should we move the docker Client to VolumeDriver structure and the docker info part to a function in shared_plugin.go?

We might want to call docker info multiple times at various places. Perhaps a better place will be shared_plugin.go. A function getsDockerInfo() can return a structure similar to this:

53 // Docker swarm information
54 type SwarmMetadata struct {
55 isManager bool
56 isLeader bool
57 selfIP string
58 managers []string
59 }

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It will depend on how and where we will use docker client.
I am fine to put it into the VolumeDriver structure later when it's used in other situations.

I am not sure about getsDockerInfo() function though.
The returned info already has those information, why bother setting them into another struct?
And those information are changing so we cannot reuse them...

node, _, err := cli.NodeInspectWithRaw(ctx, nodeID)
if err != nil {
log.WithFields(log.Fields{"nodeID": nodeID,
"error": err}).Error("Failted to inspect node ")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Failed. Typo.

return err
}

// get the IP address of current node
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

info.Swarm.NodeAddr can give you this node's IP address.

BTW this is docker CLI code. Might be helpful.

https://github.com/docker/cli/blob/2bfac7fcdafeafbd2f450abb6d1bb3106e4f3ccb/cli/command/system/info.go

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes for here we can get NodeAddr from Info.Swarm.NodeAddr.
However later when we need to get the address of leader from a non-leader manager,
we cannot do "docker info" for a remote node, but can only do "docker node inspect".
The returned struct of NodeInspectWithRaw is swarm.Node structure, instead of types.Info.
So we have to go through swarm.Node.ManagerStatus.Addr to get the addresses...
And then I decide to reuse that code :)

nodes, err := cli.NodeList(ctx, types.NodeListOptions{})
if err != nil {
log.WithFields(log.Fields{"nodeID": nodeID,
"error": err}).Error("Failted get NodeList from swarm manager")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typo. Failed.

return err
}

peerAddr := etcdScheme + nodeAddr + etcdPeerPort
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can peerAddr be renamed to nodeURL or something? this address is of local node. "Peer" word is confusing.

Copy link
Contributor Author

@luomiao luomiao Jul 20, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

peerAddr is corresponding to the peerURLs inside etcd member structure:
https://coreos.com/etcd/docs/latest/v2/members_api.html
which means the addresses for peer traffic.

And it's matching with etcd MemberAdd function:
https://github.com/coreos/etcd/blob/d51d381eca406156fd7e44118d4b0dab03589354/clientv3/cluster.go#L36

Since we are dealing with etcd here, I think it's better to keep consistent name with etcd structures and functions.


initCluster := ""
if !existing {
peerAddrs := []string{peerAddr}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am guessing member add() accepts a list of strings so we converted it to string[]. But "peer" is confusing. If peerAddrs holds addr of this node, can the name nodeURLs be picked?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

https://github.com/coreos/etcd/blob/d51d381eca406156fd7e44118d4b0dab03589354/clientv3/cluster.go#L36
It's the right name in ETCD world...
It especially means the URL addresses for listening peer traffic.

@luomiao luomiao force-pushed the shared-plugin-etcd-init branch 2 times, most recently from dbea5c7 to a811f9a Compare July 20, 2017 08:13
Copy link
Contributor

@shivanshu21 shivanshu21 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What I am looking for is a function which returns:

  1. Who am I? Worker/Manager
    As this information is needed in most workflows
  2. My IP address
  3. Manager's IP address

Since you're already fetching this info, if you can put it in form of a function that returns the same, we can reuse that in many workflows.

Also, in my opinion, code interacting with docker should be moved to shared_plugin.go

Other than that this LGTM.

@luomiao
Copy link
Contributor Author

luomiao commented Jul 20, 2017

Since we anyway need to call docker info every time and there is no extra processing needed, IMO there is no need to add an extra function wrapper...

How it looks like with the function wrapper:

type SwarmMetadata struct {
isManager bool
isLeader bool
selfIP string
managers []string
}

func getsDockerInfo() SwarmMetadata {
var metadata SwarmMetadata
info, err := cli.Info(ctx)
metadata.isManager = info.ControlAvailable
metadata.selfIP = info.NodeAddr
for _, manager := range info.swarm.RemoteManagers {
append(metadata.managers, manager.Addr)
}
...
return metadata
}

// Below are repeated used code
// when the new info is needed
metadata := getsDockerInfo()

// when need to check if manager
if metadata.isManager != true ...

// when need to get selfIP
something := metadata.selfIP

// when need to loop managers
for _, managerIP := range metadata.managers {
...
do something to managerIP
}

How it looks like without function wrapper:

// Below are repeated used code
// when the new info is needed 
info, err := cli.Info(ctx)

// when need to check if manager
if info.ControlAvailable != true ...

// when need to get selfIP
something := info.NodeAddr

// when need to loop managers
for _, manager := range info.Swarm.RemoteManagers {
...
do something to manager.Addr
}

So if we compare above, the function wrapper is not saving any code reuse. BTW it can add an extra loop in order to get manager IP addresses. So I would prefer not using a wrapper here.

@msterin
Copy link
Contributor

msterin commented Jul 20, 2017

@luomiao - I have a few questions, added with msterin> to the description. Please take a look

return nil
}

func etcdRoutine(cmd []string) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the name feels wrong. etcdService() or etcdLoop() or even etcdDaemon() are closer to what's really going on

@luomiao luomiao force-pushed the shared-plugin-etcd-init branch 2 times, most recently from 119c888 to db8f921 Compare July 24, 2017 07:18
Copy link
Contributor

@msterin msterin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good.
A few string/err messages need improvement but overall the logic is clear. I did not check for releasing objects like ctx or client (I assume you checked if it's needed) .
One request - please take a look to either flatten out the nesting a bit, or comment all nodes of the decision tree.
Other that, ready to merge

if info.Swarm.ControlAvailable == false {
log.WithFields(
log.Fields{"nodeID": nodeID},
).Info("Swarm node role: worker, return from InitEtcd ")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this info message is confusing.... maybe "Swarm node role: worker. Action: return from InitEtcd " (same comment to other tracking Info messages)

nodes, err := cli.NodeList(ctx, types.NodeListOptions{})
if err != nil {
log.WithFields(log.Fields{"nodeID": nodeID,
"error": err}).Error("Failed get NodeList from swarm manager")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

'failed to get...'

peerAddr := etcdScheme + nodeAddr + etcdPeerPort
existing := false
for _, member := range lresp.Members {
if member.PeerURLs[0] == peerAddr {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there should be a better way to find it . Have you checked existing packages ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Etcd clientv3 only have 4 member API functions:
https://github.com/coreos/etcd/blob/d51d381eca406156fd7e44118d4b0dab03589354/clientv3/cluster.go#L31
And there is no API function to locate a certain member...

}
}

log.Errorf("Failed to create etcd client according to manager info")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Need to print extra inffo here; if we got here it's likely there was no debug print yet

}

func addrToEtcdClient(addr string) (*etcdClient.Client, error) {
s := strings.Split(addr, ":")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

comment the function please. Expecially the string parsing part...

return err
}
}
break
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why break? Generally, this loop nesting looks too deep , so it's either need to be flattened out or (more likley) more comments added at each node of the decision tree here.

Copy link
Contributor Author

@luomiao luomiao Jul 25, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because we don't need to continue the comparison since one IP address can only join once. No need to continue the check.
I will add some more comments here.

Design:
Swarm leader node starts an etcd cluster when the shared volume driver is initialized.
Swarm manager nodes join the etcd cluster through the leader node.
Swarm worker nodes do nothing.

The etcd cluster resides inside plugin runc container.
Thus when plugin is stopped, the etcd member inside the container dies.
When the plugin/VM is restarted, if the plugin find there is already a started member
in the etcd cluster with the same peer url, this entry should be removed and re-joined.

Due to etcd limitation, users should start the swarm cluster with the number of managers
(including leader) as an odd number, in order to avoid the situation of quorum loss.

For now the etcd cluster start/join only happens when the plugin is initialized.
Thus if a swarm worker node has the plugin installed first and promoted to manager later,
it won't join the etcd cluster. Before this is resolved, users should reinstall the plugin
on a promoted node.
@luomiao luomiao merged commit b22d0c0 into vmware-archive:master Jul 25, 2017
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants