Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Services published ports are unreacheable from others/same service containers. #25463

Closed
dariko opened this issue Aug 6, 2016 · 13 comments
Closed
Assignees
Labels
area/networking area/swarm kind/bug Bugs are bugs. The cause may or may not be known at triage time so debugging may be needed. status/needs-vendoring version/1.12
Milestone

Comments

@dariko
Copy link

dariko commented Aug 6, 2016

Output of docker version:

$ docker $(docker-machine config swarm) version
Client:
 Version:      1.12.0
 API version:  1.24
 Go version:   go1.6.3
 Git commit:   8eab29e
 Built:        Thu Jul 28 22:13:30 2016
 OS/Arch:      linux/amd64
 Experimental: true

Server:
 Version:      1.12.0
 API version:  1.24
 Go version:   go1.6.3
 Git commit:   8eab29e
 Built:        Thu Jul 28 23:54:00 2016
 OS/Arch:      linux/amd64

Output of docker info:

$ docker $(docker-machine config swarm) info
Containers: 1
 Running: 1
 Paused: 0
 Stopped: 0
Images: 1
Server Version: 1.12.0
Storage Driver: aufs
 Root Dir: /mnt/sda1/var/lib/docker/aufs
 Backing Filesystem: extfs
 Dirs: 3
 Dirperm1 Supported: true
Logging Driver: json-file
Cgroup Driver: cgroupfs
Plugins:
 Volume: local
 Network: bridge host null overlay
Swarm: active
 NodeID: e84m2k5px9q2hb1ejkpooophr
 Is Manager: true
 ClusterID: 34rz36vpo85vqxcvusj6bumo4
 Managers: 1
 Nodes: 1
 Orchestration:
  Task History Retention Limit: 5
 Raft:
  Snapshot interval: 10000
  Heartbeat tick: 1
  Election tick: 3
 Dispatcher:
  Heartbeat period: 5 seconds
 CA configuration:
  Expiry duration: 3 months
 Node Address: 192.168.99.100
Runtimes: runc
Default Runtime: runc
Security Options: seccomp
Kernel Version: 4.4.16-boot2docker
Operating System: Boot2Docker 1.12.0 (TCL 7.2); HEAD : e030bab - Fri Jul 29 00:29:14 UTC 2016
OSType: linux
Architecture: x86_64
CPUs: 2
Total Memory: 995.7 MiB
Name: swarm
ID: MQV6:42CG:32ZN:T4VS:CEER:DY3T:XEB7:IEEC:2REO:HRAP:7W5P:H7XA
Docker Root Dir: /mnt/sda1/var/lib/docker
Debug Mode (client): false
Debug Mode (server): true
 File Descriptors: 40
 Goroutines: 131
 System Time: 2016-08-06T08:25:31.524129655Z
 EventsListeners: 1
Registry: https://index.docker.io/v1/
Labels:
 provider=virtualbox
Insecure Registries:
 127.0.0.0/8

When a service is publishing a port, the docker host/swarm port associated to the service is not reachable from other containers.

To reproduce

create a docker-machine

$ docker-machine create -d virtualbox --virtualbox-cpu-count=2 --virtualbox-memory 1024  swarm
$ docker-machine ip swarm
192.168.99.100

initialize the single manager swarm

$ docker $(docker-machine config swarm) swarm init --advertise-addr $(docker-machine ip swarm)

create a service replying to requests with the container hostname, published to port 80

$ docker $(docker-machine config swarm) service create --replicas 1 --name hostname --publish 80:80 alpine \
    sh -c 'while true; do echo -e "HTTP/1.1 200 OK\n\n$(hostname)" | nc -lp 80; done'
$ docker $(docker-machine config swarm) ps
CONTAINER ID        IMAGE               COMMAND                  CREATED              STATUS              PORTS               NAMES
268e187ddb4b        alpine:latest       "sh -c 'while true; d"   About a minute ago   Up About a minute                       hostname.1.c7r9pr2y86rmmkhhyg0asq9l8

then, testing the container reachability:

  • from outside the docker-machine
$ curl -s 192.168.99.100
d532f8bf9c62
  • from inside the docker-machine the container is reachable
$ docker-machine ssh swarm curl -s 192.168.99.100
d532f8bf9c62
  • from the container itself, via localhost, the container is reachable
$ docker $(docker-machine config swarm) exec 268 sh -c "echo|nc localhost 80"
HTTP/1.1 200 OK

268e187ddb4b

I tried reaching the service from another container, and placing them in a dedicated overlay network but the results were unchanged.

I checked the docker-machine network configuration and I noticed this iptables rule:

    9       21  1260 DROP       all  --  docker_gwbridge docker_gwbridge  0.0.0.0/0            0.0.0.0/0           

which seems to be the culprit.
In fact, if I delete it the container becomes again reachable on the external endpoint

$ docker-machine ssh swarm "sudo /usr/local/sbin/iptables -D FORWARD 8"
$ docker $(docker-machine config swarm) exec 268 sh -c "echo|nc 192.168.99.100 80"
HTTP/1.1 200 OK

268e187ddb4b

I imagine this rule as used for mantaining isolation. If so I think it must have exceptions for published ports.


Here are the docker-machine routing and iptables configuration:

$ docker-machine ssh swarm sudo /usr/local/sbin/iptables -nvL --line-numbers
Chain INPUT (policy ACCEPT 7036 packets, 2871K bytes)
num   pkts bytes target     prot opt in     out     source               destination         

Chain FORWARD (policy ACCEPT 0 packets, 0 bytes)
num   pkts bytes target     prot opt in     out     source               destination         
1       21  1260 DOCKER-ISOLATION  all  --  *      *       0.0.0.0/0            0.0.0.0/0           
2       21  1260 DOCKER     all  --  *      docker_gwbridge  0.0.0.0/0            0.0.0.0/0           
3        0     0 ACCEPT     all  --  *      docker_gwbridge  0.0.0.0/0            0.0.0.0/0            ctstate RELATED,ESTABLISHED
4        0     0 ACCEPT     all  --  docker_gwbridge !docker_gwbridge  0.0.0.0/0            0.0.0.0/0           
5        0     0 DOCKER     all  --  *      docker0  0.0.0.0/0            0.0.0.0/0           
6        0     0 ACCEPT     all  --  *      docker0  0.0.0.0/0            0.0.0.0/0            ctstate RELATED,ESTABLISHED
7        0     0 ACCEPT     all  --  docker0 !docker0  0.0.0.0/0            0.0.0.0/0           
8        0     0 ACCEPT     all  --  docker0 docker0  0.0.0.0/0            0.0.0.0/0           
9       21  1260 DROP       all  --  docker_gwbridge docker_gwbridge  0.0.0.0/0            0.0.0.0/0           

Chain OUTPUT (policy ACCEPT 4665 packets, 933K bytes)
num   pkts bytes target     prot opt in     out     source               destination         

Chain DOCKER (2 references)
num   pkts bytes target     prot opt in     out     source               destination         

Chain DOCKER-ISOLATION (1 references)
num   pkts bytes target     prot opt in     out     source               destination         
1        0     0 DROP       all  --  docker0 docker_gwbridge  0.0.0.0/0            0.0.0.0/0           
2        0     0 DROP       all  --  docker_gwbridge docker0  0.0.0.0/0            0.0.0.0/0           
3       21  1260 RETURN     all  --  *      *       0.0.0.0/0            0.0.0.0/0           




$ docker-machine ssh swarm sudo /usr/local/sbin/iptables -t nat -nvL --line-numbers
Chain PREROUTING (policy ACCEPT 454 packets, 27000 bytes)
num   pkts bytes target     prot opt in     out     source               destination         
1      475 28260 DOCKER-INGRESS  all  --  *      *       0.0.0.0/0            0.0.0.0/0            ADDRTYPE match dst-type LOCAL
2      471 27972 DOCKER     all  --  *      *       0.0.0.0/0            0.0.0.0/0            ADDRTYPE match dst-type LOCAL

Chain INPUT (policy ACCEPT 454 packets, 27000 bytes)
num   pkts bytes target     prot opt in     out     source               destination         

Chain OUTPUT (policy ACCEPT 67 packets, 5113 bytes)
num   pkts bytes target     prot opt in     out     source               destination         
1        0     0 DOCKER-INGRESS  all  --  *      *       0.0.0.0/0            0.0.0.0/0            ADDRTYPE match dst-type LOCAL
2        0     0 DOCKER     all  --  *      *       0.0.0.0/0           !127.0.0.0/8          ADDRTYPE match dst-type LOCAL

Chain POSTROUTING (policy ACCEPT 67 packets, 5113 bytes)
num   pkts bytes target     prot opt in     out     source               destination         
1        0     0 MASQUERADE  all  --  *      docker_gwbridge  0.0.0.0/0            0.0.0.0/0            ADDRTYPE match src-type LOCAL
2        0     0 MASQUERADE  all  --  *      !docker_gwbridge  172.18.0.0/16        0.0.0.0/0           
3        0     0 MASQUERADE  all  --  *      !docker0  172.17.0.0/16        0.0.0.0/0           

Chain DOCKER (2 references)
num   pkts bytes target     prot opt in     out     source               destination         
1        0     0 RETURN     all  --  docker_gwbridge *       0.0.0.0/0            0.0.0.0/0           
2        0     0 RETURN     all  --  docker0 *       0.0.0.0/0            0.0.0.0/0           

Chain DOCKER-INGRESS (2 references)
num   pkts bytes target     prot opt in     out     source               destination         
1       21  1260 DNAT       tcp  --  *      *       0.0.0.0/0            0.0.0.0/0            tcp dpt:80 to:172.18.0.2:80
2      454 27000 RETURN     all  --  *      *       0.0.0.0/0            0.0.0.0/0   


$ docker-machine ssh swarm sudo ip route
default via 10.0.2.2 dev eth0  metric 1 
10.0.2.0/24 dev eth0  proto kernel  scope link  src 10.0.2.15 
127.0.0.1 dev lo  scope link 
172.17.0.0/16 dev docker0  proto kernel  scope link  src 172.17.0.1 
172.18.0.0/16 dev docker_gwbridge  proto kernel  scope link  src 172.18.0.1 
192.168.99.0/24 dev eth1  proto kernel  scope link  src 192.168.99.100 
@thaJeztah thaJeztah added kind/bug Bugs are bugs. The cause may or may not be known at triage time so debugging may be needed. area/networking area/swarm labels Aug 8, 2016
@thaJeztah
Copy link
Member

ping @sanimej could you have a look?

@rogaha
Copy link
Contributor

rogaha commented Sep 8, 2016

Any update on this? It looks like it's blocking lots of use cases such as running spark on swarm mode (#24637).

@sanimej
Copy link

sanimej commented Sep 8, 2016

This has been fixed by libnetwork #1398 and will be available in docker master through #25962

@rogaha
Copy link
Contributor

rogaha commented Sep 8, 2016

awesome! Thanks for the heads up @sanimej!

@thaJeztah
Copy link
Member

@sanimej moby/libnetwork#1398 is on the 1.13 milestone; is there a solution for a possible 1.12.2 release?

@thaJeztah
Copy link
Member

Sorry, meant #25962

@chernals
Copy link

@thaJeztah Facing the same, can't wait to see that fixed. In the meantime, did anyone tested that it's properly solving the problem if building from sources (master)?

@sanimej
Copy link

sanimej commented Sep 15, 2016

@thaJeztah #25962 brought in docker run on swarm network which is a 1.13 feature. But for 1.12.2 we will cherry-pick the fix for this issue. This can't wait for 1.13.

@icecrime
Copy link
Contributor

Note for 1.12.2: we need to make sure moby/libnetwork@9dfce0b is in.

@mrjana
Copy link
Contributor

mrjana commented Sep 26, 2016

Closing this as this is fixed in #25962

@mrjana mrjana closed this as completed Sep 26, 2016
@thaJeztah
Copy link
Member

@mrjana should we wait for the vendor PR for 1.12.2? Or was that opened / merged already?

@mrjana
Copy link
Contributor

mrjana commented Sep 26, 2016

@thaJeztah Yeah vendor PR for 1.12.2 was merged but shouldn't this be closed regardless because this is fixed in master via #25962 long time ago?

@thaJeztah
Copy link
Member

@mrjana yes, in general, I agree; purely so that we don't loose sight (it happened before), but if it's fixed in the 1.12.2 branch, closing is fine 👍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/networking area/swarm kind/bug Bugs are bugs. The cause may or may not be known at triage time so debugging may be needed. status/needs-vendoring version/1.12
Projects
None yet
Development

No branches or pull requests

10 participants