Ansible and ControlMaster file naming
This is a short story of how Ansible and SSH (using the default `ControlMaster` path format) bite me.
Lets take this network layout for example, where we have the same IP behind different machines.
+ + + +
| dc1+jumper +-----^-----+ dc1+db |
+------> 8.8.8.8 | | 10.0.0.1 |
| | | | |
+----------+ +-------------+ | +--------------+ +------------+
| | | | |
| Ansible +----------> mgmt+host +---------+
| Client | | | | +--------------+ +------------+
| | +-------------+ | | +-----^-----+ |
+----------+ | | dc2+jumper | | dc2+db |
+------> 8.8.4.4 | | 10.0.0.1 |
| | | |
+--------------+ +------------+
And we have this lines in our ssh config file,
Host dc1-jumper HostName dc1-jumper User root Port 22 UserKnownHostsFile /dev/null StrictHostKeyChecking no Host dc2-jumper HostName dc2-jumper User root Port 22 UserKnownHostsFile /dev/null StrictHostKeyChecking no Host dc1-db HostName 10.0.0.2 User root Port 22 UserKnownHostsFile /dev/null StrictHostKeyChecking no ProxyCommand ssh dc1-jumper -W %h:%p Host dc2-db HostName 10.0.0.2 User root Port 22 UserKnownHostsFile /dev/null StrictHostKeyChecking no ProxyCommand ssh dc1-jumper -W %h:%p Host * ControlMaster auto ControlPath ~/.ssh/master-socket/%r@%h:%p ControlPersist 6s
Now the problem is that dc1-db and dc2-db share the same IP, and the master socket file which is created (based on the above placeholders `%r@%h:%p`) will use the same file name, so if you try to connect to dc1-db right after you have connected to dc2-db, guess where you well end up ?
$ ll ~/.ssh/master-socket 0 srw-------. 1 Jan 31 17:42 [email protected]:22 0 srw-------. 1 Jan 31 17:41 root@dc1-jumper:22 0 srw-------. 1 Jan 31 17:41 root@dc2-jumper:22
I only notice this after I saw my Ansible playbook is changing the same file over and over, the file was a simple yum repo definition, which should not change after the first setup, but this repo template had a ansible fact as place holder which was the distro version (rhel6/rhel7), and based on which connection was establish first (`dc1-db` vs `dc2-db`) this fact was set based on the first machine which was connected. So each run one of the machine will register a change.
After looking into ssh documentation I found out there are other place holders I can use. There is `%C` which didn’t help in my case, as it seem to generate the same string, but using `%n` did the trick as it used the connection name and not the Hostname name/ip for the socket file.
ControlPath ~/.ssh/master-socket/%r-%n-%p $ ll ~/.ssh/master-socket 0 srw-------. 1 Jan 31 17:42 root-dc1-db-22 0 srw-------. 1 Jan 31 17:42 root-dc2-db-22 0 srw-------. 1 Jan 31 17:41 root-dc1-jumper-22 0 srw-------. 1 Jan 31 17:41 root-dc2-jumper-22