EC2 agents failing to connect to Controller.
You observe below error when connecting EC2 instance to controller.
The instance EC2 (aws-xxx-xxx-xxxx) - docker-2020-09-18 (i-052d99921a8f6fc42) has a blank console. Maybe the console is yet not available. If enough time has passed, consider changing the key verification strategy or the AMI used by one printing out the host key in the instance console Failed to connect via ssh: There was a problem while connecting to X.X.X.X:22 Jun 19, 2020 10:54:07 AM hudson.plugins.ec2.EC2Cloud INFO: The instance console is blank. Cannot check the key. The connection to EC2 (Jenkins) - Default Slave (i-052d99921a8f6fc42) is not allowed Jun 19, 2020 10:54:07 AM hudson.plugins.ec2.EC2Cloud INFO: Failed to connect via ssh: There was a problem while connecting to X.X.X.X:22 Jun 19, 2020 10:54:07 AM hudson.plugins.ec2.EC2Cloud
- CloudBees CI (CloudBees Core)
- CloudBees CI (CloudBees Core) on modern cloud platforms - Managed Master
- CloudBees CI (CloudBees Core) on modern cloud platforms - Operations Center
- CloudBees CI (CloudBees Core) on traditional platforms - Client Master
- CloudBees CI (CloudBees Core) on traditional platforms - Operations Center
- CloudBees Jenkins Platform - Client Master
- CloudBees Jenkins Platform - Operations Center
- CloudBees Jenkins Distribution
- Jenkins LTS
The above error message is related to host key verification introduced in version
1.50.3 of the Amazon EC2 plugin.
When you set up a template for a Unix instance (Type AMI field), you can select a strategy to be used to guarantee the instance you’re connecting to is the expected one. There are
four options to select for this strategy under the Advanced… configuration, on the Host Key Verification Strategy field of every configured AMI.
- Check New Hard: Check the key presented by the instance against the instance console and stores it to check subsequent connections. If the key is not printed on the console, the connection is not trusted. This is the default behavior for new AMIs.
- Check New Soft: Check the key against the instance console and stores it to check subsequent connections. If the key is not printed on the console, the connection is trusted anyway. This is the default behavior for existing AMIs (upgrading from a previous plugin version). This avoids future attacks but cannot guarantee the instance is the right one if a man-in-the-middle attack has already been committed.
- Accept New: Accept the key on first connection and stores it to check subsequent connections. This doesn’t try to check the key against the console as the check-new-soft strategy does
- Off: Don’t check the host key on any connection
If the Connect by SSH Process field is checked, the equivalent host key verification options are:
- check-new-hard = yes
- check-new-soft = accept-new
- accept-new = accept-new
- off = no
This error usually occurs when using
Check New Hard or
Check New Soft options and if any one of the below requirement is not met.
- The IAM credentials configured for the plugin should have
- The AMI used should print the key used. It’s a common behaviour, for example the Amazon Linux 2 AMI prints it out. You can consult the AMI documentation to figure it out.
- The launch timeout should be long enough to allow the plugin to check the instance console. With this strategy, the plugin waits for the console to be available, which can take a few minutes. The Launch Timeout in seconds field should have a number to allow that, for example 600 (10 minutes).Some EC2 instances like M5.metal require longer timeout value, approximatley between 25 to 30 minutes. Setting it lower could lead to unpredictable issues during provisioning. By default there is no timeout, so it’s safe.
The long term fix is to ensure your environment is setup as per above requirement.
Accept Newverification strategy where the first key is accepted blindly without verifying against the instance console output. This should only be used as a temporary fix, eventually you should setup your environment to use
Check New Hardwhich is considered to be the safest option.
In some environments you may experience below error when using
Accept Newand your environment is configured according to above recommendation.
[03/23/21 14:36:00] Launching agent $ ssh -o StrictHostKeyChecking=accept-new -i /tmp/ec2_4261174220549794768.pem ec2-user@X.X.X.X -p 22 java -jar /tmp/remoting.jar -workDir /home/ec2-user command-line line 0: unsupported option "accept-new".
This normally happens when you are using the
Connect by SSH Process option in Amazon EC2 plugin AMI configuration and the version of OpenSSH running on your Controller host is older than release-7.6. The StrictHostKeyChecking option
Accept New was introduced from release-7.6 of OpenSSH.
You can verify the version of Openssh by running
SSH -V command. The workaround is either to upgrade the version of OpenSSH on your controller to release-7.6+ or Unchecking
Connect by SSH Process and using the in-process Java SSH client should also work as it does not dependent on external SSH process.
When connecting to EC2 agents using the external ssh process(Connect by SSH Process field is checked) the known host file is populated using IP address of EC2 Instances. You are likely to face issues with host verification if new instances re-use IPs already added to the list of known host. You will see messages like the one below when this occurs.
[03/29/21 15:26:34] Launching agent $ ssh -o StrictHostKeyChecking=accept-new -i /tmp/ec2_4365876265509662772.pem buildagent@X.X.X.X -p 22 java -jar /tmp/remoting.jar -workDir /home/buildagent/jenkins @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ @ WARNING: REMOTE HOST IDENTIFICATION HAS CHANGED! @ @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ IT IS POSSIBLE THAT SOMEONE IS DOING SOMETHING NASTY! Someone could be eavesdropping on you right now (man-in-the-middle attack)! It is also possible that a host key has just been changed. The fingerprint for the ECDSA key sent by the remote host is SHA256:Sd9gv2jpwAkVMXTcIYxEyKnEL8nhd5yWmCoVz8OqfYA. Please contact your system administrator. Add correct host key in /home/jenkins/.ssh/known_hosts to get rid of this message. Offending ECDSA key in /home/jenkins/.ssh/known_hosts:375 ECDSA host key for 172.26.2.172 has changed and you have requested strict checking. Host key verification failed. ERROR: Unable to launch the agent for EC2 (jenkins-test-m1-hsv.adtran.com) - lightweight (i-08bbe0ff83e8e603f) java.io.EOFException: unexpected stream termination
If you encounter such a situation the best workaround is to use the in-process Java SSH client(Connect by SSH Process field is unchecked). This uses the instance ID instead of IP adress to populate the list of known host.