Troubleshoot Boot and Networking Issues with New EC2 Serial Console

Fixing production issues is one of the key responsibilities of system and network administrators. In fact, I’ve always found it to be one of the most interesting parts of infrastructure engineering. Diving as deep as needed into the problem at hand, not only do you (eventually) have the satisfaction of solving the issue, you also learn a lot of things along the way, which you probably wouldn’t have been exposed to under normal circumstances.

Operating systems certainly present such opportunities. Over time, they’ve grown ever more complex, forcing administrators to master a zillion configuration files and settings. Although infrastructure as code and automation have greatly improved provisioning and managing servers, there’s always room for mistakes and breakdowns that prevent a system from starting correctly. The list is endless: missing hardware drivers, misconfigured file systems, invalid network configuration, incorrect permissions, and so on. To make things worse, many issues can effectively lock administrators out of a system, preventing them from logging in, diagnosing the problem and applying the appropriate fix. The only option is to have an out-of-band connection to your servers, and although customers could view the console output of an EC2 instance, they couldn’t interact with it – until now.

Today, I’m extremely happy to announce the EC2 Serial Console, a simple and secure way to troubleshoot boot and network connectivity issues by establishing a serial connection to your Amazon Elastic Compute Cloud (EC2) instances.

Introducing the EC2 Serial Console
EC2 Serial Console access is available for EC2 instances based on the AWS Nitro System. It supports all major Linux distributions, FreeBSD, NetBSD, Microsoft Windows, and VMWare.

Without any need for a working network configuration, you can connect to an instance using either a browser-based shell in the AWS Management Console, or an SSH connection to a managed console server. No need for an sshd server to be running on your instance: the only requirement is that the root account has been assigned a password, as this is the one you will use to log in. Then, you can enter commands as if you have a keyboard and monitor directly attached to one of the instance’s serial ports.

In addition, you can trigger operating system specific procedures:

  • On Linux, you can trigger a Magic SysRq command to generate a crash dump, kill processes, and so on.
  • On Windows, you can interrupt the boot process, and boot in safe mode using Emergency Management Service (EMS) and Special Admin Console (SAC).

Getting access to an instance’s console is a privileged operation that should be tightly controlled, which is why EC2 Serial Console access is not permitted by default at the account level. Once you permit access in your account, it applies to all instances in this account. Administrators can also apply controls at the organization level thanks to Service Control Policies, and at instance level thanks to AWS Identity and Access Management (IAM) permissions. As you would expect, all communication with the EC2 Serial Console is encrypted, and we generate a unique key for each session.

Let’s do a quick demo with Linux. The process is similar with other operating systems.

Connecting to the EC2 Serial Console with the AWS Management Console
First, I launch an Amazon Linux 2 instance. Logging in to it, I decide to mangle the network configuration for its Ethernet network interface (/etc/sysconfig/network-scripts/ifcfg-eth0), setting a completely bogus static IP address. PLEASE do not try this on a production instance!

Then, I reboot the instance. A few seconds later, although the instance is up and running in the EC2 console and port 22 is open in its Security Group, I’m unable to connect to it with SSH.

$ ssh -i ~/.ssh/mykey.pem
ssh: connect to host port 22: Operation timed out

EC2 Serial Console to the rescue!

First, I need to allow console access in my account. All it takes is ticking a box in the EC2 settings.

Enabling the console

Then, right clicking on the instance’s name in the EC2 console, I select Monitor and troubleshoot; then EC2 Serial Console.

This opens a new window confirming the instance id and the serial port number to connect to. I simply click on Connect.

This opens a new tab in my browser. Hitting Enter, I see the familiar login prompt.

Amazon Linux 2
Kernel 4.14.225-168.357.amzn2.x86_64 on an x86_64
ip-172-31-67-148 login:

Logging in as root, I’m relieved to get a friendly shell prompt.

Enabling Magic SysRq for this session (sysctl -w kernel.sysrq=1), I first list available commands (CTRL-0 + h), and then ask for a memory report (CTRL-0 + m). You can click on the image below to get a larger view.

Connecting to the console

Pretty cool! This would definitely come in handy to troubleshoot complex issues. No need for this here: I quickly restore a valid configuration for the network interface, and I restart the network stack.

Trying to connect to the instance again, I can see that the problem is solved.

$ ssh -i ~/.ssh/mykey.pem

__|   __|_  )
_|   (    / Amazon Linux 2 AMI
[ec2-user@ip-172-31-67-148 ~]$

Now, let me quickly show you the equivalent commands using the AWS command line interface.

Connecting to the EC2 Serial Console with the AWS CLI
This is equally simple. First, I send the SSH public key for the instance key pair to the serial console. Please make sure to add the file:// prefix.

$ aws ec2-instance-connect send-serial-console-ssh-public-key –instance-id i-003aecec198b537b0 –ssh-public-key file://~/.ssh/ –serial-port 0 –region us-east-1

Then, I ssh to the serial console, using .port as user name, and I’m greeted with a login prompt.

$ ssh -i ~/.ssh/mykey.pem

Amazon Linux 2
Kernel 4.14.225-168.357.amzn2.x86_64 on an x86_64
ip-172-31-67-148 login:

Once I’ve logged in, Magic SysRq is available, and I can trigger it with ~B+command. I can also terminate the console session with ~..

Get Started with EC2 Serial Console
As you can see, the EC2 Serial Console makes it much easier to debug and fix complex boot and network issues happening on your EC2 instances. You can start using it today in the following AWS regions, at no additional cost:

  • US East (N. Virginia), US West (Oregon), US East (Ohio)
  • Europe (Ireland), Europe (Frankfurt)
  • Asia Pacific (Tokyo), Asia Pacific (Sydney), Asia Pacific (Singapore)

Please give it a try, and let us know what you think. We’re always looking forward to your feedback! You can send it through your usual AWS Support contacts, or on the AWS Forum for Amazon EC2.

– Julien