So I had a challenge on the other day to restore an EC2 instance from EBS Snapshots. I have worked with AMI and EBS for many years, but i have never tried this before.
I was given some information about the environment such as the host OS and the listening port of SSH, then that was it, I would have to figured out the rest. In fact, it was two EBS snapshots: one for the root and one for data. And I would need to bring up the instance and data up in the middle of the night…
The first thing I tried is to create an image from the snapshot so that I can launch the instance from the image. However, I stumped into an very weird issue where it said the snapshot does not belong to my account. It seemed that you would also need to grant the CreateVolume permission to the external account as well. The only way I was able to make this work is to copy the snapshot to my account, and then created the image from my snapshot. Luckily I got the right Virtualization type (HVM is a good choice, PV is only for very old instance) for the image and other default parameters. View the instance system log is a good way to confirm whether the instance is booting normally.
Now I got the instance up and running, but I couldn’t access the instance. SSH port was opened but I could not login. After some head scratching, I decided to stop the instance, detach the volume, launched an other instance using standard image and mounted the ebs into the newly created instance. Now I could look at inside of the volume.
I could see the public key was added correctly for ec2-user (this is an Amazon Linux image) but for some reason I could still not logged in. I was given an username and password to login to the instance but it didn’t worked either.
It turned out that the instance was configured to allow only certain groups to login into the instance, and the ec2-user was not in that group! So I had to edit the sshd config file to remove that restriction (I could also tried to edit the /etc/group to add user to that group). Another side note is that when you create new instance, cloud-init will disable password login for SSH, so don’t rely a lot on password login, it is not safe anyway. I learnt later that you can use instance userdata cloud-config to reenable password login.
I remounted the volume and started the instance again, but life was not easy, it allowed me to login by key but asked me to change the password of the user 😂. Of course, I would not attempt to decrypt password. But, thank to StackOverflow, this post helped me to remove the expiration date by editing the passwd file.
I was also be able to determined the volume mapping for the other EBS by looking at /etc/fstab, so I just needed to mount the other volume with the right device mapping.
And after 3 hours, I was finally to be able to logged in, verified data and services of the instance, and finally got some sleep.
It was actually fun and interesting assignment. I knew a bit more about internal structure of linux. And the trick of having another instance running to be able to look at the volume content is really helpful. I hope this post will save someone sleep 👍.