Jaap Haagmans The all-round IT guy

2Sep/130

Why I don’t use custom AMIs for EC2 instances

When I started using AWS, most of the documentation on auto scaling and EC2 in general advised creating AMIs to launch a copy of your instance when, for example, scaling up. And it sounds convenient: you don't need to install packages and configure your server. You simply copy the running instance and you're done. When I started auto scaling, I quickly decided this method was not for me.

I found that every time I changed something on one of my servers, I had to create a new AMI from that instance, create a new launch config and terminate running instances that would then launch from the newly created AMI. While this works, I've discovered that using user-data files can be much cleaner than using AMIs. You can easily switch to a new Amazon AMI when it's released and a user-data file will ensure you only install packages you actually need (while an AMI can build up lots of redundant packages and files over the years).

But the most important reason for me to do this was simplifying the release of new application versions. When using an AMI, I'd have to update the application code on one instance, create the AMI, create my launch config, update the auto scaling group and terminate all other running instances. Using a launch script, I can simply push my code to the "stable" branch on git and start terminating instances one at a time. The launch script will ensure all instances have the right packages installed and pull the latest code from git. All our webservers are in a private subnet, connecting through a NAT instance, so the git repository can be setup to only allow access from our public NAT IP addresses. In fact, you can have a git repo within the private subnet that isn't password protected for this purpose.

An example launch config script for a Passenger/nginx server that would host a Rails application could be something like this:

#!/bin/bash
yum update
yum install git ruby19 httpd gcc gcc-c++ ruby19-devel curl-devel openssl-devel zlib-devel
gem1.9 install passenger bundler --no-ri --no-rdoc
passenger-install-nginx-module --auto --auto-download --prefix=/opt/nginx
adduser webapp
cd /home/webapp && git clone YOUR_GIT_REPO:yourapp.git && cd webapp && bundle install
cat /home/webapp/yourapp/config/custom-configs/nginx.conf > /opt/nginx/conf/nginx.conf
cat /home/webapp/yourapp/config/custom-configs/initd-nginx > /etc/init.d/nginx
chown webapp:webapp -R /home/webapp/*
chmod 755 /home/webapp
chmod +x /etc/init.d/nginx && chkconfig nginx on && service nginx start

The last 3 lines might need a bit of explanation. I've chosen to include the nginx-configuration and an init.d script with the app. I could easily put those on something like S3, but I felt that since nginx is installed automatically with every deploy, this was just as easy. However, if you make regular changes to your nginx.conf file, you might want to do this differently.

If you combine this with a Capistrano script that would iterate through your running instances (tag your auto scaling group to easily and automatically find the right instances) and shuts them down, you have fully automated deployment in a clustered environment, without having to use your own AMIs. It's as simple as git push && capistrano deploy!