Deployment process requirements

December 17th, 2008

This blog-post is mostly targeted at non-Rails developers. Rails devs should know all this by heart :) Many times we need to explain to our customers what is ‘proper deployment’ and why their current one sucks :) Now we’ll be able to just point them to this post…

Proper deployment is almost not found anywhere. At least anywhere we looked. Very few places really ‘get’ it and assign enough importance to it. Some project owners just don’t know how it is supposed to be, so they accept their developer’s practice of ftp sync to production server :)

If you are a developer, make sure you implement it all.

If you have a project developed for you, then this is a checklist that you can bring to your developers/consultants and require ‘yes’ to every single one.

If your deployment procedure misses any of the qualities listed below, you are asking for trouble.

Production deployment must be:

  • Automated
  • Atomic
  • Clean
  • Reversible
  • Have a release log
  • Manage DB migrations

Automated

This is I think the most important requirement. Deployments shouldn’t be done by hand. There are still a lot of companies where a deployment process looks something like this:
  • connect to the remote windows server with “remote desktop”
  • upload a zip file via ftp
  • unzip it on the desktop on the remote pc
  • manually drag it to the production deployment, effectively creating a mix of new and old files
  • manually connect to the sql server and fiddle with schema until it looks similar to the local one.
  • restart something.

This is soooo wrong. The deployment must be done by a single execution of a deployment script, that should do all the work. The maximum developers intervention allowed is entering a passwords.

Atomic

This one is not always possible (for example when doing database migrations), but the idea is that you switch old version with the new one instantly, as opposed to syncing a whole tree of files over ftp. With later you have a noticeable amount of time when some files are already from the new version and some others are still from the old one.

Clean

By clean deployment I mean that no files from the old version should be accessible after deploying the new version. Too many times we’ve seen deployments by simply dumping new files into production directory. So files that were removed in the new version remain in place.

Reversable

It should be fairly easy to revert to the previous version in a case when a major bug found in the new one. (and in case of no db-migrations it should be trivial)

Have a release log.

It should be possible to answer questions like:

  • what version were we running at 17th september last year? (I have a db record time-stamped with this date, and I want to see the code that created it)
  • did we have any code changes 17 days ago at 13:40? (I’m looking at the performance graphs for the application and notice that CPU usage rised by 20% at this time.)

Manage DB migrations

Database schema must be managed by scripts that run during deployment. You should NEVER manually modify production database schema in a course of a regular deployment. (you might still need to do it to ‘repair’ an automated migration that failed midway, but this should be an exception, not the rule)

Reversibility requirement also means that your migration scripts should be able to both upgrade and downgrade the schema.

Credit and more war stories

To give credit where it is due Rails + Capistrano / Vlad actually suppost all of the above almost ‘out of the box’. But even with Rails we’ve seen projects that do deployments by running ‘svn up’ in the production directory, or even with just ftp-ing new .css file directly to production w/o even checking it into the version control.

The HOWTO

Below is one of the possible ways to do it that passes all of the above (this is how its usually done with capistrano or vlad):

  • The production directory is just a symbolic link to deployment directory which is different for every deployment.
  • The files that should be shared between deployments (database configuration file, user uploaded images etc) are stored in a separate ‘shared’ directory that can be accessed directly or can be linked into each release directory during deployment (we do the later)
  • exact source code version and exact deployment time-stamp are logged into a simple text file
  • database has a version attached to it, which is stored in a separate database table (with just numeric version field). There is a script for each database version that knows how to bring the database up from the previous, or return back to the previous version (i.e. both ‘up’ and ‘down’)
  • The deployment process is performed as follows
    1. new code is uploaded/unpacked/checked out directly from source control into a NEW directory with a unique name (just use time-stamp)
    2. directory name, code version and current timestamp are logged in the releases.log file
    3. shared directory linked into the release directory
    4. deployment-specific configuration files are copied from the shared directory to the release directory
    5. database migrations are run if needed.
    6. symlink is replaced pointing to the new directory
    7. web server is restarted

The resulting directory structure might look like this:

deployment-root/
|-- current -> releases/20081210-1703
|-- releases
|   |-- 20081201-1415
|   |   |-- config
|   |   |   |-- databse.yml  <== this is copied from the shared/config/database.yml
|   |   |   `-- shared -> ../../../shared
|   |   |-- log -> ../../shared/log
|   |   `-- public
|   |       `-- images
|   |           `-- avatars -> ../../config/shared/avatars
|   `-- 20081210-1703
|       |-- config
|       |   |-- databse.yml
|       |   `-- shared -> ../../../shared
|       |-- log -> ../../shared/log
|       `-- public
|           `-- images
|               `-- avatars -> ../../config/shared/avatars
|-- revisions.log
`-- shared
    |-- avatars
    |-- config
    |   `-- databse.yml
    `-- log
        `-- production.log

I can haz comments.

December 8th, 2008

Blog moved to Mephisto. So we have comments now.

In the process of installing Mephisto I’ve got a problem with image_science gem. It installed OK but when trying to require it the was a problem with RubyInline compilation:


astrails@alpha:~$ irb
irb(main):001:0> astrails@alpha:~$ irb -rrubygems -rimage_science
/home/astrails/.ruby_inline/Inline_ImageScience_aa58.c:3:23:
error: FreeImage.h: No such file or directory
...
CompilationError: error executing gcc -shared   -fPIC -Wall -g -fno-strict-aliasing -O2  -fPIC -I /usr/lib/ruby/1.8/x86_64-   linux -I /usr/include -o "/home/astrails/.ruby_inline/Inline_ImageScience_aa58.so" "/home/astrails/.ruby_inline/Inline_ImageScience_aa58.c" -lfreeimage -lstdc++: 256
...

is a ruby gem that allows you to write inline C code in your ruby source files and it will be compiled and linked-in when it is first used. This requires gcc and all build dependencies to be installed on the computer running it (which kind of sucks if you want to use it on an embedded device, but thats for another post)

Anyway, in this case FreeImage.h was missing which is part of FreeImage project.

Unfortunately ‘freeimage’ Debian package is not available in the stable Etch distribution.

The solution is to compile source package from Lenny:

Edit your /etc/apt/sources.list file and add the following line to it:
    deb-src http://http.us.debian.org/debian lenny main non-free contrib
Install ‘tofrodos’ (This is freeimage’s build dependency):
  apt-get install tofrodos
Download and build source packages for freeimage:
  mkdir /tmp/freeimage
  cd /tmp/freeimage/
  apt-get source -b freeimage

“-b” flag tells apt-get to build the package after it downloads the sources.

Install the packages
  dpkg -i *.deb
Test image_science
  irb -rrubygems -rimage_science

We really like Debian and we usually use the current “stable” distribution for our production servers. It all works great with one little problem: if you need very current soft it is probably not in the ‘stable’ yet.

The current Debian stable (“etch”) includes rubygems 0.9.0-5 which is way too old. We needed to upgrade to at least 1.2.

There are several ways you can try to solve such a problem. For example there are backports of selected packages from testing/unstable. Or you can compile sources yourself, which is less trivial but will help when binary backport is not available.

We are going to download sources for the newer package that is in the unstable (“sid”) distribution and compile them on the stable distribution. The procedure is simple:

  • add ‘deb-src http://http.us.debian.org/debian sid main non-free contrib’ to your /etc/apt/sources.list
  • run ‘apt-get update’
  • run ‘apt-get source rubygems’
  • cd libgems-ruby-1.2.0
  • The unstable package has options to build rubygems1.9 as well as rubygems1.8. so it requires ruby1.9 to be installed. ** edit debian/control and remove dependencies for ruby1.9 and the whole section for “Package: rubygems1.9”
  • run “dpkg-buildpackage -b -uc”
  • cd ..

Now you have rubygems*.deb packages.

First remove old packages:

 apt-get remove libgems-ruby1.8 rubygems"

Then install the new packages:

 using 'dpkg -i rubygems*.deb'

Incorporated...

September 1st, 2008

We just incorporated our own Ltd. company.

It was coming for a while now but we finally got to it when we started to hire people :)

Memory problems with gems

August 21st, 2008

Once again I’ve hit a problem of installing gems on a machine with very little memory.

Gems can use quite a lot of RAM resolving dependencies and updating gem index.

It can be a problem if you have very little memory (usual condition on most cheap VPS accounts)

In my case it was an embedded device with just 235k total memory and only about 150k free.

What happens is that kernel will kill the offending process. You’ll see in the terminal something like this:

…
Bulk updating Gem source index for: http://gems.rubyforge.org/
Killed

The solution is to disable bulk update with -B parameter:

    gem install rails -B 999999

This will take some time to run and will print a lot of dots but will use less memory.

Being Lazy with Ruby

June 4th, 2008

On one of our projects we needed to do some caching for an action with an expensive db query. Fragment caching took care of the rendering but we needed a way to skip the db if we have a cache hit. And checking for an existence of the fragment file in the controller just didn’t seem right.

Lazy evaluation to the rescue.


    # controller
    @items = Item.lazy.paginate
    # view
    <% cache ("fragment...") do %>
      <%= render :collection => @items %>
    <% end %>

@items is a placeholder for the result of calling paginate which will not be actually executed until @items is used in any way (calling any method on it, like “each”, or “to_s”). And if the fragment is cached then @items is not accessed and so no db query is made.

We re-implemented the lazy evaluation recently for our new project with a much cleaner syntax and simpler implementation. We packaged it as lazyeval gem.

Sources can be found at github. The interesting part is in lazyeval.rb which is just 28 lines long. Check it out …

The basic idea is to return a placeholder object that ‘remembers’ what needs to be done once it is used.

2 options:

  1. foo.lazy.bar(params) - this will return a placeholder that will call method ‘bar’ on the object ‘foo’ with ‘params’ once used.
  2. foo.lazy {|o| o.bar } - this will return a placeholder that will call the block passing the object ‘foo’ as a parameter.