How to setup a Hoard of Mongrels

Update: Forget this, and check out mod_rails instead

My wife recently commented to me that her website www.crittersdelight.com takes quite a while to load. This website is powered by the Ruby on Rails framework, and was served via Mongrel behind an Apache proxy. This single instance of Mongrel did the job, but not very well. It took anywhere between 10-15 seconds for the URL to resolve, and begin rendering. After I setup 3 Mongrel servers, this time has been cut down to being around a second.

I did some research, and quickly found that the power of Mongrel was its ability to parse and server dynamic languages with a small memory footprint. (Each instance of Mongrel consumes anywhere from 15MB to 40MB). The problem however is that where Apache can accept multiple connections and simultaneously process requests in parallel, Mongrel cannot. This means sequential access for each requested file inside of your Rails application. If you have 20 images, 3 CSS stylesheets, and 5 Javascript files, you could be looking a quite a delay.

The de-facto solution then, was to run multiple instances of Mongrel (called a cluster) and use Apache to intelligently route the requests based on each instance’s load. This way you achieve simultaneous connections with parallel processing inside of your Rails application. I was off to Google to learn how to setup this environment when I was confronted by problem #1: there isn’t much on the subject floating around out there. Problem #2 is what I did find was geared towards Capistrano (another layer of complexity I was hoping to avoid). Problem #3 is this seems to be even documentation for Ubuntu is even more scarce.

I pieced together information from various sources, and came up with a working solution. Read below to replicate this at home: (Keep in mind that this guide is geared towards someone who already has one instance of Mongrel running their Rails application). For more information on setting up just a plain old instance of Mongrel, read this article

My setup:

  • Ubuntu 7.10 (any *nix distribution should do – but your files may be located in different places)
  • Apache 2.2 with mod_proxy, mod_proxy_balancer, mod_rewrite
  • Ruby on Rails 1.8.4 and a working project ready to go live

I started with getting what I needed (it is possible that the only thing really needed is mongrel and mongrel_cluster):

sudo gem install daemons gem_plugin mongrel mongrel_cluster --include-dependencies

Next, lets make sure that the modules that we need in Apache2 are enabled (Ubuntu style):

sudo a2enmod rewrite ; sudo a2enmod proxy ; sudo a2enmod proxy_balancer

I then started my project by navigating to my projects root directory and issuing the command:

./scripts/server

This started my project (using mongrel and port 3000). This was my proof of life before I starting mucking around with all kinds of code. I connected to the URL to confirm that it works. Next, lets generate our mongrel_cluster configuration (its pretty straightforward)

mongrel_rails cluster::configure -e production -p 8000 -a 127.0.0.1 -N 3

This should return something like “mongrel_cluster.yml configuration created inside ./config/”. The directives above are the same as when starting WeBrick, or regular Mongrel. The “e” switch is for your environment (development, production, etc). The “p” switch is to specify what port to start on. Ports will be sequentially bound based on the number of server instances (switch “N”) you specify. In this case, 8000, 8001, and 8002 will be used my Mongrel. And finally, the “a” switch locks down the Mongrel servers to only listen on the localhost address. This means only the machine this is running on can access these resources.

Now that our configuration file has been generated, we can test our progress. Run the following to start the clusters:

mongrel_rails cluster::start

You should see output detailing the servers starting up, and the ports they have bound to. For a full list of options, just run “mongrel_rails”. Its very similar to other “init” scripts in *nix. Verify that these instances are running by connecting to the ports manually on your machine using something like “lynx” with the URL 127.0.0.1:8000. All should connect for you at this point.

Now comes the hard(er) part – configuring Apache. We need to create a VirtualHost directive inside of “/etc/apache2/sites-available/default” file. For many distributions, this file will be “httpd.conf”. Inside this file, create something like the following:

ServerName myapp.com
  DocumentRoot /var/rails/myapp.com/current/public

  <Directory "/var/rails/myapp.com/current/public">
    Options FollowSymLinks
    AllowOverride None
    Order allow,deny
    Allow from all
  </Directory>

  RewriteEngine On

  # Make sure people go to www.myapp.com, not myapp.com
  RewriteCond %{HTTP_HOST} ^myapp.com$ [NC]
  RewriteRule ^(.*)$ http://www.myapp.com$1 [R=301,L]

  # Rewrite index to check for static
  RewriteRule ^/$ /index.html [QSA] 

  # Rewrite to check for Rails cached page
  RewriteRule ^([^.]+)$ $1.html [QSA]

  <Proxy *>
    Order Allow,Deny
    Allow from all
  </Proxy>

  # Redirect all non-static requests to cluster
  RewriteCond %{DOCUMENT_ROOT}/%{REQUEST_FILENAME} !-f
  RewriteRule ^/(.*)$ balancer://mongrel_cluster%{REQUEST_URI} [P,QSA,L]

  # Deflate
  AddOutputFilterByType DEFLATE text/html text/plain text/xml application/xml application/xhtml+xml text/javascript text/css
  BrowserMatch ^Mozilla/4 gzip-only-text/html
  BrowserMatch ^Mozilla/4.0[678] no-gzip
  BrowserMatch \bMSIE !no-gzip !gzip-only-text/html

Again, this should go inside a VirtualHost container. You will need to replace the occurrences of “myapp.com” with your actual DNS name. DocumentRoot should also point to the full path of your Rails project location. This does a few checks and then proxies your request to (an as of yet unwritten) our proxy balancer.

Something important that I do not see a lot of mention of is the securities needed to use a proxy. Note the section <Proxy *>…</Proxy>. If you do not put this inside your VirtualHost, you will receive an error 403: Access Forbidden when you attempt to connect.

Next, outside of the content of our VirtualHost container, we will need to create our Proxy balancer. Basically, we give this an arbitrary name (which is already defined in our VirtualHost above) of “mongrel_cluster”. Paste the code below underneath your closing tag for VirtualHost:

<Proxy balancer://mongrel_cluster>
  Order Allow,Deny
  Allow from all

  BalancerMember http://127.0.0.1:8000
  BalancerMember http://127.0.0.1:8001
  BalancerMember http://127.0.0.1:8002
</Proxy>

It is important to note that the address to load must be your internal loopback address (or 127.0.0.1). Originally, I was using name based virtual hosts, and assumed that this would need the name to correctly resolve. However this creates an outbound request and so fails because Mongrel is only listening to local connections.

Again for security, I stuck in the “Order Allow, Deny, Deny from all” directives to allow public access to this resource. After this, save your file and issue a restart command to Apache2:

sudo /etc/init.d/apache2 force-reload

With luck, you will be served the product of your hard working hoard of Mongrel servers.

Advertisement

2 Comments

  1. John says:

    Good post. You may also want to check out the link I will put below this. It goes into a lot more detail about what everything is doing. Also, you know you could have just called the HUB because I hear they have pages and pages of documentation on this :-).

    http://blog.codahale.com/2006/06/19/time-for-a-grown-up-server-rails-mongrel-apache-capistrano-and-you/

    Like

  2. Johnathon says:

    Booooooooooooooooooooooooring.

    Like

Leave a Comment

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.