Computers, Linux, Open-source, Personal, Software, Web

Apache2 + Trac + LDAP authentication

Setting up Python easy_install:

Trac is a Python project, and if you are familiar with Python, you probably know about “easy_install”. This is a light package-management utility to easy some of the installation woes associated with software installation. Using “easy_install”, a Python project can distribute as source code, and compile against your version of Python, and install to a global Python cache. This would be very roughly analogous to “apt-get” for Debian and derivatives, or even “gem” for Ruby. The benefit is you have the most current version at the time of install, and are not limited by the version that your distribution may offer through their package management program. This can potentially cause conflicts, so weight the option carefully. Good articles comparing “easy_install” with “apt-get” can be found here, and here.

To install using “easy_install”, you will need the Python setup tools package. This can be downloaded and installed by hand, but don’t be silly. Just “apt-get install python-setuptools“. You will also need Python as a prerequisite to this package, and at the time of writing this, 2.4 is the most current version in Debian Etch.

Setting up Trac:

Next we need to get the Trac source code (if you go the “easy_install” route). Extract your source package, and inside, locate the “setup.py” package. This is the starting point. Run this with Python like so: python setup.py bdist_egg

This will create a distribution agnostic package called a Python Egg. Get it? Snakes, eggs… Anyways, This egg file is what “easy_install” handles for us. Execute the command “easy_install dist/Trac-0.11-xxx” where “xxx” is the exact build name of your Trac egg. This will copy the egg file over into a global python cache. On Debian this is “/usr/lib/python2.4/site-packages/”.

Running Trac through Apache2:

Trac can be run in stand-alone mode or through Apache. We used to go with standalone because bang, boom, you are done. However, as our requirements grew, we found it easier to go with Apache so that we can tap into our campus LDAP server. This way we eliminate the need for the “user.htgigest” file, with yet another set of arbitrary credentials.

Make sure that you have Apache2 installed, as well as the modules for python. On Debian, this can be accomplished by running:

apt-get install libapache2-mod-python libapache2-mod-python-doc

This will automatically load the modules into the Apache configuration as well, and will prompt you to restart the server to take effect. Patience young grasshopper – we will restart in a moment.

Next, lets create a new site, “sudo cp /etc/apache2/sites-available/default cp /etc/apache2/sites-available/trac“, or something to that extent. This will setup a new VirtualHost directive that you can then do with as you please. You will need to enable this new site configuration by executing “sudo a2ensite trac“.

In the VirtualHost context, copy the following:

NameVirtualHost *

  <VirtualHost *>
    <Location />
      SetHandler mod_python
      PythonInterpreter main_interpreter
      PythonHandler trac.web.modpython_frontend
      PythonOption TracEnvParentDir /var/trac/
      PythonOption TracUriRoot /
    </Location>
  </VirtualHost>

If you have setup Apache VirtualHost files before, this syntax should be familiar. You have a location inside your VirtualHost, and some directives to tell Apache that this is a Python project. The options starting with Python… are actually variables being sent to the Python environment.

I have multiple Trac projects, and the root is at “/var/trac/”. Suite yours to taste, and restart Apache with this new configuration. In this setup, my location is “/” meaning the root of my website should now show a list of available projects for Trac. Ensure that this is the case.

If you run into issues, you can add “LogLevel debug” to the VirtualHost context, and follow the file “/var/log/apache2/error.log” for more information if you run into issues.

Once Trac works (minus the “Login” bit), then its time to drop LDAP into this motha’.

Integrating LDAP with Apache and Trac:

In your same VirtualHost container, append the following code in the Location context:

Order deny,allow
Deny from all
Allow from 168.28.0.0/16
AuthType Basic
AuthName "Trac"
AuthBasicProvider "ldap"
AuthLDAPURL "ldap://your.ldap.instance:389/OU=Departments,DC=ccsunet,DC=clayton,DC=edu?sAMAccountName?sub?(objectClass=user)"
AuthLDAPBindDN "full,path,to,user"
AuthLDAPBindPassword "password"
Authzldapauthoritative Off
Require ldap-attribute memberOf="CN=CTS_Administrative Systems,OU=Administrative Systems,OU=Communications & Technology,OU=Office of Information Technology and Services,OU=Departments,DC=ccsunet,DC=clayton,DC=edu"

Lets start from the top:

Order deny, allow, Deny from all, Allow from… are all directives to control access to content. Here I have locked our site down to a range of IP address to limit access to just on campus. (Actually our firewall does this as well, but it makes me feel better to have it here too).

AuthType, AuthName, AuthBasicProvider instruct our project that LDAP will be the authoritative source for our Trac installation.

AuthLDAPURL is the address of your LDAP instance. Port 389 is standard, but you  can change it if needed. Also, we have spaces in our LDAP addresses, so I contain them inside double-quotes, despite the warnings not to do so in the Apache2 mod_authnz_ldap documentation. Another note: Apache could not find users in the default directory of LDAP. We had to specify “OU=Departments” to get it to find anyone. The ?sub bit is supposed to recursively check the LDAP trees, but no dice. Just put this and move on. sAMAccountName, as you probably know is the Microsoft LDAP Active Directory naming convention for the unique identifier. If you are not using AD, you may go with uid, or something similar.

AuthLDAPBindDN, and AuthLDAPBindPassword are a read-only account to access and search the LDAP tree with. You should have some kind of reader account information here.

Authzldapauthoritative allows fallthough authentication if provided. In other words, this makes failing LDAP not truly fatal.

Require specifies a ruleset for things to check for even if the user was successful in providing a correct combination of credentials. This is how you can restrict access to a subset of people. The options that you can use here is pretty awesome – just take a look at the official docs. I use ldap-attribute to check if they belong to our department.

After you specify all of this, restart Apache2 and nagvigate to your URL. With any luck, you will get a prompt from Apache2 for a username and password, and with even more luck, it will only accept a valid combination of credentials.

To troubleshoot here, remember to set your VirtualHost to LogLevel debug and tail your “/var/log/apache2/error.log” file for all the nitty-gritty details.

Good luck!

Apple, Computers, Events, Linux, Open-source, Ruby, Software, Web

Rails – One Step Closer to Enterprise

I have jumped back into the exciting world of Ruby on Rails the last few days to make a few optimizations to my wife’s website, www.crittersdelight.com

The problem is that Rails runs so dog-ass slow compared to other interpreted languages like PHP. After a few heated debates on who is to blame with a few Rails fans at work, I concluded that Rails needs the native speed of an Apache module. I wondered how long the Rails community would take to just bite the bullet and develop the module. Its not their job (exactly), but Apache is also open-source.

Without the module, all of the solutions involve passing out the Ruby code requests to yet another server (in my case Mongrel) to process the requests.

That is, until I read about a Ruby gem named Passenger. This was the mod_rails solution I was looking for with Apache natively serving the requests!

Curiosity got to me after watching their screencast. I saw that Ubuntu was well supported (among many other OS choices) as an added bonus. The installation was absolutely painless. I performed a few steps to get the website up and running:

sudo gem1.8 install passenger
sudo apt-get install ruby1.8-dev
sudo apt-get install apache2-prefork-dev
sudo apt-get install libapr1-dev
sudo passenger-install-apache2-module

I then pointed to the compiled module from inside Apache’s configuration files (copy and paste from the setup), and deleted a lot of garbage from my old Mongrel cluster VirtualHost section. My new configuration now looks like this:

<VirtualHost 168.28.245.99:80>
     ServerName www.crittersdelight.com
     ServerAlias crittersdelight.com
     DocumentRoot /home/bsimpson/public_html/crittersdelight/public

     <Directory "/home/bsimpson/public_html/crittersdelight">
          Options FollowSymLinks
          AllowOverride None
          Order allow,deny
          Allow from all
     </Directory>

     RewriteEngine On
</VirtualHost>

A quick Apache restart, and now I am natively serving this Rail’s application. Thanks to Phusion for taking the biggest Rails pain in the ass away!

Computers, Linux, Open-source, Ruby, Software, Web

How to setup a Hoard of Mongrels

Update: Forget this, and check out mod_rails instead

My wife recently commented to me that her website www.crittersdelight.com takes quite a while to load. This website is powered by the Ruby on Rails framework, and was served via Mongrel behind an Apache proxy. This single instance of Mongrel did the job, but not very well. It took anywhere between 10-15 seconds for the URL to resolve, and begin rendering. After I setup 3 Mongrel servers, this time has been cut down to being around a second.

I did some research, and quickly found that the power of Mongrel was its ability to parse and server dynamic languages with a small memory footprint. (Each instance of Mongrel consumes anywhere from 15MB to 40MB). The problem however is that where Apache can accept multiple connections and simultaneously process requests in parallel, Mongrel cannot. This means sequential access for each requested file inside of your Rails application. If you have 20 images, 3 CSS stylesheets, and 5 Javascript files, you could be looking a quite a delay.

The de-facto solution then, was to run multiple instances of Mongrel (called a cluster) and use Apache to intelligently route the requests based on each instance’s load. This way you achieve simultaneous connections with parallel processing inside of your Rails application. I was off to Google to learn how to setup this environment when I was confronted by problem #1: there isn’t much on the subject floating around out there. Problem #2 is what I did find was geared towards Capistrano (another layer of complexity I was hoping to avoid). Problem #3 is this seems to be even documentation for Ubuntu is even more scarce.

I pieced together information from various sources, and came up with a working solution. Read below to replicate this at home: (Keep in mind that this guide is geared towards someone who already has one instance of Mongrel running their Rails application). For more information on setting up just a plain old instance of Mongrel, read this article

My setup:

  • Ubuntu 7.10 (any *nix distribution should do – but your files may be located in different places)
  • Apache 2.2 with mod_proxy, mod_proxy_balancer, mod_rewrite
  • Ruby on Rails 1.8.4 and a working project ready to go live

I started with getting what I needed (it is possible that the only thing really needed is mongrel and mongrel_cluster):

sudo gem install daemons gem_plugin mongrel mongrel_cluster --include-dependencies

Next, lets make sure that the modules that we need in Apache2 are enabled (Ubuntu style):

sudo a2enmod rewrite ; sudo a2enmod proxy ; sudo a2enmod proxy_balancer

I then started my project by navigating to my projects root directory and issuing the command:

./scripts/server

This started my project (using mongrel and port 3000). This was my proof of life before I starting mucking around with all kinds of code. I connected to the URL to confirm that it works. Next, lets generate our mongrel_cluster configuration (its pretty straightforward)

mongrel_rails cluster::configure -e production -p 8000 -a 127.0.0.1 -N 3

This should return something like “mongrel_cluster.yml configuration created inside ./config/”. The directives above are the same as when starting WeBrick, or regular Mongrel. The “e” switch is for your environment (development, production, etc). The “p” switch is to specify what port to start on. Ports will be sequentially bound based on the number of server instances (switch “N”) you specify. In this case, 8000, 8001, and 8002 will be used my Mongrel. And finally, the “a” switch locks down the Mongrel servers to only listen on the localhost address. This means only the machine this is running on can access these resources.

Now that our configuration file has been generated, we can test our progress. Run the following to start the clusters:

mongrel_rails cluster::start

You should see output detailing the servers starting up, and the ports they have bound to. For a full list of options, just run “mongrel_rails”. Its very similar to other “init” scripts in *nix. Verify that these instances are running by connecting to the ports manually on your machine using something like “lynx” with the URL 127.0.0.1:8000. All should connect for you at this point.

Now comes the hard(er) part – configuring Apache. We need to create a VirtualHost directive inside of “/etc/apache2/sites-available/default” file. For many distributions, this file will be “httpd.conf”. Inside this file, create something like the following:

ServerName myapp.com
  DocumentRoot /var/rails/myapp.com/current/public

  <Directory "/var/rails/myapp.com/current/public">
    Options FollowSymLinks
    AllowOverride None
    Order allow,deny
    Allow from all
  </Directory>

  RewriteEngine On

  # Make sure people go to www.myapp.com, not myapp.com
  RewriteCond %{HTTP_HOST} ^myapp.com$ [NC]
  RewriteRule ^(.*)$ http://www.myapp.com$1 [R=301,L]

  # Rewrite index to check for static
  RewriteRule ^/$ /index.html [QSA] 

  # Rewrite to check for Rails cached page
  RewriteRule ^([^.]+)$ $1.html [QSA]

  <Proxy *>
    Order Allow,Deny
    Allow from all
  </Proxy>

  # Redirect all non-static requests to cluster
  RewriteCond %{DOCUMENT_ROOT}/%{REQUEST_FILENAME} !-f
  RewriteRule ^/(.*)$ balancer://mongrel_cluster%{REQUEST_URI} [P,QSA,L]

  # Deflate
  AddOutputFilterByType DEFLATE text/html text/plain text/xml application/xml application/xhtml+xml text/javascript text/css
  BrowserMatch ^Mozilla/4 gzip-only-text/html
  BrowserMatch ^Mozilla/4.0[678] no-gzip
  BrowserMatch \bMSIE !no-gzip !gzip-only-text/html

Again, this should go inside a VirtualHost container. You will need to replace the occurrences of “myapp.com” with your actual DNS name. DocumentRoot should also point to the full path of your Rails project location. This does a few checks and then proxies your request to (an as of yet unwritten) our proxy balancer.

Something important that I do not see a lot of mention of is the securities needed to use a proxy. Note the section <Proxy *>…</Proxy>. If you do not put this inside your VirtualHost, you will receive an error 403: Access Forbidden when you attempt to connect.

Next, outside of the content of our VirtualHost container, we will need to create our Proxy balancer. Basically, we give this an arbitrary name (which is already defined in our VirtualHost above) of “mongrel_cluster”. Paste the code below underneath your closing tag for VirtualHost:

<Proxy balancer://mongrel_cluster>
  Order Allow,Deny
  Allow from all

  BalancerMember http://127.0.0.1:8000
  BalancerMember http://127.0.0.1:8001
  BalancerMember http://127.0.0.1:8002
</Proxy>

It is important to note that the address to load must be your internal loopback address (or 127.0.0.1). Originally, I was using name based virtual hosts, and assumed that this would need the name to correctly resolve. However this creates an outbound request and so fails because Mongrel is only listening to local connections.

Again for security, I stuck in the “Order Allow, Deny, Deny from all” directives to allow public access to this resource. After this, save your file and issue a restart command to Apache2:

sudo /etc/init.d/apache2 force-reload

With luck, you will be served the product of your hard working hoard of Mongrel servers.