Computers, Software, Web

Google Music: What Not to Do

I was mildly interested in the launch of Google Music as a platform to purchase music, with some integration into the Android platform. From the Google Music launch page, Google promises that you can both stream, as well as download content to both your computer and your Android devices. I figured to try it out with my wife’s purchase and see what it was made of.

Unfortunately, the launch page introduction was the last cool thing we saw of this product. We made our purchase quickly enough, but then we were stumped. The music would play in the browser, but the launch page promised we could download the music to our computer. After clicking every button in the UI, we resorted to doing a search for how to download music. Google Music Manager (no direct link) kept coming up as the way to download our music. Not having been prompted to install this utility during our checkout process, we then had to hunt it down.

I found it on a different computer while searching at http://music.google.com, as I had yet to register for an account. My wife already made a purchase, so her default page had been replaced with her music library. There is seemingly no link to the Download Manager once your first purchase is made.

I decided to send the link the file from my computer, however Gmail will not allow attaching .exe files. I even tried to .zip it, but it still informed me that it detected an .exe file. That is a different story however…

So finally, I got the file to my wife’s computer, and we install the application, thinking we are in the home stretch. It turns out that in order to DOWNLOAD your music, you first have to UPLOAD your personal music library. This step is not optional. It cannot be skipped, and you cannot access your downloaded content until you have upload your personal music. My wife even canceled at the prompt that asked her if she wanted to upload, and it started the process anyway, leaving us to force close the application.

After all this, we were left with the only solution of downloading the album: TRACK. BY. TRACK. You only get two downloads per purchase, so we have burned through half of them already. One of the tracks got “stuck” during download, so we crossed our fingers that the last download would be successful. Fortunately it was.

So my question to you Google, is why make the process for downloading our music purchases so painful? I am sure millions of people helped beta test this product, and I don’t believe that I am the first to notice that downloading content is far from easy. What is to be gained by making the process so difficult then? Are you pushing a cloud service that not every wants? This seems to be a copy from Amazon’s playbook, however Google has failed to execute on the critical piece – a solid way to download your purchase, a la Amazon MP3 downloader.

I think I will be directing future purchases to Amazon’s music store until Google makes some corrections.

Advertisements
Apple, Computers, Linux, Open-source, Ruby, Software, Thoughts, Web

PostgreSQL for Ruby on Rails on Ubuntu

My new desktop came in at work this week, and the installation was painless thanks to the great driver support of Ubuntu 11.10. For anyone setting up a Rails development box based on Linux, I have some tips to get around some pain points when using a PostgresSQL database.

Installation:

Postgres can be quickly and easily installed using apt-get on Debian or Ubuntu based distributions. Issue the command:

apt-get install postgresql

Ruby Driver

In order for Ruby to connect to PostgreSQL databases, you will need to install the pg gem. This gem will need the development package of PostgreSQL to successfully build its native extension. To install the PostgreSQL development package, issue the following command:

apt-get install libpq-dev # EDIT: postgresql-dev was replaced by this package on Ubuntu 11.10

Setup A PostgreSQL Role

You can configure PostgreSQL to allow your account to have superuser access, allowing your Rails tasks to create and drop databases. This is useful for development, but is strongly discouraged for a production. That being said, we can create a PostgreSQL role by logging into psql as postgres as follows:

su postgres -c psql

This will open a PostgreSQL prompt as the database owner postgres. Next, we need to create an account for our user. This should match the response from “whoami”:

create role  superuser login;

We can now exit from psql by issuing “q“. Try to connect to psql directly by issuing the following command from your shell account:

psql postgres

This should allow you to connect to the default database postgres without being prompted for credentials. You should now be able to issue the rake commands for creating, and dropping the database:

rake db:create

Rspec Prompts for Credentials

I was being prompted by Rspec for credentials when running my test suite. If you would like to remove this credential prompt, please read the following:

There are differences in how the PostgreSQL package is configured in Homebrew on OS X, and how it is packaged in the Ubuntu and across other distributions. One difference is in the level of security configured in the pg_hba.conf file. This file is responsible for identifying which sources using which authentication mechanisms should be allowed or denied. By default, Rspec will cause a prompt for a password even if your shell account has trusted permissions. This is because Rspec connects not as a local process, but to localhost. To allow connections to localhost to be trusted, you will need to modify the pg_hba.conf file.

Next, we can modify the pg_hba.conf file located at /etc/postgresql/<version>/main/pg_hba.conf

Comment out the lines any lines at the bottom of the file and append the following:

local   all             all                                      trust
host    all             all              127.0.0.1/32            trust
host    all             all              ::1/128                 trust

This will allow connections from the shell, as well as connections to 127.0.0.1 (localhost) using both IPv4 and IPv6.

You will need to restart PostgreSQL for the changes from this file to take affect:

/etc/init.d/postgresql restart

PostgreSQL Extensions

If you want to make use of any of the additional extensions to Postgres, including fuzzystrmatching, you will need to install the postgresql-contrib package:

apt-get install postgresql-contrib

The extensions will install to /usr/share/postgresql/<version>/extension/

Using the Postgres version 9, you can create these extensions in your database by using the new CREATE EXTENSION syntax. In the case of the fuzzystrmatch extensions, you can issue the following command from inside a PostgresSQL command prompt to load the extensions:

psql ;

Once inside your database:

create extension fuzzystrmatch;
Computers, Linux, Open-source, Ruby, Software, Web

Setup PostgreSQL with Rails on Linux

Today, I found myself needing to setup a Rails application to work with the PostgreSQL database. I found that the documentation on the PostgreSQL website was like drinking from a fire hose. Worse was every community response for an error message has a slightly different approach to the solution. Lets run through a basic Rails PostgreSQL configuration assuming Rails 3, Postgres 8.x, and Ubuntu 11.04:

Step 1: Installing PostgreSQL and Libraries

Install the PostgresSQL server, the client package to connect (psql), and the pg library needed to compile the Ruby PostgreSQL driver:

$ sudo apt-get install postgresql postgresql-client libpq-dev

After this finishes installing, you can turn to your OS X co-worker and laugh at him while he is still downloading the first tarball file. PostgreSQL will start automatically, under the user postgres. You can verify that the installation is a success by using the psql command line utility to connect as the user postgres. This can be accomplished using the following command:

$ sudo -u postgres psql

This uses sudo to elevate your basic user privileges, and the “-u” switch will execute the following command as an alternate user. As the postgres user, this will run psql. If you connect successfully, you should be at the psql interactive prompt. If not, ensure PostgreSQL is running, and that psql is in the path for postgres.

Note: From the psql interactive prompt, type “q” to exit.

Step 2: Configure a New PostgreSQL database

From the psql prompt, you can run SQL to view the current PostgreSQL users:

select * from pg_user;

You should see a table of database users returned:

usename usesysid usecreatedb usesuper usecatupd passwd valuntil useconfig
postgres 10 t t t ********    

(1 row)

We can see the postgres user that was created automatically during the installation of PostgreSQL. Lets add another user to be an owner for our Rails database. The path of least resistance may be to use your shell account username, since it will keep us from having to change some options in the database configuration file.

$ sudo -u postgres createuser 
# Shall the new role be a superuser? (y/n) n
# Shall the new role be allowed to create databases? (y/n) y
# Shall the new role be allowed to create more new roles? (y/n) n

This will create a new database user (named your shell account name), and grant that user access to login to the database. This will ask you a few questions regarding the user account. It is important for Rails that you answer “y” to whether the user should be able to create databases. If you say no, you will not be able to run any rake tasks that create or drop the database.

We can confirm by selecting from the pg_user table again.

$ sudo -u postgres psql
select * from pg_user;
usename usesysid usecreatedb usesuper usecatupd passwd valuntil useconfig
postgres 10 t t t ********    
<username> 16391 f f f ********    

(2 rows)

Step 3: Configure Rails

Switching to the Rails side, lets configure our application for Postgres. This requires the pg gem. Open your Gemfile and append:

# Gemfile
gem "pg"

Now run bundle install to update your project gems.

$ bundle install

This should compile the Ruby pg database driver, allowing Ruby to talk to Postgres. Now, lets tell our Rails application how to access our database. Open up config/database.yml, and change the adapter line to read “postgresql”. The database name by convention is the name of your project plus “_development”. Finally, your shell username is needed. Because PostgreSQL will authenticate this account locally, you will not need to supply a password option. Delete this line.

# config/database.yml
development:
  adapter: postgresql
  encoding: unicode
  database: _development
  pool: 5
  username: 

To test, run the rake task to create your database:

rake db:create

If everything works, you should have a newly created database owned by your shell account. You can login using psql by passing the name of the database as an option:

$ psql -d _development

Happy migrating!

Troubleshooting

If you get the error: “FATAL: Ident authentication failed for user “, ensure that you can see your newly created account in the pg_user table of the postgres database. (See Step 2 above)

If you get the error: “PGError: ERROR: permission denied to create database”, then ensure that your database user account has been granted the privilege CREATE. This can be done during the “createuser” command line account creation by answering “y” to the corresponding question about this permission.

If you get the error: “FATAL: role is not permitted to log in”, try manually granting the privilege to login on your database user account. This can be done by executing the following as postgres in the psql prompt:

ALTER ROLE  LOGIN;

Notes on Alternative Authentications

PostgreSQL integrates very deeply into the Linux authentication world, allowing for quite an array of connection options. By default passwords are not accepted for local connections. Instead, PostgreSQL is configured to use the “ident sameuser” method of user authentication. See more at http://www.postgresql.org/docs/8.4/static/auth-pg-hba-conf.html.

Computers, Open-source, Personal, Ruby, Software, Thoughts, Web

Another Helping of Abstraction, Please

Rails 3.1 is soon to be released, and with it comes two new abstraction libraries – CoffeeScript, and S(ass)CSS. These libraries are used to generate Javascript code, and CSS definitions respectively. While abstraction libraries are nothing new to Rails, the inclusion of two more got me thinking about the direction that Rails stack is heading.

CoffeeScript’s syntax seems to be to make Javascript as Ruby-ish as possible. It describes Javascript’s curly braces and semicolons as embarrassing.

SCSS aims to address some of the repetitive elements of CSS through the use of variables, nesting, and mixins. This feels more acceptable to me than CoffeeScript, but my first encounter left me burned.

A few other abstraction libraries of relevance: Haml aims to generate HTML without the use of HTML tags. Additionally, Squeel‘s (MetaWhere 2.0) aim is to remove all SQL from your projects.

So what am I bitching about? Abstraction is a good thing right? I see two categories of abstraction. The first being the “good” kind, that allow you to be completely ignorant of the underpinnings. For example, Ruby converting down into machine code.

The “bad” kind of abstraction are the substitution of a language with a DSL. This creates a lot of issues starting with development and debugging. In the case of CoffeeScript and SASS, you have to compile the DSL files into Javascript, and CSS files. I feel like this compile step is a step back from what we gain working with dynamic languages like Ruby, and Javascript to begin with.

Development in these libraries also requires that you understand both the DSL of the library, as well as being familiar with the language of the generated code. This additional skill requirement adds time to a project, and raises the entry bar for new developers. Its difficult to be proficient at a language, and a DSL that generates that language at the same time. A Ruby developer told me yesterday that he was surprised at how rusty his knowledge of SQL had gotten. Its shocking to me that a web developer would struggle with SQL, but I think its an accurate sentiment on which many Rails developers would agree.

Another casualty of abstraction is performance. Not only is the generated code sub-optimized relative to coding it by hand, it is also being run through through more system calls to get there. You can either compile up front (CoffeeScript, SASS), or you can incur this penalty on-the-fly (Haml, Squeel).

While none of the libraries are a requirement of web development, when working on a team that uses these technologies you are expected to produce consistent code. Even though these libraries let you execute “native” code, doing so is discouraged because of the availability of the DSL. The syntax for embedding native code (if its even allowed) is often cumbersome, and loses editor functionality such as syntax highlighting and parsing.

Since when did Ruby on Rails web developers stop working with SQL, CSS, HTML, and Javascript? I am beginning to feel like the Ruby camp is becoming the far left extremists of the web development world. The web is built on these core technologies, and the benefits of abstracting them doesn’t seem to outweigh the costs.

Computers, Events, Linux, Open-source, Personal, Software, Web

You Found Me!

Sorry for any confusion to the few who read my slice of the web. My old DNS name, simpson.mine.nu provided to me through dyndns.org expired leaving me stranded. Looking back through my emails it seems that I had 5 days to reply to continue my account and I failed to do so. Instead of just being a simple fix of creating a new account, they have moved my domain name to a premium service. Instead of forking over my cash, I have decided to stop being lazy and buy a real domain name. So for all who have made it this far, welcome to my new home. The bathrooms are two doors down on the right.

Computers, Personal, Software, Thoughts, Web

Year of the eBook Readers

No doubt that like me, many of you are getting or giving an eBook reader of some kind for this holiday season. It seems to be a perfect convergence of technology, price point, services and availability, and consumer demand. Initially I was not very interested in eBook readers because I saw them less about the experience, and more about creating a platform for companies to sell content through (a la iTunes). Coming from that perspective, I was impressed to discover the analog conventions present in the digital framework of eBooks. In particular, the Barnes and Noble Nook (in particular) allows me to do three things that surprised me:

  1. Lending program: I can take materials that I have on my device and lend them to my friends. This is setup to mimic lending an actual resource, although with a few more restrictions (14 day limit, one time, etc). During this time, I am not allowed to read the material on my device (policy over technology). It would be great to see this feature become cross-compatible with other platforms. The Nook app is free on most mobile devices, so sharing should be straight forward.
  2. Integration with public library systems: A big reason I resisted an eReader is that I don’t often purchase my books. I am an active patron at the library (why not? – I pay taxes for something!). This was an analog system that allowed me access to resources free of charge. It turns out that the Old Colony Library Network of Massachusetts allows you to checking out a wide range of materials in digital formats. What is even better is I can do this online!  There is a slight complexity with integrating with Adobe’s DRM solution for providing this functionality, but its mostly transparent to me.
  3. Previewing books at the Barnes and Noble: Another analog system is that I could go into a bookstore, and sit down and read a few chapters of the book to see if I liked it before I bought it. The eReaders allows this as well. I can go to any Barnes and Noble, connect to their wi-fi and browse for up to an hour a day for free. My eReader even gives me cafe coupons for food and drinks while I browse!

These digital solutions are meant to mirror our existing analog system. This is a smart move by the people driving the policies of these devices because its addressing the limitations people see with digital formats. These solutions aren’t perfect, but they are a breath of fresh air in the typical DRM rhetoric. What are your experiences with reading in a digital format? Has anyone coupled their device with desktop syncing software such as Calibre?

Computers, Open-source, Personal, Ruby, Software, Thoughts, Web

Taking the Magic Out of ActiveRecord – Speed up that Query!

Rails, and other programming frameworks that include an Object Relational Mapper (ORM) make is really easy to not worry about your database. That is until your application slows way down and you are left scratching your head trying to figure out why. Recently, I have been tasked with doing some things to optimize our database code in order to speed up the controller views.

The truth about databases is that you can’t beat them at their own game. Doing so would be like making a FPS on the consoles and releasing it the same day as Modern Warfare 2. A database stores data in a manner that can be easily and efficiently queried. ActiveRecord makes it easy to write code without worrying about the database underneath but this can create a issue. If you never know what the database is doing, how can you know if you are doing it efficiently, or not? I am of the firm belief (after witnessing Oracle chomp through 1000 line SQL queries like they weren’t there) that if a database can do something that your code can do as well, it is probably best to defer to the database. Pick the best tool for the job.

“If all you have is a hammer, everything looks like a nail” – Abraham Maslow

Lets look at some examples of where some database magic can improve our lives. Take the following code base:

# == Schema Information
#
# Table name: assignments
#
#  id         :integer         not null, primary key
#  post_id    :integer
#  keyword_id :integer
#  created_at :datetime
#  updated_at :datetime
#

class Assignment < ActiveRecord::Base
  belongs_to :topic
  belongs_to :keyword
end

# == Schema Information
#
# Table name: keywords
#
#  id         :integer         not null, primary key
#  name       :string(255)
#  created_at :datetime
#  updated_at :datetime
#

class Keyword < ActiveRecord::Base
  has_many :assignments
end

# == Schema Information
#
# Table name: posts
#
#  id         :integer         not null, primary key
#  title      :string(255)
#  created_at :datetime
#  updated_at :datetime
#

class Post < ActiveRecord::Base
  has_many :assignments
end

I have created a “Post” object to represent a blog post. On this blog post, I can assign keywords to each post. To do this, I have another model, “Keyword”, which is associated to posts through an “Assignment” model. The schema information shows the Database structure.

Now, recently I have come across some code that aimed to collect all the keywords of a post with some special options. These keywords were to be found, their names listed, and sorted without case-sensitivity. Finally, any duplicate keys would be excluded. A programmers mind might gravitate towards a solution such as:

Post.find(1).assignments.map {|x| x.keyword.name }.sort {|a,b| a.upcase  b.upcase}.uniq

Lets walk through this bit of code before I discuss better alternatives. To start with, we find our Post with an id of “1”, so that we can look at just those keywords. Next we iterate through these keywords (using the symbol to proc shorthand), and then run a sort on the returned array. This array is sorted by uppercasing the strings so that the sort is case-insensitive. Finally, “uniq” is run, to exclude any duplicate keywords. While this is a working solution, it doesn’t take any advantage of the power, and flexibility of what a database can do. A few issues:

  1. This code generates multiple SQL select statements. (N+1)
  2. Sorting can be done through “ORDER” in SQL
  3. Unique records can be generated through DISTINCT

The problem with generating multiple select statements is that this problem is an example of the “N+1” problem. This means that one query (“1”) will be run to determine what the attributes of the assignment is. After this is known, “N” queries are run for each keyword reference. In total, you have “N + 1” queries executed. If you have 5 keywords, the results will be unoptimized, but largely unnoticed. You will have 6 select statements:

  Assignment Load (6.3ms)   SELECT * FROM "assignments" WHERE ("assignments".post_id = 1) 
  Keyword Load (0.8ms)   SELECT * FROM "keywords" WHERE ("keywords"."id" = 1) 
  Keyword Load (0.7ms)   SELECT * FROM "keywords" WHERE ("keywords"."id" = 2) 
  Keyword Load (0.7ms)   SELECT * FROM "keywords" WHERE ("keywords"."id" = 3) 
  Keyword Load (0.7ms)   SELECT * FROM "keywords" WHERE ("keywords"."id" = 4) 
  Keyword Load (0.8ms)   SELECT * FROM "keywords" WHERE ("keywords"."id" = 5)

What happens if you have 500 keywords? 5,000,000 keywords? Your performance bottleneck will quickly shift to the database as multiple queries have to be generated, executed, and returned for each page request.

What is the big deal about sorting in your application instead of the database? In order for Rails (ActiveRecord specifically) to “sort” these items, the items will be be returned from a generated select statement from the database. These items will be returned as elements in an array, and then the array is sorted by calling the “” method on the Array class. Further, this is done in Ruby – a dynamic language, which is several orders of magnitude slower than in the database (which is most likely written in C). Simply, the database is made for this kind of work, and outsourcing it to Ruby is just not efficient.

Finally, why not make records unique with Ruby’s “uniq”? Again, it has to do with meddling in database territory. It has the problems inherent to the sorting problem above, but with an additional problem. Lets say that you return 500 records – and 499 of them are duplicates. Before the “uniq” method is run, Ruby is holding in memory references to the attributes of 500 ActiveRecord instances. When you call “uniq”, it can drop them. Why go through this memory spike and plummet, when the database can be instructed to just return one record? Memory is a valuable resource in a server environment – be frugal with its consumption.

So, lets address these issues, refactoring our code to take these issues into account. Starting with the multiple SQL statements, we can combine these into one statement by joining to multiple tables. I would encourage you to run “script/dbconsole” and get familiar with SQL. Let me demonstrate what we will be building the SQL way, before we implement this the “Rails” way.

SELECT a.* 
FROM   keywords a, 
       assignments b 
WHERE  a.id = b.keyword_id 
       AND b.post_id = 1; 

Another variant of this is to join the tables together using a “join” syntax. An inner join is the same type of join that we have replicated in the WHERE clause here. We can right the same SQL as follows:

SELECT a.* 
FROM   keywords a 
       INNER JOIN assignments b 
         ON a.id = b.keyword_id 
WHERE  (( b.post_id = 1 )) 

We can specify this join using ActiveRecord’s “find_by_sql” if we wanted to specify the association by hand. However this case is trivial enough that ActiveRecord can build this for us using “has_many :through”. I can add the following to my models:

# == Schema Information
#
# Table name: keywords
#
#  id         :integer         not null, primary key
#  name       :string(255)
#  created_at :datetime
#  updated_at :datetime
#

class Keyword  :assignments
end

# == Schema Information
#
# Table name: posts
#
#  id         :integer         not null, primary key
#  title      :string(255)
#  created_at :datetime
#  updated_at :datetime
#

class Post  :assignments
end

Now, I can gather all the keywords for a post by executing the following:

Post.find(1).keywords

Next, lets address the sorting issue by specifying an ORDER clause. We can tackle another related problem at the same time we do the sort. Recall that we want to sort in a case-insensitive fashion. If I just call order, then “Pabst” would beat “guinness” simply because of the capitalization (and we all know Guinness beer is better). The easy solution is to call “UPPER” to make the casing the same when the comparison is made. This actually saves even more computation on the Rails side by not having to do string conversions with our array sort. In SQL, we could append the following to our SELECT statement:

SELECT a.* 
FROM   keywords a 
       INNER JOIN assignments b 
         ON a.id = b.keyword_id 
WHERE  (( b.post_id = 1 )) 
ORDER BY UPPER(b.name)

The “Rails” way would be to include this on the association as follows: (Notice that Rails table alias names are probably not as reliable as listing out the name of the table itself. In this case, I have included “keywords.name”

# app/models/post.rb
...
has_many :keywords, :through => :assignments, :order => 'UPPER(keywords.name)'
...

Finally, lets address the unique problem. If I have duplicate keywords, I can return only the unique keywords by using the SQL DISTINCT modifier. In SQL, this would look like:

SELECT DISTINCT(a.name) 
FROM   keywords a 
       INNER JOIN assignments b 
         ON a.id = b.keyword_id 
WHERE  (( b.post_id = 1 )) 
ORDER BY UPPER(b.name)

In Rails, we can specify modifications to our SELECT clause, by passing the :select key to ActiveRecord’s “find”, and “all” methods. This has another benefit, depending on the application. For each column in each record, ActiveRecord has to store information in memory. By choosing only the specific columns that we want returned in the SQL clause, we can reduce the memory footprint. This could look something like this:

Post.find(1).keywords.all(:select => 'DISTINCT(keywords.name)')
# 0.16 seconds to complete w/ 200 keywords - thats 3 times faster!

So in summary, we have reduced SQL select statements, computationally expensive sorting and unique method calls from our results, and have managed to do all this without any fancy tricks. A sharp developer may point out that embedding SQL functions is bad form because it isn’t database agnostic. The truth is most databases conform to a base set of ANSI SQL standards and DISTINCT, and UPPER are acceptable almost across the board.

A little database magic can make a crawling Rails action become snappy again. I am a firm believer that Rails, like any framework should not be a reason to be uncaring about your SQL. Database are the heart of most applications, and usually one of the first bottlenecks for performance. Good SQL can be married with good Ruby code, and happiness will ensue. I hope this post was informative for Rails folks that want to get started with SQL optimization.