Open-source, Software

Updating database rows with a default position

I was recently tasked with making a table in a MySQL database sortable by the user. There are a few steps involved to take care of new records, while providing existing records with a default value.

The new records would be created with a ifnull(max(order), 0) + 1 to assign them the next position. This was sufficient since I was positioning based on created_at and a new record was guaranteed to have the highest created_at timestamp.

For the existing records, I wanted to:

  • group by user_id
  • set the order (int) column to an incrementing integer
  • with an initial ordering based on a created_at timestamp

The first approach was to select groups of users, and for each group sort the records then update each record. This approach would work, but it wouldn’t have good performance. I really wanted to do this in a single update statement.

I ended up going with an approach based on this article :

set @user_id = 0, @order = 1

update table1 as dest,
(
  select x.id, x.user_id, x.created_at,
  @order := if(@user_id = x.user_id, @order + 1, 1) as `order`,
  @user_id := x.user_id as dummy
  from (
    select id, user_id, created_at
    from table1
    order by user_id, created_at
  ) as x
) as src
set dest.`order` = src.`order`
where dest.id = src.id;

Let’s break this down:

  1. We are using database session variables to keep track of some state. We default to @user_id = 0 and @order = 1
  2. Working from from the inner-most select, we select the id (primary key), user_id (int)  and created_at (timestamp) as these are the columns needed to do our calculations for the new order value. We order these by user_id and created_at to get the sort ready. This dataset is aliased as x (yes I know I’m terrible at naming)
  3. The next select up pulls the same id, user_id and created_at from the inner-most select and does some calculations persisting the state in the session variables we setup in step 1. The actual work here is:
    1. Calculating the order: @order := if(@user_id = x.user_id, @order + 1, 1) . The := is an assignment in SQL. We set the @order to either a default of 1, or the @order + 1 .
    2. Comparing user to determine what to set the @order to. This is done in a conditional expression:  if(@user = x.user_id). Because the @user_id is set to the x.user_id value on each row, the comparison should return true until we get to the next user, at which point the @order will reset to 1.
  4. Finally the outer update takes the resulting src dataset and does some updating. We constrain to where dest.id = src.id to get the correct record, and then we set dest.order = src.order which is our calculated value.

Performance is looking great. On a dataset of 300k records, this runs in 3 seconds on a 4 CPU Docker container.

Bonus: If you have additional criteria (like only sorting favorite records for example) this can be easily applied to the inner-most SELECT in the where clause!

Happy ordering!

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.