Its always interesting for me to work with large data sets. The solutions that work in lower orders of magnitude don’t always scale, and I am left with unusable solutions in production. Often the problems require clever refactoring that at a cursory glance appear identical, but somehow skirt around some expensive operation.
I had a requirement to tune a script that was responsible for inserting 300k records in a database table. The implemented solution of iterating through a collection and calling ‘INSERT’ was not scaling very well and the operation was taking long enough to time out in some runs. This gave me the opportunity to learn about a few things in MySQL including the profiler, and (spoiler!) the INSERT multiple records syntax.
I needed some real numbers to compare the changes I would be making. My plan was to change one thing at a time and run a benchmark to tell if the performance was 1) better 2) worse, or 3) not impacted. MySQL has an easy to use profiler for getting this information. Inside of the MySQL CLI, you can issue the command:
Any subsequent queries you run will now be profiled. You can see a listing of queries you want to know more about by typing:
This command will show an index of queries that have run, along with their associated Query_ID. To view more information about a particular query, issue the following command replacing x with the Query_ID:
SHOW profile FOR QUERY x
Here is an example output:
+------------------------------+----------+ | Status | Duration | +------------------------------+----------+ | starting | 0.000094 | | checking permissions | 0.000003 | | checking permissions | 0.000002 | | checking permissions | 0.000001 | | checking permissions | 0.000003 | | Opening tables | 0.000021 | | System lock | 0.000008 | | init | 0.000039 | | optimizing | 0.000012 | | statistics | 0.000717 | | preparing | 0.000014 | | Creating tmp table | 0.000023 | | executing | 0.000002 | | Copying to tmp table | 0.016192 | | converting HEAP to MyISAM | 0.026860 | | Copying to tmp table on disk | 2.491668 | | Sorting result | 0.269554 | | Sending data | 0.001139 | | end | 0.000003 | | removing tmp table | 0.066401 | | end | 0.000009 | | query end | 0.000005 | | closing tables | 0.000011 | | freeing items | 0.000040 | | logging slow query | 0.000002 | | cleaning up | 0.000015 | +------------------------------+----------+
In one iteration of my SQL query, I was spending an excessive amount of time “copying to tmp table”. After reading the article http://www.dbtuna.com/article/55/Copying_to_tmp_table_-_MySQL_thread_states, I was able to isolate the cause of this to an ORDER clause in my query that wasn’t strictly necessary. In this example, Not too much exciting is going on, which is a Good Thing.
For a comprehensive listing of thread states listed in the Status column, view: http://dev.mysql.com/doc/refman/5.0/en/general-thread-states.html
Now that I know my query is as optimized as it can be, its time to pull out the bigger guns. On to plan B – consolidating those INSERT statements!
An INSERT statement, though executing seemingly instantaneously under small loads is comprised of many smaller operations, each with its own cost. The expense of these operations is roughly the following: (http://dev.mysql.com/doc/refman/5.0/en/insert-speed.html)
- Connecting: (3)
- Sending query to server: (2)
- Parsing query: (2)
- Inserting row: (1 × size of row)
- Inserting indexes: (1 × number of indexes)
- Closing: (1)
As you can see, connecting to the server, sending the query, and parsing are relatively expensive operations. In the script I was modifying, 300k INSERT statements were generating 300k records. Fortunately for us, MySQL doesn’t force our records to be 1:1 with our INSERT statements thanks to allowing multiple insertions per INSERT. To use this feature instead of having 3 INSERT statements:
INSERT INTO foo (col1, col2) VALUES (1, 1); INSERT INTO foo (col1, col2) VALUES (2, 2); INSERT INTO foo (col1, col2) VALUES (3, 3);
We can instead coalesce them into a single INSERT statement
INSERT INTO foo (col1, col2) VALUES (1, 1), (2, 2), (3, 3);
How many values can we coalesce into the same INSERT statement? This isn’t driven by a max number of records, but rather a server system variable sysvar_bulk_insert_buffer_size: http://dev.mysql.com/doc/refman/5.0/en/server-system-variables.html#sysvar_bulk_insert_buffer_size This can be modified, but the default is 8388608 bytes. The exact number of records will vary depending on the number of columns, and the amount of data being inserted into those columns. I conservatively chose to coalesce 5k records at a time. I tried to bump this to 10k, but I encountered an exception when I exceeded this server system variable maximum.
With my INSERTS coalesced, I was able to reduce my total number of INSERT statements to 60 (300k / 5k). This yielded massive performance boosts. I was able to take the query from over an hour to run to completing in just 2 minutes. Quite a nice trick, considering the data is unchanged.
Is there room for improvement? Absolutely. A statement executed 60 times may be worth preparing, or wrapping inside of a transactional block. My real world tests didn’t yield a significant enough performance boost to make these complexities worth implementing. This may not be true with data in higher orders of magnitude, or different schema layouts. MySQL also understands INDEX hints, which allow you to suggest INDEXES that may be missed by the query planner, or force the inclusion or exclusion of beneficial, or detrimental INDEXES despite what the query planner thinks! (http://dev.mysql.com/doc/refman/5.0/en/index-hints.html)
Speaking of INDEX, if any are using UNIQUE, BTREE type, these can be dropped while the mass INSERT is running, then added back later to side-step the 1n INDEX operational hit.
In the next order of magnitude, I will probably have to rethink my approach of using INSERT statements to load data. According to the MySQL documentation, LOAD DATA INFILE is “roughly 20 times faster” than a MySQL INSERT statement. My script would no longer generate statements, but rather output to a file in a comma delimited format. This could then be loaded assuming appropriate permissions are in place.