Fun MySQL fact of the day: it's all secondary

In the last couple weeks ending with yesterday, we'd discussed secondary indices a few times, but we never really discussed why you may want to use them. So, over the next several days, we will do just that.

A secondary index is a MySQL construct that allows storage engines like InnoDB to index table data in various ways. In doing so, it gives us the ability to query the table data on various columns. For example, we may have a user table as (id,full_name,date_of_birth) and we may have different use-cases where we want to find our users based on full name, date of birth, or even find a user based on their full name AND date of birth. To accomplish this, we could create 2 indices:

CREATE INDEX dateofbirth_fullname_idx ON user(date_of_birth,full_name);
CREATE INDEX fullname_idx ON user(full_name);

Why only 2 indices and not 3? Well, when MySQL looks to figure out which key will work best for a query, it does so from a left-to-right basis. That is, MySQL will evaluate each index associated with the table and compare it from left-to-right with the fields specified in your WHERE clause. MySQL will choose the index with the best prefix match. This means you can use dateofbirth_fullname_idx to find users only by date_of_birth as well as to find users by both date_of_birth AND full_name. The order of these columns in your query doesn't matter, but we'll talk about that another day.

What we do not want, however, is to use the dateofbirth_fullname_idx to find users only by full_name. Why not? Because, as you recall from previous Fun MySQL Facts Of The Day, records in an InnoDB index are sorted in ascending key-order. In this case, key-order is date_of_birth then full_name, and to find a user by full_name, we would have to scan every single record in dateofbirth_fullname_idx in date_of_birth order to find matching full_names. If that sounds familiar, it's the exact same index scan we discussed way back on March 19, and it's rarely what we want.