Fun MySQL fact of the day: in defence of
Monday's introduction to
auto_increment maybe wasn't as glowing as it should have been, but we really shouldn't let one little
bug well-documented issue stop us from using it. You see,
auto_increment really is an invaluable tool, and today we'll consider why.
As we discussed yesterday, in InnoDB, a "table" is actually a b+tree index (the
PRIMARY index, or just
PRIMARY) in which "rows" are actually "records". In the
PRIMARY index, records are sorted in ascending order of the primary key. While designing a schema, deciding on a primary key is as much a science as it is an art, and the field(s) you eventually choose is critical to the performance of your database. Today, we're just going to focus on an OLTP (Online Transaction Processing) use case, which primarily optimises for write throughput and read latency of small data sets. Most web-based systems fit this category.
I'll make a bold proclamation: in any system that optimises for write throughput, a surrogate primary key (that is, one that's unrelated to the data) consisting of an
auto_increment value will nearly always win-out in terms of inserts per second. To explain why, let's consider that we currently have a
PRIMARY index with the following keys:
[ 1, 3, 5, 6, 8 ].
Good. Now, let's insert a new record with a key value of
9. Simple: we stick it right at the end. Then
10? Again, right at the end. What about
33977344? Right smack at the end, again. It's "almost" as "simple" as that (read: come along with me on my imaginary optimism trip into make-believe database land!).
Now, what happens if we instead need to insert
4? It goes after
3 but before
5. And what about
7? You get the point: unordered inserts require extra work, which slows down the process of inserts, which, as we recall from the last few weeks, slows everything down.
Of course, there's a whole lot more complexity, but one thing remain constant: inserting (increasing) monotonic records is a heck of a lot more efficient than random inserts. So, should you use an
auto_increment column (which provides monotonic, sequential values) for your primary key? It highly depends on your scenario, but you'd be crazy not to at least consider it.