Fun MySQL fact of the day: tenant lock-out
A couple weeks ago, we very briefly touched on the subject about MySQL/InnoDB locking only single records. And since we're on the topic of multi-tenant schemas, I think it's a good time to focus on this subject. For now, we'll focus on using MySQL's default isolation,
REPEATABLE READ. In MySQL,
REPEATABLE READ prevents phantom reads, even though it is not a defined as a property of the "well-known"
REPEATABLE READ. In simple terms, a phantom read happens when the same query run twice within a transaction sees different results. To see why this all matters, let's look at a fairly common multi-tenant table:
CREATE TABLE list_of_things( id INT AUTO_INCREMENT PRIMARY KEY, account_id INT NOT NULL, # REFERENCES account(id) modified DATETIME NOT NULL DEFAULT now(), thing VARCHAR(18) NOT NULL ); INSERT INTO list_of_things(account_id, thing) VALUES(1, 'account 1''s thing'); INSERT INTO list_of_things(account_id, thing) VALUES(5, 'account 5''s thing'); INSERT INTO list_of_things(account_id, thing) VALUES(9, 'account 9''s thing');
In this table, each
account_id may have multiple
things, and each
thing has a timestamp when it was last modified. Now, imagine several transactions from different accounts try to update their own rows:
account 5> BEGIN; UPDATE list_of_things SET modified = now() WHERE account_id = 5; # ... account 9> BEGIN; UPDATE list_of_things SET modified = now() WHERE account_id = 9; # ... account 1> BEGIN; UPDATE list_of_things SET modified = now() WHERE account_id = 1; # ...
Naturally, we'd expect that these
UPDATEs wouldn't get in each other's way, but what actually happens? After
account_id=5 runs its
UPDATE, the entire table will be locked. Why? Simply, to prohibit phantom reads, an InnoDB transaction must lock each record it encounters, and because the only index InnoDB can use on this table is the
PRIMARY key, InnoDB is forced to perform a full table scan, starting from before the first record to after the last record, locking each and every record as it goes, sequentially searching for all records where
Can we fix this? Yes, in several ways, but we'll start very simple today:
CREATE INDEX listofthings_accountid_idx ON list_of_things(account_id);
How does this help? Instead of doing a full index scan, this index will probabilistically be used to find the necessary records. Does this completely fix the problem? Nope, but the index is still extremely helpful to prevent locking the entire table, effectively serializing all changes.