Fun MySQL fact of the day: foreign keys are important

A Foreign Key is a table constraint that allows a database index to remain consistent based on the existence (or non-existence) of another row in a another table. For example, MySQL will prevent the removal of a referenced row until all other referencing rows have first been removed. It will also prevent the creation of rows that have an invalid reference. This is called Referential Integrity, and it's an incredibly important tool to prevent data anomalies and corruption all while improving performance: in MySQL, a foreign key is an index. This means table joins will never require a full table scan.

On to the example! Consider the very commonly-seen relationship between two tables, account and user:

    account_id INT NOT NULL,
    FOREIGN KEY account_user_fk(account_id) REFERENCES account(id)
INSERT INTO account VALUES(), (), (); # -- ids 1, 2 ,3 respectively.
INSERT INTO user(account_id) VALUES(1), (1), (2), (2), (3), (3);

In this schema, each account may have many (or zero) users, and each user may belong to one and only one account. Here, we have 3 accounts with 2 users each. Let's try to DELETE an account:

DELETE FROM account WHERE id = 3;
(1451, 'Cannot delete or update a parent row: a foreign key constraint fails (`test`.`user`, CONSTRAINT `account_user_fk` FOREIGN KEY (`account_id`) REFERENCES `account` (`id`))')

Thankfully, MySQL stops us from deleting what would otherwise be an account with active users, as our application likely would have behaved oddly (assuming it worked at all!). If we really want to delete, we just need to delete (or move) the users:

DELETE FROM users WHERE account_id = 3;
DELETE FROM account WHERE id = 3;

Simple (and fun!), right? Unfortunately, though, in my career, I have encountered a number of teams and companies that have, for various reasons, decided to eschew referential integrity and not use foreign keys. The arguments typically focus on perceived performance issues stemming from the use of foreign keys, while, in truth, my experience suggests the performance issues are usually the result of poor data access patterns, sub-optimal configuration, or inappropriate schema design.

Sadly, deciding not to use foreign keys can lead to pretty severe consequences, and another almost-universally used feature in MySQL may well be among the biggest risks in a schema lacking foreign keys. "How risky", you ask? Well, as risky as anything that allows for undetectable account/identity takeovers, data leakage, data loss, loss of revenue, corruption, and application bugs. You're almost certainly using this feature, but that is the Fun MySQL Fact for Monday. Until then, just go add those foreign keys.