Fun MySQL fact of the day: mysqlbinlog

It's 3:37 A.M. when your phone blares out. Your hands are numbed from the early on-set of carpel tunnel and you can't hush the wailing siren that has become of your phone. With adrenaline coursing through your body, you shove your phone under your pillow mashing around in the dark trying desperately to make it stop before it wakes the neighbours. Finally, you succeed, and the hint of of groggy accomplishment you just felt quickly fades into horror you feel in your stomach: an over-night database backfill has gone awry leaving you with a table of 25,000,000 customers named Charlie Charlie. Still wiping the morning gunk and blur from your eyes, you finally notice that the last pre-disaster database back-up you have is almost 28 hours old and getting older by the second. Data loss, you fear, is inevitable as you start to draft a "Dear Charlie Charlie" email you'll send to all 25 million customers today, all, suspiciously, to charlie@charlie.com.

But wait! You enabled binary logging on Wednesday afternoon, right?! Of course you did! That was a good call! Now, put away that apology email for now, grab a coffee, and open up a terminal because we've got some work to do.

Out-of-the-box, MySQL ships with mysqlbinlog, which is a command line tool you can use to inspect or transform the binary log into something useful. In this case, "useful" ranges from "helping diagnose an application bug" all the way to "recovering from a disaster". As such, mysqlbinlog has a number of different modes and features you can and should explore, but for today, we're just going to keep it simple and generate a "patch" containing all of the changes made to our database since the last known-good back-up up until just before the disastrous query. Replaying these changes against our back-up will let us recover our corrupt data, and depending on our needs and situation, possibly even without any downtime.

For pure simplicity, let's assume that our back-ups run at midnight every other day, meaning our backup-ups are as up-to-date until 2019-05-23 at 00:00. Let's also assume, for simplicity, that our disastrous change happened at 2019-05-24 03:30. Knowing this, we can use the mysqlbinlog tool to generate a patch of all queries between these two times and then run it against a new database instance that has been restored from back-ups:

my-new-server01$ mysqlbinlog --start-datetime='2019-05-23 00:00:00' \
                             --stop-datetime='2019-05-24 03:29:59' \
                             mysql-bin-from-old-server.[0-9]* \
                             | mysql

From here, we can proceed to "fix" the data in the original database and go back to sleep, or we can skip over the bad query, bring this database up-to-date, promote it as the master database, and then go back to sleep. Either way, please allow me to emphasise VERY STRONGLY: it won't likely ever be this simple. Specifically, using start/stop times will only ever be "good-enough" at best. But, all things considered, in a "real" situation, the approach we need to take isn't going to be much different, and we'll start considering how next week. Until then, play around with mysqlbinlog and add it to your in-case-of-disaster tool-belt.