Sometimes you just have to laugh at the crazy things that can kill a good evening.
I had this brilliant idea to change our replication setup on one of our Master-Master replication server setups this week. I got sick of having to restart MySQL every time we wanted to add a new database and have it included in the list of replicated databases - we were using replicate-do-db in our configs.
So it seems very straight forward to change to ignore-db or ignore-table (because of cross database updates).
After a few weeks in QA and a few weeks in staging - no problems, no issues, no complaints... let's go for it!
Yea, as soon as we deploy and restart MySQL to pickup the configs, replication fails and stops!
And of course to make it lots of fun, replication on a couple other servers failing at the same time for unrelated reasons, and then a migration of another application that night having issues...)
So look around a while, check the MySQL Monitor ... and, wala - lost+found - table does not exist errors!
Yea, ext3 rears it's ugly head again. I'm sure it's probably my responsibility to make sure that lost+found directories are clean up, etc, etc, but it sure made for a headache this week.
The fix (knock on wood) was straight forward - we just added ignore-table=lost%.%
Seemed to do the trick. Maybe we should check out ReiserFS or xfs or zfs.
Either way, if there's a way to break replication, I'm sure I'll find it... ;)
On that note - if you love the file system (linux based) that you are using for your MySQL servers, I'd love to hear your comments (good, bad, or ugly)