Wednesday, April 23, 2008

MySQL: replicate-* rules should be dynamically configurable

I wonder what the best way is to get a feature request more visibility (convert a feature request to an actual work item).

We use replicate-do-db on all our slave servers , so after many, many, restarts of our slave servers, I checked the bug list for any feature requests surrounding this, and about 6 months ago, one was opened:

replicate-* rules should be dynamically configurable

Sadly, there does not appear to be a work log for it, nor any targeted version. It seems to me that it's probably a fairly common use case...

I suppose adding or removing a db from the replicate list, might require restarting the slave processes that scan binlogs for transactions to apply, or at least, some way to notify the slave processes that additional databases need to be accounted for. Then we have to make sure that we start at the right point in the binlog, right? (what if the database as been active for a while, and then I try to dynamically add it to the replicate list)

There is a worklog item for having the master filter the database replication (to reduce network traffic), which has a proposed option of allowing this to be dynamic through the use of a CHANGE MASTER command. That might work for the dynamic replicate-* rules as well...

Maybe this is an opportunity to open my MySQL Internals book, and see if I can pull those C programming skills out of my cobweb filled dba brain...

Anyway - it would be interesting to know who all out there would like this as a feature...

Friday, April 18, 2008

Speaking at the Conference

I had a great time speaking about partitioning at the MySQL conference this week.

Special thanks go out to Mattias Jonsson as well for helping answer some of the questions afterwards. The room was packed - which makes me excited for the future of MySQL partitioning!

I hope everyone got something out of the session, and I'd gladly welcome any comments (positive or constructive).

I plan on doing some comparisons in the next few weeks between MySQL partitioning and PostgreSQL partitioning... (I'll let you all know what I like and don't like).

If you want a copy of the slides, you can download it from google docs here

Or just take a look:

Thursday, April 10, 2008

Just a DBA in a developers world...

Sometimes I wonder where this whole web development thing is going...

I was asked to setup some backup and recovery tools for the BerkeleyDB Java Edition (JE).

Hmm... DB... I could probably argue that DB should be dropped from the name if you add Java edition to it. I suppose maybe it's Oracle's plan to drop BerkeleyDB for the newer improved BerkeleyDB Java Edition (JE). Maybe that's why there won't be a BDB storage engine for MySQL beyond 5.1...

So after reading way to many pages of Oracle docs, I try just using a shell script with some fancy tar work and cross my fingers. That doesn't work, so I call on the Java developers to write a little copy tool using a jdb backup class buried somewhere way down deep in those java class libraries somewhere, plug that it to my script, and now we have backups.

All this to make sure we have instant DBA recovery for one of our Java applications, that uses Hibernate, that talks to Terracotta, that uses the BerkeleyDB JE. No complexity there...

Of course, the next thing they ask is does that mean we can restore to a point in time?

Well, that all depends. If we crash Terracotta crashes, and the filesystem is intact, the BerkeleyDB JE has the ability to roll it's logs forward and restore to the point in time of the last valid transaction at the time of the crash.

That won't do us much good, however, if we need to actually restore from a backup and then roll the logs forward, because BerkeleyDB is a append only write model: Meaning that the log and the data are one in the same, so I can't get copy's of the transaction logs on a periodic basis without getting the data as well. So, unless we plan on running backups every 15 minutes or so, we better not be expecting to restore to a point in time.

So the next question is (of course):

How are we supposed to use Terracotta as a highly recoverable persistent data store then?

My Answer (these are my opinions, as a DBA, of course):

Don't, if you expect your persistent data store to act like a relational database engine.... :)

But don't worry... it's the DBA's responsibility to make sure the data is all recoverable... right?