Pages

Tuesday, November 25, 2008

MySQL Replication and bad assumptions

Sometimes I amaze myself in my capacity to make assumptions about how things should work, especially when it comes to test plans... ( You know what happens when we assume, right? )

I had this great idea to setup a couple slaves off a master-master replication set something like this:



MASTER A <--------------> MASTER B
| |
| |
| |
V V
SLAVE C SLAVE D

Makes sense, huh?

Then I'll just point our read / write app servers at MASTER A, and point half of our read / only app servers at SLAVE C, and half of our read only servers at SLAVE D. Sweet. Works like a charm.

Oops... Replication shows current everywhere, no alerts going off. But app servers pointed at SLAVE C show higher row counts than app servers on SLAVE D...

Yea, I never bothered to test that. That's what I get for making assumptions - if you could even call it that. When I thought things through, I quickly realized that there was no reason for MASTER B to send any of it's replicated updates on to SLAVE D, because I had told it not to write any of it's updates it received via replication to it's binary log... :)

All you replication experts out there could have probably told me that in your sleep, but of course, I never bothered to ask...

Just a couple lines in the config file could have saved me a few headaches, and a couple embarrassing apologies...

log-slave-updates
replicate-same-server-id = 0

Figures.. ;)

Saturday, October 04, 2008

MySQL Partitioned Tables with Trigger enforced FK constraints

Well, I suppose its' true you can't use the official MySQL ALTER TABLE statement to do it, but if you're willing to trust your trigger coding abilities you can.

All you need is an extra table and a couple triggers.

The concept is fairly straight forward:

  1. Create a before insert trigger on the child table that validates the parent exists in the parent table.
  2. If there is no parent found, then insert a row into a third table that automatically forces an error (say, dup key on index)
  3. Create a before delete trigger on the parent table that validate there are no children in the child table.
  4. If there are children found, then insert a row into the third table that automatically forces an error (dup key on index)
It could also be extended to handle cascading deletes, but I'm personally not a fan of that, so I leave that to you... :)

Here's an example:
- Note: This example uses Innodb tables, but it can be done with MyISAM tables just as well.



Parent Table (orders) :

CREATE TABLE IF NOT EXISTS orders
(
id INT NOT NULL AUTO_INCREMENT,
customer_name VARCHAR(30) NOT NULL,
total_amount DECIMAL,
order_date DATETIME ,
PRIMARY KEY (id,customer_name)
)
ENGINE=innodb
PARTITION BY RANGE (id)
(
PARTITION p1 VALUES LESS THAN (10),
PARTITION p2 VALUES LESS THAN (20),
PARTITION p3 VALUES LESS THAN (30),
PARTITION p4 VALUES LESS THAN (40),
PARTITION p_max VALUES LESS THAN MAXVALUE
);



Child Table (order_items) :


CREATE TABLE IF NOT EXISTS order_items
(
id INT NOT NULL AUTO_INCREMENT,
order_id INT NOT NULL ,
item_name VARCHAR(30) NOT NULL,
item_desc VARCHAR(4000),
item_amount DECIMAL,
PRIMARY KEY (id,order_id)
)
ENGINE=innodb
PARTITION BY HASH (order_id)
PARTITIONS 20
;



Special Table (fk_trigger) used to enforce FK:


CREATE TABLE IF NOT EXISTS fk_trigger
(
id INT NOT NULL PRIMARY KEY DEFAULT 0
)
ENGINE=innodb
;



Insert default row into special table (fk_trigger)
that will force all inserts to fail:


INSERT INTO fk_trigger VALUES();



Child Trigger (order_item_fk_trig) used to check
that parent row exists before insert:


DELIMITER /

CREATE TRIGGER order_item_fk_trig
BEFORE INSERT ON order_items
FOR EACH ROW BEGIN
DECLARE my_count INT;
SELECT count(*)
INTO @my_count
FROM orders
WHERE id = NEW.order_id;

IF (@my_count = 0) THEN
INSERT INTO fk_trigger VALUES ();
END IF;
END;
/

DELIMITER ;



Parent Trigger (order_fk_trig) used to check
that child row exists before delete:


DELIMITER /

CREATE TRIGGER order_fk_rev_trig
BEFORE DELETE ON orders
FOR EACH ROW BEGIN
DECLARE my_count INT;
SELECT count(*)
INTO @my_count
FROM order_items
WHERE order_id = OLD.id;

IF (@my_count > 0) THEN
INSERT INTO fk_trigger VALUES ();
END IF;
END;
/

DELIMITER ;



Insert some data into the parent table

INSERT INTO orders
(customer_name,total_amount,order_date)
VALUES
('Customer 1',300.00,now()),
('Customer 2',600.00,now());



Insert some data into the child table

INSERT INTO order_items
(order_id,item_name,item_desc, item_amount)
VALUES
(1,'This 1','Stuff 1',100),
(1,'This 2','Stuff 2',100),
(1,'This 3','Stuff 3',100),
(2,'This 1','Stuff 1',200),
(2,'This 2','Stuff 2',200),
(2,'This 3','Stuff 3',200);



Try to delete order #1 (without deleting it's children)

DELETE FROM orders
WHERE id = 1;


Results:

ERROR 1062 (23000): Duplicate entry '0' for key 'PRIMARY'



Try to insert orderitems for order #3 (Which doesn't exist)

INSERT INTO order_items
(order_id,item_name,item_desc, item_amount)
VALUES
(3,'This 3','Stuff 1',100);

Results:

ERROR 1062 (23000): Duplicate entry '0' for key 'PRIMARY'


Wednesday, October 01, 2008

MySQL Data Woes... or, Making Use of the Information_Schema

To be fair - I highly doubt it has anything to do with MySQL.

Database corruption - that's a different issue, often related to a bug in database code; but this was a data corruption issue, which is always a pain in the you know what to figure out and get fixed.

I spent the last day trying to figure out the weirdness in a couple of our databases, digging around, explaining to developers that there's no way it could be an issue with the database - it must be in their caching layer somewhere.

It never really occurred to me that the issue might be the data itself... duh. Especially when it turned out to be an issue with a few fields in a few rows with a little extra utf8 character in front of the actual data that happened to map to an ASCII zero (end of line char, I think). Nice.

Now we have to go check every stinkin database, every table, every char based column and look for that character.... Ugh.

Oh yea - we have an information_schema now in 5.1.x ... :)

So about an hour later after a quick code review with the developers and some validation tests, we had a cool little data issue checker to quickly run everywhere, that uses the information_schema database. I'm sure there's a slicker way to do it (and who knows, there might even be one up on MySQL Forge already). I suppose I should have checked there first, but I then again, it's always fun to re-invent the wheel...

Here's what we came up with:


DELIMITER /

DROP PROCEDURE IF EXISTS check_invalid_data /

CREATE PROCEDURE check_invalid_data(db_to_check varchar(128)) -- pass in db name
BEGIN
DECLARE done INT DEFAULT 0;
DECLARE exec_str TEXT;
DECLARE sql_str TEXT;
DECLARE exec_cur CURSOR FOR
SELECT check_string FROM validate_data;
DECLARE CONTINUE HANDLER FOR NOT FOUND SET done = 1;

DROP TEMPORARY TABLE IF EXISTS validate_data;

CREATE TEMPORARY TABLE validate_data
(
id INT NOT NULL AUTO_INCREMENT PRIMARY KEY,
check_string TEXT
);

DROP TEMPORARY TABLE IF EXISTS validate_results;

CREATE TEMPORARY TABLE validate_results
(
id INT NOT NULL AUTO_INCREMENT PRIMARY KEY,
table_info VARCHAR (256),
column_info VARCHAR (256),
column_length INT
);

INSERT INTO validate_data (check_string)
SELECT CONCAT ('INSERT INTO validate_results (table_info,column_info, column_length) SELECT ''',table_name,''',''',
column_name, ''',LENGTH(',column_name,') FROM ', db_to_check ,
'.' , table_name, ' WHERE ASCII (', column_name ,' ) = 0 AND LENGTH (', column_name ,') > 0 ;')
FROM information_schema.columns
WHERE table_schema = db_to_check
AND data_type IN ('char','varchar','text','mediumtext','tinytext','long','long varchar','enum','set');

OPEN exec_cur;

REPEAT
FETCH exec_cur INTO sql_str;
set @exec_str = sql_str;
IF NOT done THEN
PREPARE validate_stmt FROM @exec_str;
EXECUTE validate_stmt;
DEALLOCATE PREPARE validate_stmt;
END IF;
UNTIL done END REPEAT;

CLOSE exec_cur;

SELECT * FROM validate_results;
END
/
DELIMITER ;

Sunday, September 28, 2008

Where are all the good MySQL DBAs?

We've been looking for a good MySQL DBA at work for a little over a year now. At this point I'm beginning to wonder if DBA's are a dying breed all together...

Maybe it's the fact that we're in the Northwest (Seattle), or perhaps they've all gone to work for Percona?

Perhaps everyone has more fun developing? (We haven't had much trouble filling our Sr. Development positions or QA positions..)

It's getting to the point that I've actually tried to figure out how practical it would be integrate a remote DBA group like Percona, the Pythian Group, ProvenScaling, etc, etc... but that gets very complicated and impractical from a stand point that we really need a DBA that can play on both sides of the fence (operations and development), and work on a daily basis with our development group and QA teams on multiple projects using the Agile development methodology.

I know we're not the only ones in this situation up here ( and maybe that's half the problem).

Hmm... maybe there will be a couple good WAMU DBAs looking soon.... ;)

Well, if you're a decent DBA, enjoy working with open source, like the northwest, and are looking for a new, challenging position, send me an e-mail ... ( phil.hildebrand@gmail.com )

Wednesday, June 11, 2008

MySQL: Improving your skills with Forums

I haven't been a big user of the MySQL Forums till recently.

I'm not sure why - partly because I like to think I know something about what I do ( I just need to remember that there is always someone else who knows more), partly because I have a need to re-invent the wheel as often as possible, and partly because I dont' like waiting around for an answer...

It can be a little overwhelming when you look at the list of all them... Everything from the newbie forum to the forum for creating a custom storage engine. I guess that's what also makes them useful, if you know how to use them to your advantage.

Here's a few of my suggestions/tips towards getting the most out of the MySQL Forums :

  1. The Forums are what we make them. The more you use them, the better they get.
  2. Don't be afraid to answer a post. Sure, test your suggestion if you can, or at least think about it before you post it, but if you've run across the issue before, or have a good idea of where to dig in further, then by all means post!
  3. Don't be concerned about someone else having a better idea.... someone probably will, and that's your opportunity to learn ( this has held me back many times from posting - thinking that someone might have a better answer, so wait for them to post)
  4. Be as precise as possible when posting a question. If you can, include a test case, or a snipit of code.
  5. Include any trouble shooting steps you've already tried in your post.
  6. If you solve your issue, remember to post your solution so that the answer is there when I go looking for the answer to that same issue :)
  7. Be nice... I don't always know what I'm talking about, but at least I'm trying to contribute...
  8. If you post it, everybody sees it... (a note about security, proprietary info, etc)
  9. If you have a lot of experience with one area (say, partitioning...) then add that Forum to your google reader, and try to answer as many posts as often as you can. It will encourage more use of the forum, and help you keep your skills sharp as well!
  10. If you don't get an answer in one Forum, you might want to check another that has a similar subject. ( A replication issue with MyISAM tables might get an answer in the MyISAM forum if one doesn't come out of the Replication forum)
If you have other suggestions - let me know (and especially if you disagree with any of my suggestions....)

Wednesday, April 23, 2008

MySQL: replicate-* rules should be dynamically configurable

I wonder what the best way is to get a feature request more visibility (convert a feature request to an actual work item).

We use replicate-do-db on all our slave servers , so after many, many, restarts of our slave servers, I checked the bug list for any feature requests surrounding this, and about 6 months ago, one was opened:

replicate-* rules should be dynamically configurable

Sadly, there does not appear to be a work log for it, nor any targeted version. It seems to me that it's probably a fairly common use case...

I suppose adding or removing a db from the replicate list, might require restarting the slave processes that scan binlogs for transactions to apply, or at least, some way to notify the slave processes that additional databases need to be accounted for. Then we have to make sure that we start at the right point in the binlog, right? (what if the database as been active for a while, and then I try to dynamically add it to the replicate list)

There is a worklog item for having the master filter the database replication (to reduce network traffic), which has a proposed option of allowing this to be dynamic through the use of a CHANGE MASTER command. That might work for the dynamic replicate-* rules as well...

Maybe this is an opportunity to open my MySQL Internals book, and see if I can pull those C programming skills out of my cobweb filled dba brain...

Anyway - it would be interesting to know who all out there would like this as a feature...

Friday, April 18, 2008

Speaking at the Conference

I had a great time speaking about partitioning at the MySQL conference this week.

Special thanks go out to Mattias Jonsson as well for helping answer some of the questions afterwards. The room was packed - which makes me excited for the future of MySQL partitioning!

I hope everyone got something out of the session, and I'd gladly welcome any comments (positive or constructive).

I plan on doing some comparisons in the next few weeks between MySQL partitioning and PostgreSQL partitioning... (I'll let you all know what I like and don't like).

If you want a copy of the slides, you can download it from google docs here

Or just take a look:

Thursday, April 10, 2008

Just a DBA in a developers world...

Sometimes I wonder where this whole web development thing is going...

I was asked to setup some backup and recovery tools for the BerkeleyDB Java Edition (JE).

Hmm... DB... I could probably argue that DB should be dropped from the name if you add Java edition to it. I suppose maybe it's Oracle's plan to drop BerkeleyDB for the newer improved BerkeleyDB Java Edition (JE). Maybe that's why there won't be a BDB storage engine for MySQL beyond 5.1...

So after reading way to many pages of Oracle docs, I try just using a shell script with some fancy tar work and cross my fingers. That doesn't work, so I call on the Java developers to write a little copy tool using a jdb backup class buried somewhere way down deep in those java class libraries somewhere, plug that it to my script, and now we have backups.

All this to make sure we have instant DBA recovery for one of our Java applications, that uses Hibernate, that talks to Terracotta, that uses the BerkeleyDB JE. No complexity there...

Of course, the next thing they ask is does that mean we can restore to a point in time?

Well, that all depends. If we crash Terracotta crashes, and the filesystem is intact, the BerkeleyDB JE has the ability to roll it's logs forward and restore to the point in time of the last valid transaction at the time of the crash.

That won't do us much good, however, if we need to actually restore from a backup and then roll the logs forward, because BerkeleyDB is a append only write model: Meaning that the log and the data are one in the same, so I can't get copy's of the transaction logs on a periodic basis without getting the data as well. So, unless we plan on running backups every 15 minutes or so, we better not be expecting to restore to a point in time.

So the next question is (of course):

How are we supposed to use Terracotta as a highly recoverable persistent data store then?


My Answer (these are my opinions, as a DBA, of course):

Don't, if you expect your persistent data store to act like a relational database engine.... :)


But don't worry... it's the DBA's responsibility to make sure the data is all recoverable... right?

Monday, March 31, 2008

MySQL: Getting Creative with Partitioning (Performance Results)

I decided to run some very basic performance test comparing the non-partitioned table with a primary key, and a partitioned table with a primary key and a unique constraint enforced via a secondary table explained in my previous post.

Overall, it appears that with partitioning, as the data/rows scale, the inserts actually get faster :) This is what I would expect theoretically, so score one for partitioning.

That said, I'm not a benchmark tester, I'm a DBA, so I'm sure plenty of folks could explain why my testing isn't exactly accurate. For the purpose of the tests, I turned off query cache on the servers (though, since the tests were run with primarily inserts, it may not have made much difference). Tables were created with engine=innodb as default.

I do believe, it's enough to suggest going forward with this and doing scale testing on an application with this if you require unique keys that are part of the primary key due to partitioning restrictions in MySQL.

The tables are created for the test:

Sequence Tables:

CREATE TABLE my_test_sequence
(
id integer NOT NULL AUTO_INCREMENT,
PRIMARY KEY (id)
) AUTO_INCREMENT = 1001;

CREATE TABLE my_test_part_sequence
(
id integer NOT NULL AUTO_INCREMENT,
PRIMARY KEY (id)
) AUTO_INCREMENT = 1001;


Non Partitioned Table with PK and Unique Constraint:

CREATE TABLE test_base
(
id INTEGER NOT NULL AUTO_INCREMENT,
other_id INTEGER NOT NULL ,
name varchar(64) ,
PRIMARY KEY (id),
UNIQUE KEY (other_id)
);

Partitioned Table with PK without Unique Constraint:

CREATE TABLE test_part
(
id INTEGER NOT NULL AUTO_INCREMENT,
other_id INTEGER NOT NULL ,
name varchar(64),
PRIMARY KEY (id,other_id)
)
PARTITION BY KEY (id)
PARTITIONS 50;

Partitioned Enforcement Table with for Unique Constraint (Partitions Suggested by Mikael Ronstrom):

CREATE TABLE test_unique_part
(
other_id INTEGER NOT NULL,
PRIMARY KEY (other_id)
)
PARTITION BY KEY (other_id)
PARTITIONS 50;

Trigger for checking unique constraint:

DELIMITER /

CREATE TRIGGER enforce_unique
AFTER INSERT ON test_part
FOR EACH ROW BEGIN
INSERT INTO test_unique_part SET other_id = NEW.other_id;
END;
/

DELIMITER ;

test_base.sql insert scripts:

INSERT INTO my_test_sequence VALUES ('');
INSERT INTO test_base (other_id,name) VALUES (last_insert_id(), rand());


First 50,000 Rows (0 to start):


Non-Partitioned with just PK:

$ time mysqlslap -u testing --create-schema=test --query=test_base.sql -c 10 -i 5000 -F ";"
Benchmark
Average number of seconds to run all queries: 0.100 seconds
Minimum number of seconds to run all queries: 0.028 seconds
Maximum number of seconds to run all queries: 0.211 seconds
Number of clients running queries: 10
Average number of queries per client: 2

real 8m26.624s
user 0m1.154s
sys 0m3.473s

Partitioned with alternate table for unique Index:
$ time mysqlslap -u testing --create-schema=test --query=test_part.sql -c 10 -i 5000 -F ";"

Benchmark
Average number of seconds to run all queries: 0.051 seconds
Minimum number of seconds to run all queries: 0.020 seconds
Maximum number of seconds to run all queries: 0.376 seconds
Number of clients running queries: 10
Average number of queries per client: 2

real 4m23.382s
user 0m0.892s
sys 0m2.369s

Second 100,000 Rows (50,000 to start):

Non-Partitioned with just PK:

$ time mysqlslap -u root --create-schema=test --query=test_base.sql -c 10 -i 10000 -F ";"
Benchmark
Average number of seconds to run all queries: 0.051 seconds
Minimum number of seconds to run all queries: 0.020 seconds
Maximum number of seconds to run all queries: 0.376 seconds
Number of clients running queries: 10
Average number of queries per client: 2

real 4m23.382s
user 0m0.892s
sys 0m2.369s
Partitioned with alternate table for unique Index:
$ time mysqlslap -u root --create-schema=test --query=test_part.sql -c 10 -i 10000 -F ";"
Benchmark
Average number of seconds to run all queries: 0.051 seconds
Minimum number of seconds to run all queries: 0.020 seconds
Maximum number of seconds to run all queries: 0.376 seconds
Number of clients running queries: 10
Average number of queries per client: 2

real 4m23.382s
user 0m0.892s
sys 0m2.369s

Third 500,000 Rows (150,000 to start):

Non-Partitioned with just PK:

$ time mysqlslap -u root --create-schema=test --query=test_base.sql -c 10 -i 50000 -F ";"
Benchmark
Average number of seconds to run all queries: 0.081 seconds
Minimum number of seconds to run all queries: 0.018 seconds
Maximum number of seconds to run all queries: 2.069 seconds
Number of clients running queries: 10
Average number of queries per client: 2


real 68m20.086s
user 0m10.791s
sys 0m31.492s

Partitioned with alternate table for unique Index:

$ time mysqlslap -u root --create-schema=test --query=test_part.sql -c 10 -i 50000 -F ";"
Benchmark
Average number of seconds to run all queries: 0.057 seconds
Minimum number of seconds to run all queries: 0.020 seconds
Maximum number of seconds to run all queries: 0.746 seconds
Number of clients running queries: 10
Average number of queries per client: 2


real 48m50.993s
user 0m8.341s
sys 0m24.062s

Monday, March 24, 2008

MySQL: Getting Creative with Partitioning

Lately, I've been trying to keep up with at least one of the MySQL Forums: Partitioning.

It's a great way to keep on top of issues surrounding partitioning, and also get an idea of what people are trying to do with the new 5.1 feature. Richard came up with an interesting problem that I jumped into only to realize that I hadn't done my homework, and my initial suggestion wouldn't work at all due to documented restrictions on partitioning in MySQL 5.1. :(

So, the question is, if I can't have a unique index on a partitioned table that does not contain all columns of the partition key, is there some other way to enforce that uniqueness?

We could use a trigger on the partitioned table that try's to insert / update into another non-partitioned table that contains that unique constraint:


CREATE TABLE part_table
(
id INTEGER NOT NULL,
other INTEGER NOT NULL,
PRIMARY KEY (id,other)
)
PARTITION BY KEY (id)
PARTITIONS 20;


So the restriction can be seen by trying to create a unique index now on column 'other' :

mysql> CREATE UNIQUE INDEX test ON part_table (other);
ERROR 1503 (HY000):
A UNIQUE INDEX must include all columns in the table's partitioning function
mysql>


Create a new table for constraining the index that is not partitioned:

CREATE TABLE unique_index_table
(
id INTEGER NOT NULL,
PRIMARY KEY (id)
) ;


Create a trigger on the partitioned table for enforcing the uniqueness:

DELIMITER /

CREATE TRIGGER enforce_unique
BEFORE INSERT ON
part_table
FOR EACH ROW BEGIN
INSERT INTO
unique_index_table SET id = NEW.id;
END;
/

DELIMITER ;


Now, enforcement of the unique index can be done on the partitioned table:

mysql>INSERT INTO part_table VALUES (1,1);
Query OK, 1 row affected (0.00 sec)
mysql> INSERT INTO part_table VALUES (1,2);
ERROR 1062 (23000): Duplicate entry '1' for key 'PRIMARY'
mysql> INSERT INTO part_table VALUES (1,1);
ERROR 1062 (23000): Duplicate entry '1' for key 'PRIMARY'
mysql> INSERT INTO part_table VALUES (2,1);
Query OK, 1 row affected (0.00 sec)
mysql> INSERT INTO part_table VALUES (3,3);
Query OK, 1 row affected (0.00 sec)
mysql> INSERT INTO part_table VALUES (3,2);
ERROR 1062 (23000): Duplicate entry '3' for key 'PRIMARY'
mysql>


So, now it really becomes a question of performance. Functionally it works, but it it
really the best workaround?

I'll try some tests to compare the performance of this table partitioned with a trigger
enforced unique index against a non-partitioned table with a regular unique index
and see how the results are.

I'll post them as soon as I have the results. If anyone has a better way, I'm interested in knowing....

Wednesday, March 19, 2008

MySQL: Finally an ability to trace/profile

Finally! The ability to look a little closer into what's happening with SHOW PROFILE


Here’s how it works:

mysql> set profiling=1;
mysql> select count(*) from mysql.user;
+----------+
| count(*) |
+----------+
| 5 |
+----------+
1 row in set (0.00 sec)

mysql> show profile;
+--------------------------------+----------+
| Status | Duration |
+--------------------------------+----------+
| starting | 0.000020 |
| checking query cache for query | 0.000250 |
| Opening tables | 0.000015 |
| System lock | 0.000010 |
| Table lock | 0.000009 |
| init | 0.000022 |
| optimizing | 0.000007 |
| executing | 0.000018 |
| end | 0.000004 |
| query end | 0.000002 |
| freeing items | 0.000019 |
| logging slow query | 0.000002 |
| cleaning up | 0.000002 |
+--------------------------------+----------+
13 rows in set (0.01 sec)

In this case, we also get info on inserts (where you don’t with explain)

mysql> create table test (id integer not null auto_increment, my_name varchar(20),primary key (id));
Query OK, 0 rows affected (0.00 sec)
mysql> insert into test (my_name) values ('Phil');
Query OK, 1 row affected (0.01 sec)

mysql> show profile;

+----------------------+----------+
| Status | Duration |
+----------------------+----------+
| starting | 0.000069 |
| checking permissions | 0.000008 |
| Opening tables | 0.000134 |
| System lock | 0.000004 |
| Table lock | 0.000006 |
| init | 0.000017 |
| update | 0.000093 |
| end | 0.000003 |
| query end | 0.000003 |
| freeing items | 0.002233 |
| logging slow query | 0.000003 |
| cleaning up | 0.000002 |
+----------------------+----------+

12 rows in set (0.00 sec)

Here's the steps I used to build the binaries for MySQL 5.1.24:

1) install bitkeeper client http://www.bitmover.com/bk-client2.0.shar
shell> /bin/sh bk-client2.0.shar
shell> cd bk-client2.0
shell> make
shell> make install

2) grab the current 5.1 source tree
shell> bkf clone bk://mysql.bkbits.net/mysql-5.1 mysql-5.1

3) Prepare the source for building
shell> cd mysql-5.1
shell> ./BUILD/autorun.sh

4) Build the snapshot
shell> CC=gcc CXX=gcc ./configure --prefix=/opt/app/mysql \
--localstatedir=/opt/app/mysql/data --libexecdir=/opt/app/mysql/bin \
--with-server-suffix="" --enable-thread-safe-client --enable-local-infile \
--with-pic --with-fast-mutexes --with-client-ldflags="-static" \
--with-mysqld-ldflags="-static" --with-zlib-dir=bundled --with-big-tables \
--with-ssl --with-readline --with-embedded-server --with-archive-storage-engine \
--with-blackhole-storage-engine --with-ndbcluster --with-csv-storage-engine \
--with-example-storage-engine --without-falcon --with-federated-storage-engine \
--with-partition --with-innodb --with-extra-charsets=all --enable-community-features \
--with-comment="MySQL Enterprise Server (GPL) - Snapshot from `date +'%m-%d-%Y'`"
shell> make
shell> make test

5) Make the final installer package
shell> ./scripts/make_binary_distribution

6) Then just install the binaries with the file mysql-5.1.24-rc-linux-x86_64.tar.gz


Tuesday, March 18, 2008

Best Practices - Oops...

Yea, yea, yea... best practices.

I spent about a week troubleshooting issues on a MySQL 5.1 replication issue where certain transactions causing duplicate key on index errors stopped replication. Easy enough to fix, right?


mysql> SET GLOBAL SQL_SLAVE_SKIP_COUNTER = 1;
mysql> start slave;

Hmm... nice until it happens almost every minute...

So, after reading every link I could find on the internet, forum posts, upgrading to 5.1.24, I finally started to think I just didn't know what I was doing with replication.

Last ditch effort... ask dev and application engineers to help out...

"Oh... you mean one of the application servers is pointed at the slave server in read/write mode and is trying to insert data there as well?..."

:)

Yea, that's why it's a best practice to have your slave servers set READ ONLY .... duh...

Anyway, live and learn..

Friday, January 18, 2008

MySQL: Innodb internals

Went to the MySQL Performance Tuning Class last week. I would highly recommend it to any DBA new to MySQL, or any MySQL DBA looking to boost their tuning skills and understanding of some of the various features as they relate to performance.

The next few posts will have to do with information I gleaned from that class...

It's always good to understand what's happening underneath the covers when it comes to database performance and evaluating database architectures/designs for new products or features. One of the great things about the innodb engine is it's Oracleish approach to transaction management. The key is Oracle'ish'. As an Oracle DBA looking at MySQL, it's easy to assume that things like undo tablespaces and rollbacks work the same in innodb as they to in Oracle.

Well, you know what happens when we 'Assume'....

One of the key differences is that innodb does not have an undo tablespace or rollback segment per say. Innodb uses the concept of versioned rows, not to unlike SQLServer 2005's concept of versioned rows. SQLServer implements it by storing versions of the row being changed in the tempdb database (though only if the option is turned on). Innodb implements it with storing versions of the row being changed withing the tablespace itself.

Hmm... The obvious implication is fragmentation. As a row gets updated by multiple threads, or a long running transaction, innodb must keep copies of the row with different version numbers (rowid1, version1/rowid 1, version2, etc, etc) within the table so that a select statement has access to that row while the row is potentially changed by some other transaction. This allows for selects not to block updates, but when a transaction is committed, then the versioned rows are marked for delete. This can then cause a high level of fragmentation within the table and/or tablespace.

So, we as DBA's need to consider things like this when we give advice as to what engine should be used for a given application or database table with MySQL as a back end, as well as things like where they should reside on disk, and whether it's ok to use all tables in one tablespace (innodb default), or file_per_table option for innodb to keep fragmentation issues local to the associated tables/indexes. It may also affect the kinds of maintenance and how often it's needed with respect to rebuilds/defrags....