Pages

Showing posts with label Partitioning. Show all posts
Showing posts with label Partitioning. Show all posts

Friday, January 23, 2009

Partition Management....(Don't forget the Management)

You'd think by now I'd know better than to cut corners. But for some reason, I still somehow seem to decide once in a while for short term gains over long term benefits...

A while back I was overloaded with stuff so I set up a year's worth of new monthly partitions in one of our reporting databases. I did not, however, chose to create any nice job to automatically create new partitions or anything like that, as I did not have time (or so I thought).

Well, 'A while back' is now 18 months ago.... Hmm... 6 months of data in one partition. Ouch.

Might not be a big deal with Oracle, but it's SQLServer 2005. Sadly, just splitting 1 day of data into a new partition takes 10 minutes and 9 GB of transaction log space. Very depressing. (At least I tested and timed it before I tried to split one month's data into a new partition).

Kills me some times. Don't put off till tomorrow what you can do today, if tomorrow might become next year... ;)

Hopefully I'll come up with a slick way to split those partitions faster... we'll see.

And yes, I'm going to create a job to add partitions in advance.... :)

Saturday, October 04, 2008

MySQL Partitioned Tables with Trigger enforced FK constraints

Well, I suppose its' true you can't use the official MySQL ALTER TABLE statement to do it, but if you're willing to trust your trigger coding abilities you can.

All you need is an extra table and a couple triggers.

The concept is fairly straight forward:

  1. Create a before insert trigger on the child table that validates the parent exists in the parent table.
  2. If there is no parent found, then insert a row into a third table that automatically forces an error (say, dup key on index)
  3. Create a before delete trigger on the parent table that validate there are no children in the child table.
  4. If there are children found, then insert a row into the third table that automatically forces an error (dup key on index)
It could also be extended to handle cascading deletes, but I'm personally not a fan of that, so I leave that to you... :)

Here's an example:
- Note: This example uses Innodb tables, but it can be done with MyISAM tables just as well.



Parent Table (orders) :

CREATE TABLE IF NOT EXISTS orders
(
id INT NOT NULL AUTO_INCREMENT,
customer_name VARCHAR(30) NOT NULL,
total_amount DECIMAL,
order_date DATETIME ,
PRIMARY KEY (id,customer_name)
)
ENGINE=innodb
PARTITION BY RANGE (id)
(
PARTITION p1 VALUES LESS THAN (10),
PARTITION p2 VALUES LESS THAN (20),
PARTITION p3 VALUES LESS THAN (30),
PARTITION p4 VALUES LESS THAN (40),
PARTITION p_max VALUES LESS THAN MAXVALUE
);



Child Table (order_items) :


CREATE TABLE IF NOT EXISTS order_items
(
id INT NOT NULL AUTO_INCREMENT,
order_id INT NOT NULL ,
item_name VARCHAR(30) NOT NULL,
item_desc VARCHAR(4000),
item_amount DECIMAL,
PRIMARY KEY (id,order_id)
)
ENGINE=innodb
PARTITION BY HASH (order_id)
PARTITIONS 20
;



Special Table (fk_trigger) used to enforce FK:


CREATE TABLE IF NOT EXISTS fk_trigger
(
id INT NOT NULL PRIMARY KEY DEFAULT 0
)
ENGINE=innodb
;



Insert default row into special table (fk_trigger)
that will force all inserts to fail:


INSERT INTO fk_trigger VALUES();



Child Trigger (order_item_fk_trig) used to check
that parent row exists before insert:


DELIMITER /

CREATE TRIGGER order_item_fk_trig
BEFORE INSERT ON order_items
FOR EACH ROW BEGIN
DECLARE my_count INT;
SELECT count(*)
INTO @my_count
FROM orders
WHERE id = NEW.order_id;

IF (@my_count = 0) THEN
INSERT INTO fk_trigger VALUES ();
END IF;
END;
/

DELIMITER ;



Parent Trigger (order_fk_trig) used to check
that child row exists before delete:


DELIMITER /

CREATE TRIGGER order_fk_rev_trig
BEFORE DELETE ON orders
FOR EACH ROW BEGIN
DECLARE my_count INT;
SELECT count(*)
INTO @my_count
FROM order_items
WHERE order_id = OLD.id;

IF (@my_count > 0) THEN
INSERT INTO fk_trigger VALUES ();
END IF;
END;
/

DELIMITER ;



Insert some data into the parent table

INSERT INTO orders
(customer_name,total_amount,order_date)
VALUES
('Customer 1',300.00,now()),
('Customer 2',600.00,now());



Insert some data into the child table

INSERT INTO order_items
(order_id,item_name,item_desc, item_amount)
VALUES
(1,'This 1','Stuff 1',100),
(1,'This 2','Stuff 2',100),
(1,'This 3','Stuff 3',100),
(2,'This 1','Stuff 1',200),
(2,'This 2','Stuff 2',200),
(2,'This 3','Stuff 3',200);



Try to delete order #1 (without deleting it's children)

DELETE FROM orders
WHERE id = 1;


Results:

ERROR 1062 (23000): Duplicate entry '0' for key 'PRIMARY'



Try to insert orderitems for order #3 (Which doesn't exist)

INSERT INTO order_items
(order_id,item_name,item_desc, item_amount)
VALUES
(3,'This 3','Stuff 1',100);

Results:

ERROR 1062 (23000): Duplicate entry '0' for key 'PRIMARY'


Friday, April 18, 2008

Speaking at the Conference

I had a great time speaking about partitioning at the MySQL conference this week.

Special thanks go out to Mattias Jonsson as well for helping answer some of the questions afterwards. The room was packed - which makes me excited for the future of MySQL partitioning!

I hope everyone got something out of the session, and I'd gladly welcome any comments (positive or constructive).

I plan on doing some comparisons in the next few weeks between MySQL partitioning and PostgreSQL partitioning... (I'll let you all know what I like and don't like).

If you want a copy of the slides, you can download it from google docs here

Or just take a look:

Monday, March 31, 2008

MySQL: Getting Creative with Partitioning (Performance Results)

I decided to run some very basic performance test comparing the non-partitioned table with a primary key, and a partitioned table with a primary key and a unique constraint enforced via a secondary table explained in my previous post.

Overall, it appears that with partitioning, as the data/rows scale, the inserts actually get faster :) This is what I would expect theoretically, so score one for partitioning.

That said, I'm not a benchmark tester, I'm a DBA, so I'm sure plenty of folks could explain why my testing isn't exactly accurate. For the purpose of the tests, I turned off query cache on the servers (though, since the tests were run with primarily inserts, it may not have made much difference). Tables were created with engine=innodb as default.

I do believe, it's enough to suggest going forward with this and doing scale testing on an application with this if you require unique keys that are part of the primary key due to partitioning restrictions in MySQL.

The tables are created for the test:

Sequence Tables:

CREATE TABLE my_test_sequence
(
id integer NOT NULL AUTO_INCREMENT,
PRIMARY KEY (id)
) AUTO_INCREMENT = 1001;

CREATE TABLE my_test_part_sequence
(
id integer NOT NULL AUTO_INCREMENT,
PRIMARY KEY (id)
) AUTO_INCREMENT = 1001;


Non Partitioned Table with PK and Unique Constraint:

CREATE TABLE test_base
(
id INTEGER NOT NULL AUTO_INCREMENT,
other_id INTEGER NOT NULL ,
name varchar(64) ,
PRIMARY KEY (id),
UNIQUE KEY (other_id)
);

Partitioned Table with PK without Unique Constraint:

CREATE TABLE test_part
(
id INTEGER NOT NULL AUTO_INCREMENT,
other_id INTEGER NOT NULL ,
name varchar(64),
PRIMARY KEY (id,other_id)
)
PARTITION BY KEY (id)
PARTITIONS 50;

Partitioned Enforcement Table with for Unique Constraint (Partitions Suggested by Mikael Ronstrom):

CREATE TABLE test_unique_part
(
other_id INTEGER NOT NULL,
PRIMARY KEY (other_id)
)
PARTITION BY KEY (other_id)
PARTITIONS 50;

Trigger for checking unique constraint:

DELIMITER /

CREATE TRIGGER enforce_unique
AFTER INSERT ON test_part
FOR EACH ROW BEGIN
INSERT INTO test_unique_part SET other_id = NEW.other_id;
END;
/

DELIMITER ;

test_base.sql insert scripts:

INSERT INTO my_test_sequence VALUES ('');
INSERT INTO test_base (other_id,name) VALUES (last_insert_id(), rand());


First 50,000 Rows (0 to start):


Non-Partitioned with just PK:

$ time mysqlslap -u testing --create-schema=test --query=test_base.sql -c 10 -i 5000 -F ";"
Benchmark
Average number of seconds to run all queries: 0.100 seconds
Minimum number of seconds to run all queries: 0.028 seconds
Maximum number of seconds to run all queries: 0.211 seconds
Number of clients running queries: 10
Average number of queries per client: 2

real 8m26.624s
user 0m1.154s
sys 0m3.473s

Partitioned with alternate table for unique Index:
$ time mysqlslap -u testing --create-schema=test --query=test_part.sql -c 10 -i 5000 -F ";"

Benchmark
Average number of seconds to run all queries: 0.051 seconds
Minimum number of seconds to run all queries: 0.020 seconds
Maximum number of seconds to run all queries: 0.376 seconds
Number of clients running queries: 10
Average number of queries per client: 2

real 4m23.382s
user 0m0.892s
sys 0m2.369s

Second 100,000 Rows (50,000 to start):

Non-Partitioned with just PK:

$ time mysqlslap -u root --create-schema=test --query=test_base.sql -c 10 -i 10000 -F ";"
Benchmark
Average number of seconds to run all queries: 0.051 seconds
Minimum number of seconds to run all queries: 0.020 seconds
Maximum number of seconds to run all queries: 0.376 seconds
Number of clients running queries: 10
Average number of queries per client: 2

real 4m23.382s
user 0m0.892s
sys 0m2.369s
Partitioned with alternate table for unique Index:
$ time mysqlslap -u root --create-schema=test --query=test_part.sql -c 10 -i 10000 -F ";"
Benchmark
Average number of seconds to run all queries: 0.051 seconds
Minimum number of seconds to run all queries: 0.020 seconds
Maximum number of seconds to run all queries: 0.376 seconds
Number of clients running queries: 10
Average number of queries per client: 2

real 4m23.382s
user 0m0.892s
sys 0m2.369s

Third 500,000 Rows (150,000 to start):

Non-Partitioned with just PK:

$ time mysqlslap -u root --create-schema=test --query=test_base.sql -c 10 -i 50000 -F ";"
Benchmark
Average number of seconds to run all queries: 0.081 seconds
Minimum number of seconds to run all queries: 0.018 seconds
Maximum number of seconds to run all queries: 2.069 seconds
Number of clients running queries: 10
Average number of queries per client: 2


real 68m20.086s
user 0m10.791s
sys 0m31.492s

Partitioned with alternate table for unique Index:

$ time mysqlslap -u root --create-schema=test --query=test_part.sql -c 10 -i 50000 -F ";"
Benchmark
Average number of seconds to run all queries: 0.057 seconds
Minimum number of seconds to run all queries: 0.020 seconds
Maximum number of seconds to run all queries: 0.746 seconds
Number of clients running queries: 10
Average number of queries per client: 2


real 48m50.993s
user 0m8.341s
sys 0m24.062s

Monday, March 24, 2008

MySQL: Getting Creative with Partitioning

Lately, I've been trying to keep up with at least one of the MySQL Forums: Partitioning.

It's a great way to keep on top of issues surrounding partitioning, and also get an idea of what people are trying to do with the new 5.1 feature. Richard came up with an interesting problem that I jumped into only to realize that I hadn't done my homework, and my initial suggestion wouldn't work at all due to documented restrictions on partitioning in MySQL 5.1. :(

So, the question is, if I can't have a unique index on a partitioned table that does not contain all columns of the partition key, is there some other way to enforce that uniqueness?

We could use a trigger on the partitioned table that try's to insert / update into another non-partitioned table that contains that unique constraint:


CREATE TABLE part_table
(
id INTEGER NOT NULL,
other INTEGER NOT NULL,
PRIMARY KEY (id,other)
)
PARTITION BY KEY (id)
PARTITIONS 20;


So the restriction can be seen by trying to create a unique index now on column 'other' :

mysql> CREATE UNIQUE INDEX test ON part_table (other);
ERROR 1503 (HY000):
A UNIQUE INDEX must include all columns in the table's partitioning function
mysql>


Create a new table for constraining the index that is not partitioned:

CREATE TABLE unique_index_table
(
id INTEGER NOT NULL,
PRIMARY KEY (id)
) ;


Create a trigger on the partitioned table for enforcing the uniqueness:

DELIMITER /

CREATE TRIGGER enforce_unique
BEFORE INSERT ON
part_table
FOR EACH ROW BEGIN
INSERT INTO
unique_index_table SET id = NEW.id;
END;
/

DELIMITER ;


Now, enforcement of the unique index can be done on the partitioned table:

mysql>INSERT INTO part_table VALUES (1,1);
Query OK, 1 row affected (0.00 sec)
mysql> INSERT INTO part_table VALUES (1,2);
ERROR 1062 (23000): Duplicate entry '1' for key 'PRIMARY'
mysql> INSERT INTO part_table VALUES (1,1);
ERROR 1062 (23000): Duplicate entry '1' for key 'PRIMARY'
mysql> INSERT INTO part_table VALUES (2,1);
Query OK, 1 row affected (0.00 sec)
mysql> INSERT INTO part_table VALUES (3,3);
Query OK, 1 row affected (0.00 sec)
mysql> INSERT INTO part_table VALUES (3,2);
ERROR 1062 (23000): Duplicate entry '3' for key 'PRIMARY'
mysql>


So, now it really becomes a question of performance. Functionally it works, but it it
really the best workaround?

I'll try some tests to compare the performance of this table partitioned with a trigger
enforced unique index against a non-partitioned table with a regular unique index
and see how the results are.

I'll post them as soon as I have the results. If anyone has a better way, I'm interested in knowing....

Friday, June 08, 2007

SQLServer 2005: Automated Partitioning Part 1

Partitioning is great, but it has to be automated if you want to do anything regularly with it. And as it typically goes, if you can do a task in repeatable order with T-SQL commands, you ought to be able to automate it, right?

In the next few weeks I'll layout a hopefully straight forward way to automate your partitioning in SQLServer 2005. Though the examples will be done with date range partitioning, the basic methodolgy should be transferable to any types of partitioning.

This week we'll be looking at setting up our partitioning functions, shemes, tables, and initial switching of the current partition(s).

In later weeks' we'll look at automating the merging and/or splitting of partitions as data is rolled out of relevance from a time perspective (IE: X number of months then rolls up into quarters / years as time goes on)

Example: Creating the Partition Function monthly_partition_function
We're going to start with a partition function that partitions by month for one year, then partitions by quarter for one year, then partitions by year. We'll assume that we'll start with 2008-01-01 as our highest defined partition, and 2005-01-01 as our oldest/lowest defined partition.

CREATE PARTITION FUNCTION monthly_partition_function (DATETIME)
AS RANGE RIGHT FOR VALUES
(
'2005-01-01', -- partition by year
'2006-01-01', -- partition by year
'2006-04-01', -- partition by q1
'2006-07-01', -- partition by q2
'2006-10-01', -- partition by q3
'2007-01-01', -- partition by q4
'2007-02-01', -- partition by month
'2007-03-01', -- partition by month
'2007-04-01', -- partition by month
'2007-05-01', -- partition by month
'2007-06-01', -- partition by month
'2007-07-01', -- partition by month
'2007-08-01', -- partition by month
'2007-09-01', -- partition by month
'2007-10-01', -- partition by month
'2007-11-01', -- partition by month
'2007-12-01', -- partition by month
'2008-01-01' -- partition by month
)

Example: Creating the Partition Scheme monthly_partition_scheme
Next create our partition scheme that lays out partitions onto file groups. For our purposes, we'll put all partitions in the primary file group (In reality, you should put them in different file groups based on your disk subsystem design for performance and managability reasons).

CREATE PARTITION SCHEME monthly_partition_scheme AS PARTITION monthly_partition_function ALL TO ([PRIMARY])
Example: Creating the Base Table example_sales_data
Now create our base fact table ( example_sales_data ) that will be partitioned according to the rules defined by our partition function and partition scheme:

CREATE TABLE example_sales_data ( trans_date DATETIME NOT NULL, trans_amt DECIMAL (5,2) NOT NULL, trans_type INT, sales_id INT, store_id INT, cust_id INT ) ON monthly_partition_scheme (trans_date)
Example: Creating the Load Table example_sales_data_load
Next create our load table ( example_sales_data_load ) that will be simillarly partitioned according to the rules defined by our partition function and partition scheme in order to insure partition switching will be seamless:


CREATE TABLE example_sales_data_load ( trans_date DATETIME NOT NULL, trans_amt DECIMAL (5,2) NOT NULL, trans_type INT, sales_id INT, store_id INT, cust_id INT ) ON monthly_partition_scheme (trans_date)
Example: Auto Load Seed Data Into Table example_sales_data
Next auto load our fact table ( example_sales_data) with seed data from previous years and months. We'll load through Feb of 2007 to start with.

DECLARE
@my_trans_date DATETIME ,
@my_trans_amt DECIMAL (5,2),
@my_trans_type INT,
@my_sales_id
INT,
@my_store_id
INT,
@my_cust_id
INT,
@my_cur_date
DATETIME
BEGIN
SET @my_cur_date = GETDATE() - 1200
SET @my_trans_date = @my_cur_date
SET @my_trans_amt = 10.00
SET @my_trans_type = 1
SET @my_sales_id = 1
SET @my_store_id = 1
SET @my_cust_id = 1

WHILE CAST ('2007-03-01' AS DATETIME) > @my_cur_date
BEGIN
WHILE @my_trans_type <= 3
BEGIN

WHILE @my_sales_id <= 5
BEGIN

WHILE @my_store_id <= 6
BEGIN

WHILE @my_cust_id <= 4
BEGIN
WHILE @my_trans_amt <= 500.00
BEGIN

INSERT INTO example_sales_data
VALUES
(
@my_trans_date,
@my_trans_amt,
@my_trans_type,
@my_sales_id,
@my_store_id,
@my_cust_id
)

SET @my_trans_amt = @my_trans_amt * 1.25
END -- @my_trans_amt
SET @my_trans_amt = @my_trans_amt / 10.0
SET @my_cust_id = @my_cust_id + 1
END -- @my_cust_id
SET @my_cust_id = 1
SET @my_store_id = @my_store_id + 1
END -- @my_store_id
SET @my_store_id = 1
SET @my_sales_id = @my_sales_id + 1
END -- @my_sales_id
SET @my_sales_id = 1
SET @my_trans_type = @my_trans_type + 1
END --@my_trans_type
SET @my_trans_type = 1
SET @my_cur_date = @my_cur_date + 7
SET @my_trans_date = @my_cur_date
END --@my_cur_date
END -- Base Table Load
Example: Verify Seed Data Via Partition Counts example_sales_data
Next check that rows were distributed appropriately to table partitions via the sys.dm_db_partition_stats dynamic management object.

SELECT OBJECT_NAME ([object_id]),partition_number,row_count
FROM sys.dm_db_partition_stats
WHERE object_id = OBJECT_ID('example_sales_data')

Example: Auto Load New Data Into Load Table example_sales_data
Next auto load our load table ( example_sales_data_load) with new data from current month. (This is autogen'd here, but in reality it would be from whatever ETL process).

DECLARE
@my_trans_date DATETIME ,
@my_trans_amt DECIMAL (5,2),
@my_trans_type INT,
@my_sales_id
INT,
@my_store_id
INT,
@my_cust_id
INT,
@my_cur_date
DATETIME
BEGIN
SET @my_cur_date = CAST ('2007-03-01' AS DATETIME)
SET @my_trans_date = @my_cur_date
SET @my_trans_amt = 10.00
SET @my_trans_type = 1
SET @my_sales_id = 1
SET @my_store_id = 1
SET @my_cust_id = 1

WHILE CAST ('2007-04-01' AS DATETIME) > @my_cur_date
BEGIN
WHILE @my_trans_type <= 3
BEGIN

WHILE @my_sales_id <= 5
BEGIN

WHILE @my_store_id <= 6
BEGIN

WHILE @my_cust_id <= 4
BEGIN
WHILE @my_trans_amt <= 500.00
BEGIN

INSERT INTO example_sales_data_load
VALUES
(
@my_trans_date,
@my_trans_amt,
@my_trans_type,
@my_sales_id,
@my_store_id,
@my_cust_id
)

SET @my_trans_amt = @my_trans_amt * 1.25
END -- @my_trans_amt
SET @my_trans_amt = @my_trans_amt / 10.0
SET @my_cust_id = @my_cust_id + 1
END -- @my_cust_id
SET @my_cust_id = 1
SET @my_store_id = @my_store_id + 1
END -- @my_store_id
SET @my_store_id = 1
SET @my_sales_id = @my_sales_id + 1
END -- @my_sales_id
SET @my_sales_id = 1
SET @my_trans_type = @my_trans_type + 1
END --@my_trans_type
SET @my_trans_type = 1
SET @my_cur_date = @my_cur_date + 7
SET @my_trans_date = @my_cur_date
END --@my_cur_date
END -- Base Table Load
Example: Verify Seed Data Via Partition Counts example_sales_data_load
Next check that rows were distributed appropriately to table partitions via the sys.dm_db_partition_stats dynamic management object. All rows should reside in partition 9 now.

SELECT OBJECT_NAME ([object_id]),partition_number,row_count
FROM sys.dm_db_partition_stats
WHERE object_id = OBJECT_ID('example_sales_data')
Example: Switch partition from LOAD table to MAIN table
Now we can switch partition 9 of the load table with the main table. Let's auto determine which partition to switch though!

To Manually get the partion number to switch:

SELECT $PARTITION.monthly_partition_function ( (
SELECT TOP 1
(trans_date) FROM example_sales_data_load ) )


To Auto Generate the partion number and switch:

DECLARE
@switch_str VARCHAR (512),
@target_partition INT

BEGIN SELECT @target_partition = $PARTITION.monthly_partition_function
((
SELECT TOP 1 (trans_date)
FROM example_sales_data_load
))

SET @switch_str = 'ALTER TABLE example_sales_data_load '
SET @switch_str = @switch_str + 'SWITCH PARTITION ('
SET @switch_str = @switch_str + CAST (@target_partition AS VARCHAR)
SET @switch_str = @switch_str + ') TO example_sales_data'
SET @switch_str = @switch_str + ' PARTITION ('
SET @switch_str = @switch_str + CAST (@target_partition AS VARCHAR)
SET @switch_str = @switch_str + ')'

EXEC (@switch_str);
END


OK - It's a lot in one session, but the scripts work, and now you have automated loading of your monthly partitions. Next time we'll look at automating the quarterly partition merges....

Thursday, May 31, 2007

TUNING: Taking Advantage of New Features

Wouldn't it be great if we always were able to design things from scratch? So many things could be re-hashed and re-designed. One of the real reason's we're (DBA's that is) frustrated all the time with poorly scalable products, is that we don't pro-actively encourage evolving architectures in a meaningful, planned way.

It's always those developers throwing things over the wall, or the feature requests from product managers - never giving the time to design it right the first time. There may be some thread of truth to those statements, but if we take advantage of those new features, bug fixes, maint windows, we could do a lot for ourselves in terms of performance tuning.

Here's a couple examples to get you started thinking:

A new feature:

  1. Does it require new tables?
    • Have we thought about partitioning should the data in those tables scale up?
    • Have we thought about primary keys and which columns should or should not be included given usage and future modifications?
    • Have we thought about foreign keys and associated indexes?
    • Have we thought about normalization?
  2. Is the new feature heavy read or heavy write?
    • Should we be considering additional data files and/or data stores?
    • Are there certain types of indexes that might be more prudent in the long term?
    • How does this fit in with our HA / disaster recovery scenarios?
  3. Does the new feature have any long term reporting benefits?
    • Are changes easily monitored?
    • Is the data easily transformed to dimensions / facts?
    • Should there be some archival process?
Those are just a few questions that might be thought about and pondered when new features or bug fixes come across in our applications from a database perspective. Many times, we rely on developers who are under tight schedules and deadlines to think these things through, when they are really focusing on usability and functionality.

If we want to make our lives easier, we need to be the ones asking those questions, and in such a way that helps the development team see the benefit in making those decisions now, and not a year from now when we've added a new customer with 500,000 new transactions a day...

As DBA's we need to strive to get on the front end of the development cycles to make it as easy as possible for our development teams to take advantage of features and designs that will allow scalability and manageability with our databases - whatever those databases might be.

If you're not doing that, then in my opinion, you might as well outsource your database administration to some company that just gonna monitor alerts and backup your data....

Friday, May 25, 2007

MySQL: getting started for newbies

When MySQL first came out and started popping up all over the web in the e-commerce arena, if you were like me you looked at it as a wannabe. 'Call me in a few years when you grow up'. Yea, sometimes the Oracle in us goes to our heads a little to fast...

MySQL has really gone mainstream in the last few years. Companies that never would have considered MySQL in the past for critical database systems now have it on the forfront of their minds when evaluating different database technologies. Clustering, Replication, Monitoring are all well used now with MySQL and well tested.

Well, if I'm not using it now, how do I get on board?

Don't wait for your boss to evaluate it, that's for sure. Go get your MySQL VM set up on your laptop and stop thinking about it! It took me about 1/2 an hour to get the MySQL VM installed, up and running, and security settings all in place to try out the latest version of the MySQL Administrator up and running to play with.

Most likely you have various programs that are already using MySql as a backend - you just might not be maintaining them or playing with them because they are in your corporate environment instead of your production environment. This is the perfect oportunity to get started with MySQL.

Start by downloading the MySQL VM and get it runing on your PC or laptop.

Then download the MySQL Administrator and set up your security and connetions.

Next get the MySql Migration Toolkit and migrate one of your smaller SQLServer or Oracle databases to it.

Now, get familiar with the tools, features, backup and recovery, etc, etc. Try setting up replication; try out clustering, try out the partitioning....

Lastly, admit your Oracle snobbery and get on board :)




Tuesday, March 13, 2007

SQLServer 2005: Working with partitions

Now that SQLServer finally supports something the database world would really consider a decent first attempt at partitioning, I think it's high time we all start putting it to practice.

Partitioning in the Oracle world is almost taken for granted. It's one of the most straight forward ways to scale a database in terms of size and performance, without looking at additional hardware and/or heavy software changes in terms of data flows and use patterns.

Partitioning does not need to be restricted to data warehousing either. It has very practical applications to OLTP databases. When it comes to SQLServer, in my mind, it has even more impact on performance than it does with Oracle because of SQLServer's implementation of locking (although, technically, with SQLServer 2005, it is possible to implement a reader's don't block writer's transaction scheme).

This is because partitioning allows us to have more control over what rows/pages get locked during inserts and selects if we design it correctly. The next few posts with focus on ways to partition an OLTP application, and my workarounds for unimplemented schemes like hash-partitioning. Most of my initial on the job training I give credit to Frank Bommarito, an incredibly talented Oracle architect.

Friday, October 29, 2004

SQLServer: Partitions

Initial testing with SQLServer 2000 partitions appears to work simillarly with Oracle from a select perspective.

Explain plans show accessing only local partition indexes, and not all partitions (or tables) in the union views. Inserts / updates appear to first check each partition, then perform the transaction only on the necessary partition.

Some of the issues we've specifically found with partitions are all the nitpicky rules associated with them.
  • Underlying tables must have a primary key and the partition constraint must be part of it.
  • Doesn't seem to work with between all the time with the optimizer.
  • Must be running Enterprise Edition of SQLServer

Hmm.... Now that's about the same cost as Oracle ;)


And so on, and so on, and so on...

It'll be nice when SQLServer catches up with Oracle from a partitioning perspective, but at least it's a start.