Pages

Friday, June 08, 2007

SQLServer 2005: Automated Partitioning Part 1

Partitioning is great, but it has to be automated if you want to do anything regularly with it. And as it typically goes, if you can do a task in repeatable order with T-SQL commands, you ought to be able to automate it, right?

In the next few weeks I'll layout a hopefully straight forward way to automate your partitioning in SQLServer 2005. Though the examples will be done with date range partitioning, the basic methodolgy should be transferable to any types of partitioning.

This week we'll be looking at setting up our partitioning functions, shemes, tables, and initial switching of the current partition(s).

In later weeks' we'll look at automating the merging and/or splitting of partitions as data is rolled out of relevance from a time perspective (IE: X number of months then rolls up into quarters / years as time goes on)

Example: Creating the Partition Function monthly_partition_function
We're going to start with a partition function that partitions by month for one year, then partitions by quarter for one year, then partitions by year. We'll assume that we'll start with 2008-01-01 as our highest defined partition, and 2005-01-01 as our oldest/lowest defined partition.

CREATE PARTITION FUNCTION monthly_partition_function (DATETIME)
AS RANGE RIGHT FOR VALUES
(
'2005-01-01', -- partition by year
'2006-01-01', -- partition by year
'2006-04-01', -- partition by q1
'2006-07-01', -- partition by q2
'2006-10-01', -- partition by q3
'2007-01-01', -- partition by q4
'2007-02-01', -- partition by month
'2007-03-01', -- partition by month
'2007-04-01', -- partition by month
'2007-05-01', -- partition by month
'2007-06-01', -- partition by month
'2007-07-01', -- partition by month
'2007-08-01', -- partition by month
'2007-09-01', -- partition by month
'2007-10-01', -- partition by month
'2007-11-01', -- partition by month
'2007-12-01', -- partition by month
'2008-01-01' -- partition by month
)

Example: Creating the Partition Scheme monthly_partition_scheme
Next create our partition scheme that lays out partitions onto file groups. For our purposes, we'll put all partitions in the primary file group (In reality, you should put them in different file groups based on your disk subsystem design for performance and managability reasons).

CREATE PARTITION SCHEME monthly_partition_scheme AS PARTITION monthly_partition_function ALL TO ([PRIMARY])
Example: Creating the Base Table example_sales_data
Now create our base fact table ( example_sales_data ) that will be partitioned according to the rules defined by our partition function and partition scheme:

CREATE TABLE example_sales_data ( trans_date DATETIME NOT NULL, trans_amt DECIMAL (5,2) NOT NULL, trans_type INT, sales_id INT, store_id INT, cust_id INT ) ON monthly_partition_scheme (trans_date)
Example: Creating the Load Table example_sales_data_load
Next create our load table ( example_sales_data_load ) that will be simillarly partitioned according to the rules defined by our partition function and partition scheme in order to insure partition switching will be seamless:


CREATE TABLE example_sales_data_load ( trans_date DATETIME NOT NULL, trans_amt DECIMAL (5,2) NOT NULL, trans_type INT, sales_id INT, store_id INT, cust_id INT ) ON monthly_partition_scheme (trans_date)
Example: Auto Load Seed Data Into Table example_sales_data
Next auto load our fact table ( example_sales_data) with seed data from previous years and months. We'll load through Feb of 2007 to start with.

DECLARE
@my_trans_date DATETIME ,
@my_trans_amt DECIMAL (5,2),
@my_trans_type INT,
@my_sales_id
INT,
@my_store_id
INT,
@my_cust_id
INT,
@my_cur_date
DATETIME
BEGIN
SET @my_cur_date = GETDATE() - 1200
SET @my_trans_date = @my_cur_date
SET @my_trans_amt = 10.00
SET @my_trans_type = 1
SET @my_sales_id = 1
SET @my_store_id = 1
SET @my_cust_id = 1

WHILE CAST ('2007-03-01' AS DATETIME) > @my_cur_date
BEGIN
WHILE @my_trans_type <= 3
BEGIN

WHILE @my_sales_id <= 5
BEGIN

WHILE @my_store_id <= 6
BEGIN

WHILE @my_cust_id <= 4
BEGIN
WHILE @my_trans_amt <= 500.00
BEGIN

INSERT INTO example_sales_data
VALUES
(
@my_trans_date,
@my_trans_amt,
@my_trans_type,
@my_sales_id,
@my_store_id,
@my_cust_id
)

SET @my_trans_amt = @my_trans_amt * 1.25
END -- @my_trans_amt
SET @my_trans_amt = @my_trans_amt / 10.0
SET @my_cust_id = @my_cust_id + 1
END -- @my_cust_id
SET @my_cust_id = 1
SET @my_store_id = @my_store_id + 1
END -- @my_store_id
SET @my_store_id = 1
SET @my_sales_id = @my_sales_id + 1
END -- @my_sales_id
SET @my_sales_id = 1
SET @my_trans_type = @my_trans_type + 1
END --@my_trans_type
SET @my_trans_type = 1
SET @my_cur_date = @my_cur_date + 7
SET @my_trans_date = @my_cur_date
END --@my_cur_date
END -- Base Table Load
Example: Verify Seed Data Via Partition Counts example_sales_data
Next check that rows were distributed appropriately to table partitions via the sys.dm_db_partition_stats dynamic management object.

SELECT OBJECT_NAME ([object_id]),partition_number,row_count
FROM sys.dm_db_partition_stats
WHERE object_id = OBJECT_ID('example_sales_data')

Example: Auto Load New Data Into Load Table example_sales_data
Next auto load our load table ( example_sales_data_load) with new data from current month. (This is autogen'd here, but in reality it would be from whatever ETL process).

DECLARE
@my_trans_date DATETIME ,
@my_trans_amt DECIMAL (5,2),
@my_trans_type INT,
@my_sales_id
INT,
@my_store_id
INT,
@my_cust_id
INT,
@my_cur_date
DATETIME
BEGIN
SET @my_cur_date = CAST ('2007-03-01' AS DATETIME)
SET @my_trans_date = @my_cur_date
SET @my_trans_amt = 10.00
SET @my_trans_type = 1
SET @my_sales_id = 1
SET @my_store_id = 1
SET @my_cust_id = 1

WHILE CAST ('2007-04-01' AS DATETIME) > @my_cur_date
BEGIN
WHILE @my_trans_type <= 3
BEGIN

WHILE @my_sales_id <= 5
BEGIN

WHILE @my_store_id <= 6
BEGIN

WHILE @my_cust_id <= 4
BEGIN
WHILE @my_trans_amt <= 500.00
BEGIN

INSERT INTO example_sales_data_load
VALUES
(
@my_trans_date,
@my_trans_amt,
@my_trans_type,
@my_sales_id,
@my_store_id,
@my_cust_id
)

SET @my_trans_amt = @my_trans_amt * 1.25
END -- @my_trans_amt
SET @my_trans_amt = @my_trans_amt / 10.0
SET @my_cust_id = @my_cust_id + 1
END -- @my_cust_id
SET @my_cust_id = 1
SET @my_store_id = @my_store_id + 1
END -- @my_store_id
SET @my_store_id = 1
SET @my_sales_id = @my_sales_id + 1
END -- @my_sales_id
SET @my_sales_id = 1
SET @my_trans_type = @my_trans_type + 1
END --@my_trans_type
SET @my_trans_type = 1
SET @my_cur_date = @my_cur_date + 7
SET @my_trans_date = @my_cur_date
END --@my_cur_date
END -- Base Table Load
Example: Verify Seed Data Via Partition Counts example_sales_data_load
Next check that rows were distributed appropriately to table partitions via the sys.dm_db_partition_stats dynamic management object. All rows should reside in partition 9 now.

SELECT OBJECT_NAME ([object_id]),partition_number,row_count
FROM sys.dm_db_partition_stats
WHERE object_id = OBJECT_ID('example_sales_data')
Example: Switch partition from LOAD table to MAIN table
Now we can switch partition 9 of the load table with the main table. Let's auto determine which partition to switch though!

To Manually get the partion number to switch:

SELECT $PARTITION.monthly_partition_function ( (
SELECT TOP 1
(trans_date) FROM example_sales_data_load ) )


To Auto Generate the partion number and switch:

DECLARE
@switch_str VARCHAR (512),
@target_partition INT

BEGIN SELECT @target_partition = $PARTITION.monthly_partition_function
((
SELECT TOP 1 (trans_date)
FROM example_sales_data_load
))

SET @switch_str = 'ALTER TABLE example_sales_data_load '
SET @switch_str = @switch_str + 'SWITCH PARTITION ('
SET @switch_str = @switch_str + CAST (@target_partition AS VARCHAR)
SET @switch_str = @switch_str + ') TO example_sales_data'
SET @switch_str = @switch_str + ' PARTITION ('
SET @switch_str = @switch_str + CAST (@target_partition AS VARCHAR)
SET @switch_str = @switch_str + ')'

EXEC (@switch_str);
END


OK - It's a lot in one session, but the scripts work, and now you have automated loading of your monthly partitions. Next time we'll look at automating the quarterly partition merges....

No comments: