Practice Your Restores

December 30, 2009 at 7:35 am (SQL Server 2005, SQL Server 2008, SQLServerPedia Syndication) (, , , , , )

Steven Jones posted an excellent editorial today all about how your backups are only good if you know that you can restore from them. He couldn’t be more correct. I posted the following thoughts in the comments, but I know not everyone reads the comments in articles & editorials. Although, if it’s a good article, you should read the comments, especially on SQL Server Central. Frequently the discussion about the article can be as enlightening as the article itself. But I digress.

Steve’s point, pretty clearly stated but I’ll repeat it, backups don’t matter, restores do. I’m going to pile on to this point just a bit, because it can’t be emphasized enough. Nothing is more important than verifying backups, except, verifying that you know how to run a restore. You’re absolutely right when you say that backups are no good unless you can restore them, but it goes beyond actually validating that the backup files themselves are valid and accessible. You need to know that you, and any other DBA’s in the organization, can actually run a restore, knows how to read the file header, can do a point in time recovery, etc. Practice restoring databases not only validates that the backups are good, but that you’re good as well.

Permalink 1 Comment

View vs. Table Valued Function vs. Multi-Statement Table Valued Function

August 13, 2008 at 8:36 am (TSQL) (, , , , , )

About five years ago, I was checking an app before it went to production. I hadn’t seen the app before then and a junior dba had worked with the developers designing and building the app. It didn’t use a single stored procedure or view. Instead, it was built entirely of multi-statement UDF’s. These UDF’s called other UDF’s which joined to UDF’s… It was actually a very beautiful design in terms of using the functions more or less like objects within the database. Amazing. It also would not, and could not, perform enough to function, let alone scale. It was a horror because they thought they were done and ready to go to production, but no one had ever tested more than a couple of rows of data in any of the tables. Of course, a couple of rows of data worked just fine. It was when we put in 10, 1000, a few million, that the thing came to a total and complete halt. We spent weeks arguing about the stupid thing. The developers instisted that since it was “possible” to do what they did, that, in fact, it was OK to do what they did.

Anyway, with the help of a Microsoft consultant, we finally cleaned up the app and got it on it’s feet. Ever since then, I’ve preached the dangers of the multi-statement table valued function. The thing to remember is, there are no statistics generated for these things. That means the optimizer thinks they return a single row of data. When they do only return a few rows, everything is fine. When they return even as little as a hundred rows, like the example I’m posting below, they stink.

Anyway, I boiled up this silly example because some developer accused me and several other DBA’s of spreading Fear, Undertainty, and Doubt because we suggested that the multi-statement UDF is something to avoid if possible. Actually, he pretty all but stated that we didn’t know what we were talking about. I was peeved. Hence this example. Feel free to check it out. Oh, and if you check the execution plans, note that the multi-statement UDF is marked as the least costly even though it actually performs twice as slow as the others. One more example of execution plans being wrong.

Here are the time results from one run of the view & UDF’s:

(99 row(s) affected)

SQL Server Execution Times:
CPU time = 0 ms, elapsed time = 1 ms.

(99 row(s) affected)

SQL Server Execution Times:
CPU time = 0 ms, elapsed time = 2 ms.

(99 row(s) affected)

SQL Server Execution Times:
CPU time = 0 ms, elapsed time = 3 ms.

And the code to test for yourself:

CREATE TABLE dbo.Parent
(ParentId int identity(1,1)
,ParentDate datetime)

CREATE TABLE dbo.Child
(ChildId int identity(1,1)
,ParentId int
,ChildDate datetime)

DECLARE @i int
DECLARE @j int
SET @i = 1
SET @j = 1
WHILE @i < 100
BEGIN
INSERT INTO dbo.Parent
(ParentDate)
SELECT GETDATE()
WHILE @j < 100
BEGIN
INSERT INTO dbo.Child
(ParentId
,ChildDate)
SELECT @i
,GETDATE()
SET @j = @j + 1
END
SET @i = @i + 1
END

CREATE VIEW dbo.vJoin
AS
SELECT p.ParentId
,p.ParentDate
,c.ChildId
,C.ChildDate
FROM dbo.Parent p
JOIN dbo.Child c
ON p.ParentId = c.ParentId

CREATE FUNCTION dbo.SingleUDF ()
RETURNS TABLE
AS
RETURN
(
SELECT p.ParentId
,p.ParentDate
,c.ChildId
,C.ChildDate
FROM dbo.Parent p
JOIN dbo.Child c
ON p.ParentId = c.ParentId
)

CREATE Function dbo.MultiUDF ()
RETURNS @Multi TABLE
(ParentId int
,ParentDate datetime
,ChildId int
,ChildDate datetime)
AS
BEGIN
INSERT INTO @Multi
(ParentId
,ParentDate
,ChildId
,ChildDate)
SELECT p.ParentId
,p.ParentDate
,c.ChildId
,C.ChildDate
FROM dbo.Parent p
JOIN dbo.Child c
ON p.ParentId = c.ParentId
RETURN
END

set statistics time on
select * from vJoin
select * from SingleUDF()
select * from MultiUDF()
set statistics time off

Permalink 6 Comments

Slick New Tool from RedGate

March 20, 2008 at 12:16 pm (Tools) (, , )

I have no intention of this becoming “tool of the day” or anything, but I can’t help tooting the horn for a tool that I’ve been using a lot from RedGate. It’s new and in beta right now, but it’s going to be pretty good. It’s a data generation tool called, are you ready, Sql Data Generator. Who saw that coming? Ok. I know. I’m not funny. Anyway, this is a great little tool. I’ve been using it to quickly slap large amounts of data into small sets of tables to test queries that I’m writing or for checking answers that I’m posting over at SQL Server Central. The tool lets you pick which tables and the columns inside those tables that you want filled with data. It has a number of data generation schemes built in to get either randomly generated data or data that looks like something (names from a list for example). It’ll let you pull data from other sources, and unlike the data load tool that comes with DataDude, this will let you do two things, ignore a table but still use data from it. Is that confusing? We have a build process using DataDude that we use to build our databases. Additionally, we load the standard look-up data into the database as part of the build by including it in a post deployment script. Works great. Unfortunately, when we try to use these tables in the MS data loader, it wants to replace the data or completely ignore the table (and associated foreign keys…). This tool from RedGate allows me to skip loading data into the table because it’s already there, but it lets me use that table as a look-up for the foreign key data that it is generating. Great stuff. The beta is pretty stable if you want to try it out and I think it’s going to be release soon.

Permalink 1 Comment