Table Valued Functions

January 7, 2009 at 7:43 am (TSQL) (, , )

I’ve been blogging for a whole nine months now. I decided to look back and see what’s bringing people to the site. The number one search phrase is “sql server 2005 service pack 3” but the overwhelming topic that most people are using to get to the site are user defined functions, specifically multi-statement table valued user defined functions.

It’s completely understandable. Ever since I first saw these things in use back in SQL Server 2000, I thought they were slick. Unfortunately appearances can be deceiving. The reason so many people are searching out information on these things is because they just don’t work very well. SQL Server can’t create statistics on the tables generated through the multi-statement UDF. Because it has no statistics to work with, the query optimizer assumes that each of these tables has a single row. Creating execution plans for only a single row of data works well when there is only a single row, or just a few rows, of data. But when you have hundreds, thousands, or more, it breaks down very quickly.

Because of this rather severe short coming, I recommend avoiding the use of multi-statement table valued functions. There are any number of tests (look through Gail’s blog, just for starters) proving this for those who are skeptical and think that, if they just adjust the right little bit of the query, it’ll move millions of rows through 8 layers of functions lickety split. It won’t. It can’t. Stop trying.

The single statement functions, parameterized views as they’re sometimes referred to, work very well, so feel free to use them if they’ll help your code. They resolve out, like views, into actual execution plans based on the statistics within the database.

Permalink Leave a Comment

Multi-Statement Table Value Function Alternative

November 21, 2008 at 1:41 pm (TSQL) (, , , )

I was talking with Andrew Novick at the PASS Summit. We ended up talking about multi-statement table valued functions. I was talking about how much the performance of these things is weak (to be kind). He agreed, but suggested an alertnative that might be worth further exploration, if you really think you need multi-statement UDFs. Andrew said that in his testing, using CLR offered a great alernative to using the UDF. I’m still pretty convinced that any type of programming you’re doing on the SQL Server end that requires a UDF or CLR is probably either just TSQL gone wrong (see Jeff Moden and the RBAR concept) or it’s something that doesn’t belong on the SQL Server but instead should be done on the application layer somewhere. Still, it is an option.

Permalink Leave a Comment

More on Table Valued Functions

August 15, 2008 at 7:29 am (TSQL) (, , , )

From the hits in the search results, this is a popular topic. That being so, I’d like to redirect you to another blog that has some more detailed tests available. These tests show more of the shortcomings of multi-statement table valued functions. Gail Shaw is an MVP and a regular at SQL Server Central. She’s worth tracking and this post shows why.

I want to be clear. I’m not suggesting that you never, ever, use multi-statement table valued functions. There may be places where their use is helpful. I’m saying that using them comes with a very heavy cost, so you better be sure that they are in fact needed in the situation, whatever it is.

There was a long discussion and debate over at SQL Server Central recently on “profane” methods in TSQL. These were defined as the methods that any reasonably experienced DBA or database developer will generally proscribe against, such as TSQL cursors, table valued user defined functions (of either variety), CLR, and a few others. The confusion seemed to arise between the statement “there are usually better ways to access the data than to use X” and the statement “THOU SHALT NEVER USE X.” Apparently saying the first somehow causes people to hear the second. I don’t know why and I kind of don’t care. I think there are perfectly valid uses for cursors and WHILE loops and CLR and triggers and UDF’s… But I also know these things have a cost and if you simply don’t use them, there may be the occasional place where you might have recognized a savings, but overall your database will be better off.

Permalink Leave a Comment

View vs. Table Valued Function vs. Multi-Statement Table Valued Function

August 13, 2008 at 8:36 am (TSQL) (, , , , , )

About five years ago, I was checking an app before it went to production. I hadn’t seen the app before then and a junior dba had worked with the developers designing and building the app. It didn’t use a single stored procedure or view. Instead, it was built entirely of multi-statement UDF’s. These UDF’s called other UDF’s which joined to UDF’s… It was actually a very beautiful design in terms of using the functions more or less like objects within the database. Amazing. It also would not, and could not, perform enough to function, let alone scale. It was a horror because they thought they were done and ready to go to production, but no one had ever tested more than a couple of rows of data in any of the tables. Of course, a couple of rows of data worked just fine. It was when we put in 10, 1000, a few million, that the thing came to a total and complete halt. We spent weeks arguing about the stupid thing. The developers instisted that since it was “possible” to do what they did, that, in fact, it was OK to do what they did.

Anyway, with the help of a Microsoft consultant, we finally cleaned up the app and got it on it’s feet. Ever since then, I’ve preached the dangers of the multi-statement table valued function. The thing to remember is, there are no statistics generated for these things. That means the optimizer thinks they return a single row of data. When they do only return a few rows, everything is fine. When they return even as little as a hundred rows, like the example I’m posting below, they stink.

Anyway, I boiled up this silly example because some developer accused me and several other DBA’s of spreading Fear, Undertainty, and Doubt because we suggested that the multi-statement UDF is something to avoid if possible. Actually, he pretty all but stated that we didn’t know what we were talking about. I was peeved. Hence this example. Feel free to check it out. Oh, and if you check the execution plans, note that the multi-statement UDF is marked as the least costly even though it actually performs twice as slow as the others. One more example of execution plans being wrong.

Here are the time results from one run of the view & UDF’s:

(99 row(s) affected)

SQL Server Execution Times:
CPU time = 0 ms, elapsed time = 1 ms.

(99 row(s) affected)

SQL Server Execution Times:
CPU time = 0 ms, elapsed time = 2 ms.

(99 row(s) affected)

SQL Server Execution Times:
CPU time = 0 ms, elapsed time = 3 ms.

And the code to test for yourself:

CREATE TABLE dbo.Parent
(ParentId int identity(1,1)
,ParentDate datetime)

CREATE TABLE dbo.Child
(ChildId int identity(1,1)
,ParentId int
,ChildDate datetime)

DECLARE @i int
DECLARE @j int
SET @i = 1
SET @j = 1
WHILE @i < 100
BEGIN
INSERT INTO dbo.Parent
(ParentDate)
SELECT GETDATE()
WHILE @j < 100
BEGIN
INSERT INTO dbo.Child
(ParentId
,ChildDate)
SELECT @i
,GETDATE()
SET @j = @j + 1
END
SET @i = @i + 1
END

CREATE VIEW dbo.vJoin
AS
SELECT p.ParentId
,p.ParentDate
,c.ChildId
,C.ChildDate
FROM dbo.Parent p
JOIN dbo.Child c
ON p.ParentId = c.ParentId

CREATE FUNCTION dbo.SingleUDF ()
RETURNS TABLE
AS
RETURN
(
SELECT p.ParentId
,p.ParentDate
,c.ChildId
,C.ChildDate
FROM dbo.Parent p
JOIN dbo.Child c
ON p.ParentId = c.ParentId
)

CREATE Function dbo.MultiUDF ()
RETURNS @Multi TABLE
(ParentId int
,ParentDate datetime
,ChildId int
,ChildDate datetime)
AS
BEGIN
INSERT INTO @Multi
(ParentId
,ParentDate
,ChildId
,ChildDate)
SELECT p.ParentId
,p.ParentDate
,c.ChildId
,C.ChildDate
FROM dbo.Parent p
JOIN dbo.Child c
ON p.ParentId = c.ParentId
RETURN
END

set statistics time on
select * from vJoin
select * from SingleUDF()
select * from MultiUDF()
set statistics time off

Permalink 6 Comments