SQL Rally: Performance Tuning Abstract

October 28, 2010 at 8:00 am (PASS) (, , , , , , )

I get the call, you get the call, everyone gets the call. “Hey, my app/procedure/query/report is running slow.” Now what do you do? You go to my full day session at SQL Rally, that’s what. Assuming you vote for it.

I didn’t post the abstract I submitted for the SQL Rally before because I thought that it would be redudant. However, since it’s not right off the voting page (unless they updated it since I voted), if you’re interested, here’s what I thought I would do for a day. If it sounds good to you, please go here and vote for it.

One of the most common problems encountered in SQL Server is the slow running query. Once a query is identified as performing poorly, DBAs and developers frequently don’t understand how to diagnose the problem and often struggle to fix the problem. This one day seminar focuses exclusively on these two topics. Attendees will learn how to identify the queries that are performing badly and learn how to fix them. We will start by learning how to gather performance metrics, both server and query metrics, using tools available directly from Microsoft such as performance monitor, DMVs and Profiler. From there we’ll move into learning how the optimizer works and how it uses statistics to determine which indexes and other database objects can assist the performance of a query. The session takes considerable time to show exactly how to generate and read execution plans, the one best mechanism for observing how the optimizer works. We’ll then look at other DMVs that can assist you when performance tuning queries. With all this knowledge gathered, we’ll move into looking at common performance problems, how they evidence themselves in the metrics and execution plans, and how to address them. Finally, we’ll explore advanced methods for solving some of the more difficult query performance problems introducing such concepts as query hints, plan guides and plan forcing. Through all of this, best practices and common techniques will be reviewed. Attendees will go home with a working knowledge of query performance tuning, a set of methods for identifying poorly performing queries, scripts to assist in these processes and the knowledge of how fix performance problems in their own systems.

To see the other sessions go here:
BI
DBA
Developer
Misc

Although I would prefer that you voted for me, it’s more important that you vote at all (same thing as in real life). Please go here and select the sessions that you want to see.

Advertisements

Permalink 1 Comment

24 Hours of PASS: Summit Preview

August 11, 2010 at 8:58 am (PASS, SQL Server 2008, SQLServerPedia Syndication, TSQL) (, , , , , , , )

Registration is open for the second 24 Hours of PASS this year. This one is going to be a preview of the Summit itself. So all the sessions are tied, in some manner, to sessions being given at the summit.Here’s a link to go and register.

I’m very excited to be able to say that I’ll be presenting in this 24HOP. One of my presentations at the Summit this year is Identifying and Fixing Performance Problems Using Execution Plans. It covers pretty much what it says, methods for fixing performance problems by exploring the information available within execution plans. But, how do you know you have a performance problem? That’s where my preview session comes in. Identifying Costly Queries will show you several ways to gather metrics on your system so that you can understand which queries are causing you the most pain. Once you know which queries need tuning, you can use execution plans to tune them. Whether you’ll be attending the PASS Summit or not, and whether or not you’ll go to my session once you’re there, I think this 24HOP session will be useful to help you understand where the pain points are within your own systems. I hope you’ll attend.

More importantly though, check out all the other great sessions. This is an excellent collection of presenters and presentations. For anyone who has ever said “PASS doesn’t do anything for me,” I want you especially to take a look at the amazing training opportunities being offered by PASS, for free. The volunteers that run PASS do amazing things and this is just one of them. Take advantage of this opportunity and, hopefully, recognize that PASS is doing things for you. This just barely scratches the surface of all that PASS offers.

Permalink 3 Comments

Confio Ignite: First Impressions

April 6, 2010 at 8:28 am (SQL Server 2005, SQL Server 2008) (, , , , , )

One of the national sponsors for SQL Satuday is Confio Software. Just because they sponsor SQL Saturday, it’s worth checking out their product, but they also host the speaker’s dinner at these events. So, having eaten off their dime, I felt obligated to take a look. I’m glad I did.

Confio Ignite is a monitoring software that keeps real-time and historical track of the performance of your SQL Server (and Oracle and DB2) database servers. It’s focus is on wait states and queues, a very common method for troubleshooting performance.

You can get a trial download from their web site to run for a couple of weeks yourself. Everyone’s first impression of a software, after the web site of course, is when go to do the install. The install routine for Ignite was very easy. It’s also a bit of a shock. It actually uses web pages for configuration. This makes sense when you consider that the software originally comes from Oracle and Unix systems. Anyway, despite the bit of weirdness of using a web page for configuring software, it’s a very easy install. Once the install is complete, it’s also really simple to add servers for the software to monitor.

I put it to work on a number of development servers to start (I’m not plugging some unknown software into Production without a shake-down. I may be stupid, but I’m not crazy). It’s agentless and it began running queries against the systems immediately. It’s storing the data to my local desktop. That data is available in real-time, a historical view, or a trending view, showing how performance is changing over time. It’s all through a web site, not dissimilar to Quest’s Foglight. You can see a list of servers, their immediate state, action choices and some other general info.

Clicking a server, after it’s had time to gather data, put’s you into the Trend screen. It’s laid out very nicely. You get to break down performance by SQL, Waits, Programs, Databases, Machines, or Users. Then it shows information by Total Wait, Average Wait or Typical Day. The chart gives you a visual representation of queries running against the system, by default numbered and assigned by an internal hashing system that Ignite uses (no, I don’t know what the internals look like, I’m inferring that from the information presented). Down at the bottom of the screen you can see Top Query Problems, which shows you queries that are running slow or using too many resources according configurable settings, or Resources, your standard breakdown of CPU, Memory Paging, Disk Queue and the rest, or SQL Text, literally the queries that have been running on your system.

Almost everything has a drill down that takes you to a lot more information, and there are pop-ups that show you bits of detail.

I’ve only been running it for a few days, but I can already see how it could be very useful. I’m very happy with the focus on waits and on queries. So often when you’re looking at these tools queries are either an after thought or ignored completely. You frequently have to spend time monitoring your queries on your own, while that very expensive monitoring software is tracking page life expectancy and buffer cache hit ratio, both useful measures, but you still need to know which query is causing the problems. Ignite will get to that information right away and still collect the other as well.

So far, I’m pretty pleased with the software. I’m going to switch this over to a production system in the next few days and see what I can see with Ignite running against a real system. More to come.

Permalink 2 Comments

nHibernate, First Look at TSQL

April 5, 2010 at 10:19 am (nHibernate) (, , , , )

I’ve blogged in the past about the nHibernate project that has been going on at my company for the last eighteen months. Prior to today, I have only seen the database generated by nHibernate. But today I finally started seeing some TSQL code. My first impressions… oy!

There are two levels of concern coming out of the gate. First, it appears that some of the programming decisions, completely independent of nHibernate, are going to cause problems. Second, it appears we’re going to be hitting issues with nHibernate.

First, the programming decision. I’ve been using Confio to monitor this server for a few days now (look for upcoming posts on my eval of Confio). Only one day has captured any real activity from the nHibernate team (and yes, I’m basically spying on these guys because they are absolutely insistent that my team stay out of their way). The longest running query was two calls to this (names have been changed to protect my job, there are no innocents):

SELECT BookID
FROM Books WITH (updlock,rowlock)
WHERE BookID = 42

 What they’re doing is locking the row so that no other user can get at it while it’s being edited. Total lock time for the first day was 38 seconds for two calls. I’ve posted about this previously, so I won’t go into it again, but this is potentially a disaster.

On to nHibernate. The next query was pretty benign:

SELECT OrgID
,OfficeID
,StartDate
,EndDate
,OtherID
FROM SpecialList
WHERE OfficeID = 42

Unfortunately this table, as created out of nHibernate, has no primary key, no clustered index or any other type of index, so this is just a table scan. But it’s only on 109 rows… in dev, not production, oh, and most of the 109 rows have a null value for OfficeID, but it’s all good I suppose… until I notice that this query also had 38 seconds of wait time, but it was called 394,652 times… oy (and no, I’m not even one of the chosen, but that’s a wonderful expression for this situation). Looking at the data in cache, this query has been called, since it was created in cache on the 2nd, 598351 times with a total elapsed time on the server of 5429689ms. The average then is 9ms, but the max was 29296 or 29 seconds.

The next query up looked like this:

SELECT col1
,col2
,col3
FROM Table1
INNER JOIN Table2
ON Table1.ChildID = Table2.OtherID
WHERE tabl2.ParentID IN (@p1,@p1,@p2…@p99)

Yep. 99 parameters passed and used against an IN statement. I can’t post the execution plans on this stuff without talking it over with the boss, but suffice to say, it’s two table scans and a merge join to get the data back. Oh, and this was called 1,138 times with a minimum elapsed time of 976ms. Yeah, minimum time to call this proc is 1 second and it’s been called over a thousand times.

It went down hill from there. I’m looking at other queries, one that hits seven tables, but instead of using JOINs uses a bunch of sub-selects in the WHERE clause. It has a minimum run time of 9 seconds. Luckily it’s only been called 64 times.

This is a horror show.

A lot of benefit can be reaped if we’re able to go in and create some indexes on these tables, but that’s going to be an issue anyway because I’m seeing lots of implicit data conversions on the parameters being passed in, as I noted a couple of years ago. At this point in time, I would say, at best, I’m very underwhelmed by what nHibernate is producing. To be perfectly honest, I’m concerned for the health of the project.

Permalink 22 Comments

More Free Training

September 15, 2009 at 9:59 am (SQL Server 2005, SQL Server 2008, TSQL) (, , , , )

Quest Connect 2009, taking place in October 21 for 24 hours, looks like it’s going to have 64 different sessions, live and recorded, by a variety of the names in the industry. It’s another chance to dig in and learn the details on a variety of topics from some of the top names in the business. Can you say Tom LaRock? How about Tim Ford? I know you want to hear from Brent Ozar. Those are just some of the featured speakers. There are a whole slew of others, it’s worth pursuing, and did I mention, the price is right.

I recorded a session for them last night. It’s on the basics of understanding execution plans.

Permalink 2 Comments

No Join Predicate

September 15, 2009 at 8:00 am (SQL Server 2005, SQL Server 2008, SQLServerPedia Syndication, TSQL) (, , , , , )

You could be looking at an execution plan on a query and see this message: Warning, No Join Predicate. With a very few exceptions (very few), that’s not a warning, like “Mind the gap” or “Your Mileage May Vary” or “Never fight a land war in Asia.” No. It’s a screaming shout saying “BY ALL THE GODS IN VALHALA, YOU DON’T HAVE ANY JOIN PREDICATES IN THIS QUERY!” Seriously, that’s exactly what it says.

But, you might be asking yourself, that’s what it says, but what does it mean? Please allow me to explain. The tables in your system, whether it’s a classic BI star schema, or a highly normalized OLTP system, or even (shudder) ORM objects, are related to one another. Usually they’re related through the use of primary and foreign keys. The primary key uniquely identifies a row, like a particular value in a dimension table. This then is related to some number of rows in another table, like in a fact table, through the foreign key. The foreign key acts to protect the integrity of the data, ensuring that if you insert something that’s supposed to relate to another row, then that other row really exists and if you try to delete that other row, the foreign key will remind you that, hey, that row has things that depend on it existing in another table. All of this can be quite complicated or quite simple. Heck, you may be using an ORM system and don’t have a single foreign key in site.

But, when you go to retrieve data from two tables that are related in a TSQL query, you need to define that relationship in the TSQL itself. SQL Server 2008 relies on what was defined as the ANSI standard back in 1992. But, if you learned your SQL coding skills back in the day or on Oracle, you may be used to simply listing tables and then defining their relationships through the WHERE clause. Just a side note, unless you learned your SQL skills in Oracle, if you were born AFTER the 1992 standard was enacted, you don’t have an excuse. Anyway, the old way that code looked like was this:

SELECT…
FROM TableA AS a, TableB as b, TableC as c
WHERE a.MyID = b.MyID
AND b.YourID *= c.YourID
AND….

What you have there is an inner join, or in the old parlance, an equi-join because of the use of the equals sign, between TableA and TableB and an left outer join, defined through the use of the ‘*’, between TableB and TableC. Any other criteria that defined the WHERE clause then followed.

Flat out, in SQL Server 2008, you can’t do this any more. It will not support that style of outer join. The new fangled method, remember, new as of 1992, is this:

SELECT …
FROM TableA as a
INNER JOIN TableB as b
ON a.MyID = b.MyID
LEFT JOIN TableC as c
ON b.YourID = c.YourID
WHERE….

This forces you to separate the relations between the tables as defined by the JOIN’s and the filtering methods supplied through the WHERE clause. And yes, it’s a lot more complicated than I’m making it out to be, but I’m trying to establish the ground rules here. Where was I? Oh yeah, missing join criteria.

What happens is, someone that’s used to writing things in the ANSI ’89 syntax can either forget to define the JOIN criteria or accidently remove it while editing or something. Regardless of how it happened, you’ve ended up with what is known as a cartesian product, shortest possible explanation, everything joined on everything else. For example, here’s a real query against the AdvenureWorks2008 database:

SELECT *
FROM Sales.SalesOrderHeader AS soh
,Sales.SalesOrderDetail AS sod
,Production.Product AS p
WHERE soh.SalesOrderID = 43659

That query does not define join criteria between any of the tables involved. This results in the following estimated execution plan (estimated because I got sick of waiting for the query to finish):

MissingJoin

If you look at the final Nested Loop operation right before the SELECT statement operator, you’ll notice there is a little exclamation point inside a yellow triangle. This is an indication of a warning in your query.  You can hover over the operator to get the tool tip, which looks like this:

NoJoinPredicate

You can see at the bottom the warning that no join predicate is in use. If you constructed the query appropriately it would look more like this:

SELECT *
FROM Sales.SalesOrderHeader AS soh
JOIN Sales.SalesOrderDetail AS sod
ON soh.SalesOrderID = sod.SalesOrderID
JOIN Production.Product AS p
ON sod.ProductId = p.ProductID
WHERE soh.SalesOrderID = 43659;

Which, instead of returning 800,000 rows before I interrupted it returned only 12 rows and had a very clean execution plan:

Join

I would never consider “No Join Predicate” as a warning. It’s a blaring klaxon of a problem and you need to do something about it, right away.

Permalink 13 Comments

No really. When did this statement start?

July 31, 2009 at 8:07 am (SQL Server 2005, SQL Server 2008, SQLServerPedia Syndication) (, , , , , )

I thought I had an interesting answer to the question. Unfortunately Adam Machanic, who has been working in this specific area quite a bit longer than I have, and, let’s face it, is a bit smarter, pointed out (in the comments) the possibility that I didn’t have this quite right.

I ran all his tests and sure enough, it wasn’t working the same way that I saw it work. First, I tried modifying his query so that it ran the SELECT statements from the DMV’s between the operations, but this didn’t change the results, start_time and last_request_start_time were the same. From a scientific stand-point, if no one can replicate your results, the experiment failed.

So I went back and tried again. To be sure I got things right and wasn’t, somehow, querying the wrong server or looking at the wrong spid, I added a couple of checks, specifically I queried the spid I was working with from my first query: SELECT @@SPID.

Unlike Adam’s tests, I continued running mine with two connections. I wanted to see multiple entries while the test was running. The results from sys.dm_exec_requests showed what I had seen before. There were 25 rows,  All of them showing the exact same start_time. The command, sql_handle, wait_time, cpu_time and all the other values that you would have expected to change, did. Elapsed time and logical_reads for example show a steadily increasing value, as you would expect.

Turning to sys.dm_exec_sessions, it showed more data, 40 rows instead of 25, just because it continued to record a session for the whole time I was querying the view since the spid wasn’t dropped. I finally saw the problem. It was the way I tested. I started the query against the DMV’s before I started the query that was generating the load. Because of this, the initial queries against the sys.dm_exec_sessions showed a different start & end time, unrelated to the test. Then the test ran and a second set of values were generated when the test completed. What I had attributed to the time it took for individual statements to complete and then be registered was not. It was just the different batches finishing, not the statements.

Nuts. 100% wrong. That’ll make Tim Ford happy though.

Unfortunately I’m back to where I was. Is there a way to tell when a statement started? I don’t know of a dynamic method for doing so. Obviously you could run a trace to get the statement start event, but you’ll be collecting a lot of data depending on how you set up that trace, so it wouldn’t be a good approach for everyday troubleshooting. I’ll keep looking.

Permalink 13 Comments

When did this statement start?

July 28, 2009 at 10:19 am (SQL Server 2005, SQL Server 2008, SQLServerPedia Syndication, TSQL) (, , , )

UPDATE: This post is incorrect. Adam nailed it in the comments. I explain my mistake here.

A question came up over at SQL Server Central where someone was wondering if it was possible to know when a given statement within a batch started. Immediately I thought, oh yeah, that’s easy, use the sys.dm_exec_requests dynamic management view (DMV). Done.

Wrong. The original poster pointed out that I had assumed that the values present in the DMV represented statement level values, but they show the batch. While the DMV shows a start_time, that time is the start of the current batch, not the statement within the batch. Now the question was, where else might I get this data? I next tried sys.dm_exec_sessions because it has the last_request_start_time value. Sure enough this worked.

Don’t believe me? This time I tested it. First, I set up a batch that ran a few longish queries:

BACKUP DATABASE AdventureWorks
TO DISK = 'c:\bu\ad.bak'
WITH INIT;
DBCC CHECKALLOC(AdventureWorks);
DBCC CHECKDB(AdventureWorks);
DBCC CHECKCATALOG(AdventureWorks);
BACKUP DATABASE AdventureWorks
TO DISK = 'c:\bu\ad.bak'
WITH INIT; 

Then I wrote a little loop that would insert data from both DMV’s, once every second for a minute:

SET NOCOUNT ON;
TRUNCATE TABLE dbo.dmR;
TRUNCATE TABLE dbo.dmS;
DECLARE @CheckTime DATETIME = DATEADD(ss,60,GETDATE())
 ,@CurrTime DATETIME = GETDATE();
WHILE @CurrTime < @CheckTime
BEGIN
INSERT INTO ... SELECT ... FROM sys.dm_exec_sessions
INSERT INTO ... SELECT ... FROM sys.dm_exec_requests
WAITFOR DELAY '00:00:01';
SET @CurrTime = GETDATE();
END

Then it was a question of hitting the go button. The results are pretty clear. If you look at sys.dm_exec_requests data, you’ll see the start_time doesn’t change, but, the work being performed, visible in the command and plan_handle columns changes. You can also see the statement_start_offset and statement_end_offset changing. These values will allow you to pull data in combination with the sys.dm_exec_sql_text() dynamic management function.

 start_time             command         statement_start_offset statement_end_offset plan_handle
2009-07-28 10:42:09.447 BACKUP DATABASE 0                      138                  0x060001007D931719B8C0BF0D000000000000000000000000
2009-07-28 10:42:09.447 DBCC            140                    206                  0x060001007D931719B8C0BF0D000000000000000000000000
2009-07-28 10:42:09.447 DBCC            140                    206                  0x06000F00AF16F10FB8C04D0D000000000000000000000000
2009-07-28 10:42:09.447 DBCC            208                    268                  0x060001007D931719B8C0BF0D000000000000000000000000

But, as you can see, the start_time stays the same. Looking at the last_request_start_time and last_request_end_time in the sys.dm_exec_sessions DMV, you can see these values changing in conjunction with the changing values in sys.dm_exec_requests.

last_request_start_time last_request_end_time
2009-07-28 10:35:21.500 2009-07-28 10:35:51.170
2009-07-28 10:42:09.447 2009-07-28 10:35:51.170 

I’ll leave at as an excercise for the reader (or for another post for myself) to put together the DMV’s and DMF’s that would allow you to combine this data and show exactly which statement was being executed at which moment. The point, it is possible to determine exactly which statement is currently being executed within a batch process, and when that statement started.

Permalink 5 Comments

Execution Plan Compile Termination

July 17, 2009 at 7:22 am (SQL Server 2005, SQL Server 2008, SQLServerPedia Syndication, TSQL) (, , , , , )

Recently I’ve been seeing a lot of people with bad execution plans, desperately trying to tune them, but they were unable to explain why they had such bad plans. More often than no these were larger queries, with a number of derived tables, CTE’s, CROSS APPLY, etc. In most cases the statistics appeared to be fine (this is usually checked by comparing estimated & actual rows within the operations in the execution plan) and the plans themselves didn’t look crazy, but the execution plans were flat out, not good.

If you’re looking at a plan and it doesn’t make much sense, one option that most people don’t check… SQL Server didn’t have enough time to complete optimization. The optimizer is a pretty amazing bit of code. The scary volume of work it does in frightenly short periods of time is quite awesome. However, it doesn’t always get done. To quickly check this situation you need to determine the “Reason For Early Termination Of Statement Optimization.” In the graphical execution plans this is found by going to the final operator in the plan, usually an INSERT, UPDATE, DELETE, or SELECT operator. Most people know to hover the mouse over the operator & get interesting little facts, but most people forget about checking the properties. Right click on this, or any other operator, and select Properties from the context menu. This will open the properties window, which looks something like this:

Properties

Right near the bottom you can see that this execution plan is showing an early termination reason of “Time Out.” That means that the optimizer didn’t finish optimizing. When the optimizer is running, it tries a whole series of different joins, join order, filters, what have you, all through some mathematic process that I’m not even remotely qualified to describe. But if it runs out of time, it goes with the last one it had. The last one it had might be the worst possible plan for this query, but because it hit a time out…

You can also find this right in the XML for the execution plan. In the Batch/Statements/StmtSimple property find the @StatementOptmEarlyAbortReason. It will show the same thing, “Time Out,” when the optimizer has run out of time.

You can get other early abort reasons, but the best one is “GoodEnoughPlanFound.” That means that the optimizer was able to complete it’s job. That doesn’t mean you have a good plan, but it means that you don’t have some interim pile of junk that is doing you more harm than good.

What can you do to fix this? Write simpler queries.

OK. Maybe that’s not helpful. Try to simplify the query. You may not be able to. If not, you might try rearranging the query, changing the join order. Yes, as everyone says, join order doesn’t matter, but the full statement is “Join order doesn’t normally matter.” Sometimes it can make a real difference. You may also have to supply some query hints, join hints, etc., to try to nudge the optimizer along.

Permalink 14 Comments

Re-evaluating Execution Plans (again)

June 26, 2009 at 7:23 am (SQL Server 2005, SQL Server 2008, SQLServerPedia Syndication, TSQL) (, , , , )

I keep thinking I’ve got a handle on the way execution plans are dealt with in SQL Server. I had a pretty simplistic view of things, there’s the estimated plan which comes out of the optimizer and there’s the actual plan which comes out of the data engine. The one shows the best guess for what will happen based on the query and the underlying statistics. The other shows what actually happened based on the query and the real data, load on the server, available memory, you get the idea. This model is easy to explain and understand. Too bad it’s not quite the way things work.

If you query the dynamic management function sys.dm_exec_query_plan, you can see a query plan. Once you drill down on the XML, or browse through the GUI, you’ll see that this is an estimated plan, with no execution statistics at all. The safe assumption here is that SQL Server takes the execution plan created by the optimizer and stores it. Ah, but is that exactly how it works? Can you generate an estimated execution plan for this batch statement:

CREATE TABLE dbo.TempTable (id INT IDENTITY(1, 1)
,val VARCHAR(50) ) ;
INSERT INTO dbo.TempTable
(val)
VALUES
(‘dude’) ;
SELECT tt.*
FROM dbo.TempTable AS tt ;
DROP TABLE dbo.TempTable ;

No. You can’t. If you try to generate an estimated plan you’ll get the error “Invalid object name ‘dbo.TempTable’.” This is because the algebrizer, a part of the process of the optimizer, which is what generates the estimated plan, can’t resolve the table dbo.TempTable because it doesn’t exist yet. However, if you execute the batch above and then run this query:

SELECT deqp.query_plan
FROM
sys.dm_exec_cached_plans AS decp
CROSS APPLY sys.dm_exec_query_plan(decp.plan_handle) AS deqp
CROSS APPLY sys.dm_exec_sql_text(decp.plan_handle) AS dest
WHERE
dest.text LIKE ‘create table%’

You can see the estimated plan for executing this batch. So… if the optimizer creates estimated plans, but it can’t create an estimated plan for this query since the algebrizer can’t resolve the object name… where did this estimated plan come from? Good question.

Here’s some more interesting info. You know that you can capture both estimated & actual execution plans through trace events, right? Watch what happens if you run the batch statement above and capture both the “Showplan XML” and “Showplan XML Statistics Profile” events, estimated and actual, respectively. You’ll get the estimated plan that is being stored in sys.dm_exec_query_plan and you’ll get an actual execution plan including actual rows vs. estimated, etc. Where did that execution plan come from? Another good question.

A better man than I, Tibor Karaszi, first noticed that there was, other than the execution statistics, no difference between “XML Showplan” and “XML Showplan Statistics Profile.” He asked whether or not we should ever expect to see a difference in a forum post. I jumped in with both feet, of course you’ll see a difference. After all, one is the estimated plan and one is the actual plan. Mr. Karaszi corrected a couple of my assumptions and suddenly, there I was, not understanding execution plans, again. I didn’t know where those estimated plans were coming from, not for sure. I assumed the optimizer, but then how were they getting past the algebrizer?

I had a theory and a mechanism for testing it. If I set up a trace and capture the recompile events using SQL:StmtRecompile and return the EventSubClass column, I can see what caused a recompile. When I run this query and check the trace information, I find that the EventSubClass resolves to 3-Deferred Compile. I’m not 100% certain that I’m correct here, but the following is what I think occurrs.

Because of the CREATE statement, the batch is going to fail in the optimizer. But, because it’s a DDL statement, it goes on to execute anyway. Then, a recompile puts the batch back through the optimizer. There, a table is now available where none was previously. An estimated execution plan is created. This plan, previously non-existent gets stored and can be accessed by us.

Within the facts available, this seems to explain the behavior. In fact, it’s basically supported by Connor Cunningham, one of the MS SQL Server developers, in some blog posts over at Conner vs. SQL.

Permalink Leave a Comment

Next page »