I try to watch the search phrases that point people to the blog because sometimes, you get a sense of what problems people are running into. The latest question or phrase I’ve seen a lot lately is along the lines of “how do you know if an execution plan is being reused.”
Since compiling an execution plan can be an extremely expensive operation, it’s worth your time to understand how well a given plan is getting reused. If you’ve seen me present, I’ll frequently talk about the application that had a query with an 86 table join. Recompiles on that thing were frequent and extremely costly. The only good news was, they were recompiles. If we weren’t getting plan reuse it would have been an even worse system than it was.
There are a number of ways you can see if a plan is being reused. One of the easiest is to hit the DMOs that look into the plan cache. You can look at either sys.dm_exec_procedure_stats or sys.dm_exec_query_stats. These DMOs are somewhat different and somewhat the same. Actually, they sort of cross each other. For sys.dm_exec_procedure_stats, you get aggregate information about stored procedures. For sys.dm_exec_query_stats you get aggregate information about queries, which, may be run within stored procedures. So either or both could be useful depending on what you’re attempting to measure. However, both return a very useful counter, execution_count. Quite simply, that tells you that the plan, whether for the statement or the procedure, is being reused. To get maximum gain out using either of these, you’ll want to combine them with other DMOs. Something along these lines can show you the plans for procedures that have been running against a system, ordered by the number of times they’ve been reused:
SELECT deps.execution_count , OBJECT_NAME(deps.object_id, deps.database_id) 'Procedure' , deqp.query_plan FROM sys.dm_exec_procedure_stats AS deps CROSS APPLY sys.dm_exec_query_plan(deps.plan_handle) AS deqp ORDER BY deps.execution_count DESC
The only thing wrong with using the DMO like this is that you can only see what’s currently in cache. This means no history, depending on how volatile the cache is on your system.
Another way to tell if a plan is being reused, is to set up a server side trace and capture the event for the SP:CacheMiss or SP:CacheHit. This is pretty straight forward. If the query was not in, it generates a miss event. If it was there, it generates a hit. But, this has to be running in order for you to know if you had a hit or a miss. The good news is, if it’s running, you’ve got historical information since this captures the event as it occurs.
You can also catch recompile events using SQL:StmtRecompile. Yes, you can get SP:Recompile if you’re only ever dealing with procedures, but if you’ve got any kind of ad-hoc querying going on in the system or triggers, you’ll miss recompile events. Basically, because recompiles are at the statement level starting in 2005, BOL recommends only using SQL:StmtRcompile.
That’s pretty much it. Using these methods in the appropriate place will let you know if the plan is being reused or not.
I saw an odd statement the other day, “The size of the name of the parameter does not affect performance.” My first thought was, “Well, duh!” But then, I had one of those, “Ah, but are you sure” thoughts. And you know what, I wasn’t sure.
If size of the parameter name did affect performance, I figured, the one sure place where that would be evident is in the size of the execution plan. Right? I mean, if there was an impact on memory, and hence on performance, that’s probably where you’d see evidence of it. I wrote two queries:
DECLARE @ThisIsAVeryVeryLongParameterNameThatIsTrulyRidiculousButItIllustratesThePointThatParameterLengthDoesNotAffectPerformance int SET @ThisIsAVeryVeryLongParameterNameThatIsTrulyRidiculousButItIllustratesThePointThatParameterLengthDoesNotAffectPerformance = 572 SELECT soh.SalesOrderID ,sod.SalesOrderDetailID FROM Sales.SalesOrderHeader AS soh JOIN Sales.SalesOrderDetail AS sod ON soh.SalesOrderID = sod.SalesOrderID WHERE soh.SalesOrderID = @ThisIsAVeryVeryLongParameterNameThatIsTrulyRidiculousButItIllustratesThePointThatParameterLengthDoesNotAffectPerformance DECLARE @v int SET @v = 572 SELECT soh.SalesOrderID ,sod.SalesOrderDetailID FROM Sales.SalesOrderHeader AS soh JOIN Sales.SalesOrderDetail AS sod ON soh.SalesOrderID = sod.SalesOrderID WHERE soh.SalesOrderID = @v
If you run this against AdventureWorks2008R2 you’ll get two distinct, but identical, execution plans:
You can see that they look identical, but how do I know they’re distinct? If you run this query:
SELECT deqs.creation_time, deqs.query_hash, deqs.query_plan_hash FROM sys.dm_exec_query_stats AS deqs CROSS APPLY sys.dm_exec_query_plan(deqs.plan_handle) AS deqp CROSS APPLY sys.dm_exec_sql_text(deqs.sql_handle) AS dest WHERE dest.text LIKE '%SELECT soh.SalesOrderID%'
You’ll get this back as a result:
creation_time query_hash query_plan_hash
2010-09-23 18:18:09.347 0x8D0FB9D524B8DD4D 0x13707445560737BA
2010-09-23 18:18:16.223 0x8D0FB9D524B8DD4D 0x13707445560737BA
Two distinct queries, but with identical hash values, so the plans generated are nearly the same, but clearly different, due to the fact that they were built with different parameters, including the monster name. So, how to see if there is a difference in the plan generated that could affect performance? How about the execution plan properties. First, the property sheet for the SELECT operator for the query with the long parameter name:
Of particular note is the Cache Plan Size. Let’s compare it to the same property sheet for the small parameter name:
If you compare the two, you’ll see that they’re the same. In fact, if you look at almost all the values, you’ll see that the Compile CPU, Compile Memory, and Compile Time are all identical. Based on all this information, I have to conclude that no, the size of the name of the parameter doesn’t affect performance, positively or negatively. But why?
I’m actually not 100% sure, but based on some things I know, here’s what I think. The Algebrizer within the Query Optimizer breaks down all the objects referred to within a query plan. It assigns them all values and identifiers for that plan, part of getting the information together to feed the plan into the mathematical part of the Optimizer. I’ll bet it just assigns values to parameters that are the same type of value, if not the same exact value, and that value is the same size from one execution plan to the next.
This means that you don’t save memory by assigning parameters @a, @b, @c when in fact you mean @ReferenceCount, @MaxRetries, @BeginDate. Do I think you should put in parameters of the silly length I put in before? No, of course, not, because it makes the TSQL code less clear. But, so does putting in equally silly, short, parameter names.
Don’t make your TSQL code hard to read. It doesn’t help performance.
I won’t be going to Las Vegas, but I will be presenting to the Las Vegas SQL Server Users Group, S3OLV. The sesssion will be “Introduction to Execution Plans.” Please swing by if you’re in the area.
I’m not sure if they’ll make the LiveMeeting available to the public or if they’ll record the session. But if they do, please attend that way too.
I had no idea, but evidently I started a bit of a donny-brook. Cool!
I’m presenting on Thursday to the Columbus SQL Server Users Group, home of the newly minted MVP, Jeremiah Peschka (congrats again). It’s to be a virtual presentation through live meeting. The topic is a Understanding Execution Plans. This time I hope that one of us remembers to hit the record button unlike last time when I did this presentation for the PASS AppDev Virtual Chapter. If you’re in the area, come on down and meet me, virtually. Show runs from 6:30-8:30, although all that time is not my presentation, so I suspect Jeremiah has some more fun planned.
I don’t generally do lots of blog aggregation and cross post linking & stuff. It’s just not something I’m that into. However, this time is an exception. Gail Shaw, Gila Monster to those who hang on out SQL Server Central, has posted an excellent explanation of times when Estimated and Actual row counts vary for a reason. I’m one of those who emphasises that differences between estimated & actual is an indication of… something. It could be out of date or missing statistics or it could be caused by query language like multi-statement table valued functions, but it’s usually and indication of a problem. Except when it’s not. Read Gail’s explanation for more.
You could be looking at an execution plan on a query and see this message: Warning, No Join Predicate. With a very few exceptions (very few), that’s not a warning, like “Mind the gap” or “Your Mileage May Vary” or “Never fight a land war in Asia.” No. It’s a screaming shout saying “BY ALL THE GODS IN VALHALA, YOU DON’T HAVE ANY JOIN PREDICATES IN THIS QUERY!” Seriously, that’s exactly what it says.
But, you might be asking yourself, that’s what it says, but what does it mean? Please allow me to explain. The tables in your system, whether it’s a classic BI star schema, or a highly normalized OLTP system, or even (shudder) ORM objects, are related to one another. Usually they’re related through the use of primary and foreign keys. The primary key uniquely identifies a row, like a particular value in a dimension table. This then is related to some number of rows in another table, like in a fact table, through the foreign key. The foreign key acts to protect the integrity of the data, ensuring that if you insert something that’s supposed to relate to another row, then that other row really exists and if you try to delete that other row, the foreign key will remind you that, hey, that row has things that depend on it existing in another table. All of this can be quite complicated or quite simple. Heck, you may be using an ORM system and don’t have a single foreign key in site.
But, when you go to retrieve data from two tables that are related in a TSQL query, you need to define that relationship in the TSQL itself. SQL Server 2008 relies on what was defined as the ANSI standard back in 1992. But, if you learned your SQL coding skills back in the day or on Oracle, you may be used to simply listing tables and then defining their relationships through the WHERE clause. Just a side note, unless you learned your SQL skills in Oracle, if you were born AFTER the 1992 standard was enacted, you don’t have an excuse. Anyway, the old way that code looked like was this:
FROM TableA AS a, TableB as b, TableC as c
WHERE a.MyID = b.MyID
AND b.YourID *= c.YourID
What you have there is an inner join, or in the old parlance, an equi-join because of the use of the equals sign, between TableA and TableB and an left outer join, defined through the use of the ‘*’, between TableB and TableC. Any other criteria that defined the WHERE clause then followed.
Flat out, in SQL Server 2008, you can’t do this any more. It will not support that style of outer join. The new fangled method, remember, new as of 1992, is this:
FROM TableA as a
INNER JOIN TableB as b
ON a.MyID = b.MyID
LEFT JOIN TableC as c
ON b.YourID = c.YourID
This forces you to separate the relations between the tables as defined by the JOIN’s and the filtering methods supplied through the WHERE clause. And yes, it’s a lot more complicated than I’m making it out to be, but I’m trying to establish the ground rules here. Where was I? Oh yeah, missing join criteria.
What happens is, someone that’s used to writing things in the ANSI ’89 syntax can either forget to define the JOIN criteria or accidently remove it while editing or something. Regardless of how it happened, you’ve ended up with what is known as a cartesian product, shortest possible explanation, everything joined on everything else. For example, here’s a real query against the AdvenureWorks2008 database:
FROM Sales.SalesOrderHeader AS soh
,Sales.SalesOrderDetail AS sod
,Production.Product AS p
WHERE soh.SalesOrderID = 43659
That query does not define join criteria between any of the tables involved. This results in the following estimated execution plan (estimated because I got sick of waiting for the query to finish):
If you look at the final Nested Loop operation right before the SELECT statement operator, you’ll notice there is a little exclamation point inside a yellow triangle. This is an indication of a warning in your query. You can hover over the operator to get the tool tip, which looks like this:
You can see at the bottom the warning that no join predicate is in use. If you constructed the query appropriately it would look more like this:
FROM Sales.SalesOrderHeader AS soh
JOIN Sales.SalesOrderDetail AS sod
ON soh.SalesOrderID = sod.SalesOrderID
JOIN Production.Product AS p
ON sod.ProductId = p.ProductID
WHERE soh.SalesOrderID = 43659;
Which, instead of returning 800,000 rows before I interrupted it returned only 12 rows and had a very clean execution plan:
I would never consider “No Join Predicate” as a warning. It’s a blaring klaxon of a problem and you need to do something about it, right away.
I got the question the other day, when are you likely to see a spool in an execution plan? Easy, whenever SQL Server needs to walk through the data multiple times, usually in JOIN operations… Yeah, well, once again, my flip answers are not quite the entire story.
Spool operations are temporary storage of the data for later reuse in a query plan. There are two types of spool operations, eager spool and lazy spool. A spool is basically a temporary table created within the execution of the query that is used when it’s likely that data will be needed again, and again during the execution of the query. This is not an explicit #temp temporary table, but a work table for operations within the processing necessary for a given query’s behavior. A spool is created when the optimizer thinks that it can work better with a semi-permanent sub-set of data rather than have to perform multiple seeks or scans against a table or index or in other places where data re-use is important (more in a bit).
So how does this work? Take a look at a simple query:
UPDATE Person.Person SET FirstName = 'Ted' WHERE FirstName = 'Ted';
When the execution plan for this query is generated, it looks like this:
In this case, an eager spool is used as part of the roll back mechanism and to prevent the Halloween scenario. An eager spool is one where the data is retrieved immediately.
It’s possible to see the other type of spool in a query that looks like this (straight out of the Books Online):
WITH DirectReports(ManagerID, EmployeeID, EmployeeLevel) AS ( SELECT ManagerID, EmployeeID, 0 AS EmployeeLevel FROM HumanResources.Employee WHERE ManagerID IS NULL UNION ALL SELECT e.ManagerID, e.EmployeeID, EmployeeLevel + 1 FROM HumanResources.Employee e INNER JOIN DirectReports d ON e.ManagerID = d.EmployeeID ) SELECT ManagerID, EmployeeID, EmployeeLevel FROM DirectReports ;
Which would result in this execution plan:
Now you see a table spool that is called a lazy spool. This means that it only loads data as the data is requested. This makes a lot of sense because the lazy spool is operating as the means for gathering the recursive data together. So it’s not going to go and get all the data available, like an eager spool. Instead it’s going to only load the data as needed, lazy.
These two scenarios are much more likely than the typical join to show a table spool. Yes, it can, and does, appear in join operations, but as I said at the beginning, that’s such a flip answer. Much better to try to be complete.
Recently I’ve been seeing a lot of people with bad execution plans, desperately trying to tune them, but they were unable to explain why they had such bad plans. More often than no these were larger queries, with a number of derived tables, CTE’s, CROSS APPLY, etc. In most cases the statistics appeared to be fine (this is usually checked by comparing estimated & actual rows within the operations in the execution plan) and the plans themselves didn’t look crazy, but the execution plans were flat out, not good.
If you’re looking at a plan and it doesn’t make much sense, one option that most people don’t check… SQL Server didn’t have enough time to complete optimization. The optimizer is a pretty amazing bit of code. The scary volume of work it does in frightenly short periods of time is quite awesome. However, it doesn’t always get done. To quickly check this situation you need to determine the “Reason For Early Termination Of Statement Optimization.” In the graphical execution plans this is found by going to the final operator in the plan, usually an INSERT, UPDATE, DELETE, or SELECT operator. Most people know to hover the mouse over the operator & get interesting little facts, but most people forget about checking the properties. Right click on this, or any other operator, and select Properties from the context menu. This will open the properties window, which looks something like this:
Right near the bottom you can see that this execution plan is showing an early termination reason of “Time Out.” That means that the optimizer didn’t finish optimizing. When the optimizer is running, it tries a whole series of different joins, join order, filters, what have you, all through some mathematic process that I’m not even remotely qualified to describe. But if it runs out of time, it goes with the last one it had. The last one it had might be the worst possible plan for this query, but because it hit a time out…
You can also find this right in the XML for the execution plan. In the Batch/Statements/StmtSimple property find the @StatementOptmEarlyAbortReason. It will show the same thing, “Time Out,” when the optimizer has run out of time.
You can get other early abort reasons, but the best one is “GoodEnoughPlanFound.” That means that the optimizer was able to complete it’s job. That doesn’t mean you have a good plan, but it means that you don’t have some interim pile of junk that is doing you more harm than good.
What can you do to fix this? Write simpler queries.
OK. Maybe that’s not helpful. Try to simplify the query. You may not be able to. If not, you might try rearranging the query, changing the join order. Yes, as everyone says, join order doesn’t matter, but the full statement is “Join order doesn’t normally matter.” Sometimes it can make a real difference. You may also have to supply some query hints, join hints, etc., to try to nudge the optimizer along.
I keep thinking I’ve got a handle on the way execution plans are dealt with in SQL Server. I had a pretty simplistic view of things, there’s the estimated plan which comes out of the optimizer and there’s the actual plan which comes out of the data engine. The one shows the best guess for what will happen based on the query and the underlying statistics. The other shows what actually happened based on the query and the real data, load on the server, available memory, you get the idea. This model is easy to explain and understand. Too bad it’s not quite the way things work.
If you query the dynamic management function sys.dm_exec_query_plan, you can see a query plan. Once you drill down on the XML, or browse through the GUI, you’ll see that this is an estimated plan, with no execution statistics at all. The safe assumption here is that SQL Server takes the execution plan created by the optimizer and stores it. Ah, but is that exactly how it works? Can you generate an estimated execution plan for this batch statement:
CREATE TABLE dbo.TempTable (id INT IDENTITY(1, 1)
,val VARCHAR(50) ) ;
INSERT INTO dbo.TempTable
FROM dbo.TempTable AS tt ;
DROP TABLE dbo.TempTable ;
No. You can’t. If you try to generate an estimated plan you’ll get the error “Invalid object name ‘dbo.TempTable’.” This is because the algebrizer, a part of the process of the optimizer, which is what generates the estimated plan, can’t resolve the table dbo.TempTable because it doesn’t exist yet. However, if you execute the batch above and then run this query:
FROM sys.dm_exec_cached_plans AS decp
CROSS APPLY sys.dm_exec_query_plan(decp.plan_handle) AS deqp
CROSS APPLY sys.dm_exec_sql_text(decp.plan_handle) AS dest
WHERE dest.text LIKE ‘create table%’
You can see the estimated plan for executing this batch. So… if the optimizer creates estimated plans, but it can’t create an estimated plan for this query since the algebrizer can’t resolve the object name… where did this estimated plan come from? Good question.
Here’s some more interesting info. You know that you can capture both estimated & actual execution plans through trace events, right? Watch what happens if you run the batch statement above and capture both the “Showplan XML” and “Showplan XML Statistics Profile” events, estimated and actual, respectively. You’ll get the estimated plan that is being stored in sys.dm_exec_query_plan and you’ll get an actual execution plan including actual rows vs. estimated, etc. Where did that execution plan come from? Another good question.
A better man than I, Tibor Karaszi, first noticed that there was, other than the execution statistics, no difference between “XML Showplan” and “XML Showplan Statistics Profile.” He asked whether or not we should ever expect to see a difference in a forum post. I jumped in with both feet, of course you’ll see a difference. After all, one is the estimated plan and one is the actual plan. Mr. Karaszi corrected a couple of my assumptions and suddenly, there I was, not understanding execution plans, again. I didn’t know where those estimated plans were coming from, not for sure. I assumed the optimizer, but then how were they getting past the algebrizer?
I had a theory and a mechanism for testing it. If I set up a trace and capture the recompile events using SQL:StmtRecompile and return the EventSubClass column, I can see what caused a recompile. When I run this query and check the trace information, I find that the EventSubClass resolves to 3-Deferred Compile. I’m not 100% certain that I’m correct here, but the following is what I think occurrs.
Because of the CREATE statement, the batch is going to fail in the optimizer. But, because it’s a DDL statement, it goes on to execute anyway. Then, a recompile puts the batch back through the optimizer. There, a table is now available where none was previously. An estimated execution plan is created. This plan, previously non-existent gets stored and can be accessed by us.
Within the facts available, this seems to explain the behavior. In fact, it’s basically supported by Connor Cunningham, one of the MS SQL Server developers, in some blog posts over at Conner vs. SQL.