Database Screening Questions

April 30, 2009 at 1:58 pm (SQL Server 2005, SQL Server 2008, SQLServerPedia Syndication, TSQL) (, )

With all the cool kids posting about beginners and interview questions, I thought I’d toss my favorites out there, from the brief-case gang point of view. These are the technical phone-screening questions I use after I look at a resume. There are only 10. They’re simple. Stupid simple. Silly even. Yet, I can count on eliminating 4 out 5 people who have a resume that looks like a qualified DBA. I’ve seen people with 10 years experience fail on these questions.

I’m only going to provide the questions. If you can’t find the answers on your own, you’re already disqualified:

  1. What is the difference between a clustered and non-clustered index?
    No, don’t tell me that one is clustered and the other is not. I don’t need specific low-level information on this, just a demonstration of knowledge that the difference is understood.
  2.  What is the difference between a block (b – l – o – c – k) and a deadlock (d – e – a – d – l – o – c – k)?
    And yes, I spell the words. I don’t want any chance of misunderstanding. And yet, most people carefully explain what a block is and then carefully explain what a block is again.
  3. Can you tell me two of the three recovery models in SQL Server and what the difference between them is?
    Again, I don’t need to know what happens differently inside of the checkpoint operation, just tell me what’s different. Yeah, I only ask for two since almost everyone only uses one of the two.
  4. Can you tell me a few things that might cause a query stored in cache to recompile?
    Let me tell you that, yes, they do. I’ve had several people argue with me on that question.
  5. What do you think the query hint NO_LOCK might do?
    This should be a give away. I’m not asking for specifics. I’m assuming you don’t know. Why would you say “I have no idea” to a question like this?
  6. Can you tell me some of the various types of backups that are available in SQL Server?
    If you give me three, I’ll be overjoyed. I need at least two.
  7. How did error handling change in SQL Server 2005?
    Note, not how do you write error handling based on the change in 2005, just, what was the change. I need to know you’re aware there was one.
  8. Do you have any experience working with [latest hot topic] inside SQL Server?
    Our latest is Microsoft Dynamics CRM. We’ve also asked the question about XML and other stuff. It’s just an attempt to understand you. Talk about what you know or don’t know.
  9. Do you have experience with Version X of SQL Server?
    Now I ask about 2008, but before I asked about 2005. “No” is a perfectly acceptable answer. “I’ve never heard of it” or “That’s not out yet” or “No, but I have lots of experience with 2009” are pretty much disqualifiers. Broke my rule about no answers there, but I hate seeing people get this one wrong.
  10. You’re the DBA. The phone rings. One of the users is on the line. They say “The database is slow.” Then they hang up. What do you do?
    My favorite was the guy who wanted to track down the user in order to get his name and his managers name and to fill out a series of forms before he’d even consider the technical aspects of the question. This is the only open-ended question I ask for screening.

Preparing this I went back through my notes. I keep notes on every interview. It’s creepy. Page after page of people who can’t answer even four of these questions. We only want you to correctly answer six before we bring you in for an interview.

So, if you’ve got five or ten years experience as a DBA and you think this was a tough quiz… time to evaluate what you’ve been doing. If you’re just starting out, here are some of the basics that it might be nice to know.

Permalink 32 Comments

Tim Ford’s Top 5 Indexing Best Practices

April 27, 2009 at 8:31 am (SQL Server 2005, SQL Server 2008, TSQL) (, , )

All I can really add to this is, yeah, me too. If you want some absolutely great advice on indexes, read this post. It’s a must.

And might I add, I’ve been the bad guy in Tim’s example. Once, many, many years ago, I was reading from the SQL Server 7.0 documentation. It suggested that compound indexes were no longer needed since the optimizer could build them on the fly using index intersection. I had a performance problem and a consultant was telling me to use a compound index. I swore up and down it wouldn’t work because Microsoft said so. He kept pushing and I kept pushing back. Finally, after a rather heated discussion in which I was convinced I had the upper hand, I got off the phone resolved to show this “ID 10 T” he didn’t know what he was talking about… Let me just say that after running some tests I did NOT enjoy the next phone call. Crow really tastes nasty.

Great post Tim. I’m looking forward to the book.

Permalink 1 Comment

Unpacking the View

April 24, 2009 at 9:16 am (SQL Server 2005, SQL Server 2008, SQLServerPedia Syndication, TSQL) (, , , , )

A view is simply a query that behaves something like a table. Most people know this. Most people also know that a view is simply a mask on top of what might be a very complex query. It all seems really simple. You call the view inside a simple query, the view runs the underlying complex query. Most people might not know that when a view is called and it gets sent to the optimizer, the optimizer unpacks the view and binds the component parts of the query necessary to create an execution plan that will return the data requested. What I didn’t know until recently was that the optimizer is VERY smart. Not only does it unpack the query of the view, but it will change the query that the view uses. Let’s take an example (using AdventureWorks2008):


CREATE VIEW dbo.vTest
AS
 SELECT soh.AccountNumber
,sod.LineTotal
,(SELECT COUNT(pcc.BusinessEntityId)
FROM Sales.PersonCreditCard AS pcc
WHERE pcc.CreditCardID = soh.CreditCardID
) AS PersonCreditCard
FROM Sales.SalesOrderHeader AS soh
JOIN Sales.SalesOrderDetail AS sod
ON soh.SalesOrderID = sod.SalesOrderID ;
GO

Whether or not this query is a good one (I’d say it isn’t) and whatever you might think about it, the view definition here is pretty straight forward. If you run a SELECT against this view and take a look at the execution plan, it looks like this:

vtestexecutionplan4

As you can see, the behavior is completely as defined above. The view has been resolved out to it’s component parts in order to build the query and arrive at the data requested. What happens if, instead of selecting all the columns, only a couple are selected? Well, the query that defines the view is not changing, so the optimizer will arrive at the same execution plan as that shown in Figure 1, right? Run this query and get the execution plan:

SELECT
 vt.AccountNumber
,vt.LineTotal
FROM
 dbo.vTest AS vt;

2columexecutionplan

What the heck just happened?

I told you, the optimizer is smart. Compare that execution plan with the one generated from this query:

 SELECT soh.AccountNumber
,sod.LineTotal
FROM Sales.SalesOrderHeader AS soh
JOIN Sales.SalesOrderDetail AS sod
ON soh.SalesOrderID = sod.SalesOrderID ;

You’ll find that they’re identical. As stated before, views are just a query stored on the side. The optimizer is smart enough to figure out that it only needs to generate a plan as if the query requested was this. But what happens when you start nesting views? Yes, yes, Yeah… I… Ok… But… YES! That’s not a good idea. So? When has that slowed people down for a New York second? I’ve seen views nested three and four layers deep. I’ve seen table valued user defined functions nested six and seven layers deep. The question is, how well does the optimizer handle something like this? This script sets up a new situation:

CREATE VIEW dbo.vSalesInfo
AS
SELECT soh.AccountNumber
,sod.LineTotal
,soh.CreditCardID
,sod.ProductID
FROM Sales.SalesOrderHeader AS soh
JOIN Sales.SalesOrderDetail AS  sod
ON soh.SalesOrderID = sod.SalesOrderID ;

CREATE VIEW dbo.vTest3
AS
SELECT vsi.AccountNumber
,vsi.LineTotal,
(
SELECT COUNT(pcc.BusinessEntityId)
FROM Sales.PersonCreditCard AS pcc
WHERE pcc.CreditCardID = vsi.CreditCardID) AS PersonCreditCard
,(SELECT COUNT(v.BusinessEntityId)
FROM Purchasing.Vendor v) AS VendorCount
,(SELECT COUNT(a.AddressId)
FROM Person.Address AS a) AS AddressCount
,(SELECT COUNT(be.BusinessEntityID)
FROM Person.BusinessEntity be) AS BusinessEntityCount
FROM dbo.vSalesInfo AS vsi

From the script you can see that I’ve nested one view inside another and complicated the outer a view a bit. What happens when you run the query that would eliminate tables from the view this time?

 SELECT vt.AccountNumber
,vt.LineTotal
FROM dbo.vTest3 AS vt

 3rdtest

See. The optimizer is very smart. Ah, but before you get all comfy and crack a bottle of some frothy beverage, we need to look at the compile times. Compare the STATISTICS TIME output from a query that pulls from the full view, the limited pull from the view and the pull from the tables:

–View 1
SQL Server parse and compile time:
CPU time = 0 ms, elapsed time = 6 ms.

–Table
SQL Server parse and compile time:
CPU time = 0 ms, elapsed time = 6 ms.

–View 2
SQL Server parse and compile time:
CPU time = 8 ms, elapsed time = 8 ms.

These are small views and easy queries, so the compile times are not large. But with one consistently 30% larger than the others in terms of execution time and actually requiring CPU cycles when the others require less, you begin to see that, despite the extreme intelligence of the optimizer, there is a cost. TANSTAAFL still applies. Keep all this in mind if you think that nested views are a nifty way to write your code or you’re trying to convince someone that nested views are not a nifty way to write their code.

Permalink 12 Comments

Spatial Data

April 21, 2009 at 7:33 am (spatial data, SQL Server 2008, TSQL) (, )

I work for an insurance company. If you think that maybe, we might be interested in the physical location of the things we insure, you’d be right. Actually, we’re an insurance company predicated on the idea that risk can be managed. That means that not only do we know where your factory is located. We know the wind zone, earthquake zone, flood zone, rain zone and temperature zone it’s in. We send engineers out to the site to inspect it and recommend upgrades. We track the upgrades and the condition of your facility.

With all that location specific information, just how important do you think it is that with SQL Server 2008 we’re finally getting a spatial data type? Yeah, exactly.

We’re in the process of launching our first full implementation of the spatial data type and, quite frankly, I was not ready. I had read several of the BOL entries and a few articles online, but nothing had given me enough information for me to say I understood how spatial data worked. Now I at least feel like I’ve got a basis for understanding. That’s because of Alistair Aitchison‘s book Beginning Spatial.

This was not an easy read for me. The first few chapters are frankly difficult. The concepts behind spatial data are not exactly simple. However, Mr. Aitchison did a great job of presenting the information in well explained, digestible chunks. I was able to get through. Coming out the other side, I’ve still got a lot of questions and concerns (especially around spatial indexes, he could have done another three or four chapters on just that topic), but I’m ready to support our project going forward now.

If you’re dealing with Spatial data and you’re not already a guru on the topic, I’d strongly suggest you pick this up. If you’re an expert, this book isn’t for you. There are plenty of examples and Mr. Aitchison walks you through some important concepts, such as importing spatial data. Did you realize that Microsoft didn’t include any mechanisms for dealing with in SSIS? Instead you either need to get creative with TSQL or, better still, use a third party product. I got a copy of Shape2SQL and I’ve tried it out. It seems to do what we need. We still might end up buying a commercial product (several were listed in the book). But it was the information in the book that told me what I needed to know to get started working with our spatial data.

I need to track down more info on spatial indexes now.

Permalink Leave a Comment

Pushing Connect

April 17, 2009 at 7:28 am (PASS) (, , , )

I have posted multiple times that I think Microsoft Connect is one of the best tools you can use to communicate with Microsoft. It works. Buck Woody is reinforcing my position from the Microsoft side of the fence.

Look, Microsoft is not a small nimble company like Red Gate. And again, unlike Red Gate, you might not find yourself in a conversation with the CEO of Microsoft on the floor of a conference like the PASS Summit. And short of going to PASS or TechEd to track down some of the developers and project managers (many of whom attend these conferences), you need to have a mechanism to communicate with the company. You can try standing on your front door step and screaming (trust me, the neighbors just call the police) or you can go to Connect.

Speaking of the PASS Summit, the Call for Speakers is open for a few more days. Please, don’t go over there and submit fantastic sessions that all of us will benefit from. Please, sit on your hands and ignore me. That way there’s less competition and I’ll get to present!

Permalink Leave a Comment

SQL Quiz Part 4

April 16, 2009 at 7:39 am (Misc, SQLServerPedia Syndication)

Thanks SQLBatman. It’s not like I need to do work or something.

Who has been a great leader in your career and what has made them a great leader?

First, I need to define what I understand a leader to be and then I’ll see if I have any great ones on the list. There are a lot of ways to define what a leader is. You could say simply the managers, team leads, CEO’s and CIO’s that you’ve worked for or with are leaders. But I don’t see that. I think of a leader and I see Henry V and the St. Crispin’s day speech, “…And gentlemen in England now-a-bed shall think themselves accurs’d they were not here, and hold their manhoods cheap whiles any speaks that fought with us upon Saint Crispin’s day.” I think of a leader as the guy who stands up and says, “Hey, let’s go to hell,” and you fall into line behind the guy and off you go. Now, I realize that we’re working in IT and trips to hell, service pack rollouts not withstanding, are not exactly a part of the job description. But, I still have this idea of someone who makes suggestions, gives guidance, and simply commands through his prescence, not because he’s been with the company the longest or was assigned as the lead by someone else with institutional power. Personal power makes a leader. The person who will end up in charge wherever they go and whatever they do. And when they’re in charge, you’ll follow them, not blindly like an idiot, but because they demonstrate qualities that makes you believe that they’re going to take you into hell and back out, possibly in one piece.

So, who has been a great leader in my career? That’s really hard. I have worked with some simply amazing people. I’ve worked for some terrific managers. I’ve also worked with and for people that were wasting perfectly good oxygen. I have one guy that really stands out as being a simply outstanding leader. I say this quite simply because if I got a phone call from him tomorrow asking me to come to work with him, I’d probably go. Vipul Minocha.

Who?

Vipul Minocha.

I only worked for Vipul for about 10 months back in 1998-99. We were working at a start-up that has subsequently failed. Vipul was the head of development and I was one of his VB developers. Vipul was technically smart as hell. He knew programming languages and databases and business management. But he didn’t stand up and declaim his knowledge. Instead he stayed out of the way and cleared paths for the rest of us to get our jobs done. He’d review our work every so often and suggest ways we could do things better, approaches we hadn’t thought of, other stuff. He was really great.  The company on the other hand was a flipping nightmare. In the first four months I worked there we went through two different dba’s. A few weeks after the last one had left I stormed into Vipul’s office, mad as hell. I’d just run into a whole bunch of database problems on the code I was working on. I listed all the stuff that was wrong and what should be done to fix it and I asked Vipul where the heck a DBA was coming from… Yeah. I walked out of the office with a mandate and a new mission. Vipul really helped out with a lot of my first stumbling steps as a DBA. He could have done the job better than I did, but he was actively growing me into something new, a development DBA. Unfortunately, the stresses of working for the start-up were too much AND another start-up offered me a ridiculous amount of money to work for them as their cheif DBA (meaning, only, at first), so I left. I understand that Vipul has gone on to found a consulting company. It’s actually quite a ways away from where I live or might have been tempted to look him up.

Vipul was a great leader because while he set direction, he didn’t micro-manage. He let you swim on your own, but he never let you drown. He had an agenda and a set of goals that he sold to all of us. We followed him willingly because it was pretty clear he was going to get where he wanted to go, either with us, or without us. He encouraged learning and experimentation, but never lost sight of the end-point. He was a real joy to work with and for. It’s a shame we were in such a wretched environment with a bunch of total jerks (who, by the way, undermined Vipul on multiple occasions, they were the opposite of great leaders) because I ended up working more for them than for Vipul. Still, it was a heck of an experience and he set a very high bar that I couldn’t clear at the time, but it gave me something to strive for.

That’s enough, I hope. Time to inflict this on others.

My favorite target: Gail Shaw
I think he’ll have something interesting to say as always: Jack Corbett
Because I’m really interested in his answer: Andy Warren

Update: Apparently I got tagged by Brent Ozar too. And he has already tagged Gail. So, TJay, you’re up.

Permalink 4 Comments

Publishing at PASS

April 14, 2009 at 7:52 am (PASS, SQLServerPedia Syndication) (, , , )

I’ve been working with Andy Warren on several things for PASS through the editorial committee. Some of the work is available now. First, I’ve searched out all the online SQL Server communities that I could track down. I gathered information about the community and whether or not they published articles and what the contact for publication is. All the information about the online communities is gathered in one place. Please look them over. If I missed someone’s favorite community, please let me know. I’ll update it. If I’ve got some bad information up there, again, pass the word and I’ll get the changes in. If you’re looking for somewhere to go for help, support, or just someone to talk to, I think you can find something to suit you there.

PASS is posting Top 10 lists. I put together the top 10 articles on execution plans that I knew about (and I only listed myself once). Jack Corbett, Jessica Moss and Andy Warren also posted some. If you have an idea for a top 10 list, you can get it published there. Just email to address provided. There’s some seriously good reading all gathered in one spot.

There’ll be more coming out of the editorial committee soon.  You can read about some of it here on Andy’s blog.

Permalink Leave a Comment

Paul Randal’s Database Size Survey

April 13, 2009 at 7:35 am (SQL Server 2005, SQL Server 2008, TSQL) (, , , , )

If you have three minutes to spare, swing by Paul Randal’s blog and answer his survey questions about the size and distribution of your database. The results are very interesting. I was most interested in the number of respondents to each of the questions.  As each size category switched, fewer and fewer people responded. However, a lot more people responded than I expected. 94 last I looked had databases under 10gb in size, but 42 had databases over 1tb. Yeah, that’s only 1/2, but, holy cow, it’s 1/2.

I wish I had a database to manage that was over 1tb. Back in the 7.0/2000 days I was at a dot com that was getting close. When I left they had 700gb. I understand they got close to 850 before the company folded. Managing that much data in SQL Server 7 and 2000 was like wrestling a very serious bear. I’d love to see what it’s like now.

Permalink 4 Comments

SQL Server Execution Plans Published… Again

April 11, 2009 at 8:59 am (SQLServerPedia Syndication, TSQL) (, , )

My first book is finally in print. OK. I know. It sounds funny. But my first book was printed in only a limited print run from Red Gate, most of which they gave away at Tech Ed last year. Then they offered it for free in an electronic form. Very few people got a printed copy. Well, if you were waiting around for the dead tree version, it’s here!

Thanks again to Tony Davis & Brad McGehee for all the work they did. It just wouldn’t have been possible without them.

Permalink Leave a Comment

Call for Speakers Extension Explanation

April 9, 2009 at 8:46 am (Uncategorized) (, , , )

It’s not nearly as fun as my own speculations were, but it’s still interesting. Andy Warren explains that the because Microsoft cancelled the 2009 BI Conference and recommended the PASS Summit as an alternative, PASS wisely decided to extend the time to try to get those people to submit abstracts.

Permalink Leave a Comment

Next page »