Spatial Data Hurts My Brain

May 27, 2009 at 1:03 pm (spatial data, SQLServerPedia Syndication, TSQL) (, , , )

I’m still barely scratching the surface working with spatial data in SQL Server 2008. We’ve ported some of the data into a table where we built a geography spatial data column and we’re begginning to work with point data. The requirements from the developers are, so far, very simple. They’ll feed me a point and I find all the locations “close” to it. We had to go round & round on what defines “close” but finally settled on, I think, 15km.

The query to answer a question like this is ridiculously simple (a few object names have been changed):

SELECT ebe.[Location].STDistance(@Location) AS Distance,
ebe.[InterestId],
ebe.[Location].Lat AS Latitude,
ebe.[Location].Long AS Longitude,
ebe.[OrgId]
FROM dbo.[ebe] AS ebe
WHERE ebe[OrgId] = @OrgId
AND ebe.[Location].STDistance(@Location) < @CloseDistance

I’m not even hard-coding the “close” value so it can change when they change their minds. It retrieves exactly what’s needed. But… Well, look at the STATISTICS IO & TIME:

Table 'ebe'. Scan count 3, logical reads 40179
Table 'Worktable'. Scan count 0, logical reads 0
CPU time = 376 ms, elapsed time = 373 ms.

Not exactly snappy, is it? So, the obvious answer is to provide an index it can use so that it doesn’t have to scan the clustered index. Spatial indexes only support certain functions with the spatial column, STIntersects, STEquals, STDistance. Since I’m dealing with STDistance, I have a shot at this working. Here’s the script:

CREATE SPATIAL INDEX [ebe_spatial] ON [dbo].[ebe]
(
[Location]
)
USING GEOGRAPHY_GRID
WITH
( GRIDS =( LEVEL_1 = MEDIUM, LEVEL_2 = MEDIUM, LEVEL_3 = MEDIUM, LEVEL_4 = MEDIUM),
CELLS_PER_OBJECT= 128,
PAD_INDEX = OFF,
SORT_IN_TEMPDB = OFF,
DROP_EXISTING = OFF,
ALLOW_ROW_LOCKS = ON,
ALLOW_PAGE_LOCKS = ON)
ON [PRIMARY]

When I then run the query, well, here’s the STATISTICS IO & TIME again:

Table 'Worktable'. Scan count 0, logical reads 0
Table 'extended_index_469576711_384000'. Scan count 2076
Table 'ebe'. Scan count 0, logical reads 448064
Table 'Worktable'. Scan count 0, logical reads 0
CPU time = 1155 ms, elapsed time = 1393 ms.

You read that correctly, it used the spatial index which caused the performance to decrease. There are adjustments you can make to spatial indexes. You can change the number of cells per object or set the detail level on the grid, but it’s hard to understand what this does to the index.

After fighting with it for a bit, I sent a tweet out on twitter, just whining about the index causing a slow-down. Surprised as anything, I get a response from Paul Randall, uh, wow, suggesting an approach. This is responded to by Isaac Kunen, you know the Microsoft PM & spatial expert. You just have to love Twitter. I took the information they gave me and did a few searches on the web. This lead to Bob Beauchemin’s blog and my introduction to some new Dynamic Management Views, sp_help_spatial_geometry_index and sp_help_spatial_geography_index. Now we’re talking. A mechanism to identify how useful the spatial data index is, similar to looking at sys.dm_db_index_physical_stats to see information about a clustered index. This lead to the following query and output:

DECLARE  @LocationGEOGRAPHY 
SET @Location = GEOGRAPHY::STPointFromText('POINT (-86.674582 36.148843)',4326)
EXEC sp_help_spatial_geography_index @tabname = 'EWH_BIM_Extract2'
,@indexname='ebe_spatial'
,@verboseoutput = 1
,@query_sample = @Location

The question is, how efficient is the index? In my case, the Primary_Filter_Efficiency was listed as 3.33333. Yes, that’s 3% efficient. No wonder it killed the performance. What do I do to make the index more efficient? I haven’t a clue. That may be the next blog post. But at least now I know how to evaluate the usefulness of a spatial index.

Permalink 13 Comments

How Can PowerShell Help You as a DBA

May 27, 2009 at 9:02 am (PowerShell)

Buck Woody has an article up on the Microsoft Scripting web site (which I didn’t know about before today) on how PowerShell will help you as a DBA. That’s a must read for every DBA.

Permalink 2 Comments

WHOOOP!

May 27, 2009 at 8:59 am (PASS, SQL Server 2005, SQL Server 2008, TSQL) (, , , , )

I just got the good word, I submitted an abstract for a spotlight session at the PASS Summit and it was accepted. Jazzed doesn’t begin to describe it. If you’re interested in hearing what I think about Best Practices for working with Execution Plans, please swing by.

Permalink 16 Comments