More Spatial Headaches

June 11, 2009 at 2:38 pm (spatial data, SQL Server 2008, SQLServerPedia Syndication, TSQL) (, , , , )


I keep thinking I’ve got a handle on spatial data and I keep finding out I’m wrong. I rewrote the cursor that my co-worker used into a query that joined the list of test criteria onto our list of locations. I used an OUTER APPLY to more or less run a check for each of the test criteria since, except for the proximity to the locations, there’s no actual relationship between the test criteria and the location data for me to join on.

The query looked something like this:

SELECT a
.CriteriaDesc
,a.CriteriaLoc
,l.[Location].Lat AS LocationLatitude
,l.[Location].Long AS LocationLongitude
,l.LocationDesc
FROM dbo.Alert AS a
OUTER APPLY (SELECT x.[Location]
		FROM dbo.MyLocs x
		WHERE x.OrgID = 42
		AND x.[Location].STDistance(a.AlertLocation) < 50000) AS l

The cursor was taking almost a full minute to return just under 400 rows worth of data. This query took about 25 seconds. SCORE!

Unfortunately when I looked at the execution plan, the optimizer was not able to recognize that my spatial index would increase the performance. So I modified the query to provide a hint:


FROM dbo.MyLocs x WITH (INDEX(xsHigh))

Now the query returned in 5 seconds. BIG SCORE! Then I remembered my co-worker and his low density index. So I quickly rebuilt it and changed the INDEX hint to use the low density index. It ran in just over 2 seconds. Yeah, it’s a score, but why? I thought the high density index was the way to go with point locations. I decided I just wanted to see the middle behavior and see if a medium density index took 3.5 seconds about mid-way between the low and high density indexes… It ran in just over 1 second. WHAT THE F…

I then ran a whole series of experiments with my query. I kept changing the density and the cells per object value for the three indexes and recording times. In every test, the medium density index outperformed the low and high density indexes. It never outperformed the low density index by a whole lot, usually running ~50% faster as opposed to running 5 times faster than the high density index. But it really did run better.

My current working theory, and I’m trying to figure out how I can validate this, is that the low and medium density indexes were more efficient at eliminating the negative values, the places where the criteria wasn’t in proximity to my locations. Then, the medium density index was just better at retrieving the data than the low density index. I base this on the fact that the primary filter efficiency on the medium density index is  better than the low (although not as good as the high).

2 Comments

  1. Desinderlase said,

    Point data will always be covered with a single cell on the highest level.

    Your query window is a.AlertLocation.STBuffer(50000) and it can have better coverage for lower grid density, since your query window is quite big. Use sp_help_spatial_geography_index to see how query window is covered with all four levels.

    If you have only points in a table, it’s better to optimize index density for the query window.

  2. scarydba said,

    Thanks for the info. I do use sp_help_spatial_geography_index to observe the application of the index, thanks.

    In this case, quite a lot of testing suggests that medium density is working better than low density which works better than high density.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: