I keep thinking I’ve got a handle on spatial data and I keep finding out I’m wrong. I rewrote the cursor that my co-worker used into a query that joined the list of test criteria onto our list of locations. I used an OUTER APPLY to more or less run a check for each of the test criteria since, except for the proximity to the locations, there’s no actual relationship between the test criteria and the location data for me to join on.
The query looked something like this:
SELECT a .CriteriaDesc ,a.CriteriaLoc ,l.[Location].Lat AS LocationLatitude ,l.[Location].Long AS LocationLongitude ,l.LocationDesc FROM dbo.Alert AS a OUTER APPLY (SELECT x.[Location] FROM dbo.MyLocs x WHERE x.OrgID = 42 AND x.[Location].STDistance(a.AlertLocation) < 50000) AS l
The cursor was taking almost a full minute to return just under 400 rows worth of data. This query took about 25 seconds. SCORE!
Unfortunately when I looked at the execution plan, the optimizer was not able to recognize that my spatial index would increase the performance. So I modified the query to provide a hint:
FROM dbo.MyLocs x WITH (INDEX(xsHigh))
Now the query returned in 5 seconds. BIG SCORE! Then I remembered my co-worker and his low density index. So I quickly rebuilt it and changed the INDEX hint to use the low density index. It ran in just over 2 seconds. Yeah, it’s a score, but why? I thought the high density index was the way to go with point locations. I decided I just wanted to see the middle behavior and see if a medium density index took 3.5 seconds about mid-way between the low and high density indexes… It ran in just over 1 second. WHAT THE F…
I then ran a whole series of experiments with my query. I kept changing the density and the cells per object value for the three indexes and recording times. In every test, the medium density index outperformed the low and high density indexes. It never outperformed the low density index by a whole lot, usually running ~50% faster as opposed to running 5 times faster than the high density index. But it really did run better.
My current working theory, and I’m trying to figure out how I can validate this, is that the low and medium density indexes were more efficient at eliminating the negative values, the places where the criteria wasn’t in proximity to my locations. Then, the medium density index was just better at retrieving the data than the low density index. I base this on the fact that the primary filter efficiency on the medium density index is better than the low (although not as good as the high).