How do I efficiently search for all the landmarks within a range of a certain landmark?
up vote
14
down vote
favorite
I am trying to start with a geo search project that will find all landmarks in the 10 km/miles (not important for this story) of a particular landmark.
So for example, lets say I have a database of a 1,000,000 landmarks. In order to find all landmarks in 10 miles range of a landmark with certain coordinates, I would have to calculate a distance between a landmark from my search and 1,000,000 landmarks.
Is there a better way to do that?
Alternative I was thinking is to categorize landmarks such as country, region, city, neighborhood, business, historical, etc. in such a way that business can be part of a neighborhood or city. City is a part of a region, a country, etc. This can narrow a list of calculations, but it still looks like a lot of work do to in order for search to be fast and accurate.
Could the Google Maps API help?
c# application-design geolocation google-maps
add a comment |
up vote
14
down vote
favorite
I am trying to start with a geo search project that will find all landmarks in the 10 km/miles (not important for this story) of a particular landmark.
So for example, lets say I have a database of a 1,000,000 landmarks. In order to find all landmarks in 10 miles range of a landmark with certain coordinates, I would have to calculate a distance between a landmark from my search and 1,000,000 landmarks.
Is there a better way to do that?
Alternative I was thinking is to categorize landmarks such as country, region, city, neighborhood, business, historical, etc. in such a way that business can be part of a neighborhood or city. City is a part of a region, a country, etc. This can narrow a list of calculations, but it still looks like a lot of work do to in order for search to be fast and accurate.
Could the Google Maps API help?
c# application-design geolocation google-maps
5
You could probably eliminate a good many simply by performing a quick Manhattan distance calculation and then performing a second filter afterwards to exclude landmarks which are within a 10km square but are outside the 10km radius.
– Neil
Nov 5 at 13:30
3
What database technology are you using? The answer is not database agnostic.
– jpmc26
Nov 5 at 16:48
1
@Neil As a second pass you can include any landmark that where the x and y both fall with in 7km of the origin without calculating the actual distance.
– JimmyJames
Nov 5 at 20:17
add a comment |
up vote
14
down vote
favorite
up vote
14
down vote
favorite
I am trying to start with a geo search project that will find all landmarks in the 10 km/miles (not important for this story) of a particular landmark.
So for example, lets say I have a database of a 1,000,000 landmarks. In order to find all landmarks in 10 miles range of a landmark with certain coordinates, I would have to calculate a distance between a landmark from my search and 1,000,000 landmarks.
Is there a better way to do that?
Alternative I was thinking is to categorize landmarks such as country, region, city, neighborhood, business, historical, etc. in such a way that business can be part of a neighborhood or city. City is a part of a region, a country, etc. This can narrow a list of calculations, but it still looks like a lot of work do to in order for search to be fast and accurate.
Could the Google Maps API help?
c# application-design geolocation google-maps
I am trying to start with a geo search project that will find all landmarks in the 10 km/miles (not important for this story) of a particular landmark.
So for example, lets say I have a database of a 1,000,000 landmarks. In order to find all landmarks in 10 miles range of a landmark with certain coordinates, I would have to calculate a distance between a landmark from my search and 1,000,000 landmarks.
Is there a better way to do that?
Alternative I was thinking is to categorize landmarks such as country, region, city, neighborhood, business, historical, etc. in such a way that business can be part of a neighborhood or city. City is a part of a region, a country, etc. This can narrow a list of calculations, but it still looks like a lot of work do to in order for search to be fast and accurate.
Could the Google Maps API help?
c# application-design geolocation google-maps
c# application-design geolocation google-maps
edited Nov 6 at 10:41
Peter Mortensen
1,11621114
1,11621114
asked Nov 5 at 12:11
Dario Granich
365310
365310
5
You could probably eliminate a good many simply by performing a quick Manhattan distance calculation and then performing a second filter afterwards to exclude landmarks which are within a 10km square but are outside the 10km radius.
– Neil
Nov 5 at 13:30
3
What database technology are you using? The answer is not database agnostic.
– jpmc26
Nov 5 at 16:48
1
@Neil As a second pass you can include any landmark that where the x and y both fall with in 7km of the origin without calculating the actual distance.
– JimmyJames
Nov 5 at 20:17
add a comment |
5
You could probably eliminate a good many simply by performing a quick Manhattan distance calculation and then performing a second filter afterwards to exclude landmarks which are within a 10km square but are outside the 10km radius.
– Neil
Nov 5 at 13:30
3
What database technology are you using? The answer is not database agnostic.
– jpmc26
Nov 5 at 16:48
1
@Neil As a second pass you can include any landmark that where the x and y both fall with in 7km of the origin without calculating the actual distance.
– JimmyJames
Nov 5 at 20:17
5
5
You could probably eliminate a good many simply by performing a quick Manhattan distance calculation and then performing a second filter afterwards to exclude landmarks which are within a 10km square but are outside the 10km radius.
– Neil
Nov 5 at 13:30
You could probably eliminate a good many simply by performing a quick Manhattan distance calculation and then performing a second filter afterwards to exclude landmarks which are within a 10km square but are outside the 10km radius.
– Neil
Nov 5 at 13:30
3
3
What database technology are you using? The answer is not database agnostic.
– jpmc26
Nov 5 at 16:48
What database technology are you using? The answer is not database agnostic.
– jpmc26
Nov 5 at 16:48
1
1
@Neil As a second pass you can include any landmark that where the x and y both fall with in 7km of the origin without calculating the actual distance.
– JimmyJames
Nov 5 at 20:17
@Neil As a second pass you can include any landmark that where the x and y both fall with in 7km of the origin without calculating the actual distance.
– JimmyJames
Nov 5 at 20:17
add a comment |
4 Answers
4
active
oldest
votes
up vote
11
down vote
accepted
Since SQL Server 2008, there is a geography data type which stores locations (lat/lon pairs) and makes it easy for you to write location-related queries.
There is an existing StackOverflow answer that discusses this in-depth.
A basic query to find the nearest 7 items:
USE AdventureWorks2012
GO
DECLARE @g geography = 'POINT(-121.626 47.8315)';
SELECT TOP(7) SpatialLocation.ToString(), City FROM Person.Address
WHERE SpatialLocation.STDistance(@g) IS NOT NULL
ORDER BY SpatialLocation.STDistance(@g);
A basic query to find everything within 100m (second answer to the question)
-- Get the center point
DECLARE @g geography
SELECT @g = geo FROM yourTable WHERE PointId = something
-- Get the results, radius 100m
SELECT * FROM yourTable WHERE @g.STDistance(geo) <= 100
11
@KonradRudolph: As is the case for any SQL column that is used for querying on a table with a massive rowcount. You're correct, but that comment would apply to virtually any SQL query that is posted as an answer.
– Flater
Nov 5 at 14:58
2
Where did you read "MS SQL Server" in the question?
– Doc Brown
Nov 5 at 14:58
3
@Flater I agree that it would normally obvious and redundant but OP’s wording seems to suggest that they’re unaware of such mechanisms.
– Konrad Rudolph
Nov 5 at 15:09
2
@jpmc26: You're appalled that I listed a valid option and didn't include some other option? What? If you feel it's relevant to add PostGIS, add the answer yourself (which you did) and don't resort to criticizing others for not having the same idea as you.
– Flater
Nov 6 at 7:22
3
Your answer appears to me as basically just a MS SQL sales pitch. Your comments suggesting they switch databases to something that would cost 10s of thousands of dollars without actually inquiring about what their situation only makes it appear moreso. It doesn't even describe how the OP can actually implement their query or discuss the fact that doing so and esnuring the spatial index is used is not as straightforward in MS SQL as in other DBs. Nor does it discuss any of the underlying concepts. It's a bad answer, irrespective of whether it's "valid." That's why it bothers me.
– jpmc26
Nov 6 at 10:16
|
show 6 more comments
up vote
30
down vote
Use a database with support for GIS (geographic information systems) queries. Most databases support this outright or have extensions, but the details will be database-specific (in their answer, Flater shows the syntax for SQL server).
If you need to implement such queries within your application, you can implement a data structure that allows spatial queries, e.g. a k-d Tree. This is like a binary search tree, except that each level of the tree partitions on a different coordinate dimension. This allows you to restrict the search to a smaller set of feasible candidates. Effectively, you translate your search “10km radius” into bounds for each coordinate dimension, and tighten the bounds as you recurse into the tree.
4
There is also a GIS stackexchange
– BlueRaja - Danny Pflughoeft
Nov 5 at 15:11
8
PostGIS is the premier free option. It supports much, much more than SQL Server's very basic GIS types and functions. But this is basic functionality.
– jpmc26
Nov 5 at 16:41
@amon I find jpmc26's comment as a good addition, and not that much as criticizing your example. "If you want to start from scratch, you don't need to pay for a licensed DB - this free, open-source one will also do the trick really well".
– mgarciaisaia
Nov 6 at 11:51
add a comment |
up vote
11
down vote
Yes, there's a better way. You need to use a spatial index. These indexes organize metadata about geometries to filter out far away geometries very rapidly, saving a lot of CPU cycles by avoiding the computations you describe. You shouldn't bother implementing one yourself as all major relational databases provide a spatial geometry type and indexes to go with them.
- PostGIS (the GIS extension for PostgreSQL) uses R-Trees: http://postgis.net/docs/using_postgis_dbmanagement.html#idm2246 (The GiST type)
- SQL Server uses grid indexes: https://docs.microsoft.com/en-us/sql/relational-databases/spatial/spatial-indexes-overview?view=sql-server-2017
- Oracle uses R-Trees: https://docs.oracle.com/database/121/SPATL/creating-spatial-index.htm
- MySQL uses R-Trees: https://dev.mysql.com/doc/refman/8.0/en/optimizing-spatial-analysis.html
What you want to look into are "within distance" queries (queries for geometries within a certain distance of some other geometry). These are very standard and very much a solved problem and are possible in all of the above databases (and built into several):
- PostGIS:
ST_DWithin
- SQL Server:
STDistance
(Not clear that index use on the 3D geography version of this function is supported) - Oracle:
SDO_WITHIN_DISTANCE
(This doesn't say explicitly that it will trigger index usage. I'd double check the query plan. You might need to apply anSDO_FILTER
to get it to use the index.) - MySQL: Still figuring this out.
Workaround for triggering index usage
In the worst case where you have trouble getting the system to use the spatial index with these queries, you can add an additional filter. You'd create a square bounding box with sides of length 2*(search distance) centered at your search point and compare the table geometries' bounding boxes against that before checking the actual distance. That's what PostGIS' ST_DWithin
above does internally anyway.
Distance in GIS
While spatial indexes are fantastic and absolutely the right solution to your problem, distance calculation can get logically complicated. In particular, you need to worry about what projection (basically all the parameters for the coordinate system) your data is stored in. Most 2D projections (things other than angular coordinate systems like the various lat/long projections) distort length significantly. For example, the Web Mercator projection (the one used by Google, Bing, and every other major base map provider) expands areas and distances increasingly as the location gets further from the equator. I might be wrong as I'm not formally educated in GIS, but the best I've seen for 2D projections is some specific ones that promise correct distances from a single, constant point in the entire world. (No, it's not practical to use a different projection for every query; that would render your indexes useless.)
The bottom line is that you need to make sure your math is accurate. The simplest way of doing so from a development perspective is to use angular projections (These are often referred to as "geographic.") and functions that support doing the math using a spheroid model, but these computations are slightly more expensive than the 2D counterparts and some DBs may not support indexing them. If you can get acceptable performance using them, though, that's probably the way to go. Another common option is regional projections (like UTM zones) that get both distances and areas pretty close to correct if your data is confined to a particular part of the world. What's best for your app will depend on your specific requirements, but be aware that you need to think this through and maybe learn a little bit about it.
This applies even if you don't use built in spatial indexes. Your data has some projection regardless of what technology or technique you are currently using or use in the future, and it's already currently affecting any queries and computations you're making.
add a comment |
up vote
2
down vote
I would agree that if possible using specific support in a database would be the most sensible way to do this.
However if I had to do this on a database without specific support I would start by querying for a square that encloses the circule e.g. (y > (y1 - rad)) AND (y < (y1 + rad)) AND (x > (x1 - rad)) AND (x < (x1 + rad)). Assuming your points have roughly even distribution querying for a square will get you your true matches plus about 30% extra false matches. You can then cull out the false matches.
But without an appropriate spatial index, such a query will scan at worst the entire database, at best all items within the given latitude OR longitude range depending on your index, i.e. a "band" rather than a square. If you don't want to kill performance, use a database that supports spatial indexes!
– jcaron
Nov 5 at 16:21
@jcaron I believe this query could be optimized with an ordinary B-tree index onx
andy
. (Perhaps combined, perhaps separate. I'd profile a little to figure out which works better in practice.)
– jpmc26
Nov 5 at 17:55
@jpmc26 Nope, it can’t. Think it through, you’ll see.
– jcaron
Nov 5 at 20:52
@jcaron Perhaps it would be better if you weren't cryptic about something that is clearly not straightforward. B-trees can be used forBETWEEN
queries. I don't see why worst case you couldn't have 2 indexes and then the filtered results from each index get joined together. (That is something RDBMSes do internally when they deem it worth using multiple indexes.) If a combined index works, it should filter out one dimension entirely at the first level and then relatively quickly narrow down in the second level.
– jpmc26
Nov 5 at 21:04
1
@jcaron actually you can use index for something likey between -68 and -69 and x between 10 and 11
but of course spatial index do a better job for that task
– Juan Carlos Oropeza
Nov 5 at 21:04
|
show 2 more comments
4 Answers
4
active
oldest
votes
4 Answers
4
active
oldest
votes
active
oldest
votes
active
oldest
votes
up vote
11
down vote
accepted
Since SQL Server 2008, there is a geography data type which stores locations (lat/lon pairs) and makes it easy for you to write location-related queries.
There is an existing StackOverflow answer that discusses this in-depth.
A basic query to find the nearest 7 items:
USE AdventureWorks2012
GO
DECLARE @g geography = 'POINT(-121.626 47.8315)';
SELECT TOP(7) SpatialLocation.ToString(), City FROM Person.Address
WHERE SpatialLocation.STDistance(@g) IS NOT NULL
ORDER BY SpatialLocation.STDistance(@g);
A basic query to find everything within 100m (second answer to the question)
-- Get the center point
DECLARE @g geography
SELECT @g = geo FROM yourTable WHERE PointId = something
-- Get the results, radius 100m
SELECT * FROM yourTable WHERE @g.STDistance(geo) <= 100
11
@KonradRudolph: As is the case for any SQL column that is used for querying on a table with a massive rowcount. You're correct, but that comment would apply to virtually any SQL query that is posted as an answer.
– Flater
Nov 5 at 14:58
2
Where did you read "MS SQL Server" in the question?
– Doc Brown
Nov 5 at 14:58
3
@Flater I agree that it would normally obvious and redundant but OP’s wording seems to suggest that they’re unaware of such mechanisms.
– Konrad Rudolph
Nov 5 at 15:09
2
@jpmc26: You're appalled that I listed a valid option and didn't include some other option? What? If you feel it's relevant to add PostGIS, add the answer yourself (which you did) and don't resort to criticizing others for not having the same idea as you.
– Flater
Nov 6 at 7:22
3
Your answer appears to me as basically just a MS SQL sales pitch. Your comments suggesting they switch databases to something that would cost 10s of thousands of dollars without actually inquiring about what their situation only makes it appear moreso. It doesn't even describe how the OP can actually implement their query or discuss the fact that doing so and esnuring the spatial index is used is not as straightforward in MS SQL as in other DBs. Nor does it discuss any of the underlying concepts. It's a bad answer, irrespective of whether it's "valid." That's why it bothers me.
– jpmc26
Nov 6 at 10:16
|
show 6 more comments
up vote
11
down vote
accepted
Since SQL Server 2008, there is a geography data type which stores locations (lat/lon pairs) and makes it easy for you to write location-related queries.
There is an existing StackOverflow answer that discusses this in-depth.
A basic query to find the nearest 7 items:
USE AdventureWorks2012
GO
DECLARE @g geography = 'POINT(-121.626 47.8315)';
SELECT TOP(7) SpatialLocation.ToString(), City FROM Person.Address
WHERE SpatialLocation.STDistance(@g) IS NOT NULL
ORDER BY SpatialLocation.STDistance(@g);
A basic query to find everything within 100m (second answer to the question)
-- Get the center point
DECLARE @g geography
SELECT @g = geo FROM yourTable WHERE PointId = something
-- Get the results, radius 100m
SELECT * FROM yourTable WHERE @g.STDistance(geo) <= 100
11
@KonradRudolph: As is the case for any SQL column that is used for querying on a table with a massive rowcount. You're correct, but that comment would apply to virtually any SQL query that is posted as an answer.
– Flater
Nov 5 at 14:58
2
Where did you read "MS SQL Server" in the question?
– Doc Brown
Nov 5 at 14:58
3
@Flater I agree that it would normally obvious and redundant but OP’s wording seems to suggest that they’re unaware of such mechanisms.
– Konrad Rudolph
Nov 5 at 15:09
2
@jpmc26: You're appalled that I listed a valid option and didn't include some other option? What? If you feel it's relevant to add PostGIS, add the answer yourself (which you did) and don't resort to criticizing others for not having the same idea as you.
– Flater
Nov 6 at 7:22
3
Your answer appears to me as basically just a MS SQL sales pitch. Your comments suggesting they switch databases to something that would cost 10s of thousands of dollars without actually inquiring about what their situation only makes it appear moreso. It doesn't even describe how the OP can actually implement their query or discuss the fact that doing so and esnuring the spatial index is used is not as straightforward in MS SQL as in other DBs. Nor does it discuss any of the underlying concepts. It's a bad answer, irrespective of whether it's "valid." That's why it bothers me.
– jpmc26
Nov 6 at 10:16
|
show 6 more comments
up vote
11
down vote
accepted
up vote
11
down vote
accepted
Since SQL Server 2008, there is a geography data type which stores locations (lat/lon pairs) and makes it easy for you to write location-related queries.
There is an existing StackOverflow answer that discusses this in-depth.
A basic query to find the nearest 7 items:
USE AdventureWorks2012
GO
DECLARE @g geography = 'POINT(-121.626 47.8315)';
SELECT TOP(7) SpatialLocation.ToString(), City FROM Person.Address
WHERE SpatialLocation.STDistance(@g) IS NOT NULL
ORDER BY SpatialLocation.STDistance(@g);
A basic query to find everything within 100m (second answer to the question)
-- Get the center point
DECLARE @g geography
SELECT @g = geo FROM yourTable WHERE PointId = something
-- Get the results, radius 100m
SELECT * FROM yourTable WHERE @g.STDistance(geo) <= 100
Since SQL Server 2008, there is a geography data type which stores locations (lat/lon pairs) and makes it easy for you to write location-related queries.
There is an existing StackOverflow answer that discusses this in-depth.
A basic query to find the nearest 7 items:
USE AdventureWorks2012
GO
DECLARE @g geography = 'POINT(-121.626 47.8315)';
SELECT TOP(7) SpatialLocation.ToString(), City FROM Person.Address
WHERE SpatialLocation.STDistance(@g) IS NOT NULL
ORDER BY SpatialLocation.STDistance(@g);
A basic query to find everything within 100m (second answer to the question)
-- Get the center point
DECLARE @g geography
SELECT @g = geo FROM yourTable WHERE PointId = something
-- Get the results, radius 100m
SELECT * FROM yourTable WHERE @g.STDistance(geo) <= 100
answered Nov 5 at 12:24
Flater
5,22111019
5,22111019
11
@KonradRudolph: As is the case for any SQL column that is used for querying on a table with a massive rowcount. You're correct, but that comment would apply to virtually any SQL query that is posted as an answer.
– Flater
Nov 5 at 14:58
2
Where did you read "MS SQL Server" in the question?
– Doc Brown
Nov 5 at 14:58
3
@Flater I agree that it would normally obvious and redundant but OP’s wording seems to suggest that they’re unaware of such mechanisms.
– Konrad Rudolph
Nov 5 at 15:09
2
@jpmc26: You're appalled that I listed a valid option and didn't include some other option? What? If you feel it's relevant to add PostGIS, add the answer yourself (which you did) and don't resort to criticizing others for not having the same idea as you.
– Flater
Nov 6 at 7:22
3
Your answer appears to me as basically just a MS SQL sales pitch. Your comments suggesting they switch databases to something that would cost 10s of thousands of dollars without actually inquiring about what their situation only makes it appear moreso. It doesn't even describe how the OP can actually implement their query or discuss the fact that doing so and esnuring the spatial index is used is not as straightforward in MS SQL as in other DBs. Nor does it discuss any of the underlying concepts. It's a bad answer, irrespective of whether it's "valid." That's why it bothers me.
– jpmc26
Nov 6 at 10:16
|
show 6 more comments
11
@KonradRudolph: As is the case for any SQL column that is used for querying on a table with a massive rowcount. You're correct, but that comment would apply to virtually any SQL query that is posted as an answer.
– Flater
Nov 5 at 14:58
2
Where did you read "MS SQL Server" in the question?
– Doc Brown
Nov 5 at 14:58
3
@Flater I agree that it would normally obvious and redundant but OP’s wording seems to suggest that they’re unaware of such mechanisms.
– Konrad Rudolph
Nov 5 at 15:09
2
@jpmc26: You're appalled that I listed a valid option and didn't include some other option? What? If you feel it's relevant to add PostGIS, add the answer yourself (which you did) and don't resort to criticizing others for not having the same idea as you.
– Flater
Nov 6 at 7:22
3
Your answer appears to me as basically just a MS SQL sales pitch. Your comments suggesting they switch databases to something that would cost 10s of thousands of dollars without actually inquiring about what their situation only makes it appear moreso. It doesn't even describe how the OP can actually implement their query or discuss the fact that doing so and esnuring the spatial index is used is not as straightforward in MS SQL as in other DBs. Nor does it discuss any of the underlying concepts. It's a bad answer, irrespective of whether it's "valid." That's why it bothers me.
– jpmc26
Nov 6 at 10:16
11
11
@KonradRudolph: As is the case for any SQL column that is used for querying on a table with a massive rowcount. You're correct, but that comment would apply to virtually any SQL query that is posted as an answer.
– Flater
Nov 5 at 14:58
@KonradRudolph: As is the case for any SQL column that is used for querying on a table with a massive rowcount. You're correct, but that comment would apply to virtually any SQL query that is posted as an answer.
– Flater
Nov 5 at 14:58
2
2
Where did you read "MS SQL Server" in the question?
– Doc Brown
Nov 5 at 14:58
Where did you read "MS SQL Server" in the question?
– Doc Brown
Nov 5 at 14:58
3
3
@Flater I agree that it would normally obvious and redundant but OP’s wording seems to suggest that they’re unaware of such mechanisms.
– Konrad Rudolph
Nov 5 at 15:09
@Flater I agree that it would normally obvious and redundant but OP’s wording seems to suggest that they’re unaware of such mechanisms.
– Konrad Rudolph
Nov 5 at 15:09
2
2
@jpmc26: You're appalled that I listed a valid option and didn't include some other option? What? If you feel it's relevant to add PostGIS, add the answer yourself (which you did) and don't resort to criticizing others for not having the same idea as you.
– Flater
Nov 6 at 7:22
@jpmc26: You're appalled that I listed a valid option and didn't include some other option? What? If you feel it's relevant to add PostGIS, add the answer yourself (which you did) and don't resort to criticizing others for not having the same idea as you.
– Flater
Nov 6 at 7:22
3
3
Your answer appears to me as basically just a MS SQL sales pitch. Your comments suggesting they switch databases to something that would cost 10s of thousands of dollars without actually inquiring about what their situation only makes it appear moreso. It doesn't even describe how the OP can actually implement their query or discuss the fact that doing so and esnuring the spatial index is used is not as straightforward in MS SQL as in other DBs. Nor does it discuss any of the underlying concepts. It's a bad answer, irrespective of whether it's "valid." That's why it bothers me.
– jpmc26
Nov 6 at 10:16
Your answer appears to me as basically just a MS SQL sales pitch. Your comments suggesting they switch databases to something that would cost 10s of thousands of dollars without actually inquiring about what their situation only makes it appear moreso. It doesn't even describe how the OP can actually implement their query or discuss the fact that doing so and esnuring the spatial index is used is not as straightforward in MS SQL as in other DBs. Nor does it discuss any of the underlying concepts. It's a bad answer, irrespective of whether it's "valid." That's why it bothers me.
– jpmc26
Nov 6 at 10:16
|
show 6 more comments
up vote
30
down vote
Use a database with support for GIS (geographic information systems) queries. Most databases support this outright or have extensions, but the details will be database-specific (in their answer, Flater shows the syntax for SQL server).
If you need to implement such queries within your application, you can implement a data structure that allows spatial queries, e.g. a k-d Tree. This is like a binary search tree, except that each level of the tree partitions on a different coordinate dimension. This allows you to restrict the search to a smaller set of feasible candidates. Effectively, you translate your search “10km radius” into bounds for each coordinate dimension, and tighten the bounds as you recurse into the tree.
4
There is also a GIS stackexchange
– BlueRaja - Danny Pflughoeft
Nov 5 at 15:11
8
PostGIS is the premier free option. It supports much, much more than SQL Server's very basic GIS types and functions. But this is basic functionality.
– jpmc26
Nov 5 at 16:41
@amon I find jpmc26's comment as a good addition, and not that much as criticizing your example. "If you want to start from scratch, you don't need to pay for a licensed DB - this free, open-source one will also do the trick really well".
– mgarciaisaia
Nov 6 at 11:51
add a comment |
up vote
30
down vote
Use a database with support for GIS (geographic information systems) queries. Most databases support this outright or have extensions, but the details will be database-specific (in their answer, Flater shows the syntax for SQL server).
If you need to implement such queries within your application, you can implement a data structure that allows spatial queries, e.g. a k-d Tree. This is like a binary search tree, except that each level of the tree partitions on a different coordinate dimension. This allows you to restrict the search to a smaller set of feasible candidates. Effectively, you translate your search “10km radius” into bounds for each coordinate dimension, and tighten the bounds as you recurse into the tree.
4
There is also a GIS stackexchange
– BlueRaja - Danny Pflughoeft
Nov 5 at 15:11
8
PostGIS is the premier free option. It supports much, much more than SQL Server's very basic GIS types and functions. But this is basic functionality.
– jpmc26
Nov 5 at 16:41
@amon I find jpmc26's comment as a good addition, and not that much as criticizing your example. "If you want to start from scratch, you don't need to pay for a licensed DB - this free, open-source one will also do the trick really well".
– mgarciaisaia
Nov 6 at 11:51
add a comment |
up vote
30
down vote
up vote
30
down vote
Use a database with support for GIS (geographic information systems) queries. Most databases support this outright or have extensions, but the details will be database-specific (in their answer, Flater shows the syntax for SQL server).
If you need to implement such queries within your application, you can implement a data structure that allows spatial queries, e.g. a k-d Tree. This is like a binary search tree, except that each level of the tree partitions on a different coordinate dimension. This allows you to restrict the search to a smaller set of feasible candidates. Effectively, you translate your search “10km radius” into bounds for each coordinate dimension, and tighten the bounds as you recurse into the tree.
Use a database with support for GIS (geographic information systems) queries. Most databases support this outright or have extensions, but the details will be database-specific (in their answer, Flater shows the syntax for SQL server).
If you need to implement such queries within your application, you can implement a data structure that allows spatial queries, e.g. a k-d Tree. This is like a binary search tree, except that each level of the tree partitions on a different coordinate dimension. This allows you to restrict the search to a smaller set of feasible candidates. Effectively, you translate your search “10km radius” into bounds for each coordinate dimension, and tighten the bounds as you recurse into the tree.
answered Nov 5 at 12:32
amon
81.3k21153239
81.3k21153239
4
There is also a GIS stackexchange
– BlueRaja - Danny Pflughoeft
Nov 5 at 15:11
8
PostGIS is the premier free option. It supports much, much more than SQL Server's very basic GIS types and functions. But this is basic functionality.
– jpmc26
Nov 5 at 16:41
@amon I find jpmc26's comment as a good addition, and not that much as criticizing your example. "If you want to start from scratch, you don't need to pay for a licensed DB - this free, open-source one will also do the trick really well".
– mgarciaisaia
Nov 6 at 11:51
add a comment |
4
There is also a GIS stackexchange
– BlueRaja - Danny Pflughoeft
Nov 5 at 15:11
8
PostGIS is the premier free option. It supports much, much more than SQL Server's very basic GIS types and functions. But this is basic functionality.
– jpmc26
Nov 5 at 16:41
@amon I find jpmc26's comment as a good addition, and not that much as criticizing your example. "If you want to start from scratch, you don't need to pay for a licensed DB - this free, open-source one will also do the trick really well".
– mgarciaisaia
Nov 6 at 11:51
4
4
There is also a GIS stackexchange
– BlueRaja - Danny Pflughoeft
Nov 5 at 15:11
There is also a GIS stackexchange
– BlueRaja - Danny Pflughoeft
Nov 5 at 15:11
8
8
PostGIS is the premier free option. It supports much, much more than SQL Server's very basic GIS types and functions. But this is basic functionality.
– jpmc26
Nov 5 at 16:41
PostGIS is the premier free option. It supports much, much more than SQL Server's very basic GIS types and functions. But this is basic functionality.
– jpmc26
Nov 5 at 16:41
@amon I find jpmc26's comment as a good addition, and not that much as criticizing your example. "If you want to start from scratch, you don't need to pay for a licensed DB - this free, open-source one will also do the trick really well".
– mgarciaisaia
Nov 6 at 11:51
@amon I find jpmc26's comment as a good addition, and not that much as criticizing your example. "If you want to start from scratch, you don't need to pay for a licensed DB - this free, open-source one will also do the trick really well".
– mgarciaisaia
Nov 6 at 11:51
add a comment |
up vote
11
down vote
Yes, there's a better way. You need to use a spatial index. These indexes organize metadata about geometries to filter out far away geometries very rapidly, saving a lot of CPU cycles by avoiding the computations you describe. You shouldn't bother implementing one yourself as all major relational databases provide a spatial geometry type and indexes to go with them.
- PostGIS (the GIS extension for PostgreSQL) uses R-Trees: http://postgis.net/docs/using_postgis_dbmanagement.html#idm2246 (The GiST type)
- SQL Server uses grid indexes: https://docs.microsoft.com/en-us/sql/relational-databases/spatial/spatial-indexes-overview?view=sql-server-2017
- Oracle uses R-Trees: https://docs.oracle.com/database/121/SPATL/creating-spatial-index.htm
- MySQL uses R-Trees: https://dev.mysql.com/doc/refman/8.0/en/optimizing-spatial-analysis.html
What you want to look into are "within distance" queries (queries for geometries within a certain distance of some other geometry). These are very standard and very much a solved problem and are possible in all of the above databases (and built into several):
- PostGIS:
ST_DWithin
- SQL Server:
STDistance
(Not clear that index use on the 3D geography version of this function is supported) - Oracle:
SDO_WITHIN_DISTANCE
(This doesn't say explicitly that it will trigger index usage. I'd double check the query plan. You might need to apply anSDO_FILTER
to get it to use the index.) - MySQL: Still figuring this out.
Workaround for triggering index usage
In the worst case where you have trouble getting the system to use the spatial index with these queries, you can add an additional filter. You'd create a square bounding box with sides of length 2*(search distance) centered at your search point and compare the table geometries' bounding boxes against that before checking the actual distance. That's what PostGIS' ST_DWithin
above does internally anyway.
Distance in GIS
While spatial indexes are fantastic and absolutely the right solution to your problem, distance calculation can get logically complicated. In particular, you need to worry about what projection (basically all the parameters for the coordinate system) your data is stored in. Most 2D projections (things other than angular coordinate systems like the various lat/long projections) distort length significantly. For example, the Web Mercator projection (the one used by Google, Bing, and every other major base map provider) expands areas and distances increasingly as the location gets further from the equator. I might be wrong as I'm not formally educated in GIS, but the best I've seen for 2D projections is some specific ones that promise correct distances from a single, constant point in the entire world. (No, it's not practical to use a different projection for every query; that would render your indexes useless.)
The bottom line is that you need to make sure your math is accurate. The simplest way of doing so from a development perspective is to use angular projections (These are often referred to as "geographic.") and functions that support doing the math using a spheroid model, but these computations are slightly more expensive than the 2D counterparts and some DBs may not support indexing them. If you can get acceptable performance using them, though, that's probably the way to go. Another common option is regional projections (like UTM zones) that get both distances and areas pretty close to correct if your data is confined to a particular part of the world. What's best for your app will depend on your specific requirements, but be aware that you need to think this through and maybe learn a little bit about it.
This applies even if you don't use built in spatial indexes. Your data has some projection regardless of what technology or technique you are currently using or use in the future, and it's already currently affecting any queries and computations you're making.
add a comment |
up vote
11
down vote
Yes, there's a better way. You need to use a spatial index. These indexes organize metadata about geometries to filter out far away geometries very rapidly, saving a lot of CPU cycles by avoiding the computations you describe. You shouldn't bother implementing one yourself as all major relational databases provide a spatial geometry type and indexes to go with them.
- PostGIS (the GIS extension for PostgreSQL) uses R-Trees: http://postgis.net/docs/using_postgis_dbmanagement.html#idm2246 (The GiST type)
- SQL Server uses grid indexes: https://docs.microsoft.com/en-us/sql/relational-databases/spatial/spatial-indexes-overview?view=sql-server-2017
- Oracle uses R-Trees: https://docs.oracle.com/database/121/SPATL/creating-spatial-index.htm
- MySQL uses R-Trees: https://dev.mysql.com/doc/refman/8.0/en/optimizing-spatial-analysis.html
What you want to look into are "within distance" queries (queries for geometries within a certain distance of some other geometry). These are very standard and very much a solved problem and are possible in all of the above databases (and built into several):
- PostGIS:
ST_DWithin
- SQL Server:
STDistance
(Not clear that index use on the 3D geography version of this function is supported) - Oracle:
SDO_WITHIN_DISTANCE
(This doesn't say explicitly that it will trigger index usage. I'd double check the query plan. You might need to apply anSDO_FILTER
to get it to use the index.) - MySQL: Still figuring this out.
Workaround for triggering index usage
In the worst case where you have trouble getting the system to use the spatial index with these queries, you can add an additional filter. You'd create a square bounding box with sides of length 2*(search distance) centered at your search point and compare the table geometries' bounding boxes against that before checking the actual distance. That's what PostGIS' ST_DWithin
above does internally anyway.
Distance in GIS
While spatial indexes are fantastic and absolutely the right solution to your problem, distance calculation can get logically complicated. In particular, you need to worry about what projection (basically all the parameters for the coordinate system) your data is stored in. Most 2D projections (things other than angular coordinate systems like the various lat/long projections) distort length significantly. For example, the Web Mercator projection (the one used by Google, Bing, and every other major base map provider) expands areas and distances increasingly as the location gets further from the equator. I might be wrong as I'm not formally educated in GIS, but the best I've seen for 2D projections is some specific ones that promise correct distances from a single, constant point in the entire world. (No, it's not practical to use a different projection for every query; that would render your indexes useless.)
The bottom line is that you need to make sure your math is accurate. The simplest way of doing so from a development perspective is to use angular projections (These are often referred to as "geographic.") and functions that support doing the math using a spheroid model, but these computations are slightly more expensive than the 2D counterparts and some DBs may not support indexing them. If you can get acceptable performance using them, though, that's probably the way to go. Another common option is regional projections (like UTM zones) that get both distances and areas pretty close to correct if your data is confined to a particular part of the world. What's best for your app will depend on your specific requirements, but be aware that you need to think this through and maybe learn a little bit about it.
This applies even if you don't use built in spatial indexes. Your data has some projection regardless of what technology or technique you are currently using or use in the future, and it's already currently affecting any queries and computations you're making.
add a comment |
up vote
11
down vote
up vote
11
down vote
Yes, there's a better way. You need to use a spatial index. These indexes organize metadata about geometries to filter out far away geometries very rapidly, saving a lot of CPU cycles by avoiding the computations you describe. You shouldn't bother implementing one yourself as all major relational databases provide a spatial geometry type and indexes to go with them.
- PostGIS (the GIS extension for PostgreSQL) uses R-Trees: http://postgis.net/docs/using_postgis_dbmanagement.html#idm2246 (The GiST type)
- SQL Server uses grid indexes: https://docs.microsoft.com/en-us/sql/relational-databases/spatial/spatial-indexes-overview?view=sql-server-2017
- Oracle uses R-Trees: https://docs.oracle.com/database/121/SPATL/creating-spatial-index.htm
- MySQL uses R-Trees: https://dev.mysql.com/doc/refman/8.0/en/optimizing-spatial-analysis.html
What you want to look into are "within distance" queries (queries for geometries within a certain distance of some other geometry). These are very standard and very much a solved problem and are possible in all of the above databases (and built into several):
- PostGIS:
ST_DWithin
- SQL Server:
STDistance
(Not clear that index use on the 3D geography version of this function is supported) - Oracle:
SDO_WITHIN_DISTANCE
(This doesn't say explicitly that it will trigger index usage. I'd double check the query plan. You might need to apply anSDO_FILTER
to get it to use the index.) - MySQL: Still figuring this out.
Workaround for triggering index usage
In the worst case where you have trouble getting the system to use the spatial index with these queries, you can add an additional filter. You'd create a square bounding box with sides of length 2*(search distance) centered at your search point and compare the table geometries' bounding boxes against that before checking the actual distance. That's what PostGIS' ST_DWithin
above does internally anyway.
Distance in GIS
While spatial indexes are fantastic and absolutely the right solution to your problem, distance calculation can get logically complicated. In particular, you need to worry about what projection (basically all the parameters for the coordinate system) your data is stored in. Most 2D projections (things other than angular coordinate systems like the various lat/long projections) distort length significantly. For example, the Web Mercator projection (the one used by Google, Bing, and every other major base map provider) expands areas and distances increasingly as the location gets further from the equator. I might be wrong as I'm not formally educated in GIS, but the best I've seen for 2D projections is some specific ones that promise correct distances from a single, constant point in the entire world. (No, it's not practical to use a different projection for every query; that would render your indexes useless.)
The bottom line is that you need to make sure your math is accurate. The simplest way of doing so from a development perspective is to use angular projections (These are often referred to as "geographic.") and functions that support doing the math using a spheroid model, but these computations are slightly more expensive than the 2D counterparts and some DBs may not support indexing them. If you can get acceptable performance using them, though, that's probably the way to go. Another common option is regional projections (like UTM zones) that get both distances and areas pretty close to correct if your data is confined to a particular part of the world. What's best for your app will depend on your specific requirements, but be aware that you need to think this through and maybe learn a little bit about it.
This applies even if you don't use built in spatial indexes. Your data has some projection regardless of what technology or technique you are currently using or use in the future, and it's already currently affecting any queries and computations you're making.
Yes, there's a better way. You need to use a spatial index. These indexes organize metadata about geometries to filter out far away geometries very rapidly, saving a lot of CPU cycles by avoiding the computations you describe. You shouldn't bother implementing one yourself as all major relational databases provide a spatial geometry type and indexes to go with them.
- PostGIS (the GIS extension for PostgreSQL) uses R-Trees: http://postgis.net/docs/using_postgis_dbmanagement.html#idm2246 (The GiST type)
- SQL Server uses grid indexes: https://docs.microsoft.com/en-us/sql/relational-databases/spatial/spatial-indexes-overview?view=sql-server-2017
- Oracle uses R-Trees: https://docs.oracle.com/database/121/SPATL/creating-spatial-index.htm
- MySQL uses R-Trees: https://dev.mysql.com/doc/refman/8.0/en/optimizing-spatial-analysis.html
What you want to look into are "within distance" queries (queries for geometries within a certain distance of some other geometry). These are very standard and very much a solved problem and are possible in all of the above databases (and built into several):
- PostGIS:
ST_DWithin
- SQL Server:
STDistance
(Not clear that index use on the 3D geography version of this function is supported) - Oracle:
SDO_WITHIN_DISTANCE
(This doesn't say explicitly that it will trigger index usage. I'd double check the query plan. You might need to apply anSDO_FILTER
to get it to use the index.) - MySQL: Still figuring this out.
Workaround for triggering index usage
In the worst case where you have trouble getting the system to use the spatial index with these queries, you can add an additional filter. You'd create a square bounding box with sides of length 2*(search distance) centered at your search point and compare the table geometries' bounding boxes against that before checking the actual distance. That's what PostGIS' ST_DWithin
above does internally anyway.
Distance in GIS
While spatial indexes are fantastic and absolutely the right solution to your problem, distance calculation can get logically complicated. In particular, you need to worry about what projection (basically all the parameters for the coordinate system) your data is stored in. Most 2D projections (things other than angular coordinate systems like the various lat/long projections) distort length significantly. For example, the Web Mercator projection (the one used by Google, Bing, and every other major base map provider) expands areas and distances increasingly as the location gets further from the equator. I might be wrong as I'm not formally educated in GIS, but the best I've seen for 2D projections is some specific ones that promise correct distances from a single, constant point in the entire world. (No, it's not practical to use a different projection for every query; that would render your indexes useless.)
The bottom line is that you need to make sure your math is accurate. The simplest way of doing so from a development perspective is to use angular projections (These are often referred to as "geographic.") and functions that support doing the math using a spheroid model, but these computations are slightly more expensive than the 2D counterparts and some DBs may not support indexing them. If you can get acceptable performance using them, though, that's probably the way to go. Another common option is regional projections (like UTM zones) that get both distances and areas pretty close to correct if your data is confined to a particular part of the world. What's best for your app will depend on your specific requirements, but be aware that you need to think this through and maybe learn a little bit about it.
This applies even if you don't use built in spatial indexes. Your data has some projection regardless of what technology or technique you are currently using or use in the future, and it's already currently affecting any queries and computations you're making.
edited Nov 6 at 23:56
answered Nov 5 at 17:00
jpmc26
3,81011735
3,81011735
add a comment |
add a comment |
up vote
2
down vote
I would agree that if possible using specific support in a database would be the most sensible way to do this.
However if I had to do this on a database without specific support I would start by querying for a square that encloses the circule e.g. (y > (y1 - rad)) AND (y < (y1 + rad)) AND (x > (x1 - rad)) AND (x < (x1 + rad)). Assuming your points have roughly even distribution querying for a square will get you your true matches plus about 30% extra false matches. You can then cull out the false matches.
But without an appropriate spatial index, such a query will scan at worst the entire database, at best all items within the given latitude OR longitude range depending on your index, i.e. a "band" rather than a square. If you don't want to kill performance, use a database that supports spatial indexes!
– jcaron
Nov 5 at 16:21
@jcaron I believe this query could be optimized with an ordinary B-tree index onx
andy
. (Perhaps combined, perhaps separate. I'd profile a little to figure out which works better in practice.)
– jpmc26
Nov 5 at 17:55
@jpmc26 Nope, it can’t. Think it through, you’ll see.
– jcaron
Nov 5 at 20:52
@jcaron Perhaps it would be better if you weren't cryptic about something that is clearly not straightforward. B-trees can be used forBETWEEN
queries. I don't see why worst case you couldn't have 2 indexes and then the filtered results from each index get joined together. (That is something RDBMSes do internally when they deem it worth using multiple indexes.) If a combined index works, it should filter out one dimension entirely at the first level and then relatively quickly narrow down in the second level.
– jpmc26
Nov 5 at 21:04
1
@jcaron actually you can use index for something likey between -68 and -69 and x between 10 and 11
but of course spatial index do a better job for that task
– Juan Carlos Oropeza
Nov 5 at 21:04
|
show 2 more comments
up vote
2
down vote
I would agree that if possible using specific support in a database would be the most sensible way to do this.
However if I had to do this on a database without specific support I would start by querying for a square that encloses the circule e.g. (y > (y1 - rad)) AND (y < (y1 + rad)) AND (x > (x1 - rad)) AND (x < (x1 + rad)). Assuming your points have roughly even distribution querying for a square will get you your true matches plus about 30% extra false matches. You can then cull out the false matches.
But without an appropriate spatial index, such a query will scan at worst the entire database, at best all items within the given latitude OR longitude range depending on your index, i.e. a "band" rather than a square. If you don't want to kill performance, use a database that supports spatial indexes!
– jcaron
Nov 5 at 16:21
@jcaron I believe this query could be optimized with an ordinary B-tree index onx
andy
. (Perhaps combined, perhaps separate. I'd profile a little to figure out which works better in practice.)
– jpmc26
Nov 5 at 17:55
@jpmc26 Nope, it can’t. Think it through, you’ll see.
– jcaron
Nov 5 at 20:52
@jcaron Perhaps it would be better if you weren't cryptic about something that is clearly not straightforward. B-trees can be used forBETWEEN
queries. I don't see why worst case you couldn't have 2 indexes and then the filtered results from each index get joined together. (That is something RDBMSes do internally when they deem it worth using multiple indexes.) If a combined index works, it should filter out one dimension entirely at the first level and then relatively quickly narrow down in the second level.
– jpmc26
Nov 5 at 21:04
1
@jcaron actually you can use index for something likey between -68 and -69 and x between 10 and 11
but of course spatial index do a better job for that task
– Juan Carlos Oropeza
Nov 5 at 21:04
|
show 2 more comments
up vote
2
down vote
up vote
2
down vote
I would agree that if possible using specific support in a database would be the most sensible way to do this.
However if I had to do this on a database without specific support I would start by querying for a square that encloses the circule e.g. (y > (y1 - rad)) AND (y < (y1 + rad)) AND (x > (x1 - rad)) AND (x < (x1 + rad)). Assuming your points have roughly even distribution querying for a square will get you your true matches plus about 30% extra false matches. You can then cull out the false matches.
I would agree that if possible using specific support in a database would be the most sensible way to do this.
However if I had to do this on a database without specific support I would start by querying for a square that encloses the circule e.g. (y > (y1 - rad)) AND (y < (y1 + rad)) AND (x > (x1 - rad)) AND (x < (x1 + rad)). Assuming your points have roughly even distribution querying for a square will get you your true matches plus about 30% extra false matches. You can then cull out the false matches.
answered Nov 5 at 15:58
Peter Green
1,499511
1,499511
But without an appropriate spatial index, such a query will scan at worst the entire database, at best all items within the given latitude OR longitude range depending on your index, i.e. a "band" rather than a square. If you don't want to kill performance, use a database that supports spatial indexes!
– jcaron
Nov 5 at 16:21
@jcaron I believe this query could be optimized with an ordinary B-tree index onx
andy
. (Perhaps combined, perhaps separate. I'd profile a little to figure out which works better in practice.)
– jpmc26
Nov 5 at 17:55
@jpmc26 Nope, it can’t. Think it through, you’ll see.
– jcaron
Nov 5 at 20:52
@jcaron Perhaps it would be better if you weren't cryptic about something that is clearly not straightforward. B-trees can be used forBETWEEN
queries. I don't see why worst case you couldn't have 2 indexes and then the filtered results from each index get joined together. (That is something RDBMSes do internally when they deem it worth using multiple indexes.) If a combined index works, it should filter out one dimension entirely at the first level and then relatively quickly narrow down in the second level.
– jpmc26
Nov 5 at 21:04
1
@jcaron actually you can use index for something likey between -68 and -69 and x between 10 and 11
but of course spatial index do a better job for that task
– Juan Carlos Oropeza
Nov 5 at 21:04
|
show 2 more comments
But without an appropriate spatial index, such a query will scan at worst the entire database, at best all items within the given latitude OR longitude range depending on your index, i.e. a "band" rather than a square. If you don't want to kill performance, use a database that supports spatial indexes!
– jcaron
Nov 5 at 16:21
@jcaron I believe this query could be optimized with an ordinary B-tree index onx
andy
. (Perhaps combined, perhaps separate. I'd profile a little to figure out which works better in practice.)
– jpmc26
Nov 5 at 17:55
@jpmc26 Nope, it can’t. Think it through, you’ll see.
– jcaron
Nov 5 at 20:52
@jcaron Perhaps it would be better if you weren't cryptic about something that is clearly not straightforward. B-trees can be used forBETWEEN
queries. I don't see why worst case you couldn't have 2 indexes and then the filtered results from each index get joined together. (That is something RDBMSes do internally when they deem it worth using multiple indexes.) If a combined index works, it should filter out one dimension entirely at the first level and then relatively quickly narrow down in the second level.
– jpmc26
Nov 5 at 21:04
1
@jcaron actually you can use index for something likey between -68 and -69 and x between 10 and 11
but of course spatial index do a better job for that task
– Juan Carlos Oropeza
Nov 5 at 21:04
But without an appropriate spatial index, such a query will scan at worst the entire database, at best all items within the given latitude OR longitude range depending on your index, i.e. a "band" rather than a square. If you don't want to kill performance, use a database that supports spatial indexes!
– jcaron
Nov 5 at 16:21
But without an appropriate spatial index, such a query will scan at worst the entire database, at best all items within the given latitude OR longitude range depending on your index, i.e. a "band" rather than a square. If you don't want to kill performance, use a database that supports spatial indexes!
– jcaron
Nov 5 at 16:21
@jcaron I believe this query could be optimized with an ordinary B-tree index on
x
and y
. (Perhaps combined, perhaps separate. I'd profile a little to figure out which works better in practice.)– jpmc26
Nov 5 at 17:55
@jcaron I believe this query could be optimized with an ordinary B-tree index on
x
and y
. (Perhaps combined, perhaps separate. I'd profile a little to figure out which works better in practice.)– jpmc26
Nov 5 at 17:55
@jpmc26 Nope, it can’t. Think it through, you’ll see.
– jcaron
Nov 5 at 20:52
@jpmc26 Nope, it can’t. Think it through, you’ll see.
– jcaron
Nov 5 at 20:52
@jcaron Perhaps it would be better if you weren't cryptic about something that is clearly not straightforward. B-trees can be used for
BETWEEN
queries. I don't see why worst case you couldn't have 2 indexes and then the filtered results from each index get joined together. (That is something RDBMSes do internally when they deem it worth using multiple indexes.) If a combined index works, it should filter out one dimension entirely at the first level and then relatively quickly narrow down in the second level.– jpmc26
Nov 5 at 21:04
@jcaron Perhaps it would be better if you weren't cryptic about something that is clearly not straightforward. B-trees can be used for
BETWEEN
queries. I don't see why worst case you couldn't have 2 indexes and then the filtered results from each index get joined together. (That is something RDBMSes do internally when they deem it worth using multiple indexes.) If a combined index works, it should filter out one dimension entirely at the first level and then relatively quickly narrow down in the second level.– jpmc26
Nov 5 at 21:04
1
1
@jcaron actually you can use index for something like
y between -68 and -69 and x between 10 and 11
but of course spatial index do a better job for that task– Juan Carlos Oropeza
Nov 5 at 21:04
@jcaron actually you can use index for something like
y between -68 and -69 and x between 10 and 11
but of course spatial index do a better job for that task– Juan Carlos Oropeza
Nov 5 at 21:04
|
show 2 more comments
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fsoftwareengineering.stackexchange.com%2fquestions%2f381001%2fhow-do-i-efficiently-search-for-all-the-landmarks-within-a-range-of-a-certain-la%23new-answer', 'question_page');
}
);
Post as a guest
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
5
You could probably eliminate a good many simply by performing a quick Manhattan distance calculation and then performing a second filter afterwards to exclude landmarks which are within a 10km square but are outside the 10km radius.
– Neil
Nov 5 at 13:30
3
What database technology are you using? The answer is not database agnostic.
– jpmc26
Nov 5 at 16:48
1
@Neil As a second pass you can include any landmark that where the x and y both fall with in 7km of the origin without calculating the actual distance.
– JimmyJames
Nov 5 at 20:17