How do I efficiently search for all the landmarks within a range of a certain landmark?











up vote
14
down vote

favorite
6












I am trying to start with a geo search project that will find all landmarks in the 10 km/miles (not important for this story) of a particular landmark.



So for example, lets say I have a database of a 1,000,000 landmarks. In order to find all landmarks in 10 miles range of a landmark with certain coordinates, I would have to calculate a distance between a landmark from my search and 1,000,000 landmarks.



Is there a better way to do that?



Alternative I was thinking is to categorize landmarks such as country, region, city, neighborhood, business, historical, etc. in such a way that business can be part of a neighborhood or city. City is a part of a region, a country, etc. This can narrow a list of calculations, but it still looks like a lot of work do to in order for search to be fast and accurate.



Could the Google Maps API help?










share|improve this question




















  • 5




    You could probably eliminate a good many simply by performing a quick Manhattan distance calculation and then performing a second filter afterwards to exclude landmarks which are within a 10km square but are outside the 10km radius.
    – Neil
    Nov 5 at 13:30






  • 3




    What database technology are you using? The answer is not database agnostic.
    – jpmc26
    Nov 5 at 16:48






  • 1




    @Neil As a second pass you can include any landmark that where the x and y both fall with in 7km of the origin without calculating the actual distance.
    – JimmyJames
    Nov 5 at 20:17















up vote
14
down vote

favorite
6












I am trying to start with a geo search project that will find all landmarks in the 10 km/miles (not important for this story) of a particular landmark.



So for example, lets say I have a database of a 1,000,000 landmarks. In order to find all landmarks in 10 miles range of a landmark with certain coordinates, I would have to calculate a distance between a landmark from my search and 1,000,000 landmarks.



Is there a better way to do that?



Alternative I was thinking is to categorize landmarks such as country, region, city, neighborhood, business, historical, etc. in such a way that business can be part of a neighborhood or city. City is a part of a region, a country, etc. This can narrow a list of calculations, but it still looks like a lot of work do to in order for search to be fast and accurate.



Could the Google Maps API help?










share|improve this question




















  • 5




    You could probably eliminate a good many simply by performing a quick Manhattan distance calculation and then performing a second filter afterwards to exclude landmarks which are within a 10km square but are outside the 10km radius.
    – Neil
    Nov 5 at 13:30






  • 3




    What database technology are you using? The answer is not database agnostic.
    – jpmc26
    Nov 5 at 16:48






  • 1




    @Neil As a second pass you can include any landmark that where the x and y both fall with in 7km of the origin without calculating the actual distance.
    – JimmyJames
    Nov 5 at 20:17













up vote
14
down vote

favorite
6









up vote
14
down vote

favorite
6






6





I am trying to start with a geo search project that will find all landmarks in the 10 km/miles (not important for this story) of a particular landmark.



So for example, lets say I have a database of a 1,000,000 landmarks. In order to find all landmarks in 10 miles range of a landmark with certain coordinates, I would have to calculate a distance between a landmark from my search and 1,000,000 landmarks.



Is there a better way to do that?



Alternative I was thinking is to categorize landmarks such as country, region, city, neighborhood, business, historical, etc. in such a way that business can be part of a neighborhood or city. City is a part of a region, a country, etc. This can narrow a list of calculations, but it still looks like a lot of work do to in order for search to be fast and accurate.



Could the Google Maps API help?










share|improve this question















I am trying to start with a geo search project that will find all landmarks in the 10 km/miles (not important for this story) of a particular landmark.



So for example, lets say I have a database of a 1,000,000 landmarks. In order to find all landmarks in 10 miles range of a landmark with certain coordinates, I would have to calculate a distance between a landmark from my search and 1,000,000 landmarks.



Is there a better way to do that?



Alternative I was thinking is to categorize landmarks such as country, region, city, neighborhood, business, historical, etc. in such a way that business can be part of a neighborhood or city. City is a part of a region, a country, etc. This can narrow a list of calculations, but it still looks like a lot of work do to in order for search to be fast and accurate.



Could the Google Maps API help?







c# application-design geolocation google-maps






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited Nov 6 at 10:41









Peter Mortensen

1,11621114




1,11621114










asked Nov 5 at 12:11









Dario Granich

365310




365310








  • 5




    You could probably eliminate a good many simply by performing a quick Manhattan distance calculation and then performing a second filter afterwards to exclude landmarks which are within a 10km square but are outside the 10km radius.
    – Neil
    Nov 5 at 13:30






  • 3




    What database technology are you using? The answer is not database agnostic.
    – jpmc26
    Nov 5 at 16:48






  • 1




    @Neil As a second pass you can include any landmark that where the x and y both fall with in 7km of the origin without calculating the actual distance.
    – JimmyJames
    Nov 5 at 20:17














  • 5




    You could probably eliminate a good many simply by performing a quick Manhattan distance calculation and then performing a second filter afterwards to exclude landmarks which are within a 10km square but are outside the 10km radius.
    – Neil
    Nov 5 at 13:30






  • 3




    What database technology are you using? The answer is not database agnostic.
    – jpmc26
    Nov 5 at 16:48






  • 1




    @Neil As a second pass you can include any landmark that where the x and y both fall with in 7km of the origin without calculating the actual distance.
    – JimmyJames
    Nov 5 at 20:17








5




5




You could probably eliminate a good many simply by performing a quick Manhattan distance calculation and then performing a second filter afterwards to exclude landmarks which are within a 10km square but are outside the 10km radius.
– Neil
Nov 5 at 13:30




You could probably eliminate a good many simply by performing a quick Manhattan distance calculation and then performing a second filter afterwards to exclude landmarks which are within a 10km square but are outside the 10km radius.
– Neil
Nov 5 at 13:30




3




3




What database technology are you using? The answer is not database agnostic.
– jpmc26
Nov 5 at 16:48




What database technology are you using? The answer is not database agnostic.
– jpmc26
Nov 5 at 16:48




1




1




@Neil As a second pass you can include any landmark that where the x and y both fall with in 7km of the origin without calculating the actual distance.
– JimmyJames
Nov 5 at 20:17




@Neil As a second pass you can include any landmark that where the x and y both fall with in 7km of the origin without calculating the actual distance.
– JimmyJames
Nov 5 at 20:17










4 Answers
4






active

oldest

votes

















up vote
11
down vote



accepted










Since SQL Server 2008, there is a geography data type which stores locations (lat/lon pairs) and makes it easy for you to write location-related queries.



There is an existing StackOverflow answer that discusses this in-depth.



A basic query to find the nearest 7 items:



USE AdventureWorks2012  
GO
DECLARE @g geography = 'POINT(-121.626 47.8315)';
SELECT TOP(7) SpatialLocation.ToString(), City FROM Person.Address
WHERE SpatialLocation.STDistance(@g) IS NOT NULL
ORDER BY SpatialLocation.STDistance(@g);


A basic query to find everything within 100m (second answer to the question)



-- Get the center point
DECLARE @g geography
SELECT @g = geo FROM yourTable WHERE PointId = something

-- Get the results, radius 100m
SELECT * FROM yourTable WHERE @g.STDistance(geo) <= 100





share|improve this answer

















  • 11




    @KonradRudolph: As is the case for any SQL column that is used for querying on a table with a massive rowcount. You're correct, but that comment would apply to virtually any SQL query that is posted as an answer.
    – Flater
    Nov 5 at 14:58






  • 2




    Where did you read "MS SQL Server" in the question?
    – Doc Brown
    Nov 5 at 14:58






  • 3




    @Flater I agree that it would normally obvious and redundant but OP’s wording seems to suggest that they’re unaware of such mechanisms.
    – Konrad Rudolph
    Nov 5 at 15:09






  • 2




    @jpmc26: You're appalled that I listed a valid option and didn't include some other option? What? If you feel it's relevant to add PostGIS, add the answer yourself (which you did) and don't resort to criticizing others for not having the same idea as you.
    – Flater
    Nov 6 at 7:22






  • 3




    Your answer appears to me as basically just a MS SQL sales pitch. Your comments suggesting they switch databases to something that would cost 10s of thousands of dollars without actually inquiring about what their situation only makes it appear moreso. It doesn't even describe how the OP can actually implement their query or discuss the fact that doing so and esnuring the spatial index is used is not as straightforward in MS SQL as in other DBs. Nor does it discuss any of the underlying concepts. It's a bad answer, irrespective of whether it's "valid." That's why it bothers me.
    – jpmc26
    Nov 6 at 10:16




















up vote
30
down vote













Use a database with support for GIS (geographic information systems) queries. Most databases support this outright or have extensions, but the details will be database-specific (in their answer, Flater shows the syntax for SQL server).



If you need to implement such queries within your application, you can implement a data structure that allows spatial queries, e.g. a k-d Tree. This is like a binary search tree, except that each level of the tree partitions on a different coordinate dimension. This allows you to restrict the search to a smaller set of feasible candidates. Effectively, you translate your search “10km radius” into bounds for each coordinate dimension, and tighten the bounds as you recurse into the tree.






share|improve this answer

















  • 4




    There is also a GIS stackexchange
    – BlueRaja - Danny Pflughoeft
    Nov 5 at 15:11






  • 8




    PostGIS is the premier free option. It supports much, much more than SQL Server's very basic GIS types and functions. But this is basic functionality.
    – jpmc26
    Nov 5 at 16:41












  • @amon I find jpmc26's comment as a good addition, and not that much as criticizing your example. "If you want to start from scratch, you don't need to pay for a licensed DB - this free, open-source one will also do the trick really well".
    – mgarciaisaia
    Nov 6 at 11:51


















up vote
11
down vote













Yes, there's a better way. You need to use a spatial index. These indexes organize metadata about geometries to filter out far away geometries very rapidly, saving a lot of CPU cycles by avoiding the computations you describe. You shouldn't bother implementing one yourself as all major relational databases provide a spatial geometry type and indexes to go with them.




  • PostGIS (the GIS extension for PostgreSQL) uses R-Trees: http://postgis.net/docs/using_postgis_dbmanagement.html#idm2246 (The GiST type)

  • SQL Server uses grid indexes: https://docs.microsoft.com/en-us/sql/relational-databases/spatial/spatial-indexes-overview?view=sql-server-2017

  • Oracle uses R-Trees: https://docs.oracle.com/database/121/SPATL/creating-spatial-index.htm

  • MySQL uses R-Trees: https://dev.mysql.com/doc/refman/8.0/en/optimizing-spatial-analysis.html


What you want to look into are "within distance" queries (queries for geometries within a certain distance of some other geometry). These are very standard and very much a solved problem and are possible in all of the above databases (and built into several):




  • PostGIS: ST_DWithin

  • SQL Server: STDistance (Not clear that index use on the 3D geography version of this function is supported)

  • Oracle: SDO_WITHIN_DISTANCE (This doesn't say explicitly that it will trigger index usage. I'd double check the query plan. You might need to apply an SDO_FILTER to get it to use the index.)

  • MySQL: Still figuring this out.


Workaround for triggering index usage



In the worst case where you have trouble getting the system to use the spatial index with these queries, you can add an additional filter. You'd create a square bounding box with sides of length 2*(search distance) centered at your search point and compare the table geometries' bounding boxes against that before checking the actual distance. That's what PostGIS' ST_DWithin above does internally anyway.





Distance in GIS



While spatial indexes are fantastic and absolutely the right solution to your problem, distance calculation can get logically complicated. In particular, you need to worry about what projection (basically all the parameters for the coordinate system) your data is stored in. Most 2D projections (things other than angular coordinate systems like the various lat/long projections) distort length significantly. For example, the Web Mercator projection (the one used by Google, Bing, and every other major base map provider) expands areas and distances increasingly as the location gets further from the equator. I might be wrong as I'm not formally educated in GIS, but the best I've seen for 2D projections is some specific ones that promise correct distances from a single, constant point in the entire world. (No, it's not practical to use a different projection for every query; that would render your indexes useless.)



The bottom line is that you need to make sure your math is accurate. The simplest way of doing so from a development perspective is to use angular projections (These are often referred to as "geographic.") and functions that support doing the math using a spheroid model, but these computations are slightly more expensive than the 2D counterparts and some DBs may not support indexing them. If you can get acceptable performance using them, though, that's probably the way to go. Another common option is regional projections (like UTM zones) that get both distances and areas pretty close to correct if your data is confined to a particular part of the world. What's best for your app will depend on your specific requirements, but be aware that you need to think this through and maybe learn a little bit about it.



This applies even if you don't use built in spatial indexes. Your data has some projection regardless of what technology or technique you are currently using or use in the future, and it's already currently affecting any queries and computations you're making.






share|improve this answer






























    up vote
    2
    down vote













    I would agree that if possible using specific support in a database would be the most sensible way to do this.



    However if I had to do this on a database without specific support I would start by querying for a square that encloses the circule e.g. (y > (y1 - rad)) AND (y < (y1 + rad)) AND (x > (x1 - rad)) AND (x < (x1 + rad)). Assuming your points have roughly even distribution querying for a square will get you your true matches plus about 30% extra false matches. You can then cull out the false matches.






    share|improve this answer





















    • But without an appropriate spatial index, such a query will scan at worst the entire database, at best all items within the given latitude OR longitude range depending on your index, i.e. a "band" rather than a square. If you don't want to kill performance, use a database that supports spatial indexes!
      – jcaron
      Nov 5 at 16:21










    • @jcaron I believe this query could be optimized with an ordinary B-tree index on x and y. (Perhaps combined, perhaps separate. I'd profile a little to figure out which works better in practice.)
      – jpmc26
      Nov 5 at 17:55












    • @jpmc26 Nope, it can’t. Think it through, you’ll see.
      – jcaron
      Nov 5 at 20:52










    • @jcaron Perhaps it would be better if you weren't cryptic about something that is clearly not straightforward. B-trees can be used for BETWEEN queries. I don't see why worst case you couldn't have 2 indexes and then the filtered results from each index get joined together. (That is something RDBMSes do internally when they deem it worth using multiple indexes.) If a combined index works, it should filter out one dimension entirely at the first level and then relatively quickly narrow down in the second level.
      – jpmc26
      Nov 5 at 21:04








    • 1




      @jcaron actually you can use index for something like y between -68 and -69 and x between 10 and 11 but of course spatial index do a better job for that task
      – Juan Carlos Oropeza
      Nov 5 at 21:04













    Your Answer








    StackExchange.ready(function() {
    var channelOptions = {
    tags: "".split(" "),
    id: "131"
    };
    initTagRenderer("".split(" "), "".split(" "), channelOptions);

    StackExchange.using("externalEditor", function() {
    // Have to fire editor after snippets, if snippets enabled
    if (StackExchange.settings.snippets.snippetsEnabled) {
    StackExchange.using("snippets", function() {
    createEditor();
    });
    }
    else {
    createEditor();
    }
    });

    function createEditor() {
    StackExchange.prepareEditor({
    heartbeatType: 'answer',
    convertImagesToLinks: false,
    noModals: true,
    showLowRepImageUploadWarning: true,
    reputationToPostImages: null,
    bindNavPrevention: true,
    postfix: "",
    imageUploader: {
    brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
    contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
    allowUrls: true
    },
    onDemand: true,
    discardSelector: ".discard-answer"
    ,immediatelyShowMarkdownHelp:true
    });


    }
    });














     

    draft saved


    draft discarded


















    StackExchange.ready(
    function () {
    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fsoftwareengineering.stackexchange.com%2fquestions%2f381001%2fhow-do-i-efficiently-search-for-all-the-landmarks-within-a-range-of-a-certain-la%23new-answer', 'question_page');
    }
    );

    Post as a guest
































    4 Answers
    4






    active

    oldest

    votes








    4 Answers
    4






    active

    oldest

    votes









    active

    oldest

    votes






    active

    oldest

    votes








    up vote
    11
    down vote



    accepted










    Since SQL Server 2008, there is a geography data type which stores locations (lat/lon pairs) and makes it easy for you to write location-related queries.



    There is an existing StackOverflow answer that discusses this in-depth.



    A basic query to find the nearest 7 items:



    USE AdventureWorks2012  
    GO
    DECLARE @g geography = 'POINT(-121.626 47.8315)';
    SELECT TOP(7) SpatialLocation.ToString(), City FROM Person.Address
    WHERE SpatialLocation.STDistance(@g) IS NOT NULL
    ORDER BY SpatialLocation.STDistance(@g);


    A basic query to find everything within 100m (second answer to the question)



    -- Get the center point
    DECLARE @g geography
    SELECT @g = geo FROM yourTable WHERE PointId = something

    -- Get the results, radius 100m
    SELECT * FROM yourTable WHERE @g.STDistance(geo) <= 100





    share|improve this answer

















    • 11




      @KonradRudolph: As is the case for any SQL column that is used for querying on a table with a massive rowcount. You're correct, but that comment would apply to virtually any SQL query that is posted as an answer.
      – Flater
      Nov 5 at 14:58






    • 2




      Where did you read "MS SQL Server" in the question?
      – Doc Brown
      Nov 5 at 14:58






    • 3




      @Flater I agree that it would normally obvious and redundant but OP’s wording seems to suggest that they’re unaware of such mechanisms.
      – Konrad Rudolph
      Nov 5 at 15:09






    • 2




      @jpmc26: You're appalled that I listed a valid option and didn't include some other option? What? If you feel it's relevant to add PostGIS, add the answer yourself (which you did) and don't resort to criticizing others for not having the same idea as you.
      – Flater
      Nov 6 at 7:22






    • 3




      Your answer appears to me as basically just a MS SQL sales pitch. Your comments suggesting they switch databases to something that would cost 10s of thousands of dollars without actually inquiring about what their situation only makes it appear moreso. It doesn't even describe how the OP can actually implement their query or discuss the fact that doing so and esnuring the spatial index is used is not as straightforward in MS SQL as in other DBs. Nor does it discuss any of the underlying concepts. It's a bad answer, irrespective of whether it's "valid." That's why it bothers me.
      – jpmc26
      Nov 6 at 10:16

















    up vote
    11
    down vote



    accepted










    Since SQL Server 2008, there is a geography data type which stores locations (lat/lon pairs) and makes it easy for you to write location-related queries.



    There is an existing StackOverflow answer that discusses this in-depth.



    A basic query to find the nearest 7 items:



    USE AdventureWorks2012  
    GO
    DECLARE @g geography = 'POINT(-121.626 47.8315)';
    SELECT TOP(7) SpatialLocation.ToString(), City FROM Person.Address
    WHERE SpatialLocation.STDistance(@g) IS NOT NULL
    ORDER BY SpatialLocation.STDistance(@g);


    A basic query to find everything within 100m (second answer to the question)



    -- Get the center point
    DECLARE @g geography
    SELECT @g = geo FROM yourTable WHERE PointId = something

    -- Get the results, radius 100m
    SELECT * FROM yourTable WHERE @g.STDistance(geo) <= 100





    share|improve this answer

















    • 11




      @KonradRudolph: As is the case for any SQL column that is used for querying on a table with a massive rowcount. You're correct, but that comment would apply to virtually any SQL query that is posted as an answer.
      – Flater
      Nov 5 at 14:58






    • 2




      Where did you read "MS SQL Server" in the question?
      – Doc Brown
      Nov 5 at 14:58






    • 3




      @Flater I agree that it would normally obvious and redundant but OP’s wording seems to suggest that they’re unaware of such mechanisms.
      – Konrad Rudolph
      Nov 5 at 15:09






    • 2




      @jpmc26: You're appalled that I listed a valid option and didn't include some other option? What? If you feel it's relevant to add PostGIS, add the answer yourself (which you did) and don't resort to criticizing others for not having the same idea as you.
      – Flater
      Nov 6 at 7:22






    • 3




      Your answer appears to me as basically just a MS SQL sales pitch. Your comments suggesting they switch databases to something that would cost 10s of thousands of dollars without actually inquiring about what their situation only makes it appear moreso. It doesn't even describe how the OP can actually implement their query or discuss the fact that doing so and esnuring the spatial index is used is not as straightforward in MS SQL as in other DBs. Nor does it discuss any of the underlying concepts. It's a bad answer, irrespective of whether it's "valid." That's why it bothers me.
      – jpmc26
      Nov 6 at 10:16















    up vote
    11
    down vote



    accepted







    up vote
    11
    down vote



    accepted






    Since SQL Server 2008, there is a geography data type which stores locations (lat/lon pairs) and makes it easy for you to write location-related queries.



    There is an existing StackOverflow answer that discusses this in-depth.



    A basic query to find the nearest 7 items:



    USE AdventureWorks2012  
    GO
    DECLARE @g geography = 'POINT(-121.626 47.8315)';
    SELECT TOP(7) SpatialLocation.ToString(), City FROM Person.Address
    WHERE SpatialLocation.STDistance(@g) IS NOT NULL
    ORDER BY SpatialLocation.STDistance(@g);


    A basic query to find everything within 100m (second answer to the question)



    -- Get the center point
    DECLARE @g geography
    SELECT @g = geo FROM yourTable WHERE PointId = something

    -- Get the results, radius 100m
    SELECT * FROM yourTable WHERE @g.STDistance(geo) <= 100





    share|improve this answer












    Since SQL Server 2008, there is a geography data type which stores locations (lat/lon pairs) and makes it easy for you to write location-related queries.



    There is an existing StackOverflow answer that discusses this in-depth.



    A basic query to find the nearest 7 items:



    USE AdventureWorks2012  
    GO
    DECLARE @g geography = 'POINT(-121.626 47.8315)';
    SELECT TOP(7) SpatialLocation.ToString(), City FROM Person.Address
    WHERE SpatialLocation.STDistance(@g) IS NOT NULL
    ORDER BY SpatialLocation.STDistance(@g);


    A basic query to find everything within 100m (second answer to the question)



    -- Get the center point
    DECLARE @g geography
    SELECT @g = geo FROM yourTable WHERE PointId = something

    -- Get the results, radius 100m
    SELECT * FROM yourTable WHERE @g.STDistance(geo) <= 100






    share|improve this answer












    share|improve this answer



    share|improve this answer










    answered Nov 5 at 12:24









    Flater

    5,22111019




    5,22111019








    • 11




      @KonradRudolph: As is the case for any SQL column that is used for querying on a table with a massive rowcount. You're correct, but that comment would apply to virtually any SQL query that is posted as an answer.
      – Flater
      Nov 5 at 14:58






    • 2




      Where did you read "MS SQL Server" in the question?
      – Doc Brown
      Nov 5 at 14:58






    • 3




      @Flater I agree that it would normally obvious and redundant but OP’s wording seems to suggest that they’re unaware of such mechanisms.
      – Konrad Rudolph
      Nov 5 at 15:09






    • 2




      @jpmc26: You're appalled that I listed a valid option and didn't include some other option? What? If you feel it's relevant to add PostGIS, add the answer yourself (which you did) and don't resort to criticizing others for not having the same idea as you.
      – Flater
      Nov 6 at 7:22






    • 3




      Your answer appears to me as basically just a MS SQL sales pitch. Your comments suggesting they switch databases to something that would cost 10s of thousands of dollars without actually inquiring about what their situation only makes it appear moreso. It doesn't even describe how the OP can actually implement their query or discuss the fact that doing so and esnuring the spatial index is used is not as straightforward in MS SQL as in other DBs. Nor does it discuss any of the underlying concepts. It's a bad answer, irrespective of whether it's "valid." That's why it bothers me.
      – jpmc26
      Nov 6 at 10:16
















    • 11




      @KonradRudolph: As is the case for any SQL column that is used for querying on a table with a massive rowcount. You're correct, but that comment would apply to virtually any SQL query that is posted as an answer.
      – Flater
      Nov 5 at 14:58






    • 2




      Where did you read "MS SQL Server" in the question?
      – Doc Brown
      Nov 5 at 14:58






    • 3




      @Flater I agree that it would normally obvious and redundant but OP’s wording seems to suggest that they’re unaware of such mechanisms.
      – Konrad Rudolph
      Nov 5 at 15:09






    • 2




      @jpmc26: You're appalled that I listed a valid option and didn't include some other option? What? If you feel it's relevant to add PostGIS, add the answer yourself (which you did) and don't resort to criticizing others for not having the same idea as you.
      – Flater
      Nov 6 at 7:22






    • 3




      Your answer appears to me as basically just a MS SQL sales pitch. Your comments suggesting they switch databases to something that would cost 10s of thousands of dollars without actually inquiring about what their situation only makes it appear moreso. It doesn't even describe how the OP can actually implement their query or discuss the fact that doing so and esnuring the spatial index is used is not as straightforward in MS SQL as in other DBs. Nor does it discuss any of the underlying concepts. It's a bad answer, irrespective of whether it's "valid." That's why it bothers me.
      – jpmc26
      Nov 6 at 10:16










    11




    11




    @KonradRudolph: As is the case for any SQL column that is used for querying on a table with a massive rowcount. You're correct, but that comment would apply to virtually any SQL query that is posted as an answer.
    – Flater
    Nov 5 at 14:58




    @KonradRudolph: As is the case for any SQL column that is used for querying on a table with a massive rowcount. You're correct, but that comment would apply to virtually any SQL query that is posted as an answer.
    – Flater
    Nov 5 at 14:58




    2




    2




    Where did you read "MS SQL Server" in the question?
    – Doc Brown
    Nov 5 at 14:58




    Where did you read "MS SQL Server" in the question?
    – Doc Brown
    Nov 5 at 14:58




    3




    3




    @Flater I agree that it would normally obvious and redundant but OP’s wording seems to suggest that they’re unaware of such mechanisms.
    – Konrad Rudolph
    Nov 5 at 15:09




    @Flater I agree that it would normally obvious and redundant but OP’s wording seems to suggest that they’re unaware of such mechanisms.
    – Konrad Rudolph
    Nov 5 at 15:09




    2




    2




    @jpmc26: You're appalled that I listed a valid option and didn't include some other option? What? If you feel it's relevant to add PostGIS, add the answer yourself (which you did) and don't resort to criticizing others for not having the same idea as you.
    – Flater
    Nov 6 at 7:22




    @jpmc26: You're appalled that I listed a valid option and didn't include some other option? What? If you feel it's relevant to add PostGIS, add the answer yourself (which you did) and don't resort to criticizing others for not having the same idea as you.
    – Flater
    Nov 6 at 7:22




    3




    3




    Your answer appears to me as basically just a MS SQL sales pitch. Your comments suggesting they switch databases to something that would cost 10s of thousands of dollars without actually inquiring about what their situation only makes it appear moreso. It doesn't even describe how the OP can actually implement their query or discuss the fact that doing so and esnuring the spatial index is used is not as straightforward in MS SQL as in other DBs. Nor does it discuss any of the underlying concepts. It's a bad answer, irrespective of whether it's "valid." That's why it bothers me.
    – jpmc26
    Nov 6 at 10:16






    Your answer appears to me as basically just a MS SQL sales pitch. Your comments suggesting they switch databases to something that would cost 10s of thousands of dollars without actually inquiring about what their situation only makes it appear moreso. It doesn't even describe how the OP can actually implement their query or discuss the fact that doing so and esnuring the spatial index is used is not as straightforward in MS SQL as in other DBs. Nor does it discuss any of the underlying concepts. It's a bad answer, irrespective of whether it's "valid." That's why it bothers me.
    – jpmc26
    Nov 6 at 10:16














    up vote
    30
    down vote













    Use a database with support for GIS (geographic information systems) queries. Most databases support this outright or have extensions, but the details will be database-specific (in their answer, Flater shows the syntax for SQL server).



    If you need to implement such queries within your application, you can implement a data structure that allows spatial queries, e.g. a k-d Tree. This is like a binary search tree, except that each level of the tree partitions on a different coordinate dimension. This allows you to restrict the search to a smaller set of feasible candidates. Effectively, you translate your search “10km radius” into bounds for each coordinate dimension, and tighten the bounds as you recurse into the tree.






    share|improve this answer

















    • 4




      There is also a GIS stackexchange
      – BlueRaja - Danny Pflughoeft
      Nov 5 at 15:11






    • 8




      PostGIS is the premier free option. It supports much, much more than SQL Server's very basic GIS types and functions. But this is basic functionality.
      – jpmc26
      Nov 5 at 16:41












    • @amon I find jpmc26's comment as a good addition, and not that much as criticizing your example. "If you want to start from scratch, you don't need to pay for a licensed DB - this free, open-source one will also do the trick really well".
      – mgarciaisaia
      Nov 6 at 11:51















    up vote
    30
    down vote













    Use a database with support for GIS (geographic information systems) queries. Most databases support this outright or have extensions, but the details will be database-specific (in their answer, Flater shows the syntax for SQL server).



    If you need to implement such queries within your application, you can implement a data structure that allows spatial queries, e.g. a k-d Tree. This is like a binary search tree, except that each level of the tree partitions on a different coordinate dimension. This allows you to restrict the search to a smaller set of feasible candidates. Effectively, you translate your search “10km radius” into bounds for each coordinate dimension, and tighten the bounds as you recurse into the tree.






    share|improve this answer

















    • 4




      There is also a GIS stackexchange
      – BlueRaja - Danny Pflughoeft
      Nov 5 at 15:11






    • 8




      PostGIS is the premier free option. It supports much, much more than SQL Server's very basic GIS types and functions. But this is basic functionality.
      – jpmc26
      Nov 5 at 16:41












    • @amon I find jpmc26's comment as a good addition, and not that much as criticizing your example. "If you want to start from scratch, you don't need to pay for a licensed DB - this free, open-source one will also do the trick really well".
      – mgarciaisaia
      Nov 6 at 11:51













    up vote
    30
    down vote










    up vote
    30
    down vote









    Use a database with support for GIS (geographic information systems) queries. Most databases support this outright or have extensions, but the details will be database-specific (in their answer, Flater shows the syntax for SQL server).



    If you need to implement such queries within your application, you can implement a data structure that allows spatial queries, e.g. a k-d Tree. This is like a binary search tree, except that each level of the tree partitions on a different coordinate dimension. This allows you to restrict the search to a smaller set of feasible candidates. Effectively, you translate your search “10km radius” into bounds for each coordinate dimension, and tighten the bounds as you recurse into the tree.






    share|improve this answer












    Use a database with support for GIS (geographic information systems) queries. Most databases support this outright or have extensions, but the details will be database-specific (in their answer, Flater shows the syntax for SQL server).



    If you need to implement such queries within your application, you can implement a data structure that allows spatial queries, e.g. a k-d Tree. This is like a binary search tree, except that each level of the tree partitions on a different coordinate dimension. This allows you to restrict the search to a smaller set of feasible candidates. Effectively, you translate your search “10km radius” into bounds for each coordinate dimension, and tighten the bounds as you recurse into the tree.







    share|improve this answer












    share|improve this answer



    share|improve this answer










    answered Nov 5 at 12:32









    amon

    81.3k21153239




    81.3k21153239








    • 4




      There is also a GIS stackexchange
      – BlueRaja - Danny Pflughoeft
      Nov 5 at 15:11






    • 8




      PostGIS is the premier free option. It supports much, much more than SQL Server's very basic GIS types and functions. But this is basic functionality.
      – jpmc26
      Nov 5 at 16:41












    • @amon I find jpmc26's comment as a good addition, and not that much as criticizing your example. "If you want to start from scratch, you don't need to pay for a licensed DB - this free, open-source one will also do the trick really well".
      – mgarciaisaia
      Nov 6 at 11:51














    • 4




      There is also a GIS stackexchange
      – BlueRaja - Danny Pflughoeft
      Nov 5 at 15:11






    • 8




      PostGIS is the premier free option. It supports much, much more than SQL Server's very basic GIS types and functions. But this is basic functionality.
      – jpmc26
      Nov 5 at 16:41












    • @amon I find jpmc26's comment as a good addition, and not that much as criticizing your example. "If you want to start from scratch, you don't need to pay for a licensed DB - this free, open-source one will also do the trick really well".
      – mgarciaisaia
      Nov 6 at 11:51








    4




    4




    There is also a GIS stackexchange
    – BlueRaja - Danny Pflughoeft
    Nov 5 at 15:11




    There is also a GIS stackexchange
    – BlueRaja - Danny Pflughoeft
    Nov 5 at 15:11




    8




    8




    PostGIS is the premier free option. It supports much, much more than SQL Server's very basic GIS types and functions. But this is basic functionality.
    – jpmc26
    Nov 5 at 16:41






    PostGIS is the premier free option. It supports much, much more than SQL Server's very basic GIS types and functions. But this is basic functionality.
    – jpmc26
    Nov 5 at 16:41














    @amon I find jpmc26's comment as a good addition, and not that much as criticizing your example. "If you want to start from scratch, you don't need to pay for a licensed DB - this free, open-source one will also do the trick really well".
    – mgarciaisaia
    Nov 6 at 11:51




    @amon I find jpmc26's comment as a good addition, and not that much as criticizing your example. "If you want to start from scratch, you don't need to pay for a licensed DB - this free, open-source one will also do the trick really well".
    – mgarciaisaia
    Nov 6 at 11:51










    up vote
    11
    down vote













    Yes, there's a better way. You need to use a spatial index. These indexes organize metadata about geometries to filter out far away geometries very rapidly, saving a lot of CPU cycles by avoiding the computations you describe. You shouldn't bother implementing one yourself as all major relational databases provide a spatial geometry type and indexes to go with them.




    • PostGIS (the GIS extension for PostgreSQL) uses R-Trees: http://postgis.net/docs/using_postgis_dbmanagement.html#idm2246 (The GiST type)

    • SQL Server uses grid indexes: https://docs.microsoft.com/en-us/sql/relational-databases/spatial/spatial-indexes-overview?view=sql-server-2017

    • Oracle uses R-Trees: https://docs.oracle.com/database/121/SPATL/creating-spatial-index.htm

    • MySQL uses R-Trees: https://dev.mysql.com/doc/refman/8.0/en/optimizing-spatial-analysis.html


    What you want to look into are "within distance" queries (queries for geometries within a certain distance of some other geometry). These are very standard and very much a solved problem and are possible in all of the above databases (and built into several):




    • PostGIS: ST_DWithin

    • SQL Server: STDistance (Not clear that index use on the 3D geography version of this function is supported)

    • Oracle: SDO_WITHIN_DISTANCE (This doesn't say explicitly that it will trigger index usage. I'd double check the query plan. You might need to apply an SDO_FILTER to get it to use the index.)

    • MySQL: Still figuring this out.


    Workaround for triggering index usage



    In the worst case where you have trouble getting the system to use the spatial index with these queries, you can add an additional filter. You'd create a square bounding box with sides of length 2*(search distance) centered at your search point and compare the table geometries' bounding boxes against that before checking the actual distance. That's what PostGIS' ST_DWithin above does internally anyway.





    Distance in GIS



    While spatial indexes are fantastic and absolutely the right solution to your problem, distance calculation can get logically complicated. In particular, you need to worry about what projection (basically all the parameters for the coordinate system) your data is stored in. Most 2D projections (things other than angular coordinate systems like the various lat/long projections) distort length significantly. For example, the Web Mercator projection (the one used by Google, Bing, and every other major base map provider) expands areas and distances increasingly as the location gets further from the equator. I might be wrong as I'm not formally educated in GIS, but the best I've seen for 2D projections is some specific ones that promise correct distances from a single, constant point in the entire world. (No, it's not practical to use a different projection for every query; that would render your indexes useless.)



    The bottom line is that you need to make sure your math is accurate. The simplest way of doing so from a development perspective is to use angular projections (These are often referred to as "geographic.") and functions that support doing the math using a spheroid model, but these computations are slightly more expensive than the 2D counterparts and some DBs may not support indexing them. If you can get acceptable performance using them, though, that's probably the way to go. Another common option is regional projections (like UTM zones) that get both distances and areas pretty close to correct if your data is confined to a particular part of the world. What's best for your app will depend on your specific requirements, but be aware that you need to think this through and maybe learn a little bit about it.



    This applies even if you don't use built in spatial indexes. Your data has some projection regardless of what technology or technique you are currently using or use in the future, and it's already currently affecting any queries and computations you're making.






    share|improve this answer



























      up vote
      11
      down vote













      Yes, there's a better way. You need to use a spatial index. These indexes organize metadata about geometries to filter out far away geometries very rapidly, saving a lot of CPU cycles by avoiding the computations you describe. You shouldn't bother implementing one yourself as all major relational databases provide a spatial geometry type and indexes to go with them.




      • PostGIS (the GIS extension for PostgreSQL) uses R-Trees: http://postgis.net/docs/using_postgis_dbmanagement.html#idm2246 (The GiST type)

      • SQL Server uses grid indexes: https://docs.microsoft.com/en-us/sql/relational-databases/spatial/spatial-indexes-overview?view=sql-server-2017

      • Oracle uses R-Trees: https://docs.oracle.com/database/121/SPATL/creating-spatial-index.htm

      • MySQL uses R-Trees: https://dev.mysql.com/doc/refman/8.0/en/optimizing-spatial-analysis.html


      What you want to look into are "within distance" queries (queries for geometries within a certain distance of some other geometry). These are very standard and very much a solved problem and are possible in all of the above databases (and built into several):




      • PostGIS: ST_DWithin

      • SQL Server: STDistance (Not clear that index use on the 3D geography version of this function is supported)

      • Oracle: SDO_WITHIN_DISTANCE (This doesn't say explicitly that it will trigger index usage. I'd double check the query plan. You might need to apply an SDO_FILTER to get it to use the index.)

      • MySQL: Still figuring this out.


      Workaround for triggering index usage



      In the worst case where you have trouble getting the system to use the spatial index with these queries, you can add an additional filter. You'd create a square bounding box with sides of length 2*(search distance) centered at your search point and compare the table geometries' bounding boxes against that before checking the actual distance. That's what PostGIS' ST_DWithin above does internally anyway.





      Distance in GIS



      While spatial indexes are fantastic and absolutely the right solution to your problem, distance calculation can get logically complicated. In particular, you need to worry about what projection (basically all the parameters for the coordinate system) your data is stored in. Most 2D projections (things other than angular coordinate systems like the various lat/long projections) distort length significantly. For example, the Web Mercator projection (the one used by Google, Bing, and every other major base map provider) expands areas and distances increasingly as the location gets further from the equator. I might be wrong as I'm not formally educated in GIS, but the best I've seen for 2D projections is some specific ones that promise correct distances from a single, constant point in the entire world. (No, it's not practical to use a different projection for every query; that would render your indexes useless.)



      The bottom line is that you need to make sure your math is accurate. The simplest way of doing so from a development perspective is to use angular projections (These are often referred to as "geographic.") and functions that support doing the math using a spheroid model, but these computations are slightly more expensive than the 2D counterparts and some DBs may not support indexing them. If you can get acceptable performance using them, though, that's probably the way to go. Another common option is regional projections (like UTM zones) that get both distances and areas pretty close to correct if your data is confined to a particular part of the world. What's best for your app will depend on your specific requirements, but be aware that you need to think this through and maybe learn a little bit about it.



      This applies even if you don't use built in spatial indexes. Your data has some projection regardless of what technology or technique you are currently using or use in the future, and it's already currently affecting any queries and computations you're making.






      share|improve this answer

























        up vote
        11
        down vote










        up vote
        11
        down vote









        Yes, there's a better way. You need to use a spatial index. These indexes organize metadata about geometries to filter out far away geometries very rapidly, saving a lot of CPU cycles by avoiding the computations you describe. You shouldn't bother implementing one yourself as all major relational databases provide a spatial geometry type and indexes to go with them.




        • PostGIS (the GIS extension for PostgreSQL) uses R-Trees: http://postgis.net/docs/using_postgis_dbmanagement.html#idm2246 (The GiST type)

        • SQL Server uses grid indexes: https://docs.microsoft.com/en-us/sql/relational-databases/spatial/spatial-indexes-overview?view=sql-server-2017

        • Oracle uses R-Trees: https://docs.oracle.com/database/121/SPATL/creating-spatial-index.htm

        • MySQL uses R-Trees: https://dev.mysql.com/doc/refman/8.0/en/optimizing-spatial-analysis.html


        What you want to look into are "within distance" queries (queries for geometries within a certain distance of some other geometry). These are very standard and very much a solved problem and are possible in all of the above databases (and built into several):




        • PostGIS: ST_DWithin

        • SQL Server: STDistance (Not clear that index use on the 3D geography version of this function is supported)

        • Oracle: SDO_WITHIN_DISTANCE (This doesn't say explicitly that it will trigger index usage. I'd double check the query plan. You might need to apply an SDO_FILTER to get it to use the index.)

        • MySQL: Still figuring this out.


        Workaround for triggering index usage



        In the worst case where you have trouble getting the system to use the spatial index with these queries, you can add an additional filter. You'd create a square bounding box with sides of length 2*(search distance) centered at your search point and compare the table geometries' bounding boxes against that before checking the actual distance. That's what PostGIS' ST_DWithin above does internally anyway.





        Distance in GIS



        While spatial indexes are fantastic and absolutely the right solution to your problem, distance calculation can get logically complicated. In particular, you need to worry about what projection (basically all the parameters for the coordinate system) your data is stored in. Most 2D projections (things other than angular coordinate systems like the various lat/long projections) distort length significantly. For example, the Web Mercator projection (the one used by Google, Bing, and every other major base map provider) expands areas and distances increasingly as the location gets further from the equator. I might be wrong as I'm not formally educated in GIS, but the best I've seen for 2D projections is some specific ones that promise correct distances from a single, constant point in the entire world. (No, it's not practical to use a different projection for every query; that would render your indexes useless.)



        The bottom line is that you need to make sure your math is accurate. The simplest way of doing so from a development perspective is to use angular projections (These are often referred to as "geographic.") and functions that support doing the math using a spheroid model, but these computations are slightly more expensive than the 2D counterparts and some DBs may not support indexing them. If you can get acceptable performance using them, though, that's probably the way to go. Another common option is regional projections (like UTM zones) that get both distances and areas pretty close to correct if your data is confined to a particular part of the world. What's best for your app will depend on your specific requirements, but be aware that you need to think this through and maybe learn a little bit about it.



        This applies even if you don't use built in spatial indexes. Your data has some projection regardless of what technology or technique you are currently using or use in the future, and it's already currently affecting any queries and computations you're making.






        share|improve this answer














        Yes, there's a better way. You need to use a spatial index. These indexes organize metadata about geometries to filter out far away geometries very rapidly, saving a lot of CPU cycles by avoiding the computations you describe. You shouldn't bother implementing one yourself as all major relational databases provide a spatial geometry type and indexes to go with them.




        • PostGIS (the GIS extension for PostgreSQL) uses R-Trees: http://postgis.net/docs/using_postgis_dbmanagement.html#idm2246 (The GiST type)

        • SQL Server uses grid indexes: https://docs.microsoft.com/en-us/sql/relational-databases/spatial/spatial-indexes-overview?view=sql-server-2017

        • Oracle uses R-Trees: https://docs.oracle.com/database/121/SPATL/creating-spatial-index.htm

        • MySQL uses R-Trees: https://dev.mysql.com/doc/refman/8.0/en/optimizing-spatial-analysis.html


        What you want to look into are "within distance" queries (queries for geometries within a certain distance of some other geometry). These are very standard and very much a solved problem and are possible in all of the above databases (and built into several):




        • PostGIS: ST_DWithin

        • SQL Server: STDistance (Not clear that index use on the 3D geography version of this function is supported)

        • Oracle: SDO_WITHIN_DISTANCE (This doesn't say explicitly that it will trigger index usage. I'd double check the query plan. You might need to apply an SDO_FILTER to get it to use the index.)

        • MySQL: Still figuring this out.


        Workaround for triggering index usage



        In the worst case where you have trouble getting the system to use the spatial index with these queries, you can add an additional filter. You'd create a square bounding box with sides of length 2*(search distance) centered at your search point and compare the table geometries' bounding boxes against that before checking the actual distance. That's what PostGIS' ST_DWithin above does internally anyway.





        Distance in GIS



        While spatial indexes are fantastic and absolutely the right solution to your problem, distance calculation can get logically complicated. In particular, you need to worry about what projection (basically all the parameters for the coordinate system) your data is stored in. Most 2D projections (things other than angular coordinate systems like the various lat/long projections) distort length significantly. For example, the Web Mercator projection (the one used by Google, Bing, and every other major base map provider) expands areas and distances increasingly as the location gets further from the equator. I might be wrong as I'm not formally educated in GIS, but the best I've seen for 2D projections is some specific ones that promise correct distances from a single, constant point in the entire world. (No, it's not practical to use a different projection for every query; that would render your indexes useless.)



        The bottom line is that you need to make sure your math is accurate. The simplest way of doing so from a development perspective is to use angular projections (These are often referred to as "geographic.") and functions that support doing the math using a spheroid model, but these computations are slightly more expensive than the 2D counterparts and some DBs may not support indexing them. If you can get acceptable performance using them, though, that's probably the way to go. Another common option is regional projections (like UTM zones) that get both distances and areas pretty close to correct if your data is confined to a particular part of the world. What's best for your app will depend on your specific requirements, but be aware that you need to think this through and maybe learn a little bit about it.



        This applies even if you don't use built in spatial indexes. Your data has some projection regardless of what technology or technique you are currently using or use in the future, and it's already currently affecting any queries and computations you're making.







        share|improve this answer














        share|improve this answer



        share|improve this answer








        edited Nov 6 at 23:56

























        answered Nov 5 at 17:00









        jpmc26

        3,81011735




        3,81011735






















            up vote
            2
            down vote













            I would agree that if possible using specific support in a database would be the most sensible way to do this.



            However if I had to do this on a database without specific support I would start by querying for a square that encloses the circule e.g. (y > (y1 - rad)) AND (y < (y1 + rad)) AND (x > (x1 - rad)) AND (x < (x1 + rad)). Assuming your points have roughly even distribution querying for a square will get you your true matches plus about 30% extra false matches. You can then cull out the false matches.






            share|improve this answer





















            • But without an appropriate spatial index, such a query will scan at worst the entire database, at best all items within the given latitude OR longitude range depending on your index, i.e. a "band" rather than a square. If you don't want to kill performance, use a database that supports spatial indexes!
              – jcaron
              Nov 5 at 16:21










            • @jcaron I believe this query could be optimized with an ordinary B-tree index on x and y. (Perhaps combined, perhaps separate. I'd profile a little to figure out which works better in practice.)
              – jpmc26
              Nov 5 at 17:55












            • @jpmc26 Nope, it can’t. Think it through, you’ll see.
              – jcaron
              Nov 5 at 20:52










            • @jcaron Perhaps it would be better if you weren't cryptic about something that is clearly not straightforward. B-trees can be used for BETWEEN queries. I don't see why worst case you couldn't have 2 indexes and then the filtered results from each index get joined together. (That is something RDBMSes do internally when they deem it worth using multiple indexes.) If a combined index works, it should filter out one dimension entirely at the first level and then relatively quickly narrow down in the second level.
              – jpmc26
              Nov 5 at 21:04








            • 1




              @jcaron actually you can use index for something like y between -68 and -69 and x between 10 and 11 but of course spatial index do a better job for that task
              – Juan Carlos Oropeza
              Nov 5 at 21:04

















            up vote
            2
            down vote













            I would agree that if possible using specific support in a database would be the most sensible way to do this.



            However if I had to do this on a database without specific support I would start by querying for a square that encloses the circule e.g. (y > (y1 - rad)) AND (y < (y1 + rad)) AND (x > (x1 - rad)) AND (x < (x1 + rad)). Assuming your points have roughly even distribution querying for a square will get you your true matches plus about 30% extra false matches. You can then cull out the false matches.






            share|improve this answer





















            • But without an appropriate spatial index, such a query will scan at worst the entire database, at best all items within the given latitude OR longitude range depending on your index, i.e. a "band" rather than a square. If you don't want to kill performance, use a database that supports spatial indexes!
              – jcaron
              Nov 5 at 16:21










            • @jcaron I believe this query could be optimized with an ordinary B-tree index on x and y. (Perhaps combined, perhaps separate. I'd profile a little to figure out which works better in practice.)
              – jpmc26
              Nov 5 at 17:55












            • @jpmc26 Nope, it can’t. Think it through, you’ll see.
              – jcaron
              Nov 5 at 20:52










            • @jcaron Perhaps it would be better if you weren't cryptic about something that is clearly not straightforward. B-trees can be used for BETWEEN queries. I don't see why worst case you couldn't have 2 indexes and then the filtered results from each index get joined together. (That is something RDBMSes do internally when they deem it worth using multiple indexes.) If a combined index works, it should filter out one dimension entirely at the first level and then relatively quickly narrow down in the second level.
              – jpmc26
              Nov 5 at 21:04








            • 1




              @jcaron actually you can use index for something like y between -68 and -69 and x between 10 and 11 but of course spatial index do a better job for that task
              – Juan Carlos Oropeza
              Nov 5 at 21:04















            up vote
            2
            down vote










            up vote
            2
            down vote









            I would agree that if possible using specific support in a database would be the most sensible way to do this.



            However if I had to do this on a database without specific support I would start by querying for a square that encloses the circule e.g. (y > (y1 - rad)) AND (y < (y1 + rad)) AND (x > (x1 - rad)) AND (x < (x1 + rad)). Assuming your points have roughly even distribution querying for a square will get you your true matches plus about 30% extra false matches. You can then cull out the false matches.






            share|improve this answer












            I would agree that if possible using specific support in a database would be the most sensible way to do this.



            However if I had to do this on a database without specific support I would start by querying for a square that encloses the circule e.g. (y > (y1 - rad)) AND (y < (y1 + rad)) AND (x > (x1 - rad)) AND (x < (x1 + rad)). Assuming your points have roughly even distribution querying for a square will get you your true matches plus about 30% extra false matches. You can then cull out the false matches.







            share|improve this answer












            share|improve this answer



            share|improve this answer










            answered Nov 5 at 15:58









            Peter Green

            1,499511




            1,499511












            • But without an appropriate spatial index, such a query will scan at worst the entire database, at best all items within the given latitude OR longitude range depending on your index, i.e. a "band" rather than a square. If you don't want to kill performance, use a database that supports spatial indexes!
              – jcaron
              Nov 5 at 16:21










            • @jcaron I believe this query could be optimized with an ordinary B-tree index on x and y. (Perhaps combined, perhaps separate. I'd profile a little to figure out which works better in practice.)
              – jpmc26
              Nov 5 at 17:55












            • @jpmc26 Nope, it can’t. Think it through, you’ll see.
              – jcaron
              Nov 5 at 20:52










            • @jcaron Perhaps it would be better if you weren't cryptic about something that is clearly not straightforward. B-trees can be used for BETWEEN queries. I don't see why worst case you couldn't have 2 indexes and then the filtered results from each index get joined together. (That is something RDBMSes do internally when they deem it worth using multiple indexes.) If a combined index works, it should filter out one dimension entirely at the first level and then relatively quickly narrow down in the second level.
              – jpmc26
              Nov 5 at 21:04








            • 1




              @jcaron actually you can use index for something like y between -68 and -69 and x between 10 and 11 but of course spatial index do a better job for that task
              – Juan Carlos Oropeza
              Nov 5 at 21:04




















            • But without an appropriate spatial index, such a query will scan at worst the entire database, at best all items within the given latitude OR longitude range depending on your index, i.e. a "band" rather than a square. If you don't want to kill performance, use a database that supports spatial indexes!
              – jcaron
              Nov 5 at 16:21










            • @jcaron I believe this query could be optimized with an ordinary B-tree index on x and y. (Perhaps combined, perhaps separate. I'd profile a little to figure out which works better in practice.)
              – jpmc26
              Nov 5 at 17:55












            • @jpmc26 Nope, it can’t. Think it through, you’ll see.
              – jcaron
              Nov 5 at 20:52










            • @jcaron Perhaps it would be better if you weren't cryptic about something that is clearly not straightforward. B-trees can be used for BETWEEN queries. I don't see why worst case you couldn't have 2 indexes and then the filtered results from each index get joined together. (That is something RDBMSes do internally when they deem it worth using multiple indexes.) If a combined index works, it should filter out one dimension entirely at the first level and then relatively quickly narrow down in the second level.
              – jpmc26
              Nov 5 at 21:04








            • 1




              @jcaron actually you can use index for something like y between -68 and -69 and x between 10 and 11 but of course spatial index do a better job for that task
              – Juan Carlos Oropeza
              Nov 5 at 21:04


















            But without an appropriate spatial index, such a query will scan at worst the entire database, at best all items within the given latitude OR longitude range depending on your index, i.e. a "band" rather than a square. If you don't want to kill performance, use a database that supports spatial indexes!
            – jcaron
            Nov 5 at 16:21




            But without an appropriate spatial index, such a query will scan at worst the entire database, at best all items within the given latitude OR longitude range depending on your index, i.e. a "band" rather than a square. If you don't want to kill performance, use a database that supports spatial indexes!
            – jcaron
            Nov 5 at 16:21












            @jcaron I believe this query could be optimized with an ordinary B-tree index on x and y. (Perhaps combined, perhaps separate. I'd profile a little to figure out which works better in practice.)
            – jpmc26
            Nov 5 at 17:55






            @jcaron I believe this query could be optimized with an ordinary B-tree index on x and y. (Perhaps combined, perhaps separate. I'd profile a little to figure out which works better in practice.)
            – jpmc26
            Nov 5 at 17:55














            @jpmc26 Nope, it can’t. Think it through, you’ll see.
            – jcaron
            Nov 5 at 20:52




            @jpmc26 Nope, it can’t. Think it through, you’ll see.
            – jcaron
            Nov 5 at 20:52












            @jcaron Perhaps it would be better if you weren't cryptic about something that is clearly not straightforward. B-trees can be used for BETWEEN queries. I don't see why worst case you couldn't have 2 indexes and then the filtered results from each index get joined together. (That is something RDBMSes do internally when they deem it worth using multiple indexes.) If a combined index works, it should filter out one dimension entirely at the first level and then relatively quickly narrow down in the second level.
            – jpmc26
            Nov 5 at 21:04






            @jcaron Perhaps it would be better if you weren't cryptic about something that is clearly not straightforward. B-trees can be used for BETWEEN queries. I don't see why worst case you couldn't have 2 indexes and then the filtered results from each index get joined together. (That is something RDBMSes do internally when they deem it worth using multiple indexes.) If a combined index works, it should filter out one dimension entirely at the first level and then relatively quickly narrow down in the second level.
            – jpmc26
            Nov 5 at 21:04






            1




            1




            @jcaron actually you can use index for something like y between -68 and -69 and x between 10 and 11 but of course spatial index do a better job for that task
            – Juan Carlos Oropeza
            Nov 5 at 21:04






            @jcaron actually you can use index for something like y between -68 and -69 and x between 10 and 11 but of course spatial index do a better job for that task
            – Juan Carlos Oropeza
            Nov 5 at 21:04




















             

            draft saved


            draft discarded



















































             


            draft saved


            draft discarded














            StackExchange.ready(
            function () {
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fsoftwareengineering.stackexchange.com%2fquestions%2f381001%2fhow-do-i-efficiently-search-for-all-the-landmarks-within-a-range-of-a-certain-la%23new-answer', 'question_page');
            }
            );

            Post as a guest




















































































            這個網誌中的熱門文章

            Xamarin.form Move up view when keyboard appear

            Post-Redirect-Get with Spring WebFlux and Thymeleaf

            Anylogic : not able to use stopDelay()