POSTGIS limiting a distance query by a threshold
I am looking to get pairwise distances between rows from two tables and only return the pairs that are within a threshold distance. The tables have large numbers of objects so performance is a concern.
I found an example of getting the closest distances at
PostGIS minimum distance between two large sets of points
The code there looks like
SELECT
a.id, nn.id AS id_nn,
a.geom, nn.geom_closest,
ST_Distance_Sphere(a.geom, nn.geom_closest) AS min_dist
FROM
table_a AS a
CROSS JOIN LATERAL
(SELECT
b.id,
b.geom AS geom_closest
FROM table_b b
ORDER BY a.geom <-> b.geom
LIMIT 1) AS nn;
I'm terrible with SQL and I understand that the LIMIT 1 is taking the closest when they are ordered.
How do I modify this to give all pairs less than a threshold? I tried using a WHERE clause to limit it to within a value
SELECT
a.id, nn.id AS id_nn,
a.wkb_geometry, nn.geom_closest,
ST_DistanceSphere(a.wkb_geometry, nn.geom_closest) AS min_dist
FROM
mammography21 AS a
CROSS JOIN LATERAL
(SELECT
b.gid as id,
b.wkb_geometry AS geom_closest
FROM cartographic_boundary_us_zcta_2016 b
ORDER BY a.wkb_geometry <-> b.wkb_geometry) AS nn
WHERE ST_DistanceSphere(a.wkb_geometry, nn.geom_closest) <= 10.0;
but this gives the error:
The server encountered an internal error and was unable to complete your request. Either the server is overloaded or there is an error in the application.
Even it had worked, I would guess it is an inefficient approach. How should I be trying to do this query?
postgis
add a comment |
I am looking to get pairwise distances between rows from two tables and only return the pairs that are within a threshold distance. The tables have large numbers of objects so performance is a concern.
I found an example of getting the closest distances at
PostGIS minimum distance between two large sets of points
The code there looks like
SELECT
a.id, nn.id AS id_nn,
a.geom, nn.geom_closest,
ST_Distance_Sphere(a.geom, nn.geom_closest) AS min_dist
FROM
table_a AS a
CROSS JOIN LATERAL
(SELECT
b.id,
b.geom AS geom_closest
FROM table_b b
ORDER BY a.geom <-> b.geom
LIMIT 1) AS nn;
I'm terrible with SQL and I understand that the LIMIT 1 is taking the closest when they are ordered.
How do I modify this to give all pairs less than a threshold? I tried using a WHERE clause to limit it to within a value
SELECT
a.id, nn.id AS id_nn,
a.wkb_geometry, nn.geom_closest,
ST_DistanceSphere(a.wkb_geometry, nn.geom_closest) AS min_dist
FROM
mammography21 AS a
CROSS JOIN LATERAL
(SELECT
b.gid as id,
b.wkb_geometry AS geom_closest
FROM cartographic_boundary_us_zcta_2016 b
ORDER BY a.wkb_geometry <-> b.wkb_geometry) AS nn
WHERE ST_DistanceSphere(a.wkb_geometry, nn.geom_closest) <= 10.0;
but this gives the error:
The server encountered an internal error and was unable to complete your request. Either the server is overloaded or there is an error in the application.
Even it had worked, I would guess it is an inefficient approach. How should I be trying to do this query?
postgis
You can use ST_DWithin in theWHERE
clause to limit the search to a certain distance.
– thibautg
Nov 22 '18 at 8:15
add a comment |
I am looking to get pairwise distances between rows from two tables and only return the pairs that are within a threshold distance. The tables have large numbers of objects so performance is a concern.
I found an example of getting the closest distances at
PostGIS minimum distance between two large sets of points
The code there looks like
SELECT
a.id, nn.id AS id_nn,
a.geom, nn.geom_closest,
ST_Distance_Sphere(a.geom, nn.geom_closest) AS min_dist
FROM
table_a AS a
CROSS JOIN LATERAL
(SELECT
b.id,
b.geom AS geom_closest
FROM table_b b
ORDER BY a.geom <-> b.geom
LIMIT 1) AS nn;
I'm terrible with SQL and I understand that the LIMIT 1 is taking the closest when they are ordered.
How do I modify this to give all pairs less than a threshold? I tried using a WHERE clause to limit it to within a value
SELECT
a.id, nn.id AS id_nn,
a.wkb_geometry, nn.geom_closest,
ST_DistanceSphere(a.wkb_geometry, nn.geom_closest) AS min_dist
FROM
mammography21 AS a
CROSS JOIN LATERAL
(SELECT
b.gid as id,
b.wkb_geometry AS geom_closest
FROM cartographic_boundary_us_zcta_2016 b
ORDER BY a.wkb_geometry <-> b.wkb_geometry) AS nn
WHERE ST_DistanceSphere(a.wkb_geometry, nn.geom_closest) <= 10.0;
but this gives the error:
The server encountered an internal error and was unable to complete your request. Either the server is overloaded or there is an error in the application.
Even it had worked, I would guess it is an inefficient approach. How should I be trying to do this query?
postgis
I am looking to get pairwise distances between rows from two tables and only return the pairs that are within a threshold distance. The tables have large numbers of objects so performance is a concern.
I found an example of getting the closest distances at
PostGIS minimum distance between two large sets of points
The code there looks like
SELECT
a.id, nn.id AS id_nn,
a.geom, nn.geom_closest,
ST_Distance_Sphere(a.geom, nn.geom_closest) AS min_dist
FROM
table_a AS a
CROSS JOIN LATERAL
(SELECT
b.id,
b.geom AS geom_closest
FROM table_b b
ORDER BY a.geom <-> b.geom
LIMIT 1) AS nn;
I'm terrible with SQL and I understand that the LIMIT 1 is taking the closest when they are ordered.
How do I modify this to give all pairs less than a threshold? I tried using a WHERE clause to limit it to within a value
SELECT
a.id, nn.id AS id_nn,
a.wkb_geometry, nn.geom_closest,
ST_DistanceSphere(a.wkb_geometry, nn.geom_closest) AS min_dist
FROM
mammography21 AS a
CROSS JOIN LATERAL
(SELECT
b.gid as id,
b.wkb_geometry AS geom_closest
FROM cartographic_boundary_us_zcta_2016 b
ORDER BY a.wkb_geometry <-> b.wkb_geometry) AS nn
WHERE ST_DistanceSphere(a.wkb_geometry, nn.geom_closest) <= 10.0;
but this gives the error:
The server encountered an internal error and was unable to complete your request. Either the server is overloaded or there is an error in the application.
Even it had worked, I would guess it is an inefficient approach. How should I be trying to do this query?
postgis
postgis
asked Nov 22 '18 at 3:16
R RommelR Rommel
63
63
You can use ST_DWithin in theWHERE
clause to limit the search to a certain distance.
– thibautg
Nov 22 '18 at 8:15
add a comment |
You can use ST_DWithin in theWHERE
clause to limit the search to a certain distance.
– thibautg
Nov 22 '18 at 8:15
You can use ST_DWithin in the
WHERE
clause to limit the search to a certain distance.– thibautg
Nov 22 '18 at 8:15
You can use ST_DWithin in the
WHERE
clause to limit the search to a certain distance.– thibautg
Nov 22 '18 at 8:15
add a comment |
1 Answer
1
active
oldest
votes
The ordering was used to get the closest point only. If you want more than one point, you don't really care about ordering the points by distance anymore.
Putting the distance validation is the right thing to do... but beware where you do it. Your query is taking too much time because for every point, it computes the distance to every other point (the lateral join) and then it filters the results to keep only the nearby ones.
Since you need more than just 1 point, the later join can be removed. As suggested by @thibautg, st_Dwithin
is preferable as it makes use of the spatial index.
At last, you might want to cast your data to geography to compute the distance in meters (it depends on your data CRS). If it is the case, you would need a spatial index on the geographies too.
SELECT
a.id, nn.id AS id_nn,
a.geom srcGeom, nn.geom nearGeom,
ST_DistanceSphere(a.geom, nn.geom) AS near_dist
FROM
mammography21 AS a,
cartographic_boundary_us_zcta_2016 nn
WHERE ST_DWithin(a.geom, nn.geom, 10);
add a comment |
Your Answer
StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53423373%2fpostgis-limiting-a-distance-query-by-a-threshold%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
The ordering was used to get the closest point only. If you want more than one point, you don't really care about ordering the points by distance anymore.
Putting the distance validation is the right thing to do... but beware where you do it. Your query is taking too much time because for every point, it computes the distance to every other point (the lateral join) and then it filters the results to keep only the nearby ones.
Since you need more than just 1 point, the later join can be removed. As suggested by @thibautg, st_Dwithin
is preferable as it makes use of the spatial index.
At last, you might want to cast your data to geography to compute the distance in meters (it depends on your data CRS). If it is the case, you would need a spatial index on the geographies too.
SELECT
a.id, nn.id AS id_nn,
a.geom srcGeom, nn.geom nearGeom,
ST_DistanceSphere(a.geom, nn.geom) AS near_dist
FROM
mammography21 AS a,
cartographic_boundary_us_zcta_2016 nn
WHERE ST_DWithin(a.geom, nn.geom, 10);
add a comment |
The ordering was used to get the closest point only. If you want more than one point, you don't really care about ordering the points by distance anymore.
Putting the distance validation is the right thing to do... but beware where you do it. Your query is taking too much time because for every point, it computes the distance to every other point (the lateral join) and then it filters the results to keep only the nearby ones.
Since you need more than just 1 point, the later join can be removed. As suggested by @thibautg, st_Dwithin
is preferable as it makes use of the spatial index.
At last, you might want to cast your data to geography to compute the distance in meters (it depends on your data CRS). If it is the case, you would need a spatial index on the geographies too.
SELECT
a.id, nn.id AS id_nn,
a.geom srcGeom, nn.geom nearGeom,
ST_DistanceSphere(a.geom, nn.geom) AS near_dist
FROM
mammography21 AS a,
cartographic_boundary_us_zcta_2016 nn
WHERE ST_DWithin(a.geom, nn.geom, 10);
add a comment |
The ordering was used to get the closest point only. If you want more than one point, you don't really care about ordering the points by distance anymore.
Putting the distance validation is the right thing to do... but beware where you do it. Your query is taking too much time because for every point, it computes the distance to every other point (the lateral join) and then it filters the results to keep only the nearby ones.
Since you need more than just 1 point, the later join can be removed. As suggested by @thibautg, st_Dwithin
is preferable as it makes use of the spatial index.
At last, you might want to cast your data to geography to compute the distance in meters (it depends on your data CRS). If it is the case, you would need a spatial index on the geographies too.
SELECT
a.id, nn.id AS id_nn,
a.geom srcGeom, nn.geom nearGeom,
ST_DistanceSphere(a.geom, nn.geom) AS near_dist
FROM
mammography21 AS a,
cartographic_boundary_us_zcta_2016 nn
WHERE ST_DWithin(a.geom, nn.geom, 10);
The ordering was used to get the closest point only. If you want more than one point, you don't really care about ordering the points by distance anymore.
Putting the distance validation is the right thing to do... but beware where you do it. Your query is taking too much time because for every point, it computes the distance to every other point (the lateral join) and then it filters the results to keep only the nearby ones.
Since you need more than just 1 point, the later join can be removed. As suggested by @thibautg, st_Dwithin
is preferable as it makes use of the spatial index.
At last, you might want to cast your data to geography to compute the distance in meters (it depends on your data CRS). If it is the case, you would need a spatial index on the geographies too.
SELECT
a.id, nn.id AS id_nn,
a.geom srcGeom, nn.geom nearGeom,
ST_DistanceSphere(a.geom, nn.geom) AS near_dist
FROM
mammography21 AS a,
cartographic_boundary_us_zcta_2016 nn
WHERE ST_DWithin(a.geom, nn.geom, 10);
answered Nov 22 '18 at 14:29
JGHJGH
3,55441126
3,55441126
add a comment |
add a comment |
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53423373%2fpostgis-limiting-a-distance-query-by-a-threshold%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
You can use ST_DWithin in the
WHERE
clause to limit the search to a certain distance.– thibautg
Nov 22 '18 at 8:15