Doing a mysql like %term% on 1B records (with indexed field)
I have the following query that I'm using and was wondering if it would work performantly, or whether I should use ElasticSearch from the start:
SELECT
*
FROM
entity_access
JOIN entity ON (entity.id=entity_access.entity_id)
WHERE
user_id = 144
AND name LIKE '%format%'
The entity_access
table will have about a billion results. But each user should have 5k entries max. My thinking was that a LIKE %term%
would be trivial on a table of 5k rows (under 50ms), so hopefully it would be the same if I have a good index on a large table before doing it? Or is there something I'm missing here?
mysql sql innodb
add a comment |
I have the following query that I'm using and was wondering if it would work performantly, or whether I should use ElasticSearch from the start:
SELECT
*
FROM
entity_access
JOIN entity ON (entity.id=entity_access.entity_id)
WHERE
user_id = 144
AND name LIKE '%format%'
The entity_access
table will have about a billion results. But each user should have 5k entries max. My thinking was that a LIKE %term%
would be trivial on a table of 5k rows (under 50ms), so hopefully it would be the same if I have a good index on a large table before doing it? Or is there something I'm missing here?
mysql sql innodb
LIKE '%format%'
cannot use an index. If you really have a need to assert this condition quickly, then consider creating a materialized view.
– Tim Biegeleisen
Nov 16 '18 at 3:00
1
Qualify your column names. Without knowing where the columns come from, nothing can really be usefully said about performance.
– Gordon Linoff
Nov 16 '18 at 3:58
add a comment |
I have the following query that I'm using and was wondering if it would work performantly, or whether I should use ElasticSearch from the start:
SELECT
*
FROM
entity_access
JOIN entity ON (entity.id=entity_access.entity_id)
WHERE
user_id = 144
AND name LIKE '%format%'
The entity_access
table will have about a billion results. But each user should have 5k entries max. My thinking was that a LIKE %term%
would be trivial on a table of 5k rows (under 50ms), so hopefully it would be the same if I have a good index on a large table before doing it? Or is there something I'm missing here?
mysql sql innodb
I have the following query that I'm using and was wondering if it would work performantly, or whether I should use ElasticSearch from the start:
SELECT
*
FROM
entity_access
JOIN entity ON (entity.id=entity_access.entity_id)
WHERE
user_id = 144
AND name LIKE '%format%'
The entity_access
table will have about a billion results. But each user should have 5k entries max. My thinking was that a LIKE %term%
would be trivial on a table of 5k rows (under 50ms), so hopefully it would be the same if I have a good index on a large table before doing it? Or is there something I'm missing here?
mysql sql innodb
mysql sql innodb
asked Nov 16 '18 at 2:58
David LDavid L
37816
37816
LIKE '%format%'
cannot use an index. If you really have a need to assert this condition quickly, then consider creating a materialized view.
– Tim Biegeleisen
Nov 16 '18 at 3:00
1
Qualify your column names. Without knowing where the columns come from, nothing can really be usefully said about performance.
– Gordon Linoff
Nov 16 '18 at 3:58
add a comment |
LIKE '%format%'
cannot use an index. If you really have a need to assert this condition quickly, then consider creating a materialized view.
– Tim Biegeleisen
Nov 16 '18 at 3:00
1
Qualify your column names. Without knowing where the columns come from, nothing can really be usefully said about performance.
– Gordon Linoff
Nov 16 '18 at 3:58
LIKE '%format%'
cannot use an index. If you really have a need to assert this condition quickly, then consider creating a materialized view.– Tim Biegeleisen
Nov 16 '18 at 3:00
LIKE '%format%'
cannot use an index. If you really have a need to assert this condition quickly, then consider creating a materialized view.– Tim Biegeleisen
Nov 16 '18 at 3:00
1
1
Qualify your column names. Without knowing where the columns come from, nothing can really be usefully said about performance.
– Gordon Linoff
Nov 16 '18 at 3:58
Qualify your column names. Without knowing where the columns come from, nothing can really be usefully said about performance.
– Gordon Linoff
Nov 16 '18 at 3:58
add a comment |
1 Answer
1
active
oldest
votes
Two things. First, it doesn't matter how many total rows in the table, because the index on user_id
will select only those rows for matching. As you say there are about 5k per user_id, then that's easily managed.
Second, LIKE '%foo%'
will not use an index: the leading '%'
precludes that. If you want to use an index, you'll have to accept a pattern of LIKE 'foo%'
. If that fits the use case, then the query as written will perform fine.
If either of the above conditions doesn't hold, then consider using a dedicated search engine (like Sphinx, or roll-your own with radix trees) or materialize your search into a more indexable format (such as using MySQL Full-Text Search).
add a comment |
Your Answer
StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53330789%2fdoing-a-mysql-like-term-on-1b-records-with-indexed-field%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
Two things. First, it doesn't matter how many total rows in the table, because the index on user_id
will select only those rows for matching. As you say there are about 5k per user_id, then that's easily managed.
Second, LIKE '%foo%'
will not use an index: the leading '%'
precludes that. If you want to use an index, you'll have to accept a pattern of LIKE 'foo%'
. If that fits the use case, then the query as written will perform fine.
If either of the above conditions doesn't hold, then consider using a dedicated search engine (like Sphinx, or roll-your own with radix trees) or materialize your search into a more indexable format (such as using MySQL Full-Text Search).
add a comment |
Two things. First, it doesn't matter how many total rows in the table, because the index on user_id
will select only those rows for matching. As you say there are about 5k per user_id, then that's easily managed.
Second, LIKE '%foo%'
will not use an index: the leading '%'
precludes that. If you want to use an index, you'll have to accept a pattern of LIKE 'foo%'
. If that fits the use case, then the query as written will perform fine.
If either of the above conditions doesn't hold, then consider using a dedicated search engine (like Sphinx, or roll-your own with radix trees) or materialize your search into a more indexable format (such as using MySQL Full-Text Search).
add a comment |
Two things. First, it doesn't matter how many total rows in the table, because the index on user_id
will select only those rows for matching. As you say there are about 5k per user_id, then that's easily managed.
Second, LIKE '%foo%'
will not use an index: the leading '%'
precludes that. If you want to use an index, you'll have to accept a pattern of LIKE 'foo%'
. If that fits the use case, then the query as written will perform fine.
If either of the above conditions doesn't hold, then consider using a dedicated search engine (like Sphinx, or roll-your own with radix trees) or materialize your search into a more indexable format (such as using MySQL Full-Text Search).
Two things. First, it doesn't matter how many total rows in the table, because the index on user_id
will select only those rows for matching. As you say there are about 5k per user_id, then that's easily managed.
Second, LIKE '%foo%'
will not use an index: the leading '%'
precludes that. If you want to use an index, you'll have to accept a pattern of LIKE 'foo%'
. If that fits the use case, then the query as written will perform fine.
If either of the above conditions doesn't hold, then consider using a dedicated search engine (like Sphinx, or roll-your own with radix trees) or materialize your search into a more indexable format (such as using MySQL Full-Text Search).
answered Nov 16 '18 at 3:23
bishopbishop
24.1k46589
24.1k46589
add a comment |
add a comment |
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53330789%2fdoing-a-mysql-like-term-on-1b-records-with-indexed-field%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
LIKE '%format%'
cannot use an index. If you really have a need to assert this condition quickly, then consider creating a materialized view.– Tim Biegeleisen
Nov 16 '18 at 3:00
1
Qualify your column names. Without knowing where the columns come from, nothing can really be usefully said about performance.
– Gordon Linoff
Nov 16 '18 at 3:58