Stats: Accuracy of the KNN algorithm for K=1
.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty{ height:90px;width:728px;box-sizing:border-box;
}
This is really a general question about the KNN algorithm that hopefully you will be able to help me understand.
I use the knn algorithm in the library class in R:
knn <- knn(train=X_train, test=X_test, cl=train_Y, k=3)
When running the KNN algorithm for a classification of handwritten digits from 0 to 9 (each observation is a txt file with 1024 0s and 1s creating the picture of a number). So it's a dataset with 1024 variables and each of them can be 0 or 1.
I am able to run the algorithm and I get very good results with K=3 (somehow reasonable). However, when I loop through different K values looking for an optimal value it happens that the optimal K values are either 1 or 3, then it descends gradually.
What I find odd is the fact that K=1 can be an optimal value as I don't find it reasonable that looking only at the closest point should be the optimal value.
Can you help me with this matter? Could it be because of the Argument use.all = TRUE?
algorithm machine-learning statistics knn
add a comment |
This is really a general question about the KNN algorithm that hopefully you will be able to help me understand.
I use the knn algorithm in the library class in R:
knn <- knn(train=X_train, test=X_test, cl=train_Y, k=3)
When running the KNN algorithm for a classification of handwritten digits from 0 to 9 (each observation is a txt file with 1024 0s and 1s creating the picture of a number). So it's a dataset with 1024 variables and each of them can be 0 or 1.
I am able to run the algorithm and I get very good results with K=3 (somehow reasonable). However, when I loop through different K values looking for an optimal value it happens that the optimal K values are either 1 or 3, then it descends gradually.
What I find odd is the fact that K=1 can be an optimal value as I don't find it reasonable that looking only at the closest point should be the optimal value.
Can you help me with this matter? Could it be because of the Argument use.all = TRUE?
algorithm machine-learning statistics knn
"Could it be because of the Argument use.all = TRUE?", you tried withFALSE
and you inspired to ask that question?
– gsamaras
Nov 23 '18 at 15:06
It's interesting. With use.all=FALSE I get a slightly worse result, but again k=1 and k=3 are the most accurate. I get the best results when use.all=TRUE. And if I don't put anything, I get a slightly different result, in between TRUE and FALSE. I tried it many times (as results may vary when using use.all=FALSE)
– Johnny
Nov 24 '18 at 12:38
add a comment |
This is really a general question about the KNN algorithm that hopefully you will be able to help me understand.
I use the knn algorithm in the library class in R:
knn <- knn(train=X_train, test=X_test, cl=train_Y, k=3)
When running the KNN algorithm for a classification of handwritten digits from 0 to 9 (each observation is a txt file with 1024 0s and 1s creating the picture of a number). So it's a dataset with 1024 variables and each of them can be 0 or 1.
I am able to run the algorithm and I get very good results with K=3 (somehow reasonable). However, when I loop through different K values looking for an optimal value it happens that the optimal K values are either 1 or 3, then it descends gradually.
What I find odd is the fact that K=1 can be an optimal value as I don't find it reasonable that looking only at the closest point should be the optimal value.
Can you help me with this matter? Could it be because of the Argument use.all = TRUE?
algorithm machine-learning statistics knn
This is really a general question about the KNN algorithm that hopefully you will be able to help me understand.
I use the knn algorithm in the library class in R:
knn <- knn(train=X_train, test=X_test, cl=train_Y, k=3)
When running the KNN algorithm for a classification of handwritten digits from 0 to 9 (each observation is a txt file with 1024 0s and 1s creating the picture of a number). So it's a dataset with 1024 variables and each of them can be 0 or 1.
I am able to run the algorithm and I get very good results with K=3 (somehow reasonable). However, when I loop through different K values looking for an optimal value it happens that the optimal K values are either 1 or 3, then it descends gradually.
What I find odd is the fact that K=1 can be an optimal value as I don't find it reasonable that looking only at the closest point should be the optimal value.
Can you help me with this matter? Could it be because of the Argument use.all = TRUE?
algorithm machine-learning statistics knn
algorithm machine-learning statistics knn
edited Nov 23 '18 at 19:53
juvian
13.5k22227
13.5k22227
asked Nov 23 '18 at 14:45
JohnnyJohnny
185
185
"Could it be because of the Argument use.all = TRUE?", you tried withFALSE
and you inspired to ask that question?
– gsamaras
Nov 23 '18 at 15:06
It's interesting. With use.all=FALSE I get a slightly worse result, but again k=1 and k=3 are the most accurate. I get the best results when use.all=TRUE. And if I don't put anything, I get a slightly different result, in between TRUE and FALSE. I tried it many times (as results may vary when using use.all=FALSE)
– Johnny
Nov 24 '18 at 12:38
add a comment |
"Could it be because of the Argument use.all = TRUE?", you tried withFALSE
and you inspired to ask that question?
– gsamaras
Nov 23 '18 at 15:06
It's interesting. With use.all=FALSE I get a slightly worse result, but again k=1 and k=3 are the most accurate. I get the best results when use.all=TRUE. And if I don't put anything, I get a slightly different result, in between TRUE and FALSE. I tried it many times (as results may vary when using use.all=FALSE)
– Johnny
Nov 24 '18 at 12:38
"Could it be because of the Argument use.all = TRUE?", you tried with
FALSE
and you inspired to ask that question?– gsamaras
Nov 23 '18 at 15:06
"Could it be because of the Argument use.all = TRUE?", you tried with
FALSE
and you inspired to ask that question?– gsamaras
Nov 23 '18 at 15:06
It's interesting. With use.all=FALSE I get a slightly worse result, but again k=1 and k=3 are the most accurate. I get the best results when use.all=TRUE. And if I don't put anything, I get a slightly different result, in between TRUE and FALSE. I tried it many times (as results may vary when using use.all=FALSE)
– Johnny
Nov 24 '18 at 12:38
It's interesting. With use.all=FALSE I get a slightly worse result, but again k=1 and k=3 are the most accurate. I get the best results when use.all=TRUE. And if I don't put anything, I get a slightly different result, in between TRUE and FALSE. I tried it many times (as results may vary when using use.all=FALSE)
– Johnny
Nov 24 '18 at 12:38
add a comment |
0
active
oldest
votes
Your Answer
StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53448760%2fstats-accuracy-of-the-knn-algorithm-for-k-1%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
0
active
oldest
votes
0
active
oldest
votes
active
oldest
votes
active
oldest
votes
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53448760%2fstats-accuracy-of-the-knn-algorithm-for-k-1%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
"Could it be because of the Argument use.all = TRUE?", you tried with
FALSE
and you inspired to ask that question?– gsamaras
Nov 23 '18 at 15:06
It's interesting. With use.all=FALSE I get a slightly worse result, but again k=1 and k=3 are the most accurate. I get the best results when use.all=TRUE. And if I don't put anything, I get a slightly different result, in between TRUE and FALSE. I tried it many times (as results may vary when using use.all=FALSE)
– Johnny
Nov 24 '18 at 12:38