Unable to run parameter tuning for XGBoost regression model using caret
I am trying to build a regression model using the Boston Housing data using the caret
package. The code is as follows
library(tidyverse)
library(ggplot2)
library(lubridate)
library(broom)
library(caret)
library(xgboost)
#list.files()
options(scipen = 999)
library(MASS)
data_model <- Boston
data_model <- as.data.frame(data_model)
# based on this link https://stackoverflow.com/questions/51762536/r-xgboost-on-caret-attempts-to-perform-classification-instead-of-regression
data_model$medv <- as.double(data_model$medv)
data_model$zn <- as.double(data_model$zn)
xgb_grid_1 = expand.grid(
nrounds = 1000,
max_depth = c(2, 4, 6, 8, 10),
eta=c(0.5, 0.1, 0.07),
gamma = 0.01,
colsample_bytree=0.5,
min_child_weight=1,
subsample=0.5
)
xgb_trcontrol_1 = trainControl(
method = "cv",
number = 5,
allowParallel = TRUE
)
xgb_train_1 = train(
x = data_model %>% dplyr::select(-medv) %>% as.matrix(),
y = as.matrix(data_model$medv),
trControl = xgb_trcontrol_1,
tuneGrid = xgb_grid_1,
method = "xgbTree",
metric = 'RMSE'
)
sessionInfo()
But when I run the train()
function I get the error Error: Metric RMSE not applicable for classification models
. Then I tried to change variables that were integers
to double
as suggested by this link. I still seem to get the same error. Am I missing out on an extra parameter that should take care of this? Thank You in advance! I have also included my session information below in case there is version conflict that I am not aware of.
R version 3.4.0 (2017-04-21)
Platform: x86_64-apple-darwin15.6.0 (64-bit)
Running under: macOS 10.14
Matrix products: default
BLAS: /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib
LAPACK: /Library/Frameworks/R.framework/Versions/3.4/Resources/lib/libRlapack.dylib
locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] MASS_7.3-47 bindrcpp_0.2.2 xgboost_0.71.2 caret_6.0-81 lattice_0.20-35 broom_0.4.2 lubridate_1.6.0 dplyr_0.7.8 purrr_0.2.3
[10] readr_1.1.1 tidyr_0.7.2 tibble_1.4.2 ggplot2_2.2.1.9000 tidyverse_1.1.1
loaded via a namespace (and not attached):
[1] Rcpp_1.0.0 class_7.3-14 utf8_1.1.3 assertthat_0.2.0 ipred_0.9-6 psych_1.7.5 foreach_1.4.3 R6_2.2.2
[9] cellranger_1.1.0 plyr_1.8.4 stats4_3.4.0 httr_1.3.1 pillar_1.2.1 rlang_0.3.0.1 lazyeval_0.2.1 readxl_1.0.0
[17] rstudioapi_0.7 data.table_1.10.4 rpart_4.1-11 Matrix_1.2-9 splines_3.4.0 gower_0.1.2 stringr_1.3.0 foreign_0.8-67
[25] munsell_0.4.3 compiler_3.4.0 modelr_0.1.1 pkgconfig_2.0.1 mnormt_1.5-5 nnet_7.3-12 tidyselect_0.2.5 prodlim_2018.04.18
[33] codetools_0.2-15 crayon_1.3.4 withr_2.1.2 recipes_0.1.4 ModelMetrics_1.1.0 grid_3.4.0 nlme_3.1-131 jsonlite_1.5
[41] gtable_0.2.0 magrittr_1.5 waterfalls_0.1.2 scales_0.5.0.9000 cli_1.0.0 stringi_1.1.7 reshape2_1.4.3 timeDate_3012.100
[49] xml2_1.2.0 generics_0.0.1 lava_1.6.1 iterators_1.0.8 tools_3.4.0 forcats_0.2.0 glue_1.3.0 hms_0.3
[57] parallel_3.4.0 survival_2.41-3 colorspace_1.3-2 xgboostExplainer_0.1 rvest_0.3.2 bindr_0.1.1 haven_1.1.0
r r-caret xgboost
add a comment |
I am trying to build a regression model using the Boston Housing data using the caret
package. The code is as follows
library(tidyverse)
library(ggplot2)
library(lubridate)
library(broom)
library(caret)
library(xgboost)
#list.files()
options(scipen = 999)
library(MASS)
data_model <- Boston
data_model <- as.data.frame(data_model)
# based on this link https://stackoverflow.com/questions/51762536/r-xgboost-on-caret-attempts-to-perform-classification-instead-of-regression
data_model$medv <- as.double(data_model$medv)
data_model$zn <- as.double(data_model$zn)
xgb_grid_1 = expand.grid(
nrounds = 1000,
max_depth = c(2, 4, 6, 8, 10),
eta=c(0.5, 0.1, 0.07),
gamma = 0.01,
colsample_bytree=0.5,
min_child_weight=1,
subsample=0.5
)
xgb_trcontrol_1 = trainControl(
method = "cv",
number = 5,
allowParallel = TRUE
)
xgb_train_1 = train(
x = data_model %>% dplyr::select(-medv) %>% as.matrix(),
y = as.matrix(data_model$medv),
trControl = xgb_trcontrol_1,
tuneGrid = xgb_grid_1,
method = "xgbTree",
metric = 'RMSE'
)
sessionInfo()
But when I run the train()
function I get the error Error: Metric RMSE not applicable for classification models
. Then I tried to change variables that were integers
to double
as suggested by this link. I still seem to get the same error. Am I missing out on an extra parameter that should take care of this? Thank You in advance! I have also included my session information below in case there is version conflict that I am not aware of.
R version 3.4.0 (2017-04-21)
Platform: x86_64-apple-darwin15.6.0 (64-bit)
Running under: macOS 10.14
Matrix products: default
BLAS: /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib
LAPACK: /Library/Frameworks/R.framework/Versions/3.4/Resources/lib/libRlapack.dylib
locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] MASS_7.3-47 bindrcpp_0.2.2 xgboost_0.71.2 caret_6.0-81 lattice_0.20-35 broom_0.4.2 lubridate_1.6.0 dplyr_0.7.8 purrr_0.2.3
[10] readr_1.1.1 tidyr_0.7.2 tibble_1.4.2 ggplot2_2.2.1.9000 tidyverse_1.1.1
loaded via a namespace (and not attached):
[1] Rcpp_1.0.0 class_7.3-14 utf8_1.1.3 assertthat_0.2.0 ipred_0.9-6 psych_1.7.5 foreach_1.4.3 R6_2.2.2
[9] cellranger_1.1.0 plyr_1.8.4 stats4_3.4.0 httr_1.3.1 pillar_1.2.1 rlang_0.3.0.1 lazyeval_0.2.1 readxl_1.0.0
[17] rstudioapi_0.7 data.table_1.10.4 rpart_4.1-11 Matrix_1.2-9 splines_3.4.0 gower_0.1.2 stringr_1.3.0 foreign_0.8-67
[25] munsell_0.4.3 compiler_3.4.0 modelr_0.1.1 pkgconfig_2.0.1 mnormt_1.5-5 nnet_7.3-12 tidyselect_0.2.5 prodlim_2018.04.18
[33] codetools_0.2-15 crayon_1.3.4 withr_2.1.2 recipes_0.1.4 ModelMetrics_1.1.0 grid_3.4.0 nlme_3.1-131 jsonlite_1.5
[41] gtable_0.2.0 magrittr_1.5 waterfalls_0.1.2 scales_0.5.0.9000 cli_1.0.0 stringi_1.1.7 reshape2_1.4.3 timeDate_3012.100
[49] xml2_1.2.0 generics_0.0.1 lava_1.6.1 iterators_1.0.8 tools_3.4.0 forcats_0.2.0 glue_1.3.0 hms_0.3
[57] parallel_3.4.0 survival_2.41-3 colorspace_1.3-2 xgboostExplainer_0.1 rvest_0.3.2 bindr_0.1.1 haven_1.1.0
r r-caret xgboost
It is the Boston data from MASS
– adhok
Nov 23 '18 at 5:38
Like the comment in the link you have mentioned says; change the target variabley = as.matrix(data_model$medv)
to double, i.e.,y = as.double(data_model$medv)
– discipulus
Nov 23 '18 at 5:44
add a comment |
I am trying to build a regression model using the Boston Housing data using the caret
package. The code is as follows
library(tidyverse)
library(ggplot2)
library(lubridate)
library(broom)
library(caret)
library(xgboost)
#list.files()
options(scipen = 999)
library(MASS)
data_model <- Boston
data_model <- as.data.frame(data_model)
# based on this link https://stackoverflow.com/questions/51762536/r-xgboost-on-caret-attempts-to-perform-classification-instead-of-regression
data_model$medv <- as.double(data_model$medv)
data_model$zn <- as.double(data_model$zn)
xgb_grid_1 = expand.grid(
nrounds = 1000,
max_depth = c(2, 4, 6, 8, 10),
eta=c(0.5, 0.1, 0.07),
gamma = 0.01,
colsample_bytree=0.5,
min_child_weight=1,
subsample=0.5
)
xgb_trcontrol_1 = trainControl(
method = "cv",
number = 5,
allowParallel = TRUE
)
xgb_train_1 = train(
x = data_model %>% dplyr::select(-medv) %>% as.matrix(),
y = as.matrix(data_model$medv),
trControl = xgb_trcontrol_1,
tuneGrid = xgb_grid_1,
method = "xgbTree",
metric = 'RMSE'
)
sessionInfo()
But when I run the train()
function I get the error Error: Metric RMSE not applicable for classification models
. Then I tried to change variables that were integers
to double
as suggested by this link. I still seem to get the same error. Am I missing out on an extra parameter that should take care of this? Thank You in advance! I have also included my session information below in case there is version conflict that I am not aware of.
R version 3.4.0 (2017-04-21)
Platform: x86_64-apple-darwin15.6.0 (64-bit)
Running under: macOS 10.14
Matrix products: default
BLAS: /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib
LAPACK: /Library/Frameworks/R.framework/Versions/3.4/Resources/lib/libRlapack.dylib
locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] MASS_7.3-47 bindrcpp_0.2.2 xgboost_0.71.2 caret_6.0-81 lattice_0.20-35 broom_0.4.2 lubridate_1.6.0 dplyr_0.7.8 purrr_0.2.3
[10] readr_1.1.1 tidyr_0.7.2 tibble_1.4.2 ggplot2_2.2.1.9000 tidyverse_1.1.1
loaded via a namespace (and not attached):
[1] Rcpp_1.0.0 class_7.3-14 utf8_1.1.3 assertthat_0.2.0 ipred_0.9-6 psych_1.7.5 foreach_1.4.3 R6_2.2.2
[9] cellranger_1.1.0 plyr_1.8.4 stats4_3.4.0 httr_1.3.1 pillar_1.2.1 rlang_0.3.0.1 lazyeval_0.2.1 readxl_1.0.0
[17] rstudioapi_0.7 data.table_1.10.4 rpart_4.1-11 Matrix_1.2-9 splines_3.4.0 gower_0.1.2 stringr_1.3.0 foreign_0.8-67
[25] munsell_0.4.3 compiler_3.4.0 modelr_0.1.1 pkgconfig_2.0.1 mnormt_1.5-5 nnet_7.3-12 tidyselect_0.2.5 prodlim_2018.04.18
[33] codetools_0.2-15 crayon_1.3.4 withr_2.1.2 recipes_0.1.4 ModelMetrics_1.1.0 grid_3.4.0 nlme_3.1-131 jsonlite_1.5
[41] gtable_0.2.0 magrittr_1.5 waterfalls_0.1.2 scales_0.5.0.9000 cli_1.0.0 stringi_1.1.7 reshape2_1.4.3 timeDate_3012.100
[49] xml2_1.2.0 generics_0.0.1 lava_1.6.1 iterators_1.0.8 tools_3.4.0 forcats_0.2.0 glue_1.3.0 hms_0.3
[57] parallel_3.4.0 survival_2.41-3 colorspace_1.3-2 xgboostExplainer_0.1 rvest_0.3.2 bindr_0.1.1 haven_1.1.0
r r-caret xgboost
I am trying to build a regression model using the Boston Housing data using the caret
package. The code is as follows
library(tidyverse)
library(ggplot2)
library(lubridate)
library(broom)
library(caret)
library(xgboost)
#list.files()
options(scipen = 999)
library(MASS)
data_model <- Boston
data_model <- as.data.frame(data_model)
# based on this link https://stackoverflow.com/questions/51762536/r-xgboost-on-caret-attempts-to-perform-classification-instead-of-regression
data_model$medv <- as.double(data_model$medv)
data_model$zn <- as.double(data_model$zn)
xgb_grid_1 = expand.grid(
nrounds = 1000,
max_depth = c(2, 4, 6, 8, 10),
eta=c(0.5, 0.1, 0.07),
gamma = 0.01,
colsample_bytree=0.5,
min_child_weight=1,
subsample=0.5
)
xgb_trcontrol_1 = trainControl(
method = "cv",
number = 5,
allowParallel = TRUE
)
xgb_train_1 = train(
x = data_model %>% dplyr::select(-medv) %>% as.matrix(),
y = as.matrix(data_model$medv),
trControl = xgb_trcontrol_1,
tuneGrid = xgb_grid_1,
method = "xgbTree",
metric = 'RMSE'
)
sessionInfo()
But when I run the train()
function I get the error Error: Metric RMSE not applicable for classification models
. Then I tried to change variables that were integers
to double
as suggested by this link. I still seem to get the same error. Am I missing out on an extra parameter that should take care of this? Thank You in advance! I have also included my session information below in case there is version conflict that I am not aware of.
R version 3.4.0 (2017-04-21)
Platform: x86_64-apple-darwin15.6.0 (64-bit)
Running under: macOS 10.14
Matrix products: default
BLAS: /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib
LAPACK: /Library/Frameworks/R.framework/Versions/3.4/Resources/lib/libRlapack.dylib
locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] MASS_7.3-47 bindrcpp_0.2.2 xgboost_0.71.2 caret_6.0-81 lattice_0.20-35 broom_0.4.2 lubridate_1.6.0 dplyr_0.7.8 purrr_0.2.3
[10] readr_1.1.1 tidyr_0.7.2 tibble_1.4.2 ggplot2_2.2.1.9000 tidyverse_1.1.1
loaded via a namespace (and not attached):
[1] Rcpp_1.0.0 class_7.3-14 utf8_1.1.3 assertthat_0.2.0 ipred_0.9-6 psych_1.7.5 foreach_1.4.3 R6_2.2.2
[9] cellranger_1.1.0 plyr_1.8.4 stats4_3.4.0 httr_1.3.1 pillar_1.2.1 rlang_0.3.0.1 lazyeval_0.2.1 readxl_1.0.0
[17] rstudioapi_0.7 data.table_1.10.4 rpart_4.1-11 Matrix_1.2-9 splines_3.4.0 gower_0.1.2 stringr_1.3.0 foreign_0.8-67
[25] munsell_0.4.3 compiler_3.4.0 modelr_0.1.1 pkgconfig_2.0.1 mnormt_1.5-5 nnet_7.3-12 tidyselect_0.2.5 prodlim_2018.04.18
[33] codetools_0.2-15 crayon_1.3.4 withr_2.1.2 recipes_0.1.4 ModelMetrics_1.1.0 grid_3.4.0 nlme_3.1-131 jsonlite_1.5
[41] gtable_0.2.0 magrittr_1.5 waterfalls_0.1.2 scales_0.5.0.9000 cli_1.0.0 stringi_1.1.7 reshape2_1.4.3 timeDate_3012.100
[49] xml2_1.2.0 generics_0.0.1 lava_1.6.1 iterators_1.0.8 tools_3.4.0 forcats_0.2.0 glue_1.3.0 hms_0.3
[57] parallel_3.4.0 survival_2.41-3 colorspace_1.3-2 xgboostExplainer_0.1 rvest_0.3.2 bindr_0.1.1 haven_1.1.0
r r-caret xgboost
r r-caret xgboost
edited Nov 25 '18 at 14:58
jmuhlenkamp
1,514526
1,514526
asked Nov 23 '18 at 5:16
adhokadhok
728
728
It is the Boston data from MASS
– adhok
Nov 23 '18 at 5:38
Like the comment in the link you have mentioned says; change the target variabley = as.matrix(data_model$medv)
to double, i.e.,y = as.double(data_model$medv)
– discipulus
Nov 23 '18 at 5:44
add a comment |
It is the Boston data from MASS
– adhok
Nov 23 '18 at 5:38
Like the comment in the link you have mentioned says; change the target variabley = as.matrix(data_model$medv)
to double, i.e.,y = as.double(data_model$medv)
– discipulus
Nov 23 '18 at 5:44
It is the Boston data from MASS
– adhok
Nov 23 '18 at 5:38
It is the Boston data from MASS
– adhok
Nov 23 '18 at 5:38
Like the comment in the link you have mentioned says; change the target variable
y = as.matrix(data_model$medv)
to double, i.e., y = as.double(data_model$medv)
– discipulus
Nov 23 '18 at 5:44
Like the comment in the link you have mentioned says; change the target variable
y = as.matrix(data_model$medv)
to double, i.e., y = as.double(data_model$medv)
– discipulus
Nov 23 '18 at 5:44
add a comment |
1 Answer
1
active
oldest
votes
You have already converted data_model$zn
to double
. So, just remove as.matrix
in the parameter y = as.matrix(data_model$medv)
add a comment |
Your Answer
StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53440974%2funable-to-run-parameter-tuning-for-xgboost-regression-model-using-caret%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
You have already converted data_model$zn
to double
. So, just remove as.matrix
in the parameter y = as.matrix(data_model$medv)
add a comment |
You have already converted data_model$zn
to double
. So, just remove as.matrix
in the parameter y = as.matrix(data_model$medv)
add a comment |
You have already converted data_model$zn
to double
. So, just remove as.matrix
in the parameter y = as.matrix(data_model$medv)
You have already converted data_model$zn
to double
. So, just remove as.matrix
in the parameter y = as.matrix(data_model$medv)
answered Nov 23 '18 at 5:50
TeeKeaTeeKea
3,22851932
3,22851932
add a comment |
add a comment |
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53440974%2funable-to-run-parameter-tuning-for-xgboost-regression-model-using-caret%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
It is the Boston data from MASS
– adhok
Nov 23 '18 at 5:38
Like the comment in the link you have mentioned says; change the target variable
y = as.matrix(data_model$medv)
to double, i.e.,y = as.double(data_model$medv)
– discipulus
Nov 23 '18 at 5:44