How to split a data frame based on one variable and create cross table separetely between other two variable...











up vote
-1
down vote

favorite












My dataset: df



PID<-c(1,2,3,4,5,6,7,8,9)  
gender<-c(1,1,0,1,0,0,0,1,1)
smoking<-c(1,1,0,0,0,0,1,0,1)
disease<-c(1,0,0,1,1,1,0,1,0)
BMI<-c(24,23,21,28,29,21,18,19,16)
df<-data.frame(PID, gender, smoking, disease, BMI)


I want to split this dataset based on gender. Then develop crosstab between smoking and disease. How to do this?
Expected outcome(first question):

Gender:1

crosstab between smoking and disease



Gender:2

Crosstab between smoking and disease.



Expected outcome (second question):



Gender:1

mean of BMI



Gender:2

mean of BMI










share|improve this question
























  • Have you tried anything yet? I'd think xtabs and aggregate should answer both questions.
    – r2evans
    Nov 5 at 2:28















up vote
-1
down vote

favorite












My dataset: df



PID<-c(1,2,3,4,5,6,7,8,9)  
gender<-c(1,1,0,1,0,0,0,1,1)
smoking<-c(1,1,0,0,0,0,1,0,1)
disease<-c(1,0,0,1,1,1,0,1,0)
BMI<-c(24,23,21,28,29,21,18,19,16)
df<-data.frame(PID, gender, smoking, disease, BMI)


I want to split this dataset based on gender. Then develop crosstab between smoking and disease. How to do this?
Expected outcome(first question):

Gender:1

crosstab between smoking and disease



Gender:2

Crosstab between smoking and disease.



Expected outcome (second question):



Gender:1

mean of BMI



Gender:2

mean of BMI










share|improve this question
























  • Have you tried anything yet? I'd think xtabs and aggregate should answer both questions.
    – r2evans
    Nov 5 at 2:28













up vote
-1
down vote

favorite









up vote
-1
down vote

favorite











My dataset: df



PID<-c(1,2,3,4,5,6,7,8,9)  
gender<-c(1,1,0,1,0,0,0,1,1)
smoking<-c(1,1,0,0,0,0,1,0,1)
disease<-c(1,0,0,1,1,1,0,1,0)
BMI<-c(24,23,21,28,29,21,18,19,16)
df<-data.frame(PID, gender, smoking, disease, BMI)


I want to split this dataset based on gender. Then develop crosstab between smoking and disease. How to do this?
Expected outcome(first question):

Gender:1

crosstab between smoking and disease



Gender:2

Crosstab between smoking and disease.



Expected outcome (second question):



Gender:1

mean of BMI



Gender:2

mean of BMI










share|improve this question















My dataset: df



PID<-c(1,2,3,4,5,6,7,8,9)  
gender<-c(1,1,0,1,0,0,0,1,1)
smoking<-c(1,1,0,0,0,0,1,0,1)
disease<-c(1,0,0,1,1,1,0,1,0)
BMI<-c(24,23,21,28,29,21,18,19,16)
df<-data.frame(PID, gender, smoking, disease, BMI)


I want to split this dataset based on gender. Then develop crosstab between smoking and disease. How to do this?
Expected outcome(first question):

Gender:1

crosstab between smoking and disease



Gender:2

Crosstab between smoking and disease.



Expected outcome (second question):



Gender:1

mean of BMI



Gender:2

mean of BMI







r






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited Nov 5 at 2:31

























asked Nov 5 at 2:15









Bikram Adhitya Adhikari

154




154












  • Have you tried anything yet? I'd think xtabs and aggregate should answer both questions.
    – r2evans
    Nov 5 at 2:28


















  • Have you tried anything yet? I'd think xtabs and aggregate should answer both questions.
    – r2evans
    Nov 5 at 2:28
















Have you tried anything yet? I'd think xtabs and aggregate should answer both questions.
– r2evans
Nov 5 at 2:28




Have you tried anything yet? I'd think xtabs and aggregate should answer both questions.
– r2evans
Nov 5 at 2:28












2 Answers
2






active

oldest

votes

















up vote
2
down vote













No need for external packages:



xtabs(~smoking+disease+gender,data=df)
# , , gender = 0
# disease
# smoking 0 1
# 0 1 2
# 1 1 0
# , , gender = 1
# disease
# smoking 0 1
# 0 0 2
# 1 2 1
aggregate(df$BMI, list(gender=df$gender), FUN=mean)
# gender x
# 1 0 22.25
# 2 1 22.00


Similarly (thanks thelatemail):



aggregate(BMI ~ gender, data = df, FUN=mean)





share|improve this answer



















  • 1




    aggregate(BMI ~ gender, data=df, FUN=mean) might be a bit more readable and keep the consistent formula interface. Or even aggregate(df["BMI"], df["gender"], FUN=mean) if the formulas are not to your taste.
    – thelatemail
    Nov 5 at 3:56




















up vote
1
down vote













Here is a possible way for the first question using dplyr:



library(dplyr)
library(magrittr)


> df %>% split(gender) %>% lapply(function(x) tab=xtabs(gender~smoking+disease, data=x))
$`0`
disease
smoking 0 1
0 0 0
1 0 0

$`1`
disease
smoking 0 1
0 0 2
1 2 1





share|improve this answer





















  • You really don't need dlpyr here ... xtabs(~smoking+disease+gender,data=df) does effectively the same thing
    – r2evans
    Nov 5 at 3:19











Your Answer






StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});


}
});














 

draft saved


draft discarded


















StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53147467%2fhow-to-split-a-data-frame-based-on-one-variable-and-create-cross-table-separetel%23new-answer', 'question_page');
}
);

Post as a guest
































2 Answers
2






active

oldest

votes








2 Answers
2






active

oldest

votes









active

oldest

votes






active

oldest

votes








up vote
2
down vote













No need for external packages:



xtabs(~smoking+disease+gender,data=df)
# , , gender = 0
# disease
# smoking 0 1
# 0 1 2
# 1 1 0
# , , gender = 1
# disease
# smoking 0 1
# 0 0 2
# 1 2 1
aggregate(df$BMI, list(gender=df$gender), FUN=mean)
# gender x
# 1 0 22.25
# 2 1 22.00


Similarly (thanks thelatemail):



aggregate(BMI ~ gender, data = df, FUN=mean)





share|improve this answer



















  • 1




    aggregate(BMI ~ gender, data=df, FUN=mean) might be a bit more readable and keep the consistent formula interface. Or even aggregate(df["BMI"], df["gender"], FUN=mean) if the formulas are not to your taste.
    – thelatemail
    Nov 5 at 3:56

















up vote
2
down vote













No need for external packages:



xtabs(~smoking+disease+gender,data=df)
# , , gender = 0
# disease
# smoking 0 1
# 0 1 2
# 1 1 0
# , , gender = 1
# disease
# smoking 0 1
# 0 0 2
# 1 2 1
aggregate(df$BMI, list(gender=df$gender), FUN=mean)
# gender x
# 1 0 22.25
# 2 1 22.00


Similarly (thanks thelatemail):



aggregate(BMI ~ gender, data = df, FUN=mean)





share|improve this answer



















  • 1




    aggregate(BMI ~ gender, data=df, FUN=mean) might be a bit more readable and keep the consistent formula interface. Or even aggregate(df["BMI"], df["gender"], FUN=mean) if the formulas are not to your taste.
    – thelatemail
    Nov 5 at 3:56















up vote
2
down vote










up vote
2
down vote









No need for external packages:



xtabs(~smoking+disease+gender,data=df)
# , , gender = 0
# disease
# smoking 0 1
# 0 1 2
# 1 1 0
# , , gender = 1
# disease
# smoking 0 1
# 0 0 2
# 1 2 1
aggregate(df$BMI, list(gender=df$gender), FUN=mean)
# gender x
# 1 0 22.25
# 2 1 22.00


Similarly (thanks thelatemail):



aggregate(BMI ~ gender, data = df, FUN=mean)





share|improve this answer














No need for external packages:



xtabs(~smoking+disease+gender,data=df)
# , , gender = 0
# disease
# smoking 0 1
# 0 1 2
# 1 1 0
# , , gender = 1
# disease
# smoking 0 1
# 0 0 2
# 1 2 1
aggregate(df$BMI, list(gender=df$gender), FUN=mean)
# gender x
# 1 0 22.25
# 2 1 22.00


Similarly (thanks thelatemail):



aggregate(BMI ~ gender, data = df, FUN=mean)






share|improve this answer














share|improve this answer



share|improve this answer








edited Nov 5 at 4:03

























answered Nov 5 at 3:19









r2evans

23.7k32856




23.7k32856








  • 1




    aggregate(BMI ~ gender, data=df, FUN=mean) might be a bit more readable and keep the consistent formula interface. Or even aggregate(df["BMI"], df["gender"], FUN=mean) if the formulas are not to your taste.
    – thelatemail
    Nov 5 at 3:56
















  • 1




    aggregate(BMI ~ gender, data=df, FUN=mean) might be a bit more readable and keep the consistent formula interface. Or even aggregate(df["BMI"], df["gender"], FUN=mean) if the formulas are not to your taste.
    – thelatemail
    Nov 5 at 3:56










1




1




aggregate(BMI ~ gender, data=df, FUN=mean) might be a bit more readable and keep the consistent formula interface. Or even aggregate(df["BMI"], df["gender"], FUN=mean) if the formulas are not to your taste.
– thelatemail
Nov 5 at 3:56






aggregate(BMI ~ gender, data=df, FUN=mean) might be a bit more readable and keep the consistent formula interface. Or even aggregate(df["BMI"], df["gender"], FUN=mean) if the formulas are not to your taste.
– thelatemail
Nov 5 at 3:56














up vote
1
down vote













Here is a possible way for the first question using dplyr:



library(dplyr)
library(magrittr)


> df %>% split(gender) %>% lapply(function(x) tab=xtabs(gender~smoking+disease, data=x))
$`0`
disease
smoking 0 1
0 0 0
1 0 0

$`1`
disease
smoking 0 1
0 0 2
1 2 1





share|improve this answer





















  • You really don't need dlpyr here ... xtabs(~smoking+disease+gender,data=df) does effectively the same thing
    – r2evans
    Nov 5 at 3:19















up vote
1
down vote













Here is a possible way for the first question using dplyr:



library(dplyr)
library(magrittr)


> df %>% split(gender) %>% lapply(function(x) tab=xtabs(gender~smoking+disease, data=x))
$`0`
disease
smoking 0 1
0 0 0
1 0 0

$`1`
disease
smoking 0 1
0 0 2
1 2 1





share|improve this answer





















  • You really don't need dlpyr here ... xtabs(~smoking+disease+gender,data=df) does effectively the same thing
    – r2evans
    Nov 5 at 3:19













up vote
1
down vote










up vote
1
down vote









Here is a possible way for the first question using dplyr:



library(dplyr)
library(magrittr)


> df %>% split(gender) %>% lapply(function(x) tab=xtabs(gender~smoking+disease, data=x))
$`0`
disease
smoking 0 1
0 0 0
1 0 0

$`1`
disease
smoking 0 1
0 0 2
1 2 1





share|improve this answer












Here is a possible way for the first question using dplyr:



library(dplyr)
library(magrittr)


> df %>% split(gender) %>% lapply(function(x) tab=xtabs(gender~smoking+disease, data=x))
$`0`
disease
smoking 0 1
0 0 0
1 0 0

$`1`
disease
smoking 0 1
0 0 2
1 2 1






share|improve this answer












share|improve this answer



share|improve this answer










answered Nov 5 at 3:06









mysteRious

1,9702512




1,9702512












  • You really don't need dlpyr here ... xtabs(~smoking+disease+gender,data=df) does effectively the same thing
    – r2evans
    Nov 5 at 3:19


















  • You really don't need dlpyr here ... xtabs(~smoking+disease+gender,data=df) does effectively the same thing
    – r2evans
    Nov 5 at 3:19
















You really don't need dlpyr here ... xtabs(~smoking+disease+gender,data=df) does effectively the same thing
– r2evans
Nov 5 at 3:19




You really don't need dlpyr here ... xtabs(~smoking+disease+gender,data=df) does effectively the same thing
– r2evans
Nov 5 at 3:19


















 

draft saved


draft discarded



















































 


draft saved


draft discarded














StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53147467%2fhow-to-split-a-data-frame-based-on-one-variable-and-create-cross-table-separetel%23new-answer', 'question_page');
}
);

Post as a guest




















































































這個網誌中的熱門文章

Tangent Lines Diagram Along Smooth Curve

Yusuf al-Mu'taman ibn Hud

Zucchini