How to split a data frame based on one variable and create cross table separetely between other two variable...

up vote
-1
down vote

favorite

My dataset: df

PID<-c(1,2,3,4,5,6,7,8,9)  

gender<-c(1,1,0,1,0,0,0,1,1)  

smoking<-c(1,1,0,0,0,0,1,0,1)  

disease<-c(1,0,0,1,1,1,0,1,0)  

BMI<-c(24,23,21,28,29,21,18,19,16)    

 df<-data.frame(PID, gender, smoking, disease, BMI)

I want to split this dataset based on gender. Then develop crosstab between smoking and disease. How to do this?
Expected outcome(first question):

Gender:1

crosstab between smoking and disease

Gender:2

Crosstab between smoking and disease.

Expected outcome (second question):

Gender:1

mean of BMI

Gender:2

mean of BMI

edited Nov 5 at 2:31

asked Nov 5 at 2:15

Bikram Adhitya Adhikari

154

Have you tried anything yet? I'd think xtabs and aggregate should answer both questions.
– r2evans
Nov 5 at 2:28

add a comment |

up vote
-1
down vote

favorite

My dataset: df

PID<-c(1,2,3,4,5,6,7,8,9)  

gender<-c(1,1,0,1,0,0,0,1,1)  

smoking<-c(1,1,0,0,0,0,1,0,1)  

disease<-c(1,0,0,1,1,1,0,1,0)  

BMI<-c(24,23,21,28,29,21,18,19,16)    

 df<-data.frame(PID, gender, smoking, disease, BMI)

I want to split this dataset based on gender. Then develop crosstab between smoking and disease. How to do this?
Expected outcome(first question):

Gender:1

crosstab between smoking and disease

Gender:2

Crosstab between smoking and disease.

Expected outcome (second question):

Gender:1

mean of BMI

Gender:2

mean of BMI

edited Nov 5 at 2:31

asked Nov 5 at 2:15

Bikram Adhitya Adhikari

154

Have you tried anything yet? I'd think xtabs and aggregate should answer both questions.
– r2evans
Nov 5 at 2:28

add a comment |

up vote
-1
down vote

favorite

My dataset: df

PID<-c(1,2,3,4,5,6,7,8,9)  

gender<-c(1,1,0,1,0,0,0,1,1)  

smoking<-c(1,1,0,0,0,0,1,0,1)  

disease<-c(1,0,0,1,1,1,0,1,0)  

BMI<-c(24,23,21,28,29,21,18,19,16)    

 df<-data.frame(PID, gender, smoking, disease, BMI)

I want to split this dataset based on gender. Then develop crosstab between smoking and disease. How to do this?
Expected outcome(first question):

Gender:1

crosstab between smoking and disease

Gender:2

Crosstab between smoking and disease.

Expected outcome (second question):

Gender:1

mean of BMI

Gender:2

mean of BMI

edited Nov 5 at 2:31

asked Nov 5 at 2:15

Bikram Adhitya Adhikari

154

My dataset: df

PID<-c(1,2,3,4,5,6,7,8,9)  

gender<-c(1,1,0,1,0,0,0,1,1)  

smoking<-c(1,1,0,0,0,0,1,0,1)  

disease<-c(1,0,0,1,1,1,0,1,0)  

BMI<-c(24,23,21,28,29,21,18,19,16)    

 df<-data.frame(PID, gender, smoking, disease, BMI)

I want to split this dataset based on gender. Then develop crosstab between smoking and disease. How to do this?
Expected outcome(first question):

Gender:1

crosstab between smoking and disease

Gender:2

Crosstab between smoking and disease.

Expected outcome (second question):

Gender:1

mean of BMI

Gender:2

mean of BMI

edited Nov 5 at 2:31

asked Nov 5 at 2:15

Bikram Adhitya Adhikari

154

edited Nov 5 at 2:31

asked Nov 5 at 2:15

Bikram Adhitya Adhikari

154

edited Nov 5 at 2:31

asked Nov 5 at 2:15

Bikram Adhitya Adhikari

154

asked Nov 5 at 2:15

Bikram Adhitya Adhikari

154

asked Nov 5 at 2:15

Bikram Adhitya Adhikari

154

Have you tried anything yet? I'd think xtabs and aggregate should answer both questions.
– r2evans
Nov 5 at 2:28

add a comment |

Have you tried anything yet? I'd think xtabs and aggregate should answer both questions.
– r2evans
Nov 5 at 2:28

Have you tried anything yet? I'd think xtabs and aggregate should answer both questions.
– r2evans
Nov 5 at 2:28

add a comment |

2 Answers
2

active

oldest

votes

up vote
2
down vote

No need for external packages:

xtabs(~smoking+disease+gender,data=df)

# , , gender = 0

#        disease

# smoking 0 1

#       0 1 2

#       1 1 0

# , , gender = 1

#        disease

# smoking 0 1

#       0 0 2

#       1 2 1

aggregate(df$BMI, list(gender=df$gender), FUN=mean)

#   gender     x

# 1      0 22.25

# 2      1 22.00

Similarly (thanks thelatemail):

aggregate(BMI ~ gender, data = df, FUN=mean)

edited Nov 5 at 4:03

answered Nov 5 at 3:19

r2evans

23.7k32856

1

aggregate(BMI ~ gender, data=df, FUN=mean) might be a bit more readable and keep the consistent formula interface. Or even aggregate(df["BMI"], df["gender"], FUN=mean) if the formulas are not to your taste.
– thelatemail
Nov 5 at 3:56

add a comment |

up vote
1
down vote

Here is a possible way for the first question using dplyr:

library(dplyr)

library(magrittr)





> df %>% split(gender) %>% lapply(function(x) tab=xtabs(gender~smoking+disease, data=x))

$`0`

       disease

smoking 0 1

      0 0 0

      1 0 0



$`1`

       disease

smoking 0 1

      0 0 2

      1 2 1

answered Nov 5 at 3:06

mysteRious

1,9702512

You really don't need dlpyr here ... xtabs(~smoking+disease+gender,data=df) does effectively the same thing
– r2evans
Nov 5 at 3:19

add a comment |

Your Answer

StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});

}
});

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53147467%2fhow-to-split-a-data-frame-based-on-one-variable-and-create-cross-table-separetel%23new-answer', 'question_page');
}
);

Post as a guest

Name

2 Answers
2

active

oldest

votes

2 Answers
2

active

oldest

votes

up vote
2
down vote

No need for external packages:

xtabs(~smoking+disease+gender,data=df)

# , , gender = 0

#        disease

# smoking 0 1

#       0 1 2

#       1 1 0

# , , gender = 1

#        disease

# smoking 0 1

#       0 0 2

#       1 2 1

aggregate(df$BMI, list(gender=df$gender), FUN=mean)

#   gender     x

# 1      0 22.25

# 2      1 22.00

Similarly (thanks thelatemail):

aggregate(BMI ~ gender, data = df, FUN=mean)

edited Nov 5 at 4:03

answered Nov 5 at 3:19

r2evans

23.7k32856

1

aggregate(BMI ~ gender, data=df, FUN=mean) might be a bit more readable and keep the consistent formula interface. Or even aggregate(df["BMI"], df["gender"], FUN=mean) if the formulas are not to your taste.
– thelatemail
Nov 5 at 3:56

add a comment |

up vote
2
down vote

No need for external packages:

xtabs(~smoking+disease+gender,data=df)

# , , gender = 0

#        disease

# smoking 0 1

#       0 1 2

#       1 1 0

# , , gender = 1

#        disease

# smoking 0 1

#       0 0 2

#       1 2 1

aggregate(df$BMI, list(gender=df$gender), FUN=mean)

#   gender     x

# 1      0 22.25

# 2      1 22.00

Similarly (thanks thelatemail):

aggregate(BMI ~ gender, data = df, FUN=mean)

edited Nov 5 at 4:03

answered Nov 5 at 3:19

r2evans

23.7k32856

1

aggregate(BMI ~ gender, data=df, FUN=mean) might be a bit more readable and keep the consistent formula interface. Or even aggregate(df["BMI"], df["gender"], FUN=mean) if the formulas are not to your taste.
– thelatemail
Nov 5 at 3:56

add a comment |

up vote
2
down vote

No need for external packages:

xtabs(~smoking+disease+gender,data=df)

# , , gender = 0

#        disease

# smoking 0 1

#       0 1 2

#       1 1 0

# , , gender = 1

#        disease

# smoking 0 1

#       0 0 2

#       1 2 1

aggregate(df$BMI, list(gender=df$gender), FUN=mean)

#   gender     x

# 1      0 22.25

# 2      1 22.00

Similarly (thanks thelatemail):

aggregate(BMI ~ gender, data = df, FUN=mean)

edited Nov 5 at 4:03

answered Nov 5 at 3:19

r2evans

23.7k32856

No need for external packages:

xtabs(~smoking+disease+gender,data=df)

# , , gender = 0

#        disease

# smoking 0 1

#       0 1 2

#       1 1 0

# , , gender = 1

#        disease

# smoking 0 1

#       0 0 2

#       1 2 1

aggregate(df$BMI, list(gender=df$gender), FUN=mean)

#   gender     x

# 1      0 22.25

# 2      1 22.00

Similarly (thanks thelatemail):

aggregate(BMI ~ gender, data = df, FUN=mean)

edited Nov 5 at 4:03

answered Nov 5 at 3:19

r2evans

23.7k32856

edited Nov 5 at 4:03

answered Nov 5 at 3:19

r2evans

23.7k32856

answered Nov 5 at 3:19

r2evans

23.7k32856

answered Nov 5 at 3:19

r2evans

23.7k32856

1

aggregate(BMI ~ gender, data=df, FUN=mean) might be a bit more readable and keep the consistent formula interface. Or even aggregate(df["BMI"], df["gender"], FUN=mean) if the formulas are not to your taste.
– thelatemail
Nov 5 at 3:56

add a comment |

1

aggregate(BMI ~ gender, data=df, FUN=mean) might be a bit more readable and keep the consistent formula interface. Or even aggregate(df["BMI"], df["gender"], FUN=mean) if the formulas are not to your taste.
– thelatemail
Nov 5 at 3:56

aggregate(BMI ~ gender, data=df, FUN=mean) might be a bit more readable and keep the consistent formula interface. Or even aggregate(df["BMI"], df["gender"], FUN=mean) if the formulas are not to your taste.
– thelatemail
Nov 5 at 3:56

add a comment |

up vote
1
down vote

Here is a possible way for the first question using dplyr:

library(dplyr)

library(magrittr)





> df %>% split(gender) %>% lapply(function(x) tab=xtabs(gender~smoking+disease, data=x))

$`0`

       disease

smoking 0 1

      0 0 0

      1 0 0



$`1`

       disease

smoking 0 1

      0 0 2

      1 2 1

answered Nov 5 at 3:06

mysteRious

1,9702512

You really don't need dlpyr here ... xtabs(~smoking+disease+gender,data=df) does effectively the same thing
– r2evans
Nov 5 at 3:19

add a comment |

up vote
1
down vote

Here is a possible way for the first question using dplyr:

library(dplyr)

library(magrittr)





> df %>% split(gender) %>% lapply(function(x) tab=xtabs(gender~smoking+disease, data=x))

$`0`

       disease

smoking 0 1

      0 0 0

      1 0 0



$`1`

       disease

smoking 0 1

      0 0 2

      1 2 1

answered Nov 5 at 3:06

mysteRious

1,9702512

You really don't need dlpyr here ... xtabs(~smoking+disease+gender,data=df) does effectively the same thing
– r2evans
Nov 5 at 3:19

add a comment |

up vote
1
down vote

Here is a possible way for the first question using dplyr:

library(dplyr)

library(magrittr)





> df %>% split(gender) %>% lapply(function(x) tab=xtabs(gender~smoking+disease, data=x))

$`0`

       disease

smoking 0 1

      0 0 0

      1 0 0



$`1`

       disease

smoking 0 1

      0 0 2

      1 2 1

answered Nov 5 at 3:06

mysteRious

1,9702512

Here is a possible way for the first question using dplyr:

library(dplyr)

library(magrittr)





> df %>% split(gender) %>% lapply(function(x) tab=xtabs(gender~smoking+disease, data=x))

$`0`

       disease

smoking 0 1

      0 0 0

      1 0 0



$`1`

       disease

smoking 0 1

      0 0 2

      1 2 1

answered Nov 5 at 3:06

mysteRious

1,9702512

answered Nov 5 at 3:06

mysteRious

1,9702512

answered Nov 5 at 3:06

mysteRious

1,9702512

answered Nov 5 at 3:06

mysteRious

1,9702512

You really don't need dlpyr here ... xtabs(~smoking+disease+gender,data=df) does effectively the same thing
– r2evans
Nov 5 at 3:19

add a comment |

You really don't need dlpyr here ... xtabs(~smoking+disease+gender,data=df) does effectively the same thing
– r2evans
Nov 5 at 3:19

You really don't need dlpyr here ... xtabs(~smoking+disease+gender,data=df) does effectively the same thing
– r2evans
Nov 5 at 3:19

add a comment |

draft saved

draft discarded

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Name

This page is only for reference, If you need detailed information, please check here

8Q,OuQIf,Ycp,HuFE,A3oDXRlSV27gyn0KptXyIhtq3K1KjeB,VjkMOtKRy 5MMa MawlEAwtOg 2SG3LWy

搜尋此網誌

Wsrtjtyk