Crosstab on multiple columns
up vote
1
down vote
favorite
I have a dataframe with a name, day, and location. For each name-day-location triple, I want to know what proportion of the rows with that name-day have that location.
In code, I am starting with df
and looking for expected
.
import pandas as pd
df = pd.DataFrame(
[
{"name": "Alice", "day": "friday", "location": "left"},
{"name": "Alice", "day": "friday", "location": "right"},
{"name": "Bob", "day": "monday", "location": "left"},
]
)
print(df)
expected = pd.DataFrame(
[
{"name": "Alice", "day": "friday", "location": "left", "row_percent": 50.0},
{"name": "Alice", "day": "friday", "location": "right", "row_percent": 50.0},
{"name": "Bob", "day": "monday", "location": "left", "row_percent": 100.0},
]
).set_index(['name', 'day', ])
print(expected)
Printed:
In [13]: df
Out[13]:
day location name
0 friday left Alice
1 friday right Alice
2 monday left Bob
In [12]: expected
Out[12]:
location row_percent
name day
Alice friday left 50.0
friday right 50.0
Bob monday left 100.0
python pandas
add a comment |
up vote
1
down vote
favorite
I have a dataframe with a name, day, and location. For each name-day-location triple, I want to know what proportion of the rows with that name-day have that location.
In code, I am starting with df
and looking for expected
.
import pandas as pd
df = pd.DataFrame(
[
{"name": "Alice", "day": "friday", "location": "left"},
{"name": "Alice", "day": "friday", "location": "right"},
{"name": "Bob", "day": "monday", "location": "left"},
]
)
print(df)
expected = pd.DataFrame(
[
{"name": "Alice", "day": "friday", "location": "left", "row_percent": 50.0},
{"name": "Alice", "day": "friday", "location": "right", "row_percent": 50.0},
{"name": "Bob", "day": "monday", "location": "left", "row_percent": 100.0},
]
).set_index(['name', 'day', ])
print(expected)
Printed:
In [13]: df
Out[13]:
day location name
0 friday left Alice
1 friday right Alice
2 monday left Bob
In [12]: expected
Out[12]:
location row_percent
name day
Alice friday left 50.0
friday right 50.0
Bob monday left 100.0
python pandas
add a comment |
up vote
1
down vote
favorite
up vote
1
down vote
favorite
I have a dataframe with a name, day, and location. For each name-day-location triple, I want to know what proportion of the rows with that name-day have that location.
In code, I am starting with df
and looking for expected
.
import pandas as pd
df = pd.DataFrame(
[
{"name": "Alice", "day": "friday", "location": "left"},
{"name": "Alice", "day": "friday", "location": "right"},
{"name": "Bob", "day": "monday", "location": "left"},
]
)
print(df)
expected = pd.DataFrame(
[
{"name": "Alice", "day": "friday", "location": "left", "row_percent": 50.0},
{"name": "Alice", "day": "friday", "location": "right", "row_percent": 50.0},
{"name": "Bob", "day": "monday", "location": "left", "row_percent": 100.0},
]
).set_index(['name', 'day', ])
print(expected)
Printed:
In [13]: df
Out[13]:
day location name
0 friday left Alice
1 friday right Alice
2 monday left Bob
In [12]: expected
Out[12]:
location row_percent
name day
Alice friday left 50.0
friday right 50.0
Bob monday left 100.0
python pandas
I have a dataframe with a name, day, and location. For each name-day-location triple, I want to know what proportion of the rows with that name-day have that location.
In code, I am starting with df
and looking for expected
.
import pandas as pd
df = pd.DataFrame(
[
{"name": "Alice", "day": "friday", "location": "left"},
{"name": "Alice", "day": "friday", "location": "right"},
{"name": "Bob", "day": "monday", "location": "left"},
]
)
print(df)
expected = pd.DataFrame(
[
{"name": "Alice", "day": "friday", "location": "left", "row_percent": 50.0},
{"name": "Alice", "day": "friday", "location": "right", "row_percent": 50.0},
{"name": "Bob", "day": "monday", "location": "left", "row_percent": 100.0},
]
).set_index(['name', 'day', ])
print(expected)
Printed:
In [13]: df
Out[13]:
day location name
0 friday left Alice
1 friday right Alice
2 monday left Bob
In [12]: expected
Out[12]:
location row_percent
name day
Alice friday left 50.0
friday right 50.0
Bob monday left 100.0
python pandas
python pandas
edited Nov 5 at 4:06
user3483203
28.3k72351
28.3k72351
asked Nov 5 at 3:49
Hatshepsut
1,24111023
1,24111023
add a comment |
add a comment |
1 Answer
1
active
oldest
votes
up vote
3
down vote
accepted
Using groupby
and value_counts
:
df.groupby(['name', 'day']).location.value_counts(normalize=True).mul(100)
name day location
Alice friday left 50.0
right 50.0
Bob monday left 100.0
Name: location, dtype: float64
With a bit more cleaning for your desired output:
out = (df.groupby(['name', 'day']).location.value_counts(normalize=True).mul(100)
.rename('row_percent').reset_index(2))
location row_percent
name day
Alice friday left 50.0
friday right 50.0
Bob monday left 100.0
out == expected
location row_percent
name day
Alice friday True True
friday True True
Bob monday True True
add a comment |
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
up vote
3
down vote
accepted
Using groupby
and value_counts
:
df.groupby(['name', 'day']).location.value_counts(normalize=True).mul(100)
name day location
Alice friday left 50.0
right 50.0
Bob monday left 100.0
Name: location, dtype: float64
With a bit more cleaning for your desired output:
out = (df.groupby(['name', 'day']).location.value_counts(normalize=True).mul(100)
.rename('row_percent').reset_index(2))
location row_percent
name day
Alice friday left 50.0
friday right 50.0
Bob monday left 100.0
out == expected
location row_percent
name day
Alice friday True True
friday True True
Bob monday True True
add a comment |
up vote
3
down vote
accepted
Using groupby
and value_counts
:
df.groupby(['name', 'day']).location.value_counts(normalize=True).mul(100)
name day location
Alice friday left 50.0
right 50.0
Bob monday left 100.0
Name: location, dtype: float64
With a bit more cleaning for your desired output:
out = (df.groupby(['name', 'day']).location.value_counts(normalize=True).mul(100)
.rename('row_percent').reset_index(2))
location row_percent
name day
Alice friday left 50.0
friday right 50.0
Bob monday left 100.0
out == expected
location row_percent
name day
Alice friday True True
friday True True
Bob monday True True
add a comment |
up vote
3
down vote
accepted
up vote
3
down vote
accepted
Using groupby
and value_counts
:
df.groupby(['name', 'day']).location.value_counts(normalize=True).mul(100)
name day location
Alice friday left 50.0
right 50.0
Bob monday left 100.0
Name: location, dtype: float64
With a bit more cleaning for your desired output:
out = (df.groupby(['name', 'day']).location.value_counts(normalize=True).mul(100)
.rename('row_percent').reset_index(2))
location row_percent
name day
Alice friday left 50.0
friday right 50.0
Bob monday left 100.0
out == expected
location row_percent
name day
Alice friday True True
friday True True
Bob monday True True
Using groupby
and value_counts
:
df.groupby(['name', 'day']).location.value_counts(normalize=True).mul(100)
name day location
Alice friday left 50.0
right 50.0
Bob monday left 100.0
Name: location, dtype: float64
With a bit more cleaning for your desired output:
out = (df.groupby(['name', 'day']).location.value_counts(normalize=True).mul(100)
.rename('row_percent').reset_index(2))
location row_percent
name day
Alice friday left 50.0
friday right 50.0
Bob monday left 100.0
out == expected
location row_percent
name day
Alice friday True True
friday True True
Bob monday True True
answered Nov 5 at 3:52
user3483203
28.3k72351
28.3k72351
add a comment |
add a comment |
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53148069%2fcrosstab-on-multiple-columns%23new-answer', 'question_page');
}
);
Post as a guest
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password