Checking multiple columns condition in pandas
up vote
3
down vote
favorite
I want to create a new column in my dataframe that places the name of the column in the row if only that column has a value of 8 in the respective row, otherwise the new column's value for the row would be "NONE". For the dataframe df
, the new column df["New_Column"] = ["NONE","NONE","A","NONE"]
df = pd.DataFrame({"A": [1, 2,8,3], "B": [0, 2,4,8], "C": [0, 0,7,8]})
python pandas
New contributor
add a comment |
up vote
3
down vote
favorite
I want to create a new column in my dataframe that places the name of the column in the row if only that column has a value of 8 in the respective row, otherwise the new column's value for the row would be "NONE". For the dataframe df
, the new column df["New_Column"] = ["NONE","NONE","A","NONE"]
df = pd.DataFrame({"A": [1, 2,8,3], "B": [0, 2,4,8], "C": [0, 0,7,8]})
python pandas
New contributor
What is the expected outcome when two or more columns have a value of 8 in the same row? Both column names (ie "BC")? Only the first/last?
– Tomas Farias
Nov 5 at 2:15
@DYZ are you implying one should replace the value 8 with the name of the column and the rest of rows with None? If that's the case, I got a bit confused by creating a new column instead of, say, modifying a column.
– Tomas Farias
Nov 5 at 2:18
@TomasFarias I think the description of the problem is pretty clear (at least for a new contributor). A new column shall be created.
– DYZ
Nov 5 at 2:20
add a comment |
up vote
3
down vote
favorite
up vote
3
down vote
favorite
I want to create a new column in my dataframe that places the name of the column in the row if only that column has a value of 8 in the respective row, otherwise the new column's value for the row would be "NONE". For the dataframe df
, the new column df["New_Column"] = ["NONE","NONE","A","NONE"]
df = pd.DataFrame({"A": [1, 2,8,3], "B": [0, 2,4,8], "C": [0, 0,7,8]})
python pandas
New contributor
I want to create a new column in my dataframe that places the name of the column in the row if only that column has a value of 8 in the respective row, otherwise the new column's value for the row would be "NONE". For the dataframe df
, the new column df["New_Column"] = ["NONE","NONE","A","NONE"]
df = pd.DataFrame({"A": [1, 2,8,3], "B": [0, 2,4,8], "C": [0, 0,7,8]})
python pandas
python pandas
New contributor
New contributor
edited Nov 5 at 2:10
DYZ
23.8k61947
23.8k61947
New contributor
asked Nov 5 at 2:06
rer49
183
183
New contributor
New contributor
What is the expected outcome when two or more columns have a value of 8 in the same row? Both column names (ie "BC")? Only the first/last?
– Tomas Farias
Nov 5 at 2:15
@DYZ are you implying one should replace the value 8 with the name of the column and the rest of rows with None? If that's the case, I got a bit confused by creating a new column instead of, say, modifying a column.
– Tomas Farias
Nov 5 at 2:18
@TomasFarias I think the description of the problem is pretty clear (at least for a new contributor). A new column shall be created.
– DYZ
Nov 5 at 2:20
add a comment |
What is the expected outcome when two or more columns have a value of 8 in the same row? Both column names (ie "BC")? Only the first/last?
– Tomas Farias
Nov 5 at 2:15
@DYZ are you implying one should replace the value 8 with the name of the column and the rest of rows with None? If that's the case, I got a bit confused by creating a new column instead of, say, modifying a column.
– Tomas Farias
Nov 5 at 2:18
@TomasFarias I think the description of the problem is pretty clear (at least for a new contributor). A new column shall be created.
– DYZ
Nov 5 at 2:20
What is the expected outcome when two or more columns have a value of 8 in the same row? Both column names (ie "BC")? Only the first/last?
– Tomas Farias
Nov 5 at 2:15
What is the expected outcome when two or more columns have a value of 8 in the same row? Both column names (ie "BC")? Only the first/last?
– Tomas Farias
Nov 5 at 2:15
@DYZ are you implying one should replace the value 8 with the name of the column and the rest of rows with None? If that's the case, I got a bit confused by creating a new column instead of, say, modifying a column.
– Tomas Farias
Nov 5 at 2:18
@DYZ are you implying one should replace the value 8 with the name of the column and the rest of rows with None? If that's the case, I got a bit confused by creating a new column instead of, say, modifying a column.
– Tomas Farias
Nov 5 at 2:18
@TomasFarias I think the description of the problem is pretty clear (at least for a new contributor). A new column shall be created.
– DYZ
Nov 5 at 2:20
@TomasFarias I think the description of the problem is pretty clear (at least for a new contributor). A new column shall be created.
– DYZ
Nov 5 at 2:20
add a comment |
3 Answers
3
active
oldest
votes
up vote
2
down vote
accepted
Cool problem.
- Find the 8-fields in each row:
df==8
- Count them:
(df==8).sum(axis=1)
- Find the rows where the count is 1:
(df==8).sum(axis=1)==1
- Select just those rows from the original dataframe:
df[(df==8).sum(axis=1)==1]==8
- Find the 8-fields again:
df[(df==8).sum(axis=1)==1]==8)
- Find the columns that hold the
True
values withidxmax
(becauseTrue>False
):(df[(df==8).sum(axis=1)==1]==8).idxmax(axis=1)
- Fill in the gaps with
"NONE"
To summarize:
df["New_Column"] = (df[(df==8).sum(axis=1)==1]==8).idxmax(axis=1)
df["New_Column"] = df["New_Column"].fillna("NONE")
# A B C New_Column
#0 1 0 0 NONE
#1 2 2 0 NONE
#2 8 4 7 A
#3 3 8 8 NONE
# I added another line as a proof of concept
#4 0 8 0 B
add a comment |
up vote
1
down vote
You can accomplish this using idxmax
and a mask:
out = (df==8).idxmax(1)
m = ~(df==8).any(1) | ((df==8).sum(1) > 1)
df.assign(col=out.mask(m))
A B C col
0 1 0 0 NaN
1 2 2 0 NaN
2 8 4 7 A
3 3 8 8 NaN
add a comment |
up vote
1
down vote
Or do:
df2=df[(df==8)]
df['New_Column']=(df2[(df2!=df2.dropna(thresh=2).values[0]).all(1)].dropna(how='all')).idxmax(1)
df['New_Column'] = df['New_Column'].fillna('NONE')
print(df)
dropna
+ dropna
again + idxmax
+ fillna
. that's all you need for this.
Output:
A B C New_Column
0 1 0 0 NONE
1 2 2 0 NONE
2 8 4 7 A
3 3 8 8 NONE
1
Thank you thank you
– rer49
Nov 5 at 3:31
add a comment |
3 Answers
3
active
oldest
votes
3 Answers
3
active
oldest
votes
active
oldest
votes
active
oldest
votes
up vote
2
down vote
accepted
Cool problem.
- Find the 8-fields in each row:
df==8
- Count them:
(df==8).sum(axis=1)
- Find the rows where the count is 1:
(df==8).sum(axis=1)==1
- Select just those rows from the original dataframe:
df[(df==8).sum(axis=1)==1]==8
- Find the 8-fields again:
df[(df==8).sum(axis=1)==1]==8)
- Find the columns that hold the
True
values withidxmax
(becauseTrue>False
):(df[(df==8).sum(axis=1)==1]==8).idxmax(axis=1)
- Fill in the gaps with
"NONE"
To summarize:
df["New_Column"] = (df[(df==8).sum(axis=1)==1]==8).idxmax(axis=1)
df["New_Column"] = df["New_Column"].fillna("NONE")
# A B C New_Column
#0 1 0 0 NONE
#1 2 2 0 NONE
#2 8 4 7 A
#3 3 8 8 NONE
# I added another line as a proof of concept
#4 0 8 0 B
add a comment |
up vote
2
down vote
accepted
Cool problem.
- Find the 8-fields in each row:
df==8
- Count them:
(df==8).sum(axis=1)
- Find the rows where the count is 1:
(df==8).sum(axis=1)==1
- Select just those rows from the original dataframe:
df[(df==8).sum(axis=1)==1]==8
- Find the 8-fields again:
df[(df==8).sum(axis=1)==1]==8)
- Find the columns that hold the
True
values withidxmax
(becauseTrue>False
):(df[(df==8).sum(axis=1)==1]==8).idxmax(axis=1)
- Fill in the gaps with
"NONE"
To summarize:
df["New_Column"] = (df[(df==8).sum(axis=1)==1]==8).idxmax(axis=1)
df["New_Column"] = df["New_Column"].fillna("NONE")
# A B C New_Column
#0 1 0 0 NONE
#1 2 2 0 NONE
#2 8 4 7 A
#3 3 8 8 NONE
# I added another line as a proof of concept
#4 0 8 0 B
add a comment |
up vote
2
down vote
accepted
up vote
2
down vote
accepted
Cool problem.
- Find the 8-fields in each row:
df==8
- Count them:
(df==8).sum(axis=1)
- Find the rows where the count is 1:
(df==8).sum(axis=1)==1
- Select just those rows from the original dataframe:
df[(df==8).sum(axis=1)==1]==8
- Find the 8-fields again:
df[(df==8).sum(axis=1)==1]==8)
- Find the columns that hold the
True
values withidxmax
(becauseTrue>False
):(df[(df==8).sum(axis=1)==1]==8).idxmax(axis=1)
- Fill in the gaps with
"NONE"
To summarize:
df["New_Column"] = (df[(df==8).sum(axis=1)==1]==8).idxmax(axis=1)
df["New_Column"] = df["New_Column"].fillna("NONE")
# A B C New_Column
#0 1 0 0 NONE
#1 2 2 0 NONE
#2 8 4 7 A
#3 3 8 8 NONE
# I added another line as a proof of concept
#4 0 8 0 B
Cool problem.
- Find the 8-fields in each row:
df==8
- Count them:
(df==8).sum(axis=1)
- Find the rows where the count is 1:
(df==8).sum(axis=1)==1
- Select just those rows from the original dataframe:
df[(df==8).sum(axis=1)==1]==8
- Find the 8-fields again:
df[(df==8).sum(axis=1)==1]==8)
- Find the columns that hold the
True
values withidxmax
(becauseTrue>False
):(df[(df==8).sum(axis=1)==1]==8).idxmax(axis=1)
- Fill in the gaps with
"NONE"
To summarize:
df["New_Column"] = (df[(df==8).sum(axis=1)==1]==8).idxmax(axis=1)
df["New_Column"] = df["New_Column"].fillna("NONE")
# A B C New_Column
#0 1 0 0 NONE
#1 2 2 0 NONE
#2 8 4 7 A
#3 3 8 8 NONE
# I added another line as a proof of concept
#4 0 8 0 B
edited Nov 5 at 2:33
answered Nov 5 at 2:28
DYZ
23.8k61947
23.8k61947
add a comment |
add a comment |
up vote
1
down vote
You can accomplish this using idxmax
and a mask:
out = (df==8).idxmax(1)
m = ~(df==8).any(1) | ((df==8).sum(1) > 1)
df.assign(col=out.mask(m))
A B C col
0 1 0 0 NaN
1 2 2 0 NaN
2 8 4 7 A
3 3 8 8 NaN
add a comment |
up vote
1
down vote
You can accomplish this using idxmax
and a mask:
out = (df==8).idxmax(1)
m = ~(df==8).any(1) | ((df==8).sum(1) > 1)
df.assign(col=out.mask(m))
A B C col
0 1 0 0 NaN
1 2 2 0 NaN
2 8 4 7 A
3 3 8 8 NaN
add a comment |
up vote
1
down vote
up vote
1
down vote
You can accomplish this using idxmax
and a mask:
out = (df==8).idxmax(1)
m = ~(df==8).any(1) | ((df==8).sum(1) > 1)
df.assign(col=out.mask(m))
A B C col
0 1 0 0 NaN
1 2 2 0 NaN
2 8 4 7 A
3 3 8 8 NaN
You can accomplish this using idxmax
and a mask:
out = (df==8).idxmax(1)
m = ~(df==8).any(1) | ((df==8).sum(1) > 1)
df.assign(col=out.mask(m))
A B C col
0 1 0 0 NaN
1 2 2 0 NaN
2 8 4 7 A
3 3 8 8 NaN
answered Nov 5 at 2:38
user3483203
28.2k72351
28.2k72351
add a comment |
add a comment |
up vote
1
down vote
Or do:
df2=df[(df==8)]
df['New_Column']=(df2[(df2!=df2.dropna(thresh=2).values[0]).all(1)].dropna(how='all')).idxmax(1)
df['New_Column'] = df['New_Column'].fillna('NONE')
print(df)
dropna
+ dropna
again + idxmax
+ fillna
. that's all you need for this.
Output:
A B C New_Column
0 1 0 0 NONE
1 2 2 0 NONE
2 8 4 7 A
3 3 8 8 NONE
1
Thank you thank you
– rer49
Nov 5 at 3:31
add a comment |
up vote
1
down vote
Or do:
df2=df[(df==8)]
df['New_Column']=(df2[(df2!=df2.dropna(thresh=2).values[0]).all(1)].dropna(how='all')).idxmax(1)
df['New_Column'] = df['New_Column'].fillna('NONE')
print(df)
dropna
+ dropna
again + idxmax
+ fillna
. that's all you need for this.
Output:
A B C New_Column
0 1 0 0 NONE
1 2 2 0 NONE
2 8 4 7 A
3 3 8 8 NONE
1
Thank you thank you
– rer49
Nov 5 at 3:31
add a comment |
up vote
1
down vote
up vote
1
down vote
Or do:
df2=df[(df==8)]
df['New_Column']=(df2[(df2!=df2.dropna(thresh=2).values[0]).all(1)].dropna(how='all')).idxmax(1)
df['New_Column'] = df['New_Column'].fillna('NONE')
print(df)
dropna
+ dropna
again + idxmax
+ fillna
. that's all you need for this.
Output:
A B C New_Column
0 1 0 0 NONE
1 2 2 0 NONE
2 8 4 7 A
3 3 8 8 NONE
Or do:
df2=df[(df==8)]
df['New_Column']=(df2[(df2!=df2.dropna(thresh=2).values[0]).all(1)].dropna(how='all')).idxmax(1)
df['New_Column'] = df['New_Column'].fillna('NONE')
print(df)
dropna
+ dropna
again + idxmax
+ fillna
. that's all you need for this.
Output:
A B C New_Column
0 1 0 0 NONE
1 2 2 0 NONE
2 8 4 7 A
3 3 8 8 NONE
answered Nov 5 at 3:28
U9-Forward
8,6842733
8,6842733
1
Thank you thank you
– rer49
Nov 5 at 3:31
add a comment |
1
Thank you thank you
– rer49
Nov 5 at 3:31
1
1
Thank you thank you
– rer49
Nov 5 at 3:31
Thank you thank you
– rer49
Nov 5 at 3:31
add a comment |
rer49 is a new contributor. Be nice, and check out our Code of Conduct.
rer49 is a new contributor. Be nice, and check out our Code of Conduct.
rer49 is a new contributor. Be nice, and check out our Code of Conduct.
rer49 is a new contributor. Be nice, and check out our Code of Conduct.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53147413%2fchecking-multiple-columns-condition-in-pandas%23new-answer', 'question_page');
}
);
Post as a guest
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
What is the expected outcome when two or more columns have a value of 8 in the same row? Both column names (ie "BC")? Only the first/last?
– Tomas Farias
Nov 5 at 2:15
@DYZ are you implying one should replace the value 8 with the name of the column and the rest of rows with None? If that's the case, I got a bit confused by creating a new column instead of, say, modifying a column.
– Tomas Farias
Nov 5 at 2:18
@TomasFarias I think the description of the problem is pretty clear (at least for a new contributor). A new column shall be created.
– DYZ
Nov 5 at 2:20