Python: Pandas Module - Nested IF Statement that Fills in NaN (Empty Values) in a Dataframe
up vote
0
down vote
favorite
I've created a function that tests multiple IF statements given the data in the 'Name' column.
Criteria 1: If 'Name' is blank, return the 'Secondary_Name'. However, if 'Secondary_Name' is also blank, return the 'Third_Name'.
Criteria 2: If 'Name' == 'GENERAL', return the 'Secondary_Name'. However, if 'Secondary_Name' is also blank, return the 'Third_Name'
Else: Return the 'Name'
def account_name(row):
if row['Name'] == None and row['Secondary_Name'] == None:
return row['Third_Name']
elif row['Name'] == 'GENERAL':
if row['Secondary_Name'] == None:
return row['Third_Name']
else:
return row['Name']
I've tried == None, == np.NaN, == Null, .isnull(), == '', == '0'.
Nothing seems to replace the empty values to what I want.
Edit:
Example of DF
python-3.x pandas dataframe null
add a comment |
up vote
0
down vote
favorite
I've created a function that tests multiple IF statements given the data in the 'Name' column.
Criteria 1: If 'Name' is blank, return the 'Secondary_Name'. However, if 'Secondary_Name' is also blank, return the 'Third_Name'.
Criteria 2: If 'Name' == 'GENERAL', return the 'Secondary_Name'. However, if 'Secondary_Name' is also blank, return the 'Third_Name'
Else: Return the 'Name'
def account_name(row):
if row['Name'] == None and row['Secondary_Name'] == None:
return row['Third_Name']
elif row['Name'] == 'GENERAL':
if row['Secondary_Name'] == None:
return row['Third_Name']
else:
return row['Name']
I've tried == None, == np.NaN, == Null, .isnull(), == '', == '0'.
Nothing seems to replace the empty values to what I want.
Edit:
Example of DF
python-3.x pandas dataframe null
can you provide us with a sample df?
– wpercy
Nov 7 at 18:38
Example provided in original post: 'Example of DF'
– Matthew
Nov 7 at 18:51
add a comment |
up vote
0
down vote
favorite
up vote
0
down vote
favorite
I've created a function that tests multiple IF statements given the data in the 'Name' column.
Criteria 1: If 'Name' is blank, return the 'Secondary_Name'. However, if 'Secondary_Name' is also blank, return the 'Third_Name'.
Criteria 2: If 'Name' == 'GENERAL', return the 'Secondary_Name'. However, if 'Secondary_Name' is also blank, return the 'Third_Name'
Else: Return the 'Name'
def account_name(row):
if row['Name'] == None and row['Secondary_Name'] == None:
return row['Third_Name']
elif row['Name'] == 'GENERAL':
if row['Secondary_Name'] == None:
return row['Third_Name']
else:
return row['Name']
I've tried == None, == np.NaN, == Null, .isnull(), == '', == '0'.
Nothing seems to replace the empty values to what I want.
Edit:
Example of DF
python-3.x pandas dataframe null
I've created a function that tests multiple IF statements given the data in the 'Name' column.
Criteria 1: If 'Name' is blank, return the 'Secondary_Name'. However, if 'Secondary_Name' is also blank, return the 'Third_Name'.
Criteria 2: If 'Name' == 'GENERAL', return the 'Secondary_Name'. However, if 'Secondary_Name' is also blank, return the 'Third_Name'
Else: Return the 'Name'
def account_name(row):
if row['Name'] == None and row['Secondary_Name'] == None:
return row['Third_Name']
elif row['Name'] == 'GENERAL':
if row['Secondary_Name'] == None:
return row['Third_Name']
else:
return row['Name']
I've tried == None, == np.NaN, == Null, .isnull(), == '', == '0'.
Nothing seems to replace the empty values to what I want.
Edit:
Example of DF
python-3.x pandas dataframe null
python-3.x pandas dataframe null
edited Nov 7 at 18:50
asked Nov 7 at 18:28
Matthew
42
42
can you provide us with a sample df?
– wpercy
Nov 7 at 18:38
Example provided in original post: 'Example of DF'
– Matthew
Nov 7 at 18:51
add a comment |
can you provide us with a sample df?
– wpercy
Nov 7 at 18:38
Example provided in original post: 'Example of DF'
– Matthew
Nov 7 at 18:51
can you provide us with a sample df?
– wpercy
Nov 7 at 18:38
can you provide us with a sample df?
– wpercy
Nov 7 at 18:38
Example provided in original post: 'Example of DF'
– Matthew
Nov 7 at 18:51
Example provided in original post: 'Example of DF'
– Matthew
Nov 7 at 18:51
add a comment |
2 Answers
2
active
oldest
votes
up vote
0
down vote
Depending on column dtype, NULLs dont interact well with NaN for whatever reason.
None would mean that the field is blank, which it is not since you have "a value" for a given dtype.
The simple way to ensure you can identify NaNs is to see if the field is equal to itself
def isNaN(value):
if value != value:
return True
else:
return False
And to provide an example:
df = pd.DataFrame(data={'ClientId':[1,2,3,4] , 'SomeNULLs':['main','main',None,None], 'NewNULLs':[1,None,0,1]})
df['Test'] = df.NewNULLs.apply(isNaN)
The resulting dataset should be
ClientId SomeNULLs NewNULLs Test
0 1 main 1.0 False
1 2 main NaN True
2 3 None 0.0 False
3 4 None 1.0 False
add a comment |
up vote
0
down vote
Consider this df
df = pd.DataFrame({'Name':['a', 'GENERAL', None],'Secondary_Name':['e','f',None], 'Third_Name':['x', 'y', 'z']})
Name Secondary_Name Third_Name
0 a e x
1 GENERAL f y
2 None None z
Since you are writing the function in python, you can use is None
def account_name(row):
if (row['Name'] is None or row['Name'] == 'GENERAL') and (row['Secondary_Name'] is None):
return row['Third_Name']
elif row['Name'] is None or row['Name'] == 'GENERAL':
return row['Secondary_Name']
else:
return row['Name']
df['Name'] = df.apply(account_name, axis = 1)
You get
Name Secondary_Name Third_Name
0 a e x
1 f f y
2 z None z
You can get same output using pandas and nested np.where
cond1 = (df['Name'].isnull()) | (df['Name'] == 'GENERAL')
cond2 = (cond1) & (df['Secondary_Name'].isnull())
np.where(cond2, df['Third_Name'], np.where(cond1, df['Secondary_Name'], df['Name']))
Dear Vaishali, Thank you for the function provided above. However, when I tested it, the first IF statement didn't return the Third_Name. if (row['Name'] is None or row['Name'] == 'GENERAL') and (row['Secondary_Name'] is None): return row['Third_Name']
– Matthew
Nov 7 at 19:26
Its working fine with the example df I created. Can you post your sample df on the question, not as an image?
– Vaishali
Nov 7 at 20:27
Vaishali, Does it make a difference if I use the df.apply on a new column? For instance, df['Target_Col'] = df.apply(account_name, axis=1) as opposed to df['Name']... My target col doesn't return the ['Third_Name'] during the first TRUE condition.
– Matthew
Nov 7 at 20:45
That should not make any difference because your are still apply the function on original df but assigning the result to a new column
– Vaishali
Nov 7 at 22:06
I figured. I'm not sure why when I export the DataFrame, there is still null values in the new column. All of which fall under the first condition and should return 'Third_Name' Info.
– Matthew
Nov 7 at 22:15
add a comment |
2 Answers
2
active
oldest
votes
2 Answers
2
active
oldest
votes
active
oldest
votes
active
oldest
votes
up vote
0
down vote
Depending on column dtype, NULLs dont interact well with NaN for whatever reason.
None would mean that the field is blank, which it is not since you have "a value" for a given dtype.
The simple way to ensure you can identify NaNs is to see if the field is equal to itself
def isNaN(value):
if value != value:
return True
else:
return False
And to provide an example:
df = pd.DataFrame(data={'ClientId':[1,2,3,4] , 'SomeNULLs':['main','main',None,None], 'NewNULLs':[1,None,0,1]})
df['Test'] = df.NewNULLs.apply(isNaN)
The resulting dataset should be
ClientId SomeNULLs NewNULLs Test
0 1 main 1.0 False
1 2 main NaN True
2 3 None 0.0 False
3 4 None 1.0 False
add a comment |
up vote
0
down vote
Depending on column dtype, NULLs dont interact well with NaN for whatever reason.
None would mean that the field is blank, which it is not since you have "a value" for a given dtype.
The simple way to ensure you can identify NaNs is to see if the field is equal to itself
def isNaN(value):
if value != value:
return True
else:
return False
And to provide an example:
df = pd.DataFrame(data={'ClientId':[1,2,3,4] , 'SomeNULLs':['main','main',None,None], 'NewNULLs':[1,None,0,1]})
df['Test'] = df.NewNULLs.apply(isNaN)
The resulting dataset should be
ClientId SomeNULLs NewNULLs Test
0 1 main 1.0 False
1 2 main NaN True
2 3 None 0.0 False
3 4 None 1.0 False
add a comment |
up vote
0
down vote
up vote
0
down vote
Depending on column dtype, NULLs dont interact well with NaN for whatever reason.
None would mean that the field is blank, which it is not since you have "a value" for a given dtype.
The simple way to ensure you can identify NaNs is to see if the field is equal to itself
def isNaN(value):
if value != value:
return True
else:
return False
And to provide an example:
df = pd.DataFrame(data={'ClientId':[1,2,3,4] , 'SomeNULLs':['main','main',None,None], 'NewNULLs':[1,None,0,1]})
df['Test'] = df.NewNULLs.apply(isNaN)
The resulting dataset should be
ClientId SomeNULLs NewNULLs Test
0 1 main 1.0 False
1 2 main NaN True
2 3 None 0.0 False
3 4 None 1.0 False
Depending on column dtype, NULLs dont interact well with NaN for whatever reason.
None would mean that the field is blank, which it is not since you have "a value" for a given dtype.
The simple way to ensure you can identify NaNs is to see if the field is equal to itself
def isNaN(value):
if value != value:
return True
else:
return False
And to provide an example:
df = pd.DataFrame(data={'ClientId':[1,2,3,4] , 'SomeNULLs':['main','main',None,None], 'NewNULLs':[1,None,0,1]})
df['Test'] = df.NewNULLs.apply(isNaN)
The resulting dataset should be
ClientId SomeNULLs NewNULLs Test
0 1 main 1.0 False
1 2 main NaN True
2 3 None 0.0 False
3 4 None 1.0 False
edited Nov 7 at 18:49
answered Nov 7 at 18:41
rogersdevop
112
112
add a comment |
add a comment |
up vote
0
down vote
Consider this df
df = pd.DataFrame({'Name':['a', 'GENERAL', None],'Secondary_Name':['e','f',None], 'Third_Name':['x', 'y', 'z']})
Name Secondary_Name Third_Name
0 a e x
1 GENERAL f y
2 None None z
Since you are writing the function in python, you can use is None
def account_name(row):
if (row['Name'] is None or row['Name'] == 'GENERAL') and (row['Secondary_Name'] is None):
return row['Third_Name']
elif row['Name'] is None or row['Name'] == 'GENERAL':
return row['Secondary_Name']
else:
return row['Name']
df['Name'] = df.apply(account_name, axis = 1)
You get
Name Secondary_Name Third_Name
0 a e x
1 f f y
2 z None z
You can get same output using pandas and nested np.where
cond1 = (df['Name'].isnull()) | (df['Name'] == 'GENERAL')
cond2 = (cond1) & (df['Secondary_Name'].isnull())
np.where(cond2, df['Third_Name'], np.where(cond1, df['Secondary_Name'], df['Name']))
Dear Vaishali, Thank you for the function provided above. However, when I tested it, the first IF statement didn't return the Third_Name. if (row['Name'] is None or row['Name'] == 'GENERAL') and (row['Secondary_Name'] is None): return row['Third_Name']
– Matthew
Nov 7 at 19:26
Its working fine with the example df I created. Can you post your sample df on the question, not as an image?
– Vaishali
Nov 7 at 20:27
Vaishali, Does it make a difference if I use the df.apply on a new column? For instance, df['Target_Col'] = df.apply(account_name, axis=1) as opposed to df['Name']... My target col doesn't return the ['Third_Name'] during the first TRUE condition.
– Matthew
Nov 7 at 20:45
That should not make any difference because your are still apply the function on original df but assigning the result to a new column
– Vaishali
Nov 7 at 22:06
I figured. I'm not sure why when I export the DataFrame, there is still null values in the new column. All of which fall under the first condition and should return 'Third_Name' Info.
– Matthew
Nov 7 at 22:15
add a comment |
up vote
0
down vote
Consider this df
df = pd.DataFrame({'Name':['a', 'GENERAL', None],'Secondary_Name':['e','f',None], 'Third_Name':['x', 'y', 'z']})
Name Secondary_Name Third_Name
0 a e x
1 GENERAL f y
2 None None z
Since you are writing the function in python, you can use is None
def account_name(row):
if (row['Name'] is None or row['Name'] == 'GENERAL') and (row['Secondary_Name'] is None):
return row['Third_Name']
elif row['Name'] is None or row['Name'] == 'GENERAL':
return row['Secondary_Name']
else:
return row['Name']
df['Name'] = df.apply(account_name, axis = 1)
You get
Name Secondary_Name Third_Name
0 a e x
1 f f y
2 z None z
You can get same output using pandas and nested np.where
cond1 = (df['Name'].isnull()) | (df['Name'] == 'GENERAL')
cond2 = (cond1) & (df['Secondary_Name'].isnull())
np.where(cond2, df['Third_Name'], np.where(cond1, df['Secondary_Name'], df['Name']))
Dear Vaishali, Thank you for the function provided above. However, when I tested it, the first IF statement didn't return the Third_Name. if (row['Name'] is None or row['Name'] == 'GENERAL') and (row['Secondary_Name'] is None): return row['Third_Name']
– Matthew
Nov 7 at 19:26
Its working fine with the example df I created. Can you post your sample df on the question, not as an image?
– Vaishali
Nov 7 at 20:27
Vaishali, Does it make a difference if I use the df.apply on a new column? For instance, df['Target_Col'] = df.apply(account_name, axis=1) as opposed to df['Name']... My target col doesn't return the ['Third_Name'] during the first TRUE condition.
– Matthew
Nov 7 at 20:45
That should not make any difference because your are still apply the function on original df but assigning the result to a new column
– Vaishali
Nov 7 at 22:06
I figured. I'm not sure why when I export the DataFrame, there is still null values in the new column. All of which fall under the first condition and should return 'Third_Name' Info.
– Matthew
Nov 7 at 22:15
add a comment |
up vote
0
down vote
up vote
0
down vote
Consider this df
df = pd.DataFrame({'Name':['a', 'GENERAL', None],'Secondary_Name':['e','f',None], 'Third_Name':['x', 'y', 'z']})
Name Secondary_Name Third_Name
0 a e x
1 GENERAL f y
2 None None z
Since you are writing the function in python, you can use is None
def account_name(row):
if (row['Name'] is None or row['Name'] == 'GENERAL') and (row['Secondary_Name'] is None):
return row['Third_Name']
elif row['Name'] is None or row['Name'] == 'GENERAL':
return row['Secondary_Name']
else:
return row['Name']
df['Name'] = df.apply(account_name, axis = 1)
You get
Name Secondary_Name Third_Name
0 a e x
1 f f y
2 z None z
You can get same output using pandas and nested np.where
cond1 = (df['Name'].isnull()) | (df['Name'] == 'GENERAL')
cond2 = (cond1) & (df['Secondary_Name'].isnull())
np.where(cond2, df['Third_Name'], np.where(cond1, df['Secondary_Name'], df['Name']))
Consider this df
df = pd.DataFrame({'Name':['a', 'GENERAL', None],'Secondary_Name':['e','f',None], 'Third_Name':['x', 'y', 'z']})
Name Secondary_Name Third_Name
0 a e x
1 GENERAL f y
2 None None z
Since you are writing the function in python, you can use is None
def account_name(row):
if (row['Name'] is None or row['Name'] == 'GENERAL') and (row['Secondary_Name'] is None):
return row['Third_Name']
elif row['Name'] is None or row['Name'] == 'GENERAL':
return row['Secondary_Name']
else:
return row['Name']
df['Name'] = df.apply(account_name, axis = 1)
You get
Name Secondary_Name Third_Name
0 a e x
1 f f y
2 z None z
You can get same output using pandas and nested np.where
cond1 = (df['Name'].isnull()) | (df['Name'] == 'GENERAL')
cond2 = (cond1) & (df['Secondary_Name'].isnull())
np.where(cond2, df['Third_Name'], np.where(cond1, df['Secondary_Name'], df['Name']))
answered Nov 7 at 18:52
Vaishali
16.5k3927
16.5k3927
Dear Vaishali, Thank you for the function provided above. However, when I tested it, the first IF statement didn't return the Third_Name. if (row['Name'] is None or row['Name'] == 'GENERAL') and (row['Secondary_Name'] is None): return row['Third_Name']
– Matthew
Nov 7 at 19:26
Its working fine with the example df I created. Can you post your sample df on the question, not as an image?
– Vaishali
Nov 7 at 20:27
Vaishali, Does it make a difference if I use the df.apply on a new column? For instance, df['Target_Col'] = df.apply(account_name, axis=1) as opposed to df['Name']... My target col doesn't return the ['Third_Name'] during the first TRUE condition.
– Matthew
Nov 7 at 20:45
That should not make any difference because your are still apply the function on original df but assigning the result to a new column
– Vaishali
Nov 7 at 22:06
I figured. I'm not sure why when I export the DataFrame, there is still null values in the new column. All of which fall under the first condition and should return 'Third_Name' Info.
– Matthew
Nov 7 at 22:15
add a comment |
Dear Vaishali, Thank you for the function provided above. However, when I tested it, the first IF statement didn't return the Third_Name. if (row['Name'] is None or row['Name'] == 'GENERAL') and (row['Secondary_Name'] is None): return row['Third_Name']
– Matthew
Nov 7 at 19:26
Its working fine with the example df I created. Can you post your sample df on the question, not as an image?
– Vaishali
Nov 7 at 20:27
Vaishali, Does it make a difference if I use the df.apply on a new column? For instance, df['Target_Col'] = df.apply(account_name, axis=1) as opposed to df['Name']... My target col doesn't return the ['Third_Name'] during the first TRUE condition.
– Matthew
Nov 7 at 20:45
That should not make any difference because your are still apply the function on original df but assigning the result to a new column
– Vaishali
Nov 7 at 22:06
I figured. I'm not sure why when I export the DataFrame, there is still null values in the new column. All of which fall under the first condition and should return 'Third_Name' Info.
– Matthew
Nov 7 at 22:15
Dear Vaishali, Thank you for the function provided above. However, when I tested it, the first IF statement didn't return the Third_Name. if (row['Name'] is None or row['Name'] == 'GENERAL') and (row['Secondary_Name'] is None): return row['Third_Name']
– Matthew
Nov 7 at 19:26
Dear Vaishali, Thank you for the function provided above. However, when I tested it, the first IF statement didn't return the Third_Name. if (row['Name'] is None or row['Name'] == 'GENERAL') and (row['Secondary_Name'] is None): return row['Third_Name']
– Matthew
Nov 7 at 19:26
Its working fine with the example df I created. Can you post your sample df on the question, not as an image?
– Vaishali
Nov 7 at 20:27
Its working fine with the example df I created. Can you post your sample df on the question, not as an image?
– Vaishali
Nov 7 at 20:27
Vaishali, Does it make a difference if I use the df.apply on a new column? For instance, df['Target_Col'] = df.apply(account_name, axis=1) as opposed to df['Name']... My target col doesn't return the ['Third_Name'] during the first TRUE condition.
– Matthew
Nov 7 at 20:45
Vaishali, Does it make a difference if I use the df.apply on a new column? For instance, df['Target_Col'] = df.apply(account_name, axis=1) as opposed to df['Name']... My target col doesn't return the ['Third_Name'] during the first TRUE condition.
– Matthew
Nov 7 at 20:45
That should not make any difference because your are still apply the function on original df but assigning the result to a new column
– Vaishali
Nov 7 at 22:06
That should not make any difference because your are still apply the function on original df but assigning the result to a new column
– Vaishali
Nov 7 at 22:06
I figured. I'm not sure why when I export the DataFrame, there is still null values in the new column. All of which fall under the first condition and should return 'Third_Name' Info.
– Matthew
Nov 7 at 22:15
I figured. I'm not sure why when I export the DataFrame, there is still null values in the new column. All of which fall under the first condition and should return 'Third_Name' Info.
– Matthew
Nov 7 at 22:15
add a comment |
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53195614%2fpython-pandas-module-nested-if-statement-that-fills-in-nan-empty-values-in%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
can you provide us with a sample df?
– wpercy
Nov 7 at 18:38
Example provided in original post: 'Example of DF'
– Matthew
Nov 7 at 18:51