Python: Pandas Module - Nested IF Statement that Fills in NaN (Empty Values) in a Dataframe

up vote
0
down vote

favorite

I've created a function that tests multiple IF statements given the data in the 'Name' column.

Criteria 1: If 'Name' is blank, return the 'Secondary_Name'. However, if 'Secondary_Name' is also blank, return the 'Third_Name'.

Criteria 2: If 'Name' == 'GENERAL', return the 'Secondary_Name'. However, if 'Secondary_Name' is also blank, return the 'Third_Name'

Else: Return the 'Name'

def account_name(row):

    if row['Name'] == None and row['Secondary_Name'] == None:

        return row['Third_Name']



    elif row['Name'] == 'GENERAL':

        if row['Secondary_Name'] == None:

            return row['Third_Name']



    else:

        return row['Name']

I've tried == None, == np.NaN, == Null, .isnull(), == '', == '0'.

Nothing seems to replace the empty values to what I want.

Edit:

Example of DF

edited Nov 7 at 18:50

asked Nov 7 at 18:28

Matthew

can you provide us with a sample df?
– wpercy
Nov 7 at 18:38

Example provided in original post: 'Example of DF'
– Matthew
Nov 7 at 18:51

add a comment |

up vote
0
down vote

favorite

I've created a function that tests multiple IF statements given the data in the 'Name' column.

Criteria 1: If 'Name' is blank, return the 'Secondary_Name'. However, if 'Secondary_Name' is also blank, return the 'Third_Name'.

Criteria 2: If 'Name' == 'GENERAL', return the 'Secondary_Name'. However, if 'Secondary_Name' is also blank, return the 'Third_Name'

Else: Return the 'Name'

def account_name(row):

    if row['Name'] == None and row['Secondary_Name'] == None:

        return row['Third_Name']



    elif row['Name'] == 'GENERAL':

        if row['Secondary_Name'] == None:

            return row['Third_Name']



    else:

        return row['Name']

I've tried == None, == np.NaN, == Null, .isnull(), == '', == '0'.

Nothing seems to replace the empty values to what I want.

Edit:

Example of DF

edited Nov 7 at 18:50

asked Nov 7 at 18:28

Matthew

can you provide us with a sample df?
– wpercy
Nov 7 at 18:38

Example provided in original post: 'Example of DF'
– Matthew
Nov 7 at 18:51

add a comment |

up vote
0
down vote

favorite

I've created a function that tests multiple IF statements given the data in the 'Name' column.

Criteria 1: If 'Name' is blank, return the 'Secondary_Name'. However, if 'Secondary_Name' is also blank, return the 'Third_Name'.

Criteria 2: If 'Name' == 'GENERAL', return the 'Secondary_Name'. However, if 'Secondary_Name' is also blank, return the 'Third_Name'

Else: Return the 'Name'

def account_name(row):

    if row['Name'] == None and row['Secondary_Name'] == None:

        return row['Third_Name']



    elif row['Name'] == 'GENERAL':

        if row['Secondary_Name'] == None:

            return row['Third_Name']



    else:

        return row['Name']

I've tried == None, == np.NaN, == Null, .isnull(), == '', == '0'.

Nothing seems to replace the empty values to what I want.

Edit:

Example of DF

edited Nov 7 at 18:50

asked Nov 7 at 18:28

Matthew

I've created a function that tests multiple IF statements given the data in the 'Name' column.

Criteria 1: If 'Name' is blank, return the 'Secondary_Name'. However, if 'Secondary_Name' is also blank, return the 'Third_Name'.

Criteria 2: If 'Name' == 'GENERAL', return the 'Secondary_Name'. However, if 'Secondary_Name' is also blank, return the 'Third_Name'

Else: Return the 'Name'

def account_name(row):

    if row['Name'] == None and row['Secondary_Name'] == None:

        return row['Third_Name']



    elif row['Name'] == 'GENERAL':

        if row['Secondary_Name'] == None:

            return row['Third_Name']



    else:

        return row['Name']

I've tried == None, == np.NaN, == Null, .isnull(), == '', == '0'.

Nothing seems to replace the empty values to what I want.

Edit:

Example of DF

python-3.x pandas dataframe null

edited Nov 7 at 18:50

asked Nov 7 at 18:28

Matthew

edited Nov 7 at 18:50

asked Nov 7 at 18:28

Matthew

edited Nov 7 at 18:50

asked Nov 7 at 18:28

Matthew

asked Nov 7 at 18:28

Matthew

asked Nov 7 at 18:28

Matthew

can you provide us with a sample df?
– wpercy
Nov 7 at 18:38

Example provided in original post: 'Example of DF'
– Matthew
Nov 7 at 18:51

add a comment |

can you provide us with a sample df?
– wpercy
Nov 7 at 18:38

Example provided in original post: 'Example of DF'
– Matthew
Nov 7 at 18:51

can you provide us with a sample df?
– wpercy
Nov 7 at 18:38

Example provided in original post: 'Example of DF'
– Matthew
Nov 7 at 18:51

add a comment |

2 Answers
2

active

oldest

votes

up vote
0
down vote

Depending on column dtype, NULLs dont interact well with NaN for whatever reason.
None would mean that the field is blank, which it is not since you have "a value" for a given dtype.

The simple way to ensure you can identify NaNs is to see if the field is equal to itself

def isNaN(value):

if value != value: 

    return True

else:

    return False

And to provide an example:

df = pd.DataFrame(data={'ClientId':[1,2,3,4] , 'SomeNULLs':['main','main',None,None], 'NewNULLs':[1,None,0,1]})



df['Test'] = df.NewNULLs.apply(isNaN)

The resulting dataset should be

   ClientId SomeNULLs  NewNULLs   Test

0         1      main       1.0  False

1         2      main       NaN   True

2         3      None       0.0  False

3         4      None       1.0  False

edited Nov 7 at 18:49

answered Nov 7 at 18:41

rogersdevop

112

add a comment |

up vote
0
down vote

Consider this df

df = pd.DataFrame({'Name':['a', 'GENERAL', None],'Secondary_Name':['e','f',None], 'Third_Name':['x', 'y', 'z']})



    Name    Secondary_Name  Third_Name

0   a       e               x

1   GENERAL f               y

2   None    None            z

Since you are writing the function in python, you can use is None

def account_name(row):



    if (row['Name'] is None or row['Name'] == 'GENERAL') and (row['Secondary_Name'] is None):

        return row['Third_Name']



    elif row['Name'] is None or row['Name'] == 'GENERAL':

        return row['Secondary_Name']



    else:

        return row['Name']



df['Name'] = df.apply(account_name, axis = 1)

You get

    Name    Secondary_Name  Third_Name

0   a       e               x

1   f       f               y

2   z       None            z

You can get same output using pandas and nested np.where

cond1 = (df['Name'].isnull()) | (df['Name'] == 'GENERAL')

cond2 = (cond1) & (df['Secondary_Name'].isnull())

np.where(cond2, df['Third_Name'], np.where(cond1, df['Secondary_Name'], df['Name']))

answered Nov 7 at 18:52

Vaishali

16.5k3927

Dear Vaishali, Thank you for the function provided above. However, when I tested it, the first IF statement didn't return the Third_Name. if (row['Name'] is None or row['Name'] == 'GENERAL') and (row['Secondary_Name'] is None): return row['Third_Name']
– Matthew
Nov 7 at 19:26

Its working fine with the example df I created. Can you post your sample df on the question, not as an image?
– Vaishali
Nov 7 at 20:27

Vaishali, Does it make a difference if I use the df.apply on a new column? For instance, df['Target_Col'] = df.apply(account_name, axis=1) as opposed to df['Name']... My target col doesn't return the ['Third_Name'] during the first TRUE condition.
– Matthew
Nov 7 at 20:45

That should not make any difference because your are still apply the function on original df but assigning the result to a new column
– Vaishali
Nov 7 at 22:06

I figured. I'm not sure why when I export the DataFrame, there is still null values in the new column. All of which fall under the first condition and should return 'Third_Name' Info.
– Matthew
Nov 7 at 22:15

add a comment |

Your Answer

StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});

}
});

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53195614%2fpython-pandas-module-nested-if-statement-that-fills-in-nan-empty-values-in%23new-answer', 'question_page');
}
);

Post as a guest

Name

Required, but never shown

2 Answers
2

active

oldest

votes

2 Answers
2

active

oldest

votes

up vote
0
down vote

Depending on column dtype, NULLs dont interact well with NaN for whatever reason.
None would mean that the field is blank, which it is not since you have "a value" for a given dtype.

The simple way to ensure you can identify NaNs is to see if the field is equal to itself

def isNaN(value):

if value != value: 

    return True

else:

    return False

And to provide an example:

df = pd.DataFrame(data={'ClientId':[1,2,3,4] , 'SomeNULLs':['main','main',None,None], 'NewNULLs':[1,None,0,1]})



df['Test'] = df.NewNULLs.apply(isNaN)

The resulting dataset should be

   ClientId SomeNULLs  NewNULLs   Test

0         1      main       1.0  False

1         2      main       NaN   True

2         3      None       0.0  False

3         4      None       1.0  False

edited Nov 7 at 18:49

answered Nov 7 at 18:41

rogersdevop

112

add a comment |

up vote
0
down vote

Depending on column dtype, NULLs dont interact well with NaN for whatever reason.
None would mean that the field is blank, which it is not since you have "a value" for a given dtype.

The simple way to ensure you can identify NaNs is to see if the field is equal to itself

def isNaN(value):

if value != value: 

    return True

else:

    return False

And to provide an example:

df = pd.DataFrame(data={'ClientId':[1,2,3,4] , 'SomeNULLs':['main','main',None,None], 'NewNULLs':[1,None,0,1]})



df['Test'] = df.NewNULLs.apply(isNaN)

The resulting dataset should be

   ClientId SomeNULLs  NewNULLs   Test

0         1      main       1.0  False

1         2      main       NaN   True

2         3      None       0.0  False

3         4      None       1.0  False

edited Nov 7 at 18:49

answered Nov 7 at 18:41

rogersdevop

112

add a comment |

up vote
0
down vote

Depending on column dtype, NULLs dont interact well with NaN for whatever reason.
None would mean that the field is blank, which it is not since you have "a value" for a given dtype.

The simple way to ensure you can identify NaNs is to see if the field is equal to itself

def isNaN(value):

if value != value: 

    return True

else:

    return False

And to provide an example:

df = pd.DataFrame(data={'ClientId':[1,2,3,4] , 'SomeNULLs':['main','main',None,None], 'NewNULLs':[1,None,0,1]})



df['Test'] = df.NewNULLs.apply(isNaN)

The resulting dataset should be

   ClientId SomeNULLs  NewNULLs   Test

0         1      main       1.0  False

1         2      main       NaN   True

2         3      None       0.0  False

3         4      None       1.0  False

edited Nov 7 at 18:49

answered Nov 7 at 18:41

rogersdevop

112

Depending on column dtype, NULLs dont interact well with NaN for whatever reason.
None would mean that the field is blank, which it is not since you have "a value" for a given dtype.

The simple way to ensure you can identify NaNs is to see if the field is equal to itself

def isNaN(value):

if value != value: 

    return True

else:

    return False

And to provide an example:

df = pd.DataFrame(data={'ClientId':[1,2,3,4] , 'SomeNULLs':['main','main',None,None], 'NewNULLs':[1,None,0,1]})



df['Test'] = df.NewNULLs.apply(isNaN)

The resulting dataset should be

   ClientId SomeNULLs  NewNULLs   Test

0         1      main       1.0  False

1         2      main       NaN   True

2         3      None       0.0  False

3         4      None       1.0  False

edited Nov 7 at 18:49

answered Nov 7 at 18:41

rogersdevop

112

edited Nov 7 at 18:49

answered Nov 7 at 18:41

rogersdevop

112

answered Nov 7 at 18:41

rogersdevop

112

answered Nov 7 at 18:41

rogersdevop

112

add a comment |

up vote
0
down vote

Consider this df

df = pd.DataFrame({'Name':['a', 'GENERAL', None],'Secondary_Name':['e','f',None], 'Third_Name':['x', 'y', 'z']})



    Name    Secondary_Name  Third_Name

0   a       e               x

1   GENERAL f               y

2   None    None            z

Since you are writing the function in python, you can use is None

def account_name(row):



    if (row['Name'] is None or row['Name'] == 'GENERAL') and (row['Secondary_Name'] is None):

        return row['Third_Name']



    elif row['Name'] is None or row['Name'] == 'GENERAL':

        return row['Secondary_Name']



    else:

        return row['Name']



df['Name'] = df.apply(account_name, axis = 1)

You get

    Name    Secondary_Name  Third_Name

0   a       e               x

1   f       f               y

2   z       None            z

You can get same output using pandas and nested np.where

cond1 = (df['Name'].isnull()) | (df['Name'] == 'GENERAL')

cond2 = (cond1) & (df['Secondary_Name'].isnull())

np.where(cond2, df['Third_Name'], np.where(cond1, df['Secondary_Name'], df['Name']))

answered Nov 7 at 18:52

Vaishali

16.5k3927

Dear Vaishali, Thank you for the function provided above. However, when I tested it, the first IF statement didn't return the Third_Name. if (row['Name'] is None or row['Name'] == 'GENERAL') and (row['Secondary_Name'] is None): return row['Third_Name']
– Matthew
Nov 7 at 19:26

Its working fine with the example df I created. Can you post your sample df on the question, not as an image?
– Vaishali
Nov 7 at 20:27

Vaishali, Does it make a difference if I use the df.apply on a new column? For instance, df['Target_Col'] = df.apply(account_name, axis=1) as opposed to df['Name']... My target col doesn't return the ['Third_Name'] during the first TRUE condition.
– Matthew
Nov 7 at 20:45

That should not make any difference because your are still apply the function on original df but assigning the result to a new column
– Vaishali
Nov 7 at 22:06

I figured. I'm not sure why when I export the DataFrame, there is still null values in the new column. All of which fall under the first condition and should return 'Third_Name' Info.
– Matthew
Nov 7 at 22:15

add a comment |

up vote
0
down vote

Consider this df

df = pd.DataFrame({'Name':['a', 'GENERAL', None],'Secondary_Name':['e','f',None], 'Third_Name':['x', 'y', 'z']})



    Name    Secondary_Name  Third_Name

0   a       e               x

1   GENERAL f               y

2   None    None            z

Since you are writing the function in python, you can use is None

def account_name(row):



    if (row['Name'] is None or row['Name'] == 'GENERAL') and (row['Secondary_Name'] is None):

        return row['Third_Name']



    elif row['Name'] is None or row['Name'] == 'GENERAL':

        return row['Secondary_Name']



    else:

        return row['Name']



df['Name'] = df.apply(account_name, axis = 1)

You get

    Name    Secondary_Name  Third_Name

0   a       e               x

1   f       f               y

2   z       None            z

You can get same output using pandas and nested np.where

cond1 = (df['Name'].isnull()) | (df['Name'] == 'GENERAL')

cond2 = (cond1) & (df['Secondary_Name'].isnull())

np.where(cond2, df['Third_Name'], np.where(cond1, df['Secondary_Name'], df['Name']))

answered Nov 7 at 18:52

Vaishali

16.5k3927

Dear Vaishali, Thank you for the function provided above. However, when I tested it, the first IF statement didn't return the Third_Name. if (row['Name'] is None or row['Name'] == 'GENERAL') and (row['Secondary_Name'] is None): return row['Third_Name']
– Matthew
Nov 7 at 19:26

Its working fine with the example df I created. Can you post your sample df on the question, not as an image?
– Vaishali
Nov 7 at 20:27

Vaishali, Does it make a difference if I use the df.apply on a new column? For instance, df['Target_Col'] = df.apply(account_name, axis=1) as opposed to df['Name']... My target col doesn't return the ['Third_Name'] during the first TRUE condition.
– Matthew
Nov 7 at 20:45

That should not make any difference because your are still apply the function on original df but assigning the result to a new column
– Vaishali
Nov 7 at 22:06

I figured. I'm not sure why when I export the DataFrame, there is still null values in the new column. All of which fall under the first condition and should return 'Third_Name' Info.
– Matthew
Nov 7 at 22:15

add a comment |

up vote
0
down vote

Consider this df

df = pd.DataFrame({'Name':['a', 'GENERAL', None],'Secondary_Name':['e','f',None], 'Third_Name':['x', 'y', 'z']})



    Name    Secondary_Name  Third_Name

0   a       e               x

1   GENERAL f               y

2   None    None            z

Since you are writing the function in python, you can use is None

def account_name(row):



    if (row['Name'] is None or row['Name'] == 'GENERAL') and (row['Secondary_Name'] is None):

        return row['Third_Name']



    elif row['Name'] is None or row['Name'] == 'GENERAL':

        return row['Secondary_Name']



    else:

        return row['Name']



df['Name'] = df.apply(account_name, axis = 1)

You get

    Name    Secondary_Name  Third_Name

0   a       e               x

1   f       f               y

2   z       None            z

You can get same output using pandas and nested np.where

cond1 = (df['Name'].isnull()) | (df['Name'] == 'GENERAL')

cond2 = (cond1) & (df['Secondary_Name'].isnull())

np.where(cond2, df['Third_Name'], np.where(cond1, df['Secondary_Name'], df['Name']))

answered Nov 7 at 18:52

Vaishali

16.5k3927

Consider this df

df = pd.DataFrame({'Name':['a', 'GENERAL', None],'Secondary_Name':['e','f',None], 'Third_Name':['x', 'y', 'z']})



    Name    Secondary_Name  Third_Name

0   a       e               x

1   GENERAL f               y

2   None    None            z

Since you are writing the function in python, you can use is None

def account_name(row):



    if (row['Name'] is None or row['Name'] == 'GENERAL') and (row['Secondary_Name'] is None):

        return row['Third_Name']



    elif row['Name'] is None or row['Name'] == 'GENERAL':

        return row['Secondary_Name']



    else:

        return row['Name']



df['Name'] = df.apply(account_name, axis = 1)

You get

    Name    Secondary_Name  Third_Name

0   a       e               x

1   f       f               y

2   z       None            z

You can get same output using pandas and nested np.where

cond1 = (df['Name'].isnull()) | (df['Name'] == 'GENERAL')

cond2 = (cond1) & (df['Secondary_Name'].isnull())

np.where(cond2, df['Third_Name'], np.where(cond1, df['Secondary_Name'], df['Name']))

answered Nov 7 at 18:52

Vaishali

16.5k3927

answered Nov 7 at 18:52

Vaishali

16.5k3927

answered Nov 7 at 18:52

Vaishali

16.5k3927

answered Nov 7 at 18:52

Vaishali

16.5k3927

Dear Vaishali, Thank you for the function provided above. However, when I tested it, the first IF statement didn't return the Third_Name. if (row['Name'] is None or row['Name'] == 'GENERAL') and (row['Secondary_Name'] is None): return row['Third_Name']
– Matthew
Nov 7 at 19:26

Its working fine with the example df I created. Can you post your sample df on the question, not as an image?
– Vaishali
Nov 7 at 20:27

Vaishali, Does it make a difference if I use the df.apply on a new column? For instance, df['Target_Col'] = df.apply(account_name, axis=1) as opposed to df['Name']... My target col doesn't return the ['Third_Name'] during the first TRUE condition.
– Matthew
Nov 7 at 20:45

That should not make any difference because your are still apply the function on original df but assigning the result to a new column
– Vaishali
Nov 7 at 22:06

I figured. I'm not sure why when I export the DataFrame, there is still null values in the new column. All of which fall under the first condition and should return 'Third_Name' Info.
– Matthew
Nov 7 at 22:15

add a comment |

Dear Vaishali, Thank you for the function provided above. However, when I tested it, the first IF statement didn't return the Third_Name. if (row['Name'] is None or row['Name'] == 'GENERAL') and (row['Secondary_Name'] is None): return row['Third_Name']
– Matthew
Nov 7 at 19:26

Its working fine with the example df I created. Can you post your sample df on the question, not as an image?
– Vaishali
Nov 7 at 20:27

Vaishali, Does it make a difference if I use the df.apply on a new column? For instance, df['Target_Col'] = df.apply(account_name, axis=1) as opposed to df['Name']... My target col doesn't return the ['Third_Name'] during the first TRUE condition.
– Matthew
Nov 7 at 20:45

That should not make any difference because your are still apply the function on original df but assigning the result to a new column
– Vaishali
Nov 7 at 22:06

I figured. I'm not sure why when I export the DataFrame, there is still null values in the new column. All of which fall under the first condition and should return 'Third_Name' Info.
– Matthew
Nov 7 at 22:15

Dear Vaishali, Thank you for the function provided above. However, when I tested it, the first IF statement didn't return the Third_Name. if (row['Name'] is None or row['Name'] == 'GENERAL') and (row['Secondary_Name'] is None): return row['Third_Name']
– Matthew
Nov 7 at 19:26

Its working fine with the example df I created. Can you post your sample df on the question, not as an image?
– Vaishali
Nov 7 at 20:27

Vaishali, Does it make a difference if I use the df.apply on a new column? For instance, df['Target_Col'] = df.apply(account_name, axis=1) as opposed to df['Name']... My target col doesn't return the ['Third_Name'] during the first TRUE condition.
– Matthew
Nov 7 at 20:45

That should not make any difference because your are still apply the function on original df but assigning the result to a new column
– Vaishali
Nov 7 at 22:06

I figured. I'm not sure why when I export the DataFrame, there is still null values in the new column. All of which fall under the first condition and should return 'Third_Name' Info.
– Matthew
Nov 7 at 22:15

add a comment |

draft saved

draft discarded

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

搜尋此網誌

Wsrtjtyk