Python: How to find which values in a column have NaN values in another specific column (dataframes)

Multi tool use
Multi tool use





.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty{ height:90px;width:728px;box-sizing:border-box;
}







1















Suppose we have df1 that looks like this:



x1 = [{'partner': "Afghanistan", 'commodity': NaN}, 
{'partner': "Zambia", 'commodity': 2},
{'partner': "Germany", 'commodity': 2},
{'partner': "Afghanistan", 'commodity': NaN},
{'partner': "Canada", 'commodity': NaN},
{'partner': "Italy", 'commodity': 3},
{'partner': "Canada", 'commodity': NaN},
{'partner': "USA", 'commodity': NaN}]

df1 = pd.DataFrame(x1)


What I want to do is see the list of values in partner that have the NaN value in commodity, but I don't want to have the same partner listed twice.



So my preferred result would look like this:



commodity_nan_partners=
Afghanistan
Canada
USA


and not:



Afghanistan
Afghanistan
Canada
Canada
USA









share|improve this question































    1















    Suppose we have df1 that looks like this:



    x1 = [{'partner': "Afghanistan", 'commodity': NaN}, 
    {'partner': "Zambia", 'commodity': 2},
    {'partner': "Germany", 'commodity': 2},
    {'partner': "Afghanistan", 'commodity': NaN},
    {'partner': "Canada", 'commodity': NaN},
    {'partner': "Italy", 'commodity': 3},
    {'partner': "Canada", 'commodity': NaN},
    {'partner': "USA", 'commodity': NaN}]

    df1 = pd.DataFrame(x1)


    What I want to do is see the list of values in partner that have the NaN value in commodity, but I don't want to have the same partner listed twice.



    So my preferred result would look like this:



    commodity_nan_partners=
    Afghanistan
    Canada
    USA


    and not:



    Afghanistan
    Afghanistan
    Canada
    Canada
    USA









    share|improve this question



























      1












      1








      1








      Suppose we have df1 that looks like this:



      x1 = [{'partner': "Afghanistan", 'commodity': NaN}, 
      {'partner': "Zambia", 'commodity': 2},
      {'partner': "Germany", 'commodity': 2},
      {'partner': "Afghanistan", 'commodity': NaN},
      {'partner': "Canada", 'commodity': NaN},
      {'partner': "Italy", 'commodity': 3},
      {'partner': "Canada", 'commodity': NaN},
      {'partner': "USA", 'commodity': NaN}]

      df1 = pd.DataFrame(x1)


      What I want to do is see the list of values in partner that have the NaN value in commodity, but I don't want to have the same partner listed twice.



      So my preferred result would look like this:



      commodity_nan_partners=
      Afghanistan
      Canada
      USA


      and not:



      Afghanistan
      Afghanistan
      Canada
      Canada
      USA









      share|improve this question
















      Suppose we have df1 that looks like this:



      x1 = [{'partner': "Afghanistan", 'commodity': NaN}, 
      {'partner': "Zambia", 'commodity': 2},
      {'partner': "Germany", 'commodity': 2},
      {'partner': "Afghanistan", 'commodity': NaN},
      {'partner': "Canada", 'commodity': NaN},
      {'partner': "Italy", 'commodity': 3},
      {'partner': "Canada", 'commodity': NaN},
      {'partner': "USA", 'commodity': NaN}]

      df1 = pd.DataFrame(x1)


      What I want to do is see the list of values in partner that have the NaN value in commodity, but I don't want to have the same partner listed twice.



      So my preferred result would look like this:



      commodity_nan_partners=
      Afghanistan
      Canada
      USA


      and not:



      Afghanistan
      Afghanistan
      Canada
      Canada
      USA






      python pandas dataframe multiple-columns nan






      share|improve this question















      share|improve this question













      share|improve this question




      share|improve this question








      edited Nov 25 '18 at 2:25









      cs95

      143k25165252




      143k25165252










      asked Nov 25 '18 at 2:20









      Hassan DboukHassan Dbouk

      545




      545
























          5 Answers
          5






          active

          oldest

          votes


















          3















          loc + isnull + drop_duplicates



          You can filter your series and then drop duplicates:



          res = df1.loc[df1['commodity'].isnull(), 'partner'].drop_duplicates()

          print(res)

          0 Afghanistan
          4 Canada
          7 USA
          Name: partner, dtype: object





          share|improve this answer































            3














            You can look for NaN values using isnull, then get unique values with unique or set:



            >>> pd.Series(df1.loc[df1.commodity.isnull(),'partner'].unique())
            0 Afghanistan
            1 Canada
            2 USA
            dtype: object

            # or
            >>> pd.Series(list(set(df1.loc[df1.commodity.isnull(),'partner'])))
            0 Canada
            1 Afghanistan
            2 USA
            dtype: object





            share|improve this answer































              2














              Step 1

              Filter out to retain valid strings only:



              v = df1.loc[df1.commodity.isna(), 'partner']


              Or,



              v = df1.partner[df1.commodity.isna()]




              print(v)
              0 Afghanistan
              3 Afghanistan
              4 Canada
              6 Canada
              7 USA
              Name: partner, dtype: object


              Step 2

              Drop duplicates.



              If you want a collection,



              ingredients.unique()
              array(['Afghanistan', 'Canada', 'USA'], dtype=object)


              Or,



              set(ingredients)
              {'Afghanistan', 'Canada', 'USA'}


              If you want a Series,



              ser = ingredients.drop_duplicates().reset_index(drop=True)

              0 Afghanistan
              1 Canada
              2 USA
              Name: partner, dtype: object


              If you want a DataFrame,



              df = ser.to_frame()





              share|improve this answer































                2














                May check with dropna , just provide a different Idea here .



                set(df1.partner.tolist())-set(df1.dropna().partner.tolist())
                Out[94]: {'Afghanistan', 'Canada', 'USA'}





                share|improve this answer































                  0














                  Just another alternatives:



                  >>> df1[df1.isnull().any(axis=1)]['partner'].drop_duplicates()
                  0 Afghanistan
                  4 Canada
                  7 USA
                  Name: partner, dtype: object


                  Using loc + np.isnan



                  >>> df1.loc[np.isnan(df1.commodity), 'partner'].drop_duplicates()
                  0 Afghanistan
                  4 Canada
                  7 USA
                  Name: partner, dtype: object





                  share|improve this answer
























                    Your Answer






                    StackExchange.ifUsing("editor", function () {
                    StackExchange.using("externalEditor", function () {
                    StackExchange.using("snippets", function () {
                    StackExchange.snippets.init();
                    });
                    });
                    }, "code-snippets");

                    StackExchange.ready(function() {
                    var channelOptions = {
                    tags: "".split(" "),
                    id: "1"
                    };
                    initTagRenderer("".split(" "), "".split(" "), channelOptions);

                    StackExchange.using("externalEditor", function() {
                    // Have to fire editor after snippets, if snippets enabled
                    if (StackExchange.settings.snippets.snippetsEnabled) {
                    StackExchange.using("snippets", function() {
                    createEditor();
                    });
                    }
                    else {
                    createEditor();
                    }
                    });

                    function createEditor() {
                    StackExchange.prepareEditor({
                    heartbeatType: 'answer',
                    autoActivateHeartbeat: false,
                    convertImagesToLinks: true,
                    noModals: true,
                    showLowRepImageUploadWarning: true,
                    reputationToPostImages: 10,
                    bindNavPrevention: true,
                    postfix: "",
                    imageUploader: {
                    brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
                    contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
                    allowUrls: true
                    },
                    onDemand: true,
                    discardSelector: ".discard-answer"
                    ,immediatelyShowMarkdownHelp:true
                    });


                    }
                    });














                    draft saved

                    draft discarded


















                    StackExchange.ready(
                    function () {
                    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53464129%2fpython-how-to-find-which-values-in-a-column-have-nan-values-in-another-specific%23new-answer', 'question_page');
                    }
                    );

                    Post as a guest















                    Required, but never shown

























                    5 Answers
                    5






                    active

                    oldest

                    votes








                    5 Answers
                    5






                    active

                    oldest

                    votes









                    active

                    oldest

                    votes






                    active

                    oldest

                    votes









                    3















                    loc + isnull + drop_duplicates



                    You can filter your series and then drop duplicates:



                    res = df1.loc[df1['commodity'].isnull(), 'partner'].drop_duplicates()

                    print(res)

                    0 Afghanistan
                    4 Canada
                    7 USA
                    Name: partner, dtype: object





                    share|improve this answer




























                      3















                      loc + isnull + drop_duplicates



                      You can filter your series and then drop duplicates:



                      res = df1.loc[df1['commodity'].isnull(), 'partner'].drop_duplicates()

                      print(res)

                      0 Afghanistan
                      4 Canada
                      7 USA
                      Name: partner, dtype: object





                      share|improve this answer


























                        3












                        3








                        3








                        loc + isnull + drop_duplicates



                        You can filter your series and then drop duplicates:



                        res = df1.loc[df1['commodity'].isnull(), 'partner'].drop_duplicates()

                        print(res)

                        0 Afghanistan
                        4 Canada
                        7 USA
                        Name: partner, dtype: object





                        share|improve this answer














                        loc + isnull + drop_duplicates



                        You can filter your series and then drop duplicates:



                        res = df1.loc[df1['commodity'].isnull(), 'partner'].drop_duplicates()

                        print(res)

                        0 Afghanistan
                        4 Canada
                        7 USA
                        Name: partner, dtype: object






                        share|improve this answer












                        share|improve this answer



                        share|improve this answer










                        answered Nov 25 '18 at 2:25









                        jppjpp

                        103k2167117




                        103k2167117

























                            3














                            You can look for NaN values using isnull, then get unique values with unique or set:



                            >>> pd.Series(df1.loc[df1.commodity.isnull(),'partner'].unique())
                            0 Afghanistan
                            1 Canada
                            2 USA
                            dtype: object

                            # or
                            >>> pd.Series(list(set(df1.loc[df1.commodity.isnull(),'partner'])))
                            0 Canada
                            1 Afghanistan
                            2 USA
                            dtype: object





                            share|improve this answer




























                              3














                              You can look for NaN values using isnull, then get unique values with unique or set:



                              >>> pd.Series(df1.loc[df1.commodity.isnull(),'partner'].unique())
                              0 Afghanistan
                              1 Canada
                              2 USA
                              dtype: object

                              # or
                              >>> pd.Series(list(set(df1.loc[df1.commodity.isnull(),'partner'])))
                              0 Canada
                              1 Afghanistan
                              2 USA
                              dtype: object





                              share|improve this answer


























                                3












                                3








                                3







                                You can look for NaN values using isnull, then get unique values with unique or set:



                                >>> pd.Series(df1.loc[df1.commodity.isnull(),'partner'].unique())
                                0 Afghanistan
                                1 Canada
                                2 USA
                                dtype: object

                                # or
                                >>> pd.Series(list(set(df1.loc[df1.commodity.isnull(),'partner'])))
                                0 Canada
                                1 Afghanistan
                                2 USA
                                dtype: object





                                share|improve this answer













                                You can look for NaN values using isnull, then get unique values with unique or set:



                                >>> pd.Series(df1.loc[df1.commodity.isnull(),'partner'].unique())
                                0 Afghanistan
                                1 Canada
                                2 USA
                                dtype: object

                                # or
                                >>> pd.Series(list(set(df1.loc[df1.commodity.isnull(),'partner'])))
                                0 Canada
                                1 Afghanistan
                                2 USA
                                dtype: object






                                share|improve this answer












                                share|improve this answer



                                share|improve this answer










                                answered Nov 25 '18 at 2:24









                                sacuLsacuL

                                31k42144




                                31k42144























                                    2














                                    Step 1

                                    Filter out to retain valid strings only:



                                    v = df1.loc[df1.commodity.isna(), 'partner']


                                    Or,



                                    v = df1.partner[df1.commodity.isna()]




                                    print(v)
                                    0 Afghanistan
                                    3 Afghanistan
                                    4 Canada
                                    6 Canada
                                    7 USA
                                    Name: partner, dtype: object


                                    Step 2

                                    Drop duplicates.



                                    If you want a collection,



                                    ingredients.unique()
                                    array(['Afghanistan', 'Canada', 'USA'], dtype=object)


                                    Or,



                                    set(ingredients)
                                    {'Afghanistan', 'Canada', 'USA'}


                                    If you want a Series,



                                    ser = ingredients.drop_duplicates().reset_index(drop=True)

                                    0 Afghanistan
                                    1 Canada
                                    2 USA
                                    Name: partner, dtype: object


                                    If you want a DataFrame,



                                    df = ser.to_frame()





                                    share|improve this answer




























                                      2














                                      Step 1

                                      Filter out to retain valid strings only:



                                      v = df1.loc[df1.commodity.isna(), 'partner']


                                      Or,



                                      v = df1.partner[df1.commodity.isna()]




                                      print(v)
                                      0 Afghanistan
                                      3 Afghanistan
                                      4 Canada
                                      6 Canada
                                      7 USA
                                      Name: partner, dtype: object


                                      Step 2

                                      Drop duplicates.



                                      If you want a collection,



                                      ingredients.unique()
                                      array(['Afghanistan', 'Canada', 'USA'], dtype=object)


                                      Or,



                                      set(ingredients)
                                      {'Afghanistan', 'Canada', 'USA'}


                                      If you want a Series,



                                      ser = ingredients.drop_duplicates().reset_index(drop=True)

                                      0 Afghanistan
                                      1 Canada
                                      2 USA
                                      Name: partner, dtype: object


                                      If you want a DataFrame,



                                      df = ser.to_frame()





                                      share|improve this answer


























                                        2












                                        2








                                        2







                                        Step 1

                                        Filter out to retain valid strings only:



                                        v = df1.loc[df1.commodity.isna(), 'partner']


                                        Or,



                                        v = df1.partner[df1.commodity.isna()]




                                        print(v)
                                        0 Afghanistan
                                        3 Afghanistan
                                        4 Canada
                                        6 Canada
                                        7 USA
                                        Name: partner, dtype: object


                                        Step 2

                                        Drop duplicates.



                                        If you want a collection,



                                        ingredients.unique()
                                        array(['Afghanistan', 'Canada', 'USA'], dtype=object)


                                        Or,



                                        set(ingredients)
                                        {'Afghanistan', 'Canada', 'USA'}


                                        If you want a Series,



                                        ser = ingredients.drop_duplicates().reset_index(drop=True)

                                        0 Afghanistan
                                        1 Canada
                                        2 USA
                                        Name: partner, dtype: object


                                        If you want a DataFrame,



                                        df = ser.to_frame()





                                        share|improve this answer













                                        Step 1

                                        Filter out to retain valid strings only:



                                        v = df1.loc[df1.commodity.isna(), 'partner']


                                        Or,



                                        v = df1.partner[df1.commodity.isna()]




                                        print(v)
                                        0 Afghanistan
                                        3 Afghanistan
                                        4 Canada
                                        6 Canada
                                        7 USA
                                        Name: partner, dtype: object


                                        Step 2

                                        Drop duplicates.



                                        If you want a collection,



                                        ingredients.unique()
                                        array(['Afghanistan', 'Canada', 'USA'], dtype=object)


                                        Or,



                                        set(ingredients)
                                        {'Afghanistan', 'Canada', 'USA'}


                                        If you want a Series,



                                        ser = ingredients.drop_duplicates().reset_index(drop=True)

                                        0 Afghanistan
                                        1 Canada
                                        2 USA
                                        Name: partner, dtype: object


                                        If you want a DataFrame,



                                        df = ser.to_frame()






                                        share|improve this answer












                                        share|improve this answer



                                        share|improve this answer










                                        answered Nov 25 '18 at 2:34









                                        cs95cs95

                                        143k25165252




                                        143k25165252























                                            2














                                            May check with dropna , just provide a different Idea here .



                                            set(df1.partner.tolist())-set(df1.dropna().partner.tolist())
                                            Out[94]: {'Afghanistan', 'Canada', 'USA'}





                                            share|improve this answer




























                                              2














                                              May check with dropna , just provide a different Idea here .



                                              set(df1.partner.tolist())-set(df1.dropna().partner.tolist())
                                              Out[94]: {'Afghanistan', 'Canada', 'USA'}





                                              share|improve this answer


























                                                2












                                                2








                                                2







                                                May check with dropna , just provide a different Idea here .



                                                set(df1.partner.tolist())-set(df1.dropna().partner.tolist())
                                                Out[94]: {'Afghanistan', 'Canada', 'USA'}





                                                share|improve this answer













                                                May check with dropna , just provide a different Idea here .



                                                set(df1.partner.tolist())-set(df1.dropna().partner.tolist())
                                                Out[94]: {'Afghanistan', 'Canada', 'USA'}






                                                share|improve this answer












                                                share|improve this answer



                                                share|improve this answer










                                                answered Nov 25 '18 at 3:36









                                                Wen-BenWen-Ben

                                                128k83872




                                                128k83872























                                                    0














                                                    Just another alternatives:



                                                    >>> df1[df1.isnull().any(axis=1)]['partner'].drop_duplicates()
                                                    0 Afghanistan
                                                    4 Canada
                                                    7 USA
                                                    Name: partner, dtype: object


                                                    Using loc + np.isnan



                                                    >>> df1.loc[np.isnan(df1.commodity), 'partner'].drop_duplicates()
                                                    0 Afghanistan
                                                    4 Canada
                                                    7 USA
                                                    Name: partner, dtype: object





                                                    share|improve this answer




























                                                      0














                                                      Just another alternatives:



                                                      >>> df1[df1.isnull().any(axis=1)]['partner'].drop_duplicates()
                                                      0 Afghanistan
                                                      4 Canada
                                                      7 USA
                                                      Name: partner, dtype: object


                                                      Using loc + np.isnan



                                                      >>> df1.loc[np.isnan(df1.commodity), 'partner'].drop_duplicates()
                                                      0 Afghanistan
                                                      4 Canada
                                                      7 USA
                                                      Name: partner, dtype: object





                                                      share|improve this answer


























                                                        0












                                                        0








                                                        0







                                                        Just another alternatives:



                                                        >>> df1[df1.isnull().any(axis=1)]['partner'].drop_duplicates()
                                                        0 Afghanistan
                                                        4 Canada
                                                        7 USA
                                                        Name: partner, dtype: object


                                                        Using loc + np.isnan



                                                        >>> df1.loc[np.isnan(df1.commodity), 'partner'].drop_duplicates()
                                                        0 Afghanistan
                                                        4 Canada
                                                        7 USA
                                                        Name: partner, dtype: object





                                                        share|improve this answer













                                                        Just another alternatives:



                                                        >>> df1[df1.isnull().any(axis=1)]['partner'].drop_duplicates()
                                                        0 Afghanistan
                                                        4 Canada
                                                        7 USA
                                                        Name: partner, dtype: object


                                                        Using loc + np.isnan



                                                        >>> df1.loc[np.isnan(df1.commodity), 'partner'].drop_duplicates()
                                                        0 Afghanistan
                                                        4 Canada
                                                        7 USA
                                                        Name: partner, dtype: object






                                                        share|improve this answer












                                                        share|improve this answer



                                                        share|improve this answer










                                                        answered Nov 25 '18 at 6:40









                                                        pygopygo

                                                        3,2451721




                                                        3,2451721






























                                                            draft saved

                                                            draft discarded




















































                                                            Thanks for contributing an answer to Stack Overflow!


                                                            • Please be sure to answer the question. Provide details and share your research!

                                                            But avoid



                                                            • Asking for help, clarification, or responding to other answers.

                                                            • Making statements based on opinion; back them up with references or personal experience.


                                                            To learn more, see our tips on writing great answers.




                                                            draft saved


                                                            draft discarded














                                                            StackExchange.ready(
                                                            function () {
                                                            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53464129%2fpython-how-to-find-which-values-in-a-column-have-nan-values-in-another-specific%23new-answer', 'question_page');
                                                            }
                                                            );

                                                            Post as a guest















                                                            Required, but never shown





















































                                                            Required, but never shown














                                                            Required, but never shown












                                                            Required, but never shown







                                                            Required, but never shown

































                                                            Required, but never shown














                                                            Required, but never shown












                                                            Required, but never shown







                                                            Required, but never shown







                                                            5ltCvzAATgJ GTx1 U,QM7,cZCJ,oMA3,VA9 YjP,Xi pOy5
                                                            6kAzuEtQI njvbB3ZEXeZNGS3NT4SF uOpmU3,oc8l60,7

                                                            這個網誌中的熱門文章

                                                            MGP Nordic

                                                            Xamarin.form Move up view when keyboard appear

                                                            Post-Redirect-Get with Spring WebFlux and Thymeleaf