Checking multiple columns condition in pandas











up vote
3
down vote

favorite












I want to create a new column in my dataframe that places the name of the column in the row if only that column has a value of 8 in the respective row, otherwise the new column's value for the row would be "NONE". For the dataframe df, the new column df["New_Column"] = ["NONE","NONE","A","NONE"]



df = pd.DataFrame({"A": [1, 2,8,3], "B": [0, 2,4,8], "C": [0, 0,7,8]})









share|improve this question









New contributor




rer49 is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.




















  • What is the expected outcome when two or more columns have a value of 8 in the same row? Both column names (ie "BC")? Only the first/last?
    – Tomas Farias
    Nov 5 at 2:15












  • @DYZ are you implying one should replace the value 8 with the name of the column and the rest of rows with None? If that's the case, I got a bit confused by creating a new column instead of, say, modifying a column.
    – Tomas Farias
    Nov 5 at 2:18










  • @TomasFarias I think the description of the problem is pretty clear (at least for a new contributor). A new column shall be created.
    – DYZ
    Nov 5 at 2:20















up vote
3
down vote

favorite












I want to create a new column in my dataframe that places the name of the column in the row if only that column has a value of 8 in the respective row, otherwise the new column's value for the row would be "NONE". For the dataframe df, the new column df["New_Column"] = ["NONE","NONE","A","NONE"]



df = pd.DataFrame({"A": [1, 2,8,3], "B": [0, 2,4,8], "C": [0, 0,7,8]})









share|improve this question









New contributor




rer49 is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.




















  • What is the expected outcome when two or more columns have a value of 8 in the same row? Both column names (ie "BC")? Only the first/last?
    – Tomas Farias
    Nov 5 at 2:15












  • @DYZ are you implying one should replace the value 8 with the name of the column and the rest of rows with None? If that's the case, I got a bit confused by creating a new column instead of, say, modifying a column.
    – Tomas Farias
    Nov 5 at 2:18










  • @TomasFarias I think the description of the problem is pretty clear (at least for a new contributor). A new column shall be created.
    – DYZ
    Nov 5 at 2:20













up vote
3
down vote

favorite









up vote
3
down vote

favorite











I want to create a new column in my dataframe that places the name of the column in the row if only that column has a value of 8 in the respective row, otherwise the new column's value for the row would be "NONE". For the dataframe df, the new column df["New_Column"] = ["NONE","NONE","A","NONE"]



df = pd.DataFrame({"A": [1, 2,8,3], "B": [0, 2,4,8], "C": [0, 0,7,8]})









share|improve this question









New contributor




rer49 is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.











I want to create a new column in my dataframe that places the name of the column in the row if only that column has a value of 8 in the respective row, otherwise the new column's value for the row would be "NONE". For the dataframe df, the new column df["New_Column"] = ["NONE","NONE","A","NONE"]



df = pd.DataFrame({"A": [1, 2,8,3], "B": [0, 2,4,8], "C": [0, 0,7,8]})






python pandas






share|improve this question









New contributor




rer49 is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.











share|improve this question









New contributor




rer49 is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.









share|improve this question




share|improve this question








edited Nov 5 at 2:10









DYZ

23.8k61947




23.8k61947






New contributor




rer49 is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.









asked Nov 5 at 2:06









rer49

183




183




New contributor




rer49 is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.





New contributor





rer49 is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.






rer49 is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.












  • What is the expected outcome when two or more columns have a value of 8 in the same row? Both column names (ie "BC")? Only the first/last?
    – Tomas Farias
    Nov 5 at 2:15












  • @DYZ are you implying one should replace the value 8 with the name of the column and the rest of rows with None? If that's the case, I got a bit confused by creating a new column instead of, say, modifying a column.
    – Tomas Farias
    Nov 5 at 2:18










  • @TomasFarias I think the description of the problem is pretty clear (at least for a new contributor). A new column shall be created.
    – DYZ
    Nov 5 at 2:20


















  • What is the expected outcome when two or more columns have a value of 8 in the same row? Both column names (ie "BC")? Only the first/last?
    – Tomas Farias
    Nov 5 at 2:15












  • @DYZ are you implying one should replace the value 8 with the name of the column and the rest of rows with None? If that's the case, I got a bit confused by creating a new column instead of, say, modifying a column.
    – Tomas Farias
    Nov 5 at 2:18










  • @TomasFarias I think the description of the problem is pretty clear (at least for a new contributor). A new column shall be created.
    – DYZ
    Nov 5 at 2:20
















What is the expected outcome when two or more columns have a value of 8 in the same row? Both column names (ie "BC")? Only the first/last?
– Tomas Farias
Nov 5 at 2:15






What is the expected outcome when two or more columns have a value of 8 in the same row? Both column names (ie "BC")? Only the first/last?
– Tomas Farias
Nov 5 at 2:15














@DYZ are you implying one should replace the value 8 with the name of the column and the rest of rows with None? If that's the case, I got a bit confused by creating a new column instead of, say, modifying a column.
– Tomas Farias
Nov 5 at 2:18




@DYZ are you implying one should replace the value 8 with the name of the column and the rest of rows with None? If that's the case, I got a bit confused by creating a new column instead of, say, modifying a column.
– Tomas Farias
Nov 5 at 2:18












@TomasFarias I think the description of the problem is pretty clear (at least for a new contributor). A new column shall be created.
– DYZ
Nov 5 at 2:20




@TomasFarias I think the description of the problem is pretty clear (at least for a new contributor). A new column shall be created.
– DYZ
Nov 5 at 2:20












3 Answers
3






active

oldest

votes

















up vote
2
down vote



accepted










Cool problem.




  1. Find the 8-fields in each row: df==8

  2. Count them: (df==8).sum(axis=1)

  3. Find the rows where the count is 1: (df==8).sum(axis=1)==1

  4. Select just those rows from the original dataframe: df[(df==8).sum(axis=1)==1]==8

  5. Find the 8-fields again: df[(df==8).sum(axis=1)==1]==8)

  6. Find the columns that hold the True values with idxmax (because True>False): (df[(df==8).sum(axis=1)==1]==8).idxmax(axis=1)

  7. Fill in the gaps with "NONE"


To summarize:



df["New_Column"] = (df[(df==8).sum(axis=1)==1]==8).idxmax(axis=1)
df["New_Column"] = df["New_Column"].fillna("NONE")
# A B C New_Column
#0 1 0 0 NONE
#1 2 2 0 NONE
#2 8 4 7 A
#3 3 8 8 NONE
# I added another line as a proof of concept
#4 0 8 0 B





share|improve this answer






























    up vote
    1
    down vote













    You can accomplish this using idxmax and a mask:



    out = (df==8).idxmax(1)
    m = ~(df==8).any(1) | ((df==8).sum(1) > 1)

    df.assign(col=out.mask(m))




       A  B  C  col
    0 1 0 0 NaN
    1 2 2 0 NaN
    2 8 4 7 A
    3 3 8 8 NaN





    share|improve this answer




























      up vote
      1
      down vote













      Or do:



      df2=df[(df==8)]
      df['New_Column']=(df2[(df2!=df2.dropna(thresh=2).values[0]).all(1)].dropna(how='all')).idxmax(1)
      df['New_Column'] = df['New_Column'].fillna('NONE')
      print(df)


      dropna + dropna again + idxmax + fillna. that's all you need for this.



      Output:



         A  B  C New_Column
      0 1 0 0 NONE
      1 2 2 0 NONE
      2 8 4 7 A
      3 3 8 8 NONE





      share|improve this answer

















      • 1




        Thank you thank you
        – rer49
        Nov 5 at 3:31











      Your Answer






      StackExchange.ifUsing("editor", function () {
      StackExchange.using("externalEditor", function () {
      StackExchange.using("snippets", function () {
      StackExchange.snippets.init();
      });
      });
      }, "code-snippets");

      StackExchange.ready(function() {
      var channelOptions = {
      tags: "".split(" "),
      id: "1"
      };
      initTagRenderer("".split(" "), "".split(" "), channelOptions);

      StackExchange.using("externalEditor", function() {
      // Have to fire editor after snippets, if snippets enabled
      if (StackExchange.settings.snippets.snippetsEnabled) {
      StackExchange.using("snippets", function() {
      createEditor();
      });
      }
      else {
      createEditor();
      }
      });

      function createEditor() {
      StackExchange.prepareEditor({
      heartbeatType: 'answer',
      convertImagesToLinks: true,
      noModals: true,
      showLowRepImageUploadWarning: true,
      reputationToPostImages: 10,
      bindNavPrevention: true,
      postfix: "",
      imageUploader: {
      brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
      contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
      allowUrls: true
      },
      onDemand: true,
      discardSelector: ".discard-answer"
      ,immediatelyShowMarkdownHelp:true
      });


      }
      });






      rer49 is a new contributor. Be nice, and check out our Code of Conduct.










       

      draft saved


      draft discarded


















      StackExchange.ready(
      function () {
      StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53147413%2fchecking-multiple-columns-condition-in-pandas%23new-answer', 'question_page');
      }
      );

      Post as a guest
































      3 Answers
      3






      active

      oldest

      votes








      3 Answers
      3






      active

      oldest

      votes









      active

      oldest

      votes






      active

      oldest

      votes








      up vote
      2
      down vote



      accepted










      Cool problem.




      1. Find the 8-fields in each row: df==8

      2. Count them: (df==8).sum(axis=1)

      3. Find the rows where the count is 1: (df==8).sum(axis=1)==1

      4. Select just those rows from the original dataframe: df[(df==8).sum(axis=1)==1]==8

      5. Find the 8-fields again: df[(df==8).sum(axis=1)==1]==8)

      6. Find the columns that hold the True values with idxmax (because True>False): (df[(df==8).sum(axis=1)==1]==8).idxmax(axis=1)

      7. Fill in the gaps with "NONE"


      To summarize:



      df["New_Column"] = (df[(df==8).sum(axis=1)==1]==8).idxmax(axis=1)
      df["New_Column"] = df["New_Column"].fillna("NONE")
      # A B C New_Column
      #0 1 0 0 NONE
      #1 2 2 0 NONE
      #2 8 4 7 A
      #3 3 8 8 NONE
      # I added another line as a proof of concept
      #4 0 8 0 B





      share|improve this answer



























        up vote
        2
        down vote



        accepted










        Cool problem.




        1. Find the 8-fields in each row: df==8

        2. Count them: (df==8).sum(axis=1)

        3. Find the rows where the count is 1: (df==8).sum(axis=1)==1

        4. Select just those rows from the original dataframe: df[(df==8).sum(axis=1)==1]==8

        5. Find the 8-fields again: df[(df==8).sum(axis=1)==1]==8)

        6. Find the columns that hold the True values with idxmax (because True>False): (df[(df==8).sum(axis=1)==1]==8).idxmax(axis=1)

        7. Fill in the gaps with "NONE"


        To summarize:



        df["New_Column"] = (df[(df==8).sum(axis=1)==1]==8).idxmax(axis=1)
        df["New_Column"] = df["New_Column"].fillna("NONE")
        # A B C New_Column
        #0 1 0 0 NONE
        #1 2 2 0 NONE
        #2 8 4 7 A
        #3 3 8 8 NONE
        # I added another line as a proof of concept
        #4 0 8 0 B





        share|improve this answer

























          up vote
          2
          down vote



          accepted







          up vote
          2
          down vote



          accepted






          Cool problem.




          1. Find the 8-fields in each row: df==8

          2. Count them: (df==8).sum(axis=1)

          3. Find the rows where the count is 1: (df==8).sum(axis=1)==1

          4. Select just those rows from the original dataframe: df[(df==8).sum(axis=1)==1]==8

          5. Find the 8-fields again: df[(df==8).sum(axis=1)==1]==8)

          6. Find the columns that hold the True values with idxmax (because True>False): (df[(df==8).sum(axis=1)==1]==8).idxmax(axis=1)

          7. Fill in the gaps with "NONE"


          To summarize:



          df["New_Column"] = (df[(df==8).sum(axis=1)==1]==8).idxmax(axis=1)
          df["New_Column"] = df["New_Column"].fillna("NONE")
          # A B C New_Column
          #0 1 0 0 NONE
          #1 2 2 0 NONE
          #2 8 4 7 A
          #3 3 8 8 NONE
          # I added another line as a proof of concept
          #4 0 8 0 B





          share|improve this answer














          Cool problem.




          1. Find the 8-fields in each row: df==8

          2. Count them: (df==8).sum(axis=1)

          3. Find the rows where the count is 1: (df==8).sum(axis=1)==1

          4. Select just those rows from the original dataframe: df[(df==8).sum(axis=1)==1]==8

          5. Find the 8-fields again: df[(df==8).sum(axis=1)==1]==8)

          6. Find the columns that hold the True values with idxmax (because True>False): (df[(df==8).sum(axis=1)==1]==8).idxmax(axis=1)

          7. Fill in the gaps with "NONE"


          To summarize:



          df["New_Column"] = (df[(df==8).sum(axis=1)==1]==8).idxmax(axis=1)
          df["New_Column"] = df["New_Column"].fillna("NONE")
          # A B C New_Column
          #0 1 0 0 NONE
          #1 2 2 0 NONE
          #2 8 4 7 A
          #3 3 8 8 NONE
          # I added another line as a proof of concept
          #4 0 8 0 B






          share|improve this answer














          share|improve this answer



          share|improve this answer








          edited Nov 5 at 2:33

























          answered Nov 5 at 2:28









          DYZ

          23.8k61947




          23.8k61947
























              up vote
              1
              down vote













              You can accomplish this using idxmax and a mask:



              out = (df==8).idxmax(1)
              m = ~(df==8).any(1) | ((df==8).sum(1) > 1)

              df.assign(col=out.mask(m))




                 A  B  C  col
              0 1 0 0 NaN
              1 2 2 0 NaN
              2 8 4 7 A
              3 3 8 8 NaN





              share|improve this answer

























                up vote
                1
                down vote













                You can accomplish this using idxmax and a mask:



                out = (df==8).idxmax(1)
                m = ~(df==8).any(1) | ((df==8).sum(1) > 1)

                df.assign(col=out.mask(m))




                   A  B  C  col
                0 1 0 0 NaN
                1 2 2 0 NaN
                2 8 4 7 A
                3 3 8 8 NaN





                share|improve this answer























                  up vote
                  1
                  down vote










                  up vote
                  1
                  down vote









                  You can accomplish this using idxmax and a mask:



                  out = (df==8).idxmax(1)
                  m = ~(df==8).any(1) | ((df==8).sum(1) > 1)

                  df.assign(col=out.mask(m))




                     A  B  C  col
                  0 1 0 0 NaN
                  1 2 2 0 NaN
                  2 8 4 7 A
                  3 3 8 8 NaN





                  share|improve this answer












                  You can accomplish this using idxmax and a mask:



                  out = (df==8).idxmax(1)
                  m = ~(df==8).any(1) | ((df==8).sum(1) > 1)

                  df.assign(col=out.mask(m))




                     A  B  C  col
                  0 1 0 0 NaN
                  1 2 2 0 NaN
                  2 8 4 7 A
                  3 3 8 8 NaN






                  share|improve this answer












                  share|improve this answer



                  share|improve this answer










                  answered Nov 5 at 2:38









                  user3483203

                  28.2k72351




                  28.2k72351






















                      up vote
                      1
                      down vote













                      Or do:



                      df2=df[(df==8)]
                      df['New_Column']=(df2[(df2!=df2.dropna(thresh=2).values[0]).all(1)].dropna(how='all')).idxmax(1)
                      df['New_Column'] = df['New_Column'].fillna('NONE')
                      print(df)


                      dropna + dropna again + idxmax + fillna. that's all you need for this.



                      Output:



                         A  B  C New_Column
                      0 1 0 0 NONE
                      1 2 2 0 NONE
                      2 8 4 7 A
                      3 3 8 8 NONE





                      share|improve this answer

















                      • 1




                        Thank you thank you
                        – rer49
                        Nov 5 at 3:31















                      up vote
                      1
                      down vote













                      Or do:



                      df2=df[(df==8)]
                      df['New_Column']=(df2[(df2!=df2.dropna(thresh=2).values[0]).all(1)].dropna(how='all')).idxmax(1)
                      df['New_Column'] = df['New_Column'].fillna('NONE')
                      print(df)


                      dropna + dropna again + idxmax + fillna. that's all you need for this.



                      Output:



                         A  B  C New_Column
                      0 1 0 0 NONE
                      1 2 2 0 NONE
                      2 8 4 7 A
                      3 3 8 8 NONE





                      share|improve this answer

















                      • 1




                        Thank you thank you
                        – rer49
                        Nov 5 at 3:31













                      up vote
                      1
                      down vote










                      up vote
                      1
                      down vote









                      Or do:



                      df2=df[(df==8)]
                      df['New_Column']=(df2[(df2!=df2.dropna(thresh=2).values[0]).all(1)].dropna(how='all')).idxmax(1)
                      df['New_Column'] = df['New_Column'].fillna('NONE')
                      print(df)


                      dropna + dropna again + idxmax + fillna. that's all you need for this.



                      Output:



                         A  B  C New_Column
                      0 1 0 0 NONE
                      1 2 2 0 NONE
                      2 8 4 7 A
                      3 3 8 8 NONE





                      share|improve this answer












                      Or do:



                      df2=df[(df==8)]
                      df['New_Column']=(df2[(df2!=df2.dropna(thresh=2).values[0]).all(1)].dropna(how='all')).idxmax(1)
                      df['New_Column'] = df['New_Column'].fillna('NONE')
                      print(df)


                      dropna + dropna again + idxmax + fillna. that's all you need for this.



                      Output:



                         A  B  C New_Column
                      0 1 0 0 NONE
                      1 2 2 0 NONE
                      2 8 4 7 A
                      3 3 8 8 NONE






                      share|improve this answer












                      share|improve this answer



                      share|improve this answer










                      answered Nov 5 at 3:28









                      U9-Forward

                      8,6842733




                      8,6842733








                      • 1




                        Thank you thank you
                        – rer49
                        Nov 5 at 3:31














                      • 1




                        Thank you thank you
                        – rer49
                        Nov 5 at 3:31








                      1




                      1




                      Thank you thank you
                      – rer49
                      Nov 5 at 3:31




                      Thank you thank you
                      – rer49
                      Nov 5 at 3:31










                      rer49 is a new contributor. Be nice, and check out our Code of Conduct.










                       

                      draft saved


                      draft discarded


















                      rer49 is a new contributor. Be nice, and check out our Code of Conduct.













                      rer49 is a new contributor. Be nice, and check out our Code of Conduct.












                      rer49 is a new contributor. Be nice, and check out our Code of Conduct.















                       


                      draft saved


                      draft discarded














                      StackExchange.ready(
                      function () {
                      StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53147413%2fchecking-multiple-columns-condition-in-pandas%23new-answer', 'question_page');
                      }
                      );

                      Post as a guest




















































































                      這個網誌中的熱門文章

                      Xamarin.form Move up view when keyboard appear

                      Post-Redirect-Get with Spring WebFlux and Thymeleaf

                      Anylogic : not able to use stopDelay()