String mode aggregation with group by function












2















I have dataframe which looks like below



Country  City
UK London
USA Washington
UK London
UK Manchester
USA Washington
USA Chicago


I want to group country and aggregate on the most repeated city in a country



My desired output should be like



Country City
UK London
USA Washington


Because London and Washington appears 2 times whereas Manchester and Chicago appears only 1 time.



I tried




from scipy.stats import mode
df_summary = df.groupby('Country')['City'].
apply(lambda x: mode(x)[0][0]).reset_index()



But it seems it won't work on strings










share|improve this question





























    2















    I have dataframe which looks like below



    Country  City
    UK London
    USA Washington
    UK London
    UK Manchester
    USA Washington
    USA Chicago


    I want to group country and aggregate on the most repeated city in a country



    My desired output should be like



    Country City
    UK London
    USA Washington


    Because London and Washington appears 2 times whereas Manchester and Chicago appears only 1 time.



    I tried




    from scipy.stats import mode
    df_summary = df.groupby('Country')['City'].
    apply(lambda x: mode(x)[0][0]).reset_index()



    But it seems it won't work on strings










    share|improve this question



























      2












      2








      2








      I have dataframe which looks like below



      Country  City
      UK London
      USA Washington
      UK London
      UK Manchester
      USA Washington
      USA Chicago


      I want to group country and aggregate on the most repeated city in a country



      My desired output should be like



      Country City
      UK London
      USA Washington


      Because London and Washington appears 2 times whereas Manchester and Chicago appears only 1 time.



      I tried




      from scipy.stats import mode
      df_summary = df.groupby('Country')['City'].
      apply(lambda x: mode(x)[0][0]).reset_index()



      But it seems it won't work on strings










      share|improve this question
















      I have dataframe which looks like below



      Country  City
      UK London
      USA Washington
      UK London
      UK Manchester
      USA Washington
      USA Chicago


      I want to group country and aggregate on the most repeated city in a country



      My desired output should be like



      Country City
      UK London
      USA Washington


      Because London and Washington appears 2 times whereas Manchester and Chicago appears only 1 time.



      I tried




      from scipy.stats import mode
      df_summary = df.groupby('Country')['City'].
      apply(lambda x: mode(x)[0][0]).reset_index()



      But it seems it won't work on strings







      python pandas aggregate pandas-groupby mode






      share|improve this question















      share|improve this question













      share|improve this question




      share|improve this question








      edited Nov 22 '18 at 2:31









      jpp

      102k2165115




      102k2165115










      asked Nov 22 '18 at 2:26









      Moses SolemanMoses Soleman

      395213




      395213
























          2 Answers
          2






          active

          oldest

          votes


















          1














          I can't replicate your error, but you can use pd.Series.mode, which accepts strings and returns a series, using iat to extract the first value:



          res = df.groupby('Country')['City'].apply(lambda x: x.mode().iat[0]).reset_index()

          print(res)

          Country City
          0 UK London
          1 USA Washington





          share|improve this answer


























          • which means should I have to import pd.Series.mode ?

            – Moses Soleman
            Nov 22 '18 at 2:39











          • @MosesSoleman, Nope, you are using pd.Series.mode whenever you use x.mode(), i.e. you don't need to import anything extra.

            – jpp
            Nov 22 '18 at 2:39













          • Thanks , it works on this sample set , but let me try on a bigger set.

            – Moses Soleman
            Nov 22 '18 at 2:49











          • when I try with bigger set which has many unique value I get this error index 0 is out of bounds for axis 0 with size 0

            – Moses Soleman
            Nov 22 '18 at 2:56






          • 1





            @MosesSoleman, you can also try scipy which works fine as well.

            – pygo
            Nov 22 '18 at 3:36



















          1














          try like below:



          >>> df.City.mode()
          0 London
          1 Washington
          dtype: object


          OR



          import pandas as pd
          from scipy import stats


          Can use scipy with stats + lambda :



          df.groupby('Country').agg({'City': lambda x:stats.mode(x)[0]})
          City
          Country
          UK London
          USA Washington

          # df.groupby('Country').agg({'City': lambda x:stats.mode(x)[0]}).reset_index()


          However, it gives nice count as well if you don't want to return ony First value:



          >>> df.groupby('Country').agg({'City': lambda x:stats.mode(x)})
          City
          Country
          UK ([London], [2])
          USA ([Washington], [2])





          share|improve this answer

























            Your Answer






            StackExchange.ifUsing("editor", function () {
            StackExchange.using("externalEditor", function () {
            StackExchange.using("snippets", function () {
            StackExchange.snippets.init();
            });
            });
            }, "code-snippets");

            StackExchange.ready(function() {
            var channelOptions = {
            tags: "".split(" "),
            id: "1"
            };
            initTagRenderer("".split(" "), "".split(" "), channelOptions);

            StackExchange.using("externalEditor", function() {
            // Have to fire editor after snippets, if snippets enabled
            if (StackExchange.settings.snippets.snippetsEnabled) {
            StackExchange.using("snippets", function() {
            createEditor();
            });
            }
            else {
            createEditor();
            }
            });

            function createEditor() {
            StackExchange.prepareEditor({
            heartbeatType: 'answer',
            autoActivateHeartbeat: false,
            convertImagesToLinks: true,
            noModals: true,
            showLowRepImageUploadWarning: true,
            reputationToPostImages: 10,
            bindNavPrevention: true,
            postfix: "",
            imageUploader: {
            brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
            contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
            allowUrls: true
            },
            onDemand: true,
            discardSelector: ".discard-answer"
            ,immediatelyShowMarkdownHelp:true
            });


            }
            });














            draft saved

            draft discarded


















            StackExchange.ready(
            function () {
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53423056%2fstring-mode-aggregation-with-group-by-function%23new-answer', 'question_page');
            }
            );

            Post as a guest















            Required, but never shown

























            2 Answers
            2






            active

            oldest

            votes








            2 Answers
            2






            active

            oldest

            votes









            active

            oldest

            votes






            active

            oldest

            votes









            1














            I can't replicate your error, but you can use pd.Series.mode, which accepts strings and returns a series, using iat to extract the first value:



            res = df.groupby('Country')['City'].apply(lambda x: x.mode().iat[0]).reset_index()

            print(res)

            Country City
            0 UK London
            1 USA Washington





            share|improve this answer


























            • which means should I have to import pd.Series.mode ?

              – Moses Soleman
              Nov 22 '18 at 2:39











            • @MosesSoleman, Nope, you are using pd.Series.mode whenever you use x.mode(), i.e. you don't need to import anything extra.

              – jpp
              Nov 22 '18 at 2:39













            • Thanks , it works on this sample set , but let me try on a bigger set.

              – Moses Soleman
              Nov 22 '18 at 2:49











            • when I try with bigger set which has many unique value I get this error index 0 is out of bounds for axis 0 with size 0

              – Moses Soleman
              Nov 22 '18 at 2:56






            • 1





              @MosesSoleman, you can also try scipy which works fine as well.

              – pygo
              Nov 22 '18 at 3:36
















            1














            I can't replicate your error, but you can use pd.Series.mode, which accepts strings and returns a series, using iat to extract the first value:



            res = df.groupby('Country')['City'].apply(lambda x: x.mode().iat[0]).reset_index()

            print(res)

            Country City
            0 UK London
            1 USA Washington





            share|improve this answer


























            • which means should I have to import pd.Series.mode ?

              – Moses Soleman
              Nov 22 '18 at 2:39











            • @MosesSoleman, Nope, you are using pd.Series.mode whenever you use x.mode(), i.e. you don't need to import anything extra.

              – jpp
              Nov 22 '18 at 2:39













            • Thanks , it works on this sample set , but let me try on a bigger set.

              – Moses Soleman
              Nov 22 '18 at 2:49











            • when I try with bigger set which has many unique value I get this error index 0 is out of bounds for axis 0 with size 0

              – Moses Soleman
              Nov 22 '18 at 2:56






            • 1





              @MosesSoleman, you can also try scipy which works fine as well.

              – pygo
              Nov 22 '18 at 3:36














            1












            1








            1







            I can't replicate your error, but you can use pd.Series.mode, which accepts strings and returns a series, using iat to extract the first value:



            res = df.groupby('Country')['City'].apply(lambda x: x.mode().iat[0]).reset_index()

            print(res)

            Country City
            0 UK London
            1 USA Washington





            share|improve this answer















            I can't replicate your error, but you can use pd.Series.mode, which accepts strings and returns a series, using iat to extract the first value:



            res = df.groupby('Country')['City'].apply(lambda x: x.mode().iat[0]).reset_index()

            print(res)

            Country City
            0 UK London
            1 USA Washington






            share|improve this answer














            share|improve this answer



            share|improve this answer








            edited Nov 22 '18 at 2:41

























            answered Nov 22 '18 at 2:29









            jppjpp

            102k2165115




            102k2165115













            • which means should I have to import pd.Series.mode ?

              – Moses Soleman
              Nov 22 '18 at 2:39











            • @MosesSoleman, Nope, you are using pd.Series.mode whenever you use x.mode(), i.e. you don't need to import anything extra.

              – jpp
              Nov 22 '18 at 2:39













            • Thanks , it works on this sample set , but let me try on a bigger set.

              – Moses Soleman
              Nov 22 '18 at 2:49











            • when I try with bigger set which has many unique value I get this error index 0 is out of bounds for axis 0 with size 0

              – Moses Soleman
              Nov 22 '18 at 2:56






            • 1





              @MosesSoleman, you can also try scipy which works fine as well.

              – pygo
              Nov 22 '18 at 3:36



















            • which means should I have to import pd.Series.mode ?

              – Moses Soleman
              Nov 22 '18 at 2:39











            • @MosesSoleman, Nope, you are using pd.Series.mode whenever you use x.mode(), i.e. you don't need to import anything extra.

              – jpp
              Nov 22 '18 at 2:39













            • Thanks , it works on this sample set , but let me try on a bigger set.

              – Moses Soleman
              Nov 22 '18 at 2:49











            • when I try with bigger set which has many unique value I get this error index 0 is out of bounds for axis 0 with size 0

              – Moses Soleman
              Nov 22 '18 at 2:56






            • 1





              @MosesSoleman, you can also try scipy which works fine as well.

              – pygo
              Nov 22 '18 at 3:36

















            which means should I have to import pd.Series.mode ?

            – Moses Soleman
            Nov 22 '18 at 2:39





            which means should I have to import pd.Series.mode ?

            – Moses Soleman
            Nov 22 '18 at 2:39













            @MosesSoleman, Nope, you are using pd.Series.mode whenever you use x.mode(), i.e. you don't need to import anything extra.

            – jpp
            Nov 22 '18 at 2:39







            @MosesSoleman, Nope, you are using pd.Series.mode whenever you use x.mode(), i.e. you don't need to import anything extra.

            – jpp
            Nov 22 '18 at 2:39















            Thanks , it works on this sample set , but let me try on a bigger set.

            – Moses Soleman
            Nov 22 '18 at 2:49





            Thanks , it works on this sample set , but let me try on a bigger set.

            – Moses Soleman
            Nov 22 '18 at 2:49













            when I try with bigger set which has many unique value I get this error index 0 is out of bounds for axis 0 with size 0

            – Moses Soleman
            Nov 22 '18 at 2:56





            when I try with bigger set which has many unique value I get this error index 0 is out of bounds for axis 0 with size 0

            – Moses Soleman
            Nov 22 '18 at 2:56




            1




            1





            @MosesSoleman, you can also try scipy which works fine as well.

            – pygo
            Nov 22 '18 at 3:36





            @MosesSoleman, you can also try scipy which works fine as well.

            – pygo
            Nov 22 '18 at 3:36













            1














            try like below:



            >>> df.City.mode()
            0 London
            1 Washington
            dtype: object


            OR



            import pandas as pd
            from scipy import stats


            Can use scipy with stats + lambda :



            df.groupby('Country').agg({'City': lambda x:stats.mode(x)[0]})
            City
            Country
            UK London
            USA Washington

            # df.groupby('Country').agg({'City': lambda x:stats.mode(x)[0]}).reset_index()


            However, it gives nice count as well if you don't want to return ony First value:



            >>> df.groupby('Country').agg({'City': lambda x:stats.mode(x)})
            City
            Country
            UK ([London], [2])
            USA ([Washington], [2])





            share|improve this answer






























              1














              try like below:



              >>> df.City.mode()
              0 London
              1 Washington
              dtype: object


              OR



              import pandas as pd
              from scipy import stats


              Can use scipy with stats + lambda :



              df.groupby('Country').agg({'City': lambda x:stats.mode(x)[0]})
              City
              Country
              UK London
              USA Washington

              # df.groupby('Country').agg({'City': lambda x:stats.mode(x)[0]}).reset_index()


              However, it gives nice count as well if you don't want to return ony First value:



              >>> df.groupby('Country').agg({'City': lambda x:stats.mode(x)})
              City
              Country
              UK ([London], [2])
              USA ([Washington], [2])





              share|improve this answer




























                1












                1








                1







                try like below:



                >>> df.City.mode()
                0 London
                1 Washington
                dtype: object


                OR



                import pandas as pd
                from scipy import stats


                Can use scipy with stats + lambda :



                df.groupby('Country').agg({'City': lambda x:stats.mode(x)[0]})
                City
                Country
                UK London
                USA Washington

                # df.groupby('Country').agg({'City': lambda x:stats.mode(x)[0]}).reset_index()


                However, it gives nice count as well if you don't want to return ony First value:



                >>> df.groupby('Country').agg({'City': lambda x:stats.mode(x)})
                City
                Country
                UK ([London], [2])
                USA ([Washington], [2])





                share|improve this answer















                try like below:



                >>> df.City.mode()
                0 London
                1 Washington
                dtype: object


                OR



                import pandas as pd
                from scipy import stats


                Can use scipy with stats + lambda :



                df.groupby('Country').agg({'City': lambda x:stats.mode(x)[0]})
                City
                Country
                UK London
                USA Washington

                # df.groupby('Country').agg({'City': lambda x:stats.mode(x)[0]}).reset_index()


                However, it gives nice count as well if you don't want to return ony First value:



                >>> df.groupby('Country').agg({'City': lambda x:stats.mode(x)})
                City
                Country
                UK ([London], [2])
                USA ([Washington], [2])






                share|improve this answer














                share|improve this answer



                share|improve this answer








                edited Nov 22 '18 at 3:52

























                answered Nov 22 '18 at 3:11









                pygopygo

                3,1961721




                3,1961721






























                    draft saved

                    draft discarded




















































                    Thanks for contributing an answer to Stack Overflow!


                    • Please be sure to answer the question. Provide details and share your research!

                    But avoid



                    • Asking for help, clarification, or responding to other answers.

                    • Making statements based on opinion; back them up with references or personal experience.


                    To learn more, see our tips on writing great answers.




                    draft saved


                    draft discarded














                    StackExchange.ready(
                    function () {
                    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53423056%2fstring-mode-aggregation-with-group-by-function%23new-answer', 'question_page');
                    }
                    );

                    Post as a guest















                    Required, but never shown





















































                    Required, but never shown














                    Required, but never shown












                    Required, but never shown







                    Required, but never shown

































                    Required, but never shown














                    Required, but never shown












                    Required, but never shown







                    Required, but never shown







                    這個網誌中的熱門文章

                    Academy of Television Arts & Sciences

                    L'Équipe

                    1995 France bombings