New dataframe from grouping together two columns












1















I have a dataset that looks like the following.



Region_Name  Date     Average
London 1990Q1 105
London 1990Q1 118
... ... ...
London 2018Q1 157


I converted the date into quarters and wish to create a new dataframe with the matching quarters and region names grouped together, with the mean average.
What is the best way to accomplish such a task.



I have been looking at the groupby function but keep getting a traceback.
for example:



new_df = df.groupby(['Resion_Name','Date']).mean()









share|improve this question




















  • 1





    Your groupby contains a typo. Try new_df = df.groupby(['Region_Name', 'Date']).mean()

    – Peter Leimbigler
    Nov 21 '18 at 13:04











  • Also, note that the default behaviour of groupby is to insert the grouped-by columns into the index of the resulting DataFrame. If you group by multiple columns, you get a MultiIndex. To keep the grouped-by columns as normal columns, use groupby(as_index=False), or if the result is df, run df = df.reset_index().

    – Peter Leimbigler
    Nov 21 '18 at 13:07
















1















I have a dataset that looks like the following.



Region_Name  Date     Average
London 1990Q1 105
London 1990Q1 118
... ... ...
London 2018Q1 157


I converted the date into quarters and wish to create a new dataframe with the matching quarters and region names grouped together, with the mean average.
What is the best way to accomplish such a task.



I have been looking at the groupby function but keep getting a traceback.
for example:



new_df = df.groupby(['Resion_Name','Date']).mean()









share|improve this question




















  • 1





    Your groupby contains a typo. Try new_df = df.groupby(['Region_Name', 'Date']).mean()

    – Peter Leimbigler
    Nov 21 '18 at 13:04











  • Also, note that the default behaviour of groupby is to insert the grouped-by columns into the index of the resulting DataFrame. If you group by multiple columns, you get a MultiIndex. To keep the grouped-by columns as normal columns, use groupby(as_index=False), or if the result is df, run df = df.reset_index().

    – Peter Leimbigler
    Nov 21 '18 at 13:07














1












1








1








I have a dataset that looks like the following.



Region_Name  Date     Average
London 1990Q1 105
London 1990Q1 118
... ... ...
London 2018Q1 157


I converted the date into quarters and wish to create a new dataframe with the matching quarters and region names grouped together, with the mean average.
What is the best way to accomplish such a task.



I have been looking at the groupby function but keep getting a traceback.
for example:



new_df = df.groupby(['Resion_Name','Date']).mean()









share|improve this question
















I have a dataset that looks like the following.



Region_Name  Date     Average
London 1990Q1 105
London 1990Q1 118
... ... ...
London 2018Q1 157


I converted the date into quarters and wish to create a new dataframe with the matching quarters and region names grouped together, with the mean average.
What is the best way to accomplish such a task.



I have been looking at the groupby function but keep getting a traceback.
for example:



new_df = df.groupby(['Resion_Name','Date']).mean()






python pandas pandas-groupby sklearn-pandas






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited Nov 21 '18 at 13:03









Peter Leimbigler

4,4981416




4,4981416










asked Nov 21 '18 at 13:03









sara keyessara keyes

61




61








  • 1





    Your groupby contains a typo. Try new_df = df.groupby(['Region_Name', 'Date']).mean()

    – Peter Leimbigler
    Nov 21 '18 at 13:04











  • Also, note that the default behaviour of groupby is to insert the grouped-by columns into the index of the resulting DataFrame. If you group by multiple columns, you get a MultiIndex. To keep the grouped-by columns as normal columns, use groupby(as_index=False), or if the result is df, run df = df.reset_index().

    – Peter Leimbigler
    Nov 21 '18 at 13:07














  • 1





    Your groupby contains a typo. Try new_df = df.groupby(['Region_Name', 'Date']).mean()

    – Peter Leimbigler
    Nov 21 '18 at 13:04











  • Also, note that the default behaviour of groupby is to insert the grouped-by columns into the index of the resulting DataFrame. If you group by multiple columns, you get a MultiIndex. To keep the grouped-by columns as normal columns, use groupby(as_index=False), or if the result is df, run df = df.reset_index().

    – Peter Leimbigler
    Nov 21 '18 at 13:07








1




1





Your groupby contains a typo. Try new_df = df.groupby(['Region_Name', 'Date']).mean()

– Peter Leimbigler
Nov 21 '18 at 13:04





Your groupby contains a typo. Try new_df = df.groupby(['Region_Name', 'Date']).mean()

– Peter Leimbigler
Nov 21 '18 at 13:04













Also, note that the default behaviour of groupby is to insert the grouped-by columns into the index of the resulting DataFrame. If you group by multiple columns, you get a MultiIndex. To keep the grouped-by columns as normal columns, use groupby(as_index=False), or if the result is df, run df = df.reset_index().

– Peter Leimbigler
Nov 21 '18 at 13:07





Also, note that the default behaviour of groupby is to insert the grouped-by columns into the index of the resulting DataFrame. If you group by multiple columns, you get a MultiIndex. To keep the grouped-by columns as normal columns, use groupby(as_index=False), or if the result is df, run df = df.reset_index().

– Peter Leimbigler
Nov 21 '18 at 13:07












1 Answer
1






active

oldest

votes


















1














dict3={'Region_Name': ['London','Newyork','London','Newyork','London','London','Newyork','Newyork','Newyork','Newyork','London'],
'Date' : ['1990Q1','1990Q1','1990Q2','1990Q2','1991Q1','1991Q1','1991Q2','1992Q2','1993Q1','1993Q1','1994Q1'],
'Average': [34,56,45,67,23,89,12,45,67,34,67]}


df3=pd.DataFrame(dict3)


**Now My df3 is as follows **



    Region_Name Date    Average

0 London 1990Q1 34


1 Newyork 1990Q1 56

2 London 1990Q2 45

3 Newyork 1990Q2 67

4 London 1991Q1 23

5 London 1991Q1 89

6 Newyork 1991Q2 12

7 Newyork 1992Q2 45

8 Newyork 1993Q1 67

9 Newyork 1993Q1 34

10 London 1994Q1 67


code looks as follows:



new_df = df3.groupby(['Region_Name','Date'])

new1=new_df['Average'].transform('mean')


Result of dataframe new1:



print(new1)


0 34.0

1 56.0

2 45.0

3 67.0

4 56.0

5 56.0

6 12.0

7 45.0

8 50.5

9 50.5

10 67.0





share|improve this answer

























    Your Answer






    StackExchange.ifUsing("editor", function () {
    StackExchange.using("externalEditor", function () {
    StackExchange.using("snippets", function () {
    StackExchange.snippets.init();
    });
    });
    }, "code-snippets");

    StackExchange.ready(function() {
    var channelOptions = {
    tags: "".split(" "),
    id: "1"
    };
    initTagRenderer("".split(" "), "".split(" "), channelOptions);

    StackExchange.using("externalEditor", function() {
    // Have to fire editor after snippets, if snippets enabled
    if (StackExchange.settings.snippets.snippetsEnabled) {
    StackExchange.using("snippets", function() {
    createEditor();
    });
    }
    else {
    createEditor();
    }
    });

    function createEditor() {
    StackExchange.prepareEditor({
    heartbeatType: 'answer',
    autoActivateHeartbeat: false,
    convertImagesToLinks: true,
    noModals: true,
    showLowRepImageUploadWarning: true,
    reputationToPostImages: 10,
    bindNavPrevention: true,
    postfix: "",
    imageUploader: {
    brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
    contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
    allowUrls: true
    },
    onDemand: true,
    discardSelector: ".discard-answer"
    ,immediatelyShowMarkdownHelp:true
    });


    }
    });














    draft saved

    draft discarded


















    StackExchange.ready(
    function () {
    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53412665%2fnew-dataframe-from-grouping-together-two-columns%23new-answer', 'question_page');
    }
    );

    Post as a guest















    Required, but never shown

























    1 Answer
    1






    active

    oldest

    votes








    1 Answer
    1






    active

    oldest

    votes









    active

    oldest

    votes






    active

    oldest

    votes









    1














    dict3={'Region_Name': ['London','Newyork','London','Newyork','London','London','Newyork','Newyork','Newyork','Newyork','London'],
    'Date' : ['1990Q1','1990Q1','1990Q2','1990Q2','1991Q1','1991Q1','1991Q2','1992Q2','1993Q1','1993Q1','1994Q1'],
    'Average': [34,56,45,67,23,89,12,45,67,34,67]}


    df3=pd.DataFrame(dict3)


    **Now My df3 is as follows **



        Region_Name Date    Average

    0 London 1990Q1 34


    1 Newyork 1990Q1 56

    2 London 1990Q2 45

    3 Newyork 1990Q2 67

    4 London 1991Q1 23

    5 London 1991Q1 89

    6 Newyork 1991Q2 12

    7 Newyork 1992Q2 45

    8 Newyork 1993Q1 67

    9 Newyork 1993Q1 34

    10 London 1994Q1 67


    code looks as follows:



    new_df = df3.groupby(['Region_Name','Date'])

    new1=new_df['Average'].transform('mean')


    Result of dataframe new1:



    print(new1)


    0 34.0

    1 56.0

    2 45.0

    3 67.0

    4 56.0

    5 56.0

    6 12.0

    7 45.0

    8 50.5

    9 50.5

    10 67.0





    share|improve this answer






























      1














      dict3={'Region_Name': ['London','Newyork','London','Newyork','London','London','Newyork','Newyork','Newyork','Newyork','London'],
      'Date' : ['1990Q1','1990Q1','1990Q2','1990Q2','1991Q1','1991Q1','1991Q2','1992Q2','1993Q1','1993Q1','1994Q1'],
      'Average': [34,56,45,67,23,89,12,45,67,34,67]}


      df3=pd.DataFrame(dict3)


      **Now My df3 is as follows **



          Region_Name Date    Average

      0 London 1990Q1 34


      1 Newyork 1990Q1 56

      2 London 1990Q2 45

      3 Newyork 1990Q2 67

      4 London 1991Q1 23

      5 London 1991Q1 89

      6 Newyork 1991Q2 12

      7 Newyork 1992Q2 45

      8 Newyork 1993Q1 67

      9 Newyork 1993Q1 34

      10 London 1994Q1 67


      code looks as follows:



      new_df = df3.groupby(['Region_Name','Date'])

      new1=new_df['Average'].transform('mean')


      Result of dataframe new1:



      print(new1)


      0 34.0

      1 56.0

      2 45.0

      3 67.0

      4 56.0

      5 56.0

      6 12.0

      7 45.0

      8 50.5

      9 50.5

      10 67.0





      share|improve this answer




























        1












        1








        1







        dict3={'Region_Name': ['London','Newyork','London','Newyork','London','London','Newyork','Newyork','Newyork','Newyork','London'],
        'Date' : ['1990Q1','1990Q1','1990Q2','1990Q2','1991Q1','1991Q1','1991Q2','1992Q2','1993Q1','1993Q1','1994Q1'],
        'Average': [34,56,45,67,23,89,12,45,67,34,67]}


        df3=pd.DataFrame(dict3)


        **Now My df3 is as follows **



            Region_Name Date    Average

        0 London 1990Q1 34


        1 Newyork 1990Q1 56

        2 London 1990Q2 45

        3 Newyork 1990Q2 67

        4 London 1991Q1 23

        5 London 1991Q1 89

        6 Newyork 1991Q2 12

        7 Newyork 1992Q2 45

        8 Newyork 1993Q1 67

        9 Newyork 1993Q1 34

        10 London 1994Q1 67


        code looks as follows:



        new_df = df3.groupby(['Region_Name','Date'])

        new1=new_df['Average'].transform('mean')


        Result of dataframe new1:



        print(new1)


        0 34.0

        1 56.0

        2 45.0

        3 67.0

        4 56.0

        5 56.0

        6 12.0

        7 45.0

        8 50.5

        9 50.5

        10 67.0





        share|improve this answer















        dict3={'Region_Name': ['London','Newyork','London','Newyork','London','London','Newyork','Newyork','Newyork','Newyork','London'],
        'Date' : ['1990Q1','1990Q1','1990Q2','1990Q2','1991Q1','1991Q1','1991Q2','1992Q2','1993Q1','1993Q1','1994Q1'],
        'Average': [34,56,45,67,23,89,12,45,67,34,67]}


        df3=pd.DataFrame(dict3)


        **Now My df3 is as follows **



            Region_Name Date    Average

        0 London 1990Q1 34


        1 Newyork 1990Q1 56

        2 London 1990Q2 45

        3 Newyork 1990Q2 67

        4 London 1991Q1 23

        5 London 1991Q1 89

        6 Newyork 1991Q2 12

        7 Newyork 1992Q2 45

        8 Newyork 1993Q1 67

        9 Newyork 1993Q1 34

        10 London 1994Q1 67


        code looks as follows:



        new_df = df3.groupby(['Region_Name','Date'])

        new1=new_df['Average'].transform('mean')


        Result of dataframe new1:



        print(new1)


        0 34.0

        1 56.0

        2 45.0

        3 67.0

        4 56.0

        5 56.0

        6 12.0

        7 45.0

        8 50.5

        9 50.5

        10 67.0






        share|improve this answer














        share|improve this answer



        share|improve this answer








        edited Nov 21 '18 at 15:24









        pygo

        3,1751619




        3,1751619










        answered Nov 21 '18 at 13:50









        AnupritaAnuprita

        285




        285
































            draft saved

            draft discarded




















































            Thanks for contributing an answer to Stack Overflow!


            • Please be sure to answer the question. Provide details and share your research!

            But avoid



            • Asking for help, clarification, or responding to other answers.

            • Making statements based on opinion; back them up with references or personal experience.


            To learn more, see our tips on writing great answers.




            draft saved


            draft discarded














            StackExchange.ready(
            function () {
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53412665%2fnew-dataframe-from-grouping-together-two-columns%23new-answer', 'question_page');
            }
            );

            Post as a guest















            Required, but never shown





















































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown

































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown







            這個網誌中的熱門文章

            Hercules Kyvelos

            Tangent Lines Diagram Along Smooth Curve

            Yusuf al-Mu'taman ibn Hud