is it possible to format string in pandas data frame?











up vote
1
down vote

favorite
1












I am a new pandas user and I would like to ask is it possible to do this?



Here is the sample of my data frame. all data type of both attributes are the string.



id class
A1 X1,41
A1 X1,42
A1 X1,43
A2 X1,41
A2 X1,45


I have merged the data frame using groupby and the results are being like this. df = df.groupby(['id']).sum()



id class
A1 X1,41X1,42X1,43
A2 X1,41X1,45


but I would like the results to be like this (hope the data contains in the form of list.)



id class
A1 [X1,41], [X1,42], [X1,43]
A2 [X1,41], [X1,45]









share|improve this question
























  • It's possible, but not recommended. Pandas is not designed to hold lists in series. If you're concerned about memory / performance, looking at Categorical Data is a much better idea.
    – jpp
    Nov 7 at 15:27












  • The things you have in the original class column are strings. Not lists! It looks like from your desired output that you have lists of lists where each nested list only has a single string in it. That makes very little sense to me. You need to do a better job of explaining what it is inside your dataframe and what it is you want there when you are done.
    – piRSquared
    Nov 7 at 15:30










  • Please edit your post to show how you used groupby. Maybe modifying that code a bit can someone answer your question.
    – Edgar R. Mondragón
    Nov 7 at 15:30












  • thank you for all comments, i have edited the post
    – Sujin
    Nov 7 at 15:34















up vote
1
down vote

favorite
1












I am a new pandas user and I would like to ask is it possible to do this?



Here is the sample of my data frame. all data type of both attributes are the string.



id class
A1 X1,41
A1 X1,42
A1 X1,43
A2 X1,41
A2 X1,45


I have merged the data frame using groupby and the results are being like this. df = df.groupby(['id']).sum()



id class
A1 X1,41X1,42X1,43
A2 X1,41X1,45


but I would like the results to be like this (hope the data contains in the form of list.)



id class
A1 [X1,41], [X1,42], [X1,43]
A2 [X1,41], [X1,45]









share|improve this question
























  • It's possible, but not recommended. Pandas is not designed to hold lists in series. If you're concerned about memory / performance, looking at Categorical Data is a much better idea.
    – jpp
    Nov 7 at 15:27












  • The things you have in the original class column are strings. Not lists! It looks like from your desired output that you have lists of lists where each nested list only has a single string in it. That makes very little sense to me. You need to do a better job of explaining what it is inside your dataframe and what it is you want there when you are done.
    – piRSquared
    Nov 7 at 15:30










  • Please edit your post to show how you used groupby. Maybe modifying that code a bit can someone answer your question.
    – Edgar R. Mondragón
    Nov 7 at 15:30












  • thank you for all comments, i have edited the post
    – Sujin
    Nov 7 at 15:34













up vote
1
down vote

favorite
1









up vote
1
down vote

favorite
1






1





I am a new pandas user and I would like to ask is it possible to do this?



Here is the sample of my data frame. all data type of both attributes are the string.



id class
A1 X1,41
A1 X1,42
A1 X1,43
A2 X1,41
A2 X1,45


I have merged the data frame using groupby and the results are being like this. df = df.groupby(['id']).sum()



id class
A1 X1,41X1,42X1,43
A2 X1,41X1,45


but I would like the results to be like this (hope the data contains in the form of list.)



id class
A1 [X1,41], [X1,42], [X1,43]
A2 [X1,41], [X1,45]









share|improve this question















I am a new pandas user and I would like to ask is it possible to do this?



Here is the sample of my data frame. all data type of both attributes are the string.



id class
A1 X1,41
A1 X1,42
A1 X1,43
A2 X1,41
A2 X1,45


I have merged the data frame using groupby and the results are being like this. df = df.groupby(['id']).sum()



id class
A1 X1,41X1,42X1,43
A2 X1,41X1,45


but I would like the results to be like this (hope the data contains in the form of list.)



id class
A1 [X1,41], [X1,42], [X1,43]
A2 [X1,41], [X1,45]






python pandas dataframe






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited Nov 7 at 16:16









Seanny123

2,21933261




2,21933261










asked Nov 7 at 15:24









Sujin

468




468












  • It's possible, but not recommended. Pandas is not designed to hold lists in series. If you're concerned about memory / performance, looking at Categorical Data is a much better idea.
    – jpp
    Nov 7 at 15:27












  • The things you have in the original class column are strings. Not lists! It looks like from your desired output that you have lists of lists where each nested list only has a single string in it. That makes very little sense to me. You need to do a better job of explaining what it is inside your dataframe and what it is you want there when you are done.
    – piRSquared
    Nov 7 at 15:30










  • Please edit your post to show how you used groupby. Maybe modifying that code a bit can someone answer your question.
    – Edgar R. Mondragón
    Nov 7 at 15:30












  • thank you for all comments, i have edited the post
    – Sujin
    Nov 7 at 15:34


















  • It's possible, but not recommended. Pandas is not designed to hold lists in series. If you're concerned about memory / performance, looking at Categorical Data is a much better idea.
    – jpp
    Nov 7 at 15:27












  • The things you have in the original class column are strings. Not lists! It looks like from your desired output that you have lists of lists where each nested list only has a single string in it. That makes very little sense to me. You need to do a better job of explaining what it is inside your dataframe and what it is you want there when you are done.
    – piRSquared
    Nov 7 at 15:30










  • Please edit your post to show how you used groupby. Maybe modifying that code a bit can someone answer your question.
    – Edgar R. Mondragón
    Nov 7 at 15:30












  • thank you for all comments, i have edited the post
    – Sujin
    Nov 7 at 15:34
















It's possible, but not recommended. Pandas is not designed to hold lists in series. If you're concerned about memory / performance, looking at Categorical Data is a much better idea.
– jpp
Nov 7 at 15:27






It's possible, but not recommended. Pandas is not designed to hold lists in series. If you're concerned about memory / performance, looking at Categorical Data is a much better idea.
– jpp
Nov 7 at 15:27














The things you have in the original class column are strings. Not lists! It looks like from your desired output that you have lists of lists where each nested list only has a single string in it. That makes very little sense to me. You need to do a better job of explaining what it is inside your dataframe and what it is you want there when you are done.
– piRSquared
Nov 7 at 15:30




The things you have in the original class column are strings. Not lists! It looks like from your desired output that you have lists of lists where each nested list only has a single string in it. That makes very little sense to me. You need to do a better job of explaining what it is inside your dataframe and what it is you want there when you are done.
– piRSquared
Nov 7 at 15:30












Please edit your post to show how you used groupby. Maybe modifying that code a bit can someone answer your question.
– Edgar R. Mondragón
Nov 7 at 15:30






Please edit your post to show how you used groupby. Maybe modifying that code a bit can someone answer your question.
– Edgar R. Mondragón
Nov 7 at 15:30














thank you for all comments, i have edited the post
– Sujin
Nov 7 at 15:34




thank you for all comments, i have edited the post
– Sujin
Nov 7 at 15:34












2 Answers
2






active

oldest

votes

















up vote
0
down vote













I think you are looking for this:



df.groupby('id').apply(lambda x: [[_x] for _x in x['class']])


This means to groupby the 'id' column, and for every grouped object, apply the function given. In this case I provide a function that creates a list of lists of the objects in that groupby object. _x could be anything, I just named it that to reflect that it is temporary and a placeholder.






share|improve this answer























  • could you please explain me what does _x mean?
    – Sujin
    Nov 7 at 15:37


















up vote
0
down vote













Expanding on @Ethan Koch's answer:



df.groupby('id').apply(lambda x: [[_x] for _x in x['class']])


returns a Series, not a Dataframe. To convert back to Dataframe:



df2=pd.DataFrame({'id':df.index, 'class':df.values})





share|improve this answer





















    Your Answer






    StackExchange.ifUsing("editor", function () {
    StackExchange.using("externalEditor", function () {
    StackExchange.using("snippets", function () {
    StackExchange.snippets.init();
    });
    });
    }, "code-snippets");

    StackExchange.ready(function() {
    var channelOptions = {
    tags: "".split(" "),
    id: "1"
    };
    initTagRenderer("".split(" "), "".split(" "), channelOptions);

    StackExchange.using("externalEditor", function() {
    // Have to fire editor after snippets, if snippets enabled
    if (StackExchange.settings.snippets.snippetsEnabled) {
    StackExchange.using("snippets", function() {
    createEditor();
    });
    }
    else {
    createEditor();
    }
    });

    function createEditor() {
    StackExchange.prepareEditor({
    heartbeatType: 'answer',
    convertImagesToLinks: true,
    noModals: true,
    showLowRepImageUploadWarning: true,
    reputationToPostImages: 10,
    bindNavPrevention: true,
    postfix: "",
    imageUploader: {
    brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
    contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
    allowUrls: true
    },
    onDemand: true,
    discardSelector: ".discard-answer"
    ,immediatelyShowMarkdownHelp:true
    });


    }
    });














     

    draft saved


    draft discarded


















    StackExchange.ready(
    function () {
    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53192492%2fis-it-possible-to-format-string-in-pandas-data-frame%23new-answer', 'question_page');
    }
    );

    Post as a guest















    Required, but never shown

























    2 Answers
    2






    active

    oldest

    votes








    2 Answers
    2






    active

    oldest

    votes









    active

    oldest

    votes






    active

    oldest

    votes








    up vote
    0
    down vote













    I think you are looking for this:



    df.groupby('id').apply(lambda x: [[_x] for _x in x['class']])


    This means to groupby the 'id' column, and for every grouped object, apply the function given. In this case I provide a function that creates a list of lists of the objects in that groupby object. _x could be anything, I just named it that to reflect that it is temporary and a placeholder.






    share|improve this answer























    • could you please explain me what does _x mean?
      – Sujin
      Nov 7 at 15:37















    up vote
    0
    down vote













    I think you are looking for this:



    df.groupby('id').apply(lambda x: [[_x] for _x in x['class']])


    This means to groupby the 'id' column, and for every grouped object, apply the function given. In this case I provide a function that creates a list of lists of the objects in that groupby object. _x could be anything, I just named it that to reflect that it is temporary and a placeholder.






    share|improve this answer























    • could you please explain me what does _x mean?
      – Sujin
      Nov 7 at 15:37













    up vote
    0
    down vote










    up vote
    0
    down vote









    I think you are looking for this:



    df.groupby('id').apply(lambda x: [[_x] for _x in x['class']])


    This means to groupby the 'id' column, and for every grouped object, apply the function given. In this case I provide a function that creates a list of lists of the objects in that groupby object. _x could be anything, I just named it that to reflect that it is temporary and a placeholder.






    share|improve this answer














    I think you are looking for this:



    df.groupby('id').apply(lambda x: [[_x] for _x in x['class']])


    This means to groupby the 'id' column, and for every grouped object, apply the function given. In this case I provide a function that creates a list of lists of the objects in that groupby object. _x could be anything, I just named it that to reflect that it is temporary and a placeholder.







    share|improve this answer














    share|improve this answer



    share|improve this answer








    edited Nov 7 at 15:39

























    answered Nov 7 at 15:34









    Ethan Koch

    2116




    2116












    • could you please explain me what does _x mean?
      – Sujin
      Nov 7 at 15:37


















    • could you please explain me what does _x mean?
      – Sujin
      Nov 7 at 15:37
















    could you please explain me what does _x mean?
    – Sujin
    Nov 7 at 15:37




    could you please explain me what does _x mean?
    – Sujin
    Nov 7 at 15:37












    up vote
    0
    down vote













    Expanding on @Ethan Koch's answer:



    df.groupby('id').apply(lambda x: [[_x] for _x in x['class']])


    returns a Series, not a Dataframe. To convert back to Dataframe:



    df2=pd.DataFrame({'id':df.index, 'class':df.values})





    share|improve this answer

























      up vote
      0
      down vote













      Expanding on @Ethan Koch's answer:



      df.groupby('id').apply(lambda x: [[_x] for _x in x['class']])


      returns a Series, not a Dataframe. To convert back to Dataframe:



      df2=pd.DataFrame({'id':df.index, 'class':df.values})





      share|improve this answer























        up vote
        0
        down vote










        up vote
        0
        down vote









        Expanding on @Ethan Koch's answer:



        df.groupby('id').apply(lambda x: [[_x] for _x in x['class']])


        returns a Series, not a Dataframe. To convert back to Dataframe:



        df2=pd.DataFrame({'id':df.index, 'class':df.values})





        share|improve this answer












        Expanding on @Ethan Koch's answer:



        df.groupby('id').apply(lambda x: [[_x] for _x in x['class']])


        returns a Series, not a Dataframe. To convert back to Dataframe:



        df2=pd.DataFrame({'id':df.index, 'class':df.values})






        share|improve this answer












        share|improve this answer



        share|improve this answer










        answered Nov 7 at 16:09









        Ricky Kim

        642211




        642211






























             

            draft saved


            draft discarded



















































             


            draft saved


            draft discarded














            StackExchange.ready(
            function () {
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53192492%2fis-it-possible-to-format-string-in-pandas-data-frame%23new-answer', 'question_page');
            }
            );

            Post as a guest















            Required, but never shown





















































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown

































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown







            這個網誌中的熱門文章

            Tangent Lines Diagram Along Smooth Curve

            Yusuf al-Mu'taman ibn Hud

            Zucchini