is it possible to format string in pandas data frame?
up vote
1
down vote
favorite
I am a new pandas user and I would like to ask is it possible to do this?
Here is the sample of my data frame. all data type of both attributes are the string.
id class
A1 X1,41
A1 X1,42
A1 X1,43
A2 X1,41
A2 X1,45
I have merged the data frame using groupby and the results are being like this. df = df.groupby(['id']).sum()
id class
A1 X1,41X1,42X1,43
A2 X1,41X1,45
but I would like the results to be like this (hope the data contains in the form of list.)
id class
A1 [X1,41], [X1,42], [X1,43]
A2 [X1,41], [X1,45]
python pandas dataframe
add a comment |
up vote
1
down vote
favorite
I am a new pandas user and I would like to ask is it possible to do this?
Here is the sample of my data frame. all data type of both attributes are the string.
id class
A1 X1,41
A1 X1,42
A1 X1,43
A2 X1,41
A2 X1,45
I have merged the data frame using groupby and the results are being like this. df = df.groupby(['id']).sum()
id class
A1 X1,41X1,42X1,43
A2 X1,41X1,45
but I would like the results to be like this (hope the data contains in the form of list.)
id class
A1 [X1,41], [X1,42], [X1,43]
A2 [X1,41], [X1,45]
python pandas dataframe
It's possible, but not recommended. Pandas is not designed to hold lists in series. If you're concerned about memory / performance, looking at Categorical Data is a much better idea.
– jpp
Nov 7 at 15:27
The things you have in the originalclass
column are strings. Not lists! It looks like from your desired output that you have lists of lists where each nested list only has a single string in it. That makes very little sense to me. You need to do a better job of explaining what it is inside your dataframe and what it is you want there when you are done.
– piRSquared
Nov 7 at 15:30
Please edit your post to show how you usedgroupby
. Maybe modifying that code a bit can someone answer your question.
– Edgar R. Mondragón
Nov 7 at 15:30
thank you for all comments, i have edited the post
– Sujin
Nov 7 at 15:34
add a comment |
up vote
1
down vote
favorite
up vote
1
down vote
favorite
I am a new pandas user and I would like to ask is it possible to do this?
Here is the sample of my data frame. all data type of both attributes are the string.
id class
A1 X1,41
A1 X1,42
A1 X1,43
A2 X1,41
A2 X1,45
I have merged the data frame using groupby and the results are being like this. df = df.groupby(['id']).sum()
id class
A1 X1,41X1,42X1,43
A2 X1,41X1,45
but I would like the results to be like this (hope the data contains in the form of list.)
id class
A1 [X1,41], [X1,42], [X1,43]
A2 [X1,41], [X1,45]
python pandas dataframe
I am a new pandas user and I would like to ask is it possible to do this?
Here is the sample of my data frame. all data type of both attributes are the string.
id class
A1 X1,41
A1 X1,42
A1 X1,43
A2 X1,41
A2 X1,45
I have merged the data frame using groupby and the results are being like this. df = df.groupby(['id']).sum()
id class
A1 X1,41X1,42X1,43
A2 X1,41X1,45
but I would like the results to be like this (hope the data contains in the form of list.)
id class
A1 [X1,41], [X1,42], [X1,43]
A2 [X1,41], [X1,45]
python pandas dataframe
python pandas dataframe
edited Nov 7 at 16:16
Seanny123
2,21933261
2,21933261
asked Nov 7 at 15:24
Sujin
468
468
It's possible, but not recommended. Pandas is not designed to hold lists in series. If you're concerned about memory / performance, looking at Categorical Data is a much better idea.
– jpp
Nov 7 at 15:27
The things you have in the originalclass
column are strings. Not lists! It looks like from your desired output that you have lists of lists where each nested list only has a single string in it. That makes very little sense to me. You need to do a better job of explaining what it is inside your dataframe and what it is you want there when you are done.
– piRSquared
Nov 7 at 15:30
Please edit your post to show how you usedgroupby
. Maybe modifying that code a bit can someone answer your question.
– Edgar R. Mondragón
Nov 7 at 15:30
thank you for all comments, i have edited the post
– Sujin
Nov 7 at 15:34
add a comment |
It's possible, but not recommended. Pandas is not designed to hold lists in series. If you're concerned about memory / performance, looking at Categorical Data is a much better idea.
– jpp
Nov 7 at 15:27
The things you have in the originalclass
column are strings. Not lists! It looks like from your desired output that you have lists of lists where each nested list only has a single string in it. That makes very little sense to me. You need to do a better job of explaining what it is inside your dataframe and what it is you want there when you are done.
– piRSquared
Nov 7 at 15:30
Please edit your post to show how you usedgroupby
. Maybe modifying that code a bit can someone answer your question.
– Edgar R. Mondragón
Nov 7 at 15:30
thank you for all comments, i have edited the post
– Sujin
Nov 7 at 15:34
It's possible, but not recommended. Pandas is not designed to hold lists in series. If you're concerned about memory / performance, looking at Categorical Data is a much better idea.
– jpp
Nov 7 at 15:27
It's possible, but not recommended. Pandas is not designed to hold lists in series. If you're concerned about memory / performance, looking at Categorical Data is a much better idea.
– jpp
Nov 7 at 15:27
The things you have in the original
class
column are strings. Not lists! It looks like from your desired output that you have lists of lists where each nested list only has a single string in it. That makes very little sense to me. You need to do a better job of explaining what it is inside your dataframe and what it is you want there when you are done.– piRSquared
Nov 7 at 15:30
The things you have in the original
class
column are strings. Not lists! It looks like from your desired output that you have lists of lists where each nested list only has a single string in it. That makes very little sense to me. You need to do a better job of explaining what it is inside your dataframe and what it is you want there when you are done.– piRSquared
Nov 7 at 15:30
Please edit your post to show how you used
groupby
. Maybe modifying that code a bit can someone answer your question.– Edgar R. Mondragón
Nov 7 at 15:30
Please edit your post to show how you used
groupby
. Maybe modifying that code a bit can someone answer your question.– Edgar R. Mondragón
Nov 7 at 15:30
thank you for all comments, i have edited the post
– Sujin
Nov 7 at 15:34
thank you for all comments, i have edited the post
– Sujin
Nov 7 at 15:34
add a comment |
2 Answers
2
active
oldest
votes
up vote
0
down vote
I think you are looking for this:
df.groupby('id').apply(lambda x: [[_x] for _x in x['class']])
This means to groupby the 'id'
column, and for every grouped object, apply the function given. In this case I provide a function that creates a list of lists of the objects in that groupby object. _x
could be anything, I just named it that to reflect that it is temporary and a placeholder.
could you please explain me what does _x mean?
– Sujin
Nov 7 at 15:37
add a comment |
up vote
0
down vote
Expanding on @Ethan Koch's answer:
df.groupby('id').apply(lambda x: [[_x] for _x in x['class']])
returns a Series, not a Dataframe. To convert back to Dataframe:
df2=pd.DataFrame({'id':df.index, 'class':df.values})
add a comment |
2 Answers
2
active
oldest
votes
2 Answers
2
active
oldest
votes
active
oldest
votes
active
oldest
votes
up vote
0
down vote
I think you are looking for this:
df.groupby('id').apply(lambda x: [[_x] for _x in x['class']])
This means to groupby the 'id'
column, and for every grouped object, apply the function given. In this case I provide a function that creates a list of lists of the objects in that groupby object. _x
could be anything, I just named it that to reflect that it is temporary and a placeholder.
could you please explain me what does _x mean?
– Sujin
Nov 7 at 15:37
add a comment |
up vote
0
down vote
I think you are looking for this:
df.groupby('id').apply(lambda x: [[_x] for _x in x['class']])
This means to groupby the 'id'
column, and for every grouped object, apply the function given. In this case I provide a function that creates a list of lists of the objects in that groupby object. _x
could be anything, I just named it that to reflect that it is temporary and a placeholder.
could you please explain me what does _x mean?
– Sujin
Nov 7 at 15:37
add a comment |
up vote
0
down vote
up vote
0
down vote
I think you are looking for this:
df.groupby('id').apply(lambda x: [[_x] for _x in x['class']])
This means to groupby the 'id'
column, and for every grouped object, apply the function given. In this case I provide a function that creates a list of lists of the objects in that groupby object. _x
could be anything, I just named it that to reflect that it is temporary and a placeholder.
I think you are looking for this:
df.groupby('id').apply(lambda x: [[_x] for _x in x['class']])
This means to groupby the 'id'
column, and for every grouped object, apply the function given. In this case I provide a function that creates a list of lists of the objects in that groupby object. _x
could be anything, I just named it that to reflect that it is temporary and a placeholder.
edited Nov 7 at 15:39
answered Nov 7 at 15:34
Ethan Koch
2116
2116
could you please explain me what does _x mean?
– Sujin
Nov 7 at 15:37
add a comment |
could you please explain me what does _x mean?
– Sujin
Nov 7 at 15:37
could you please explain me what does _x mean?
– Sujin
Nov 7 at 15:37
could you please explain me what does _x mean?
– Sujin
Nov 7 at 15:37
add a comment |
up vote
0
down vote
Expanding on @Ethan Koch's answer:
df.groupby('id').apply(lambda x: [[_x] for _x in x['class']])
returns a Series, not a Dataframe. To convert back to Dataframe:
df2=pd.DataFrame({'id':df.index, 'class':df.values})
add a comment |
up vote
0
down vote
Expanding on @Ethan Koch's answer:
df.groupby('id').apply(lambda x: [[_x] for _x in x['class']])
returns a Series, not a Dataframe. To convert back to Dataframe:
df2=pd.DataFrame({'id':df.index, 'class':df.values})
add a comment |
up vote
0
down vote
up vote
0
down vote
Expanding on @Ethan Koch's answer:
df.groupby('id').apply(lambda x: [[_x] for _x in x['class']])
returns a Series, not a Dataframe. To convert back to Dataframe:
df2=pd.DataFrame({'id':df.index, 'class':df.values})
Expanding on @Ethan Koch's answer:
df.groupby('id').apply(lambda x: [[_x] for _x in x['class']])
returns a Series, not a Dataframe. To convert back to Dataframe:
df2=pd.DataFrame({'id':df.index, 'class':df.values})
answered Nov 7 at 16:09
Ricky Kim
642211
642211
add a comment |
add a comment |
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53192492%2fis-it-possible-to-format-string-in-pandas-data-frame%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
It's possible, but not recommended. Pandas is not designed to hold lists in series. If you're concerned about memory / performance, looking at Categorical Data is a much better idea.
– jpp
Nov 7 at 15:27
The things you have in the original
class
column are strings. Not lists! It looks like from your desired output that you have lists of lists where each nested list only has a single string in it. That makes very little sense to me. You need to do a better job of explaining what it is inside your dataframe and what it is you want there when you are done.– piRSquared
Nov 7 at 15:30
Please edit your post to show how you used
groupby
. Maybe modifying that code a bit can someone answer your question.– Edgar R. Mondragón
Nov 7 at 15:30
thank you for all comments, i have edited the post
– Sujin
Nov 7 at 15:34