How to change type of column in pandas without column name?
up vote
1
down vote
favorite
I have a problem with a data type.
Please assumed that here is my sample data frame.
class1 class2 docid
A123 08/9 X123
A123 08/1 X123
A124 08/1 X124
A124 08/2 X124
A125 08/3 X125
I have merged class1 and class2 then named as class3
class3 docid
A123,08/9 X123
A123,08/1 X123
A124,08/1 X124
A124,08/2 X124
A125,08/3 X125
and then make a matrix by get_dummies
df1 = pd.get_dummies(df.docid).sort_index(level=0).max(level=[0,1])
df1
and get the results like this
class3 X123 X124 X125
A123,08/9 1 0 0
A123,08/1 1 0 0
A124,08/1 0 1 0
A124,08/2 0 1 0
A125,08/3. 0 0 1
and then I have dropped the class3 then transposed this matrix to calculate the Jaccard similarity by docid
df1_new = df1.drop(['class3'], axis=1)
df1_new_1 = df1_new.transpose()
df1_new_1
and the results are being like this
0 1 2 3 4
X123 1 1 0 0 0
X124 0 0 1 1 0
X125 0 0 0 0 1
from this result, the column has no name, then I would like to ask how can I change the X123 X124 X125 into 0 1 2 or only change the datatype from string to int? because, when I use this result to calculate the Jaccard similarity, it's appeared
ValueError: invalid literal for int() with base 10: 'X123'
thank you in advance
python pandas numpy
add a comment |
up vote
1
down vote
favorite
I have a problem with a data type.
Please assumed that here is my sample data frame.
class1 class2 docid
A123 08/9 X123
A123 08/1 X123
A124 08/1 X124
A124 08/2 X124
A125 08/3 X125
I have merged class1 and class2 then named as class3
class3 docid
A123,08/9 X123
A123,08/1 X123
A124,08/1 X124
A124,08/2 X124
A125,08/3 X125
and then make a matrix by get_dummies
df1 = pd.get_dummies(df.docid).sort_index(level=0).max(level=[0,1])
df1
and get the results like this
class3 X123 X124 X125
A123,08/9 1 0 0
A123,08/1 1 0 0
A124,08/1 0 1 0
A124,08/2 0 1 0
A125,08/3. 0 0 1
and then I have dropped the class3 then transposed this matrix to calculate the Jaccard similarity by docid
df1_new = df1.drop(['class3'], axis=1)
df1_new_1 = df1_new.transpose()
df1_new_1
and the results are being like this
0 1 2 3 4
X123 1 1 0 0 0
X124 0 0 1 1 0
X125 0 0 0 0 1
from this result, the column has no name, then I would like to ask how can I change the X123 X124 X125 into 0 1 2 or only change the datatype from string to int? because, when I use this result to calculate the Jaccard similarity, it's appeared
ValueError: invalid literal for int() with base 10: 'X123'
thank you in advance
python pandas numpy
add a comment |
up vote
1
down vote
favorite
up vote
1
down vote
favorite
I have a problem with a data type.
Please assumed that here is my sample data frame.
class1 class2 docid
A123 08/9 X123
A123 08/1 X123
A124 08/1 X124
A124 08/2 X124
A125 08/3 X125
I have merged class1 and class2 then named as class3
class3 docid
A123,08/9 X123
A123,08/1 X123
A124,08/1 X124
A124,08/2 X124
A125,08/3 X125
and then make a matrix by get_dummies
df1 = pd.get_dummies(df.docid).sort_index(level=0).max(level=[0,1])
df1
and get the results like this
class3 X123 X124 X125
A123,08/9 1 0 0
A123,08/1 1 0 0
A124,08/1 0 1 0
A124,08/2 0 1 0
A125,08/3. 0 0 1
and then I have dropped the class3 then transposed this matrix to calculate the Jaccard similarity by docid
df1_new = df1.drop(['class3'], axis=1)
df1_new_1 = df1_new.transpose()
df1_new_1
and the results are being like this
0 1 2 3 4
X123 1 1 0 0 0
X124 0 0 1 1 0
X125 0 0 0 0 1
from this result, the column has no name, then I would like to ask how can I change the X123 X124 X125 into 0 1 2 or only change the datatype from string to int? because, when I use this result to calculate the Jaccard similarity, it's appeared
ValueError: invalid literal for int() with base 10: 'X123'
thank you in advance
python pandas numpy
I have a problem with a data type.
Please assumed that here is my sample data frame.
class1 class2 docid
A123 08/9 X123
A123 08/1 X123
A124 08/1 X124
A124 08/2 X124
A125 08/3 X125
I have merged class1 and class2 then named as class3
class3 docid
A123,08/9 X123
A123,08/1 X123
A124,08/1 X124
A124,08/2 X124
A125,08/3 X125
and then make a matrix by get_dummies
df1 = pd.get_dummies(df.docid).sort_index(level=0).max(level=[0,1])
df1
and get the results like this
class3 X123 X124 X125
A123,08/9 1 0 0
A123,08/1 1 0 0
A124,08/1 0 1 0
A124,08/2 0 1 0
A125,08/3. 0 0 1
and then I have dropped the class3 then transposed this matrix to calculate the Jaccard similarity by docid
df1_new = df1.drop(['class3'], axis=1)
df1_new_1 = df1_new.transpose()
df1_new_1
and the results are being like this
0 1 2 3 4
X123 1 1 0 0 0
X124 0 0 1 1 0
X125 0 0 0 0 1
from this result, the column has no name, then I would like to ask how can I change the X123 X124 X125 into 0 1 2 or only change the datatype from string to int? because, when I use this result to calculate the Jaccard similarity, it's appeared
ValueError: invalid literal for int() with base 10: 'X123'
thank you in advance
python pandas numpy
python pandas numpy
asked Nov 8 at 3:06
Sujin
488
488
add a comment |
add a comment |
1 Answer
1
active
oldest
votes
up vote
0
down vote
accepted
if your problem is just to change the str index to int , you can use
df1_new_1.reset_index(drop=True)
if you want to keep the values from string, you can use
df1_new_1.index.str.extract(r'd+')
Out:
Int64Index([123, 124, 125], dtype='int64')
thank you for your comment! i forgot to think about that T T
– Sujin
Nov 8 at 3:34
add a comment |
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
up vote
0
down vote
accepted
if your problem is just to change the str index to int , you can use
df1_new_1.reset_index(drop=True)
if you want to keep the values from string, you can use
df1_new_1.index.str.extract(r'd+')
Out:
Int64Index([123, 124, 125], dtype='int64')
thank you for your comment! i forgot to think about that T T
– Sujin
Nov 8 at 3:34
add a comment |
up vote
0
down vote
accepted
if your problem is just to change the str index to int , you can use
df1_new_1.reset_index(drop=True)
if you want to keep the values from string, you can use
df1_new_1.index.str.extract(r'd+')
Out:
Int64Index([123, 124, 125], dtype='int64')
thank you for your comment! i forgot to think about that T T
– Sujin
Nov 8 at 3:34
add a comment |
up vote
0
down vote
accepted
up vote
0
down vote
accepted
if your problem is just to change the str index to int , you can use
df1_new_1.reset_index(drop=True)
if you want to keep the values from string, you can use
df1_new_1.index.str.extract(r'd+')
Out:
Int64Index([123, 124, 125], dtype='int64')
if your problem is just to change the str index to int , you can use
df1_new_1.reset_index(drop=True)
if you want to keep the values from string, you can use
df1_new_1.index.str.extract(r'd+')
Out:
Int64Index([123, 124, 125], dtype='int64')
answered Nov 8 at 3:29
Naga Kiran
2,017316
2,017316
thank you for your comment! i forgot to think about that T T
– Sujin
Nov 8 at 3:34
add a comment |
thank you for your comment! i forgot to think about that T T
– Sujin
Nov 8 at 3:34
thank you for your comment! i forgot to think about that T T
– Sujin
Nov 8 at 3:34
thank you for your comment! i forgot to think about that T T
– Sujin
Nov 8 at 3:34
add a comment |
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Some of your past answers have not been well-received, and you're in danger of being blocked from answering.
Please pay close attention to the following guidance:
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53200971%2fhow-to-change-type-of-column-in-pandas-without-column-name%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown