Pandas DataFrame turn a list of jsons column into informative row, per “id”
.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty{ height:90px;width:728px;box-sizing:border-box;
}
Consider the following DataFrame:
import pandas as pd
df = pd.DataFrame({'id': [1, 2, 3],
'json_col': [ [{'aa' : 1, 'ab' : 1}, {'aa' : 3, 'ab' : 2, 'ac': 6}],
[{'aa' : 1, 'ab' : 2, 'ac': 1}, {'aa' : 5}],
[{'aa': 3, 'ac': 2}] ]})
df
Out[134]:
id json_col
0 1 [{'aa': 1, 'ab': 1}, {'aa': 3, 'ab': 2, 'ac': 6}]
1 2 [{'aa': 1, 'ab': 2, 'ac': 1}, {'aa': 5}]
2 3 [{'aa': 3, 'ac': 2}]
We can see that we have a list of jsons for each id.
I'd like, for each 'id'
and for each corresponding json in its list, to have a 'row'
in the DataFrame
. So the following DataFrame
will look like this:
id aa ab ac
0 1 1 1.0 NaN
1 1 3 2.0 6.0
2 2 1 2.0 1.0
3 2 5 NaN NaN
4 3 3 NaN 2.0
We can see, id '1'
had 2 corresponding jsons in it's list and therefor it gets 2 rows in the new DataFrame
Is there a pythonic way to do so using panda, numpy or json functionality?
Adding the run times of the solutions
setup = """
import pandas as pd
df = pd.DataFrame({'id': [1, 2, 3],
'json_col': [ [{'aa' : 1, 'ab' : 1}, {'aa' : 3, 'ab' : 2, 'ac': 6}],
[{'aa' : 1, 'ab' : 2, 'ac': 1}, {'aa' : 5}],
[{'aa': 3, 'ac': 2}] ]})
"""
s1 = """
df = pd.concat(
[pd.DataFrame(j, index=[i]*len(j)) for i, j in enumerate(df['json_col'], 1)],
sort=False
)
"""
s2 = """
recs = df.apply(lambda x: [{**{'id': x.id}, **d} for d in x.json_col], axis=1).sum()
df2 = pd.DataFrame.from_records(recs)
"""
%timeit(s1, setup)
52.3 ns ± 2.6 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)
%timeit(s2, setup)
50.6 ns ± 3.28 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)
python json pandas numpy
add a comment |
Consider the following DataFrame:
import pandas as pd
df = pd.DataFrame({'id': [1, 2, 3],
'json_col': [ [{'aa' : 1, 'ab' : 1}, {'aa' : 3, 'ab' : 2, 'ac': 6}],
[{'aa' : 1, 'ab' : 2, 'ac': 1}, {'aa' : 5}],
[{'aa': 3, 'ac': 2}] ]})
df
Out[134]:
id json_col
0 1 [{'aa': 1, 'ab': 1}, {'aa': 3, 'ab': 2, 'ac': 6}]
1 2 [{'aa': 1, 'ab': 2, 'ac': 1}, {'aa': 5}]
2 3 [{'aa': 3, 'ac': 2}]
We can see that we have a list of jsons for each id.
I'd like, for each 'id'
and for each corresponding json in its list, to have a 'row'
in the DataFrame
. So the following DataFrame
will look like this:
id aa ab ac
0 1 1 1.0 NaN
1 1 3 2.0 6.0
2 2 1 2.0 1.0
3 2 5 NaN NaN
4 3 3 NaN 2.0
We can see, id '1'
had 2 corresponding jsons in it's list and therefor it gets 2 rows in the new DataFrame
Is there a pythonic way to do so using panda, numpy or json functionality?
Adding the run times of the solutions
setup = """
import pandas as pd
df = pd.DataFrame({'id': [1, 2, 3],
'json_col': [ [{'aa' : 1, 'ab' : 1}, {'aa' : 3, 'ab' : 2, 'ac': 6}],
[{'aa' : 1, 'ab' : 2, 'ac': 1}, {'aa' : 5}],
[{'aa': 3, 'ac': 2}] ]})
"""
s1 = """
df = pd.concat(
[pd.DataFrame(j, index=[i]*len(j)) for i, j in enumerate(df['json_col'], 1)],
sort=False
)
"""
s2 = """
recs = df.apply(lambda x: [{**{'id': x.id}, **d} for d in x.json_col], axis=1).sum()
df2 = pd.DataFrame.from_records(recs)
"""
%timeit(s1, setup)
52.3 ns ± 2.6 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)
%timeit(s2, setup)
50.6 ns ± 3.28 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)
python json pandas numpy
add a comment |
Consider the following DataFrame:
import pandas as pd
df = pd.DataFrame({'id': [1, 2, 3],
'json_col': [ [{'aa' : 1, 'ab' : 1}, {'aa' : 3, 'ab' : 2, 'ac': 6}],
[{'aa' : 1, 'ab' : 2, 'ac': 1}, {'aa' : 5}],
[{'aa': 3, 'ac': 2}] ]})
df
Out[134]:
id json_col
0 1 [{'aa': 1, 'ab': 1}, {'aa': 3, 'ab': 2, 'ac': 6}]
1 2 [{'aa': 1, 'ab': 2, 'ac': 1}, {'aa': 5}]
2 3 [{'aa': 3, 'ac': 2}]
We can see that we have a list of jsons for each id.
I'd like, for each 'id'
and for each corresponding json in its list, to have a 'row'
in the DataFrame
. So the following DataFrame
will look like this:
id aa ab ac
0 1 1 1.0 NaN
1 1 3 2.0 6.0
2 2 1 2.0 1.0
3 2 5 NaN NaN
4 3 3 NaN 2.0
We can see, id '1'
had 2 corresponding jsons in it's list and therefor it gets 2 rows in the new DataFrame
Is there a pythonic way to do so using panda, numpy or json functionality?
Adding the run times of the solutions
setup = """
import pandas as pd
df = pd.DataFrame({'id': [1, 2, 3],
'json_col': [ [{'aa' : 1, 'ab' : 1}, {'aa' : 3, 'ab' : 2, 'ac': 6}],
[{'aa' : 1, 'ab' : 2, 'ac': 1}, {'aa' : 5}],
[{'aa': 3, 'ac': 2}] ]})
"""
s1 = """
df = pd.concat(
[pd.DataFrame(j, index=[i]*len(j)) for i, j in enumerate(df['json_col'], 1)],
sort=False
)
"""
s2 = """
recs = df.apply(lambda x: [{**{'id': x.id}, **d} for d in x.json_col], axis=1).sum()
df2 = pd.DataFrame.from_records(recs)
"""
%timeit(s1, setup)
52.3 ns ± 2.6 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)
%timeit(s2, setup)
50.6 ns ± 3.28 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)
python json pandas numpy
Consider the following DataFrame:
import pandas as pd
df = pd.DataFrame({'id': [1, 2, 3],
'json_col': [ [{'aa' : 1, 'ab' : 1}, {'aa' : 3, 'ab' : 2, 'ac': 6}],
[{'aa' : 1, 'ab' : 2, 'ac': 1}, {'aa' : 5}],
[{'aa': 3, 'ac': 2}] ]})
df
Out[134]:
id json_col
0 1 [{'aa': 1, 'ab': 1}, {'aa': 3, 'ab': 2, 'ac': 6}]
1 2 [{'aa': 1, 'ab': 2, 'ac': 1}, {'aa': 5}]
2 3 [{'aa': 3, 'ac': 2}]
We can see that we have a list of jsons for each id.
I'd like, for each 'id'
and for each corresponding json in its list, to have a 'row'
in the DataFrame
. So the following DataFrame
will look like this:
id aa ab ac
0 1 1 1.0 NaN
1 1 3 2.0 6.0
2 2 1 2.0 1.0
3 2 5 NaN NaN
4 3 3 NaN 2.0
We can see, id '1'
had 2 corresponding jsons in it's list and therefor it gets 2 rows in the new DataFrame
Is there a pythonic way to do so using panda, numpy or json functionality?
Adding the run times of the solutions
setup = """
import pandas as pd
df = pd.DataFrame({'id': [1, 2, 3],
'json_col': [ [{'aa' : 1, 'ab' : 1}, {'aa' : 3, 'ab' : 2, 'ac': 6}],
[{'aa' : 1, 'ab' : 2, 'ac': 1}, {'aa' : 5}],
[{'aa': 3, 'ac': 2}] ]})
"""
s1 = """
df = pd.concat(
[pd.DataFrame(j, index=[i]*len(j)) for i, j in enumerate(df['json_col'], 1)],
sort=False
)
"""
s2 = """
recs = df.apply(lambda x: [{**{'id': x.id}, **d} for d in x.json_col], axis=1).sum()
df2 = pd.DataFrame.from_records(recs)
"""
%timeit(s1, setup)
52.3 ns ± 2.6 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)
%timeit(s2, setup)
50.6 ns ± 3.28 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)
python json pandas numpy
python json pandas numpy
edited Nov 25 '18 at 10:32
Eran Moshe
asked Nov 25 '18 at 9:09
Eran MosheEran Moshe
1,418723
1,418723
add a comment |
add a comment |
2 Answers
2
active
oldest
votes
Here is one quick way by converting all the json_col
's lists of dictionaries to DataFrame
and concatenating them together plus some tweaks to create the id
column:
In [51]: df = pd.concat(
[pd.DataFrame(j, index=[i]*len(j)) for i, j in enumerate(json_col, 1)],
sort=False
)
In [52]: df.index.name = 'id'
In [53]: df.reset_index()
Out[53]:
id aa ab ac
0 1 1 1.0 NaN
1 1 3 2.0 6.0
2 2 1 2.0 1.0
3 2 5 NaN NaN
4 3 3 NaN 2.0
Ali solution work faster, but it's more python and easy to understand, So I'll accept this one
– Eran Moshe
Nov 25 '18 at 10:12
@EranMoshe It's very likely that the other answer works a little bit slower though.
– Kasrâmvd
Nov 25 '18 at 10:16
I thought so too.. Buttimeit
proved me wrong. I might misused it.. You wanna give it a go ?
– Eran Moshe
Nov 25 '18 at 10:17
@EranMoshe Oh, that's interesting! Can you please update your question with the benchmarks? Thanks.
– Kasrâmvd
Nov 25 '18 at 10:20
1
After checking it again, they run roughly the same. I'll edit it though.
– Eran Moshe
Nov 25 '18 at 10:30
add a comment |
a short way to accomplish this would be the following, although I don't personally consider it very pythonic as the code is a little hard to read, and not terribly performant, but for small data wrangling this should do the trick:
recs = df.apply(lambda x: [{**{'id': x.id}, **d} for d in x.json_col], axis=1).sum()
df2 = pd.DataFrame.from_records(recs)
# outputs:
aa ab ac id
0 1 1.0 NaN 1
1 3 2.0 6.0 1
2 1 2.0 1.0 2
3 5 NaN NaN 2
4 3 NaN 2.0 3
How it works:
The applied lambda creates a new dictionary by merging the contents of
{id: x.id}
to each dictionary in the list of dictionaries inx.json_col
(where x is a row).
This is then summed. Since summing a lists of list of elements unites them into a big list of elements, recs has the following form
[{'id': 1, 'aa': 1, 'ab': 1},
{'id': 1, 'aa': 3, 'ab': 2, 'ac': 6},
{'id': 2, 'aa': 1, 'ab': 2, 'ac': 1},
{'id': 2, 'aa': 5},
{'id': 3, 'aa': 3, 'ac': 2}]
A new data frame is then simply constructed from the records.
add a comment |
Your Answer
StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53466076%2fpandas-dataframe-turn-a-list-of-jsons-column-into-informative-row-per-id%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
2 Answers
2
active
oldest
votes
2 Answers
2
active
oldest
votes
active
oldest
votes
active
oldest
votes
Here is one quick way by converting all the json_col
's lists of dictionaries to DataFrame
and concatenating them together plus some tweaks to create the id
column:
In [51]: df = pd.concat(
[pd.DataFrame(j, index=[i]*len(j)) for i, j in enumerate(json_col, 1)],
sort=False
)
In [52]: df.index.name = 'id'
In [53]: df.reset_index()
Out[53]:
id aa ab ac
0 1 1 1.0 NaN
1 1 3 2.0 6.0
2 2 1 2.0 1.0
3 2 5 NaN NaN
4 3 3 NaN 2.0
Ali solution work faster, but it's more python and easy to understand, So I'll accept this one
– Eran Moshe
Nov 25 '18 at 10:12
@EranMoshe It's very likely that the other answer works a little bit slower though.
– Kasrâmvd
Nov 25 '18 at 10:16
I thought so too.. Buttimeit
proved me wrong. I might misused it.. You wanna give it a go ?
– Eran Moshe
Nov 25 '18 at 10:17
@EranMoshe Oh, that's interesting! Can you please update your question with the benchmarks? Thanks.
– Kasrâmvd
Nov 25 '18 at 10:20
1
After checking it again, they run roughly the same. I'll edit it though.
– Eran Moshe
Nov 25 '18 at 10:30
add a comment |
Here is one quick way by converting all the json_col
's lists of dictionaries to DataFrame
and concatenating them together plus some tweaks to create the id
column:
In [51]: df = pd.concat(
[pd.DataFrame(j, index=[i]*len(j)) for i, j in enumerate(json_col, 1)],
sort=False
)
In [52]: df.index.name = 'id'
In [53]: df.reset_index()
Out[53]:
id aa ab ac
0 1 1 1.0 NaN
1 1 3 2.0 6.0
2 2 1 2.0 1.0
3 2 5 NaN NaN
4 3 3 NaN 2.0
Ali solution work faster, but it's more python and easy to understand, So I'll accept this one
– Eran Moshe
Nov 25 '18 at 10:12
@EranMoshe It's very likely that the other answer works a little bit slower though.
– Kasrâmvd
Nov 25 '18 at 10:16
I thought so too.. Buttimeit
proved me wrong. I might misused it.. You wanna give it a go ?
– Eran Moshe
Nov 25 '18 at 10:17
@EranMoshe Oh, that's interesting! Can you please update your question with the benchmarks? Thanks.
– Kasrâmvd
Nov 25 '18 at 10:20
1
After checking it again, they run roughly the same. I'll edit it though.
– Eran Moshe
Nov 25 '18 at 10:30
add a comment |
Here is one quick way by converting all the json_col
's lists of dictionaries to DataFrame
and concatenating them together plus some tweaks to create the id
column:
In [51]: df = pd.concat(
[pd.DataFrame(j, index=[i]*len(j)) for i, j in enumerate(json_col, 1)],
sort=False
)
In [52]: df.index.name = 'id'
In [53]: df.reset_index()
Out[53]:
id aa ab ac
0 1 1 1.0 NaN
1 1 3 2.0 6.0
2 2 1 2.0 1.0
3 2 5 NaN NaN
4 3 3 NaN 2.0
Here is one quick way by converting all the json_col
's lists of dictionaries to DataFrame
and concatenating them together plus some tweaks to create the id
column:
In [51]: df = pd.concat(
[pd.DataFrame(j, index=[i]*len(j)) for i, j in enumerate(json_col, 1)],
sort=False
)
In [52]: df.index.name = 'id'
In [53]: df.reset_index()
Out[53]:
id aa ab ac
0 1 1 1.0 NaN
1 1 3 2.0 6.0
2 2 1 2.0 1.0
3 2 5 NaN NaN
4 3 3 NaN 2.0
edited Nov 25 '18 at 10:08
answered Nov 25 '18 at 9:29
KasrâmvdKasrâmvd
80.1k1093131
80.1k1093131
Ali solution work faster, but it's more python and easy to understand, So I'll accept this one
– Eran Moshe
Nov 25 '18 at 10:12
@EranMoshe It's very likely that the other answer works a little bit slower though.
– Kasrâmvd
Nov 25 '18 at 10:16
I thought so too.. Buttimeit
proved me wrong. I might misused it.. You wanna give it a go ?
– Eran Moshe
Nov 25 '18 at 10:17
@EranMoshe Oh, that's interesting! Can you please update your question with the benchmarks? Thanks.
– Kasrâmvd
Nov 25 '18 at 10:20
1
After checking it again, they run roughly the same. I'll edit it though.
– Eran Moshe
Nov 25 '18 at 10:30
add a comment |
Ali solution work faster, but it's more python and easy to understand, So I'll accept this one
– Eran Moshe
Nov 25 '18 at 10:12
@EranMoshe It's very likely that the other answer works a little bit slower though.
– Kasrâmvd
Nov 25 '18 at 10:16
I thought so too.. Buttimeit
proved me wrong. I might misused it.. You wanna give it a go ?
– Eran Moshe
Nov 25 '18 at 10:17
@EranMoshe Oh, that's interesting! Can you please update your question with the benchmarks? Thanks.
– Kasrâmvd
Nov 25 '18 at 10:20
1
After checking it again, they run roughly the same. I'll edit it though.
– Eran Moshe
Nov 25 '18 at 10:30
Ali solution work faster, but it's more python and easy to understand, So I'll accept this one
– Eran Moshe
Nov 25 '18 at 10:12
Ali solution work faster, but it's more python and easy to understand, So I'll accept this one
– Eran Moshe
Nov 25 '18 at 10:12
@EranMoshe It's very likely that the other answer works a little bit slower though.
– Kasrâmvd
Nov 25 '18 at 10:16
@EranMoshe It's very likely that the other answer works a little bit slower though.
– Kasrâmvd
Nov 25 '18 at 10:16
I thought so too.. But
timeit
proved me wrong. I might misused it.. You wanna give it a go ?– Eran Moshe
Nov 25 '18 at 10:17
I thought so too.. But
timeit
proved me wrong. I might misused it.. You wanna give it a go ?– Eran Moshe
Nov 25 '18 at 10:17
@EranMoshe Oh, that's interesting! Can you please update your question with the benchmarks? Thanks.
– Kasrâmvd
Nov 25 '18 at 10:20
@EranMoshe Oh, that's interesting! Can you please update your question with the benchmarks? Thanks.
– Kasrâmvd
Nov 25 '18 at 10:20
1
1
After checking it again, they run roughly the same. I'll edit it though.
– Eran Moshe
Nov 25 '18 at 10:30
After checking it again, they run roughly the same. I'll edit it though.
– Eran Moshe
Nov 25 '18 at 10:30
add a comment |
a short way to accomplish this would be the following, although I don't personally consider it very pythonic as the code is a little hard to read, and not terribly performant, but for small data wrangling this should do the trick:
recs = df.apply(lambda x: [{**{'id': x.id}, **d} for d in x.json_col], axis=1).sum()
df2 = pd.DataFrame.from_records(recs)
# outputs:
aa ab ac id
0 1 1.0 NaN 1
1 3 2.0 6.0 1
2 1 2.0 1.0 2
3 5 NaN NaN 2
4 3 NaN 2.0 3
How it works:
The applied lambda creates a new dictionary by merging the contents of
{id: x.id}
to each dictionary in the list of dictionaries inx.json_col
(where x is a row).
This is then summed. Since summing a lists of list of elements unites them into a big list of elements, recs has the following form
[{'id': 1, 'aa': 1, 'ab': 1},
{'id': 1, 'aa': 3, 'ab': 2, 'ac': 6},
{'id': 2, 'aa': 1, 'ab': 2, 'ac': 1},
{'id': 2, 'aa': 5},
{'id': 3, 'aa': 3, 'ac': 2}]
A new data frame is then simply constructed from the records.
add a comment |
a short way to accomplish this would be the following, although I don't personally consider it very pythonic as the code is a little hard to read, and not terribly performant, but for small data wrangling this should do the trick:
recs = df.apply(lambda x: [{**{'id': x.id}, **d} for d in x.json_col], axis=1).sum()
df2 = pd.DataFrame.from_records(recs)
# outputs:
aa ab ac id
0 1 1.0 NaN 1
1 3 2.0 6.0 1
2 1 2.0 1.0 2
3 5 NaN NaN 2
4 3 NaN 2.0 3
How it works:
The applied lambda creates a new dictionary by merging the contents of
{id: x.id}
to each dictionary in the list of dictionaries inx.json_col
(where x is a row).
This is then summed. Since summing a lists of list of elements unites them into a big list of elements, recs has the following form
[{'id': 1, 'aa': 1, 'ab': 1},
{'id': 1, 'aa': 3, 'ab': 2, 'ac': 6},
{'id': 2, 'aa': 1, 'ab': 2, 'ac': 1},
{'id': 2, 'aa': 5},
{'id': 3, 'aa': 3, 'ac': 2}]
A new data frame is then simply constructed from the records.
add a comment |
a short way to accomplish this would be the following, although I don't personally consider it very pythonic as the code is a little hard to read, and not terribly performant, but for small data wrangling this should do the trick:
recs = df.apply(lambda x: [{**{'id': x.id}, **d} for d in x.json_col], axis=1).sum()
df2 = pd.DataFrame.from_records(recs)
# outputs:
aa ab ac id
0 1 1.0 NaN 1
1 3 2.0 6.0 1
2 1 2.0 1.0 2
3 5 NaN NaN 2
4 3 NaN 2.0 3
How it works:
The applied lambda creates a new dictionary by merging the contents of
{id: x.id}
to each dictionary in the list of dictionaries inx.json_col
(where x is a row).
This is then summed. Since summing a lists of list of elements unites them into a big list of elements, recs has the following form
[{'id': 1, 'aa': 1, 'ab': 1},
{'id': 1, 'aa': 3, 'ab': 2, 'ac': 6},
{'id': 2, 'aa': 1, 'ab': 2, 'ac': 1},
{'id': 2, 'aa': 5},
{'id': 3, 'aa': 3, 'ac': 2}]
A new data frame is then simply constructed from the records.
a short way to accomplish this would be the following, although I don't personally consider it very pythonic as the code is a little hard to read, and not terribly performant, but for small data wrangling this should do the trick:
recs = df.apply(lambda x: [{**{'id': x.id}, **d} for d in x.json_col], axis=1).sum()
df2 = pd.DataFrame.from_records(recs)
# outputs:
aa ab ac id
0 1 1.0 NaN 1
1 3 2.0 6.0 1
2 1 2.0 1.0 2
3 5 NaN NaN 2
4 3 NaN 2.0 3
How it works:
The applied lambda creates a new dictionary by merging the contents of
{id: x.id}
to each dictionary in the list of dictionaries inx.json_col
(where x is a row).
This is then summed. Since summing a lists of list of elements unites them into a big list of elements, recs has the following form
[{'id': 1, 'aa': 1, 'ab': 1},
{'id': 1, 'aa': 3, 'ab': 2, 'ac': 6},
{'id': 2, 'aa': 1, 'ab': 2, 'ac': 1},
{'id': 2, 'aa': 5},
{'id': 3, 'aa': 3, 'ac': 2}]
A new data frame is then simply constructed from the records.
edited Nov 25 '18 at 9:57
answered Nov 25 '18 at 9:27
Haleemur AliHaleemur Ali
12.9k21741
12.9k21741
add a comment |
add a comment |
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53466076%2fpandas-dataframe-turn-a-list-of-jsons-column-into-informative-row-per-id%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown