Mapping a new column to a DataFrame by rows from another DataFrame
.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty{ height:90px;width:728px;box-sizing:border-box;
}
I have a Pandas DataFrame stations
with index as id:
id station lat lng
1 Boston 45.343 -45.333
2 New York 56.444 -35.690
I have another DataFrame df1
that has the following:
duration date station gender
NaN 20181118 NaN M
9 20181009 2.0 F
8 20170605 1.0 F
I want to add to df1
so that it looks like the following DataFrame:
duration date station gender lat lng
NaN 20181118 NaN M nan nan
9 20181009 New York F 56.444 -35.690
8 20170605 Boston F 45.343 -45.333
I tried doing this iteratively by referring to the station.iloc
as shown in the following example but I have about 2 mil rows and it ended up taking a lot of time.
stat_list =
lng_list
lat_list =
for stat in df1:
if not np.isnan(stat):
ref = station.iloc[stat]
stat_list.append(ref.station)
lng_list.append(ref.lng)
lat_list.append(ref.lat)
else:
stat_list.append(np.nan)
lng_list.append(np.nan)
lat_list.append(np.nan)
Is there a faster way to do this?
python pandas performance numpy dataframe
add a comment |
I have a Pandas DataFrame stations
with index as id:
id station lat lng
1 Boston 45.343 -45.333
2 New York 56.444 -35.690
I have another DataFrame df1
that has the following:
duration date station gender
NaN 20181118 NaN M
9 20181009 2.0 F
8 20170605 1.0 F
I want to add to df1
so that it looks like the following DataFrame:
duration date station gender lat lng
NaN 20181118 NaN M nan nan
9 20181009 New York F 56.444 -35.690
8 20170605 Boston F 45.343 -45.333
I tried doing this iteratively by referring to the station.iloc
as shown in the following example but I have about 2 mil rows and it ended up taking a lot of time.
stat_list =
lng_list
lat_list =
for stat in df1:
if not np.isnan(stat):
ref = station.iloc[stat]
stat_list.append(ref.station)
lng_list.append(ref.lng)
lat_list.append(ref.lat)
else:
stat_list.append(np.nan)
lng_list.append(np.nan)
lat_list.append(np.nan)
Is there a faster way to do this?
python pandas performance numpy dataframe
add a comment |
I have a Pandas DataFrame stations
with index as id:
id station lat lng
1 Boston 45.343 -45.333
2 New York 56.444 -35.690
I have another DataFrame df1
that has the following:
duration date station gender
NaN 20181118 NaN M
9 20181009 2.0 F
8 20170605 1.0 F
I want to add to df1
so that it looks like the following DataFrame:
duration date station gender lat lng
NaN 20181118 NaN M nan nan
9 20181009 New York F 56.444 -35.690
8 20170605 Boston F 45.343 -45.333
I tried doing this iteratively by referring to the station.iloc
as shown in the following example but I have about 2 mil rows and it ended up taking a lot of time.
stat_list =
lng_list
lat_list =
for stat in df1:
if not np.isnan(stat):
ref = station.iloc[stat]
stat_list.append(ref.station)
lng_list.append(ref.lng)
lat_list.append(ref.lat)
else:
stat_list.append(np.nan)
lng_list.append(np.nan)
lat_list.append(np.nan)
Is there a faster way to do this?
python pandas performance numpy dataframe
I have a Pandas DataFrame stations
with index as id:
id station lat lng
1 Boston 45.343 -45.333
2 New York 56.444 -35.690
I have another DataFrame df1
that has the following:
duration date station gender
NaN 20181118 NaN M
9 20181009 2.0 F
8 20170605 1.0 F
I want to add to df1
so that it looks like the following DataFrame:
duration date station gender lat lng
NaN 20181118 NaN M nan nan
9 20181009 New York F 56.444 -35.690
8 20170605 Boston F 45.343 -45.333
I tried doing this iteratively by referring to the station.iloc
as shown in the following example but I have about 2 mil rows and it ended up taking a lot of time.
stat_list =
lng_list
lat_list =
for stat in df1:
if not np.isnan(stat):
ref = station.iloc[stat]
stat_list.append(ref.station)
lng_list.append(ref.lng)
lat_list.append(ref.lat)
else:
stat_list.append(np.nan)
lng_list.append(np.nan)
lat_list.append(np.nan)
Is there a faster way to do this?
python pandas performance numpy dataframe
python pandas performance numpy dataframe
asked Nov 25 '18 at 9:02
Swopnil ShresthaSwopnil Shrestha
636
636
add a comment |
add a comment |
1 Answer
1
active
oldest
votes
Looks like this would be best solved with a merge which should significantly boost performance:
df1.merge(stations, left_on="station", right_index=True, how="left")
This will leave you with two columns station_x
and station_y
if you only want the station column with the string names in you can do:
df_merged = df1.merge(stations, left_on="station", right_index=True, how="left", suffixes=("_x", ""))
df_final = df_merged[df_merged.columns.difference(["station_x"])]
(or just rename one of them before you merge)
add a comment |
Your Answer
StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53466038%2fmapping-a-new-column-to-a-dataframe-by-rows-from-another-dataframe%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
Looks like this would be best solved with a merge which should significantly boost performance:
df1.merge(stations, left_on="station", right_index=True, how="left")
This will leave you with two columns station_x
and station_y
if you only want the station column with the string names in you can do:
df_merged = df1.merge(stations, left_on="station", right_index=True, how="left", suffixes=("_x", ""))
df_final = df_merged[df_merged.columns.difference(["station_x"])]
(or just rename one of them before you merge)
add a comment |
Looks like this would be best solved with a merge which should significantly boost performance:
df1.merge(stations, left_on="station", right_index=True, how="left")
This will leave you with two columns station_x
and station_y
if you only want the station column with the string names in you can do:
df_merged = df1.merge(stations, left_on="station", right_index=True, how="left", suffixes=("_x", ""))
df_final = df_merged[df_merged.columns.difference(["station_x"])]
(or just rename one of them before you merge)
add a comment |
Looks like this would be best solved with a merge which should significantly boost performance:
df1.merge(stations, left_on="station", right_index=True, how="left")
This will leave you with two columns station_x
and station_y
if you only want the station column with the string names in you can do:
df_merged = df1.merge(stations, left_on="station", right_index=True, how="left", suffixes=("_x", ""))
df_final = df_merged[df_merged.columns.difference(["station_x"])]
(or just rename one of them before you merge)
Looks like this would be best solved with a merge which should significantly boost performance:
df1.merge(stations, left_on="station", right_index=True, how="left")
This will leave you with two columns station_x
and station_y
if you only want the station column with the string names in you can do:
df_merged = df1.merge(stations, left_on="station", right_index=True, how="left", suffixes=("_x", ""))
df_final = df_merged[df_merged.columns.difference(["station_x"])]
(or just rename one of them before you merge)
edited Nov 25 '18 at 9:20
answered Nov 25 '18 at 9:11
Sven HarrisSven Harris
2,1961516
2,1961516
add a comment |
add a comment |
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53466038%2fmapping-a-new-column-to-a-dataframe-by-rows-from-another-dataframe%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown