pandas to_sql, only update new rows to mysql db (primary key & duplicate)





.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty{ height:90px;width:728px;box-sizing:border-box;
}







0















I am having problems updating pd to mysql db



for stock_code in cleaned:

url = 'https://www.hkex.com.hk/chi/sorc/options/statistics_hv_iv_c.aspx?action=csv&type=3&ucode={0}'.format(stock_code)
df = pd.read_csv(url,index_col='交易日', header=0, skiprows=2)
df.index.names = ['Trade Date']
df.index = pd.to_datetime(df.index, dayfirst=True)
df.insert(loc=0, column ='Stock Code', value=stock_code)
df.columns = ['Stock Code', 'Implied IV (%)','HV10 (%)','HV30 (%)','HV60 (%)','HV90 (%)']
df.to_sql(con=database_connection, name='table', if_exists='append')

database_connection.close()


The url will provide data for nearest 3 months : ie. 2018-08-25 to 2018-11-25,
after today the url will give data 2018-08-26 to 2018-11-26, what I want is to save all the data in db without duplication.



I set 'Trade Date' & 'Stock Code' as primary key, but it yields error :
(mysql.connector.errors.IntegrityError) 1062 (23000): Duplicate entry '2018-11-23-00001' for key 'PRIMARY' [SQL



How can I skip the duplicate entry and only update new rows? Thanks a lot!










share|improve this question























  • set 'Stock Code' + 'Trade Date' as primary key (in order)

    – RomanPerekhrest
    Nov 25 '18 at 7:05













  • Sorry, can u elaborate a bit? Trade Date is index & Stock Code is a column

    – Anony Yam
    Nov 25 '18 at 10:02


















0















I am having problems updating pd to mysql db



for stock_code in cleaned:

url = 'https://www.hkex.com.hk/chi/sorc/options/statistics_hv_iv_c.aspx?action=csv&type=3&ucode={0}'.format(stock_code)
df = pd.read_csv(url,index_col='交易日', header=0, skiprows=2)
df.index.names = ['Trade Date']
df.index = pd.to_datetime(df.index, dayfirst=True)
df.insert(loc=0, column ='Stock Code', value=stock_code)
df.columns = ['Stock Code', 'Implied IV (%)','HV10 (%)','HV30 (%)','HV60 (%)','HV90 (%)']
df.to_sql(con=database_connection, name='table', if_exists='append')

database_connection.close()


The url will provide data for nearest 3 months : ie. 2018-08-25 to 2018-11-25,
after today the url will give data 2018-08-26 to 2018-11-26, what I want is to save all the data in db without duplication.



I set 'Trade Date' & 'Stock Code' as primary key, but it yields error :
(mysql.connector.errors.IntegrityError) 1062 (23000): Duplicate entry '2018-11-23-00001' for key 'PRIMARY' [SQL



How can I skip the duplicate entry and only update new rows? Thanks a lot!










share|improve this question























  • set 'Stock Code' + 'Trade Date' as primary key (in order)

    – RomanPerekhrest
    Nov 25 '18 at 7:05













  • Sorry, can u elaborate a bit? Trade Date is index & Stock Code is a column

    – Anony Yam
    Nov 25 '18 at 10:02














0












0








0








I am having problems updating pd to mysql db



for stock_code in cleaned:

url = 'https://www.hkex.com.hk/chi/sorc/options/statistics_hv_iv_c.aspx?action=csv&type=3&ucode={0}'.format(stock_code)
df = pd.read_csv(url,index_col='交易日', header=0, skiprows=2)
df.index.names = ['Trade Date']
df.index = pd.to_datetime(df.index, dayfirst=True)
df.insert(loc=0, column ='Stock Code', value=stock_code)
df.columns = ['Stock Code', 'Implied IV (%)','HV10 (%)','HV30 (%)','HV60 (%)','HV90 (%)']
df.to_sql(con=database_connection, name='table', if_exists='append')

database_connection.close()


The url will provide data for nearest 3 months : ie. 2018-08-25 to 2018-11-25,
after today the url will give data 2018-08-26 to 2018-11-26, what I want is to save all the data in db without duplication.



I set 'Trade Date' & 'Stock Code' as primary key, but it yields error :
(mysql.connector.errors.IntegrityError) 1062 (23000): Duplicate entry '2018-11-23-00001' for key 'PRIMARY' [SQL



How can I skip the duplicate entry and only update new rows? Thanks a lot!










share|improve this question














I am having problems updating pd to mysql db



for stock_code in cleaned:

url = 'https://www.hkex.com.hk/chi/sorc/options/statistics_hv_iv_c.aspx?action=csv&type=3&ucode={0}'.format(stock_code)
df = pd.read_csv(url,index_col='交易日', header=0, skiprows=2)
df.index.names = ['Trade Date']
df.index = pd.to_datetime(df.index, dayfirst=True)
df.insert(loc=0, column ='Stock Code', value=stock_code)
df.columns = ['Stock Code', 'Implied IV (%)','HV10 (%)','HV30 (%)','HV60 (%)','HV90 (%)']
df.to_sql(con=database_connection, name='table', if_exists='append')

database_connection.close()


The url will provide data for nearest 3 months : ie. 2018-08-25 to 2018-11-25,
after today the url will give data 2018-08-26 to 2018-11-26, what I want is to save all the data in db without duplication.



I set 'Trade Date' & 'Stock Code' as primary key, but it yields error :
(mysql.connector.errors.IntegrityError) 1062 (23000): Duplicate entry '2018-11-23-00001' for key 'PRIMARY' [SQL



How can I skip the duplicate entry and only update new rows? Thanks a lot!







python mysql pandas






share|improve this question













share|improve this question











share|improve this question




share|improve this question










asked Nov 25 '18 at 7:01









Anony YamAnony Yam

103




103













  • set 'Stock Code' + 'Trade Date' as primary key (in order)

    – RomanPerekhrest
    Nov 25 '18 at 7:05













  • Sorry, can u elaborate a bit? Trade Date is index & Stock Code is a column

    – Anony Yam
    Nov 25 '18 at 10:02



















  • set 'Stock Code' + 'Trade Date' as primary key (in order)

    – RomanPerekhrest
    Nov 25 '18 at 7:05













  • Sorry, can u elaborate a bit? Trade Date is index & Stock Code is a column

    – Anony Yam
    Nov 25 '18 at 10:02

















set 'Stock Code' + 'Trade Date' as primary key (in order)

– RomanPerekhrest
Nov 25 '18 at 7:05







set 'Stock Code' + 'Trade Date' as primary key (in order)

– RomanPerekhrest
Nov 25 '18 at 7:05















Sorry, can u elaborate a bit? Trade Date is index & Stock Code is a column

– Anony Yam
Nov 25 '18 at 10:02





Sorry, can u elaborate a bit? Trade Date is index & Stock Code is a column

– Anony Yam
Nov 25 '18 at 10:02












1 Answer
1






active

oldest

votes


















0














As far as I know it does not have a solution for bulk insert (to_sql). You can try this:



for i in range(len(df)):
try:
df.iloc[i:i+1].to_sql(name='table', if_exists='append', con=database_connection)
except IntegrityError:
pass





share|improve this answer
























  • Thanks for the suggestion, but it created another error : IntegrityError: (mysql.connector.errors.IntegrityError) During handling of the above exception, another exception occurred: NameError

    – Anony Yam
    Dec 2 '18 at 3:44














Your Answer






StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});


}
});














draft saved

draft discarded


















StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53465370%2fpandas-to-sql-only-update-new-rows-to-mysql-db-primary-key-duplicate%23new-answer', 'question_page');
}
);

Post as a guest















Required, but never shown

























1 Answer
1






active

oldest

votes








1 Answer
1






active

oldest

votes









active

oldest

votes






active

oldest

votes









0














As far as I know it does not have a solution for bulk insert (to_sql). You can try this:



for i in range(len(df)):
try:
df.iloc[i:i+1].to_sql(name='table', if_exists='append', con=database_connection)
except IntegrityError:
pass





share|improve this answer
























  • Thanks for the suggestion, but it created another error : IntegrityError: (mysql.connector.errors.IntegrityError) During handling of the above exception, another exception occurred: NameError

    – Anony Yam
    Dec 2 '18 at 3:44


















0














As far as I know it does not have a solution for bulk insert (to_sql). You can try this:



for i in range(len(df)):
try:
df.iloc[i:i+1].to_sql(name='table', if_exists='append', con=database_connection)
except IntegrityError:
pass





share|improve this answer
























  • Thanks for the suggestion, but it created another error : IntegrityError: (mysql.connector.errors.IntegrityError) During handling of the above exception, another exception occurred: NameError

    – Anony Yam
    Dec 2 '18 at 3:44
















0












0








0







As far as I know it does not have a solution for bulk insert (to_sql). You can try this:



for i in range(len(df)):
try:
df.iloc[i:i+1].to_sql(name='table', if_exists='append', con=database_connection)
except IntegrityError:
pass





share|improve this answer













As far as I know it does not have a solution for bulk insert (to_sql). You can try this:



for i in range(len(df)):
try:
df.iloc[i:i+1].to_sql(name='table', if_exists='append', con=database_connection)
except IntegrityError:
pass






share|improve this answer












share|improve this answer



share|improve this answer










answered Nov 25 '18 at 12:17









E. ZeytinciE. Zeytinci

10317




10317













  • Thanks for the suggestion, but it created another error : IntegrityError: (mysql.connector.errors.IntegrityError) During handling of the above exception, another exception occurred: NameError

    – Anony Yam
    Dec 2 '18 at 3:44





















  • Thanks for the suggestion, but it created another error : IntegrityError: (mysql.connector.errors.IntegrityError) During handling of the above exception, another exception occurred: NameError

    – Anony Yam
    Dec 2 '18 at 3:44



















Thanks for the suggestion, but it created another error : IntegrityError: (mysql.connector.errors.IntegrityError) During handling of the above exception, another exception occurred: NameError

– Anony Yam
Dec 2 '18 at 3:44







Thanks for the suggestion, but it created another error : IntegrityError: (mysql.connector.errors.IntegrityError) During handling of the above exception, another exception occurred: NameError

– Anony Yam
Dec 2 '18 at 3:44






















draft saved

draft discarded




















































Thanks for contributing an answer to Stack Overflow!


  • Please be sure to answer the question. Provide details and share your research!

But avoid



  • Asking for help, clarification, or responding to other answers.

  • Making statements based on opinion; back them up with references or personal experience.


To learn more, see our tips on writing great answers.




draft saved


draft discarded














StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53465370%2fpandas-to-sql-only-update-new-rows-to-mysql-db-primary-key-duplicate%23new-answer', 'question_page');
}
);

Post as a guest















Required, but never shown





















































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown

































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown







這個網誌中的熱門文章

Xamarin.form Move up view when keyboard appear

Post-Redirect-Get with Spring WebFlux and Thymeleaf

Anylogic : not able to use stopDelay()