Kafka reset offsets to earliest












0















I'm running Kafka (version 0.10.2) with Spring-data (version 1.5.1.RELEASE), Spring-kafka (version 1.1.1.RELEASE).



I have a topic which one consumer group is polling from. I noticed that sometimes, when one consumer restarts, the topic's lag turns instantly to a much higher number. After some research I came to conclusion that Kafka restarting the offsets, but I can't understand why.



enable.auto.commit = true
auto.commit.interval.ms = 5000
auto.offset.reset = smallest
log.retention.hours=168


The lag is usually very low (below 500) and being consumed in a few ms, so it can't be a out of range index (or can it?)



Someone have an idea maybe?










share|improve this question























  • what do you mean by restarting the offsets ? reading the same msg again ?

    – bittu
    Nov 18 '18 at 18:09











  • log.retention.hours is a server config, not a client config, by the way, the rest are for clients

    – cricket_007
    Nov 18 '18 at 18:46













  • Yes, the same messages are being read again...

    – Yuval
    Nov 19 '18 at 7:49
















0















I'm running Kafka (version 0.10.2) with Spring-data (version 1.5.1.RELEASE), Spring-kafka (version 1.1.1.RELEASE).



I have a topic which one consumer group is polling from. I noticed that sometimes, when one consumer restarts, the topic's lag turns instantly to a much higher number. After some research I came to conclusion that Kafka restarting the offsets, but I can't understand why.



enable.auto.commit = true
auto.commit.interval.ms = 5000
auto.offset.reset = smallest
log.retention.hours=168


The lag is usually very low (below 500) and being consumed in a few ms, so it can't be a out of range index (or can it?)



Someone have an idea maybe?










share|improve this question























  • what do you mean by restarting the offsets ? reading the same msg again ?

    – bittu
    Nov 18 '18 at 18:09











  • log.retention.hours is a server config, not a client config, by the way, the rest are for clients

    – cricket_007
    Nov 18 '18 at 18:46













  • Yes, the same messages are being read again...

    – Yuval
    Nov 19 '18 at 7:49














0












0








0








I'm running Kafka (version 0.10.2) with Spring-data (version 1.5.1.RELEASE), Spring-kafka (version 1.1.1.RELEASE).



I have a topic which one consumer group is polling from. I noticed that sometimes, when one consumer restarts, the topic's lag turns instantly to a much higher number. After some research I came to conclusion that Kafka restarting the offsets, but I can't understand why.



enable.auto.commit = true
auto.commit.interval.ms = 5000
auto.offset.reset = smallest
log.retention.hours=168


The lag is usually very low (below 500) and being consumed in a few ms, so it can't be a out of range index (or can it?)



Someone have an idea maybe?










share|improve this question














I'm running Kafka (version 0.10.2) with Spring-data (version 1.5.1.RELEASE), Spring-kafka (version 1.1.1.RELEASE).



I have a topic which one consumer group is polling from. I noticed that sometimes, when one consumer restarts, the topic's lag turns instantly to a much higher number. After some research I came to conclusion that Kafka restarting the offsets, but I can't understand why.



enable.auto.commit = true
auto.commit.interval.ms = 5000
auto.offset.reset = smallest
log.retention.hours=168


The lag is usually very low (below 500) and being consumed in a few ms, so it can't be a out of range index (or can it?)



Someone have an idea maybe?







apache-kafka spring-kafka






share|improve this question













share|improve this question











share|improve this question




share|improve this question










asked Nov 18 '18 at 17:44









YuvalYuval

919




919













  • what do you mean by restarting the offsets ? reading the same msg again ?

    – bittu
    Nov 18 '18 at 18:09











  • log.retention.hours is a server config, not a client config, by the way, the rest are for clients

    – cricket_007
    Nov 18 '18 at 18:46













  • Yes, the same messages are being read again...

    – Yuval
    Nov 19 '18 at 7:49



















  • what do you mean by restarting the offsets ? reading the same msg again ?

    – bittu
    Nov 18 '18 at 18:09











  • log.retention.hours is a server config, not a client config, by the way, the rest are for clients

    – cricket_007
    Nov 18 '18 at 18:46













  • Yes, the same messages are being read again...

    – Yuval
    Nov 19 '18 at 7:49

















what do you mean by restarting the offsets ? reading the same msg again ?

– bittu
Nov 18 '18 at 18:09





what do you mean by restarting the offsets ? reading the same msg again ?

– bittu
Nov 18 '18 at 18:09













log.retention.hours is a server config, not a client config, by the way, the rest are for clients

– cricket_007
Nov 18 '18 at 18:46







log.retention.hours is a server config, not a client config, by the way, the rest are for clients

– cricket_007
Nov 18 '18 at 18:46















Yes, the same messages are being read again...

– Yuval
Nov 19 '18 at 7:49





Yes, the same messages are being read again...

– Yuval
Nov 19 '18 at 7:49












1 Answer
1






active

oldest

votes


















0














I don't think it's actually committing the offsets as frequently as you expect, therefore, when a consumer restarts, the group rebalances, then picks up at the most recent auto-committed offset.



Commits happen only periodically (5 seconds, per your config), not on a message-per-message basis. Thus, it should be expected to see at most 5 seconds worth of duplicated data, but not the beginning of the topic, unless offsets are not being committed at all (you should setup simple log4j logging in the clients in order to determine this)



If you want finer control, disable auto offset commits, and call the commitSync or commitAsync methods of the Consumer object (these are the methods of the core Java API, not sure about Spring)



One option might be to upgrade your Spring clients like Gary is saying below. Since you're running Kafka 0.10.2+, this shouldn't be a problem.






share|improve this answer


























  • Nope, definitely the same messages are being read again. Messages from a week ago

    – Yuval
    Nov 19 '18 at 7:49











  • Can you show your code in the question?

    – cricket_007
    Nov 19 '18 at 14:04








  • 1





    You should upgrade to 1.3.x; 1.1.x is very old and has a very complicated threading model. KIP-62 allowed us to rewrite the threading model and make it much simpler. The current 1.3.x release is 1.3.7; 1.3.8 will be released next week. With brokers before 2.0.0, the consumer offsets are removed after 24 hours so if you get no messages in that time (e.g. over a weekend); the offsets will be reset. 2.0.0 changed the default to 7 days.

    – Gary Russell
    Nov 19 '18 at 15:13













  • @Yuval See above ^^

    – cricket_007
    Nov 19 '18 at 16:55






  • 2





    Good point; with auto commit; the offsets shouldn't expire. I would still recommend upgrading to a more modern version of spring-kafka, though. 1.1.x is no longer supported; you should go to 1.3.7 at a minimum; the current vesrsion is 2.1.0 (2.1.1 next week).

    – Gary Russell
    Nov 20 '18 at 14:01











Your Answer






StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});


}
});














draft saved

draft discarded


















StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53363776%2fkafka-reset-offsets-to-earliest%23new-answer', 'question_page');
}
);

Post as a guest















Required, but never shown

























1 Answer
1






active

oldest

votes








1 Answer
1






active

oldest

votes









active

oldest

votes






active

oldest

votes









0














I don't think it's actually committing the offsets as frequently as you expect, therefore, when a consumer restarts, the group rebalances, then picks up at the most recent auto-committed offset.



Commits happen only periodically (5 seconds, per your config), not on a message-per-message basis. Thus, it should be expected to see at most 5 seconds worth of duplicated data, but not the beginning of the topic, unless offsets are not being committed at all (you should setup simple log4j logging in the clients in order to determine this)



If you want finer control, disable auto offset commits, and call the commitSync or commitAsync methods of the Consumer object (these are the methods of the core Java API, not sure about Spring)



One option might be to upgrade your Spring clients like Gary is saying below. Since you're running Kafka 0.10.2+, this shouldn't be a problem.






share|improve this answer


























  • Nope, definitely the same messages are being read again. Messages from a week ago

    – Yuval
    Nov 19 '18 at 7:49











  • Can you show your code in the question?

    – cricket_007
    Nov 19 '18 at 14:04








  • 1





    You should upgrade to 1.3.x; 1.1.x is very old and has a very complicated threading model. KIP-62 allowed us to rewrite the threading model and make it much simpler. The current 1.3.x release is 1.3.7; 1.3.8 will be released next week. With brokers before 2.0.0, the consumer offsets are removed after 24 hours so if you get no messages in that time (e.g. over a weekend); the offsets will be reset. 2.0.0 changed the default to 7 days.

    – Gary Russell
    Nov 19 '18 at 15:13













  • @Yuval See above ^^

    – cricket_007
    Nov 19 '18 at 16:55






  • 2





    Good point; with auto commit; the offsets shouldn't expire. I would still recommend upgrading to a more modern version of spring-kafka, though. 1.1.x is no longer supported; you should go to 1.3.7 at a minimum; the current vesrsion is 2.1.0 (2.1.1 next week).

    – Gary Russell
    Nov 20 '18 at 14:01
















0














I don't think it's actually committing the offsets as frequently as you expect, therefore, when a consumer restarts, the group rebalances, then picks up at the most recent auto-committed offset.



Commits happen only periodically (5 seconds, per your config), not on a message-per-message basis. Thus, it should be expected to see at most 5 seconds worth of duplicated data, but not the beginning of the topic, unless offsets are not being committed at all (you should setup simple log4j logging in the clients in order to determine this)



If you want finer control, disable auto offset commits, and call the commitSync or commitAsync methods of the Consumer object (these are the methods of the core Java API, not sure about Spring)



One option might be to upgrade your Spring clients like Gary is saying below. Since you're running Kafka 0.10.2+, this shouldn't be a problem.






share|improve this answer


























  • Nope, definitely the same messages are being read again. Messages from a week ago

    – Yuval
    Nov 19 '18 at 7:49











  • Can you show your code in the question?

    – cricket_007
    Nov 19 '18 at 14:04








  • 1





    You should upgrade to 1.3.x; 1.1.x is very old and has a very complicated threading model. KIP-62 allowed us to rewrite the threading model and make it much simpler. The current 1.3.x release is 1.3.7; 1.3.8 will be released next week. With brokers before 2.0.0, the consumer offsets are removed after 24 hours so if you get no messages in that time (e.g. over a weekend); the offsets will be reset. 2.0.0 changed the default to 7 days.

    – Gary Russell
    Nov 19 '18 at 15:13













  • @Yuval See above ^^

    – cricket_007
    Nov 19 '18 at 16:55






  • 2





    Good point; with auto commit; the offsets shouldn't expire. I would still recommend upgrading to a more modern version of spring-kafka, though. 1.1.x is no longer supported; you should go to 1.3.7 at a minimum; the current vesrsion is 2.1.0 (2.1.1 next week).

    – Gary Russell
    Nov 20 '18 at 14:01














0












0








0







I don't think it's actually committing the offsets as frequently as you expect, therefore, when a consumer restarts, the group rebalances, then picks up at the most recent auto-committed offset.



Commits happen only periodically (5 seconds, per your config), not on a message-per-message basis. Thus, it should be expected to see at most 5 seconds worth of duplicated data, but not the beginning of the topic, unless offsets are not being committed at all (you should setup simple log4j logging in the clients in order to determine this)



If you want finer control, disable auto offset commits, and call the commitSync or commitAsync methods of the Consumer object (these are the methods of the core Java API, not sure about Spring)



One option might be to upgrade your Spring clients like Gary is saying below. Since you're running Kafka 0.10.2+, this shouldn't be a problem.






share|improve this answer















I don't think it's actually committing the offsets as frequently as you expect, therefore, when a consumer restarts, the group rebalances, then picks up at the most recent auto-committed offset.



Commits happen only periodically (5 seconds, per your config), not on a message-per-message basis. Thus, it should be expected to see at most 5 seconds worth of duplicated data, but not the beginning of the topic, unless offsets are not being committed at all (you should setup simple log4j logging in the clients in order to determine this)



If you want finer control, disable auto offset commits, and call the commitSync or commitAsync methods of the Consumer object (these are the methods of the core Java API, not sure about Spring)



One option might be to upgrade your Spring clients like Gary is saying below. Since you're running Kafka 0.10.2+, this shouldn't be a problem.







share|improve this answer














share|improve this answer



share|improve this answer








edited Nov 20 '18 at 11:18

























answered Nov 18 '18 at 18:41









cricket_007cricket_007

81.7k1143111




81.7k1143111













  • Nope, definitely the same messages are being read again. Messages from a week ago

    – Yuval
    Nov 19 '18 at 7:49











  • Can you show your code in the question?

    – cricket_007
    Nov 19 '18 at 14:04








  • 1





    You should upgrade to 1.3.x; 1.1.x is very old and has a very complicated threading model. KIP-62 allowed us to rewrite the threading model and make it much simpler. The current 1.3.x release is 1.3.7; 1.3.8 will be released next week. With brokers before 2.0.0, the consumer offsets are removed after 24 hours so if you get no messages in that time (e.g. over a weekend); the offsets will be reset. 2.0.0 changed the default to 7 days.

    – Gary Russell
    Nov 19 '18 at 15:13













  • @Yuval See above ^^

    – cricket_007
    Nov 19 '18 at 16:55






  • 2





    Good point; with auto commit; the offsets shouldn't expire. I would still recommend upgrading to a more modern version of spring-kafka, though. 1.1.x is no longer supported; you should go to 1.3.7 at a minimum; the current vesrsion is 2.1.0 (2.1.1 next week).

    – Gary Russell
    Nov 20 '18 at 14:01



















  • Nope, definitely the same messages are being read again. Messages from a week ago

    – Yuval
    Nov 19 '18 at 7:49











  • Can you show your code in the question?

    – cricket_007
    Nov 19 '18 at 14:04








  • 1





    You should upgrade to 1.3.x; 1.1.x is very old and has a very complicated threading model. KIP-62 allowed us to rewrite the threading model and make it much simpler. The current 1.3.x release is 1.3.7; 1.3.8 will be released next week. With brokers before 2.0.0, the consumer offsets are removed after 24 hours so if you get no messages in that time (e.g. over a weekend); the offsets will be reset. 2.0.0 changed the default to 7 days.

    – Gary Russell
    Nov 19 '18 at 15:13













  • @Yuval See above ^^

    – cricket_007
    Nov 19 '18 at 16:55






  • 2





    Good point; with auto commit; the offsets shouldn't expire. I would still recommend upgrading to a more modern version of spring-kafka, though. 1.1.x is no longer supported; you should go to 1.3.7 at a minimum; the current vesrsion is 2.1.0 (2.1.1 next week).

    – Gary Russell
    Nov 20 '18 at 14:01

















Nope, definitely the same messages are being read again. Messages from a week ago

– Yuval
Nov 19 '18 at 7:49





Nope, definitely the same messages are being read again. Messages from a week ago

– Yuval
Nov 19 '18 at 7:49













Can you show your code in the question?

– cricket_007
Nov 19 '18 at 14:04







Can you show your code in the question?

– cricket_007
Nov 19 '18 at 14:04






1




1





You should upgrade to 1.3.x; 1.1.x is very old and has a very complicated threading model. KIP-62 allowed us to rewrite the threading model and make it much simpler. The current 1.3.x release is 1.3.7; 1.3.8 will be released next week. With brokers before 2.0.0, the consumer offsets are removed after 24 hours so if you get no messages in that time (e.g. over a weekend); the offsets will be reset. 2.0.0 changed the default to 7 days.

– Gary Russell
Nov 19 '18 at 15:13







You should upgrade to 1.3.x; 1.1.x is very old and has a very complicated threading model. KIP-62 allowed us to rewrite the threading model and make it much simpler. The current 1.3.x release is 1.3.7; 1.3.8 will be released next week. With brokers before 2.0.0, the consumer offsets are removed after 24 hours so if you get no messages in that time (e.g. over a weekend); the offsets will be reset. 2.0.0 changed the default to 7 days.

– Gary Russell
Nov 19 '18 at 15:13















@Yuval See above ^^

– cricket_007
Nov 19 '18 at 16:55





@Yuval See above ^^

– cricket_007
Nov 19 '18 at 16:55




2




2





Good point; with auto commit; the offsets shouldn't expire. I would still recommend upgrading to a more modern version of spring-kafka, though. 1.1.x is no longer supported; you should go to 1.3.7 at a minimum; the current vesrsion is 2.1.0 (2.1.1 next week).

– Gary Russell
Nov 20 '18 at 14:01





Good point; with auto commit; the offsets shouldn't expire. I would still recommend upgrading to a more modern version of spring-kafka, though. 1.1.x is no longer supported; you should go to 1.3.7 at a minimum; the current vesrsion is 2.1.0 (2.1.1 next week).

– Gary Russell
Nov 20 '18 at 14:01




















draft saved

draft discarded




















































Thanks for contributing an answer to Stack Overflow!


  • Please be sure to answer the question. Provide details and share your research!

But avoid



  • Asking for help, clarification, or responding to other answers.

  • Making statements based on opinion; back them up with references or personal experience.


To learn more, see our tips on writing great answers.




draft saved


draft discarded














StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53363776%2fkafka-reset-offsets-to-earliest%23new-answer', 'question_page');
}
);

Post as a guest















Required, but never shown





















































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown

































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown







這個網誌中的熱門文章

Tangent Lines Diagram Along Smooth Curve

Yusuf al-Mu'taman ibn Hud

Zucchini