Heartbeat failed for group <id> because it is rebalancing

Hi,

Our node details are as follows Kafka - 3 nodes (shared between different applications) App - 2 nodes (ie 2 consumers per topic, total 5 topics) Kafka settings - version 1.0 with default settings, nothing changed

FYI, Kafka settings

broker.id=1
num.network.threads=3
num.io.threads=8
socket.send.buffer.bytes=102400
socket.receive.buffer.bytes=102400
socket.request.max.bytes=104857600
log.dirs=/var/lib/kafka
num.partitions=1
num.recovery.threads.per.data.dir=1
default.replication.factor=3
offsets.topic.replication.factor=3
transaction.state.log.replication.factor=3
transaction.state.log.min.isr=2
log.retention.bytes=1073741824
log.segment.bytes=1073741824
log.retention.check.interval.ms=300000
log.retention.hours=120
zookeeper.connection.timeout.ms=6000
confluent.support.metrics.enable=false
group.initial.rebalance.delay.ms=0
confluent.support.customer.id=anonymous
auto.create.topics.enable=false

Consumer s running fine , if I run one node with one consumer. The moment I start another consumer from other node, I get the group rebalancing error for some topics immediately.

I have tried with below consumer settings, still I see the same error.

AIOKafkaConsumer(
        #evgaKafkaName(topic_name),
        topic_name,
        loop=loop,
        #bootstrap_servers=getKafkaBrokers(),
        bootstrap_servers=BROKERS,
        # group_id=evgaKafkaName(group_name),
        group_id=group_name,
        #security_protocol='SSL',
        #ssl_context=getKafkaSslContext(),
        value_deserializer=lambda v: v.decode('utf8'),
        #value_deserializer=deserializer,
        enable_auto_commit=True,
        auto_offset_reset=auto_offset_reset,
        consumer_timeout_ms=consumer_timeout_ms,
        retry_backoff_ms=10,
        #heartbeat_interval_ms=50,
        session_timeout_ms=60000,
        #request_timeout_ms=80 * 1000,
        #fetch_max_wait_ms=3000,
        max_poll_records=50
    )

What could be the issue, anything to do with rebalance.max.retries and rebalance.backoff.ms and session timeout ?

Thanks, Bala

Issue Analytics

State:
Created 5 years ago
Comments:9 (4 by maintainers)

Top GitHub Comments

1reaction

pybalacommented, Sep 11, 2018

yes @tvoinarovskyi , looks like it fixed after adding the Rebalance listener.

0reactions

tvoinarovskyicommented, Mar 12, 2020

@pod2metra Are you are seeing errors of similar kind in logs, or are you having problems with consuming messages because of it constantly happening?

If it’s just some messages in logs, that’s OK. It’s how Kafka behaves (at least used to before they introduced some optimizations, which are not implemented in aiokafka sadly). When Kafka broker sees some members of the group leave or timeout it will throw an error to all remaining participants. Usually, this error will be seen on the next heartbeat, which will result in a log message similar to:

 Heartbeat failed for group <id> because it is rebalancing

That is good, it just signals the consumer to rejoin the group for consumption with a different member set.

If you have a different case, please open a new issue. Sorry for the trouble.

Top Results From Across the Web

Heartbeat failed for group xxxWorker because it is rebalancing

I solved this question by create many group to consume the topics. firstly I consume all the topics in one group, nearly twelve...

Heartbeat failed for group because it's rebalancing – iTecNote

When a new consumer joins a consumer group the set of consumers attempt to "rebalance" the load to assign partitions to each consumer....

Solving My Weird Kafka Rebalancing Problems & Explaining ...

Kafka starts a rebalancing if a consumer joins or leaves a group. Below are various reasons why that can or will happen.

kafka rebalancing issues

If you get a heartbeat failure because the group is rebalancing, – DeV Mar 19 at 9:18. it indicates that your consumer instance...

Getting CommitFailedException in 0.10.2.0 due to member id ...

If you look at line 42006 you will see that the group is rebalancing. Attempt to heartbeat failed for group new-part-advice since it...

Troubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.

Start Free

Top Related Reddit Thread

No results found

Top Related Tweet

No results found

Top Related Dev.to Post

No results found

Heartbeat failed for group <id> because it is rebalancing

Issue Analytics

Top GitHub Comments

Top Results From Across the Web

Top Related Medium Post

Top Related StackOverflow Question

Troubleshoot Live Code

Top Related Reddit Thread

Top Related Hackernoon Post

Top Related Tweet

Top Related Dev.to Post

Top Related Hashnode Post