Scheduler livenessProbe errors on new helm chart
See original GitHub issueOfficial Helm Chart version
1.6.0 (latest released)
Apache Airflow version
v2.1.2
Kubernetes Version
v1.22.10 (GKE version v1.22.10-gke.600)
Helm Chart configuration
Only livenessProbe config before and during the issue:
# Airflow scheduler settings
scheduler:
livenessProbe:
initialDelaySeconds: 10
timeoutSeconds: 15
failureThreshold: 10
periodSeconds: 60
Docker Image customisations
Here’s the image we use based on the apache airflow image:
### Main official airflow image
FROM apache/airflow:2.1.2-python3.8
USER root
RUN apt update
RUN echo 'debconf debconf/frontend select Noninteractive' | debconf-set-selections
### Add OS packages here
## GCC compiler in case it's needed for installing python packages
RUN apt install -y -q build-essential
USER airflow
### Changing the default SSL / TLS mode for mysql client to work properly
## https://askubuntu.com/questions/1233186/ubuntu-20-04-how-to-set-lower-ssl-security-level
## https://bugs.launchpad.net/ubuntu/+source/mysql-8.0/+bug/1872541
## https://stackoverflow.com/questions/61649764/mysql-error-2026-ssl-connection-error-ubuntu-20-04
RUN echo $'openssl_conf = default_conf\n\
[default_conf]\n\
ssl_conf = ssl_sect\n\
[ssl_sect]\n\
system_default = ssl_default_sect\n\
[ssl_default_sect]\n\
MinProtocol = TLSv1\n\
CipherString = DEFAULT:@SECLEVEL=1' >> /home/airflow/.openssl.cnf
## OS env var to point to the new openssl.cnf file
ENV OPENSSL_CONF=/home/airflow/.openssl.cnf
### Add airflow providers
RUN pip install apache-airflow-providers-apache-beam
### End airflow providers
### Add extra python packages
RUN pip install python-slugify==3.0.3
### End extra python packages
What happened
After the upgrade to the helm chart 1.6.0, the scheduler POD was restarting as the livenessProbe was failing.
Command for the new livenessProbe from helm chart 1.6.0 tested directly on our scheduler POD:
airflow@yc-data-airflow-scheduler-0:/opt/airflow$ CONNECTION_CHECK_MAX_COUNT=0 AIRFLOW__LOGGING__LOGGING_LEVEL=ERROR exec /entrypoint \
> airflow jobs check --job-type SchedulerJob --hostname $(hostname)
No alive jobs found.
Removing the --hostname argument works and a live job is found:
airflow@yc-data-airflow-scheduler-0:/opt/airflow$ CONNECTION_CHECK_MAX_COUNT=0 AIRFLOW__LOGGING__LOGGING_LEVEL=ERROR exec /entrypoint \
> airflow jobs check --job-type SchedulerJob
Found one alive job.
What you think should happen instead
livenessProbe should not error.
How to reproduce
No response
Anything else
No response
Are you willing to submit PR?
- Yes I am willing to submit a PR!
Code of Conduct
- I agree to follow this project’s Code of Conduct
Issue Analytics
- State:
- Created a year ago
- Reactions:2
- Comments:14 (4 by maintainers)
Top Results From Across the Web
Parameters reference — helm-chart Documentation
The following tables lists the configurable parameters of the Airflow chart and their default values. Sections: Common. Airflow. Images. Ports. Database.
Read more >Mysql Kubernetes Deployment helm chart fails with readiness ...
It looks like the Readiness probe failed & the Liveness probe is failing in your case. For MySQL, please try with the below...
Read more >mxnet 3.1.7 · bitnami/bitnami - Artifact Hub
This chart bootstraps an Apache MXNet (Incubating) deployment on a Kubernetes cluster using the Helm package manager. Bitnami charts can be used with...
Read more >Pod Lifecycle | Kubernetes
Once the scheduler assigns a Pod to a Node, the kubelet starts creating containers ... Once a container has executed for 10 minutes...
Read more >View the list of available chart parameters - VMware Docs
livenessProbe.enabled, Enable livenessProbe on Airflow web ... scheduler.image.digest, Airflow Schefuler image digest in the way sha256:aa.
Read more >
Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free
Top Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
hey @fernhtls, I think that the issue might not be directly related to AIRFLOW__CORE__HOSTNAME_CALLABLE, however, might be this is obvious but still as per your findings it is related to the fact that pod hostname not being resolved. So as a workaround
$(hostname -i)instead of$(hostname)may be used as well in liveliness probe command.I have the same issue with airflow
2.4.1to checklivenessondag-processor’s pod. Default is not working for me:as well as with
$(hostname -i):Only after removing
--hostnameflag it works: