Received SIGTERM. Terminating subprocesses (State of this instance has been externally set to success. Terminating instance.)

See original GitHub issue

Apache Airflow version

2.0.2

Operating System

PRETTY_NAME=“Debian GNU/Linux 10 (buster)” NAME=“Debian GNU/Linux” VERSION_ID=“10” VERSION=“10 (buster)” VERSION_CODENAME=buster ID=debian HOME_URL=“https://www.debian.org/” SUPPORT_URL=“https://www.debian.org/support” BUG_REPORT_URL=“https://bugs.debian.org/

Versions of Apache Airflow Providers

apache-airflow-backport-providers-google==2021.3.3 apache-airflow-backport-providers-amazon==2021.3.3

Deployment

Docker-Compose

Deployment details

No response

What happened

We are using airflow to schedule ETL pipeline and for the same transformation we are using EMR step sensor. And we are seeing Received SIGTERM. Terminating subprocesses. error very frequently. https://github.com/apache/airflow/blob/88583095c408ef9ea60f793e7072e3fd4b88e329/airflow/models/taskinstance.py#L1394 From logs it looks like Airflow task is setting the operator status = Success and again airflow is considering this change invalid State of this instance has been externally set to success. Terminating instance. https://github.com/apache/airflow/blob/88583095c408ef9ea60f793e7072e3fd4b88e329/airflow/jobs/local_task_job.py#L211

From Airflow UI there is no Failures but we are seeing the below errors in the airflow logs.

[2021-11-08 01:12:13,405] {standard_task_runner.py:52} INFO - Started process 27548 to run task
[2021-11-08 01:12:14,178] {standard_task_runner.py:76} INFO - Running: ['airflow', 'tasks', 'run', 'ETL_PIPELINE_JOB', 'etl_step_sensor_task', '2021-11-08T00:01:00+00:00', '--job-id', '100495', '--pool', 'default_pool', '--raw', '--subdir', 'DAGS_FOLDER/etl_pipeline_script.py', '--cfg-path', '/tmp/tmp_6kklqa5', '--error-file', '/tmp/tmpo2ug1xs3']
[2021-11-08 01:12:14,279] {standard_task_runner.py:77} INFO - Job 100495: Subtask etl_step_sensor_task
[2021-11-08 01:12:15,800] {logging_mixin.py:104} INFO - Running <TaskInstance: ETL_PIPELINE_JOB.etl_step_sensor_task 2021-11-08T00:01:00+00:00 [running]> on host e363e366c87b
[2021-11-08 01:12:17,015] {taskinstance.py:1281} INFO - Exporting the following env vars:
AIRFLOW_CTX_DAG_EMAIL=airflow@airflow.com
AIRFLOW_CTX_DAG_OWNER=airflow
AIRFLOW_CTX_DAG_ID=ETL_PIPELINE_JOB
AIRFLOW_CTX_TASK_ID=etl_step_sensor_task
AIRFLOW_CTX_EXECUTION_DATE=2021-11-08T00:01:00+00:00
AIRFLOW_CTX_DAG_RUN_ID=scheduled__2021-11-08T00:01:00+00:00
[2021-11-08 01:12:17,174] {base_aws.py:362} INFO - Airflow Connection: aws_conn_id=aws_default
[2021-11-08 01:12:17,462] {base_aws.py:385} WARNING - Unable to use Airflow Connection for credentials.
[2021-11-08 01:12:17,462] {base_aws.py:386} INFO - Fallback on boto3 credential strategy
[2021-11-08 01:12:17,462] {base_aws.py:389} INFO - Creating session using boto3 credential strategy region_name=None
[2021-11-08 01:12:18,035] {emr_step.py:75} INFO - Poking step step-id on cluster cluster_id
[2021-11-08 01:12:18,395] {emr_base.py:68} INFO - Job flow currently COMPLETED
[2021-11-08 01:12:18,395] {base.py:245} INFO - Success criteria met. Exiting.
[2021-11-08 01:12:18,522] {taskinstance.py:1185} INFO - Marking task as SUCCESS. dag_id=ETL_PIPELINE_JOB, task_id=etl_step_sensor_task, execution_date=20211108T000100, start_date=20211108T011212, end_date=20211108T011218
[2021-11-08 01:12:19,269] {local_task_job.py:187} WARNING - State of this instance has been externally set to success. Terminating instance.
[2021-11-08 01:12:19,272] {process_utils.py:100} INFO - Sending Signals.SIGTERM to GPID 27548
[2021-11-08 01:12:19,419] {taskinstance.py:1265} ERROR - Received SIGTERM. Terminating subprocesses.
[2021-11-08 01:12:20,636] {process_utils.py:66} INFO - Process psutil.Process(pid=27548, status='terminated', exitcode=1, started='01:12:12') (27548) terminated with exit code 1
airflow@e363e366c87b:/opt/airflow/logs/ETL_PIPELINE_JOB$ 

### What you expected to happen

We expect No errors in the airflow task 
``` Received SIGTERM. Terminating subprocesses.

And if this expected, How we can fix it in Airflow Build.


### How to reproduce

This is an intermittent issue, Not sure if we can generate these errors with some pre-defined steps

### Anything else

_No response_

### Are you willing to submit PR?

- [ ] Yes I am willing to submit a PR!

### Code of Conduct

- [X] I agree to follow this project's [Code of Conduct](https://github.com/apache/airflow/blob/main/CODE_OF_CONDUCT.md)

Issue Analytics

  • State:closed
  • Created 2 years ago
  • Comments:10 (8 by maintainers)

github_iconTop GitHub Comments

1reaction
potiukcommented, Feb 6, 2022

Hi @potiuk , we have seen a similar issue since we upgraded from Airflow 2.0.2 to Airflow 2.2.3, for example in a BashOperator.

This issue is already closed. Without knowing your configuration, it’s impossible to say what the problem is and whether it’s the same or not. sending a TEMMR signal by something will usually result with this kind of problem, but the reasons for that might be multiple. \

\I propose you open a new issue where you describe your circumstances or you could take a look at other issues that have described similar problems in various circusmstances to see if they are similar and if you find the issue is open, post details in it to help to diagnose/fix it with more certainty:

Here are all similar issues: https://github.com/apache/airflow/issues?q=is%3Aissue+SIGTERM+label%3Akind%3Abug+ And here only the opened ones: https://github.com/apache/airflow/issues?q=is%3Aissue+SIGTERM+label%3Akind%3Abug+is%3Aopen

If you find that circumstances are similar in one of those opened issues, I suggest you just describe a bit more details - your deployment details, when you experience the problems what remedies you used so far etc. everything that might help there.

0reactions
potiukcommented, Apr 13, 2022

There are reeally, really different reasons why task might get SIGTERM 😦 . We need some detailed logs to act on it.

Read more comments on GitHub >

github_iconTop Results From Across the Web

WARNING - State of this instance has been externally set to ...
This error typically occurs when a task has timed out. You could increase the timeout in case your task takes a long time...
Read more >
State of this instance has been externally set to up_for_retry ...
SIGTERM to GPID 9303 [2021-07-23 11:16:21,900] {taskinstance.py:1284} ERROR - Received SIGTERM. Terminating subprocesses. I have reviewed the ...
Read more >
How To Fix Task received SIGTERM signal In Airflow
While I have been recently working on migrating DAGs from Airflow 1 ... Task instances listen for external kill signal (when you clear...
Read more >
Tasks taking the poison pills after few seconds of running
Hi, We are encountering strange behaviour using Airflow. Randomly, tasks “take poison pill” few second after starting.
Read more >
state of this instance has been externally set to success airflow
airflowexception: task received sigterm signal. Just noticed that towards the end of the log for the task, it clearly states that . airflow.exceptions....
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found