failed to start postgres (reaped unknown pid : postmaster is not running )

Hey Guys, During server upgrades, when patroni instances are restarted, this happens on secondary nodes:

2017-12-06 00:40:29,638 INFO: Lock owner: postgres-patroni-2; I am postgres-patroni-0
2017-12-06 00:40:29,665 INFO: starting as a secondary
2017-12-06 00:40:29,705 INFO: postmaster pid=428
2017-12-06 00:40:29 UTC [428]: [1-1] 5a273c7d.1ac 0     LOG:  redirecting log output to logging collector process
2017-12-06 00:40:29 UTC [428]: [2-1] 5a273c7d.1ac 0     HINT:  Future log output will appear in directory "../pg_log".
2017-12-06 00:40:29,924 INFO reaped unknown pid 428
2017-12-06 00:40:29,924 INFO reaped unknown pid 429
2017-12-06 00:40:30,706 ERROR: postmaster is not running
2017-12-06 00:40:30,711 INFO: Lock owner: postgres-patroni-2; I am postgres-patroni-0
2017-12-06 00:40:30,725 INFO: failed to start postgres

It starts fine if I delete all postgres data on these nodes… however I want replication to kick in and resume where it has finished (before the server upgrade).

Any ideas why this happens and who kills postmaster process ? Supervisord? Why?

Issue Analytics

State:
Created 6 years ago
Comments:14

Top GitHub Comments

2reactions

CyberDem0ncommented, Sep 27, 2019

Hi @bappr,

that’s quite an old image, it was build more than a year ago…

You can try to figure out why postgres if failing by execing into the pod and looking into logs which are located in the /home/postgres/pgdata/pgroot/pg_log. Since it is a Friday now, the current log file is postgres-5.csv.

Since your master is alive you can rebuild replicas with the help of patronictl:

$ kubectl exec cluster-name-X bash
root@cluster-name-X:/home/postgres# su postgres
postgres@cluster-name-X:~$ patronictl list # check cluster status
postgres@cluster-name-X:~$ patronictl reinit cluster-name cluster-name-X

The last command will wipe PGDATA and take a fresh pg_basebackup from the master.

1reaction

CyberDem0ncommented, Dec 9, 2019

Any reason why the pg_control got fucked ?

You’ll have to figure it out from the logs.

Top Results From Across the Web

failed to start postgres (reaped unknown pid - Bountysource

It starts fine if I delete all postgres data on these nodes... however I want replication to kick in and resume where it...

PostgreSQL stale 'postmaster.pid' error - Danielle McCarthy

Open your terminal and make sure you're in the home directory. · Navigate to the Postgres directory. cd Library/Application\ Support/Postgres · Type ls...

Обсуждение: BUG #14945: postmaster deadlock ... - Postgres Pro

The message that it's writing indicates that it failed to start an autovacuum ... to the other side of the pipe) has exited...

Subprocesses — Supervisor 4.2.4 documentation - Supervisord

The process could not be started successfully. UNKNOWN (1000). The process is in an unknown state (supervisord programming error). Each process run ......

1", port 5432 failed: fatal: role "postgres" does not exist - You ...

13. It would appear that you have not created a user account for the application. In psql: CREATE USER myapp WITH PASSWORD 'thepassword';...