[GKE] failed to bootstrap from leader 'ticker-patroni-0'

See original GitHub issue

Hello, My patroni are not being available to sync. i am using Spilo-1.5-p5 and PostgreSQL 11 in kubernetes with kubernetes DCS

ticker-patroni-1 patroni 2019-02-08 17:51:35,705 - bootstrapping - INFO - Figuring out my environment (Google? AWS? Openstack? Local?)
ticker-patroni-1 patroni 2019-02-08 17:51:35,733 - bootstrapping - INFO - Looks like your running google
ticker-patroni-1 patroni 2019-02-08 17:51:35,747 - bootstrapping - INFO - Configuring crontab
ticker-patroni-1 patroni 2019-02-08 17:51:35,747 - bootstrapping - INFO - Configuring wal-e
ticker-patroni-1 patroni 2019-02-08 17:51:35,747 - bootstrapping - INFO - Configuring certificate
ticker-patroni-1 patroni 2019-02-08 17:51:35,748 - bootstrapping - INFO - Generating ssl certificate
ticker-patroni-1 patroni 2019-02-08 17:51:35,850 - bootstrapping - INFO - Configuring patroni
ticker-patroni-1 patroni 2019-02-08 17:51:35,858 - bootstrapping - INFO - Writing to file /home/postgres/postgres.yml
ticker-patroni-1 patroni 2019-02-08 17:51:35,858 - bootstrapping - INFO - Configuring log
ticker-patroni-1 patroni 2019-02-08 17:51:35,858 - bootstrapping - INFO - Configuring bootstrap
ticker-patroni-1 patroni 2019-02-08 17:51:35,858 - bootstrapping - INFO - Configuring patronictl
ticker-patroni-1 patroni 2019-02-08 17:51:35,859 - bootstrapping - INFO - Configuring pgbouncer
ticker-patroni-1 patroni 2019-02-08 17:51:35,859 - bootstrapping - INFO - No PGBOUNCER_CONFIGURATION was specified, skipping
ticker-patroni-1 patroni 2019-02-08 17:51:35,859 - bootstrapping - INFO - Configuring pam-oauth2
ticker-patroni-1 patroni 2019-02-08 17:51:35,859 - bootstrapping - INFO - No PAM_OAUTH2 configuration was specified, skipping
ticker-patroni-1 patroni 2019-02-08 17:51:36,102 CRIT Supervisor is running as root.  Privileges were not dropped because no user is specified in the config file.  If you intend to run as root, you can set user=root in the config file to avoid this message.
ticker-patroni-1 patroni 2019-02-08 17:51:36,102 INFO Included extra file "/etc/supervisor/conf.d/cron.conf" during parsing
ticker-patroni-1 patroni 2019-02-08 17:51:36,102 INFO Included extra file "/etc/supervisor/conf.d/patroni.conf" during parsing
ticker-patroni-1 patroni 2019-02-08 17:51:36,102 INFO Included extra file "/etc/supervisor/conf.d/pgq.conf" during parsing
ticker-patroni-1 patroni 2019-02-08 17:51:36,110 INFO RPC interface 'supervisor' initialized
ticker-patroni-1 patroni 2019-02-08 17:51:36,110 CRIT Server 'unix_http_server' running without any HTTP authentication checking
ticker-patroni-1 patroni 2019-02-08 17:51:36,110 INFO supervisord started with pid 1
ticker-patroni-1 patroni 2019-02-08 17:51:37,114 INFO spawned: 'cron' with pid 24
ticker-patroni-1 patroni 2019-02-08 17:51:37,117 INFO spawned: 'patroni' with pid 25
ticker-patroni-1 patroni 2019-02-08 17:51:37,119 INFO spawned: 'pgq' with pid 26
ticker-patroni-1 patroni 2019-02-08 17:51:37,683 INFO: Lock owner: ticker-patroni-0; I am ticker-patroni-1
ticker-patroni-1 patroni 2019-02-08 17:51:37,691 INFO: trying to bootstrap from leader 'ticker-patroni-0'
ticker-patroni-1 patroni 2019-02-08 17:51:37,692 ERROR: failed to bootstrap from leader 'ticker-patroni-0'
ticker-patroni-1 patroni 2019-02-08 17:51:37,692 INFO: Removing data directory: /home/postgres/pgdata/pgroot/data
ticker-patroni-1 patroni 2019-02-08 17:51:38,694 INFO success: cron entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
ticker-patroni-1 patroni 2019-02-08 17:51:38,694 INFO success: patroni entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
ticker-patroni-1 patroni 2019-02-08 17:51:38,694 INFO success: pgq entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
ticker-patroni-0 patroni 2019-02-08 17:51:39,395 INFO: Lock owner: ticker-patroni-0; I am ticker-patroni-0
ticker-patroni-0 patroni 2019-02-08 17:51:39,406 INFO: no action.  i am the leader with the lock
ticker-patroni-1 patroni 2019-02-08 17:51:47,680 INFO: Lock owner: ticker-patroni-0; I am ticker-patroni-1
ticker-patroni-1 patroni 2019-02-08 17:51:47,687 INFO: trying to bootstrap from leader 'ticker-patroni-0'
ticker-patroni-1 patroni 2019-02-08 17:51:47,688 ERROR: failed to bootstrap from leader 'ticker-patroni-0'
ticker-patroni-1 patroni 2019-02-08 17:51:47,688 INFO: Removing data directory: /home/postgres/pgdata/pgroot/data
ticker-patroni-0 patroni 2019-02-08 17:51:49,394 INFO: Lock owner: ticker-patroni-0; I am ticker-patroni-0
ticker-patroni-0 patroni 2019-02-08 17:51:49,407 INFO: no action.  i am the leader with the lock

Issue Analytics

  • State:closed
  • Created 5 years ago
  • Comments:7

github_iconTop GitHub Comments

1reaction
CyberDem0ncommented, Feb 9, 2019

Your problem has nothing to do with wal-e. It might be that pods somehow configured differently.

You need to check postgres logs on the master, patroni config files on all pods (especially kubernetes.labels), check k8s manifests of all pods, statefulset, services, and endpoints and make sure that they have labels attached and could be found by label selector made from kubernetes.labels. If you don’t have any experience with Patroni, please try to start it locally to get some understanding of how it works.

Kubernetes is not only the most complicated case but also buried down under the layers of spilo, therefore I recommend you to figure out how Spilo works and how the environment variables provided to Spilo affecting the Patroni config file and all other configs (including wal-e) inside the Spilo.

There is also a possibility that the helm chart generates a wrong statefulset manifest. There is a reference manifest in the spilo repo. It could be different from the helm chart but I am sure it works.

0reactions
k1ng440commented, Feb 9, 2019

You are right. it was the helm chart. i have deployed from “reference” and its working as expected.

Thank you

Read more comments on GitHub >

github_iconTop Results From Across the Web

Troubleshooting | Google Kubernetes Engine (GKE)
Your Nodes might fail to bootstrap if the service account used for the node pool is disabled, which usually is the Compute Engine...
Read more >
how to debug and fix google gke, spilo/patroni pod label ...
The error seems to be from misconfiguration, but it is strange that it has been running fine until now and I am running...
Read more >
GKE Kubelet TLS Bootstrap Privilege Escalation
We exploit Kuberntetes's kubelet with TLS Bootstrapping to gain cluster admin access in the GKE cluster.
Read more >
How We Built Databricks on Google Kubernetes Engine (GKE)
The GKE cluster is bootstrapped with a system node pool dedicated to running workspace-wide trusted services. When launching a Databricks ...
Read more >
Managing Hybrid Clusters using Kubernetes Engine
Review GKE clusters, remote and on Google Cloud, with GKE Dashboard ... Note: If the bootstrap script encounters an error when setting up ......
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found