How to properly restore cluster using WAL-G (WAL-E) to a specific time (PITR) on Postgres 12?
See original GitHub issueI’ve gone through manual pages, resolved issues, and many tutorials, but unfortunately, I didn’t find complex and working examples, how to restore Patroni cluster to a specific time (PITR). So with little help I’ll prepare some tutorial to cover this topic.
There is no recovery.conf in Postgres 12, so Patroni should handle configuration and paste it into postgresql.conf.
In pure Postgres you can restore database to a specific time by this row:
recovery_target_time: '2020-05-25 11:20:00'
- Setup environment I’m using custom boostrap method pitr_recovery.
scope: postgres
namespace: /db/
name: postgresql2
restapi:
listen: 10.0.1.8:8008
connect_address: 10.0.1.8:8008
etcd:
host: 10.0.1.7:2379
bootstrap:
dcs:
ttl: 30
loop_wait: 10
retry_timeout: 10
maximum_lag_on_failover: 1048576
postgresql:
use_pg_rewind: true
method: pitr_recovery
pitr_recovery:
keep_existing_recovery_conf: true
recovery_conf:
restore_command: 'envdir /etc/wal-g.d/env wal-g wal-fetch "%f" "%p"'
recovery_target_timeline: latest
recovery_target_action: promote
recovery_target_time: '2020-05-25 11:20:00'
initdb:
- encoding: UTF8
- locale: en_US.UTF-8
- data-checksums
pg_hba:
- host replication replicator 127.0.0.1/32 md5
- host replication replicator 10.0.1.4/0 md5
- host replication replicator 10.0.1.8/0 md5
- host replication replicator 10.0.1.6/0 md5
- host all all 0.0.0.0/0 md5
postgresql:
listen: 10.0.1.8:5432
connect_address: 10.0.1.8:5432
data_dir: /data/patroni
pgpass: /tmp/pgpass
authentication:
replication:
username: replicator
password: rep-pass
superuser:
username: postgres
password: secretpassword
parameters:
unix_socket_directories: '/var/run/postgresql'
shared_preload_libraries: 'pg_stat_statements'
archive_mode: 'on'
archive_timeout: 300s
archive_command: 'envdir /etc/wal-g.d/env wal-g wal-push %p'
recovery_conf:
restore_command: 'envdir /etc/wal-g.d/env wal-g wal-fetch "%f" "%p"'
tags:
nofailover: false
noloadbalance: false
clonefrom: false
nosync: false
- Steps
- stop all Patroni nodes
- deleted Patroni cluster (patroni remove)
- deleted content of data directory (/data/patroni)
- checkout backup with wal-g (wal-e)
envdir /etc/wal-g.d/env wal-g backup-fetch /data/patroni LATEST - start Patroni node (master)
- db is recovered to the last backup, recovery_target_time is ignored (in postgresql.conf)
Could someone help me with that?
Secondary mission 😃
I’ve also tried put checkout to command section, but it doesn’t work at all and thows error.
method: pitr_recovery
pitr_recovery:
command: 'envdir /etc/wal-g.d/env wal-g backup-fetch /data/patroni LATEST'
keep_existing_recovery_conf: true
recovery_conf:
restore_command: 'envdir /etc/wal-g.d/env wal-g wal-fetch "%f" "%p"'
recovery_target_timeline: latest
recovery_target_action: promote
recovery_target_time: '2020-05-25 11:20:00'
patroni.service - Runners to orchestrate a high-availability PostgreSQL
Loaded: loaded (/etc/systemd/system/patroni.service; disabled; vendor preset: enabled)
Active: failed (Result: exit-code) since Mon 2020-06-01 08:26:45 UTC; 1min 36s ago
Process: 3564 ExecStart=/usr/local/bin/patroni /etc/patroni.yml (code=exited, status=1/FAILURE)
Main PID: 3564 (code=exited, status=1/FAILURE)
Jun 01 08:26:45 pg-2 patroni[3564]: info = self._run_cycle()
Jun 01 08:26:45 pg-2 patroni[3564]: File "/usr/local/lib/python2.7/dist-packages/patroni/ha.py", line 1277, in _run_cycle
Jun 01 08:26:45 pg-2 patroni[3564]: return self.post_bootstrap()
Jun 01 08:26:45 pg-2 patroni[3564]: File "/usr/local/lib/python2.7/dist-packages/patroni/ha.py", line 1173, in post_bootstrap
Jun 01 08:26:45 pg-2 patroni[3564]: self.cancel_initialization()
Jun 01 08:26:45 pg-2 patroni[3564]: File "/usr/local/lib/python2.7/dist-packages/patroni/ha.py", line 1168, in cancel_initialization
Jun 01 08:26:45 pg-2 patroni[3564]: raise PatroniException('Failed to bootstrap cluster')
Jun 01 08:26:45 pg-2 patroni[3564]: patroni.exceptions.PatroniException: 'Failed to bootstrap cluster'
Jun 01 08:26:45 pg-2 systemd[1]: patroni.service: Main process exited, code=exited, status=1/FAILURE
Jun 01 08:26:45 pg-2 systemd[1]: patroni.service: Failed with result 'exit-code'.
Issue Analytics
- State:
- Created 3 years ago
- Comments:5
Top Results From Across the Web
docs/patroni/postgresql-backups-wale-walg.md · master - GitLab
To restore to a given point of time or to the latest available point, ... by WAL-G takes ~9 hours with Postgres cluster...
Read more >26.3. Continuous Archiving and Point-in-Time Recovery (PITR)
To recover successfully using continuous archiving (also called “online backup” by many database vendors), you need a continuous sequence of archived WAL ...
Read more >Continuous PostgreSQL Backups using WAL-G - Supabase
1. Configure environment variables · 2. Enable WAL archiving · 3. Restart the database · 4. Create your first physical backup · 5....
Read more >docs/patroni/postgresql-backups-wale-walg.md - GitLab
To restore from WAL-E backups, either WAL-G or WAL-E can be used. In "gitlab-restore", the default is WAL-G, as it gives 3-4 times...
Read more >How To Set Up Continuous Archiving and Perform Point-In ...
In this first step, you need to configure your PostgreSQL 12 cluster to archive the cluster's WAL files in a directory different from...
Read more >
Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free
Top Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Thank you for help! Now it’s working, so here is my configuration file and scripts. It’s working combination for Postgresql 12 + Patroni + WAL-G (maintained successor of WAL-E) + Azure / AWS
And easy script clone_with_walg.sh:
Warning: your /data directory must be writable for postgres user
Steps to PITR:
Stop Patroni on all replica nodes and finally on master sudo systemctl stop patroni
Update configuration file /etc/patroni.yml recovery_target_time: ‘2020-06-08 08:52:00’
Remove cluster from etcd patronictl -c /etc/patroni.yml remove postgres
Backup and remove data directory on master /data/patroni
Run Patroni on master - it automatically call script clone_with_walg.sh sudo systemctl start patroni
@jiripsota Thank you for providing the additional details in your last comment, I spent many hours trying to figure out how to do this. Was not able to easily find this information.