-
Notifications
You must be signed in to change notification settings - Fork 382
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
in place major upgrade #488
Conversation
* make configure_spilo to ignor PGVERSION when generating postgres.yml if it doesn't match $PGDATA/PG_VERSION * move functions used across different modules to the spilo_commons * add rsync to Dockerfile * implemented inplace_upgrade script (WIP) How to tigger upgrade? This is a two step process: 1. Update configuration (version) and rotate all pods. On start configure_spilo will notice version mismatch and start the old version. 2. When all pods are rotated exec into the master container and call `python3 /scripts/inplace_upgrade.py N`, where N the capacity of the PostgreSQL cluster. What `inplace_upgrade.py` does: 1. Safety checks: * new version must be bigger then the old one * current node must be running as a master with the leader lock * the current number of members must match with `N` * the cluster must not be running in maintenance mode * all replicas must be streaming from the master with small lag 2. Prepare `data_new` by running `initdb` with matching parameters 3. Drop objects from the database which could be incompatible with the new version (e.g. pg_stat_statements wrapper, postgres_log fdw) 4. Memorize and reset custom statistics targets (not yet implemented) 5. enable maintenance mode (patronictl pause --wait) 6. Do a clean shutdown of the postgre 7. Get the latest checkpoint location from pg_controldata 8. Wait for replicas to receive/apply latest checkpoint location 9. Start rsyncd, listening on port 5432 (we know that it is exposed!) 10. If all previous steps succeeded call `pg_upgrade` 11. If pg_upgrade succeeded we reached the point of no return! If it failed we need to rollback previous steps. 12. Rename data directories `data -> data_old` and `data_new -> data` 13. Update configuration file (postgres.yaml and wal-e envdir) 14. Call CHECKPOINT on replicas (not yet implemented) 15. Trigger rsync on replicas (COPY (SELECT) TO PROGRAM) 16. Wait for replicas rsync to complete (feedback status is generated by `post-xfer exec` script. Wait timeout 300 seconds. 17. Stop rsyncd 18. Remove the initialize key from DCS (it contains old sysid) 19. Restart Patroni on the master with the new configuration 20. Start the master up by calling REST API `POST /restart` 21. Disable maintenance mode (patronictl resume) 22. Run vacuumdb --analyze-in-stages 23. Restore custom statistics targets and analyze these tables 24. Call post_bootstrap script (restore dropped objects) 25. Remove `data_old` Rollback: 1. Stop rsyncd if it is running 2. Disable maintenance mode (patronictl resume) 3. Remove `data_new` if it exists Replicas upgrade with rsync --------------------------- There are many options on how to call the script: 1. Start a separate REST API for such maintenance tasks (requires opening a new port and some changes in infrastructure) 2. Allow `pod/exec` (works only on K8s, not desireable) 3. Use COPY TO PROGRAM "hack" The `COPY TO PROGRAM` seems to be a low-hanging fruit. It requires only postgres to be up and running, which is in turn already one of the requirement for upgrade to start. When being started, the script does some sanity checks based on input parameters. There are three parameters required: new_version, primary_ip, and PID. * new_version - the version we are upgrading to * primary_ip - where to rsync from * PID - the pid of postgres backend that executed COPY TO PROGRAM. The script must wait until backend will exit before continuing. Also the script must check that its parent (maybe grandparent?) process has the right PID which is matching with the argument. There are some problems with `COPY TO PROGRAM` approach. The Patroni and therefore PostgreSQL environment is cleared before start. As a result, the script started by postgres backend will not see for example $KUBERNETES_SERVICE_HOST and wont be able to work with DCS in all cases. Once made sure that the client backend is gone the script will: 1. Remember the old sysid 2. Do a clean shutdown of the postgres 3. Rename data directory `data -> data_old` 4. Update configuration file (postgres.yaml and wal-e envdir). We do it before rsync because initialize key could be cleaned up right after rsync was completed and Patroni will exit! 5. Call rsync. If it failed, rename data directory back. 6. Now we need to wait for the fact that the initialize key is removed from DCS. Since we know that it happens before the postgres on the master is started we will try to connect to the master via replication protocol and check the sysid. 7. Restart Patroni. 8. Remove `data_old`
* handle custom statistics target (speed up analyze) * remove more incompatible objects (pg_stat_statements) * truncate unlogged tables (should we do that?) * update extensions after upgrade * exclude pg_wal/* from rsync * CHECKPOINT on replica before shutdown to make rsync time predictable * Unpause when we know that Patroni on replicas was restarted * run pg_upgrade --check after initdb
* wal-e 1.1.1 * wal-g 0.2.17 * timescaledb 1.7.3 * refactor DCS configuration (close #468)
…eature/in-place-upgrade
backup = choose_backup(backup_list, recovery_target_time) | ||
if backup: | ||
return backup, (name if value != old_value else None) | ||
else: # We assume that the LATEST backup will be for the biggest postgres version! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
to what does the "biggest" refer in this comment ? Does it meant the LATEST
backup has to be for PG v 12 when spilo-13 is deployed ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we don't have the major version of the source cluster specified explicitly we try all postgres versions starting from the biggest. I.e., get_wale_environments()
function yields tuples:
- ('WALE_S3_PREFIX', 's3://$bucket/spilo/cluster-name/$uid/wal/12')
- ('WALE_S3_PREFIX', 's3://$bucket/spilo/cluster-name/$uid/wal/11')
- ('WALE_S3_PREFIX', 's3://$bucket/spilo/cluster-name/$uid/wal/10')
and so on.
For every prefix we call wal-e backup-list
and trying to find the backup suitable for the given recovery_target_time
.
If the recovery_target_time
is not specified we just pick the LATEST
backup.
But! It might be, that under the s3://$bucket/spilo/cluster-name/$uid/wal/
path there are backups for 12 and lets say 10. The correct way of selecting the latest backup between two (or more) different versions would be listing backups for all versions and choosing between them. This is too much work with too few benefits. Therefore I made an assumption if the backup for version 12 is there to not continue with other versions, because most likely the backup for 10 would be older.
…nto feature/pg13-inplace-upgrade-v2
…spilo into feature/in-place-upgrade
👍 |
1 similar comment
👍 |
@CyberDem0n My storyI forgot to enforce write lock on minor version upgrade & ended up corrupted indices. I need to manually resolve those corruptions and this SO post [1] was very helpful |
How to trigger upgrade? This is a two-step process:
python3 /scripts/inplace_upgrade.py N
, where N the capacity of the PostgreSQL cluster.What
inplace_upgrade.py
does:N
data_new
by runninginitdb
with matching parameterspg_upgrade --check
. If it fails - abort and do a cleanup.pg_upgrade -k
If it failed we need to rollback previous steps.
data -> data_old
anddata_new -> data
post-xfer exec
script. Wait timeout 300 seconds.POST /restart
data_old
Rollback:
data_new
if it existsReplicas upgrade with rsync
There are many options on how to call the script:
pod/exec
(works only on K8s, not desirable)The
COPY TO PROGRAM
seems to be low-hanging fruit. It requires only postgres to be up and running, which is in turn already one of the requirements for the in-place upgrade to start. When being started, the script does some sanity checks based on input parameters.There are three parameters required: new_version, primary_ip, and PID.
The script must wait until the backend will exit before continuing. Also the script must check that its parent (maybe grandparent?) process has the right PID which is matching with the argument.
There are some problems with
COPY TO PROGRAM
approach. The Patroni and therefore PostgreSQL environment is cleared before start. As a result, the script started by postgres backend will not see for example$KUBERNETES_SERVICE_HOST
and won't be able to work with DCS in all cases.Once made sure that the client backend is gone the script will:
data -> data_old
data_old
In addition to that, implement integration tests. Mostly they are testing happy-case scenarios, like:
Also tests are covering a few unhappy cases, like: in-place upgrade doesn't start if pre-conditions are not meet.