in place major upgrade #488

CyberDem0n · 2020-09-07T11:47:43Z

make configure_spilo to ignore PGVERSION when generating postgres.yml if it doesn't match $PGDATA/PG_VERSION
configure the backup path dependant on major version ($cluster_name/wal/$PGVERSION)
move functions used across different modules to the spilo_commons
add rsync to Dockerfile
implemented inplace_upgrade script (WIP)

How to trigger upgrade? This is a two-step process:

Update configuration (version) and rotate all pods. On start, configure_spilo will notice version mismatch and start the old version.
When all pods are rotated exec into the master container and call python3 /scripts/inplace_upgrade.py N, where N the capacity of the PostgreSQL cluster.

What inplace_upgrade.py does:

Safety checks:

new version must be bigger than the old one
current node must be running as a master with the leader lock
the current number of members must match with N
the cluster must not be running in maintenance mode
all replicas must be streaming from the master with small lag

Prepare data_new by running initdb with matching parameters
Run pg_upgrade --check. If it fails - abort and do a cleanup.
Drop objects from the database which could be incompatible with the new version (e.g. pg_stat_statements wrapper, postgres_log fdw)
enable maintenance mode (patronictl pause --wait)
Do a clean shutdown of the postgres
Get the latest checkpoint location from pg_controldata
Wait for replicas to receive/apply latest checkpoint location
Start rsyncd, listening on port 5432 (we know that it is exposed!)
If all previous steps succeeded call pg_upgrade -k
If pg_upgrade succeeded we reached the point of no return!
If it failed we need to rollback previous steps.
Rename data directories data -> data_old and data_new -> data
Update configuration files (postgres.yml and wal-e envdir).
Call CHECKPOINT on replicas (predictable shutdown time).
Trigger rsync on replicas (COPY (SELECT) TO PROGRAM)
Wait for replicas rsync to complete. Feedback status is generated by post-xfer exec script. Wait timeout 300 seconds.
Stop rsyncd
Remove the initialize key from DCS (it contains old sysid)
Restart Patroni on the master with the new configuration
Start the local postgres up as the master by calling REST API POST /restart
Memorize and reset custom statistics targets.
Start the ANALYZE in stages in a separate thread.
Wait until Patroni on replicas is restarted.
Disable maintenance mode (patronictl resume)
Wait until analyze in stages finishes.
Restore custom statistics targets and analyze these tables
Call post_bootstrap script (restore dropped objects)
Remove data_old
Trigger creation of the new backup

Rollback:

Stop rsyncd if it is running
Disable maintenance mode (patronictl resume)
Remove data_new if it exists

Replicas upgrade with rsync

There are many options on how to call the script:

Start a separate REST API for such maintenance tasks (requires opening a new port and some changes in infrastructure)
Allow pod/exec (works only on K8s, not desirable)
Use COPY TO PROGRAM "hack"

The COPY TO PROGRAM seems to be low-hanging fruit. It requires only postgres to be up and running, which is in turn already one of the requirements for the in-place upgrade to start. When being started, the script does some sanity checks based on input parameters.

There are three parameters required: new_version, primary_ip, and PID.

new_version - the version we are upgrading to
primary_ip - where to rsync from
PID - the pid of postgres backend that executed COPY TO PROGRAM.
The script must wait until the backend will exit before continuing. Also the script must check that its parent (maybe grandparent?) process has the right PID which is matching with the argument.

There are some problems with COPY TO PROGRAM approach. The Patroni and therefore PostgreSQL environment is cleared before start. As a result, the script started by postgres backend will not see for example $KUBERNETES_SERVICE_HOST and won't be able to work with DCS in all cases.

Once made sure that the client backend is gone the script will:

Remember the old sysid
Do a clean shutdown of the postgres
Rename data directory data -> data_old
Update configuration file (postgres.yaml and wal-e envdir). We do it before rsync because the initialize key could be cleaned up right after rsync was completed and Patroni will exit!
Call rsync. If it failed, rename data directory back.
Now we need to wait for the fact that the initialize key is removed from DCS. Since we know that it happens before the postgres on the master is started we will try to connect to the master via replication protocol and check the sysid.
Restart Patroni.
Remove data_old

In addition to that, implement integration tests. Mostly they are testing happy-case scenarios, like:

Successful in-place upgrade from 9.6 to 10
Successful in-place upgrade from 10 to 12
Major upgrade after the custom bootstrap with wal-g
Major upgrade after the custom bootstrap with pg_basebackup
Bootstrap of a new replica with wal-g

Also tests are covering a few unhappy cases, like: in-place upgrade doesn't start if pre-conditions are not meet.

* make configure_spilo to ignor PGVERSION when generating postgres.yml if it doesn't match $PGDATA/PG_VERSION * move functions used across different modules to the spilo_commons * add rsync to Dockerfile * implemented inplace_upgrade script (WIP) How to tigger upgrade? This is a two step process: 1. Update configuration (version) and rotate all pods. On start configure_spilo will notice version mismatch and start the old version. 2. When all pods are rotated exec into the master container and call `python3 /scripts/inplace_upgrade.py N`, where N the capacity of the PostgreSQL cluster. What `inplace_upgrade.py` does: 1. Safety checks: * new version must be bigger then the old one * current node must be running as a master with the leader lock * the current number of members must match with `N` * the cluster must not be running in maintenance mode * all replicas must be streaming from the master with small lag 2. Prepare `data_new` by running `initdb` with matching parameters 3. Drop objects from the database which could be incompatible with the new version (e.g. pg_stat_statements wrapper, postgres_log fdw) 4. Memorize and reset custom statistics targets (not yet implemented) 5. enable maintenance mode (patronictl pause --wait) 6. Do a clean shutdown of the postgre 7. Get the latest checkpoint location from pg_controldata 8. Wait for replicas to receive/apply latest checkpoint location 9. Start rsyncd, listening on port 5432 (we know that it is exposed!) 10. If all previous steps succeeded call `pg_upgrade` 11. If pg_upgrade succeeded we reached the point of no return! If it failed we need to rollback previous steps. 12. Rename data directories `data -> data_old` and `data_new -> data` 13. Update configuration file (postgres.yaml and wal-e envdir) 14. Call CHECKPOINT on replicas (not yet implemented) 15. Trigger rsync on replicas (COPY (SELECT) TO PROGRAM) 16. Wait for replicas rsync to complete (feedback status is generated by `post-xfer exec` script. Wait timeout 300 seconds. 17. Stop rsyncd 18. Remove the initialize key from DCS (it contains old sysid) 19. Restart Patroni on the master with the new configuration 20. Start the master up by calling REST API `POST /restart` 21. Disable maintenance mode (patronictl resume) 22. Run vacuumdb --analyze-in-stages 23. Restore custom statistics targets and analyze these tables 24. Call post_bootstrap script (restore dropped objects) 25. Remove `data_old` Rollback: 1. Stop rsyncd if it is running 2. Disable maintenance mode (patronictl resume) 3. Remove `data_new` if it exists Replicas upgrade with rsync --------------------------- There are many options on how to call the script: 1. Start a separate REST API for such maintenance tasks (requires opening a new port and some changes in infrastructure) 2. Allow `pod/exec` (works only on K8s, not desireable) 3. Use COPY TO PROGRAM "hack" The `COPY TO PROGRAM` seems to be a low-hanging fruit. It requires only postgres to be up and running, which is in turn already one of the requirement for upgrade to start. When being started, the script does some sanity checks based on input parameters. There are three parameters required: new_version, primary_ip, and PID. * new_version - the version we are upgrading to * primary_ip - where to rsync from * PID - the pid of postgres backend that executed COPY TO PROGRAM. The script must wait until backend will exit before continuing. Also the script must check that its parent (maybe grandparent?) process has the right PID which is matching with the argument. There are some problems with `COPY TO PROGRAM` approach. The Patroni and therefore PostgreSQL environment is cleared before start. As a result, the script started by postgres backend will not see for example $KUBERNETES_SERVICE_HOST and wont be able to work with DCS in all cases. Once made sure that the client backend is gone the script will: 1. Remember the old sysid 2. Do a clean shutdown of the postgres 3. Rename data directory `data -> data_old` 4. Update configuration file (postgres.yaml and wal-e envdir). We do it before rsync because initialize key could be cleaned up right after rsync was completed and Patroni will exit! 5. Call rsync. If it failed, rename data directory back. 6. Now we need to wait for the fact that the initialize key is removed from DCS. Since we know that it happens before the postgres on the master is started we will try to connect to the master via replication protocol and check the sysid. 7. Restart Patroni. 8. Remove `data_old`

* handle custom statistics target (speed up analyze) * remove more incompatible objects (pg_stat_statements) * truncate unlogged tables (should we do that?) * update extensions after upgrade * exclude pg_wal/* from rsync * CHECKPOINT on replica before shutdown to make rsync time predictable * Unpause when we know that Patroni on replicas was restarted * run pg_upgrade --check after initdb

* wal-e 1.1.1 * wal-g 0.2.17 * timescaledb 1.7.3 * refactor DCS configuration (close #468)

…ce-upgrade

…eature/in-place-upgrade

…ce-upgrade

postgres-appliance/bootstrap/clone_with_wale.py

sdudoladov · 2020-09-29T06:12:58Z

postgres-appliance/bootstrap/clone_with_wale.py

+                backup = choose_backup(backup_list, recovery_target_time)
+                if backup:
+                    return backup, (name if value != old_value else None)
+            else:  # We assume that the LATEST backup will be for the biggest postgres version!


to what does the "biggest" refer in this comment ? Does it meant the LATEST backup has to be for PG v 12 when spilo-13 is deployed ?

If we don't have the major version of the source cluster specified explicitly we try all postgres versions starting from the biggest. I.e., get_wale_environments() function yields tuples:

('WALE_S3_PREFIX', 's3://$bucket/spilo/cluster-name/$uid/wal/12')

('WALE_S3_PREFIX', 's3://$bucket/spilo/cluster-name/$uid/wal/11')

('WALE_S3_PREFIX', 's3://$bucket/spilo/cluster-name/$uid/wal/10')
and so on.

For every prefix we call wal-e backup-list and trying to find the backup suitable for the given recovery_target_time.
If the recovery_target_time is not specified we just pick the LATEST backup.

But! It might be, that under the s3://$bucket/spilo/cluster-name/$uid/wal/ path there are backups for 12 and lets say 10. The correct way of selecting the latest backup between two (or more) different versions would be listing backups for all versions and choosing between them. This is too much work with too few benefits. Therefore I made an assumption if the backup for version 12 is there to not continue with other versions, because most likely the backup for 10 would be older.

postgres-appliance/tests/test_spilo.sh

postgres-appliance/scripts/spilo_commons.py

postgres-appliance/scripts/configure_spilo.py

postgres-appliance/major_upgrade/inplace_upgrade.py

…nto feature/pg13-inplace-upgrade-v2

…ce-upgrade

…in-place-upgrade

…spilo into feature/in-place-upgrade

postgres-appliance/major_upgrade/inplace_upgrade.py

…ce-upgrade

…in-place-upgrade

Jan-M · 2020-11-30T10:24:16Z

👍

CyberDem0n · 2020-11-30T10:27:19Z

👍

anasanjaria · 2024-03-01T08:37:37Z

@CyberDem0n
I have a question. Why this script does not enforce "Write lock" before starting upgrade process?

My story

I forgot to enforce write lock on minor version upgrade & ended up corrupted indices. I need to manually resolve those corruptions and this SO post [1] was very helpful

[1] https://stackoverflow.com/a/45317850/665905

Alexander Kukushkin added 24 commits August 21, 2020 07:43

Better logging + drop postgres_log before upgrade

9f0320a

Little fixes

fc47c58

Implement integration tests

342da9f

Move tests to the right place

eec5b9a

Polish tests

cf37c6d

Add minio

10260ff

Patroni 2.0

d0eb6b4

* wal-e 1.1.1 * wal-g 0.2.17 * timescaledb 1.7.3 * refactor DCS configuration (close #468)

Merge branch 'master' of github.com:zalando/spilo into feature/in-pla…

a9cf41e

…ce-upgrade

Merge branch 'feature/patroni-2.0' of github.com:zalando/spilo into f…

936a90b

…eature/in-place-upgrade

More refactoring. Define PATRONI_CONFIG_FILE in spilo_commons

590917a

Upgrade with clone + tests

3a5b988

More tests: timescaledb and upgrade after clone

2bf0a67

Merge branch 'master' of github.com:zalando/spilo into feature/in-pla…

2aa408f

…ce-upgrade

Run tests in CDP

862acc9

Fix delivery.yaml

23e71f1

Install docker-compose

ad68923

Install docker-compose with pip3

592306e

Pin docker-compose version

4f22981

debug tests

92442de

SPILO_PROVIDER=local

a7a4cd7

Raise timeouts

13e9567

Skip failed upgrade

5ed144d

CyberDem0n changed the title ~~[WiP] in place upgrade~~ [WiP] in place major upgrade Sep 8, 2020

Alexander Kukushkin added 5 commits September 8, 2020 12:51

Rename test_major_upgrade.sh -> test_spilo.sh

eb45869

Merge branch 'master' of github.com:zalando/spilo into feature/in-pla…

67caf5b

…ce-upgrade

Disable one test

1b49069

Update wal-e envdir and trigger backup after upgrade

c64fe33

Implemented ENABLE_WAL_PATH_COMPAT

28217e3

Merge branch 'master' of github.com:zalando/spilo into feature/in-pla…

1a0a436

…ce-upgrade

CyberDem0n changed the title ~~[WiP] in place major upgrade~~ in place major upgrade Sep 24, 2020

sdudoladov reviewed Sep 29, 2020

View reviewed changes

Alexander Kukushkin added 5 commits September 29, 2020 14:46

Merge branch 'feature/in-place-upgrade' of github.com:zalando/spilo i…

7c70d96

…nto feature/pg13-inplace-upgrade-v2

more tests

7903d61

Merge branch 'master' of github.com:zalando/spilo into feature/in-pla…

722d63f

…ce-upgrade

Merge branch 'feature/pg13' of github.com:zalando/spilo into feature/…

06b26ee

…in-place-upgrade

Merge branch 'feature/pg13-inplace-upgrade-v2' of github.com:zalando/…

6e3c87f

…spilo into feature/in-place-upgrade

CyberDem0n changed the base branch from master to feature/pg13 September 29, 2020 14:26

Make rsync port configurable

3c5f6f5

sdudoladov reviewed Sep 30, 2020

View reviewed changes

postgres-appliance/major_upgrade/inplace_upgrade.py Outdated Show resolved Hide resolved

postgres-appliance/major_upgrade/inplace_upgrade.py Show resolved Hide resolved

postgres-appliance/major_upgrade/inplace_upgrade.py Show resolved Hide resolved

Alexander Kukushkin added 3 commits September 30, 2020 13:46

Add a few comments

4c36a50

Merge branch 'master' of github.com:zalando/spilo into feature/in-pla…

92e69b0

…ce-upgrade

Remove unnecessary cd

aef8a98

sdudoladov mentioned this pull request Oct 8, 2020

Remove operator checks that prevent PG major version upgrade zalando/postgres-operator#1160

Merged

Alexander Kukushkin added 6 commits October 9, 2020 10:44

Bump bg_mon commit id

35349b9

Start backup only if there is envdir defined

c1cf441

Merge branch 'feature/pg13' of github.com:zalando/spilo into feature/…

179b825

…in-place-upgrade

Merge branch 'feature/pg13' of github.com:zalando/spilo into feature/…

c5981d5

…in-place-upgrade

Merge branch 'feature/pg13' of github.com:zalando/spilo into feature/…

32a5243

…in-place-upgrade

Merge branch 'feature/pg13' of github.com:zalando/spilo into feature/…

a1685da

…in-place-upgrade

CyberDem0n merged commit 852c17f into feature/pg13 Nov 30, 2020

CyberDem0n deleted the feature/in-place-upgrade branch November 30, 2020 10:28

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

in place major upgrade #488

in place major upgrade #488

CyberDem0n commented Sep 7, 2020 •

edited

Loading

sdudoladov Sep 29, 2020

CyberDem0n Sep 29, 2020

Jan-M commented Nov 30, 2020

CyberDem0n commented Nov 30, 2020

anasanjaria commented Mar 1, 2024 •

edited

Loading

in place major upgrade #488

in place major upgrade #488

Conversation

CyberDem0n commented Sep 7, 2020 • edited Loading

Replicas upgrade with rsync

sdudoladov Sep 29, 2020

Choose a reason for hiding this comment

CyberDem0n Sep 29, 2020

Choose a reason for hiding this comment

Jan-M commented Nov 30, 2020

CyberDem0n commented Nov 30, 2020

anasanjaria commented Mar 1, 2024 • edited Loading

My story

CyberDem0n commented Sep 7, 2020 •

edited

Loading

anasanjaria commented Mar 1, 2024 •

edited

Loading