Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Experiment] Use incremental vacuum to speed up delete? #2800

Merged
merged 3 commits into from
Jun 29, 2023

Conversation

chakflying
Copy link
Collaborator

@chakflying chakflying commented Feb 18, 2023

Description

Background

In SQLite, data is stored in pages. When data is deleted, the pages which store that data are marked as free in the "freelist", which can be reused when new data is added. This means that no actual delete happens on disk.

In the past, we found that the database size increased a lot in long term use, and deleting data doesn't free up disk space. auto_vacuum = FULL was added to solve this issue.

auto_vacuum

According to the SQLite documentation, when auto_vacuum is set to full, on every transaction freelist pages are moved to the end of the file and truncated. If you think about it, this is terrible for our use case, since all our heartbeat data is inserted interleaved:

Heartbeat table
--------------------------------------------Time------------------------------------------->
|monitor1|monitor2|monitor3||monitor1|monitor2|monitor3||monitor1|monitor2|monitor3|monito..
--------------------------------------------Time------------------------------------------->

If monitor 1 is deleted, essentially all the pages in table needs to be shifted to reclaim the free space:

Heartbeat table
--------------------------------------------Time------------------------------------------->
|--------|monitor2|monitor3||--------|monitor2|monitor3||--------|monitor2|monitor3|-----..
--------------------------------------------Time------------------------------------------->
vacuum ->
--------------------------------------------Time------------------------------------------->
|monitor2|monitor3|monitor2||monitor3|monitor2|monitor3|..
--------------------------------------------Time------------------------------------------->

If auto_vacuum is set to incremental, this behavior is disabled, and we can run a separate command incremental_vacuum to free up a limited number of pages.

Nightly cleanup

Currently the server deletes heartbeat data older than the retention period every night. I think theoretically auto_vacuum = FULL also has no use here. Assuming the monitor config is not changed, the server will produce the same amount of data each day. If we cleanup the database every night, all the freelist pages can be reused the next day. The database size should not increase past the size of the data retention period. There is indeed one caveat, if the retention period is decreased, the database size will not decrease correspondingly. But maybe trigger a vacuum on changing of retention period can be a solution?

Experiment

However, I have not researched enough to see if this is actually happening on delete, or if the alignment situation in the heartbeat table is such that the all the pages are moved like that on delete, or it just leads to more fragmentation. I also don't have a big enough database to test this on, and my 4MB database showed no noticeable improvement.

Anyone who is interested can test this PR to see if delete is faster. I have added a logging function to time how long the delete takes. It's set at the INFO level so it might not show up by default.

@chakflying chakflying force-pushed the experiment/incremental-vacuum branch from e8c09a3 to 7a9bbc3 Compare March 12, 2023 17:23
@chakflying chakflying mentioned this pull request Mar 12, 2023
2 tasks
@louislam louislam added this to the 1.22.0 milestone Mar 13, 2023
@louislam
Copy link
Owner

Sounds interesting. I will test it later. Added it to the 1.22.0 milestone first.

@campfred
Copy link

Currently testing the experimental fix. 'Will report no later than this Friday on how it is going.

Instructions how I applied it on the Uptime-Kuma container for who may be interested:

  1. Download the raw server/database.js & server/server.js files on your host
    wget https://github.com/chakflying/uptime-kuma/raw/experiment/incremental-vacuum/server/database.js --output-document=server/database.js
    wget https://github.com/chakflying/uptime-kuma/raw/experiment/incremental-vacuum/server/server.js --output-document=server/server.js
  2. Mount it to your container with destination path being /app/server/database.js & /app/server/server.js respectively (make sure it has read+write permissions)
    docker run -d --restart=always -p 3001:3001 -v uptime-kuma:/app/data -v <PREVIOUSLY_DOWNLOADED_FILES_DIR>/server/database.js:/app/server/database.js -v <PREVIOUSLY_DOWNLOADED_FILES_DIR>/server/database.js:/app/server/server.js --name uptime-kuma louislam/uptime-kuma:1

Docker Compose file example:

services:
  uptime-kuma:
    [...]
    volumes:
      - uptime-kuma:/app/data:rw
      - <PREVIOUSLY_DOWNLOADED_FILES_DIR>/server/database.js:/app/server/database.js:rw
      - <PREVIOUSLY_DOWNLOADED_FILES_DIR>/server/database.js:/app/server/server.js
    [...]

@chakflying chakflying force-pushed the experiment/incremental-vacuum branch from 7a9bbc3 to 099e610 Compare April 18, 2023 19:33
@campfred
Copy link

Reporting back (late, but still)!

It did seem to have accelerated my Uptime Kuma installation but all slowdowns/locks up still seem to exist. They just happen later unfortunately.
It still gets locked up easily when, for example, refreshing the page while it is pulling data for each monitor. Aka, when the monitors list has part of them still saying « N/A » as their uptime percentage. If authentication is disabled, the authentication login page will be seen for a moment before getting redirected properly and being greeted with an empty list until the service catches up or is restarted.

I still consider it as an improvement as it has prevented my Uptime Kuma instance to be so locked up that it couldn't do its monitoring pings to Healthchecks.io (DMS to check if it's dead) in a timely manner.

@chakflying
Copy link
Collaborator Author

Thanks for the feedback. I guess it's mainly the benefit of synchronous = normal. We can't do much when even reads cannot keep up.

@chakflying chakflying force-pushed the experiment/incremental-vacuum branch from 099e610 to 8080e27 Compare May 7, 2023 15:15
@louislam louislam modified the milestones: 1.22.0, 1.23.0 May 15, 2023
@CommanderStorm CommanderStorm mentioned this pull request Jun 20, 2023
1 task
@otbutz
Copy link

otbutz commented Jun 21, 2023

@chakflying could you add PRAGMA optimize; as part of the nightly cleanup?

@otbutz
Copy link

otbutz commented Jun 21, 2023

@chakflying I might have missed it but you're never invoking PRAGMA incremental_vacuum; ?

When the value of auto-vacuum is 2 or "incremental" then the additional information needed to do auto-vacuuming is stored in the database file but auto-vacuuming does not occur automatically at each commit as it does with auto_vacuum=full. In incremental mode, the separate incremental_vacuum pragma must be invoked to cause the auto-vacuum to occur.

I still think we're better of with disabling autovacuum entirely and running a full blown VACUUM on monitor deletion (edit: and timer based)

@otbutz
Copy link

otbutz commented Jun 21, 2023

The missing executions of optimize and vacuum would also explain the slowdowns observed by @campfred. The database slowly gets more fragmented and the query planer has to work with outdated statistics on top of that.

@chakflying chakflying marked this pull request as draft June 25, 2023 21:03
@chakflying chakflying force-pushed the experiment/incremental-vacuum branch from 8080e27 to 101af82 Compare June 25, 2023 21:03
@louislam
Copy link
Owner

It want to start testing it during my development, so I merge it first.

@louislam louislam marked this pull request as ready for review June 29, 2023 14:40
@louislam louislam merged commit a386f1f into louislam:master Jun 29, 2023
@chakflying chakflying deleted the experiment/incremental-vacuum branch July 18, 2023 06:12
@Saibamen
Copy link
Contributor

@louislam: Any news after testing?

@louislam
Copy link
Owner

@louislam: Any news after testing?

I didn't see any negative effects, so I will keep this.

@otbutz
Copy link

otbutz commented Aug 17, 2023

Our uptime-kuma instance with ~80 monitors and 365 days of history feels a lot snappier now. sqlite db size for reference: 4800MB

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants