Append mode: Do not try to delete objects that can't exist in middle #2006

joto · 2023-07-17T08:13:30Z

When osm2pgsql runs in append mode it deletes all objects for which it gets new versions from the middle tables before then adding the new version. For a typical diff many of these deletes will be unnecessary because the objects are new. With this commit the behaviour changes slightly: We first get the maximum id from the nodes/ways/relations middle tables. This operation is fast, because the PostgreSQL max() function is aware of the btree index on those tables. Later, before we delete an object we check the id against that maximum id, if it is larger the object can't be in the table and we don't do the delete.

(Note that in theory we could use the fact that an object has version number 1 to figure out that it must be new. But this is much less robust than what we are doing here, for instance when the diff overlaps with the original import.)

Performance improvement for small (minutely) diffs is not measurable, for large diffs about 10%.

When osm2pgsql runs in append mode it deletes all objects for which it gets new versions from the middle tables before then adding the new version. For a typical diff many of these deletes will be unnecessary because the objects are new. With this commit the behaviour changes slightly: We first get the maximum id from the nodes/ways/relations middle tables. This operation is fast, because the PostgreSQL max() function is aware of the btree index on those tables. Later, before we delete an object we check the id against that maximum id, if it is larger the object can't be in the table and we don't do the delete. (Note that in theory we could use the fact that an object has version number 1 to figure out that it must be new. But this is much less robust than what we are doing here, for instance when the diff overlaps with the original import.)

joto force-pushed the check-max-id-before-delete branch from d276b6e to c885922 Compare July 17, 2023 09:12

lonvia merged commit 4c35c03 into osm2pgsql-dev:master Jul 20, 2023
27 checks passed

joto deleted the check-max-id-before-delete branch July 30, 2023 18:13

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Append mode: Do not try to delete objects that can't exist in middle #2006

Append mode: Do not try to delete objects that can't exist in middle #2006

joto commented Jul 17, 2023

Append mode: Do not try to delete objects that can't exist in middle #2006

Append mode: Do not try to delete objects that can't exist in middle #2006

Conversation

joto commented Jul 17, 2023