New ram middle #1461

joto · 2021-04-26T08:33:28Z

This implements a completely new ram (non-slim) middle that is compatible with the old one but much more memory-efficient. The first three commits build some infrastructure for it, the last contains the new middle.

The new middle usually only stores node locations and way nodes which are needed to build way and relation geometries. If the output uses two-stage processing it can now tell the middle that that's the case and then also complete way objects (including the tags and attributes) are stored. The new middle code has provisions for storing node and relation objects, too, but these are not used yet, because two-stage processing does not use them yet.

When not using two-stage processing the memory requirements are much much smaller than with the old ram middle. Rule of thumb is, you'll need about 1GB plus 2.5 times the size of the PBF file as memory. This makes it possible to import even continent-sized data on reasonably-sized machines.

When using two-stage processing the memory requirements are larger than without it. Currently OSM objects are just stored in the libosmium internal format, which is meant for quick access, not for saving memory. There could be considerable space savings there with a better implementation, but considering that two-stage processing is seldom used and you can always fall back to the slim middle, improving this has been left for a later time.

lonvia

Comments for first two commits. I'll do the rest later.

lonvia · 2021-04-26T08:50:24Z

src/options.hpp

+{
+    bool full_nodes = false;
+    bool full_ways = false;
+    bool full_relations = false;


Could you please add documentation here. The variable names are not self-explanatory.

lonvia · 2021-04-26T09:01:20Z

src/ordered-index.hpp

+            index.reserve(block_size);
+        }
+
+        bool full() const noexcept { return index.size() == index.capacity(); }


Is this guaranteed to work, i.e. is vector allowed to pre-emptively enlarge its capacity before it is full?

https://en.cppreference.com/w/cpp/container/vector says iterators are only invalidated on, for example push_back() or resize() if the vector changed capacity. Documenting it this way makes only sense if we can know whether a vector changed capacity. So I think we are good here.

joto · 2021-04-26T10:07:24Z

I have added a new PR #1464 which contains the first two commits of this one with added docs. Once that is through, I'll reissue this PR here.

This is a very memory-efficient storage which will be used for the new ram middle.

Replaces the somewhat dated middle_ram_t by a completely new implementation for importing small to medium sized files into a non-updateable database. It works completely in memory, no data is written to disk. The following traits of OSM objects can be stored. All are optional: - Node locations for building geometries of ways. - Way node ids for building geometries of relations based on ways. - Tags and attributes for nodes, ways, and/or relations for full 2-stage-processing support. - Attributes for untagged nodes.

joto · 2021-04-26T14:50:09Z

New version of this PR with only the two last commits slightly updated and rebased.

lonvia

That's so much more readable.

lonvia reviewed Apr 26, 2021

View reviewed changes

joto mentioned this pull request Apr 26, 2021

Prepare for new ram middle #1464

Merged

joto added 2 commits April 26, 2021 16:43

New node_locations_t class for storing node locations in memory

913632c

This is a very memory-efficient storage which will be used for the new ram middle.

joto force-pushed the new-ram-middle branch from 57a10ef to 50cd39b Compare April 26, 2021 14:49

lonvia approved these changes Apr 28, 2021

View reviewed changes

lonvia merged commit bf4d427 into osm2pgsql-dev:master Apr 29, 2021

joto mentioned this pull request Apr 29, 2021

Possible improvements for new ram middle #1466

Open

7 tasks

joto deleted the new-ram-middle branch April 29, 2021 08:41

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

New ram middle #1461

New ram middle #1461

joto commented Apr 26, 2021

lonvia left a comment

lonvia Apr 26, 2021

lonvia Apr 26, 2021

joto Apr 26, 2021

joto commented Apr 26, 2021

joto commented Apr 26, 2021

lonvia left a comment

New ram middle #1461

New ram middle #1461

Conversation

joto commented Apr 26, 2021

lonvia left a comment

Choose a reason for hiding this comment

lonvia Apr 26, 2021

Choose a reason for hiding this comment

lonvia Apr 26, 2021

Choose a reason for hiding this comment

joto Apr 26, 2021

Choose a reason for hiding this comment

joto commented Apr 26, 2021

joto commented Apr 26, 2021

lonvia left a comment

Choose a reason for hiding this comment