Move index creation inside CREATE TABLE for massive database creation speedup #273
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Hi!
We're in the process of migrating to cockroachdb. We found that naively creating dev databases using sqlalchemy and cockroachdb was dramatically slower than postgres (~20 minutes vs ~30 seconds). We narrowed most of this time down to
CREATE INDEX
statements.Talking with @data-matt he suggested doing index creation inside
CREATE TABLE
rather than as separate statements. This makes a huge difference, bringing our overall db initialisation time down to ~2m30.So I've experimented with getting sqlalchemy to do it this way. The only approach I could find is to have visit_create_table do the index creation and visit_create_index be a no-op. I've got some prototype code here which works for how we use sqlalchemy at Wave.
I don't know sqlalchemy's internals very well so I'm kind of uncertain about my approach, in particular:
Do you have any ideas/thoughts?
Then my other question: if this works, how do we release it? Presumably this is a breaking change, so do we need to put it behind some kind of config flag?