Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize insert query loading #15404

Merged
merged 2 commits into from
Jun 23, 2017
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -118,21 +118,35 @@ def create_records!(inventory_collection, all_attribute_keys, batch, attributes_

return if hashes.blank?

ActiveRecord::Base.connection.execute(
result = ActiveRecord::Base.connection.execute(
build_insert_query(inventory_collection, all_attribute_keys, hashes)
)
if inventory_collection.dependees.present?
# We need to get primary keys of the created objects, but only if there are dependees that would use them
map_ids_to_inventory_objects(inventory_collection, indexed_inventory_objects, hashes)
map_ids_to_inventory_objects(inventory_collection, indexed_inventory_objects, all_attribute_keys, hashes, result)
end
end

def map_ids_to_inventory_objects(inventory_collection, indexed_inventory_objects, hashes)
inventory_collection.model_class.where(
build_multi_selection_query(inventory_collection, hashes)
).select(inventory_collection.unique_index_columns + [:id]).each do |inserted_record|
inventory_object = indexed_inventory_objects[inventory_collection.unique_index_columns.map { |x| inserted_record.public_send(x) }]
inventory_object.id = inserted_record.id if inventory_object
def map_ids_to_inventory_objects(inventory_collection, indexed_inventory_objects, all_attribute_keys, hashes, result)
# The remote_data_timestamp is adding a WHERE condition to ON CONFLICT UPDATE. As a result, the RETURNING
# clause is not guaranteed to return all ids of the inserted/updated records in the result. In that case
# we test if the number of results matches the expected batch size. Then if the counts do not match, the only
# safe option is to query all the data from the DB, using the unique_indexes. The batch size will also not match
# for every remainders(a last batch in a stream of batches)
if !supports_remote_data_timestamp?(all_attribute_keys) || result.count == batch_size
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Ladas is it possible the count could be the same as the batch size but still not return the "correct" ids?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it should not be, we check for uniqueness at least like on 2 places

result.each do |inserted_record|
key = inventory_collection.unique_index_columns.map { |x| inserted_record[x.to_s] }
inventory_object = indexed_inventory_objects[key]
inventory_object.id = inserted_record["id"] if inventory_object
end
else
inventory_collection.model_class.where(
build_multi_selection_query(inventory_collection, hashes)
).select(inventory_collection.unique_index_columns + [:id]).each do |inserted_record|
key = inventory_collection.unique_index_columns.map { |x| inserted_record.public_send(x) }
inventory_object = indexed_inventory_objects[key]
inventory_object.id = inserted_record.id if inventory_object
end
end
end
end
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -38,6 +38,13 @@ def build_insert_query(inventory_collection, all_attribute_keys, hashes)
WHERE EXCLUDED.remote_data_timestamp IS NULL OR (EXCLUDED.remote_data_timestamp > #{table_name}.remote_data_timestamp)
}
end

if inventory_collection.dependees.present?
insert_query += %{
RETURNING id,#{inventory_collection.unique_index_columns.join(",")}
}
end

insert_query
end

Expand Down