Unifying a way of indexing our tables that are related to Provider entities #13806

Ladas · 2017-02-08T09:10:21Z

We need to start using unified way of indexing as a first step towards DB unique indexes. Each of our table should have:

1.1) manager_ref -> an identifier inside of the provider, does not have to be unique since this can be a complex key, like stack resource will have [:manager_ref, :stack_id], where manager ref of the resource is unique only under a stack. So main usecase for this one is for provider actions, like update a resource. If the manager_ref is missing, that is because we created a fake model on our side.

1.2) manager_ref_uuid/manager_uuid -> this would be a unique key under a manager, that we will build in the refresh to be unique. So e.g. "#{stack_manager_ref}__#{resource_manager_ref}"

1.3) manager_id -> a foreign key for the Manager

But that should be probably only for first class citizen tables, so not hardware, Disk, etc.?

Then unique key in the DB would be [:manager_id, :manager_ref_uuid]. And e.g. Disk unique key would be [:hardware_id, :device_name] ?

Then for disconnected elements we would have [nil, :manager_ref_uuid]? There could be some issues then, when disconnecting.

durandom · 2017-02-08T09:47:03Z

Manager vs EMS

In the beginning there was ExtManagementSystems - short EMS.
And most objects have and ems_id - which is the rails internal id for the ems that the object is linked to via belongs_to :ems

And we saw it was good.

But then came the managers. Now every manager inherits from ExtManagementSystem. And technically they live in the same db table. But we are shifting the wording from EMS to Manager, hence we want to move from ems_id to manager_id

Refs

Then there was ems_ref - which is used to identify an object with its external provider id. Which can be anything - it's used by the manager to identify the object e.g. during refresh or operations. But as there are managers now, we want to shift towards manager_ref

durandom · 2017-02-08T09:47:56Z

Cant we just assume and require that a manager_ref should always be uniq in the scope of a manager_id? Wouldnt that take away the pain of having 1.2 manager_ref_uuid?

Ladas · 2017-02-08T16:35:23Z

@durandom right, let me try to explain what I mean with manager_ref vs manager_uuid

In the ansible example a unique key in the db [:configuration_script_source_id, :name]

but since we do STI, some other provider could come (happens in cloud) and say: "But my unique key is [:configuration_script_source_id, :name, :version]"

doing this change in the db would affect all the other providers

so what if we said unique key is [:manager_uuid] or [:configuration_script_source_id, :manager_uuid]. Then the purpose of manager_uuid would be to be unique under the given scope and you can build it from any columns to achieve that.

But then, you would still have unchanged key of the provider in :manager_ref. So when obtaining playbook you don't need to do configuration_script_source_ref, playbook_name = playbook.manager_ref.split("__")

but you can just use playbook.configuration_script_source.manager_ref and playbook.manager_ref or even playbook.name and playbook.version

I've experienced that any changes to manager_ref, leads to a pain when talking back to provider. :-) Also, we have cases where we do fake objects, because we need one structure of models, if we show them in 'generic' lists, etc. Then would be nice to have :manager_ref blank, as an easy way of saying "it's fake, so you can't do any action on it" -> "This object is being managed by XY, you cannot modify it separately"

But the alternative is to use complex unique key like [:configuration_script_source_id, :name] and let other providers to 'fake it to uniqueness' it needed :-)

Ladas · 2017-05-11T13:16:33Z

@durandom @agrare @Fryguy

So talking to Adam, we might need to scope the unique index to provider, so e.g. [provider_id, ems_ref] to set an allowed area of disconnect/reconnect. Where the reconnect logic would be done using ON CONFLICT UPDATE, using our unique constraint. So we could never duplicate it under a provider. Thoughts? (Not sure how to tackle the fact that some managers can actually have duplicate data then though)

But nevertheless the [ems_id, ems_ref] is probably not usable, because we disconnect things.

So I am thinking we should have:

:ems_ref (or :name, or combination of keys): to be able to identify the object in the Provider API. It will be blank if it's our fake object
:ems_id(:manager_id in new tables): current manager owning it, nil if disconnected
:provider_id: a provider of the related managers, never nil
:manager_uuid (provider_uuid): the unique identifier under the provider, that we will construct to be unique for every table that is filled via the refresh

unique index [:provider_id, :provider_uuid]

then ON CONFLICT UPDATE on the unique index should take care of the reconnecting, but if the provider_uuid is build badly, the managers under could be stealing their data. So here is the place where the graph DB would shine, since it would just create edges/relations to the resource from many managers, instead of just stealing it.

and the other option would be to do:
[:ems_id, :provider_uuid] and not doing disconnect, but rather soft delete, e.g. using :deleted? attr and scoping all relations. Then the reconnect under 1 manager would be solved by always setting :deleted => false in parser and ON CONFLICT UPDATE would take care of that. And a cross manager disconnect/reconnect would be done as a special case of looking to other managers to take their resources.

durandom · 2017-05-12T07:10:10Z

sounds reasonable, though I have to admit I dont fully understand it.
I'm a big fan of small step iterations.

So wouldnt it be a good first start to

add ems_ref to every table in our DB and force the collector to provide it?
if the relation to a provider/ems/manager isnt enough to scope ems_ref: prepend ems_id to it during save_inventory

Once we have that in place we can tackle the problems you describe.
But maybe I'm missing something crucial that makes such a stepped approach not possible

agrare · 2017-05-15T13:36:13Z

:provider_id: a provider of the related managers, never nil

This sounds like a workaround for the fact that we denote disconnected inventory by deleting the :ems_id

The last option you have (soft delete) sounds like the best

:manager_uuid (provider_uuid)

How about :provider_uid since this is explicitly not "universally unique" but unique within the provider?

miq-bot · 2017-11-27T05:00:22Z

This issue has been automatically marked as stale because it has not been updated for at least 6 months.

If you can still reproduce this issue on the current release or on master, please reply with all of the information you have about it in order to keep the issue open.

Thank you for all your contributions!

Fryguy · 2017-11-27T16:58:35Z

@Ladas @agrare Can this be closed or is this still planned?

JPrause · 2019-01-23T16:28:45Z

@Ladas is this still a valid issue. If not can you close.
If there's no update by next week, I'll be closing this issue.

Ladas · 2019-01-23T20:13:55Z

solved in topology inventory, miq will use the same at some point

Ladas mentioned this issue Feb 8, 2017

Ansible refresh inventory #13807

Merged

blomquisg assigned Ladas Feb 9, 2017

miq-bot added the stale label Nov 27, 2017

Ladas closed this as completed Jan 23, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Unifying a way of indexing our tables that are related to Provider entities #13806

Unifying a way of indexing our tables that are related to Provider entities #13806

Ladas commented Feb 8, 2017 •

edited

Loading

durandom commented Feb 8, 2017

durandom commented Feb 8, 2017

Ladas commented Feb 8, 2017 •

edited

Loading

Ladas commented May 11, 2017 •

edited

Loading

durandom commented May 12, 2017

agrare commented May 15, 2017

miq-bot commented Nov 27, 2017

Fryguy commented Nov 27, 2017

JPrause commented Jan 23, 2019

Ladas commented Jan 23, 2019

Unifying a way of indexing our tables that are related to Provider entities #13806

Unifying a way of indexing our tables that are related to Provider entities #13806

Comments

Ladas commented Feb 8, 2017 • edited Loading

durandom commented Feb 8, 2017

Manager vs EMS

Refs

durandom commented Feb 8, 2017

Ladas commented Feb 8, 2017 • edited Loading

Ladas commented May 11, 2017 • edited Loading

durandom commented May 12, 2017

agrare commented May 15, 2017

miq-bot commented Nov 27, 2017

Fryguy commented Nov 27, 2017

JPrause commented Jan 23, 2019

Ladas commented Jan 23, 2019

Ladas commented Feb 8, 2017 •

edited

Loading

Ladas commented Feb 8, 2017 •

edited

Loading

Ladas commented May 11, 2017 •

edited

Loading