Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unifying a way of indexing our tables that are related to Provider entities #13806

Closed
Ladas opened this issue Feb 8, 2017 · 10 comments
Closed
Assignees
Labels

Comments

@Ladas
Copy link
Contributor

Ladas commented Feb 8, 2017

We need to start using unified way of indexing as a first step towards DB unique indexes. Each of our table should have:


1.1) manager_ref -> an identifier inside of the provider, does not have to be unique since this can be a complex key, like stack resource will have [:manager_ref, :stack_id], where manager ref of the resource is unique only under a stack. So main usecase for this one is for provider actions, like update a resource. If the manager_ref is missing, that is because we created a fake model on our side.

1.2) manager_ref_uuid/manager_uuid -> this would be a unique key under a manager, that we will build in the refresh to be unique. So e.g. "#{stack_manager_ref}__#{resource_manager_ref}"

1.3) manager_id -> a foreign key for the Manager


  1. But that should be probably only for first class citizen tables, so not hardware, Disk, etc.?

  1. Then unique key in the DB would be [:manager_id, :manager_ref_uuid]. And e.g. Disk unique key would be [:hardware_id, :device_name] ?

Then for disconnected elements we would have [nil, :manager_ref_uuid]? There could be some issues then, when disconnecting.

@durandom
Copy link
Member

durandom commented Feb 8, 2017

Manager vs EMS

In the beginning there was ExtManagementSystems - short EMS.
And most objects have and ems_id - which is the rails internal id for the ems that the object is linked to via belongs_to :ems

And we saw it was good.

But then came the managers. Now every manager inherits from ExtManagementSystem. And technically they live in the same db table. But we are shifting the wording from EMS to Manager, hence we want to move from ems_id to manager_id

Refs

Then there was ems_ref - which is used to identify an object with its external provider id. Which can be anything - it's used by the manager to identify the object e.g. during refresh or operations. But as there are managers now, we want to shift towards manager_ref

@durandom
Copy link
Member

durandom commented Feb 8, 2017

Cant we just assume and require that a manager_ref should always be uniq in the scope of a manager_id? Wouldnt that take away the pain of having 1.2 manager_ref_uuid?

@Ladas
Copy link
Contributor Author

Ladas commented Feb 8, 2017

@durandom right, let me try to explain what I mean with manager_ref vs manager_uuid

In the ansible example a unique key in the db [:configuration_script_source_id, :name]

but since we do STI, some other provider could come (happens in cloud) and say: "But my unique key is [:configuration_script_source_id, :name, :version]"

doing this change in the db would affect all the other providers

so what if we said unique key is [:manager_uuid] or [:configuration_script_source_id, :manager_uuid]. Then the purpose of manager_uuid would be to be unique under the given scope and you can build it from any columns to achieve that.

But then, you would still have unchanged key of the provider in :manager_ref. So when obtaining playbook you don't need to do configuration_script_source_ref, playbook_name = playbook.manager_ref.split("__")

but you can just use playbook.configuration_script_source.manager_ref and playbook.manager_ref or even playbook.name and playbook.version

I've experienced that any changes to manager_ref, leads to a pain when talking back to provider. :-) Also, we have cases where we do fake objects, because we need one structure of models, if we show them in 'generic' lists, etc. Then would be nice to have :manager_ref blank, as an easy way of saying "it's fake, so you can't do any action on it" -> "This object is being managed by XY, you cannot modify it separately"


But the alternative is to use complex unique key like [:configuration_script_source_id, :name] and let other providers to 'fake it to uniqueness' it needed :-)

@Ladas
Copy link
Contributor Author

Ladas commented May 11, 2017

@durandom @agrare @Fryguy

So talking to Adam, we might need to scope the unique index to provider, so e.g. [provider_id, ems_ref] to set an allowed area of disconnect/reconnect. Where the reconnect logic would be done using ON CONFLICT UPDATE, using our unique constraint. So we could never duplicate it under a provider. Thoughts? (Not sure how to tackle the fact that some managers can actually have duplicate data then though)

But nevertheless the [ems_id, ems_ref] is probably not usable, because we disconnect things.

So I am thinking we should have:

  • :ems_ref (or :name, or combination of keys): to be able to identify the object in the Provider API. It will be blank if it's our fake object
  • :ems_id(:manager_id in new tables): current manager owning it, nil if disconnected
  • :provider_id: a provider of the related managers, never nil
  • :manager_uuid (provider_uuid): the unique identifier under the provider, that we will construct to be unique for every table that is filled via the refresh

unique index [:provider_id, :provider_uuid]


then ON CONFLICT UPDATE on the unique index should take care of the reconnecting, but if the provider_uuid is build badly, the managers under could be stealing their data. So here is the place where the graph DB would shine, since it would just create edges/relations to the resource from many managers, instead of just stealing it.


and the other option would be to do:
[:ems_id, :provider_uuid] and not doing disconnect, but rather soft delete, e.g. using :deleted? attr and scoping all relations. Then the reconnect under 1 manager would be solved by always setting :deleted => false in parser and ON CONFLICT UPDATE would take care of that. And a cross manager disconnect/reconnect would be done as a special case of looking to other managers to take their resources.

@durandom
Copy link
Member

sounds reasonable, though I have to admit I dont fully understand it.
I'm a big fan of small step iterations.

So wouldnt it be a good first start to

  • add ems_ref to every table in our DB and force the collector to provide it?
  • if the relation to a provider/ems/manager isnt enough to scope ems_ref: prepend ems_id to it during save_inventory

Once we have that in place we can tackle the problems you describe.
But maybe I'm missing something crucial that makes such a stepped approach not possible

@agrare
Copy link
Member

agrare commented May 15, 2017

:provider_id: a provider of the related managers, never nil

This sounds like a workaround for the fact that we denote disconnected inventory by deleting the :ems_id

The last option you have (soft delete) sounds like the best

:manager_uuid (provider_uuid)

How about :provider_uid since this is explicitly not "universally unique" but unique within the provider?

@miq-bot
Copy link
Member

miq-bot commented Nov 27, 2017

This issue has been automatically marked as stale because it has not been updated for at least 6 months.

If you can still reproduce this issue on the current release or on master, please reply with all of the information you have about it in order to keep the issue open.

Thank you for all your contributions!

@miq-bot miq-bot added the stale label Nov 27, 2017
@Fryguy
Copy link
Member

Fryguy commented Nov 27, 2017

@Ladas @agrare Can this be closed or is this still planned?

@JPrause
Copy link
Member

JPrause commented Jan 23, 2019

@Ladas is this still a valid issue. If not can you close.
If there's no update by next week, I'll be closing this issue.

@Ladas
Copy link
Contributor Author

Ladas commented Jan 23, 2019

solved in topology inventory, miq will use the same at some point

@Ladas Ladas closed this as completed Jan 23, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

6 participants