Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[RFC] Create a migration function for datasource to add migrationVersion field #6022

Closed
Tracked by #5752
yibow98 opened this issue Mar 4, 2024 · 1 comment
Closed
Tracked by #5752
Assignees
Labels
bug Something isn't working multiple datasource multiple datasource project RFC Substantial changes or new features that require community input to garner consensus. v2.13.0

Comments

@yibow98
Copy link
Contributor

yibow98 commented Mar 4, 2024

Background

When create a datasource, it does not have migrationVersion causes the migration cannot be triggered.

Currently, the datasource does not have migration function.(reference) When creating the datasource, the migrationVersion will not be created and this variable will be marked as undefined.
So with the below migration logic, the migration result will depend on doc.migrationVersion. If doc.migrationVersion is undefined, then it will always be considered to up to date and does not need to apply migration for these data.

/**
 * Creates a function which migrates and validates any document that is passed to it.
 */
function buildDocumentTransform({
  migrations,
}: {
  opensearchDashboardsVersion: string;
  migrations: ActiveMigrations;
}): TransformFn {
  return function transformAndValidate(doc: SavedObjectUnsanitizedDoc) {
    const result = doc.migrationVersion
      ? applyMigrations(doc, migrations)
      : markAsUpToDate(doc, migrations);

    // In order to keep tests a bit more stable, we won't
    // tack on an empy migrationVersion to docs that have
    // no migrations defined.
    if (_.isEmpty(result.migrationVersion)) {
      delete result.migrationVersion;
    }

    return result;
  };
}

Potential Soluion

  1. Find the root cause and fix it in OSD core.
    1. Pros: will permanently resolve the issue
    2. Cons:
      1. Potential risk is high: the document migrator logic is pretty complicated with long history, it is the common used document migrator for all types of save objects. We may need an extra 4 to 5 weeks for the investigation, design and code fix.
      2. Time consuming
  2. [Preferred] Find the workaround which create a dummy migration function for the datasource to ensure the migrationVersion is not undefined when creating a datasource.
    1. Pros:
      1. easy to implement, we can directly move forward to next step
    2. Cons:
      1. Will not fix the issue from the root.

Design

For now, I will use approach 2 to create a dummy migration function for giving the datasource a migrationVersion. And we need to think about two design related questions:

Q1: which version will we use for the dummy migration function

A: I will use 2.4.0 as the migrationVersion since this is the version we introduced the datasource

Q2: how do we implement the migration function:

A:

  1. Similar to other saved object, it will be implemented in https://github.com/opensearch-project/OpenSearch-Dashboards/blob/main/src/plugins/data_source/server/saved_objects/data_source.ts and we can first create a function called migrateDataSource. This is a dummy migration function that returns the document without any changes since we currently do not have any schema change.
  2. Add a migrations configuration that contains migration functions keyed by version. The '2.4.0' migration uses the flow function from lodash to apply the migrateDataSource function, indicating this migration should be applied when upgrading to version 2.4.0

Test for the changes

  1. Start the OSD locally and create a datasource (without migration function)
  2. Find the datasource id and then call GET API to check the datasource information

GET API request:

curl -X GET 'http://localhost:5601/api/saved_objects/data-source/d31cd080-da94-11ee-9155-b95eb77f43e8‘curl -X GET 'http://localhost:5601/api/saved_objects/data-source/d31cd080-da94-11ee-9155-b95eb77f43e8‘

Output is:

{
  "id": "d31cd080-da94-11ee-9155-b95eb77f43e8",
  "type": "data-source",
  "namespaces": [],
  "updated_at": "2024-03-05T02:05:22.696Z",
  "version": "WzIsMV0=",
  "attributes": {
    "title": "testDummyMigration",
    "description": "",
    "endpoint": "https://yibow.com",
    "auth": {
      "type": "no_auth"
    }
  },
  "references": []
}

We can see there is no migrationVersion included in the output.

  1. Add the migration function and restart the OSD, then check the datasource information
{
  "id": "d31cd080-da94-11ee-9155-b95eb77f43e8",
  "type": "data-source",
  "namespaces": [],
  "updated_at": "2024-03-05T02:05:22.696Z",
  "version": "WzIsMV0=",
  "attributes": {
    "title": "testDummyMigration",
    "description": "",
    "endpoint": "https://yibow.com",
    "auth": {
      "type": "no_auth"
    }
  },
  "references": [],
  "migrationVersion": {
    "data-source": "2.4.0"
  }
}

We can see the migrationVersion is successfully added.

@yibow98 yibow98 added bug Something isn't working untriaged labels Mar 4, 2024
@yibow98
Copy link
Contributor Author

yibow98 commented Mar 4, 2024

I can work on the fix for this issue.

@BionIT BionIT added v2.13.0 and removed untriaged labels Mar 4, 2024
@BionIT BionIT added the multiple datasource multiple datasource project label Mar 4, 2024
@yibow98 yibow98 changed the title [BUG] Creating a datasource does not contain the "migrationVersion" field [RFC] Creating a datasource does not contain the "migrationVersion" field Mar 5, 2024
@yibow98 yibow98 changed the title [RFC] Creating a datasource does not contain the "migrationVersion" field [RFC] Create a migration function for datasource to add migrationVersion field Mar 5, 2024
@seraphjiang seraphjiang added the RFC Substantial changes or new features that require community input to garner consensus. label Mar 5, 2024
@kavilla kavilla closed this as completed in 70adcc9 Mar 5, 2024
opensearch-trigger-bot bot pushed a commit that referenced this issue Mar 5, 2024
…eld (#6025)

This PR is to add a migration function with version 2.4.0 for datasource to add a migrationVersion field.
For more information, please refer to the RFC: #6022

Issues Resolved
#6022

Signed-off-by: Yibo Wang <yibow@amazon.com>
(cherry picked from commit 70adcc9)
Signed-off-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working multiple datasource multiple datasource project RFC Substantial changes or new features that require community input to garner consensus. v2.13.0
Projects
None yet
Development

No branches or pull requests

3 participants