Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[NEAT-237, NEAT-262] 🫠Adding Svein Harald Spreadsheets #459

Merged
merged 25 commits into from
May 21, 2024

Conversation

doctrino
Copy link
Collaborator

@doctrino doctrino commented May 19, 2024

Discovered that these sheets were missing in addition to a set of bugs.

Loading Svein Harald's sheet turned out to completely break the validation of DMS, so had to do NEAT-237 as well.

  • Before this PR, when you dump with last you do not get the Metadata sheet. I have changed this to include it, as I think it shows the linage of the model, and it is also clearer in the validation which is performed on the last model.

Realize that there are several inconsistencies in the code wrt to the metadata properties. Suggest the following:

DataModelType this determines how we validate against the reference sheets.

  1. enterprise. No validation is performed against the reference model, it can be there but is assumed to be external to CDF and cannot be validated against
  2. solution. There must be a Reference model.

SchemaCompleteness This determines the overall consistency checks for the Data Model. This is connected to the Last sheets.

  1. complete Last sheets are ignored. All classes/views/containers should be in the user sheets.
  2. extended Last sheet must be present, and it assumed that user+last = complete. With the additional validation determined by the extension field.
  3. partial: No consistency checks are performed on the model as a whole.

ExtensionType:

  1. Rebuild: Everything is allowed
  2. Reshape: Containers cannot change.
  3. Additition: Containers and views cannot change'

Note New solution model should set their schema to complete. The schema as a whole will still include the reference model.

Copy link

github-actions bot commented May 19, 2024

☂️ Python Coverage

current status: ✅

Overall Coverage

Lines Covered Coverage Threshold Status
20310 13852 68% 60% 🟢

New Files

No new covered files...

Modified Files

File Coverage Status
cognite/neat/rules/analysis/_information_rules.py 79% 🟢
cognite/neat/rules/exporters/_rules2excel.py 87% 🟢
cognite/neat/rules/importers/_dms2rules.py 80% 🟢
cognite/neat/rules/importers/_spreadsheet2rules.py 75% 🟢
cognite/neat/rules/models/dms/_converter.py 93% 🟢
cognite/neat/rules/models/dms/_exporter.py 89% 🟢
cognite/neat/rules/models/dms/_rules.py 94% 🟢
cognite/neat/rules/models/dms/_schema.py 86% 🟢
cognite/neat/rules/models/dms/_serializer.py 97% 🟢
cognite/neat/rules/models/dms/_validation.py 97% 🟢
cognite/neat/rules/models/information/_rules.py 85% 🟢
cognite/neat/rules/models/information/_rules_input.py 96% 🟢
cognite/neat/workflows/steps/lib/current/rules_importer.py 28% 🟢
TOTAL 83% 🟢

updated for commit: 0189804 by action🐍

@@ -143,6 +144,7 @@ def _create_metadata_from_model(
updated = now
return DMSMetadata(
schema_=SchemaCompleteness.complete,
data_model_type=DataModelType.solution if has_reference else DataModelType.enterprise,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This keeps thing simple

@@ -111,17 +111,27 @@ def __init__(
self.required = required
self.metadata = metadata
self._sheet_prefix = sheet_prefix
self._seen_files: set[Path] = set()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this for possibility to have partial/modular rules?

def sheet_names(self, role: RoleTypes) -> set[str]:
names = MANDATORY_SHEETS_BY_ROLE[role]
return {f"{self._sheet_prefix}{sheet_name}" for sheet_name in names if sheet_name != "Metadata"}

def read(self, filepath: Path) -> None | ReadResult:
with pd.ExcelFile(filepath) as excel_file:
self._seen_files.add(filepath)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see that you expect that this is support use case when you split Metadata, Classes, ... into individual files, but not use case where we have portion of data model (and all underlaying sheets) in multiple files.
Maybe tech debt task?

if self.metadata.schema_ is SchemaCompleteness.partial:
return self.issue_list
dms_schema = self.rules.as_schema()
self.issue_list.extend(dms_schema.validate())
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this former _validate_schema?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, since it turned into a one liner, I thought it was overkill to have a function for it.

@@ -213,14 +220,7 @@ def _validate_extension(self) -> None:
)
)

def _validate_performance(self) -> None:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

renamed to _validate_filter, a bit misleading name as what we are checking is if filter is applied to one too many containers for the same view

self._consistent_container_properties()

self._referenced_views_and_containers_are_existing()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we do any check if SchemaCompleteness is partial ?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this method will break if .last is missing?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have this:

        if not self.rules.last:
            raise ValueError("The schema is set to 'extended', but no last rules are provided to validate against")

self._consistent_container_properties()

self._referenced_views_and_containers_are_existing()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this method will break if .last is missing?

@doctrino
Copy link
Collaborator Author

Issue with GitHub Action, ran tests locally and merging to avoid being blocked.

@doctrino doctrino merged commit 15c0f03 into main May 21, 2024
6 of 7 checks passed
@doctrino doctrino deleted the prep-extending-solution-model branch May 21, 2024 14:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants