Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FEAT: Pydantic types #13

Merged
merged 11 commits into from
Feb 16, 2024
Merged

FEAT: Pydantic types #13

merged 11 commits into from
Feb 16, 2024

Conversation

carlwilson
Copy link
Member

@carlwilson carlwilson commented Jan 10, 2024

  • converted struct types to Pydantic; and
  • let the validator handle package unpacking;
  • completed addition of type hints to methods;
  • output of CLI now mostly JSON;
  • terminate processing of bad input files quickly; and
  • removed unused imports.- replaced pprint with print for now.

- created new module for API model types `eark_validator.model` that contains types for:
  - `Level` (requirements level);
  - `PackageDetails`;
  - `Severity` (of test result);
  - `StructResults` (for structural validation results);
  - `StructStatus` (for structural validation status);
  - `TestResult` (for validation test results); and
  - `ValidationReport` (for final validation report);
- refactored structural validation to use new types;
- small fix to schematron test for `CSIP11` in `mets_metsHdr_rules.xml`;
- new `eark_validator.packages` module for package validation;
- introduced a `PackageHandler` type to start abstraction of package parsing (replaces `STRUCT.ArchivePackageHandler`);
- introduced a `ValidationReport` type to handle final aggregation of validation results;
- added missing `__init__.py` files with appropriate commenting;
- removed some defunct types;
- removed unused imports;
- introduced light use of pydantic, better use to come; and
- unit test improvements and fixes.
- converted struct types to Pydantic; and
- fixed constructors where necessary.
- let the validator handle package unpacking;
- added empty list default for `Checksum`; and
- replaced `pprint` with `print` for now.
@carlwilson carlwilson self-assigned this Jan 10, 2024
@carlwilson carlwilson added the enhancement New feature or request label Jan 10, 2024
- added model types for Checkums and ChecksumAlg types;
- removed/refactored defunct Checksum types; and
- refactored tests to use new Checksum types.
- refactored model code to group related types in modules;
- better inclusing of model code in `__init__.py`;
- completed addition of type hints to methods;
- output of CLI now mostly JSON;
- terminate processing of bad input files quickly; and
- removed unused imports.
Copy link
Collaborator

@darrendignam darrendignam left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi Carl,

To be able to use the package I needed to explicitly set my venv to Python3.11 and I have detailed why in a previous comment.
Having made that change I still had an issue where the xsd resources are not in the installed package:

lxml.etree.XMLSchemaParseError: Failed to locate the main schema resource at "..."

If I manually copy the files into the correct library folder within venv folder then I can use the cli tool via "ip-check" without errors.

In previous python builds you could have these files built into the package by adding

include_package_data = True 

To the setup.cfg file. The process may have changes with the pyproject.toml build configuration file.

But once this has been addressed, I think this will be ready to merge into main branch.

Darren

from eark_validator.model.checksum import Checksum

class ManifestEntry(BaseModel):
path : Path | str
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not valid for Python3.9 so my initial attempts to use the package caused an exception due to this not being a supported use of the | operator.
Installing Python3.10 or above resolves this error. And while Python 3.10 is gaining greater support, Python3.9 is still quite prevalent and is the default Python3.x version on many Linux distributions.
So either a minimum version requirement should be mentioned in the README file, or we make this backwards compatible with Python3.9.

There are other instances of this type hinting usage of the | character, but this was the first for me to raise an exception.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm with you, I think backward compatibility is more important here and 3.11 as mandatory is too restrictive. A build matrix with the appropriate versions here would pick this up automatically.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm beginning to wonder, Python 3.9 is EOL in October 2025: https://devguide.python.org/versions/ and the code is fine with 3.10 and beyond: https://github.com/E-ARK-Software/eark-validator/actions/runs/7758461444

Thoughts welcome.

@carlwilson carlwilson changed the base branch from feat/struct-types to integration February 2, 2024 15:53
Copy link
Collaborator

@darrendignam darrendignam left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've reviewed the changes made in this pull request. It's clear that a significant amount of work has gone into this refactor, and I appreciate the effort to improve our code base.

The changes are consistent, the code is cleaner, and the overall structure seems more maintainable. I did not find any issues that would prevent these changes from being merged.

I'm approving these changes and looking forward to seeing them integrated into our main branch, and the continued improvements.

I agree that we should drop support for Python 3.9 and focus on 3.10 and above as the added improvements this brings, especially in light of the fact that 3.9 is heading towards EOL.

@carlwilson carlwilson merged commit da7c45e into integration Feb 16, 2024
4 checks passed
@carlwilson carlwilson deleted the feat/pydantic-types branch February 16, 2024 17:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants