Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Do not create hash conflicts #57

Merged
merged 10 commits into from
Sep 2, 2021
Merged

Do not create hash conflicts #57

merged 10 commits into from
Sep 2, 2021

Conversation

kzscisoft
Copy link
Collaborator

@kzscisoft kzscisoft commented Sep 1, 2021

@kzscisoft kzscisoft added the enhancement New feature or request label Sep 1, 2021
Copy link
Member

@richardreeve richardreeve left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think maybe this goes too far in the passive direction - we literally want it to do everything (set environment variable, create all of the necessary things(?) in the registry, create a working config and script) except run the script.

I thought it might be useful if it returned the path the config and script folder instead of running the script in fact, so that the output of fair run --ci could be directly used to find the files?

fair/run.py Outdated Show resolved Hide resolved
fair/run.py Outdated Show resolved Hide resolved
@kzscisoft
Copy link
Collaborator Author

I think maybe this goes too far in the passive direction - we literally want it to do everything (set environment variable, create all of the necessary things(?) in the registry, create a working config and script) except run the script.

I thought it might be useful if it returned the path the config and script folder instead of running the script in fact, so that the output of fair run --ci could be directly used to find the files?

It does exactly the same as @DennisReddyhoff solution. All it is doing is skipping execution of an actual bash command, the rest is the same.

@richardreeve
Copy link
Member

It does exactly the same as @DennisReddyhoff solution. All it is doing is skipping execution of an actual bash command, the rest is the same.

Great. Could we add in printing the folder path so that it can be picked up by any script calling it? That wasn't in the original solution, but I think it could be helpful to save investigating environment variables.

@kzscisoft
Copy link
Collaborator Author

It does exactly the same as @DennisReddyhoff solution. All it is doing is skipping execution of an actual bash command, the rest is the same.

Great. Could we add in printing the folder path so that it can be picked up by any script calling it? That wasn't in the original solution, but I think it could be helpful to save investigating environment variables.

Let's definitely discuss this later

@kzscisoft kzscisoft changed the title Fix logging and use CMD_MODE for passive run Fix logging and use CMD_MODE for passive run, do not create hash conflicts and do not create config outside of command run Sep 1, 2021
* rr/fixes:
  Revert bodge to get local uri.
  Linting README
  Fix docs
  Fix initialisation from file
  Remove description key from some other unnecessary locations in code
  Replace --glob with --global
@richardreeve
Copy link
Member

Okay. So I get this, which is great - with a config script of Sonia's that registers a few files:

(fair-6UDMIGpx-py3.8) (base) bleaberry[~/Documents/git/julia/EcoSISTEM] fair pull sonia-config1.yaml
Updating registry from config.yaml
(fair-6UDMIGpx-py3.8) (base) bleaberry[~/Documents/git/julia/EcoSISTEM] fair pull sonia-config1.yaml
Updating registry from config.yaml
Skipping item 'disease/sars_cov2/simple_model/parameters/static_params' as a hash matched entry is already present with this name
Skipping item 'disease/sars_cov2/simple_model/parameters/rts' as a hash matched entry is already present with this name
Skipping item 'disease/sars_cov2/simple_model/parameters/efoi' as a hash matched entry is already present with this name

However, when I call fair run it's less happy:

(fair-6UDMIGpx-py3.8) (base) bleaberry[~/Documents/git/julia/EcoSISTEM] fair run --ci sonia-config1.yaml
Updating registry from config.yaml
InternalError: Expected singular result for version '0.20210901.0'
(fair-6UDMIGpx-py3.8) (base) bleaberry[~/Documents/git/julia/EcoSISTEM] fair run --ci --debug sonia-config1.yaml
DEBUG:FAIRDataPipeline:Starting new session.
DEBUG:FAIRDataPipeline:Setting up command execution
Updating registry from config.yaml
DEBUG:FAIRDataPipeline.Run:Run called in passive mode, no command will be executed
DEBUG:FAIRDataPipeline.Run:Using job directory: /Users/richardr/.fair/data/jobs
DEBUG:FAIRDataPipeline.Run:Performing 'register' substitutions
Traceback (most recent call last):
  File "/Users/richardr/Library/Caches/pypoetry/virtualenvs/fair-6UDMIGpx-py3.8/bin/fair", line 5, in <module>
    cli()
  File "/Users/richardr/Library/Caches/pypoetry/virtualenvs/fair-6UDMIGpx-py3.8/lib/python3.8/site-packages/click/core.py", line 1137, in __call__
    return self.main(*args, **kwargs)
  File "/Users/richardr/Library/Caches/pypoetry/virtualenvs/fair-6UDMIGpx-py3.8/lib/python3.8/site-packages/click/core.py", line 1062, in main
    rv = self.invoke(ctx)
  File "/Users/richardr/Library/Caches/pypoetry/virtualenvs/fair-6UDMIGpx-py3.8/lib/python3.8/site-packages/click/core.py", line 1668, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/Users/richardr/Library/Caches/pypoetry/virtualenvs/fair-6UDMIGpx-py3.8/lib/python3.8/site-packages/click/core.py", line 1404, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/Users/richardr/Library/Caches/pypoetry/virtualenvs/fair-6UDMIGpx-py3.8/lib/python3.8/site-packages/click/core.py", line 763, in invoke
    return __callback(*args, **kwargs)
  File "/Users/richardr/tmp/FAIR-CLI/fair/cli.py", line 309, in run
    raise e
  File "/Users/richardr/tmp/FAIR-CLI/fair/cli.py", line 304, in run
    _hash = fair_session.run_job(script, mode=_run_mode)
  File "/Users/richardr/tmp/FAIR-CLI/fair/session.py", line 262, in run_job
    _hash = fdp_run.run_command(
  File "/Users/richardr/tmp/FAIR-CLI/fair/run.py", line 211, in run_command
    _conf_yaml = fdp_reg.subst_registrations(local_uri, _work_cfg)
  File "/Users/richardr/tmp/FAIR-CLI/fair/register.py", line 294, in subst_registrations
    raise fdp_exc.InternalError(
fair.exceptions.InternalError: Expected singular result for version '0.20210901.0'

I've put the config file itself on Zulip - here.

@richardreeve
Copy link
Member

@kzscisoft As well as erroring, fair run generates the wrong working config file. It just outputs the original register block I think:

register:
- alternate_identifier_type: simple_model_params
  description: Static parameters of the model
  external_object: disease/sars_cov2/simple_model/parameters/static_params
  file_type: csv
  namespace_full_name: Biomathematics and Statistics Scotland
  namespace_name: BIOSS
  namespace_website: https://www.bioss.ac.uk
  path: FAIRDataPipeline/SimpleModel/main/inst/extdata/static_params_SEInRD.csv
  primary: true
  release_date: 2021-09-01 16:20:26
  root: https://github.com/
  title: Static parameters of the model
  unique_name: Simple model parameters - Static parameters of the model
  version: 0.20210901.0
- alternate_identifier_type: simple_model_params
  description: Values of Rt at time t
  external_object: disease/sars_cov2/simple_model/parameters/rts
  file_type: csv
  namespace_full_name: Biomathematics and Statistics Scotland
  namespace_name: BIOSS
  namespace_website: https://www.bioss.ac.uk
  path: FAIRDataPipeline/SimpleModel/main/inst/extdata/Rt_beep.csv
  primary: true
  release_date: 2021-09-01 16:20:26
  root: https://github.com/
  title: Values of Rt at time t
  unique_name: Simple model parameters - Values of Rt at time t
  version: 0.20210901.0
- alternate_identifier_type: simple_model_params
  description: Effective force of infection at time t
  external_object: disease/sars_cov2/simple_model/parameters/efoi
  file_type: csv
  namespace_full_name: Biomathematics and Statistics Scotland
  namespace_name: BIOSS
  namespace_website: https://www.bioss.ac.uk
  path: FAIRDataPipeline/SimpleModel/main/inst/extdata/efoi_all_dates.csv
  primary: true
  release_date: 2021-09-01 16:20:26
  root: https://github.com/
  title: Effective force of infection at time t
  unique_name: Simple model parameters - Effective force of infection
  version: 0.20210901.0
run_metadata:
  default_input_namespace: richardr
  default_output_namespace: richardr
  description: Simple Model
  latest_commit: 6d34068e0032f4aad5b1969f1b5c7aba75bf60b1
  local_data_registry_url: http://localhost:8000/api/
  local_repo: /Users/richardr/Documents/git/julia/EcoSISTEM
  public: true
  remote_data_registry_url: https://data.scrc.uk/api/
  remote_repo: https://github.com/EcoJulia/EcoSISTEM.jl.git
  script: echo "Hello World!"
  write_data_store: /Users/richardr/.fair/data

* dev:
  Fix run --ci (PASS) mode to still generate the correct config and script files.
  Fixing output from `fair run --ci`.
  Print job directory in CI mode

# Conflicts:
#	fair/cli.py
@richardreeve richardreeve changed the title Fix logging and use CMD_MODE for passive run, do not create hash conflicts and do not create config outside of command run Do not create hash conflicts Sep 1, 2021
@richardreeve
Copy link
Member

richardreeve commented Sep 1, 2021

Okay, I've reconstructed and merged the original version of this PR, and then merged dev back in to reduce the size of this one. There may / are problems with the old PR, but this is a cleaner thing to check over, and then we can identify new bugs.

@sonarcloud
Copy link

sonarcloud bot commented Sep 2, 2021

Kudos, SonarCloud Quality Gate passed!    Quality Gate passed

Bug A 0 Bugs
Vulnerability A 0 Vulnerabilities
Security Hotspot A 0 Security Hotspots
Code Smell A 0 Code Smells

No Coverage information No Coverage information
0.0% 0.0% Duplication

@richardreeve
Copy link
Member

Okay, all of the issues I can see are now fixed. There are still problems with uniquely identifying items in the registry, but there are no regressions as a result of this PR, so I think it can be merged.

@kzscisoft
Copy link
Collaborator Author

kzscisoft commented Sep 2, 2021

Okay. So I get this, which is great - with a config script of Sonia's that registers a few files:

(fair-6UDMIGpx-py3.8) (base) bleaberry[~/Documents/git/julia/EcoSISTEM] fair pull sonia-config1.yaml
Updating registry from config.yaml
(fair-6UDMIGpx-py3.8) (base) bleaberry[~/Documents/git/julia/EcoSISTEM] fair pull sonia-config1.yaml
Updating registry from config.yaml
Skipping item 'disease/sars_cov2/simple_model/parameters/static_params' as a hash matched entry is already present with this name
Skipping item 'disease/sars_cov2/simple_model/parameters/rts' as a hash matched entry is already present with this name
Skipping item 'disease/sars_cov2/simple_model/parameters/efoi' as a hash matched entry is already present with this name

However, when I call fair run it's less happy:

(fair-6UDMIGpx-py3.8) (base) bleaberry[~/Documents/git/julia/EcoSISTEM] fair run --ci sonia-config1.yaml
Updating registry from config.yaml
InternalError: Expected singular result for version '0.20210901.0'
(fair-6UDMIGpx-py3.8) (base) bleaberry[~/Documents/git/julia/EcoSISTEM] fair run --ci --debug sonia-config1.yaml
DEBUG:FAIRDataPipeline:Starting new session.
DEBUG:FAIRDataPipeline:Setting up command execution
Updating registry from config.yaml
DEBUG:FAIRDataPipeline.Run:Run called in passive mode, no command will be executed
DEBUG:FAIRDataPipeline.Run:Using job directory: /Users/richardr/.fair/data/jobs
DEBUG:FAIRDataPipeline.Run:Performing 'register' substitutions
Traceback (most recent call last):
  File "/Users/richardr/Library/Caches/pypoetry/virtualenvs/fair-6UDMIGpx-py3.8/bin/fair", line 5, in <module>
    cli()
  File "/Users/richardr/Library/Caches/pypoetry/virtualenvs/fair-6UDMIGpx-py3.8/lib/python3.8/site-packages/click/core.py", line 1137, in __call__
    return self.main(*args, **kwargs)
  File "/Users/richardr/Library/Caches/pypoetry/virtualenvs/fair-6UDMIGpx-py3.8/lib/python3.8/site-packages/click/core.py", line 1062, in main
    rv = self.invoke(ctx)
  File "/Users/richardr/Library/Caches/pypoetry/virtualenvs/fair-6UDMIGpx-py3.8/lib/python3.8/site-packages/click/core.py", line 1668, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/Users/richardr/Library/Caches/pypoetry/virtualenvs/fair-6UDMIGpx-py3.8/lib/python3.8/site-packages/click/core.py", line 1404, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/Users/richardr/Library/Caches/pypoetry/virtualenvs/fair-6UDMIGpx-py3.8/lib/python3.8/site-packages/click/core.py", line 763, in invoke
    return __callback(*args, **kwargs)
  File "/Users/richardr/tmp/FAIR-CLI/fair/cli.py", line 309, in run
    raise e
  File "/Users/richardr/tmp/FAIR-CLI/fair/cli.py", line 304, in run
    _hash = fair_session.run_job(script, mode=_run_mode)
  File "/Users/richardr/tmp/FAIR-CLI/fair/session.py", line 262, in run_job
    _hash = fdp_run.run_command(
  File "/Users/richardr/tmp/FAIR-CLI/fair/run.py", line 211, in run_command
    _conf_yaml = fdp_reg.subst_registrations(local_uri, _work_cfg)
  File "/Users/richardr/tmp/FAIR-CLI/fair/register.py", line 294, in subst_registrations
    raise fdp_exc.InternalError(
fair.exceptions.InternalError: Expected singular result for version '0.20210901.0'

I've put the config file itself on Zulip - here.

I think suggests that the objects are not being registered correctly.

@richardreeve
Copy link
Member

I think I've fixed these problems now, but also note that there were some small issues with the registry itself, so you need to download a new copy - the latest is 1.0-rc4.

@richardreeve
Copy link
Member

@kzscisoft Are you checking this now, or are you happy for me to merge it?

@kzscisoft kzscisoft merged commit 7f22e64 into dev Sep 2, 2021
@kzscisoft kzscisoft deleted the kzscisoft/ci-run branch October 22, 2021 14:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

fair pull fails if you try to register a file with a hash already in the registry
2 participants