Add pyiron based example #73

jan-janssen · 2024-03-12T00:12:11Z

This pull requests implements the example workflow using the pyiron_base workflow manager which is developed as part of the pyiron project. The workflow in the workflow.py file can be executed using:

python workflow.py

In addition, the pull request also adds the corresponding Github action and a short Readme to explain the usage of the example.

joergfunger

Thanks a lot for the merge request. The tests are running through (there is a problem with cwl, but that is unrelated to your merge request). There is a main question related to the connection of the workflow steps in addition to some minor details.

What is not clear to me is, how the dependencies are defined. The inputs are specified for each workflow step, but not the output and thus I'm not sure how the execution graph of the workflow is build. This is also relevant in the example, where instead of the output of a previous step (num_dofs) a hardcoded number is used.
The current implementation seems to always execute the complete python script with all steps - most workflow tools build a graph and then execute this graph. This would also be important when building a provenance graph of an execution/run to follow all processing steps and the information that was transfered.
This then also relates to the last question of avoiding the computation of already computed steps when e.g. working only on the postprocessing. It seems here that the complete workflow is re-executed. Not all tools provide an option to do so, but if that is the case, it would be nice to showcase that here (in a realistic scenario we would often define the function in a workflow step (e.g. the postprocessing script) as an additional input and thus trigger automatically a recomputation of only that step (and subsequent ones), since that input file has changed.
That finally relates to the last question, which is not directly related to the merge request, but a general pyiron question. We often assume that a workflow step is completely defined by the inputs and the function call, i.e. there are no hidden static variables that are set in a workflow step and then reused in subsequent steps. With file based interfaces, this is relatively easy to ensure, since there are no internal variables attached to a workflow step. How would that be ensured in a system where complete objects/jobs are stored in memory? Related then the question, is there - in your opinion - a difference between a workflow system and a complete programming language?

exemplary_workflow/pyiron/workflow.py

.github/workflows/main.yml

exemplary_workflow/pyiron/workflow.py

pdiercks

Hi @jan-janssen ,

thank you very much for the PR!

I added my review (mostly questions to clarify). I agree with @joergfunger that not explicitly defining the outputs/targets of each job is not ideal (and probably something that should be changed in a future version).

I was not able to run the pyiron workflow on my machine.
The error seems to be related to mpi4py and conda_subprocess.

pyiron_base ❯ python workflow.py 
The job gmsh was saved and received the ID: 1
The job meshio was saved and received the ID: 2
The job poisson was saved and received the ID: 3
2024-03-12 11:54:53,752 - pyiron_log - WARNING - Job aborted
2024-03-12 11:54:53,752 - pyiron_log - WARNING - ERROR: could not import mpi4py!
Traceback (most recent call last):
  File "/home/pdiercks/repos/nfdi4ing/pyiron_example/exemplary_workflow/pyiron/workflow/poisson_hdf5/poisson/poisson.py", line 6, in <module>
    import dolfin as df
  File "/home/pdiercks/mambaforge/envs/processing/lib/python3.9/site-packages/dolfin/__init__.py", line 142, in <module>
    from .fem.assembling import (assemble, assemble_system, assemble_multimesh,
  File "/home/pdiercks/mambaforge/envs/processing/lib/python3.9/site-packages/dolfin/fem/assembling.py", line 34, in <module>
    from dolfin.fem.form import Form
  File "/home/pdiercks/mambaforge/envs/processing/lib/python3.9/site-packages/dolfin/fem/form.py", line 12, in <module>
    from dolfin.jit.jit import dolfin_pc, ffc_jit
  File "/home/pdiercks/mambaforge/envs/processing/lib/python3.9/site-packages/dolfin/jit/jit.py", line 121, in <module>
    def compile_class(cpp_data, mpi_comm=MPI.comm_world):
RuntimeError: Error when importing mpi4py

Traceback (most recent call last):
  File "/home/pdiercks/mambaforge/envs/pyiron_base/lib/python3.12/site-packages/pyiron_base/jobs/job/runfunction.py", line 665, in execute_job_with_external_executable
    out = conda_subprocess.run(
          ^^^^^^^^^^^^^^^^^^^^^
  File "/home/pdiercks/mambaforge/envs/pyiron_base/lib/python3.12/site-packages/conda_subprocess/interface.py", line 175, in run
    raise CalledProcessError(
subprocess.CalledProcessError: Command '['/bin/bash', '/tmp/tmprzp6zlv0']' returned non-zero exit status 1.

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/pdiercks/repos/nfdi4ing/pyiron_example/exemplary_workflow/pyiron/workflow.py", line 51, in <module>
    poisson = pr.wrap_executable(
              ^^^^^^^^^^^^^^^^^^^
  File "/home/pdiercks/mambaforge/envs/pyiron_base/lib/python3.12/site-packages/pyiron_base/project/generic.py", line 436, in wrap_executable
    job.run()
  File "/home/pdiercks/mambaforge/envs/pyiron_base/lib/python3.12/site-packages/pyiron_base/utils/deprecate.py", line 171, in decorated
    return function(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/pdiercks/mambaforge/envs/pyiron_base/lib/python3.12/site-packages/pyiron_base/jobs/job/generic.py", line 763, in run
    self._run_if_new(debug=debug)
  File "/home/pdiercks/mambaforge/envs/pyiron_base/lib/python3.12/site-packages/pyiron_base/jobs/job/generic.py", line 1281, in _run_if_new
    run_job_with_status_initialized(job=self, debug=debug)
  File "/home/pdiercks/mambaforge/envs/pyiron_base/lib/python3.12/site-packages/pyiron_base/jobs/job/runfunction.py", line 92, in run_job_with_status_initialized
    job.run()
  File "/home/pdiercks/mambaforge/envs/pyiron_base/lib/python3.12/site-packages/pyiron_base/utils/deprecate.py", line 171, in decorated
    return function(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/pdiercks/mambaforge/envs/pyiron_base/lib/python3.12/site-packages/pyiron_base/jobs/job/generic.py", line 765, in run
    self._run_if_created()
  File "/home/pdiercks/mambaforge/envs/pyiron_base/lib/python3.12/site-packages/pyiron_base/jobs/job/generic.py", line 1292, in _run_if_created
    return run_job_with_status_created(job=self)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/pdiercks/mambaforge/envs/pyiron_base/lib/python3.12/site-packages/pyiron_base/jobs/job/runfunction.py", line 115, in run_job_with_status_created
    job.run_static()
  File "/home/pdiercks/mambaforge/envs/pyiron_base/lib/python3.12/site-packages/pyiron_base/jobs/job/generic.py", line 799, in run_static
    execute_job_with_external_executable(job=self)
  File "/home/pdiercks/mambaforge/envs/pyiron_base/lib/python3.12/site-packages/pyiron_base/jobs/job/runfunction.py", line 610, in wrapper
    func(job)
  File "/home/pdiercks/mambaforge/envs/pyiron_base/lib/python3.12/site-packages/pyiron_base/jobs/job/runfunction.py", line 676, in execute_job_with_external_executable
    out, job_crashed = handle_failed_job(job=job, error=e)
                       ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/pdiercks/mambaforge/envs/pyiron_base/lib/python3.12/site-packages/pyiron_base/jobs/job/runfunction.py", line 732, in handle_failed_job
    raise RuntimeError("Job aborted")
RuntimeError: Job aborted

The conda environments were created successfully. Here, I would also suggest to use a local prefix for the installation instead of a name and global installation.

Best,
Philipp

exemplary_workflow/pyiron/workflow.py

exemplary_workflow/pyiron/README.md

exemplary_workflow/pyiron/workflow.py

exemplary_workflow/pyiron/README.md

Co-authored-by: Philipp Diercks <55233917+pdiercks@users.noreply.github.com>

jan-janssen · 2024-03-12T13:39:07Z

I was not able to run the pyiron workflow on my machine. The error seems to be related to mpi4py and conda_subprocess.

I was not able to reproduce your error. Is it possible that you already had an environment named processing?

jan-janssen · 2024-03-12T19:07:36Z

@pdiercks and @joergfunger thank you both for the careful review. I updated the pull request and try to address the majority of the points individually. For the remaining points, I try to provide a brief summary here:

The workflow graph is currently not explicitly created. We work in this functionality as part of the pyiron_workflow module.
The conda environments in pyiron are stored in the central conda installation. This is an error-prone solution, so I plan to add the functionality to store conda environments in the project directory More flexible conda environments pyiron/pyiron_base#1371
The output of the shell tasks is not stored in the inside the pyiron object unless the output is redirected to a file.

I think the first two points can be addressed rather quickly (days to weeks), while the first point most likely still takes more time (weeks to months). So I guess my question is: What is the minimal amount of changes to merge this pull request?

joergfunger · 2024-03-13T06:16:38Z

The changes are well addressed and in principle we could merge. Just a couple of remarks.

The output is not yet specified. In the current setup, it is quite difficult for a user to know what files are produced by a job and thus how to use it correctly. In the future, this might also relevant to know what files have to be collected when the jobs is executed on an external machine
The questions of updates is not yet fully addressed. In case that pyiron offers the option to only recompute things (similar to a makefile) that have changed, it would be great to include that option.
I had a challenge with an old version of mamba (I had still 1.1 installed that did not provide the -y option) that was used to build the conda-environments in the workflow. Maybe adding that as a dependency would help.

pdiercks · 2024-03-13T08:24:01Z

I was not able to run the pyiron workflow on my machine. The error seems to be related to mpi4py and conda_subprocess.

I was not able to reproduce your error. Is it possible that you already had an environment named processing?

I will investigate, but it's probably a problem with my local setup rather than the workflow. Also, the actions / tests were passing before, so no need for changes on your end.

pdiercks · 2024-03-13T08:30:07Z

@pdiercks and @joergfunger thank you both for the careful review. I updated the pull request and try to address the majority of the points individually. For the remaining points, I try to provide a brief summary here:
* The workflow graph is currently not explicitly created. We work in this functionality as part of the [pyiron_workflow](https://github.com/pyiron/pyiron_workflow) module.

* The conda environments in pyiron are stored in the central conda installation. This is an error-prone solution, so I plan to add the functionality to store conda environments in the project directory [More flexible conda environments pyiron/pyiron_base#1371](https://github.com/pyiron/pyiron_base/issues/1371)

* The output of the shell tasks is not stored in the inside the pyiron object unless the output is redirected to a file.
I think the first two points can be addressed rather quickly (days to weeks), while the first point most likely still takes more time (weeks to months). So I guess my question is: What is the minimal amount of changes to merge this pull request?

I would also be okay to merge. Thanks @jan-janssen for incorporating the changes and your explanations!

jan-janssen · 2024-03-13T12:53:47Z

The changes are well addressed and in principle we could merge. Just a couple of remarks.

The output is not yet specified. In the current setup, it is quite difficult for a user to know what files are produced by a job and thus how to use it correctly. In the future, this might also relevant to know what files have to be collected when the jobs is executed on an external machine

As pyiron is copying all files to the working directory of each job step, we basically define the files on the input level rather than the output level. So for a remote setup this would mean we copy the working directory including all the required files for a given task via SSH to a remote computer and potentially even submit it to the scheduler there. Then once we get run time the task is executed and afterwards the results are compressed in one tar archive. This tar archive is then copied back to the local computer, there the output can be reused in the following job steps.

The questions of updates is not yet fully addressed. In case that pyiron offers the option to only recompute things (similar to a makefile) that have changed, it would be great to include that option.

Yes, in pyiron each job receives a unique ID and so if a job was successfully calculated then it is just reloaded and no additional execution is necessary. In the case a calculation failed then the user has to manually remove this failed calculation before the workflow can be executed again. We added this blocker to remind users to look at their failed calculation and check if it was just a random error or if a modification of the input is necessary.

I had a challenge with an old version of mamba (I had still 1.1 installed that did not provide the -y option) that was used to build the conda-environments in the workflow. Maybe adding that as a dependency would help.

Based on this feedback, we made mamba an optional dependency and rely on conda as default option. The conda version is tracked via the conda_subprocess package. pyiron/pyiron_base#1370

joergfunger

Thanks, nice work. If you update pyiron, it would be great to integrate the updated versions here as well.

Add pyiron based example

a828007

jan-janssen closed this Mar 12, 2024

jan-janssen added 3 commits March 11, 2024 19:24

Add conda_subprocess as dependency

740c272

remove empty lines

dc2fd42

Update source path

8a500b2

jan-janssen reopened this Mar 12, 2024

Add Readme

e77abed

joergfunger requested changes Mar 12, 2024

View reviewed changes

exemplary_workflow/pyiron/workflow.py Outdated Show resolved Hide resolved

.github/workflows/main.yml Show resolved Hide resolved

exemplary_workflow/pyiron/workflow.py Outdated Show resolved Hide resolved

joergfunger requested a review from pdiercks March 12, 2024 07:14

pdiercks requested changes Mar 12, 2024

View reviewed changes

jan-janssen and others added 2 commits March 12, 2024 07:44

rename functions

dfd0f69

Update exemplary_workflow/pyiron/README.md

d714177

Co-authored-by: Philipp Diercks <55233917+pdiercks@users.noreply.github.com>

jan-janssen added 2 commits March 12, 2024 09:50

Add explanation about storage in the README

c01d5c2

Use f-strings to call executables

9617e02

pdiercks approved these changes Mar 13, 2024

View reviewed changes

joergfunger approved these changes Mar 14, 2024

View reviewed changes

joergfunger merged commit 652d7c3 into BAMresearch:main Mar 14, 2024
9 of 10 checks passed

jan-janssen mentioned this pull request Mar 27, 2024

Update pyiron_base to 0.8.0 - to parse stdout directly without defining a collect_output() function #77

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add pyiron based example #73

Add pyiron based example #73

jan-janssen commented Mar 12, 2024 •

edited

Loading

joergfunger left a comment

pdiercks left a comment

jan-janssen commented Mar 12, 2024

jan-janssen commented Mar 12, 2024

joergfunger commented Mar 13, 2024

pdiercks commented Mar 13, 2024

pdiercks commented Mar 13, 2024

jan-janssen commented Mar 13, 2024

joergfunger left a comment

Add pyiron based example #73

Add pyiron based example #73

Conversation

jan-janssen commented Mar 12, 2024 • edited Loading

joergfunger left a comment

Choose a reason for hiding this comment

pdiercks left a comment

Choose a reason for hiding this comment

jan-janssen commented Mar 12, 2024

jan-janssen commented Mar 12, 2024

joergfunger commented Mar 13, 2024

pdiercks commented Mar 13, 2024

pdiercks commented Mar 13, 2024

jan-janssen commented Mar 13, 2024

joergfunger left a comment

Choose a reason for hiding this comment

jan-janssen commented Mar 12, 2024 •

edited

Loading