-
Notifications
You must be signed in to change notification settings - Fork 14
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[patch] restore forbidding duplicate imports #1453
Conversation
So I guess for now this is the most pragmatic solution. I still do not perfectly understand what's happening, but it sounds reasonable to me to forbid importing jobs with an existing job name. I put the error in the importing function and not deeper in base, because I thought maybe it is not fundamentally forbidden that there are jobs with the same name in pyiron? Maybe @jan-janssen can make a comment? One way or other, I think this is more like a temporary fix, and we need an overhaul in this PR. What do you think? |
I actually wasn't sure what the behaviour would be here, so I checked. The executive summary is:
Demo: from pyiron_atomistics import Project
pr = Project("test")
pr.remove_jobs(silently=True)
j1 = pr.create.job.Lammps("lmp")
j1.structure = pr.create.structure.bulk("Al")
j2 = pr.create.job.Lammps("lmp")
j2.structure = pr.create.structure.bulk("Cu")
print(j1.structure.get_chemical_formula(), j2.structure.get_chemical_formula())
# Al Cu
print(j1.name, j2.name)
# lmp lmp
print(pr.job_table().job)
# Series([], Name: job, dtype: object)
j1.run()
# Runs as expected
j2.run()
# Prints a warning and doesn't run `job exists already and therefore was not created!`
j3 = pr.create.job.Lammps("lmp")
# Silently (really, I set `warnings.simplefilter("always")`) reloads the job
print(j1.structure.get_chemical_formula(), j2.structure.get_chemical_formula(), j3.structure.get_chemical_formula())
# Al Cu Al
print(pr.job_table().job)
# 0 lmp; Name: job, dtype: object Only the project+job column gives a unique name, as it's fine to create and run a job in a subproject: sub_pr = pr.create_group("sub_pr")
k1 = sub_pr.create.job.Lammps("lmp")
k1.structure = sub_pr.create.structure.bulk("Ni")
k1.run()
print(pr.job_table().job)
# 0 lmp; 1 lmp; Name: job, dtype: object IMO there's nothing faulty with any of this, although it is annoying to find out "late" that your job name is non-unique and won't run -- would be nicer to learn that at creation time. This is impossible in the current setup though, as project is not truly aware of a job until after it has run. I've attacked this general topic in |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am completely behind the idea of having a check here, but I don't think this does what you want and it needs tests to prove it's doing the desired thing.
Yep so far that was more or less what I had expected. What I don't know is whether it can happen that there are jobs with the same name that have run. I super vaguely remember that this can happen when there are two child jobs which share the same name. In that case it's the same project and the same job name but with different parents, but as I said I don't clearly remember it. |
Oh yeah, I wasn't thinking about parent jobs. I don't intend to dig in and make an example for that though 😝 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agree with @liamhuber's diagnosis. Probably you want to join job
and project
columns in the given df
first, before making this check.
I guess now I found the workaround and hopefully the tests will pass this time... |
In the end, it boiled down to |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me. @samwaseda Can you merge this?
It's blocked because @liamhuber's approval is missing |
This PR partially solves this problem, as it forbids to import a job whose name already exists in the project, but I make it a draft because it does not clarify why the problem is occurring in the first place.