Checks for common errors / defaults #1129

scarlehoff · 2021-03-05T12:08:27Z

I've added a check for the mistake with sum_rules (being under parameters instead of under fitting).
When adding that check I realized a very similar mistake could happen with epochs (i.e., the user sets epochs in the wrong place, gets no error but runs the fit with the wrong number of epochs).

There might be other keywords for which one could make similar mistakes. With the publication of the code we want to make sure that not many of these pitfalls are left.

(I've set reviewers looking for brainstorming rather than for actual reviews)

Zaharid · 2021-03-05T12:52:50Z

I think the best fix would be to have a warning for unused keys in reportengine. But not sure if I am doing something about that any time soon.

wilsonmr · 2021-03-05T12:57:06Z

With all the nested dictionaries in the runcard dict, covering all bases of what could go wrong seems to be quite a lot of checks. However what about:

I wonder if the params dictionary could be properly parsed earlier on, rather than being the dictionary it currently is. It could still be parsed as a dictionary there would be known keys and at least then if there was an additonal "unknown" key present then it would be flagged and raise an error.
Probably we could check if there were duplicate keys in fitting, parameters and in the top level of the runcard and raise an error. I think it's reasonable that this should error because seeing as runcards get c+p a lot, we don't want to propagate useless flags. This would be a more general version of your epoch check. I had a quick scan I'm not sure if there are supposed to be any duplicate keys which actually do different things, but maybe I'm wrong.

Zaharid · 2021-03-05T13:03:16Z

Or just get rid of the dicts in the runcard. They are mostly useless (a bit tricky to do in a backwards compatible way though).

scarlehoff · 2021-03-05T13:04:55Z

With all the nested dictionaries in the runcard dict, covering all bases of what could go wrong seems to be quite a lot of checks. However what about:

There are a number of keys that need to be present in the right place, the problem are the ones that have been added later and (for compatibility with previous versions) take a default when they are not present. Maybe this is a mistake.

1. I wonder if the params dictionary could be properly parsed earlier on, rather than being the dictionary it currently is. It could still be parsed as a dictionary there would be known keys and at least then if there was an additonal "unknown" key present then it would be flagged and raise an error.

We want the hyperparameters to come as a dictionary so even if it is parsed early on it needs to be reconstructed later. Ok about the second point.

2. Probably we could check if there were duplicate keys in fitting, parameters and in the top level of the runcard and raise an error. I think it's reasonable that this should error because seeing as runcards get c+p a lot, we don't want to propagate useless flags. This would be a more general version of your `epoch` check. I had a quick scan I'm not sure if there are supposed to be any duplicate keys which actually do different things, but maybe I'm wrong.

I think if there are multiple parameters with the same name we should do something about them because it will be confusing for sure.
The check about the epochs however is in a different category because the problem is many (old) runcards have "epochs" in it because of nnfit. Maybe the (right) answer is "all those runcards should fail from now on and no duplicate epochs is allowed". I'd be happy with that.

RoyStegeman · 2021-03-05T13:05:29Z

Instead of checking for unused keys in the runcard, we could also remove all default values. Such that if a typo is made, or a parameter is defined in the wrong place, the fit will fail. Of course this leaves the issue with double keywords such as epochs, but for that something like what @wilsonmr proposes in his second item could help.

Zaharid · 2021-03-05T13:17:02Z

Having or not default is something that fluctuates. At some point there was to be no default at all but then that leaves the problem that it is difficult to write trivially correct runcards (for example now people routinely forget to add cfactors in the right place). So there isn't a simple answer. Other than that unused keys should warn or even fail, but then again that is easier said than done.

wilsonmr · 2021-03-05T13:18:52Z

btw the branch this points to is #1081 and that actually expands fitting as a namespace and so technically you could move everything in there outside and it should still work. It would probably break at the validphys level where we assume fitting always exists, but if that used the parsing tools properly I think it still would be fine. *EDIT and you would still need an empty namespace of fitting present which would be expanded..

We want the hyperparameters to come as a dictionary so even if it is parsed early on it needs to be reconstructed later. Ok about the second point.

This is fine, like I said you can keep it as a dict but it should have known keys. The parsing would simply be checking at "compile" time that everything was present and there were no superfluous keys. This would probably also make it easier to lookup what it should contain from scratch and the defaults would be filled here - they can even use the lockfile system if you want reproducibility in the future. It also makes it easier to see which defaults have been filled in this case.

Instead of checking for unused keys in the runcard, we could also remove all default values.

This is the other option. I am biased but I think defaults are fine as long as they are done in a sensible way, which is what the lockfiles are supposed to be addressing. As above, if we parse the runcard fully at the start and we know where the defaults are being filled in and what values they are taking then hopefully there is less unexpected behaviour.

"all those runcards should fail from now on and no duplicate epochs is allowed". I'd be happy with that.

yes this is fine for me, they are two separate codes and after @voisey kindly did #973 I think it's fine to stop supporting older ones.

wilsonmr · 2021-03-05T13:20:58Z

I would echo that removing defaults doesn't actually remove defaults it just means that defaults are filled by copying and pasting and either intentionally or accidentally not editing parts of the pasted runcard..

wilsonmr · 2021-03-05T13:24:41Z

btw the branch this points to is #1081 and that actually expands fitting as a namespace and so technically you could move everything in there outside and it should still work. It would probably break at the validphys level where we assume fitting always exists, but if that used the parsing tools properly I think it still would be fine. *EDIT and you would still need an empty namespace of fitting present which would be expanded..

After a grep, I see we actually don't rely on fitting in too many places, perhaps this could just go!

stuff like

fitting = fit.as_input()["fitting"]
    if fitting["fitbasis"] == basis

but I think this could be made backwards compatible

scarlehoff · 2021-03-05T13:26:24Z

btw the branch this points to is #1081 and that actually expands fitting as a namespace and so technically you could move everything in there outside and it should still work.

We could do that yes. Having "parameters" inside of "fitting" was something that made sense when we started and didn't want to be very different from the nnfit runcard. That structure maybe doesn't make much sense now...

Ok. Then:

We all agree that no duplicates should exist.
Defaults are a problem.

At least we can act on the first point I guess :P

scarlehoff · 2021-03-24T15:26:39Z

Closed in favour of issue #1166

scarlehoff added 2 commits March 5, 2021 12:34

check for common user-based errors

52fb7ad

remove epochs from basic runcard

76d4e93

scarlehoff requested review from Zaharid, voisey, tgiani, wilsonmr and RoyStegeman March 5, 2021 12:08

scarlehoff changed the base branch from master to n3fit-reader-action March 5, 2021 12:08

Base automatically changed from n3fit-reader-action to master March 10, 2021 15:44

scarlehoff mentioned this pull request Mar 24, 2021

Remove useless fit runcard namespaces #1166

Closed

scarlehoff closed this Mar 24, 2021

scarrazza deleted the check_n3fit_defaults branch August 27, 2021 21:42

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Checks for common errors / defaults #1129

Checks for common errors / defaults #1129

scarlehoff commented Mar 5, 2021

Zaharid commented Mar 5, 2021

wilsonmr commented Mar 5, 2021

Zaharid commented Mar 5, 2021

scarlehoff commented Mar 5, 2021

RoyStegeman commented Mar 5, 2021

Zaharid commented Mar 5, 2021

wilsonmr commented Mar 5, 2021 •

edited

Loading

wilsonmr commented Mar 5, 2021

wilsonmr commented Mar 5, 2021 •

edited

Loading

scarlehoff commented Mar 5, 2021

scarlehoff commented Mar 24, 2021

Checks for common errors / defaults #1129

Checks for common errors / defaults #1129

Conversation

scarlehoff commented Mar 5, 2021

Zaharid commented Mar 5, 2021

wilsonmr commented Mar 5, 2021

Zaharid commented Mar 5, 2021

scarlehoff commented Mar 5, 2021

RoyStegeman commented Mar 5, 2021

Zaharid commented Mar 5, 2021

wilsonmr commented Mar 5, 2021 • edited Loading

wilsonmr commented Mar 5, 2021

wilsonmr commented Mar 5, 2021 • edited Loading

scarlehoff commented Mar 5, 2021

scarlehoff commented Mar 24, 2021

wilsonmr commented Mar 5, 2021 •

edited

Loading

wilsonmr commented Mar 5, 2021 •

edited

Loading