Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Uniform initialization interval #244

Open
nomisto opened this issue Nov 17, 2021 · 6 comments
Open

Uniform initialization interval #244

nomisto opened this issue Nov 17, 2021 · 6 comments
Assignees

Comments

@nomisto
Copy link

nomisto commented Nov 17, 2021

Hello,

First of all thanks for your work, very helpful framework!

I've got a quick question about hyperparameter search and the uniform initialization interval. In the configs used for the paper you set

    - name: lookup_embedder.initialize_args.uniform_.a
      type: range
      bounds: [-1.0, -0.00001]

and in the table regarding the hyperparameter search space in the paper you say:

Interval (Unif) [−1.0,1.0]

so I would assume that b = -1 * a, however in the codebase it says the opposite (a = -1 * b), ist this a bug or did I overlook something?

# Automatically set arg a (lower bound) for uniform_ if not given
if initialize == "uniform_" and "a" not in initialize_args:
initialize_args["a"] = initialize_args["b"] * -1
KgeBase._initialize(what, initialize, initialize_args)

@rufex2001
Copy link
Member

This is definitely a bug (whether in our configs or the code is a matter of interpretation). The idea was to have symmetric intervals, i.e. set a = -b, where a and b are lower and upper bound, respectively. The current code only sets the lower bound when it isn't given, meaning our search configs, which are setting the lower bound, are using the default value of the upper bound b = 1.0. The desired behavior can be obtained by setting "b" instead of "a" in the configs.

Codewise, I suggest a general solution that keeps the idea of using symmetric intervals. So, we'd set any non-given bound to be -1 times the given bound. We should use clearer names too, so lower_bound and upper_bound instead of torch's "a" and "b". Finally, we could decide to check that the given lower bound is indeed lower than the upper one, but torch let's that fly, so I suggest we do the same.

@rgemulla thoughts?

@rgemulla
Copy link
Member

I think low/high is better than a/b. It's perhaps cleanest to disable any automatic setting of non-given values (and use defaults 0/1).

@rufex2001
Copy link
Member

I now think the cleanest thing is to just get rid of the automatic setting of non-given values and leave the rest as is. Sure, using lower_bound/upper_bound would be clearer than torch's a and b, but it's nicer to leave a and b so the use of all initialization methods is consistent with torch's documentation.

@rgemulla if you agree, I'll drop the automatic settings. Otherwise, I'll add a line to rewrite the lower_bound/upper_bound keys as a and b to then pass it to torch

@rgemulla
Copy link
Member

rgemulla commented Dec 3, 2021

Yes, agreed. If only one is set, issue a warning with the actual bounds being used. If bounds are inconsistent, raise an error.

@rufex2001
Copy link
Member

rufex2001 commented Dec 3, 2021 via email

@rgemulla
Copy link
Member

rgemulla commented Dec 3, 2021

Fine, as long as it's signaled via an error.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants