Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix CI #115

Merged
merged 27 commits into from
Oct 4, 2023
Merged

Fix CI #115

merged 27 commits into from
Oct 4, 2023

Conversation

simsurace
Copy link
Member

@simsurace simsurace commented Sep 20, 2023

This PR is to try to fix #114. I will start by re-testing on Julia 1.8, because the last green CI run was before 1.9 was released.

@simsurace
Copy link
Member Author

1.6 is expected to fail. But interesting that the old manifest + Julia 1.8 are needed to pass all tests.

Simone Carlo Surace added 2 commits September 22, 2023 01:19
@simsurace
Copy link
Member Author

Ok, adding some rrules was sufficient. They are probably not as fast as they could be and should be upstreamed to ChainRules.jl, but it should do for now.
Let's revert the debugging changes now.

@simsurace simsurace changed the title WIP: Fix CI Fix CI Sep 21, 2023
@simsurace simsurace marked this pull request as ready for review September 21, 2023 23:38
@coveralls
Copy link

coveralls commented Sep 22, 2023

Pull Request Test Coverage Report for Build 6403558960

  • 11 of 11 (100.0%) changed or added relevant lines in 2 files are covered.
  • No unchanged relevant lines lost coverage.
  • Overall coverage increased (+20.1%) to 88.008%

Totals Coverage Status
Change from base Build 6289961279: 20.1%
Covered Lines: 1365
Relevant Lines: 1551

💛 - Coveralls

@simsurace
Copy link
Member Author

simsurace commented Sep 22, 2023

Managed to fix it. There was an API problem with dtc from AbstractGPs 0.5.16->0.5.17 . I'm now using the new API, I think TemporalGPs should probably not be extending the old one (#116).
Will do some cleanup work later to bump compat and get rid of the deprecation warnings before merging, but I ask for a review to see if the changes are ok.

@simsurace
Copy link
Member Author

Unbelieveable, how did it break again?

@simsurace
Copy link
Member Author

simsurace commented Sep 22, 2023

Seems like I repeated the same "mistake" as Theo in this commit. Seems like fixing those deprecation warnings is not so easy. I clearly don't know how to fix all this AD stuff :)

@simsurace
Copy link
Member Author

The examples seem to take prohibitively long and hitting the timeout.

@willtebbutt
Copy link
Member

@simsurace modulo the tests timing out, this looks great.

Before I review properly, I'd like to know whether you would you be willing to upstream the kron rules to ChainRules.jl? This would avoid type-piracy here, and would ensure that everyone gets the benefit of them.

@simsurace
Copy link
Member Author

I meant to do that, but I haven't yet managed to write down rules that pass all tests.

@willtebbutt
Copy link
Member

Would it be helpful if I took a look at testing the code currently on this branch?

@simsurace
Copy link
Member Author

Thanks, I'll open a PR draft and link it here.

@simsurace
Copy link
Member Author

I think getting the ChainRules PR merged will take a while. To not leave TemporalGPs in a broken state, and to be able to release these fixes, it would probably be good to invent an internal method _kron and define rules on that in order to not commit type piracy.

@willtebbutt
Copy link
Member

Agreed. Looking at the Zygote implementation of kron I'm a bit confused why it is not working correctly here though. Do you have any intuition why it isn't?

Sorry to slow things down with additional questions, I just want to make sure that we've not missed anything.

@willtebbutt
Copy link
Member

I think it's just https://github.com/JuliaGaussianProcesses/TemporalGPs.jl/blob/master/bench/single_output_gps.jl ? In any case, it's really computing the logpdf and its gradient as a function of the kernel parameters and observations for the three Matern kernels that we've got implemented, and a variety of different numbers of observations (definitely uses RegularSpacing rather than arbitrary spacing of inputs).

@simsurace
Copy link
Member Author

Two-argument map strikes back. I think they can be replaced by broadcasts or comprehensions...

@simsurace
Copy link
Member Author

Taken together, these optimizations amount to a 2-3 orders of magnitude improvement in approx_space_time_learning for T = 1000.

@simsurace
Copy link
Member Author

Finally all green 🎉. I will now restore the original problem sizes for the space-time examples.

@willtebbutt, are you happy to merge this now?

@simsurace simsurace changed the title Fix CI (except examples) Fix CI Oct 3, 2023
@simsurace
Copy link
Member Author

Fixes #118

@simsurace simsurace linked an issue Oct 3, 2023 that may be closed by this pull request
3 tasks
@simsurace
Copy link
Member Author

I removed the "needs version bump" label because the version that is currently on master hasn't been released yet.

@simsurace
Copy link
Member Author

The SIGTERM in the last example is new.. I will investigate

@simsurace
Copy link
Member Author

Ok, it seems that the augmented_inference example is a bit flaky in CI. Locally it has not failed me. I don't know why it occasionally gets killed. @willtebbutt can we merge?

Copy link
Member

@willtebbutt willtebbutt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for sorting this. Yup, please feel to merge!

@simsurace simsurace merged commit f8b8302 into master Oct 4, 2023
12 of 13 checks passed
@simsurace simsurace deleted the fix-CI branch October 4, 2023 11:18
@simsurace simsurace restored the fix-CI branch October 4, 2023 11:18
@simsurace simsurace deleted the fix-CI branch October 4, 2023 11:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Examples are very slow Broken tests on master
3 participants