Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix pytorch gradients #1450

Open
wants to merge 14 commits into
base: master
Choose a base branch
from
Open

Fix pytorch gradients #1450

wants to merge 14 commits into from

Conversation

Simone-Bordoni
Copy link
Contributor

Fix #1449

Checklist:

  • Reviewers confirm new code works as expected.
  • Tests are passing.
  • Coverage does not decrease.
  • Documentation is updated.

@Simone-Bordoni Simone-Bordoni added the bug Something isn't working label Sep 16, 2024
@Simone-Bordoni Simone-Bordoni self-assigned this Sep 16, 2024
@renatomello renatomello added this to the Qibo 0.2.13 milestone Sep 17, 2024
@BrunoLiegiBastonLiegi
Copy link
Contributor

BrunoLiegiBastonLiegi commented Sep 17, 2024

I found a potential problem when you cast the parameters of parametrized gates:

return self.np.tensor(x, dtype=self.dtype, requires_grad=self.requires_grad)

here you cast to self.dtype which is usually torch.complex128 but it should actually be torch.float32/64 since this is a parameter. I don't know whether this is the origin of the problem with the gradients though.

EDIT: nevermind, you necessarily have to cast to complex otherwise when you build the matrix elements pytorch complains about mismatching dtypes...

@Simone-Bordoni
Copy link
Contributor Author

Simone-Bordoni commented Sep 17, 2024

I found a potential problem when you cast the parameters of parametrized gates:

return self.np.tensor(x, dtype=self.dtype, requires_grad=self.requires_grad)

here you cast to self.dtype which is usually torch.complex128 but it should actually be torch.float32/64 since this is a parameter. I don't know whether this is the origin of the problem with the gradients though.
EDIT: nevermind, you necessarily have to cast to complex otherwise when you build the matrix elements pytorch complains about mismatching dtypes...

Actually this may be part of the problem, i am rewriting in a better way the whole casting function differentiating the matrix_dtype from the parameters_dtype

@Simone-Bordoni
Copy link
Contributor Author

Now the gradients are passing however their final value is always zero. I have tried visualizing the computational graph (shown below) and all the operations seems to be performed correctly.
I am currently tsting a simple circuit with just one rotation gate (test_torch_gradients.py).
In the next days I will not be able to work furthermore on this issue, I will resume working on the gradients next week. In the meantime if you find any possible reason for this problemlet me know, it would be very helpful.
loss

@Simone-Bordoni
Copy link
Contributor Author

Now the gradients are passing however their final value is always zero. I have tried visualizing the computational graph (shown below) and all the operations seems to be performed correctly. I am currently tsting a simple circuit with just one rotation gate (test_torch_gradients.py). In the next days I will not be able to work furthermore on this issue, I will resume working on the gradients next week. In the meantime if you find any possible reason for this problemlet me know, it would be very helpful. loss

Now the gradients are passing however their final value is always zero. I have tried visualizing the computational graph (shown below) and all the operations seems to be performed correctly. I am currently tsting a simple circuit with just one rotation gate (test_torch_gradients.py). In the next days I will not be able to work furthermore on this issue, I will resume working on the gradients next week. In the meantime if you find any possible reason for this problemlet me know, it would be very helpful. loss

I found out the problem was with the target state. now the gradients are passing correctly,
as soon as possible I will clean up the code and it should be ready for a review.

@Simone-Bordoni Simone-Bordoni marked this pull request as ready for review September 24, 2024 13:47
@Simone-Bordoni
Copy link
Contributor Author

I think that now everything has been fixed.
The changes required were bigger than expected, I had to add the backend in the gate decompositions and in other parts.
I have added a test to check the correct backpropagation in test_torch_gradients.py so that thay can be easily moved to qiboml.

Copy link

codecov bot commented Sep 24, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 99.94%. Comparing base (c92f49c) to head (ad25c2c).
Report is 4 commits behind head on master.

Additional details and impacted files
@@            Coverage Diff             @@
##           master    #1450      +/-   ##
==========================================
- Coverage   99.94%   99.94%   -0.01%     
==========================================
  Files          81       81              
  Lines       11740    11715      -25     
==========================================
- Hits        11733    11708      -25     
  Misses          7        7              
Flag Coverage Δ
unittests 99.94% <100.00%> (-0.01%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@BrunoLiegiBastonLiegi
Copy link
Contributor

Thanks @Simone-Bordoni, #1449 is still failing even with this patch, but I think that's more related to a problem with the circuit.set/get_parameters().

@Simone-Bordoni
Copy link
Contributor Author

Thanks @Simone-Bordoni, #1449 is still failing even with this patch, but I think that's more related to a problem with the circuit.set/get_parameters().

In the new version the backend.cast has requires_grad=False by default, you need to modify #1449 and activate the gradients when casting.

@BrunoLiegiBastonLiegi
Copy link
Contributor

BrunoLiegiBastonLiegi commented Sep 25, 2024

Like this?

self.circuit_parameters = torch.nn.Parameter(BACKEND.cast(circuit.get_parameters(), dtype=torch.float64, requires_grad=True))

because this fails anyway.

EDIT: sorry maybe I should have specified better, but it doesn't fail in the sense that the gradients don't pass, it actually crashes with an error:

RuntimeError: Output 0 of UnbindBackward0 is a view and its base or another view of its base has been modified inplace. This view is the output of a function that returns multiple views. Such functions do not allow the output views to be modified inplace. You should replace the inplace operation by an out-of-place one.

@Simone-Bordoni
Copy link
Contributor Author

Like this?

self.circuit_parameters = torch.nn.Parameter(BACKEND.cast(circuit.get_parameters(), dtype=torch.float64, requires_grad=True))

because this fails anyway.

EDIT: sorry maybe I should have specified better, but it doesn't fail in the sense that the gradients don't pass, it actually crashes with an error:

RuntimeError: Output 0 of UnbindBackward0 is a view and its base or another view of its base has been modified inplace. This view is the output of a function that returns multiple views. Such functions do not allow the output views to be modified inplace. You should replace the inplace operation by an out-of-place one.

OK, I thought that the problem was the gradient not passing. Is it a different error with respect to what you were getting before this update?

@BrunoLiegiBastonLiegi
Copy link
Contributor

Yes, if I remember correctly before it used to run without crashing but had the gradients flow problem.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging this pull request may close these issues.

PyTorchBackend backpropagation compatibility with torch.nn.Module and torch.optim
3 participants