Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Probabilities do not sum to 1 #517

Closed
mlazzarin opened this issue Nov 19, 2021 · 7 comments · Fixed by #528
Closed

Probabilities do not sum to 1 #517

mlazzarin opened this issue Nov 19, 2021 · 7 comments · Fixed by #528
Labels
bug Something isn't working

Comments

@mlazzarin
Copy link
Contributor

I have a very interesting bug to report. If I run the following circuit with NQUBITS>=26 and TARGET>=25:

import qibo
from qibo.models import Circuit
from qibo import gates
qibo.set_backend("numpy")
qibo.set_precision("single")

c = Circuit(NQUBITS)
for i in range(NQUBITS):
    c.add(gates.H(i))
output = c.add(gates.M(TARGET, collapse=True))
for i in range(NQUBITS):
    c.add(gates.H(i))
result = c()

I obtain the following error:


```Traceback (most recent call last):
  File ".../circuit.py", line 15, in <module>
    result = c()
  File ".../qibo/abstractions/circuit.py", line 712, in __call__
    return self.execute(initial_state=initial_state, nshots=nshots)
  File ".../qibo/core/circuit.py", line 306, in execute
    state = self._device_execute(initial_state)
  File ".../qibo/core/circuit.py", line 235, in _device_execute
    state = self._execute(initial_state=initial_state)
  File ".../qibo/core/circuit.py", line 221, in _execute
    state = self._eager_execute(state)
  File ".../qibo/core/circuit.py", line 188, in _eager_execute
    state = gate(state)
  File ".../qibo/core/gates.py", line 302, in __call__
    return getattr(self, self._active_call)(state)
  File ".../qibo/core/gates.py", line 291, in _state_vector_call
    return K.state_vector_collapse(self, state, self.result.binary[-1])
  File ".../qibo/core/measurements.py", line 111, in binary
    self._binary = self._convert_to_binary()
  File ".../qibo/core/measurements.py", line 167, in _convert_to_binary
    return K.mod(K.right_shift(self.decimal[:, K.newaxis], _range), 2)
  File ".../qibo/core/measurements.py", line 102, in decimal
    self._decimal = self._sample_shots()
  File ".../qibo/core/measurements.py", line 187, in _sample_shots
    result = K.cpu_fallback(K.sample_shots, self.probabilities, self.nshots)
  File ".../qibo/backends/abstract.py", line 106, in cpu_fallback
    return func(*args)
  File ".../qibo/backends/numpy.py", line 232, in sample_shots
    return self.random.choice(range(len(probs)), size=nshots, p=probs)
  File "mtrand.pyx", line 933, in numpy.random.mtrand.RandomState.choice
ValueError: probabilities do not sum to 1

Additional details:

  • The problem appears in single precision, with TARGET >= 25. If TARGET < 25, the problem disappears.
  • The probabilities sum to 0.5.

The main issue appears here:

@check_measured_qubits
def probabilities(self, qubits=None, measurement_gate=None):
unmeasured_qubits = tuple(i for i in range(self.nqubits)
if i not in qubits)
state = K.reshape(K.square(K.abs(self.tensor)), self.nqubits * (2,))
return K.sum(state, axis=unmeasured_qubits)

In line 82, state is normalized, but then after the call to K.sum, we obtain a probability distribution which is not normalized.
I will investigate this issue further, in particular I'm curious to see if there is the same problem with non-collapse measurements.

@mlazzarin mlazzarin added the bug Something isn't working label Nov 19, 2021
@mlazzarin
Copy link
Contributor Author

The problem involves all kind of measurements, with a very strange behavior:

import qibo
from qibo.models import Circuit
from qibo import gates
qibo.set_backend("numpy")
qibo.set_precision("single")

NQUBITS = <insert something e.g. 28 or 30>
c = Circuit(NQUBITS)
for i in range(NQUBITS):
    c.add(gates.H(i))
c.add(gates.M(24+K))
result = c(nshots=1000).frequencies()

If K < 0, probabilities sum to 1, while of K >= 0 probabilities sum to 2 ** (-K).
Notice that no error is raised in this case due to #460 normalizing the probs.
We did it to fix a tolerance issue, but it hides this problem, maybe we should revisit it.

Is anyone able to reproduce this error? Note that you need to plot probabilities directly or revert #460, otherwise no error is raised for standard measurements (but it still appears with collapse gates).

@mlazzarin
Copy link
Contributor Author

I prepared a code snippet to isolate the problem. It doesn't even need qibo, only numpy.

import numpy as np

NQUBITS=26
QUBITS = tuple(i for i in range(NQUBITS)) # tuple with the qubit indices

# Prepare and normalize a state vector in single precision
state = np.ones(2**NQUBITS, dtype=np.complex64)
state = state / 2 ** (NQUBITS / 2) 

# Compute the vector with the probs in computational basis
probs = np.square(np.abs(state))

# Verify that it's normalized
sum = np.sum(probs)
print(f"Norm: {sum}")

# Reshape the state vector to a tensor with rank NQUBITS
state = np.reshape(state, newshape=(2,) * NQUBITS)
probs = np.square(np.abs(state))

# Sum along all the axis except the last one
sum = np.sum(probs, axis=QUBITS[0:25])
print(f"Sum in one step: {sum}")

# Do the same in two steps
sum = np.sum(probs, axis=QUBITS[0:13])
sum = np.sum(sum, axis=QUBITS[0:12])
print(f"Sum in two steps: {sum}")

It looks like that summing over the first 25 or more axis in one step gives unexpected results.
I don't think it's the expected behavior, what do you think?
Let me know if you are able to reproduce these behaviors.

@stavros11
Copy link
Member

It looks like that summing over the first 25 or more axis in one step gives unexpected results. I don't think it's the expected behavior, what do you think? Let me know if you are able to reproduce these behaviors.

Thank you very much for identifying this and the very detailed investigation. The numpy script is a bit crazy but I can reproduce it. It seems like a precision issue to me. For example, try the following:

import numpy as np

n = 26
t = 1

x = (np.random.random(2 ** n) ** 2 / 2 ** (n / 2)).astype(np.float32)
x = np.reshape(x, (2 ** (n - t), 2 ** t))

print(np.sum(x))
print(np.sum(np.sum(x, axis=0)))
print(np.sum(np.sum(x, axis=1)))

This is just calculating the sum of a random float32 vector in three different ways. For me results vary depending on how x is generated but generally the middle value is wrong, while the other two always agree. For this particular example the discrepancy is actually very large and the funny thing is that the middle value coverges to the correct result as t is increased (here we need about t = 10 for acceptable accuracy). Also with this configuration the issue appears even for n < 26 if t is small enough.

@scarrazza
Copy link
Member

@mlazzarin could you please open an issue in numpy with this short example?

@mlazzarin
Copy link
Contributor Author

Done, see numpy/numpy#20458 .

@mlazzarin
Copy link
Contributor Author

It looks like it's a precision issue indeed. The fact that the different sum order gives different results is actually documented in Numpy's doc.

For floating point numbers the numerical precision of sum (and np.add.reduce) is in general limited by directly adding each number individually to the result causing rounding errors in every step. However, often numpy will use a numerically better approach (partial pairwise summation) leading to improved precision in many use-cases. This improved precision is always provided when no axis is given. When axis is given, it will depend on which axis is summed. Technically, to provide the best speed possible, the improved precision is only used when the summation is along the fast axis in memory. Note that the exact precision may vary depending on other parameters. In contrast to NumPy, Python’s math.fsum function uses a slower but more precise approach to summation. Especially when summing a large number of lower precision floating point numbers, such as float32, numerical errors can become significant. In such cases it can be advisable to use dtype=”float64” to use a higher precision for the output.

When I have some spare time, I'll try to figure out how to cope with this problem in Qibo.

@scarrazza
Copy link
Member

Ok, then one possibility is to avoid the axis and write explicitly the sum of each array column.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants