Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Parallelize sampling measurement in MPS #1911

Merged
merged 9 commits into from
Jan 10, 2024
Merged

Conversation

Patataman
Copy link
Contributor

@Patataman Patataman commented Aug 24, 2023

Summary

Hello, this PR is a parallelization over the measurement sampling algorithm implemented in #1377.

Details and comments

Measurement can be parallelized at shots level. A small critical part is necessary to ensure the random numbers are generated correctly. With this, performance is better than current implementation in almost every case.

For the PR and benchmarking, parallelism can be activated or deactivated using the environmental variable PRL_PROB_MEAS=1. If you are ok with the parallelization I will just remove the condition.

Here are some results using Random Quantum Circuit (https://arxiv.org/pdf/2207.14280.pdf) for 30 qubits and different depths:

Server configuration

CPU Intel Xeon Gold 6148
# sockets 2
# cores 20
RAM 192GB
GPU None
OS Ubuntu 22.04.1 LTS
Python 3.10
OpenBLAS/LAPACK 0.3.21
gcc v11.3

For Depth 1 and 3, the main "problem" is the overhead from the parallelization, execution time was already small.

And here the specific times (in seconds)

Depth Base Parallel
1 0.030685282 0.042850685
3 0.03668642 0.053231239
5 0.076008368 0.073100615
10 198.6679559 55.42104983
12 1220.6618 1025.729388
15 59207.97474 39454.64986

Or checking with a different number of qubits but same depth

@doichanj doichanj added the performance Performance improvements label Aug 24, 2023
Copy link
Contributor

@merav-aharoni merav-aharoni left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good!
The graphs are a bit strange. I don't see why the depth should be a factor in the performance improvement. Also increasing the number of qubits is not a direct factor. The main factor should be the number of shots. I think a graph where you increase the number of shots on the x-axis would be best to show your improvement.

@doichanj doichanj added this to the Aer 0.14.0 milestone Oct 11, 2023
@doichanj doichanj merged commit c533946 into Qiskit:main Jan 10, 2024
31 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
performance Performance improvements
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants