Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize ConsolidateBlocks pass #10365

Merged
merged 22 commits into from
Jul 20, 2023
Merged

Conversation

raynelfss
Copy link
Contributor

@raynelfss raynelfss commented Jun 30, 2023

Summary

Tries to address #8779
The following commits attempt to speedup the ConsolidateBlocks pass by calculating the unitary of each block inside of the pass rather than passing it as a QuantumCircuit and converting it to an Operator, which is redundant.

Details and comments

Known issues:

  • When a circuit is generated from a qasm file, some non-traditional gates will not have a defined to_matrix method and the algorithm will throw an exception. Fixed in 653e23c.
  • On Windows devices, the transpiler will throw the following error with certain circuits:
LinAlgError: eig algorithm (geev) did not converge (only eigenvalues with order >= 4 have converged)

Performance

The following is an example with a 1024 qubit random circuit of depth 128.

  • Old ConsolidateBlocks:
    image
  • New ConsolidateBlocks:
    image

@qiskit-bot qiskit-bot added the Community PR PRs from contributors that are not 'members' of the Qiskit repo label Jun 30, 2023
@raynelfss
Copy link
Contributor Author

  • Old ConsolidateBlocks snakeviz: image
  • New ConsolidateBlocks snakeviz:
    image

@jakelishman jakelishman added Changelog: New Feature Include in the "Added" section of the changelog Intern PR PR submitted by IBM Quantum interns performance and removed Community PR PRs from contributors that are not 'members' of the Qiskit repo labels Jun 30, 2023
@raynelfss raynelfss changed the title [WIP] Optimize ConsolidateBlocks pass Optimize ConsolidateBlocks pass Jul 6, 2023
@raynelfss raynelfss marked this pull request as ready for review July 6, 2023 15:57
@raynelfss raynelfss requested a review from a team as a code owner July 6, 2023 15:57
@qiskit-bot qiskit-bot added the Community PR PRs from contributors that are not 'members' of the Qiskit repo label Jul 6, 2023
@qiskit-bot
Copy link
Collaborator

Thank you for opening a new pull request.

Before your PR can be merged it will first need to pass continuous integration tests and be reviewed. Sometimes the review process can be slow, so please be patient.

While you're waiting, please feel free to review other open PRs. While only a subset of people are authorized to approve pull requests for merging, everyone is encouraged to review open pull requests. Doing reviews helps reduce the burden on the core team and helps make the project's code better for everyone.

One or more of the the following people are requested to review this:

  • @Qiskit/terra-core

@coveralls
Copy link

coveralls commented Jul 6, 2023

Pull Request Test Coverage Report for Build 5616076303

  • 40 of 46 (86.96%) changed or added relevant lines in 3 files are covered.
  • 7 unchanged lines in 4 files lost coverage.
  • Overall coverage remained the same at 85.954%

Changes Missing Coverage Covered Lines Changed/Added Lines %
qiskit/transpiler/passes/optimization/consolidate_blocks.py 10 12 83.33%
qiskit/transpiler/passes/utils/block_to_matrix.py 29 33 87.88%
Files with Coverage Reduction New Missed Lines %
crates/qasm2/src/expr.rs 1 93.76%
qiskit/circuit/tools/pi_check.py 1 91.23%
qiskit/transpiler/passes/synthesis/unitary_synthesis.py 1 90.33%
crates/qasm2/src/lex.rs 4 91.39%
Totals Coverage Status
Change from base Build 5615993162: 0.0%
Covered Lines: 72903
Relevant Lines: 84816

💛 - Coveralls

@mtreinish mtreinish added this to the 0.25.0 milestone Jul 11, 2023
@ewinston ewinston self-assigned this Jul 12, 2023
@ewinston
Copy link
Contributor

The _block_to_matrix method seems to be taking over some of the responsibilities which Operator would normally take. If such an improvement is possible then it seems better to place it with Operator so other parts of the code could benefit from it. If it is the case, as suggested, that there is some trade off between matmul and einsum as a function of matrix size it would be nice to understand where that trade-off lies (e.g. n=2, 3,...) and apply the switch there.

Actually I believe @chriseclectic orginally was using matmul in Operator but switched to einsum for speed but perhaps that was just based on larger matrices?

@jakelishman
Copy link
Member

Erick: this PR is deliberately meant to take over some of what Operator does just like #5909 does, because of the overheads from Operator itself. There's potentially the case for some slightly faster paths to be added to Operator for the case of small numbers of qubits, but Operator is rarely performance critical, and is optimised more towards generality. There's overhead in managing things like the OpShape, parsing the qargs into a format, and then of course there's the einsum parts.

einsum lets Operator avoid constructing the implicit full-tensor-product space matrices when composing a smaller operator onto a larger one, and it allows general argument re-ordering without the explicit swaps that the PR in its current form adds in (though of course it's possible to elide the 2 matmul swap ops and do a direct permutation, or the manual 4-tensor transposition in Numpy). For the cases considered by this PR, there's no need for any of Operators general handling, and you can see in the results the huge improvement from avoiding that overhead.

@ewinston
Copy link
Contributor

ewinston commented Jul 14, 2023

Can you show the profile result in table format sorted by cumtime and tottime for the top ~20 rows?

@raynelfss
Copy link
Contributor Author

raynelfss commented Jul 17, 2023

Can you show the profile result in table format sorted by cumtime and tottime for the top ~20 rows?

@ewinston
This is the profile result of the old version:

  • By tottime:
    image
  • By cumtime:
    image

This is for the optimized version:

  • By tottime:
    image
  • By cumtime:
    image

@ewinston
Copy link
Contributor

@raynelfss It appears like the profiling includes other passes besides the one under test which makes the output a little hard to interpret. Don't worry about it for now though since we're just taking the step of moving the block_to_matrix to transpiler/passes/utils.

Returns:
NDArray: Matrix representation of the block of operations.
"""
matrix = identity(2 ** len(block_index_map), dtype=complex)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

An exception should be raised if the dimension of matrix != 4.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For this, it would be better to just throw an exception whenever len(block_index_map) != 2 . Since earlier in the code we know that block_index_map comes from the following operation:

https://github.com/Qiskit/qiskit-terra/blob/c87d00c90af0635561c97a8ae54cd7eeccf78fce/qiskit/transpiler/passes/optimization/consolidate_blocks.py#L106

Which takes block_qargs (The set of all qubits in the block) as an argument and returns a mapped dictionary of the same length. Before we call _block_to_matrix we will check that len(block_qargs) <= 2 to call it, otherwise, the old method will be used. As shown here:

https://github.com/Qiskit/qiskit-terra/blob/c87d00c90af0635561c97a8ae54cd7eeccf78fce/qiskit/transpiler/passes/optimization/consolidate_blocks.py#L112-L123

So it would be better to just add an exception at the beginning of block_to_matrix that says if the length of block_index_length exceeds 2 then throw an error. Which would cover for cases in which the matrix has a dimension that exceeds 2x2.

@raynelfss raynelfss requested a review from ewinston July 20, 2023 21:08
block_index_length = len(block_index_map)
if block_index_length != 2:
raise QiskitError(
"This function can only operate with blocks of 2 qubits or less."
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The "or less" should be removed.

ewinston
ewinston previously approved these changes Jul 20, 2023
@mtreinish mtreinish enabled auto-merge July 20, 2023 21:16
auto-merge was automatically disabled July 20, 2023 21:28

Head branch was pushed to by a user without write access

@mtreinish mtreinish enabled auto-merge July 20, 2023 21:35
@mtreinish mtreinish added this pull request to the merge queue Jul 20, 2023
Merged via the queue into Qiskit:main with commit 2b225d3 Jul 20, 2023
13 checks passed
to24toro pushed a commit to to24toro/qiskit-terra that referenced this pull request Aug 3, 2023
* Initial: Introducing speedup by calculating matrix

* Lint: Reformat using Black

* Fix: Added `basis_count` and `outside_basis`

* Fix: Use `Operator` if `to_matrix` is unavailable.

* Fix: Default to Operator when necessary
- Will default to previous method when blocks have more than 2 qubits.
    - This is a temporary measure.

* Lint: Removed Cyclic import

* Docs: Added release note.

* Docs: Added reference to the issue.

* Fix: Move `block_to_matrix` to ~.passes.utils

* Lint: Fixed cyclical import and order

* CI: Removed type checking

* Add: Exceptions on `_block_to_matrix`

* Docs: Fix release note.

* Fix: Change import path for block_to_matrix

* Update qiskit/transpiler/passes/utils/block_to_matrix.py

---------

Co-authored-by: ewinston <ewinston@us.ibm.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Changelog: New Feature Include in the "Added" section of the changelog Community PR PRs from contributors that are not 'members' of the Qiskit repo Intern PR PR submitted by IBM Quantum interns performance
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

6 participants