New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

Adding circuit executor classes and shot-branching #1766

Merged

hhorii merged 63 commits into Qiskit:main from doichanj:add_executor

Aug 9, 2023

Collaborator

doichanj commented Mar 29, 2023

Summary

This PR restructures parallel simulation classes that were implemented in StateChunk class.
Instead of StateChunk this PR introduces CircuitExecutor classes outside of State classes.

This PR also implements shot-branching revised from PR #1606
Now the implementation is simplified in CircuitExecutor::MultiStateExecutor class

Details and comments

doichanj added 5 commits

March 17, 2023 10:31


          adding executor classes for parallel simulations

5684f13


          merge main

a03bbb1


          fix merge conflicts

430ed9f


          Fix tensor_network method enablenment

4a0b8ec


          simplify sub-classes

e0ba7e6

doichanj requested a review from hhorii

March 29, 2023 07:50

doichanj added 4 commits

March 30, 2023 09:57


          fix unformatted code

1cd3c73


          Merge remote-tracking branch 'upstream/main' into add_executor


          fix unformatted code again

9fcd4e9


          Fix MPI code

f4fe17f

hhorii added the enhancement label

hhorii added this to the Aer 0.13.0 milestone

doichanj added 3 commits

April 5, 2023 18:53


          Fix shot-branching was not enabled with noise sampling

c6c2e6e


          Fix clang format

72faca7


          set_num_qubits to virtual function to set correct num qubits on matrix

hhorii reviewed

View reviewed changes

Collaborator

hhorii left a comment

I would like to leave comments on aer_controller.hpp first.

src/simulators/simulators.hpp Outdated

Comment on lines 18 to 26

+              #include "simulators/density_matrix/densitymatrix_state.hpp"
+              #include "simulators/extended_stabilizer/extended_stabilizer_state.hpp"
+              #include "simulators/matrix_product_state/matrix_product_state.hpp"
+              #include "simulators/stabilizer/stabilizer_state.hpp"
+              #include "simulators/statevector/qubitvector.hpp"
+              #include "simulators/statevector/statevector_state.hpp"
+              #include "simulators/superoperator/superoperator_state.hpp"
+              #include "simulators/tensor_network/tensor_net_state.hpp"
+              #include "simulators/unitary/unitary_state.hpp"

Collaborator

hhorii Apr 6, 2023 •

edited

Loading

The above include are not necessary if we do not refactor Controller::execute.

src/controllers/aer_controller.hpp

		@@ -917,18 +565,7 @@ Result Controller::execute(std::vector<Circuit> &circuits,

Collaborator

hhorii Apr 6, 2023

I think Controller::execute(std::vector<Circuit> &circuits, ..) method can be

Determine methods (calling simulation_methods())
Construct std::vector<std::shared_ptr<Executor>>
Determine experiment parallelization (set_parallelization_experiments()) by using Executor.required_memory_mb()
Call Executor. run_circuit() in parallel or serial

Construction of Executor can be via a static method in simulators.hpp.

Collaborator Author

doichanj Apr 7, 2023

I create Controller::make_circuit_executor function and removed required_memory_mb and validate_state from Controller.

hhorii reviewed

View reviewed changes

Collaborator

hhorii left a comment

Comments for circuit_executor.hpp is here:

src/simulators/circuit_executor.hpp

+                uint_t distributed_procs_;    // number of processes in communicator group
+                uint_t distributed_group_;    // group id of distribution
+                int_t distributed_proc_bits_; // distributed_procs_=2^distributed_proc_bits_
+                                              // (if nprocs != power of 2, set -1)

Collaborator

hhorii Apr 6, 2023

  uint_t myrank_;               // process ID
  uint_t nprocs_;               // number of processes
  uint_t distributed_rank_;     // process ID in communicator group
  uint_t distributed_procs_;    // number of processes in communicator group
  uint_t distributed_group_;    // group id of distribution
  int_t distributed_proc_bits_; // distributed_procs_=2^distributed_proc_bits_
                                // (if nprocs != power of 2, set -1)

Theses values are only for MPI, I think. Can we put them in #ifdef AER_MPI. Maybe we set some codes in aer_controller.hpp in it.

Collaborator Author

doichanj Apr 7, 2023

There are many depending codes to these params, so I would like to keep them available for non MPI environment. But I added #ifdef AER_MPI to the codes we can separate.

src/simulators/circuit_executor.hpp

+              #endif
+                // settings for cuStateVec
+                bool cuStateVec_enable_ = false;

Collaborator

hhorii Apr 6, 2023

This also can be in #ifndef AER_THRUST_CPU I guess.

Collaborator Author

doichanj Apr 7, 2023

I added #ifdef AER_CUSTATEVEC to this param.

src/simulators/circuit_executor.hpp

+                  // Rng engine (this one is used to add noise on circuit)
+                  RngEngine rng;
+                  rng.set_seed(circ.seed);

Collaborator

hhorii Apr 6, 2023

rng.set_seed(circ.seed); is called in later (run_circuit_with_sampling or run_circuit_shots ). It is better to use different seeds for them.

Collaborator Author

doichanj Apr 7, 2023

Now initial rng used for noise sampling is reused in run_circuit_xxx functions.
I think this is the strategy used in the older code.

src/simulators/circuit_executor.hpp Outdated

+                  rng.set_seed(circ.seed);
+                  run_with_sampling(circ, state, result, rng, circ.shots);
+                } else {
+                  // Vector to store parallel thread output data

Collaborator

hhorii Apr 6, 2023

I'm confused that run_circuit_with_sampling needs to consider about parallel_shots_ . If we use sampling, state will be calculated once and then get bitstrings from the state.

Collaborator Author

doichanj Apr 7, 2023

removed unnecessary functions and codes

src/simulators/circuit_executor.hpp

+                Utils::apply_omp_parallel_for((par_shots > 1), 0, par_shots,
+                                              run_circuit_lambda);
+                // gather cregs on MPI processes and save to result

Collaborator

hhorii Apr 6, 2023

The below can be in #AER_MPI (gather_creg_memory is also), I guess.

Collaborator Author

doichanj Apr 7, 2023

#ifdef AER_MPI inserted

doichanj added 3 commits

April 7, 2023 15:15


          reflecting review comments

cb0579c


          reuse of random number generator

b0c79a3


          Merge remote-tracking branch 'upstream/main' into add_executor

6a56514

hhorii reviewed

View reviewed changes

Collaborator

hhorii left a comment

Reviewed Executors. I'm wondering MultiStateExecutor is effective if shot-branching improves any methods.

src/controllers/aer_controller.hpp Outdated

                   result.metadata.add(max_memory_mb_, "max_memory_mb");
                   result.metadata.add(max_gpu_memory_mb_, "max_gpu_memory_mb");
+              #ifdef AER_MPI

Collaborator

hhorii Apr 14, 2023

This is very minor comments. We can consolidate multiple _OPENMP and AER_MPI ifdef blocks.

Collaborator Author

doichanj Apr 17, 2023

merged ifdef blocks

src/framework/results/experiment_result.hpp Outdated

Comment on lines 218 to 219

		for (int_t i = 0; i < nshots; i++)
		data.add_single(datum, key);

Collaborator

hhorii Apr 14, 2023

I'm not sure why we add the same datum multiple times here. Even if a caller wants to add the same data multiple times, it is better to add it outside of save_data_pershot as follows:

for (size_t i = 0; i < root.num_shots(); ++i) {
    result.save_data_pershot(Base::states_[root.state_index()].creg(),
                             op.string_params[0], amps, op.type, op.save_type);
}

Collaborator Author

doichanj Apr 17, 2023

removed nshots and loop from save_dat_pershot

src/noise/noise_model.hpp Outdated

Collaborator

hhorii Apr 14, 2023

I think we can reuse qerror_loc to sampling in runtime. But it is fine to use sample_noise explicitly to specify points to inject noise again in C++ (also, we may be able to cover noise injection without qerror_loc).

src/simulators/batch_shots_executor.hpp Outdated

Collaborator

hhorii Apr 14, 2023

In principle, BatchShotsExecutor should not inherit ParallelStateExecutor because batch execution and chunk execution are independent. However, currently, batch execution is supported only by statevector and densitymatrix and the both supports batch execution. I agree with this class structure because we should avoid complexity from multiple inheritance. We will be able to consider another class structure when batch execution is supported in other methods (that do not support chunk execution).

src/simulators/multi_state_executor.hpp

+              protected:
+                void set_config(const Config &config) override;
+                void set_distribution(uint_t num_states);

Collaborator

hhorii Apr 14, 2023

Could you add comments to explain this method? Originally, I found the following comment instate_chunk.hpp to the method of the same name excepting its argument name uint_t nprocs.

// set number of processes to be distributed

Collaborator Author

doichanj Apr 17, 2023

added comment

src/controllers/aer_controller.hpp

-                  return state.required_memory_mb(circ.num_qubits, circ.ops);
-                }
+                  return std::make_shared<
+                      CircuitExecutor::Executor<MatrixProductState::State>>();

Collaborator

hhorii Apr 14, 2023

Why not MultiStateExecutor is used for MPS and others?

Collaborator Author

doichanj Apr 17, 2023

Only a simulation method who supports shot-branching inherits MultiStateExecutor at this time.

doichanj added 7 commits

April 17, 2023 16:36


          recover save_data_per_shot

6abd5ab


          Merge remote-tracking branch 'upstream/main' into add_executor

9a9e3e7


          add missed omp threads setting in statevector, change class hieralchy

a7276f3


          Fix performance issue of GPU shot-branching

d02c782


          move fusion outside of loop for non noise dynamic circuits

7fa4240


          Merge branch 'main' into add_executor

846f645


          fix shot-branching options in aer_compiler.py

371dd62

hhorii mentioned this pull request

Sampler with shots=None does not calculate resets exactly #1811

Open


          save codes before merge

16aa8a8

doichanj added 4 commits

July 14, 2023 15:37


          Changed OpenMP threading for shot-branching

138a1a0


          Merge remote-tracking branch 'upstream/main' into add_executor

abbb719


          mutable to matrix and param buffer

61bb0cb


          format

3353d11

hhorii added the Changelog: New Feature label

doichanj and others added 15 commits

July 28, 2023 10:36


          add target_gpus option

07bffc2


          Remove Python 3.7 from Github actions (Qiskit#1819)

c76ed74

Since 0.13.0, Aer does not support Python 3.7.
This commit removes github actions for CI.

* Removing python 3.7 from test workflow
* Removing python 3.7 from build workflow
* Removing python 3.7 from deploy workflow
* Removing python 3.7 from tox
* revert
* Remove python 3.7 from pyproject.toml
* Remove python 3.7 from pyproject.toml - tool
---------

Co-authored-by: Hiroshi Horii <hhorii@users.noreply.github.com>


          Fix missing dynamic link path for CUDA runtime and cuQuantum libraries (

99c9e26

Qiskit#1877)

Co-authored-by: Hiroshi Horii <hhorii@users.noreply.github.com>


          Fix OpenMP nested parallel (Qiskit#1880)

67e6e8d

* Fix OpenMP nested parallel

* add comment in release note

* fix true and false

* fix format

---------

Co-authored-by: Hiroshi Horii <hhorii@users.noreply.github.com>


          Support u3 gate application in Aer runtime API (Qiskit#1876)

8885eae

* Support u3 gate application

* Apply clang-format

* Revert clang-format for aer_runtime_api.h

* Add release note

---------

Co-authored-by: Hiroshi Horii <hhorii@users.noreply.github.com>
Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com>


          Fix required_memory_mb (Qiskit#1881)

0d553b0

* Fix required_memory_mb

* add release note

---------

Co-authored-by: Hiroshi Horii <hhorii@users.noreply.github.com>


          format

daf3c19


          format


          format

dbe2300


          comment out target_gpu setting for non-GPU

fabfa63


          comment out target_gpu setting for non-GPU

e66ab86


          Remove PulseSimulator (Qiskit#1884)

c9c6150

Since 0.12, Qiskit-Aer notices deprecation warnings to use of PulseSimulato. Because 0.13 will be released after +3 months since 0.12 was released, Qiskit-Aer will stop supports of pulse simulation.

* first pass at removing pulse simulator
* autoformat with black
* remove ref to aer pulse in docs
* fix lint issues
* remove pulse rst
* remove pulse tests
* add release note
* remove open pulse from CMakeLists.txt
* remove pulse tests
* remove remaining pulse codes

---------

Co-authored-by: AngeloDanducci <angelo.danducci.ii@ibm.com>


          Fix an issue in aer_state_initialize() of C API (Qiskit#1885)

e39e458

Correct C API `aer_state_initialize` to take an argument of `handler`.

* update aer_state_initialize API
* add reno


          fix MPI shot-branching sampling

a59e319


          fix unmerged file

6fe5f04

doichanj force-pushed the add_executor branch from 9005ba9 to 6fe5f04 Compare

August 9, 2023 06:58

doichanj added 7 commits

August 9, 2023 16:00


          remove conflict

8f07e97


          rerun tests

e63e1e0


          remove conflict


          recover files

f566872


          remove conflict

12b7d7b


          fix non-gpu

36f0ea8


          update release note

f1ee98d

hhorii approved these changes

View reviewed changes

Collaborator

hhorii left a comment

Though I was not able to review the detail of implementation due to the size of changes, I believe that we can merge this PR and have time to refine implementation by the release of 0.13.0.

hhorii merged commit 9999dfb into Qiskit:main

30 checks passed

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Changelog: New Feature enhancement