Skip to content

Commit

Permalink
Adding circuit executor classes and shot-branching (#1766)
Browse files Browse the repository at this point in the history
* adding executor classes for parallel simulations

* fix merge conflicts

* simplify sub-classes

* fix unformatted code

* fix unformatted code again

* Fix MPI code

* Fix shot-branching was not enabled with noise sampling

* Fix clang format

* set_num_qubits to virtual function to set correct num qubits on matrix

* reflecting review comments

* reuse of random number generator

* recover save_data_per_shot

* add missed omp threads setting in statevector, change class hieralchy

* Fix performance issue of GPU shot-branching

* move fusion outside of loop for non noise dynamic circuits

* fix shot-branching options in aer_compiler.py

* save codes before merge

* Fix format

* Fix multi-chunk with cuStateVec

* format

* format

* add better multi-GPU distribution for shot-branching

* fix format

* Changed option shot_branching_enable=False by default, add shot_branching_sampling_enable (False by default), add test cases for shot-branching

* format

* format test_shot_branching.py

* Changed OpenMP threading for shot-branching

* mutable to matrix and param buffer

* format

* add target_gpus option

* Remove Python 3.7 from Github actions (#1819)

Since 0.13.0, Aer does not support Python 3.7.
This commit removes github actions for CI.

* Removing python 3.7 from test workflow
* Removing python 3.7 from build workflow
* Removing python 3.7 from deploy workflow
* Removing python 3.7 from tox
* revert
* Remove python 3.7 from pyproject.toml
* Remove python 3.7 from pyproject.toml - tool
---------

Co-authored-by: Hiroshi Horii <hhorii@users.noreply.github.com>

* Fix missing dynamic link path for CUDA runtime and cuQuantum libraries (#1877)

Co-authored-by: Hiroshi Horii <hhorii@users.noreply.github.com>

* Fix OpenMP nested parallel (#1880)

* Fix OpenMP nested parallel

* add comment in release note

* fix true and false

* fix format

---------

Co-authored-by: Hiroshi Horii <hhorii@users.noreply.github.com>

* Support u3 gate application in Aer runtime API (#1876)

* Support u3 gate application

* Apply clang-format

* Revert clang-format for aer_runtime_api.h

* Add release note

---------

Co-authored-by: Hiroshi Horii <hhorii@users.noreply.github.com>
Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com>

* Fix required_memory_mb (#1881)

* Fix required_memory_mb

* add release note

---------

Co-authored-by: Hiroshi Horii <hhorii@users.noreply.github.com>

* format

* format

* format

* comment out target_gpu setting for non-GPU

* comment out target_gpu setting for non-GPU

* Remove `PulseSimulator` (#1884)

Since 0.12, Qiskit-Aer notices deprecation warnings to use of PulseSimulato. Because 0.13 will be released after +3 months since 0.12 was released, Qiskit-Aer will stop supports of pulse simulation.

* first pass at removing pulse simulator
* autoformat with black
* remove ref to aer pulse in docs
* fix lint issues
* remove pulse rst
* remove pulse tests
* add release note
* remove open pulse from CMakeLists.txt
* remove pulse tests
* remove remaining pulse codes

---------

Co-authored-by: AngeloDanducci <angelo.danducci.ii@ibm.com>

* Fix an issue in `aer_state_initialize()` of C API (#1885)

Correct C API `aer_state_initialize` to take an argument of `handler`.

* update aer_state_initialize API
* add reno

* fix MPI shot-branching sampling

* fix unmerged file

* remove conflict

* rerun tests

* recover files

* remove conflict

* fix non-gpu

* update release note

---------

Co-authored-by: Tung Bui (Leo) <85242618+tungbq@users.noreply.github.com>
Co-authored-by: Hiroshi Horii <hhorii@users.noreply.github.com>
Co-authored-by: Ryo Wakizaka <135729070+ibm-wakizaka@users.noreply.github.com>
Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com>
Co-authored-by: AngeloDanducci <angelo.danducci.ii@ibm.com>
  • Loading branch information
6 people committed Aug 9, 2023
1 parent e842b4c commit 9999dfb
Show file tree
Hide file tree
Showing 53 changed files with 10,881 additions and 6,526 deletions.
3 changes: 3 additions & 0 deletions qiskit_aer/backends/aer_compiler.py
Original file line number Diff line number Diff line change
Expand Up @@ -465,6 +465,8 @@ def compile_circuit(circuits, basis_gates=None, optypes=None):
"chunk_swap_buffer_qubits": (int, np.integer),
"batched_shots_gpu": (bool, np.bool_),
"batched_shots_gpu_max_qubits": (int, np.integer),
"shot_branching_enable": (bool, np.bool_),
"shot_branching_sampling_enable": (bool, np.bool_),
"num_threads_per_device": (int, np.integer),
"statevector_parallel_threshold": (int, np.integer),
"statevector_sample_measure_opt": (int, np.integer),
Expand All @@ -488,6 +490,7 @@ def compile_circuit(circuits, basis_gates=None, optypes=None):
"use_cuTensorNet_autotuning": (bool, np.bool_),
"parameterizations": (list),
"fusion_parallelization_threshold": (int, np.integer),
"target_gpus": (list),
}


Expand Down
31 changes: 31 additions & 0 deletions qiskit_aer/backends/aer_simulator.py
Original file line number Diff line number Diff line change
Expand Up @@ -170,6 +170,10 @@ class AerSimulator(AerBackend):
If AerSimulator is built with cuStateVec support, cuStateVec APIs are enabled
by setting ``cuStateVec_enable=True``.
* ``target_gpus`` (list): List of GPU's IDs starting from 0 sets
the target GPUs used for the simulation.
If this option is not specified, all the available GPUs are used for
chunks/shots distribution.
**Additional Backend Options**
Expand Down Expand Up @@ -287,6 +291,30 @@ class AerSimulator(AerBackend):
threads per GPU. This parameter is used to optimize Pauli noise
simulation with multiple-GPUs (Default: 1).
* ``shot_branching_enable`` (bool): This option enables/disables
applying shot-branching technique to speed up multi-shots of dynamic
circutis simulations or circuits simulations with noise models.
(Default: False).
Starting from single state shared with multiple shots and
state will be branched dynamically at runtime.
This option can decrease runs of shots if there will be less branches
than number of total shots.
This option is available for ``"statevector"``, ``"density_matrix"``
and ``"tensor_network"``.
* ``shot_branching_sampling_enable`` (bool): This option enables/disables
applying sampling measure if the input circuit has all the measure
operations at the end of the circuit. (Default: False).
Because measure operation branches state into 2 states, it is not
efficient to apply branching for measure.
Sampling measure improves speed to get counts for multiple-shots
sharing the same state.
Note that the counts obtained by sampling measure may not be as same as
the counts calculated by multiple measure operations,
becuase sampling measure takes only one randome number per shot.
This option is available for ``"statevector"``, ``"density_matrix"``
and ``"tensor_network"``.
* ``accept_distributed_results`` (bool): This option enables storing
results independently in each process (Default: None).
Expand Down Expand Up @@ -709,6 +737,9 @@ def _default_options(cls):
batched_shots_gpu=False,
batched_shots_gpu_max_qubits=16,
num_threads_per_device=1,
# multi-shot branching
shot_branching_enable=False,
shot_branching_sampling_enable=False,
# statevector options
statevector_parallel_threshold=14,
statevector_sample_measure_opt=10,
Expand Down
20 changes: 17 additions & 3 deletions qiskit_aer/backends/wrappers/aer_controller_binding.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -182,6 +182,11 @@ void bind_aer_controller(MODULE m) {
[](Config &config, uint_t val) {
config.num_threads_per_device.value(val);
});
// # multi-shot branching
aer_config.def_readwrite("shot_branching_enable",
&Config::shot_branching_enable);
aer_config.def_readwrite("shot_branching_sampling_enable",
&Config::shot_branching_sampling_enable);
// # statevector options
aer_config.def_readwrite("statevector_parallel_threshold",
&Config::statevector_parallel_threshold);
Expand Down Expand Up @@ -403,6 +408,10 @@ void bind_aer_controller(MODULE m) {
[](Config &config, uint_t val) {
config.extended_stabilizer_norm_estimation_default_samples.value(val);
});
aer_config.def_property(
"target_gpus",
[](const Config &config) { return config.target_gpus.val; },
[](Config &config, reg_t val) { config.target_gpus.value(val); });

aer_config.def(py::pickle(
[](const AER::Config &config) {
Expand Down Expand Up @@ -488,12 +497,14 @@ void bind_aer_controller(MODULE m) {
write_value(77, config.unitary_parallel_threshold),
write_value(78, config.memory_blocking_bits),
write_value(
79,
config.extended_stabilizer_norm_estimation_default_samples));
79, config.extended_stabilizer_norm_estimation_default_samples),
write_value(80, config.shot_branching_enable),
write_value(81, config.shot_branching_sampling_enable),
write_value(82, config.target_gpus));
},
[](py::tuple t) {
AER::Config config;
if (t.size() != 79)
if (t.size() != 82)
throw std::runtime_error("Invalid serialization format.");

read_value(t, 0, config.shots);
Expand Down Expand Up @@ -580,6 +591,9 @@ void bind_aer_controller(MODULE m) {
read_value(t, 78, config.memory_blocking_bits);
read_value(t, 79,
config.extended_stabilizer_norm_estimation_default_samples);
read_value(t, 80, config.shot_branching_enable);
read_value(t, 81, config.shot_branching_sampling_enable);
read_value(t, 82, config.target_gpus);
return config;
}));
}
Expand Down
30 changes: 30 additions & 0 deletions releasenotes/notes/add_executor-a03f2d23cf6f4ca9.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,30 @@
---
features:
- |
This release restructures ``State`` classes.
Adding circuit executor classes that runs a circuit and manages multiple
states for multi-shots simulations or multi-chunk simulations for large
number of qubits.
Previously ``StateChunk`` class manages multiple chunks for multi-shots or
multi-chunk simulations but now ``State`` class only has one state
and all the parallelization codes are moved to ``Executor`` classes.
Now all ``State`` classes are independent from parallelization.
Also some of the functions in ``Aer::Controller`` class are moved to
``CircuitExecutor::Executor`` class.
- |
Shot-branching technique that accelerates dynamic circuits simulations
is implemented with restructured ``Executor`` classes.
Shot-branching is currently applicable to statevector density_matrix
and tensor_network methods.
Shot-branching provides dynamic distribution of multi-shots
by branching states when applying dynamic operations
(measure, reset, initialize, noises)
By default ``shot_branching_enable`` is disabled.
And by setting ``shot_branching_sampling_enable``, final measures will be
done by sampling measure that will speed up to get counts for multiple shots
sharing the same state.
- |
A new option ``target_gpus`` is added to select GPUs used for the
simulation. A list of target GPU's ID is passed for example
``target_gpus=[0, 2]`` select 2 GPUs to be used.
Without this option, all the available GPUs are used.
Loading

0 comments on commit 9999dfb

Please sign in to comment.