Small fixes and improvements / sparse coefficient matrices #1088

ischoegl · 2021-08-22T14:56:01Z

Changes proposed in this pull request

This PR contains several minor updates that fix small issues accumulated over several PR's (or address inconsistencies that were left over from the 2.5 code base).

Make Plog interface consistent: existing code mixed std::vector<std::pair> and std::multimap (deprecate std::multimap versions)
Make Chebyshev interface consistent: existing code mixed Array2D and vector_fp (deprecate vector_fp versions)
Update internal Chebyshev routines to consistently use column major arrays
Fix warnings in compilation and test suite
Clarify docstring for SCons legacy_rate_constants
Allow for custom C++ NotImplementedError messages (allows for more granular user feedback)
Renamed eigen_dense.h to eigen_defs.h (prepares for use of sparse matrices)
Implement sparse stoichiometric coefficient matrices (StoichManagerN) via Eigen::SparseMatrix
Expose sparse stoichiometric coefficient matrices to Python API via scipy.sparse

Most of these changes are straight-forward; some are meant to preempt questions that may arise in the context of #1051 and #1081. Specifically, the sparse interface is a preamble to sparse Jacobian output for the Python API in #1081.

If applicable, provide an example illustrating new features this pull request is introducing

Almost all updates are internal. The main API change allows for sparse matrix output (implemented for stoichiometric coefficients):

In [1]: import cantera as ct
   ...: gas = ct.Solution("test/data/kineticsfromscratch.yaml")

In [2]: gas.reactant_stoich_coefficients
Out[2]: 
array([[0., 0., 0., 0., 0., 0.],
       [1., 2., 0., 0., 0., 0.],
       [1., 0., 0., 1., 0., 0.],
       [0., 0., 0., 0., 0., 1.],
       [0., 0., 2., 0., 0., 0.],
       [0., 0., 0., 1., 0., 1.],
       [0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 1., 0.]])

In [3]: ct.use_sparse() # enable sparse output via scipy.sparse

In [4]: gas.reactant_stoich_coefficients
Out[4]: 
<9x6 sparse matrix of type '<class 'numpy.float64'>'
	with 9 stored elements in COOrdinate format>

In [5]: print(gas.reactant_stoich_coefficients)
  (1, 0)	1.0
  (2, 0)	1.0
  (1, 1)	2.0
  (4, 2)	2.0
  (2, 3)	1.0
  (5, 3)	1.0
  (8, 4)	1.0
  (3, 5)	1.0
  (5, 5)	1.0

In [6]: gas.reactant_stoich_coefficients.toarray()
Out[6]: 
array([[0., 0., 0., 0., 0., 0.],
       [1., 2., 0., 0., 0., 0.],
       [1., 0., 0., 1., 0., 0.],
       [0., 0., 0., 0., 0., 1.],
       [0., 0., 2., 0., 0., 0.],
       [0., 0., 0., 1., 0., 1.],
       [0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 1., 0.]])

Checklist

The pull request includes a clear description of this code change
Commit messages have short titles and reference relevant issues
Build passes (scons build & scons test) and unit tests address code coverage
Style & formatting of contributed code follows contributing guidelines
The pull request is ready for review

Other thoughts

I also considered using Eigen::MatrixXd for the external Chebyshev interface (2-D matrix of coefficients), but ended up staying closer to the previous implementation.

~~For simplicity, the Eigen::SparseMatrix interface to Python is implemented using a C preprocessor; the alternative would be to expose some Eigen classes in _cantera.pxd.~~

codecov · 2021-08-22T15:44:15Z

Codecov Report

Merging #1088 (842b149) into main (cf085e4) will increase coverage by 0.03%.
The diff coverage is 91.42%.

❗ Current head 842b149 differs from pull request most recent head c4e4a7b. Consider uploading reports for the commit c4e4a7b to get more accurate results

@@            Coverage Diff             @@
##             main    #1088      +/-   ##
==========================================
+ Coverage   73.46%   73.49%   +0.03%     
==========================================
  Files         364      364              
  Lines       47667    47809     +142     
==========================================
+ Hits        35018    35138     +120     
- Misses      12649    12671      +22

Impacted Files	Coverage Δ
include/cantera/base/Units.h	`100.00% <ø> (ø)`
include/cantera/base/YamlWriter.h	`100.00% <ø> (ø)`
include/cantera/base/ctexceptions.h	`73.68% <ø> (ø)`
include/cantera/kinetics/GasKinetics.h	`100.00% <ø> (ø)`
include/cantera/kinetics/InterfaceKinetics.h	`100.00% <ø> (ø)`
include/cantera/kinetics/KineticsFactory.h	`100.00% <ø> (ø)`
include/cantera/kinetics/ReactionRate.h	`61.33% <ø> (ø)`
include/cantera/thermo/ThermoFactory.h	`100.00% <ø> (ø)`
src/kinetics/ReactionRate.cpp	`94.25% <ø> (ø)`
test/kinetics/rates.cpp	`100.00% <ø> (ø)`
... and 25 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update cf085e4...c4e4a7b. Read the comment docs.

ischoegl · 2021-08-24T01:34:25Z

@speth ... as #1081 was starting to exceed 2k lines, I decided to push a portion that includes relevant changes to the API first. In addition, there is some streamlining of code that arose in the context of #1051.

'scons help' reports setting values in terms of 'yes' and 'no' rather than boolean flags that are otherwise used internally. This commit ensures that documentation for the newly added flag 'legacy-rate-constants' is consistent with this convention; accordingly, the setting is either enabled or disabled.

Compilation with GNU g++ currently raises the following warning. warning: 'nSolnValues' may be used uninitialized in this function [-Wmaybe-uninitialized] The warning is resolved by providing an (invalid) initial value; the value is overwritten by a subsequent (pre-existing) if-else tree.

Resolve GNU g++ warning: warning: unused variable 'p' [-Wunused-variable]

speth

I like the addition of the option to use sparse matrices for the stoichiometric coefficients, but I think there are a few tweaks needed to get it to make full use of the sparsity.

The one change here that I don't really understand is to the data structure used to initialize Plog objects.

include/cantera/cython/wrappers.h

src/kinetics/RxnRates.cpp

include/cantera/numerics/eigen_defs.h

interfaces/cython/cantera/kinetics.pyx

ischoegl · 2021-09-03T22:53:31Z

@speth ... thank you for the comments. I'll respond to some of your points right away, and should be able to get back to this before long.

* Use consistent interface based on Array2D * Use column-major consistently in internal calculations * Deprecate Chebyshev::coeffs and Chebyshev::setup

Several internal methods defined for C1, C2, C3, and C_AnyN are unused; as there are no simple accessor methods defined in StoichManagerN, deprecation appears not to be required.

The 'Kinetics::finalizeSetup' method executes code after all reactions have been added. The new flag 'finalize' is added to 'Kinetics::addReaction' which determines whether 'finalizeSetup' is executed. While the default is 'true', imports within 'KineticsManager' use 'addReaction' with the flag set to 'false'.

as Kinetics::m_irrevProductManager is only used in conjunction with Kinetics::m_revProductManager, a new Kinetics::m_productManager provides for simpler code.

Deprecate Tmin/Tmax, Pmin/Pmax as well as constructors to use std::pair<double,double> input/output

Deprecate Kinetics.*_stoich_coeffs() method in favor of property: Behavior will change after Cantera 2.6; for new behavior, new properties Kinetics.*_stoich_coefficients are implemented

This commit unifies parameter handling of Plog rate setters and getters to use std::multimap<...>.

ischoegl · 2021-09-05T15:07:07Z

@speth ... thanks again for the feedback (and I'm happy you liked the sparse coefficient interface, despite the naming squabbles).

I believe I took care of everything you mentioned thus far. The interface to Eigen::SparseMatrix no longer uses excessive memory and the Plog/multimap issue is taken care of. Regarding the Python coefficient getter, I opted for a transitional name, so the old names won't change long-term.

One new significant change is that I streamlined Chebyshev to consistently use range pairs, e.g.

In [1]: import cantera as ct
   ...: gas = ct.Solution("h2o2.yaml")

In [2]:  rxn = ct.ChebyshevReaction(
   ...:             equation="HO2 <=> OH + O",
   ...:             rate={"temperature-range": [290.0, 3000.0],
   ...:                   "pressure-range": [1e3, 1e8],
   ...:                   "data": [[8.2883, -1.1397, -0.12059, 0.016034],
   ...:                            [1.9764, 1.0037, 7.2865e-03, -0.030432],
   ...:                            [0.3177, 0.26889, 0.094806, -7.6385e-03]]},
   ...:             kinetics=gas)

In [3]: rxn.rate.temperature_range # previously rxn.Tmin / rxn.Tmax
Out[3]: (290.0, 3000.0)

In [4]: rxn.rate.pressure_range # previously rxn.Pmin / rxn.Pmax
Out[4]: (1000.0, 100000000.0)

with corresponding changes in the C++ underpinnings (setters/getters/constructors).

Edit: Writing this, I realized that rxn.rate.coeffs really should be rxn.rate.data to be consistent with YAML, which is now fixed:

In [5]: rxn.rate.data  # previously rxn.coeffs
Out[5]: 
array([[ 8.2883e+00, -1.1397e+00, -1.2059e-01,  1.6034e-02],
       [ 1.9764e+00,  1.0037e+00,  7.2865e-03, -3.0432e-02],
       [ 3.1770e-01,  2.6889e-01,  9.4806e-02, -7.6385e-03]])

Finally, the transitional names just end with 3, i.e.

In [6]: ct.use_sparse() # enable sparse output via scipy.sparse
   ...: gas.reactant_stoich_coeffs3
   ...: 
Out[6]: 
<10x29 sparse matrix of type '<class 'numpy.float64'>'
	with 50 stored elements in COOrdinate format>

ischoegl · 2021-09-13T17:30:07Z

🎉 ... finally found something that will help exposing sparse matrices without copying ... it does look like scipy.sparse.csc_matrix can be initialized from the data structure used by compressed sparse matrices inEigen

csc_matrix((data, indices, indptr), [shape=(M, N)])

is the standard CSC representation where the row indices for column i are stored in indices[indptr[i]:indptr[i+1]] and their corresponding values are stored in data[indptr[i]:indptr[i+1]]. If the shape parameter is not supplied, the matrix dimensions are inferred from the index arrays.

ischoegl · 2021-09-15T03:37:55Z

Switching to CSC (from COO) format in scipy.sparse leverages the internal Eigen storage order, and has some impact on performance:

In [2]: gas = ct.Solution("gri30.yaml")
   ...: dense = gas.reactant_stoich_coeffs3

In [3]: %timeit gas.reactant_stoich_coeffs3
10000 loops, best of 3: 64.3 us per loop

In [4]: ct.use_sparse()

In [5]: sparse = gas.reactant_stoich_coeffs3
   ...: sparse
Out[5]: 
<53x325 sparse matrix of type '<class 'numpy.float64'>'
	with 626 stored elements in Compressed Sparse Column format>

In [6]: (sparse == dense).all()
Out[6]: True

In [7]: %timeit gas.reactant_stoich_coeffs3
The slowest run took 4.78 times longer than the fastest. This could mean that an intermediate result is being cached.
10000 loops, best of 3: 48 us per loop

In [8]: %timeit gas.reactant_stoich_coeffs3_prev # previous implementation outputting COO matrix
The slowest run took 9.96 times longer than the fastest. This could mean that an intermediate result is being cached.
10000 loops, best of 3: 61.6 us per loop

These tests show a 20% speed improvement with CSC (not a major saving); CSC is, however, better suited for matrix-vector products, so the switch makes sense overall.

This change leverages the default internal storage format in Eigen (CSC)

speth

Thanks for the updates, @ischoegl. I'm glad to see the reasonably efficient interface for exposing Eigen's sparse matrices to Python. I think that bodes well for some of the other things that I expect are on the horizon like Jacobians.

The change from Plog::setup to Plog::setRates makes sense, although I think the new function name implies that you could call it again to replace the existing rates and the current implementation doesn't support that as it leaves the existing contents of rates_ and pressures_ in tact.

I don't really understand the motivation for packing Chebyshev rate Tmin/Tmax and Pmin/Pmax into std::pair. All the extra use of std::make_pair and foo.first/foo.second that's now required doesn't really seem like an improvement over the old interface. In Python, on the other hand, implicit tuple packing/unpacking is clean and idiomatic, so I think that's a reasonable API change.

include/cantera/kinetics/StoichManager.h

include/cantera/cython/wrappers.h

include/cantera/kinetics/StoichManager.h

include/cantera/kinetics/Kinetics.h

include/cantera/kinetics/RxnRates.h

include/cantera/kinetics/StoichManager.h

interfaces/cython/cantera/_cantera.pxd

include/cantera/kinetics/Kinetics.h

ischoegl · 2021-09-16T21:33:52Z

@speth ... thank you for your comments!

The change from Plog::setup to Plog::setRates makes sense, although I think the new function name implies that you could call it again to replace the existing rates and the current implementation doesn't support that as it leaves the existing contents of rates_ and pressures_ in tact.

Good catch! Some of the renaming here pre-empts #1051 (which is currently on hold), so it will be possible to change that eventually even in memory.

I don't really understand the motivation for packing Chebyshev rate Tmin/Tmax and Pmin/Pmax into std::pair. All the extra use of std::make_pair and foo.first/foo.second that's now required doesn't really seem like an improvement over the old interface. In Python, on the other hand, implicit tuple packing/unpacking is clean and idiomatic, so I think that's a reasonable API change.

I partially reverted those changes to C++. I guess I wanted to make things as similar as possible for YAML / Python and C++, but for the latter I agree that the resulting API change wasn't great. The updated Python API remains of course.

Regarding Kinetics::finalizeSetup() I see your point, but there is indeed a need to call it at a different time than Kinetics::init() ... let me know if you can think of a better name, but initialize would be imho too confusing.

ischoegl · 2021-09-17T03:21:37Z

@speth ... after settling on Kinetics::resizeReactions (and StoichManagerN::resizeCoeffs), I believe everything is taken care of and this should be ready to be merged. Thanks for the review and the suggestions!

Update nomenclature for previously introduced finalizeSetup method; new names are Kinetics::resizeReactions and StoichManagerN::resizeCoeffs

speth

Thanks, @ischoegl! This looks good to me.

ischoegl · 2021-09-17T16:21:07Z

@speth ... thanks again for the review. Now that this is merged, I will post a new PR wrapping up GasKinetics (much of this is done, but I'll have to clean up some of the commit history).

None of my other pending PR's are close to being merged, as most need to be rebased at this point.

ischoegl force-pushed the small-fixes branch from eccf9fc to 12a4bcd Compare August 22, 2021 15:21

ischoegl force-pushed the small-fixes branch 3 times, most recently from 2c0cc44 to 1b11a99 Compare August 24, 2021 00:34

ischoegl changed the title ~~Small fixes and improvements~~ Small fixes and improvements / sparse coefficient matrices Aug 24, 2021

ischoegl requested a review from a team August 24, 2021 01:29

ischoegl mentioned this pull request Aug 24, 2021

Proof-of-concept: Modify reaction rates in memory #1051

Closed

5 tasks

ischoegl added the Kinetics label Aug 24, 2021

ischoegl force-pushed the small-fixes branch from b1aa24b to 48a77e2 Compare August 25, 2021 17:02

This was referenced Aug 25, 2021

Sparse Jacobians for GasKinetics #1089

Merged

add Tsang closed-form falloff #1090

Merged

ischoegl added 5 commits September 3, 2021 13:18

[UnitTest] Fix true_divide RuntimeWarning in test_kinetics.py

b9a7eb8

Allow for custom NotImplementedError messages

2091536

[Unittests] Resolve unused variable warning in test-surfSolver2

3d03164

Resolve GNU g++ warning: warning: unused variable 'p' [-Wunused-variable]

ischoegl force-pushed the small-fixes branch from bd9dca4 to 04e6b1d Compare September 3, 2021 18:18

speth requested changes Sep 3, 2021

View reviewed changes

ischoegl force-pushed the small-fixes branch 2 times, most recently from f6d7434 to a969367 Compare September 5, 2021 02:15

ischoegl added 7 commits September 5, 2021 07:54

[Numerics] Update eigen_dense.h and introduce eigen_sparse.h

d6afa44

[Kinetics] Update ChebyshevRate interface

3fcbc13

* Use consistent interface based on Array2D * Use column-major consistently in internal calculations * Deprecate Chebyshev::coeffs and Chebyshev::setup

[Kinetics] Remove unused functions from internal StoichManager.h classes

ef7c81b

Several internal methods defined for C1, C2, C3, and C_AnyN are unused; as there are no simple accessor methods defined in StoichManagerN, deprecation appears not to be required.

[Kinetics] Add sparse stoichiometric coefficients in StoichManager.h

c409e25

[Kinetics] Merge irreversible and reversible product StoichManagerN

3f2dc1e

as Kinetics::m_irrevProductManager is only used in conjunction with Kinetics::m_revProductManager, a new Kinetics::m_productManager provides for simpler code.

[Kinetics] Use sparse stoichcoeff matrices for internal calculations

9e31a12

ischoegl added 6 commits September 5, 2021 07:55

[Unittest] Skip some tests if h5py is not installed

6cd9e31

[Kinetics] Streamline Chebyshev interface

b492434

Deprecate Tmin/Tmax, Pmin/Pmax as well as constructors to use std::pair<double,double> input/output

[Unittest] Update Chebyshev tests

2ac8e8c

[Python] Make sparse API more efficient and deprecate *stoich methods

d9a446b

Deprecate Kinetics.*_stoich_coeffs() method in favor of property: Behavior will change after Cantera 2.6; for new behavior, new properties Kinetics.*_stoich_coefficients are implemented

[Unittest] Add test for sparse matrix output

5ff68c3

[Kinetics] Standardize Plog rates access

79e62d8

This commit unifies parameter handling of Plog rate setters and getters to use std::multimap<...>.

ischoegl force-pushed the small-fixes branch from a969367 to 79e62d8 Compare September 5, 2021 14:48

[Kinetics] Access Chebyshev coefficients as 'data'

463598a

ischoegl requested a review from speth September 5, 2021 16:32

[Kinetics] Use compressed sparse matrix format

031a7e9

[Python] Switch sparse matrix output to CSC (from COO) format

5394540

This change leverages the default internal storage format in Eigen (CSC)

ischoegl force-pushed the small-fixes branch from 41338c1 to 5394540 Compare September 15, 2021 03:49

speth requested changes Sep 16, 2021

View reviewed changes

ischoegl force-pushed the small-fixes branch from 7d37cc1 to 85f019d Compare September 16, 2021 21:00

ischoegl added 2 commits September 16, 2021 16:08

Update formatting

a9901f5

[Kinetics] Partially roll back changes to Chebyshev

202c2bf

ischoegl force-pushed the small-fixes branch from 85f019d to 202c2bf Compare September 16, 2021 21:09

[Kinetics] Ensure member variables are reset when calling Plog::setRates

5a2999f

[Kinetics] Improve nomenclature for final step of Kinetics setup

c4e4a7b

Update nomenclature for previously introduced finalizeSetup method; new names are Kinetics::resizeReactions and StoichManagerN::resizeCoeffs

ischoegl force-pushed the small-fixes branch from 976527e to c4e4a7b Compare September 17, 2021 03:24

speth approved these changes Sep 17, 2021

View reviewed changes

speth merged commit c4d6ecc into Cantera:main Sep 17, 2021

ischoegl deleted the small-fixes branch September 17, 2021 16:17

ischoegl mentioned this pull request Sep 18, 2021

Refactor Reaction Rate Evaluators Cantera/enhancements#87

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Small fixes and improvements / sparse coefficient matrices #1088

Small fixes and improvements / sparse coefficient matrices #1088

ischoegl commented Aug 22, 2021 •

edited

Loading

codecov bot commented Aug 22, 2021 •

edited

Loading

ischoegl commented Aug 24, 2021 •

edited

Loading

speth left a comment

ischoegl commented Sep 3, 2021

ischoegl commented Sep 5, 2021 •

edited

Loading

ischoegl commented Sep 13, 2021 •

edited

Loading

ischoegl commented Sep 15, 2021

speth left a comment

ischoegl commented Sep 16, 2021 •

edited

Loading

ischoegl commented Sep 17, 2021 •

edited

Loading

speth left a comment

ischoegl commented Sep 17, 2021

Small fixes and improvements / sparse coefficient matrices #1088

Small fixes and improvements / sparse coefficient matrices #1088

Conversation

ischoegl commented Aug 22, 2021 • edited Loading

codecov bot commented Aug 22, 2021 • edited Loading

Codecov Report

ischoegl commented Aug 24, 2021 • edited Loading

speth left a comment

Choose a reason for hiding this comment

ischoegl commented Sep 3, 2021

ischoegl commented Sep 5, 2021 • edited Loading

ischoegl commented Sep 13, 2021 • edited Loading

ischoegl commented Sep 15, 2021

speth left a comment

Choose a reason for hiding this comment

ischoegl commented Sep 16, 2021 • edited Loading

ischoegl commented Sep 17, 2021 • edited Loading

speth left a comment

Choose a reason for hiding this comment

ischoegl commented Sep 17, 2021

ischoegl commented Aug 22, 2021 •

edited

Loading

codecov bot commented Aug 22, 2021 •

edited

Loading

ischoegl commented Aug 24, 2021 •

edited

Loading

ischoegl commented Sep 5, 2021 •

edited

Loading

ischoegl commented Sep 13, 2021 •

edited

Loading

ischoegl commented Sep 16, 2021 •

edited

Loading

ischoegl commented Sep 17, 2021 •

edited

Loading