Fix dimensionality #146

iangrooms · 2022-02-15T20:11:59Z

This PR addresses the problem of dimensionality described in #136. A quick summary: If a user tries any of the REGULAR Laplacians and uses a value of dx_min other than 1, they will get wrong answers. This PR will allow users to put in dimensional values of dx_min and filter_scale (in any system of units) and still get the right answers.

I am attempting to be minimally invasive. I've added a boolean class attribute is_nondimensional to all of the Laplacians. If the Laplacian is nondimensional, then at each step of the filter (in filter.py) it re-dimensionalizes by dividing by dx_min ** 2. The filter_func's don't have access to dx_min so I added dx_min_sq = dx_min ** 2 to the filter spec.

I initially started with @NoraLoose's suggestion (from #136) of defining new Laplacian classes, but as I worked on it it seemed like it would require bigger changes. For example, _create_filter_func accepts a BaseScalarLaplacian and I didn't want to have to define four new versions of _create_filter_func (scalar vs vector; dimensional vs non).

I also added a Note in the Basic Filtering section of the docs, saying that any grid_vars should have units that are consistent with filter_scale and dx_min, but that the specific system of units doesn't matter.

Adds a boolean `is_nondimensional` to the Laplacian classes. When True, the filter scales by `dx_min ** 2` to restore dimensionality. Had to add `dx_min_sq` (equals `dx_min ** 2` to the`filter_spec` to get it to work.

NoraLoose · 2022-02-16T16:38:36Z

gcm_filters/filter.py

+                        ufield_bar += (
+                            utemp_l * 2 * np.real(s_b) / np.abs(s_b) ** 2
+                            + utemp_b * 1 / np.abs(s_b) ** 2
+                        )


Lines 271-274 should not be part of the if-statement.

Good catch! That would have been bad.

Wondering why the tests did not catch this? 😕

rabernat

It's great to see this fix Ian! I can see that this resolves the original bug.

I'd like to brainstorm a bit whether there is a simpler way to do this, which doesn't require us to carry around this is_nondimensional attribute with all of the Laplacians.

Looking at the code I am struck by the fact that the is_nondimensional condition seems to somehow be closely tied to the value of dx_min. Would it be possible to simply define a nondimensional Laplacian as a Laplaican for which dx_min==1? Are there scenarios that would break this assumption.

A corollary of this is the suggestion that perhaps we want to attach dx_min to the Laplacian / grid, rather than to the Filter itself. After all, dx_min is a property of the grid itself, not the filter.

Thanks for considering these comments. I'm just starting to really dig into this, but I know you have already through this through deeply.

NoraLoose · 2022-02-16T17:01:04Z

Great work, Ian!

If we stick to the is_nondimensional attribute, we should add some tests to this PR. In an ideal scenario, we want to test every newly added line to the code. Based on this, I can think of 2 tests:

A test that checks that the is_nondimensional attribute is there for each Laplacian, and has the expected value (e.g., to make sure a future PR does not accidentally change is_nondimenional from True to False for a Laplacian). Such a test could be added to test_kernels.py and could follow this structure:

@pytest.mark.parametrize(Laplacian, expected)
    ...

def test_nondimensional(Laplacian, expected):

    laplacian = Laplacian()
    assert laplacian.is_nondimensional == expected

A test for the added lines in filter.py. For example, we could test if the filter gives the same thing if we put in values for dx_min, area etc. in cm vs. km.

Let me know if you would like help with these tests. But obviously, before implementing these test cases in test_*, we should agree on how we want to implement the code changes in kernels.py and filter.py.

iangrooms · 2022-02-16T22:50:39Z

Replying to @rabernat:

Would it be possible to simply define a nondimensional Laplacian as a Laplacian for which dx_min==1?

The nondimensional Laplacians don't just have minimum grid spacing 1; they have every cell being a unit square. Identifying regular Laplacians with dx_min==1 could break if, for example, there was a variable grid whose minimum was coincidentally 1 km (or 1 m or 1 cm, etc). The user would set dx_min=1 and then the code would mistakenly think it was using a regular/nondimensional Laplacian.

perhaps we want to attach dx_min to the Laplacian / grid, rather than to the Filter itself

The Filter does need to know about dx_min. The Filter object obtains a polynomial approximation that is accurate over a range of wavenumbers, with the highest wavenumber in the range set by dx_min.

perhaps we want to attach dx_min to the Laplacian / grid, rather than to the Filter itself

As noted above, the filter needs dx_min. But maybe your suggestion points to a different approach. My proposed fix has the same effect as changing the grid size in the regular Laplacians from 1 to dx_min, but without having to change the actual Laplacian kernel code. Instead of having the Laplacian kernel know about the grid size (which would be natural), I hacked _create_filter_func to fix the output of the nondimensional Laplacians. Not sure this is the best way, but it does seem to work.

An more-natural alternative would be to add dx_min (or maybe just dx) to the grid_vars for the REGULAR Laplacians, and then change the code of the Laplacian kernels so that they divide by dx ** 2 right before returning the result. I didn't choose this path because it would break all of the code that currently uses the REGULAR Laplacians, in addition to requiring a fairly major overhaul of kernels.py. If there's a strong argument for doing it this way even if it breaks backwards compatibility, then I'm happy to code it up. Curious to hear what you think.

iangrooms · 2022-02-25T18:28:42Z

I added some tests as suggested by @NoraLoose. In test_kernels.py I add a test that just checks if the is_nondimensional attribute has the correct value. In test_filters.py I run the REGULAR filter on test data with dx_min=1, filter_scale=4 and again with dx_min=2 and filter_scale=8, and make sure the results are the same.

codecov-commenter · 2022-02-25T18:29:41Z

Codecov Report

Merging #146 (85f38a8) into master (766f624) will decrease coverage by 0.50%.
The diff coverage is 88.00%.

@@            Coverage Diff             @@
##           master     #146      +/-   ##
==========================================
- Coverage   98.49%   97.98%   -0.51%     
==========================================
  Files           9        9              
  Lines         997     1044      +47     
==========================================
+ Hits          982     1023      +41     
- Misses         15       21       +6

Flag	Coverage Δ
unittests	`97.98% <88.00%> (-0.51%)`	⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files	Coverage Δ
gcm_filters/filter.py	`96.46% <70.00%> (-2.58%)`	⬇️
gcm_filters/kernels.py	`98.61% <100.00%> (+0.04%)`	⬆️
tests/test_filter.py	`99.56% <100.00%> (+0.01%)`	⬆️
tests/test_kernels.py	`100.00% <100.00%> (ø)`

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 766f624...85f38a8. Read the comment docs.

iangrooms · 2022-03-03T15:49:19Z

The reason code coverage decreased is that I implemented the fix for non-dimensional vector Laplacians, but we don't have any non-dimensional vector Laplacians, so those lines don't get touched by the tests.

rabernat · 2022-03-03T15:57:09Z

Hi Ian! Thanks for continuing to work on this PR. I have been a bit overwhelmed with getting ready for / participating in OSM for the past two weeks. Next week I intend to spend some time reviewing this PR in detail. (I am still confused about some parts of this and need to take some time to really understand what is happening.) Thanks for your patience.

rabernat

This is great Ian. Sorry for taking so long to review this. I really appreciate your patience.

A minor nit, I would prefer dimensional = True, rather is_nondimensional = False, since i find double negatives to be confusing. If you agree, you can change the name and invert the parameter. But this is annoying and not that important, so feel free to ignore that suggestion.

The only real suggestion I have is a minor performance optimization.

rabernat · 2022-03-29T13:49:35Z

gcm_filters/filter.py

@@ -191,13 +194,18 @@ def filter_func(field, *args):
                if filter_spec.is_laplacian[i]:
                    s_l = np.real(filter_spec.s[i])
                    tendency = laplacian(field_bar)  # Compute Laplacian
+                    if Laplacian.is_nondimensional:
+                        tendency /= filter_spec.dx_min_sq  # dimensionalize


It would be great to be able to skip this step if dx_min_sq==1, which will probably often be the case.

Perhaps we could create a helper function like this

def _maybe_dimensionalize(data, dx_min_sq): if ds_min_sq == 1: return data else: return data / dx_min_sq

and use it wherever the normalization occurs

This is great Ian. Sorry for taking so long to review this. I really appreciate your patience.

No problem!

A minor nit, I would prefer dimensional = True, rather is_nondimensional = False, since i find double negatives to be confusing. If you agree, you can change the name and invert the parameter. But this is annoying and not that important, so feel free to ignore that suggestion.

Good idea. Avoiding the double negative will not fail (!) to make it clearer.

The only real suggestion I have is a minor performance optimization.

I don't understand how adding a function would improve performance here, probably because I have a wrong idea for how to use the new function.

But I do agree that the if statement is going to slow the whole code down unnecessarily. How about wrapping the whole block iteration in an if statement? If dimensional then we iterate as-is; it not dimensional then we divide by dx_min_sq at every iteration. Right now the if statement hits every time we use the Laplacian. If we instead put the test outside the loop then the if statement only hits once.

Rather than saying

tendency /= filter_spec.dx_min_sq

you would say

tendency = _maybe_nondimensionalize(tendency, filter_spec.dx_min_sq)

You could accomplish the same thing with yet another nested if block, but I just find that syntax a little easier. Calling the function and then immediately returning the same array back is cheap. The expensive part is dividing a big array by 1 unnecessarily; that's what we want to avoid.

Alternatively, you could change all of the

if Laplacian.is_nondimensional:

calls to

if Laplacian.is_nondimensional and filter_spec.dx_min_sq != 1:

to accomplish the same thing.

Oh, I see. I thought it was the if statement that was the problem.

I think your fix is cleaner than mine, so let's go with that. I can work on both the double negative and the maybe_nondimensionalize and push something later today

Awesome. When you're done feel free to self-merge so this is not blocking on me any longer.

Since black is evidently broken at the moment, I had to also pin version 8.0.4. Once it gets fixed this can be reverted.

iangrooms · 2022-03-29T15:41:46Z

Had some problems related to this psf/black#2964. My changes to the pre-commit configuration can be rolled back once this is fixed.

iangrooms added 2 commits February 15, 2022 12:38

Fix regular Laplacian dimensionality

a0be4a5

Adds a boolean `is_nondimensional` to the Laplacian classes. When True, the filter scales by `dx_min ** 2` to restore dimensionality. Had to add `dx_min_sq` (equals `dx_min ** 2` to the`filter_spec` to get it to work.

Mention units in docs

81ffdd2

iangrooms added the bug Something isn't working label Feb 15, 2022

iangrooms requested a review from NoraLoose February 15, 2022 20:11

iangrooms linked an issue Feb 15, 2022 that may be closed by this pull request

[BUG] Inconsistent dimensionality of discrete Laplacians #136

Closed

NoraLoose reviewed Feb 16, 2022

View reviewed changes

rabernat reviewed Feb 16, 2022

View reviewed changes

Fix indentation

ca5f25c

paigem mentioned this pull request Feb 17, 2022

Apply filtering to CM2.6 data ocean-transport/scale-aware-air-sea#17

Closed

iangrooms added 2 commits February 25, 2022 11:08

Add tests of dimensionality bool

0d5b582

Test nondim filter with dx_min!=1

767de25

rabernat reviewed Mar 29, 2022

View reviewed changes

Minor Improvements

85f38a8

Since black is evidently broken at the moment, I had to also pin version 8.0.4. Once it gets fixed this can be reverted.

iangrooms merged commit e4177da into ocean-eddy-cpt:master Mar 29, 2022

iangrooms deleted the fix_dimensionality branch March 29, 2022 15:42

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix dimensionality #146

Fix dimensionality #146

iangrooms commented Feb 15, 2022

NoraLoose Feb 16, 2022

iangrooms Feb 16, 2022

NoraLoose Feb 16, 2022

rabernat left a comment

NoraLoose commented Feb 16, 2022 •

edited

Loading

iangrooms commented Feb 16, 2022

iangrooms commented Feb 25, 2022

codecov-commenter commented Feb 25, 2022 •

edited

Loading

iangrooms commented Mar 3, 2022

rabernat commented Mar 3, 2022

rabernat left a comment

rabernat Mar 29, 2022

iangrooms Mar 29, 2022

rabernat Mar 29, 2022

iangrooms Mar 29, 2022

rabernat Mar 29, 2022

iangrooms commented Mar 29, 2022

Fix dimensionality #146

Fix dimensionality #146

Conversation

iangrooms commented Feb 15, 2022

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

rabernat left a comment

Choose a reason for hiding this comment

NoraLoose commented Feb 16, 2022 • edited Loading

iangrooms commented Feb 16, 2022

iangrooms commented Feb 25, 2022

codecov-commenter commented Feb 25, 2022 • edited Loading

Codecov Report

iangrooms commented Mar 3, 2022

rabernat commented Mar 3, 2022

rabernat left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

iangrooms commented Mar 29, 2022

NoraLoose commented Feb 16, 2022 •

edited

Loading

codecov-commenter commented Feb 25, 2022 •

edited

Loading