Warnings about bad input data #152

awst-baum · 2018-09-17T14:31:46Z

As far as I can see: if there's "bad" input data, pytesmo usually either drops it or issues a (sometimes quite generic) warning.
Examples are:

pytesmo.validation_framework.data_manager.DataManager.read_ds: warnings are given but exception and sometimes dataset name and arguments information are omitted.
pytesmo.temporal_matching.df_match, lines 90-117: If there are no matches between data and reference, no warning is given and an empty (or filled with NaN) DataFrame is returned.

Is there generic philosophy behind this like "don't bother the user at all, just give them the results we can produce and let them look into missing or faulty data themselves"?

Since we're currently trying to build a user-friendly webservice that uses pytesmo for validations, we'd like to tell the user not only "x% of your input data didn't yield results" but also ideally why that was the case. But that may clash with the more Python-developer-oriented approach pytesmo has?
Would you be open to us adding more warnings? How much would be too much?

The text was updated successfully, but these errors were encountered:

cpaulik · 2018-09-17T15:55:28Z

If the dataset reading fails then only the reader class can issue a specific warning since pytesmo can not know why the reading failed. We can of course add the requested gpi or lon, lat and the data source name to the pytesmo level warning. For the temporal matching we can add a warning if no matches are found. Probably in the validation framework since the temporal matcher does not have all info to issue a good warning. I could also imagine a strict mode or something like that which raises an exception for these failures. A more general question is if a warning is enough for your purposes? Would you not prefer a results object with more detailed information at which step a validation failed?

…

On Mon, Sep 17, 2018, 16:34 D. Baum ***@***.***> wrote: As far as I can see: if there's "bad" input data, pytesmo usually either drops it or issues a (sometimes quite generic) warning. Examples are: - pytesmo.validation_framework.data_manager.DataManager.read_ds: warnings are given but exception and sometimes dataset name and arguments information are omitted. - pytesmo.temporal_matching.df_match, lines 90-117: If there are no matches between data and reference, no warning is given and an empty (or filled with NaN) DataFrame is returned. Is there generic philosophy behind this like "don't bother the user at all, just give them the results we can produce and let them look into missing or faulty data themselves"? Since we're currently trying to build a user-friendly webservice that uses pytesmo for validations, we'd like to tell the user not only "x% of your input data didn't yield results" but also ideally *why* that was the case. But that may clash with the more Python-developer-oriented approach pytesmo has? Would you be open to us adding more warnings? How much would be too much? — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub <#152>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AAXP_4-a_Xqb5T858qU-kfvsD0PtGLtiks5ub7LrgaJpZM4WsDGt> .

awst-baum · 2018-09-17T16:13:29Z

I could also imagine a strict mode or something like that which raises an exception for these failures.

Might be done with https://docs.python.org/3/library/warnings.html#the-warnings-filter ?

Re results object: I hadn't thought that far. It sounds promising/interesting but may be a major change, right? A tricky part may be storing the results into a netcdf file when they contain error reports as well as results arrays.
For the webservice, we're looking at both short-term and long-term solutions.

PS: I'm currently playing around in a branch here but haven't done too much yet: https://github.com/awst-austria/pytesmo/tree/verbose_warnings
I need to define some unit tests...

cpaulik · 2018-09-18T15:33:32Z

Might be done with https://docs.python.org/3/library/warnings.html#the-warnings-filter ?

Yes that should work fine.

Re results object: I hadn't thought that far. It sounds promising/interesting but may be a major change, right?

Using a results object instead of the dictionary we currently should not be too big of a change. But I could be wrong.

A tricky part may be storing the results into a netcdf file when they contain error reports as well as results arrays.

We would have to come up with a flagging system where each error has a value. This should then be fairly easy to store according to CF conventions. See http://cfconventions.org/Data/cf-conventions/cf-conventions-1.7/cf-conventions.html#flags

awst-baum · 2018-09-20T08:17:15Z

And the results object would be put together in pytesmo.validation_framework.validation.Validation.perform_validation?

Of course the trick for creating a netcdf output format would be to foresee the problems that occur and categorise them in a useful fashion (NOT so that all practically occurring issues ends up in "other errors"). And then to write a reader/writer for it, I guess?

cpaulik · 2018-09-21T13:50:20Z

And the results object would be put together in pytesmo.validation_framework.validation.Validation.perform_validation?

Yes.

Of course the trick for creating a netcdf output format would be to foresee the problems that occur and categorise them in a useful fashion (NOT so that all practically occurring issues ends up in "other errors"). And then to write a reader/writer for it, I guess?

For every exception that we have we can add an error code/value/bit that we then set in the result. The ResultsManager will have to be updated.

See TUW-GEO#152

awst-baum added the question label Sep 17, 2018

tracyscanlon mentioned this issue Jul 15, 2019

Metadata handling #173

Open

s-scherrer added a commit to s-scherrer/pytesmo that referenced this issue Feb 12, 2021

added option to raise warning when no match is found, see TUW-GEO#152

9f5d78e

s-scherrer added a commit to s-scherrer/pytesmo that referenced this issue Feb 12, 2021

option to raise warning when no match is found

e0851da

See TUW-GEO#152

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Warnings about bad input data #152

Warnings about bad input data #152

awst-baum commented Sep 17, 2018 •

edited

Loading

cpaulik commented Sep 17, 2018 via email

awst-baum commented Sep 17, 2018

cpaulik commented Sep 18, 2018

awst-baum commented Sep 20, 2018

cpaulik commented Sep 21, 2018

Warnings about bad input data #152

Warnings about bad input data #152

Comments

awst-baum commented Sep 17, 2018 • edited Loading

cpaulik commented Sep 17, 2018 via email

awst-baum commented Sep 17, 2018

cpaulik commented Sep 18, 2018

awst-baum commented Sep 20, 2018

cpaulik commented Sep 21, 2018

awst-baum commented Sep 17, 2018 •

edited

Loading