Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

mypy 0.981 compatibility #336

Closed
Dr-Irv opened this issue Sep 27, 2022 · 10 comments · Fixed by #419
Closed

mypy 0.981 compatibility #336

Dr-Irv opened this issue Sep 27, 2022 · 10 comments · Fixed by #419
Assignees

Comments

@Dr-Irv
Copy link
Collaborator

Dr-Irv commented Sep 27, 2022

mypy 0.981 released on 9/26/22. If you do a manual re-run of a job, it picks that up, and we get some failures. Could be mypy bugs, or settings, or errors we need to fix.

Might be able to remove the python 3.10.6 restriction, since I think 0.981 is supposed to fix the issue of using 3.10.7 or later.

@twoertwein
Copy link
Member

twoertwein commented Sep 27, 2022

pandas-stubs/core/series.pyi:440: error: Overloaded function signatures 1 and 2 overlap with incompatible return types [misc]

This is an actual ambiguity/error as None is Hashable. Can probably be silenced in the stubs but should still work for users.

error: No overload variant of "concat" matches argument types "Dict[str, Series[Any]]", "int" [call-overload]

I think there might be two issues, something about using HashableT (replacing it with Any solves many of those errors - not sure what's going on there) and probably a mypy bug as axis=1 seems to be infered as int and not as Literal[1].

edit: opened python/mypy#13750

error: Dict entry 0 has incompatible type "Tuple[str]": "str"; expected "str": "str" [dict-item]

Could be a mypy bug with how it handles HashableT: the typevar is used a second time in the declaration of get_dummies (so the two instances have to have the same type) but the second time it is not used because columns is not passed.

tests/test_resampler.py:277: error: Expression is of type "Union[str, bytes, date, timedelta, complex]", not "Union[str, bytes, date, datetime, timedelta, bool, complex, Timestamp, Timedelta]" [assert-type]

It seems that _typing.Scalar might contain redundant types: pd.Timedelta is part of datetime.timedelta and so on. It seems that mypy compacts Scalar in some cases but not always creating this discrepancy.

I will try to create minimal examples later and report them at mypy.

@twoertwein
Copy link
Member

@hauntsaninja it might be great to add pandas-stubs to mypy-primer. Pandas-stubs has many annotations and some of them might also be dubious - perfect test case for mypy-primer :)

@hauntsaninja
Copy link
Contributor

Thanks, added!

@twoertwein
Copy link
Member

Mypy 0.981 adds support for unpacking kwargs python/mypy#13471

from typing import TypedDict
from typing_extensions import Unpack
    
class Style(TypedDict, total=False):
    margin: int
    sticky: bool
    
def add_button(label: str, **kwds: Unpack[Style]) -> None:
    ...

That could make many overloads much smaller and easier to understand as we can put all the non-overload-related keywords in a TypedDict!

@Dr-Irv
Copy link
Collaborator Author

Dr-Irv commented Sep 29, 2022

tests/test_resampler.py:277: error: Expression is of type "Union[str, bytes, date, timedelta, complex]", not "Union[str, bytes, date, datetime, timedelta, bool, complex, Timestamp, Timedelta]" [assert-type]

It seems that _typing.Scalar might contain redundant types: pd.Timedelta is part of datetime.timedelta and so on. It seems that mypy compacts Scalar in some cases but not always creating this discrepancy.

Created python/mypy#13760

It seems that 0.971 did type narrowing on unions, but 0.981 does it inconsistently.

@Dr-Irv
Copy link
Collaborator Author

Dr-Irv commented Sep 29, 2022

With respect to Hashable, I think we should consider removing all uses of Hashable, because how it is interpreted by type checkers has some ambiguity.

If we want a type to correspond to what could be a column name or index name, we could do something like:

ValidName= Union[str, complex, Timestamp, Timedelta, tuple, None, OTHER_TYPES]
ValidNameT = TypeVar("ValidNameT", bound=ValidName)

where OTHER_TYPES are other types we know are OK for values of column names, index labels, then we replace all uses of HashableT with ValidNameT and all uses of Hashable with ValidName .

This might (but I'm not sure) clean up these mypy issues.

@twoertwein
Copy link
Member

With respect to Hashable, I think we should consider removing all uses of Hashable, because how it is interpreted by type checkers has some ambiguity.

I would first wait for the next mypy release (or check mypy main) to know which errors are actual errors and which aren't. I think pandas-stubs/core/series.pyi:440 is an error/ambiguity. Since type checkers pick the first matching overload, we could live with this ambiguity.

@twoertwein
Copy link
Member

With mypy 0.982, we are now down to only three errors :)

We need to fix this (or ignore it):

pandas-stubs/core/series.pyi:445: error: Overloaded function signatures 1 and 2 overlap with incompatible return types [misc]

Union-simplifications (wait for mypy 0.983?):

tests/test_resampler.py:277: error: Expression is of type "Union[str, bytes, date, timedelta, complex]", not "Union[str, bytes, date, datetime, timedelta, bool, int, float, complex, Timestamp, Timedelta]" [assert-type]
tests/test_pandas.py:313: error: Expression is of type "Union[ndarray[Any, dtype[Any]], str, bytes, date, timedelta, complex, DataFrame, Series[Any], None]", not "Union[ndarray[Any, dtype[Any]], Union[str, bytes, date, datetime, timedelta, bool, int, float, complex, Timestamp, Timedelta], DataFrame, Series[Any], None]" [assert-type]

@Dr-Irv
Copy link
Collaborator Author

Dr-Irv commented Oct 3, 2022

With mypy 0.982, we are now down to only three errors :)

We need to fix this (or ignore it):

pandas-stubs/core/series.pyi:445: error: Overloaded function signatures 1 and 2 overlap with incompatible return types [misc]

I think we have to just add a # type: ignore on this one.

Union-simplifications (wait for mypy 0.983?):

tests/test_resampler.py:277: error: Expression is of type "Union[str, bytes, date, timedelta, complex]", not "Union[str, bytes, date, datetime, timedelta, bool, int, float, complex, Timestamp, Timedelta]" [assert-type]
tests/test_pandas.py:313: error: Expression is of type "Union[ndarray[Any, dtype[Any]], str, bytes, date, timedelta, complex, DataFrame, Series[Any], None]", not "Union[ndarray[Any, dtype[Any]], Union[str, bytes, date, datetime, timedelta, bool, int, float, complex, Timestamp, Timedelta], DataFrame, Series[Any], None]" [assert-type]

So they claimed to have fixed the bug here: python/mypy#13781

But I looked at the source for tag for 0.982, and it doesn't include that fix, so let's wait for that.

@twoertwein
Copy link
Member

If we had a good solution for optional CI runs that are allowed to fail without screaming in red that they failed, it might be nice to add a run with mypy nightly. This would allow us to more quickly correct errors that newer versions of mypy will detect and if mypy has unexpected behaviors, we can report them before a new version is released.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants