Skip to content

Create a dataclass for version info #12205

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

sirosen
Copy link
Contributor

@sirosen sirosen commented May 25, 2025

@ericholscher and I briefly discussed this at the PyCon US Sprints.
This some basic backlog cleanup.

I have tests passing for me locally except for some which rely on rclone -- I didn't feel like installing that, so I'm just trusting CI for now.

closes #7798
closes #10936

I only noticed #10936 after I finished this. I named and placed the class slightly differently and didn't make some of the formatting changes which that version did (because things have updated anyway).


Convert from fixed-shape dicts to a dataclass. Preserve almost all code semantics identically -- e.g., variable assignments in readthedocs.api.v2.utils may be micro-optimizations which could be important.

The only intentional semantic change included here (other than the change in types) is in readthedocs.projects.tasks.mixins, where a list comprehension is replaced with a generator expression.

@sirosen sirosen requested a review from a team as a code owner May 25, 2025 04:16
@sirosen sirosen requested a review from humitos May 25, 2025 04:16
@humitos humitos requested review from stsewd and ericholscher and removed request for humitos May 26, 2025 08:44
Convert from fixed-shape dicts to a dataclass. Preserve almost all
code semantics identically -- e.g., variable assignments in
`readthedocs.api.v2.utils` may be micro-optimizations which could be
important.

The only intentional semantic change included here (other than the
change in types) is in
`readthedocs.projects.tasks.mixins`, where a list comprehension is
replaced with a generator expression.
@sirosen sirosen force-pushed the create-version-info-datatype branch from 3be7570 to ce29788 Compare May 27, 2025 00:27
@ericholscher
Copy link
Member

ericholscher commented May 28, 2025

This looks good to me, but will ping @stsewd as well here, since I know there were some plans for a larger refactoring. This feels like a nice improvement over the current situation tho, so I'd be 👍 on merging and then refactoring more later.

@stsewd
Copy link
Member

stsewd commented Jun 2, 2025

Since we use celery with the json serializer in production, this breaks builds (we should probably use that same serializer in tests).

EncodeError: Object of type ProjectVersionInfo is not JSON serializable

Doesn't look like there is a clean solution to just change the dataclass to make it json serializable. Named tuples are json serializable, but not sure if they can be serialized back, as they are serialized as a list.

Also, doesn't look like this change has anything to do with #10963.

@sirosen
Copy link
Contributor Author

sirosen commented Jun 2, 2025

Ah, whoops! That's definitely the wrong issue linked. I'll correct not only the PR comment but the commit message (so that GitHub won't auto-close an unrelated issue). There is a PR which addressed this same issue, but I must have fat-fingered the number for it somehow.

Regarding serialization, if the data is being passed around as JSON, I have some notion of how to approach it. I would not setup a custom JSON encoder -- it only real solves the encoding side, not decoding, and I don't think there's a simple path to resolve that asymmetry.
My preference would be to give the dataclass an additional pair of methods:

@dataclasses.dataclass
class ProjectVersionInfo:
    def to_dict(self) -> dict[str, typing.Any]:
        return dataclasses.todict(self)

    @classmethod
    def from_dict(cls, data: dict[str, typing.Any]) -> typing.Self: ...

so that it encapsulates the JSON-safe-dict encoding and decoding logic. (from_dict can have explicit checks on the data, rather than the potential for TypeErrors and whatnot from ProjectVersionInfo(**dict_data))

I assume the tests are pickling? I'll sniff around and see if I can find the appropriate module(s).


I'll address all of the above this evening or in the next couple of days, when I'm not on the clock for work.

@sirosen
Copy link
Contributor Author

sirosen commented Jun 3, 2025

I edited to update. I had transposed digits; the relevant PR I meant to reference is #10936.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Use a named tuple for tags and branches data
3 participants