DM-37758: Replace Cachemachine with JupyterLab Controller #199

athornton · 2023-02-03T18:09:02Z

No description provided.

rra · 2023-02-03T23:24:27Z

src/mobu/jupyterclient.py

-            "image_dropdown": "use_image_from_dropdown",
-            "size": self.config.image_size,
+            "image_list": [image.path],
+            "image_dropdown": [image.path],  # Not used


Does this have to be sent? It would be nice if it could be omitted since it's not used.

I think I can modify the models on the receiving end to make those optional. At least one of list or dropdown needs to be sent, but I can certainly send dropdown as null.

rra · 2023-02-03T23:25:20Z

src/mobu/models/jupyter.py

+def to_camel_case(string: str) -> str:
+    """Convert a string to camel case.
+
+    Originally written for use with Pydantic as an alias generator so that the
+    model can be initialized from camel-case input (such as Kubernetes
+    objects).
+
+    Parameters
+    ----------
+    string
+        Input string
+
+    Returns
+    -------
+    str
+        String converted to camel-case with the first character in lowercase.
+    """
+    components = string.split("_")
+    return components[0] + "".join(c.title() for c in components[1:])
+
+
+class CamelCaseModel(BaseModel):
+    """This is what we will use in place of BaseModel for the Spawner
+    Pydantic models.  Any configuration can be given in Helm-appropriate
+    camelCase, but internal Python methods and objects will all be snake_case.
+
+    This isn't actually all that useful here, but since these models are
+    copied from jupyterlabcontroller, which *does* make use of these features,
+    it's easier than messing with the models.
+    """
+
+    class Config:
+        """Pydantic configuration."""
+
+        alias_generator = to_camel_case
+        allow_population_by_field_name = True


I don't think you end up needing any of this code since you use dashify instead.

ControllerImages is a CamelCase model and I wanted to minimize the changes when I forklifted the model from jupyterlabcontroller.

That is a question I've got, actually: Ideally models shared between applications (e.g. as producer and consumer, which is what we're doing here) should go in some common library they both reference. What's a good pattern for that?

I've been pondering the same thing for Pydantic-based Avro for Roundtable Kafka producers and consumers. I was thinking about putting them in Safir?

rra · 2023-02-03T23:27:12Z

src/mobu/models/jupyter.py

+class SpawnerEnum(str, Enum):
+    """This will validate that the name is entirely upper case, and
+    will produce auto() values in lower case with underscores turned to
+    dashes.
+    """

-    RECOMMENDED = "recommended"
-    LATEST_WEEKLY = "latest-weekly"
-    BY_REFERENCE = "by-reference"
+    def _generate_next_value_(  # type: ignore
+        name, start, count, last_values
+    ) -> str:
+        if name != name.upper():
+            raise RuntimeError("Enum names must be entirely upper-case")
+        return dashify(name.lower())


-class JupyterImage(BaseModel):
-    """Represents an image to spawn as a Jupyter Lab."""
+class NubladoEnum(str, Enum):
+    """This will validate that the name is entirely upper case, and
+    will produce auto() values in lower case.  This is exactly StrEnum from
+    Python 3.11, except for the validation step."""

-    reference: str = Field(
+    def _generate_next_value_(  # type: ignore
+        name, start, count, last_values
+    ) -> str:
+        if name != name.upper():
+            raise RuntimeError("Enum names must be entirely upper-case")
+        return name.lower()


I don't understand these changes. They seem like a ton of unnecessary complexity. Why can't we just use simple enums?

Same as above: I tried to minimize the changes when pulling this over from JupyterLab Controller. We do care about the actual string values there. I think the right way to go may be to go to Python 3.11 and use its new StrEnum class directly, though. That doesn't get us the keys-are-uppercase validator, but, meh.

rra · 2023-02-03T23:27:55Z

src/mobu/models/jupyter.py

-    digest: Optional[str] = Field(
+    digest: str = Field(
        ...,
-        title="Hash of the last layer of the Docker container",
-        description="May be null if the digest isn't known",
+        name="digest",
        example=(
-            "sha256:419c4b7e14603711b25fa9e0569460a753c4b2449fe275bb5f89743b"
-            "01794a30"
+            "sha256:e693782192ecef4f7846ad2b21"
+            "b1574682e700747f94c5a256b5731331a2eec2"
        ),
+        title="unique digest of image contents",
+    )


This should be optional since we don't care about it, I think. Then when the image is manually constructed for the case where we explicitly provide an image, we don't need to pass in a dummy string and can just let it default to None.

rra · 2023-02-03T23:29:40Z

src/mobu/models/jupyter.py

+class JupyterImageClass(SpawnerEnum):
+    """Possible ways of selecting an image."""
+
+    RECOMMENDED = auto()
+    LATEST_WEEKLY = auto()
+    BY_REFERENCE = auto()


Suggested change

class JupyterImageClass(SpawnerEnum):

"""Possible ways of selecting an image."""

RECOMMENDED = auto()

LATEST_WEEKLY = auto()

BY_REFERENCE = auto()

class JupyterImageClass(Enum):

"""Possible ways of selecting an image."""

RECOMMENDED = "recommended"

LATEST_WEEKLY = "latest-weekly"

BY_REFERENCE = "by-reference"

rra · 2023-02-03T23:30:24Z

tests/business/jupyterpythonloop_test.py

@@ -203,8 +203,7 @@ async def test_long_error(
                "jupyter": {
                    "image_class": "by-reference",
                    "image_reference": (
-                        "registry.hub.docker.com/lsstsqre/sciplat-lab"
-                        ":d_2021_08_30"
+                        "docker.io/lsstsqre/sciplat-lab" ":d_2023_02_03"


Strings can be combined now.

rra · 2023-02-03T23:33:55Z

src/mobu/jupyterclient.py

-            image = JupyterImage.from_reference(self.config.image_reference)
+            ref = self.config.image_reference
+            image = ControllerImage(
+                path=ref, name=ref, digest="unknown", tags={}


digest and tags should be optional in the model so that you can just omit them here.

rra · 2023-02-03T23:34:56Z

src/mobu/models/jupyter.py

+class ControllerImages(CamelCaseModel):
+    recommended: Optional[ControllerImage] = None
+    latest_weekly: Optional[ControllerImage] = None
+    latest_daily: Optional[ControllerImage] = None
+    latest_release: Optional[ControllerImage] = None
+    all: List[ControllerImage] = Field(default_factory=list)
+
+    class Config:
+        alias_generator = dashify
+        allow_population_by_field_name = True


This is defined as a CamelCaseModel but then you override everything CamelCaseModel does, so this should just be BaseModel.

This comes back to the shared-class thing. It makes sense for it to be a CamelCaseModel in the JupyterLab Controller, because we instantiate it directly from a CamelCase YAML file. It doesn't really make sense here, but I also feel like it should be as nearly as possible the same model in each place, or we should have a repository or set of repositories that hold models that are used as things we're passing around over the network.

rra · 2023-02-03T23:36:00Z

src/mobu/models/jupyter.py

+class PartialImage(CamelCaseModel):
+    path: str = Field(


Not sure why this isn't merged with ContainerImage. I don't think mobu ever cares about PartialImage.

One more "well, I wanted to leave the classes from the controller intact."

rra · 2023-02-03T23:36:40Z

src/mobu/models/jupyter.py

+    tags: Dict[str, str] = Field(
+        ...,
+        name="tags",
+        title="Map between tag and its display name",
+    )


Should be optional so that you don't need to supply it when constructing the image by reference.

rra · 2023-03-23T04:44:32Z

This was done using the bot API instead of talking to the lab controller directly in #227.

athornton added 3 commits January 30, 2023 12:40

Freshen for python 3.11

c714bc2

Replace cachemachine with JupyterLab controller

c26b8ad

Fix test suite

6bdd303

athornton requested review from rra and jonathansick February 3, 2023 18:09

rra reviewed Feb 3, 2023

View reviewed changes

rra closed this Mar 23, 2023

rra deleted the tickets/DM-37758 branch March 23, 2023 04:44

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

DM-37758: Replace Cachemachine with JupyterLab Controller #199

DM-37758: Replace Cachemachine with JupyterLab Controller #199

athornton commented Feb 3, 2023

rra Feb 3, 2023

athornton Feb 6, 2023

rra Feb 3, 2023

athornton Feb 6, 2023

jonathansick Feb 7, 2023

rra Feb 3, 2023

athornton Feb 6, 2023

rra Feb 3, 2023

rra Feb 3, 2023

rra Feb 3, 2023

rra Feb 3, 2023

rra Feb 3, 2023

athornton Feb 6, 2023

rra Feb 3, 2023

athornton Feb 6, 2023

rra Feb 3, 2023

rra commented Mar 23, 2023

DM-37758: Replace Cachemachine with JupyterLab Controller #199

DM-37758: Replace Cachemachine with JupyterLab Controller #199

Conversation

athornton commented Feb 3, 2023

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

rra commented Mar 23, 2023