From 373c8f2a09c6b578742d99964dfcc5d2686255cc Mon Sep 17 00:00:00 2001 From: Elliot Gunton Date: Thu, 25 May 2023 15:05:26 +0100 Subject: [PATCH] Getting started docs (#634) - [x] Fixes #627 - [x] ~Tests added~ docs only - [x] Documentation/examples added - [x] [Good commit messages](https://cbea.ms/git-commit/) and/or PR title **Description of PR** See #627. This PR gets users off the ground. Future PRs to cover features like DAGs, Artifacts etc --------- Signed-off-by: Elliot Gunton --- docs/getting-started/introduction.md | 84 ++++++++++++ docs/getting-started/quick-start.md | 92 +++++++++++++ .../walk-through/about-hera.md | 49 +++++++ .../walk-through/hello-world.md | 128 ++++++++++++++++++ docs/hera_getting_started.md | 28 ---- docs/introduction.md | 59 -------- mkdocs.yml | 7 +- 7 files changed, 359 insertions(+), 88 deletions(-) create mode 100644 docs/getting-started/introduction.md create mode 100644 docs/getting-started/quick-start.md create mode 100644 docs/getting-started/walk-through/about-hera.md create mode 100644 docs/getting-started/walk-through/hello-world.md delete mode 100644 docs/hera_getting_started.md delete mode 100644 docs/introduction.md diff --git a/docs/getting-started/introduction.md b/docs/getting-started/introduction.md new file mode 100644 index 000000000..8fd165a32 --- /dev/null +++ b/docs/getting-started/introduction.md @@ -0,0 +1,84 @@ +# Introduction + +Hera is a Python library that allows you to construct and submit Argo Workflows. It is designed to be intuitive and easy +to use, while also providing a powerful interface to the underlying Argo API. + +## Hera V5 vs V4 + +Hera v5 is a major release that introduces breaking changes from v4. The main reason for this is that v5 is a complete +rewrite of the library, and is now based on the OpenAPI specification of Argo Workflows. This allows us to provide a +more intuitive interface to the Argo API, while also providing full feature parity with Argo Workflows. This means that +you can now use all the features of Argo Workflows in your workflows. Additionally, it has been re-structured to +accommodate other Argo projects, such as Argo Events and Argo CD. Currently only Argo Workflows is supported, and there +is some work in progress to add support for Argo Events. + +The codebase is now much more readable, and the focus can be fully dedicated to improving the Python interface to +various Argo projects rather than maintaining feature parity with the Argo codebase. The library is divided into the +following components: + +- `hera.shared` - This package contains the shared code that will be used by all Argo projects. This includes common + global configuration to interact with the Argo API, and common Pydantic base models that are used by all Argo + projects. + +- `hera.events.models` - This package contains the auto-generated code that allows you to construct Argo Events. It + provides Pydantic models for all the Argo Events OpenAPI objects, and allows you to construct events using these + models. These models are based on the OpenAPI specification, and are therefore exactly the same as the models used by + Argo Events. + +- `hera.workflows.models` - This package contains the auto-generated code that allows you to construct Argo Workflows. + It provides Pydantic models for all the Argo Workflows OpenAPI objects, and allows you to construct workflows using + these models. These models are based on the OpenAPI specification, and are therefore exactly the same as the models + used by Argo Workflows. + +- `hera.workflows` - This package contains the hand-written code that allows you to construct and submit Argo Workflows. + It wraps the auto-generated code, and provides a more intuitive interface to the Argo API. It also provides a number + of useful features, such as the ability to submit workflows from a Python function. This package has various extension + points that allow you to plug-in the auto-generated models in case you need to use a feature that is not yet supported + by the hand-written code. + +The major differences between v4 and v5 are: + +- The `hera.workflows.models` package is now auto-generated, and is based on the OpenAPI specification of Argo + Workflows. This means that all the models are exactly the same as the models used by Argo Workflows, and you can use + all the features of Argo Workflows in your workflows written with `hera`. + +- The auto-generated models are based on Pydantic, which means that you can use all the features of Pydantic to + construct your workflows. This includes better type-checking, auto-completion in IDEs and more. + +- All template types are now supported. This means that you can now use all the template types that are supported by + Argo Workflows, such as DAGs, Steps, Suspend and more. Previously, only the DAG template type was supported. + +- The hand-written code has been rewritten to be extensible. This means that you can now easily extend the library to + support new features, or to support features that are not yet supported by the hand-written code. This is done by + using the `hera.workflows.models` package, and plugging it into the `hera.workflows` package. + +The following example shows how to use the DAG template type. + +```python +from hera.workflows import ( + DAG, + Workflow, + script, +) + + +# Notice that we are using the script decorator to define the function. +# This is required in order to use the function as a template. +# The decorator also allows us to define the image that will be used to run the function and +# other parameters that are specific to the Script template type. +@script(add_cwd_to_sys_path=False, image="python:alpine3.6") +def say(message): + print(message) + + +with Workflow(generate_name="dag-diamond-", entrypoint="diamond") as w: + # Note that we need to explicitly specify the DAG template type. + with DAG(name="diamond"): + # We can now use the decorated function as tasks in the DAG. + A = say(name="A", arguments={"message": "A"}) + B = say(name="B", arguments={"message": "B"}) + C = say(name="C", arguments={"message": "C"}) + D = say(name="D", arguments={"message": "D"}) + # We can use the `>>` or `.next()` operators to define dependencies between tasks. + A >> [B, C] >> D +``` diff --git a/docs/getting-started/quick-start.md b/docs/getting-started/quick-start.md new file mode 100644 index 000000000..b7e73d45b --- /dev/null +++ b/docs/getting-started/quick-start.md @@ -0,0 +1,92 @@ +# Quick Start + +## Install Argo tools + +Ensure you have a Kubernetes cluster, kubectl and Argo Workflows installed by following the +[Argo Workflows Quick Start](https://argoproj.github.io/argo-workflows/quick-start/). + +Ensure you are able to submit a workflow to Argo as in the example: + +```console +argo submit -n argo --watch https://raw.githubusercontent.com/argoproj/argo-workflows/master/examples/hello-world.yaml +``` + +## Install Hera + +[![Pypi](https://img.shields.io/pypi/v/hera.svg)](https://pypi.python.org/pypi/hera) + +Hera is available on PyPi as the `hera` package. Add this dependency to your project in your usual way, e.g. pip or +poetry, or install directly with `pip install hera`. + +## Hello World + +If you were able to run the `argo submit` command above, copy the following Workflow definition into a local file +`hello_world.py`. + +```py +from hera.workflows import Steps, Workflow, WorkflowsService, script + + +@script() +def echo(message: str): + print(message) + + +with Workflow( + generate_name="hello-world-", + entrypoint="steps", + namespace="argo", + workflows_service=WorkflowsService(host="https://localhost:2746") +) as w: + with Steps(name="steps"): + echo(arguments={"message": "Hello world!"}) + +w.create() +``` + +Run the file + +```console +python -m hello_world +``` + +You will then see the Workflow at + +## Hello World on an existing Argo installation + +If you or your organization are already running on Argo and you're interested in using Hera to write your Workflow +definitions, you will need to set up some config variables in `hera.shared.global_config`. Copy the following as a basis +and fill in the blanks. + +```py +from hera.workflows import Steps, Workflow, script +from hera.shared import global_config + +global_config.host = "https://" +global_config.token = "" # Copy token value after "Bearer" from the `argo auth token` command +global_config.image = "/python:3.8" # set the image if you cannot access "python:3.8" via Docker Hub + + +@script() +def echo(message: str): + print(message) + + +with Workflow( + generate_name="hello-world-", + entrypoint="steps", + namespace="argo", +) as w: + with Steps(name="steps"): + echo(arguments={"message": "Hello world!"}) + +w.create() +``` + +Run the file + +```console +python -m hello_world +``` + +You will then see the Workflow at https://\ diff --git a/docs/getting-started/walk-through/about-hera.md b/docs/getting-started/walk-through/about-hera.md new file mode 100644 index 000000000..e2b9dcc64 --- /dev/null +++ b/docs/getting-started/walk-through/about-hera.md @@ -0,0 +1,49 @@ +# About + +Hera is a Python library that allows you to construct and submit Argo Workflows. It is designed to be intuitive and easy +to use, while also providing a powerful interface to the underlying Argo API. + +Hera acts as a domain-specific-language on top of Argo, so it is primarily a way to define Workflows. In previous Argo +Workflows surveys such as [2021](https://blog.argoproj.io/argo-workflows-2021-survey-results-d6fa890030ee), a better +Python DSL has been highly requested to overcome the YAML barrier to adoption. In the +[2022 survey results](https://blog.argoproj.io/cncf-argo-project-2022-user-survey-results-f9caf46df7fd#:~:text=Job%20Roles%20%26%20Use%20Cases) +we can infer from the job roles for people using Argo Workflows that the DevOps Engineers are likely more comfortable +using YAML than ML Engineers. + +> DevOps Engineers: 41% +> Software Engineer: 20% +> Architects: 20% +> Data Engineer / Data Scientist / ML Engineer: 13% + +We hope by providing a more intuitive Python definition language, Data and ML users of Argo Workflows will increase. + +## Feature Parity + +A natural concern about an abstraction layer on top of another technology is whether it can function the same as the +original lower layer. In this case, Hera generates a [library of model classes](../../api/workflows/models.md) using +Argo's OpenAPI spec which are wrapped up by Hera's feature-rich classes, while the model classes are available as a +fallback mechanism. You can check out the extensive +["upstream" examples](../../examples/workflows/upstream/dag_diamond.md) that contain side-by-side Python and YAML +definitions for Workflows in +[the Argo examples folder on GitHub](https://github.com/argoproj/argo-workflows/tree/master/examples). Our CI/CD runs +through the Argo examples folder to check that we are able to reproduce them using Hera Workflows written by hand (note: +we have not _yet_ written Hera Workflows for all the examples). + +If you are a new user of Argo, we encourage you to become familiar with +[Argo's Core Concepts](https://argoproj.github.io/argo-workflows/workflow-concepts/), which provide a foundation of +understanding when working with Hera. Working through the +[Argo Walk Through](https://argoproj.github.io/argo-workflows/walk-through/) will also help you understand key concepts +before moving to Python. + +## Context Managers + +You will notice many classes in Hera implement the context manager interface. This was designed to mirror the YAML +syntax of Argo, helping existing users come to Hera from YAML, and for users new to both Argo and Hera, who will be able +to interpret and understand most of the existing YAML documentation and resources online from familiar naming and +functionality in Hera. + +## Orchestrating Scripts + +A natural extension of a Python DSL for Argo is tighter integration with Python scripts. This is where Hera improves the +developer experience through its tailored classes and syntactic sugar to enable developers to easily orchestrate Python +functions. Check out [Hello World](hello-world.md) to get started! diff --git a/docs/getting-started/walk-through/hello-world.md b/docs/getting-started/walk-through/hello-world.md new file mode 100644 index 000000000..735bd98f8 --- /dev/null +++ b/docs/getting-started/walk-through/hello-world.md @@ -0,0 +1,128 @@ +# Hello World + +Let's take a look at the `hello_world.py` from the [Quick Start](../quick-start.md) guide. + +```py +from hera.workflows import Steps, Workflow, WorkflowsService, script + + +@script() +def echo(message: str): + print(message) + + +with Workflow( + generate_name="hello-world-", + entrypoint="steps", + namespace="argo", + workflows_service=WorkflowsService(host="https://localhost:2746") +) as w: + with Steps(name="steps"): + echo(arguments={"message": "Hello world!"}) + +w.create() +``` + +## The imports + +As we are using Argo Workflows, we import specialized classes from `hera.workflows`. You will see Argo concepts from the +Argo spec have been transformed into powerful Python classes, explore them at the +[Hera Workflows API reference](../../api/workflows/hera.md). + +For this Workflow, we want to echo using Python's `print` function, which is wrapped in our convenience `echo` function. +We use Hera's `script` decorator to turn the `echo` function into what's known as a +[Script template](https://argoproj.github.io/argo-workflows/workflow-concepts/#script), and is mirrored in Hera with the +`Script` class. As we're defining the Workflow in Python, Hera is able to infer multiple field values that the developer +would otherwise have to define when using YAML. + +## The script decorator + +The `script` decorator can take kwargs that a `Script` can take. Importantly, you can specify the `image` of Python +to use instead of the default `python:3.8` for your script if required: + +```py +@script(image="python:3.11") +def echo(message: str): + print(message) +``` + +Alternatively, you can specify this image once via the `global_config.image` variable, and it will be used for all +`script`s automatically: + +```py +from hera.shared import global_config +global_config.image = "python:3.11" + +@script() # "echo" will now run using python:3.11, as will any other scripts you define +def echo(message: str): + print(message) + +@script() # "echo_twice" will also run using python:3.11 +def echo_twice(message: str): + print(message) + print(message) +``` + +## The Workflow context manager + +The Workflow context manager acts as a scope under which `template` Hera objects can be declared, which include +Containers, Scripts, DAGs [and more](https://argoproj.github.io/argo-workflows/workflow-concepts/#template-types). For a +minimal example, you will need to provide your `Workflow` the initialization values as seen + +```py +with Workflow( + generate_name="hello-world-", + entrypoint="steps", + namespace="argo", + workflows_service=WorkflowsService(host="https://localhost:2746") +) as w: +``` + +* `generate_name` is taken by Argo upon submission, where it appends a random 5 character suffix, so you may see this + Workflow run with a name like `hello-world-vmsz5`. +* `entrypoint` tells Argo which template to run upon submission. +* `namespace` refers to the Kubernetes namespace you want to submit to. +* `workflows_service` is the submission service. + +## The Steps context manager + +A `Steps` template is the second template type of this example, the first being the `Script`. The `Steps` template, +along with the `DAG` template, is known as a "template invocator". This is because they are used to arrange other +templates, mainly Containers and Scripts, to do the actual work. In Hera, the `Steps` class is a context manager as it +automatically arranges your templates in the order that you add them, with each template invocation known as a `Step`. + +```py +with Steps(name="steps"): +``` + +To invoke the `echo` template, you can call it, passing values to its arguments through the `arguments` kwarg, which is +a dictionary of the _function_ kwargs to values. This is because under a `Steps` or `DAG` context manager, the `script` +decorator converts a call of the function into a `Script` object, to which you must pass `Script` initialization kwargs. + +```py +echo(arguments={"message": "Hello world!"}) +``` + +> For advanced users: the exact mechanism of the `script` decorator is to create a `Script` object when declared, so +> that when your function is invoked you have to pass its arguments through the `arguments` kwarg as a dictionary, and +> the `Script` objects `__call__` function is invoked with the `arguments` kwarg. The `__call__` function on a +> `CallableTemplateMixin` automatically creates a `Step` or a `Task` depending on whether the context manager is a +> `Steps` or a `DAG`. + +## Submitting the Workflow + +Finally, with the workflow defined, the actual submisson occurs on + +```py +w.create() +``` + +This uses the `WorkflowsService` to submit to Argo using its REST API, so `w.create()` can be thought of as running +`argo submit`. + +Alternatively, you may want to see what the YAML looks like for this Workflow, which can be done with a print or to a +file using `w.to_yaml()`. + +```py +print(w.to_yaml()) +``` diff --git a/docs/hera_getting_started.md b/docs/hera_getting_started.md deleted file mode 100644 index a79a96385..000000000 --- a/docs/hera_getting_started.md +++ /dev/null @@ -1,28 +0,0 @@ -# Hera - -## Quick Start - -```python -from hera.workflows import DAG, Container, Parameter, Workflow - -with Workflow( - generate_name="dag-diamond-", - entrypoint="diamond", -) as w: - echo = Container( - name="echo", - image="alpine:3.7", - command=["echo", "{{inputs.parameters.message}}"], - inputs=[Parameter(name="message")], - ) - with DAG(name="diamond"): - A = echo(name="A", arguments={"message": "A"}) - B = echo(name="B", arguments={"message": "B"}) - C = echo(name="C", arguments={"message": "C"}) - D = echo(name="D", arguments={"message": "D"}) - A >> [B, C] >> D - -w.create() -``` - -## Walk Through \ No newline at end of file diff --git a/docs/introduction.md b/docs/introduction.md deleted file mode 100644 index 85cb27e9e..000000000 --- a/docs/introduction.md +++ /dev/null @@ -1,59 +0,0 @@ -# Introduction - -Hera is a Python library that allows you to construct and submit Argo Workflows. It is designed to be intuitive and easy to use, while also providing a powerful interface to the underlying Argo API. - -## Hera V5 vs V4 - -Hera v5 is a major release that introduces breaking changes from v4. The main reason for this is that v5 is a complete rewrite of the library, and is now based on the OpenAPI specification of Argo Workflows. This allows us to provide a more intuitive interface to the Argo API, while also providing full feature parity with Argo Workflows. This means that you can now use all the features of Argo Workflows in your workflows. Additionally, it has been re-structured to accommodate other Argo projects, such as Argo Events and Argo CD. Currently only Argo Workflows is supported, and there is some work in progress to add support for Argo Events. - -The codebase is now much more readable, and the focus can be fully dedicated to improving the Python interface to various Argo projects rather than maintaining feature parity with the Argo codebase. -The library is divided into the following components: - -- `hera.shared` - This package contains the shared code that will be used by all Argo projects. This includes common global configuration to interact with the Argo API, and common Pydantic base models that are used by all Argo projects. - -- `hera.events.models` - This package contains the auto-generated code that allows you to construct Argo Events. It provides Pydantic models for all the Argo Events OpenAPI objects, and allows you to construct events using these models. These models are based on the OpenAPI specification, and are therefore exactly the same as the models used by Argo Events. - -- `hera.workflows.models` - This package contains the auto-generated code that allows you to construct Argo Workflows. It provides Pydantic models for all the Argo Workflows OpenAPI objects, and allows you to construct workflows using these models. These models are based on the OpenAPI specification, and are therefore exactly the same as the models used by Argo Workflows. - -- `hera.workflows` - This package contains the hand-written code that allows you to construct and submit Argo Workflows. It wraps the auto-generated code, and provides a more intuitive interface to the Argo API. It also provides a number of useful features, such as the ability to submit workflows from a Python function. This package has various extension points that allow you to plug-in the auto-generated models in case you need to use a feature that is not yet supported by the hand-written code. - -The major differences between v4 and v5 are: - -- The `hera.workflows.models` package is now auto-generated, and is based on the OpenAPI specification of Argo Workflows. This means that all the models are exactly the same as the models used by Argo Workflows, and you can use all the features of Argo Workflows in your workflows written with `hera`. - -- The auto-generated models are based on Pydantic, which means that you can use all the features of Pydantic to construct your workflows. This includes better type-checking, auto-completion in IDEs and more. - -- All template types are now supported. This means that you can now use all the template types that are supported by Argo Workflows, such as DAGs, Steps, Suspend and more. Previously, only the DAG template type was supported. - -- The hand-written code has been rewritten to be extensible. This means that you can now easily extend the library to support new features, or to support features that are not yet supported by the hand-written code. This is done by using the `hera.workflows.models` package, and plugging it into the `hera.workflows` package. - -The following example shows how to use the DAG template type. - -```python -from hera.workflows import ( - DAG, - Workflow, - script, -) - - -# Notice that we are using the script decorator to define the function. -# This is required in order to use the function as a template. -# The decorator also allows us to define the image that will be used to run the function and -# other parameters that are specific to the Script template type. -@script(add_cwd_to_sys_path=False, image="python:alpine3.6") -def say(message): - print(message) - - -with Workflow(generate_name="dag-diamond-", entrypoint="diamond") as w: - # Note that we need to explicitly specify the DAG template type. - with DAG(name="diamond"): - # We can now use the decorated function as tasks in the DAG. - A = say(name="A", arguments={"message": "A"}) - B = say(name="B", arguments={"message": "B"}) - C = say(name="C", arguments={"message": "C"}) - D = say(name="D", arguments={"message": "D"}) - # We can also use the `>>` operator to define dependencies between tasks. - A >> [B, C] >> D -``` diff --git a/mkdocs.yml b/mkdocs.yml index a7c15f043..d48961809 100644 --- a/mkdocs.yml +++ b/mkdocs.yml @@ -5,7 +5,12 @@ copyright: Copyright © 2023 Flaviu Vadan, Sambhav Kothari, Elliot Gunton nav: - Home: README.md - - Introduction: introduction.md + - Getting Started: + - getting-started/introduction.md + - getting-started/quick-start.md + - Walk Through: + - getting-started/walk-through/about-hera.md + - getting-started/walk-through/hello-world.md - Hera expr transpiler: expr.md - Examples: - Workflows: