Getting started docs (#634)

- [x] Fixes #627 - [x] ~Tests added~ docs only - [x] Documentation/examples added - [x] [Good commit messages](https://cbea.ms/git-commit/) and/or PR title **Description of PR** See #627. This PR gets users off the ground. Future PRs to cover features like DAGs, Artifacts etc --------- Signed-off-by: Elliot Gunton <egunton@bloomberg.net>
argoproj-labs · May 25, 2023 · 373c8f2 · 373c8f2
1 parent 36aeef4
commit 373c8f2
Show file tree

Hide file tree

Showing 7 changed files with 359 additions and 88 deletions.
diff --git a/docs/getting-started/introduction.md b/docs/getting-started/introduction.md
@@ -0,0 +1,84 @@
+# Introduction
+
+Hera is a Python library that allows you to construct and submit Argo Workflows. It is designed to be intuitive and easy
+to use, while also providing a powerful interface to the underlying Argo API.
+
+## Hera V5 vs V4
+
+Hera v5 is a major release that introduces breaking changes from v4. The main reason for this is that v5 is a complete
+rewrite of the library, and is now based on the OpenAPI specification of Argo Workflows. This allows us to provide a
+more intuitive interface to the Argo API, while also providing full feature parity with Argo Workflows. This means that
+you can now use all the features of Argo Workflows in your workflows. Additionally, it has been re-structured to
+accommodate other Argo projects, such as Argo Events and Argo CD. Currently only Argo Workflows is supported, and there
+is some work in progress to add support for Argo Events.
+
+The codebase is now much more readable, and the focus can be fully dedicated to improving the Python interface to
+various Argo projects rather than maintaining feature parity with the Argo codebase. The library is divided into the
+following components:
+
+- `hera.shared` - This package contains the shared code that will be used by all Argo projects. This includes common
+  global configuration to interact with the Argo API, and common Pydantic base models that are used by all Argo
+  projects.
+
+- `hera.events.models` - This package contains the auto-generated code that allows you to construct Argo Events. It
+  provides Pydantic models for all the Argo Events OpenAPI objects, and allows you to construct events using these
+  models. These models are based on the OpenAPI specification, and are therefore exactly the same as the models used by
+  Argo Events.
+
+- `hera.workflows.models` - This package contains the auto-generated code that allows you to construct Argo Workflows.
+  It provides Pydantic models for all the Argo Workflows OpenAPI objects, and allows you to construct workflows using
+  these models. These models are based on the OpenAPI specification, and are therefore exactly the same as the models
+  used by Argo Workflows.
+
+- `hera.workflows` - This package contains the hand-written code that allows you to construct and submit Argo Workflows.
+  It wraps the auto-generated code, and provides a more intuitive interface to the Argo API. It also provides a number
+  of useful features, such as the ability to submit workflows from a Python function. This package has various extension
+  points that allow you to plug-in the auto-generated models in case you need to use a feature that is not yet supported
+  by the hand-written code.
+
+The major differences between v4 and v5 are:
+
+- The `hera.workflows.models` package is now auto-generated, and is based on the OpenAPI specification of Argo
+  Workflows. This means that all the models are exactly the same as the models used by Argo Workflows, and you can use
+  all the features of Argo Workflows in your workflows written with `hera`.
+
+- The auto-generated models are based on Pydantic, which means that you can use all the features of Pydantic to
+  construct your workflows. This includes better type-checking, auto-completion in IDEs and more.
+
+- All template types are now supported. This means that you can now use all the template types that are supported by
+  Argo Workflows, such as DAGs, Steps, Suspend and more. Previously, only the DAG template type was supported.
+
+- The hand-written code has been rewritten to be extensible. This means that you can now easily extend the library to
+  support new features, or to support features that are not yet supported by the hand-written code. This is done by
+  using the `hera.workflows.models` package, and plugging it into the `hera.workflows` package.
+
+The following example shows how to use the DAG template type.
+
+```python
+from hera.workflows import (
+    DAG,
+    Workflow,
+    script,
+)
+
+
+# Notice that we are using the script decorator to define the function.
+# This is required in order to use the function as a template.
+# The decorator also allows us to define the image that will be used to run the function and
+# other parameters that are specific to the Script template type.
+@script(add_cwd_to_sys_path=False, image="python:alpine3.6")
+def say(message):
+    print(message)
+
+
+with Workflow(generate_name="dag-diamond-", entrypoint="diamond") as w:
+    # Note that we need to explicitly specify the DAG template type.
+    with DAG(name="diamond"):
+        # We can now use the decorated function as tasks in the DAG.
+        A = say(name="A", arguments={"message": "A"})
+        B = say(name="B", arguments={"message": "B"})
+        C = say(name="C", arguments={"message": "C"})
+        D = say(name="D", arguments={"message": "D"})
+        # We can use the `>>` or `.next()` operators to define dependencies between tasks.
+        A >> [B, C] >> D
+```
diff --git a/docs/getting-started/quick-start.md b/docs/getting-started/quick-start.md
@@ -0,0 +1,92 @@
+# Quick Start
+
+## Install Argo tools
+
+Ensure you have a Kubernetes cluster, kubectl and Argo Workflows installed by following the
+[Argo Workflows Quick Start](https://argoproj.github.io/argo-workflows/quick-start/).
+
+Ensure you are able to submit a workflow to Argo as in the example:
+
+```console
+argo submit -n argo --watch https://github.com/argoproj/argo-workflows/master/examples/hello-world.yaml
+```
+
+## Install Hera
+
+[![Pypi](https://img.shields.io/pypi/v/hera.svg)](https://pypi.python.org/pypi/hera)
+
+Hera is available on PyPi as the `hera` package. Add this dependency to your project in your usual way, e.g. pip or
+poetry, or install directly with `pip install hera`.
+
+## Hello World
+
+If you were able to run the `argo submit` command above, copy the following Workflow definition into a local file
+`hello_world.py`.
+
+```py
+from hera.workflows import Steps, Workflow, WorkflowsService, script
+
+
+@script()
+def echo(message: str):
+    print(message)
+
+
+with Workflow(
+    generate_name="hello-world-",
+    entrypoint="steps",
+    namespace="argo",
+    workflows_service=WorkflowsService(host="https://localhost:2746")
+) as w:
+    with Steps(name="steps"):
+        echo(arguments={"message": "Hello world!"})
+
+w.create()
+```
+
+Run the file
+
+```console
+python -m hello_world
+```
+
+You will then see the Workflow at <https://localhost:2746/>
+
+## Hello World on an existing Argo installation
+
+If you or your organization are already running on Argo and you're interested in using Hera to write your Workflow
+definitions, you will need to set up some config variables in `hera.shared.global_config`. Copy the following as a basis
+and fill in the blanks.
+
+```py
+from hera.workflows import Steps, Workflow, script
+from hera.shared import global_config
+
+global_config.host = "https://<your-host-name>"
+global_config.token = ""  # Copy token value after "Bearer" from the `argo auth token` command
+global_config.image = "<your-image-repository>/python:3.8"  # set the image if you cannot access "python:3.8" via Docker Hub
+
+
+@script()
+def echo(message: str):
+    print(message)
+
+
+with Workflow(
+    generate_name="hello-world-",
+    entrypoint="steps",
+    namespace="argo",
+) as w:
+    with Steps(name="steps"):
+        echo(arguments={"message": "Hello world!"})
+
+w.create()
+```
+
+Run the file
+
+```console
+python -m hello_world
+```
+
+You will then see the Workflow at https://\<your-host-name>
diff --git a/docs/getting-started/walk-through/about-hera.md b/docs/getting-started/walk-through/about-hera.md
@@ -0,0 +1,49 @@
+# About
+
+Hera is a Python library that allows you to construct and submit Argo Workflows. It is designed to be intuitive and easy
+to use, while also providing a powerful interface to the underlying Argo API.
+
+Hera acts as a domain-specific-language on top of Argo, so it is primarily a way to define Workflows. In previous Argo
+Workflows surveys such as [2021](https://blog.argoproj.io/argo-workflows-2021-survey-results-d6fa890030ee), a better
+Python DSL has been highly requested to overcome the YAML barrier to adoption. In the
+[2022 survey results](https://blog.argoproj.io/cncf-argo-project-2022-user-survey-results-f9caf46df7fd#:~:text=Job%20Roles%20%26%20Use%20Cases)
+we can infer from the job roles for people using Argo Workflows that the DevOps Engineers are likely more comfortable
+using YAML than ML Engineers.
+
+> DevOps Engineers: 41%
+> Software Engineer: 20%
+> Architects: 20%
+> Data Engineer / Data Scientist / ML Engineer: 13%
+
+We hope by providing a more intuitive Python definition language, Data and ML users of Argo Workflows will increase.
+
+## Feature Parity
+
+A natural concern about an abstraction layer on top of another technology is whether it can function the same as the
+original lower layer. In this case, Hera generates a [library of model classes](../../api/workflows/models.md) using
+Argo's OpenAPI spec which are wrapped up by Hera's feature-rich classes, while the model classes are available as a
+fallback mechanism. You can check out the extensive
+["upstream" examples](../../examples/workflows/upstream/dag_diamond.md) that contain side-by-side Python and YAML
+definitions for Workflows in
+[the Argo examples folder on GitHub](https://github.com/argoproj/argo-workflows/tree/master/examples). Our CI/CD runs
+through the Argo examples folder to check that we are able to reproduce them using Hera Workflows written by hand (note:
+we have not _yet_ written Hera Workflows for all the examples).
+
+If you are a new user of Argo, we encourage you to become familiar with
+[Argo's Core Concepts](https://argoproj.github.io/argo-workflows/workflow-concepts/), which provide a foundation of
+understanding when working with Hera. Working through the
+[Argo Walk Through](https://argoproj.github.io/argo-workflows/walk-through/) will also help you understand key concepts
+before moving to Python.
+
+## Context Managers
+
+You will notice many classes in Hera implement the context manager interface. This was designed to mirror the YAML
+syntax of Argo, helping existing users come to Hera from YAML, and for users new to both Argo and Hera, who will be able
+to interpret and understand most of the existing YAML documentation and resources online from familiar naming and
+functionality in Hera.
+
+## Orchestrating Scripts
+
+A natural extension of a Python DSL for Argo is tighter integration with Python scripts. This is where Hera improves the
+developer experience through its tailored classes and syntactic sugar to enable developers to easily orchestrate Python
+functions. Check out [Hello World](hello-world.md) to get started!
diff --git a/docs/getting-started/walk-through/hello-world.md b/docs/getting-started/walk-through/hello-world.md
@@ -0,0 +1,128 @@
+# Hello World
+
+Let's take a look at the `hello_world.py` from the [Quick Start](../quick-start.md) guide.
+
+```py
+from hera.workflows import Steps, Workflow, WorkflowsService, script
+
+
+@script()
+def echo(message: str):
+    print(message)
+
+
+with Workflow(
+    generate_name="hello-world-",
+    entrypoint="steps",
+    namespace="argo",
+    workflows_service=WorkflowsService(host="https://localhost:2746")
+) as w:
+    with Steps(name="steps"):
+        echo(arguments={"message": "Hello world!"})
+
+w.create()
+```
+
+## The imports
+
+As we are using Argo Workflows, we import specialized classes from `hera.workflows`. You will see Argo concepts from the
+Argo spec have been transformed into powerful Python classes, explore them at the
+[Hera Workflows API reference](../../api/workflows/hera.md).
+
+For this Workflow, we want to echo using Python's `print` function, which is wrapped in our convenience `echo` function.
+We use Hera's `script` decorator to turn the `echo` function into what's known as a
+[Script template](https://argoproj.github.io/argo-workflows/workflow-concepts/#script), and is mirrored in Hera with the
+`Script` class. As we're defining the Workflow in Python, Hera is able to infer multiple field values that the developer
+would otherwise have to define when using YAML.
+
+## The script decorator
+
+The `script` decorator can take kwargs that a `Script` can take. Importantly, you can specify the `image` of Python
+to use instead of the default `python:3.8` for your script if required:
+
+```py
+@script(image="python:3.11")
+def echo(message: str):
+    print(message)
+```
+
+Alternatively, you can specify this image once via the `global_config.image` variable, and it will be used for all
+`script`s automatically:
+
+```py
+from hera.shared import global_config
+global_config.image = "python:3.11"
+
+@script()  # "echo" will now run using python:3.11, as will any other scripts you define
+def echo(message: str):
+    print(message)
+
+@script()  # "echo_twice" will also run using python:3.11
+def echo_twice(message: str):
+    print(message)
+    print(message)
+```
+
+## The Workflow context manager
+
+The Workflow context manager acts as a scope under which `template` Hera objects can be declared, which include
+Containers, Scripts, DAGs [and more](https://argoproj.github.io/argo-workflows/workflow-concepts/#template-types). For a
+minimal example, you will need to provide your `Workflow` the initialization values as seen
+
+```py
+with Workflow(
+    generate_name="hello-world-",
+    entrypoint="steps",
+    namespace="argo",
+    workflows_service=WorkflowsService(host="https://localhost:2746")
+) as w:
+```
+
+* `generate_name` is taken by Argo upon submission, where it appends a random 5 character suffix, so you may see this
+  Workflow run with a name like `hello-world-vmsz5`.
+* `entrypoint` tells Argo which template to run upon submission.
+* `namespace` refers to the Kubernetes namespace you want to submit to.
+* `workflows_service` is the submission service.
+
+## The Steps context manager
+
+A `Steps` template is the second template type of this example, the first being the `Script`. The `Steps` template,
+along with the `DAG` template, is known as a "template invocator". This is because they are used to arrange other
+templates, mainly Containers and Scripts, to do the actual work. In Hera, the `Steps` class is a context manager as it
+automatically arranges your templates in the order that you add them, with each template invocation known as a `Step`.
+
+```py
+with Steps(name="steps"):
+```
+
+To invoke the `echo` template, you can call it, passing values to its arguments through the `arguments` kwarg, which is
+a dictionary of the _function_ kwargs to values. This is because under a `Steps` or `DAG` context manager, the `script`
+decorator converts a call of the function into a `Script` object, to which you must pass `Script` initialization kwargs.
+
+```py
+echo(arguments={"message": "Hello world!"})
+```
+
+> For advanced users: the exact mechanism of the `script` decorator is to create a `Script` object when declared, so
+> that when your function is invoked you have to pass its arguments through the `arguments` kwarg as a dictionary, and
+> the `Script` objects `__call__` function is invoked with the `arguments` kwarg. The `__call__` function on a
+> `CallableTemplateMixin` automatically creates a `Step` or a `Task` depending on whether the context manager is a
+> `Steps` or a `DAG`.
+
+## Submitting the Workflow
+
+Finally, with the workflow defined, the actual submisson occurs on
+
+```py
+w.create()
+```
+
+This uses the `WorkflowsService` to submit to Argo using its REST API, so `w.create()` can be thought of as running
+`argo submit`.
+
+Alternatively, you may want to see what the YAML looks like for this Workflow, which can be done with a print or to a
+file using `w.to_yaml()`.
+
+```py
+print(w.to_yaml())
+```
diff --git a/docs/hera_getting_started.md b/docs/hera_getting_started.md