-
Notifications
You must be signed in to change notification settings - Fork 30
Task agglomeration
Task agglomeration allows for submitting groups of tasks for execution. This may be particularly useful for small tasks for which job creation time is comparable to job execution time. HyperFlow provides a generic job buffering mechanism which can be used to implement task agglomeration in a particular computing infrastructure. Currently it is implemented for Kubernetes clusters as follows:
- k8sCommand function uses job buffering to submit multiple jobs as a single Pod (Kubernetes Job)
- HyperFlow job executor can execute arrays of tasks (sequentially)
Task agglomeration is configured as follows:
workflow.config.jobAgglomerations.json:
[
{
"matchTask": ["mProject"],
"size": 2,
"timeoutMs": 3000
},
{
"matchTask": ["mDiffFit"],
"size": 6,
"timeoutMs": 3000
},
{
"matchTask": ["mBackground"],
"size": 3,
"timeoutMs": 3000
}
]
This configuration sets rules for agglomeration of different types of tasks. For example, mDiffFit
tasks will be gathered into groups of size six, or for 3 seconds, whichever comes first.
The following example shows the execution traces of Montage2 workflow of size 1.0 (4800+ tasks) with and without agglomeration, on a Kubernetes cluster with 8 nodes, each with 8 vCPUs. Note the drastical difference in the execution times (600 vs. 1800 seconds). Because the workflow has many very small jobs (1-2 seconds), high parallelism cannot be achieved without agglomeration because the creation of Kubernetes Pods takes a comparable time.