TODO tracker #2

lffg · 2024-02-14T18:45:45Z

Graceful shutdown stuff
- Pass cancellation token to job's context (Pass cancellation token to job's context #3)
- Propagate cancellation confirmation upwards the task hierarchy tree (4077065)
- Study whether we should provide a wrapper type over tokio_utils' CancellationToken. Otherwise, if a downstream user who wants to implement graceful shutdowns would need to directly depend on tokio_utils. (4077065)
Improve error handling
- External error type without external types associated to it (bab4625)
- Internal error type (bab4625)
  - ~~Study whether should we introduce a "level" field on the internal error~~
  - ~~The actual problem is to decide which internal errors should be reported to the user. I think that we should only expose error related to job executions ("user-provided code")~~
- Refactor the current lifecycle implementation to accommodate the new error types (fe6f7c4)
- Decide on how to handle InternalErrors that arise from the job lifecycle implementation
  - Probably we should just expose internal errors through the error handler and let the user do something with them (e.g. log them and then probably file a bug report on fila's issue tracker). I can't see any other more sensible option.
- Handle panics in job executors (ec68b69)
- Apply timeout
- Expose error handler
Create the Maintainer process tree (a Subscriber sub component)
- Cleaner, to remove completed and cancelled jobs that are past a certain age
- Rescuer, to recover jobs:
  - That are stuck on the processing state. This mostly happens when the node that was processing a job goes down while the job is executing, such that the job lifecycle implementation can't finish.
  - That are stuck on the available state. This happens when a job is published but there aren't any listening subscribers, which causes the Postgres notification to get lost.
- ~~Scheduler~~ (not for now; see long term)
Do some load tests
Abstract over DB driver (this library should be compatible with both sqlx and tokio-postgres, under different feature-flags). Traits:
- Executor, to execute queries
- Transaction: Executor, to run a transaction
- Pool
- Listener
Testing stuff (under mod fila::test)
- Test-friendly send API which also allows one to set the state
- Create a "mock" implementation for the above database traits so that we may mock job publishing during tests without actually talking to the database
- Create assertion helpers which also query the database state, e.g. to return one specific job status.
  - We'll probably have to make fila::send return an opaque JobId to identify it later.
Guide-level documentation
- Do not forget to mention that the current architecture guarantees only at least once job execution. In the future we may introduce a TransactionalJob trait that receives the same transaction as the job lifecycle runner to ensure exactly once semantics.
Long-term
- Job scheduling
  - Cron-style scheduling
  - Substitute the current default retry strategy (immediate) to use exponential backoff

The text was updated successfully, but these errors were encountered:

lffg pinned this issue Feb 14, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

TODO tracker #2

TODO tracker #2

lffg commented Feb 14, 2024 •

edited

Loading

TODO tracker #2

TODO tracker #2

Comments

lffg commented Feb 14, 2024 • edited Loading

lffg commented Feb 14, 2024 •

edited

Loading