Skip to content

EP07 Database transaction API

Paul Haesler edited this page Aug 25, 2022 · 11 revisions

ODC-EP 07 - Database transaction API

Overview

Database transactions are currently handled internally to Index Drivers and not exposed to higher architectural levels.

This EP proposes an API to allow user-managed transactions. The API consists solely of new methods, and new optional arguments to existing methods where the default (i.e. not supplying the new arguments) corresponds to existing behaviour. This proposal is therefore fully backwards compatible.

This proposal is a more detailed writeup of an enhancement already proposed in the ODCv2 Road Map.

Proposed By

Paul Haesler (@SpacemanPaul)

State

  • Under Discussion
  • In Progress
  • Completed
  • Rejected
  • Deferred

Motivation

Currently, high-level indexing tools (e.g. in odc-tools) cannot make use of database transactions to ensure index consistency and have limited efficiency of bulk indexing because each dataset must be indexed in a separate database transaction.

Proposal

  1. A new abstract Transaction class.

    • Has abstract methods begin_transaction, commit, and rollback (intended for internal use - use as context manager preferred).
    • SHALL implement ContextHandler semantics (i.e. __enter__ and __exit__ functions.)
    • SHALL carry awareness of whether the transaction it represents is active, committed or rolledback.
    • SHALL carry methods to construct special exceptions that will be caught in __exit__ and trigger a commit or rollback: commit_exception and rollback_exception.
    • Implemented for postgres and postgis drivers (as wrappers around the existing PostgresDbAPI/PostgisDbAPI classes)
  2. A new abstract method on AbstractIndex: transaction()

    • Returns a Transaction context handler with an active transaction.
    • Implemented for postgres and postgis drivers
    • Raises an exception on Index Drivers that do not support transactions.
  3. A new optional argument to most (if not all) existing Index driver methods transaction: Optional[Transaction] = None.

    • If None, current behaviour: method is executed in it's own transaction (or in multiple transactions).
    • Raise exception if inactive transaction is passed in.
    • If active transaction is passed in, perform all DB reads and writes in the context of that transaction.
    • Implement as a method decorator to minimise repetition.
    • Non-transaction supporting index drivers raise exception if transaction is not None.
    • Passing in a transaction generated by a different index instance raises exception.

Code Examples:

# Context manager
with dc.index.transaction() as trans:
   # Archive old datasets and add new ones in single transaction
   dc.index.datasets.archive([old_ds1.id, old_ds2.id], transaction=trans)
   dc.index.datasets.add(ds1, transaction=trans)
   dc.index.datasets.add(ds2, transaction=trans)
   # If execution gets to here, the transaction is commited.
   # If an exception was raised by any of the above methods, the transaction is rolled back.
# Handling commits and rollbacks with exceptions
with dc.index.transaction() as trans:
   # Archive old datasets and add new ones in single transaction
   dc.index.datasets.archive([old_ds1.id, old_ds2.id], transaction=trans)
   dc.index.datasets.add(ds1, transaction=trans)
   if dont_need_ds2(ds1, ds2):
       # Commit immediately and exit context handler.
       raise trans.commit_exception("Shortcut commit")
   elif problematic(ds1, ds2):
       # Rollback immediately and exit context handler
       raise trans.rollback_exception("Shortcut commit")
   elif really_problematic(ds1, ds2):
       # Other exceptions trigger rollback as well, but
       # are not caught by `__exit__` and so are propagated on
       # up the stack.
       raise ValueError("Uncaught exceptions trigger rollback too.")

   dc.index.datasets.add(ds2, transaction=trans)
   # If execution gets to here, the transaction is commited.
   # Manual low-level transaction management (not recommended, but supported)
   trans = dc.index.transaction()
   dc.index.datasets.update(ds1, transaction=trans)
   dc.index.datasets.update(ds2, transaction=trans)
   trans.rollback()
   # ds1 and ds2 are NOT updated in the database
   try:
      dc.index.datasets.update(ds3, transaction=trans)
   except InactiveTransactionException:
      # Update of ds3 will raise an exception as transaction is not active after rollback
      pass
   # Manually start a new transaction:
   trans.begin_transaction()
   dc.index.datasets.update(ds4, transaction=trans)
   dc.index.datasets.update(ds5, transaction=trans)
   trans.commit()
   # ds4 and ds5 ARE updated in the database
   # Transaction is now inactive. No further cleanup required.

Feedback

Voting

Enhancement Proposal Team

  • Paul Haesler (@SpacemanPaul)

Links

Clone this wiki locally