-
Notifications
You must be signed in to change notification settings - Fork 176
EP07 Database transaction API
Database transactions are currently handled internally to Index Drivers and not exposed to higher architectural levels.
This EP proposes an API to allow user-managed transactions. The API consists solely of new methods, and new optional arguments to existing methods where the default (i.e. not supplying the new arguments) corresponds to existing behaviour. This proposal is therefore fully backwards compatible.
This proposal is a more detailed writeup of an enhancement already proposed in the ODCv2 Road Map.
Paul Haesler (@SpacemanPaul)
- Under Discussion
- In Progress
- Completed
- Rejected
- Deferred
Currently, high-level indexing tools (e.g. in odc-tools
) cannot make use of database transactions to ensure index consistency and have limited efficiency of bulk indexing because each dataset must be indexed in a separate database transaction.
-
A new abstract
Transaction
class.- Has abstract methods
begin_transaction
,commit
, androllback
(intended for internal use - use as context manager preferred). - SHALL implement ContextHandler semantics (i.e.
__enter__
and__exit__
functions.) - SHALL carry awareness of whether the transaction it represents is active, committed or rolledback.
- SHALL carry methods to construct special exceptions that will be caught in
__exit__
and trigger a commit or rollback:commit_exception
androllback_exception
. - Implemented for
postgres
andpostgis
drivers (as wrappers around the existingPostgresDbAPI
/PostgisDbAPI
classes)
- Has abstract methods
-
A new abstract method on
AbstractIndex
:transaction()
- Returns a Transaction context handler with an active transaction.
- Implemented for
postgres
andpostgis
drivers - Raises an exception on Index Drivers that do not support transactions.
-
A new optional argument to most (if not all) existing Index driver methods
transaction: Optional[Transaction] = None
.- If None, current behaviour: method is executed in it's own transaction (or in multiple transactions).
- Raise exception if inactive transaction is passed in.
- If active transaction is passed in, perform all DB reads and writes in the context of that transaction.
- Implement as a method decorator to minimise repetition.
- Non-transaction supporting index drivers raise exception if transaction is not None.
- Passing in a transaction generated by a different index instance raises exception.
Code Examples:
# Context manager
with dc.index.transaction() as trans:
# Archive old datasets and add new ones in single transaction
dc.index.datasets.archive([old_ds1.id, old_ds2.id], transaction=trans)
dc.index.datasets.add(ds1, transaction=trans)
dc.index.datasets.add(ds2, transaction=trans)
# If execution gets to here, the transaction is commited.
# If an exception was raised by any of the above methods, the transaction is rolled back.
# Handling commits and rollbacks with exceptions
with dc.index.transaction() as trans:
# Archive old datasets and add new ones in single transaction
dc.index.datasets.archive([old_ds1.id, old_ds2.id], transaction=trans)
dc.index.datasets.add(ds1, transaction=trans)
if dont_need_ds2(ds1, ds2):
# Commit immediately and exit context handler.
raise trans.commit_exception("Shortcut commit")
elif problematic(ds1, ds2):
# Rollback immediately and exit context handler
raise trans.rollback_exception("Shortcut commit")
elif really_problematic(ds1, ds2):
# Other exceptions trigger rollback as well, but
# are not caught by `__exit__` and so are propagated on
# up the stack.
raise ValueError("Uncaught exceptions trigger rollback too.")
dc.index.datasets.add(ds2, transaction=trans)
# If execution gets to here, the transaction is commited.
# Manual low-level transaction management (not recommended, but supported)
trans = dc.index.transaction()
# Transaction is initially active - no need to call begin_transaction.
dc.index.datasets.update(ds1, transaction=trans)
dc.index.datasets.update(ds2, transaction=trans)
trans.rollback()
# ds1 and ds2 are NOT updated in the database
try:
dc.index.datasets.update(ds3, transaction=trans)
except InactiveTransactionException:
# Update of ds3 will raise an exception as transaction is not active after rollback
pass
# Manually start a new transaction:
trans.begin_transaction()
dc.index.datasets.update(ds4, transaction=trans)
dc.index.datasets.update(ds5, transaction=trans)
trans.commit()
# ds4 and ds5 ARE updated in the database
# Transaction is now inactive. No further cleanup required.
- Robert Woodcock (@woodcockr)
- This looks good to me. Straightforward context and if need be manual handling. Provides the opportunity for the transaction benefit. Further optimisation can be performed through incremental changes to other code. I'm not overly familiar with all the database code in ODC these days but I am familiar with the impact of not having the transaction handling. Get's my vote.
- Paul Haesler (@SpacemanPaul)
Welcome to the Open Data Cube