bug: ibis should release resources when an ibis.memtable
op is GC'd
#10044
Labels
performance
Issues related to ibis's performance
Milestone
ibis.memtable
is useful for constructing small tables out of in-memory data in a backend agnostic way. Whenibis
executes an expression containing amemtable
node, a new "table" is created in the backend with the backing data (only if it wasn't created yet), and then used like any other table when executing the query. How the "table" is created is backend dependent:postgres
,bigquery
, ...) registering a memtable is equivalent tocon.create_table(unique_name, data, temp=True)
. These are temporary tables that are automatically cleaned up when the session is closed.duckdb
) have more native ways of registering in memory data that doesn't require a copy. These are also cleaned up when the session ends.Ibis currently will avoid duplicate registrations of the same memtable in the same backend, but we don't do anything to automatically cleanup the memtables if they go out of scope, instead we rely on them being cleaned up when the connection is closed.
In #10041 (and #10042) it was noted that this behavior is nonideal in the local case, since these backends generally keep around references to the in-memory data, even if the user has no way of accessing those tables anymore. The situation would be the same in the remote SQL backend case too, except in that case the local memory usage would stay the same and there'd just be an excessive number of temp tables created in the backend. The local backend case is definitely a bigger issue, but we might want to automatically de-register memtables on GC in all cases.
Sketch of the intended memtable behavior:
The text was updated successfully, but these errors were encountered: