Refresh correct partial during refresh on drop #2613

k-rus · 2020-11-03T08:38:53Z

When drop_chunks is called on a hypertable manually or through
a retention policy, it initiates refresh in hypertable's continuous
aggregates. The refresh includes all buckets of each dropped chunk.
Some buckets are inside a dropped chunk only partially. In such case,
this commit fixes to refresh only the correct partials of the buckets.

The fix passes chunk id to refresh on drop of a chunk. The chunk id is
used to identify and update correct partials. This is a workaround
that the invalidation outside of the refreshed chunk doesn't update
the other partial, which is outside the refreshed chunk.

The commit also adds the test demonstrating correct refresh.

Fixes #2592

tsl/src/continuous_aggs/materialize.c

tsl/test/expected/continuous_aggs_drop_chunks.out

codecov · 2020-11-03T14:28:13Z

Codecov Report

Merging #2613 into master will increase coverage by 0.12%.
The diff coverage is 82.14%.

@@            Coverage Diff             @@
##           master    #2613      +/-   ##
==========================================
+ Coverage   90.02%   90.14%   +0.12%     
==========================================
  Files         212      212              
  Lines       34291    34259      -32     
==========================================
+ Hits        30869    30883      +14     
+ Misses       3422     3376      -46

Impacted Files	Coverage Δ
src/cross_module_fn.c	`68.29% <0.00%> (-1.71%)`	⬇️
tsl/src/data_node_dispatch.c	`97.02% <ø> (-0.25%)`	⬇️
tsl/src/nodes/decompress_chunk/planner.c	`97.03% <ø> (ø)`
tsl/src/data_node.c	`94.90% <90.90%> (-0.11%)`	⬇️
src/chunk.c	`94.65% <100.00%> (+0.08%)`	⬆️
tsl/src/continuous_aggs/materialize.c	`67.61% <100.00%> (+1.61%)`	⬆️
tsl/src/continuous_aggs/refresh.c	`99.38% <100.00%> (ø)`
tsl/src/dist_util.c	`93.28% <100.00%> (+0.04%)`	⬆️
... and 1 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update f8f53aa...a2aebad. Read the comment docs.

erimatnor · 2020-11-03T15:11:03Z

tsl/test/sql/continuous_aggs_drop_chunks.sql

+        SELECT time_bucket(7, time_int) as bucket,
+            SUM(value), COUNT(value)
+        FROM conditions GROUP BY bucket WITH DATA;
+SELECT materialization_hypertable_schema||'.'||materialization_hypertable_name AS mat_hyper


I don't think mat_hyper is ever used. Is this query used at all?

Thank you! It is left over after I queries the materialized table in the view, but it was redundant after I improved tests to cover corner cases better. I will remove this.

erimatnor · 2020-11-03T15:15:12Z

tsl/test/expected/continuous_aggs_drop_chunks.out

+(1 row)
+
+INSERT INTO conditions
+SELECT time_val, time_val % 4, 3.14 FROM generate_series(0,100,1) AS time_val;


Would be nice to use the new chunks view to show the ranges of the chunks before dropping them so that one can easily compare the ranges to the buckets in the cagg.

erimatnor · 2020-11-03T15:16:11Z

tsl/test/expected/continuous_aggs_drop_chunks.out

+-- This is the simplest case, when the updated bucket is inside a chunk, so it is expected
+-- that the update is always reflected after the refresh on drop.
+UPDATE conditions SET value = 4.00 WHERE time_int = 2;
+-- This case updates values in the bucket, which affects two different partials in


The test case I was referring to previously was to drop these chunks separately, to see that the same bucket is updated twice; first when dropping the first chunk, then updated again when dropping the second chunk.

No, your suggestion was to drop from the chunks, which are dropped together. See your comment with your example: #2613 (comment)

My suggestion was to change the test so that the chunks are not dropped together. But I see that the requested case is now covered in a different way

I was mislead by your example, which was in the bucket of the chunks dropped together. Yes, the other test was also added.

erimatnor · 2020-11-03T15:22:24Z

tsl/test/expected/continuous_aggs_drop_chunks.out

+ bucket |  sum  | count 
+--------+-------+-------
+      0 | 22.84 |     7
+      7 |  23.7 |     7


If you drop the two first chunks separately, I would expect bucket 7 to first refresh to 22.84 and then when the second chunk is dropped it would be 23.7. Why not test that case as I suggested?

Ohh, I think this is tested below... sorry

Your case is already tested and I commented two times that I am adding it. Please, check the code

gayyappan

Comments requested.

gayyappan · 2020-11-04T14:26:52Z

tsl/src/continuous_aggs/materialize.c


 void
 continuous_agg_update_materialization(SchemaAndName partial_view,
 									  SchemaAndName materialization_table, Name time_column_name,
 									  InternalTimeRange new_materialization_range,
-									  InternalTimeRange invalidation_range, int64 bucket_width)
+									  InternalTimeRange invalidation_range, int64 bucket_width,


I had to follow the code path to figure out what it means to pass INVALID_CHUNK_ID as a parameter. Could you please add a comment stating the difference in behavior for valid/invalid chunk ids?

Added a comment

When drop_chunks is called on a hypertable manually or through a retention policy, it initiates refresh in hypertable's continuous aggregates. The refresh includes all buckets of each dropped chunk. Some buckets are inside a dropped chunk only partially. In such case, this commit fixes to refresh only the correct partials of the buckets. The fix passes chunk id to refresh on drop of a chunk. The chunk id is used to identify and update correct partials. This is a workaround that the invalidation outside of the refreshed chunk doesn't update the other partial, which is outside the refreshed chunk. The commit also adds the test demonstrating correct refresh. Fixes timescale#2592

**Minor Features** * timescale#2627 Add optional user mappings support **Bugfixes** * timescale#2593 Set explicitly all lock parameters in alter_job * timescale#2604 Fix chunk creation on hypertables with foreign key constraints * timescale#2612 Optimize internal cagg_watermark function * timescale#2613 Refresh correct partial during refresh on drop * timescale#2617 Fix validation of available extensions on data node * timescale#2619 Fix segfault in decompress_chunk for chunks with dropped columns * timescale#2620 Fix DROP CASCADE for continuous aggregate * timescale#2625 Fix subquery errors when using AsyncAppend * timescale#2626 Fix incorrect total_table_pages setting for compressed scan * timescale#2628 Stop recursion in cache invalidation * timescale#2631 Fix segfault in decompress_chunk for chunks with dropped columns

**Minor Features** * timescale#2627 Add optional user mappings support * timescale#2635 Add API to refresh continuous aggregate on chunk **Bugfixes** * timescale#2593 Set explicitly all lock parameters in alter_job * timescale#2604 Fix chunk creation on hypertables with foreign key constraints * timescale#2612 Optimize internal cagg_watermark function * timescale#2613 Refresh correct partial during refresh on drop * timescale#2617 Fix validation of available extensions on data node * timescale#2619 Fix segfault in decompress_chunk for chunks with dropped columns * timescale#2620 Fix DROP CASCADE for continuous aggregate * timescale#2625 Fix subquery errors when using AsyncAppend * timescale#2626 Fix incorrect total_table_pages setting for compressed scan * timescale#2628 Stop recursion in cache invalidation * timescale#2631 Fix segfault in decompress_chunk for chunks with dropped columns

This release candidate contains bugfixes since the previous release candidate. **Minor Features** * timescale#2627 Add optional user mappings support * timescale#2635 Add API to refresh continuous aggregate on chunk **Bugfixes** * timescale#2593 Set explicitly all lock parameters in alter_job * timescale#2604 Fix chunk creation on hypertables with foreign key constraints * timescale#2612 Optimize internal cagg_watermark function * timescale#2613 Refresh correct partial during refresh on drop * timescale#2617 Fix validation of available extensions on data node * timescale#2619 Fix segfault in decompress_chunk for chunks with dropped columns * timescale#2620 Fix DROP CASCADE for continuous aggregate * timescale#2625 Fix subquery errors when using AsyncAppend * timescale#2626 Fix incorrect total_table_pages setting for compressed scan * timescale#2628 Stop recursion in cache invalidation * timescale#2631 Fix segfault in decompress_chunk for chunks with dropped columns

This release candidate contains bugfixes since the previous release candidate, as well as additional minor features including support for "user-mapping" authentication between access/data nodes and an experimental API for refreshing continuous aggregates on individual chunks. **Minor Features** * timescale#2627 Add optional user mappings support * timescale#2635 Add API to refresh continuous aggregate on chunk **Bugfixes** * timescale#2593 Set explicitly all lock parameters in alter_job * timescale#2604 Fix chunk creation on hypertables with foreign key constraints * timescale#2612 Optimize internal cagg_watermark function * timescale#2613 Refresh correct partial during refresh on drop * timescale#2617 Fix validation of available extensions on data node * timescale#2619 Fix segfault in decompress_chunk for chunks with dropped columns * timescale#2620 Fix DROP CASCADE for continuous aggregate * timescale#2625 Fix subquery errors when using AsyncAppend * timescale#2626 Fix incorrect total_table_pages setting for compressed scan * timescale#2628 Stop recursion in cache invalidation * timescale#2631 Fix segfault in decompress_chunk for chunks with dropped columns

This release candidate contains bugfixes since the previous release candidate, as well as additional minor features including support for "user-mapping" authentication between access/data nodes and an experimental API for refreshing continuous aggregates on individual chunks. **Minor Features** * #2627 Add optional user mappings support * #2635 Add API to refresh continuous aggregate on chunk **Bugfixes** * #2593 Set explicitly all lock parameters in alter_job * #2604 Fix chunk creation on hypertables with foreign key constraints * #2612 Optimize internal cagg_watermark function * #2613 Refresh correct partial during refresh on drop * #2617 Fix validation of available extensions on data node * #2619 Fix segfault in decompress_chunk for chunks with dropped columns * #2620 Fix DROP CASCADE for continuous aggregate * #2625 Fix subquery errors when using AsyncAppend * #2626 Fix incorrect total_table_pages setting for compressed scan * #2628 Stop recursion in cache invalidation

This release candidate contains bugfixes since the previous release candidate, as well as additional minor features including support for "user-mapping" authentication between access/data nodes and an experimental API for refreshing continuous aggregates on individual chunks. **Minor Features** * timescale#2627 Add optional user mappings support * timescale#2635 Add API to refresh continuous aggregate on chunk **Bugfixes** * timescale#2593 Set explicitly all lock parameters in alter_job * timescale#2604 Fix chunk creation on hypertables with foreign key constraints * timescale#2610 Support analyze of internal compression table * timescale#2612 Optimize internal cagg_watermark function * timescale#2613 Refresh correct partial during refresh on drop * timescale#2617 Fix validation of available extensions on data node * timescale#2619 Fix segfault in decompress_chunk for chunks with dropped columns * timescale#2620 Fix DROP CASCADE for continuous aggregate * timescale#2625 Fix subquery errors when using AsyncAppend * timescale#2626 Fix incorrect total_table_pages setting for compressed scan * timescale#2628 Stop recursion in cache invalidation

This release candidate contains bugfixes since the previous release candidate, as well as additional minor features including support for "user-mapping" authentication between access/data nodes and an experimental API for refreshing continuous aggregates on individual chunks. **Minor Features** * timescale#2627 Add optional user mappings support * timescale#2635 Add API to refresh continuous aggregate on chunk **Bugfixes** * timescale#2560 Fix SCHEMA DROP CASCADE with continuous aggregates * timescale#2593 Set explicitly all lock parameters in alter_job * timescale#2604 Fix chunk creation on hypertables with foreign key constraints * timescale#2610 Support analyze of internal compression table * timescale#2612 Optimize internal cagg_watermark function * timescale#2613 Refresh correct partial during refresh on drop * timescale#2617 Fix validation of available extensions on data node * timescale#2619 Fix segfault in decompress_chunk for chunks with dropped columns * timescale#2620 Fix DROP CASCADE for continuous aggregate * timescale#2625 Fix subquery errors when using AsyncAppend * timescale#2626 Fix incorrect total_table_pages setting for compressed scan * timescale#2628 Stop recursion in cache invalidation

This release candidate contains bugfixes since the previous release candidate, as well as additional minor features including support for "user-mapping" authentication between access/data nodes and an experimental API for refreshing continuous aggregates on individual chunks. **Minor Features** * #2627 Add optional user mappings support * #2635 Add API to refresh continuous aggregate on chunk **Bugfixes** * #2560 Fix SCHEMA DROP CASCADE with continuous aggregates * #2593 Set explicitly all lock parameters in alter_job * #2604 Fix chunk creation on hypertables with foreign key constraints * #2610 Support analyze of internal compression table * #2612 Optimize internal cagg_watermark function * #2613 Refresh correct partial during refresh on drop * #2617 Fix validation of available extensions on data node * #2619 Fix segfault in decompress_chunk for chunks with dropped columns * #2620 Fix DROP CASCADE for continuous aggregate * #2625 Fix subquery errors when using AsyncAppend * #2626 Fix incorrect total_table_pages setting for compressed scan * #2628 Stop recursion in cache invalidation

k-rus requested a review from a team as a code owner November 3, 2020 08:38

k-rus requested review from mkindahl, svenklemm, gayyappan, erimatnor, pmwkaa, WireBaron and cevian and removed request for a team November 3, 2020 08:38

k-rus added this to the 2.0.0 milestone Nov 3, 2020

erimatnor reviewed Nov 3, 2020

View reviewed changes

tsl/src/continuous_aggs/materialize.c Show resolved Hide resolved

erimatnor reviewed Nov 3, 2020

View reviewed changes

k-rus force-pushed the cagg-dropchunks branch from 72c7f68 to 8678119 Compare November 3, 2020 14:04

k-rus requested a review from erimatnor November 3, 2020 14:04

k-rus self-assigned this Nov 3, 2020

k-rus added the continuous_aggregate label Nov 3, 2020

erimatnor reviewed Nov 3, 2020

View reviewed changes

k-rus force-pushed the cagg-dropchunks branch from 8678119 to 3e7fa45 Compare November 3, 2020 15:23

k-rus requested a review from erimatnor November 3, 2020 15:27

erimatnor approved these changes Nov 3, 2020

View reviewed changes

k-rus mentioned this pull request Nov 4, 2020

Add test to verify refresh on drop in caggs #2606

Closed

gayyappan approved these changes Nov 5, 2020

View reviewed changes

k-rus force-pushed the cagg-dropchunks branch from 3e7fa45 to 82307c9 Compare November 6, 2020 12:15

k-rus force-pushed the cagg-dropchunks branch from 82307c9 to a2aebad Compare November 6, 2020 12:21

k-rus merged commit 307ac6a into timescale:master Nov 6, 2020

k-rus deleted the cagg-dropchunks branch November 6, 2020 13:12

erimatnor mentioned this pull request Nov 11, 2020

Release 2.0.0-rc3 #2637

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Refresh correct partial during refresh on drop #2613

Refresh correct partial during refresh on drop #2613

k-rus commented Nov 3, 2020

codecov bot commented Nov 3, 2020 •

edited

Loading

erimatnor Nov 3, 2020

k-rus Nov 3, 2020

erimatnor Nov 3, 2020

erimatnor Nov 3, 2020

k-rus Nov 3, 2020

erimatnor Nov 3, 2020

k-rus Nov 3, 2020

erimatnor Nov 3, 2020

erimatnor Nov 3, 2020

k-rus Nov 3, 2020

gayyappan left a comment

gayyappan Nov 4, 2020

k-rus Nov 6, 2020

Refresh correct partial during refresh on drop #2613

Refresh correct partial during refresh on drop #2613

Conversation

k-rus commented Nov 3, 2020

codecov bot commented Nov 3, 2020 • edited Loading

Codecov Report

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

gayyappan left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

codecov bot commented Nov 3, 2020 •

edited

Loading