You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
PostgreSQL version (output of postgres --version): 12.4
TimescaleDB version (output of \dx in psql): 1.7.3
Installation method: source
Describe the bug
A query produces incorrect results due to expression propagation that Timescale seems to do too eagerly.
To Reproduce
create table t1_timescale (a int, b int);
create table t1_notimescale (a int, b int);
create table t2 (a int, b int);
select create_hypertable('t1_timescale', 'a', chunk_time_interval=>1000);
insert into t1_timescale select a, -1 from generate_series(1, 100) a, generate_series(1,1000) b;
insert into t1_notimescale select a, -1 from generate_series(1, 100) a, generate_series(1,1000) b;
insert into t2 select a, b from generate_series(1, 100) a, generate_series(1,1000) b;
analyze t1_timescale; analyze t1_notimescale; analyze t2;
select * from t1_timescale
left join t2 on t1_timescale.b=t2.b and t2.b between 10 and 20
where t1_timescale.a=5
;
select * from t1_notimescale
left join t2 on t1_notimescale.b=t2.b and t2.b between 10 and 20
where t1_notimescale.a=5
;
Expected behavior
Both first and second SELECT query should produce identical results, as the data in the tables is identical. The only difference is that the first one uses a hypertable while the second one does not.
Actual behavior
➤ Tue 12:00 postgres@:5432/
=# select * from t1_timescale
-# left join t2 on t1_timescale.b=t2.b and t2.b between 10 and 20
-# where t1_timescale.a=5
-# ;
┌───┬───┬───┬───┐
│ a │ b │ a │ b │
├───┼───┼───┼───┤
└───┴───┴───┴───┘
(0 rows)
Time: 3.410 ms
➤ Tue 12:01 postgres@:5432/
=# select * from t1_notimescale
-# left join t2 on t1_notimescale.b=t2.b and t2.b between 10 and 20
-# where t1_notimescale.a=5
-# ;
┌───┬────┬───┬───┐
│ a │ b │ a │ b │
├───┼────┼───┼───┤
│ 5 │ -1 │ ∅ │ ∅ │
│ 5 │ -1 │ ∅ │ ∅ │
│ 5 │ -1 │ ∅ │ ∅ │
│ 5 │ -1 │ ∅ │ ∅ │
│ 5 │ -1 │ ∅ │ ∅ │
│ 5 │ -1 │ ∅ │ ∅ │
│ 5 │ -1 │ ∅ │ ∅ │
│ 5 │ -1 │ ∅ │ ∅ │
│ 5 │ -1 │ ∅ │ ∅ │
│ 5 │ -1 │ ∅ │ ∅ │
... (etc... many more rows)
Looking at query plans:
➤ Tue 12:01 postgres@:5432/
=# explain select * from t1_timescale
left join t2 on t1_timescale.b=t2.b and t2.b between 10 and 20
where t1_timescale.a=5
;
┌───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┐
│ QUERY PLAN │
├───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┤
│ Nested Loop Left Join (cost=0.29..1991.94 rows=1 width=16) │
│ Join Filter: (_hyper_447_53536_chunk.b = t2.b) │
│ -> Index Scan using _hyper_447_53536_chunk_t1_timescale_a_idx on _hyper_447_53536_chunk (cost=0.29..34.05 rows=1 width=8) │
│ Index Cond: (a = 5) │
│ Filter: ((b >= 10) AND (b <= 20)) │
│ -> Seq Scan on t2 (cost=0.00..1943.00 rows=1191 width=8) │
│ Filter: ((b >= 10) AND (b <= 20)) │
└───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘
(7 rows)
=# explain select * from t1_notimescale
left join t2 on t1_notimescale.b=t2.b and t2.b between 10 and 20
where t1_notimescale.a=5
;
┌───────────────────────────────────────────────────────────────────────────────┐
│ QUERY PLAN │
├───────────────────────────────────────────────────────────────────────────────┤
│ Merge Left Join (cost=3746.28..3763.05 rows=1180 width=16) │
│ Merge Cond: (t1_notimescale.b = t2.b) │
│ -> Sort (cost=1742.43..1744.91 rows=993 width=8) │
│ Sort Key: t1_notimescale.b │
│ -> Seq Scan on t1_notimescale (cost=0.00..1693.00 rows=993 width=8) │
│ Filter: (a = 5) │
│ -> Sort (cost=2003.85..2006.83 rows=1191 width=8) │
│ Sort Key: t2.b │
│ -> Seq Scan on t2 (cost=0.00..1943.00 rows=1191 width=8) │
│ Filter: ((b >= 10) AND (b <= 20)) │
└───────────────────────────────────────────────────────────────────────────────┘
(10 rows)
You can see that the condition that t1.b=t2.b causes a propagation of the t2.b between 10 and 20 to table t1_timescale. The planner seems to think it's alright to think that this implies that t1_timescale.b between 10 and 20 must also hold. However, since this is a LEFT JOIN, this is not true.
The text was updated successfully, but these errors were encountered:
time_bucket_annotate_walker passes an incorrect status
for outer join to the function that checks quals eligibility
for propagation.
Fixestimescale#2500
time_bucket_annotate_walker passes an incorrect status
for outer join to the function that checks quals eligibility
for propagation.
Fixestimescale#2500
gayyappan
added a commit
to gayyappan/timescaledb
that referenced
this issue
Oct 15, 2020
time_bucket_annotate_walker passes an incorrect status
for outer join to the function that checks quals eligibility
for propagation.
Fixestimescale#2500
postgres --version
): 12.4\dx
inpsql
): 1.7.3Describe the bug
A query produces incorrect results due to expression propagation that Timescale seems to do too eagerly.
To Reproduce
Expected behavior
Both first and second
SELECT
query should produce identical results, as the data in the tables is identical. The only difference is that the first one uses a hypertable while the second one does not.Actual behavior
Looking at query plans:
You can see that the condition that
t1.b=t2.b
causes a propagation of thet2.b between 10 and 20
to tablet1_timescale
. The planner seems to think it's alright to think that this implies thatt1_timescale.b between 10 and 20
must also hold. However, since this is aLEFT JOIN
, this is not true.The text was updated successfully, but these errors were encountered: