Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix crash due to txns in emit_log_hook_callback #2987

Merged
merged 1 commit into from
Mar 3, 2021

Conversation

nikkhils
Copy link
Contributor

@nikkhils nikkhils commented Feb 26, 2021

A weird crash was coming up while running the bgw_db_scheduler test
while using log_min_messages with DEBUG3 setting. So what's happening
here is that we register a RegisterXactCallback callback which gets
called at post-commit time. This remote_connection_xact_end callback
was issuing a debug3 level message. As part of its processing, the
emit_log_hook_callback function was trying to write this message
to a catalog table via a transaction. Obviously trying to start a
transaction at this post-commit stage will error out and leads to
weird behavior and crashes. In general it's not a good idea to have
emit_log_hook_callback functions start transactions because messages
can be emitted at any point of a transaction cycle, but we have to live
with the current code for now.

The fix is to reset the emit_log_hook in these xact_end callback
functions. Additionally the emit_log_hook_callback function is only
interested in elog/ereport messages with elevel LOG or above. So add
that check in the emit_log_hook_callback as well.

Fixes https://github.com/timescale/timescaledb-private/issues/899

@nikkhils nikkhils requested a review from a team as a code owner February 26, 2021 09:14
@nikkhils nikkhils requested review from pmwkaa, erimatnor and svenklemm and removed request for a team February 26, 2021 09:14
@codecov
Copy link

codecov bot commented Feb 26, 2021

Codecov Report

Merging #2987 (7186e82) into master (3723104) will increase coverage by 0.01%.
The diff coverage is 100.00%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #2987      +/-   ##
==========================================
+ Coverage   90.28%   90.30%   +0.01%     
==========================================
  Files         213      213              
  Lines       34998    35004       +6     
==========================================
+ Hits        31599    31610      +11     
+ Misses       3399     3394       -5     
Impacted Files Coverage Δ
tsl/src/remote/connection.c 91.65% <100.00%> (+0.08%) ⬆️
src/loader/bgw_message_queue.c 87.74% <0.00%> (+3.22%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 38493e9...10e5799. Read the comment docs.

@nikkhils nikkhils force-pushed the bgw_db_scheduler2 branch 2 times, most recently from e1e5472 to 26f31ce Compare February 26, 2021 09:39
@nikkhils nikkhils changed the title Crash due to emit_log_hook_callback using a transaction of its own Fix crash due to txns in emit_log_hook_callback Feb 26, 2021
@pmwkaa
Copy link
Contributor

pmwkaa commented Feb 26, 2021

Is it possible to add a test case for it?

@nikkhils
Copy link
Contributor Author

Is it possible to add a test case for it?

This came up when I was running bgw_db_scheduler test with debug3 log_min_messages. So the test case is already there

Copy link
Contributor

@mkindahl mkindahl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

test/src/bgw/log.c Show resolved Hide resolved
@svenklemm
Copy link
Member

please squash commits before merging

A weird crash was coming up while running the bgw_db_scheduler test
while using log_min_messages with DEBUG3 setting. So what's happening
here is that we register a RegisterXactCallback callback which gets
called at post-commit time. This remote_connection_xact_end callback
was issuing a debug3 level message. As part of its processing, the
emit_log_hook_callback function was trying to write this message
to a catalog table via a transaction. Obviously trying to start a
transaction at this post-commit stage will error out and leads to
weird behavior and crashes. In general it's not a good idea to have
emit_log_hook_callback functions start transactions because messages
can be emitted at any point of a transaction cycle, but we have to live
with the current code for now.

The fix is to reset the emit_log_hook in these xact_end callback
functions. Additionally the emit_log_hook_callback function is only
interested in elog/ereport messages with elevel LOG or above. So add
that check in the emit_log_hook_callback as well.
@nikkhils nikkhils merged commit 152bebe into timescale:master Mar 3, 2021
@nikkhils nikkhils deleted the bgw_db_scheduler2 branch March 3, 2021 14:15
k-rus added a commit to k-rus/timescaledb that referenced this pull request Mar 29, 2021
This maintenance release contains bugfixes since the 2.1.0 release. We
deem it high priority for upgrading.

The bug fixes in this release address issues with CREATE INDEX and
UPSERT for hypertables, custom jobs, and gapfill queries.

This release marks TimescaleDB as trusted extension.

**Minor features**
* timescale#2998 Mark timescaledb as trusted extension

**Bugfixes**
* timescale#2948 Fix off by 4 error in histogram deserialize
* timescale#2974 Fix index creation for hypertables with dropped columns
* timescale#2990 Fix segfault in job_config_check for cagg
* timescale#2987 Fix crash due to txns in emit_log_hook_callback
* timescale#3042 Commit end transaction for CREATE INDEX
* timescale#3053 Fix gapfill/hashagg planner interaction
* timescale#3059 Fix UPSERT on hypertables with columns with defaults

**Thanks**
* @eloyekunle and @kitwestneat for reporting an issue with UPSERT
* @jocrau for reporting an issue with index creation
* @kev009 for fixing a compilation issue
* @majozv and @pehlert for reporting an issue with time_bucket_gapfill
k-rus added a commit to k-rus/timescaledb that referenced this pull request Mar 29, 2021
This maintenance release contains bugfixes since the 2.1.0 release. We
deem it high priority for upgrading.

The bug fixes in this release address issues with CREATE INDEX and
UPSERT for hypertables, custom jobs, and gapfill queries.

This release marks TimescaleDB as trusted extension.

**Minor features**
* timescale#2998 Mark timescaledb as trusted extension

**Bugfixes**
* timescale#2948 Fix off by 4 error in histogram deserialize
* timescale#2974 Fix index creation for hypertables with dropped columns
* timescale#2990 Fix segfault in job_config_check for cagg
* timescale#2987 Fix crash due to txns in emit_log_hook_callback
* timescale#3042 Commit end transaction for CREATE INDEX
* timescale#3053 Fix gapfill/hashagg planner interaction
* timescale#3059 Fix UPSERT on hypertables with columns with defaults

**Thanks**
* @eloyekunle and @kitwestneat for reporting an issue with UPSERT
* @jocrau for reporting an issue with index creation
* @kev009 for fixing a compilation issue
* @majozv and @pehlert for reporting an issue with time_bucket_gapfill
k-rus added a commit to k-rus/timescaledb that referenced this pull request Mar 29, 2021
This maintenance release contains bugfixes since the 2.1.0 release. We
deem it high priority for upgrading.

The bug fixes in this release address issues with CREATE INDEX and
UPSERT for hypertables, custom jobs, and gapfill queries.

This release marks TimescaleDB as trusted extension.

**Minor features**
* timescale#2998 Mark timescaledb as trusted extension

**Bugfixes**
* timescale#2948 Fix off by 4 error in histogram deserialize
* timescale#2974 Fix index creation for hypertables with dropped columns
* timescale#2990 Fix segfault in job_config_check for cagg
* timescale#2987 Fix crash due to txns in emit_log_hook_callback
* timescale#3042 Commit end transaction for CREATE INDEX
* timescale#3053 Fix gapfill/hashagg planner interaction
* timescale#3059 Fix UPSERT on hypertables with columns with defaults

**Thanks**
* @eloyekunle and @kitwestneat for reporting an issue with UPSERT
* @jocrau for reporting an issue with index creation
* @kev009 for fixing a compilation issue
* @majozv and @pehlert for reporting an issue with time_bucket_gapfill
k-rus added a commit to k-rus/timescaledb that referenced this pull request Mar 29, 2021
This maintenance release contains bugfixes since the 2.1.0 release. We
deem it high priority for upgrading.

The bug fixes in this release address issues with CREATE INDEX and
UPSERT for hypertables, custom jobs, and gapfill queries.

This release marks TimescaleDB as trusted extension.

**Minor features**
* timescale#2998 Mark timescaledb as trusted extension

**Bugfixes**
* timescale#2948 Fix off by 4 error in histogram deserialize
* timescale#2974 Fix index creation for hypertables with dropped columns
* timescale#2990 Fix segfault in job_config_check for cagg
* timescale#2987 Fix crash due to txns in emit_log_hook_callback
* timescale#3042 Commit end transaction for CREATE INDEX
* timescale#3053 Fix gapfill/hashagg planner interaction
* timescale#3059 Fix UPSERT on hypertables with columns with defaults

**Thanks**
* @eloyekunle and @kitwestneat for reporting an issue with UPSERT
* @jocrau for reporting an issue with index creation
* @kev009 for fixing a compilation issue
* @majozv and @pehlert for reporting an issue with time_bucket_gapfill
@k-rus k-rus mentioned this pull request Mar 29, 2021
k-rus added a commit to k-rus/timescaledb that referenced this pull request Mar 29, 2021
This maintenance release contains bugfixes since the 2.1.0 release. We
deem it high priority for upgrading.

The bug fixes in this release address issues with CREATE INDEX and
UPSERT for hypertables, custom jobs, and gapfill queries.

This release marks TimescaleDB as a trusted extension in PG13, so that
superuser privileges are not required anymore to install the extension.

**Minor features**
* timescale#2998 Mark timescaledb as trusted extension

**Bugfixes**
* timescale#2948 Fix off by 4 error in histogram deserialize
* timescale#2974 Fix index creation for hypertables with dropped columns
* timescale#2990 Fix segfault in job_config_check for cagg
* timescale#2987 Fix crash due to txns in emit_log_hook_callback
* timescale#3042 Commit end transaction for CREATE INDEX
* timescale#3053 Fix gapfill/hashagg planner interaction
* timescale#3059 Fix UPSERT on hypertables with columns with defaults

**Thanks**
* @eloyekunle and @kitwestneat for reporting an issue with UPSERT
* @jocrau for reporting an issue with index creation
* @kev009 for fixing a compilation issue
* @majozv and @pehlert for reporting an issue with time_bucket_gapfill
k-rus added a commit that referenced this pull request Mar 29, 2021
This maintenance release contains bugfixes since the 2.1.0 release. We
deem it high priority for upgrading.

The bug fixes in this release address issues with CREATE INDEX and
UPSERT for hypertables, custom jobs, and gapfill queries.

This release marks TimescaleDB as a trusted extension in PG13, so that
superuser privileges are not required anymore to install the extension.

**Minor features**
* #2998 Mark timescaledb as trusted extension

**Bugfixes**
* #2948 Fix off by 4 error in histogram deserialize
* #2974 Fix index creation for hypertables with dropped columns
* #2990 Fix segfault in job_config_check for cagg
* #2987 Fix crash due to txns in emit_log_hook_callback
* #3042 Commit end transaction for CREATE INDEX
* #3053 Fix gapfill/hashagg planner interaction
* #3059 Fix UPSERT on hypertables with columns with defaults

**Thanks**
* @eloyekunle and @kitwestneat for reporting an issue with UPSERT
* @jocrau for reporting an issue with index creation
* @kev009 for fixing a compilation issue
* @majozv and @pehlert for reporting an issue with time_bucket_gapfill
k-rus added a commit to k-rus/timescaledb that referenced this pull request Mar 29, 2021
This maintenance release contains bugfixes since the 2.1.0 release. We
deem it high priority for upgrading.

The bug fixes in this release address issues with CREATE INDEX and
UPSERT for hypertables, custom jobs, and gapfill queries.

This release marks TimescaleDB as a trusted extension in PG13, so that
superuser privileges are not required anymore to install the extension.

**Minor features**
* timescale#2998 Mark timescaledb as trusted extension

**Bugfixes**
* timescale#2948 Fix off by 4 error in histogram deserialize
* timescale#2974 Fix index creation for hypertables with dropped columns
* timescale#2990 Fix segfault in job_config_check for cagg
* timescale#2987 Fix crash due to txns in emit_log_hook_callback
* timescale#3042 Commit end transaction for CREATE INDEX
* timescale#3053 Fix gapfill/hashagg planner interaction
* timescale#3059 Fix UPSERT on hypertables with columns with defaults

**Thanks**
* @eloyekunle and @kitwestneat for reporting an issue with UPSERT
* @jocrau for reporting an issue with index creation
* @kev009 for fixing a compilation issue
* @majozv and @pehlert for reporting an issue with time_bucket_gapfill
@k-rus k-rus mentioned this pull request Mar 29, 2021
k-rus added a commit that referenced this pull request Mar 29, 2021
This maintenance release contains bugfixes since the 2.1.0 release. We
deem it high priority for upgrading.

The bug fixes in this release address issues with CREATE INDEX and
UPSERT for hypertables, custom jobs, and gapfill queries.

This release marks TimescaleDB as a trusted extension in PG13, so that
superuser privileges are not required anymore to install the extension.

**Minor features**
* #2998 Mark timescaledb as trusted extension

**Bugfixes**
* #2948 Fix off by 4 error in histogram deserialize
* #2974 Fix index creation for hypertables with dropped columns
* #2990 Fix segfault in job_config_check for cagg
* #2987 Fix crash due to txns in emit_log_hook_callback
* #3042 Commit end transaction for CREATE INDEX
* #3053 Fix gapfill/hashagg planner interaction
* #3059 Fix UPSERT on hypertables with columns with defaults

**Thanks**
* @eloyekunle and @kitwestneat for reporting an issue with UPSERT
* @jocrau for reporting an issue with index creation
* @kev009 for fixing a compilation issue
* @majozv and @pehlert for reporting an issue with time_bucket_gapfill
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants