-
Notifications
You must be signed in to change notification settings - Fork 879
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Check configuration in alter_job and add_job #2689
Conversation
b8bc325
to
0c290a2
Compare
0d7bf7f
to
d6d0403
Compare
9b581a4
to
c09bb38
Compare
Codecov Report
@@ Coverage Diff @@
## master #2689 +/- ##
==========================================
+ Coverage 90.03% 90.19% +0.16%
==========================================
Files 212 212
Lines 34519 34508 -11
==========================================
+ Hits 31078 31125 +47
+ Misses 3441 3383 -58
Continue to review full report at Codecov.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks like a definite step in the right direction.
I would suggest two things, however:
- Since this is already a refactor, I suggest defining PolicyConfig objects for each policy instead of using "ad hoc" parameters when parsing. This makes it much cleaner to parse jsonb->PolicyConfig and opens up for more structured parsing/validation as suggested by example inline. I'll let you decide.
- I would also consider having a generic parse/validate function for custom jobs, which can at least check that an "alter" of the policy is compatible with the existing config (not changing types of existing key-values, etc.). This can be left for follow up PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not entirely convinced that we need to create dynamic objects (that is, tagged structures) for it, but maybe it can simplify some things. Will test a few things.
530dab1
to
d05ed7e
Compare
I created structures for each of the policy with the parameters so that they can easily be extracted. However, creating a generic structure like the one you suggest is useful when you work with the structures dynamically, which is done for instance with the nodes in PostgreSQL. However, in this case, we really only parse the configuration parameters from the JSON and this is done inside code specific to each procedure, hence there is no need to look up any form of tag. In other words, we have little need for the dynamic nature that it would result in, at least not for the time being, and would needlessly make the code more complicated. I think this is better done when we really that for some feature. |
2b9410f
to
30dfec0
Compare
|
||
-- First test that add_job checks the config. These should all generate errors. | ||
\set ON_ERROR_STOP 0 | ||
SELECT add_job( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why use add_job
here instead of the job-specific wrappers? Should we test both? At least having an explanation for this would make sense.
I am also wondering if we should actually raise an error for this case and point to the policy-specific APIs when people are adding policies using our "official" policy functions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is no need to test the job-specific wrappers since they take correctly typed arguments and construct the config. The function add_job
accepts a configuration and we need to do type-checking here so making sure that it works.
It might be reasonable to not allow creating non-custom jobs using this API, but this is outside the scope of this PR and warrant a separate discussion, so we should put this in a separate PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added a more elaborate description of why we test this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not sure I agree. This seems like a strange way to test validation in add_job
simply because custom jobs (which this function is typically used for) currently doesn't do any validation of the config.
So, why test something that is not the way we'd like people to use this? It seems odd, in particular since we currently have no validation of the config for custom jobs, where it really would matter for add_job
.
I think your comment about job-specific wrappers validating the config in other ways (with typed arguments) says it all: we now have two different ways of adding the same jobs where one way provides must stronger validation through typed arguments. That concerns me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree with Erik here, unless we check those cases elsewhere we should also test job wrappers to see that they act in the same or valid manner.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The job wrappers are already tested elsewhere: they build the config so there is nothing to check. The only purpose of this test is to check that users cannot inject an invalid config using this interface. I think that any changes to the current API requires a separate discussion, so it is out of scope for this PR.
f37ea87
to
8243056
Compare
scripts/clang_format_wrapper.sh
Outdated
@@ -67,7 +67,7 @@ echo "formatting" | |||
|
|||
cd ${TEMP_DIR} | |||
|
|||
clang-format ${CLANG_FORMAT_FLAGS} ${FILE_NAMES} | |||
clang-format-8 ${CLANG_FORMAT_FLAGS} ${FILE_NAMES} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unrelated change. This might break other people's setup.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Current system for picking clang-format is broken, and I had local changes to handle that, so this was accidentally added.
/* if job does its own transaction handling it might not have set a snapshot */ | ||
if (ActiveSnapshotSet()) | ||
PopActiveSnapshot(); | ||
if (pushed_snapshot && ActiveSnapshotSet()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would add a comment here about why we need both checks, i.e., that we cannot trust the function or procedure that we execute to leave us with an active snapshot, even if we pushed one, since it can be a multi-transaction procedure.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added comment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isn't pushed_snapshot
check shoud be enough here? Maybe assert in such case, to make sure snapshot is still active
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No. The action above can actually start a new transaction with no active snapshots at all (the cagg refresh does this), so we have to check if there is an active snapshot even if we pushed one above.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Approving, but there are two things to check:
- Remove change that sets new clang-format
- Potentially add comment explaining condition for popping snapshot
8243056
to
a7030a8
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Feels like a good change. I am wondering if we could simplify job interfaces here even further in the future. For example, having unified functions for validating/parsing and executing jobs declared from job definition as function pointers, etc.
tsl/src/bgw_policy/job.c
Outdated
PopActiveSnapshot(); | ||
/* Both checks are needed: if the executed procedure commit the | ||
* transaction---which `continuous_agg_refresh_internal` does, for | ||
* example---it will remove the active snapshot and and start a new |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
and and
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed.
/* if job does its own transaction handling it might not have set a snapshot */ | ||
if (ActiveSnapshotSet()) | ||
PopActiveSnapshot(); | ||
if (pushed_snapshot && ActiveSnapshotSet()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isn't pushed_snapshot
check shoud be enough here? Maybe assert in such case, to make sure snapshot is still active
|
||
-- First test that add_job checks the config. These should all generate errors. | ||
\set ON_ERROR_STOP 0 | ||
SELECT add_job( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree with Erik here, unless we check those cases elsewhere we should also test job wrappers to see that they act in the same or valid manner.
If a bad value is given to `alter_job` or `add_job` for a configuration parameter, no error will be given but the job will fail to execute. This commit adds checks of the configuration parameters to the functions so that an error is given immediately when calling it. The commit factors out the extraction of parameters from the configuration from the execution functions into a separate functions and calls them from `alter_job` and `add_job` as well as when executing the job. Only non-custom job checks are done. The commit also moves a few functions that were only used in TSL code from the `src/` directory to the `tsl/src/` directory and also removes a redundant permission check and does a minor refactoring of the `job_execute` function so that an active snapshot is always created regardless of whether a transaction is open or not. The corresponding code in the individual policy functions are removed since they are not needed. Closes timescale#2607
a7030a8
to
c2bd78c
Compare
This release candidate contains bugfixes since the previous release candidate, as well as additional minor features. **Minor Features** * timescale#2689 Check configuration in alter_job and add_job * timescale#2696 Support gapfill on distributed hypertable * timescale#2468 Show more information in get_git_commit * timescale#2678 Include user actions into job stats view * timescale#2664 Fix support for complex aggregate expression * timescale#2672 Add hypertable to continuous aggregates view **Bugfixes** * timescale#2688 Fix crash for concurrent drop and compress chunk * timescale#2666 Fix timeout handling in async library * timescale#2683 Fix crash in add_job when given NULL interval
This release candidate contains bugfixes since the previous release candidate, as well as additional minor features. **Minor Features** * timescale#2689 Check configuration in alter_job and add_job * timescale#2696 Support gapfill on distributed hypertable * timescale#2468 Show more information in get_git_commit * timescale#2678 Include user actions into job stats view * timescale#2664 Fix support for complex aggregate expression * timescale#2672 Add hypertable to continuous aggregates view **Bugfixes** * timescale#2688 Fix crash for concurrent drop and compress chunk * timescale#2666 Fix timeout handling in async library * timescale#2683 Fix crash in add_job when given NULL interval
This release candidate contains bugfixes since the previous release candidate, as well as additional minor features. It adds checks of configuration, adds support for gapfill on distributed tables, and improvements to compression for distributed hypertables. **Minor Features** * timescale#2689 Check configuration in alter_job and add_job * timescale#2696 Support gapfill on distributed hypertable * timescale#2468 Show more information in get_git_commit * timescale#2678 Include user actions into job stats view * timescale#2664 Fix support for complex aggregate expression * timescale#2672 Add hypertable to continuous aggregates view **Bugfixes** * timescale#2688 Fix crash for concurrent drop and compress chunk * timescale#2666 Fix timeout handling in async library * timescale#2683 Fix crash in add_job when given NULL interval
This release candidate contains bugfixes since the previous release candidate, as well as additional minor features. It adds checks of configuration, adds support for gapfill on distributed tables, and improvements to compression for distributed hypertables. **Minor Features** * timescale#2689 Check configuration in alter_job and add_job * timescale#2696 Support gapfill on distributed hypertable * timescale#2468 Show more information in get_git_commit * timescale#2678 Include user actions into job stats view * timescale#2664 Fix support for complex aggregate expression * timescale#2672 Add hypertable to continuous aggregates view **Bugfixes** * timescale#2688 Fix crash for concurrent drop and compress chunk * timescale#2666 Fix timeout handling in async library * timescale#2683 Fix crash in add_job when given NULL interval
This release candidate contains bugfixes since the previous release candidate, as well as additional minor features. It adds checks of configuration, adds support for gapfill on distributed tables, and improvements to compression for distributed hypertables. **Minor Features** * timescale#2689 Check configuration in alter_job and add_job * timescale#2696 Support gapfill on distributed hypertable * timescale#2468 Show more information in get_git_commit * timescale#2678 Include user actions into job stats view * timescale#2664 Fix support for complex aggregate expression * timescale#2672 Add hypertable to continuous aggregates view **Bugfixes** * timescale#2688 Fix crash for concurrent drop and compress chunk * timescale#2666 Fix timeout handling in async library * timescale#2683 Fix crash in add_job when given NULL interval
This release candidate contains bugfixes since the previous release candidate, as well as additional minor features. It improves validation of configuration changes for background jobs, adds support for gapfill on distributed tables, and contains improvements to compression for distributed hypertables. **Minor Features** * timescale#2689 Check configuration in alter_job and add_job * timescale#2696 Support gapfill on distributed hypertable * timescale#2468 Show more information in get_git_commit * timescale#2678 Include user actions into job stats view * timescale#2664 Fix support for complex aggregate expression * timescale#2672 Add hypertable to continuous aggregates view * timescale#2662 Save compression metadata settings on access node * timescale#2707 Introduce additional db for data node bootstrapping **Bugfixes** * timescale#2688 Fix crash for concurrent drop and compress chunk * timescale#2666 Fix timeout handling in async library * timescale#2683 Fix crash in add_job when given NULL interval * timescale#2698 Improve memory handling for remote COPY * timescale#2555 Set metadata for chunks compressed before 2.0
This release candidate contains bugfixes since the previous release candidate, as well as additional minor features. It improves validation of configuration changes for background jobs, adds support for gapfill on distributed tables, contains improvements to the memory handling for large COPY, and contains improvements to compression for distributed hypertables. **Minor Features** * timescale#2689 Check configuration in alter_job and add_job * timescale#2696 Support gapfill on distributed hypertable * timescale#2468 Show more information in get_git_commit * timescale#2678 Include user actions into job stats view * timescale#2664 Fix support for complex aggregate expression * timescale#2672 Add hypertable to continuous aggregates view * timescale#2662 Save compression metadata settings on access node * timescale#2707 Introduce additional db for data node bootstrapping **Bugfixes** * timescale#2688 Fix crash for concurrent drop and compress chunk * timescale#2666 Fix timeout handling in async library * timescale#2683 Fix crash in add_job when given NULL interval * timescale#2698 Improve memory handling for remote COPY * timescale#2555 Set metadata for chunks compressed before 2.0
This release candidate contains bugfixes since the previous release candidate, as well as additional minor features. It improves validation of configuration changes for background jobs, adds support for gapfill on distributed tables, contains improvements to the memory handling for large COPY, and contains improvements to compression for distributed hypertables. **Minor Features** * timescale#2689 Check configuration in alter_job and add_job * timescale#2696 Support gapfill on distributed hypertable * timescale#2468 Show more information in get_git_commit * timescale#2678 Include user actions into job stats view * timescale#2664 Fix support for complex aggregate expression * timescale#2672 Add hypertable to continuous aggregates view * timescale#2662 Save compression metadata settings on access node * timescale#2707 Introduce additional db for data node bootstrapping **Bugfixes** * timescale#2688 Fix crash for concurrent drop and compress chunk * timescale#2666 Fix timeout handling in async library * timescale#2683 Fix crash in add_job when given NULL interval * timescale#2698 Improve memory handling for remote COPY * timescale#2555 Set metadata for chunks compressed before 2.0
This release candidate contains bugfixes since the previous release candidate, as well as additional minor features. It improves validation of configuration changes for background jobs, adds support for gapfill on distributed tables, contains improvements to the memory handling for large COPY, and contains improvements to compression for distributed hypertables. **Minor Features** * #2689 Check configuration in alter_job and add_job * #2696 Support gapfill on distributed hypertable * #2468 Show more information in get_git_commit * #2678 Include user actions into job stats view * #2664 Fix support for complex aggregate expression * #2672 Add hypertable to continuous aggregates view * #2662 Save compression metadata settings on access node * #2707 Introduce additional db for data node bootstrapping **Bugfixes** * #2688 Fix crash for concurrent drop and compress chunk * #2666 Fix timeout handling in async library * #2683 Fix crash in add_job when given NULL interval * #2698 Improve memory handling for remote COPY * #2555 Set metadata for chunks compressed before 2.0
This release candidate contains bugfixes since the previous release candidate, as well as additional minor features. It improves validation of configuration changes for background jobs, adds support for gapfill on distributed tables, contains improvements to the memory handling for large COPY, and contains improvements to compression for distributed hypertables. **Minor Features** * timescale#2689 Check configuration in alter_job and add_job * timescale#2696 Support gapfill on distributed hypertable * timescale#2468 Show more information in get_git_commit * timescale#2678 Include user actions into job stats view * timescale#2664 Fix support for complex aggregate expression * timescale#2672 Add hypertable to continuous aggregates view * timescale#2662 Save compression metadata settings on access node * timescale#2707 Introduce additional db for data node bootstrapping **Bugfixes** * timescale#2688 Fix crash for concurrent drop and compress chunk * timescale#2666 Fix timeout handling in async library * timescale#2683 Fix crash in add_job when given NULL interval * timescale#2698 Improve memory handling for remote COPY * timescale#2555 Set metadata for chunks compressed before 2.0
This release candidate contains bugfixes since the previous release candidate, as well as additional minor features. It improves validation of configuration changes for background jobs, adds support for gapfill on distributed tables, contains improvements to the memory handling for large COPY, and contains improvements to compression for distributed hypertables. **Minor Features** * timescale#2689 Check configuration in alter_job and add_job * timescale#2696 Support gapfill on distributed hypertable * timescale#2468 Show more information in get_git_commit * timescale#2678 Include user actions into job stats view * timescale#2664 Fix support for complex aggregate expression * timescale#2672 Add hypertable to continuous aggregates view * timescale#2662 Save compression metadata settings on access node * timescale#2707 Introduce additional db for data node bootstrapping **Bugfixes** * timescale#2688 Fix crash for concurrent drop and compress chunk * timescale#2666 Fix timeout handling in async library * timescale#2683 Fix crash in add_job when given NULL interval * timescale#2698 Improve memory handling for remote COPY * timescale#2555 Set metadata for chunks compressed before 2.0
This release candidate contains bugfixes since the previous release candidate, as well as additional minor features. It improves validation of configuration changes for background jobs, adds support for gapfill on distributed tables, contains improvements to the memory handling for large COPY, and contains improvements to compression for distributed hypertables. **Minor Features** * timescale#2689 Check configuration in alter_job and add_job * timescale#2696 Support gapfill on distributed hypertable * timescale#2468 Show more information in get_git_commit * timescale#2678 Include user actions into job stats view * timescale#2664 Fix support for complex aggregate expression * timescale#2672 Add hypertable to continuous aggregates view * timescale#2662 Save compression metadata settings on access node * timescale#2707 Introduce additional db for data node bootstrapping **Bugfixes** * timescale#2688 Fix crash for concurrent drop and compress chunk * timescale#2666 Fix timeout handling in async library * timescale#2683 Fix crash in add_job when given NULL interval * timescale#2698 Improve memory handling for remote COPY * timescale#2555 Set metadata for chunks compressed before 2.0
This release candidate contains bugfixes since the previous release candidate, as well as additional minor features. It improves validation of configuration changes for background jobs, adds support for gapfill on distributed tables, contains improvements to the memory handling for large COPY, and contains improvements to compression for distributed hypertables. **Minor Features** * timescale#2689 Check configuration in alter_job and add_job * timescale#2696 Support gapfill on distributed hypertable * timescale#2468 Show more information in get_git_commit * timescale#2678 Include user actions into job stats view * timescale#2664 Fix support for complex aggregate expression * timescale#2672 Add hypertable to continuous aggregates view * timescale#2662 Save compression metadata settings on access node * timescale#2707 Introduce additional db for data node bootstrapping **Bugfixes** * timescale#2688 Fix crash for concurrent drop and compress chunk * timescale#2666 Fix timeout handling in async library * timescale#2683 Fix crash in add_job when given NULL interval * timescale#2698 Improve memory handling for remote COPY * timescale#2555 Set metadata for chunks compressed before 2.0
This release candidate contains bugfixes since the previous release candidate, as well as additional minor features. It improves validation of configuration changes for background jobs, adds support for gapfill on distributed tables, contains improvements to the memory handling for large COPY, and contains improvements to compression for distributed hypertables. **Minor Features** * timescale#2689 Check configuration in alter_job and add_job * timescale#2696 Support gapfill on distributed hypertable * timescale#2468 Show more information in get_git_commit * timescale#2678 Include user actions into job stats view * timescale#2664 Fix support for complex aggregate expression * timescale#2672 Add hypertable to continuous aggregates view * timescale#2662 Save compression metadata settings on access node * timescale#2707 Introduce additional db for data node bootstrapping **Bugfixes** * timescale#2688 Fix crash for concurrent drop and compress chunk * timescale#2666 Fix timeout handling in async library * timescale#2683 Fix crash in add_job when given NULL interval * timescale#2698 Improve memory handling for remote COPY * timescale#2555 Set metadata for chunks compressed before 2.0
This release candidate contains bugfixes since the previous release candidate, as well as additional minor features. It improves validation of configuration changes for background jobs, adds support for gapfill on distributed tables, contains improvements to the memory handling for large COPY, and contains improvements to compression for distributed hypertables. **Minor Features** * timescale#2689 Check configuration in alter_job and add_job * timescale#2696 Support gapfill on distributed hypertable * timescale#2468 Show more information in get_git_commit * timescale#2678 Include user actions into job stats view * timescale#2664 Fix support for complex aggregate expression * timescale#2672 Add hypertable to continuous aggregates view * timescale#2662 Save compression metadata settings on access node * timescale#2707 Introduce additional db for data node bootstrapping **Bugfixes** * timescale#2688 Fix crash for concurrent drop and compress chunk * timescale#2666 Fix timeout handling in async library * timescale#2683 Fix crash in add_job when given NULL interval * timescale#2698 Improve memory handling for remote COPY * timescale#2555 Set metadata for chunks compressed before 2.0
This release candidate contains bugfixes since the previous release candidate, as well as additional minor features. It improves validation of configuration changes for background jobs, adds support for gapfill on distributed tables, contains improvements to the memory handling for large COPY, and contains improvements to compression for distributed hypertables. **Minor Features** * #2689 Check configuration in alter_job and add_job * #2696 Support gapfill on distributed hypertable * #2468 Show more information in get_git_commit * #2678 Include user actions into job stats view * #2664 Fix support for complex aggregate expression * #2672 Add hypertable to continuous aggregates view * #2662 Save compression metadata settings on access node * #2707 Introduce additional db for data node bootstrapping **Bugfixes** * #2688 Fix crash for concurrent drop and compress chunk * #2666 Fix timeout handling in async library * #2683 Fix crash in add_job when given NULL interval * #2698 Improve memory handling for remote COPY * #2555 Set metadata for chunks compressed before 2.0
If a bad value is given to
alter_job
oradd_job
for a configurationparameter, no error will be given but the job will fail to execute.
This commit adds checks of the configuration parameters to the
functions so that an error is given immediately when calling it. The
commit factors out the extraction of parameters from the configuration
from the execution functions into a separate functions and calls them
from
alter_job
andadd_job
as well as when executing the job. Onlynon-custom job checks are done.
The commit also moves a few functions that were only used in TSL code
from the
src/
directory to thetsl/src/
directory and also removesa redundant permission check.
Closes #2607