Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix incorrect data interval alignment due to assumption on input time alignment #22658

Merged
merged 5 commits into from
Aug 16, 2022

Conversation

wanlce
Copy link
Contributor

@wanlce wanlce commented Mar 31, 2022

This introduces a new _align_to_prev util function for cron-based interval alignment. Previously we assumed _get_prev(_align_to_next(t)) is enough for this purpose, but this proved incorrect (see discussions in #21715), and an additional alignment logic is needed to avoid the inferred interval being off by one period when the input value has unexpected alignment (unexpectedly aligning for a manual run, and unexpected not aligning for an auto one).

See also #21715
Fix #23689

@wanlce wanlce requested a review from uranusjr as a code owner March 31, 2022 16:29
@wanlce
Copy link
Contributor Author

wanlce commented Mar 31, 2022

I am very sorry, because my wrong rebase caused the old request to be closed. 😭
I wasn't sure where these changes needed to be documented, thank you so much for your help!

@wanlce wanlce force-pushed the fix-timetables-interval-start branch from 8a59c00 to c16d11f Compare April 1, 2022 07:34
@wanlce wanlce requested a review from uranusjr April 1, 2022 07:35
airflow/timetables/interval.py Outdated Show resolved Hide resolved
@uranusjr uranusjr added this to the Airflow 2.3.0 milestone Apr 12, 2022
@github-actions github-actions bot added the full tests needed We need to run full set of tests for this PR to merge label Apr 12, 2022
@github-actions
Copy link

The PR most likely needs to run full matrix of tests because it modifies parts of the core of Airflow. However, committers might decide to merge it quickly and take the risk. If they don't merge it quickly - please rebase it to the latest main at your convenience, or amend the last commit of the PR, and push it with --force-with-lease.

@ashb ashb modified the milestones: Airflow 2.3.0, Airflow 2.3.1 Apr 22, 2022
@eladkal eladkal modified the milestones: Airflow 2.3.1, Airflow 2.3.2 May 23, 2022
@wanlce wanlce closed this Jun 1, 2022
@wanlce wanlce reopened this Jun 1, 2022
wanlce and others added 2 commits June 6, 2022 12:39
This PR fixes the incorrect first run of data_interval_start after
changing the scheduling time.

* Added _align_to_prev function for scheduling alignment after time change.

* renamed _align to _align_to_next.
@uranusjr
Copy link
Member

uranusjr commented Aug 8, 2022

Anyone else wants to take a look at this one? It’s a relatively straightforward change, since we established that this additional method is needed.

@potiuk
Copy link
Member

potiuk commented Aug 8, 2022

I think what woudl be great here is to add a few more tests showing the different cases and maybe explaining the context in the test a bit better?

The case here is either not tested with the edge case or it's not clear that it is tested I think:

    def _align_to_prev(self, current: DateTime) -> DateTime:
        """Get the prev scheduled time.
        This is ``current - interval``, unless ``current`` falls right on the
        interval boundary, when ``current`` is returned.
        """
        prev_time = self._get_prev(current)
        if self._get_next(prev_time) != current:
            return prev_time
        return current

@uranusjr
Copy link
Member

uranusjr commented Aug 9, 2022

Actually this is also the cause of #23689. I’ve added a few test cases here to illustrate those issues, and modified the PR description to better explain the problems this solve.

@uranusjr uranusjr changed the title Fix incorrect data_interval_start due to scheduling time change Fix incorrect data interval alignment due to assumption on input time alignment Aug 9, 2022
@uranusjr uranusjr added the AIP-39 Timetables label Aug 9, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
AIP-39 Timetables full tests needed We need to run full set of tests for this PR to merge type:bug-fix Changelog: Bug Fixes
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Data Interval wrong when manually triggering with a specific logical date
6 participants