Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix an issue with sync::Barrier leader selection with reuse #102

Merged
merged 1 commit into from
Apr 15, 2023

Conversation

lukenels
Copy link
Contributor

The Rust standard library documentation for sync::Barrier states the following about Barrier::wait:

Blocks the current thread until all threads have rendezvoused here.
Barriers are re-usable after all threads have rendezvoused once,
and can be used continuously. A single (arbitrary) thread will
receive a BarrierWaitResult that returns true from
BarrierWaitResult::is_leader() when returning from this function,
and all other threads will receive a result that will return false
from BarrierWaitResult::is_leader().

This documentation is a bit ambiguous. It says that "a single thread" becomes the leader after returning from wait, but it isn't clear if that means a single thread for the existence of the Barrier, or a single thread for each "batch" of threads released by the barrier. The latter interpretation seems to make the most sense given the context of being able to reuse barriers, and it seems more useful to be able to assume that a leader is set for each batch. That is in fact the behavior implemented by the standard library as evidenced by the implementation and this unit test, which reuses a barrier on a new thread:

let barrier = Barrier::new(1);
assert!(barrier.wait().is_leader());

let thd = thread::spawn(move || {
    assert!(barrier.wait().is_leader());
});

thd.join().unwrap();

Let's assume this behavior is correct, and that ambiguity in the Rust documentation would be resolved in favor of the existing implementation. Then the Shuttle behavior for Barrier is incorrect and causes this test to fail, because there is only a single, global leader field on the barrier which is set once the first time it's used.

This commit fixes the problem by adding the notion of an "epoch" to the Barrier implementation, which allows there to be different leaders for each batch / group of threads released by the barrier. (This is similar to the standard library implementation's notion of generation).

The commit also adds some new unit tests to capture the issue with the old behavior and prevent regressions. It also tweaks Barrier::wait to become an explicit thread yield point, in order to vary which thread becomes the leader rather than just choosing the first one that happens to call `wait.


By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

Copy link
Member

@jamesbornholt jamesbornholt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I ended up writing this test because I wasn't sure about the case where 1 < barrier_size < num_threads; I think it subsumes some of yours:

// Test that any combination of threads can be chosen as leader of an epoch.
fn barrier_test_nondeterministic_leader(batch_size: usize, num_threads: usize) {
    // Don't test cases that might deadlock; every thread needs to be released eventually
    assert!(num_threads % batch_size == 0);

    let seen_leaders = Arc::new(std::sync::Mutex::new(HashSet::new()));

    {
        let seen_leaders = Arc::clone(&seen_leaders);
        check_dfs(
            move || {
                let barrier = Arc::new(Barrier::new(batch_size));
                let leaders = Arc::new(std::sync::Mutex::new(vec![]));
                let handles = (0..num_threads).map(|i| {
                    let barrier = Arc::clone(&barrier);
                    let leaders = Arc::clone(&leaders);

                    thread::spawn(move || {
                        let result = barrier.wait();
                        if result.is_leader() {
                            leaders.lock().unwrap().push(i);
                        }
                    })
                }).collect::<Vec<_>>();

                for thd in handles {
                    thd.join().unwrap();
                }

                let leaders = Arc::try_unwrap(leaders).unwrap().into_inner().unwrap();
                seen_leaders.lock().unwrap().insert(leaders);
            },
            None,
        );
    }

    let seen_leaders = Arc::try_unwrap(seen_leaders).unwrap().into_inner().unwrap();
    let num_batches = num_threads / batch_size;
    let fact = |i: usize| (2..i+1).reduce(|acc, i| acc * i).unwrap_or(1);

    // Every execution should have one leader per batch
    assert!(seen_leaders.iter().all(|l| l.len() == num_batches));
    // There should be `num_threads permute num_batches` different leader vectors
    assert_eq!(seen_leaders.len(), fact(num_threads) / fact(num_threads - num_batches));
}

#[test]
fn barrier_test_nondeterministic_leader_1() {
    barrier_test_nondeterministic_leader(1, 4);
}

#[test]
fn barrier_test_nondeterministic_leader_2() {
    barrier_test_nondeterministic_leader(2, 4);
}

#[test]
fn barrier_test_nondeterministic_leader_4() {
    barrier_test_nondeterministic_leader(4, 4);
}

Comment on lines 127 to 131
// Mark the task as requesting an explicit yield and schedule a thread switch, otherwise
// some schedulers would nearly always pick to continue executing the current task.
// Requesting a yield allows other threads to be the first one to wake up and grab the
// leader token.
ExecutionState::request_yield();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unlike #101 I think we don't want to yield in this case. We needed it there for fairness because park gets used in spinloops, which are doomed by unfair schedules. But barriers aren't used like that—threads generally do useful work once unblocked by a barrier. (I do wish we had better rules for thinking about this, but it's all random in the end anyway, so it gets a bit handwave-y).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My intention here was to prevent an unfair schedule from making it so that it's (almost) always the last thread that gets to the barrier that becomes the leader, by giving every other thread a chance to wake up first and become the leader instead. In most situations I can think of though, it's probably already random enough which thread happens to call wait last, so the "extra" fairness here won't help much. I'll remove the request_yield here.

src/sync/barrier.rs Outdated Show resolved Hide resolved
src/sync/barrier.rs Outdated Show resolved Hide resolved
// Block current thread
assert!(state.waiters.insert(me)); // current thread shouldn't already be in the set
// The current thread shouldn't already be in `waiters`.
assert!(state.waiters.insert(me));
Copy link
Member

@jamesbornholt jamesbornholt Apr 11, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I marginally preferred the existing code that didn't needlessly add the current thread to the waiters set, so that the invariant was that all waiters are blocked. It ends up having roughly the same number of special cases for me, just different ones. But we should still assert here that me isn't already a waiter.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It ends up having roughly the same number of special cases for me, just different ones.

What special cases for me do you mean? Assuming I kill assert_eq!(t.blocked(), tid != me) (which is really just for debugging), then the only use of me in the function is in the line you commented on here which I could just inline as:

assert!(state.waiters.insert(ExecutionState::me()));

[...] the invariant was that all waiters are blocked.

Arguably, that still is an invariant. The barrier's state is never release while this isn't true (and every thread in waiters is always one that called wait).

@@ -22,9 +22,19 @@ impl BarrierWaitResult {

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you write a comment like the one we have for Condvar explaining how/why the implementation works?

// We implement `Condvar` by tracking the `CondvarWaitStatus` of each thread currently waiting on

It should probably be a doc comment on BarrierState? (the Condvar one should probably have been a doc comment on CondvarState).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added a doc comment to BarrierState in latest revision.

The Rust standard library documentation for `sync::Barrier`
(https://doc.rust-lang.org/std/sync/struct.Barrier.html) states the
following about `Barrier::wait`:

> Blocks the current thread until all threads have rendezvoused here.
> Barriers are re-usable after all threads have rendezvoused once,
> and can be used continuously.  A single (arbitrary) thread will
> receive a BarrierWaitResult that returns true from
> BarrierWaitResult::is_leader() when returning from this function,
> and all other threads will receive a result that will return false
> from BarrierWaitResult::is_leader().

This documentation is a bit ambiguous. It says that "a single thread"
becomes the leader after returning from `wait`, but it isn't clear
if that means a single thread for the existence of the `Barrier`,
or a single thread for each "batch" of threads released by the
barrier. The latter interpretation seems to make the most sense
given the context of being able to reuse barriers, and it seems
more useful to be able to assume that a leader is set for each
batch. That is in fact the behavior implemented by the standard
library as evidenced by the implementation
(https://doc.rust-lang.org/src/std/sync/barrier.rs.html#126-143)
and this unit test, which reuses a barrier on a new thread:

```
let barrier = Barrier::new(1);
assert!(barrier.wait().is_leader());

let thd = thread::spawn(move || {
    assert!(barrier.wait().is_leader());
});

thd.join().unwrap();
```

Let's assume this behavior is correct, and that ambiguity in the
Rust documentation would be resolved in favor of the existing
implementation. Then the Shuttle behavior for `Barrier` is incorrect
and causes this test to fail, because there is only a single, global
`leader` field on the barrier which is set once the first time it's
used.

This commit fixes the problem by adding the notion of an "epoch"
to the `Barrier` implementation, which allows there to be different
leaders for each batch / group of threads released by the barrier.
(This is similar to the standard library implementation's notion
of generation).

The commit also adds some new unit tests to capture the issue with
the old behavior and prevent regressions. It also tweaks `Barrier::wait`
to become an explicit thread yield point, in order to vary which
thread becomes the leader rather than just choosing the first one
that happens to call `wait`.

Co-authored-by: James Bornholt <bornholt@amazon.com>
@jamesbornholt jamesbornholt merged commit e939a38 into awslabs:main Apr 15, 2023
jorajeev pushed a commit that referenced this pull request Feb 29, 2024
The Rust standard library documentation for `sync::Barrier`
(https://doc.rust-lang.org/std/sync/struct.Barrier.html) states the
following about `Barrier::wait`:

> Blocks the current thread until all threads have rendezvoused here.
> Barriers are re-usable after all threads have rendezvoused once,
> and can be used continuously.  A single (arbitrary) thread will
> receive a BarrierWaitResult that returns true from
> BarrierWaitResult::is_leader() when returning from this function,
> and all other threads will receive a result that will return false
> from BarrierWaitResult::is_leader().

This documentation is a bit ambiguous. It says that "a single thread"
becomes the leader after returning from `wait`, but it isn't clear
if that means a single thread for the existence of the `Barrier`,
or a single thread for each "batch" of threads released by the
barrier. The latter interpretation seems to make the most sense
given the context of being able to reuse barriers, and it seems
more useful to be able to assume that a leader is set for each
batch. That is in fact the behavior implemented by the standard
library as evidenced by the implementation
(https://doc.rust-lang.org/src/std/sync/barrier.rs.html#126-143)
and this unit test, which reuses a barrier on a new thread:

```
let barrier = Barrier::new(1);
assert!(barrier.wait().is_leader());

let thd = thread::spawn(move || {
    assert!(barrier.wait().is_leader());
});

thd.join().unwrap();
```

Let's assume this behavior is correct, and that ambiguity in the
Rust documentation would be resolved in favor of the existing
implementation. Then the Shuttle behavior for `Barrier` is incorrect
and causes this test to fail, because there is only a single, global
`leader` field on the barrier which is set once the first time it's
used.

This commit fixes the problem by adding the notion of an "epoch"
to the `Barrier` implementation, which allows there to be different
leaders for each batch / group of threads released by the barrier.
(This is similar to the standard library implementation's notion
of generation).

The commit also adds some new unit tests to capture the issue with
the old behavior and prevent regressions. It also tweaks `Barrier::wait`
to become an explicit thread yield point, in order to vary which
thread becomes the leader rather than just choosing the first one
that happens to call `wait`.

Co-authored-by: James Bornholt <bornholt@amazon.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants