Skip to content

Ensure that groupKey objects are treated as their groupTarget for the purposes of joins and combines #6139

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 5 commits into
base: master
Choose a base branch
from

Conversation

robsyme
Copy link
Collaborator

@robsyme robsyme commented May 29, 2025

When using operators like join or combine, if one side of the join uses a plain value (e.g., a string) and the other side uses a GroupKey (wrapping the same string), the join may fail to match them—even though they represent the same logical key.

This is because, although GroupKey implements equals and hashCode to delegate to its target, there are subtle issues in Groovy/Java equality and hash semantics, especially when comparing a GroupKey to a plain value from the other side. This can lead to unpredictable or inconsistent join results, as described in #4104.

v = 'BATCH1'
gk = groupKey(v, 3)
println v == gk // false
println gk == v // true

This asymmetry can cause joins to miss matches, depending on which side emits the GroupKey. The ordering of items on left or right can depend on the order of items in channels, leading to difficult-to-debug timing issues. This issue affects the join and combine operators which will do these sorts of equality checks.

Both JoinOp and CombineOp use the makeKey utility (which creates a KeyPair) to extract keys from tuples for matching. By ensuring that all keys are "unwrapped" from GroupKey to their target value, we guarantee that:
A key 'BATCH1' and a key groupKey('BATCH1') are treated as the same for the purposes of joining or combining.
This removes the asymmetry and inconsistency in key matching, regardless of which side emits a GroupKey or a plain value.

Copy link

netlify bot commented May 29, 2025

Deploy Preview for nextflow-docs-staging canceled.

Name Link
🔨 Latest commit 145e24d
🔍 Latest deploy log https://app.netlify.com/projects/nextflow-docs-staging/deploys/6846d745be715c000993862c

Comment on lines 47 to +57
static private safeStr(key) {
key instanceof GString ? key.toString() : key
if (key instanceof GString)
return key.toString()
if (key instanceof GroupKey)
return key.getGroupTarget()
return key
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This sounds very reasonable, it may be worth adding a unit test

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed. Will fix that today.

result.keys = [entry]
result.addKey(entry)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not strictly required, right?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the [entry] is added directly to the result.keys List, it doesn't pass through the safeStr method, and thus would miss the new instanceOf GroupKey check.

If we wanted to be strict about it, we could make the keys List private to make sure addKey is the only way to add a KeyPair, but I'm not sure it's worth the extra complexity. KeyPair is not used broadly in the codebase.

@robsyme
Copy link
Collaborator Author

robsyme commented May 30, 2025

Looking at this a bit more closely this morning, I see that this change does "unwrap" the groupKey if the channel passes through the join or combine.

The options are:

  1. Current state (before this PR): Don't unwrap automatically anywhere - the user is responsible for calling getGroupTarget(). Inside of join and combine, groupKey("foo") == "foo" but "foo" != groupKey("foo"), leading to timing-dependent bugs if joining/combining one wrapped channel with an unwrapped channel (see example in Channel emits differently depending on order of dependent task completion #4104).
  • wrapped.join(unwrapped) → empty if the unwrapped value arrives first
  • wrapped.join(unwrapped) → wrapped values if the wrapped value arrives first.
  • groupTuple(wrapped) → wrapped
  1. We remove the groupKey("foo") == "foo" equality. This would give us more consistent behaviour, but I'm not sure if we depend on this equality elsewhere in the codebase. A lot of users use dump and view to inspect the values, and because the wrappend and unwrapped values toString() exatly the same, users may be confused why their joins are not working.
  • wrapped.join(unwrapped) → empty
  • groupTuple(wrapped) → wrapped
  1. This PR: We solve the commutitivity problem by always returning unwrapping in combine and join. The groupKey lifetime extends until it hits join or combine.
  • wrapped.join(unwrapped) → unwrapped
  • groupTuple(wrapped) → wrapped
  1. We unwrap on groupTuple, effectively "consuming" the groupKey when used as a group.
  • wrapped.join(unwrapped) → empty if the unwrapped value arrives first
  • wrapped.join(unwrapped) → wrapped values if the wrapped value arrives first.
  • groupTuple(wrapped) → unwrapped
  1. We do both options 3 and 4. The groupKey lifetime extends until it hits join, combine, or groupTuple
  • wrapped.join(unwrapped) → unwrapped
  • groupTuple(wrapped) → unwrapped

I think option 3 (the PR) is the cleanest without introducing major breaking changes. The only "breaking" change this introduces is if a workflow is relying on a groupKey being passed through a join/combine with the matching unwrapped value. Such a workflow is already in a fragile timing-dependent state anyway, so I think it's ok draw attention to this by introducing this "breaking" change.

@pditommaso
Copy link
Member

I need AI to summarise this 😆

@pditommaso
Copy link
Member

In my view this could be managed, unwrapping temporary the key for the requirement comparison/indexing operation, but continuing to emit as wrapped GroupKey. This essential because this was designed to traverse a chain of operators

@robsyme
Copy link
Collaborator Author

robsyme commented May 30, 2025

If so, we may have to consider the pathological case where two groupKeys are joined, but the groupKey contains different size values 😬

@pditommaso
Copy link
Member

How that would happen in practice?

@bentsherman
Copy link
Member

Now I hope you see why I would rather deal with this in operators v2

@robsyme
Copy link
Collaborator Author

robsyme commented May 31, 2025

How that would happen in practice?

A fair concern. This is very unlikely, and I can't imagine anybody doing it intentionally. Maybe we can just issue a warning in this very rare circumstance.

Now I hope you see why I would rather deal with this in operators v2

Yes, Ben :D I would still like to find a neat solution for existing users. If you work on v2, I'll give some thought to an interim solution here. Nextflow is a path to reproducibility, so it would be great to help users avoid timing-dependent workflow bugs that break that reproducibility guarantee.

robsyme added 2 commits May 31, 2025 17:49
Signed-off-by: Rob Syme <rob.syme@gmail.com>
- Store original keys in KeyPair and use them for output emission in join/combine
- Always prefer GroupKey over plain keys when both are present for the same match
- Fixes timing-dependent bugs when joining or combining channels with mixed GroupKey/plain keys
- Updates buffer logic to store KeyPair objects, ensuring metadata like group size is retained
- Updates remainder and mismatch handling to work with new buffer structure
- Adds and passes comprehensive tests for GroupKey preservation and commutativity

Fixes #4104

Signed-off-by: Rob Syme <rob.syme@gmail.com>
@robsyme robsyme force-pushed the groupkey-join-fix branch from 5fe466e to d105842 Compare May 31, 2025 22:25
@robsyme
Copy link
Collaborator Author

robsyme commented May 31, 2025

Ok, I've fixed it to ensure that the groupKey is preserved if combined with the target. I'll accept that preserving the key comes at the cost of some additional complexity 😬

Signed-off-by: Rob Syme <rob.syme@gmail.com>
@robsyme robsyme force-pushed the groupkey-join-fix branch from 6384cc5 to fc58bec Compare May 31, 2025 22:33
@pditommaso pditommaso force-pushed the master branch 3 times, most recently from b4b321e to 069653d Compare June 4, 2025 18:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants