-
Notifications
You must be signed in to change notification settings - Fork 513
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Added functions for join-filtering. #202
Conversation
992adc7
to
c3f53c4
Compare
Current coverage is 75.26% (diff: 100%)@@ master #202 diff @@
==========================================
Files 61 61
Lines 2160 2163 +3
Methods 1998 1977 -21
Messages 0 0
Branches 162 186 +24
==========================================
+ Hits 1625 1628 +3
Misses 535 535
Partials 0 0
|
Would not use |
} | ||
} | ||
|
||
it should "support intersectByKey() with duplicate keys" in { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does it make sense to test on empty left&right hand side?
+1 pls squash commits |
dff11c5
to
3da9919
Compare
@nevillelyh @rav RFC please. I often see code to filter an SCollection[(K, V)] by another one of type SCollection[K], where (k, v) pairs are filtered out if k is not present on the right hand side. It usually looks like this:
I find this pattern doesn't really speak to the high level intent of the operation, and personally it always takes me a bit of staring to understand what's happening when I see it. So I thought it might be useful to include in the core API. The equivalent of the above would look like this:
Do you consider this useful at all? Is there a more obvious way of doing this that I'm not thinking about? If you do think it's useful, any suggestions on naming?
filterJoin
is the first thing that came to mind since I'm implementing this on top of join but there's probably a better name. Similar functionality is achieved in a plainSCollection[K]
viaintersect
, which internally converts both collections to a dummySCollection[(K, V)]
, and inner join is the equivalent forSCollection[(K, V)], SCollection[(K, W)]
. MaybekeyIntersect
is a better name.