-
Notifications
You must be signed in to change notification settings - Fork 486
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
2122 querying root dataverse contents (and other permission performance boosts) #4883
Conversation
List<Integer> roles = em.createNativeQuery(powerfull_roles).getResultList(); | ||
String x = "select id from dataverserole where (permissionbits&12)!=0"; | ||
return null; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
note to self: delete this method
…b.com/IQSS/dataverse into 2122-querying-root-dataverse-contents
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The structure of the caching makes sense to me and seems reasonable
import java.util.concurrent.ConcurrentHashMap; | ||
import java.util.logging.Logger; | ||
|
||
public class TimeoutCache<K, V> implements Map<K, V> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be good to have a bit more explanation as to the use of this cache in a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I figured that could wait until we made a final decision to use it.
if (d instanceof DataFile && p.contains(Permission.DownloadFile)) { | ||
// unrestricted files that are part of a release dataset | ||
// automatically get download permission for everybody: | ||
// -- L.A. 4.0 beta12 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Stale comments?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
don't think they're stale
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably just take out the L.A. 4.0 tag or rewrite it... makes it seem like the code is ancient yet its new.
final Set<Permission> groupPremissions = permissionsForSingleRoleAssignee(g,dvo); | ||
permissions.addAll(groupPremissions); | ||
for (Group g : groupService.groupsFor(req, dvo)) { | ||
permissionsForSingleRoleAssignee(g, dvo, permissions); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I find it kind of confusing that in some versions of "permissionsForSingleRoleAssignee" you have to assign the return value to the permissions while in other cases it is assigned to the passed in object. Maybe its better for them to all be uniform? Or maybe change the naming?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah, I just realized I introduced 2 different permissionForSingleRoleAssignee
that pass sets with completely different meanings. Will fix.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Aside from the things that Matthew touched on, I think it looks good. I'm not going to give it approval because I can only follow the java.
|
||
|
||
private static final Set<Permission> PERMISSIONS_FOR_AUTHENTICATED_USERS_ONLY | ||
= EnumSet.copyOf(Arrays.asList(Permission.values()).stream() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm assuming this was done on purpose/is good syntax.
…now have their groupProvider set up properly
User user = req.getUser(); | ||
|
||
// quick cases | ||
if (user.isSuperuser()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should a required isempty also go here?
…ent, or even no statements at all, when the closure seed is empty
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I left a comment in the issue.
The code looks good to me. As much as I can tell from just looking at the code. Considering how many changes in the logic there are in this PR, we probably need to come up with some comprehensive test for it. I don't think it's possible to confirm that it's all still doing what we want otherwise.
Thanks @pameyer, for the detailed report.
|
@oscardssmith
I get back its contents correctly. However, if I try it as the user who created the dataverse and the dataset (and this is what's happening in the test that is failing), I get nothing:
(same thing when I try it without any api key). I can't immediately tell what's going on. I was hoping it was some kind of an obvious conflict with something that's been introduced into the branch since you worked on it, as it was synced up with develop; but I'm not seeing anything obvious. Do you have any pointers? I'm sure I can figure it out. But still hoping it may be something simple you could point out immediately... Hope you're well!
|
@landreev just checking that you know about pull request #4998 which includes commits from @oscardssmith (which are also in this pull request #4883) followed by commits by @michbarsinai |
Oh, yes, what is the relationship between this branch and #4998, does it have everything this branch has? Should we abandon this branch? |
What I would recommend is abandoning both this and michael's PR's. Then cherry pick the docs changes and the change to how group transitive closure is computed (including removing the double recursion), and leaving the rest. Those changes account for almost all of the functional improvements. |
The rewritten method for child permissions might be nice too as that gives a notable speedup for that 1 case. |
@oscardssmith I'd be ok with cherry-picking the changes that we need from the branches. But do you have any ideas/suggestions for the fact that the code in the branch just doesn't seem to be working? I tried building @michbarsinai's branch (#4998), but it appears to be even more broken. There I can't even initiate a new database - trying to create the top-level, root dataverse with /api/dataverses in setup-all fails with an internal server error. The stack trace suggests that it's also permission service related. |
The branches are identical up to commit 05b704b (August 10). I #4998 still has those breaking @NotNull annotations, which may cause the "more broken state".
…-- Michael
On 18 Oct 2018, at 1:26, landreev ***@***.***> wrote:
@oscardssmith <https://github.com/oscardssmith> I'd be ok with cherry-picking the changes that we need from the branches. But do you have any ideas/suggestions for the fact that the code in the branch just doesn't seem to be working?
It's not just the issue of cleaning up/getting rid of the code we don't need. There's something wrong with the core code that we actually want to keep, in its current state.
I tried building @michbarsinai <https://github.com/michbarsinai>'s branch (#4998 <#4998>), but it appears to be even more broken. There I can't even initiate a new database - trying to create the top-level, root dataverse with /api/dataverses in setup-all fails with an internal server error. The stack trace suggests that it's also permission service related.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub <#4883 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AB2UJFFaPjMI8L0wpsMrwbmSH1S8V7vAks5ul7y-gaJpZM4VbiVN>.
|
@michbarsinai For the record, no this appears to have nothing to do with those NotNull annotations. Adding that fix to the "cleanup" branch doesn't change the behavior in question - you still can't create the initial, root dataverse w/ this build. I'm assuming it's because somewhere in the code something expects the root dataverse to always be present in order to perform any permission checks? - I'm back to debugging the original, "uncleaned-up" branch; but will take another look at this too.
|
Fixed the conflict: src/main/java/edu/harvard/iq/dataverse/api/Dataverses.java
Thanks for getting this across the finish line @landreev. Do we have a final measurement for getting children of root of harvard dataverse with this? Also has anyone checked the effect on re-indexing? |
Now that this pull request has been merged, we can't run the API tests on the phoenix server: #5393 |
Looks like this is failing, using the builtin secret key. Maybe that was overlooked? |
@oscardssmith But, answering your question, yes, I ran some performance tests; I made a script for recursive crawling, and ran it on the entire prod. dataverse, as different users, and on individual dataverses. I'll post the numbers; they seem sensible, but most importantly it is something you can do now - where it was just not practical at all to crawl the entire root dataverse holdings. (I'll post the number in the issue, not here) I haven't tested the effects on reindexing; I probably should, and will. But I honestly don't expect that much of an effect there. Full reindexing is always done as the superuser; and there is no recursive crawling involved (instead of going through dataverses recursively, it just gets a linear list of all the datasets from the database, and goes through them). I also believe we have some evidence that shows that most of the time there is actually spent doing the indexing... But as I said, I'll test it, to know for sure. |
@oscardssmith (I'm being reminded that we are actually indexing the permissions - so it's not just the cost of getting to the object... I still believe most of the time spent there is that of indexing the metadata. But as I said, I'll test and report. |
This speeds up querying root dataverse contents by adding fast paths for non users and superusers who now don't get permissions checked, as well as speedifying
permissionServiceBean
with shortcut has methods that are more efficient than the methods to find all permissions for a user. I also removed redundant recursive calls togroupsFor
which speeds things up.These changes make the query have acceptable performance for non-users and superusers, and speeds up the time of a regular user from ????????? to 26 minutes (on my laptop). This level of speedup is probably also achieved for most other calls to
permissionServiceBean
, which may significantly speed up other unrelated pieces of the code.The main thing this PR still needs is a set of integration tests for permissions involving groups and nested groups with inherited permissions to make sure there are no functional regressions.
In addition caching of recent
groupsFor
calls should be able to speed up the remaining case of users who are not superusers to a reasonable level.connects to Optimize Permissions Lookup #2122:querying root dataverse contents
Integration tests: