Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improvements for Collection Operations #2496

Open
2 of 8 tasks
jmchilton opened this issue Jun 13, 2016 · 3 comments
Open
2 of 8 tasks

Improvements for Collection Operations #2496

jmchilton opened this issue Jun 13, 2016 · 3 comments

Comments

@jmchilton
Copy link
Member

jmchilton commented Jun 13, 2016

  • Tool form gives no indication of batch or mapping an input vs. not-batching an input if the input type is a data_collection. No Visual Indication of Mapping over a Collection #2467
  • These tools readily produce list:list and such that the history panel does not render. Limited Collection Types Render in History #2495
  • Right now the collection outputs place duplicated HDAs (with non-duplicated Datasets) in the History (like other tools) - this sucks and would no longer be needed if Simplified User-Facing Dataset Collection Model #1810 is complete - work on Simplified User-Facing Dataset Collection Model #1810 should revise this behavior as part of a broader fix for the collection abstraction's place in the history.
  • Filter failed should be able to consume a list:paired and properly filter it.
  • A detailed review and revision of the naming used by the tools (especially collection parts) should be performed - some of it doesn't make a ton of sense and while this may be hard to correct at the framework-level these tools provide a direct Python function to produce outputs and could be done better.
  • Filter failing datasets isn't what one wants to do in case of a cluster failure, Rerunning a failed dataset collection element should substitute the failed element #2235 should be complete and the tool should be updated with a discussion of when to use what.
  • The other collection operations should be implemented - including those in Collection Operations #1313 (grouping and filtering collections) as well as other definitively useful operations such as exploding collections, merging collections, and re-labelling collections with "expressions".
  • Extend the filter failed operation with option to filter empty datasets. Various datatypes may have empty data files that aren't not in fact empty files (e.g. they may have structural elements or metadata but still be empty) and so a general purpose filtering tool that can consume metadata is also important (see above).
@nknox
Copy link

nknox commented Jul 6, 2016

I agree with the naming issue in collections. It would be great to download individual files from a collection which matches the individual dataset name within the collection.

@jennaj
Copy link
Member

jennaj commented Jan 18, 2018

A user request for the improvement idea "Extend the filter failed operation with option to filter empty datasets." above was made at Biostars. I wasn't sure if that was tracked in a distinct ticket yet (this ticket is older). Please help to link that in if one exists. Thanks!

What are others doing for a workaround? Or did I miss a new alternative method? It would be good to post it here if exists.

Post: https://biostar.usegalaxy.org/p/26364

@Mataivic
Copy link
Contributor

Mataivic commented Mar 1, 2018

@jennaj I'm the user :). I've looked a bit at the code today, hoping to be able to participate to this improvement. I didn't find a lot of stuff, but I let it in a comment in an issue I had made on GitHub (#5090, @jmchilton had answered) as the same time I asked on Biostars.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants