Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Triage github tickets #2612

Open
mpenkov opened this issue Sep 29, 2019 · 8 comments
Open

Triage github tickets #2612

mpenkov opened this issue Sep 29, 2019 · 8 comments

Comments

@mpenkov
Copy link
Collaborator

mpenkov commented Sep 29, 2019

We have a ton of issues on github (over 220) and for me personally, it feels a bit overwhelming. What do you think?

Realistically, that's more than we can ever hope to resolve given our current velocity. I think it's worth performing a bug triage: going through the issues and identifying:

  • Priority
  • Severity
  • Approximate time scope (now / weeks / months / years / never)

Here are some places to start (issue labels may overlap):

You can see the full list of our labels here.

@piskvorky @gojomo What's the best medium to do this? Perhaps a phone call? It does not have to happen now or even in the immediate future, but it should happen sometime.

@piskvorky
Copy link
Owner

piskvorky commented Sep 29, 2019

For me, fixing bugs, improving docs and streamlining the existing workflows is the priority. "Wishlist" or new feature tickets we can (should) ignore: We don't have the time capacity to add major functionality ourselves, and I have little trust in outside contributions.

It's my impression many of the tickets will be trivial (user mistake), and/or obsolete. Identifying those tickets that are both non-trivial and urgent seems worthwhile. I suspect there won't be that many – yes, let's clear up which is which (for ourselves) in a call.

Then we can come up with a ticket-labeling scheme to better sort & organize the tickets.

@mpenkov
Copy link
Collaborator Author

mpenkov commented Sep 29, 2019

For me, fixing bugs, improving docs and streamlining the existing workflows is the priority.

👍

"Wishlist" or new feature tickets we can (should) ignore: We don't have the time capacity to add major functionality ourselves, and I have little trust in outside contributions.

OK, so do we close all of the wishlist tickets, then?

It's my impression many of the tickets will be trivial (user mistake), and/or obsolete. Identifying those tickets that are both non-trivial and urgent seems worthwhile. I suspect there won't be that many – yes, let's clear up which is which (for ourselves) in a call.

Then we can come up with a ticket-labeling scheme to better sort & organize the tickets.

👍

@piskvorky
Copy link
Owner

piskvorky commented Sep 29, 2019

OK, so do we close all of the wishlist tickets, then?

They don't bother me, I'd keep them open. But the more outlandish / impractical ones we can certainly close. Or would you close them all? We can go over this in our call too.

@mpenkov
Copy link
Collaborator Author

mpenkov commented Sep 29, 2019

Let's go through them too, perhaps after the bugs are triaged.

@gojomo
Copy link
Collaborator

gojomo commented Sep 30, 2019

For a project like gensim, I'd prefer to keep marginal/speculative/longshot issues open, but use other labeling to help keep them from distracting people doing prioritized work.

Why? Closed issues can be harder to find, and "closing" can imply disinterest/rejection – when in fact many such issues are just awaiting the arrival of the right, interested volunteers to research/complete them. (Or, awaiting the addition of some additional report/insight that eventually helps connect them to a larger opportunity.) Keeping them open, but well-labeled as low-priority, lets them store & accumulate info for the future.

(This calculation changes for projects with more fixed budgets/timelines, and a smaller set of customer/manager/coworker collaborators. There, quick definitive prioritize-or-close processes can be important, and also reporters will know better how to find/escalate/revive closed issues. But here, even many fringe bugs/ideas could be promising, if they eventually attract a skills-matched, motivated contributor.)

@piskvorky
Copy link
Owner

piskvorky commented Oct 8, 2019

We went over the first page of bug tickets with Misha today. Recording my impressions here:

  • Bugs are really bugs, need action.
    • Very few nonsense tickets.
    • Some minor ticket hijacking.
  • Largest classes of bugs:
    • API design: the clusterfuck of Re-design "*2vec" implementations #1777 redesign. We didn't manage to revert that PR in time, and now there's a steady stream of broken API contracts, things that shouldn't work but do, things that should work but don't, unclear responsibilities.
    • Scaling of LdaModel and LdaMulticore: multiple issues with numerical stability (esp. in combination with large corpora); training getting stuck; training resulting in zero vectors.
    • I/O issues with fastText: loading from native fastText, RAM issues.
    • Wrapper issues: sklearn, mallet, pandas, keras.
  • The rest are more case-by-case errors, no common pattern

The most severe errors to me are the first kind (API design). Rather than being a single fixable bug, they're compromising the core of Gensim's mission: topic modeling for humans. They're also the most embarrassing bugs, because they show a lack of engineering skill – a very bad sign for any library.

@mpenkov
Copy link
Collaborator Author

mpenkov commented Oct 9, 2019

@piskvorky I couldn't articulate the difference between severity and priority during our call, but this article does a decent job: http://tryqa.com/what-is-the-difference-between-severity-and-priority/

Do you think we need to keep both severity and priority labels?

@piskvorky
Copy link
Owner

piskvorky commented Oct 9, 2019

Thanks, that looks good to me. As long as we're clear about the difference between the various labels: the clearer we can articulate the purpose of each label ("would this ticket fit under this label?"), the better.

Can you make the label names and descriptions more explicit? Otherwise we'll forget again soon :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants