Skip to content

Option to use custom bug fix pattern #16

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
9 changes: 7 additions & 2 deletions gitrisky/cli.py
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,9 @@ def cli():


@cli.command()
def train():
@click.option('-p', '--pattern', required=False,
help="Bug fix pattern. Ex. BUG,FIX", type=str)
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

click lets you specify an option multiple times by passing multiple=True. Let's do that so we can accept an arbitrary number of tags, which will be nice.

One thing to watch out for though: using multiple=True makes the option always return a tuple even if no arguments were passed, so we will need to change our check in get_bugfix_commits from if pattern is None to if len(pattern) == 0.

def train(pattern=None):
"""Train a git commit bug risk model.

This will save a pickled sklearn model to a file in the toplevel directory
Expand All @@ -23,7 +25,10 @@ def train():

# get the features and labels by parsing the git logs
features = get_features()
labels = get_labels()
if pattern is not None:
labels = get_labels(pattern.split(','))
else:
labels = get_labels(pattern)

# instantiate and train a model
model = create_model()
Expand Down
16 changes: 12 additions & 4 deletions gitrisky/gitcmds.py
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@
import re

from collections import defaultdict
from subprocess import check_output
from subprocess import check_output, CalledProcessError


def _run_bash_command(bash_cmd):
Expand All @@ -20,7 +20,11 @@ def _run_bash_command(bash_cmd):
The resulting stdout output.
"""

stdout = check_output(bash_cmd.split()).decode('utf-8').rstrip('\n')
try:
stdout = check_output(bash_cmd.split()).decode('utf-8').rstrip('\n')
except CalledProcessError as err:
print('Failed to execute bash command: {!r}'.format(str(bash_cmd)))
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to wrap bash_cmd in str()? It should be a string already I think.

exit(1)
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good call adding the non-zero exit status


return stdout

Expand Down Expand Up @@ -80,7 +84,7 @@ def get_git_log(commit=None):
return stdout


def get_bugfix_commits():
def get_bugfix_commits(pattern=None):
"""Get the commits whose commit messages contain BUG or FIX.
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should probably update this docstring too.


Returns
Expand All @@ -89,8 +93,12 @@ def get_bugfix_commits():
A list of commit hashes.
"""

if pattern is None:
pattern = ("BUG", "FIX")

# TODO: add option to specify custom bugfix tags
bash_cmd = "git log -i --all --grep BUG --grep FIX --pretty=format:%h"
bash_cmd = "git log -i --all --grep {} --grep {} --pretty=format:%h"\
.format(pattern[0], pattern[1])
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's do this in a way that allows for a variable number of bugfix tags, maybe something like:

grep_clause = ' '.join(['--grep {}'.format(tag) for tag in pattern])
bash_cmd = "git log -i --all --pretty=format:%h " + grep_clause


stdout = _run_bash_command(bash_cmd)

Expand Down
4 changes: 2 additions & 2 deletions gitrisky/parsing.py
Original file line number Diff line number Diff line change
Expand Up @@ -135,7 +135,7 @@ def get_features(commit=None):
return feats


def get_labels():
def get_labels(pattern=None):
"""Get a label for each commit indicating whether it introduced a bug.

Returns
Expand All @@ -147,7 +147,7 @@ def get_labels():

feats = get_features()

fix_commits = get_bugfix_commits()
fix_commits = get_bugfix_commits(pattern)

bug_commits = link_fixes_to_bugs(fix_commits)

Expand Down