Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix issue-1310 #1332

Merged
merged 5 commits into from
May 23, 2017
Merged

Fix issue-1310 #1332

merged 5 commits into from
May 23, 2017

Conversation

parulsethi
Copy link
Contributor

Renamed out_path. Also, shifted a log line just above its corresponding command execution.

@tmylk
Copy link
Contributor

tmylk commented May 17, 2017

please add logging before every check_output call

@parulsethi
Copy link
Contributor Author

Done

with smart_open(input_fname, 'rb') as r:
with smart_open(output_fname, 'wb') as w:
utils.check_output(w, args=command, stdin=r)

logger.info("Delete frequencies from vocab file")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add a logging with cmd_del_vocab_freq like line 103

Copy link
Contributor

@menshikh-iv menshikh-iv May 19, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It may be necessary to add logging arguments to the function check_output. Then you will not need to add logging to it each time and some questions from the mailing list will be easier to debug, @tmylk what do you think about it?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, let's add a utils logger

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would be great

@menshikh-iv
Copy link
Contributor

menshikh-iv commented May 20, 2017

@parulsethi could you please add logging parameters to utils.check_output function, after this you can remove logging before check_output statement

gensim/utils.py Outdated
@@ -1164,11 +1164,12 @@ def check_output(stdout=subprocess.PIPE, *popenargs, **kwargs):
Added extra KeyboardInterrupt handling
"""
try:
cmd = kwargs.get("args")
logger.info("COMMAND: %s", cmd)
Copy link
Contributor Author

@parulsethi parulsethi May 20, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think if some specific information needs to be logged for a command, then the log statement can be specified in the code there and this will just make sure to log the command that has to be run, for debugging purposes. WDYT?

Or should a parameter for log string be provided so that the command specific info is also logged from check_output only

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. You should use logger.debug here (because it's needed for debug purposes only)
  2. Need to logging all input parameters (popenargs and kwargs)
  3. I think now is not necessary to logging some "command specific" data (based on troubles from our users in mailing list), logging all arguments will be enough

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Made suggested changes, and removed logging command(cmd) at logger.info level

@menshikh-iv
Copy link
Contributor

menshikh-iv commented May 22, 2017

Looks good, thank you @parulsethi 👍
@tmylk can you merge this please

@menshikh-iv menshikh-iv merged commit a49aa9b into piskvorky:develop May 23, 2017
@@ -1164,6 +1164,7 @@ def check_output(stdout=subprocess.PIPE, *popenargs, **kwargs):
Added extra KeyboardInterrupt handling
"""
try:
logger.debug("COMMAND: %s %s", str(popenargs), str(kwargs))
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

%s = str, no need to call that function explicitly.

sgd_num=100, lrate=0.001, period=10, iter=90, epsilon=0.75, dump_period=10, reg=0, alpha=100,
beta=99, loss='hinge', memory=4.0, cleanup_files=True, sorted_vocab=1, ensemble=0):
"""
`wr_path` is the path to the Wordrank directory.
`corpus_file` is the filename of the text file to be used for training the Wordrank model.
Expects file to contain space-separated tokens in a single line
`out_path` is the path to directory which will be created to save embeddings and training data.
`out_name` is name of the directory which will be created(in wordrank folder) to save embeddings and training data.
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Space before bracket (.

input_fnames = [corpus_file.split('/')[-1], corpus_file.split('/')[-1], cooccurrence_file]
output_fnames = [temp_vocab_file, cooccurrence_file, cooccurrence_shuf_file]

logger.info("Prepare training data using glove code")
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We try to make info messages more informative: what data? How much data?

If it's not meant to be seen by end users (internal opaque debug message), it's better to use debug.

for command, input_fname, output_fname in zip(commands, input_fnames, output_fnames):
with smart_open(input_fname, 'rb') as r:
with smart_open(output_fname, 'wb') as w:
utils.check_output(w, args=command, stdin=r)

logger.info("Delete frequencies from vocab file")
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Delete = imperative. Is that intented? Should the user delete something?

Probably meant to be "deleting" or "will delete" instead.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants