Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] HDP #1055

Merged
merged 9 commits into from
Dec 27, 2016
Merged

[WIP] HDP #1055

merged 9 commits into from
Dec 27, 2016

Conversation

bhargavvader
Copy link
Contributor

This is to address issues #901, and #952, and go towards fixing #945 - basically to attempt to clean up HDP as much as possible.

This includes only the HDP changes from the closed #996.

I'll make the other cosmetic changes, and moving the appropriate methods to utils and matutils in a different PR.

@bhargavvader
Copy link
Contributor Author

@tmylk , only python 2.6 has failed with ImportError: cannot import name interfaces.
Other tests pass fine.

@tmylk
Copy link
Contributor

tmylk commented Dec 23, 2016

That test error will be fixed after #1056 merged in

@@ -56,6 +56,21 @@ def dirichlet_expectation(alpha):
return(sp.psi(alpha) - sp.psi(np.sum(alpha, 1))[:, np.newaxis])


def get_random_state(seed):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why copy-paste from LdaModel? Should it be moved to utils?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll move the common ldamodel and hdpmodel methods to the respective utils and matutils files after this PR is merged.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Need to make a few other changes to utils and matutils as well

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be done as a part of this PR. Duplicate code will not be merged.

@bhargavvader
Copy link
Contributor Author

bhargavvader commented Dec 23, 2016

@tmylk tests pass!

@tmylk tmylk changed the title HDP [WIP] HDP Dec 25, 2016
@bhargavvader
Copy link
Contributor Author

@tmylk what else would you want done on this PR?

Copy link
Contributor

@tmylk tmylk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please remove duplicate code

@@ -56,6 +56,21 @@ def dirichlet_expectation(alpha):
return(sp.psi(alpha) - sp.psi(np.sum(alpha, 1))[:, np.newaxis])


def get_random_state(seed):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be done as a part of this PR. Duplicate code will not be merged.

def suggested_lda_model(self):
"""
Returns closest corresponding ldamodel object corresponding to current hdp model.
The num_topics is m_T (default is 150) so as to preserve the matrice shapes when we assign alpha and beta.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how is it different from hdp_to_lda? Add a comment

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed duplicate code, added comment.

@bhargavvader
Copy link
Contributor Author

I've moved the get_random_state to utils from ldamodel and made them both call it from utils.

@bhargavvader
Copy link
Contributor Author

@tmylk I've addressed all your comments.

@tmylk tmylk merged commit 32f0f17 into piskvorky:develop Dec 27, 2016
@tmylk
Copy link
Contributor

tmylk commented Dec 27, 2016

Thanks for the PR!

@bhargavvader bhargavvader deleted the hdp_fix branch December 27, 2016 14:08
return numpy.random.RandomState(seed)
if isinstance(seed, numpy.random.RandomState):
return seed
raise ValueError('%r cannot be used to seed a numpy.random.RandomState'
Copy link
Owner

@piskvorky piskvorky Dec 27, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No vertical indent in gensim, please use hanging indent.

@tmylk this keeps happening over and over -- watch out for this in reviews.

Copy link
Contributor Author

@bhargavvader bhargavvader Dec 27, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My bad, I had just copy-pasted this from the existing ldamodel code and missed this. Fixing it in a new PR where I make some changes to utils.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bhargavvader Thanks!

jayantj pushed a commit to jayantj/gensim that referenced this pull request Jan 4, 2017
* Added print methods, lda_model

* Added HDP tests

* Changelog

* Removed duplicate code

* Removed duplicate code

* Added import

* Fixed Changelog
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants