Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Author-topic model #893

Merged
merged 103 commits into from
Jan 17, 2017
Merged
Show file tree
Hide file tree
Changes from 97 commits
Commits
Show all changes
103 commits
Select commit Hold shift + click to select a range
2e8f3cb
Initial commit. Very early stages of algorithm development.
olavurmortensen Sep 25, 2016
a21059e
Fixed some errors.
olavurmortensen Sep 25, 2016
a9bddaa
Added online algorithm, removed batch algorithm.
olavurmortensen Sep 27, 2016
7ea76f2
Using max change instead of mean change criterion. Computing a differ…
olavurmortensen Sep 29, 2016
839a8b3
Fixed some things with var_mu. Also was passing the wrong arguments t…
olavurmortensen Sep 30, 2016
bd13c60
Added 'offline' algorithm, and notebook for experiments.
olavurmortensen Sep 30, 2016
ebc808c
Fixed log normalization. Also changed symmetric initilization of hype…
olavurmortensen Oct 9, 2016
16b26f7
Removed offline algorithm class as it is no longer necessary.
olavurmortensen Oct 9, 2016
10d2b36
Changed name of online algorithm class and file.
olavurmortensen Oct 9, 2016
c94f516
Made some changes to how the likelihood is computed.
olavurmortensen Oct 10, 2016
a1d758f
Changed the name of the online algorithm again.
olavurmortensen Oct 10, 2016
46cc8bf
Brought the offline algorithm back.
olavurmortensen Oct 10, 2016
3e53655
Working on bound computation.
olavurmortensen Oct 11, 2016
09666c4
Changed the way the data structure is prepared and how the model acce…
olavurmortensen Oct 12, 2016
a562fca
Cleaned the code up a bit. Added a simple method to get author topics.
olavurmortensen Oct 12, 2016
0de43a5
Removed some comments, mostly TODOs.
olavurmortensen Oct 12, 2016
a892564
Ran some very successful experiments on 286 documents. Offline algori…
olavurmortensen Oct 13, 2016
388a5e9
Changed the online algorithm according to all the changes that have h…
olavurmortensen Oct 14, 2016
2b2a896
Fixed mistake with mu variable.
olavurmortensen Oct 14, 2016
3756435
Fixed lambda update, multiplication by size of corpus was missing. Re…
olavurmortensen Oct 14, 2016
ed3416d
Added a loop for passing over entire corpus. Discarded use of log_nor…
olavurmortensen Oct 14, 2016
994f212
Moved bound computation out of corpus-wide loop.
olavurmortensen Oct 14, 2016
956fbd5
Updated notebook.
olavurmortensen Oct 14, 2016
40bbabf
Computing rho in a different way. Added the possibility to evaluate o…
olavurmortensen Oct 14, 2016
ed96b23
Implemented hyperparam MLE for eta and alpha in offline algo. Removed…
olavurmortensen Oct 17, 2016
a225399
Made it possible to sample a subset of documents in lambda update to …
olavurmortensen Oct 17, 2016
938daff
Now, if LDA topics are supplied lambda is not estimated at all. Added…
olavurmortensen Oct 18, 2016
b43d344
Updating notebook.
olavurmortensen Oct 19, 2016
1dc7e6a
Working on line search for hyperparam MLE.
olavurmortensen Oct 20, 2016
910c626
Made some structural changes to bound and log probability computation.
olavurmortensen Oct 20, 2016
7dbd01f
In process of updating online algo w.r.t. changes in offline algo.
olavurmortensen Oct 20, 2016
9a04533
Mostly updated the online algorithm according to changes that have be…
olavurmortensen Oct 21, 2016
b450609
Fixed a critical mistake in the online algorithm.
olavurmortensen Oct 21, 2016
d3ca917
Removed a redundancy in lambda update. Updated notebook.
olavurmortensen Nov 4, 2016
ba5ba63
Making sure that the model is evaluated after the last iteration, if …
olavurmortensen Nov 7, 2016
afa747d
Updated notebook.
olavurmortensen Nov 8, 2016
693b70b
Fixed mistake in interpolating gamma. Moved lambda update outside of …
olavurmortensen Nov 8, 2016
7783261
Working on an algorithm that tries to process each 'disjoint' set of …
olavurmortensen Nov 8, 2016
868b174
Working on a minibatch algorithm. Updated notebook.
olavurmortensen Nov 9, 2016
1cfd00f
Only updating the necessary expected log theta. Changed the name of O…
olavurmortensen Nov 11, 2016
edd5025
Implemented a new algorithm. It is 5 times faster, more memory effici…
olavurmortensen Nov 13, 2016
fafc20a
Moved all algorithms except the new online one to a 'temp' folder. Ve…
olavurmortensen Nov 15, 2016
32e750d
Changed the name of the main algorithm (and file). Made a new noteboo…
olavurmortensen Nov 15, 2016
4286e90
Cleaning up code. Removed or changed a lot of comments. Removed optio…
olavurmortensen Nov 16, 2016
12f231c
Was computing the norm of phi incorrectly, fixed that, speed-up not a…
olavurmortensen Nov 21, 2016
76764ff
Working on numerically stable phi update and bound computation. Is no…
olavurmortensen Nov 22, 2016
4cb3ee9
Made a separate file for unvectorized code and stored it in , in case…
olavurmortensen Nov 22, 2016
7c14f61
Implemented mini-batch algorithm.
olavurmortensen Nov 22, 2016
eade3e1
Some minor changes to old code.
olavurmortensen Nov 22, 2016
6fe4c0e
Updated notebook.
olavurmortensen Nov 22, 2016
1975321
In mini-batch algo, only the terms seen in the current chunk are upda…
olavurmortensen Nov 23, 2016
e4a0e4b
Updated notebook. Finally getting decent results on the entire NIPS d…
olavurmortensen Nov 23, 2016
526a3bb
Optimized phinorm computation by taking expElogbeta out of the loop.
olavurmortensen Nov 28, 2016
5ee9a95
Computing the bound more efficiently (much faster now). Now not passi…
olavurmortensen Nov 28, 2016
e0d7367
Removed unnecessary temp file. Updated notebook.
olavurmortensen Nov 28, 2016
2f621e2
Merged upstream develop branch into my feature branch.
olavurmortensen Nov 28, 2016
df11bb4
In the process of refactoring (atmodel2.py will become the new atmode…
olavurmortensen Nov 30, 2016
054d37c
Merge branch 'develop' into author-topic_model
olavurmortensen Nov 30, 2016
9d9da44
Refactoring the code. A lot left to do.
olavurmortensen Nov 30, 2016
336ff92
The refactored code now runs, converges almost exactly as the old cod…
olavurmortensen Dec 1, 2016
e5e7722
Refactoring. Various docstring and commenting. Made methods for const…
olavurmortensen Dec 4, 2016
861e81a
New refactored code now in atmodel.py. Old code is in atmodelold.py, …
olavurmortensen Dec 5, 2016
e911aed
Implemented 'continued training' (call update multiple times) and __g…
olavurmortensen Dec 7, 2016
ff7f8e6
A lot of changes. Most notably, added docstrings, and made it possibl…
olavurmortensen Dec 8, 2016
bdac93a
Added unit tests. Basically a retrofit of LDA test; some new tests, s…
olavurmortensen Dec 9, 2016
9429c0a
Updated unit tests. Fixed some mistakes. Added some tests; testing up…
olavurmortensen Dec 9, 2016
e0dc2d9
Forgot to add num_docs to ids of new authors in id2author. Some comme…
olavurmortensen Dec 9, 2016
aabc0f4
Made it possible to use serialized corpora (MmCorpus), and made unit …
olavurmortensen Dec 11, 2016
e526cbc
Removed code in unit tests that silence logging (useful when doing lo…
olavurmortensen Dec 11, 2016
8cb404f
Just removed a comment.
olavurmortensen Dec 12, 2016
6cf4e75
Reverted some changes that were made to ldamodel.py that were no long…
olavurmortensen Dec 12, 2016
94956fa
get_author_topics now takes author name instead of integer ID; change…
olavurmortensen Dec 12, 2016
ebd9679
Logging silencing again causing unit test failures. Fixed.
olavurmortensen Dec 13, 2016
bafb5ef
Updated docstring. Changed __getitem__ method.
olavurmortensen Dec 28, 2016
8cd90cf
Added a new notebook where a stackexchange dataset is used. Started w…
olavurmortensen Dec 28, 2016
ac9ecd4
Updated notebooks (just to trigger rebuild).
olavurmortensen Dec 28, 2016
aa08b49
Updated code w.r.t. comments from Lev (@tmylk).
olavurmortensen Jan 5, 2017
9ce1fd5
Updated all notebooks.
olavurmortensen Jan 5, 2017
f1f9f50
Two algorithms in 'temp' used to test the difference between blocking…
olavurmortensen Jan 5, 2017
cad8f26
Added the deepcopy again. Without it, the program can fail and the sy…
olavurmortensen Jan 5, 2017
6caefd7
Removed minimum_phi_value test (was already commented out).
olavurmortensen Jan 10, 2017
48b6c1a
Comments and docstrings. Responding to comments from Lev, and working…
olavurmortensen Jan 10, 2017
d03e020
Added the author-topic model to the API reference. Also slight change…
olavurmortensen Jan 10, 2017
7ac77b7
Added a test for gamma in persistency.
olavurmortensen Jan 10, 2017
cab716d
Removed test for single author in persistency test (test is simplifie…
olavurmortensen Jan 10, 2017
be7bddf
Removed save and load methods, using LdaModel's methods directly work…
olavurmortensen Jan 10, 2017
ffadaf1
Removed all temporary files.
olavurmortensen Jan 12, 2017
e218883
Made changes to model and test in preperation for a merge with upstream.
olavurmortensen Jan 12, 2017
7d2994f
Merge remote-tracking branch 'upstream/develop' into author-topic_model
olavurmortensen Jan 12, 2017
616a965
Modified the bound method; it was somewhat confusing, and there were …
olavurmortensen Jan 13, 2017
7f98e3a
Simplified sum in phi norm computation.
olavurmortensen Jan 13, 2017
ddfc8f7
Fixed some mistakes introduced in bound method in recent commit.
olavurmortensen Jan 13, 2017
7d03608
Updated algorithm and tests w.r.t. comments from Lev. Other changes a…
olavurmortensen Jan 14, 2017
661e7e5
Updated tutorial. Removed test notebook.
olavurmortensen Jan 14, 2017
85123c0
Updated notebook.
olavurmortensen Jan 14, 2017
91675a5
Updated tutorial.
olavurmortensen Jan 14, 2017
6d961a5
Updated tutorial (introduction).
olavurmortensen Jan 14, 2017
13fa9ee
Changes w.r.t. change requests from @tmylk, plus some other changes.
olavurmortensen Jan 16, 2017
018896c
Added the URL to view notebook in HTML to tutorial.
olavurmortensen Jan 16, 2017
a0a9832
Telling users to view the notebook in nbviewer instead. Docstring lin…
olavurmortensen Jan 16, 2017
5d6944a
Removed a fixme about tutorial link.
olavurmortensen Jan 16, 2017
8e56e9e
Fixed a small mistake in bound method.
olavurmortensen Jan 16, 2017
aecaecb
Added further explanation of tutorial goal in notebook.
olavurmortensen Jan 17, 2017
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Loading