# LDA **Repository Path**: learning-ml/LDA ## Basic Information - **Project Name**: LDA - **Description**: Three open source versions of LDA with collapsed Gibbs Sampling, modified by nanjunxiao - **Primary Language**: C++ - **License**: Not specified - **Default Branch**: master - **Homepage**: None - **GVP Project**: No ## Statistics - **Stars**: 0 - **Forks**: 0 - **Created**: 2019-05-21 - **Last Updated**: 2022-05-31 ## Categories & Tags **Categories**: Uncategorized **Tags**: None ## README LDA: Latent Dirichlet Allocation --- This repository includes three open source versions of LDA with collapsed Gibbs Sampling, modified by nanjunxiao. [GibbsLDA++](http://sourceforge.net/projects/gibbslda/files/latest/download) single thread,written in C++ [ompi-lda](http://code.google.com/p/ompi-lda/) multi-node/multi-threads, written in C++ [online_twitter_lda](https://github.com/jhlau/online_twitter_lda) multi-threads,written in Python collapsed Gibbs LDA reference : [my blog](http://nanjunxiao.github.io/2015/08/07/Topic-Model-LDA%E7%90%86%E8%AE%BA%E7%AF%87/ ) What's New --- #### 1. GibbsLDA++ fixed bugs: 1). memory leakage. 'delete[] p' instead of 'delete p',when p points to an Array. 2). Array out of bound. (double)random() / RAND_MAX in [0,1] ``` int topic = (int)(((double)random() / RAND_MAX) * K); --> int topic = (int)(((double)random() / RAND_MAX + 1) * K); double u = ((double)random() / RAND_MAX) * p[K - 1]; --> double u = ((double)random() / RAND_MAX + 1) * p[K - 1]; ``` #### 2. ompi-lda fixed bug: 1). infer.cc bugs. 2). rm 'sampler.UpdateModel(corpus)' in lda.cc. add features: 1). add theta twords file output. 2). add partial boost's hpp/cpp in include dir, so can make directly. #### 3. online_twitter_lda add features: 1). add theta phi mat file output. TODO --- #### ompi-lda 1). twordsnum can configure. 2). rewrite cmd_flag without boost, so can remove include dir. 3). rewrite makefile.