Partially collapsed Gibbs sampling for latent Dirichlet allocation

Park, Hongju; Park, Taeyoung; Lee, Yung-Seop

Detailed Information

Cited 20 time in webofscience

Cited 23 time in scopus

Metadata Downloads

Partially collapsed Gibbs sampling for latent Dirichlet allocation

Authors: Park, Hongju; Park, Taeyoung; Lee, Yung-Seop

Issue Date: 1-Oct-2019

Publisher: PERGAMON-ELSEVIER SCIENCE LTD

Keywords: Bayesian analysis; Latent Dirichlet allocation; Dirichlet process mixture; Partial collapse; Machine learning; Natural language processing

Citation: EXPERT SYSTEMS WITH APPLICATIONS, v.131, pp 208 - 218

Pages: 11

Indexed: SCIE
SCOPUS

Journal Title: EXPERT SYSTEMS WITH APPLICATIONS

Volume: 131

Start Page: 208

End Page: 218

URI: https://scholarworks.dongguk.edu/handle/sw.dongguk/7534

DOI: 10.1016/j.eswa.2019.04.028

ISSN: 0957-4174
1873-6793

Abstract: A latent Dirichlet allocation (LDA) model is a machine learning technique to identify latent topics from text corpora within a Bayesian hierarchical framework. Current popular inferential methods to fit the LDA model are based on variational Bayesian inference, collapsed Gibbs sampling, or a combination of these. Because these methods assume a unimodal distribution over topics, however, they can suffer from large bias when text corpora consist of various clusters with different topic distributions. This paper proposes an inferential LDA method to efficiently obtain unbiased estimates under flexible modeling for heterogeneous text corpora with the method of partial collapse and the Dirichlet process mixtures. The method is illustrated using a simulation study and an application to a corpus of 1300 documents from neural information processing systems (NIPS) conference articles during the period of 2000-2002 and British Broadcasting Corporation (BBC) news articles during the period of 2004-2005. (C) 2019 Elsevier Ltd. All rights reserved.

Files in This Item: There are no files associated with this item.

Appears in Collections: College of Natural Science > Department of Statistics > 1. Journal Articles

Show full item record

qrcode

Related Researcher

Researcher Lee, Yung Seop photo

Lee, Yung Seop: College of Natural Science (Department of Statistics)

Read more

Altmetrics

Total Views & Downloads

RSS_1.0 RSS_2.0 ATOM_1.0

30, Pildong-ro 1-gil, Jung-gu, Seoul, 04620, Republic of Korea+82-2-2260-3114

Certain data included herein are derived from the © Web of Science of Clarivate Analytics. All rights reserved.
You may not copy or re-distribute this material in whole or in part without the prior written consent of Clarivate Analytics.

Detailed Information

Related Researcher

Altmetrics

Total Views & Downloads

BROWSE