Discovering research topics from library electronic references using latent Dirichlet allocation

Discovering research topics from library electronic references using latent Dirichlet allocation

Keywords:Academic libraries, Big data, Accounting research, Latent Dirichlet allocation (LDA), Topic model, Topic trends
Abstract:Discovering research topics from library electronic references using latent Dirichlet allocation Purpose Discovering the research topics and trends from a large quantity of library electronic references is essential for scientific researchCurrent research of this kind mainly depends on human justificationThe purpose of this paper is to demonstrate how to identify research topics and evolution in trends from library electronic references efficiently and effectively by employing automatic text analysis algorithmsDesignmethodologyapproach The authors used the latent Dirichlet allocationLDAa probabilistic generative topic model to extract the latent topic from the large quantity of research abstractsThenthe authors conducted a regression analysis on the document-topic distributions generated by LDA to identify hot and cold topicsFindings Firstthis paper discovers 32 significant research topics from the abstracts of 3737 articles published in the six top accounting journals during the period of 1992-2014Secondbased on the document-topic distributions generated by LDAthe authors identified seven hot topics and six cold topics from the 32 topicsOriginalityvalue The topics discovered by LDA are highly consistent with the topics identified by human expertsindicating the validity and effectiveness of the methodologyThereforethis paper provides novel knowledge to the accounting literature and demonstrates a methodology and process for topic discovery with lower cost and higher efficiency than the current methodsSearch and indexBecause it involves analysis of a large number of research abstracts to identify and extract themesThis method makes it easier for academic workers to find research fields or trends they are interested inThis article uses potential Dirichlet allocationLDAwhich is a probability of generating theme model to extract hidden themes from a large number of research abstractsThis basically belongs to the category ofdata explorationbecause it involves the model or theme of the implicit implied from the big data