Implementation of Dynamic Topic Modeling to Discover Topic Evolution on Customer Reviews
DOI:
https://doi.org/10.15575/join.v8i2.963Keywords:
Dynamic topic modeling, BERTopic, Customer reviews, Topic evolutionAbstract
Annotation and analysis of online customer reviews were identified as significant problems in various domains, including business intelligence, marketing, and e-governance. In the last decade, various approaches based on topic modeling have been developed to solve this problem. The known solutions, however, often only work well on content with static topics. As a result, it is challenging to analyze customer reviews that include dynamic and constantly expanding collections of short and noisy texts. A method was proposed to handle such dynamic content. The proposed system applied a dynamic topic model using BERTopic to monitor topics and word evolution over time. It would help decide when the topic model needs to be retrained to capture emerging topics. Several experiments were conducted to test the practicality and effectiveness of the proposed framework. It demonstrated how a dynamic topic model could handle the emergence of new and over-time-correlated topics in customer review data. As a result, improved performance was achieved compared to the baseline static topic model, with 25% of new segmented texts discovered using the dynamic topic model. Experimental results have, therefore, convincingly demonstrated that the proposed framework can be used in practice to develop automatic review annotation tools.
References
M. Sun, “How Does the Variance of Product Ratings Matter?,” Manage. Sci., vol. 58, no. 4, pp. 696–707, Dec. 2011, doi: 10.1287/MNSC.1110.1458.
N. Bashir, K. N. Papamichail, and K. Malik, “Use of Social Media Applications for Supporting New Product Development Processes in Multinational Corporations,” Technol. Forecast. Soc. Change, vol. 120, pp. 176–183, Jul. 2017, doi: 10.1016/J.TECHFORE.2017.02.028.
A. Qazi, K. B. Shah Syed, R. G. Raj, E. Cambria, M. Tahir, and D. Alghazzawi, “A concept-level approach to the analysis of online review helpfulness,” Comput. Human Behav., vol. 58, pp. 75–81, May 2016, doi: 10.1016/J.CHB.2015.12.028.
G. O. Diaz and V. Ng, “Modeling and Prediction of Online Product Review Helpfulness: A Survey,” in Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics, Jul. 2018, pp. 698–708. Accessed: May 06, 2022. [Online]. Available: https://www.cse.msu.edu/
M. G. Parente, “Using NLP and Information Visualization to Analyze App Reviews,” Master Thesis, Utrecht University, Utrecht, Netherlands, 2018. [Online]. Available: https://dspace.library.uu.nl/bitstream/handle/1874/368082/MScThesis_MGarciaParente.pdf?sequence=2
Z. Jin, W. Zhangwen, and N. Naichen, “Helping consumers to overcome information overload with a diversified online review subset,” Front. Bus. Res. China, vol. 13, no. 1, pp. 1–25, Dec. 2019, doi: 10.1186/S11782-019-0062-1/TABLES/8.
H. Lee, K. Choi, D. Yoo, Y. Suh, S. Lee, and G. He, “Recommending valuable ideas in an open innovation community: A text mining approach to information overload problem,” Ind. Manag. Data Syst., vol. 118, no. 4, pp. 683–699, 2018, doi: 10.1108/IMDS-02-2017-0044/FULL/PDF.
F. Pech, A. Martinez, H. Estrada, and Y. Hernandez, “Semantic Annotation of Unstructured Documents Using Concepts Similarity,” Sci. Program., vol. 2017, 2017, doi: 10.1155/2017/7831897.
A. Canito, G. Marreiros, and J. M. Corchado, “Automatic Document Annotation with Data Mining Algorithms,” Adv. Intell. Syst. Comput., vol. 930, pp. 68–76, Apr. 2019, doi: 10.1007/978-3-030-16181-1_7.
S. Tuarob, L. C. Pouchard, P. Mitra, and C. L. Giles, “A generalized topic modeling approach for automatic document annotation,” Int. J. Digit. Libr., vol. 16, no. 2, pp. 111–128, Mar. 2015, doi: 10.1007/S00799-015-0146-2/TABLES/3.
K. Bontcheva and H. Cunningham, “Semantic Annotations and Retrieval: Manual, Semiautomatic, and Automatic Generation,” in Handbook of Semantic Web Technologies, Springer Berlin Heidelberg, 2011, pp. 77–116. doi: 10.1007/978-3-540-92913-0_3.
J. Qiang, Z. Qian, Y. Li, Y. Yuan, and X. Wu, “Short Text Topic Modeling Techniques, Applications, and Performance: A Survey,” IEEE Trans. Knowl. Data Eng., pp. 1–1, May 2020, doi: 10.1109/TKDE.2020.2992485.
E. Gallinucci, M. Golfarelli, and S. Rizzi, “Advanced topic modeling for social business intelligence,” Inf. Syst., vol. 53, pp. 87–106, Oct. 2015, doi: 10.1016/J.IS.2015.04.005.
X. Liao, Z. Zhao, X. Liao, and Z. Zhao, “Unsupervised Approaches for Textual Semantic Annotation, A Survey,” ACM Comput. Surv., vol. 52, no. 4, pp. 1–45, Aug. 2019, doi: 10.1145/3324473.
A. M. de Sousa and K. Becker, “Pro/Anti-vaxxers in Brazil: a temporal analysis of COVID vaccination stance in Twitter,” in 9th Symposium on Knowledge Discovery, Mining, and Learning, (KDMILE) 2021, Oct. 2021, pp. 105–112. doi: 10.5753/KDMILE.2021.17467.
Patrick Jähnichen, Florian Wenzel, Marius Kloft, and Stephan Mandt, “Scalable Generalized Dynamic Topic Models,” in Proceedings of the Twenty-First International Conference on Artificial Intelligence and Statistics, 2018, pp. 84:1427-1435. Accessed: May 19, 2022. [Online]. Available: https://proceedings.mlr.press/v84/jahnichen18a.html
Federico Tomasi, Praveen Chandar, Gal Levy-Fix, Mounia Lalmas-Roelleke, and Zhenwen Dai, “Stochastic Variational Inference for Dynamic Correlated Topic Models,” in Proceedings of the 36th Conference on Uncertainty in Artificial Intelligence, 2020, pp. 124:859-868. Accessed: May 19, 2022. [Online]. Available: https://proceedings.mlr.press/v124/tomasi20a.html
I. Pak and P. L. Teh, “Text Segmentation Techniques: A Critical Review,” in Innovative Computing, Optimization and Its Applications: Modelling and Simulations, I. Zelinka, P. Vasant, V. H. Duy, and T. T. Dao, Eds. Cham: Springer International Publishing, 2018, pp. 167–181. doi: 10.1007/978-3-319-66984-7_10.
A. Sharma, S. Susan, A. Bansal, and A. Choudhry, “Dynamic Topic Modeling of Covid-19 Vaccine-Related Tweets,” ACM Int. Conf. Proceeding Ser., pp. 79–84, Feb. 2022, doi: 10.1145/3528114.3528127.
S. Mosallaie, M. Rad, A. Schiffauerova, and A. Ebadi, “Discovering the evolution of artificial intelligence in cancer research using dynamic topic modeling,” COLLNET J. Sci. Inf. Manag., vol. 15, no. 2, pp. 225–240, Jul. 2021, doi: 10.1080/09737766.2021.1958659.
R. Churchill and L. Singh, “Dynamic Topic-Noise Models for Social Media,” Lect. Notes Comput. Sci. (including Subser. Lect. Notes Artif. Intell. Lect. Notes Bioinformatics), vol. 13281 LNAI, pp. 429–443, 2022, doi: 10.1007/978-3-031-05936-0_34/COVER.
V. R. Hananto, U. Serdült, and V. Kryssanov, “A Text Segmentation Approach for Automated Annotation of Online Customer Reviews, Based on Topic Modeling,” Appl. Sci., vol. 12, no. 7, p. 3412, Mar. 2022, doi: 10.3390/APP12073412.
R. He and J. McAuley, “Ups and downs: Modeling the visual evolution of fashion trends with one-class collaborative filtering,” in 25th International World Wide Web Conference, WWW 2016, Apr. 2016, pp. 507–517. doi: 10.1145/2872427.2883037.
M. Röder, A. Both, and A. Hinneburg, “Exploring the space of topic coherence measures,” in Proceedings of the Eighth ACM International Conference on Web Search and Data Mining - WSDM ’15, 2015, pp. 399–408. doi: 10.1145/2684822.2685324.
M. Grootendorst, “BERTopic: Neural topic modeling with a class-based TF-IDF procedure,” arXiv:2203.05794 [cs.CL], Mar. 2022, doi: 10.48550/arxiv.2203.05794.
N. F. F. d. Silva et al., “Evaluating Topic Models in Portuguese Political Comments About Bills from Brazil’s Chamber of Deputies,” Lect. Notes Comput. Sci. (including Subser. Lect. Notes Artif. Intell. Lect. Notes Bioinformatics), vol. 13074 LNAI, pp. 104–120, 2021, doi: 10.1007/978-3-030-91699-2_8/FIGURES/4.
C. Y. K. Williams, R. X. Li, M. Y. Luo, and M. Bance, “Exploring patient experiences and concerns in the online Cochlear Implant community: a natural language processing approach,” Clin. Otolaryngol., vol. 48, no. 3, pp. 442–450, Mar. 2023, doi: https://doi.org/10.1111/coa.14037.
L. McInnes, J. Healy, N. Saul, and L. Großberger, “UMAP: Uniform Manifold Approximation and Projection,” J. Open Source Softw., vol. 3, no. 29, p. 861, Sep. 2018, doi: 10.21105/JOSS.00861.
L. McInnes, J. Healy, and S. Astels, “HDBSCAN: Hierarchical density-based clustering,” J. Open Source Softw., vol. 2, no. 11, p. 205, Mar. 2017, doi: 10.21105/JOSS.00205.
D. M. Blei, A. Y. Ng, and M. I. Jordan, “Latent Dirichlet Allocation,” J. Mach. Learn. Res., vol. 3, pp. 993–1022, 2003.
Downloads
Published
Issue
Section
Citation Check
License
Copyright (c) 2023 Jurnal Online Informatika

This work is licensed under a Creative Commons Attribution-NoDerivatives 4.0 International License.
You are free to:
- Share — copy and redistribute the material in any medium or format for any purpose, even commercially.
- The licensor cannot revoke these freedoms as long as you follow the license terms.
Under the following terms:
-
Attribution — You must give appropriate credit, provide a link to the license, and indicate if changes were made. You may do so in any reasonable manner, but not in any way that suggests the licensor endorses you or your use.
-
NoDerivatives — If you remix, transform, or build upon the material, you may not distribute the modified material.
-
No additional restrictions — You may not apply legal terms or technological measures that legally restrict others from doing anything the license permits.
Notices:
- You do not have to comply with the license for elements of the material in the public domain or where your use is permitted by an applicable exception or limitation.
- No warranties are given. The license may not give you all of the permissions necessary for your intended use. For example, other rights such as publicity, privacy, or moral rights may limit how you use the material.
This work is licensed under a Creative Commons Attribution-NoDerivatives 4.0 International License