Text Mining dengan Topic Modelling LDA dari Pertanyaan Gelar Wicara Literasi Perpustakaan Nasional RI
DOI:
https://doi.org/10.37014/medpus.v31i3.5237Keywords:
text mining, topic modelling, latent dirichlet allocation (lda), literacy talk show questionsAbstract
In 2023, the National Library of Indonesia, through its Library Analysis and Reading Culture Development Centre, organised several literacy talk shows. Each event was documented in minutes (.doc or .pdf format) recording the speakers' material, questions, and answers. As events increased, so did the volume of minutes. This research aimed to identify frequently discussed topics using Text Mining with a Topic Modelling approach. Latent Dirichlet Allocation was applied and evaluated by perplexity values (a measure of model quality). Results showed the optimal number of topics to represent the dataset was three, with the lowest perplexity value of 470.922 at the 30th iteration. The three main topics identified were reading interest and the need for books in schools and regions, libraries’ role in improving children’s literacy, and librarians' role in inclusive literacy programmes for both young and old, including health literacy. Frequent words were literacy, library, reading, books, and children.References
Abidin, Y., et al (2021). Pembelajaran literasi: Strategi meningkatkan kemampuan literasi matematika, sains, membaca dan menulis. Bumi Aksara.
Chilmi, M. L. C. (2021). Latent dirichlet allocation (LDA) untuk mengetahui topik pembicaraan warganet twitter tentang omnibus law [Skripsi, Universitas Islam Negeri Syarif Hidayatullah]. Institutional Repository UIN Syarif Hidayatullah. https://repository.uinjkt.ac.id/dspace/bitstream/123456789/56724/1/M.%20LUVIAN%20CHISNI%20CHILMI-FST.pdf
Hidayat, E., et al (2015). Automatic text summarization using latent dirichlet allocation (LDA) for document clustering. International Journal of Advances in Intelligent Informatics, 1(3), 132-139. https://doi.org/10.26555/ijain.v1i3.43
Indonesia. (2007). Undang-undang Republik Indonesia nomor 43 tahun 2007 tentang perpustakaan. Lembaran Negara Republik Indonesia Tahun 2007 Nomor 129.
Matira, et al. (2023). Pemodelan topik pada judul berita online detikcom menggunakan latent dirichlet allocation. Estimasi: Journal of Statistics and Its Application, 4(1), 53-63. https://doi.org/10.20956/ejsa.vi.24843
Narendra, L. W. (2022). Topic modeling in conversational dialogs for naming intent labels using lda. Jurnal Sistem Telekomunikasi, Elektronika, Sistem Kontrol, Power System & Komputer, 2(1), 65-74. https://doi.org/10.32503/jtecs.v2i1.1820
Polin, M., et al. (2022). Analisa dan visualisasi hasil kuesioner pertanyaan terbuka menggunakan elasticsearch dan kibana. Jurnal Sistem Informasi (E-Journal), 14(2), 2763–2777. https://doi.org/10.18495/jsi.v14i2.18418
Priyanto, I. F. (2013). Apa dan mengapa ilmu informasi. Jurnal Kajian Informasi & Perpustakaan, 1(1), 55-59. https://doi.org/10.24198/jkip.v1i1.9611
Putra, I. M. K. B & Kusumawardani, R. P. (2017). Analisis topik informasi publik media sosial di Surabaya menggunakan pemodelan latent dirichlet allocation (LDA). Jurnal Teknik ITS, 6(2), A311-A316. https://doi: 10.12962/j23373539.v6i2.23205
Rahayu, et al. (2024). Buku ajar data mining. PT. Sonpedia Publishing Indonesia.
Ridwan, M. H. & Azizah, L., (2022). Analisis struktur percakapan Merry Riana dan narasumber pada gelar wicara “zero to hero”. PENEROKA: Jurnal Kajian Ilmu Pendidikan Bahasa dan Sastra Indonesia, 2(1), 67-80. https://doi.org/10.30739/peneroka.v2i1.1366
Rohim, A., et al. (2023). Penerapan metode text mining dengan chatbot questions and answers pada PT. PLN (persero) sumatera selatan. Klik-Jurnal Ilmu Komputer 4(2), 59-67. https://doi.org/10.56869/klik.v4i2.551
Silge, Julia. (2017). Text mining of stack overflow questions. Stackoverflow. https://stackoverflow.blog/2017/07/06/text-mining-stack-overflow-questions/.
Sohrabi, B., Raeesi Vanani, I., & Baranizade Shineh, M. (2018). Topic modeling and classification of cyberspace papers using text mining. Journal of Cyberspace Studies, 2(1), 103-125. https://doi.org/10.22059/jcss.2017.239847.1009
Suyosiawaty, Dewi. (2023). Buku ajar pemrosesan bahasa alami. Laboratorium teknik Informatika Universitas Ahmad Dahlan.
Text mining vs. NLP: What’s the difference?. (2023, June 20). Relative Insight. https://relativeinsight.com/text-mining-vs-nlp/
Verma, T. et al. (2014). Tokenization and filtering process in rapidminer. International Journal of Applied Information Systems (IJAIS), 7(2), 16-18. Foundation of Computer Science Inc. https://doi.org/10.5120/ijais14-451139
Zahra, D. F. & Carkirman. (2024). Pengalaman pelanggan membeli tiket konser coldplay: Menambang ulasan online berdasarkan pemodelan topik dan analisis sentimen. Journal of Information System, Applied, Management, Accounting and Research, 8(2), 243-260. https://doi.org/10.52362/jisamar.v8i2.1426
Downloads
Published
How to Cite
Issue
Section
License
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.