Text Mining dengan Topic Modelling LDA dari Pertanyaan Gelar Wicara Literasi Perpustakaan Nasional RI

Authors

  • Mutia Jelita Perpustakaan Nasional RI

DOI:

https://doi.org/10.37014/medpus.v31i3.5237

Keywords:

text mining, topic modelling, latent dirichlet allocation (lda), literacy talk show questions

Abstract

In 2023, the National Library of Indonesia, through its Library Analysis and Reading Culture Development Centre, organised several literacy talk shows. Each event was documented in minutes (.doc or .pdf format) recording the speakers' material, questions, and answers. As events increased, so did the volume of minutes. This research aimed to identify frequently discussed topics using Text Mining with a Topic Modelling approach. Latent Dirichlet Allocation was applied and evaluated by perplexity values (a measure of model quality). Results showed the optimal number of topics to represent the dataset was three, with the lowest perplexity value of 470.922 at the 30th iteration. The three main topics identified were reading interest and the need for books in schools and regions, libraries’ role in improving children’s literacy, and librarians' role in inclusive literacy programmes for both young and old, including health literacy. Frequent words were literacy, library, reading, books, and children.

References

Abidin, Y., et al (2021). Pembelajaran literasi: Strategi meningkatkan kemampuan literasi matematika, sains, membaca dan menulis. Bumi Aksara.

Chilmi, M. L. C. (2021). Latent dirichlet allocation (LDA) untuk mengetahui topik pembicaraan warganet twitter tentang omnibus law [Skripsi, Universitas Islam Negeri Syarif Hidayatullah]. Institutional Repository UIN Syarif Hidayatullah. https://repository.uinjkt.ac.id/dspace/bitstream/123456789/56724/1/M.%20LUVIAN%20CHISNI%20CHILMI-FST.pdf

Hidayat, E., et al (2015). Automatic text summarization using latent dirichlet allocation (LDA) for document clustering. International Journal of Advances in Intelligent Informatics, 1(3), 132-139. https://doi.org/10.26555/ijain.v1i3.43

Indonesia. (2007). Undang-undang Republik Indonesia nomor 43 tahun 2007 tentang perpustakaan. Lembaran Negara Republik Indonesia Tahun 2007 Nomor 129.

Matira, et al. (2023). Pemodelan topik pada judul berita online detikcom menggunakan latent dirichlet allocation. Estimasi: Journal of Statistics and Its Application, 4(1), 53-63. https://doi.org/10.20956/ejsa.vi.24843

Narendra, L. W. (2022). Topic modeling in conversational dialogs for naming intent labels using lda. Jurnal Sistem Telekomunikasi, Elektronika, Sistem Kontrol, Power System & Komputer, 2(1), 65-74. https://doi.org/10.32503/jtecs.v2i1.1820

Polin, M., et al. (2022). Analisa dan visualisasi hasil kuesioner pertanyaan terbuka menggunakan elasticsearch dan kibana. Jurnal Sistem Informasi (E-Journal), 14(2), 2763–2777. https://doi.org/10.18495/jsi.v14i2.18418

Priyanto, I. F. (2013). Apa dan mengapa ilmu informasi. Jurnal Kajian Informasi & Perpustakaan, 1(1), 55-59. https://doi.org/10.24198/jkip.v1i1.9611

Putra, I. M. K. B & Kusumawardani, R. P. (2017). Analisis topik informasi publik media sosial di Surabaya menggunakan pemodelan latent dirichlet allocation (LDA). Jurnal Teknik ITS, 6(2), A311-A316. https://doi: 10.12962/j23373539.v6i2.23205

Rahayu, et al. (2024). Buku ajar data mining. PT. Sonpedia Publishing Indonesia.

Ridwan, M. H. & Azizah, L., (2022). Analisis struktur percakapan Merry Riana dan narasumber pada gelar wicara “zero to hero”. PENEROKA: Jurnal Kajian Ilmu Pendidikan Bahasa dan Sastra Indonesia, 2(1), 67-80. https://doi.org/10.30739/peneroka.v2i1.1366

Rohim, A., et al. (2023). Penerapan metode text mining dengan chatbot questions and answers pada PT. PLN (persero) sumatera selatan. Klik-Jurnal Ilmu Komputer 4(2), 59-67. https://doi.org/10.56869/klik.v4i2.551

Silge, Julia. (2017). Text mining of stack overflow questions. Stackoverflow. https://stackoverflow.blog/2017/07/06/text-mining-stack-overflow-questions/.

Sohrabi, B., Raeesi Vanani, I., & Baranizade Shineh, M. (2018). Topic modeling and classification of cyberspace papers using text mining. Journal of Cyberspace Studies, 2(1), 103-125. https://doi.org/10.22059/jcss.2017.239847.1009

Suyosiawaty, Dewi. (2023). Buku ajar pemrosesan bahasa alami. Laboratorium teknik Informatika Universitas Ahmad Dahlan.

Text mining vs. NLP: What’s the difference?. (2023, June 20). Relative Insight. https://relativeinsight.com/text-mining-vs-nlp/

Verma, T. et al. (2014). Tokenization and filtering process in rapidminer. International Journal of Applied Information Systems (IJAIS), 7(2), 16-18. Foundation of Computer Science Inc. https://doi.org/10.5120/ijais14-451139

Zahra, D. F. & Carkirman. (2024). Pengalaman pelanggan membeli tiket konser coldplay: Menambang ulasan online berdasarkan pemodelan topik dan analisis sentimen. Journal of Information System, Applied, Management, Accounting and Research, 8(2), 243-260. https://doi.org/10.52362/jisamar.v8i2.1426

Downloads

Published

2023-12-27

How to Cite

Jelita, M. (2023). Text Mining dengan Topic Modelling LDA dari Pertanyaan Gelar Wicara Literasi Perpustakaan Nasional RI. Media Pustakawan, 31(3), 253–265. https://doi.org/10.37014/medpus.v31i3.5237

Issue

Section

Articles