Implementation Information Gain Feature Selection for Hoax News Detection on Twitter using Convolutional Neural Network (CNN)
Abstract
The development of information and communication technology is currently increased, especially related to social media. Nowadays, many people get information through social media, especially Twitter, because of its easy access and it doesn't cost much. However, it has a negative impact in the form of spreading fake news or hoaxes that are difficult to detect. In this research, the authors developed a hoax news detection model using the Convolutional Neural Network and the TF-IDF weighting method. Feature selection is performed using Information Gain with various features, such as unigram, bigram, trigram and a combination of the three. Testing is done with 3 scenarios, classification, classification by weighting, classification by weighting and feature selection. The parameter used in the information gain feature selection is the threshold 0.8. The results showed that the classification by weighting and feature selection produced the highest accuracy that is equal to 95.56% on the unigram + bigram features with a comparison of training data and test data 50:50.
Downloads
References
Mastel, 2019. Hasil Survey Wabah HOAX Nasional 2019. [Online] (Updated 10 Apr 2019). Available at: https://mastel.id/hasil-survey-wabah-hoax-nasional-2019. [Accessed 20 September 2019]
C. Juditha, “Hoax Communication Interactivity in Social Media and Anticipation (Interaksi Komunikasi Hoax di Media Sosial serta Antisipasinya),†J. Pekommas, vol. 3, no. 1, p. 31, 2018.
J. J. S. S. M. U. B. Kariaman Sinaga, “Pelatihan Meminimalisir Efek Hoaks Media Sosial di Desa Namo Sialang Kecamatan Batang Serangan Kabupaten Langkat, Sumatera Utara.†E-Dimas: Jurnal Pengabdian kepada Masyarakat, 10(2), 150-159, 2019.
Okezone, (2017, Mei). 7 ciri berita hoax. [Online]. Dipetik Agustus 21, 2018. Available at: https://news.okezone.com/read/2017/05/02/337/1680830/7-ciri-berita-hoax- seperti-ini-lho. [Accessed 20 September 2019]
E. B. S. Laode Mohammad Ikhsan, “Deteksi hoax pada twitter menggunakan metode Decision Tree dan Analytical Hierarchy Process.â€, Open Library Telkom, 2019.
E. B. S. Z. K. A. Achmad Fauzi, “Hoax News Detection on Twitter using Term Frequency Inverse Document Frequency and Support Vector Machine Method,†J. Phys. Conf. Ser., vol. 1192, no. 1, 2019.
K. P. H. B. Marin Vuković, “An intelligent automatic hoax detection system,†Lect. Notes Comput. Sci. (including Subser. Lect. Notes Artif. Intell. Lect. Notes Bioinformatics), vol. 5711 LNAI, no. PART 1, pp. 318–325, 2009.
A. D. T. M. Jeqwalin Claudya, “Klasifikasi Spam Pada Email Menggunakan Algoritma Convolutional Neural Network.†Open Library Telkom, 2019.
I. W. S. E. Putra, “Klasifikasi citra menggunakan convolutional neural network (CNN) pada caltech 101†Doctoral dissertation, Institut Teknologi Sepuluh Nopember, 2016.
R. R. I. D. E. Y. A. A. S. M. A. A. S. Agung. B. Prasetijo,“Hoax detection system on Indonesian news sites based on text classification using SVM and SGD,†Proc. - 2017 4th Int. Conf. Inf. Technol. Comput. Electr. Eng. ICITACEE 2017, vol. 2018-Janua, pp. 45– 49, 2018.
I. Y. R. P. D. M. R. Faisal Rahutomo, “Eksperimen Naïve Bayes Pada Deteksi Berita Hoax Berbahasa Indonesia,†J. Penelit. Komun. dan Opini Publik, vol. 23, no. 1, pp. 1–15, 2019.
F. N. A. B. Kemas Muslim Lhaksmana, “Klasifikasi Pengguna Media Sosial Twitter Dalam Persebaran Hoax Menggunakan Metode Backpropagation Classification of Users Social Media Twitter in the Hoax Spread,†vol. 4, no. 2, pp. 3082–3090, 2017.
TorunoÄŸlu, Dilara, et al. "Analysis of preprocessing methods on classification of Turkish texts." 2011 International Symposium on Innovations in Intelligent Systems and Applications. IEEE, 2011.
D. N. I. G. S. I. N. Chandra, “Klasifikasi Berita Lokal Radar Malang Menggunakan Metode Naïve Bayes Dengan Fitur N-Gramâ€. Jurnal Ilmiah Teknologi Informasi Asia, 10(1), 11-19, 2016.
A. S. H. R. H. Maulida Indah, “Seleksi Fitur Pada Dokumen Abstrak Teks Bahasa Indonesia Menggunakan Metode Information Gain,†JSM (Jurnal SIFO Mikroskil), vol. 17, no. 2, pp. 249–258, 2016.
M. A. A. K. Much. Rifqi Maulana, “Information Gain Untuk Mengetahui Pengaruh Atribut,†J. Litbang Kota Pekalongan, vol. 9, 2015.
H. H. Erlyn Nour Arrofiqoh, “Implementasi Metode Convolutional Neural Network Untuk Klasifikasi Tanaman Pada Citra Resolusi Tinggi,†Geomatika, vol. 24, no. 2, p. 61, 2018.
Dharmadi, R. 2018. Mengenal Convolutional Layer Dan Pooling Layer [online] available at: https://medium.com/nodeflux/mengenal-convolutional-layer-dan-pooling-layer-3c6f5c393ab2 [accessed 29 September 2019].
D. Kefin Pudi. Implementasi Deep Learning Menggunakan Convolutional Neural Network untuk Klasifikasi Citra Candi Berbasis GPU. Diss. UAJY, 2017.
I. T. a. I.Technology, "Review on Evaluation Metrics For Data Classification Evaluations," Int J. Data Min. Knowl. Manag, vol. 5, no. 2, pp. 1-11, 2015.
Copyright (c) 2021 Husnul Khotimah Farid
This work is licensed under a Creative Commons Attribution 4.0 International License.
- Manuscript submitted to IndoJC has to be an original work of the author(s), contains no element of plagiarism, and has never been published or is not being considered for publication in other journals.Â
- Copyright on any article is retained by the author(s). Regarding copyright transfers please see below.
- Authors grant IndoJC a license to publish the article and identify itself as the original publisher.
- Authors grant IndoJC commercial rights to produce hardcopy volumes of the journal for sale to libraries and individuals.
- Authors grant any third party the right to use the article freely as long as its original authors and citation details are identified.
- The article and any associated published material is distributed under the Creative Commons Attribution 4.0License