Named Entity Recognition for an Indonesian Based Language Tweet using Multinomial Naive Bayes Classifier

  • Ramadhyni Rifani Telkom University
  • Moch Arif Bijaksana Telkom University
  • Ibnu Asror Telkom University
Abstract views: 477 , PDF downloads: 886

Abstract

In Natural Languange Processing (NLP), Named Entity Recognition (NER) is a sub discussion that is widely used for research. the main task of Named Entity Recognition (NER) is to help identify and detect the entity names from a word in a sentence. The data sources we use are a real time Indonesian language tweets that often occur, which the number of letter each tweet is limited to 280 characters. The words contained in that Indonesian language tweets can refer to the name of the entity, location, or organization, so to determine the name of that entity, it must be considered first by looking at the word patterns around it. In Indonesia, an average tweet posted from an account at least is 1-3 tweets per day which contain a formal and non-formal contents that made this a difficult challenge to provide the right entity naming. In this research, we are naming the entities from the Indonesian language tweets by using the Multinomial Naive Bayes Classifier algorithm. The system uses precision, recall,and f-measure as evaluation metrics. Naming this entity is able to classify with a value of f-1 reaching 80%.

Downloads

Download data is not yet available.

References

Charu C Aggarwal and ChengXiang Zhai.Mining text data. Springer Science and Business Media, 2012.

Moch Arif Bijaksana, Siti Sa’adah, et al. Klasifikasi argumen semantik menggunakan kombinasi fitur named entities inconstituent, head word pos, dan syntactic frame.eProceedings of Engineering, 2(2), 2015.

Sigit A Dayinta W W Putra P A. Named entity recognition (ner) pada dokumen biologi menggunakan rule based dan naà ́rvebayes classifier.Jurnal Pengembangan Teknologi Informasi dan Ilmu Komputer, 2(11):4555–4563, 2018.

Devin Hoesen and Ayu Purwarianti. Investigating bi-lstm and crf with pos tag embedding for indonesian named entity tagger.In2018 International Conference on Asian Language Processing (IALP), pages 35–38. IEEE, 2018.

Iwan Kosasih. Peran media sosial facebook dan twitter dalam membangun komunikasi.Lembaran Masyarakat: JurnalPengembangan Masyarakat Islam, 2(1):29–42, 2016.

Nuning Kurniasih, S Sos, and M Hum. Penggunaan media sosial bagi humas di lembaga pemerintah. InForum KehumasanKota Tangerang, 2013.

Erick Alfons Lisangan. Implementasi n-gram technique dalam deteksi plagiarism pada tugas mahasiswa.TEMATIKA, Journalof Informatics and Information Systems, 1(2):24–30, 2013.

Ony Naraulita Maringga. Pemeriksaan penggunaan huruf kapital pada teks bahasa indonesia menggunaan metode rule based.2018.

Y Munarko, MS Sutrisno, WAI Mahardika, I Nuryasin, and Y Azhar. Named entity recognition model for indonesian tweetusing crf classifier. InIOP Conference Series: Materials Science and Engineering, volume 403, page 012067. IOP Publishing,2018.

Amelia Rahman, Wiranto Wiranto, and Afrizal Doewes. Online news classification using multinomial naive bayes.ITSMART:Jurnal Teknologi dan Informasi, 6(1):32–38, 2017.

Irina Rish et al. An empirical study of the naive bayes classifier. InIJCAI 2001 workshop on empirical methods in artificialintelligence, volume 3, pages 41–46. IBM New York, 2001.

Imanudin Shaufiah and Ibnu Asror. Android short messages filtering for bahasa using multinomial naive bayes. 2006.

Published
2019-09-09
How to Cite
Rifani, R., Bijaksana, M. A., & Asror, I. (2019). Named Entity Recognition for an Indonesian Based Language Tweet using Multinomial Naive Bayes Classifier. Indonesian Journal on Computing (Indo-JC), 4(2), 119-126. https://doi.org/10.34818/INDOJC.2019.4.2.330
Section
Computer Science