Online News Classification Using Multinomial Naive Bayes
Abstract: The huge
availability of text in numerous forms is the valuable information resource
that can be used for various purposes. One of the text mining methods to
analyze text document is classification. Text classification is a process of
grouping and categorizing a document based on the training models. This study
aimed to categorize Indonesian news automatically using Multinomial Naive
Bayes. To get more optimal result, feature selection process using Document
Frequency Thresholding method and term weighting using Term Frequency-Inverse
Document Frequency (TF-IDF) were applied. The experiment showed that
Multinomial Naive Bayes with TF-IDF produced the highest average accuracy to
86,62 % while Multinomial Naive Bayes reached 86,28%, Multinomial Naive Bayes
with DF-Thresholding-TFIDF to 86,15% and Multinomial Naive Bayes with
DF-Thresholding to 85,98%. Feature selection with Document Frequency Thresholding
is quite efficient to reduce the number of data dimension shown with the result
of insignificant final accuracy from Multinomial Naive Bayes method.
Penulis: Amelia Rahman, Wiranto,
Afrizal Doewes
Kode Jurnal: jptinformatikadd170125