Analysis of Stemming Influence on Indonesian Tweet Classification
Abstract: Stemming has been
commonly used by some researchers in natural language processing area such as
text mining, text classification, and information retrieval. In information
retrieval, stemming mayhelp to raise retrieval performance. However, there is
an indication that stemming does not hand oversignificant influence toward the
accuracy in text classification. Therefore, this paper analyzes further research
about the influence of stemming on tweet classification in Bahasa Indonesia.
This work examines about the accuracy result between two conditions by
involving stemming and without involving stemming in pre-processing task for
tweet classification. The contribution of this research is to find out a better
preprocessing task in order to obtain good accuracy in text classification.
According to the experiments, it is observed that all accuracy results in tweet
classification tend to decrease. Stemming task does not raise the accuracy
either using SVM or Naive Bayes algorithm. Therefore, this work summarized that
stemming process does not affect significantly towards the accuracy
performance.
Author: Ahmad Fathan
Hidayatullah
Journal Code: jptkomputergg160251