Peringkasan Sentimen Esktraktif di Twitter Menggunakan Hybrid TF-IDF dan Cosine Similarity
Abstract: The using of Twitter
by selebrities has become a new trend of impression management strategy. Mining
public reaction in social media is a good strategy to obtain feedbacks, but
extracting it are not trivial matter. Reads hundred of tweets while determine
their sentiment polarity are time consuming. Extractive sentiment summarization
machine are needed to address this issue. Previous research generally do not
include sentiment information contained in a tweet as weight factor, as a
results only general topics of discussion are extracted.
This research aimed to do an extractive sentiment summarization on both
positive and negative sentiment mentioning Indonesian selebrity, Agnes Monica,
by combining SentiStrength, Hybrid TF-IDF, and Cosine Similarity. SentiStrength
is used to obtain sentiment strength score and classify tweet as a positive,
negative or neutral. The summarization of posisitve and negative sentiment can
be done by rank tweets using Hybrid TF-IDF summarization and sentiment strength
score as additional weight then removing similar tweet by using Cosine
Similarity.
The test results showed that the combination of SentiStrength, Hybrid
TF-IDF, and Cosine Similarity perform better than using Hybrid TF-IDF only,
given an average 60% accuracy and 62% f-measure. This is due to the addition of
sentiment score as a weight factor in sentiment summarization.
Keywords: extractive sentiment
summarization, sentiment analysist, classification, automatic text
summarization, SentiStrength, Hybrid TF-IDF
Penulis: Devid Haryalesmana
Wahid, Azhari SN
Kode Jurnal: jptinformatikadd160307