Analisis Fitur Kalimat untuk Peringkas Teks Otomatis pada Bahasa Indonesia
Abstract: Automatic Text
Summarization (ATS) is a technique to create a summary of the document
automatically by using computer applications to produce the most important
information from the original document. Features are required to perform
weighting of sentences, including Log-TFISF (term frequency index sentence
frequency), sentence location, sentence overlap, title overlap and sentence
relative length. This research conducted an analysis of five features in order
to determine the weights of each feature that will get the results of a
coherent summary. The five features are implemented in automated text
summarization system in Indonesian language that was developed using the method
of relative importance of topics. Results from experiments show that sentence
location feature has the highest F-Measures namely 0.46 and then consecutive
sentence overlap, title overlap, sentence relative length and Log-TFISF, with a
value of 0.42, 0.42, 0.35 and 0.32. Relative weights of feature extraction
consecutive from the largest are sentence location, sentence overlap, title
overlap, sentence relative length and Log-TFISF with a value of 0.25, 0.22,
0.22, 0.19 and 0.12. These relative weights are implemented on ATS, so we get
accuracy of 70.62%. It is more accurate 2,86% than without relative weights
which accuracy of 67,72%..
Keywords: Automatic Text
Summarization (ATS), Log-TFISF, sentence location, sentence overlap, title
overlap, sentence relative length, bahasa Indonesia
Penulis: Badrus Zaman, Edi
Winarko
Kode Jurnal: jptinformatikadd110164