Rhetorical Sentence Classification for Automatic Title Generation in Scientific Article

Abstract: In this paper, we proposed a work on rhetorical corpus construction and sentence classification model experiment that specifically could be incorporated in automatic paper title generation task for scientific article. Rhetorical classification is treated as sequence labeling. Rhetorical sentence classification model is useful in task which considers document’s discourse structure. We performed experiments using two domains of datasets: computer science (CS dataset), and chemistry (GaN dataset). We evaluated the models using 10-fold-cross validation (0.70-0.79 weighted average F-measure) as well as on-the-run (0.30-0.36 error rate at best). We argued that our models performed best when handled using SMOTE filter for imbalanced data.
Keywords: rhetorical corpus construction, rhetorical classification, automatic title generation, scientific article
Author: Jan Wira Gotama Putra, Masayu Leylia Khodra
Journal Code: jptkomputergg170126

Artikel Terkait :