A Novel Part-of-Speech Set Developing Method for Statistical Machine Translation

Abstract: Part of speech (PoS) is one of the features that can be used to improve the quality of statistical-based machine translation. Typically, the language PoS determined based grammar of the language or adopt from other languages PoS. This work aims to formulate a model to developing PoS as linguistic factors to improve the quality of machine translation automatically. The research method using word similarity approach, where we perform clustering of the words contained in a corpus. Further classes will be defined as PoS set obtained for a given language.We evaluated the results of the PoS that defined computational results using machine translation system MOSES as the system by comparing the results of the SMT are using PoS sets generated manually, while the assessment of the system using BLEU method. Language that will be used for evaluation is English as the source language and Indonesian as the target language.
Keywords: method; part-of-speech; statistical machine translation; moses; word similarity
Author: Herry Sujaini, Kuspriyanto, Arry Akhmad Arman, Ayu Purwarianti
Journal Code: jptkomputergg140077

Artikel Terkait :