A Novel Part-of-Speech Set Developing Method for Statistical Machine Translation
Abstract: Part of speech (PoS)
is one of the features that can be used to improve the quality of
statistical-based machine translation. Typically, the language PoS determined
based grammar of the language or adopt from other languages PoS. This work aims
to formulate a model to developing PoS as linguistic factors to improve the
quality of machine translation automatically. The research method using word
similarity approach, where we perform clustering of the words contained in a
corpus. Further classes will be defined as PoS set obtained for a given
language.We evaluated the results of the PoS that defined computational results
using machine translation system MOSES as the system by comparing the results
of the SMT are using PoS sets generated manually, while the assessment of the
system using BLEU method. Language that will be used for evaluation is English
as the source language and Indonesian as the target language.
Author: Herry Sujaini, Kuspriyanto,
Arry Akhmad Arman, Ayu Purwarianti
Journal Code: jptkomputergg140077