A Phrase Combination Approach to Patent SMT
Junguo Zhu 0, Muyun Yang, Tiejun Zhao, Sheng Li, Qi Haoliang
0Harbin Institute of Technology
Available Online December 2008.
- https://doi.org/10.2991/jcis.2008.99How to use a DOI?
- statistical machine translation, patent, phrase combination, word segmentation
- This paper presents a phrase combination approach to patent SMT (Statistical Ma-chine Translation) for Japanese to English. To minimize the segmentation problems caused by the rich OOV (out-of-vocabulary) words in the patent texts, the character based translation phrases are first introduced to avoid the segmentation errors in translation modeling. Then the word based translation phrases, which are established to utilize the dependent word level information, are combined with character translation table by linearly integrating their probability. Our experiments on NTCIR corpus indicate that the proposed method significantly out-performed the originally word based approach.
- Open Access
- This is an open access article distributed under the CC BY-NC license.
Cite this article
TY - CONF AU - Junguo Zhu AU - Muyun Yang AU - Tiejun Zhao AU - Sheng Li AU - Qi Haoliang PY - 2008/12 DA - 2008/12 TI - A Phrase Combination Approach to Patent SMT BT - 11th Joint International Conference on Information Sciences PB - Atlantis Press SP - 590 EP - 594 SN - 1951-6851 UR - https://doi.org/10.2991/jcis.2008.99 DO - https://doi.org/10.2991/jcis.2008.99 ID - Zhu2008/12 ER -