您好,欢迎访问新疆畜牧科学院 机构知识库!

WordNet Expansion with Bilingual Word Embeddings and Neural Machine Translation

文献类型: 会议论文

第一作者: Marta Vazquez Abuin

作者: Marta Vazquez Abuin 1 ; Marcos Garcia 1 ;

作者机构: 1.Centro Singular de Investigacion en Tecnoloxaas Intelixentes (CiTIUS), Universidade de Santiago de Compostela

关键词: WordNet;Lexical semantics;Distributional semantics;Galician

会议名称: EPIA Conference on Artificial Intelligence

主办单位:

页码: 280-291

摘要: This paper explores various strategies to expand Galnet (the Galician WordNet) with both word entries and sentence examples from the English WordNet. To obtain translation equivalents for a given word in a synset, we rely on lemmatized and POS-tagged bilingual word embeddings, used as probabilistic dictionaries. Concerning the examples, we use state-of-the-art English-Galician neural machine translation models. Based on these resources, we have designed and evaluated straightforward heuristics to expand Galnet. The proposed approach allows us to obtain more than 13k high-quality example sentences in Galician, and more than 4,5k new entries for Galnet. Critically, we have performed a set of careful qualitative analyses to verify the suitability of each step, assessing the adequacy of the obtained word forms of the quality of the automatic translation. The results of these analyses shed light on the performance of each stage of the process, which is valuable information also to adapt our method to other languages.

分类号: tp18-53

  • 相关文献
作者其他论文 更多>>