Publication

Bilingual word embeddings from non-parallel document-aligned data applied to bilingual lexicon induction

Book Contribution - Book Chapter Conference Contribution

We propose a simple yet effective approach to learning bilingual word embeddings (BWEs) from non-parallel document-aligned data (based on the omnipresent skip-gram model), and its application to bilingual lexicon induction (BLI). We demonstrate the utility of the induced BWEs in the BLI task by reporting on benchmarking BLI datasets for three language pairs: (1) We show that our BWE-based BLI models significantly outperform the MuPTM-based and context-counting models in this setting, and obtain the best reported BLI results for all three tested language pairs; (2) We also show that our BWE-based BLI models outperform other BLI models based on recently proposed BWEs that require parallel data for bilingual training.

Book: Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics (ACL 2015)

Pages: 719 - 725

ISBN:9781941643730

Publication year:2015

WoS Id: 000493810000118
Institutional Repository URL: https://lirias.kuleuven.be/1572159

Accessibility:Open

Publication

Bilingual word embeddings from non-parallel document-aligned data applied to bilingual lexicon induction

Book Contribution - Book Chapter Conference Contribution

Authors/publisher

Research units

Events