Download
Abstract
We study methods for learning sentence embeddings with syntactic structure. We focus on methods of learning syntactic sentence-embeddings by using a multilingual parallel-corpus augmented by Universal Parts-of-Speech tags. We evaluate the quality of the learned embeddings by examining sentence-level nearest neighbours and functional dissimilarity in the embedding space. We also evaluate the ability of the method to learn syntactic sentence-embeddings for low-resource languages and demonstrate strong evidence for transfer learning. Our results show that syntactic sentence-embeddings can be learned while using less training data, fewer model parameters, and resulting in better evaluation metrics than state-of-the-art language models.
Citation
Chen Liu, Anderson de Andrade, & Muhammad Osama. (2019). “Exploring multilingual syntactic sentence representations.” EMNLP Workshop on Noisy User-Generated Text.
@inproceedings{DBLP:conf/aclnut/LiuAO19,
author = {Chen Liu and
Anderson de Andrade and
Muhammad Osama},
editor = {Wei Xu and
Alan Ritter and
Tim Baldwin and
Afshin Rahimi},
title = {Exploring Multilingual Syntactic Sentence Representations},
booktitle = {Proceedings of the 5th Workshop on Noisy User-generated Text, W-NUT@EMNLP
2019, Hong Kong, China, November 4, 2019},
pages = {153--159},
publisher = {Association for Computational Linguistics},
year = {2019},
url = {https://doi.org/10.18653/v1/D19-5521},
doi = {10.18653/V1/D19-5521},
timestamp = {Tue, 22 Aug 2023 20:03:10 +0200},
biburl = {https://dblp.org/rec/conf/aclnut/LiuAO19.bib},
bibsource = {dblp computer science bibliography, https://dblp.org}
}