Download

Abstract

We study methods for learning sentence embeddings with syntactic structure. We focus on methods of learning syntactic sentence-embeddings by using a multilingual parallel-corpus augmented by Universal Parts-of-Speech tags. We evaluate the quality of the learned embeddings by examining sentence-level nearest neighbours and functional dissimilarity in the embedding space. We also evaluate the ability of the method to learn syntactic sentence-embeddings for low-resource languages and demonstrate strong evidence for transfer learning. Our results show that syntactic sentence-embeddings can be learned while using less training data, fewer model parameters, and resulting in better evaluation metrics than state-of-the-art language models.


Citation

Chen Liu, Anderson de Andrade, & Muhammad Osama. (2019). “Exploring multilingual syntactic sentence representations.” EMNLP Workshop on Noisy User-Generated Text.

@inproceedings{DBLP:conf/aclnut/LiuAO19,
  author       = {Chen Liu and
                  Anderson de Andrade and
                  Muhammad Osama},
  editor       = {Wei Xu and
                  Alan Ritter and
                  Tim Baldwin and
                  Afshin Rahimi},
  title        = {Exploring Multilingual Syntactic Sentence Representations},
  booktitle    = {Proceedings of the 5th Workshop on Noisy User-generated Text, W-NUT@EMNLP
                  2019, Hong Kong, China, November 4, 2019},
  pages        = {153--159},
  publisher    = {Association for Computational Linguistics},
  year         = {2019},
  url          = {https://doi.org/10.18653/v1/D19-5521},
  doi          = {10.18653/V1/D19-5521},
  timestamp    = {Tue, 22 Aug 2023 20:03:10 +0200},
  biburl       = {https://dblp.org/rec/conf/aclnut/LiuAO19.bib},
  bibsource    = {dblp computer science bibliography, https://dblp.org}
}