Avatar

Noëmi Aepli

PhD Student

University of Zurich

CV

That’s me

I’m a PhD student at the Department of Computational Linguistics working on my SNSF funded project Natural Language Processing for Low-Resource Language Variations (LORELAI). From August 2022 until March 2023 I'm visiting the Berkeley Speech and Computation Lab at UC Berkeley.

I’m very much fascinated by dialects and language variations and the challenges they pose for NLP.

Check out my publication Improving Zero-Shot Cross-lingual Transfer Between Closely Related Languages by Injecting Character-Level Noise, an ACL 2022 Findings paper. POS tagging models for Swiss German trained with this approach are available on the Huggingface model hub.

Interests

  • dialects & language varieties
  • speech processing & multimodal settings
  • transfer learning between closely related languages
  • text normalization & MT between closely related languages
  • low-resource settings & robustness to language varieties

Education

  • Master in Computational Linguistics and Language Technologies, 2018

    University of Zurich

Projects

LORELAI

NLP for Low-Resource Language Variations

NOAH’s Corpus

Part-of-Speech Tagging for Swiss German

ITDI Shared Task 2022

Identification of Languages and Dialects of Italy: Shared Task @ VarDial 2022

Swiss German UD

Universal Dependency Parsing for Swiss German

Selected Publications

Improving Zero-Shot Cross-lingual Transfer Between Closely Related Languages by Injecting Character-Level Noise

a simple yet effective strategy to improve cross-lingual transfer between closely related language varieties

On Biasing Transformer Attention Towards Monotonicity

we introduce a monotonicity bias into attentional sequence-to-sequence models and explore to what extent a model can benefit from that

Compilation of a Swiss German Dialect Corpus and its Application to PoS Tagging

data set for developing NLP applications for Swiss German & baseline for dialect identification

Part-of-Speech Tag Disambiguation by Cross-Linguistic Majority Vote

an approach to developing resources for a low-resource language, taking advantage of closely related languages