1 |
Between words and characters: A Brief History of Open-Vocabulary Modeling and Tokenization in NLP
|
|
|
|
In: https://hal.inria.fr/hal-03540069 ; 2022 (2022)
|
|
BASE
|
|
Show details
|
|
2 |
SIGMORPHON 2020 Shared Task 0: Typologically Diverse Morphological Inflection ...
|
|
|
|
BASE
|
|
Show details
|
|
3 |
SIGTYP 2020 Shared Task: Prediction of Typological Features ...
|
|
|
|
BASE
|
|
Show details
|
|
4 |
It’s Easier to Translate out of English than into it: Measuring Neural Translation Difficulty by Cross-Mutual Information ...
|
|
|
|
BASE
|
|
Show details
|
|
5 |
Linguistic calibration through metacognition: aligning dialogue agent responses with expected correctness ...
|
|
|
|
BASE
|
|
Show details
|
|
6 |
Processing South Asian Languages Written in the Latin Script: the Dakshina Dataset ...
|
|
|
|
BASE
|
|
Show details
|
|
7 |
It’s Easier to Translate out of English than into it: Measuring Neural Translation Difficulty by Cross-Mutual Information
|
|
|
|
In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics (2020)
|
|
BASE
|
|
Show details
|
|
8 |
UniMorph 3.0: Universal Morphology
|
|
|
|
In: Proceedings of the 12th Language Resources and Evaluation Conference (2020)
|
|
BASE
|
|
Show details
|
|
9 |
UniMorph 3.0: Universal Morphology ...
|
|
McCarthy, Arya D.; Kirov, Christo; Grella, Matteo; Nidhi, Amrit; Xia, Patrick; Gorman, Kyle; Vylomova, Ekaterina; Mielke, Sabrina J.; Nicolai, Garrett; Silfverberg, Miikka; Arkhangelskij, Timofey; Krizhanovsky, Natalya; Krizhanovsky, Andrew; Klyachko, Elena; Sorokin, Alexey; Mansfield, John; Ernštreits, Valts; Pinter, Yuval; Jacobs, Cassandra L.; Cotterell, Ryan; Hulden, Mans; Yarowsky, David. - : ETH Zurich, 2020
|
|
Abstract:
The Universal Morphology (UniMorph) project is a collaborative effort providing broad-coverage instantiated normalized morphological paradigms for hundreds of diverse world languages. The project comprises two major thrusts: a language-independent feature schema for rich morphological annotation and a type-level resource of annotated data in diverse languages realizing that schema. We have implemented several improvements to the extraction pipeline which creates most of our data, so that it is both more complete and more correct. We have added 66 new languages, as well as new parts of speech for 12 languages. We have also amended the schema in several ways. Finally, we present three new community tools: two to validate data for resource creators, and one to make morphological data available from the command line. UniMorph is based at the Center for Language and Speech Processing (CLSP) at Johns Hopkins University in Baltimore, Maryland. This paper details advances made to the schema, tooling, and ... : Proceedings of the 12th Language Resources and Evaluation Conference ...
|
|
Keyword:
lexical database; morphology; multilinguality
|
|
URL: https://dx.doi.org/10.3929/ethz-b-000462327 http://hdl.handle.net/20.500.11850/462327
|
|
BASE
|
|
Hide details
|
|
10 |
The SIGMORPHON 2019 Shared Task: Morphological Analysis in Context and Cross-Lingual Transfer for Inflection ...
|
|
|
|
BASE
|
|
Show details
|
|
11 |
Spell Once, Summon Anywhere: A Two-Level Open-Vocabulary Language Model ...
|
|
|
|
BASE
|
|
Show details
|
|
13 |
Unsupervised Disambiguation of Syncretism in Inflected Lexicons ...
|
|
|
|
BASE
|
|
Show details
|
|
|
|