1 |
Universal Dependencies and Semantics for English and Hebrew Child-directed Speech
|
|
|
|
In: Proceedings of the Society for Computation in Linguistics (2022)
|
|
BASE
|
|
Show details
|
|
2 |
Do Infants Really Learn Phonetic Categories?
|
|
|
|
In: EISSN: 2470-2986 ; Open Mind ; https://hal.archives-ouvertes.fr/hal-03550830 ; Open Mind, MIT Press, 2021, 5, pp.113-131. ⟨10.1162/opmi_a_00046⟩ (2021)
|
|
BASE
|
|
Show details
|
|
3 |
Early phonetic learning without phonetic categories -- Insights from large-scale simulations on realistic input
|
|
|
|
In: ISSN: 0027-8424 ; EISSN: 1091-6490 ; Proceedings of the National Academy of Sciences of the United States of America ; https://hal.archives-ouvertes.fr/hal-03070566 ; Proceedings of the National Academy of Sciences of the United States of America , National Academy of Sciences, 2021, 118 (7), pp.e2001844118. ⟨10.1073/pnas.2001844118⟩ (2021)
|
|
BASE
|
|
Show details
|
|
4 |
Black or White but never neutral: How readers perceive identity from yellow or skin-toned emoji ...
|
|
|
|
BASE
|
|
Show details
|
|
6 |
Cross-linguistically Consistent Semantic and Syntactic Annotation of Child-directed Speech ...
|
|
|
|
BASE
|
|
Show details
|
|
7 |
Do Infants Really Learn Phonetic Categories?
|
|
|
|
In: Open Mind (Camb) (2021)
|
|
BASE
|
|
Show details
|
|
8 |
Early phonetic learning without phonetic categories: Insights from large-scale simulations on realistic input
|
|
|
|
In: Proc Natl Acad Sci U S A (2021)
|
|
BASE
|
|
Show details
|
|
9 |
Multilingual acoustic word embedding models for processing zero-resource languages ...
|
|
|
|
BASE
|
|
Show details
|
|
10 |
Improved acoustic word embeddings for zero-resource languages using multilingual transfer ...
|
|
|
|
BASE
|
|
Show details
|
|
12 |
Inflecting when there's no majority: Limitations of encoder-decoder neural networks as cognitive models for German plurals ...
|
|
|
|
BASE
|
|
Show details
|
|
13 |
Evaluating computational models of infant phonetic learning across languages ...
|
|
|
|
BASE
|
|
Show details
|
|
14 |
Multilingual and Unsupervised Subword Modeling for Zero-Resource Languages
|
|
|
|
In: http://infoscience.epfl.ch/record/277105 (2020)
|
|
BASE
|
|
Show details
|
|
15 |
On understanding character-level models for representing morphology
|
|
|
|
BASE
|
|
Show details
|
|
16 |
Methods for morphology learning in low(er)-resource scenarios
|
|
|
|
Abstract:
A core issue that hampers development and use of language technology for underresourced and morphologically rich languages is data sparsity. In this work, we consider unsupervised morphological analysis and lemmatization — two linguistically motivated ways to combat problems with sparse data. The morphological analysis aims to represent words in terms of the smallest meaningful units of language — morphemes (e.g., acid +ify +ed), while lemmatization concerns individual relationships among words (e.g., walks, walking and walked all are different forms of the lexeme walk). In this thesis, we focus on morphology learning in low-resource scenarios: we propose algorithms and methods that learn unsupervised morphological analysis and lemmatization with higher accuracy than the previous work while having affordable training data requirements. Our unsupervised morphological analyzers have similar or better underlying morpheme accuracy than three strong baselines while on average, inducing 12.8% more compact representation of the data than the next best system. Our lemmatizers reduce the training data requirements to raw character representations of wordforms in their immediate context, yet yield improvements (especially on unseen and ambiguous words) over systems that learn from complete morphologically annotated sentences.
|
|
Keyword:
lemmatization; low-resource learning; morphemes; morphological analysis; morphology; Natural Language Processing
|
|
URL: https://hdl.handle.net/1842/37115 https://doi.org/10.7488/era/416
|
|
BASE
|
|
Hide details
|
|
17 |
Discovering and analysing lexical variation in social media text
|
|
|
|
BASE
|
|
Show details
|
|
18 |
Are we there yet? Encoder-decoder neural networks as cognitive models of English past tense inflection ...
|
|
|
|
BASE
|
|
Show details
|
|
19 |
Analyzing ASR pretraining for low-resource speech-to-text translation ...
|
|
|
|
BASE
|
|
Show details
|
|
|
|