1 |
Measuring Terminology Consistency in Translated Corpora: Implementation of the Herfindahl-Hirshman Index
|
|
|
|
In: Information; Volume 13; Issue 2; Pages: 43 (2022)
|
|
BASE
|
|
Show details
|
|
2 |
Constructional equivalence in the Indonesian translations of ROB and STEAL ...
|
|
|
|
BASE
|
|
Show details
|
|
4 |
Le particelle razve e neuželi alla luce del Corpus parallelo russo-italiano
|
|
Noseda, Valentina (orcid:0000-0002-5148-1241). - : Aracne, 2021. : country:ITA, 2021. : place:Roma, 2021
|
|
BASE
|
|
Show details
|
|
5 |
Translationese and register variation in English-to-Russian professional translation
|
|
|
|
In: New Perspectives on Corpus Translation Studies (2021)
|
|
Abstract:
This is an accepted manuscript of a chapter published by Springer in Wang V.X., Lim L., Li D. (eds.) New Perspectives on Corpus Translation Studies. New Frontiers in Translation Studies, available online: https://doi.org/10.1007/978-981-16-4918-9_6 The accepted version of the publication may differ from the final published version. For re-use please see the publisher's reuse policy. ; This study explores the impact of register on the properties of translations. We compare sources, translations and non-translated reference texts to describe the linguistic specificity of translations common and unique between four registers. Our approach includes bottom-up identification of translationese effects that can be used to define translations in relation to contrastive properties of each register. The analysis is based on an extended set of features that reflect morphological, syntactic and text-level characteristics of translations. We also experiment with lexis-based features from n-gram language models estimated on large bodies of originally- authored texts from the included registers. Our parallel corpora are built from published English-to-Russian professional translations of general domain mass-media texts, popular-scientific books, fiction and analytical texts on political and economic news. The number of observations and the data sizes for parallel and reference components are comparable within each register and range from 166 (fiction) to 525 (media) text pairs; from 300,000 to 1 million tokens. Methodologically, the research relies on a series of supervised and unsupervised machine learning techniques, including those that facilitate visual data exploration. We learn a number of text classification models and study their performance to assess our hypotheses. Further on, we analyse the usefulness of the features for these classifications to detect the best translationese indicators in each register. The multivariate analysis via text classification is complemented by univariate statistical analysis which helps to explain the observed deviation of translated registers through a number of translationese effects and detect the features that contribute to them. Our results demonstrate that each register generates a unique form of translationese that can be only partially explained by cross-linguistic factors. Translated registers differ in the amount and type of prevalent translationese. The same translationese tendencies in different registers are manifested through different features. In particular, the notorious shining-through effect is more noticeable in general media texts and news commentary and is less prominent in fiction. ; Published version
|
|
Keyword:
English; language and arts disciplines; machine learning; parallel corpora; register variation; Russian; translation; translationese indicators; translationese trends
|
|
URL: https://doi.org/10.1007/978-981-16-4918-9_6 http://hdl.handle.net/2436/624409
|
|
BASE
|
|
Hide details
|
|
6 |
The use of parallel Corpora for a contrastive (Russian-Italian) description of discourse markers: new instruments compared to traditional lexicography
|
|
Bonola, Anna Paola (orcid:0000-0003-3931-670X); Noseda, Valentina (orcid:0000-0002-5148-1241). - : Associazione per l’Informatica Umanistica e la Cultura Digitale, 2020. : country:ITA, 2020. : place:Milano, 2020
|
|
BASE
|
|
Show details
|
|
7 |
The use of English, Czech and French punctuation marks in reference, parallel and comparable web corpora: a question of methodology
|
|
|
|
In: Linguistica Pragensia, Vol 30, Iss 1, Pp 30-50 (2020) (2020)
|
|
BASE
|
|
Show details
|
|
8 |
A Sustainable and Open Access Knowledge Organization Model to Preserve Cultural Heritage and Language Diversity
|
|
|
|
In: ISSN: 2078-2489 ; Information ; https://hal.archives-ouvertes.fr/hal-02565134 ; Information, MDPI, 2019, 10 (10), pp.303. ⟨10.3390/info10100303⟩ (2019)
|
|
BASE
|
|
Show details
|
|
9 |
A Sustainable and Open Access Knowledge Organization Model to Preserve Cultural Heritage and Language Diversity
|
|
|
|
In: Information ; Volume 10 ; Issue 10 (2019)
|
|
BASE
|
|
Show details
|
|
10 |
Post-French Immersion Student Perceptions of Parallel Concordancing: A Mixed Methods Study
|
|
|
|
BASE
|
|
Show details
|
|
11 |
Extending the Galician Wordnet Using a Multilingual Bible Through Lexical Alignment and Semantic Annotation
|
|
: Schloss Dagstuhl - Leibniz-Zentrum fuer Informatik, 2018. : OASIcs - OpenAccess Series in Informatics. 7th Symposium on Languages, Applications and Technologies (SLATE 2018), 2018
|
|
BASE
|
|
Show details
|
|
12 |
A flexible framework for collocation retrieval and translation from parallel and comparable corpora
|
|
|
|
BASE
|
|
Show details
|
|
13 |
NLP Corpus Observatory – Looking for Constellations in Parallel Corpora to Improve Learners’ Collocational Skills
|
|
|
|
In: Schneider, Gerold; Graën, Johannes (2018). NLP Corpus Observatory – Looking for Constellations in Parallel Corpora to Improve Learners’ Collocational Skills. In: 7th Workshop on NLP for Computer Assisted Language Learning at SLTC 2018 (NLP4CALL 2018), Stockholm, 7 November 2018 - 7 November 2018, 69-78. (2018)
|
|
BASE
|
|
Show details
|
|
14 |
English to Spanish translated medical forms: A descriptive genre-based corpus study
|
|
|
|
In: Translation and Interpreting : the International Journal of Translation and Interpreting Research, Vol 10, Iss 2, Pp 122-141 (2018) (2018)
|
|
BASE
|
|
Show details
|
|
15 |
Crossing the Border Twice: Reimporting Prepositions to Alleviate L1-Specific Transfer Errors
|
|
|
|
In: Graën, Johannes; Schneider, Gerold (2017). Crossing the Border Twice: Reimporting Prepositions to Alleviate L1-Specific Transfer Errors. In: Joint 6th Workshop on NLP for Computer Assisted Language Learning and 2nd Workshop on NLP for Research on Language Acquisition, Gothenburg, 22 May 2017 - 22 May 2017, 18-26. (2017)
|
|
BASE
|
|
Show details
|
|
16 |
The parallel Polish-Bulgarian-Russian corpus: problems and solutions
|
|
|
|
BASE
|
|
Show details
|
|
17 |
Forms of Address as Discrete Modal Operators
|
|
|
|
In: Cognitive Studies | Études cognitives; No 16 (2016); 23-32 ; 2392-2397 (2016)
|
|
BASE
|
|
Show details
|
|
18 |
The Use of the Lexical Exponents of Hypothetical Modality in Polish and Lithuanian
|
|
|
|
In: Cognitive Studies | Études cognitives; No 16 (2016); 45-56 ; 2392-2397 (2016)
|
|
BASE
|
|
Show details
|
|
19 |
Parallel corpora. A real-time approach to the study of language change in progress
|
|
|
|
BASE
|
|
Show details
|
|
|
|