3 |
CorCenCC: Corpws Cenedlaethol Cymraeg Cyfoes – the National Corpus of Contemporary Welsh ...
|
|
Knight, Dawn; Morris, Steve; Fitzpatrick, Tess; Rayson, Paul; Spasić, Irena; Thomas, Enlli Môn; Lovell, Alex; Morris, Jonathan; Evas, Jeremy; Stonelake, Mark; Arman, Laura; Davies, Josh; Ezeani, Ignatius; Neale, Steven; Needs, Jennifer; Piao, Scott; Rees, Mair; Watkins, Gareth; Williams, Lowri; Muralidaran, Vignesh; Tovey-Walsh, Bethan; Anthony, Laurence; Cobb, Thomas M; Deuchar, Margaret; Donnelly, Kevin; McCarthy, Michael; Scannell, Kevin. - : Cardiff University, 2020
|
|
Abstract:
The CorCenCC corpus contains over 11 million words (circa 14.4m tokens) from written, spoken and electronic (online, digital texts) Welsh language sources, taken from a range of genres, language varieties (regional and social) and contexts. The contributors to CorCenCC are representative of the over half a million Welsh speakers in the country. The creation of CorCenCC was a community-driven project, which offered users of Welsh an opportunity to be proactive in contributing to a Welsh language resource that reflects how Welsh is currently used. To make CorCenCC as representative of contemporary Welsh as possible, the project team designed a bespoke sampling framework. Extracts were collected from sources including for example, journals, emails, sermons, road signs, TV programmes, meetings, magazines and books. Conversations were recorded by the research team, and a specially designed crowdsourcing app (see: https://www.corcencc.org/app/) enabled Welsh speakers in the community to record and upload samples ...
|
|
Keyword:
Computational Linguistics; Computational/Corpus Linguistics; Language Corpora for ICT; Linguistics General
|
|
URL: https://dx.doi.org/10.17035/d.2020.0119878310 https://research.cardiff.ac.uk/converis/portal/detail/Dataset/119878310?auxfun=&lang=en_GB
|
|
BASE
|
|
Hide details
|
|
4 |
The National Corpus of Contemporary Welsh: Project Report | Y Corpws Cenedlaethol Cymraeg Cyfoes: Adroddiad y Prosiect ...
|
|
|
|
BASE
|
|
Show details
|
|
5 |
Yr Amliadur: Frequency Lists for Contemporary Welsh (Version 1.0.0) ...
|
|
|
|
BASE
|
|
Show details
|
|
7 |
The CorCenCC crowdsourcing app: a bespoke tool for the user-driven creation of the national corpus of contemporary Welsh
|
|
|
|
BASE
|
|
Show details
|
|
8 |
The CorCenCC crowdsourcing app: a bespoke tool for the user-driven creation of the national corpus of contemporary Welsh
|
|
|
|
BASE
|
|
Show details
|
|
9 |
Welsh for adults teaching and learning approaches, methodologies and resources: a comprehensive research study and critical review of the way forward
|
|
|
|
BASE
|
|
Show details
|
|
10 |
Welsh for adults teaching and learning approaches, methodologies and resources: a comprehensive research study and critical review of the way forward
|
|
|
|
BASE
|
|
Show details
|
|
12 |
Creating CorCenCC (Corpws Cenedlaethol Cymraeg Cyfoes - The National Corpus of Contemporary Welsh) [Online resource]
|
|
|
|
IDS-Repository
|
|
Show details
|
|
|
|