4 |
The national corpus of contemporary Welsh, 2016-2020 ...
|
|
Knight, Dawn; Morris, Steve; Fitzpatrick, Tess; Rayson, Paul; Spasić, Irena; Thomas, Enlli Môn; Lovell, Alex; Morris, Jonathan; Evas, Jeremy; Stonelake, Mark; Arman, Laura; Davies, Joshua; Ezeani, Ignatius; Neale, Steven; Needs, Jennifer; Piao, Scott; Rees, Mair; Watkins, Gareth; Williams, Lowri; Muralidaran, Vignesh; Tovey-Walsh, Bethan; Anthony, Laurence; Cobb, Tom; Deuchar, Margaret; Donnelly, Kevin; McCarthy, Michael; Scannell, Kevin. - : UK Data Service, 2021
|
|
Abstract:
The CorCenCC corpus contains over 11 million words (circa 14.4m tokens). CorCenCC is the first corpus of the Welsh language that covers all three aspects of contemporary Welsh: spoken, written and electronically mediated (e-language). It offers a snapshot of the Welsh language across a range of contexts of use, e.g. private conversations, group socialising, business and other work situations, in education, in the various published media, and in public spaces. It includes examples of news headlines, personal and professional emails and correspondence, academic writing, formal and informal speech, blog posts and text messaging. Language data was sampled from a range of different speakers and users of Welsh, from all regions of Wales, of all ages and genders, with a wide range of occupations, and with a variety of linguistic backgrounds (e.g. how they came to speak Welsh), to reflect the diversity of text types and of Welsh speakers found in contemporary Wales. In this way, the CorCenCC corpus provides the ...
|
|
URL: https://dx.doi.org/10.5255/ukda-sn-854531 http://reshare.ukdataservice.ac.uk/id/eprint/854531
|
|
BASE
|
|
Hide details
|
|
5 |
CorCenCC: Corpws Cenedlaethol Cymraeg Cyfoes – the National Corpus of Contemporary Welsh ...
|
|
|
|
BASE
|
|
Show details
|
|
7 |
Code-switching in Irish tweets: a preliminary analysis
|
|
|
|
In: Lynn, Teresa and Scannell, Kevin orcid:0000-0003-4075-9524 (2019) Code-switching in Irish tweets: a preliminary analysis. In: Third Celtic Language Technology Workshop 2019, 19 Aug 2019, Dublin, Ireland. (2019)
|
|
BASE
|
|
Show details
|
|
9 |
Minority language Twitter: part-of-speech tagging and analysis of Irish Tweets
|
|
|
|
In: Lynn, Teresa, Scannell, Kevin and Maguire, Eimear (2015) Minority language Twitter: part-of-speech tagging and analysis of Irish Tweets. In: ACL 2015 Workshop on Noisy User-generated Text 2015 (W-NUT), 31 July 2015, Beijing, China. (2015)
|
|
BASE
|
|
Show details
|
|
12 |
Creating CorCenCC (Corpws Cenedlaethol Cymraeg Cyfoes - The National Corpus of Contemporary Welsh) [Online resource]
|
|
|
|
IDS-Repository
|
|
Show details
|
|
|
|