3 |
A large Portuguese corpus on-line: cleaning and preprocessing
|
|
|
|
BASE
|
|
Show details
|
|
4 |
Annotating the Interaction between Focus and Modality : the case of exclusive particles
|
|
|
|
BASE
|
|
Show details
|
|
5 |
Introducing the Reference Corpus of Contemporary Portuguese On-Line
|
|
|
|
BASE
|
|
Show details
|
|
8 |
Proposal for Multi-word Expression annotation in running text
|
|
|
|
BASE
|
|
Show details
|
|
10 |
Discovering the Language of Wine Reviews: A Text Mining Account
|
|
|
|
BASE
|
|
Show details
|
|
11 |
Modality annotation for Portuguese: from manual annotation to automatic labeling
|
|
|
|
Abstract:
We investigate modality in Portuguese and we combine a linguistic perspective with an application-oriented perspective on modality. We design an annotation scheme reflecting theoretical linguistic concepts and apply this schema to a small corpus sample to show how the scheme deals with real world language usage. We present two schemas for Portuguese, one for spoken Brazilian Portuguese and one for written European Portuguese. Furthermore, we use the annotated data not only to study the linguistic phenomena of modality, but also to train a practical text mining tool to detect modality in text automatically. The modality tagger uses a machine learning classi er trained on automatically extracted features from a syntactic parser. As we only have a small annotated sample available, the tagger was evaluated on 11 modal verbs that are frequent in our corpus and that denote more than one modal meaning. Finally, we discuss several valuable insights into the complexity of the semantic concept of modality that derive from the process of manual annotation of the corpus and from the analysis of the results of the automatic labeling: ambiguity and the semantic and syntactic properties typically associated to one modal meaning in context, and also the interaction of modality with negation and focus. The knowledge gained from the manual annotation task leads us to propose a new uni ed scheme for modality that applies to the two Portuguese varieties and covers both written and spoken data. ; info:eu-repo/semantics/publishedVersion
|
|
Keyword:
Corpus annotation; Modality; Portuguese linguistics; Text mining
|
|
URL: http://hdl.handle.net/10451/30693
|
|
BASE
|
|
Hide details
|
|
14 |
Manuscripts and machines: the automatic replacement of spelling variants in a Portuguese historical corpus
|
|
|
|
BASE
|
|
Show details
|
|
16 |
Towards a unified approach to modality annotation in portuguese
|
|
|
|
BASE
|
|
Show details
|
|
19 |
Enhancing access to online education: quality machine translation of MOOC content
|
|
|
|
In: Kordoni, Valia, van den Bosch, Antal orcid:0000-0003-2493-656X , Kermanidis, Katia Lida orcid:0000-0002-3270-5078 , Sosoni, Vilelmini orcid:0000-0002-9583-4651 , Cholakov, Kostadin, Hendrickx, Iris, Huck, Matthias and Way, Andy orcid:0000-0001-5736-5930 (2016) Enhancing access to online education: quality machine translation of MOOC content. In: Tenth International Conference on Language Resources and Evaluation (LREC 2016), 23-28 May 2016, Portorož, Slovenia. ISBN 978-2-9517408-9-1 (2016)
|
|
BASE
|
|
Show details
|
|
|
|