21 |
Connecting Resources: Which Issues Have to be Solved to Integrate CMC Corpora from Heterogeneous Sources and for Different Languages?
|
|
|
|
In: 5th Conference on CMC and Social Media Corpora for the Humanities (cmccorpora17) ; https://hal.archives-ouvertes.fr/hal-01918880 ; 5th Conference on CMC and Social Media Corpora for the Humanities (cmccorpora17), Oct 2017, Bolzano, Italy. pp.52-55 ; https://doi.org/10.5281/zenodo.1040713 (2017)
|
|
BASE
|
|
Show details
|
|
23 |
Connecting Resources: Which Issues Have To Be Solved To Integrate Cmc Corpora From Heterogeneous Sources And For Different Languages? ...
|
|
|
|
BASE
|
|
Show details
|
|
24 |
Connecting Resources: Which Issues Have To Be Solved To Integrate Cmc Corpora From Heterogeneous Sources And For Different Languages? ...
|
|
|
|
BASE
|
|
Show details
|
|
26 |
Integrating optical character recognition and machine translation of historical documents
|
|
|
|
In: Afli, Haithem orcid:0000-0002-7449-4707 and Way, Andy orcid:0000-0001-5736-5930 (2016) Integrating optical character recognition and machine translation of historical documents. In: COLING, the 26th International Conference on Computational Linguistics, 13-16 Dec 2016, Osaka, Japan. (2016)
|
|
BASE
|
|
Show details
|
|
27 |
Language Technology Resources and Tools for Digital Humanities (LT4DH) : proceedings of the workshop : December 11-16, 2016, Osaka, Japan : LT4DH 2016
|
|
|
|
BASE
|
|
Show details
|
|
35 |
Introduction to the Special Issue [Computational, cognitive, and linguistic approaches to the analysis of compounds and collocations]
|
|
|
|
BASE
|
|
Show details
|
|
36 |
Accurate linear-time Chinese word segmentation via embedding matching
|
|
|
|
BASE
|
|
Show details
|
|
37 |
Computational, Cognitive, and Linguistic Approaches to the Analysis of Compounds and Collocations [Special Issue]
|
|
|
|
BASE
|
|
Show details
|
|
38 |
Proceedings of the fourteenth International Workshop on Treebanks and Linguistic Theories (TLT14) : 11-12 December 2015, Warsaw, Poland
|
|
Hinrichs, Erhard. - : Warszawa : Institute of Computer Science, Polish Academy of Sciences, 2015
|
|
BASE
|
|
Show details
|
|
39 |
Word Sense Disambiguation with GermaNet ; Disambiguierung von Wortbedeutungen mit GermaNet
|
|
|
|
Abstract:
The subject of this dissertation is boosting research on word sense disambiguation (WSD) for German. WSD is a very active area of research in computational linguistics, but most of the work is focused on English. One of the factors that has hampered WSD research for other languages such as German is the lack of appropriate resources, particularly in the form of sense-annotated corpus data. Hence, this work inevitably has to start with the preparation of resources before actual WSD experiments can be performed. The work program is fourfold. Firstly, since sense definitions are necessary to distinguish word senses (both for humans and for automatic WSD algorithms), the German wordnet GermaNet is (semi-)automatically extended with sense descriptions. This is done by automatically mapping GermaNet senses to descriptions in the online dictionary Wiktionary. Secondly, since the availability of sense-annotated corpora is a prerequisite for evaluating and developing word sense disambiguation systems, two GermaNet sense-annotated corpora are constructed. One corpus is automatically constructed and the other corpus is manually sense-annotated. Thirdly, several knowledge-based WSD algorithms are applied and evaluated -- using the newly created sense-annotated corpora. These algorithms are based on a suite of semantic relatedness measures, including path-based, information-content-based, and gloss-based methods. Experiments on gloss-based methods also employ the newly harvested definitions from Wiktionary. Fourthly, several supervised machine learning classifiers are applied to the task of German WSD, including rule-based methods, instance-based methods, probabilistic methods, and support vector machines. The classifiers rely on a wide range of machine learning features and their evaluation focuses on several aspects, including a comparison of several algorithms, a detailed analysis of the implemented features, and an investigation of the influence of syntax and semantics on the disambiguation performance for verbs.
|
|
Keyword:
400; Bedeutung; Bedeutungsdisambiguierung; Computational Linguistics; Computerlinguistik; deutsches Wortnetz; Disambiguierung; German wordnet; GermaNet; lesartenannotierte Korpora; sense-annotated corpora; Wiktionary; Word Sense Disambiguation
|
|
URL: http://nbn-resolving.de/urn:nbn:de:bsz:21-dspace-632846 http://hdl.handle.net/10900/63284 https://doi.org/10.15496/publikation-4706
|
|
BASE
|
|
Hide details
|
|
40 |
Automatic noun compound interpretation using deep neural networks and word embeddings
|
|
|
|
BASE
|
|
Show details
|
|
|
|