21 |
An attempt to formalize word sense disambiguation: maximizing efficiency by minimizing computational costs
|
|
|
|
In: Revista española de lingüística aplicada, ISSN 0213-2028, Vol. 22, 2009, pags. 77-88 (2009)
|
|
Abstract:
En el presente artículo se expone la estructura de un algoritmo para la desambiguación automática de significados a partir de colocados. El objetivo de nuestro algoritmo es lograr la máxima eficiencia reduciendo al mínimo (1) los costes computacionales y (2) el recurso a los corpus anotados o etiquetados. La formalización del algoritmo se fundamenta en el análisis de funciones discriminantes. Esta técnica estadística nos permite parametrizar cada uno de los colocados con su correspondiente significado, valiéndonos solamente del texto plano. Los datos parametrizados nos permitirán clasificar cada caso (frases con una palabra ambigua) en una variable de valores de dependientes (es decir, cada uno de los significados de la palabra ambigua). Para comprobar la validez y eficiencia de nuestro algoritmo desambiguador, desambiguamos primero manualmente el significado de la palabra estudiada en cada una de las frases en que ésta aparecía, para luego validar los datos clasificados con la aplicación automática del desambiguador de sentidos. Finalmente, presentamos los resultados globales de nuestro algoritmo, tras aplicarlo a una muestra de limitada de oraciones de ambas lenguas, español e inglés. Al mismo tiempo ponemos de relieve algunos de los aspectos que consideramos relevantes de cara a investigaciones o trabajos futuros. ; This paper presents an algorithm based on collocational data for word sense disambiguation (WSD). The aim of this algorithm is to maximize efficiency by minimizing (1) computational costs and (2) linguistic tagging/annotation. The formalization of our WSD algorithm is based on discriminant function analysis (DFA). This statistical technique allows us to parameterize each collocational item with its meaning, using just bare text. The parameterized data allow us to classify cases (sentences with an ambiguous word) into the values of a categorical dependent (each of the meanings of the ambiguous word). To evaluate the validity and efficiency of our WSD algorithm, we previously hand sense-tagged all the sentences containing ambiguous words and then cross-validated the hand sense-tagged data with the automatic WSD performance. Finally, we present the global results of our algorithm after applying it to a limited set of words in both languages: Spanish and English, highlighting the points we consider relevant for further analysis.
|
|
Keyword:
applied linguistics; computational linguistics; Desambiguación automática de significados; lexicografía; lexicología; lexicology; lingüística aplicada; lingüística computacional; lingüística del corpus; WSD
|
|
URL: http://dialnet.unirioja.es/servlet/oaiart?codigo=3138264
|
|
BASE
|
|
Hide details
|
|
22 |
Bases de conocimiento multilíngües para el procesamiento semántico a gran escala ; Multilingual knowledge resources for wide–coverage semantic processing
|
|
|
|
BASE
|
|
Show details
|
|
24 |
Authors
|
|
|
|
In: http://www.lsi.upc.edu/~nlp/meaning/documentation/3rdYear/WP6.17.pdf (2005)
|
|
BASE
|
|
Show details
|
|
27 |
Mapping WorldNet Senses to a Lexical Database of Verbs
|
|
|
|
In: DTIC (2001)
|
|
BASE
|
|
Show details
|
|
28 |
Word Sense Disambiguation Using Automatically Acquired Verbal Preferences
|
|
|
|
In: ftp://ftp.cogs.sussex.ac.uk/pub/users/dianam/senseval.ps (2000)
|
|
BASE
|
|
Show details
|
|
29 |
H.: Semantic rule filtering for web-scale relation extraction
|
|
|
|
In: http://wwwusers.di.uniroma1.it/~navigli/pubs/ISWC_2013_Moro_etal.pdf
|
|
BASE
|
|
Show details
|
|
30 |
Authors
|
|
|
|
In: http://www.lsi.upc.edu/~nlp/meaning/documentation/2onYear/D2.2.pdf
|
|
BASE
|
|
Show details
|
|
31 |
H.: Semantic rule filtering for web-scale relation extraction
|
|
|
|
In: http://wwwusers.di.uniroma1.it/~navigli/pubs/ISWC_2013_Moro_etal.pdf
|
|
BASE
|
|
Show details
|
|
32 |
ABSTRACT Long Tail in Weighted Lexical Networks
|
|
|
|
In: http://www.aclweb.org/anthology/W/W12/W12-5102.pdf
|
|
BASE
|
|
Show details
|
|
33 |
Auto-Discovery of NVEF Word-Pairs in Chinese Abstract
|
|
|
|
In: http://www.aclclp.org.tw/rocling/2003/M09.pdf
|
|
BASE
|
|
Show details
|
|
34 |
World Wide Web presents several.
|
|
|
|
In: http://www.softcomputing.net/jucs2013.pdf
|
|
BASE
|
|
Show details
|
|
35 |
The UNED systems at SENSEVAL-2
|
|
|
|
In: http://aclweb.org/anthology-new/S/S01/S01-1018.pdf
|
|
BASE
|
|
Show details
|
|
36 |
ABSTRACT Long Tail in Weighted Lexical Networks
|
|
|
|
In: http://www.lirmm.fr/~lafourcade/ML-biblio/COGALEX3/COGALEX2012-ML-v4.pdf
|
|
BASE
|
|
Show details
|
|
37 |
USYD: WSD and Lexical Substitution using the Web1T Corpus Abstract This paper describes the University of Sydney’s WSD and Lexical Substitution systems
|
|
In: http://www.denizyuret.com/ref/hawker/98.pdf
|
|
BASE
|
|
Show details
|
|
38 |
Development of an Approach for Disambiguating Ambiguous Hindi postposition
|
|
|
|
In: http://www.ijcaonline.org/volume5/number9/pxc3871317.pdf
|
|
BASE
|
|
Show details
|
|
39 |
PARALLEL CORPORA, ALIGNMENT TECHNOLOGIES AND FURTHER PROSPECTS IN MULTILINGUAL RESOURCES AND TECHNOLOGY INFRASTRUCTURE
|
|
|
|
In: http://www.racai.ro/~tufis/papers/Tufis-Ion-SPED2007.pdf
|
|
BASE
|
|
Show details
|
|
40 |
Addressing Challenges in Multilingual Machine Translation
|
|
|
|
In: http://www.ijser.org/researchpaper/Addressing_Challenges_in_Multilingual_Machine_Translation.pdf
|
|
BASE
|
|
Show details
|
|
|
|