Page: 1 2 3 4 5 6 7 8 9... 20
81 |
Why Microsoft Arabic Spell checker is ineffective
|
|
|
|
In: ISSN: 0851-6774 ; Linguistica Communicatio ; https://hal.archives-ouvertes.fr/hal-01081965 ; Linguistica Communicatio, http://www.al-erfan.com/, 2014, Arabic Language in Information Technology, 16, pp.55 ; http://www.al-erfan.com/ (2014)
|
|
Abstract:
International audience ; Since 1997, the MS Arabic spell checker was integrated by Coltec-Egypt in the MS-Office suite and till now many Arabic users find it worthless. In this study, we show why the MS-spell checker fails to attract Arabic users. After spell-checking a document (10 pages -3300 words in Arabic), the assessment procedure spots 78 false positive errors. They reveal the lexical resource flaws: an unsystematic lexical coverage of the feminine and the broken plural of nouns and adjectives, and an arbitrary coverage of verbs and nouns with prefixed or suffixed particles. This unsystematic and arbitrary lexical coverage of the language resources pinpoints the absence of a clear definition of a lexical entry and an inadequate design of the related agglutination rules. Finally, this assessment reveals in general the failure of scientific and technological policies in big companies and in research institutions regarding Arabic.
|
|
Keyword:
[INFO.INFO-CL]Computer Science [cs]/Computation and Language [cs.CL]; Arabic; Computational Linguistics; dictionary; NLP; spelling error detection
|
|
URL: https://hal.archives-ouvertes.fr/hal-01081965 https://hal.archives-ouvertes.fr/hal-01081965/file/Why%20Microsoft%20Arabic%20Spell%20checker%20is%20ineffective.pdf https://hal.archives-ouvertes.fr/hal-01081965/document
|
|
BASE
|
|
Hide details
|
|
82 |
WoNeF, an improved, expanded and evaluated automatic French translation of WordNet
|
|
|
|
In: 7th Global Wordnet Conference, GWC 2014 ; https://hal-cea.archives-ouvertes.fr/cea-01844457 ; 7th Global Wordnet Conference, GWC 2014, Jan 2014, Tartu, Estonia. pp.32-39 (2014)
|
|
BASE
|
|
Show details
|
|
83 |
Discours. A journal of linguistics, psycholinguistics and computational linguistics. ; Discours. Revue de linguistique, psycholinguistique et informatique.
|
|
|
|
In: https://halshs.archives-ouvertes.fr/halshs-01432020 ; France. Presses universitaires de Caen, 2014, Discours. Revue de linguistique, psycholinguistique et informatique., ISSN électronique 1963-1723 (2014)
|
|
BASE
|
|
Show details
|
|
84 |
Making the Most of It: Word Sense Annotation and Disambiguation in the Face of Data Sparsity and Ambiguity
|
|
|
|
In: Jurgens, David Alan. (2014). Making the Most of It: Word Sense Annotation and Disambiguation in the Face of Data Sparsity and Ambiguity. UCLA: Computer Science 0201. Retrieved from: http://www.escholarship.org/uc/item/2wn4h7ph (2014)
|
|
BASE
|
|
Show details
|
|
85 |
Corpus parallèles, corpus comparables : quels contrastes ?
|
|
|
|
In: https://hal.archives-ouvertes.fr/tel-01184585 ; Informatique et langage [cs.CL]. Université de Poitiers, 2014 (2014)
|
|
BASE
|
|
Show details
|
|
88 |
The LIMA multilingual analyzer made free: FLOSS resources adaptation and correction
|
|
|
|
In: Proceedings of the 9th International Conference on Language Resources and Evaluation, LREC 2014 ; https://hal-cea.archives-ouvertes.fr/cea-01844458 ; Proceedings of the 9th International Conference on Language Resources and Evaluation, LREC 2014, May 2014, Reykjavik, Iceland. pp.2932-2937 (2014)
|
|
BASE
|
|
Show details
|
|
89 |
τC: C with process network extensions for embedded manycores
|
|
|
|
In: ISSN: 1877-0509 ; EISSN: 1877-0509 ; Procedia Computer Science ; https://hal-cea.archives-ouvertes.fr/cea-01831559 ; Procedia Computer Science, Elsevier, 2014, 29, pp.1100-1112. ⟨10.1016/j.procs.2014.05.099⟩ (2014)
|
|
BASE
|
|
Show details
|
|
90 |
Instrumentation of annotated c programs for test generation
|
|
|
|
In: 2014 IEEE 14th International Working Conference on Source Code Analysis and Manipulation ; https://hal-cea.archives-ouvertes.fr/cea-01836306 ; 2014 IEEE 14th International Working Conference on Source Code Analysis and Manipulation, Sep 2014, Victoria, Canada. pp.105-114, ⟨10.1109/SCAM.2014.19⟩ (2014)
|
|
BASE
|
|
Show details
|
|
91 |
Ranking canonical English poems
|
|
|
|
In: Literary and Linguistic Computing ; http://llc.oxfordjournals.org/ (2014)
|
|
BASE
|
|
Show details
|
|
94 |
JOINT_FORCES : unite competing sentiment classifiers with random forest ...
|
|
|
|
BASE
|
|
Show details
|
|
95 |
Information extraction for the geospatial domain ... : Informationsextraktion für georäumliche Entitäten und Relationen ...
|
|
|
|
BASE
|
|
Show details
|
|
96 |
Supervised and semi-supervised statistical models for word-based sentiment analysis ... : Überwachte und halbüberwachte statistische Modelle zur wortbasierten Sentimentanalyse ...
|
|
|
|
BASE
|
|
Show details
|
|
97 |
The Montagovian Generative Lexicon Lambda Ty_n: a Type Theoretical Framework for Natural Language Semantics
|
|
: Schloss Dagstuhl - Leibniz-Zentrum fuer Informatik, 2014. : LIPIcs - Leibniz International Proceedings in Informatics. 19th International Conference on Types for Proofs and Programs (TYPES 2013), 2014
|
|
BASE
|
|
Show details
|
|
98 |
Investigating Context Parameters in Technology Term Recognition
|
|
|
|
BASE
|
|
Show details
|
|
99 |
The ACL RD-TEC: A Dataset for Benchmarking Terminology Extraction and Classification in Computational Linguistics
|
|
|
|
BASE
|
|
Show details
|
|
100 |
Combining visual recognition and computational linguistics : linguistic knowledge for visual recognition and natural language descriptions of visual content ...
|
|
|
|
BASE
|
|
Show details
|
|
Page: 1 2 3 4 5 6 7 8 9... 20
|
|