DE eng

Search in the Catalogues and Directories

Hits 1 – 2 of 2

1
Parallel Strands: A Preliminary Investigation into Mining the Web
In: DTIC (1998)
Abstract: Parallel corpora are a valuable resource for machine translation, but at present their availability and utility is limited by genre- and domain-specificity, licensing restrictions, and the basic difficulty of locating parallel texts in all but the most dominant of the world's languages. A parallel corpus resource not yet explored is the World Wide Web which hosts an abundance of pages in parallel translation, offering a potential solution to some of these problems and unique opportunities of its own. This paper presents the necessary first step in that exploration: a method for automatically finding parallel translated documents on the Web. The technique is conceptually simple, fully language independent, and scalable, and preliminary evaluation results indicate that the method may be accurate enough to apply without human intervention. ; The original document contains color images. Report No., CS-TR-3922.
Keyword: *INTERNET; *INTERVENTION; *LANGUAGE; *MACHINE TRANSLATION; *TEXT PROCESSING; ACCURACY; Computer Systems; DATA MINING; DOCUMENTS; Information Science; METHODOLOGY; ONLINE SYSTEMS; PARALLEL ORIENTATION; WWW(WORLD WIDE WEB)
URL: http://www.dtic.mil/docs/citations/ADA458649
http://oai.dtic.mil/oai/oai?&verb=getRecord&metadataPrefix=html&identifier=ADA458649
BASE
Hide details
2
Evaluating Multilingual Gisting of Web Pages
In: DTIC (1997)
BASE
Show details

Catalogues
0
0
0
0
0
0
0
Bibliographies
0
0
0
0
0
0
0
0
0
Linked Open Data catalogues
0
Online resources
0
0
0
0
Open access documents
2
0
0
0
0
© 2013 - 2024 Lin|gu|is|tik | Imprint | Privacy Policy | Datenschutzeinstellungen ändern