Home Catalogue search

eng

Refine your search:

Search in the Catalogues and Directories






	Sort by
Simple Search

Page: 1 2 3 4 5...13

Hits 1 – 20 of 257

1	Raising the Titanic: Prospects for Reviving the Century Dictionary ...
	Triggs, Jeffery A.. - : Rutgers University, 2022
	BASE
	Show details

2	Exploiting Script Similarities to Compensate for the Large Amount of Data in Training Tesseract LSTM: Towards Kurdish OCR
	Saman Idrees; Hossein Hassani
	In: Applied Sciences ; Volume 11 ; Issue 20 (2021)
	BASE
	Show details

3	Reconocimiento automático de un censo histórico impreso sin recursos lingüísticos
	Anitei, Dan. - : Universitat Politècnica de València, 2021
	Abstract: [ES] El reconocimiento automático de documentos históricos impresos es actualmente un problema resuelto para muchas colecciones de datos. Sin embargo, los sistemas de reconocimiento automático de documentos históricos impresos aún deben resolver varios obstáculos inherentes al trabajo con documentos antiguos. La degradación del papel o las manchas pueden aumentar la dificultad del correcto reconocimiento de los caracteres. No obstante, dichos problemas se pueden paliar utilizando recursos lingüísticos para entrenar buenos modelos de lenguaje que disminuyan la tasa de error de los caracteres. En cambio, hay muchas colecciones como la que se presenta en este trabajo, compuestas por tablas que contienen principalmente números y nombres propios, para las que no se dispone. En este trabajo se muestra que el reconocimiento automático puede realizarse con éxito para una colección de documentos sin utilizar ningún recurso lingüístico. Este proyecto cubre la extracción de información y el proceso de OCR dirigido, especialmente diseñados para el reconocimiento automático de un censo español del siglo XIX, registrado en documentos impresos. Muchos de los problemas relacionados con los documentos históricos se resuelven utilizando una combinación de técnicas clásicas de visión por computador y aprendizaje neuronal profundo. Los errores, como los caracteres mal reconocidos, son detectados y corregidos gracias a la información redundante que contiene el censo. Dada la importancia de este censo español para la realización de estudios demográficos, este trabajo da un paso más e introduce un modelo demostrador que facilita la investigación sobre este corpus mediante la indexación de los datos. ; [EN] Automatic recognition of typeset historical documents is currently a solved problem for many collections of data. However, systems for automatic recognition of typeset historical documents still need to address several issues inherent to working with this kind of documents. Degradation of the paper or smudges can increase the difficulty of correctly recognizing characters, problems that can be alleviated by using linguistic resources for training good language models which decrease the character error rate. Nonetheless, there are many collections such as the one presented in this paper, composed of tables that contain mainly numbers and proper names, for which a language model is neither available nor useful. This paper illustrates that automatic recognition can be done successfully for a collection of documents without using any linguistic resources. The paper covers the information extraction and the targeted OCR process, specially designed for the automatic recognition of a Spanish census from the XIX century, registered in printed documents. Many of the problems related to historical documents are overcame by using a combination of classical computer vision techniques and deep learning. Errors, such as miss-recognized characters, are detected and corrected thanks to redundant information that the census contains. Given the importance of this Spanish census for conducting demographic studies, this paper goes a step forward and introduces a demonstrator model to facilitate researching on this corpus by indexing the data. ; This work has been partially supported by the BBVA Fundation, as a collaboration between the PRHLT team in charge of the HisClima project and the ESPAREL project. ; Anitei, D. (2021). Reconocimiento automático de un censo histórico impreso sin recursos lingüísticos. Universitat Politècnica de València. http://hdl.handle.net/10251/172694 ; TFGM
	Keyword: Censo; Census; Computer Vision; Documentos Históricos Impresos; Historical Printed Documents; LENGUAJES Y SISTEMAS INFORMATICOS; Máster Universitario en Inteligencia Artificial; Optical Character Recognition; Reconocimiento de Formas e Imagen Digital-Màster Universitari en Intel·Ligència Artificial: Reconeixement de Formes i Imatge Digital; Reconocimiento Óptico de Caracteres; Visión por Computador
	URL: http://hdl.handle.net/10251/172694
	BASE
	Hide details

4	Quality Measurement for Optical Character Recognition without ground truth data ...
	Weltevrede, Mike. - : Zenodo, 2020
	BASE
	Show details

5	Quality Measurement for Optical Character Recognition without ground truth data ...
	Weltevrede, Mike. - : Zenodo, 2020
	BASE
	Show details

6	AI in gastronomic tourism ...
	Pavlidis, George; Markantonatou, Stella; Toraki, Katerina. - : Zenodo, 2020
	BASE
	Show details

7	Improving the recognition of Dutch Gothic machine print, at four levels in the processing pipeline, in four days ...
	Schomaker, Lambert; Ameryan, Mahya; Cuper, Mirjam. - : Zenodo, 2020
	BASE
	Show details

8	AI in gastronomic tourism ...
	Pavlidis, George; Markantonatou, Stella; Toraki, Katerina. - : Zenodo, 2020
	BASE
	Show details

9	Improving the recognition of Dutch Gothic machine print, at four levels in the processing pipeline, in four days ...
	Schomaker, Lambert; Ameryan, Mahya; Cuper, Mirjam. - : Zenodo, 2020
	BASE
	Show details

10	NAT: Noise-Aware Training for Robust Neural Sequence Labeling
	Behnke, Sven; Namysl, Marcin; Köhler, Joachim
	In: Fraunhofer IAIS (2020)
	BASE
	Show details

11	OPTICAL CHARACTER RECOGNITION APPLIED TO ANDROID-BASED BILINGUAL TRANSLATOR APPLICATION (ENGLISH AND INDONESIAN) TO SIGN LANGUAGE ...
	Pratama, Juan Adhiasta. - : Zenodo, 2019
	BASE
	Show details

12	OPTICAL CHARACTER RECOGNITION APPLIED TO ANDROID-BASED BILINGUAL TRANSLATOR APPLICATION (ENGLISH AND INDONESIAN) TO SIGN LANGUAGE ...
	Pratama, Juan Adhiasta. - : Zenodo, 2019
	BASE
	Show details

13	Bilingual text detection in natural scene images using invariant moments
	Maheshwari, Karan; Joseph Raj, Alex N.; Mahesh, Vijayalakshmi G.. - : Netherlands, IOS Press, 2019
	BASE
	Show details

14	Wenn Algorithmen Zeitschriften lesen - vom Mehrwert automatisierter Textanreicherung ...
	Wanger, Regina; Gasser, Michael. - : ETH Zurich, 2018
	BASE
	Show details

15	Generating a training corpus for OCR post-correction using encoder-decoder model
	D'hondt, Eva; Grouin, Cyril; Grau, Brigitte
	In: Proceedings of the Eighth International Joint Conference on Natural Language Processing (Volume 1: Long Papers) ; International Joint Conference on Natural Language Processing ; https://hal.archives-ouvertes.fr/hal-01831147 ; International Joint Conference on Natural Language Processing, Nov 2017, Taipei, Taiwan ; https://www.aclweb.org/anthology/I17-1101 (2017)
	BASE
	Show details

16	Corpus linguistics for History ... : the methodology of investigating place-name discourses in digitised nineteenth-century newspapers ...
	Joulain, Amelia Tahirih. - : Lancaster University, 2017
	BASE
	Show details

17	Radical Recognition in Off-Line Handwritten Chinese Characters Using Non-Negative Matrix Factorization
	Shuai, Xiangying
	In: Senior Projects Spring 2016 (2016)
	BASE
	Show details

18	Using SMT for OCR error correction of historical texts
	Afli, Haithem; Qui, Zhengwei; Way, Andy...
	In: Afli, Haithem orcid:0000-0002-7449-4707 , Qui, Zhengwei, Way, Andy orcid:0000-0001-5736-5930 and Sheridan, Páraic (2016) Using SMT for OCR error correction of historical texts. In: Tenth International Conference on Language Resources and Evaluation (LREC 2016), 23-28 May 2016, Portorož, Slovenia. ISBN 978-2-9517408-9-1 (2016)
	BASE
	Show details

19	Augmented reality applied to language translation
	Salvado, Ana Rita de Tróia. - 2016
	BASE
	Show details

20	Data Cleaning for XML Electronic Dictionaries via Statistical Anomaly Detection ...
	Bloodgood, Michael; Strauss, Benjamin. - : Digital Repository at the University of Maryland, 2016
	BASE
	Show details

Page: 1 2 3 4 5...13

© 2013 - 2024 Lin|gu|is|tik | Imprint | Privacy Policy | Datenschutzeinstellungen ändern