5 |
Morphological Degradation Models and their Use in Document Image Restoration
|
|
|
|
In: DTIC (2001)
|
|
Abstract:
Document images undergo various degradation processes. Numerous models of these degradation processes have been proposed in the literature. In this paper we propose a model-based restoration algorithm. The restoration algorithm first estimates the parameters of a degradation model and then uses the estimated parameters to construct a lookup table for restoring the degraded image. The estimated degradation model is used to estimate the probability of an ideal binary pattern, given the noisy observed pattern. This probability is estimated by degrading noise-free document images and then computing the frequency of corresponding noise-free and noisy pattern pairs. This conditional probability is then used to construct a lookup table to restore the noisy images. The impact of the restoration process is then quantified by computing the decrease in OCR word and character error rate. We find that given the estimated degradation model parameter values, the restoration algorithm decreases the character error rate by 16.1% and the word error rate by 7.35%. In some categories of degradation (e.g. model parameters that give rise to broken characters) there is a 41.5% reduction in character error rate and a 20.4% reduction in word error rate. ; Sponsored in part by NSF grant IIS9987944. Additional report CS-TR-4218
|
|
Keyword:
*ALGORITHMS; *DEGRADATION; *IMAGE RESTORATION; *MORPHOLOGY; *NUMERICAL ANALYSIS; CHARACTER RECOGNITION; DOCUMENTS; ERRORS; ESTIMATES; Linguistics; NOISE REDUCTION; Numerical Mathematics; Optics; PARAMETERS; PATTERNS; RATES; TABLES(DATA); WORDS(LANGUAGE)
|
|
URL: http://oai.dtic.mil/oai/oai?&verb=getRecord&metadataPrefix=html&identifier=ADA458744 http://www.dtic.mil/docs/citations/ADA458744
|
|
BASE
|
|
Hide details
|
|
6 |
A Downhill Simplex Algorithm for Estimating Morphological Degradation Model Parameters
|
|
|
|
In: DTIC (2001)
|
|
BASE
|
|
Show details
|
|
7 |
The Architecture of TrueViz: A Groundtruth/Metadata Editing and Visualizing Toolkit
|
|
|
|
In: DTIC (2001)
|
|
BASE
|
|
Show details
|
|
9 |
A Statistical, Nonparametric Methodology for Document Degradation Model Validation
|
|
|
|
In: DTIC (1999)
|
|
BASE
|
|
Show details
|
|
10 |
The Bible, Truth, and Multilingual OCR Evaluation
|
|
|
|
In: DTIC (1998)
|
|
BASE
|
|
Show details
|
|
|
|