Existence of touching characters in scanned documents is a major problem in designing an effective character segmentation procedure for OCR systems. In this paper, new techniques are presented for identification and segmentation of touching characters. The techniques are based on fuzzy multifactorial analysis. A predictive algorithm is developed for effectively selecting cut-points to segment touching characters. Initially, our proposed method has been applied for segmenting touching characters that appear in Devnagari (Hindi) and Bangla, two major scripts in the Indian sub-continent. The results obtained from a test-set of considerable size show that a high recognition rate can be achieved with a reasonable amount of computations.
Segmentation of touching characters in printed Devnagari and Bangla scripts using fuzzy multifactorial analysis
Published 2001 in Proceedings of Sixth International Conference on Document Analysis and Recognition
ABSTRACT
PUBLICATION RECORD
- Publication year
2001
- Venue
Proceedings of Sixth International Conference on Document Analysis and Recognition
- Publication date
2001-09-10
- Fields of study
Computer Science
- Identifiers
- External record
- Source metadata
Semantic Scholar
CITATION MAP
EXTRACTION MAP
CLAIMS
- No claims are published for this paper.
CONCEPTS
- No concepts are published for this paper.
REFERENCES
Showing 1-21 of 21 references · Page 1 of 1