Probabilistic Homogeneity for Document Image Segmentation Vrije Universiteit Brussel
In this paper we propose a novel probabilistic framework for document segmentation exploiting human perceptual recognition of text regions from complicated layouts. In particular, we conceptualize text homogeneity as the Gestalt pattern displayed in text regions, characterized by proximately and symmetrically arranged units with similar morphological and texture features. We model this pattern in the local region of a connected component (CC) ...