Abstract:
A new method based on the Gaussian mixture modeling of neighbor characters is proposed to extract multilingual texts in images. In the training phase, the Gaussian mixture model of three neighbor characters is trained from the examples. Then the texts in an input image are extracted in the following steps. Firstly, the image is binarized using the edge-pixel clustering method and the morphological closing operation is performed on the binary image, in order that each character in it can be treated as a connected component. Secondly, the neighborhood of connected components is established according to the Voronoi partition of the image. Three connected components neighboring with each other constitute a neighbor set. For each neighbor set, a posteriori pseudo-probability is computed based on the Gaussian mixture model of three neighbor characters and used to classify the neighbor set as the case of three neighbor characters. Finally, the text extraction is completed by labeling the connected components as characters or non-characters with the following rule: if a connected component is included in at least one neighbor set classified as the case of three neighbor characters, then the connected component is labeled as a character, or else as a non-character. The proposed method are tested in the applications of Chinese and English text extraction. In the experiments, the expectation-maximization algorithm is employed to train the Gaussian mixture model of three neighbor characters. The experimental results of text extraction show the effectiveness of the method.