CN107301414B - Chinese positioning, segmenting and identifying method in natural scene image - Google Patents
Chinese positioning, segmenting and identifying method in natural scene image Download PDFInfo
- Publication number
- CN107301414B CN107301414B CN201710484646.5A CN201710484646A CN107301414B CN 107301414 B CN107301414 B CN 107301414B CN 201710484646 A CN201710484646 A CN 201710484646A CN 107301414 B CN107301414 B CN 107301414B
- Authority
- CN
- China
- Prior art keywords
- character
- single character
- candidate
- region
- segmentation
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/20—Image preprocessing
- G06V10/22—Image preprocessing by selection of a specific region containing or referencing a pattern; Locating or processing of specific regions to guide the detection or recognition
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/20—Image preprocessing
- G06V10/24—Aligning, centring, orientation detection or correction of the image
- G06V10/245—Aligning, centring, orientation detection or correction of the image by locating a pattern; Special marks for positioning
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/20—Image preprocessing
- G06V10/26—Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion
- G06V10/267—Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion by performing operations on regions, e.g. growing, shrinking or watersheds
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Multimedia (AREA)
- Theoretical Computer Science (AREA)
- Character Discrimination (AREA)
Abstract
The invention provides a Chinese positioning, segmenting and identifying method in a natural scene image. The method comprises the steps of carrying out primary character positioning on an original picture through an FASText model, extracting a candidate character region, carrying out pre-segmentation on the candidate character region, then identifying a single character part of the pre-segmented character region, and carrying out further single character segmentation and identification on the field part. The method utilizes the accurate extraction of character stroke characteristics and the strong character recognition capability of a deep residual error neural network, combines a path tree method, simply and effectively realizes the purposes of Chinese positioning and recognition, and can be applied to various natural scenes without supervision and training.
Description
Technical Field
The invention belongs to the technical field of image processing, and particularly relates to a Chinese positioning, segmenting and identifying method in a natural scene image.
Background
Text recognition in natural scenes is a very important visual detection target, and texts in images have much useful information and are very important for understanding and acquiring visual contents. There are many related text recognition applications including road signs, license plates, tickets, etc.
Generally, the conventional OCR technology is influenced by a complex background of a natural scene, and it is difficult to correctly complete a related task. Overall, such tasks can be divided into two phases, localization and recognition of text. The positioning of the text is the precise positioning of the text position in the image, and mainly distinguishes fields from backgrounds according to the extraction of relevant character features, such as MSERs. Compared with the traditional text feature-based detection method, the method for realizing text positioning by training the deep neural network also appears at present. However, this method often requires a large amount of manual labeling data for training, and the trained model is difficult to be directly extended to more other application scenarios.
Disclosure of Invention
The invention aims to provide a simple and effective Chinese positioning, segmenting and identifying method in a natural scene image, which can be applied to more scenes in an expanded mode.
In order to achieve the above purpose, the invention adopts the technical scheme that: a Chinese positioning, segmenting and identifying method in natural scene images comprises the following steps:
1) performing primary character positioning on an original picture through an FASText model, and extracting candidate character areas;
2) pre-dividing a candidate character area;
3) and identifying the single character part of the character region after the pre-segmentation, and further segmenting and identifying the single character part of the field part.
Further, the candidate text region is extracted by getCharSegmentation function of fasttext.
Further, the specific process of pre-segmentation in the step 2) is as follows: and (3) calibrating the communication area of the candidate character area, removing some small communication areas (noise), taking the area which accords with the length-width ratio of the Chinese character as a single character to be directly cut out, and taking out the rest communication areas.
Further, the specific process of performing further single character segmentation on the field part after the pre-segmentation in the step 3) is as follows:
(1) training a deep residual error neural network to obtain a single word recognizer ResNet;
(2) directly identifying the single character result obtained after the pre-segmentation through a single character identifier;
(3) further performing single character segmentation on the field result obtained after pre-segmentation, acquiring the region range of candidate characters in the field picture by using FASText, collecting vertical lines in all the region ranges, and using the vertical lines as a candidate segmentation line set for single character segmentation(ii) a Generating all candidate single character segmentation schemes by using a path tree method, wherein each path corresponds to a single character segmentation scheme;
(4) scheme for dividing any single word by using trained single word recognizer ResNet (such as) Identifying, recording each individual character and corresponding identification confidence degree () Then taking the average value;
(5) Selecting the single character segmentation scheme with the highest average confidence coefficient as the optimal single character segmentation scheme;
(6) and taking the single character recognition result corresponding to the optimal single character segmentation scheme as an optimal field recognition scheme, and outputting a corresponding field recognition result.
Further, the rectangular boxes corresponding to the candidate single characters on each path in the step (3) are not overlapped with each other and cover all character strokes detected by the FASText.
The invention utilizes a character stroke detector FASText based on character stroke characteristics to extract candidate single characters and field regions, and then provides a path tree method on the basis of a candidate single character rectangular frame to generate a candidate single character segmentation scheme. For each single character segmentation scheme, a single character recognizer ResNet trained by a deep residual error neural network is used for recognizing all single characters corresponding to the single character segmentation scheme and recording single character recognition confidence coefficients, a field recognition confidence coefficient corresponding to each single character segmentation scheme is calculated, and the scheme with the highest field recognition confidence coefficient is selected as a final single character segmentation and recognition scheme. The method utilizes the accurate extraction of character stroke characteristics and the strong character recognition capability of a deep residual error neural network, combines a path tree method, simply and effectively realizes the purposes of Chinese positioning and recognition, and can be applied to various natural scenes without supervision and training.
Compared with the prior art, the invention has the following beneficial effects: firstly, aiming at the obvious stroke structural characteristics in Chinese characters, an FASText model is adopted, and through the detection of the stroke parts of the characters, the initial positioning of character areas is realized, so that the influence of background factors is effectively eliminated. Second, there are both single word and field parts in the acquired candidate area. Aiming at the field part, the invention adopts FASText to further separate the single character of the detected candidate field region, and simultaneously utilizes a deep residual error neural network to identify the separated single character part, the method integrates the field separation and the single character identification, and the optimal scheme is found out on the premise of trying all candidate separation schemes, thereby having higher robustness and accuracy.
Drawings
FIG. 1 is a flow chart of the present invention.
FIG. 2 is a diagram illustrating the effect of locating and identifying the Chinese embodiment according to the present invention.
Detailed Description
As shown in fig. 1, the present embodiment provides a method for locating, segmenting and recognizing chinese characters in natural scene images, and the process can be divided into the following steps:
1) performing primary character positioning on an original picture through an FASText model, and extracting candidate character areas;
2) pre-dividing a candidate character area;
3) and identifying the single character part of the character region after the pre-segmentation, and further segmenting and identifying the single character part of the field part.
As shown in fig. 2, in which diagram (a) is an original picture; step 1, extracting candidate image regions by using a getCharSegmentation function of FASText, wherein the extracted image is shown as a picture (b); the pre-segmentation operation of the step 2 is specifically that the connected region extracted in the step 1 is determined, after some small connected regions (noise) are removed, the region which accords with the aspect ratio (close to 1: 1) of the Chinese character is regarded as a single character and is directly cut out, and then the remaining connected regions (as shown in a figure (c)) are taken out; in step 3, the word result obtained by pre-segmentation can be directly identified by a word identifier, such as "kou", "mao", "yi", "there", "limited", "official" and "department" in the graph (c), and the field result needs to be further divided into words, such as "shanghai real in and out" in the graph (c), where a region (region box) of a candidate character (Label Candidates) in the field picture needs to be obtained by FASText first (as shown in the graph (d)); then, the vertical lines in all the area ranges are collected and used as candidate segmentation line sets of single character segmentation(as shown in fig. (e)); then, a path tree method is used for generating all candidate single-word segmentation schemes, and any single-word segmentation scheme (such as) Respectively calculating the confidence of each single segmentation region on the word recognizer ResNet ()) Then taking the average valueSelecting the final single-word segmentation scheme with the highest average confidence coefficient from all the segmentation schemes, wherein the field identifiers corresponding to the single-word segmentation schemeThe other result is used as the final field identification result.
The present invention will be described in detail with reference to specific examples. The following examples will assist those skilled in the art in further understanding the invention, but are not intended to limit the invention in any way. It should be noted that variations and modifications can be made by persons skilled in the art without departing from the spirit of the invention. All falling within the scope of the present invention.
Claims (2)
1. A Chinese positioning, segmenting and identifying method in natural scene images is characterized by comprising the following steps:
1) performing primary character positioning on an original picture through a FASText model, and extracting a candidate character region, wherein the candidate character region is extracted through a getCharSegmentation function of the FASText;
2) pre-dividing the candidate character area, wherein the specific process of the pre-dividing is as follows: calibrating a communication region of the candidate character region, removing some smaller communication regions, directly cutting out the region conforming to the length-width ratio of the Chinese character as a single character, and taking out the remaining communication region;
3) identifying the single character part of the character region after the pre-segmentation, and further segmenting and identifying the single character in the field part; the specific process of the step is as follows:
(1) training a deep residual error neural network to obtain a single word recognizer ResNet;
(2) directly identifying the single character result obtained after the pre-segmentation through a single character identifier;
(3) the field part is further divided into single words, the FASText is used for acquiring the region range of the candidate characters in the field picture, the vertical lines in all the region ranges are collected and used as the candidate division line set for single word division; generating all candidate single character segmentation schemes by using a path tree method, wherein each path corresponds to a single character segmentation scheme;
(4) recognizing any single character segmentation scheme by using a trained single character recognizer ResNet, recording each recognized single character and a corresponding recognition confidence coefficient, and then averaging;
(5) selecting the single character segmentation scheme with the highest average confidence coefficient as the optimal single character segmentation scheme;
(6) and taking the single character recognition result corresponding to the optimal single character segmentation scheme as an optimal field recognition result, and outputting a corresponding field recognition result.
2. The method of claim 1, wherein the method comprises the steps of: and (4) the rectangular frames corresponding to the candidate single characters on each path in the step (3) are not overlapped and cover all character strokes detected by the FASText.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710484646.5A CN107301414B (en) | 2017-06-23 | 2017-06-23 | Chinese positioning, segmenting and identifying method in natural scene image |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710484646.5A CN107301414B (en) | 2017-06-23 | 2017-06-23 | Chinese positioning, segmenting and identifying method in natural scene image |
Publications (2)
Publication Number | Publication Date |
---|---|
CN107301414A CN107301414A (en) | 2017-10-27 |
CN107301414B true CN107301414B (en) | 2020-07-07 |
Family
ID=60135051
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201710484646.5A Active CN107301414B (en) | 2017-06-23 | 2017-06-23 | Chinese positioning, segmenting and identifying method in natural scene image |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN107301414B (en) |
Families Citing this family (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108458716B (en) * | 2018-02-02 | 2023-04-18 | 北京交通大学 | Electric vehicle charging navigation method based on charging pile dynamic occupancy rate prediction |
CN108229463A (en) * | 2018-02-07 | 2018-06-29 | 众安信息技术服务有限公司 | Character recognition method based on image |
CN110321760A (en) * | 2018-03-29 | 2019-10-11 | 北京和缓医疗科技有限公司 | A kind of medical document recognition methods and device |
CN112789623B (en) * | 2018-11-16 | 2024-08-16 | 北京比特大陆科技有限公司 | Text detection method, device and storage medium |
CN109615719B (en) * | 2019-01-04 | 2021-04-20 | 天地协同科技有限公司 | Freight vehicle non-stop charging system and method based on road safety monitoring system |
CN109977878B (en) * | 2019-03-28 | 2021-01-22 | 华南理工大学 | Vehicle detection method based on heavily weighted Anchor |
CN111062393B (en) * | 2019-11-08 | 2021-12-17 | 西安理工大学 | Natural scene Chinese character segmentation method based on spectral clustering |
CN111488873B (en) * | 2020-04-03 | 2023-10-24 | 中国科学院深圳先进技术研究院 | Character level scene text detection method and device based on weak supervision learning |
Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101777124A (en) * | 2010-01-29 | 2010-07-14 | 北京新岸线网络技术有限公司 | Method for extracting video text message and device thereof |
-
2017
- 2017-06-23 CN CN201710484646.5A patent/CN107301414B/en active Active
Patent Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101777124A (en) * | 2010-01-29 | 2010-07-14 | 北京新岸线网络技术有限公司 | Method for extracting video text message and device thereof |
Non-Patent Citations (1)
Title |
---|
FASText Efficient Unconstrained Scene Text Detector;Michal Buta;Luká Neumann ;Jirí Matas;《2015 IEEE International Conference on Computer Vision (ICCV)》;20151213;第1206-1213页 * |
Also Published As
Publication number | Publication date |
---|---|
CN107301414A (en) | 2017-10-27 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN107301414B (en) | Chinese positioning, segmenting and identifying method in natural scene image | |
Greenhalgh et al. | Recognizing text-based traffic signs | |
Ye et al. | Text detection and recognition in imagery: A survey | |
Neumann et al. | Efficient scene text localization and recognition with local character refinement | |
Shahab et al. | ICDAR 2011 robust reading competition challenge 2: Reading text in scene images | |
Kuettel et al. | Figure-ground segmentation by transferring window masks | |
CN104299009A (en) | Plate number character recognition method based on multi-feature fusion | |
CN112818951A (en) | Ticket identification method | |
CN109325487B (en) | Full-category license plate recognition method based on target detection | |
CN104063701B (en) | Fast electric television stations TV station symbol recognition system and its implementation based on SURF words trees and template matches | |
CN106778777B (en) | Vehicle matching method and system | |
Faustina Joan et al. | A survey on text information extraction from born-digital and scene text images | |
CN110852327A (en) | Image processing method, image processing device, electronic equipment and storage medium | |
CN102098449A (en) | Method for realizing automatic inside segmentation of TV programs by utilizing mark detection | |
CN102521582B (en) | Human upper body detection and splitting method applied to low-contrast video | |
CN113743389B (en) | Facial expression recognition method and device and electronic equipment | |
Karanje et al. | Survey on text detection, segmentation and recognition from a natural scene images | |
CN114926635B (en) | Target segmentation method in multi-focus image combined with deep learning method | |
Rakhmatillaevich et al. | A novel method for extracting text from naturalscene images and TTS | |
TWI430187B (en) | License plate number identification method | |
Devi et al. | Text extraction from images using gamma correction method and different text extraction methods—A comparative analysis | |
Huang | A novel video text extraction approach based on Log-Gabor filters | |
Chen et al. | Text detection in traffic informatory signs using synthetic data | |
CN111027399B (en) | Remote sensing image water surface submarine recognition method based on deep learning | |
Rani et al. | Object Detection in Natural Scene Images Using Thresholding Techniques |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
CB02 | Change of applicant information | ||
CB02 | Change of applicant information |
Address after: 361000 room 3008, Xuan Ye Lou, Pioneer Park, Xiamen torch high tech Zone, Xiamen, Fujian Applicant after: Xiamen Shang Ji Network Technology Co., Ltd. Address before: 361000 room 3008, Xuan Ye Lou, Pioneer Park, Xiamen torch high tech Zone, Xiamen, Fujian Applicant before: Xiamen Business Consulting Co., Ltd. |
|
GR01 | Patent grant | ||
GR01 | Patent grant |