CN103617423B - Image segmentation and recognition method based on color parameter - Google Patents

Image segmentation and recognition method based on color parameter Download PDF

Info

Publication number
CN103617423B
CN103617423B CN201310612649.4A CN201310612649A CN103617423B CN 103617423 B CN103617423 B CN 103617423B CN 201310612649 A CN201310612649 A CN 201310612649A CN 103617423 B CN103617423 B CN 103617423B
Authority
CN
China
Prior art keywords
content
recognized
recognition
images
text
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201310612649.4A
Other languages
Chinese (zh)
Other versions
CN103617423A (en
Inventor
王威扬
宫连志
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Qianjin Network Information Technology (Shanghai) Co.,Ltd.
Original Assignee
Qianjin Network Information Technology (shanghai) Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Qianjin Network Information Technology (shanghai) Co ltd filed Critical Qianjin Network Information Technology (shanghai) Co ltd
Priority to CN201310612649.4A priority Critical patent/CN103617423B/en
Publication of CN103617423A publication Critical patent/CN103617423A/en
Application granted granted Critical
Publication of CN103617423B publication Critical patent/CN103617423B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Landscapes

  • Image Analysis (AREA)
  • Character Input (AREA)

Abstract

The invention provides an image segmentation and recognition method based on a color parameter. The method includes the steps of (1) carrying out analysis on an image to be recognized on the basis of the color parameter to find a crest point corresponding to the color parameter, (2) segmenting the image to be recognized into a plurality of segmented images according to the crest point, (3) recognizing each segmented image to obtain recognition results r of a plurality of sets of the segmented images, and (4) merging the recognition results r of the segmented images into a recognition result R of the image to be recognized. The method can well handle the situations such as backgrounds, texture and inclination, so that the recognition rate is improved, multicolor pictures with backgrounds can be recognized, a good recognition effect is obtained, and the method is particularly suitable for recognition of business cards and bills with complex images.

Description

Image based on color parameter splits recognition methodss
Technical field
The present invention relates to the Figure recognition field in Computer Science and Technology, in particular it relates to based on color parameter Image splits recognition methodss, the especially recognition methodss of color range Chromatographic resolution.
Background technology
In field of character recognition, ocr(optical character recognition, optical character recognition) technology is Apply more technology, and apply relatively extensive.In traditional identification, figure can be carried out rectangle by ocr graphics process Correct and greyscale transformation (the priority execution sequence of two steps can be exchanged), then carry out binaryzation, finally enter ocr core And be identified.
This identification process is directed to the identification of common text (business card for example simple, color is single), can complete to appoint Picture can be changed into word by business.But identifying variegated business card, the ticket after the business card having powerful connections with shading, and scanning According to during, recognition accuracy is poor, or even many times None- identified goes out word.
Traditional ocr is by obtaining image to be identified to the greyscale transformation of picture, but this kind of image is in identification During, the reduction of discrimination can be led to because background, texture, situation about tilting.Especially when identifying that printed tickets obtain Wait, traditional recognition methodss are often due to complicated image (includes: the background texture of bill itself;Due to print deviation leads to Printing character can be pressed onto cut-off rule;Seal cover word or formed decorative pattern background), lead to identify bill when be difficult to substantially effective Identify content.
It is therefore desirable to designing a kind of method that can preferably identify image complex script.
Content of the invention
For defect of the prior art, it is an object of the invention to provide a kind of image based on color parameter splits identification Method, thus solving the problems, such as the variegated ,/identification of shading word of having powerful connections, especially can recognize that the name with complicated image Piece and bill.
Recognition methodss are split according to the image based on color parameter that the present invention provides, comprises the steps:
Step 1: the analysis based on color parameter is carried out to images to be recognized, finds the corresponding wave crest point of this color parameter;
Step 2: images to be recognized is split into by multiple broken away view pictures according to described wave crest point;
Step 3: each Zhang Suoshu broken away view picture is identified, obtains the recognition result r of multigroup broken away view picture;
Step 4: split the recognition result r that image recognition result r is merged into images to be recognized by described.
Preferably, described based on the analysis of color parameter include color range analysis, correspondingly:
Described step 1 includes step: carries out color range analysis to images to be recognized, finds the color range degree ripple in color range degree image Peak dot;
Described step 2 includes step: according to different described color range degree wave crest points, images to be recognized is split into multiple and tears open Partial image.
Preferably, described chromatograph cluster analyses are included based on the analysis of color parameter, correspondingly:
Described step 1 includes step: carries out chromatograph cluster analyses to images to be recognized, finds the color in chromatograph dendrogram picture Spectral clustering wave crest point;
Described step 2 includes step: according to different described chromatograph cluster wave crest points, images to be recognized is split into multiple Broken away view picture.
Preferably, described step 4 comprises the steps:
All fractionation image recognition result r of described images to be recognized are merged into the recognition result r of images to be recognized.
Preferably, described step 1 comprises the steps:
Step 11: images to be recognized is converted into digital record;
Step 12: Fourier transformation is carried out to the digital record of images to be recognized;
Step 13: the result derivation to Fourier transformation, obtain extreme point;
Step 14: according to extreme point clustering, obtain and most Color Ranges occur.
Preferably, described step 2 comprises the steps:
Step 21: for the Color Range obtaining in step 14, all digital records are grouped;
Step 22: in every group of record, for corresponding Color Range, the numerical value in Color Range retains, other Change turn to the numerical value corresponding to white;
Step 23: the numeral of every group of record is re-converted into broken away view picture.
Preferably, described step 3 comprises the steps:
Step 31: carry out Figure recognition by after the picture name record of split image;
Step 32: obtain the corresponding content of text of every pictures, each content of text include project name, particular content and Credibility.
Preferably, described step 4 comprises the steps:
Step 41: according to the fractionation record of picture to be identified, obtain each and split the corresponding content of text of picture;
Step 42: determine from each broken away view is as corresponding content of text and treat syndicated content;
Step 43: will treat that syndicated content is integrated, and obtain the final recognition result to image recognition to be identified.
Preferably, described step 42 include following any one or appoint plurality of step:
- for the broken away view picture only needing to a content of text, then select a credible letter from described content of text Degree highest content of text is as treating syndicated content;
- for the broken away view picture needing multiple content of text, then select credibility to exceed certain from described content of text The content of text of one threshold values is as treating syndicated content.
Compared with prior art, the present invention has a following beneficial effect:
1st, the present invention, during image recognition, can preferably be processed to situations such as background, texture, inclination, Thus improving discrimination.
2nd, can effectively identify Yin Wenben itself background texture, because print deviation leads to printing character can be pressed onto point Secant and because seal cover word and formed decorative pattern background when bill.
3rd, many normal complexion picture of having powerful connections can be identified during identification by the present invention, is preferably identified effect Really, it is particularly suited for business card and the bank slip recognition with complicated image.
Brief description
The detailed description with reference to the following drawings, non-limiting example made by reading, the further feature of the present invention, Objects and advantages will become more apparent upon:
Fig. 1 illustrates the principle schematic of the present invention.
Specific embodiment
With reference to specific embodiment, the present invention is described in detail.Following examples will be helpful to the technology of this area Personnel further understand the present invention, but the invention is not limited in any way.It should be pointed out that the ordinary skill to this area For personnel, without departing from the inventive concept of the premise, some deformation can also be made and improve.These broadly fall into the present invention Protection domain.
Recognition methodss are split according to the image based on color parameter that the present invention provides, comprises the steps:
Step 1: the analysis based on color parameter is carried out to images to be recognized, finds the corresponding wave crest point of this color parameter;
Step 2: images to be recognized is split into by multiple broken away view pictures according to described wave crest point;
Step 3: each Zhang Suoshu broken away view picture is identified, obtains the recognition result r of multigroup broken away view picture;
Step 4: described image recognition result r that splits is merged into the recognition result r of images to be recognized it is preferable that by institute All fractionation image recognition result r stating images to be recognized are merged into the recognition result r of images to be recognized.
In preference, color range analysis and chromatograph cluster analyses can be carried out to images to be recognized, carry out finding accordingly The wave crest point of analysis of the image, wherein, color range analysis and chromatograph cluster analyses can individually execute, and the execution of any one analysis is equal The discrimination of images to be recognized can be improved.Specifically:
Described based on the analysis of color parameter include color range analysis, correspondingly: described step 1 includes step: to be identified Image carries out color range analysis, finds the color range degree wave crest point in color range degree image;Described step 2 includes step: according to different Images to be recognized is split into multiple broken away view pictures by described color range degree wave crest point;
Described chromatograph cluster analyses are included based on the analysis of color parameter, correspondingly: described step 1 includes step: treats Identification image carries out chromatograph cluster analyses, finds the chromatograph cluster wave crest point in chromatograph dendrogram picture;Described step 2 includes walking Rapid: images to be recognized is split into by multiple broken away view pictures according to different described chromatograph cluster wave crest points.
It is further preferred that described step 1 comprises the steps:
Step 11: images to be recognized is converted into digital record.
Step 12: Fourier transformation is carried out to the digital record of images to be recognized.
Step 13: the result derivation to Fourier transformation, obtain extreme point.
Step 14: according to extreme point clustering, obtain and most Color Ranges occur.
Described step 2 comprises the steps:
Step 21: for the Color Range obtaining in step 14, all digital records are grouped.
Step 22: in every group of record, for corresponding Color Range, the numerical value in Color Range retains, other Change turn to the numerical value corresponding to white.
Step 23: the numeral of every group of record is re-converted into broken away view picture.
Described step 3 comprises the steps:
Step 31: after the picture name record of split image, deliver to and carry out Figure recognition in ocr core.
Step 32: ocr core outfan to every pictures, there is corresponding content of text, each content of text bag Include project name, particular content and credibility.
Described step 4 comprises the steps:
Step 41: according to the fractionation record of picture to be identified, obtain each and split the corresponding content of text of picture.
Step 42: for the project only needing to a content of text, then in recognition result, select a credible reliability Highest content of text.
Step 43: for the project that may need multiple content of text, in the result of identification, select credibility to exceed The result of some threshold values.
Step 44: the result of multiple content of text documents is integrated, obtains the final result to original image identification.
Further specifically, to the analysis based on color range (gray scale) and chromatograph cluster, images to be recognized can be decomposited The substantial amounts of color being comprised and grey level;If being directed to specific bill, the algorithm of decomposition can be fixed, need not enter again Row analysis;Then picture to be identified is resolved into the fractionation picture of multiple different gray scales or color;Using ocr module by each Split picture to be all identified, obtain decompositing to obtain each group of picture;Using data processing module, all of recognition result is closed And become the recognition result of pictures.
Above the specific embodiment of the present invention is described.It is to be appreciated that the invention is not limited in above-mentioned Particular implementation, those skilled in the art can make various modifications or modification within the scope of the claims, this not shadow Ring the flesh and blood of the present invention.

Claims (7)

1. a kind of image based on color parameter splits recognition methodss it is characterised in that comprising the steps:
Step 1: the analysis based on color parameter is carried out to images to be recognized, finds the corresponding wave crest point of this color parameter;
Step 2: images to be recognized is split into by multiple broken away view pictures according to described wave crest point;
Step 3: each Zhang Suoshu broken away view picture is identified, obtains the recognition result r of multigroup broken away view picture;
Step 4: split the recognition result r that image recognition result r is merged into images to be recognized by described;
Described based on the analysis of color parameter include color range analysis, correspondingly:
Described step 1 includes step: carries out color range analysis to images to be recognized, finds the color range degree crest in color range degree image Point;
Described step 2 includes step: according to different described color range degree wave crest points, images to be recognized is split into multiple broken away view Picture.
2. image based on color parameter according to claim 1 splits recognition methodss it is characterised in that described step 4 Comprise the steps:
All fractionation image recognition result r of described images to be recognized are merged into the recognition result r of images to be recognized.
3. image based on color parameter according to claim 1 splits recognition methodss it is characterised in that described step 1 Comprise the steps:
Step 11: images to be recognized is converted into digital record;
Step 12: Fourier transformation is carried out to the digital record of images to be recognized;
Step 13: the result derivation to Fourier transformation, obtain extreme point;
Step 14: according to extreme point clustering, obtain and most Color Ranges occur.
4. image based on color parameter according to claim 3 splits recognition methodss it is characterised in that described step 2 Comprise the steps:
Step 21: for the Color Range obtaining in step 14, all digital records are grouped;
Step 22: in every group of record, for corresponding Color Range, the numerical value in Color Range retains, other changes Turn to the numerical value corresponding to white;
Step 23: the numeral of every group of record is re-converted into broken away view picture.
5. image based on color parameter according to claim 1 splits recognition methodss it is characterised in that described step 3 Comprise the steps:
Step 31: carry out Figure recognition by after the picture name record of split image;
Step 32: obtain the corresponding content of text of every pictures, each content of text includes project name, particular content and credible Degree.
6. image based on color parameter according to claim 5 splits recognition methodss it is characterised in that described step 4 Comprise the steps:
Step 41: according to the fractionation record of picture to be identified, obtain each and split the corresponding content of text of picture;
Step 42: determine from each broken away view is as corresponding content of text and treat syndicated content;
Step 43: will treat that syndicated content is integrated, and obtain the final recognition result to image recognition to be identified.
7. image based on color parameter according to claim 6 splits recognition methodss it is characterised in that described step 42 Including following any one or appoint plurality of step:
- for the broken away view picture only needing to a content of text, then select a credibility highest from described content of text Content of text as treating syndicated content;
- for the broken away view picture needing multiple content of text, then select credibility to exceed some from described content of text The content of text of threshold values is as treating syndicated content.
CN201310612649.4A 2013-11-26 2013-11-26 Image segmentation and recognition method based on color parameter Active CN103617423B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201310612649.4A CN103617423B (en) 2013-11-26 2013-11-26 Image segmentation and recognition method based on color parameter

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201310612649.4A CN103617423B (en) 2013-11-26 2013-11-26 Image segmentation and recognition method based on color parameter

Publications (2)

Publication Number Publication Date
CN103617423A CN103617423A (en) 2014-03-05
CN103617423B true CN103617423B (en) 2017-01-25

Family

ID=50168126

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201310612649.4A Active CN103617423B (en) 2013-11-26 2013-11-26 Image segmentation and recognition method based on color parameter

Country Status (1)

Country Link
CN (1) CN103617423B (en)

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104268512B (en) * 2014-09-17 2018-04-27 清华大学 Character identifying method and device in image based on optical character identification
CN107092903A (en) * 2016-02-18 2017-08-25 阿里巴巴集团控股有限公司 information identifying method and device
CN110187816B (en) * 2019-05-22 2020-11-20 掌阅科技股份有限公司 Automatic page turning method for cartoon type electronic book, computing device and storage medium
CN112215159B (en) * 2020-10-13 2021-05-07 苏州工业园区报关有限公司 International trade document splitting system based on OCR and artificial intelligence technology

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5134666A (en) * 1990-08-15 1992-07-28 Ricoh Company, Ltd. Image separator for color image processing
CN102122348A (en) * 2011-02-26 2011-07-13 王枚 Practical method for recovering fuzzy license plate image
CN102663411A (en) * 2012-02-29 2012-09-12 宁波大学 Recognition method for target human body
CN103136845A (en) * 2013-01-23 2013-06-05 浙江大学 Renminbi (RMB) counterfeit identifying method based on crown-word image characters

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5134666A (en) * 1990-08-15 1992-07-28 Ricoh Company, Ltd. Image separator for color image processing
CN102122348A (en) * 2011-02-26 2011-07-13 王枚 Practical method for recovering fuzzy license plate image
CN102663411A (en) * 2012-02-29 2012-09-12 宁波大学 Recognition method for target human body
CN103136845A (en) * 2013-01-23 2013-06-05 浙江大学 Renminbi (RMB) counterfeit identifying method based on crown-word image characters

Also Published As

Publication number Publication date
CN103617423A (en) 2014-03-05

Similar Documents

Publication Publication Date Title
US11062163B2 (en) Iterative recognition-guided thresholding and data extraction
US10846553B2 (en) Recognizing typewritten and handwritten characters using end-to-end deep learning
US8494273B2 (en) Adaptive optical character recognition on a document with distorted characters
US9811749B2 (en) Detecting a label from an image
US10817741B2 (en) Word segmentation system, method and device
Gebhardt et al. Document authentication using printing technique features and unsupervised anomaly detection
US11151402B2 (en) Method of character recognition in written document
CN111539409B (en) Ancient tomb question and character recognition method based on hyperspectral remote sensing technology
TW200842734A (en) Image processing program and image processing device
CN103617423B (en) Image segmentation and recognition method based on color parameter
CN107358184A (en) The extracting method and extraction element of document word
CN113901952A (en) Print form and handwritten form separated character recognition method based on deep learning
Brisinello et al. Optical Character Recognition on images with colorful background
Bulatov et al. Towards a unified framework for identity documents analysis and recognition
CN108877030B (en) Image processing method, device, terminal and computer readable storage medium
JP2022067086A (en) Digitalized writing processing
Karthik et al. Segmentation and recognition of handwritten kannada text using relevance feedback and histogram of oriented gradients–a novel approach
CN115083024A (en) Signature identification method, device, medium and equipment based on region division
CN106803269B (en) Method and device for perspective correction of document image
Liu et al. A prototype system of courtesy amount recognition for Chinese Bank checks
Hegadi Recognition of printed Kannada numerals based on zoning method
WO2023032177A1 (en) Object removal system, object removal method, and object removal program
Murthy et al. Nearest neighbor clustering based approach for line and character segmentation in epigraphical scripts
Zong Handwritten number recognition system based on Image processing
Mohanraj et al. Bilingual Approach: Leveraging Deep Neural Network Techniques for Handwritten Signature Authentication

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C41 Transfer of patent application or patent right or utility model
TA01 Transfer of patent application right

Effective date of registration: 20160608

Address after: 200120 Shanghai City, Pudong New Area China Mall (Shanghai) free trade zone lucky road No. 660 building 2307 unit

Applicant after: Qianjin Network Information Technology (Shanghai) Co.,Ltd.

Address before: 200336 Shanghai city Changning District Xianxia Road 579 Lane No. 38 building second room 106

Applicant before: Find forest network technology (Shanghai) Co., Ltd.

C14 Grant of patent or utility model
GR01 Patent grant