CN102567711A - Method and system for making and using scanning recognition template - Google Patents
Method and system for making and using scanning recognition template Download PDFInfo
- Publication number
- CN102567711A CN102567711A CN2010106228013A CN201010622801A CN102567711A CN 102567711 A CN102567711 A CN 102567711A CN 2010106228013 A CN2010106228013 A CN 2010106228013A CN 201010622801 A CN201010622801 A CN 201010622801A CN 102567711 A CN102567711 A CN 102567711A
- Authority
- CN
- China
- Prior art keywords
- template
- locating piece
- image
- making
- content
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Images
Landscapes
- Character Input (AREA)
Abstract
The invention relates to a method and a system for making and using a scanning recognition template. The method comprises the following steps of: making a recognition template, dividing a locating block in the template, and setting the attribute of the locating plate; performing regional analysis on a scanned image, and searching out the template of which a superposition rate to the image region reaches a threshold; matching the locating block in the template with the region in the scanned image, and extracting and recognizing the content information of the matched locating block; and classifying the recognized content information of the locating block. With the method and the system provided by the invention, the recognition efficiency to the regular complex layout is improved greatly, and the recognition information is checked and classified automatically.
Description
Technical field
The present invention relates to scan distinguishment technical field, be specifically related to a kind of scanning recognition template making and use method and system.
Background technology
Along with the continuous progress of society, the fast development of digitizing technique, the data requestor of People more and more favor electronization need carry out digitized processing so get more and more to the papery data, scans identification.
In the digital production process, the OCR technology is very crucial, the good and bad quality that directly influences data identification quality of OCR technology.And the very big difficulty that increase Computer Automatic Recognition such as the various charts in the papery data, formula.Also have the picture in some data, identification is got up can waste the plenty of time, and effect is bad, reduces identification efficiency simultaneously greatly.The workload of the content arrangement after the identification also is very huge, is very easy to cause content confused, also needs manual work to put in order, has increased cost of labor.
Summary of the invention
The objective of the invention is to defective, a kind of scanning recognition template making and use method and system are provided, to improve picture and text identification efficiency and quality to present OCR technology.
The present invention provides a kind of scanning recognition template making and use method, comprises the steps:
(S0) make recognition template, in said template, mark locating piece, and the attribute of locating piece is set;
(S1) scan image is carried out regional analysis, find out the template that reaches the setting threshold values with the image-region coincidence factor;
(S2) locating piece in the said template and the zone in the scan image are mated, extract and discern the content information of Matching Location piece;
(S3) the locating piece content information of having discerned is sorted out.
Further, aforesaid a kind of scanning recognition template making and use method, this method also comprises, scan image is carried out normalization handle, said normalization is handled and is meant, the anamorphose that causes in the scanning is corrected.
Further; Aforesaid a kind of scanning recognition template making and use method, in the step (S0), said template is meant the closed figure zone that comprises the border; Comprise one or more locating pieces in the said template; Wherein, the sealing rectangle frame of locating piece finger print intralamellar part is used for the content in its matching area is discerned the row labels of going forward side by side.
Further; Aforesaid a kind of scanning recognition template making and use method, template and locating piece all have adeditive attribute, comprise: the coupling metric attribute; Be used to weigh the coincidence factor of coincidence factor, locating piece and the image-region of template and image, and as the index of manual intervention.
Further, aforesaid a kind of scanning recognition template making and use method, the adeditive attribute of said locating piece also comprises:
1) identification content type: comprise literal, figure, image;
2) identification content clustering label: be used for system and the identification content carried out classification processing according to this label;
3) content verification rule: be used for to discerning the rule that content is checked;
4) from the dynamic deformation attribute: be used for locating piece and overlap with image-region when contrasting, locating piece is carried out the fine setting of size, position in setting the threshold values scope.
Further; Aforesaid a kind of scanning recognition template making and use method; In the step (S2); In the template locating piece with scan image in the zone mate, promptly two regional rectangle coincidence factors coupling metric attribute preset threshold of reaching locating piece thinks that promptly this zone and this locating piece mate.
Further, aforesaid a kind of scanning recognition template making and use method, in the step (S2), locating piece allows nested, when locating piece is discerned the interior content in its zone, discerns according to following order: by the nested number of plies, matching degree, priority weight.
Further, aforesaid a kind of scanning recognition template making and use method, in the step (S2), locating piece carries out the fine setting of size, position according to the picture material of its matching area in setting the threshold values scope to locating piece.
Further; Aforesaid a kind of scanning recognition template making and use method; In the step (S2); Locating piece carries out dissimilar processing according to identification content type mark to the image in its zone: as carrying out OCR identification to literal, scratching figure to image, possibly carry out curve fitting to figure.
A kind of scanning recognition template is made and using system, comprising:
The template construct device is used for making template and marks the template locating piece, and the attribute of locating piece is set;
The Template Manager device is used to manage all templates, and finds out the template that reaches the setting threshold values with the image-region coincidence factor;
The identification actuating unit is used for the zone of locating piece and scan image is mated, and extracts and discern the content information of Matching Location piece;
Sorter is used for classifying to accomplishing content identified information.
Beneficial effect of the present invention is following: the present invention helps to promote recognition efficiency, and carries out identifying information verification and classification for the file of publishing based on template.For the image-region evident characteristic, through the cutting zone, and pass through regional separation and the marks of different identification difficulty, not only can verify the accuracy of discerning to promote each other, but also discern the taxonomic revision of content simultaneously.Adopt method and system of the present invention, solved the relative positioning problem of the picture that takes, reduced the workload of manual sorting significantly.
Description of drawings
Fig. 1 is that a kind of scanning recognition template is made and the using system structural drawing in the embodiment of the invention;
Fig. 2 is a kind of scanning recognition template making and use method process flow diagram in the embodiment of the invention;
Fig. 3 is original scan image among the embodiment;
Fig. 4 is the template pattern of the most suitable Fig. 3 among the embodiment;
Fig. 5 is the synoptic diagram that locating piece and image-region mate among the embodiment.
Embodiment
Be elaborated below in conjunction with the Figure of description specific embodiments of the invention.
As shown in Figure 1, the invention provides a kind of scanning recognition template manufacturing system, comprising:
Identification actuating unit 13 is used for the zone of locating piece and scan image is mated, and extracts and discern the content information of Matching Location piece;
A kind of scanning recognition template method for making that said system realized is as shown in Figure 2, and this method comprises the steps:
S0: make recognition template, in said template, mark locating piece, and the attribute of locating piece is set.
In the embodiment of the invention, said template is meant the closed figure zone that comprises the border, comprises one or more locating pieces in the said template, and wherein, the sealing rectangle frame of locating piece finger print intralamellar part is used for the content in its matching area is discerned the row labels of going forward side by side.
Template and locating piece all have adeditive attribute, comprise: the coupling metric attribute, be used to weigh the coincidence factor of coincidence factor, locating piece and the image-region of template and image, and as the index of manual intervention.
The adeditive attribute of said locating piece also comprises:
1) identification content type: like literal, figure, image;
2) identification content clustering label: be used for system and the identification content carried out classification processing according to this label.
3) content verification rule is used for the rule that the identification content is checked;
4) from the dynamic deformation attribute: be used for locating piece and overlap with image-region when contrasting, locating piece is carried out the fine setting of size, position in setting the threshold values scope.
S1: scan image is carried out regional analysis, find out the template that reaches the setting threshold values with the image-region coincidence factor.
In the embodiment of the invention, scan image is carried out the connected domain analysis, carry out the Region Segmentation of image according to the characteristic of connected domain, the template in image after the Region Segmentation and the Template Manager device is mated, the zoning coincidence factor is found out corresponding template thus.This connected domain analysis and matching process are the known technology of this area.
In the embodiment of the invention, comprise that also scan image is carried out normalization to be handled, said normalization is handled and is meant, the anamorphose that causes in the scanning is corrected, and the typical case is crooked like the page, and size has slight variation.Efficient and the accuracy that helps to improve template and scan image coupling handled in normalization.What the normalization processing of the scanning page was adopted all is some known image processing techniquess.
S2: locating piece in the said template and the zone in the scan image are mated, extract and discern the content information of Matching Location piece.
In the embodiment of the invention, mate in the zone in the template in locating piece and the scan image, and promptly two regional rectangle coincidence factors reach the threshold values that the coupling metric attribute of locating piece sets and think that promptly this zone and this locating piece mate.
Further, locating piece allows nested, when locating piece is discerned the content in its zone, discerns according to following order: by the nested number of plies, matching degree, priority weight.
Further, locating piece carries out the fine setting of size, position according to the picture material of its matching area in setting the threshold values scope to locating piece.
Further, locating piece carries out dissimilar processing according to identification content type mark to the image in its zone: as carrying out OCR identification to literal, scratching figure to image, possibly carry out curve fitting to figure.
S3: the locating piece content information to having discerned is sorted out.
The identifying information corresponding like some locating piece is image, and the identifying information that some locating piece is corresponding is a literal, and these dissimilar content informations are sorted out accordingly.
Below for the concrete embodiment of the present invention describes, to specify the concrete ins and outs of scanning recognition template method for making.
Fig. 3 is original scan image among the embodiment, by finding out that this original scanning is the content of a menu among the figure, comprises the pattern of width of cloth completion article, the material of whole menu, method for making and points for attention.
Fig. 4 is the template pattern of the most suitable Fig. 3 among the embodiment.In the Template Manager device, carry out regional analysis according to Fig. 3, find out the template that reaches the setting threshold values with the image-region coincidence factor, in the present embodiment, format template as shown in Figure 4.
By finding out among the figure, this template is formed template housing 41 and locating piece 42 by 2 parts.Wherein, template housing 41 has been set the size of whole scan image, 42 distribution situations that mark content in the scan image of locating piece.
In the present embodiment, locating piece all includes following attribute:
1) identification content type: like literal, figure, image;
2) identification content clustering label: be used for system and the identification content carried out classification processing according to this label.
3) content verification rule is used for the rule that the identification content is checked;
4) from the dynamic deformation attribute: be used for locating piece and overlap with image-region when contrasting, locating piece is carried out the fine setting of size, position in setting the threshold values scope.
Fig. 5 is that locating piece and image-region mate among the embodiment.In the identification actuating unit, at first with locating piece in the template and image-region according to location matches, promptly two regional rectangle coincidence factors reach and set threshold values and think that promptly this zone and this locating piece mate.This location matches technology is the known technology of this area, and here with regard to no longer too much introducing, in the present embodiment, threshold value setting is 85%, and promptly the coincidence factor of locating piece and image-region reaches more than 85% in the template, just thinks that this zone and this locating piece mate.In the present embodiment, as shown in Figure 5.
After regional and locating piece carry out elementary coupling,, locating piece is carried out the fine setting of size, position in setting the threshold values scope according to the attribute that is arranged in the locating piece.For example to locating piece c, after the coupling of the points for attention in locating piece c and the image, locating piece c narrows down to literal scope automatically, and has ignored the housing of literal.
Next, discern the content of Matching Location piece, content identified is recorded in the locating piece.Simultaneously content identified is classified, for example the content type that identifies of locating piece a is an image, and the content type that locating piece b identifies is a literal.The locating piece content information of having discerned is sorted out.
Obviously, those skilled in the art can carry out various changes and modification to the present invention and not break away from the spirit and scope of the present invention.Like this, belong within the scope of claim of the present invention and equivalent technology thereof if of the present invention these are revised with modification, then the present invention also is intended to comprise these changes and modification interior.
Claims (10)
1. a scanning recognition template making and use method comprises the steps:
(S0) make recognition template, in said template, mark locating piece, and the attribute of locating piece is set;
(S1) scan image is carried out regional analysis, find out the template that reaches the setting threshold values with the image-region coincidence factor;
(S2) locating piece in the said template and the zone in the scan image are mated, extract and discern the content information of Matching Location piece;
(S3) the locating piece content information of having discerned is sorted out.
2. a kind of scanning recognition template making and use method as claimed in claim 1 is characterized in that, this method also comprises, scan image is carried out normalization handle, and said normalization is handled and is meant, the anamorphose that causes in the scanning is corrected.
3. a kind of scanning recognition template making and use method as claimed in claim 1; It is characterized in that in the step (S0), said template is meant the closed figure zone that comprises the border; Comprise one or more locating pieces in the said template; Wherein, the sealing rectangle frame of locating piece finger print intralamellar part is used for the content in its matching area is discerned the row labels of going forward side by side.
4. a kind of scanning recognition template making and use method as claimed in claim 3; It is characterized in that; Described template and locating piece all have adeditive attribute; Comprise: the coupling metric attribute, be used to weigh the coincidence factor of coincidence factor, locating piece and the image-region of template and image, and as the index of manual intervention.
5. a kind of scanning recognition template making and use method as claimed in claim 4 is characterized in that the adeditive attribute of said locating piece also comprises:
1) identification content type: comprise literal, figure, image;
2) identification content clustering label: be used for system and the identification content carried out classification processing according to this label;
3) content verification rule: be used for to discerning the rule that content is checked;
4) from the dynamic deformation attribute: be used for locating piece and overlap with image-region when contrasting, locating piece is carried out the fine setting of size, position in setting the threshold values scope.
6. a kind of scanning recognition template making and use method as claimed in claim 4; It is characterized in that; In the step (S2); Mate in zone in the template in locating piece and the scan image, thinks that promptly this zone and this locating piece mate if two regional rectangle coincidence factors reach the threshold values that the coupling metric attribute of locating piece sets.
7. a kind of scanning recognition template making and use method as claimed in claim 6; It is characterized in that in the step (S2), locating piece allows nested; When locating piece is discerned the interior content in its zone, discern according to following order: by the nested number of plies, matching degree, priority weight.
8. a kind of scanning recognition template making and use method as claimed in claim 6 is characterized in that, in the step (S2), locating piece carries out the fine setting of size, position according to the picture material of its matching area in setting the threshold values scope to locating piece.
9. a kind of scanning recognition template making and use method as claimed in claim 6 is characterized in that, in the step (S2), locating piece carries out dissimilar processing according to identification content type mark to the image in its zone.
10. a scanning recognition template is made and using system, comprising:
The template construct device is used for making template and marks the template locating piece, and the attribute of locating piece is set;
The Template Manager device is used to manage all templates, and finds out the template that reaches the setting threshold values with the image-region coincidence factor;
The identification actuating unit is used for the zone of locating piece and scan image is mated, and extracts and discern the content information of Matching Location piece;
Sorter is used for classifying to accomplishing content identified information.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN2010106228013A CN102567711A (en) | 2010-12-29 | 2010-12-29 | Method and system for making and using scanning recognition template |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN2010106228013A CN102567711A (en) | 2010-12-29 | 2010-12-29 | Method and system for making and using scanning recognition template |
Publications (1)
Publication Number | Publication Date |
---|---|
CN102567711A true CN102567711A (en) | 2012-07-11 |
Family
ID=46413091
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN2010106228013A Pending CN102567711A (en) | 2010-12-29 | 2010-12-29 | Method and system for making and using scanning recognition template |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN102567711A (en) |
Cited By (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105809157A (en) * | 2014-12-29 | 2016-07-27 | 北京鸿合智能系统股份有限公司 | Answer sheet modeling method and device |
CN107206587A (en) * | 2014-12-05 | 2017-09-26 | Ars责任有限公司 | Equipment for being oriented to the part especially by crawls such as robot, automation equipments |
CN107517272A (en) * | 2017-09-14 | 2017-12-26 | 新疆圣力信息科技有限公司 | A kind of device, the system and method for automatic data collection set form data |
CN107590495A (en) * | 2017-09-18 | 2018-01-16 | 哈尔滨成长科技有限公司 | Answer sheet picture method for correcting error, device, readable storage medium storing program for executing and electronic equipment |
CN108665439A (en) * | 2017-08-22 | 2018-10-16 | 深圳安博电子有限公司 | Method of testing substrate and terminal device |
CN108875697A (en) * | 2018-07-05 | 2018-11-23 | 南昌市微轲联信息技术有限公司 | Collecting vehicle information method for uploading, device, storage medium and computer equipment |
CN109086738A (en) * | 2018-08-23 | 2018-12-25 | 深圳市深晓科技有限公司 | A kind of character identifying method and device based on template matching |
CN110705610A (en) * | 2019-09-17 | 2020-01-17 | 孔佑强 | Evaluation system and method based on handwriting detection and temporary writing capability |
CN111353611A (en) * | 2018-12-20 | 2020-06-30 | 核动力运行研究所 | Automatic generation system and method for in-service inspection and overhaul inspection report of nuclear power station |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN1619580A (en) * | 2004-09-03 | 2005-05-25 | 深圳市海云天科技有限公司 | Information identification method of full-filling information card |
US20090087103A1 (en) * | 2007-09-28 | 2009-04-02 | Hitachi High-Technologies Corporation | Inspection Apparatus and Method |
CN101464951A (en) * | 2007-12-21 | 2009-06-24 | 北大方正集团有限公司 | Image recognition method and system |
CN101882225A (en) * | 2009-12-29 | 2010-11-10 | 北京中科辅龙计算机技术股份有限公司 | Engineering drawing material information extraction method based on template |
CN101923643A (en) * | 2010-08-11 | 2010-12-22 | 中科院成都信息技术有限公司 | General form recognizing method |
-
2010
- 2010-12-29 CN CN2010106228013A patent/CN102567711A/en active Pending
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN1619580A (en) * | 2004-09-03 | 2005-05-25 | 深圳市海云天科技有限公司 | Information identification method of full-filling information card |
US20090087103A1 (en) * | 2007-09-28 | 2009-04-02 | Hitachi High-Technologies Corporation | Inspection Apparatus and Method |
CN101464951A (en) * | 2007-12-21 | 2009-06-24 | 北大方正集团有限公司 | Image recognition method and system |
CN101882225A (en) * | 2009-12-29 | 2010-11-10 | 北京中科辅龙计算机技术股份有限公司 | Engineering drawing material information extraction method based on template |
CN101923643A (en) * | 2010-08-11 | 2010-12-22 | 中科院成都信息技术有限公司 | General form recognizing method |
Cited By (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107206587A (en) * | 2014-12-05 | 2017-09-26 | Ars责任有限公司 | Equipment for being oriented to the part especially by crawls such as robot, automation equipments |
CN105809157A (en) * | 2014-12-29 | 2016-07-27 | 北京鸿合智能系统股份有限公司 | Answer sheet modeling method and device |
CN108665439A (en) * | 2017-08-22 | 2018-10-16 | 深圳安博电子有限公司 | Method of testing substrate and terminal device |
CN107517272A (en) * | 2017-09-14 | 2017-12-26 | 新疆圣力信息科技有限公司 | A kind of device, the system and method for automatic data collection set form data |
CN107590495A (en) * | 2017-09-18 | 2018-01-16 | 哈尔滨成长科技有限公司 | Answer sheet picture method for correcting error, device, readable storage medium storing program for executing and electronic equipment |
CN108875697A (en) * | 2018-07-05 | 2018-11-23 | 南昌市微轲联信息技术有限公司 | Collecting vehicle information method for uploading, device, storage medium and computer equipment |
CN109086738A (en) * | 2018-08-23 | 2018-12-25 | 深圳市深晓科技有限公司 | A kind of character identifying method and device based on template matching |
CN111353611A (en) * | 2018-12-20 | 2020-06-30 | 核动力运行研究所 | Automatic generation system and method for in-service inspection and overhaul inspection report of nuclear power station |
CN111353611B (en) * | 2018-12-20 | 2023-05-26 | 核动力运行研究所 | Nuclear power station in-service inspection large repair inspection report automatic generation system and method |
CN110705610A (en) * | 2019-09-17 | 2020-01-17 | 孔佑强 | Evaluation system and method based on handwriting detection and temporary writing capability |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN102567711A (en) | Method and system for making and using scanning recognition template | |
US8792715B2 (en) | System and method for forms classification by line-art alignment | |
JP5492205B2 (en) | Segment print pages into articles | |
Ray Choudhury et al. | An architecture for information extraction from figures in digital libraries | |
CN102081732B (en) | Method and system for recognizing format template | |
CN104778470B (en) | Text detection based on component tree and Hough forest and recognition methods | |
Rigaud et al. | Robust frame and text extraction from comic books | |
CN101017533A (en) | Recognition method of printed mongolian character | |
CN109325401A (en) | The method and system for being labeled, identifying to title field are positioned based on edge | |
CN100562074C (en) | The method that a kind of video caption extracts | |
CN102332096A (en) | Video caption text extraction and identification method | |
CN1760860A (en) | Device part assembly drawing image search apparatus | |
CN1237742A (en) | Address reader, sorting machine and character string recognition method for mail and the like | |
EP2220590A1 (en) | A method for processing optical character recognition (ocr) data, wherein the output comprises visually impaired character images | |
CN112419260A (en) | PCB character area defect detection method | |
KR101937398B1 (en) | System and method for extracting character in image data of old document | |
CN113723362A (en) | Method and device for detecting table line in image | |
Banerjee et al. | Automatic hyperlinking of engineering drawing documents | |
CN104680142A (en) | Method for comparing four-slap fingerprint based on feature point set segmentation and RST invariant features | |
Sumathi et al. | Techniques and challenges of automatic text extraction in complex images: a survey | |
CN104123527A (en) | Mask-based image table document identification method | |
Lue et al. | A novel character segmentation method for text images captured by cameras | |
CN100356393C (en) | Character recognition method predicted base on font | |
Li et al. | Script identification of camera-based images | |
CN111950556A (en) | License plate printing quality detection method based on deep learning |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
AD01 | Patent right deemed abandoned |
Effective date of abandoning: 20161130 |
|
C20 | Patent right or utility model deemed to be abandoned or is abandoned |