CN107247950A - A kind of ID Card Image text recognition method based on machine learning - Google Patents

A kind of ID Card Image text recognition method based on machine learning Download PDF

Info

Publication number
CN107247950A
CN107247950A CN201710416957.8A CN201710416957A CN107247950A CN 107247950 A CN107247950 A CN 107247950A CN 201710416957 A CN201710416957 A CN 201710416957A CN 107247950 A CN107247950 A CN 107247950A
Authority
CN
China
Prior art keywords
image
character
self
character area
card image
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201710416957.8A
Other languages
Chinese (zh)
Inventor
屈鸿
黄鹂
高榕
刘永胜
张翮
史冬霞
陈珊
汪文
汪一文
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
University of Electronic Science and Technology of China
Original Assignee
University of Electronic Science and Technology of China
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by University of Electronic Science and Technology of China filed Critical University of Electronic Science and Technology of China
Priority to CN201710416957.8A priority Critical patent/CN107247950A/en
Publication of CN107247950A publication Critical patent/CN107247950A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/22Image preprocessing by selection of a specific region containing or referencing a pattern; Locating or processing of specific regions to guide the detection or recognition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2411Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on the proximity to a decision surface, e.g. support vector machines
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/44Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • General Physics & Mathematics (AREA)
  • Physics & Mathematics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • General Engineering & Computer Science (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Multimedia (AREA)
  • Character Discrimination (AREA)

Abstract

The invention discloses a kind of ID Card Image text recognition method based on machine learning, belong to image procossing, machine vision, the technical fields such as neutral net, solve in the prior art OCR identification under complex background carry out ID Card Image automatic identification when, recognition time length, the accuracy rate of identification are low, anti-rotation, the problem of warping property is poor.The present invention includes obtaining the image shot, and the image of shooting is pre-processed, and the ID Card Image in pretreated image and complicated background image are distinguished;Word area detection is carried out to the ID Card Image detected, word cutting then is carried out to the character area detected, word one by one is obtained;The word cut out is identified character recognition model based on deep learning, exports the result identified.The present invention is for the text identification on ID Card Image.

Description

A kind of ID Card Image text recognition method based on machine learning
Technical field
A kind of ID Card Image text recognition method based on machine learning, the text identification on ID Card Image, Belong to image procossing, machine vision, the technical field such as neutral net.
Background technology
Certificate identification is come pair using optical character identification (OCR, Optical Character Recognition) technology Text information on certificate is identified.Specifically refer to using OCR technique to scanning, taking pictures after certificate image analyzed, Identification, to obtain the process of the text message on certificate.Compared with traditional manual entry mode, OCR automatic information record Enter with big advantage, the operating efficiency of remote superman's class is wanted in terms of speed and accuracy rate, especially in people with work The increase of time and under the fatigue state, the speed reduction of people's not merely typing information, accuracy rate is also natural Reduction.The mankind are natural when handling mechanical tedious work can not to defeat machine, in order to pursue the reasonable excellent of resource distribution Change, the mankind are freed from such work and put into that other work are imperative, this technology of OCR is just along with the mankind This demand be born out.
The purpose of one OCR identifying system, exactly comes out the Word Input of image file, then carries out layout reversion. The realization of a usual OCR system is mainly comprising four steps:Image preprocessing, word area detection, Character segmentation, character is known Not:
(1) pretreatment of image
Image preprocessing part mainly includes binaryzation, image noise reduction, Slant Rectify etc..Image preprocessing is to recognize The first step of journey, is to lift the treatment effeciency and accuracy rate of subsequent processing units.By taking RGB color image as an example, one Pixel three-component containing chromatic colour, and bianry image only needs to one-component and can just represented, then shared by coloured image Memory space will be three times of bianry image.So big information content is not only computationally intensive and computation complexity is also high, so needing Binary conversion treatment is carried out to picture.Moreover, because the difference time of the quality of picture in itself is uneven, pretreatment work first has to basis The feature of noise carries out denoising to image to be identified.Moreover, the image manually shot often has tilt phenomenon, therefore Slant Rectify is also a highly important ring, is easy to later stage scan text.The step of image preprocessing, is not necessarily to stream Journey is changeless, and different identification demands needs to make the adjustment of step according to experiment effect.Swept generally, for identification Pre-treatment step needed for the PDF retouched, word file is then simply more, and similar to Car license recognition, identity card identification, streetscape The complicated image of this kind of environment of billboard, then need troublesome step.
(2) Text RegionDetection
After image pretreatment operation is carried out, the character area being generally about to begin in detection image.Traditional Word area detection method has the Page Segmentation method of connected region and the dividing method based on textural characteristics, in recent years more popular Object detection method have the method based on deep neural network such as fast-rcnn.
(3) Character segmentation
Character segmentation is the first step of character recognition, the cutting that the good Character segmentation algorithm of a robustness can be complete Numeral, letter and the Chinese text gone out on identity card.Conventional Character segmentation algorithm, which has mainly, at present two classes, and a class is fixed The cutting of spacing, this method is cut image according to constant spacing, and possible Character segmentation is come out.This kind of method is very It is adapted to letter word or numeral as the cutting of target, reason is also very simple, because western language word or numeral are past in block letter It is past all to possess very big uniformity.It is another kind of, it is the cutting of not constant spacing, such as vertical projection method, this class algorithm is more suitable For possess unique scheme structure Chinese text or using whole word (word) as target cutting.In view of this technology institute The identity card identification engine of exploration is a conformability system that letter, numeral, Chinese text all can be identified as target System, therefore this technology is using the cutting method of second of not constant spacing, and in this approach based on make certain improvements.
(4) character recognition
Character recognition is the final step in OCR whole flow process, is also a very important step, the knowledge of this part of module Other accuracy determines that whether whole OCR system can use.All the time, character recognition algorithm is all based on mathematical theory design Algorithm, famous method has template matching method i.e. configuration mode identification, statistical pattern recognition method.Since deep learning emerges Afterwards, due to the feature that it enables it to extract more higher-dimension to the deeply abstraction of feature, with the knowledge of depth learning technology Malapropism symbol starts one upsurge in field.
The weak point of OCR identifications can only exactly recognize formatted document such as word document, it is impossible to which processing is multiple well Certificate identification under miscellaneous background, cause recognition time length, identification accuracy rate is low, anti-rotation, the problem of warping property is poor.
The content of the invention
The present invention provides a kind of ID Card Image text recognition method based on machine learning for above-mentioned weak point, OCR identifications in the prior art are solved under complex background during progress ID Card Image automatic identification, recognition time length, the standard of identification True rate is low, anti-rotation, the problem of warping property is poor.
The technical solution adopted by the present invention is as follows:
A kind of ID Card Image text recognition method based on machine learning, it is characterised in that comprise the following steps:
Step 1, the image of the shooting of acquisition pre-processed, by the ID Card Image in pretreated image and multiple Miscellaneous background image is distinguished;
Step 2, word area detection is carried out to the ID Card Image that detects, then the character area to detecting Word cutting is carried out, word one by one is obtained;
The word cut out is identified for step 3, the character recognition model based on deep learning, and output is identified Result.
Further, comprising the following steps that in the step 1:
(11), pre-processed using Gaussian Blur and gray processing come the image to shooting;
(12) pretreated image, is carried out to step (11), identity card is carried out using Canny operators and Sobe l operators Rim detection;
(13), the region for the identity card surrounded by edges for being detected step (12) using binaryzation and than operation is syncopated as Come, obtain ID Card Image region;
(14), ID Card Image region progress profile is selected using SVM classifier, correct identity card profile diagram is obtained Picture;
(15), the image for the irregular deflection for obtaining step (14), will be carried out using Hough transformation and perspective transform Correct.
Further, the step 2 is comprised the following steps that:
(21) network for the high-level characteristic that three self-encoding encoders of a cascade are obtained, is built, according to the network of high-level characteristic Carry out whether judging pixel as character area from pixel scale, take out accurate character area;Concretely comprise the following steps:
(211), first self-encoding encoder random 500k size of taking-up from given all training pictures is 5*5's Block is set to x as input(1), then x(1)∈R75, R represents real number space, R75It is the vector that a dimension is 75 to define x;Will be defeated The 500k size entered determines hidden neuron number for 5*5 block by many experiments effect, final to determine hidden neuron Number is 40, then 500k size of input is trained for 5*5 block and hidden neuron number by self-encoding encoder, network convergence The result f of first self-encoding encoder coded portion is obtained afterwards(1), f(1)∈40;
(212), taking out 500k size in the characteristic pattern matrix that second self-encoding encoder is obtained from step (211) at random is 3*3 block is set to x as input(2), order"+" represents x(2)Be by 9 x(1)Directly it is in series, w refers to weight, x(2)∈ 360, the hidden neuron number for taking second self-encoding encoder is 30, will 500k size is trained for 3*3 block and hidden neuron number by self-encoding encoder, obtains second self-encoding encoder coding unit The result f divided(2), f(2)∈30;
(213), taking out 200k size in the characteristic pattern matrix that the 3rd self-encoding encoder is obtained from step (212) at random is 3*3 block is set to x as input(3)), x(3)∈ 270, wherein, every fritter in 3*3 block has 5 pixels and next small Block is overlapping, and the hidden neuron for taking the 3rd self-encoding encoder is 20, by block and hidden neuron of the 200k size for 3*3 After the completion of number is by self-encoding encoder training, the result f of the 3rd self-encoding encoder coded portion is obtained(3), f(3)∈20;
(214) three kinds of features of the central point of 5*5 block, are obtained according to step (211)-step (213), f=f is made(1)+f(2)+f(3), "+" represents direct series connection, forms the composite character of one 90 dimension, and the composite character of 90 dimensions is put into SVM models Classification based training is carried out, a svm classifier model is finally given, after training is finished, the body that svm classifier model is distinguished to step 1 Part card image is scanned, and judges whether each pixel is a part for character area, so as to take out accurate character area;
(22) accurate character area, is taken out, character cutting is carried out;Comprise the following steps that:
(221), by Chinese character mean breadth W in accurate character area1With digital mean breadth W2Come out as cutting Standard;
(222), the character area width record of the starting point of scan first character area and end point is got off, If the character area width of cutting is similar to grapholect mean breadth is considered as a Chinese character by the character area of cutting;If not Then go to step (223);
(223) it is, noise if character area width is much smaller than digital averaging width, abandons the region;If literal field Character area is then given the SVM trained a digital sort device and determines whether number by field width degree close to digital averaging width Word, if numeral scans next character area, otherwise goes to step (224);
(224) right side in current character region, will be inspected, two regional connections are got up in trial, judges to contact again Whether two regions come are Chinese character or numeral, if being not still Chinese character or numeral, reattempt the right side for merging a upper combined region Carry out Chinese character or digital judgement.
Further, the step 3 is comprised the following steps that:
(31) network model of identification character, is built, the network model is by input layer, multiple convolutional layers, multiple sample levels, Full articulamentum and output layer composition;
(32) the network weight parameter of a set of network model, is trained using the training dataset collected;
(33), the word being syncopated as is identified using the network model for training network weight parameter, output result.
In summary, by adopting the above-described technical solution, the beneficial effects of the invention are as follows:
1st, the present invention carries out the automatic identification of ID Card Image under complicated background, and recognition time is short, identification it is accurate Rate is high, there is anti-rotation, the advantage of distortion.
Brief description of the drawings
The particular flow sheet that Fig. 1 detects for ID Card Image in the present invention;
Fig. 2 is the overall flow figure of ID Card Image text recognition technique of the present invention.
Embodiment
In order to make the purpose , technical scheme and advantage of the present invention be clearer, it is right below in conjunction with drawings and Examples The present invention is further elaborated.It should be appreciated that specific embodiment described herein is only to explain the present invention, not For limiting the present invention.
A kind of ID Card Image text recognition method based on machine learning, it is characterised in that comprise the following steps:
Step 1, the image for obtaining shooting, the image of shooting are pre-processed, by the identity in pretreated image Card image and complicated background image are distinguished;Comprise the following steps that:
(11), pre-processed using Gaussian Blur and gray processing come the image to shooting;
(12) pretreated image, is carried out to step (11), identity card is carried out using Canny operators and Sobel operators Rim detection;
(13), the region for the identity card surrounded by edges for being detected step (12) using binaryzation and than operation is syncopated as Come, obtain ID Card Image region;
(14), ID Card Image region progress profile is selected using SVM classifier, correct identity card profile diagram is obtained Picture.
(15), by the image of irregular deflection, it will be corrected using Hough transformation and perspective transform.
Step 2, word area detection is carried out to the ID Card Image that detects, then the character area to detecting Word cutting is carried out, word one by one is obtained;Comprise the following steps that:
(21) network for the high-level characteristic that three self-encoding encoders of a cascade are obtained, is built, according to the network of high-level characteristic Carry out whether judging pixel as character area from pixel scale, take out accurate character area;Concretely comprise the following steps:
(211), first self-encoding encoder is random from given all training pictures (general 1000 of picture of training) Take out 500k size and be used as input for 5*5 block (500k 5*5 cutting image block), be set to x(1), then x(1)∈R75, R generations Table real number space, R75It is the vector that a dimension is 75 to define x;500k size of input is passed through for 5*5 block repeatedly real Test effect and determine hidden neuron number, final to determine hidden neuron number be 40, then by 500k size of input be 5*5's Block and hidden neuron number are trained by self-encoding encoder, and the result of first self-encoding encoder coded portion is obtained after network convergence f(1), f(1)∈40;
(212), taking out 500k size in the characteristic pattern matrix that second self-encoding encoder is obtained from step (211) at random is 3*3 block is set to x as input(2), order"+" represents x(2)It is By 9 x(1)Directly it is in series, w refers to weight, x(2)∈ 360, the hidden neuron number for taking second self-encoding encoder is 30, 500k size is trained for 3*3 block and hidden neuron number by self-encoding encoder, second self-encoding encoder coding is obtained Partial result f(2), f(2)∈30;
(213), taking out 200k size in the characteristic pattern matrix that the 3rd self-encoding encoder is obtained from step (212) at random is 3*3 block is set to x as input(3)), x(3)∈ 270, wherein, every fritter in 3*3 block has 5 pixels and next small Block is overlapping, and the hidden neuron for taking the 3rd self-encoding encoder is 20, by block and hidden neuron of the 200k size for 3*3 After the completion of number is by self-encoding encoder training, the result f of the 3rd self-encoding encoder coded portion is obtained(3), f(3)∈20;
(214) three kinds of features of the central point of 5*5 block, are obtained according to step (211)-step (213), f=f is made(1)+f(2)+f(3), "+" represents direct series connection, forms the composite character of one 90 dimension, and the composite character of 90 dimensions is put into SVM models Classification based training is carried out, a svm classifier model is finally given, after training is finished, the body that svm classifier model is distinguished to step 1 Part card image is scanned, and judges whether each pixel is a part for character area, so as to take out accurate character area;
(22) accurate character area, is taken out, character cutting is carried out;Comprise the following steps that:
(221), by Chinese character mean breadth W in accurate character area1With digital mean breadth W2Come out as cutting Standard;
(222), the character area width record of the starting point of scan first character area and end point is got off, If the character area width of cutting is similar to grapholect mean breadth is considered as a Chinese character by the character area of cutting;If not Then go to step (223);
(223) it is, noise if character area width is much smaller than digital averaging width, abandons the region;If literal field Character area is then given the SVM trained a digital sort device and determines whether number by field width degree close to digital averaging width Word, if numeral scans next character area, otherwise goes to step (224);
(224) right side in current character region, will be inspected, two regional connections are got up in trial, judges to contact again Whether two regions come are Chinese character or numeral, if being not still Chinese character or numeral, reattempt the right side for merging a upper combined region Carry out Chinese character or digital judgement.
The word cut out is identified for step 3, the character recognition model based on deep learning, and output is identified Result.Comprise the following steps that:
(31) network model of identification character, is built, the network model is by input layer, multiple convolutional layers, multiple sample levels, Full articulamentum and output layer composition;
(32) the network weight parameter of a set of network model, is trained using the training dataset collected;
(33), the word being syncopated as is identified using the network model for training network weight parameter, output result.
The foregoing is merely illustrative of the preferred embodiments of the present invention, is not intended to limit the invention, all essences in the present invention Any modifications, equivalent substitutions and improvements made within refreshing and principle etc., should be included in the scope of the protection.

Claims (4)

1. a kind of ID Card Image text recognition method based on machine learning, it is characterised in that comprise the following steps:
Step 1, the image of the shooting of acquisition pre-processed, by the ID Card Image in pretreated image and complicated Background image is distinguished;
Step 2, the ID Card Image progress word area detection to detecting, are then carried out to the character area detected Word is cut, and obtains word one by one;
The word cut out is identified for step 3, the character recognition model based on deep learning, exports the knot identified Really.
2. a kind of ID Card Image text recognition method based on machine learning according to claim 1, it is characterised in that: Comprising the following steps that in the step 1:
(11), pre-processed using Gaussian Blur and gray processing come the image to shooting;
(12) pretreated image, is carried out to step (11), identity card edge is carried out using Canny operators and Sobel operators Detection;
(13), the region for the identity card surrounded by edges for being detected step (12) using binaryzation and than operation is cut out, Obtain ID Card Image region;
(14), ID Card Image region progress profile is selected using SVM classifier, correct identity card contour images are obtained;
(15), the image for the irregular deflection for obtaining step (14), will be corrected using Hough transformation and perspective transform.
3. a kind of ID Card Image text recognition method based on machine learning according to claim 1, it is characterised in that: The step 2 is comprised the following steps that:
(21) network of high-level characteristic that three self-encoding encoders of a cascade are obtained, is built, according to the network of high-level characteristic from picture Plain rank carries out whether judging pixel as character area, takes out accurate character area;Concretely comprise the following steps:
(211), first self-encoding encoder random block work for taking out 500k size for 5*5 from given all training pictures For input, x is set to(1), then x(1)∈R75, R represents real number space, R75It is the vector that a dimension is 75 to define x;By input 500k size determines hidden neuron number for 5*5 block by many experiments effect, and finally determining hidden neuron number is 40, then 500k size of input is trained for 5*5 block and hidden neuron number by self-encoding encoder, after network convergence To the result f of first self-encoding encoder coded portion(1), f(1)∈40;
(212) it is, random in the characteristic pattern matrix that second self-encoding encoder is obtained from step (211) to take out 500k size for 3*3 Block as input, be set to x(2), order"+" represents x(2)It is by 9 x(1)Directly it is in series, w refers to weight, x(2)∈ 360, the hidden neuron number for taking second self-encoding encoder is 30, will 500k size is trained for 3*3 block and hidden neuron number by self-encoding encoder, obtains second self-encoding encoder coding unit The result f divided(2), f(2)∈30;
(213) it is, random in the characteristic pattern matrix that the 3rd self-encoding encoder is obtained from step (212) to take out 200k size for 3*3 Block as input, be set to x(3)), x(3)∈ 270, wherein, every fritter in 3*3 block has 5 pixels and next fritter Overlapping, the hidden neuron for taking the 3rd self-encoding encoder is 20, by block and hidden neuron number of the 200k size for 3*3 After the completion of being trained by self-encoding encoder, the result f of the 3rd self-encoding encoder coded portion is obtained(3), f(3)∈20;
(214) three kinds of features of the central point of 5*5 block, are obtained according to step (211)-step (213), f=f is made(1)+f(2)+f(3), "+" represents direct series connection, forms the composite character of one 90 dimension, and the composite character of 90 dimensions is put into SVM models is carried out Classification based training, finally gives a svm classifier model, after training is finished, the identity card that svm classifier model is distinguished to step 1 Image is scanned, and judges whether each pixel is a part for character area, so as to take out accurate character area;
(22) accurate character area, is taken out, character cutting is carried out;Comprise the following steps that:
(221), by Chinese character mean breadth W in accurate character area1With digital mean breadth W2Come out as cutting mark It is accurate;
(222), the character area width record of the starting point of scan first character area and end point is got off, if cutting The character area width divided is similar to grapholect mean breadth and the character area of cutting is considered as into a Chinese character;If not then turning To step (223);
(223) it is, noise if character area width is much smaller than digital averaging width, abandons the region;If literal field field width Character area is then given the SVM trained a digital sort device and determines whether numeral, such as by degree close to digital averaging width Fruit is that numeral scans next character area, otherwise goes to step (224);
(224) right side in current character region, will be inspected, two regional connections are got up in trial, judges what is connected again Whether two regions are Chinese character or numeral, if being not still Chinese character or numeral, reattempt the right side progress for merging a upper combined region Chinese character or digital judgement.
4. a kind of ID Card Image text recognition technique based on machine learning according to claim 1, it is characterised in that: The step 3 is comprised the following steps that:
(31) network model of identification character, is built, the network model is by input layer, multiple convolutional layers, multiple sample levels, Quan Lian Connect layer and output layer composition;
(32) the network weight parameter of a set of network model, is trained using the training dataset collected;
(33), the word being syncopated as is identified using the network model for training network weight parameter, output result.
CN201710416957.8A 2017-06-06 2017-06-06 A kind of ID Card Image text recognition method based on machine learning Pending CN107247950A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201710416957.8A CN107247950A (en) 2017-06-06 2017-06-06 A kind of ID Card Image text recognition method based on machine learning

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201710416957.8A CN107247950A (en) 2017-06-06 2017-06-06 A kind of ID Card Image text recognition method based on machine learning

Publications (1)

Publication Number Publication Date
CN107247950A true CN107247950A (en) 2017-10-13

Family

ID=60019054

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201710416957.8A Pending CN107247950A (en) 2017-06-06 2017-06-06 A kind of ID Card Image text recognition method based on machine learning

Country Status (1)

Country Link
CN (1) CN107247950A (en)

Cited By (32)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107862314A (en) * 2017-10-25 2018-03-30 武汉楚锐视觉检测科技有限公司 A kind of coding recognition methods and identification device
CN108229463A (en) * 2018-02-07 2018-06-29 众安信息技术服务有限公司 Character recognition method based on image
CN108236784A (en) * 2018-01-22 2018-07-03 腾讯科技(深圳)有限公司 The training method and device of model, storage medium, electronic device
CN108549881A (en) * 2018-05-02 2018-09-18 杭州创匠信息科技有限公司 The recognition methods of certificate word and device
CN108694393A (en) * 2018-05-30 2018-10-23 深圳市思迪信息技术股份有限公司 A kind of certificate image text area extraction method based on depth convolution
CN108875697A (en) * 2018-07-05 2018-11-23 南昌市微轲联信息技术有限公司 Collecting vehicle information method for uploading, device, storage medium and computer equipment
CN108921185A (en) * 2018-05-04 2018-11-30 广州图匠数据科技有限公司 A kind of shelf sales promotion information recognition methods based on image recognition, device and system
CN109241974A (en) * 2018-08-23 2019-01-18 苏州研途教育科技有限公司 A kind of recognition methods and system of text image
CN109376658A (en) * 2018-10-26 2019-02-22 信雅达系统工程股份有限公司 A kind of OCR method based on deep learning
CN109377397A (en) * 2018-11-07 2019-02-22 中国平安财产保险股份有限公司 Insurance business list checking method, device, computer equipment and storage medium
CN109389121A (en) * 2018-10-30 2019-02-26 金现代信息产业股份有限公司 A kind of nameplate recognition methods and system based on deep learning
CN109492643A (en) * 2018-10-11 2019-03-19 平安科技(深圳)有限公司 Certificate recognition methods, device, computer equipment and storage medium based on OCR
CN109726719A (en) * 2017-10-31 2019-05-07 比亚迪股份有限公司 Character recognition method, device and computer equipment based on autocoder
CN109886978A (en) * 2019-02-20 2019-06-14 贵州电网有限责任公司 A kind of end-to-end warning information recognition methods based on deep learning
CN109919060A (en) * 2019-02-26 2019-06-21 上海七牛信息技术有限公司 A kind of identity card content identifying system and method based on characteristic matching
CN109916923A (en) * 2019-04-25 2019-06-21 广州宁基智能系统有限公司 A kind of customization plate automatic defect detection method based on machine vision
CN109960707A (en) * 2019-03-20 2019-07-02 上海亿阁信息科技有限公司 A kind of colleges and universities' enrollment data acquisition method and system based on artificial intelligence
CN110020640A (en) * 2019-04-19 2019-07-16 厦门商集网络科技有限责任公司 A kind of method and terminal for correcting ID card information
CN110334142A (en) * 2019-06-18 2019-10-15 北京红云融通技术有限公司 Intelligent data acquisition method, terminal, server and interactive system
CN110378337A (en) * 2019-07-05 2019-10-25 上海交通大学 Metal cutting tool drawing identification information vision input method and system
CN110378338A (en) * 2019-07-11 2019-10-25 腾讯科技(深圳)有限公司 A kind of text recognition method, device, electronic equipment and storage medium
CN110503091A (en) * 2019-07-19 2019-11-26 平安科技(深圳)有限公司 Certificate verification method, device and storage medium based on neural network model
CN110554991A (en) * 2019-09-03 2019-12-10 浙江传媒学院 Method for correcting and managing text picture
CN110674808A (en) * 2019-08-28 2020-01-10 国网天津市电力公司电力科学研究院 Transformer substation pressure plate state intelligent identification method and device
WO2020024939A1 (en) * 2018-08-01 2020-02-06 北京京东尚科信息技术有限公司 Text region identification method and device
CN111079480A (en) * 2018-10-19 2020-04-28 北京金山云网络技术有限公司 Identification method and device of identity card information and terminal equipment
CN111144400A (en) * 2018-11-06 2020-05-12 北京金山云网络技术有限公司 Identification method and device for identity card information, terminal equipment and storage medium
CN111242112A (en) * 2018-11-29 2020-06-05 马上消费金融股份有限公司 Image processing method, identity information processing method and device
CN111325194A (en) * 2018-12-13 2020-06-23 杭州海康威视数字技术股份有限公司 Character recognition method, device and equipment and storage medium
CN111401142A (en) * 2020-02-25 2020-07-10 杭州测质成科技有限公司 Aero-engine blade metal surface etching character recognition method based on deep learning
CN112418158A (en) * 2020-02-11 2021-02-26 支付宝实验室(新加坡)有限公司 System suitable for detecting identity card and device and processing method associated with same
CN113313217A (en) * 2021-07-31 2021-08-27 北京惠朗世纪科技有限公司 Method and system for accurately identifying dip angle characters based on robust template

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103020621A (en) * 2012-12-25 2013-04-03 深圳深讯和科技有限公司 Method and device for segmenting Chinese and English mixed typeset character images
CN104680130A (en) * 2015-01-09 2015-06-03 安徽清新互联信息科技有限公司 Chinese character recognition method for identification cards
CN106156712A (en) * 2015-04-23 2016-11-23 信帧电子技术(北京)有限公司 A kind of based on the ID (identity number) card No. recognition methods under natural scene and device
CN106407980A (en) * 2016-11-03 2017-02-15 贺江涛 Image processing-based bank card number recognition method
CN106778748A (en) * 2016-12-30 2017-05-31 江西憶源多媒体科技有限公司 Identity card method for quickly identifying and its device based on artificial neural network

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103020621A (en) * 2012-12-25 2013-04-03 深圳深讯和科技有限公司 Method and device for segmenting Chinese and English mixed typeset character images
CN104680130A (en) * 2015-01-09 2015-06-03 安徽清新互联信息科技有限公司 Chinese character recognition method for identification cards
CN106156712A (en) * 2015-04-23 2016-11-23 信帧电子技术(北京)有限公司 A kind of based on the ID (identity number) card No. recognition methods under natural scene and device
CN106407980A (en) * 2016-11-03 2017-02-15 贺江涛 Image processing-based bank card number recognition method
CN106778748A (en) * 2016-12-30 2017-05-31 江西憶源多媒体科技有限公司 Identity card method for quickly identifying and its device based on artificial neural network

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
KAI CHEN等: "Page Segmentation of Historical Document Images with Convolutional Autoencoders", 《2015 13TH INTERNATIONAL CONFERENCE ON DOCUMENT ANALYSIS AND RECOGNITION (ICDAR)》 *
刘芳: "文字识别系统中藏文字符切分算法研究", 《中国优秀硕士学位论文全文数据库哲学与人文科学辑》 *
穆丽娟 等: "基于新模板的算法在车牌字符分割中的应用", 《计算机工程与应用》 *

Cited By (44)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107862314B (en) * 2017-10-25 2021-04-20 武汉楚锐视觉检测科技有限公司 Code spraying identification method and device
CN107862314A (en) * 2017-10-25 2018-03-30 武汉楚锐视觉检测科技有限公司 A kind of coding recognition methods and identification device
CN109726719A (en) * 2017-10-31 2019-05-07 比亚迪股份有限公司 Character recognition method, device and computer equipment based on autocoder
CN108236784A (en) * 2018-01-22 2018-07-03 腾讯科技(深圳)有限公司 The training method and device of model, storage medium, electronic device
CN108236784B (en) * 2018-01-22 2021-09-24 腾讯科技(深圳)有限公司 Model training method and device, storage medium and electronic device
CN108229463A (en) * 2018-02-07 2018-06-29 众安信息技术服务有限公司 Character recognition method based on image
CN108549881A (en) * 2018-05-02 2018-09-18 杭州创匠信息科技有限公司 The recognition methods of certificate word and device
CN108921185A (en) * 2018-05-04 2018-11-30 广州图匠数据科技有限公司 A kind of shelf sales promotion information recognition methods based on image recognition, device and system
CN108694393A (en) * 2018-05-30 2018-10-23 深圳市思迪信息技术股份有限公司 A kind of certificate image text area extraction method based on depth convolution
CN108875697A (en) * 2018-07-05 2018-11-23 南昌市微轲联信息技术有限公司 Collecting vehicle information method for uploading, device, storage medium and computer equipment
WO2020024939A1 (en) * 2018-08-01 2020-02-06 北京京东尚科信息技术有限公司 Text region identification method and device
US11763167B2 (en) 2018-08-01 2023-09-19 Bejing Jingdong Shangke Information Technology Co, Ltd. Copy area identification method and device
CN109241974A (en) * 2018-08-23 2019-01-18 苏州研途教育科技有限公司 A kind of recognition methods and system of text image
CN109241974B (en) * 2018-08-23 2020-12-01 苏州研途教育科技有限公司 Text image identification method and system
CN109492643B (en) * 2018-10-11 2023-12-19 平安科技(深圳)有限公司 Certificate identification method and device based on OCR, computer equipment and storage medium
CN109492643A (en) * 2018-10-11 2019-03-19 平安科技(深圳)有限公司 Certificate recognition methods, device, computer equipment and storage medium based on OCR
CN111079480A (en) * 2018-10-19 2020-04-28 北京金山云网络技术有限公司 Identification method and device of identity card information and terminal equipment
CN109376658A (en) * 2018-10-26 2019-02-22 信雅达系统工程股份有限公司 A kind of OCR method based on deep learning
CN109389121A (en) * 2018-10-30 2019-02-26 金现代信息产业股份有限公司 A kind of nameplate recognition methods and system based on deep learning
CN109389121B (en) * 2018-10-30 2021-11-09 金现代信息产业股份有限公司 Nameplate identification method and system based on deep learning
CN111144400B (en) * 2018-11-06 2024-03-29 北京金山云网络技术有限公司 Identification method and device for identity card information, terminal equipment and storage medium
CN111144400A (en) * 2018-11-06 2020-05-12 北京金山云网络技术有限公司 Identification method and device for identity card information, terminal equipment and storage medium
CN109377397A (en) * 2018-11-07 2019-02-22 中国平安财产保险股份有限公司 Insurance business list checking method, device, computer equipment and storage medium
CN111242112A (en) * 2018-11-29 2020-06-05 马上消费金融股份有限公司 Image processing method, identity information processing method and device
CN111325194B (en) * 2018-12-13 2023-12-29 杭州海康威视数字技术股份有限公司 Character recognition method, device and equipment and storage medium
CN111325194A (en) * 2018-12-13 2020-06-23 杭州海康威视数字技术股份有限公司 Character recognition method, device and equipment and storage medium
CN109886978A (en) * 2019-02-20 2019-06-14 贵州电网有限责任公司 A kind of end-to-end warning information recognition methods based on deep learning
CN109919060A (en) * 2019-02-26 2019-06-21 上海七牛信息技术有限公司 A kind of identity card content identifying system and method based on characteristic matching
CN109960707A (en) * 2019-03-20 2019-07-02 上海亿阁信息科技有限公司 A kind of colleges and universities' enrollment data acquisition method and system based on artificial intelligence
CN110020640B (en) * 2019-04-19 2021-08-24 厦门商集网络科技有限责任公司 Method and terminal for correcting identity card information
CN110020640A (en) * 2019-04-19 2019-07-16 厦门商集网络科技有限责任公司 A kind of method and terminal for correcting ID card information
CN109916923A (en) * 2019-04-25 2019-06-21 广州宁基智能系统有限公司 A kind of customization plate automatic defect detection method based on machine vision
CN110334142A (en) * 2019-06-18 2019-10-15 北京红云融通技术有限公司 Intelligent data acquisition method, terminal, server and interactive system
CN110334142B (en) * 2019-06-18 2022-05-17 北京红云融通技术有限公司 Intelligent data acquisition method, terminal, server and interaction system
CN110378337B (en) * 2019-07-05 2023-03-31 上海交通大学 Visual input method and system for drawing identification information of metal cutting tool
CN110378337A (en) * 2019-07-05 2019-10-25 上海交通大学 Metal cutting tool drawing identification information vision input method and system
CN110378338A (en) * 2019-07-11 2019-10-25 腾讯科技(深圳)有限公司 A kind of text recognition method, device, electronic equipment and storage medium
CN110503091A (en) * 2019-07-19 2019-11-26 平安科技(深圳)有限公司 Certificate verification method, device and storage medium based on neural network model
CN110674808A (en) * 2019-08-28 2020-01-10 国网天津市电力公司电力科学研究院 Transformer substation pressure plate state intelligent identification method and device
CN110554991A (en) * 2019-09-03 2019-12-10 浙江传媒学院 Method for correcting and managing text picture
CN112418158A (en) * 2020-02-11 2021-02-26 支付宝实验室(新加坡)有限公司 System suitable for detecting identity card and device and processing method associated with same
CN111401142A (en) * 2020-02-25 2020-07-10 杭州测质成科技有限公司 Aero-engine blade metal surface etching character recognition method based on deep learning
CN113313217B (en) * 2021-07-31 2021-11-02 北京惠朗世纪科技有限公司 Method and system for accurately identifying dip angle characters based on robust template
CN113313217A (en) * 2021-07-31 2021-08-27 北京惠朗世纪科技有限公司 Method and system for accurately identifying dip angle characters based on robust template

Similar Documents

Publication Publication Date Title
CN107247950A (en) A kind of ID Card Image text recognition method based on machine learning
CN111401372B (en) Method for extracting and identifying image-text information of scanned document
Raghunandan et al. Riesz fractional based model for enhancing license plate detection and recognition
Gebhardt et al. Document authentication using printing technique features and unsupervised anomaly detection
CN105046196B (en) Front truck information of vehicles structuring output method based on concatenated convolutional neutral net
CN101142584B (en) Method for facial features detection
CN104751142B (en) A kind of natural scene Method for text detection based on stroke feature
CN110363199A (en) Certificate image text recognition method and system based on deep learning
Liu et al. A contour-based robust algorithm for text detection in color images
US20070253040A1 (en) Color scanning to enhance bitonal image
CN106446750A (en) Bar code reading method and device
LeBourgeois Robust multifont OCR system from gray level images
He et al. Real-time human face detection in color image
CN102629322B (en) Character feature extraction method based on stroke shape of boundary point and application thereof
CN110766020A (en) System and method for detecting and identifying multi-language natural scene text
CN110728302A (en) Method for identifying color textile fabric tissue based on HSV (hue, saturation, value) and Lab (Lab) color spaces
Belaïd et al. Handwritten and printed text separation in real document
CN104408728A (en) Method for detecting forged images based on noise estimation
Amin et al. A robust system for thresholding and skew detection in mixed text/graphics documents
CN103530625A (en) Optical character recognition method based on digital image processing
CN111339932B (en) Palm print image preprocessing method and system
CN115082776A (en) Electric energy meter automatic detection system and method based on image recognition
KR101151739B1 (en) System for color clustering based on tensor voting and method therefor
CN107609482B (en) Chinese text image inversion discrimination method based on Chinese character stroke characteristics
CN112200789A (en) Image identification method and device, electronic equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20171013