CN108898131A - It is a kind of complexity natural scene under digital instrument recognition methods - Google Patents

It is a kind of complexity natural scene under digital instrument recognition methods Download PDF

Info

Publication number
CN108898131A
CN108898131A CN201810500379.0A CN201810500379A CN108898131A CN 108898131 A CN108898131 A CN 108898131A CN 201810500379 A CN201810500379 A CN 201810500379A CN 108898131 A CN108898131 A CN 108898131A
Authority
CN
China
Prior art keywords
digital instrument
text
feature
digital
network
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201810500379.0A
Other languages
Chinese (zh)
Inventor
张晨民
彭天强
李丙涛
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhengzhou Jinhui Computer System Engineering Co Ltd
Original Assignee
Zhengzhou Jinhui Computer System Engineering Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhengzhou Jinhui Computer System Engineering Co Ltd filed Critical Zhengzhou Jinhui Computer System Engineering Co Ltd
Priority to CN201810500379.0A priority Critical patent/CN108898131A/en
Publication of CN108898131A publication Critical patent/CN108898131A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/25Determination of region of interest [ROI] or a volume of interest [VOI]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/29Graphical models, e.g. Bayesian networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition
    • G06V30/14Image acquisition
    • G06V30/148Segmentation of character regions
    • G06V30/153Segmentation of character regions using recognition of characters or words
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V2201/00Indexing scheme relating to image or video recognition or understanding
    • G06V2201/02Recognising information on displays, dials, clocks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • General Physics & Mathematics (AREA)
  • Physics & Mathematics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • General Engineering & Computer Science (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Multimedia (AREA)
  • Image Analysis (AREA)
  • Character Discrimination (AREA)

Abstract

The present invention relates to the digital instrument recognition methods under digital instrument identification technology field more particularly to a kind of complicated natural scene.It is a kind of complexity natural scene under digital instrument recognition methods, include the following steps:The digital instrument zone location under complicated natural scene is carried out using SSD algorithm;Feature is extracted using ResNet50 neural network, the feature of extraction is trained using two-way LSTM network, obtains the line of text positioning in digital instrument region;Line of text feature is extracted using ResNet50 neural network, is trained using line of text feature of the BRNN network to extraction, obtains digital instrument recognition result using CTC algorithm.The invention avoids identification errors caused by Character segmentation when background complexity, improve the recognition accuracy of digital instrument.

Description

It is a kind of complexity natural scene under digital instrument recognition methods
Technical field
The present invention relates to the digital instrument knowledges under digital instrument identification technology field more particularly to a kind of complicated natural scene Other method.
Background technique
Digital instrument identification, which refers to, to be found the position of numerical character from digital picture automatically using computer and identifies number The technology of word character belongs to the scope of pattern-recognition.Digital instrument is high with its precision, reads the advantages that convenient, easy setting in work Industry and detection field are widely used.
Currently, there are mainly two types of forms for the identification of digital instrument:
1, based on artificial Meter recognition.This method needs artificial eye to identify and record instrument, and process is numerous Trivial and inefficiency.And during artificial read, since the subjective reason or outside environmental elements of people are also easy to produce reading Number error, causes measurement accuracy to decline.Contain simultaneously for some external environments more severe scene, such as chemical industry, electric power etc. There are pernicious gas or low temperature, high temperature, the high place radiated, the mode being unsuitable for manually to read instrument indication value.
2, based on the Meter recognition of machine vision.This method be Instrument image is acquired using video camera, and according to The algorithm of machine vision identifies image, substantially increases the efficiency of Meter recognition.Such method uses machine vision generation For manually identifying to instrument, error caused by artificial subjective factor is not only reduced, but also eliminate artificial execute-in-place Risk.But the existing Meter recognition method based on machine vision, it can only be determined for the single digital instrument of background Position, segmentation and identification, and most of algorithm can not identify decimal point and sign.Existing algorithm is to digital instrument simultaneously After table section positioning, it is necessary to just can be carried out identification by Character segmentation, when background complexity typically result in that Character segmentation occurs partly A unrecognized situation of character.Therefore, it is necessary to a kind of for complicated natural scene and without the calculation of the Meter recognition of Character segmentation Method.
Number of patent application is 201510920430.X, entitled《A kind of digit recognition method based on intersection point feature extraction》's Chinese patent application realizes binaryzation, the target that will be identified to the image after gray processing first with maximum variance between clusters It is distinguished with image background;Secondly, carrying out Character segmentation to LED number, LED number table binary map is obtained;Then two are utilized Horizontal line is scanned from left to right at the 3/4 and 1/4 of digital table bianry image, records the number of transitions of pixel respectively;Again Using a vertical line at the 1/2 of digital table bianry image, column scan is carried out from top to bottom, records the number of transitions of pixel; Ranks pixel transform number is compared with the number of transitions of standard digital finally, number is carried out according to certain logic strategy Differentiate.
The disadvantages of the method are as follows being unable to the digital instrument under effective position Complex Natural Environment, and this method is just for 0-9 Number identified, do not design the identification of decimal point and sign.
Number of patent application is 201611031884.2, entitled《A kind of digital instrument reading image-recognizing method》China specially Benefit application, according to the digital instrument image demarcated in advance, extracts area-of-interest using template matching method in panoramic picture, Single character zone and decimal point area to be tested in area-of-interest are extracted further according to the relative positional relationship of calibration character;It is right Single character zone carries out single character recognition using the good convolutional neural networks character model of precondition;Decimal point is waited for Detection zone utilizes the good Cascade target detection based on piecemeal LBP coding characteristic and Adaboost classifier of precondition Son carries out decimal point detection, and post-processes to testing result;Finally obtained according to character, decimal point and sign recognition result Read number.
The disadvantages of the method are as follows identifying to single character, segmentation effect seriously affects digital recognition result and can only The digital instrument of identification ideally, cannot effectively identify the digital instrument under complicated natural scene.
Summary of the invention
The problem of present invention is identified for above-mentioned digital instrument proposes the digital instrument under a kind of complicated natural scene Table recognition methods, identification error caused by Character segmentation when avoiding background complexity, improves the recognition accuracy of digital instrument.
To achieve the goals above, the present invention uses following technical scheme:
It is a kind of complexity natural scene under digital instrument recognition methods, include the following steps:
Step 1:The digital instrument zone location under complicated natural scene is carried out using SSD algorithm;
Step 2:Feature is extracted using ResNet50 neural network, the feature of extraction is trained using two-way LSTM network, Obtain the line of text positioning in digital instrument region;
Step 3:Extract line of text feature using ResNet50 neural network, using BRNN network to the line of text feature of extraction into Row training obtains digital instrument recognition result using CTC algorithm.
Further, the step 1 includes:
Step 1.1:Sample data is pre-processed, pretreated sample data is obtained;
Step 1.2:SSD network model is constructed, in the infrastructure network of VGG16, by the 6th layer and the 7th layer of full articulamentum It is converted to convolutional layer;Increase by 3 convolutional layers and an average pond layer;
Step 1.3:To every characteristic pattern after convolution, the coordinate and classification after the recurrence of default frame is generated using 3 × 3 convolution are general Rate;The calculation formula of size of each default frame is:
Wherein m is characterized map number, sminFrame size, s are defaulted for the bottommaxFrame size is defaulted for top;
Step 1.4:The pointer instrument region that definition has marked in advance is ground truth box, passes through ground truth Box is trained SSD network model;The accurate positioning of multi-angle pointer instrument is carried out using trained SSD network;
Training process is as follows:
The default frame prior box and ground truth box actually chosen is matched according to IOU, IOU T1's Prior box is positive sample, remaining is negative sample, the T1It is 0.7;The recurrence loss of prior box is carried out from high to low Sequence, select to return and loses highest M prior box as set D, the positive sample conduct set P after successful match, then just Sample set is P-D ∩ P, and negative sample integrates as D-D ∩ P;The positive sample collection and negative sample concentrate the quantity of positive sample and negative sample Than being 1:4, i.e. M are the 1/4 of prior box quantity;
Network parameter is adjusted by loss function, completes the positioning of pointer instrument;
The loss function is:
Wherein, c is class probability, and l is prediction block, and N is the prior box number to match with ground truth box;Such as Fruit N=0, loss function 0;LconfFor Classification Loss part;Lloc(x, l, g) is prediction block l and g-th of ground truth Part is lost in the recurrence of box;λ is the weight for returning loss, represents the contribution for returning loss to entire loss function, λ value It is 0.5;
Step 1.5:Leave out repetition framework using NMS algorithm, chooses digital instrument region.
Further, the step 2 includes:
Step 2.1:Using ResNet50 neural metwork training digital instrument sample data, obtaining size is W × H × C feature Figure;
Step 2.2:Each position of characteristic pattern takes the forms feature of 3 × 3 × C in the step 2.1, for predicting the position The corresponding classification information of k anchor point, i.e. anchor and location information;
Step 2.3:The feature of corresponding 3 × 3 × C of all forms of every row is input in two-way LSTM network, size is obtained For the Output matrix of W × 256;
Step 2.4:The full articulamentum that the Input matrix 512 of W × 256 is tieed up;
Step 2.5:The feature of full articulamentum is input to classification or is returned in layer, the corresponding classification information of anchor and position are obtained Confidence breath, to obtain multiple fine digital text detection zones;
Step 2.6:According to thresholding method, given threshold T2, by scores<T2Anchor directly delete, the scores For class probability, duplicate removal, the T are carried out to remaining anchor text box using NMS algorithm2It is 0.8;
Step 2.7:Merge line of text using text construction algorithm;
Step 2.8:It is finely adjusted using horizontal position of the side-refinement algorithm to prediction text framework, obtains number The positioning of character text row.
Further, the step 3 includes:
Step 3.1:Digital line of text image is pre-processed, digital text row image size is M × N, sets the scaling of M Than being zoomed in and out according to the pantograph ratio of M to N;
Step 3.2:Pretreated sample data is inputted in ResNet50 neural network and carries out feature extraction, obtains feature Characteristic pattern is converted into feature vector by column by figure;
Step 3.3:Feature vector is identified using the two-way LSTM algorithm of BRNN network, obtains the classification sequence of each column feature Column;
Step 3.4:Optimal classification sequence is solved using CTC algorithm, obtains line of text recognition result.
Compared with prior art, the device have the advantages that:
The invention firstly uses SSD algorithms to carry out digital instrument zone location;Then RestNet50 neural network knot is utilized The positioning that BLSTM network carries out digital text row is closed, RestNet50 neural network combination BRNN network implementations number is finally utilized The identification of line of text utilizes the recognition result that CTC algorithm picks are optimal.The invention positions meter region using SSD algorithm, improves Digital instrument locating accuracy under Complex Natural Environment, while using BLSTM algorithm before not doing single character cutting It puts and entire line of text is identified, identification error caused by Character segmentation, improves digital instrument when avoiding background complexity The recognition accuracy of table.
Digital instrument recognizer of the invention is can to receive any scene, arbitrary dimension based on training end to end Image input, identify the character string of random length.The present invention can carry out the digital instrument numerical value under natural scene effective Identification.
Detailed description of the invention
Fig. 1 is the basic flow chart of the digital instrument recognition methods under a kind of complicated natural scene of the embodiment of the present invention.
Fig. 2 is the general frame of the digital instrument recognition methods under a kind of complicated natural scene of another embodiment of the present invention Figure.
Fig. 3 is the SSD network structure of the digital instrument recognition methods under a kind of complicated natural scene of the embodiment of the present invention Schematic diagram.
Fig. 4 is the digital instrument region of the digital instrument recognition methods under a kind of complicated natural scene of the embodiment of the present invention Position schematic network structure.
Fig. 5 is that the digital text row of the digital instrument recognition methods under a kind of complicated natural scene of the embodiment of the present invention is fixed Position schematic network structure.
Fig. 6 is the numerical character region of the digital instrument recognition methods under a kind of complicated natural scene of the embodiment of the present invention Identify schematic network structure.
Fig. 7 is the digital instrument identification of the digital instrument recognition methods under a kind of complicated natural scene of the embodiment of the present invention One of test result.
Fig. 8 is the digital instrument identification of the digital instrument recognition methods under a kind of complicated natural scene of the embodiment of the present invention The two of test result.
Specific embodiment
With reference to the accompanying drawing with specific embodiment the present invention will be further explained explanation:
Embodiment one:
As shown in Figure 1, the digital instrument recognition methods under a kind of complicated natural scene of the invention, includes the following steps:
Step S101:The digital instrument zone location under complicated natural scene is carried out using SSD algorithm;
Step S102:Extract feature using ResNet50 neural network, using two-way LSTM network to the feature of extraction into Row training obtains the line of text positioning in digital instrument region;
Step S103:Line of text feature is extracted using ResNet50 neural network, using BRNN network to the text of extraction Row feature is trained, and obtains digital instrument recognition result using CTC algorithm.
Embodiment two:
As shown in Fig. 2, the digital instrument recognition methods under the complicated natural scene of another kind of the invention, including:
The neural network model in each stage needs prior off-line training in the present invention, needs before off-line training to collected Digital instrument under natural scene is manually marked, i.e., marks the position of digital instrument in image respectively, in digital instrument The position of line of text and the recognition result of line of text.Using there are the data of label to be trained network, offline instruction is then utilized The network perfected tests test sample, realizes digital instrument zone location, the positioning of digital text row, Number character recognition Function.Digital instrument zone location, the positioning of digital text row, the process of Number character recognition are specific as follows shown:
Step S201:Digital instrument zone location.
Digital instrument zone location is to be determined using trained SSD network the digital instrument region in sample data The process of position.Wherein SSD network is the method for realizing Target detection and identification using single deep neural network model, SSD net Then fc6 and fc7 layers as shown in figure 3, be converted to using the infrastructure network of VGG16 using first 5 layers by network structure first Two convolutional layers.3 convolutional layers and one pool layers of average are especially increased again.The feature map of different levels points Frame, the i.e. offset of default box and the prediction of different classes of score Yong Yu not defaulted, obtained finally by NMS final Detect positioning result.
The network structure of digital instrument zone location is as shown in figure 4, specific step is as follows:
Step S2011:Sample data is pre-processed, the sample data of 300 × 300 × 3 sizes is obtained.
Step S2012:SSD network model is constructed, wherein the characteristic pattern of model selection includes:38×38,19×19,10× 10,5 × 5,3 × 3,1 × 1, respectively correspond block4, block7, block8, block9, block10, block11.For every Characteristic pattern is opened, the coordinate and 8 class probabilities after 4 recurrence of default frame (default box) are generated using 3 × 3 convolution.
Step S2013:It is right after preceding 5 layers of progress convolution operation of treated sample data input VGG16 neural network Convolution characteristic layer generates default frame, then proceeds by convolution operation, successively extracts default to every layer below of convolution characteristic layer Frame.Each convolution characteristic pattern generates k default frame according to different sizes (scale) and length-width ratio (ratio).
It is each default frame size calculation formula be:
Wherein m is characterized map number, sminFrame size, s are defaulted for the bottommaxFrame size is defaulted for top.Because every A default frame length-width ratio arRatio value it is original be { 1,2,3,1/2,1/3 }, so it is each default frame width be It is a height ofThe default frame for being 1 for ratio, additionally adding a ratio isDefault frame.Finally, Each point in every characteristic pattern generates 6 default frames.Each default frame center is set asWherein, | fk| For k-th of characteristic pattern size, i ∈ [1, k].
Due to smin、smaxValue directly affect the calculation amount of meter location algorithm, so passing through statistics Instrument image Area shared by dial plate region, determines s in samplemin=0.1, smax=0.25.And according to instrument shape in digital Instrument image The observation of shape counts, and for the purpose of covering various digital instrument dial plates, set the ratio value of the length-width ratio for defaulting frame as { 1,2,1/ 2 }, the calculation amount of pointer meters location algorithm is further decreased.Trained and test demonstrates s through a large number of experimentsmin=0.1, smax=0.25, when length-width ratio={ 1,2,1/2 }, digital instrument recognizer is not under the premise of reducing precision, time complexity It is minimum.
Step S2014:Digital instrument region under the natural scene that definition manually marks in advance is ground truth Box is trained SSD network by ground truth box, the network is allow accurately to position complicated natural scene While lower digital instrument region, the i.e. classification confidence of guarantee default box, prior box is returned as far as possible It is grouped into ground truth box.The ground truth box is by the true position data ground truth that correctly marks Composition.
Firstly the need of the positive negative sample of determination.By prior box and ground truth box according to IOU (JaccardOverlap) it is matched, IOU>T1Prior box be exactly positive sample (positive example), it is other just It is negative sample (negative example).Since the quantity of the negative sample generated in this way will cause to instruct far more than positive sample It is difficult to restrain when practicing.The recurrence loss of prior box is ranked up from high to low, selection, which returns, loses highest M prior Box is used as set D.If the collection of the positive sample serial number after successful match, i.e. Match success is combined into P, then positive sample integrates as P-D ∩ P, negative sample integrate as D-D ∩ P.The present invention controls the ratio of positive negative sample by the quantity of specification M.Due to T1, M value pair Accurate positioning in digital instrument dial plate region is most important, so by many experiments comparative analysis to sample data, really Determine T1=0.7, M=1:Meter location algorithm can more completely position dial plate region when 4, and convergence rate is most fast.
Then, network parameter is adjusted using loss function, makes default box as close as ground truth box.Firstly the need of loss function is solved, i.e., recurrence loss loss (loc) of corresponding default frame and classification lose loss (conf). Loss function is defined as:
Wherein, c is class probability, and l is prediction block, and N is prior box to match with ground truth box Number;If N=0, loss function 0;LconfFor Classification Loss part, Classification Loss is measured using softmax loss function; Lloc(x, l, g) is prediction block l and part is lost in the recurrence of g-th of ground truth box;λ is the weight for returning loss, generation Table returns contribution of the loss to entire loss function;The setting of λ value has vital influence, needle to meter locating effect To the various complex background factors of meter under natural scene, by a large amount of training experiments and cross-beta, λ=0.5 is finally set When achieve optimal locating effect.
Return loss part LlocIt is defined as follows:
Wherein l is prediction block, and g is ground truth, i.e. actual position, and p is the corresponding classification of x, and d is default frame (default bounding box)。
Step S2015:Finally leave out repetition framework using NMS algorithm, chooses digital instrument region.
Step S202:The positioning of digital text row.
The positioning of digital text row is the mistake that the character zone in the digital instrument region detected to SSD network is positioned Journey.Convolution operation is carried out to sample image using ResNet50 neural network first, feature vector is obtained by each convolutional layer; Then feature extraction is carried out to feature vector using the sliding forms of 3X3, will be instructed in the feature input BLSTM algorithm of extraction Practice, output vector is input to the full articulamentum of FC, obtain three classification or return layers, for determining classification and character text frame Position, size.Finally, carrying out deduplication operation to the character text frame detected using NMS algorithm, merges line of text, determine sample The position of digital text row and size in notebook data.
The network structure of digital text row positioning is as shown in Figure 5.Specific step is as follows:
Step S2021:Using ResNet50 neural metwork training digital instrument sample data, obtaining size is W × H × C Characteristic pattern;
Step S2022:The forms feature of 3 × 3 × C is taken in each position of characteristic pattern, for predicting the anchor of position k The corresponding classification information of point, i.e. anchor and location information, by the shape of digital text row in statistics Instrument image, through excessive Secondary experiment determines k=1, and the length-width ratio that the width of anchor is 16, anchor is 6:1;
Step S2023:The feature of corresponding 3 × 3 × C of all forms of every row is input in two-way LSTM, it is described double It is BLSTM network to LSTM, obtains the Output matrix that size is W × 256;
Step S2024:FC layer, the i.e. full articulamentum that the Input matrix 512 of W × 256 is tieed up;
Step S2025:The feature of full articulamentum is input to three classification or is returned in layer, three classification or recurrence Layer is respectively coordinate layers of 2k vertical, side-refinement layers of k and scores layers of 2k, wherein 2k Vertical coordinate and k side-refinement are the location information for returning k anchor, 2k scores It is the classification information of k anchor, obtains multiple fine digital text detection zones;
Step S2026:According to thresholding method, given threshold T2, by scores<T2Anchor directly delete, it is described Scores is class probability, then carries out duplicate removal to remaining anchor text box using NMS algorithm.Due to T2Value be balance The recall ratio of line of text positioning and the great influence factor of precision ratio, so by carrying out analysis balance to many experiments result, T is set2Locating effect is best when=0.8.
Step S2027:Using text construction algorithm, similar two digital character text frameworks are successively merged, until nothing Until method merges, that is, merge line of text;
Step S2028:Opposite offset is obtained using side-refinement (edge thinning), by opposite offset to pre- The horizontal position for surveying text framework is finely adjusted, and obtains the positioning of numerical character line of text.Wherein xsideIt is prediction closest in fact The x coordinate of the prediction framework horizontal direction of border digital text row, x* sideIt is the x coordinate of the horizontal direction of real figure line of text, It is to be calculated in advance by the position of the bounding box and anchor of real figure line of text,It is in anchor Heart x coordinate;waFor the width of anchor, it is worth for 16 pixel of fixed value, opposite offset is:
Step S203:Number character recognition.
Numerical character region recognition is the process identified to the digital text row in sample data.First with The feature vector of ResNet50 neural network extraction digital text row image;Then two-way LSTM algorithm, i.e. BLSTM algorithm are utilized Recognition feature vector obtains the probability distribution of each column feature;It is finally solved using CTC algorithm and forward-backward algorithm algorithm optimal Label sequence obtains digital text row recognition result.
The network structure of numerical character region recognition is as shown in Figure 6.Specific identification step is as follows:
Step S2031:Digital line of text image is pre-processed, such as digital line of text image size is M × N, setting M '=16 after scaling zoom in and out N according to the pantograph ratio of M;
Step S2032:Pretreated sample data is inputted in ResNet50 neural network and carries out feature extraction, is obtained Characteristic pattern, then characteristic pattern is converted into feature vector by column;
Step S2033:Feature vector is identified using the two-way LSTM algorithm of RNN network, obtains each column feature Label sequence, i.e. classification sequence.Need to input the output classification number of the network in advance, the present invention is read according to digital instrument Feature, determine classification number be 14, the classification include 0-9 ,+,-, background.
Step S2034:Optimal label sequence is solved using CTC algorithm, obtains line of text recognition result, specific steps It is:
1) by obtained label sequences y=y1,y2,......,yT, it is input in CTC algorithm, wherein T represents sequence Length, eachA kind of probability distribution on set L ' is represented, L '=L ∪ { blank } is all possibility in task Label sequence, wherein blank be blank sequence.
2) it by the function beta function of sequence to sequence, removes repeat label sequence and blank blank sequence for the first time, wherein π∈L′T, T represents length, i.e. β (π)=l;The conditional probability of label identification is defined as the sum of the conditional probability of all π, i.e.,Forward-backward algorithm algorithm is effectively utilized to be calculated.
3) it for without dictionary library and with two kinds of models of dictionary library, chooses the highest label sequence π of prediction probability and is turned Record transformation.
It is transcribed without dictionary library model:l*≈β(argmaxπp(π|y);
Band dictionary library model is transcribed:
Candidate sequence Nδ(l ') is effectively calculated by the data structure of BK-tree;
4) network training, training dataset X={ Ii,li, IiRepresent training digital text row image, liIt represents actual Digital text row sequence.Purpose is so that the negative logarithm of actual numbers line of text sequence is minimum, i.e.,
Above-mentioned digital instrument zone location, the positioning of digital text row, Number character recognition three parts combine, and realize nature Digital instrument identification under scene.Digital instrument recognizer of the invention to the digital instrument numerical value under natural scene have compared with Strong recognition capability is a kind of digital instrument recognition methods that can be suitable under natural conditions.
Digital instrument recognizer of the invention is can to receive any scene, arbitrary dimension based on training end to end Image input, identify the character string of random length.The present invention can carry out the digital instrument numerical value under natural scene effective Identification.Fig. 7, Fig. 8 are the test result that the present invention identifies the digital instrument under natural scene, it is found that the present invention Digital instrument numerical value under natural scene can effectively be identified.
Illustrated above is only the preferred embodiment of the present invention, it is noted that for the ordinary skill people of the art For member, various improvements and modifications may be made without departing from the principle of the present invention, these improvements and modifications are also answered It is considered as protection scope of the present invention.

Claims (4)

1. the digital instrument recognition methods under a kind of complexity natural scene, which is characterized in that include the following steps:
Step 1:The digital instrument zone location under complicated natural scene is carried out using SSD algorithm;
Step 2:Feature is extracted using ResNet50 neural network, the feature of extraction is trained using two-way LSTM network, Obtain the line of text positioning in digital instrument region;
Step 3:Extract line of text feature using ResNet50 neural network, using BRNN network to the line of text feature of extraction into Row training obtains digital instrument recognition result using CTC algorithm.
2. the digital instrument recognition methods under a kind of complicated natural scene according to claim 1, which is characterized in that described Step 1 includes:
Step 1.1:Sample data is pre-processed, pretreated sample data is obtained;
Step 1.2:SSD network model is constructed, in the infrastructure network of VGG16, by the 6th layer and the 7th layer of full articulamentum It is converted to convolutional layer;Increase by 3 convolutional layers and an average pond layer;
Step 1.3:To every characteristic pattern after convolution, the coordinate and classification after the recurrence of default frame is generated using 3 × 3 convolution are general Rate;The calculation formula of size of each default frame is:
Wherein m is characterized map number, sminFrame size, s are defaulted for the bottommaxFrame size is defaulted for top;
Step 1.4:The pointer instrument region that definition has marked in advance is ground truth box, passes through ground truth Box is trained SSD network model;The accurate positioning of multi-angle pointer instrument is carried out using trained SSD network;
Training process is as follows:
The default frame prior box and ground truth box actually chosen is matched according to IOU, IOU T1Prior Box is positive sample, remaining is negative sample, the T1It is 0.7;The recurrence loss of prior box is ranked up from high to low, is selected It selects the highest M prior box of recurrence loss and is used as set D, the positive sample after successful match is used as set P, then positive sample collection For P-D ∩ P, negative sample integrates as D-D ∩ P;It is 1 that the positive sample collection and negative sample, which concentrate the quantity ratio of positive sample and negative sample,: 4, i.e. M are the 1/4 of prior box quantity;
Network parameter is adjusted by loss function, completes the positioning of pointer instrument;
The loss function is:
Wherein, c is class probability, and l is prediction block, and N is the prior box number to match with ground truth box;Such as Fruit N=0, loss function 0;LconfFor Classification Loss part;Lloc(x, l, g) is prediction block l and g-th of ground truth Part is lost in the recurrence of box;λ is the weight for returning loss, represents the contribution for returning loss to entire loss function, λ value It is 0.5;
Step 1.5:Leave out repetition framework using NMS algorithm, chooses digital instrument region.
3. the digital instrument recognition methods under a kind of complicated natural scene according to claim 1, which is characterized in that described Step 2 includes:
Step 2.1:Using ResNet50 neural metwork training digital instrument sample data, obtaining size is W × H × C feature Figure;
Step 2.2:Each position of characteristic pattern takes the forms feature of 3 × 3 × C in the step 2.1, for predicting the position The corresponding classification information of k anchor point, i.e. anchor and location information;
Step 2.3:The feature of corresponding 3 × 3 × C of all forms of every row is input in two-way LSTM network, size is obtained For the Output matrix of W × 256;
Step 2.4:The full articulamentum that the Input matrix 512 of W × 256 is tieed up;
Step 2.5:The feature of full articulamentum is input to classification or is returned in layer, the corresponding classification information of anchor and position are obtained Confidence breath, to obtain multiple fine digital text detection zones;
Step 2.6:According to thresholding method, given threshold T2, by scores<T2Anchor directly delete, the scores is Class probability carries out duplicate removal, the T to remaining anchor text box using NMS algorithm2It is 0.8;
Step 2.7:Merge line of text using text construction algorithm;
Step 2.8:It is finely adjusted using horizontal position of the side-refinement algorithm to prediction text framework, obtains number The positioning of character text row.
4. the digital instrument recognition methods under a kind of complicated natural scene according to claim 1, which is characterized in that described Step 3 includes:
Step 3.1:Digital line of text image is pre-processed, digital text row image size is M × N, sets the scaling of M Than being zoomed in and out according to the pantograph ratio of M to N;
Step 3.2:Pretreated sample data is inputted in ResNet50 neural network and carries out feature extraction, obtains feature Characteristic pattern is converted into feature vector by column by figure;
Step 3.3:Feature vector is identified using the two-way LSTM algorithm of BRNN network, obtains the classification sequence of each column feature Column;
Step 3.4:Optimal classification sequence is solved using CTC algorithm, obtains line of text recognition result.
CN201810500379.0A 2018-05-23 2018-05-23 It is a kind of complexity natural scene under digital instrument recognition methods Pending CN108898131A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810500379.0A CN108898131A (en) 2018-05-23 2018-05-23 It is a kind of complexity natural scene under digital instrument recognition methods

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810500379.0A CN108898131A (en) 2018-05-23 2018-05-23 It is a kind of complexity natural scene under digital instrument recognition methods

Publications (1)

Publication Number Publication Date
CN108898131A true CN108898131A (en) 2018-11-27

Family

ID=64343094

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810500379.0A Pending CN108898131A (en) 2018-05-23 2018-05-23 It is a kind of complexity natural scene under digital instrument recognition methods

Country Status (1)

Country Link
CN (1) CN108898131A (en)

Cited By (18)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109753961A (en) * 2018-12-26 2019-05-14 国网新疆电力有限公司乌鲁木齐供电公司 A kind of substation's spacer units unlocking method and system based on image recognition
CN109858474A (en) * 2019-01-08 2019-06-07 北京全路通信信号研究设计院集团有限公司 A kind of detection of transformer oil surface temperature controller and recognition methods
CN109886174A (en) * 2019-02-13 2019-06-14 东北大学 A kind of natural scene character recognition method of warehouse shelf Sign Board Text region
CN109918987A (en) * 2018-12-29 2019-06-21 中国电子科技集团公司信息科学研究院 A kind of video caption keyword recognition method and device
CN109948469A (en) * 2019-03-01 2019-06-28 吉林大学 The automatic detection recognition method of crusing robot instrument based on deep learning
CN110059539A (en) * 2019-02-27 2019-07-26 天津大学 A kind of natural scene text position detection method based on image segmentation
CN110059694A (en) * 2019-04-19 2019-07-26 山东大学 The intelligent identification Method of lteral data under power industry complex scene
CN110135431A (en) * 2019-05-16 2019-08-16 深圳市信联征信有限公司 The automatic identifying method and system of business license
CN110175520A (en) * 2019-04-22 2019-08-27 南方电网科学研究院有限责任公司 Text position detection method, device and the storage medium of robot inspection image
CN110399882A (en) * 2019-05-29 2019-11-01 广东工业大学 A kind of character detecting method based on deformable convolutional neural networks
CN110443159A (en) * 2019-07-17 2019-11-12 新华三大数据技术有限公司 Digit recognition method, device, electronic equipment and storage medium
CN110532855A (en) * 2019-07-12 2019-12-03 西安电子科技大学 Natural scene certificate image character recognition method based on deep learning
CN110929805A (en) * 2019-12-05 2020-03-27 上海肇观电子科技有限公司 Neural network training method, target detection device, circuit and medium
CN111428727A (en) * 2020-03-27 2020-07-17 华南理工大学 Natural scene text recognition method based on sequence transformation correction and attention mechanism
CN111553345A (en) * 2020-04-22 2020-08-18 上海浩方信息技术有限公司 Method for realizing meter pointer reading identification processing based on Mask RCNN and orthogonal linear regression
CN111967287A (en) * 2019-05-20 2020-11-20 江苏金鑫信息技术有限公司 Pedestrian detection method based on deep learning
CN113538407A (en) * 2018-12-29 2021-10-22 北京市商汤科技开发有限公司 Anchor point determining method and device, electronic equipment and storage medium
CN116958998A (en) * 2023-09-20 2023-10-27 四川泓宝润业工程技术有限公司 Digital instrument reading identification method based on deep learning

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103164692A (en) * 2012-12-03 2013-06-19 北京科技大学 Special vehicle instrument automatic identification system and algorithm based on computer vision
CN106529537A (en) * 2016-11-22 2017-03-22 亿嘉和科技股份有限公司 Digital meter reading image recognition method
CN108052943A (en) * 2017-12-29 2018-05-18 杭州占峰科技有限公司 A kind of instrument character wheel recognition methods and equipment

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103164692A (en) * 2012-12-03 2013-06-19 北京科技大学 Special vehicle instrument automatic identification system and algorithm based on computer vision
CN106529537A (en) * 2016-11-22 2017-03-22 亿嘉和科技股份有限公司 Digital meter reading image recognition method
CN108052943A (en) * 2017-12-29 2018-05-18 杭州占峰科技有限公司 A kind of instrument character wheel recognition methods and equipment

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
BAOGUANG SHI等: "An End-to-End Trainable Neural Network for Image-Based Sequence Recognition and Its Application to Scene Text Recognition", 《IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE》 *
WEI LIU等: "SSD: Single Shot MultiBox Detector", 《ARXIV》 *
ZHI TIAN等: "Detecting Text in Natural Image with Connectionist Text Proposal Network", 《ARXIV》 *

Cited By (26)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109753961A (en) * 2018-12-26 2019-05-14 国网新疆电力有限公司乌鲁木齐供电公司 A kind of substation's spacer units unlocking method and system based on image recognition
CN109918987B (en) * 2018-12-29 2021-05-14 中国电子科技集团公司信息科学研究院 Video subtitle keyword identification method and device
CN109918987A (en) * 2018-12-29 2019-06-21 中国电子科技集团公司信息科学研究院 A kind of video caption keyword recognition method and device
CN113538407B (en) * 2018-12-29 2022-10-14 北京市商汤科技开发有限公司 Anchor point determining method and device, electronic equipment and storage medium
CN113538407A (en) * 2018-12-29 2021-10-22 北京市商汤科技开发有限公司 Anchor point determining method and device, electronic equipment and storage medium
CN109858474A (en) * 2019-01-08 2019-06-07 北京全路通信信号研究设计院集团有限公司 A kind of detection of transformer oil surface temperature controller and recognition methods
CN109858474B (en) * 2019-01-08 2021-10-19 北京全路通信信号研究设计院集团有限公司 Detection and identification method for transformer oil surface temperature controller
CN109886174A (en) * 2019-02-13 2019-06-14 东北大学 A kind of natural scene character recognition method of warehouse shelf Sign Board Text region
CN110059539A (en) * 2019-02-27 2019-07-26 天津大学 A kind of natural scene text position detection method based on image segmentation
CN109948469A (en) * 2019-03-01 2019-06-28 吉林大学 The automatic detection recognition method of crusing robot instrument based on deep learning
CN110059694A (en) * 2019-04-19 2019-07-26 山东大学 The intelligent identification Method of lteral data under power industry complex scene
CN110175520A (en) * 2019-04-22 2019-08-27 南方电网科学研究院有限责任公司 Text position detection method, device and the storage medium of robot inspection image
CN110135431A (en) * 2019-05-16 2019-08-16 深圳市信联征信有限公司 The automatic identifying method and system of business license
CN111967287A (en) * 2019-05-20 2020-11-20 江苏金鑫信息技术有限公司 Pedestrian detection method based on deep learning
CN110399882A (en) * 2019-05-29 2019-11-01 广东工业大学 A kind of character detecting method based on deformable convolutional neural networks
CN110532855A (en) * 2019-07-12 2019-12-03 西安电子科技大学 Natural scene certificate image character recognition method based on deep learning
CN110532855B (en) * 2019-07-12 2022-03-18 西安电子科技大学 Natural scene certificate image character recognition method based on deep learning
CN110443159A (en) * 2019-07-17 2019-11-12 新华三大数据技术有限公司 Digit recognition method, device, electronic equipment and storage medium
CN110929805A (en) * 2019-12-05 2020-03-27 上海肇观电子科技有限公司 Neural network training method, target detection device, circuit and medium
CN110929805B (en) * 2019-12-05 2023-11-10 上海肇观电子科技有限公司 Training method, target detection method and device for neural network, circuit and medium
CN111428727A (en) * 2020-03-27 2020-07-17 华南理工大学 Natural scene text recognition method based on sequence transformation correction and attention mechanism
CN111428727B (en) * 2020-03-27 2023-04-07 华南理工大学 Natural scene text recognition method based on sequence transformation correction and attention mechanism
CN111553345A (en) * 2020-04-22 2020-08-18 上海浩方信息技术有限公司 Method for realizing meter pointer reading identification processing based on Mask RCNN and orthogonal linear regression
CN111553345B (en) * 2020-04-22 2023-10-20 上海浩方信息技术有限公司 Method for realizing meter pointer reading identification processing based on Mask RCNN and orthogonal linear regression
CN116958998A (en) * 2023-09-20 2023-10-27 四川泓宝润业工程技术有限公司 Digital instrument reading identification method based on deep learning
CN116958998B (en) * 2023-09-20 2023-12-26 四川泓宝润业工程技术有限公司 Digital instrument reading identification method based on deep learning

Similar Documents

Publication Publication Date Title
CN108898131A (en) It is a kind of complexity natural scene under digital instrument recognition methods
Xu et al. Detecting tiny objects in aerial images: A normalized Wasserstein distance and a new benchmark
Liao et al. Textboxes: A fast text detector with a single deep neural network
CN109271895B (en) Pedestrian re-identification method based on multi-scale feature learning and feature segmentation
CN111126482B (en) Remote sensing image automatic classification method based on multi-classifier cascade model
CN111680706B (en) Dual-channel output contour detection method based on coding and decoding structure
Hoque et al. Real time bangladeshi sign language detection using faster r-cnn
CN114582470B (en) Model training method and device and medical image report labeling method
CN110543906B (en) Automatic skin recognition method based on Mask R-CNN model
Shen et al. In teacher we trust: Learning compressed models for pedestrian detection
CN107430678A (en) Use the inexpensive face recognition of Gauss received field feature
CN102708384B (en) Bootstrapping weak learning method based on random fern and classifier thereof
CN110929640A (en) Wide remote sensing description generation method based on target detection
Tereikovskyi et al. The method of semantic image segmentation using neural networks
CN113158777A (en) Quality scoring method, quality scoring model training method and related device
CN112132257A (en) Neural network model training method based on pyramid pooling and long-term memory structure
CN111144462A (en) Unknown individual identification method and device for radar signals
Sun et al. Image target detection algorithm compression and pruning based on neural network
WO2020199498A1 (en) Palmar digital vein comparison method and device, computer apparatus, and storage medium
CN113762151A (en) Fault data processing method and system and fault prediction method
CN110020638A (en) Facial expression recognizing method, device, equipment and medium
CN114387524B (en) Image identification method and system for small sample learning based on multilevel second-order representation
CN111144466A (en) Image sample self-adaptive depth measurement learning method
CN115424000A (en) Pointer instrument identification method, system, equipment and storage medium
CN115936003A (en) Software function point duplicate checking method, device, equipment and medium based on neural network

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20181127

RJ01 Rejection of invention patent application after publication