CN107368831A - English words and digit recognition method in a kind of natural scene image - Google Patents

English words and digit recognition method in a kind of natural scene image Download PDF

Info

Publication number
CN107368831A
CN107368831A CN201710592890.3A CN201710592890A CN107368831A CN 107368831 A CN107368831 A CN 107368831A CN 201710592890 A CN201710592890 A CN 201710592890A CN 107368831 A CN107368831 A CN 107368831A
Authority
CN
China
Prior art keywords
layer
image
character
short
term
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201710592890.3A
Other languages
Chinese (zh)
Other versions
CN107368831B (en
Inventor
张军
涂丹
李硕豪
陈旭
雷军
郭强
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
National University of Defense Technology
Original Assignee
National University of Defense Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by National University of Defense Technology filed Critical National University of Defense Technology
Priority to CN201710592890.3A priority Critical patent/CN107368831B/en
Publication of CN107368831A publication Critical patent/CN107368831A/en
Application granted granted Critical
Publication of CN107368831B publication Critical patent/CN107368831B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/60Type of objects
    • G06V20/62Text, e.g. of license plates, overlay texts or captions on TV images
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Physics & Mathematics (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • General Engineering & Computer Science (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Multimedia (AREA)
  • Character Discrimination (AREA)
  • Image Analysis (AREA)

Abstract

The present invention provides the English words and digit recognition method in a kind of natural scene image, the identification problem of English words in natural scene and numeral is divided into feature extraction, feature focuses on and three steps of feature recognition, feature extraction is carried out to input picture using convolutional neural networks, notice mechanism is focused to the useful information in characteristic sequence, characteristic vector is identified long memory network in short-term, so as to which deep neural network and notice mechanism be combined, when input picture is to deep neural network, final recognition result can be immediately arrived at.The present invention need not carry out sliding window operation to input picture and the character in window is identified;The character string that the present invention exports simultaneously is final recognition result, it is not necessary to merges algorithm and the character string after identification is integrated.

Description

English words and digit recognition method in a kind of natural scene image
Technical field
The invention belongs to technical field of character recognition, relates to the use of deep neural network and notice mechanism carries out nature field English words and digit recognition method in scape image.
Background technology
Word in natural scene often carries very important information, and it can be used to describe the interior of the image Hold.The text information automatically obtained in image can help people more effectively to understand image and stored, pressed to image The processing such as contracting, retrieval.Relative to natural scene character detecting method, natural scene character recognition method is to having been detected by Character area is identified.English and numeral are used as a kind of universal language, occur extensively in the scene of countries in the world, know Other English words and numerical significance are great.However, natural scene Chinese and English word and the position of numeral different from Handwritten Digits Recognition Put, size, font, illumination, visual angle, profile there is polytropy, and the background of natural scene character is also considerably complicated, institute Many technological difficulties for needing to capture be present with the English words in natural scene and numeral identification.
Existing natural scene Text region algorithm be generally all the bottom of from and on algorithm, see document [Neumann L, Matas J.'Real-time lexicon-free scene text localization and recognition',IEEE Transactions on Pattern Analysis and Machine Intelligence,2015,38,(9), Pp.1872-1885], that is, first with sliding window operation and traditional classifier to the English words in image and numeral it is each Character is identified, due to not necessarily there is character in window, then after also needing to recycle merging algorithm to these identifications Character string is integrated.But there are two limitations in this method:1. identify character using sliding window method and traditional classifier Accuracy rate is not high;2. character recognition and merging algorithm are to separate training, error will be directly delivered to caused by each of which In final recognition result, cause Text region precision not high.
The content of the invention
It is an object of the invention to solve these limitations, deep neural network and notice mechanism are combined, and will combine Neutral net afterwards is trained and identified as a block mold, on the basis of the real operation there are currently no sliding window, gives One image comprising English words and numeral directly exports recognition result.
The principle of the present invention is as follows:First, extracted using in the wide variety of convolutional neural networks of computer vision field The two dimensional character matrix of input picture, in the presence of convolutional neural networks, each row in matrix are represented in input picture The depth characteristic of respective regions, two dimensional character matrix is serialized to obtain characteristic sequence according to column direction;Then, note is utilized The information related to character in power of anticipating mechanism extraction characteristic sequence, filters redundancy, draws characteristic vector, so-called notice Mechanism is exactly to observe things according to the observing pattern of human vision with the pattern focused on, filters out garbage, is deep learning In commonly use model;Finally, using long memory network in short-term, the English in image is identified successively according to spatial order from left to right Word and numeral.
The technical scheme is that English words and digit recognition method in a kind of natural scene image, for defeated It is the gray level image comprising English words and numeral to enter image, and this method combines deep neural network and notice mechanism, and It is trained and identifies using the neutral net after combination as a block mold, real there are currently no the basis of sliding window operation On, a given image comprising English words and numeral directly exports recognition result, specifically includes following step:
Step (1), feature extraction is carried out to the image of input:The present invention uses the convolutional Neural in deep neural network Network carries out feature extraction to input picture, the result using the output of convolutional neural networks as feature extraction, with traditional volume Product neutral net output three-dimensional feature matrix is different, and the output for the convolutional neural networks that the present invention designs is two dimensional character matrix. Convolutional neural networks from be input to output successively by:Convolutional layer 1, batch normalization layer 1, pond layer 1, convolutional layer 2, batch standard Change layer 2, pond layer 2, convolutional layer 3, batch normalization layer 3, convolutional layer 4, batch normalization layer 4, pond layer 4, convolutional layer 5, criticize Amount normalization layer 5, convolutional layer 6, batch normalization layer 6, pond layer 6, convolutional layer 7, batch normalization layer 7 form.Wherein convolution The order that the parameter of layer was spaced and expanded size according to convolution kernel size, number of active lanes, slip is followed successively by:(3,64,1,1), (3, 128,1,1), (3,256,1,1), (3,256,1,1), (3,512,1,1), (3,512,1,1) and (2,512,1,0).Batch is marked The purpose of standardization layer is to adjust the distribution of intermediate result data, without parameter.The parameter of pond layer is sliding according to pond window, left and right Dynamic interval, interval is slided up and down, size is expanded in left and right and the upper and lower order for expanding size is followed successively by:(2*2,2,2,0,0), (2* 2,2,2,0,0), (1*2,1,2,0,0) and (1*2,1,2,0,0).Image is differentiated before convolutional neural networks are input to Rate is adjusted to 80 × 32, then the two-dimensional matrix size of convolutional neural networks output is 512 × 19, by this two dimensional character matrix sequence Obtain, comprising the characteristic sequence that 19 sizes are 1 × 512 vector, being expressed as after change:S={ s1,s2,...sL, wherein si∈ R512, R5121 × 512 vector, i=1,2 ..., L are represented, L represents the length of sequence, size 19.
Step (2), notice mechanism is used to carrying out feature comprising 19 sizes for the characteristic sequence S of 1 × 512 vector Focus on, the characteristic vector that notice mechanism is exported gathers the result focused on as feature.The present invention is according to from left to right Spatial order identify character in image successively, and the present invention training dataset Synth [Jaderberg M, Simonyan K,Vedaldi A,et al.'Reading text in the wild with convolutional neural networks',International Journal of Computer Vision,2016,116,(1),pp.1- 20] character length is up to 24 in, therefore the output of the present invention is the English words sum combinatorics on words that length is 24, so algorithm Need to carry out 24 features focusing, feature each time was focused on as a moment.Final output be exactly 24 focus on after Characteristic vector set Vf, Vf={ V1,V2,…VT, T=24.Characteristic vector V in settRepresent what the t times feature focused on As a result, it is expressed as:
WhereinAnd When representing the t times feature focusing The coefficient of notice mechanism.Element in this coefficient is obtained by the following formula:
Wherein ht-1The t-1 moment grows the hidden variable of mnemon in short-term in expression third step.wT, Wa, UaAnd baIt is note The parameter of meaning power model, is trained by the Back Propagation Algorithm based on stochastic gradient descent.
Step (3), the characteristic vector after focusing is identified:The length of the invention utilized in deep neural network is in short-term Characteristic vector after focusing is identified memory network.According to character string maximum length it is assumed that long memory network in short-term contains There are 24 units, the output of long mnemon in short-term is exactly the character identified, and each character has 37 classes (26 English words Mother, 0~9 totally 10 numerals, end mark "-", end mark represent character string end of identification.), the long short-term memory list of t The input of member is exactly the characteristic vector V after the t times feature focuses ont, output is exactly the character class J identifiedt。JtThere are 37 classes Not (26 English alphabets, 0~9 totally ten numerals, end mark "-"), each moment choose the classification of maximum probability as now The output of long mnemon in short-term is carved, chooses mode such as following formula:
zi=softmax (ht)
Wherein htRepresent that t grows the hidden variable of mnemon in short-term, specific explanations are shown in Fig. 3 explanations.After end of identification The output of whole network is exactly the combination of 24 characters.The present invention takes the character string before end mark as final identification knot Fruit.
The input of step (1) is the image comprising English words and numeral, and output is characterized sequence, and characteristic sequence passes through The characteristic vector being calculated required for step (3) input of step (2) is crossed, the character of identification is finally exported by step (3) String.Three steps are being integrated into after a framework, it is necessary to be trained to the parameter of whole model, if X={ Ii,LiBe Training dataset, IiRepresent i-th of image, LiFor its corresponding label, that is, in image character string actual value.So exist Object function in training process can be expressed as:
Wherein W represents the parameter of whole model, contains convolutional neural networks, notice mechanism and long memory network in short-term Parameter, W*Represent the optimum value of these parameters.J={ J1,…JTRepresent Model Identification character string result, be by 24 The character string of character composition, whole character string identify that correct probability is equal to multiplying for each character recognition correct probability in character string Product, then-logp (J=Li|Ii) form can be expressed as:
Wherein Li,tT-th of character in label corresponding to i-th of image is represented, then object function can be expressed as:
After object function is drawn, the present invention is utilized based on the Back Propagation Algorithm of stochastic gradient descent to network parameter W It is trained, sees document [Shi B, Bai X, Yao C.'An end-to-end trainable neural network for image-based sequence recognition and its application to scene text recognition',arXiv preprint arXiv:1507.05717,2015]。
If input picture is coloured image, then above step will be performed after coloured image gray processing.
Compared with prior art, the beneficial effects of the present invention are:
The present invention combines deep neural network and notice mechanism, can be with when input picture is to deep neural network Immediately arrive at final recognition result.Therefore, the present invention need not carry out sliding window operation to input picture and to the word in window Symbol is identified.Meanwhile the character string that the present invention exports is final recognition result, it is not necessary to after merging algorithm to identification Character string is integrated.
Brief description of the drawings
In order to illustrate more clearly about the embodiment of the present invention or technical scheme of the prior art, below will be to embodiment or existing There is the required accompanying drawing used in technology description to be briefly described, it should be apparent that, drawings in the following description are only this Some embodiments of invention, for those of ordinary skill in the art, without having to pay creative labor, may be used also To obtain other accompanying drawings according to these accompanying drawings.
Fig. 1 is overview flow chart of the present invention;
Fig. 2 is the design framework figure of convolutional neural networks in the present invention;
Fig. 3 is the cut-away view of long memory network unit in short-term;
Fig. 4 is the result example one of present invention identification English words and numeral;
Fig. 5 is the result example two of present invention identification English words and numeral.
Embodiment
Below in conjunction with the accompanying drawing in the embodiment of the present invention, the technical scheme in the embodiment of the present invention is carried out clear, complete Site preparation describes, it is clear that described embodiment is only part of the embodiment of the present invention, rather than whole embodiments.It is based on Embodiment in the present invention, those of ordinary skill in the art are obtained every other under the premise of creative work is not made Embodiment, belong to the scope of protection of the invention.
Overview flow chart such as Fig. 1 institutes of " English words and digit recognition method in a kind of natural scene image " of the invention Show, the identification problem of the English words in natural scene and numeral is divided into feature extraction, feature focuses on and feature recognition three Step.
Step (1):Feature extraction.The present invention carries out feature extraction using convolutional neural networks to input picture, and input is Image comprising English character and numeral under natural scene, image is adjusted to 80 before convolutional neural networks are input to × 32 size.It is different that three-dimensional feature matrix can only be exported with traditional convolutional neural networks as shown in Figure 2, designed by the present invention Convolutional neural networks can export the eigenmatrix of two dimension.As illustrated, convolutional neural networks from top to bottom successively by:Convolutional layer 1st, batch normalization layer 1, pond layer 1, convolutional layer 2, batch normalization layer 2, pond layer 2, convolutional layer 3, batch normalization layer 3, Convolutional layer 4, batch normalization layer 4, pond layer 4, convolutional layer 5, batch normalization layer 5, convolutional layer 6, batch normalization layer 6, pond Change layer 6, convolutional layer 7, batch normalization layer 7 to form.Wherein the parameter of convolutional layer is according to convolution kernel size, number of active lanes, slip The order of interval and expansion size is followed successively by:(3,64,1,1), (3,128,1,1), (3,256,1,1), (3,256,1,1), (3, 512,1,1), (3,512,1,1) and (2,512,1,0).The purpose of batch normalization layer is the distribution for adjusting intermediate result data, is not had There is parameter.The parameter of pond layer is according to pond window, horizontally slip interval, slides up and down interval, and expand size and up and down for left and right The order for expanding size is followed successively by:(2*2,2,2,0,0), (2*2,2,2,0,0), (1*2,1,2,0,0) and (1*2,1,2,0, 0).Output is 512 × 19 eigenmatrixes of two dimension, and it is 1 × 512 to be obtained after it is serialized according to column direction comprising 19 sizes The characteristic sequence of vector, is expressed as:S={ s1,s2,…sL, wherein si∈R512, i=1,2 ..., L, the length of L expression sequences, Size is 19.
Step (2):Feature focuses on.The present invention is focused using notice mechanism to the useful information in characteristic sequence, Input is the characteristic sequence comprising 19 sizes for 1 × 512 vector that feature extraction phases obtain, and output is characteristic vector.Calculate Method is that the character in image is identified successively according to spatial order from left to right, and character string most greatly enhances in setting image Spend for 24, then algorithm will carry out T=24 feature and focus on, and final output is exactly the characteristic vector after 24 focusing Set Vf, Vf={ V1,V2,...VT}.Characteristic vector VtThe result that the t times feature focuses on is represented, is expressed as:
WhereinAnd Represent when the t times feature focuses on and note The coefficient for power mechanism of anticipating.Element in this coefficient can be obtained by the following formula:
Wherein wT, Wa, UaAnd baIt is the parameter of attention model, is entered by the Back Propagation Algorithm based on stochastic gradient descent Row training.ht-1The t-1 moment grows the hidden variable of mnemon in short-term in expression third step, and specific explanations are as shown in Figure 3.
Fig. 3 is the cut-away view of long mnemon in short-term.The long one kind of memory network as recurrent neural network in short-term Network is improved, the generation of conventional recursive neutral net gradient extinction tests in the training process is limited by door operation.Such as figure Showing the length of t, mnemon, one long mnemon in short-term are by a mnemon c in short-termtWith three door operations it,ot,ftComposition.Wherein, itIt is input gate, it represents that how many information content of current time can be input in unit;otIt is defeated Go out, it represents this moment unit outwardly exports how many information content;ftIt is to forget door, it is represented in the reception of current time unit The number of one moment unit output information;Its specific calculating process is as follows:
it=σ (WixVt+Wimht-1+bi)
ft=σ (WfxVt+Wfmht-1+bf)
ot=σ (WoxVt+Womht-1+bo)
ct=ft⊙ct-1+it⊙gt
htThe hidden variable of mnemon in short-term is grown for t, σ represents sigmoid functions,⊙ represents dot product Computing.Wix, Wim, Wfx, Wfm, Wox, Wom, Wgx, Wgm, bi, bf, bo, bgThe parameter of long mnemon in short-term is represented, due in length When memory network in the parameters of all units be shared, so these parameters can also be as the ginseng of long memory network in short-term Number, these parameters are trained using based on the Back Propagation Algorithm of stochastic gradient descent in the training stage present invention.
Step (3):Character recognition.Characteristic vector is identified using long memory network in short-term by the present invention, and input is 24 Characteristic vector after individual focusing, output are the character strings of 24 length.In the present invention, long memory network in short-term includes 24 long Short-term memory unit, that is, the identification process of whole character string have 24 moment, the length of t mnemon in short-term it is defeated It is exactly characteristic vector V after the t times feature focuses on to entert, output is exactly the character class J identifiedt。JtThere are 37 classifications (26 English alphabet, 0~9 totally ten numerals, end mark "-"), when each moment chooses the classification of maximum probability as this moment length The output of mnemon, choose mode such as following formula:
zi=softmax (ht)
Wherein htRepresent that t grows the hidden variable of mnemon in short-term.Whole network after end of identification as shown in Figure 1 Output is exactly the combination of 24 characters, and such as ' a ' ' d ' ' o ' ' n ' ' i ' ' s ' '-' '-' '-' ..., final recognition result is ‘adonis’。
Fig. 4 is the result one of correct identification English words of the invention and numeral, and actual value and predicted value are brutalities.As can be seen that the present invention can identify the larger image of Character deformation, robustness is higher.
Fig. 5 is that present invention identification English words are with the result two of numeral, actual value recapitaliozes, predicted value Regapitaliozes, the 3rd letter are the wrong character of identification.It can be seen that the noise of image is larger, for error Character, human eye, which is substantially all, to be differentiated.

Claims (3)

1. English words and digit recognition method in a kind of natural scene image, comprise the following steps:
Step (1), feature extraction is carried out to the image of input using the convolutional neural networks in deep neural network, by convolution Result of the output of neutral net as feature extraction;The convolutional neural networks from be input to output successively by:Convolutional layer 1, Batch normalization layer 1, pond layer 1, convolutional layer 2, batch normalization layer 2, pond layer 2, convolutional layer 3, batch normalization layer 3, volume Lamination 4, batch normalization layer 4, pond layer 4, convolutional layer 5, batch normalization layer 5, convolutional layer 6, batch normalization layer 6, Chi Hua Layer 6, convolutional layer 7, batch normalization layer 7 form;Wherein the parameter of convolutional layer 1~7 is according to convolution kernel size, number of active lanes, cunning The order of dynamic interval and expansion size is followed successively by:(3,64,1,1), (3,128,1,1), (3,256,1,1), (3,256,1,1), (3,512,1,1), (3,512,1,1) and (2,512,1,0);The purpose of batch normalization layer 1~7 is adjustment intermediate result data Distribution, without parameter;The parameter of pond layer 1,2,4,6 is according to pond window, horizontally slip interval, slides up and down interval, left The right order for expanding size and expanding size up and down is followed successively by:(2*2,2,2,0,0), (2*2,2,2,0,0), (1*2,1,2,0, And (1*2,1,2,0,0) 0);It is 80 × 32 that image, which is needed before convolutional neural networks are input to by the resolution adjustment of image, The output of the convolutional neural networks is the two dimensional character matrix that size is 512 × 19;By the two dimensional character matrix sequence Obtain, comprising the characteristic sequence that 19 sizes are 1 × 512 vector, being expressed as afterwards:S={ s1,s2,…sL, wherein si∈R512, i =1,2 ..., L;L=19, represent the length of sequence;
Step (2), notice mechanism is used to carrying out feature focusing comprising 19 sizes for the characteristic sequence S of 1 × 512 vector: Identify the character in image successively according to spatial order from left to right, the character length that setting training data is concentrated is up to 24,24 features are carried out to characteristic sequence S and are focused on, feature each time was focused on as a moment;Output characteristic vector Set Vf, Vf={ V1,V2,...VT, T=24;Wherein characteristic vector VtRepresent the result that the t times feature focuses on:T ∈ 1,2 ... T }, and Represent notice when the t times feature focuses on The coefficient of mechanism, wherein Wherein ht-1Represent the 3rd step The t-1 moment grows the hidden variable of mnemon in short-term in rapid;wT, Wa, UaAnd baIt is the parameter of attention model, by based on random The Back Propagation Algorithm that gradient declines is trained;
Step (3), using the length in deep neural network, the characteristic vector after focusing is identified memory network in short-term:It is long Short-term memory network contains 24 units, and the input of the length of t mnemon in short-term is exactly the spy after the t times feature focuses on Levy vectorial Vt, output is exactly the character class J identifiedt;The character class that each moment chooses maximum probability is grown as this moment The output of short-term memory unit, selection mode are:Wherein zi=softmax (ht);The htRepresent t Moment grows the hidden variable of mnemon in short-term;The output of whole network is exactly the combination of 24 characters after end of identification, takes end Character string before symbol is as final recognition result;Wherein described JtThere are 37 classifications, including:26 English alphabets, 0~9 Totally 10 numerals, end mark "-";The end mark represents character string end of identification.
2. the method as described in claim 1, it is characterised in that the method being trained to the parameter in this method is:If X= {Ii,LiIt is training dataset, IiRepresent i-th of image, LiFor the actual value of character string in i-th of image;In training process Object function is:Wherein W represents convolutional neural networks, The parameter of notice mechanism and long memory network in short-term, W*Represent the optimum value of the parameter, Li,tRepresent i-th of image pair T-th of character in the label answered;Using based on the Back Propagation Algorithm of stochastic gradient descent to being trained.
3. the method as described in claim 1, it is characterised in that the image of the input is gray-scale map.
CN201710592890.3A 2017-07-19 2017-07-19 English words and digit recognition method in a kind of natural scene image Active CN107368831B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201710592890.3A CN107368831B (en) 2017-07-19 2017-07-19 English words and digit recognition method in a kind of natural scene image

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201710592890.3A CN107368831B (en) 2017-07-19 2017-07-19 English words and digit recognition method in a kind of natural scene image

Publications (2)

Publication Number Publication Date
CN107368831A true CN107368831A (en) 2017-11-21
CN107368831B CN107368831B (en) 2019-08-02

Family

ID=60308319

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201710592890.3A Active CN107368831B (en) 2017-07-19 2017-07-19 English words and digit recognition method in a kind of natural scene image

Country Status (1)

Country Link
CN (1) CN107368831B (en)

Cited By (27)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108154136A (en) * 2018-01-15 2018-06-12 众安信息技术服务有限公司 For identifying the method, apparatus of writing and computer-readable medium
CN108229469A (en) * 2017-11-22 2018-06-29 北京市商汤科技开发有限公司 Recognition methods, device, storage medium, program product and the electronic equipment of word
CN109117846A (en) * 2018-08-22 2019-01-01 北京旷视科技有限公司 A kind of image processing method, device, electronic equipment and computer-readable medium
CN109242140A (en) * 2018-07-24 2019-01-18 浙江工业大学 A kind of traffic flow forecasting method based on LSTM_Attention network
CN109389091A (en) * 2018-10-22 2019-02-26 重庆邮电大学 The character identification system and method combined based on neural network and attention mechanism
CN109446187A (en) * 2018-10-16 2019-03-08 浙江大学 Complex equipment health status monitoring method based on attention mechanism and neural network
CN109522600A (en) * 2018-10-16 2019-03-26 浙江大学 Complex equipment remaining life prediction technique based on combined depth neural network
CN109726712A (en) * 2018-11-13 2019-05-07 平安科技(深圳)有限公司 Character recognition method, device and storage medium, server
CN109858420A (en) * 2019-01-24 2019-06-07 国信电子票据平台信息服务有限公司 A kind of bill processing system and processing method
CN109977969A (en) * 2019-03-27 2019-07-05 北京经纬恒润科技有限公司 A kind of image-recognizing method and device
CN109992686A (en) * 2019-02-24 2019-07-09 复旦大学 Based on multi-angle from the image-text retrieval system and method for attention mechanism
CN110135427A (en) * 2019-04-11 2019-08-16 北京百度网讯科技有限公司 The method, apparatus, equipment and medium of character in image for identification
CN110197227A (en) * 2019-05-30 2019-09-03 成都中科艾瑞科技有限公司 A kind of meter reading intelligent identification Method of multi-model fusion
CN110321755A (en) * 2018-03-28 2019-10-11 中移(苏州)软件技术有限公司 A kind of recognition methods and device
CN110555462A (en) * 2019-08-02 2019-12-10 深圳索信达数据技术有限公司 non-fixed multi-character verification code identification method based on convolutional neural network
CN110555433A (en) * 2018-05-30 2019-12-10 北京三星通信技术研究有限公司 Image processing method, image processing device, electronic equipment and computer readable storage medium
CN110659641A (en) * 2018-06-28 2020-01-07 杭州海康威视数字技术股份有限公司 Character recognition method and device and electronic equipment
CN111027555A (en) * 2018-10-09 2020-04-17 杭州海康威视数字技术股份有限公司 License plate recognition method and device and electronic equipment
CN111027562A (en) * 2019-12-06 2020-04-17 中电健康云科技有限公司 Optical character recognition method based on multi-scale CNN and RNN combined with attention mechanism
CN111222589A (en) * 2018-11-27 2020-06-02 中国移动通信集团辽宁有限公司 Image text recognition method, device, equipment and computer storage medium
CN111242113A (en) * 2020-01-08 2020-06-05 重庆邮电大学 Method for recognizing natural scene text in any direction
CN111352827A (en) * 2018-12-24 2020-06-30 中移信息技术有限公司 Automatic testing method and device
CN111523539A (en) * 2020-04-15 2020-08-11 北京三快在线科技有限公司 Character detection method and device
CN111553290A (en) * 2020-04-30 2020-08-18 北京市商汤科技开发有限公司 Text recognition method, device, equipment and storage medium
CN112101395A (en) * 2019-06-18 2020-12-18 上海高德威智能交通系统有限公司 Image identification method and device
WO2021115159A1 (en) * 2019-12-09 2021-06-17 中兴通讯股份有限公司 Character recognition network model training method, character recognition method, apparatuses, terminal, and computer storage medium therefor
CN113688822A (en) * 2021-09-07 2021-11-23 河南工业大学 Time sequence attention mechanism scene image identification method

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105654130A (en) * 2015-12-30 2016-06-08 成都数联铭品科技有限公司 Recurrent neural network-based complex image character sequence recognition system
CN106022363A (en) * 2016-05-12 2016-10-12 南京大学 Method for recognizing Chinese characters in natural scene
CN106157319A (en) * 2016-07-28 2016-11-23 哈尔滨工业大学 The significance detection method that region based on convolutional neural networks and Pixel-level merge
CN106650813A (en) * 2016-12-27 2017-05-10 华南理工大学 Image understanding method based on depth residual error network and LSTM

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105654130A (en) * 2015-12-30 2016-06-08 成都数联铭品科技有限公司 Recurrent neural network-based complex image character sequence recognition system
CN106022363A (en) * 2016-05-12 2016-10-12 南京大学 Method for recognizing Chinese characters in natural scene
CN106157319A (en) * 2016-07-28 2016-11-23 哈尔滨工业大学 The significance detection method that region based on convolutional neural networks and Pixel-level merge
CN106650813A (en) * 2016-12-27 2017-05-10 华南理工大学 Image understanding method based on depth residual error network and LSTM

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
LUKAS NEUMANN ET AL: "Real-Time Lexicon-Free Scene Text Localization and Recognition", 《IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE》 *
QIANG GUO ET AL: "Memory Matters: Convolutional Recurrent Neural Network for Scene Text Recognition", 《HTTPS://ARXIV.ORG/ABS/1601.01100》 *
葛明涛 等: "基于多重卷积神经网络的大模式联机手写文字识别", 《现代电子技术》 *

Cited By (39)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108229469A (en) * 2017-11-22 2018-06-29 北京市商汤科技开发有限公司 Recognition methods, device, storage medium, program product and the electronic equipment of word
CN108154136B (en) * 2018-01-15 2022-04-05 众安信息技术服务有限公司 Method, apparatus and computer readable medium for recognizing handwriting
CN108154136A (en) * 2018-01-15 2018-06-12 众安信息技术服务有限公司 For identifying the method, apparatus of writing and computer-readable medium
CN110321755A (en) * 2018-03-28 2019-10-11 中移(苏州)软件技术有限公司 A kind of recognition methods and device
CN110555433B (en) * 2018-05-30 2024-04-26 北京三星通信技术研究有限公司 Image processing method, device, electronic equipment and computer readable storage medium
CN110555433A (en) * 2018-05-30 2019-12-10 北京三星通信技术研究有限公司 Image processing method, image processing device, electronic equipment and computer readable storage medium
CN110659641B (en) * 2018-06-28 2023-05-26 杭州海康威视数字技术股份有限公司 Text recognition method and device and electronic equipment
CN110659641A (en) * 2018-06-28 2020-01-07 杭州海康威视数字技术股份有限公司 Character recognition method and device and electronic equipment
CN109242140A (en) * 2018-07-24 2019-01-18 浙江工业大学 A kind of traffic flow forecasting method based on LSTM_Attention network
CN109117846B (en) * 2018-08-22 2021-11-16 北京旷视科技有限公司 Image processing method and device, electronic equipment and computer readable medium
CN109117846A (en) * 2018-08-22 2019-01-01 北京旷视科技有限公司 A kind of image processing method, device, electronic equipment and computer-readable medium
CN111027555B (en) * 2018-10-09 2023-09-26 杭州海康威视数字技术股份有限公司 License plate recognition method and device and electronic equipment
CN111027555A (en) * 2018-10-09 2020-04-17 杭州海康威视数字技术股份有限公司 License plate recognition method and device and electronic equipment
CN109522600B (en) * 2018-10-16 2020-10-16 浙江大学 Complex equipment residual service life prediction method based on combined deep neural network
CN109446187B (en) * 2018-10-16 2021-01-15 浙江大学 Method for monitoring health state of complex equipment based on attention mechanism and neural network
CN109446187A (en) * 2018-10-16 2019-03-08 浙江大学 Complex equipment health status monitoring method based on attention mechanism and neural network
CN109522600A (en) * 2018-10-16 2019-03-26 浙江大学 Complex equipment remaining life prediction technique based on combined depth neural network
CN109389091A (en) * 2018-10-22 2019-02-26 重庆邮电大学 The character identification system and method combined based on neural network and attention mechanism
CN109389091B (en) * 2018-10-22 2022-05-03 重庆邮电大学 Character recognition system and method based on combination of neural network and attention mechanism
CN109726712A (en) * 2018-11-13 2019-05-07 平安科技(深圳)有限公司 Character recognition method, device and storage medium, server
CN111222589A (en) * 2018-11-27 2020-06-02 中国移动通信集团辽宁有限公司 Image text recognition method, device, equipment and computer storage medium
CN111222589B (en) * 2018-11-27 2023-07-18 中国移动通信集团辽宁有限公司 Image text recognition method, device, equipment and computer storage medium
CN111352827A (en) * 2018-12-24 2020-06-30 中移信息技术有限公司 Automatic testing method and device
CN109858420A (en) * 2019-01-24 2019-06-07 国信电子票据平台信息服务有限公司 A kind of bill processing system and processing method
CN109992686A (en) * 2019-02-24 2019-07-09 复旦大学 Based on multi-angle from the image-text retrieval system and method for attention mechanism
CN109977969A (en) * 2019-03-27 2019-07-05 北京经纬恒润科技有限公司 A kind of image-recognizing method and device
CN110135427B (en) * 2019-04-11 2021-07-27 北京百度网讯科技有限公司 Method, apparatus, device and medium for recognizing characters in image
CN110135427A (en) * 2019-04-11 2019-08-16 北京百度网讯科技有限公司 The method, apparatus, equipment and medium of character in image for identification
CN110197227B (en) * 2019-05-30 2023-10-27 成都中科艾瑞科技有限公司 Multi-model fusion intelligent instrument reading identification method
CN110197227A (en) * 2019-05-30 2019-09-03 成都中科艾瑞科技有限公司 A kind of meter reading intelligent identification Method of multi-model fusion
CN112101395A (en) * 2019-06-18 2020-12-18 上海高德威智能交通系统有限公司 Image identification method and device
CN110555462A (en) * 2019-08-02 2019-12-10 深圳索信达数据技术有限公司 non-fixed multi-character verification code identification method based on convolutional neural network
CN111027562A (en) * 2019-12-06 2020-04-17 中电健康云科技有限公司 Optical character recognition method based on multi-scale CNN and RNN combined with attention mechanism
WO2021115159A1 (en) * 2019-12-09 2021-06-17 中兴通讯股份有限公司 Character recognition network model training method, character recognition method, apparatuses, terminal, and computer storage medium therefor
CN111242113B (en) * 2020-01-08 2022-07-08 重庆邮电大学 Method for recognizing natural scene text in any direction
CN111242113A (en) * 2020-01-08 2020-06-05 重庆邮电大学 Method for recognizing natural scene text in any direction
CN111523539A (en) * 2020-04-15 2020-08-11 北京三快在线科技有限公司 Character detection method and device
CN111553290A (en) * 2020-04-30 2020-08-18 北京市商汤科技开发有限公司 Text recognition method, device, equipment and storage medium
CN113688822A (en) * 2021-09-07 2021-11-23 河南工业大学 Time sequence attention mechanism scene image identification method

Also Published As

Publication number Publication date
CN107368831B (en) 2019-08-02

Similar Documents

Publication Publication Date Title
CN107368831B (en) English words and digit recognition method in a kind of natural scene image
CN111723585B (en) Style-controllable image text real-time translation and conversion method
CN109948714B (en) Chinese scene text line identification method based on residual convolution and recurrent neural network
CN107862261A (en) Image people counting method based on multiple dimensioned convolutional neural networks
CN103605972B (en) Non-restricted environment face verification method based on block depth neural network
CN110414498B (en) Natural scene text recognition method based on cross attention mechanism
CN110929665B (en) Natural scene curve text detection method
CN107480726A (en) A kind of Scene Semantics dividing method based on full convolution and shot and long term mnemon
CN110533737A (en) The method generated based on structure guidance Chinese character style
CN108345850A (en) The scene text detection method of the territorial classification of stroke feature transformation and deep learning based on super-pixel
CN108681735A (en) Optical character recognition method based on convolutional neural networks deep learning model
CN111985525B (en) Text recognition method based on multi-mode information fusion processing
CN112069900A (en) Bill character recognition method and system based on convolutional neural network
CN114048822A (en) Attention mechanism feature fusion segmentation method for image
Hossain et al. Recognition and solution for handwritten equation using convolutional neural network
CN107818299A (en) Face recognition algorithms based on fusion HOG features and depth belief network
Talukder et al. Real-time bangla sign language detection with sentence and speech generation
CN109360179A (en) A kind of image interfusion method, device and readable storage medium storing program for executing
CN110263174A (en) - subject categories the analysis method based on focus
CN109508640A (en) A kind of crowd's sentiment analysis method, apparatus and storage medium
Truong et al. Vietnamese handwritten character recognition using convolutional neural network
CN115205521A (en) Kitchen waste detection method based on neural network
Aksoy et al. Detection of Turkish sign language using deep learning and image processing methods
CN110929013A (en) Image question-answer implementation method based on bottom-up entry and positioning information fusion
Singh et al. A comprehensive survey on Bangla handwritten numeral recognition

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant