CN107368831A - English words and digit recognition method in a kind of natural scene image - Google Patents
English words and digit recognition method in a kind of natural scene image Download PDFInfo
- Publication number
- CN107368831A CN107368831A CN201710592890.3A CN201710592890A CN107368831A CN 107368831 A CN107368831 A CN 107368831A CN 201710592890 A CN201710592890 A CN 201710592890A CN 107368831 A CN107368831 A CN 107368831A
- Authority
- CN
- China
- Prior art keywords
- layer
- image
- character
- short
- term
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/60—Type of objects
- G06V20/62—Text, e.g. of license plates, overlay texts or captions on TV images
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/214—Generating training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V30/00—Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
- G06V30/10—Character recognition
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- General Physics & Mathematics (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Evolutionary Biology (AREA)
- Evolutionary Computation (AREA)
- Bioinformatics & Computational Biology (AREA)
- General Engineering & Computer Science (AREA)
- Artificial Intelligence (AREA)
- Life Sciences & Earth Sciences (AREA)
- Multimedia (AREA)
- Character Discrimination (AREA)
- Image Analysis (AREA)
Abstract
The present invention provides the English words and digit recognition method in a kind of natural scene image, the identification problem of English words in natural scene and numeral is divided into feature extraction, feature focuses on and three steps of feature recognition, feature extraction is carried out to input picture using convolutional neural networks, notice mechanism is focused to the useful information in characteristic sequence, characteristic vector is identified long memory network in short-term, so as to which deep neural network and notice mechanism be combined, when input picture is to deep neural network, final recognition result can be immediately arrived at.The present invention need not carry out sliding window operation to input picture and the character in window is identified;The character string that the present invention exports simultaneously is final recognition result, it is not necessary to merges algorithm and the character string after identification is integrated.
Description
Technical field
The invention belongs to technical field of character recognition, relates to the use of deep neural network and notice mechanism carries out nature field
English words and digit recognition method in scape image.
Background technology
Word in natural scene often carries very important information, and it can be used to describe the interior of the image
Hold.The text information automatically obtained in image can help people more effectively to understand image and stored, pressed to image
The processing such as contracting, retrieval.Relative to natural scene character detecting method, natural scene character recognition method is to having been detected by
Character area is identified.English and numeral are used as a kind of universal language, occur extensively in the scene of countries in the world, know
Other English words and numerical significance are great.However, natural scene Chinese and English word and the position of numeral different from Handwritten Digits Recognition
Put, size, font, illumination, visual angle, profile there is polytropy, and the background of natural scene character is also considerably complicated, institute
Many technological difficulties for needing to capture be present with the English words in natural scene and numeral identification.
Existing natural scene Text region algorithm be generally all the bottom of from and on algorithm, see document [Neumann L,
Matas J.'Real-time lexicon-free scene text localization and recognition',IEEE
Transactions on Pattern Analysis and Machine Intelligence,2015,38,(9),
Pp.1872-1885], that is, first with sliding window operation and traditional classifier to the English words in image and numeral it is each
Character is identified, due to not necessarily there is character in window, then after also needing to recycle merging algorithm to these identifications
Character string is integrated.But there are two limitations in this method:1. identify character using sliding window method and traditional classifier
Accuracy rate is not high;2. character recognition and merging algorithm are to separate training, error will be directly delivered to caused by each of which
In final recognition result, cause Text region precision not high.
The content of the invention
It is an object of the invention to solve these limitations, deep neural network and notice mechanism are combined, and will combine
Neutral net afterwards is trained and identified as a block mold, on the basis of the real operation there are currently no sliding window, gives
One image comprising English words and numeral directly exports recognition result.
The principle of the present invention is as follows:First, extracted using in the wide variety of convolutional neural networks of computer vision field
The two dimensional character matrix of input picture, in the presence of convolutional neural networks, each row in matrix are represented in input picture
The depth characteristic of respective regions, two dimensional character matrix is serialized to obtain characteristic sequence according to column direction;Then, note is utilized
The information related to character in power of anticipating mechanism extraction characteristic sequence, filters redundancy, draws characteristic vector, so-called notice
Mechanism is exactly to observe things according to the observing pattern of human vision with the pattern focused on, filters out garbage, is deep learning
In commonly use model;Finally, using long memory network in short-term, the English in image is identified successively according to spatial order from left to right
Word and numeral.
The technical scheme is that English words and digit recognition method in a kind of natural scene image, for defeated
It is the gray level image comprising English words and numeral to enter image, and this method combines deep neural network and notice mechanism, and
It is trained and identifies using the neutral net after combination as a block mold, real there are currently no the basis of sliding window operation
On, a given image comprising English words and numeral directly exports recognition result, specifically includes following step:
Step (1), feature extraction is carried out to the image of input:The present invention uses the convolutional Neural in deep neural network
Network carries out feature extraction to input picture, the result using the output of convolutional neural networks as feature extraction, with traditional volume
Product neutral net output three-dimensional feature matrix is different, and the output for the convolutional neural networks that the present invention designs is two dimensional character matrix.
Convolutional neural networks from be input to output successively by:Convolutional layer 1, batch normalization layer 1, pond layer 1, convolutional layer 2, batch standard
Change layer 2, pond layer 2, convolutional layer 3, batch normalization layer 3, convolutional layer 4, batch normalization layer 4, pond layer 4, convolutional layer 5, criticize
Amount normalization layer 5, convolutional layer 6, batch normalization layer 6, pond layer 6, convolutional layer 7, batch normalization layer 7 form.Wherein convolution
The order that the parameter of layer was spaced and expanded size according to convolution kernel size, number of active lanes, slip is followed successively by:(3,64,1,1), (3,
128,1,1), (3,256,1,1), (3,256,1,1), (3,512,1,1), (3,512,1,1) and (2,512,1,0).Batch is marked
The purpose of standardization layer is to adjust the distribution of intermediate result data, without parameter.The parameter of pond layer is sliding according to pond window, left and right
Dynamic interval, interval is slided up and down, size is expanded in left and right and the upper and lower order for expanding size is followed successively by:(2*2,2,2,0,0), (2*
2,2,2,0,0), (1*2,1,2,0,0) and (1*2,1,2,0,0).Image is differentiated before convolutional neural networks are input to
Rate is adjusted to 80 × 32, then the two-dimensional matrix size of convolutional neural networks output is 512 × 19, by this two dimensional character matrix sequence
Obtain, comprising the characteristic sequence that 19 sizes are 1 × 512 vector, being expressed as after change:S={ s1,s2,...sL, wherein si∈
R512, R5121 × 512 vector, i=1,2 ..., L are represented, L represents the length of sequence, size 19.
Step (2), notice mechanism is used to carrying out feature comprising 19 sizes for the characteristic sequence S of 1 × 512 vector
Focus on, the characteristic vector that notice mechanism is exported gathers the result focused on as feature.The present invention is according to from left to right
Spatial order identify character in image successively, and the present invention training dataset Synth [Jaderberg M,
Simonyan K,Vedaldi A,et al.'Reading text in the wild with convolutional
neural networks',International Journal of Computer Vision,2016,116,(1),pp.1-
20] character length is up to 24 in, therefore the output of the present invention is the English words sum combinatorics on words that length is 24, so algorithm
Need to carry out 24 features focusing, feature each time was focused on as a moment.Final output be exactly 24 focus on after
Characteristic vector set Vf, Vf={ V1,V2,…VT, T=24.Characteristic vector V in settRepresent what the t times feature focused on
As a result, it is expressed as:
WhereinAnd When representing the t times feature focusing
The coefficient of notice mechanism.Element in this coefficient is obtained by the following formula:
Wherein ht-1The t-1 moment grows the hidden variable of mnemon in short-term in expression third step.wT, Wa, UaAnd baIt is note
The parameter of meaning power model, is trained by the Back Propagation Algorithm based on stochastic gradient descent.
Step (3), the characteristic vector after focusing is identified:The length of the invention utilized in deep neural network is in short-term
Characteristic vector after focusing is identified memory network.According to character string maximum length it is assumed that long memory network in short-term contains
There are 24 units, the output of long mnemon in short-term is exactly the character identified, and each character has 37 classes (26 English words
Mother, 0~9 totally 10 numerals, end mark "-", end mark represent character string end of identification.), the long short-term memory list of t
The input of member is exactly the characteristic vector V after the t times feature focuses ont, output is exactly the character class J identifiedt。JtThere are 37 classes
Not (26 English alphabets, 0~9 totally ten numerals, end mark "-"), each moment choose the classification of maximum probability as now
The output of long mnemon in short-term is carved, chooses mode such as following formula:
zi=softmax (ht)
Wherein htRepresent that t grows the hidden variable of mnemon in short-term, specific explanations are shown in Fig. 3 explanations.After end of identification
The output of whole network is exactly the combination of 24 characters.The present invention takes the character string before end mark as final identification knot
Fruit.
The input of step (1) is the image comprising English words and numeral, and output is characterized sequence, and characteristic sequence passes through
The characteristic vector being calculated required for step (3) input of step (2) is crossed, the character of identification is finally exported by step (3)
String.Three steps are being integrated into after a framework, it is necessary to be trained to the parameter of whole model, if X={ Ii,LiBe
Training dataset, IiRepresent i-th of image, LiFor its corresponding label, that is, in image character string actual value.So exist
Object function in training process can be expressed as:
Wherein W represents the parameter of whole model, contains convolutional neural networks, notice mechanism and long memory network in short-term
Parameter, W*Represent the optimum value of these parameters.J={ J1,…JTRepresent Model Identification character string result, be by 24
The character string of character composition, whole character string identify that correct probability is equal to multiplying for each character recognition correct probability in character string
Product, then-logp (J=Li|Ii) form can be expressed as:
Wherein Li,tT-th of character in label corresponding to i-th of image is represented, then object function can be expressed as:
After object function is drawn, the present invention is utilized based on the Back Propagation Algorithm of stochastic gradient descent to network parameter W
It is trained, sees document [Shi B, Bai X, Yao C.'An end-to-end trainable neural network for
image-based sequence recognition and its application to scene text
recognition',arXiv preprint arXiv:1507.05717,2015]。
If input picture is coloured image, then above step will be performed after coloured image gray processing.
Compared with prior art, the beneficial effects of the present invention are:
The present invention combines deep neural network and notice mechanism, can be with when input picture is to deep neural network
Immediately arrive at final recognition result.Therefore, the present invention need not carry out sliding window operation to input picture and to the word in window
Symbol is identified.Meanwhile the character string that the present invention exports is final recognition result, it is not necessary to after merging algorithm to identification
Character string is integrated.
Brief description of the drawings
In order to illustrate more clearly about the embodiment of the present invention or technical scheme of the prior art, below will be to embodiment or existing
There is the required accompanying drawing used in technology description to be briefly described, it should be apparent that, drawings in the following description are only this
Some embodiments of invention, for those of ordinary skill in the art, without having to pay creative labor, may be used also
To obtain other accompanying drawings according to these accompanying drawings.
Fig. 1 is overview flow chart of the present invention;
Fig. 2 is the design framework figure of convolutional neural networks in the present invention;
Fig. 3 is the cut-away view of long memory network unit in short-term;
Fig. 4 is the result example one of present invention identification English words and numeral;
Fig. 5 is the result example two of present invention identification English words and numeral.
Embodiment
Below in conjunction with the accompanying drawing in the embodiment of the present invention, the technical scheme in the embodiment of the present invention is carried out clear, complete
Site preparation describes, it is clear that described embodiment is only part of the embodiment of the present invention, rather than whole embodiments.It is based on
Embodiment in the present invention, those of ordinary skill in the art are obtained every other under the premise of creative work is not made
Embodiment, belong to the scope of protection of the invention.
Overview flow chart such as Fig. 1 institutes of " English words and digit recognition method in a kind of natural scene image " of the invention
Show, the identification problem of the English words in natural scene and numeral is divided into feature extraction, feature focuses on and feature recognition three
Step.
Step (1):Feature extraction.The present invention carries out feature extraction using convolutional neural networks to input picture, and input is
Image comprising English character and numeral under natural scene, image is adjusted to 80 before convolutional neural networks are input to ×
32 size.It is different that three-dimensional feature matrix can only be exported with traditional convolutional neural networks as shown in Figure 2, designed by the present invention
Convolutional neural networks can export the eigenmatrix of two dimension.As illustrated, convolutional neural networks from top to bottom successively by:Convolutional layer
1st, batch normalization layer 1, pond layer 1, convolutional layer 2, batch normalization layer 2, pond layer 2, convolutional layer 3, batch normalization layer 3,
Convolutional layer 4, batch normalization layer 4, pond layer 4, convolutional layer 5, batch normalization layer 5, convolutional layer 6, batch normalization layer 6, pond
Change layer 6, convolutional layer 7, batch normalization layer 7 to form.Wherein the parameter of convolutional layer is according to convolution kernel size, number of active lanes, slip
The order of interval and expansion size is followed successively by:(3,64,1,1), (3,128,1,1), (3,256,1,1), (3,256,1,1), (3,
512,1,1), (3,512,1,1) and (2,512,1,0).The purpose of batch normalization layer is the distribution for adjusting intermediate result data, is not had
There is parameter.The parameter of pond layer is according to pond window, horizontally slip interval, slides up and down interval, and expand size and up and down for left and right
The order for expanding size is followed successively by:(2*2,2,2,0,0), (2*2,2,2,0,0), (1*2,1,2,0,0) and (1*2,1,2,0,
0).Output is 512 × 19 eigenmatrixes of two dimension, and it is 1 × 512 to be obtained after it is serialized according to column direction comprising 19 sizes
The characteristic sequence of vector, is expressed as:S={ s1,s2,…sL, wherein si∈R512, i=1,2 ..., L, the length of L expression sequences,
Size is 19.
Step (2):Feature focuses on.The present invention is focused using notice mechanism to the useful information in characteristic sequence,
Input is the characteristic sequence comprising 19 sizes for 1 × 512 vector that feature extraction phases obtain, and output is characteristic vector.Calculate
Method is that the character in image is identified successively according to spatial order from left to right, and character string most greatly enhances in setting image
Spend for 24, then algorithm will carry out T=24 feature and focus on, and final output is exactly the characteristic vector after 24 focusing
Set Vf, Vf={ V1,V2,...VT}.Characteristic vector VtThe result that the t times feature focuses on is represented, is expressed as:
WhereinAnd Represent when the t times feature focuses on and note
The coefficient for power mechanism of anticipating.Element in this coefficient can be obtained by the following formula:
Wherein wT, Wa, UaAnd baIt is the parameter of attention model, is entered by the Back Propagation Algorithm based on stochastic gradient descent
Row training.ht-1The t-1 moment grows the hidden variable of mnemon in short-term in expression third step, and specific explanations are as shown in Figure 3.
Fig. 3 is the cut-away view of long mnemon in short-term.The long one kind of memory network as recurrent neural network in short-term
Network is improved, the generation of conventional recursive neutral net gradient extinction tests in the training process is limited by door operation.Such as figure
Showing the length of t, mnemon, one long mnemon in short-term are by a mnemon c in short-termtWith three door operations
it,ot,ftComposition.Wherein, itIt is input gate, it represents that how many information content of current time can be input in unit;otIt is defeated
Go out, it represents this moment unit outwardly exports how many information content;ftIt is to forget door, it is represented in the reception of current time unit
The number of one moment unit output information;Its specific calculating process is as follows:
it=σ (WixVt+Wimht-1+bi)
ft=σ (WfxVt+Wfmht-1+bf)
ot=σ (WoxVt+Womht-1+bo)
ct=ft⊙ct-1+it⊙gt
htThe hidden variable of mnemon in short-term is grown for t, σ represents sigmoid functions,⊙ represents dot product
Computing.Wix, Wim, Wfx, Wfm, Wox, Wom, Wgx, Wgm, bi, bf, bo, bgThe parameter of long mnemon in short-term is represented, due in length
When memory network in the parameters of all units be shared, so these parameters can also be as the ginseng of long memory network in short-term
Number, these parameters are trained using based on the Back Propagation Algorithm of stochastic gradient descent in the training stage present invention.
Step (3):Character recognition.Characteristic vector is identified using long memory network in short-term by the present invention, and input is 24
Characteristic vector after individual focusing, output are the character strings of 24 length.In the present invention, long memory network in short-term includes 24 long
Short-term memory unit, that is, the identification process of whole character string have 24 moment, the length of t mnemon in short-term it is defeated
It is exactly characteristic vector V after the t times feature focuses on to entert, output is exactly the character class J identifiedt。JtThere are 37 classifications (26
English alphabet, 0~9 totally ten numerals, end mark "-"), when each moment chooses the classification of maximum probability as this moment length
The output of mnemon, choose mode such as following formula:
zi=softmax (ht)
Wherein htRepresent that t grows the hidden variable of mnemon in short-term.Whole network after end of identification as shown in Figure 1
Output is exactly the combination of 24 characters, and such as ' a ' ' d ' ' o ' ' n ' ' i ' ' s ' '-' '-' '-' ..., final recognition result is
‘adonis’。
Fig. 4 is the result one of correct identification English words of the invention and numeral, and actual value and predicted value are
brutalities.As can be seen that the present invention can identify the larger image of Character deformation, robustness is higher.
Fig. 5 is that present invention identification English words are with the result two of numeral, actual value recapitaliozes, predicted value
Regapitaliozes, the 3rd letter are the wrong character of identification.It can be seen that the noise of image is larger, for error
Character, human eye, which is substantially all, to be differentiated.
Claims (3)
1. English words and digit recognition method in a kind of natural scene image, comprise the following steps:
Step (1), feature extraction is carried out to the image of input using the convolutional neural networks in deep neural network, by convolution
Result of the output of neutral net as feature extraction;The convolutional neural networks from be input to output successively by:Convolutional layer 1,
Batch normalization layer 1, pond layer 1, convolutional layer 2, batch normalization layer 2, pond layer 2, convolutional layer 3, batch normalization layer 3, volume
Lamination 4, batch normalization layer 4, pond layer 4, convolutional layer 5, batch normalization layer 5, convolutional layer 6, batch normalization layer 6, Chi Hua
Layer 6, convolutional layer 7, batch normalization layer 7 form;Wherein the parameter of convolutional layer 1~7 is according to convolution kernel size, number of active lanes, cunning
The order of dynamic interval and expansion size is followed successively by:(3,64,1,1), (3,128,1,1), (3,256,1,1), (3,256,1,1),
(3,512,1,1), (3,512,1,1) and (2,512,1,0);The purpose of batch normalization layer 1~7 is adjustment intermediate result data
Distribution, without parameter;The parameter of pond layer 1,2,4,6 is according to pond window, horizontally slip interval, slides up and down interval, left
The right order for expanding size and expanding size up and down is followed successively by:(2*2,2,2,0,0), (2*2,2,2,0,0), (1*2,1,2,0,
And (1*2,1,2,0,0) 0);It is 80 × 32 that image, which is needed before convolutional neural networks are input to by the resolution adjustment of image,
The output of the convolutional neural networks is the two dimensional character matrix that size is 512 × 19;By the two dimensional character matrix sequence
Obtain, comprising the characteristic sequence that 19 sizes are 1 × 512 vector, being expressed as afterwards:S={ s1,s2,…sL, wherein si∈R512, i
=1,2 ..., L;L=19, represent the length of sequence;
Step (2), notice mechanism is used to carrying out feature focusing comprising 19 sizes for the characteristic sequence S of 1 × 512 vector:
Identify the character in image successively according to spatial order from left to right, the character length that setting training data is concentrated is up to
24,24 features are carried out to characteristic sequence S and are focused on, feature each time was focused on as a moment;Output characteristic vector
Set Vf, Vf={ V1,V2,...VT, T=24;Wherein characteristic vector VtRepresent the result that the t times feature focuses on:T ∈ 1,2 ... T }, and Represent notice when the t times feature focuses on
The coefficient of mechanism, wherein Wherein ht-1Represent the 3rd step
The t-1 moment grows the hidden variable of mnemon in short-term in rapid;wT, Wa, UaAnd baIt is the parameter of attention model, by based on random
The Back Propagation Algorithm that gradient declines is trained;
Step (3), using the length in deep neural network, the characteristic vector after focusing is identified memory network in short-term:It is long
Short-term memory network contains 24 units, and the input of the length of t mnemon in short-term is exactly the spy after the t times feature focuses on
Levy vectorial Vt, output is exactly the character class J identifiedt;The character class that each moment chooses maximum probability is grown as this moment
The output of short-term memory unit, selection mode are:Wherein zi=softmax (ht);The htRepresent t
Moment grows the hidden variable of mnemon in short-term;The output of whole network is exactly the combination of 24 characters after end of identification, takes end
Character string before symbol is as final recognition result;Wherein described JtThere are 37 classifications, including:26 English alphabets, 0~9
Totally 10 numerals, end mark "-";The end mark represents character string end of identification.
2. the method as described in claim 1, it is characterised in that the method being trained to the parameter in this method is:If X=
{Ii,LiIt is training dataset, IiRepresent i-th of image, LiFor the actual value of character string in i-th of image;In training process
Object function is:Wherein W represents convolutional neural networks,
The parameter of notice mechanism and long memory network in short-term, W*Represent the optimum value of the parameter, Li,tRepresent i-th of image pair
T-th of character in the label answered;Using based on the Back Propagation Algorithm of stochastic gradient descent to being trained.
3. the method as described in claim 1, it is characterised in that the image of the input is gray-scale map.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710592890.3A CN107368831B (en) | 2017-07-19 | 2017-07-19 | English words and digit recognition method in a kind of natural scene image |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710592890.3A CN107368831B (en) | 2017-07-19 | 2017-07-19 | English words and digit recognition method in a kind of natural scene image |
Publications (2)
Publication Number | Publication Date |
---|---|
CN107368831A true CN107368831A (en) | 2017-11-21 |
CN107368831B CN107368831B (en) | 2019-08-02 |
Family
ID=60308319
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201710592890.3A Active CN107368831B (en) | 2017-07-19 | 2017-07-19 | English words and digit recognition method in a kind of natural scene image |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN107368831B (en) |
Cited By (27)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108154136A (en) * | 2018-01-15 | 2018-06-12 | 众安信息技术服务有限公司 | For identifying the method, apparatus of writing and computer-readable medium |
CN108229469A (en) * | 2017-11-22 | 2018-06-29 | 北京市商汤科技开发有限公司 | Recognition methods, device, storage medium, program product and the electronic equipment of word |
CN109117846A (en) * | 2018-08-22 | 2019-01-01 | 北京旷视科技有限公司 | A kind of image processing method, device, electronic equipment and computer-readable medium |
CN109242140A (en) * | 2018-07-24 | 2019-01-18 | 浙江工业大学 | A kind of traffic flow forecasting method based on LSTM_Attention network |
CN109389091A (en) * | 2018-10-22 | 2019-02-26 | 重庆邮电大学 | The character identification system and method combined based on neural network and attention mechanism |
CN109446187A (en) * | 2018-10-16 | 2019-03-08 | 浙江大学 | Complex equipment health status monitoring method based on attention mechanism and neural network |
CN109522600A (en) * | 2018-10-16 | 2019-03-26 | 浙江大学 | Complex equipment remaining life prediction technique based on combined depth neural network |
CN109726712A (en) * | 2018-11-13 | 2019-05-07 | 平安科技(深圳)有限公司 | Character recognition method, device and storage medium, server |
CN109858420A (en) * | 2019-01-24 | 2019-06-07 | 国信电子票据平台信息服务有限公司 | A kind of bill processing system and processing method |
CN109977969A (en) * | 2019-03-27 | 2019-07-05 | 北京经纬恒润科技有限公司 | A kind of image-recognizing method and device |
CN109992686A (en) * | 2019-02-24 | 2019-07-09 | 复旦大学 | Based on multi-angle from the image-text retrieval system and method for attention mechanism |
CN110135427A (en) * | 2019-04-11 | 2019-08-16 | 北京百度网讯科技有限公司 | The method, apparatus, equipment and medium of character in image for identification |
CN110197227A (en) * | 2019-05-30 | 2019-09-03 | 成都中科艾瑞科技有限公司 | A kind of meter reading intelligent identification Method of multi-model fusion |
CN110321755A (en) * | 2018-03-28 | 2019-10-11 | 中移(苏州)软件技术有限公司 | A kind of recognition methods and device |
CN110555462A (en) * | 2019-08-02 | 2019-12-10 | 深圳索信达数据技术有限公司 | non-fixed multi-character verification code identification method based on convolutional neural network |
CN110555433A (en) * | 2018-05-30 | 2019-12-10 | 北京三星通信技术研究有限公司 | Image processing method, image processing device, electronic equipment and computer readable storage medium |
CN110659641A (en) * | 2018-06-28 | 2020-01-07 | 杭州海康威视数字技术股份有限公司 | Character recognition method and device and electronic equipment |
CN111027555A (en) * | 2018-10-09 | 2020-04-17 | 杭州海康威视数字技术股份有限公司 | License plate recognition method and device and electronic equipment |
CN111027562A (en) * | 2019-12-06 | 2020-04-17 | 中电健康云科技有限公司 | Optical character recognition method based on multi-scale CNN and RNN combined with attention mechanism |
CN111222589A (en) * | 2018-11-27 | 2020-06-02 | 中国移动通信集团辽宁有限公司 | Image text recognition method, device, equipment and computer storage medium |
CN111242113A (en) * | 2020-01-08 | 2020-06-05 | 重庆邮电大学 | Method for recognizing natural scene text in any direction |
CN111352827A (en) * | 2018-12-24 | 2020-06-30 | 中移信息技术有限公司 | Automatic testing method and device |
CN111523539A (en) * | 2020-04-15 | 2020-08-11 | 北京三快在线科技有限公司 | Character detection method and device |
CN111553290A (en) * | 2020-04-30 | 2020-08-18 | 北京市商汤科技开发有限公司 | Text recognition method, device, equipment and storage medium |
CN112101395A (en) * | 2019-06-18 | 2020-12-18 | 上海高德威智能交通系统有限公司 | Image identification method and device |
WO2021115159A1 (en) * | 2019-12-09 | 2021-06-17 | 中兴通讯股份有限公司 | Character recognition network model training method, character recognition method, apparatuses, terminal, and computer storage medium therefor |
CN113688822A (en) * | 2021-09-07 | 2021-11-23 | 河南工业大学 | Time sequence attention mechanism scene image identification method |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105654130A (en) * | 2015-12-30 | 2016-06-08 | 成都数联铭品科技有限公司 | Recurrent neural network-based complex image character sequence recognition system |
CN106022363A (en) * | 2016-05-12 | 2016-10-12 | 南京大学 | Method for recognizing Chinese characters in natural scene |
CN106157319A (en) * | 2016-07-28 | 2016-11-23 | 哈尔滨工业大学 | The significance detection method that region based on convolutional neural networks and Pixel-level merge |
CN106650813A (en) * | 2016-12-27 | 2017-05-10 | 华南理工大学 | Image understanding method based on depth residual error network and LSTM |
-
2017
- 2017-07-19 CN CN201710592890.3A patent/CN107368831B/en active Active
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105654130A (en) * | 2015-12-30 | 2016-06-08 | 成都数联铭品科技有限公司 | Recurrent neural network-based complex image character sequence recognition system |
CN106022363A (en) * | 2016-05-12 | 2016-10-12 | 南京大学 | Method for recognizing Chinese characters in natural scene |
CN106157319A (en) * | 2016-07-28 | 2016-11-23 | 哈尔滨工业大学 | The significance detection method that region based on convolutional neural networks and Pixel-level merge |
CN106650813A (en) * | 2016-12-27 | 2017-05-10 | 华南理工大学 | Image understanding method based on depth residual error network and LSTM |
Non-Patent Citations (3)
Title |
---|
LUKAS NEUMANN ET AL: "Real-Time Lexicon-Free Scene Text Localization and Recognition", 《IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE》 * |
QIANG GUO ET AL: "Memory Matters: Convolutional Recurrent Neural Network for Scene Text Recognition", 《HTTPS://ARXIV.ORG/ABS/1601.01100》 * |
葛明涛 等: "基于多重卷积神经网络的大模式联机手写文字识别", 《现代电子技术》 * |
Cited By (39)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108229469A (en) * | 2017-11-22 | 2018-06-29 | 北京市商汤科技开发有限公司 | Recognition methods, device, storage medium, program product and the electronic equipment of word |
CN108154136B (en) * | 2018-01-15 | 2022-04-05 | 众安信息技术服务有限公司 | Method, apparatus and computer readable medium for recognizing handwriting |
CN108154136A (en) * | 2018-01-15 | 2018-06-12 | 众安信息技术服务有限公司 | For identifying the method, apparatus of writing and computer-readable medium |
CN110321755A (en) * | 2018-03-28 | 2019-10-11 | 中移(苏州)软件技术有限公司 | A kind of recognition methods and device |
CN110555433B (en) * | 2018-05-30 | 2024-04-26 | 北京三星通信技术研究有限公司 | Image processing method, device, electronic equipment and computer readable storage medium |
CN110555433A (en) * | 2018-05-30 | 2019-12-10 | 北京三星通信技术研究有限公司 | Image processing method, image processing device, electronic equipment and computer readable storage medium |
CN110659641B (en) * | 2018-06-28 | 2023-05-26 | 杭州海康威视数字技术股份有限公司 | Text recognition method and device and electronic equipment |
CN110659641A (en) * | 2018-06-28 | 2020-01-07 | 杭州海康威视数字技术股份有限公司 | Character recognition method and device and electronic equipment |
CN109242140A (en) * | 2018-07-24 | 2019-01-18 | 浙江工业大学 | A kind of traffic flow forecasting method based on LSTM_Attention network |
CN109117846B (en) * | 2018-08-22 | 2021-11-16 | 北京旷视科技有限公司 | Image processing method and device, electronic equipment and computer readable medium |
CN109117846A (en) * | 2018-08-22 | 2019-01-01 | 北京旷视科技有限公司 | A kind of image processing method, device, electronic equipment and computer-readable medium |
CN111027555B (en) * | 2018-10-09 | 2023-09-26 | 杭州海康威视数字技术股份有限公司 | License plate recognition method and device and electronic equipment |
CN111027555A (en) * | 2018-10-09 | 2020-04-17 | 杭州海康威视数字技术股份有限公司 | License plate recognition method and device and electronic equipment |
CN109522600B (en) * | 2018-10-16 | 2020-10-16 | 浙江大学 | Complex equipment residual service life prediction method based on combined deep neural network |
CN109446187B (en) * | 2018-10-16 | 2021-01-15 | 浙江大学 | Method for monitoring health state of complex equipment based on attention mechanism and neural network |
CN109446187A (en) * | 2018-10-16 | 2019-03-08 | 浙江大学 | Complex equipment health status monitoring method based on attention mechanism and neural network |
CN109522600A (en) * | 2018-10-16 | 2019-03-26 | 浙江大学 | Complex equipment remaining life prediction technique based on combined depth neural network |
CN109389091A (en) * | 2018-10-22 | 2019-02-26 | 重庆邮电大学 | The character identification system and method combined based on neural network and attention mechanism |
CN109389091B (en) * | 2018-10-22 | 2022-05-03 | 重庆邮电大学 | Character recognition system and method based on combination of neural network and attention mechanism |
CN109726712A (en) * | 2018-11-13 | 2019-05-07 | 平安科技(深圳)有限公司 | Character recognition method, device and storage medium, server |
CN111222589A (en) * | 2018-11-27 | 2020-06-02 | 中国移动通信集团辽宁有限公司 | Image text recognition method, device, equipment and computer storage medium |
CN111222589B (en) * | 2018-11-27 | 2023-07-18 | 中国移动通信集团辽宁有限公司 | Image text recognition method, device, equipment and computer storage medium |
CN111352827A (en) * | 2018-12-24 | 2020-06-30 | 中移信息技术有限公司 | Automatic testing method and device |
CN109858420A (en) * | 2019-01-24 | 2019-06-07 | 国信电子票据平台信息服务有限公司 | A kind of bill processing system and processing method |
CN109992686A (en) * | 2019-02-24 | 2019-07-09 | 复旦大学 | Based on multi-angle from the image-text retrieval system and method for attention mechanism |
CN109977969A (en) * | 2019-03-27 | 2019-07-05 | 北京经纬恒润科技有限公司 | A kind of image-recognizing method and device |
CN110135427B (en) * | 2019-04-11 | 2021-07-27 | 北京百度网讯科技有限公司 | Method, apparatus, device and medium for recognizing characters in image |
CN110135427A (en) * | 2019-04-11 | 2019-08-16 | 北京百度网讯科技有限公司 | The method, apparatus, equipment and medium of character in image for identification |
CN110197227B (en) * | 2019-05-30 | 2023-10-27 | 成都中科艾瑞科技有限公司 | Multi-model fusion intelligent instrument reading identification method |
CN110197227A (en) * | 2019-05-30 | 2019-09-03 | 成都中科艾瑞科技有限公司 | A kind of meter reading intelligent identification Method of multi-model fusion |
CN112101395A (en) * | 2019-06-18 | 2020-12-18 | 上海高德威智能交通系统有限公司 | Image identification method and device |
CN110555462A (en) * | 2019-08-02 | 2019-12-10 | 深圳索信达数据技术有限公司 | non-fixed multi-character verification code identification method based on convolutional neural network |
CN111027562A (en) * | 2019-12-06 | 2020-04-17 | 中电健康云科技有限公司 | Optical character recognition method based on multi-scale CNN and RNN combined with attention mechanism |
WO2021115159A1 (en) * | 2019-12-09 | 2021-06-17 | 中兴通讯股份有限公司 | Character recognition network model training method, character recognition method, apparatuses, terminal, and computer storage medium therefor |
CN111242113B (en) * | 2020-01-08 | 2022-07-08 | 重庆邮电大学 | Method for recognizing natural scene text in any direction |
CN111242113A (en) * | 2020-01-08 | 2020-06-05 | 重庆邮电大学 | Method for recognizing natural scene text in any direction |
CN111523539A (en) * | 2020-04-15 | 2020-08-11 | 北京三快在线科技有限公司 | Character detection method and device |
CN111553290A (en) * | 2020-04-30 | 2020-08-18 | 北京市商汤科技开发有限公司 | Text recognition method, device, equipment and storage medium |
CN113688822A (en) * | 2021-09-07 | 2021-11-23 | 河南工业大学 | Time sequence attention mechanism scene image identification method |
Also Published As
Publication number | Publication date |
---|---|
CN107368831B (en) | 2019-08-02 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN107368831B (en) | English words and digit recognition method in a kind of natural scene image | |
CN111723585B (en) | Style-controllable image text real-time translation and conversion method | |
CN109948714B (en) | Chinese scene text line identification method based on residual convolution and recurrent neural network | |
CN107862261A (en) | Image people counting method based on multiple dimensioned convolutional neural networks | |
CN103605972B (en) | Non-restricted environment face verification method based on block depth neural network | |
CN110414498B (en) | Natural scene text recognition method based on cross attention mechanism | |
CN110929665B (en) | Natural scene curve text detection method | |
CN107480726A (en) | A kind of Scene Semantics dividing method based on full convolution and shot and long term mnemon | |
CN110533737A (en) | The method generated based on structure guidance Chinese character style | |
CN108345850A (en) | The scene text detection method of the territorial classification of stroke feature transformation and deep learning based on super-pixel | |
CN108681735A (en) | Optical character recognition method based on convolutional neural networks deep learning model | |
CN111985525B (en) | Text recognition method based on multi-mode information fusion processing | |
CN112069900A (en) | Bill character recognition method and system based on convolutional neural network | |
CN114048822A (en) | Attention mechanism feature fusion segmentation method for image | |
Hossain et al. | Recognition and solution for handwritten equation using convolutional neural network | |
CN107818299A (en) | Face recognition algorithms based on fusion HOG features and depth belief network | |
Talukder et al. | Real-time bangla sign language detection with sentence and speech generation | |
CN109360179A (en) | A kind of image interfusion method, device and readable storage medium storing program for executing | |
CN110263174A (en) | - subject categories the analysis method based on focus | |
CN109508640A (en) | A kind of crowd's sentiment analysis method, apparatus and storage medium | |
Truong et al. | Vietnamese handwritten character recognition using convolutional neural network | |
CN115205521A (en) | Kitchen waste detection method based on neural network | |
Aksoy et al. | Detection of Turkish sign language using deep learning and image processing methods | |
CN110929013A (en) | Image question-answer implementation method based on bottom-up entry and positioning information fusion | |
Singh et al. | A comprehensive survey on Bangla handwritten numeral recognition |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |