CN110414592A - A kind of Digital verification code recognition methods based on multi-task learning - Google Patents
A kind of Digital verification code recognition methods based on multi-task learning Download PDFInfo
- Publication number
- CN110414592A CN110414592A CN201910672921.5A CN201910672921A CN110414592A CN 110414592 A CN110414592 A CN 110414592A CN 201910672921 A CN201910672921 A CN 201910672921A CN 110414592 A CN110414592 A CN 110414592A
- Authority
- CN
- China
- Prior art keywords
- verification code
- digital verification
- identifying code
- digital
- task learning
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000012795 verification Methods 0.000 title claims abstract description 42
- 238000000034 method Methods 0.000 title claims abstract description 34
- 238000012549 training Methods 0.000 claims abstract description 19
- 238000013527 convolutional neural network Methods 0.000 claims abstract description 18
- 230000006870 function Effects 0.000 claims abstract description 9
- 238000013461 design Methods 0.000 claims abstract description 7
- 238000004088 simulation Methods 0.000 claims abstract description 3
- 230000004913 activation Effects 0.000 claims description 8
- 230000015572 biosynthetic process Effects 0.000 claims description 6
- 238000003786 synthesis reaction Methods 0.000 claims description 6
- 238000005457 optimization Methods 0.000 claims description 4
- 238000012360 testing method Methods 0.000 claims description 4
- 238000011423 initialization method Methods 0.000 claims description 3
- 238000010606 normalization Methods 0.000 claims description 3
- 230000008569 process Effects 0.000 abstract description 5
- 238000000605 extraction Methods 0.000 abstract description 3
- 238000000638 solvent extraction Methods 0.000 abstract description 2
- 238000009825 accumulation Methods 0.000 description 3
- 238000011161 development Methods 0.000 description 3
- 230000018109 developmental process Effects 0.000 description 3
- 230000011218 segmentation Effects 0.000 description 3
- ORILYTVJVMAKLC-UHFFFAOYSA-N Adamantane Natural products C1C(C2)CC3CC1CC2C3 ORILYTVJVMAKLC-UHFFFAOYSA-N 0.000 description 2
- 238000013473 artificial intelligence Methods 0.000 description 2
- 238000013528 artificial neural network Methods 0.000 description 2
- 238000012937 correction Methods 0.000 description 2
- 230000006978 adaptation Effects 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000008901 benefit Effects 0.000 description 1
- 238000013135 deep learning Methods 0.000 description 1
- 230000007123 defense Effects 0.000 description 1
- 230000004438 eyesight Effects 0.000 description 1
- 238000007781 pre-processing Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/214—Generating training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/048—Activation functions
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- Evolutionary Computation (AREA)
- Life Sciences & Earth Sciences (AREA)
- Artificial Intelligence (AREA)
- General Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- Computing Systems (AREA)
- Software Systems (AREA)
- Molecular Biology (AREA)
- Computational Linguistics (AREA)
- Biophysics (AREA)
- Biomedical Technology (AREA)
- Mathematical Physics (AREA)
- General Health & Medical Sciences (AREA)
- Health & Medical Sciences (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Evolutionary Biology (AREA)
- Image Analysis (AREA)
- Character Discrimination (AREA)
Abstract
The Digital verification code recognition methods based on multi-task learning that the present invention relates to a kind of, this method specifically include that 4 identifying code training sample sets for firstly generating simulation;Operation is normalized to Digital verification code image;Design convolutional neural networks model, feature extraction is carried out using 2 shared convolution pond layers, the full articulamentum in 4 in parallel 2 carries out 4 digital predictions in identifying code respectively, then random initializtion is carried out to convolutional neural networks model, it is constantly trained by multitask loss function, finally obtains Digital verification code identification model;Digital verification code to be identified is input in trained identifying code identification model, final identifying code predicted figure is obtained.Method of the invention only needs an identification process, avoid the operation that is split of number in identifying code is reduced because partitioning algorithm it is bad caused by the low problem of identifying code discrimination, improve the robustness of identifying code identification.
Description
Technical field
The invention belongs to computer visions and field of artificial intelligence, are related to a kind of Digital verification code recognition methods;Tool
Body is related to a kind of volume neural network Digital verification code recognition methods using multi-task learning method.
Background technique
Along with the development of every science and technology, the especially development in computer science and technology field, network is giving people
Life while bring great convenience, safety problem also becomes increasingly conspicuous.Network verification code, which is used as, is responsible for protection network account
First of system of defense of number security system, is mainly used for resisting rogue program, prevents abuse Internet resources.Identifying code is known automatically
Existing safety of verification code can be improved in other technology, and helps to design safer identifying code, and then effectively ensure that network
Safety, it has also become a most important problem.
With the development of deep learning and artificial intelligence technology, image recognition is carried out using convolutional neural networks and has become heat
Point, the most common technology for carrying out identifying code identification using convolutional neural networks are first located identifying code image in advance
It manages, be sub-partitioned into individual digit, be finally sent into convolutional neural networks and identified.Digital verification code is wherein divided into single number
It is time-consuming serious that word is re-fed into the process that convolutional neural networks are identified, and the quality of dividing method directly influences subsequent authentication
The accuracy of code identification.Therefore, studying a kind of efficient method for recognizing verification code has important practical value.
Summary of the invention
The Digital verification code recognition methods based on multi-task learning that the present invention provides a kind of, to solve prior art utilization
Convolutional neural networks are to the problem for needing segmentation and time-consuming in Digital verification code identification process.
In order to reach the purpose of the present invention, the technical solution of the adopted offer of the present invention is:
A kind of Digital verification code recognition methods based on multi-task learning, comprising the following steps:
Step (1), the identifying code training sample set containing 4 bit digitals for generating simulation;
Step (2) carries out one-hot coding to each digital label of identifying code;
Pretreatment is normalized to Digital verification code training sample set in step (3);
Step (4), the convolutional neural networks model for designing multi-task learning;
Step (5) is trained using normalized training sample set, obtains trained Digital verification code identification mould
Type;
Operation is normalized to new unknown images, and utilizes trained Digital verification code identification model for step (6)
It is identified.
Further, in the step (1), including Mnist data set, 4 identifying code data sets of synthesis and division are obtained
Training set and test set.
Further, in the step (3), the normalization operation including Digital verification code image.
Further, in the step (4), the choosing of design and activation primitive including convolutional neural networks model framework
Take: 2 convolution pond layers of design are tested as feature extraction layer, the full articulamentum in 4 in parallel 2 as multi-task learning layer
4 digital predictions in code are demonstrate,proved, choose relu as activation primitive.
Further, in the step (5), setting, parameter initialization method, loss function including training the number of iterations
With the selection of backpropagation optimization algorithm.
Beneficial effects of the present invention:
Multi-task learning is introduced Digital verification code identification by method proposed by the invention.First to identifying code image into
Row normalization operation reduces the influence to subsequent identification;Secondly using the shared method with multi-task learning of convolution, 2 are rolled up
Product pond layer is used as the feature extraction of entire image, using the full articulamentum in 4 in parallel 2 to every number in Digital verification code
Word is predicted, is avoided the digital segmentation operation in conventional digital identifying code identification process, is greatly reduced identifying code identification
Time, improve identifying code identification robustness.
Detailed description of the invention
Fig. 1 is the identifying code number of synthesis, wherein Fig. 1 (a) is the Digital verification code figure without blank character, and Fig. 1 (b) is
Digital verification code figure containing blank character;
Fig. 2 is the multitask identifying code identification model based on convolutional neural networks that the present invention designs;
Fig. 3 is prediction result figure.
Specific embodiment
Invention is further described in detail with specific implementation with reference to the accompanying drawing.
The Digital verification code recognition methods based on multi-task learning that the present invention provides a kind of, comprising the following steps:
Step 1, Mnist data set is downloaded, synthesizes the identifying code image containing 4-digit number, the parts of images of synthesis is as schemed
Shown in 1 (a) and Fig. 1 (b), white space is expressed as number 10.
Download address: http://yann.lecun.com/exdb/mnist/
Step 2, number 0~10 is subjected to one-hot coding, for example 0 is encoded to 10000000000,1 and is encoded to
01000000000, blank is encoded to 00000000001.
Step 3, operation is normalized to the identifying code image of synthesis.
Step 4, the convolutional neural networks model for designing multi-task learning specifically includes 2 convolution pond layers as shared
Feature extractor, 42 parallel full articulamentums predict 4 numbers in identifying code as multitask output model respectively,
Nonlinear activation function in convolutional neural networks model is selected as relu activation primitive.
Step 5: being trained using normalized training sample set, obtain trained Digital verification code identification model.
Specifically include the selection of trained the number of iterations, parameter initialization method, cost function and backpropagation undated parameter method.
Step 6: prediction: operation being normalized to new unknown images, and is identified using trained Digital verification code
Model is identified.
Specific embodiments of the present invention are as follows:
Step 1: the downloading hand-written volumetric data set of http://yann.lecun.com/exdb/mnist/ includes 70000 in total
The picture of 28*28, wherein 60000 training images and 10000 test images, 10 numbers in total.Then using existing
Mnist data acquisition system Cheng Xin containing 4 digital identifying code images, then dividing training set is 50000, and verifying collection is
10000, test set is 10000.And in view of containing blank character, blank character is expressed as 10.
Step 2, number 0~9 and blank character are subjected to one-hot coding, as shown in the table.
0:10000000000 | 1:01000000000 |
2:00100000000 | 3:00010000000 |
4:00001000000 | 5:00000100000 |
6:00000010000 | 7:00000001000 |
8:00000000100 | 9:00000000010 |
Blank character 10:00000000001 |
Step 3: by each pixel value of Digital verification code image divided by 255, normalizing between [0,1].
Step 4: designing the convolutional neural networks model of multi-task learning, specifically include 2 convolution pond layers as shared
Feature extractor, 42 parallel full articulamentums predict 4 numbers in identifying code as multitask output model respectively,
Nonlinear activation function in convolutional neural networks model is selected as relu activation primitive.It is specifically based on the convolution of multi-task learning
The identification of neural network Digital verification code is as shown in Figure 2.
Step 5: being trained using normalized training sample set, obtain trained Digital verification code identification model.
Specifically include choose training the number of iterations be 20 steps, parameter initial method be cutting gearbox method, definition optimization
Cost function is cross entropy loss function, and the method for defining backpropagation undated parameter is Adam optimization algorithm, and Adam study is calculated
The step of method undated parameter, is as follows:
(1) it is concentrated from training data and takes out the small lot data { x comprising m sample1, x2... xm, the corresponding mesh of data
Mark uses yiIt indicates.
(2) gradient of every weight parameter of m training sample of t moment is calculated:
Wherein, LwFor cross entropy loss function, w is the parameter of convolutional neural networks.
(3) momentum index weighted average are calculated:
S=ρ1s+(1-ρ1)g
Wherein ρ1General value is that 0.9, s is the first moment that initial value is 0.
(4) accumulation squared gradient is calculated
R=ρ2r+(1-ρ2)g*g
Wherein ρ2General value is that 0.990, r is the second moment that initial value is 0.
(5) drift correction is carried out to momentum index weighted average
(6) drift correction is carried out to accumulation squared gradient:
(7) renewal amount of weighting parameter is calculated:
Wherein, δ is the constant established for numerical stability, general value 10-7。
(8) weighting parameter is updated:
W=w+ △ w
In the algorithm, the momentum index weighted average and accumulation squared gradient of the variable quantity of each parameter and its own
It is related, it can achieve the purpose of the different learning rate of different parameter adaptations.
Step 6: prediction: normalized being done to new unknown images, and enterprising in trained Digital verification code model
Row prediction, one of prediction result are as shown in Figure 3.
Method provided by the invention is taken in training set by being emulated on the Digital verification code data set of synthesis
The discrimination for obtaining 96.9% achieves 95.2% discrimination in verifying concentration.Compared to using image preprocessing, segmentation, again
The method for carrying out single character recognition using convolutional neural networks, the invention avoids Digital verification codes to do the process divided, benefit
The time of Digital verification code identification is improved with the shared method with multi-task learning of convolution, reduces and is led because partitioning algorithm is bad
The not high problem of the identifying code discrimination of cause increases the robustness of identifying code identification.
Use above specific case is illustrated the present invention, is merely used to help understand the present invention, not to limit
The system present invention.For those skilled in the art, according to the thought of the present invention, can also make several simple
It deduces, deform or replaces.
Claims (5)
1. a kind of Digital verification code recognition methods based on multi-task learning, which comprises the following steps:
Step (1), the identifying code training sample set containing 4 bit digitals for generating simulation;
Step (2) carries out one-hot coding to each digital label of identifying code;
Pretreatment is normalized to Digital verification code training sample set in step (3);
Step (4), the convolutional neural networks model for designing multi-task learning;
Step (5) is trained using normalized training sample set, obtains trained Digital verification code identification model;
Operation is normalized to new unknown images, and is carried out using trained Digital verification code identification model for step (6)
Identification.
2. the Digital verification code recognition methods based on multi-task learning according to claim 1, which is characterized in that the step
(1) in, including 4 Mnist data set, synthesis identifying code data sets is obtained and divide training set and test set.
3. the Digital verification code recognition methods based on multi-task learning according to claim 1, which is characterized in that the step
(3) in, the normalization operation including Digital verification code image.
4. the Digital verification code recognition methods based on multi-task learning according to claim 1, which is characterized in that the step
(4) in, the selection of design and activation primitive including convolutional neural networks model framework: 2 convolution pond layers of design are as special
Extract layer is levied, the full articulamentum in 4 in parallel 2 carries out 4 digital predictions in identifying code as multi-task learning layer, chooses
Relu is as activation primitive.
5. the Digital verification code recognition methods based on multi-task learning according to claim 1, which is characterized in that the step
(5) in, the selection of setting, parameter initialization method, loss function and backpropagation optimization algorithm including training the number of iterations.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910672921.5A CN110414592A (en) | 2019-07-24 | 2019-07-24 | A kind of Digital verification code recognition methods based on multi-task learning |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910672921.5A CN110414592A (en) | 2019-07-24 | 2019-07-24 | A kind of Digital verification code recognition methods based on multi-task learning |
Publications (1)
Publication Number | Publication Date |
---|---|
CN110414592A true CN110414592A (en) | 2019-11-05 |
Family
ID=68362936
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910672921.5A Pending CN110414592A (en) | 2019-07-24 | 2019-07-24 | A kind of Digital verification code recognition methods based on multi-task learning |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN110414592A (en) |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111259366A (en) * | 2020-01-22 | 2020-06-09 | 支付宝(杭州)信息技术有限公司 | Verification code recognizer training method and device based on self-supervision learning |
CN112270322A (en) * | 2020-12-17 | 2021-01-26 | 恒银金融科技股份有限公司 | Method for recognizing crown word number of bank note by utilizing neural network model |
CN116824597A (en) * | 2023-07-03 | 2023-09-29 | 金陵科技学院 | Dynamic image segmentation and parallel learning hand-written identity card number and identity recognition method |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107085730A (en) * | 2017-03-24 | 2017-08-22 | 深圳爱拼信息科技有限公司 | A kind of deep learning method and device of character identifying code identification |
CN109101810A (en) * | 2018-08-14 | 2018-12-28 | 电子科技大学 | A kind of text method for recognizing verification code based on OCR technique |
CN109933969A (en) * | 2017-12-15 | 2019-06-25 | 腾讯科技(深圳)有限公司 | Method for recognizing verification code, device, electronic equipment and readable storage medium storing program for executing |
-
2019
- 2019-07-24 CN CN201910672921.5A patent/CN110414592A/en active Pending
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107085730A (en) * | 2017-03-24 | 2017-08-22 | 深圳爱拼信息科技有限公司 | A kind of deep learning method and device of character identifying code identification |
CN109933969A (en) * | 2017-12-15 | 2019-06-25 | 腾讯科技(深圳)有限公司 | Method for recognizing verification code, device, electronic equipment and readable storage medium storing program for executing |
CN109101810A (en) * | 2018-08-14 | 2018-12-28 | 电子科技大学 | A kind of text method for recognizing verification code based on OCR technique |
Non-Patent Citations (2)
Title |
---|
张国基: "《生物辨识系统与深度学习》", 北京工业大学出版社, pages: 106 - 107 * |
高志强等: "《深度学习 从入门到实战》", 30 June 2018, pages: 118 - 120 * |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111259366A (en) * | 2020-01-22 | 2020-06-09 | 支付宝(杭州)信息技术有限公司 | Verification code recognizer training method and device based on self-supervision learning |
CN112270322A (en) * | 2020-12-17 | 2021-01-26 | 恒银金融科技股份有限公司 | Method for recognizing crown word number of bank note by utilizing neural network model |
CN116824597A (en) * | 2023-07-03 | 2023-09-29 | 金陵科技学院 | Dynamic image segmentation and parallel learning hand-written identity card number and identity recognition method |
CN116824597B (en) * | 2023-07-03 | 2024-05-24 | 金陵科技学院 | Dynamic image segmentation and parallel learning hand-written identity card number and identity recognition method |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110414592A (en) | A kind of Digital verification code recognition methods based on multi-task learning | |
CN109815339A (en) | Based on TextCNN Knowledge Extraction Method, device, computer equipment and storage medium | |
CN112000772B (en) | Sentence-to-semantic matching method based on semantic feature cube and oriented to intelligent question and answer | |
CN108197294A (en) | A kind of text automatic generation method based on deep learning | |
CN109005145A (en) | A kind of malice URL detection system and its method extracted based on automated characterization | |
CN112000771B (en) | Judicial public service-oriented sentence pair intelligent semantic matching method and device | |
CN111651762A (en) | Convolutional neural network-based PE (provider edge) malicious software detection method | |
CN112761628B (en) | Shale gas yield determination method and device based on long-term and short-term memory neural network | |
CN112380319A (en) | Model training method and related device | |
CN113962148B (en) | Yield prediction method, device and equipment based on convolutional coding dynamic sequence network | |
CN110472040A (en) | Extracting method and device, storage medium, the computer equipment of evaluation information | |
CN110287341A (en) | A kind of data processing method, device and readable storage medium storing program for executing | |
CN110955828A (en) | Multi-factor embedded personalized package recommendation method based on deep neural network | |
CN111241550B (en) | Vulnerability detection method based on binary mapping and deep learning | |
CN113434685A (en) | Information classification processing method and system | |
CN110689092B (en) | Sole pattern image depth clustering method based on data guidance | |
CN115130538A (en) | Training method of text classification model, text processing method, equipment and medium | |
CN112182568B (en) | Malicious code classification based on graph convolution network and topic model | |
CN115422518A (en) | Text verification code identification method based on data-free knowledge distillation | |
CN115170403A (en) | Font repairing method and system based on deep meta learning and generation countermeasure network | |
CN109635303B (en) | Method for recognizing meaning-changing words in specific field | |
CN111783688B (en) | Remote sensing image scene classification method based on convolutional neural network | |
CN113705215A (en) | Meta-learning-based large-scale multi-label text classification method | |
CN111737688B (en) | Attack defense system based on user portrait | |
CN116541792A (en) | Method for carrying out group partner identification based on graph neural network node classification |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
WD01 | Invention patent application deemed withdrawn after publication |
Application publication date: 20191105 |
|
WD01 | Invention patent application deemed withdrawn after publication |