CN110414592A - A kind of Digital verification code recognition methods based on multi-task learning - Google Patents

A kind of Digital verification code recognition methods based on multi-task learning Download PDF

Info

Publication number
CN110414592A
CN110414592A CN201910672921.5A CN201910672921A CN110414592A CN 110414592 A CN110414592 A CN 110414592A CN 201910672921 A CN201910672921 A CN 201910672921A CN 110414592 A CN110414592 A CN 110414592A
Authority
CN
China
Prior art keywords
verification code
digital verification
identifying code
digital
task learning
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201910672921.5A
Other languages
Chinese (zh)
Inventor
宋晓茹
吴雪
高嵩
陈超波
李继超
彭雨豪
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Xian Technological University
Original Assignee
Xian Technological University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Xian Technological University filed Critical Xian Technological University
Priority to CN201910672921.5A priority Critical patent/CN110414592A/en
Publication of CN110414592A publication Critical patent/CN110414592A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/048Activation functions

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Computing Systems (AREA)
  • Software Systems (AREA)
  • Molecular Biology (AREA)
  • Computational Linguistics (AREA)
  • Biophysics (AREA)
  • Biomedical Technology (AREA)
  • Mathematical Physics (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Image Analysis (AREA)
  • Character Discrimination (AREA)

Abstract

The Digital verification code recognition methods based on multi-task learning that the present invention relates to a kind of, this method specifically include that 4 identifying code training sample sets for firstly generating simulation;Operation is normalized to Digital verification code image;Design convolutional neural networks model, feature extraction is carried out using 2 shared convolution pond layers, the full articulamentum in 4 in parallel 2 carries out 4 digital predictions in identifying code respectively, then random initializtion is carried out to convolutional neural networks model, it is constantly trained by multitask loss function, finally obtains Digital verification code identification model;Digital verification code to be identified is input in trained identifying code identification model, final identifying code predicted figure is obtained.Method of the invention only needs an identification process, avoid the operation that is split of number in identifying code is reduced because partitioning algorithm it is bad caused by the low problem of identifying code discrimination, improve the robustness of identifying code identification.

Description

A kind of Digital verification code recognition methods based on multi-task learning
Technical field
The invention belongs to computer visions and field of artificial intelligence, are related to a kind of Digital verification code recognition methods;Tool Body is related to a kind of volume neural network Digital verification code recognition methods using multi-task learning method.
Background technique
Along with the development of every science and technology, the especially development in computer science and technology field, network is giving people Life while bring great convenience, safety problem also becomes increasingly conspicuous.Network verification code, which is used as, is responsible for protection network account First of system of defense of number security system, is mainly used for resisting rogue program, prevents abuse Internet resources.Identifying code is known automatically Existing safety of verification code can be improved in other technology, and helps to design safer identifying code, and then effectively ensure that network Safety, it has also become a most important problem.
With the development of deep learning and artificial intelligence technology, image recognition is carried out using convolutional neural networks and has become heat Point, the most common technology for carrying out identifying code identification using convolutional neural networks are first located identifying code image in advance It manages, be sub-partitioned into individual digit, be finally sent into convolutional neural networks and identified.Digital verification code is wherein divided into single number It is time-consuming serious that word is re-fed into the process that convolutional neural networks are identified, and the quality of dividing method directly influences subsequent authentication The accuracy of code identification.Therefore, studying a kind of efficient method for recognizing verification code has important practical value.
Summary of the invention
The Digital verification code recognition methods based on multi-task learning that the present invention provides a kind of, to solve prior art utilization Convolutional neural networks are to the problem for needing segmentation and time-consuming in Digital verification code identification process.
In order to reach the purpose of the present invention, the technical solution of the adopted offer of the present invention is:
A kind of Digital verification code recognition methods based on multi-task learning, comprising the following steps:
Step (1), the identifying code training sample set containing 4 bit digitals for generating simulation;
Step (2) carries out one-hot coding to each digital label of identifying code;
Pretreatment is normalized to Digital verification code training sample set in step (3);
Step (4), the convolutional neural networks model for designing multi-task learning;
Step (5) is trained using normalized training sample set, obtains trained Digital verification code identification mould Type;
Operation is normalized to new unknown images, and utilizes trained Digital verification code identification model for step (6) It is identified.
Further, in the step (1), including Mnist data set, 4 identifying code data sets of synthesis and division are obtained Training set and test set.
Further, in the step (3), the normalization operation including Digital verification code image.
Further, in the step (4), the choosing of design and activation primitive including convolutional neural networks model framework Take: 2 convolution pond layers of design are tested as feature extraction layer, the full articulamentum in 4 in parallel 2 as multi-task learning layer 4 digital predictions in code are demonstrate,proved, choose relu as activation primitive.
Further, in the step (5), setting, parameter initialization method, loss function including training the number of iterations With the selection of backpropagation optimization algorithm.
Beneficial effects of the present invention:
Multi-task learning is introduced Digital verification code identification by method proposed by the invention.First to identifying code image into Row normalization operation reduces the influence to subsequent identification;Secondly using the shared method with multi-task learning of convolution, 2 are rolled up Product pond layer is used as the feature extraction of entire image, using the full articulamentum in 4 in parallel 2 to every number in Digital verification code Word is predicted, is avoided the digital segmentation operation in conventional digital identifying code identification process, is greatly reduced identifying code identification Time, improve identifying code identification robustness.
Detailed description of the invention
Fig. 1 is the identifying code number of synthesis, wherein Fig. 1 (a) is the Digital verification code figure without blank character, and Fig. 1 (b) is Digital verification code figure containing blank character;
Fig. 2 is the multitask identifying code identification model based on convolutional neural networks that the present invention designs;
Fig. 3 is prediction result figure.
Specific embodiment
Invention is further described in detail with specific implementation with reference to the accompanying drawing.
The Digital verification code recognition methods based on multi-task learning that the present invention provides a kind of, comprising the following steps:
Step 1, Mnist data set is downloaded, synthesizes the identifying code image containing 4-digit number, the parts of images of synthesis is as schemed Shown in 1 (a) and Fig. 1 (b), white space is expressed as number 10.
Download address: http://yann.lecun.com/exdb/mnist/
Step 2, number 0~10 is subjected to one-hot coding, for example 0 is encoded to 10000000000,1 and is encoded to 01000000000, blank is encoded to 00000000001.
Step 3, operation is normalized to the identifying code image of synthesis.
Step 4, the convolutional neural networks model for designing multi-task learning specifically includes 2 convolution pond layers as shared Feature extractor, 42 parallel full articulamentums predict 4 numbers in identifying code as multitask output model respectively, Nonlinear activation function in convolutional neural networks model is selected as relu activation primitive.
Step 5: being trained using normalized training sample set, obtain trained Digital verification code identification model. Specifically include the selection of trained the number of iterations, parameter initialization method, cost function and backpropagation undated parameter method.
Step 6: prediction: operation being normalized to new unknown images, and is identified using trained Digital verification code Model is identified.
Specific embodiments of the present invention are as follows:
Step 1: the downloading hand-written volumetric data set of http://yann.lecun.com/exdb/mnist/ includes 70000 in total The picture of 28*28, wherein 60000 training images and 10000 test images, 10 numbers in total.Then using existing Mnist data acquisition system Cheng Xin containing 4 digital identifying code images, then dividing training set is 50000, and verifying collection is 10000, test set is 10000.And in view of containing blank character, blank character is expressed as 10.
Step 2, number 0~9 and blank character are subjected to one-hot coding, as shown in the table.
0:10000000000 1:01000000000
2:00100000000 3:00010000000
4:00001000000 5:00000100000
6:00000010000 7:00000001000
8:00000000100 9:00000000010
Blank character 10:00000000001
Step 3: by each pixel value of Digital verification code image divided by 255, normalizing between [0,1].
Step 4: designing the convolutional neural networks model of multi-task learning, specifically include 2 convolution pond layers as shared Feature extractor, 42 parallel full articulamentums predict 4 numbers in identifying code as multitask output model respectively, Nonlinear activation function in convolutional neural networks model is selected as relu activation primitive.It is specifically based on the convolution of multi-task learning The identification of neural network Digital verification code is as shown in Figure 2.
Step 5: being trained using normalized training sample set, obtain trained Digital verification code identification model. Specifically include choose training the number of iterations be 20 steps, parameter initial method be cutting gearbox method, definition optimization Cost function is cross entropy loss function, and the method for defining backpropagation undated parameter is Adam optimization algorithm, and Adam study is calculated The step of method undated parameter, is as follows:
(1) it is concentrated from training data and takes out the small lot data { x comprising m sample1, x2... xm, the corresponding mesh of data Mark uses yiIt indicates.
(2) gradient of every weight parameter of m training sample of t moment is calculated:
Wherein, LwFor cross entropy loss function, w is the parameter of convolutional neural networks.
(3) momentum index weighted average are calculated:
S=ρ1s+(1-ρ1)g
Wherein ρ1General value is that 0.9, s is the first moment that initial value is 0.
(4) accumulation squared gradient is calculated
R=ρ2r+(1-ρ2)g*g
Wherein ρ2General value is that 0.990, r is the second moment that initial value is 0.
(5) drift correction is carried out to momentum index weighted average
(6) drift correction is carried out to accumulation squared gradient:
(7) renewal amount of weighting parameter is calculated:
Wherein, δ is the constant established for numerical stability, general value 10-7
(8) weighting parameter is updated:
W=w+ △ w
In the algorithm, the momentum index weighted average and accumulation squared gradient of the variable quantity of each parameter and its own It is related, it can achieve the purpose of the different learning rate of different parameter adaptations.
Step 6: prediction: normalized being done to new unknown images, and enterprising in trained Digital verification code model Row prediction, one of prediction result are as shown in Figure 3.
Method provided by the invention is taken in training set by being emulated on the Digital verification code data set of synthesis The discrimination for obtaining 96.9% achieves 95.2% discrimination in verifying concentration.Compared to using image preprocessing, segmentation, again The method for carrying out single character recognition using convolutional neural networks, the invention avoids Digital verification codes to do the process divided, benefit The time of Digital verification code identification is improved with the shared method with multi-task learning of convolution, reduces and is led because partitioning algorithm is bad The not high problem of the identifying code discrimination of cause increases the robustness of identifying code identification.
Use above specific case is illustrated the present invention, is merely used to help understand the present invention, not to limit The system present invention.For those skilled in the art, according to the thought of the present invention, can also make several simple It deduces, deform or replaces.

Claims (5)

1. a kind of Digital verification code recognition methods based on multi-task learning, which comprises the following steps:
Step (1), the identifying code training sample set containing 4 bit digitals for generating simulation;
Step (2) carries out one-hot coding to each digital label of identifying code;
Pretreatment is normalized to Digital verification code training sample set in step (3);
Step (4), the convolutional neural networks model for designing multi-task learning;
Step (5) is trained using normalized training sample set, obtains trained Digital verification code identification model;
Operation is normalized to new unknown images, and is carried out using trained Digital verification code identification model for step (6) Identification.
2. the Digital verification code recognition methods based on multi-task learning according to claim 1, which is characterized in that the step (1) in, including 4 Mnist data set, synthesis identifying code data sets is obtained and divide training set and test set.
3. the Digital verification code recognition methods based on multi-task learning according to claim 1, which is characterized in that the step (3) in, the normalization operation including Digital verification code image.
4. the Digital verification code recognition methods based on multi-task learning according to claim 1, which is characterized in that the step (4) in, the selection of design and activation primitive including convolutional neural networks model framework: 2 convolution pond layers of design are as special Extract layer is levied, the full articulamentum in 4 in parallel 2 carries out 4 digital predictions in identifying code as multi-task learning layer, chooses Relu is as activation primitive.
5. the Digital verification code recognition methods based on multi-task learning according to claim 1, which is characterized in that the step (5) in, the selection of setting, parameter initialization method, loss function and backpropagation optimization algorithm including training the number of iterations.
CN201910672921.5A 2019-07-24 2019-07-24 A kind of Digital verification code recognition methods based on multi-task learning Pending CN110414592A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910672921.5A CN110414592A (en) 2019-07-24 2019-07-24 A kind of Digital verification code recognition methods based on multi-task learning

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910672921.5A CN110414592A (en) 2019-07-24 2019-07-24 A kind of Digital verification code recognition methods based on multi-task learning

Publications (1)

Publication Number Publication Date
CN110414592A true CN110414592A (en) 2019-11-05

Family

ID=68362936

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910672921.5A Pending CN110414592A (en) 2019-07-24 2019-07-24 A kind of Digital verification code recognition methods based on multi-task learning

Country Status (1)

Country Link
CN (1) CN110414592A (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111259366A (en) * 2020-01-22 2020-06-09 支付宝(杭州)信息技术有限公司 Verification code recognizer training method and device based on self-supervision learning
CN112270322A (en) * 2020-12-17 2021-01-26 恒银金融科技股份有限公司 Method for recognizing crown word number of bank note by utilizing neural network model
CN116824597A (en) * 2023-07-03 2023-09-29 金陵科技学院 Dynamic image segmentation and parallel learning hand-written identity card number and identity recognition method

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107085730A (en) * 2017-03-24 2017-08-22 深圳爱拼信息科技有限公司 A kind of deep learning method and device of character identifying code identification
CN109101810A (en) * 2018-08-14 2018-12-28 电子科技大学 A kind of text method for recognizing verification code based on OCR technique
CN109933969A (en) * 2017-12-15 2019-06-25 腾讯科技(深圳)有限公司 Method for recognizing verification code, device, electronic equipment and readable storage medium storing program for executing

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107085730A (en) * 2017-03-24 2017-08-22 深圳爱拼信息科技有限公司 A kind of deep learning method and device of character identifying code identification
CN109933969A (en) * 2017-12-15 2019-06-25 腾讯科技(深圳)有限公司 Method for recognizing verification code, device, electronic equipment and readable storage medium storing program for executing
CN109101810A (en) * 2018-08-14 2018-12-28 电子科技大学 A kind of text method for recognizing verification code based on OCR technique

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
张国基: "《生物辨识系统与深度学习》", 北京工业大学出版社, pages: 106 - 107 *
高志强等: "《深度学习 从入门到实战》", 30 June 2018, pages: 118 - 120 *

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111259366A (en) * 2020-01-22 2020-06-09 支付宝(杭州)信息技术有限公司 Verification code recognizer training method and device based on self-supervision learning
CN112270322A (en) * 2020-12-17 2021-01-26 恒银金融科技股份有限公司 Method for recognizing crown word number of bank note by utilizing neural network model
CN116824597A (en) * 2023-07-03 2023-09-29 金陵科技学院 Dynamic image segmentation and parallel learning hand-written identity card number and identity recognition method
CN116824597B (en) * 2023-07-03 2024-05-24 金陵科技学院 Dynamic image segmentation and parallel learning hand-written identity card number and identity recognition method

Similar Documents

Publication Publication Date Title
CN110414592A (en) A kind of Digital verification code recognition methods based on multi-task learning
CN109815339A (en) Based on TextCNN Knowledge Extraction Method, device, computer equipment and storage medium
CN112000772B (en) Sentence-to-semantic matching method based on semantic feature cube and oriented to intelligent question and answer
CN108197294A (en) A kind of text automatic generation method based on deep learning
CN109005145A (en) A kind of malice URL detection system and its method extracted based on automated characterization
CN112000771B (en) Judicial public service-oriented sentence pair intelligent semantic matching method and device
CN111651762A (en) Convolutional neural network-based PE (provider edge) malicious software detection method
CN112761628B (en) Shale gas yield determination method and device based on long-term and short-term memory neural network
CN112380319A (en) Model training method and related device
CN113962148B (en) Yield prediction method, device and equipment based on convolutional coding dynamic sequence network
CN110472040A (en) Extracting method and device, storage medium, the computer equipment of evaluation information
CN110287341A (en) A kind of data processing method, device and readable storage medium storing program for executing
CN110955828A (en) Multi-factor embedded personalized package recommendation method based on deep neural network
CN111241550B (en) Vulnerability detection method based on binary mapping and deep learning
CN113434685A (en) Information classification processing method and system
CN110689092B (en) Sole pattern image depth clustering method based on data guidance
CN115130538A (en) Training method of text classification model, text processing method, equipment and medium
CN112182568B (en) Malicious code classification based on graph convolution network and topic model
CN115422518A (en) Text verification code identification method based on data-free knowledge distillation
CN115170403A (en) Font repairing method and system based on deep meta learning and generation countermeasure network
CN109635303B (en) Method for recognizing meaning-changing words in specific field
CN111783688B (en) Remote sensing image scene classification method based on convolutional neural network
CN113705215A (en) Meta-learning-based large-scale multi-label text classification method
CN111737688B (en) Attack defense system based on user portrait
CN116541792A (en) Method for carrying out group partner identification based on graph neural network node classification

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20191105

WD01 Invention patent application deemed withdrawn after publication