CN111652233B - Text verification code automatic identification method aiming at complex background - Google Patents

Text verification code automatic identification method aiming at complex background Download PDF

Info

Publication number
CN111652233B
CN111652233B CN202010495757.8A CN202010495757A CN111652233B CN 111652233 B CN111652233 B CN 111652233B CN 202010495757 A CN202010495757 A CN 202010495757A CN 111652233 B CN111652233 B CN 111652233B
Authority
CN
China
Prior art keywords
verification code
text
picture
character
segmentation
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202010495757.8A
Other languages
Chinese (zh)
Other versions
CN111652233A (en
Inventor
王瑶
王佰玲
魏玉良
张茗晋
辛国栋
王巍
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Weihai Tianzhiwei Network Space Safety Technology Co ltd
Harbin Institute of Technology Weihai
Original Assignee
Weihai Tianzhiwei Network Space Safety Technology Co ltd
Harbin Institute of Technology Weihai
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Weihai Tianzhiwei Network Space Safety Technology Co ltd, Harbin Institute of Technology Weihai filed Critical Weihai Tianzhiwei Network Space Safety Technology Co ltd
Priority to CN202010495757.8A priority Critical patent/CN111652233B/en
Publication of CN111652233A publication Critical patent/CN111652233A/en
Application granted granted Critical
Publication of CN111652233B publication Critical patent/CN111652233B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition
    • G06V30/14Image acquisition
    • G06V30/148Segmentation of character regions
    • G06V30/153Segmentation of character regions using recognition of characters or words
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/951Indexing; Web crawling techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/30Noise filtering
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition

Abstract

The invention relates to an automatic text verification code identification method aiming at complex background, which comprises the following steps: the verification code denoising module removes complex security features of the real verification code through a circularly generated countermeasure network; the character segmentation module uses an image processing algorithm to segment the whole verification code picture into single characters; and sending the segmented characters into a text recognition network to obtain final output. The automatic identifying method for the text identifying code can be used for quickly and effectively identifying the text type identifying code with background noise, character distortion and blurred edges, has high generalization and portability, can be simply embedded into a crawler algorithm, and solves the identifying code problem in the data acquisition process.

Description

Text verification code automatic identification method aiming at complex background
Technical Field
The invention relates to an automatic identifying method of text identifying codes aiming at complex backgrounds, and belongs to the technical field of identifying codes.
Background
In the big data age, a data source is a necessary condition for carrying out big data analysis and data mining, and the time and the labor are consumed for manually searching useful data in the Internet. The crawler technology can automatically acquire data content of interest from the Internet and crawl the data to be used as a data source, so that more advanced data analysis can be performed. The verification code is taken as a measure for preventing an automatic program and is a main constraint factor in the process of the crawler. Character type verification codes are still widely used on the network at present, so that a full-automatic end-to-end identification method for the type verification codes becomes particularly important.
Existing verification code automatic identification algorithms generally include three broad categories: specific type captcha attack algorithms, algorithms based on character segmentation, methods based on deep learning. The attack algorithm of a specific type can only identify a single type of verification code picture (such as Microsoft verification code) and cannot be generalized to other types, so that the attack algorithm is difficult to apply to the engineering field; the algorithm based on character segmentation generally uses a traditional image processing algorithm to preprocess (such as graying, binarization and the like) the verification code picture, and the traditional image processing has limitations, so that the background interference cannot be effectively removed, and the problems of difficulty in character segmentation and low recognition accuracy are caused; in recent years, with the development of deep learning, verification code recognition technology based on a neural network model achieves good effects, but two main problems still exist in the method at present: first, most of the existing verification code recognition methods based on deep learning models adopt a supervised learning mode, and a large amount of marked data is required to be trained (generally not less than 50000 pieces), so that time and labor are wasted. Under the condition of insufficient labeling samples, the phenomenon of fitting is very easy to occur, so that the model cannot be converged, and the accuracy is very low; secondly, in the existing identifying method of identifying the identifying code, higher accuracy is obtained for identifying the regular and slightly noisy text identifying code, but the text identifying code type with complex security features cannot be well identified.
In addition, chinese patent document CN107967475a discloses a verification code recognition method based on window sliding and convolutional neural network. Firstly, collecting a small number of verification code pictures, extracting character sets to be identified by the verification code after noise reduction, rotating and twisting each character set to increase background noise, and training the character sets by using a convolutional neural network to obtain a single character classifier. Finally, preprocessing the verification code picture to be identified, then dividing connected domains, sliding windows for each connected domain, and classifying by using a single character classifier trained before, so as to obtain an identification final result. Chinese patent document CN110555298A discloses a verification code recognition device and a computing device, and the verification code recognition model training method includes: acquiring verification code image samples with the same verification code length, and determining character sample labels corresponding to the verification code image samples; determining verification code characters forming character sample labels and attribute values of the verification code characters, and acquiring character type information of the verification code characters; encoding the character sample label according to the character type information and the attribute value to obtain an encoded sample label; and training a verification code identification model for identifying the verification code image by using the verification code image sample and the coded sample label. However, the methods mentioned in the above two patent documents all adopt the traditional image processing algorithm to preprocess the verification code picture, and the method is only suitable for the situation without obvious noise, but cannot effectively remove noise interference for the verification code type with complex security features, so that the accuracy of character segmentation and recognition is seriously affected.
Disclosure of Invention
Aiming at the problems of the existing identifying technology of identifying codes, in particular to the problems that the text identifying code with complex safety features can not remove noise well and the identifying effect of the distorted text identifying code is poor under the condition of a small number of marks. The invention provides an automatic text verification code identification method aiming at a complex background. The method has the characteristics of few labeling samples, short processing time and high recognition accuracy, solves the problems that the existing algorithm needs a large amount of manual labeling and has poor recognition effect on complex and distorted background characters, and has wide application prospect. The method combines the verification code denoising module, the character segmentation module and the verification code identification module into a whole, and realizes automatic identification of the text verification code end-to-end. The method can obtain higher recognition accuracy rate by only marking a small number of samples (500 sheets), and has better recognition effect on noise and distortion verification codes. The identifying method of the identifying code provided by the invention has better generalization and can be applied to different types of text identifying codes on the premise of ensuring that the model structure is unchanged. Meanwhile, the model can be simply embedded into a crawler algorithm, so that the problem of text verification code anticreeper encountered by enterprises and individuals when acquiring data is solved rapidly and efficiently.
The technical scheme of the invention is as follows:
a text verification code automatic identification method aiming at complex background comprises the following steps:
the verification code denoising module removes complex security features of the real verification code through a circularly generated countermeasure network;
the character segmentation module uses an image processing algorithm to segment the whole verification code picture into single characters;
and sending the segmented characters into a text recognition network to obtain final output.
According to the invention, preferably, for text verification code types with large distortion rotation amplitude, the text recognition network of the invention uses a space transformation layer (Spatial Transformer Layers) to correct the text verification code types, so that the model has space invariance.
The invention aims at automatically identifying a text type verification code with complex safety characteristics (such as background noise, edge blurring and character distortion), and belongs to an automatic verification code identification method based on a small number of training samples. The invention comprises three parts of a verification code denoising module, a character segmentation module and a text recognition module. The whole model solution is shown in fig. 1. The verification code shown on the left side of fig. 1 is from wikipedia, and has the characteristics of blurred edges, noise and text distortion, and the noise at the edges does not greatly affect the recognition of human eyes, but is difficult to segment due to disordered pixel distribution and difficult to obtain high recognition accuracy for a neural network. Therefore, the invention firstly denoises the true verification code through the circularly generated countermeasure network, so that the edge of the true verification code is clear, and the true verification code is more beneficial to further identification. The overall captcha picture is then segmented into individual characters using an image processing algorithm. Finally, the segmented characters are sent to a text recognition network to obtain final output, and particularly, for the text verification code type with large distortion rotation amplitude, the network firstly uses a space transformation layer (Spatial Transformer Layers) to correct the text verification code type, so that the model has space invariance.
Verification code denoising module
According to the invention, preferably, in the process of denoising the true verification code through the circularly generated countermeasure network, firstly, a verification code generator is utilized, pictures with the similar format to the true verification code word are generated in batches through parameter adjustment, and the pictures and the true verification code are combined into a training set in pairs to be used as the input of the denoising network.
According to the present invention, preferably, the loop generation type countermeasure network (Cycle GAN) is composed of two generators and two discriminators, and is in a dual structure as a whole. The core goal of the loop generation type countermeasure network is to convert the verification code picture with complex security features into a simple verification code with the security features removed, so as to reduce the difficulty of character segmentation and recognition. In the model training process, as shown in fig. 2, a real input image is firstly obtained through a domain a, and is converted into a simple verification code picture in a target domain B through a first generator a- > B; this picture is then converted back to the original complex picture as input to the second generator b→a. In addition, two discriminators (discriminators) are used here to determine whether an input picture is a true input picture or a pseudo picture generated by a Generator.
According to the invention, preferably, the optimization objective of denoising the loop-generating type countermeasure network includes two different types of loss functions, namely a countermeasure loss (differential loss) and a loop consistency loss (Cycle Consistency Loss); the contrast loss is used for matching the pixel distribution of the generated picture with the pixel distribution of the picture in the target domain; the cyclical consistency loss is used for controlling the converted image to be similar to the image in the source domain as much as possible;
further preferably, the true verification code and the generated verification code are respectively used as a domain X and a domain Y, and two style converters are used for mutually converting between the domain X and the domain Y; the optimization process is as follows: (1) Firstly, carrying out feature extraction on an input picture by utilizing a convolutional neural network to obtain a feature vector; (2) Then converting the feature vector of the picture in the domain X into the feature vector in the domain Y through a Resnet module, and reserving the target of the original image feature while converting; (3) Finally, the decoding process restores the converted image from the feature vector by deconvolution operation. The discriminator consists of a multi-layer convolutional neural network, takes pictures as input, tries to judge whether the input pictures are real pictures from an original domain or false pictures generated through conversion, and outputs the probability of predicting the final layer of the discriminator as the real pictures. Algorithm flow as shown in fig. 3, unlike the unidirectional conversion of a conventional generative antagonism network, the present invention uses two style converters to convert between domain X and domain Y.
The trained verification code denoising network can effectively identify complex safety features (including background noise, interference lines, character colors, distortion and ambiguity of characters, small character spacing and the like) which possibly interfere with character segmentation and identification, and well remove the interference items, so that the verification code denoising network becomes a simple verification code picture. The difficulty of character segmentation and recognition is effectively reduced, so that high recognition accuracy can be achieved by only a small amount of labeling samples. Meanwhile, the denoising network has generalization and universality, can be applied to different types of verification code pictures under the condition of not changing a model structure, and greatly reduces manual intervention.
Character segmentation module
Through the verification code denoising network, the original verification code with complex security features is converted into a simple verification code, and the simple verification code is input into the character segmentation module. Aiming at the characteristics of different types of verification codes, the character strings in the verification code pictures are respectively segmented into single characters by adopting methods of contour detection, traditional segmentation, threshold segmentation and the like. Equidistant segmentation is one of the conventional image processing algorithms that equally divides the pixels of a picture into N parts, but this method has a problem in that equidistant segmentation does not separate the authentication codewords well, as shown in fig. 4 (a), where two characters may exist in one box. The present invention thus improves this segmentation method. The starting position of the segmentation is adjusted from (0, 0) to the upper left corner pixel point of the first character, the segmentation width is adjusted to the approximate size of each character, and the height is adjusted to the approximate height of each character, the segmentation effect is as shown in fig. 4 (b), wherein the black frame represents the segmentation result.
According to the invention, preferably, the image processing algorithm adopts a contour detection, an improved equidistant segmentation algorithm and a threshold segmentation algorithm, wherein in the improved equidistant segmentation algorithm, the starting position of segmentation is the pixel point at the upper left corner of the first character, the segmentation width is the approximate size of each character, and the segmentation height is the approximate height of each character.
Aiming at the text verification code with clear processed edges and distorted characters, the traditional segmentation algorithm is not applicable, and the invention preferably adopts an algorithm of contour detection to segment the characters;
further preferably, the contour detection algorithm scans the pixel points of the whole picture, finds the starting point of the outer boundary of each character and the starting point of the hole boundary, numbers the boundary points, and finally connects the outer boundaries through a contour drawing function to obtain a final segmentation result.
Threshold segmentation is a region-based image segmentation technique, applicable to pictures where the target and background occupy different gray level ranges.
According to the invention, preferably, for the situation that the character size intervals in the verification code pictures are unequal, a threshold segmentation algorithm is adopted, and the flow is as follows: firstly, carrying out binarization processing on a picture; and then calculating an accumulated value of the ordinate pixels of the picture, and determining the threshold value by adopting a peak-to-valley value analysis method.
Text recognition module
Because the simplified verification code picture removes most of safety factors interfering with segmentation, the segmentation module can obtain higher segmentation accuracy, and simultaneously, the difficulty of character recognition is reduced. The design of the invention uses a simple convolutional neural network model as a final text recognition module, and a specific model structure is shown in fig. 5.
According to the invention, preferably, the text recognition network is a convolutional neural network, and comprises a convolutional layer, a pooling layer, a dropout layer and a full-connection layer;
further preferably, the convolutional neural network uses ReLu as an activation function and cross entropy as a loss function, and the optimizer selects Adadelta. Because the model convolution layer is less, the problem of over fitting is not easy to occur, and a large amount of training data is not needed. In actual use, the high recognition accuracy can be obtained only by training 500 samples, the model training time is greatly reduced, the processing speed in the recognition process is increased, and the engineering use requirement can be met.
The invention is not described in detail and is in accordance with the prior art.
The beneficial effects of the invention are as follows:
1. the automatic identifying method for the text identifying code can be used for quickly and effectively identifying the text type identifying code with background noise, character distortion and blurred edges, has high generalization and portability, can be simply embedded into a crawler algorithm, and solves the identifying code problem in the data acquisition process.
2. The method has high recognition accuracy for the text verification code with complex background, distorted characters and blurred edges.
3. The invention can achieve better recognition effect only by a small amount of data annotation, and reduces manual intervention.
4. The invention has high generalization and portability, and is suitable for different types of text verification codes; the model training time is short, the processing speed is high, and the engineering requirements can be met. The application range supports any web crawler algorithm, can be applied to any website and software needing to be identified by the automatic verification code, and has wide application prospect.
Drawings
FIG. 1 is a flow chart of a complex verification code identification solution based on a small number of samples according to the present invention.
FIG. 2 is a diagram of the overall structure of the verification code denoising network according to the present invention.
FIG. 3 is a graph showing the cyclic consistency loss according to the present invention.
FIG. 4 is a diagram illustrating the result of a conventional character segmentation algorithm, wherein: (a) equidistant segmentation algorithm (b) improved post-segmentation algorithm.
Fig. 5 is a network structure diagram of the text recognition module according to the present invention.
Detailed Description
The invention will now be further illustrated by, but is not limited to, the following specific examples in connection with the accompanying drawings.
Example 1
A text verification code automatic identification method aiming at complex background comprises the following steps:
the verification code denoising module removes complex security features of a real verification code through a circularly generated countermeasure network, and meanwhile, the edges of characters are clear:
firstly, a verification code generator is utilized, pictures with the similar format to the real verification code word are generated in batches through parameter adjustment, and the pictures and the real verification code are combined into a training set in pairs to be used as the input of a denoising network. The Cycle generation type countermeasure network (Cycle GAN) consists of two generators and two discriminators, and the whole is in a dual structure. The core goal of the loop generation type countermeasure network is to convert the verification code picture with complex security features into a simple verification code with the security features removed, so as to reduce the difficulty of character segmentation and recognition. In the model training process, as shown in fig. 2, a real input image is firstly obtained through a domain a, and is converted into a simple verification code picture in a target domain B through a first generator a- > B; this picture is then converted back to the original complex picture as input to the second generator b→a. In addition, two discriminators (discriminators) are used here to determine whether an input picture is a true input picture or a pseudo picture generated by a Generator. The optimization targets of denoising the loop generation type countermeasure network comprise two different types of loss functions, namely a countermeasure loss (differential loss) and a loop consistency loss (Cycle Consistency Loss); the contrast loss is used for matching the pixel distribution of the generated picture with the pixel distribution of the picture in the target domain; the cyclical consistency loss is used for controlling the converted image to be similar to the image in the source domain as much as possible; the true verification code and the generated verification code are respectively used as a domain X and a domain Y, and two style converters are used for mutually converting between the domain X and the domain Y; the optimization process is as follows: (1) Firstly, carrying out feature extraction on an input picture by utilizing a convolutional neural network to obtain a feature vector; (2) Then converting the feature vector of the picture in the domain X into the feature vector in the domain Y through a Resnet module, and reserving the target of the original image feature while converting; (3) Finally, decoding is carried out through deconvolution operation, and the converted image is restored by the feature vector. The discriminator consists of a multi-layer convolutional neural network, takes pictures as input, tries to judge whether the input pictures are real pictures from an original domain or false pictures generated through conversion, and outputs the probability of predicting the final layer of the discriminator as the real pictures. The algorithm flow is shown in fig. 3.
The character segmentation module uses an image processing algorithm to segment the overall captcha picture into individual characters:
the image processing algorithm comprises contour detection, an improved equidistant segmentation algorithm and a threshold segmentation algorithm, wherein in the improved equidistant segmentation algorithm, the starting position of segmentation is the pixel point of the upper left corner of the first character, the segmentation width is the approximate size of each character, the segmentation height is the approximate height of each character, the segmentation effect is shown in fig. 4 (b), and the black frame represents the segmentation result; aiming at the text verification code with clear processed edges and distorted characters, the traditional segmentation algorithm is not applicable, and the invention preferably adopts an algorithm of contour detection to segment the characters; the contour detection algorithm scans the pixel points of the whole picture, finds the starting point of the outer boundary of each character and the starting point of the hole boundary, numbers the boundary points, and finally connects the outer boundaries through a contour drawing function to obtain a final segmentation result. Aiming at the condition that the character size intervals in the verification code pictures are unequal, the invention adopts a threshold segmentation algorithm, and the flow is as follows: firstly, carrying out binarization processing on a picture; and then calculating an accumulated value of the ordinate pixels of the picture, and determining the threshold value by adopting a peak-to-valley value analysis method.
The segmented characters are sent to a text recognition network to obtain final output:
the text recognition network is a convolutional neural network and comprises a convolutional layer, a pooling layer, a dropout layer and a full-connection layer; the convolutional neural network uses ReLu as an activation function and cross entropy as a loss function, and the optimizer selects Adadelta.
The overall model solution of the present invention is shown in fig. 1. The verification code shown on the left side of fig. 1 is from wikipedia, and for a neural network, because the verification code is chaotic in pixel distribution, the verification code is difficult to segment, and high recognition accuracy is difficult to obtain. According to the invention, the real verification code is de-noised through the circularly generated countermeasure network, so that the edge of the verification code is clear, and further identification is facilitated. The overall captcha picture is then segmented into individual characters using corresponding image processing algorithms. And finally, sending the segmented characters into a text recognition network to obtain final output. Meanwhile, the text recognition model designed by the patent has fewer convolution layers, so that the problem of fitting is not easy to occur, and a large amount of training data is not needed. In actual use, the high recognition accuracy can be obtained only by training 500 samples, the model training time is greatly reduced, the processing speed in the recognition process is increased, and the engineering use requirement can be met.
In particular, for text captcha types with large warped rotation amplitudes, the text recognition network first uses a spatial transform layer (Spatial Transformer Layers) to correct the text captcha types to make the model spatially invariant.

Claims (4)

1. A text verification code automatic identification method aiming at complex background comprises the following steps:
the verification code denoising module removes complex security features of the real verification code through a circularly generated countermeasure network;
the character segmentation module uses an image processing algorithm to segment the whole verification code picture into single characters;
the segmented characters are sent to a text recognition network to obtain final output;
in the process of denoising the true verification code through the cyclic generation type countermeasure network, firstly, a verification code generator is utilized, pictures with the similar format to the true verification code word are generated in batches through parameter adjustment, and the pictures and the true verification code are combined into a training set in pairs to be used as the input of the denoising network; the circularly generated countermeasure network consists of two generators and two discriminators, and is of a dual structure as a whole;
the image processing algorithm adopts a contour detection, an improved equidistant segmentation algorithm and a threshold segmentation algorithm, wherein in the improved equidistant segmentation algorithm, the starting position of segmentation is the pixel point at the upper left corner of the first character, the segmentation width is the approximate size of each character, and the segmentation height is the approximate height of each character;
the optimization targets of denoising of the circularly generated type countermeasure network comprise two different types of loss functions, namely countermeasure loss and circular consistency loss; the contrast loss is used for matching the pixel distribution of the generated picture with the pixel distribution of the picture in the target domain; the cyclical consistency loss is used for controlling the converted image to be similar to the image in the source domain as much as possible;
the true verification code and the generated verification code are respectively used as a domain X and a domain Y, and two style converters are used for mutually converting between the domain X and the domain Y; the optimization process is as follows: (1) Firstly, carrying out feature extraction on an input picture by utilizing a convolutional neural network to obtain a feature vector; (2) Then converting the feature vector of the picture in the domain X into the feature vector in the domain Y through a Resnet module, and reserving the target of the original image feature while converting; (3) Finally, the decoding process restores the converted image by the characteristic vector through deconvolution operation;
aiming at the text verification code with clear edges but distorted characters after processing, adopting an algorithm of contour detection to segment the characters;
the algorithm of contour detection scans the pixel points of the whole picture, finds the starting point of the outer boundary of each character and the starting point of the hole boundary, numbers the boundary points, and finally connects the outer boundaries through a contour drawing function to obtain a final segmentation result;
aiming at the condition that the character size intervals in the verification code pictures are unequal, a threshold segmentation algorithm is adopted, and the flow is as follows: firstly, carrying out binarization processing on a picture; and then calculating an accumulated value of the ordinate pixels of the picture, and determining the threshold value by adopting a peak-to-valley value analysis method.
2. The method for automatically identifying text verification codes against complex backgrounds according to claim 1, wherein the text identification network is a convolutional neural network, and comprises a convolutional layer, a pooling layer, a dropout layer and a full connection layer.
3. The automatic recognition method of text verification codes for complex backgrounds according to claim 2, wherein the convolutional neural network uses ReLu as an activation function and cross entropy as a loss function, and the optimizer selects Adadelta.
4. The automatic text verification code recognition method for complex backgrounds according to claim 1, wherein for text verification code types with large twisting rotation amplitude, the text recognition network first uses a spatial transformation layer to correct the text verification code types so that the model has spatial invariance.
CN202010495757.8A 2020-06-03 2020-06-03 Text verification code automatic identification method aiming at complex background Active CN111652233B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010495757.8A CN111652233B (en) 2020-06-03 2020-06-03 Text verification code automatic identification method aiming at complex background

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010495757.8A CN111652233B (en) 2020-06-03 2020-06-03 Text verification code automatic identification method aiming at complex background

Publications (2)

Publication Number Publication Date
CN111652233A CN111652233A (en) 2020-09-11
CN111652233B true CN111652233B (en) 2023-04-25

Family

ID=72345001

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010495757.8A Active CN111652233B (en) 2020-06-03 2020-06-03 Text verification code automatic identification method aiming at complex background

Country Status (1)

Country Link
CN (1) CN111652233B (en)

Families Citing this family (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112164126A (en) * 2020-09-15 2021-01-01 郑州金惠计算机系统工程有限公司 Method and device for generating composite picture, electronic equipment and storage medium
CN112380409A (en) * 2020-10-26 2021-02-19 武汉天宝莱信息技术有限公司 Verification code identification method based on automatic crawler
CN112905977A (en) * 2020-11-23 2021-06-04 重庆大学 Verification code generation method based on image style conversion
CN113065417A (en) * 2021-03-17 2021-07-02 国网河北省电力有限公司 Scene text recognition method based on generation countermeasure style migration
CN113554549B (en) * 2021-07-27 2024-03-29 深圳思谋信息科技有限公司 Text image generation method, device, computer equipment and storage medium
CN117573810B (en) * 2024-01-15 2024-04-09 腾讯烟台新工科研究院 Multi-language product package instruction text recognition query method and system

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108182657A (en) * 2018-01-26 2018-06-19 深圳市唯特视科技有限公司 A kind of face-image conversion method that confrontation network is generated based on cycle
CN109117848A (en) * 2018-09-07 2019-01-01 泰康保险集团股份有限公司 A kind of line of text character identifying method, device, medium and electronic equipment
CN110020692A (en) * 2019-04-13 2019-07-16 南京红松信息技术有限公司 A kind of handwritten form separation and localization method based on block letter template

Family Cites Families (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107633192B (en) * 2017-08-22 2020-05-26 电子科技大学 Bar code segmentation and reading method based on machine vision under complex background
CN110097056A (en) * 2018-01-30 2019-08-06 江苏博智软件科技股份有限公司 A kind of method for recognizing verification code based on intelligent pattern algorithm
US20190286938A1 (en) * 2018-03-13 2019-09-19 Recogni Inc. Real-to-synthetic image domain transfer
CN108446704A (en) * 2018-03-29 2018-08-24 哈尔滨理工大学 A kind of segmentation of adhesion character identifying code and recognition methods
CN109508717A (en) * 2018-10-09 2019-03-22 苏州科达科技股份有限公司 A kind of licence plate recognition method, identification device, identification equipment and readable storage medium storing program for executing
CN110570445B (en) * 2019-09-09 2022-03-25 上海联影医疗科技股份有限公司 Image segmentation method, device, terminal and readable medium
CN110210204B (en) * 2019-05-30 2021-07-13 网易(杭州)网络有限公司 Verification code generation method and device, storage medium and electronic equipment
CN110276357A (en) * 2019-07-01 2019-09-24 浪潮卓数大数据产业发展有限公司 A kind of method for recognizing verification code based on convolutional neural networks
CN110348450A (en) * 2019-07-15 2019-10-18 中国工商银行股份有限公司 Safety evaluation method, device and computer system for image authentication code
CN110570363A (en) * 2019-08-05 2019-12-13 浙江工业大学 Image defogging method based on Cycle-GAN with pyramid pooling and multi-scale discriminator
CN110659586B (en) * 2019-08-31 2022-03-15 电子科技大学 Gait recognition method based on identity-preserving cyclic generation type confrontation network
CN110766017B (en) * 2019-10-22 2023-08-04 国网新疆电力有限公司信息通信公司 Mobile terminal text recognition method and system based on deep learning

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108182657A (en) * 2018-01-26 2018-06-19 深圳市唯特视科技有限公司 A kind of face-image conversion method that confrontation network is generated based on cycle
CN109117848A (en) * 2018-09-07 2019-01-01 泰康保险集团股份有限公司 A kind of line of text character identifying method, device, medium and electronic equipment
CN110020692A (en) * 2019-04-13 2019-07-16 南京红松信息技术有限公司 A kind of handwritten form separation and localization method based on block letter template

Also Published As

Publication number Publication date
CN111652233A (en) 2020-09-11

Similar Documents

Publication Publication Date Title
CN111652233B (en) Text verification code automatic identification method aiming at complex background
CN108491836B (en) Method for integrally identifying Chinese text in natural scene image
CN113128442B (en) Chinese character handwriting style identification method and scoring method based on convolutional neural network
CN110598686B (en) Invoice identification method, system, electronic equipment and medium
CN108520215B (en) Single-sample face recognition method based on multi-scale joint feature encoder
CN101114340A (en) VLSI realizing system and method of histogram equalization image processing
CN105335760A (en) Image number character recognition method
CN116910752B (en) Malicious code detection method based on big data
CN110532825A (en) A kind of bar code identifying device and method based on artificial intelligence target detection
CN116030396A (en) Accurate segmentation method for video structured extraction
CN106503112B (en) Video retrieval method and device
CN1858773A (en) Image identifying method based on Gabor phase mode
CN110781898A (en) Unsupervised learning method for Chinese character OCR post-processing
Darma et al. Segmentation of balinese script on lontar manuscripts using projection profile
Devi et al. Brahmi script recognition system using deep learning techniques
CN111753842B (en) Method and device for detecting text region of bill
KR102576747B1 (en) System for local optimization of objects detector based on deep neural network and method for creating local database thereof
Zhang et al. An improved binarization algorithm of QR code image
CN114463734A (en) Character recognition method and device, electronic equipment and storage medium
CN111754459B (en) Dyeing fake image detection method based on statistical depth characteristics and electronic device
Subramani et al. A novel binarization method for degraded tamil palm leaf images
Sahu et al. A survey on handwritten character recognition
Mosannafat et al. Farsi text detection and localization in videos and images
CN116912845B (en) Intelligent content identification and analysis method and device based on NLP and AI
Shivani Techniques of Text Detection and Recognition: A Survey

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant