CN109086591A - Method for recognizing verification code, device, computer equipment and storage medium - Google Patents
Method for recognizing verification code, device, computer equipment and storage medium Download PDFInfo
- Publication number
- CN109086591A CN109086591A CN201810595668.3A CN201810595668A CN109086591A CN 109086591 A CN109086591 A CN 109086591A CN 201810595668 A CN201810595668 A CN 201810595668A CN 109086591 A CN109086591 A CN 109086591A
- Authority
- CN
- China
- Prior art keywords
- identifying code
- character
- identified
- code picture
- picture
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 74
- 238000012795 verification Methods 0.000 title claims abstract description 57
- 238000012545 processing Methods 0.000 claims abstract description 44
- 238000004422 calculation algorithm Methods 0.000 claims description 25
- 238000012163 sequencing technique Methods 0.000 claims description 24
- 238000010606 normalization Methods 0.000 claims description 23
- 238000004590 computer program Methods 0.000 claims description 18
- 238000006243 chemical reaction Methods 0.000 claims description 14
- 230000002452 interceptive effect Effects 0.000 claims description 10
- 235000013399 edible fruits Nutrition 0.000 claims description 8
- 238000004364 calculation method Methods 0.000 claims description 7
- 230000003287 optical effect Effects 0.000 claims description 6
- 230000011218 segmentation Effects 0.000 claims description 5
- 238000012360 testing method Methods 0.000 claims description 4
- 230000006870 function Effects 0.000 description 7
- 238000010586 diagram Methods 0.000 description 4
- 238000012015 optical character recognition Methods 0.000 description 3
- XEEYBQQBJWHFJM-UHFFFAOYSA-N Iron Chemical compound [Fe] XEEYBQQBJWHFJM-UHFFFAOYSA-N 0.000 description 2
- 239000000284 extract Substances 0.000 description 2
- 238000010801 machine learning Methods 0.000 description 2
- 239000000203 mixture Substances 0.000 description 2
- 230000010485 coping Effects 0.000 description 1
- 238000007418 data mining Methods 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 238000007373 indentation Methods 0.000 description 1
- 229910052742 iron Inorganic materials 0.000 description 1
- 238000012856 packing Methods 0.000 description 1
- 238000002360 preparation method Methods 0.000 description 1
- 238000010845 search algorithm Methods 0.000 description 1
- 230000003068 static effect Effects 0.000 description 1
- 230000001360 synchronised effect Effects 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F21/00—Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
- G06F21/30—Authentication, i.e. establishing the identity or authorisation of security principals
- G06F21/31—User authentication
- G06F21/36—User authentication by graphic or iconic representation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/23—Clustering techniques
- G06F18/232—Non-hierarchical techniques
- G06F18/2321—Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions
- G06F18/23213—Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions with fixed number of clusters, e.g. K-means clustering
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T3/00—Geometric image transformations in the plane of the image
- G06T3/40—Scaling of whole images or parts thereof, e.g. expanding or contracting
- G06T3/4038—Image mosaicing, e.g. composing plane images from plane sub-images
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/90—Determination of colour characteristics
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/20—Image preprocessing
- G06V10/26—Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion
- G06V10/267—Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion by performing operations on regions, e.g. growing, shrinking or watersheds
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/20—Image preprocessing
- G06V10/30—Noise filtering
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V30/00—Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
- G06V30/10—Character recognition
- G06V30/14—Image acquisition
- G06V30/148—Segmentation of character regions
- G06V30/153—Segmentation of character regions using recognition of characters or words
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V30/00—Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
- G06V30/10—Character recognition
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Data Mining & Analysis (AREA)
- Multimedia (AREA)
- Computer Security & Cryptography (AREA)
- General Engineering & Computer Science (AREA)
- Life Sciences & Earth Sciences (AREA)
- Evolutionary Biology (AREA)
- Evolutionary Computation (AREA)
- Bioinformatics & Computational Biology (AREA)
- Probability & Statistics with Applications (AREA)
- Artificial Intelligence (AREA)
- Computer Hardware Design (AREA)
- Software Systems (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Character Discrimination (AREA)
Abstract
The invention discloses a kind of method for recognizing verification code, device, computer equipment and storage medium, the described method includes: by obtaining the identifying code picture to be identified on targeted website, identifying processing is carried out to identifying code picture based on Tesseract, obtain recognition result, then, judge whether recognition result is formula, if, calculated result is then inserted into input frame, if not, recognition result is then inserted into input frame, by being identified based on Tesseract to identifying code picture, it is verified digital content information, and when the identifying code content information is formula, automatically calculated result is calculated, by the corresponding input frame of the identifying code content information of the calculated result or non-formula filling identifying code picture, to when obtaining target resource information on targeted website, it does not need to spend the time by target is manually entered The identifying code content information of website, and then improve the efficiency for obtaining resource information.
Description
Technical field
The present invention relates to financial fields more particularly to a kind of method for recognizing verification code, device, computer equipment and storage to be situated between
Matter.
Background technique
In today of information age, resource information plays a crucial role each company.
Since internet has brought great convenience, the personnel of company on network to website often through obtaining
Resource information, website could browse, together to guarantee to access quality after needing account number cipher to log in provided with some resource informations
When be arranged account number cipher log on the basis of be also provided with identifying code picture.But when user logs in net by account number cipher
Stand obtain the website on these resource informations when, by do not needed originally input identifying code picture in identifying code content information become
At needing to spend the time by the identifying code content information that is manually entered in identifying code picture, so as to cause resource information is obtained
Low efficiency.
Summary of the invention
Based on this, it is necessary in view of the above technical problems, provide a kind of testing for efficiency that can be improved and obtain resource information
Demonstrate,prove code recognition methods, device, computer equipment and storage medium.
A kind of method for recognizing verification code, comprising:
Obtain the identifying code picture to be identified on targeted website;
Identifying processing is carried out to the identifying code picture to be identified based on Tesseract, obtains the identifying code to be identified
The corresponding recognition result of picture, wherein Tesseract is a optical character identification tool;
Judge whether the recognition result is formula;
If the recognition result is formula, the corresponding calculated result of the recognition result is inserted into the verifying to be identified
The corresponding input frame of code picture;
If the recognition result is non-formula, it is corresponding that the recognition result is inserted into the identifying code picture to be identified
Input frame.
A kind of verifying code recognition device, comprising:
Recognition processing module is obtained for carrying out identifying processing to the identifying code picture to be identified based on Tesseract
The corresponding recognition result of the identifying code picture to be identified, wherein Tesseract is a optical character identification tool;
Judgment module, for judging whether the recognition result is formula;
First filling module, if being formula for the recognition result, by the corresponding calculated result of the recognition result
Insert the corresponding input frame of the identifying code picture to be identified;
Second filling module, it is if being non-formula for the recognition result, recognition result filling is described wait know
The corresponding input frame of other identifying code picture.
A kind of computer equipment, including memory, processor and storage are in the memory and can be in the processing
The computer program run on device, the processor realize the step of above-mentioned method for recognizing verification code when executing the computer program
Suddenly.
A kind of computer readable storage medium, the computer-readable recording medium storage have computer program, the meter
The step of calculation machine program realizes above-mentioned method for recognizing verification code when being executed by processor.
Above-mentioned method for recognizing verification code, device, computer equipment and storage medium, firstly, by obtaining on targeted website
Identifying code picture to be identified, based on Tesseract to identifying code picture to be identified carry out identifying processing, be verified a yard picture
Then corresponding recognition result judges whether recognition result is formula, if so, the corresponding calculated result of recognition result is filled out
Enter the corresponding input frame of identifying code picture, if it is not, then by the corresponding input frame of recognition result filling identifying code picture, firstly, logical
It crosses and the card code picture on the corresponding targeted website of target resource information is identified based on Tesseract, be verified in code
Hold information, and when the identifying code content information is formula, calculates the corresponding calculating knot of the identifying code content information automatically
Fruit, then, by the corresponding input frame of the identifying code content information of the calculated result or non-formula filling identifying code picture, to work as
When obtaining target resource information on targeted website, do not need that the time is spent to be believed by the verifying digital content that targeted website is manually entered
Breath, and then improve the efficiency for obtaining resource information.
Detailed description of the invention
In order to illustrate the technical solution of the embodiments of the present invention more clearly, below by institute in the description to the embodiment of the present invention
Attached drawing to be used is needed to be briefly described, it should be apparent that, the accompanying drawings in the following description is only some implementations of the invention
Example, for those of ordinary skill in the art, without any creative labor, can also be according to these attached drawings
Obtain other attached drawings.
Fig. 1 is an application environment schematic diagram of method for recognizing verification code in one embodiment of the invention;
Fig. 2 is a flow chart of method for recognizing verification code in one embodiment of the invention;
Fig. 3 is a flow chart of step S20 in method for recognizing verification code in one embodiment of the invention;
Fig. 4 is a flow chart of step S30 in method for recognizing verification code in one embodiment of the invention;
Fig. 5 is to carry out pretreated one to identifying code picture to be identified in method for recognizing verification code in one embodiment of the invention
Flow chart;
Fig. 6 is the picture to be identified for not going interfering line to handle in one embodiment of the invention in method for recognizing verification code;
Fig. 7 is the picture to be identified for having gone interfering line to handle in one embodiment of the invention in method for recognizing verification code;
Fig. 8 is the binary picture for not doing denoising in one embodiment of the invention in method for recognizing verification code;
Fig. 9 is the identifying code picture to be identified after having denoised in method for recognizing verification code in one embodiment of the invention;
Figure 10 to be identified is tested in method for recognizing verification code to each on each targeted website in one embodiment of the invention
The flow chart that card code picture is identified;
Figure 11 is the schematic diagram that code recognition device is verified in one embodiment of the invention;
Figure 12 is a schematic diagram of computer equipment in one embodiment of the invention.
Specific embodiment
Following will be combined with the drawings in the embodiments of the present invention, and technical solution in the embodiment of the present invention carries out clear, complete
Site preparation description, it is clear that described embodiments are some of the embodiments of the present invention, instead of all the embodiments.Based on this hair
Embodiment in bright, every other implementation obtained by those of ordinary skill in the art without making creative efforts
Example, shall fall within the protection scope of the present invention.Method for recognizing verification code provided by the present application can be applicable to the application environment such as Fig. 1
In, wherein computer equipment is communicated by network with server.Server-side obtain client targeted website on wait know
Other identifying code picture, server-side are based on Tesseract and carry out identifying processing to identifying code picture to be identified, obtain verifying to be identified
The corresponding recognition result of code picture, server-side judge whether recognition result is formula, if recognition result is formula, server-side will
The corresponding calculated result of recognition result inserts the corresponding input frame of identifying code picture to be identified, if recognition result is non-formula,
Recognition result is inserted the corresponding input frame of identifying code picture to be identified by server-side.Wherein, computer equipment can be, but not limited to
Various personal computers, laptop, smart phone, tablet computer and portable wearable device.Server can be with solely
The server clusters of the either multiple servers compositions of vertical server is realized.
In one embodiment, as shown in Fig. 2, providing a kind of method for recognizing verification code, which is applied
In financial industry, applies be illustrated for the server-side in Fig. 1 in this way, include the following steps:
S10: the identifying code picture to be identified on targeted website is obtained;
In the present embodiment, targeted website refers to the corresponding website of resource information that needs obtain.
Specifically, firstly, successfully logging in targeted website, and the routing information of identifying code picture to be identified, then, root are obtained
According to the routing information of identifying code picture to be identified, the identifying code picture to be identified is extracted on targeted website.
S20: identifying processing is carried out to identifying code picture to be identified based on Tesseract, obtains identifying code picture to be identified
Corresponding recognition result;
In the present embodiment, Tesseract is a open source OCR tool being widely used.OCR, full name in English are
Optical Character Recognition, Chinese entitled optical character identification refer to that electronic equipment checks and print on paper
Character determines its shape by the mode for detecting dark, bright, shape is then translated into computword with character identifying method
Process.
Specifically, based on Tesseract on targeted website identifying code picture to be identified carry out identifying processing, obtain to
Identify the corresponding recognition result of identifying code picture.
It should be noted that identifying code picture to be identified and recognition result record one-to-one relationship.
S30: judge whether recognition result is formula;
In the present embodiment, formula refers to the formula containing equal sign "=".
Specifically, judge whether the corresponding recognition result of identifying code picture to be identified on targeted website is formula.
S40: if recognition result is formula, the corresponding calculated result of recognition result is inserted into identifying code picture pair to be identified
The input frame answered;
Specifically, if the corresponding recognition result of identifying code picture to be identified on targeted website is formula, by the identification
As a result corresponding calculated result inserts the corresponding input frame of identifying code picture to be identified.
It should be noted that identifying code picture to be identified and input frame record one-to-one relationship.
Step S40 in order to better understand is illustrated below by an example, is specifically expressed as follows:
For example, it is assumed that targeted website is Chinese Railway shipping official website, identifying code picture to be identified is the license number for inquiring cargo
With the page pictures of cargo ticket number, the corresponding recognition result of the identifying code picture to be identified is " 36+21=", then, determines China's iron
The identifying code picture to be identified that afloat is transported on official website is corresponding " 36+21=", then " 36+21=" corresponding " 57 " is inserted vehicle
Number input frame.
S50: if recognition result is non-formula, recognition result is inserted into the corresponding input frame of identifying code picture to be identified.
Specifically, if the corresponding recognition result of identifying code picture to be identified on targeted website is non-formula, by the knowledge
Other result inserts the corresponding input frame of identifying code picture to be identified.
In the corresponding embodiment of Fig. 2, firstly, by obtaining the verifying on the corresponding website of resource information for needing to obtain
Code picture carries out identifying processing to the identifying code picture based on Tesseract, obtains recognition result, then, judge recognition result
It whether is formula, if so, the corresponding calculated result of the recognition result is inserted input frame, if it is not, then inserting recognition result
Input frame is obtained by being identified based on Tesseract to the identifying code picture on the corresponding targeted website of target resource information
To identifying code content information, and when the identifying code content information is formula, the identifying code content information pair is calculated automatically
The calculated result answered, it is then, the identifying code content information of the calculated result or non-formula filling identifying code picture is corresponding defeated
Enter frame, to not need to spend the time by targeted website is manually entered when obtaining target resource information on targeted website
Identifying code content information, and then improve the efficiency for obtaining resource information.
In one embodiment, which applies in financial industry, as shown in figure 3, step S20 is specially
Identifying processing is carried out to identifying code picture to be identified based on Tesseract, obtains the corresponding identification knot of identifying code picture to be identified
Fruit.Specifically comprise the following steps:
S201: identifying code picture to be identified is split using vertical projection method, obtains each sub- identifying code picture;
In the present embodiment, vertical projection method refers to that the stroke number of pixels of the character graphics of binaryzation carries out Vertical Square
Upward statistics, the method for determining the boundary of character by detecting the trough on obtained vertical projection diagram.Sub- identifying code figure
After piece refers to that each character is divided on identifying code picture to be identified, the corresponding identifying code sub-pictures of obtained each character.
Specifically, the identifying code picture to be identified on targeted website is split using vertical projection method, is obtained each
Sub- identifying code picture.
Step S201 in order to better understand is illustrated below by an example, is specifically expressed as follows:
For example, it is assumed that targeted website is the online vehicle administration office official website in Jiangsu, identifying code picture to be identified is license plate " Soviet Union
E.UK722 " picture then uses vertical projection method, and " Soviet Union E.UK722 " picture is split, obtain " reviving ", " E ", " U ", " K ",
" 7 ", " 2 " and " 2 " picture.
S202: each sub- identifying code picture is subjected to size normalization according to preset dimensions, obtains each rule
Sub- identifying code picture after formatting;
In the present embodiment, it is same to refer to that the size by the single character on identifying code picture be classified as size normalizing
A size.Preset dimensions can be 6cm*10cm.Cm, entitled centimetre of Chinese, refers to a kind of length unit.Preset ruler
The particular content of very little specification can be set, herein with no restrictions according to practical application.
Specifically, each sub- identifying code picture is subjected to size normalization according to preset dimensions, obtained each
Sub- identifying code picture after normalization.
S203: the sub- identifying code picture after each normalization is identified based on Tesseract, obtains each normalization
The corresponding each identifying code content information of sub- identifying code picture afterwards;
In the present embodiment, identifying code content information refers to the specific character information on sub- identifying code picture.
Specifically, the sub- identifying code picture after each normalization is identified based on Tesseract, obtains each specification
The corresponding each identifying code content information of sub- identifying code picture after change.
It is closed it should be noted that the sub- identifying code picture and identifying code content information after normalization record one-to-one correspondence
System.
S204: according to being spelled each identifying code content information by left-to-right sequence before identifying code picture segmentation
It connects, obtains the corresponding recognition result of identifying code picture to be identified, recognition result includes more than one character;
Specifically, according to being tested by left-to-right sequence by each before identifying code picture segmentation to be identified on targeted website
Card digital content information is spliced, and the corresponding recognition result of the identifying code picture to be identified is obtained.
It should be noted that recognition result includes more than one character.
In the corresponding embodiment of Fig. 3, firstly, being split by using vertical projection method to identifying code picture, obtain
Then each sub- identifying code picture is carried out size normalization, the son after obtaining each normalization by each sub- identifying code picture
Identifying code picture is obtained each next, being identified based on Tesseract to the sub- identifying code picture after each normalization
Identifying code content information obtains recognition result, by using positioning accurate finally, each identifying code content information is spliced
Identifying code picture is accurately accurately divided into each sub- identifying code picture by quasi- vertical projection method, then, by size variatiom
Sub- identifying code picture carry out size normalizing, obtain convenient for identification read normalization after sub- identifying code picture, next, base
The sub- identifying code picture after normalization is identified in the Tesseract that many machine learning algorithms continue to optimize identification precision
Identifying code content information, finally, each identifying code content information is spliced the recognition result needed, to improve identification
The precision of identifying code picture.
In one embodiment, which applies in financial industry, as shown in figure 4, step S30 is specially
Judge whether recognition result is formula.Specifically comprise the following steps:
S301: from left to right sequentially arranging according to each character in recognition result, and by each word in recognition result
Symbol is pressed into stack according to from left to right sequentially putting in order;
In the present embodiment, stack is to limit only to carry out the linear list of insert or delete operation in table tail, is a kind of data knot
Structure, according to the principle storing data that last in, first out, the data being introduced into are pressed into stack bottom for it, and last data are needed in stack top
It reads to pop up data since stack top when data.
Specifically, firstly, from left to right sequentially being arranged according to each character in recognition result, then, by recognition result
In each character be pressed into stack according to from left to right sequentially putting in order.
S302: according to popping, sequencing is arranged from pop at first the character conduct obtained in each character in stack
Current character;
Specifically, make according to sequencing arrangement is popped from pop at first the character obtained in stack in each character
For current character.
S303: current character is inquired using oeprator querying method, obtains query result;
In the present embodiment, oeprator database is dedicated for store all oeprators.Oeprator can be with
For logic connective or formula symbol.
Specifically, current character is inquired in oeprator database whether there is, and if it exists, then obtain the current character
For the query result of oeprator;If it does not exist, then the character types for obtaining the current character are the inquiry knot of inverse symbol
Fruit.
S304: according to query result, the character types of query result are determined, wherein character types include operator type
Type is accorded with inverse;
In the present embodiment, character types include operator type and inverse symbol type.
Specifically, if current character is oeprator, the character types of the current character are determined as operator type,
And by current character storage into target oeprator database.
If current character is inverse symbol, the character types of the current character are determined as inverse symbol type.
S305: if the character types of query result are operator type, preset oeprator original value conduct is obtained
Current quantity;
Specifically, it if the character types of query result are operator type, obtains preset oeprator original value and makees
For current quantity.
If the character types of query result are that inverse accords with type, which is stored to standby database, with
Just it needs with then extracting.
It should be noted that preset oeprator original value can be 0 or 1 etc., preset oeprator original value
Particular content can be set, herein with no restrictions according to practical application.
S306: current quantity and number 1 are subjected to arithmetic sum operation, obtain quantity result;
Specifically, current quantity and number 1 are subjected to arithmetic sum operation, obtain quantity result.
S307: judge whether current character is according to one finally to pop in each character that sequencing arranges of popping
A character;
Specifically, judge whether current character is according to finally popping in each character that sequencing arranges of popping
One character.
S308: if current character is for according to finally pop the word in each character for sequencing arrangement of popping
Symbol, then judge quantity result whether be greater than or equal to preset oeprator original value and number 2 and, and character types be transport
It whether there is equal sign in all characters of operator type;
Specifically, if the current character is for according to one finally to pop in each character for sequencing arrangement of popping
A character, then firstly, then all characters extracted in target oeprator database judge that oeprator quantitative value is
It is no more than or equal to preset oeprator original value and number 2 and, and it is all in target oeprator database
It whether there is have equal sign characters in character.
S309: if quantity result be greater than or equal to the preset oeprator original value and number 2 and, and character type
Type is that there are equal signs in all characters of operator type, it is determined that recognition result is formula;
Specifically, if the quantity result being calculated is greater than or equal to the preset oeprator original value and number 2
Sum, and the character types of query result are that there are equal signs in all characters of operator type, it is determined that recognition result is
Formula.
S310: if quantity result be not more than or not equal to the preset oeprator original value and number 2 and, and word
It accords in all characters that type is operator type and equal sign is not present, it is determined that recognition result is non-formula;
Specifically, if the quantity result being calculated is not more than or not equal to the preset oeprator original value and number
The sum of word 2, and the character types of query result are that there is no equal signs then to determine identification knot in all characters of operator type
Fruit is non-formula.
S311: if current character is not for according to one finally to pop in each character for sequencing arrangement of popping
Character then obtains next character popped in each character according to sequencing arrangement of popping as current character, obtains
Access amount result is returned to execute and be inquired using oeprator querying method current character, looked into as current quantity
The step of asking result;
Specifically, if current character is not for according to one finally to pop in each character for sequencing arrangement of popping
A character, then obtain next character popped in each character according to sequencing arrangement of popping as current character,
Return to step S303.
In order to better understand step S301, step S302, step S303, step S304, step S305, step S306,
Step S307, step S308, step S309, step S310 and step S311 are illustrated, specific table below by an example
It states as follows:
For example, it is assumed that recognition result is " 9+7=", target oeprator database is the first MYSQL database, operator
Number library is the second MYSQL database, and preset oeprator original value is 0, then, according to " 9+7=" by left-to-right sequence
By " 9 ", "+", " 7 " and "=" indentation stack, obtains "=" and be used as current character, determine that "=" deposits in the second MYSQL database
Determining that "=" is oeprator, and "=" is recorded as operator type, and "=" is stored to the first MYSQL database
In, while obtaining 0 and being used as current quantity, arithmetic sum operation is carried out with number 1 by 0,1 is obtained, next, it is judged that current character
Whether be " 9 ", if so, extract the first MYSQL database in "+" and "=", determine 4 be greater than 0 with number 2 and, and
" 9 ", "+", " 7 " and "=" are that there are equal signs, it is determined that " 9+7=" is formula, if "=" is not " 9 ", obtains " 7 " and is used as and work as
Preceding character, returns to step S303.
In the corresponding embodiment of Fig. 4, firstly, passing through being tied by left-to-right tactic will identify according to recognition result
Each character in fruit is pressed into stack, obtains the character popped at first as current character, using oeprator querying method to working as
Preceding character is inquired, and query result is obtained, and then, according to query result, determines the character types of query result, if inquiry knot
The character types of fruit are operator type, then obtain preset oeprator original value as current quantity, by current quantity and
Number 1 is added, and obtains quantity as a result, finally, judging whether current character is the character finally popped, if so, judgement
Quantity result whether be greater than or equal to preset oeprator original value and number 2 and, and character types be operator type
All characters in whether there is equal sign, if so, determine recognition result be formula, if not, it is determined that recognition result be non-calculation
Formula obtains by next character popped as current character if current character is not the character finally popped, obtains quantity
As a result it is used as current quantity, returns to execute and current character is inquired using oeprator querying method, obtain query result
The step of, the characteristic gone out afterwards is first entered by stack, can guarantee to inquire in an orderly manner each character in oeprator database whether
In the presence of, when it is present, then quantity result is added 1, next, it is determined that whether the character is the character finally popped, if it is last go out
The character of stack, then judge quantity result whether be greater than or equal to preset oeprator original value and number 2 and, and character type
Type has equal sign to whether there is in all characters of operator type, if so, determine that current verification code content information is formula, if
It is no, it determines that current verification code content information is non-formula, if not the character finally popped, then continues to obtain next pop
Character obtains quantity result as current quantity, S303 is returned to step, so as to accurately differentiate as current character
It is also non-formula that current verification code content information, which is formula, out, and then improves the accuracy of identifying code identification.
In one embodiment, which applies in financial industry, as shown in figure 5, step S20 it
Before, which further includes following steps:
S61: interfering line processing is removed to identifying code picture to be identified using Depth Priority Algorithm, obtains doing
Identifying code picture to be identified after disturbing;
In the present embodiment, Depth Priority Algorithm, English name are Depth First Search algorithm, are referred to
In one html file, after a hyperlink is selected, linked html file will execute depth-first search, that is, search for
An individual chain first must be completely searched for before remaining hyperlink result.Depth-first search is along super on html file
Chain go to cannot again deeply until, then return to some html file, be further continued for selecting in the html file other are super
Chain nomography.The descriptive text that html file is made of HTML command.HTML command refers to the instruction being made of HTML.
HTML, full name HyperTextMark-upLanguage, refers to hypertext markup language.
Specifically, it carries out interference to the picture to be identified on targeted website using DFS algorithm to handle, after obtaining interference
Identifying code picture to be identified.
Step S61 in order to better understand is illustrated below by an example, is specifically expressed as follows:
For example, it is assumed that Fig. 6 be do not go interfering line handle picture to be identified, Fig. 7 be go interfering line processing it is to be identified
Picture is removed interfering line to Fig. 6 based on DFS, obtains Fig. 7.
S62: the identifying code picture to be identified after going interference is subjected to conversion process according to preset conversion regime, obtains ash
Spend picture;
In the present embodiment, gray scale picture refers to the figure for being divided into several grades between white and black by logarithmic relationship
Piece.RGB color mode is a kind of color standard of industry, is by the variation and their phases to tri- Color Channels of R, G and B
Superposition between mutually obtains miscellaneous color, wherein R represents red, G represents green, and B represents green.According to default
Conversion regime can be using calculation formula RGB color mode picture to be converted into gray scale picture.
It should be noted that RGB color mode picture is converted into the calculation formula of gray scale picture are as follows: GREY=(R+G+B)/
3, wherein R, G, B are the value of three Color Channels of pixel respectively, and GREY is the corresponding gray value of the pixel, wherein ash
Angle value range is between 0-255.According to the particular content of preset conversion regime, can be set according to practical application, this
Place is with no restrictions.
Specifically, the RGB color mode to be identified by formula GREY=(R+G+B)/3, after being successively read interfering line
The each pixel of figure identifying code picture obtains R, G, B value of the pixel, takes the average value of R, G, B three as new picture pair
The gray value for answering pixel obtains gray scale picture.
S63: binary conversion treatment is carried out to gray scale picture using clustering algorithm, obtains binaryzation picture;
In the present embodiment, K-means algorithm is a kind of clustering algorithm, which is one of data mining algorithm,
Refer to be clustered centered on k point in space, to the object categorization near them, by the method for iteration, gradually more
The value of new each cluster centre, until obtaining the algorithm of best cluster result.Binary conversion treatment refers to be exactly picture by image
The gray value of vegetarian refreshments is set as 0 or 255, that is, whole image is showed to the treatment process of apparent black and white effect.
Specifically, firstly, best preset threshold value is calculated using K-means algorithm, then, by the back of gray scale picture
Scape color and character distinguish, next, the gray value of pixel each for grayscale image on piece, if gray value is greater than most preferably
The gray value of the pixel is then both configured to 255 by preset threshold value, if gray value is less than or equal to best preset threshold value,
The gray value of the pixel is both configured to 0, finally, obtaining binaryzation picture.
S64: denoising is carried out to binaryzation picture using flood filling algorithm, the identifying code to be identified after being denoised
Picture;
In the present embodiment, FloodFill algorithm is called flood filling algorithm, refers to all adjacent of point
Point all coats the color of the point, and filling is gone down always, algorithm until point all in this region has all been filled.It goes
It makes an uproar and refers to the process of noise or noise are removed to binaryzation picture.
Specifically, denoising is carried out to binaryzation picture using FloodFill algorithm, to be identified after being denoised is tested
Demonstrate,prove code picture.
Step S90 in order to better understand is illustrated below by an example, is specifically expressed as follows:
For example, it is assumed that Fig. 8 is the binary picture for not doing denoising, Fig. 9 is the identifying code figure to be identified after having denoised
Piece then carries out denoising to Fig. 8 using FloodFill algorithm, obtains Fig. 9.
In the corresponding embodiment of Fig. 5, firstly, by using the DFS algorithm of Recursion process mode to identifying code to be identified
Picture carries out the comprehensively processing of removal interfering line, the identifying code picture to be identified for obtaining not residual interference line and then passes through conversion
Formula GREY=(R+G+B)/3 carries out colored identifying code picture to be identified to be converted to black and white gray scale picture, next, using
Cluster K-means algorithm comprising machine learning method calculates best preset threshold value, gray value is greater than best preset
The pixel of threshold value is set as 255, and the pixel that gray value is less than or equal to best preset threshold value is set as 0, obtains bright dark obvious
Binaryzation picture next the noise of binary picture on piece whole is removed using the FloodFill algorithm of full packing mode
It goes, obtains completely clean identifying code picture to be identified, so as to avoid misreading identifying code picture because of interference, and then improve
The precision of identifying code identification.
In one embodiment, which applies in financial industry, and as shown in Figure 10, step S10 is specific
To obtain the identifying code figure to be identified on targeted website.Include the steps that specifically: S101: obtaining each on each targeted website
A identifying code picture to be identified;
Specifically, it firstly, obtaining the routing information of each identifying code picture to be identified, then, to be identified is tested according to each
The routing information for demonstrate,proving code picture, extracts each identifying code picture to be identified on each targeted website.
Before step S20, which further includes following steps: S65: determining each verifying to be identified
An identifying code picture to be identified in code picture is current verification code picture;
Specifically, it is determined that a verifying to be identified in each identifying code picture to be identified on each targeted website
Code picture is current verification code picture.
Step S20 specifically: S201: identifying processing is carried out to current verification code picture based on Tesseract, is obtained current
The corresponding recognition result of identifying code picture;
Specifically, identifying processing is carried out to current verification code picture based on Tesseract, obtains current verification code picture pair
The recognition result answered.
In step s 40, i.e., the corresponding calculated result of recognition result is inserted into the corresponding input of identifying code picture to be identified
In frame or step S50, i.e., after recognition result being inserted the corresponding input frame of identifying code picture to be identified, which knows
Other method further includes following steps: S66: by the unidentified identifying code figure to be identified of one in each identifying code picture to be identified
Piece is determined as current verification code picture, returns to step S201, until the identified processing of each identifying code picture to be identified
And obtain corresponding recognition result;
Specifically, the unidentified identifying code picture to be identified of one in each identifying code picture to be identified is determined as working as
Preceding identifying code picture, returns to step S201, until the identified processing of each identifying code picture to be identified and is corresponded to
Recognition result.
In the corresponding embodiment of Figure 10, firstly, by obtaining each identifying code figure to be identified on each targeted website
Piece determines that the wherein identifying code picture to be identified in each identifying code picture to be identified is current verification code picture,
Then, identifying processing is carried out to current verification code picture based on Tesseract, is verified digital content information, next, it is judged that
Whether the identifying code content information is formula, and if formula, then calculated result filling input frame then tests this if non-formula
Card digital content information filling input frame returns finally, unidentified identifying code picture to be identified is determined as current verification code picture
Receipt row step S201 until the identified processing of each identifying code picture to be identified and obtains corresponding recognition result, thus
Can each identifying code picture successively on each website corresponding to the resource information of required acquisition identify, do not need
Each identifying code content information by manually inputting each website one by one is devoted a tremendous amount of time, and then improves and obtains resource letter
The efficiency of breath.
It should be understood that the size of the serial number of each step is not meant that the order of the execution order in above-described embodiment, each process
Execution sequence should be determined by its function and internal logic, the implementation process without coping with the embodiment of the present invention constitutes any limit
It is fixed.
In one embodiment, a kind of verifying code recognition device is provided, is tested in the verifying code recognition device and above-described embodiment
Code recognition methods is demonstrate,proved to correspond.As shown in figure 11, which includes obtaining module 701, recognition processing module
702, judgment module 703, first inserts module 704, second and inserts module 705, removal module 706, conversion module 707, two-value
Change processing module 708, denoising module 709, the first determining module 710 and the second determining module 711.Each functional module is detailed
It is described as follows:
Module 701 is obtained, for obtaining the identifying code picture to be identified on targeted website;
Recognition processing module 702 is obtained for carrying out identifying processing to identifying code picture to be identified based on Tesseract
The corresponding recognition result of identifying code picture to be identified, wherein Tesseract is a optical character identification tool;
Judgment module 703, for judging whether recognition result is formula;
First filling module 704, if being formula for recognition result, by recognition result corresponding calculated result filling to
Identify the corresponding input frame of identifying code picture;
Recognition result is inserted identifying code figure to be identified if being non-formula for recognition result by the second filling module 705
The corresponding input frame of piece;
Module 706 is removed, for being removed interfering line to identifying code picture to be identified using Depth Priority Algorithm
Processing, the identifying code picture to be identified after obtaining interference;
Conversion module 707, for being turned the identifying code picture to be identified after going interference according to preset conversion regime
Processing is changed, gray scale picture is obtained;
Binary processing module 708 obtains binaryzation for carrying out binary conversion treatment to gray scale picture using clustering algorithm
Picture;
Denoising module 709 is denoised for carrying out denoising to binaryzation picture using flood filling algorithm
Identifying code picture to be identified afterwards;
First determining module 710, for determining an identifying code to be identified in each identifying code picture to be identified
Picture is current verification code picture;
Second determining module 711, for by a unidentified verifying to be identified in each identifying code picture to be identified
Code picture is determined as current verification code picture, returns to execution based on Tesseract and carries out identifying processing to current verification code picture,
The step of obtaining current verification code picture corresponding recognition result, until the identified processing of each identifying code picture to be identified simultaneously
Obtain corresponding recognition result.
Further, module 701 is obtained specifically: third extracting sub-module 7011, for obtaining each targeted website
Each identifying code picture to be identified;
Further, recognition processing module 702 specifically: second distinguishes submodule 7025, for being based on Tesseract pairs
Current verification code picture carries out identifying processing, obtains the corresponding recognition result of current verification code picture;
Further, recognition processing module 702 includes:
Divide submodule 7021, for being split using vertical projection method to identifying code picture to be identified, obtains each
Sub- identifying code picture;
Normalizing submodule 7022, for carrying out each sub- identifying code picture at size normalizing according to preset dimensions
Reason, the sub- identifying code picture after obtaining each normalization;
First distinguishes submodule 7023, for being carried out based on Tesseract to the sub- identifying code picture after each normalization
Identification, the corresponding each identifying code content information of sub- identifying code picture after obtaining each normalization;
Splice submodule 7024, for according to before identifying code picture segmentation will be in each identifying code by left-to-right sequence
Hold information to be spliced, obtains the corresponding recognition result of identifying code picture to be identified, recognition result includes more than one character.
Further, recognition processing module 703 includes:
It is pressed into submodule 7031, for from left to right sequentially being arranged according to each character in recognition result, and will identification
As a result each character in is pressed into stack according to from left to right sequentially putting in order;
First extracting sub-module 7032, for according to sequencing arrangement of popping from being obtained in stack in each character at first
The character popped is as current character;
Submodule 7033 is inquired, for inquiring using oeprator querying method current character, obtains inquiry knot
Fruit;
Submodule 7034 is concluded, for determining the character types of query result, wherein character types according to query result
Type is accorded with including operator type and inverse;
Second extracting sub-module 7035 obtains preset if the character types for query result are operator type
Oeprator original value is as current quantity;
Operation submodule 7036 obtains quantity result for current quantity and number 1 to be carried out arithmetic sum operation;
Submodule 7037 is differentiated, for judging whether current character is according in each character that sequencing arranges of popping
Finally pop a character, if current character be according to pop sequencing arrangement each character in finally popping
One character, then judge quantity result whether be greater than or equal to preset oeprator original value and number 2 and, and character type
Type is that whether there is equal sign in all characters of operator type, if quantity result is original more than or equal to preset oeprator
Value with number 2 and, and character types be operator type all characters in there are equal signs, it is determined that recognition result be calculation
Formula, if quantity result be not more than or not equal to preset oeprator original value and number 2 and, and character types be operator
Equal sign is not present in all characters of type, it is determined that recognition result is non-formula, if current character is not for according to elder generation of popping
Finally pop a character in tactic each character afterwards then obtains each word according to sequencing arrangement of popping
Next character popped in symbol obtains quantity result as current quantity as current character, and triggers inquiry submodule
7033。
Specific about verifying code recognition device limits the restriction that may refer to above for method for recognizing verification code,
This is repeated no more.Modules in above-mentioned verifying code recognition device can come fully or partially through software, hardware and combinations thereof
It realizes.Above-mentioned each module can be embedded in the form of hardware or independently of in the processor in computer equipment, can also be with software
Form is stored in the memory in computer equipment, executes the corresponding operation of the above modules in order to which processor calls.
In one embodiment, a kind of computer equipment is provided, which can be server, internal junction
Composition is shown in Fig.12.The computer equipment include by system bus connect processor, memory, network interface and
Database.Wherein, the processor of the computer equipment is for providing calculating and control ability.The memory packet of the computer equipment
Include non-volatile memory medium, built-in storage.The non-volatile memory medium is stored with operating system, computer program and data
Library.The built-in storage provides environment for the operation of operating system and computer program in non-volatile memory medium.The calculating
The database of machine equipment is for storing picture or the data information etc. that method for recognizing verification code is related to.The net of the computer equipment
Network interface is used to communicate with external terminal by network connection.To realize that one kind is tested when the computer program is executed by processor
Demonstrate,prove code recognition methods.
In one embodiment, a kind of computer equipment is provided, including memory, processor and storage are on a memory
And the computer program that can be run on a processor, processor realize that above-described embodiment identifying code identifies when executing computer program
The step of method, such as step S10 shown in Fig. 2 to step S50.Alternatively, being realized when processor execution computer program above-mentioned
The function of each module/unit of code recognition device is verified in embodiment, such as module 701 shown in Figure 11 is to the function of module 711.
To avoid repeating, which is not described herein again.
In one embodiment, a kind of computer readable storage medium is provided, computer program is stored thereon with, is calculated
Method for recognizing verification code in above method embodiment is realized when machine program is executed by processor, alternatively, the computer program is located
Manage the function of realizing when device executes and verify each module/unit in code recognition device in above-mentioned apparatus embodiment.To avoid repeating, this
In repeat no more.Those of ordinary skill in the art will appreciate that realizing all or part of the process in above-described embodiment method, it is
Relevant hardware can be instructed to complete by computer program, the computer program can be stored in a non-volatile meter
In calculation machine read/write memory medium, the computer program is when being executed, it may include such as the process of the embodiment of above-mentioned each method.Its
In, to any reference of memory, storage, database or other media used in each embodiment provided herein,
It may each comprise non-volatile and/or volatile memory.Nonvolatile memory may include read-only memory (ROM), may be programmed
ROM (PROM), electrically programmable ROM (EPROM), electrically erasable ROM (EEPROM) or flash memory.Volatile memory can wrap
Include random access memory (RAM) or external cache.By way of illustration and not limitation, RAM in a variety of forms may be used
, such as static state RAM (SRAM), dynamic ram (DRAM), synchronous dram (SDRAM), double data rate sdram (DDRSDRAM), increase
Strong type SDRAM (ESDRAM), synchronization link (Synchlink) DRAM (SLDRAM), memory bus (Rambus) direct RAM
(RDRAM), direct memory bus dynamic ram (DRDRAM) and memory bus dynamic ram (RDRAM) etc..
It is apparent to those skilled in the art that for convenience of description and succinctly, only with above-mentioned each function
Can unit, module division progress for example, in practical application, can according to need and by above-mentioned function distribution by different
Functional unit, module are completed, i.e., the internal structure of described device is divided into different functional unit or module, more than completing
The all or part of function of description.
Embodiment described above is merely illustrative of the technical solution of the present invention, rather than its limitations;Although referring to aforementioned reality
Applying example, invention is explained in detail, those skilled in the art should understand that: it still can be to aforementioned each
Technical solution documented by embodiment is modified or equivalent replacement of some of the technical features;And these are modified
Or replacement, the spirit and scope for technical solution of various embodiments of the present invention that it does not separate the essence of the corresponding technical solution should all
It is included within protection scope of the present invention.
Claims (10)
1. a kind of method for recognizing verification code, which is characterized in that the method for recognizing verification code includes:
Obtain the identifying code picture to be identified on targeted website;
Identifying processing is carried out to the identifying code picture to be identified based on Tesseract, obtains the identifying code picture to be identified
Corresponding recognition result, wherein Tesseract is a optical character identification tool;
Judge whether the recognition result is formula;
If the recognition result is formula, the corresponding calculated result of the recognition result is inserted into the identifying code figure to be identified
The corresponding input frame of piece;
If the recognition result is non-formula, the recognition result is inserted into the corresponding input of the identifying code picture to be identified
Frame.
2. method for recognizing verification code as described in claim 1, which is characterized in that described to be based on Tesseract to described wait know
Other identifying code picture carries out identifying processing, and obtaining the corresponding recognition result of the identifying code picture to be identified includes:
The identifying code picture to be identified is split using vertical projection method, obtains each sub- identifying code picture;
Each sub- identifying code picture is subjected to size normalization according to preset dimensions, after obtaining each normalization
Sub- identifying code picture;
The sub- identifying code picture after each normalization is identified based on Tesseract, obtains each normalization
The corresponding each identifying code content information of sub- identifying code picture afterwards;
According to each identifying code content information being spliced by left-to-right sequence before the identifying code picture segmentation,
The corresponding recognition result of the identifying code picture to be identified is obtained, the recognition result includes more than one character.
3. method for recognizing verification code as described in claim 1, which is characterized in that described to judge whether the recognition result is calculation
Formula includes:
It is from left to right sequentially arranged according to each character in the recognition result, and by each character in the recognition result
Stack is pressed into according to from left to right sequentially putting in order;
According to sequencing arrangement of popping from pop at first the character obtained in stack in each character as current
Character;
The current character is inquired using oeprator querying method, obtains query result;
According to the query result, the character types of the query result are determined, wherein the character types include operator class
Type and inverse accord with type;
If the character types of the query result are the operator type, obtain preset oeprator original value and be used as and work as
Preceding quantity;
The current quantity and number 1 are subjected to arithmetic sum operation, obtain quantity result;
Judge whether the current character is according to one finally to pop in each character that sequencing arranges of popping
A character, if the current character is according to finally pop the word in each character that sequencing arranges of popping
Symbol, then judge the quantity result whether be greater than or equal to the preset oeprator original value and number 2 and, and character
Type is that whether there is equal sign in all characters of operator type;
If the quantity result be greater than or equal to the preset oeprator original value and number 2 and, and character types are
There are equal signs in all characters of operator type, it is determined that the recognition result is formula, if the quantity result is not more than
Or not equal to the preset oeprator original value and number 2 and, and character types be operator type all characters
In be not present equal sign, it is determined that the recognition result be non-formula;
If the current character is not in accordance with finally pop the word in each character for sequencing arrangement of popping
Symbol, then obtain next character popped in each character according to sequencing arrangement of popping as current character,
The quantity result is obtained as the current quantity, return execution is described to use oeprator querying method to the current word
The step of symbol is inquired, obtains query result.
4. method for recognizing verification code as described in claim 1, which is characterized in that it is described based on Tesseract to it is described to
Identify that identifying code picture carries out identifying processing, it is described to test before obtaining the corresponding recognition result of the identifying code picture to be identified
Demonstrate,prove code recognition methods further include:
Interfering line processing is removed to the identifying code picture to be identified using Depth Priority Algorithm, after obtaining interference
Identifying code picture to be identified;
It goes the identifying code picture to be identified after interference to carry out conversion process for described according to preset conversion regime, obtains grayscale image
Piece;
Binary conversion treatment is carried out to the gray scale picture using clustering algorithm, obtains binaryzation picture;
Denoising is carried out to the binaryzation picture using flood filling algorithm, the identifying code figure to be identified after being denoised
Piece.
5. method for recognizing verification code according to any one of claims 1 to 4, which is characterized in that the acquisition targeted website
On identifying code figure to be identified specifically: obtain each identifying code picture to be identified on each targeted website;
Identifying processing is carried out to the identifying code picture to be identified based on Tesseract described, obtains the verifying to be identified
Before the corresponding recognition result of code picture, the method for recognizing verification code further include: determine each identifying code to be identified
An identifying code picture to be identified in picture is current verification code picture;
It is described that identifying processing is carried out to the identifying code picture to be identified based on Tesseract, obtain the identifying code to be identified
The corresponding recognition result of picture specifically: identifying processing is carried out to the current verification code picture based on Tesseract, obtains institute
State the corresponding recognition result of current verification code picture;
The corresponding calculated result of the recognition result is being inserted into the corresponding input frame of the identifying code picture, or by the knowledge
Other result is inserted after the corresponding input frame of the identifying code picture, the method for recognizing verification code further include: will be described each
A unidentified identifying code picture to be identified in identifying code picture to be identified is determined as current verification code picture, returns and executes
It is described that identifying processing is carried out to the current verification code picture based on Tesseract, it is corresponding to obtain the current verification code picture
Recognition result the step of, until each identifying code picture to be identified it is identified handle and obtain corresponding identification tie
Fruit.
6. a kind of verifying code recognition device, which is characterized in that the verifying code recognition device includes:
Module is obtained, for obtaining the identifying code picture to be identified on targeted website;
Recognition processing module obtains described for carrying out identifying processing to the identifying code picture to be identified based on Tesseract
The corresponding recognition result of identifying code picture to be identified, wherein Tesseract is a optical character identification tool;
Judgment module, for judging whether the recognition result is formula;
First filling module inserts the corresponding calculated result of the recognition result if being formula for the recognition result
The corresponding input frame of the identifying code picture to be identified;
Second filling module to be identified is tested if being non-formula for the recognition result by recognition result filling is described
Demonstrate,prove the corresponding input frame of code picture.
7. verifying code recognition device as claimed in claim 6, which is characterized in that the recognition processing module includes:
Divide submodule, for being split using vertical projection method to the identifying code picture to be identified, obtains each height and test
Demonstrate,prove code picture;
Normalizing submodule, for each sub- identifying code picture to be carried out size normalization according to preset dimensions,
Sub- identifying code picture after obtaining each normalization;
First distinguishes submodule, for being identified based on Tesseract to the sub- identifying code picture after each normalization,
The corresponding each identifying code content information of sub- identifying code picture after obtaining each normalization;
Splice submodule, for according to before the identifying code picture segmentation will be in each identifying code by left-to-right sequence
Hold information to be spliced, obtain the corresponding recognition result of the identifying code picture to be identified, the recognition result include one with
Upper character.
8. the verifying code recognition device as described in any one of claim 6 to 7, which is characterized in that the judgment module includes:
It is pressed into submodule, for from left to right sequentially arranging according to each character in the recognition result, and by the identification
As a result each character in is pressed into stack according to from left to right sequentially putting in order;
First extracting sub-module, for according to sequencing arrangement of popping from obtaining popping in each character at first in stack
A character as current character;
Submodule is inquired, for inquiring using oeprator querying method the current character, obtains query result;
Submodule is concluded, for determining the character types of the query result, wherein the character according to the query result
Type includes operator type and inverse symbol type;
Second extracting sub-module obtains preset if the character types for the query result are the operator type
Oeprator original value is as current quantity;
Operation submodule obtains quantity result for the current quantity and number 1 to be carried out arithmetic sum operation;
First differentiates submodule, for judging whether the current character is each word arranged according to sequencing of popping
Finally pop a character in symbol, if the current character is for according to each character for sequencing arrangement of popping
In finally pop a character, then it is original to judge whether the quantity result is greater than or equal to the preset oeprator
Value and number 2 and, and character types are in all characters of operator type with the presence or absence of equal sign, if the quantity result is big
In or equal to the preset oeprator original value and number 2 and, and character types be operator type all characters
In there are equal signs, it is determined that the recognition result is formula, if the quantity result is not more than or not equal to the preset fortune
Operator original value and number 2 and, and character types are in all characters of operator type there is no equal sign, it is determined that institute
State recognition result be non-formula, if the current character not in accordance with pop sequencing arrangement each character in most
The character popped afterwards then obtains next character popped in each character according to sequencing arrangement of popping
As current character, the quantity result is obtained as the current quantity, and triggers inquiry submodule.
9. a kind of computer equipment, including memory, processor and storage are in the memory and can be in the processor
The computer program of upper operation, which is characterized in that the processor realized when executing the computer program as claim 1 to
Described in any one of 5 the step of method for recognizing verification code.
10. a kind of computer readable storage medium, the computer-readable recording medium storage has computer program, and feature exists
In realizing the method for recognizing verification code as described in any one of claims 1 to 5 when the computer program is executed by processor
Step.
Priority Applications (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810595668.3A CN109086591A (en) | 2018-06-11 | 2018-06-11 | Method for recognizing verification code, device, computer equipment and storage medium |
PCT/CN2018/106400 WO2019237549A1 (en) | 2018-06-11 | 2018-09-19 | Verification code recognition method and apparatus, computer device, and storage medium |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810595668.3A CN109086591A (en) | 2018-06-11 | 2018-06-11 | Method for recognizing verification code, device, computer equipment and storage medium |
Publications (1)
Publication Number | Publication Date |
---|---|
CN109086591A true CN109086591A (en) | 2018-12-25 |
Family
ID=64839903
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201810595668.3A Pending CN109086591A (en) | 2018-06-11 | 2018-06-11 | Method for recognizing verification code, device, computer equipment and storage medium |
Country Status (2)
Country | Link |
---|---|
CN (1) | CN109086591A (en) |
WO (1) | WO2019237549A1 (en) |
Cited By (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111163082A (en) * | 2019-12-27 | 2020-05-15 | 咪咕文化科技有限公司 | Verification code generation method, verification method, electronic equipment and storage medium |
CN112948801A (en) * | 2021-03-05 | 2021-06-11 | 上海臣星软件技术有限公司 | Method, device and equipment for inputting verification code and computer storage medium |
CN115208704A (en) * | 2022-09-16 | 2022-10-18 | 欣诚信息技术有限公司 | Identity authentication system and political service application system |
CN115712887A (en) * | 2023-01-09 | 2023-02-24 | 成方金融科技有限公司 | Picture verification code identification method and device, electronic equipment and storage medium |
CN116186674A (en) * | 2023-02-21 | 2023-05-30 | 宿迁乐享知途网络科技有限公司 | High-contrast man-machine interaction verification method |
Families Citing this family (11)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111476528B (en) * | 2020-04-27 | 2023-07-11 | 上海东普信息科技有限公司 | Data processing method, device, equipment and storage medium based on express delivery library |
CN111611988A (en) * | 2020-05-22 | 2020-09-01 | 上海携程商务有限公司 | Picture verification code identification method and device, electronic equipment and computer readable medium |
CN111652130B (en) * | 2020-06-02 | 2023-09-15 | 上海语识信息技术有限公司 | Method for identifying number, symbol and letter group of non-specific font |
CN111666737B (en) * | 2020-06-04 | 2023-04-25 | 广州博高信息科技有限公司 | Multi-coding rule compatible processing method, device, equipment and medium for regional library |
CN111753845A (en) * | 2020-06-30 | 2020-10-09 | 北京来也网络科技有限公司 | AI-based verification code picture identification method, device, equipment and storage medium |
CN111881810B (en) * | 2020-07-23 | 2024-03-29 | 前海人寿保险股份有限公司 | Certificate identification method, device, terminal and storage medium based on OCR |
CN112270325B (en) * | 2020-11-09 | 2024-05-24 | 携程旅游网络技术(上海)有限公司 | Character verification code recognition model training method, recognition method, system, equipment and medium |
CN112487394A (en) * | 2020-11-30 | 2021-03-12 | 携程旅游网络技术(上海)有限公司 | Method, system, device and medium for identifying graph reasoning verification code |
CN112686266A (en) * | 2021-01-11 | 2021-04-20 | 安徽希施玛数据科技有限公司 | Verification code identification method and device |
CN114861644A (en) * | 2022-04-12 | 2022-08-05 | 深圳追一科技有限公司 | Time identification method, system, device and medium |
CN115150186A (en) * | 2022-07-27 | 2022-10-04 | 张瑜 | Verification code verification method, system, electronic equipment and storage medium |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103473531A (en) * | 2013-09-04 | 2013-12-25 | 上海索广电子有限公司 | Digit image recognition and error correction method based on name board digit recognition |
CN105681344A (en) * | 2016-03-11 | 2016-06-15 | 广东亿迅科技有限公司 | Verification code recognition system and method |
CN106650398A (en) * | 2017-01-03 | 2017-05-10 | 深圳博十强志科技有限公司 | Recognition system and recognition method for verification code of mobile platform |
CN106886996A (en) * | 2017-02-10 | 2017-06-23 | 九次方大数据信息集团有限公司 | Dividing method and device based on mathematical operation identifying code image |
Family Cites Families (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104462930B (en) * | 2014-11-18 | 2017-11-17 | 百度在线网络技术(北京)有限公司 | Verification code generation method and device |
EA031834B1 (en) * | 2015-07-01 | 2019-02-28 | Дмитрий Маринкин | Method for identifying authenticity of an item having security marking on its surface |
CN105763319A (en) * | 2016-02-02 | 2016-07-13 | 南京云创大数据科技股份有限公司 | Random multi-state verification code generation method |
CN106570159A (en) * | 2016-11-07 | 2017-04-19 | 国家电网公司 | Supplier bidding document qualification information verification system and method |
-
2018
- 2018-06-11 CN CN201810595668.3A patent/CN109086591A/en active Pending
- 2018-09-19 WO PCT/CN2018/106400 patent/WO2019237549A1/en active Application Filing
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103473531A (en) * | 2013-09-04 | 2013-12-25 | 上海索广电子有限公司 | Digit image recognition and error correction method based on name board digit recognition |
CN105681344A (en) * | 2016-03-11 | 2016-06-15 | 广东亿迅科技有限公司 | Verification code recognition system and method |
CN106650398A (en) * | 2017-01-03 | 2017-05-10 | 深圳博十强志科技有限公司 | Recognition system and recognition method for verification code of mobile platform |
CN106886996A (en) * | 2017-02-10 | 2017-06-23 | 九次方大数据信息集团有限公司 | Dividing method and device based on mathematical operation identifying code image |
Non-Patent Citations (1)
Title |
---|
李凯胜: "中文验证码识别技术研究", 《中国优秀博硕士学位论文全文数据库(硕士)信息科技辑》 * |
Cited By (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111163082A (en) * | 2019-12-27 | 2020-05-15 | 咪咕文化科技有限公司 | Verification code generation method, verification method, electronic equipment and storage medium |
CN111163082B (en) * | 2019-12-27 | 2022-03-25 | 咪咕文化科技有限公司 | Verification code generation method, verification method, electronic equipment and storage medium |
CN112948801A (en) * | 2021-03-05 | 2021-06-11 | 上海臣星软件技术有限公司 | Method, device and equipment for inputting verification code and computer storage medium |
CN115208704A (en) * | 2022-09-16 | 2022-10-18 | 欣诚信息技术有限公司 | Identity authentication system and political service application system |
CN115712887A (en) * | 2023-01-09 | 2023-02-24 | 成方金融科技有限公司 | Picture verification code identification method and device, electronic equipment and storage medium |
CN116186674A (en) * | 2023-02-21 | 2023-05-30 | 宿迁乐享知途网络科技有限公司 | High-contrast man-machine interaction verification method |
Also Published As
Publication number | Publication date |
---|---|
WO2019237549A1 (en) | 2019-12-19 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN109086591A (en) | Method for recognizing verification code, device, computer equipment and storage medium | |
CN108595583B (en) | Dynamic graph page data crawling method, device, terminal and storage medium | |
CN110334585A (en) | Table recognition method, apparatus, computer equipment and storage medium | |
CN109272043B (en) | Training data generation method and system for optical character recognition and electronic equipment | |
CN109634961B (en) | Test paper sample generation method and device, electronic equipment and storage medium | |
CN111652232B (en) | Bill identification method and device, electronic equipment and computer readable storage medium | |
CN110399291A (en) | User Page test method and relevant device based on image recognition | |
CN109255356A (en) | A kind of character recognition method, device and computer readable storage medium | |
US9633256B2 (en) | Methods and systems for efficient automated symbol recognition using multiple clusters of symbol patterns | |
US9892114B2 (en) | Methods and systems for efficient automated symbol recognition | |
CN111626124A (en) | OCR image sample generation method, OCR image sample generation device, OCR image sample printing body verification equipment and OCR image sample printing body verification medium | |
CN110147787A (en) | Bank's card number automatic identifying method and system based on deep learning | |
KR102442350B1 (en) | Information analyzing method for performing autamatic generating of document based on artificial intelligence and apparatus therefor | |
CN114663904A (en) | PDF document layout detection method, device, equipment and medium | |
CN108154191A (en) | The recognition methods of file and picture and system | |
CN105117723B (en) | A kind of image-recognizing method and device | |
CN111353689B (en) | Risk assessment method and device | |
CN112446259A (en) | Image processing method, device, terminal and computer readable storage medium | |
CN112988557A (en) | Search box positioning method, data acquisition device and medium | |
CN111462388A (en) | Bill inspection method and device, terminal equipment and storage medium | |
CN109147002A (en) | A kind of image processing method and device | |
CN112508000A (en) | Method and equipment for generating OCR image recognition model training data | |
CN114579796B (en) | Machine reading understanding method and device | |
CN116052195A (en) | Document parsing method, device, terminal equipment and computer readable storage medium | |
CN112541505B (en) | Text recognition method, text recognition device and computer-readable storage medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20181225 |
|
RJ01 | Rejection of invention patent application after publication |