CN110246197B - Verification codeword generation method and device, electronic equipment and storage medium - Google Patents

Verification codeword generation method and device, electronic equipment and storage medium Download PDF

Info

Publication number
CN110246197B
CN110246197B CN201910425674.9A CN201910425674A CN110246197B CN 110246197 B CN110246197 B CN 110246197B CN 201910425674 A CN201910425674 A CN 201910425674A CN 110246197 B CN110246197 B CN 110246197B
Authority
CN
China
Prior art keywords
font
character
vector
character image
generating
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201910425674.9A
Other languages
Chinese (zh)
Other versions
CN110246197A (en
Inventor
张兴盟
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing QIYI Century Science and Technology Co Ltd
Original Assignee
Beijing QIYI Century Science and Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing QIYI Century Science and Technology Co Ltd filed Critical Beijing QIYI Century Science and Technology Co Ltd
Priority to CN201910425674.9A priority Critical patent/CN110246197B/en
Publication of CN110246197A publication Critical patent/CN110246197A/en
Application granted granted Critical
Publication of CN110246197B publication Critical patent/CN110246197B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/30Authentication, i.e. establishing the identity or authorisation of security principals
    • G06F21/31User authentication
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T11/002D [Two Dimensional] image generation
    • G06T11/001Texturing; Colouring; Generation of texture or colour
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/08Network architectures or network communication protocols for network security for authentication of entities
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L9/00Cryptographic mechanisms or cryptographic arrangements for secret or secure communications; Network security protocols
    • H04L9/32Cryptographic mechanisms or cryptographic arrangements for secret or secure communications; Network security protocols including means for verifying the identity or authority of a user of the system or for message authentication, e.g. authorization, entity authentication, data integrity or data verification, non-repudiation, key authentication or verification of credentials
    • H04L9/3226Cryptographic mechanisms or cryptographic arrangements for secret or secure communications; Network security protocols including means for verifying the identity or authority of a user of the system or for message authentication, e.g. authorization, entity authentication, data integrity or data verification, non-repudiation, key authentication or verification of credentials using a predetermined code, e.g. password, passphrase or PIN
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L9/00Cryptographic mechanisms or cryptographic arrangements for secret or secure communications; Network security protocols
    • H04L9/32Cryptographic mechanisms or cryptographic arrangements for secret or secure communications; Network security protocols including means for verifying the identity or authority of a user of the system or for message authentication, e.g. authorization, entity authentication, data integrity or data verification, non-repudiation, key authentication or verification of credentials
    • H04L9/3226Cryptographic mechanisms or cryptographic arrangements for secret or secure communications; Network security protocols including means for verifying the identity or authority of a user of the system or for message authentication, e.g. authorization, entity authentication, data integrity or data verification, non-repudiation, key authentication or verification of credentials using a predetermined code, e.g. password, passphrase or PIN
    • H04L9/3228One-time or temporary data, i.e. information which is sent for every authentication or authorization, e.g. one-time-password, one-time-token or one-time-key

Abstract

The invention discloses a method, a device, electronic equipment and a storage medium for generating verification codewords, wherein the method comprises the following steps: receiving a first character image, and encoding the first character image to obtain a character embedding vector of the first character image, wherein the first character image comprises target characters with a first font, and the first font is a known font; receiving a font selection instruction, selecting at least one font from a preset font library, generating a font embedded vector of the selected font, wherein the preset font library comprises N fonts which are all known fonts; splicing the character embedded vector of the first character image and the font embedded vector of the selected font to obtain a character body vector; generating a second character image according to the character body vector and a preset character image generating model, wherein the second character image comprises target characters with second fonts, the second fonts are unknown fonts, and the mapping relation between the known fonts and the unknown fonts is recorded in the character image generating model.

Description

Verification codeword generation method and device, electronic equipment and storage medium
Technical Field
The present invention relates to the field of security verification technologies, and in particular, to a method and apparatus for generating a verification codeword, an electronic device, and a storage medium.
Background
With the development of internet technology, the types and functions of application programs provided by internet companies are more and more abundant, and great convenience is brought to life, work and study of people. Meanwhile, black ash generation 'out wool' phenomenon occurs, and black ash generation partners maliciously register or log in account numbers of application programs in a machine mode to steal assets of internet companies, so that huge losses are brought to the internet companies.
To prevent black-gray from producing "pull", the internet public will verify the identity of the operator when he registers and logs into the application account. Because the character selection verification code is integrated with literal logic knowledge, the security of the character selection verification code is much higher than that of other verification codes, and therefore, the character selection verification code is used by many Internet companies to verify the identity of operators.
However, with the advent of OCR (Optical Character Recognition ) technology, black-gray producing clusters began to crack the word verification code with OCR, resulting in lower security of the word verification code. Therefore, how to ensure the security of the character selection verification code under the condition that the OCR technology is widely applied has become a problem to be solved in the industry.
Disclosure of Invention
The embodiment of the invention provides a method, a device, electronic equipment and a storage medium for generating verification codewords, which are used for solving the technical problem of low safety of a word-selecting verification code in the prior art.
According to a first aspect of the present invention, there is disclosed a method of generating an authentication codeword, the method comprising:
receiving a first character image, and encoding the first character image to obtain a character embedding vector of the first character image, wherein the first character image comprises target characters with first fonts, and the first fonts are known fonts;
receiving a font selection instruction, selecting at least one font from a preset font library, and generating a font embedded vector of the selected font, wherein the preset font library comprises N fonts which are all known fonts;
splicing the character embedded vector of the first character image and the font embedded vector of the selected font to obtain a character body vector;
generating a second character image according to the character body vector and a preset character image generation model, wherein the second character image comprises target characters with a second font, the second font is an unknown font, the character image generation model records the mapping relation between the known font and the unknown font, and one character image comprises one character.
Optionally, as an embodiment, before the step of generating the second character image according to the character shape vector and the preset character image generating model, the method further includes:
training a character image generation model, wherein the training process of the character image generation model comprises the following steps:
acquiring a character set, wherein the character set comprises M different characters;
drawing each character in the character set into a corresponding character image according to each font in the preset font library to obtain a training sample set { P } 1 ,P 2 ,…,P M*N M is the number of character images in the training sample set, P i For the ith character image in the training sample set, P i The font of the middle character is Q i
For each character image P i For P i Coding to obtain P i Is used for generating Q i Is embedded in the font of P i Character embedded vector sum Q of (2) i Splicing the font embedded vectors to obtain P i Is a character shape vector of (a);
by generating the countermeasure network GAN algorithm, for all P i Training the character body vectors of the character image to obtain a character image generation model.
Optionally, as an embodiment, the font embedded vector is an N-dimensional column vector or an N-dimensional row vector, and one font embedded vector includes: one 1 and N-1 0.
Optionally, as an embodiment, the selecting at least one font from the preset font library, generating a font embedded vector of the selected font includes:
selecting a font from a preset font library to generate a font embedded vector;
the step of splicing the character embedding vector of the first character image and the font embedding vector of the selected font to obtain a character body vector comprises the following steps:
and splicing the character embedding vector of the first character image and the generated one font embedding vector to obtain a character body vector.
Optionally, as an embodiment, the selecting at least one font from the preset font library, generating a font embedded vector of the selected font includes:
selecting a plurality of fonts from a preset font library, and generating a plurality of font embedded vectors, wherein one font corresponds to one font embedded vector;
the step of splicing the character embedding vector of the first character image and the font embedding vector of the selected font to obtain a character body vector comprises the following steps:
carrying out weighted summation on the plurality of font embedded vectors to obtain an interpolation font embedded vector;
and splicing the character embedding vector of the first character image and the interpolation font embedding vector to obtain a character body vector.
Optionally, as an embodiment, the first font and the font selection instruction indicate that the selected font is the same font or is a different font.
Optionally, as an embodiment, the character includes any one of the following: chinese characters, numbers and letters.
According to a second aspect of the present invention, there is disclosed an authentication code character generating apparatus, the apparatus comprising:
the first receiving module is used for receiving a first character image, wherein the first character image comprises target characters with first fonts, and the first fonts are known fonts;
the encoding module is used for encoding the first character image to obtain a character embedding vector of the first character image;
the second receiving module is used for receiving the font selection instruction;
the first generation module is used for selecting at least one font from a preset font library and generating a font embedded vector of the selected font, wherein the preset font library comprises N fonts which are all known fonts;
the splicing module is used for splicing the character embedded vector of the first character image and the font embedded vector of the selected font to obtain a character body vector;
The second generation module is used for generating a second character image according to the character body vector and a preset character image generation model, wherein the second character image comprises target characters with a second font, the second font is an unknown font, the character image generation model records the mapping relation between the known font and the unknown font, and one character image comprises one character.
Optionally, as an embodiment, the apparatus further includes: training module, wherein, training module includes:
the character set acquisition sub-module is used for acquiring a character set, wherein the character set comprises M different characters;
a character image drawing sub-module for drawing each character in the character set into a corresponding character image according to each font in the preset font library to obtain a training sample set { P } 1 ,P 2 ,…,P M*N M is the number of character images in the training sample set, P i For the ith character image in the training sample set, P i The font of the middle character is Q i
A character-form vector generation sub-module for generating, for each character image P i For P i Coding to obtain P i Is used for generating Q i Is embedded in the font of P i Character embedded vector sum Q of (2) i Splicing the font embedded vectors to obtain P i Is a character shape vector of (a);
training submodule for generating the countermeasure network GAN algorithm for all P i Training the character body vectors of the character image to obtain a character image generation model.
Optionally, as an embodiment, the font embedded vector is an N-dimensional column vector or an N-dimensional row vector, and one font embedded vector includes: one 1 and N-1 0.
Optionally, as an embodiment, the first generating module includes:
the first generation sub-module is used for selecting a font from a preset font library and generating a font embedded vector;
the splice module includes:
and the first splicing sub-module is used for splicing the character embedded vector of the first character image and the generated one font embedded vector to obtain a character body vector.
Optionally, as an embodiment, the first generating module includes:
the second generation sub-module is used for selecting a plurality of fonts from a preset font library and generating a plurality of font embedded vectors, wherein one font corresponds to one font embedded vector;
the splice module includes:
The interpolation operator module is used for carrying out weighted summation on the plurality of font embedded vectors to obtain an interpolation font embedded vector;
and the second splicing sub-module is used for splicing the character embedded vector of the first character image and the interpolation font embedded vector to obtain a character body vector.
Optionally, as an embodiment, the first font and the font selection instruction indicate that the selected font is the same font or is a different font.
Optionally, as an embodiment, the character includes any one of the following: chinese characters, numbers and letters.
According to a third aspect of the present invention, an electronic device is disclosed, comprising: a memory, a processor and a computer program stored on the memory and executable on the processor, which when executed by the processor, performs the steps of the captcha character generation method as described above.
According to a fourth aspect of the present invention, a computer-readable storage medium having stored thereon a computer program which, when executed by a processor, implements the steps in a verification code character generation method as described above is disclosed.
In the embodiment of the invention, the character image with unknown font type can be automatically generated based on the character image with known font type, and the character verification code with the character with unknown font type is constructed based on the generated character image with unknown font type. Because the font type of the characters in the character selection verification code is unknown, the existing OCR cannot be cracked, so that the safety of the character selection verification code is improved.
Drawings
FIG. 1 is a flow chart of a method of captcha character generation in accordance with one embodiment of the present invention;
FIG. 2 is an exemplary diagram of a character image according to one embodiment of the invention;
FIG. 3 is a flow chart of a character image generation model training process of one embodiment of the present invention;
fig. 4 is a block diagram showing the construction of an apparatus for generating a verification code character according to an embodiment of the present invention.
Detailed Description
In order that the above-recited objects, features and advantages of the present invention will become more readily apparent, a more particular description of the invention will be rendered by reference to the appended drawings and appended detailed description.
It should be noted that, for simplicity of description, the method embodiments are shown as a series of acts, but it should be understood by those skilled in the art that the embodiments are not limited by the order of acts, as some steps may occur in other orders or concurrently in accordance with the embodiments. Further, those skilled in the art will appreciate that the embodiments described in the specification are presently preferred embodiments, and that the acts are not necessarily required by the embodiments of the invention.
In order to prevent the black ash production group from automatically registering or logging in the application account by malicious machine, the assets of the internet company are stolen, and the security department of the internet company is arranged at the time of registering and logging in the application account by the operator to verify the identity of the operator so as to distinguish which operators are real users and which operators are black ash production machines. Currently, some internet companies verify the identity of an operator by means of verification of a selected word verification code. However, with the advent of OCR technology, it is possible for existing word-selected verification codes to be broken, resulting in reduced security.
In order to solve the technical problems, the embodiment of the invention provides a method, a device, electronic equipment and a storage medium for generating verification codewords.
The method for generating the verification codeword provided by the embodiment of the invention is first described below.
Fig. 1 is a flowchart of a verification code character generating method according to an embodiment of the present invention, the method being performed by an electronic device, which may be a server in practical applications, as shown in fig. 1, the method may include the steps of: step 101, step 102, step 103 and step 104, wherein,
In step 101, a first character image is received, and the first character image is encoded to obtain a character embedding vector of the first character image, wherein the first character image includes a target character with a first font, and the first font is a known font.
In the embodiment of the present invention, the character may include any one of the following: the character image is an image containing only one character, and for convenience of processing, the character image in the embodiment of the invention is preferably an image of white background and black character, namely, the background of the character image is white and the color of the character is black.
Considering that compared with numbers and letters, the Chinese characters have larger cracking difficulty, the character image related in the embodiment of the invention is preferably a Chinese character image, and the Chinese character image only contains one Chinese character.
In the embodiment of the invention, both the known font (the font type is known) and the unknown font (the font with unknown or non-existing font type) are relative to the existing font library, if one font has a record in the existing font library, the font is the known font, and if one font does not have a record in the existing font library, the font is the unknown font.
In one example, as shown in fig. 2, the first character image is an image of a white background and black character, the target character is a "cut", and the first font is a regular script.
In the embodiment of the invention, the first character image is converted from image type data to numerical type data by encoding the first character image. In practical application, an algorithm related to a deep neural network may be used to encode the first character image to obtain an embedding (embedding) vector of the first character image.
In step 102, a font selection instruction is received, at least one font is selected from a preset font library, and a font embedding vector of the selected font is generated, wherein the preset font library comprises N fonts, and the N fonts are all known fonts.
In one example, the preset font library may include: song Ti, blackbody, script, regular script, hollowed-out, stereo, etc.
In the embodiment of the invention, the font selection instruction is used for indicating to select a font from a preset font library, specifically, may indicate to select one font from the preset font library, or may indicate to select a plurality of fonts from the preset font library. When a font is selected from a preset font library, generating a font embedded vector; when a plurality of fonts are selected from a preset font library, generating a corresponding font embedded vector for each selected font, and obtaining a plurality of font embedded vectors, wherein one font corresponds to one font embedded vector.
In the embodiment of the invention, the font embedded vector may be an N-dimensional column vector or an N-dimensional row vector, and one font embedded vector includes: one 1 and N-1 0.
In order to facilitate understanding of the font embedded vector, describing with reference to a specific example, in one example, the preset font library includes Song Ti, bold, running script, regular script, hollowed-out and stereo, 6 fonts in total, and the character embedded vector of each font can be generated based on the preset font library and the positional relationship of each font in the character embedded vector, for example, the positional relationship is (Song Ti, bold, running script, regular script, hollowed-out and stereo), and then the character embedded vectors of each font are generated respectively: song Ti the character embedding vector is 1,0,0,0,0,0, the bold character embedding vector is 0,1,0,0,0,0, the character embedding vector of the line book is 0,0,1,0,0,0, the character embedding vector of the regular script is 0,0,0,1,0,0, the hollowed-out character embedding vector is 0,0,0,0,1,0, and the three-dimensional character embedding vector is 0,0,0,0,0,1. When a bold is selected from a preset font library according to a font selection instruction, the generated character embedding vector is (0,1,0,0,0,0); when bold and hollowed-out are selected from a preset font library according to a font selection instruction, the generated character embedding vectors are (0,1,0,0,0,0) and (0,0,0,0,1,0).
In the embodiment of the invention, the first font and the font indicated by the font selection instruction may be the same font or different fonts. For example, the font of the target character in the first character image is Song Ti, and the font selection instruction indicates that the selected font is Song Ti; or the font of the target character in the first character image is Song Ti, and the font selection instruction indicates that the selected font is bold; or the font of the target character in the first character image is Song Ti, and the font selection instruction indicates that the selected font is Song Ti and bold; or the font of the target character in the first character image is Song Ti, and the font selection instruction indicates that the selected font is bold and hollowed.
In step 103, the character embedding vector of the first character image and the font embedding vector of the selected font are spliced to obtain a character body vector.
In the embodiment of the invention, the first character image mainly provides font characteristics for generating the second character image (i.e. which character is generated), and the font selection instruction indicates that the selected font mainly provides font characteristics for generating the second character image (i.e. which known font the generated second font is closer to, or more similar to). Further, the character embedded vector of the first character image and the font embedded vector of the selected font are spliced, the font characteristic and the font characteristic are fused, the fused characteristic is used for generating a second character image, and the font of the characters in the second character image (namely, the second font) is a brand new font and is different from the fonts in the existing font library.
In one embodiment of the present invention, when a font is selected from a preset font library, the step 103 may specifically include the following steps:
and splicing the character embedded vector of the first character image and the generated one font embedded vector to obtain a character body vector.
In one example, the character embedding vector of the first character image is a, the font embedding vector B of the selected font is concatenated to obtain the character form vector (a, B).
In the embodiment of the present invention, when the first font and the font selection instruction indicate that the selected font is the same font, a second character image is generated based on the character form vector in the embodiment, the characters in the second character image are target characters, and the font style of the characters in the second character image (second font) is close to the style of the first font. For example, the character in the first character image is "wind", the first font is Song Ti, the font selection instruction indicates that the selected font is Song Ti, the character in the second character image is also "wind", and the style of the second font is similar to Song Ti but not Song Ti.
When the first font and the font selection instruction indicate that the selected font is different, based on the second character image generated by the character body vector in the embodiment, the characters in the second character image are target characters, and the font style of the characters in the second character image is between the style of the first font and the style of the font selected by the font selection instruction. For example, if the character in the first character image is "wind", the first font is Song Ti, and the font selection instruction indicates that the selected font is bold, then the character in the second character image is also "wind", and the style of the second font is somewhat like Song Ti and somewhat like bold, but is not Song Ti and bold.
In another embodiment of the present invention, when a plurality of fonts are selected from a preset font library, the step 103 may specifically include the following steps:
carrying out weighted summation on the plurality of font embedded vectors to obtain an interpolation font embedded vector;
and splicing the character embedding vector of the first character image and the interpolation font embedding vector to obtain a character body vector.
In one example, the character embedding vector of the first character image is a, the font embedding vectors B and C of the selected font are weighted and summed to obtain an interpolated font embedding vector b+c, and then the character feature vectors (a, b+b+c) are spliced to obtain a character feature vector (a, b+c), wherein B and C are weight coefficients, and the sum of B and C is 1.
In the embodiment of the invention, the weight coefficient used for the weighted summation operation determines the approaching degree of the second font and the font indicated by the font selection instruction, and the bigger the weight coefficient is, the closer the approaching degree is.
In the embodiment of the present invention, based on the second character image generated by the character form vector in the embodiment, the characters in the second character image are target characters, and the font style of the characters in the second character image is between the style of the first font and the style of the font selected by the font selection instruction. For example, if the character in the first character image is "wind", the first font is Song Ti, the font selection command indicates that the selected font is Song Ti and bold, then the character in the second character image is also "wind", and the style of the second font is somewhat like Song Ti and somewhat like bold, but not Song Ti and bold. For another example, the characters in the first character image are "wind", the first font is Song Ti, the font selection instruction indicates that the selected fonts are hollowed-out and black, the characters in the second character image are also "wind", the style of the second font is a little like Song Ti and a little like black and is also hollowed-out with points, but is not Song Ti, black and hollowed-out.
In step 104, a second character image is generated according to the character body vector and a preset character image generation model, wherein the second character image contains target characters with a second font, the second font is an unknown font, the character image generation model records the mapping relation between the known font and the unknown font, and one character image contains one character.
In the embodiment of the invention, the character image generation model can be obtained by training the character body vectors based on various machine learning algorithms. It is considered that if the difference between the newly generated font (second font) and the existing font is too large, the user will not recognize the character corresponding to the newly generated font; and GAN (Generative Adversarial Networks, generative countermeasure network) algorithm is well-developed in the field of image style migration, therefore, training of the character image generating model is preferably performed by adopting GAN algorithm, and as shown in fig. 3, the training process of the character image generating model may include the following steps: step 301, step 302, step 303 and step 304, wherein,
in step 301, a character set is obtained, wherein the character set comprises M different characters.
In the embodiment of the invention, the character set comprises N characters, wherein the N characters are different from each other, and the N characters are all existing characters.
Considering that the more the number of characters in the character set is, the more the generated character image can generate the character image, and the cognition degree of most users on the characters, in the embodiment of the invention, the character set containing 3500 common characters can be obtained.
In step 302, each character in the character set is drawn into a corresponding character image according to each font in the preset font library, and a training sample set { P } 1 ,P 2 ,…,P M*N M is the number of character images in the training sample set, P i To train the ith character image in the sample set, P i The font of the middle character is Q i
In the embodiment of the invention, when drawing the character image, any image drawing method (such as a row_font method of PIL) of the related art may be used for drawing.
To facilitate understanding of the process of generating the training sample set, the description is provided in connection with a specific example, and in one example, the preset font library includes: song Ti, bold, regular script and hollowed-out, totally 4 fonts; the character set includes: wind, rain, lightning and electricity, 4 characters in total; drawing wind, rain, lightning and electricity in a character set respectively according to Song Ti, bold, regular script and hollowed-out characters in a preset font library to obtain a total of 16 character images respectively as shown in the following table 1:
TABLE 1
In step 303, for each character image P i For P i Coding to obtain P i Is used for generating Q i Is embedded in the font of P i Character embedded vector sum Q of (2) i Splicing the font embedded vectors to obtain P i Is a character shape vector of (a).
Considering that the image type data cannot be directly used for model training based on the GAN algorithm, in the embodiment of the invention, each character image in the training sample set is encoded, converted into numerical type data, and then input into the GAN model for training.
In the embodiment of the present invention, when each character image in the training sample set is encoded, the same encoding method as that in step 101 is adopted. Generating font Q i The same generation as in step 102 is used for the embedding of the font into the vector. In the generation of character image P i The same splicing manner as in step 103 is adopted, and will not be described here again.
Following the example in step 302, the character shape vectors of each character image in the training sample set are shown in the following table 2:
TABLE 2
In step 304, the network GAN algorithm is countered by the generation of P for all i Training the character body vectors of the character image to obtain a character image generation model.
For ease of understanding, a simplified description will be given first of GAN, which is a deep learning model, comprising two modules: a generator (also called a generation model) and a discriminator (also called a discrimination model), wherein the generator is used for learning the real image distribution so as to make the self-generated image more real to cheat the discriminator, and the discriminator is used for carrying out true and false discrimination on the received picture. In the whole training process, the generator strives to make the generated image more real, the discriminator strives to identify the true or false of the image, the process is equivalent to a two-person game, the generator and the discriminator are constantly confronted with each other along with the time, and finally, the two networks achieve a dynamic balance: the image generated by the generator is close to the true image distribution, while the arbiter does not recognize the true or false image, and the probability of the prediction for a given image being true is substantially close to 0.5 (equivalent to a random guess class).
In the embodiment of the present invention, all character body vectors generated in step 303 are input into a GAN model for training, when the training result meets the preset condition, training is stopped, and the generator obtained by training is used as a character image generation model, where the preset condition may include: the output of the arbiter is 0.5 or the generator converges.
Therefore, in the embodiment of the invention, the character body vectors of the character images in the training sample set are fused with the font characteristics and the font characteristics of the characters, and the fonts of the characters in the character images are all the existing fonts, so that the character image generation model obtained by training can be used for generating new fonts (namely intermediate fonts) between the existing fonts.
In the embodiment of the present invention, the character shape vector generated in step 103 is input into the character image generation model, so as to obtain the second character image.
When the model is trained, a great number of ready-made character images can be directly obtained as the training sample set in addition to the mode of generating the training sample set.
As can be seen from the above embodiment, in this embodiment, a character image with an unknown font type may be automatically generated based on a character image with a known font type, and a character verification code may be constructed based on the generated character image with an unknown font type. Because the font type of the characters in the character selection verification code is unknown, the existing OCR cannot be cracked, so that the safety of the character selection verification code is improved.
Fig. 4 is a block diagram of a construction of a verification code character generating apparatus according to an embodiment of the present invention, as shown in fig. 4, the verification code character generating apparatus 400 may include: a first receiving module 401, an encoding module 402, a second receiving module 403, a first generating module 404, a splicing module 405 and a second generating module 406, wherein,
A first receiving module 401, configured to receive a first character image, where the first character image includes a target character with a first font, and the first font is a known font;
an encoding module 402, configured to encode the first character image to obtain a character embedding vector of the first character image;
a second receiving module 403, configured to receive a font selection instruction;
a first generating module 404, configured to select at least one font from a preset font library, and generate a font embedded vector of the selected font, where the preset font library includes N fonts, and the N fonts are all known fonts;
a stitching module 405, configured to stitch the character embedding vector of the first character image and the font embedding vector of the selected font to obtain a character shape vector;
the second generating module 406 is configured to generate a second character image according to the character feature vector and a preset character image generating model, where the second character image includes a target character whose font is a second font, the second font is an unknown font, a mapping relationship between a known font and the unknown font is recorded in the character image generating model, and one character image includes one character. Wherein, the method comprises the steps of, wherein,
As can be seen from the above embodiment, in this embodiment, a character image of unknown font type may be automatically generated based on a character image of known font type, and a character verification code of selected character may be constructed based on the generated character image of unknown font type. Because the font type of the characters in the character selection verification code is unknown, the existing OCR cannot be cracked, so that the safety of the character selection verification code is improved.
Optionally, as an embodiment, the verification code character generating apparatus 400 may further include: a training module, wherein the training module may include:
the character set acquisition sub-module is used for acquiring a character set, wherein the character set comprises M different characters;
a character image drawing sub-module for drawing each character in the character set into a corresponding character image according to each font in the preset font library to obtain a training sample set { P } 1 ,P 2 ,…,P M*N M is the number of character images in the training sample set, P i For the ith character image in the training sample set, P i The font of the middle character is Q i
A character-form vector generation sub-module for generating, for each character image P i For P i Coding to obtain P i Is used for generating Q i Is embedded in the font of P i Character embedded vector sum Q of (2) i Splicing the font embedded vectors to obtain P i Is a character shape vector of (a);
training submodule for generating the countermeasure network GAN algorithm for all P i Training the character body vectors of the character image to obtain a character image generation model.
Optionally, as an embodiment, the font embedded vector is an N-dimensional column vector or an N-dimensional row vector, and one font embedded vector includes: one 1 and N-1 0.
Alternatively, as an embodiment, the first generating module 404 may include:
the first generation sub-module is used for selecting a font from a preset font library and generating a font embedded vector;
the splicing module 405 may include:
and the first splicing sub-module is used for splicing the character embedded vector of the first character image and the generated one font embedded vector to obtain a character body vector.
Alternatively, as an embodiment, the first generating module 404 may include:
the second generation sub-module is used for selecting a plurality of fonts from a preset font library and generating a plurality of font embedded vectors, wherein one font corresponds to one font embedded vector;
The splicing module 405 may include:
the interpolation operator module is used for carrying out weighted summation on the plurality of font embedded vectors to obtain an interpolation font embedded vector;
and the second splicing sub-module is used for splicing the character embedded vector of the first character image and the interpolation font embedded vector to obtain a character body vector.
Optionally, as an embodiment, the first font and the font selection instruction indicate that the selected font is the same font or is a different font.
Optionally, as an embodiment, the character includes at least one of: chinese characters, numbers and letters.
For the device embodiments, since they are substantially similar to the method embodiments, the description is relatively simple, and reference is made to the description of the method embodiments for relevant points.
According to still another embodiment of the present invention, there is provided an electronic apparatus including: a memory, a processor and a computer program stored on the memory and executable on the processor, which when executed by the processor performs the steps of the method of generating an authentication codeword as in any one of the embodiments above.
According to still another embodiment of the present invention, there is further provided a computer readable storage medium having stored thereon a computer program which, when executed by a processor, implements the steps of the method for generating an authentication codeword according to any of the embodiments above.
In this specification, each embodiment is described in a progressive manner, and each embodiment is mainly described by differences from other embodiments, and identical and similar parts between the embodiments are all enough to be referred to each other.
It will be apparent to those skilled in the art that embodiments of the present invention may be provided as a method, apparatus, or computer program product. Accordingly, embodiments of the present invention may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, embodiments of the invention may take the form of a computer program product on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, etc.) having computer-usable program code embodied therein.
Embodiments of the present invention are described with reference to flowchart illustrations and/or block diagrams of methods, terminal devices (systems), and computer program products according to embodiments of the invention. It will be understood that each flow and/or block of the flowchart illustrations and/or block diagrams, and combinations of flows and/or blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing terminal device to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing terminal device, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
While preferred embodiments of the present invention have been described, additional variations and modifications in those embodiments may occur to those skilled in the art once they learn of the basic inventive concepts. It is therefore intended that the following claims be interpreted as including the preferred embodiment and all such alterations and modifications as fall within the scope of the embodiments of the invention.
Finally, it is further noted that relational terms such as first and second, and the like are used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Moreover, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or terminal that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or terminal. Without further limitation, an element defined by the phrase "comprising one … …" does not exclude the presence of other like elements in a process, method, article or terminal device comprising the element.
The above description of the method, the device, the electronic equipment and the storage medium for generating the verification codeword provided by the invention applies specific examples to illustrate the principle and the implementation of the invention, and the description of the above examples is only used for helping to understand the method and the core idea of the invention; meanwhile, as those skilled in the art will have variations in the specific embodiments and application scope in accordance with the ideas of the present invention, the present description should not be construed as limiting the present invention in view of the above.

Claims (14)

1. A method of generating an authentication codeword, the method comprising:
receiving a first character image, and encoding the first character image to obtain a character embedding vector of the first character image, wherein the first character image comprises target characters with first fonts, and the first fonts are known fonts;
receiving a font selection instruction, selecting at least one font from a preset font library, and generating a font embedded vector of the selected font, wherein the preset font library comprises N fonts which are all known fonts;
splicing the character embedded vector of the first character image and the font embedded vector of the selected font to obtain a character body vector;
generating a second character image according to the character body vector and a preset character image generation model, wherein the second character image comprises target characters with a second font, the second font is an unknown font, the character image generation model records the mapping relation between the known font and the unknown font, and one character image comprises one character;
before the step of generating the second character image according to the character shape vector and the preset character image generation model, the method further comprises the following steps:
Training a character image generation model, wherein the training process of the character image generation model comprises the following steps:
acquiring a character set, wherein the character set comprises M different characters;
drawing each character in the character set into a corresponding character image according to each font in the preset font library to obtain a training sample set { P } 1 ,P 2 ,…,P M*N M is the number of character images in the training sample set, P i For the ith character image in the training sample set, P i The font of the middle character is Q i
For each character image P i For P i Coding to obtain P i Is used for generating Q i Is embedded in the font of P i Character embedded vector sum Q of (2) i Splicing the font embedded vectors to obtain P i Is a character shape vector of (a);
by generating the countermeasure network GAN algorithm, for all P i Training the character body vectors of the character image to obtain a character image generation model.
2. The method of claim 1, wherein the font embedding vector is an N-dimensional column vector or an N-dimensional row vector, and wherein one font embedding vector comprises: one 1 and N-1 0.
3. The method of claim 2, wherein selecting at least one font from a library of preset fonts, generating a font embedding vector for the selected font, comprises:
Selecting a font from a preset font library to generate a font embedded vector;
the step of splicing the character embedding vector of the first character image and the font embedding vector of the selected font to obtain a character body vector comprises the following steps:
and splicing the character embedding vector of the first character image and the generated one font embedding vector to obtain a character body vector.
4. The method of claim 2, wherein selecting at least one font from a library of preset fonts, generating a font embedding vector for the selected font, comprises:
selecting a plurality of fonts from a preset font library, and generating a plurality of font embedded vectors, wherein one font corresponds to one font embedded vector;
the step of splicing the character embedding vector of the first character image and the font embedding vector of the selected font to obtain a character body vector comprises the following steps:
carrying out weighted summation on the plurality of font embedded vectors to obtain an interpolation font embedded vector;
and splicing the character embedding vector of the first character image and the interpolation font embedding vector to obtain a character body vector.
5. The method of claim 1, wherein the first font and the font selection instruction indicate that the selected font is the same font or is a different font.
6. The method of any one of claims 1 to 5, wherein the character comprises any one of: chinese characters, numbers and letters.
7. An apparatus for generating an authentication codeword, the apparatus comprising:
the first receiving module is used for receiving a first character image, wherein the first character image comprises target characters with first fonts, and the first fonts are known fonts;
the encoding module is used for encoding the first character image to obtain a character embedding vector of the first character image;
the second receiving module is used for receiving the font selection instruction;
the first generation module is used for selecting at least one font from a preset font library and generating a font embedded vector of the selected font, wherein the preset font library comprises N fonts which are all known fonts;
the splicing module is used for splicing the character embedded vector of the first character image and the font embedded vector of the selected font to obtain a character body vector;
the second generation module is used for generating a second character image according to the character body vector and a preset character image generation model, wherein the second character image comprises target characters with a second font, the second font is an unknown font, the character image generation model records the mapping relation between the known font and the unknown font, and one character image comprises one character;
The apparatus further comprises: training module, wherein, training module includes:
the character set acquisition sub-module is used for acquiring a character set, wherein the character set comprises M different characters;
a character image drawing sub-module for drawing each character in the character set into a corresponding character image according to each font in the preset font library to obtain a training sample set { P } 1 ,P 2 ,…,P M*N M is the number of character images in the training sample set, P i For the ith character image in the training sample set, P i The font of the middle character is Q i
A character-form vector generation sub-module for generating, for each character image P i For P i Coding to obtain P i Is used for generating Q i Is embedded in the font of P i Character embedded vector sum Q of (2) i Splicing the font embedded vectors to obtain P i Is a character shape vector of (a);
training submodule for generating the countermeasure network GAN algorithm for all P i Training the character body vectors of the character image to obtain a character image generation model.
8. The apparatus of claim 7, wherein the font embedding vector is an N-dimensional column vector or an N-dimensional row vector, and wherein one font embedding vector comprises: one 1 and N-1 0.
9. The apparatus of claim 8, wherein the first generation module comprises:
the first generation sub-module is used for selecting a font from a preset font library and generating a font embedded vector;
the splice module includes:
and the first splicing sub-module is used for splicing the character embedded vector of the first character image and the generated one font embedded vector to obtain a character body vector.
10. The apparatus of claim 8, wherein the first generation module comprises:
the second generation sub-module is used for selecting a plurality of fonts from a preset font library and generating a plurality of font embedded vectors, wherein one font corresponds to one font embedded vector;
the splice module includes:
the interpolation operator module is used for carrying out weighted summation on the plurality of font embedded vectors to obtain an interpolation font embedded vector;
and the second splicing sub-module is used for splicing the character embedded vector of the first character image and the interpolation font embedded vector to obtain a character body vector.
11. The apparatus of claim 7, wherein the first font and the font selection instruction indicate that the selected font is the same font or is a different font.
12. The apparatus of any one of claims 7 to 11, wherein the character comprises any one of: chinese characters, numbers and letters.
13. An electronic device, comprising: memory, a processor and a computer program stored on the memory and executable on the processor, which when executed by the processor, implements the steps of the method of generating an authentication codeword as claimed in any of claims 1 to 6.
14. A computer readable storage medium, characterized in that the computer readable storage medium has stored thereon a computer program which, when executed by a processor, implements the steps in the method of generating an authentication codeword according to any of claims 1 to 6.
CN201910425674.9A 2019-05-21 2019-05-21 Verification codeword generation method and device, electronic equipment and storage medium Active CN110246197B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910425674.9A CN110246197B (en) 2019-05-21 2019-05-21 Verification codeword generation method and device, electronic equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910425674.9A CN110246197B (en) 2019-05-21 2019-05-21 Verification codeword generation method and device, electronic equipment and storage medium

Publications (2)

Publication Number Publication Date
CN110246197A CN110246197A (en) 2019-09-17
CN110246197B true CN110246197B (en) 2023-12-26

Family

ID=67884602

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910425674.9A Active CN110246197B (en) 2019-05-21 2019-05-21 Verification codeword generation method and device, electronic equipment and storage medium

Country Status (1)

Country Link
CN (1) CN110246197B (en)

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111476853A (en) * 2020-03-17 2020-07-31 西安万像电子科技有限公司 Method, equipment and system for encoding and decoding character image
CN111815108A (en) * 2020-05-30 2020-10-23 国网上海市电力公司 Evaluation method for power grid engineering design change and on-site visa approval sheet
CN112905977A (en) * 2020-11-23 2021-06-04 重庆大学 Verification code generation method based on image style conversion
CN113435163B (en) * 2021-08-25 2021-11-16 南京中孚信息技术有限公司 OCR data generation method for any character combination

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6252671B1 (en) * 1998-05-22 2001-06-26 Adobe Systems Incorporated System for downloading fonts
CN101853313A (en) * 2010-07-01 2010-10-06 无锡骏聿科技有限公司 Handwriting font object library generating method based on font categorization
CN108170649A (en) * 2018-01-26 2018-06-15 广东工业大学 A kind of Hanzi font library generation method and device based on DCGAN depth networks
CN108470366A (en) * 2018-03-28 2018-08-31 同方威视技术股份有限公司 Analog image generation method and device and computer readable storage medium

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6252671B1 (en) * 1998-05-22 2001-06-26 Adobe Systems Incorporated System for downloading fonts
CN101853313A (en) * 2010-07-01 2010-10-06 无锡骏聿科技有限公司 Handwriting font object library generating method based on font categorization
CN108170649A (en) * 2018-01-26 2018-06-15 广东工业大学 A kind of Hanzi font library generation method and device based on DCGAN depth networks
CN108470366A (en) * 2018-03-28 2018-08-31 同方威视技术股份有限公司 Analog image generation method and device and computer readable storage medium

Also Published As

Publication number Publication date
CN110246197A (en) 2019-09-17

Similar Documents

Publication Publication Date Title
CN110246197B (en) Verification codeword generation method and device, electronic equipment and storage medium
Hayes et al. Generating steganographic images via adversarial training
CN109587372B (en) Invisible image steganography based on generation of countermeasure network
CN109379377B (en) Encrypted malicious traffic detection method and device, electronic equipment and storage medium
CN110246198B (en) Method and device for generating character selection verification code, electronic equipment and storage medium
CN110059465B (en) Identity verification method, device and equipment
CN111275784B (en) Method and device for generating image
CN107545277B (en) Model training, identity verification method and device, storage medium and computer equipment
CN111241291A (en) Method and device for generating countermeasure sample by utilizing countermeasure generation network
US11151440B2 (en) Neural response human detector
CN111343162B (en) System secure login method, device, medium and electronic equipment
CN104618350A (en) Generation method of image checking code
CN107451106A (en) Text method and device for correcting, electronic equipment
US11886556B2 (en) Systems and methods for providing user validation
US20220189083A1 (en) Training method for character generation model, character generation method, apparatus, and medium
CN114820871A (en) Font generation method, model training method, device, equipment and medium
CN106250755B (en) Method and device for generating verification code
CN111914769B (en) User validity determination method, device, computer readable storage medium and equipment
US20210209256A1 (en) Peceptual video fingerprinting
US20200175148A1 (en) Collaborative context-aware visual authentication question protocol
CN115934484B (en) Diffusion model data enhancement-based anomaly detection method, storage medium and apparatus
CN117014693A (en) Video processing method, device, equipment and storage medium
Abubaker et al. Cloud-based Arabic reCAPTCHA service: design and architecture
CN116264606A (en) Method, apparatus and computer program product for processing video
CN112257053A (en) Image verification code generation method and system based on universal anti-disturbance

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant