CN111242241A - Method for amplifying etched character recognition network training sample - Google Patents
Method for amplifying etched character recognition network training sample Download PDFInfo
- Publication number
- CN111242241A CN111242241A CN202010096003.5A CN202010096003A CN111242241A CN 111242241 A CN111242241 A CN 111242241A CN 202010096003 A CN202010096003 A CN 202010096003A CN 111242241 A CN111242241 A CN 111242241A
- Authority
- CN
- China
- Prior art keywords
- image
- network
- stylized
- content
- loss
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 36
- 238000012549 training Methods 0.000 title claims abstract description 34
- 230000002457 bidirectional effect Effects 0.000 claims abstract description 28
- 238000005530 etching Methods 0.000 claims abstract description 27
- 230000003416 augmentation Effects 0.000 claims abstract description 10
- 239000013598 vector Substances 0.000 claims description 25
- 238000011478 gradient descent method Methods 0.000 claims description 17
- 238000004364 calculation method Methods 0.000 claims description 15
- 230000003190 augmentative effect Effects 0.000 claims description 6
- 230000003042 antagnostic effect Effects 0.000 claims description 4
- 150000001875 compounds Chemical class 0.000 claims description 3
- 238000012216 screening Methods 0.000 claims description 3
- 238000013135 deep learning Methods 0.000 abstract description 9
- 238000012545 processing Methods 0.000 abstract description 2
- 230000006870 function Effects 0.000 description 20
- 230000004913 activation Effects 0.000 description 7
- 238000010606 normalization Methods 0.000 description 7
- 238000010586 diagram Methods 0.000 description 3
- 239000000284 extract Substances 0.000 description 3
- 230000015556 catabolic process Effects 0.000 description 2
- 238000006731 degradation reaction Methods 0.000 description 2
- 230000003321 amplification Effects 0.000 description 1
- 239000000463 material Substances 0.000 description 1
- 239000007769 metal material Substances 0.000 description 1
- 238000003199 nucleic acid amplification method Methods 0.000 description 1
- 238000011160 research Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/214—Generating training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V30/00—Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
- G06V30/10—Character recognition
Landscapes
- Engineering & Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Theoretical Computer Science (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Artificial Intelligence (AREA)
- Evolutionary Biology (AREA)
- Evolutionary Computation (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Image Analysis (AREA)
- Character Discrimination (AREA)
Abstract
The invention discloses an augmentation method for an etched character recognition network training sample, and belongs to the field of image processing technology and deep learning. The method comprises the following steps: acquiring an etching character image in a scene; generating a content image and a style image according to the etching character image; constructing a bidirectional generation countermeasure network; training a bidirectional generation countermeasure network; and inputting the content image and the style image into the trained bidirectional generation countermeasure network to generate an etching character image. According to the invention, a large number of etched character images are generated by generating the countermeasure network, sufficient training samples can still be obtained under the condition of small sample scale, and compared with the method of manually acquiring samples, the method is quicker and more efficient, the generated etched character images are more vivid, and the accuracy of identifying the etched characters by using a deep learning method is improved.
Description
Technical Field
The invention belongs to the field of image processing technology and deep learning, and particularly relates to an augmentation method for an etched character recognition network training sample.
Background
Etching character recognition is commonly used for text recognition in industrial equipment labels and is one of difficulties in scene text recognition. The industrial equipment signs are usually made of metal materials and are partially placed in an outdoor environment, so that the signs often have degradation conditions such as reflection, dirt, blurring and scratches in the images, and the degradation conditions bring many difficulties for identifying etched characters.
The method for recognizing the etched characters by using the deep learning method needs a large amount of data to train a character recognition model so as to meet the generalization capability of the model, and the phenomenon of overfitting is easy to occur on the trained model under the condition of small sample scale. In the research of etching character recognition, the number of etching character images which can be collected in a specific scene is small, the problem of sample data shortage is serious, and the requirement of deep learning cannot be met. In addition, a large amount of manpower and material resources are consumed for collecting and arranging samples, and the efficiency of collecting and arranging the samples only by manpower is very low. Therefore, the method for identifying etched characters by using deep learning needs to solve the problem of small sample size. Common image augmentation methods include flipping, rotating, scaling, cropping, shifting, adding noise, etc., but these methods all make a series of random changes to the existing samples, and only generate images similar to the original samples.
Disclosure of Invention
The invention aims to provide an image sample amplification method aiming at the problem of small scale of etched character recognition network training samples, and a large quantity of etched character images are rapidly generated to meet the training requirement of a deep learning network.
The technical solution for realizing the purpose of the invention is as follows: an augmentation method for an etched character recognition network training sample, comprising the following steps:
step 1, acquiring an etching character image in a scene;
step 2, generating a content image and a style image according to the etching character image;
step 3, constructing a bidirectional generation countermeasure network;
step 4, training the bidirectional generation countermeasure network;
and 5, inputting the content image and the style image into the trained bidirectional generation countermeasure network to generate an etching character image.
Further, the step 2 of generating the content image according to the etching character image specifically includes:
step 2-1, marking text information of the etched character image;
step 2-2, counting character information according to the marked real label of the etched character image;
and 2-3, generating content images with various fonts according to the character information.
Further, the step 2 of generating the style image according to the etched character image specifically includes: and generating a style image according to the etching character image characteristics.
Further, the step 2 of generating the style image according to the etched character image specifically includes:
and selecting an image with the resolution meeting a first preset condition and/or the definition meeting a second preset condition and/or the feature significance meeting a third preset condition from the acquired etching character images as a style image.
Further, the constructing a bidirectional generation countermeasure network in step 3 specifically includes:
step 3-1, constructing a stylized generation confrontation network;
step 3-2, constructing a de-stylized generation countermeasure network;
and 3-3, constructing a loss function.
Further, the stylizing the generated countermeasure network in step 3-1 includes: a stylized generation network and a stylized discrimination network;
the stylized generation network inputs a content image and a style image and outputs a stylized character image; the stylized generated network includes: content encoder Ex1The input is a content image, and the output is a content feature vector; style encoderEx2The input is a style image, and the output is a style characteristic vector; generator GxThe input of the character image is the content feature vector and the style feature vector, and the output is a stylized character image;
the input of the stylized judging network is the stylized character image or the real etching character image, and the output is a number between 0 and 1, which is used for representing the probability that the input image is the real image.
Further, the de-stylizing the generated countermeasure network in step 3-2 includes: de-stylizing the generated network and de-stylizing the discriminant network;
the de-stylized generation network inputs the stylized character image and outputs the de-stylized character image; the de-stylized generation network includes: first encoder EyThe input is a stylized character image and the output is a feature vector; second generator GyThe input of which is a first encoder EyOutputting the output characteristic vector as a de-stylized character image;
the de-stylized discrimination network inputs the de-stylized character image or the real content image and outputs a number between 0 and 1 for representing the probability that the input image is the real image.
Further, the loss function L in step 3-3 includes: content image reconstruction loss L1Stylized generation of antagonistic network losses L2And de-stylizing to generate the countering network loss L3The formula is as follows:
L=L1+L2+L3
the content image reconstruction loss L1For securing said content encoder Ex1The core information of the content image can be extracted, and the formula is as follows:
where x represents the input content image, λxWeight, λ, representing loss of reconstruction of content imagexValue range ofIs 0 to 1;
the stylization generates a countering network loss L2Including a first pixel loss LspixAnd a first loss resistance LsadvThe formula is as follows:
L2=λx1Lspix+λx2Lsadv
in the formula, LspixRepresenting a first pixel loss, LsadvDenotes the first confrontation loss, λx1、λx2Respectively represent Lspix、LsadvWeight of (a), λx1、λx2The value range of (1) is 0 to 1;
wherein the first pixel has a loss of LspixThe calculation formula of (a) is as follows:
in the formula, x and y respectively represent an input content image and a stylized image, and y' represents an image generated by a stylized generation network;
first confrontation loss LsadvThe calculation formula of (a) is as follows:
in the formula (I), the compound is shown in the specification,defined as a vector sampled uniformly along a straight line between the stylized image y and the image y' generated by the stylized generation network, λsadvFor weight parameters with values in the range of 0 to 1, DxRepresenting a stylized discriminative network;
the de-stylizing generates a countering network loss L3Including a second pixel loss LdpixSecond pair of resistance loss LdfeatAnd content characteristic loss LdadvThe formula is as follows:
L3=λy1Ldpix+λy2Ldadv+λy3Ldfeat
in the formula, λy1、λy2、λy3Are respectively Ldpix、Ldadv、LdfeatThe value ranges of the weights are all 0 to 1;
wherein the second pixel loss LdpixThe calculation formula of (a) is as follows:
second confrontation loss LdfeatThe calculation formula of (a) is as follows:
content loss LdadvThe calculation formula of (a) is as follows:
further, training the bidirectional generation countermeasure network in step 4 includes:
step 4-1, initializing parameters and iteration times of a bidirectional generation countermeasure network;
step 4-2, inputting the content image into a content encoder of a stylized generation countermeasure network, inputting the characteristics output by the content encoder into the stylized generation network, calculating a loss function, and updating the parameters of the content encoder by using a gradient descent method;
step 4-3, inputting the style image into a de-stylized generation network to generate a false content image;
step 4-4, inputting the real content image and the fake content image into the de-stylized discrimination network respectively, calculating a loss function and updating parameters of the de-stylized discrimination network by using a gradient descent method;
step 4-5, inputting the false content image into a de-stylized discrimination network, calculating a loss function, and updating network parameters of the de-stylized generation network by using a gradient descent method;
step 4-6, inputting the content image and the style image into a stylized generation network to generate a false style image;
step 4-7, inputting the real style image and the false style image into a stylized discrimination network respectively, calculating a loss function and updating parameters of the stylized discrimination network by using a gradient descent method;
step 4-8, inputting the false style image into a stylized discrimination network, calculating a loss function, and updating network parameters of the stylized generation network by using a gradient descent method;
4-9, judging whether the current iteration times are smaller than a set threshold value, if so, repeating the steps 4-2-4-8; otherwise, the training of the bidirectional generation countermeasure network is finished.
Further, the step 5 of inputting the content image and the style image into the trained bidirectional generation countermeasure network to generate the etched character image specifically includes:
step 5-1, inputting the content image and the style image into a trained stylized generation network to generate an etched character image;
and 5-2, screening the generated etched character images, and deleting the images which do not meet the preset requirements.
Compared with the prior art, the invention has the following remarkable advantages: 1) a large number of etched character images are generated by generating a countermeasure network, and a sufficient training sample can be obtained under the condition that the sample size is small; 2) compared with the manual sample collection, the method for generating a large number of etching character images through the network is quicker and more efficient; 3) the two-way generation countermeasure network can be used for generating a vivid character image, and the accuracy of identifying etched characters by using a deep learning method is improved.
The present invention is described in further detail below with reference to the attached drawing figures.
Drawings
FIG. 1 is a flow chart of a method for augmenting an etched character recognition network training sample in one embodiment.
FIG. 2 is a diagram illustrating content images in one embodiment.
FIG. 3 is a schematic illustration of an etch style image in one embodiment.
Fig. 4 is a schematic diagram of a bi-directional generation countermeasure network in one embodiment.
FIG. 5 is a flow diagram of bi-directional generative confrontation network training in one embodiment.
Detailed Description
In order to make the objects, technical solutions and advantages of the present application more apparent, the present application is described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the present application and are not intended to limit the present application.
In one embodiment, in combination with fig. 1, a method for augmenting an etched character recognition network training sample is provided, the method comprising the following steps:
step 1, acquiring an etching character image in a scene;
step 2, generating a content image and a style image according to the etching character image;
step 3, constructing a bidirectional generation countermeasure network;
step 4, training a bidirectional generation countermeasure network;
and 5, inputting the content image and the style image into the trained bidirectional generation countermeasure network to generate an etching character image.
Further, in one embodiment, the generating a content image according to the etched character image in step 2 specifically includes:
step 2-1, marking text information of the etched character image;
step 2-2, counting character information according to the marked real label of the etched character image;
step 2-3, generating content images with various fonts according to the character information is shown in fig. 2.
Here, the plurality of fonts include a song style, a regular script, a fit, a song-style-imitating script, and the like.
Here, as a specific example, the text color of the content image is black, and the background color is white.
Further, in one embodiment, the generating the style image according to the etched character image in step 2 specifically includes: and generating a style image according to the etching character image characteristics.
Further, in one embodiment, the generating a style image according to the etched character image in step 2 specifically includes:
an image with a resolution meeting a first preset condition and/or a definition meeting a second preset condition and/or a feature significance meeting a third preset condition is selected from the collected etched character images and used as a style image as shown in fig. 3.
Here, the generated content image is the same as the size specification of the genre image.
Further, in one embodiment, the step 3 of constructing the bidirectional generation countermeasure network is shown in fig. 4, and the specific process includes:
step 3-1, constructing a stylized generation confrontation network;
step 3-2, constructing a de-stylized generation countermeasure network;
and 3-3, constructing a loss function.
Further, in one embodiment, the generating of the countermeasure network by the rasterization in step 3-1 includes: a stylized generation network and a stylized discrimination network.
The input of the stylized generation network is a content image and a style image, and the output is a stylized character image.
Here, the input image and the output image are three-channel images of the same size.
The stylized generated network includes: content encoder Ex1Style encoder Ex2And generator Gx。
Content encoder Ex1The input of (1) is a content image and the output is a content feature vector. The content encoder first extracts features of the content image using the convolutional layer. The output features are then fused with features output by previous network layers using deconvolution layer upsampling. An activation layer is arranged before the convolution layer and the deconvolution layer, and a batch normalization layer is arranged after the convolution layer and the deconvolution layer.
Style encoder Ex2The input is a style image and the output is a styleA feature vector. The stylistic encoder first extracts the features of the stylistic image using the convolutional layer. The output features are then fused with features output by previous network layers using deconvolution layer upsampling. An activation layer is arranged before the convolution layer and the deconvolution layer, and a batch normalization layer is arranged after the convolution layer and the deconvolution layer.
Generator GxThe input of (1) is the content feature vector and the style feature vector, and the output is a stylized character image. The content feature vector and the style feature vector are the same size. The generator first splices the content feature vectors and the style feature vectors, and then generates a stylized character image by using a plurality of deconvolution layer upsampling. An activation layer is disposed before the deconvolution layer, and a batch normalization layer is disposed after the deconvolution layer.
The input of the stylized discrimination network is a stylized character image or a real etching character image, and the output is a number between 0 and 1, which is used for representing the probability that the input image is a real image. The stylized decision network includes a convolutional layer, which is configured with an activation layer before the convolutional layer and a batch normalization layer after the convolutional layer.
Further, in one embodiment, the de-stylizing to generate the countermeasure network in step 3-2 includes: a de-stylized generation network and a de-stylized discrimination network.
The input of the de-stylized generation network is a stylized character image, and the output is a de-stylized character image.
De-stylized generated networks include: first encoder EyAnd a second generator Gy。
First encoder EyThe input of (1) is a stylized character image and the output is a feature vector. The encoder first extracts the features of the content image using the convolutional layer. The output features are then fused with features output by previous network layers using deconvolution layer upsampling. An activation layer is arranged before the convolution layer and the deconvolution layer, and a batch normalization layer is arranged after the convolution layer and the deconvolution layer.
Second generator GyIs input to a first encoder EyAnd outputting the output characteristic vector as a de-stylized character image. The generator uses multiple deconvolution layersThe upsampling generates a stylized character image. An activation layer is disposed before the deconvolution layer, and a batch normalization layer is disposed after the deconvolution layer.
The input of the de-stylized discrimination network is a de-stylized character image or a real content image, and the output is a number between 0 and 1, which is used for representing the probability that the input image is a real image. The de-stylized decision network includes a convolutional layer, which is preceded by an activation layer and followed by a batch normalization layer.
Further, in one embodiment, the loss function L in step 3-3 comprises: content image reconstruction loss L1Stylized generation of antagonistic network losses L2And de-stylizing to generate the countering network loss L3The formula is as follows:
L=L1+L2+L3
content image reconstruction loss L1Encoder for guaranteed content Ex1The core information (including character structure, stroke information and the like) of the content image can be extracted, and the formula is as follows:
where x represents the input content image, λxWeight, λ, representing loss of reconstruction of content imagexThe value range of (1) is 0 to 1;
stylized generation of confronted network loss L2Including a first pixel loss LspixAnd a first loss resistance LsadvThe formula is as follows:
L2=λx1Lspix+λx2Lsadv
in the formula, LspixRepresenting a first pixel loss, LsadvDenotes the first confrontation loss, λx1、λx2Respectively represent Lspix、LsadvWeight of (a), λx1、λx2The value range of (1) is 0 to 1;
wherein the first pixel has a loss of LspixThe calculation formula of (a) is as follows:
in the formula, x and y respectively represent an input content image and a stylized image, and y' represents an image generated by a stylized generation network;
first confrontation loss LsadvThe calculation formula of (a) is as follows:
in the formula (I), the compound is shown in the specification,defined as a vector sampled uniformly along a straight line between the stylized image y and the image y' generated by the stylized generation network, λsadvFor weight parameters with values in the range of 0 to 1, DxRepresenting a stylized discriminative network;
de-stylization to generate antagonistic network losses L3Including a second pixel loss LdpixSecond pair of resistance loss LdfeatAnd content characteristic loss LdadvThe formula is as follows:
L3=λy1Ldpix+λy2Ldadv+λy3Ldfeat
in the formula, λy1、λy2、λy3Are respectively Ldpix、Ldadv、LdfeatThe value ranges of the weights are all 0 to 1;
wherein the second pixel loss LdpixThe calculation formula of (a) is as follows:
second confrontation loss LdfeatThe calculation formula of (a) is as follows:
content loss LdadvThe calculation formula of (a) is as follows:
further, in one embodiment, the goal of training the bidirectional generative countermeasure network in step 4 is to minimize the value of the loss function L by using a gradient descent method. The gradient descent method is to utilize the characteristic of convex function and to shift the parameters of the convex function by one step length along the direction opposite to the gradient to realize the descent of the function value. By repeating the iteration continuously, the local minimum value of the convex function is finally found. The update formula of the parameter θ in this process is:
where θ represents the parameter, η represents the iteration step size, and J (θ) represents the loss function.
With reference to fig. 5, the specific process of training the bidirectional generation countermeasure network includes:
step 4-1, initializing parameters and iteration times of a bidirectional generation countermeasure network;
step 4-2, inputting the content image into a content encoder of a stylized generation countermeasure network, inputting the characteristics output by the content encoder into the stylized generation network, calculating a loss function, and updating the parameters of the content encoder by using a gradient descent method;
step 4-3, inputting the style image into a de-stylized generation network to generate a false content image;
step 4-4, inputting the real content image and the fake content image into the de-stylized discrimination network respectively, calculating a loss function and updating parameters of the de-stylized discrimination network by using a gradient descent method;
step 4-5, inputting the false content image into a de-stylized discrimination network, calculating a loss function, and updating network parameters of the de-stylized generation network by using a gradient descent method;
step 4-6, inputting the content image and the style image into a stylized generation network to generate a false style image;
step 4-7, inputting the real style image and the false style image into a stylized discrimination network respectively, calculating a loss function and updating parameters of the stylized discrimination network by using a gradient descent method;
step 4-8, inputting the false style image into a stylized discrimination network, calculating a loss function, and updating network parameters of the stylized generation network by using a gradient descent method;
4-9, judging whether the current iteration times are smaller than a set threshold value, if so, repeating the steps 4-2-4-8; otherwise, the training of the bidirectional generation countermeasure network is finished.
Further, in one embodiment, the step 5 of inputting the content image and the style image into the trained bidirectional generation countermeasure network to generate the etched character image specifically includes:
step 5-1, inputting the content image and the style image into a trained stylized generation network to generate an etched character image;
and 5-2, screening the generated etched character images, and deleting the images which do not meet the preset requirements.
In conclusion, the method generates a large number of etched character images through the generated countermeasure network, can obtain sufficient training samples under the condition of small sample scale, is quicker and more efficient compared with the manual sample acquisition, generates more vivid etched character images, and improves the accuracy of identifying etched characters by using a deep learning method.
Claims (10)
1. An augmentation method for etched character recognition network training samples is characterized by comprising the following steps:
step 1, acquiring an etching character image in a scene;
step 2, generating a content image and a style image according to the etching character image;
step 3, constructing a bidirectional generation countermeasure network;
step 4, training the bidirectional generation countermeasure network;
and 5, inputting the content image and the style image into the trained bidirectional generation countermeasure network to generate an etching character image.
2. The method for augmenting the etched character recognition network training sample according to claim 1, wherein the step 2 of generating the content image according to the etched character image specifically comprises:
step 2-1, marking text information of the etched character image;
step 2-2, counting character information according to the marked real label of the etched character image;
and 2-3, generating content images with various fonts according to the character information.
3. The etching character recognition network training sample augmentation method according to claim 1 or 2, wherein the step 2 of generating the style image according to the etching character image specifically comprises: and generating a style image according to the etching character image characteristics.
4. The method for augmenting the etched character recognition network training sample according to claim 3, wherein the step 2 of generating the style image according to the etched character image specifically comprises:
and selecting an image with the resolution meeting a first preset condition and/or the definition meeting a second preset condition and/or the feature significance meeting a third preset condition from the acquired etching character images as a style image.
5. The method for augmenting the etched character recognition network training sample according to claim 4, wherein the constructing of the bidirectional generation countermeasure network in step 3 specifically comprises:
step 3-1, constructing a stylized generation confrontation network;
step 3-2, constructing a de-stylized generation countermeasure network;
and 3-3, constructing a loss function.
6. The etched character recognition network training sample augmentation method of claim 5, wherein the stylized generation of the countermeasure network in step 3-1 comprises: a stylized generation network and a stylized discrimination network;
the stylized generation network inputs a content image and a style image and outputs a stylized character image; the stylized generated network includes: content encoder Ex1The input is a content image, and the output is a content feature vector; style encoder Ex2The input is a style image, and the output is a style characteristic vector; generator GxThe input of the character image is the content feature vector and the style feature vector, and the output is a stylized character image;
the input of the stylized judging network is the stylized character image or the real etching character image, and the output is a number between 0 and 1, which is used for representing the probability that the input image is the real image.
7. The etched character recognition network training sample augmentation method of claim 6, wherein the de-stylizing to generate a confrontation network in step 3-2 comprises: de-stylizing the generated network and de-stylizing the discriminant network;
the de-stylized generation network inputs the stylized character image and outputs the de-stylized character image; the de-stylized generation network includes: first encoder EyThe input is a stylized character image and the output is a feature vector; second generator GyThe input of which is a first encoder EyOutputting the output characteristic vector as a de-stylized character image;
the de-stylized discrimination network inputs the de-stylized character image or the real content image and outputs a number between 0 and 1 for representing the probability that the input image is the real image.
8. The etched character recognition network training sample augmentation method of claim 7, comprising the steps ofThe loss function L in step 3-3 includes: content image reconstruction loss L1Stylized generation of antagonistic network losses L2And de-stylizing to generate the countering network loss L3The formula is as follows:
L=L1+L2+L3
the content image reconstruction loss L1For securing said content encoder Ex1The core information of the content image can be extracted, and the formula is as follows:
where x represents the input content image, λxWeight, λ, representing loss of reconstruction of content imagexThe value range of (1) is 0 to 1;
the stylization generates a countering network loss L2Including a first pixel loss LspixAnd a first loss resistance LsadvThe formula is as follows:
L2=λx1Lspix+λx2Lsadv
in the formula, LspixRepresenting a first pixel loss, LsadvDenotes the first confrontation loss, λx1、λx2Respectively represent Lspix、LsadvWeight of (a), λx1、λx2The value range of (1) is 0 to 1;
wherein the first pixel has a loss of LspixThe calculation formula of (a) is as follows:
in the formula, x and y respectively represent an input content image and a stylized image, and y' represents an image generated by a stylized generation network;
first confrontation loss LsadvThe calculation formula of (a) is as follows:
in the formula (I), the compound is shown in the specification,defined as a vector sampled uniformly along a straight line between the stylized image y and the image y' generated by the stylized generation network, λsadvFor weight parameters with values in the range of 0 to 1, DxRepresenting a stylized discriminative network;
the de-stylizing generates a countering network loss L3Including a second pixel loss LdpixSecond pair of resistance loss LdfeatAnd content characteristic loss LdadvThe formula is as follows:
L3=λy1Ldpix+λy2Ldadv+λy3Ldfeat
in the formula, λy1、λy2、λy3Are respectively Ldpix、Ldadv、LdfeatThe value ranges of the weights are all 0 to 1;
wherein the second pixel loss LdpixThe calculation formula of (a) is as follows:
second confrontation loss LdfeatThe calculation formula of (a) is as follows:
content loss LdadvThe calculation formula of (a) is as follows:
9. the etched character recognition network training sample augmentation method of claim 8, wherein the step 4 of training the bidirectional generation countermeasure network comprises the following specific processes:
step 4-1, initializing parameters and iteration times of a bidirectional generation countermeasure network;
step 4-2, inputting the content image into a content encoder of a stylized generation countermeasure network, inputting the characteristics output by the content encoder into the stylized generation network, calculating a loss function, and updating the parameters of the content encoder by using a gradient descent method;
step 4-3, inputting the style image into a de-stylized generation network to generate a false content image;
step 4-4, inputting the real content image and the fake content image into the de-stylized discrimination network respectively, calculating a loss function and updating parameters of the de-stylized discrimination network by using a gradient descent method;
step 4-5, inputting the false content image into a de-stylized discrimination network, calculating a loss function, and updating network parameters of the de-stylized generation network by using a gradient descent method;
step 4-6, inputting the content image and the style image into a stylized generation network to generate a false style image;
step 4-7, inputting the real style image and the false style image into a stylized discrimination network respectively, calculating a loss function and updating parameters of the stylized discrimination network by using a gradient descent method;
step 4-8, inputting the false style image into a stylized discrimination network, calculating a loss function, and updating network parameters of the stylized generation network by using a gradient descent method;
4-9, judging whether the current iteration times are smaller than a set threshold value, if so, repeating the steps 4-2-4-8; otherwise, the training of the bidirectional generation countermeasure network is finished.
10. The method for augmenting the training sample of the etched character recognition network according to claim 9, wherein the step 5 of inputting the content image and the style image into the trained bidirectional generation countermeasure network to generate the etched character image specifically comprises:
step 5-1, inputting the content image and the style image into a trained stylized generation network to generate an etched character image;
and 5-2, screening the generated etched character images, and deleting the images which do not meet the preset requirements.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010096003.5A CN111242241A (en) | 2020-02-17 | 2020-02-17 | Method for amplifying etched character recognition network training sample |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010096003.5A CN111242241A (en) | 2020-02-17 | 2020-02-17 | Method for amplifying etched character recognition network training sample |
Publications (1)
Publication Number | Publication Date |
---|---|
CN111242241A true CN111242241A (en) | 2020-06-05 |
Family
ID=70879992
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202010096003.5A Pending CN111242241A (en) | 2020-02-17 | 2020-02-17 | Method for amplifying etched character recognition network training sample |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN111242241A (en) |
Cited By (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112132916A (en) * | 2020-08-18 | 2020-12-25 | 浙江大学 | Seal cutting work customized design generation device utilizing generation countermeasure network |
CN112396577A (en) * | 2020-10-22 | 2021-02-23 | 国网浙江省电力有限公司杭州供电公司 | Defect detection method of transformer based on Poisson fusion sample expansion |
CN112489165A (en) * | 2020-11-06 | 2021-03-12 | 中科云谷科技有限公司 | Method, device and storage medium for synthesizing characters |
CN113761831A (en) * | 2020-11-13 | 2021-12-07 | 北京沃东天骏信息技术有限公司 | Method, device and equipment for generating style calligraphy and storage medium |
CN114781556A (en) * | 2022-06-22 | 2022-07-22 | 北京汉仪创新科技股份有限公司 | Font generation method, system, device and medium based on character part information |
CN114782961A (en) * | 2022-03-23 | 2022-07-22 | 华南理工大学 | Character image augmentation method based on shape transformation |
Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110443864A (en) * | 2019-07-24 | 2019-11-12 | 北京大学 | A kind of characters in a fancy style body automatic generation method based on single phase a small amount of sample learning |
-
2020
- 2020-02-17 CN CN202010096003.5A patent/CN111242241A/en active Pending
Patent Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110443864A (en) * | 2019-07-24 | 2019-11-12 | 北京大学 | A kind of characters in a fancy style body automatic generation method based on single phase a small amount of sample learning |
Non-Patent Citations (1)
Title |
---|
SHUAI YANG等: "TET-GAN: Text Effects Transfer via Stylization and Destylization", 《THE THIRTY-THIRD AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE》 * |
Cited By (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112132916A (en) * | 2020-08-18 | 2020-12-25 | 浙江大学 | Seal cutting work customized design generation device utilizing generation countermeasure network |
CN112132916B (en) * | 2020-08-18 | 2023-11-14 | 浙江大学 | Seal cutting work customized design generating device for generating countermeasure network |
CN112396577A (en) * | 2020-10-22 | 2021-02-23 | 国网浙江省电力有限公司杭州供电公司 | Defect detection method of transformer based on Poisson fusion sample expansion |
CN112489165A (en) * | 2020-11-06 | 2021-03-12 | 中科云谷科技有限公司 | Method, device and storage medium for synthesizing characters |
CN112489165B (en) * | 2020-11-06 | 2024-02-06 | 中科云谷科技有限公司 | Method, device and storage medium for synthesizing characters |
CN113761831A (en) * | 2020-11-13 | 2021-12-07 | 北京沃东天骏信息技术有限公司 | Method, device and equipment for generating style calligraphy and storage medium |
CN113761831B (en) * | 2020-11-13 | 2024-05-21 | 北京沃东天骏信息技术有限公司 | Style handwriting generation method, device, equipment and storage medium |
CN114782961A (en) * | 2022-03-23 | 2022-07-22 | 华南理工大学 | Character image augmentation method based on shape transformation |
CN114781556A (en) * | 2022-06-22 | 2022-07-22 | 北京汉仪创新科技股份有限公司 | Font generation method, system, device and medium based on character part information |
CN114781556B (en) * | 2022-06-22 | 2022-09-02 | 北京汉仪创新科技股份有限公司 | Font generation method, system, device and medium based on character part information |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN111242241A (en) | Method for amplifying etched character recognition network training sample | |
CN108399419B (en) | Method for recognizing Chinese text in natural scene image based on two-dimensional recursive network | |
CN108491836B (en) | Method for integrally identifying Chinese text in natural scene image | |
CN113674140B (en) | Physical countermeasure sample generation method and system | |
CN111986125B (en) | Method for multi-target task instance segmentation | |
CN111079847B (en) | Remote sensing image automatic labeling method based on deep learning | |
CN107273897A (en) | A kind of character recognition method based on deep learning | |
CN110853057B (en) | Aerial image segmentation method based on global and multi-scale full-convolution network | |
CN111738169A (en) | Handwriting formula recognition method based on end-to-end network model | |
CN107491729B (en) | Handwritten digit recognition method based on cosine similarity activated convolutional neural network | |
CN111738113A (en) | Road extraction method of high-resolution remote sensing image based on double-attention machine system and semantic constraint | |
CN113378949A (en) | Dual-generation confrontation learning method based on capsule network and mixed attention | |
CN104182771A (en) | Time series data graphics analysis method based on automatic coding technology with packet loss | |
CN110533068B (en) | Image object identification method based on classification convolutional neural network | |
CN112381082A (en) | Table structure reconstruction method based on deep learning | |
CN114998604A (en) | Point cloud feature extraction method based on local point cloud position relation | |
CN112508108A (en) | Zero-sample Chinese character recognition method based on etymons | |
CN103455816B (en) | Stroke width extraction method and device and character recognition method and system | |
CN116958827A (en) | Deep learning-based abandoned land area extraction method | |
CN112149526A (en) | Lane line detection method and system based on long-distance information fusion | |
CN116935043A (en) | Typical object remote sensing image generation method based on multitasking countermeasure network | |
CN114581789A (en) | Hyperspectral image classification method and system | |
CN107274425A (en) | A kind of color image segmentation method and device based on Pulse Coupled Neural Network | |
CN112597925B (en) | Handwriting recognition/extraction and erasure method, handwriting recognition/extraction and erasure system and electronic equipment | |
CN112215241B (en) | Image feature extraction device based on small sample learning |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20200605 |
|
RJ01 | Rejection of invention patent application after publication |