CN114943783A

CN114943783A - Artistic word generation system oriented to complex texture structure

Info

Publication number: CN114943783A
Application number: CN202210651537.9A
Authority: CN
Inventors: 王中风; 毛文东; 石卉虹; 林军
Original assignee: Nanjing University
Current assignee: Nanjing University
Priority date: 2022-06-09
Filing date: 2022-06-09
Publication date: 2022-08-26

Abstract

The application provides an artistic word generation system facing to a complex texture structure, which comprises an input processing module, a black and white text mask and a style block, wherein the input processing module processes input source characters to generate the black and white text mask; generating a first generator of an antagonistic network model, processing a black-and-white text mask and a style small block, and generating a style large block for expanding a real edge of a preset multiple; processing the style large block by a second generator for generating an antagonistic network model to generate a black and white style mask with the style large block; the detail thinning module comprises a structure thinning network and a texture thinning network, and the structure thinning network thins and processes large style blocks to generate intermediate artistic words; and the texture refining network is used for refining the intermediate artistic words according to the black-white mask texture to generate final artistic words. Therefore, the artistic words with complex style effects are generated based on the complex texture structure by generating the artistic word rudiment and then carrying out structure and detail refinement on the artistic word rudiment.

Description

Artistic word generation system oriented to complex texture structure

Technical Field

The application relates to the field of artistic word generation, in particular to an artistic word generation system for a complex texture structure.

Background

With the development of visual arts, the use requirements of art words in various propaganda advertisements are more and more. Therefore, artistic words with corresponding styles need to be quickly generated according to different requirements. Currently, artistic words can be generated through neural network models.

For artistic word generation, a commonly used neural Network model generates a Generic Adaptive Network (GAN) model. GAN is an unsupervised deep learning model, and there are generally two methods for generating artistic words by using GAN: a method is according to source characters and corresponding source artistic word, produce the target artistic word that the target characters correspond to, the style of the target artistic word is the same as source artistic word, this method needs the special artistic word style data set to train, and can only produce the artistic word according to existing artistic word style after training, can't meet the various use demands of various propaganda advertisements, if should produce the complicated artistic word based on this method, need develop the large-scale artistic word data set with high-resolution image and diversification, so the cost is too high; the other method is to set a simple stylized text, to refer to a style picture or a style video to perform animation processing on a static text image, and to sense the shape in the style video to generate artistic words with controllable font corresponding to the static text image.

In summary, the conventional art word generation system can only generate simple art words based on simple styles and cannot generate art words with complex style effects based on complex texture structures.

Disclosure of Invention

The application provides an art word generating system for a complex texture structure, which can be used for solving the technical problems that the existing art word generating system can only generate simple art words based on a simple style and can not generate art words with complex style effects based on the complex texture structure.

In order to solve the technical problem, the application discloses the following technical scheme:

a complex texture-oriented artistic word generation system, the generation system comprising: the system comprises an input processing module, a generation confrontation network model and a detail refining module which are connected in sequence;

the input processing module is used for processing input source characters to generate a black and white text mask with smooth edges, and processing input style pictures to generate style small blocks by using the black and white text mask, wherein the style pictures are pictures with complex textures;

the generation confrontation network model comprises a first generator and a second generator, wherein the first generator is used for processing the black-and-white text mask and the style small block to generate a style large block which expands the real edge of a preset multiple; the second generator is used for processing the style big block and generating a black-and-white style mask of the style big block;

the generation and training is to train the generation and confrontation network model to be convergent by taking a preset black and white mask small block and a preset cutting style small block in a pre-created generation and training set as input and taking a preset style large block and a preset black and white style mask large block in the generation and training set as output;

the detail thinning module comprises a structure thinning network and a texture thinning network, and the structure thinning network is used for performing structure thinning processing on the style big block to generate an intermediate artistic word; the texture refining network is used for carrying out texture refining processing on the intermediate artistic word according to the black-and-white style mask to generate a final artistic word;

the Structure refinement network is a Structure Net network which is trained through styles, the styles are trained by taking pictures in a preset conventional picture data set as an input content graph, taking a preset source style picture as a reference style graph and taking a stylized content graph as output, and the Structure refinement network is trained to be convergent.

In one implementation, the generated training set is pre-created by:

selecting an original style image Y ^g The primitive style image Y ^g An image with complex texture;

acquiring the original style image Y ^g Original black and white mask M ^g The primitive style image Y ^g The style part in (1) is in the original black-and-white mask M ^g A black area in the middle, the original style image Y ^g Is in the original black-and-white mask M ^g The middle is a white area;

selecting the original black-white mask M according to a preset first size L multiplied by L ^g Local black-white mask M with maximum medium black area ^l And the primitive style image Y ^g Neutralizing the partial black-and-white mask M ^l Corresponding local style image Y ^l ；

For the original black and white mask M ^g Performing edge simplification processing to generate a first black-and-white mask with smooth edge

For the local black and white mask M ^l Performing edge simplification treatment to generate a second black-and-white mask with smooth edge

According to a preset large block clipping method, the original style image Y is clipped ^g And the local style image Y ^l In the method, a plurality of preset style chunks y are obtained by clipping, and each preset style chunk y is obtained in the original black-and-white mask M ^g And said partial black-and-white mask M ^l The preset black-white mask big blocks at corresponding positions in the first black-white mask are obtained, and each preset black-white mask big block m is positioned in the first black-white mask

And the second black-and-white mask

The preset black-white mask block m at the corresponding position _s ；

Randomly clipping each preset style large block y to obtain preset style small blocks

The size of the preset style big block y is a preset style small block

A preset multiple of the size;

for each preset black-white mask block m _s According to the preset multiple, down-sampling is carried out to obtain a small block with a preset black-white mask

The black and white mask block m _s Is the preset black and white mask small block

A preset multiple of the size;

passing through the preset black and white mask small block

Cutting out the preset style small blocks

Obtaining a preset clipping style tile

All the preset clipping style small blocks

And all the preset black-white mask small blocks

Determining to generate a training set.

In one implementation, the preset bulk clipping method includes:

setting the second size of the large block as xNxN, wherein xN is less than L, and x is the preset multiple;

cropping a plurality of blocks from a first reference image according to a first probability a, said first reference image comprising said original-style image Y ^g The original black-and-white mask M ^g And the first black-and-white mask

Cropping a plurality of large blocks from a second reference image comprising said local-style image Y with a second probability 1-a ^l The local black and white mask M ^l And the second black-and-white mask

In one implementation, the size of the preset small block is N × N, and the preset small block includes the preset style small block

The preset black and white mask small block

And the preset cutting style small block

In one implementation, the first generator includes a convolutional layer, a residual module, a stitching layer, and a transposed convolutional layer set according to a generation requirement; the second generator comprises a convolution layer, a residual error module and a splicing layer which are arranged according to the generation requirement.

In one implementation, the generating training includes a first generating training and a second generating training, where:

the first generation training sets the stride and expansion rate of the convolutional layer inside the first generator, the stride and expansion rate of the transposed convolutional layer and the size of the convolutional kernel, and the preset black-and-white mask small block

And the preset clipping style small block

Taking the preset style big block y as input, processing the black-and-white text mask and the style small block, and generating the style big block with the real edge expanded by preset times;

and the second generation training is to realize the training of processing the style big block and generating the black and white style mask of the style big block by setting the stride and the expansion rate of the convolutional layer in the second generator, taking the preset style big block y as input and the preset black and white mask big block m as output.

In one implementation, the generating the antagonistic network model further comprises a discriminator for completing the first generation training in cooperation with the first generator.

In one implementation manner, the generation system further includes a deformable module disposed between the input processing module and the generation countermeasure network model, and the deformable module is configured to implement the deformation degree control on the black-and-white text mask by performing processing of adding noise and edge erosion on the black-and-white text mask.

In an implementation manner, the deformable module implements processing of adding noise and edge corrosion to the black-and-white text mask by the following steps, and implements control of the deformation degree of the black-and-white text mask:

eroding the edge of the black-and-white text mask, and adding noise to the eroded edge to obtain a noise black-and-white mask;

setting a vector f to expand the noise black and white mask, and realizing the deformation degree control of the black and white text mask; the vector f comprises f ₀ 、f ₁ And f ₂ Wherein f is ₀ Size of core for erosion and expansion, f ₁ To the extent of noise addition in the edge, f ₂ To control the degree of internal noise addition.

In one implementation, the structure refinement network further comprises an attention mechanism module and an image conversion network disposed before the structure refinement network, wherein:

the attention mechanism module is used for outputting an element attention parameter according to an input tensor of the structure refinement network, wherein the input tensor is the input content graph, and the element attention parameter is used for enabling the image conversion network to pay attention to an element part of the input content graph;

the image conversion network is used for generating an output tensor according to a result of element-by-element multiplication of the input tensor and the element attention parameters, wherein the output tensor is the stylized content graph, the output tensor is used for calculating loss function values by combining the reference style graph, and the loss function values are used for training the structure refinement network.

The application provides an art word generating system facing a complex texture structure, which comprises an input processing module, a processing module and a processing module, wherein the input processing module processes input source words to generate a black and white text mask, and the black and white text mask is used for processing input style pictures to generate style small blocks; generating a first generator of an antagonistic network model, processing a black-and-white text mask and a style small block, and generating a style large block for expanding a real edge of a preset multiple; processing the style large block by a second generator for generating an antagonistic network model to generate a black and white style mask with the style large block; the detail thinning module comprises a structure thinning network and a texture thinning network, and the structure thinning network thins and processes large style blocks to generate intermediate artistic words; and the texture refining network is used for refining the intermediate artistic words according to the black-white mask texture to generate final artistic words. Therefore, the artistic words with complex style effects are generated based on the complex texture structure by generating the artistic word rudiment and then carrying out structure and detail refinement on the artistic word rudiment.

Drawings

Fig. 1 is a schematic structural diagram of a complex texture structure-oriented art word generating system provided in the present application;

FIG. 2 is a schematic diagram illustrating creation of a training set for a complex texture structure-oriented artistic word generation system according to the present application;

FIG. 3 is a schematic structural diagram of a generation confrontation network model of a complex texture structure-oriented art word generation system according to the present application;

FIG. 4 is a schematic diagram of a first generation training of a complex texture structure-oriented artistic word generation system provided in the present application;

FIG. 5 is a schematic diagram of style training of a complex texture structure-oriented artistic word generation system provided in the present application;

FIG. 6 is a structure-refined network architecture diagram of a complex texture structure-oriented artistic word generation system provided in the present application;

FIG. 7 is a schematic structural diagram of a deformable module of a complex texture structure-oriented artistic word generating system provided by the present application;

FIG. 8 is a schematic diagram of a testing stage of a complex texture structure-oriented art word generating system according to the present application;

fig. 9 is a schematic diagram of a final artistic word of the complex texture structure-oriented artistic word generating system provided by the present application.

Detailed Description

To make the objects, technical solutions and advantages of the present application more clear, the following detailed description of the embodiments of the present application will be made with reference to the accompanying drawings.

The terminology used in the following examples is for the purpose of describing particular embodiments only and is not intended to be limiting of the application. As used in the specification of this application and the appended claims, the singular forms "a", "an", "the" and "the" are intended to include the plural forms as well, such as "one or more", unless the context clearly indicates otherwise. It should also be understood that in the following embodiments of the present application, "at least one", "one or more" means one, two or more, "a plurality" means two or more. The term "and/or" is used to describe an association relationship that associates objects, meaning that three relationships may exist; for example, a and/or B, may represent: a alone, both A and B, and B alone, where A, B may be singular or plural. The character "/" generally indicates that the former and latter associated objects are in an "or" relationship.

Reference throughout this specification to "one embodiment" or "some embodiments," or the like, means that a particular feature, structure, or characteristic described in connection with the embodiment is included in one or more embodiments of the present application. Thus, appearances of the phrases "in one embodiment," "in some embodiments," "in other embodiments," or the like, in various places throughout this specification are not necessarily all referring to the same embodiment, but rather "one or more but not all embodiments" unless specifically stated otherwise. The terms "comprising," "including," "having," and variations thereof mean "including, but not limited to," unless expressly specified otherwise.

In order to solve the technical problem that the existing art word generating system can only generate simple art words based on simple styles and cannot generate art words with complex style effects based on complex texture structures, the application provides an art word generating system oriented to complex texture structures, and the embodiment of the application is specifically described below with reference to the attached drawings.

Referring to fig. 1, a schematic structural diagram of an art word generating system for a complex texture structure is provided in the present application;

as can be seen from fig. 1, the generation system includes: the system comprises an input processing module, a generation confrontation network model and a detail refining module which are connected in sequence;

the generation confrontation network model comprises a first generator and a second generator, wherein the first generator is used for processing the black-and-white text mask and the style small block to generate a style large block which expands the real edge of a preset multiple; the second generator is used for processing the style big block to generate a black and white style mask of the style big block;

the generation and training is to train the generation and confronting network model to be convergent by taking a preset black and white mask small block and a preset cutting style small block in a pre-created generation and training set as input and taking a preset style large block and a preset black and white style mask large block in the generation and training set as output;

the detail thinning module comprises a Structure thinning network (Ns) and a Texture thinning network (Texture Net, Nt), and the Structure thinning network is used for performing Structure thinning processing on the style big blocks to generate middle artistic words; the texture refining network is used for carrying out texture refining processing on the intermediate artistic word according to the black-and-white style mask to generate a final artistic word;

Referring to fig. 2, a schematic diagram is created for a generated training set of the complex texture structure-oriented art word generating system provided by the present application;

as can be seen from fig. 2, the generated training set in the present application is created in advance by:

step 101, selecting an original style image Y ^g The primitive style image Y ^g An image with complex texture;

102, obtaining the original style image Y ^g Original black-white mask M ^g The primitive style image Y ^g The style part in (1) is in the original black-and-white mask M ^g A black area in the middle, the original style image Y ^g Is in the original black-and-white mask M ^g The middle is a white area;

step 103, selecting the original black-and-white mask M according to a preset first size lxl ^g Local black-white mask M with maximum medium black area ^l And the primitive style image Y ^g Neutralizing the partial black-and-white mask M ^l Corresponding local style image Y ^l ；

Specifically, through steps 101 to 103, contour features of the style elements of the style image are acquired.

Step 104, masking the original black and white mask M ^g Performing edge simplification processing to generate a first black-and-white mask with smooth edge

Step 105, masking the local black and white mask M ^l Performing edge simplification treatment to generate a second black-and-white mask with smooth edge

Specifically, through steps 104 to 105, smoothness of the black and white mask edge of the original text is simulated, and the edge simplification process is completed through gaussian blur and sigmoid (.) functions.

Step 106, according to the preset block clipping method, the original style image Y is clipped ^g And the local style image Y ^l In the method, a plurality of preset style chunks y are obtained by clipping, and each preset style chunk y is obtained in the original black-and-white mask M ^g And said partial black-and-white mask M ^l The preset black-white mask big blocks at corresponding positions in the first black-white mask are obtained, and each preset black-white mask big block m is positioned in the first black-white mask

And the second black-and-white mask

The preset black-white mask block m at the corresponding position _s ；

Thus, the six-channel real training sample of the generated training set can be created from one single style picture.

Specifically, the preset bulk clipping method is completed through the following steps:

step 601, setting the second size of the large block as xN x xN, wherein xN is less than L, and x is the preset multiple;

step 602, cropping a plurality of large blocks from a first reference image according to a first probability a, said first reference image comprising said original-style image Y ^g The original black-and-white mask M ^g And the first black-and-white mask

Step 603, cropping a plurality of large blocks from a second reference image according to a second probability 1-a, wherein the second reference image comprises the local style image Y ^l The local black and white mask M ^l And the second black-and-white mask

Thus, the preset style big block y has actual style characteristics, the preset black and white mask big block m is used for describing real outlines of style elements, and the preset style big block y and the preset black and white mask big block m are combined to obtain a six-channel real training sample [ y; m ].

Step 107, randomly cutting from each preset style chunk yCutting to obtain small blocks with preset styles

The size of the preset style big block y is a preset style small block

A preset multiple of the size;

108, for each preset black and white mask block m _s According to the preset multiple, down-sampling is carried out to obtain a preset black and white mask small block

A preset multiple of the size.

Step 109, passing the preset black and white mask small block

Cutting the preset style small block

Obtaining preset cutting style small blocks

Step 110, all the preset clipping style small blocks

And all the preset black and white mask small blocks

Determining to generate a training set.

Specifically, the size of the preset small block is N × N, and the preset small block includes the preset style small block

The preset black and white mask small block

And the preset clipping style small block

Thus, due to the simplification of the black and white mask edges, the preset cropping style tiles

And the preset black and white mask small block

Black and white mask with smooth contour, its characteristics and original text

And its cutting style small block

Similarly, where superscript' denotes input or output data for the test phase. Cutting the preset cutting style small block

And the preset black-white mask small block

And generating six-channel input data of the training set

Referring to fig. 3, a structural diagram of a generation confrontation network model of a complex texture structure-oriented art word generation system is provided in the present application;

as can be seen from FIG. 3, the first generator G _p1 Comprises a convolution layer, a residual module and a splicing layer which are arranged according to the generation requirementLayers and transposed convolutional layers; the second generator G _p2 The method comprises a convolution layer, a residual module and a splicing layer which are arranged according to the generation requirement.

Specifically, the internal structure of the first generator is sequentially a first convolution layer, a second convolution layer, a third convolution layer, a plurality of residual modules, a first splicing layer, a first transposing convolution layer, a second transposing convolution layer and a fourth convolution layer;

the second generator comprises a fifth convolution layer, a residual error module, a sixth convolution layer and a second splicing layer which are connected in sequence;

the generation training includes a first generation training and a second generation training, wherein:

the first generation training is to set the stride and the expansion rate of the convolution layer inside the first generator, the stride and the expansion rate of the transposition convolution layer and the size of the convolution kernel, and to set the preset black-and-white mask small block

And the preset cutting style small block

Taking the preset style big block y as input, and realizing the training of processing the black-and-white text mask and the style small block to generate a style big block of a real edge with an expanded preset multiple;

specifically, Sidj represents the step size of the layer as i and the expansion rate as j. Where the convolutional layer with step 2 may downsample the feature and the transposed convolutional layer with step 2 may upsample the feature. kx denotes the convolution kernel size of the convolutional layer as x × x, and ty denotes the convolution kernel size of the transposed convolutional layer as y × y.

Referring to fig. 4, a schematic diagram of a first generation training of a complex texture structure-oriented artistic word generation system is provided in the present application;

as can be seen from fig. 4, the input of the generation training phase uses the preset clipping style small blocks in the generation training set

And the preset black-white mask small block

The output is the preset style large block y and the preset black-and-white mask large block m with large preset multiple in the generated training set, specifically, the first generator G _p1 The corresponding output is the preset style chunk y, the second generator G _p2 The corresponding output is the preset black and white mask chunk m, the first generator G _p1 The function of (1) is to obtain a large block picture with real edges and content expanded by a preset multiple according to the mask with smooth edges and the small style picture, and the second generator G _p2 The function of the method is to extract a real black and white mask of the large picture.

Specifically, the generation of the antagonistic network model further comprises a discriminator Dp for completing the first generation training in cooperation with the first generator.

Referring to fig. 5, a schematic diagram of style training of a complex texture structure-oriented artistic word generation system provided by the present application is shown;

as can be seen from fig. 5, specifically, the Structure refinement network is a Structure Net network trained by style, and the style training takes the pictures in the conventional picture data set as input content graphs, such as coco dataset, the preset source style pictures as reference style graphs, and the stylized content graphs as output, and trains the Structure refinement network to converge.

Referring to fig. 6, a network architecture diagram is refined for the structure of the complex texture structure-oriented art word generating system provided by the present application;

as can be seen from fig. 6, in particular, the structure refinement network further includes an attention mechanism module (attention module) and an Image Transformation Network (ITN) disposed before the structure refinement network, where:

specifically, the attention mechanism module comprises an averaging pooling layer (averaging capacitance), a seventh convolution layer, a relu activation function layer, an eighth convolution layer and a sigmoid layer which are connected in sequence, wherein k1 represents that the convolution kernel of the convolution layer is 1, Och1 represents that the output channel of the seventh convolution layer is 1, Och3 represents that the output channel of the eighth convolution layer is 3, and the relu activation function layer and the sigmoid layer represent corresponding nonlinear layers.

Referring to fig. 7, a schematic structural diagram of a deformable module of a complex texture structure-oriented artistic word generating system is provided in the present application;

as can be seen from fig. 7, the generating system further includes a deformable module disposed between the input processing module and the generation countermeasure network model, and the deformable module is configured to perform processing of adding noise and edge erosion to the black-and-white text mask, so as to control a degree of deformation of the black-and-white text mask.

Specifically, the deformable module implements the processing of adding noise and corroding edges on the black-and-white text mask by the following steps, and implements the control of the deformation degree of the black-and-white text mask:

setting a vector f to expand the noise black and white mask, and realizing the deformation degree control of the black and white text mask; the vector f comprises f ₀ 、f ₁ And f ₂ Wherein f is ₀ Size of the eroded and expanded core, f ₁ To the extent of noise addition in the edge, f ₂ To control the degree of internal noise addition.

In particular, three factors in the deformable module are related to style control. Two of which are related to edge erosion and others to the degree of internal deformation. Specifically, as shown in fig. 7(a), first, f passes ₀ Controlling etching of the edge of the black and white mask and passing f ₁ Adding noise at the edges of the corrosion. Second, the noisy black and white mask is expanded. The scale of the edge deformation is given by (f) ₀ ,f ₁ ) And (5) controlling. For internal morphing, noise will be added inside the text, where f ₂ The degree of internal noise addition is controlled. Will (f) ₀ ,f ₁ ,f ₂ ) The combination of (2) is named vector f, which determines the degree of deformation. Therefore, as shown in fig. 7(b), text black and white masks of various scales can be obtained by changing the vector f. It is noted that the deformable module is only used in the testing phase to process the text black and white mask and then crop the picture with the style changeable black and white mask to obtain a coarse style text. As shown in fig. 7(c), multi-scale artistic text is obtained by a changeable-style black-and-white mask and a clipped texture without retraining the network.

Referring to fig. 8, a schematic diagram of a testing stage of the complex texture structure-oriented artistic word generating system provided by the present application is shown;

referring to fig. 9, a schematic diagram of a final artistic word of the complex texture structure-oriented artistic word generating system is provided.

As shown in fig. 8 and 9, in the present application, a test stage is set to verify the results of the generation training and the style training, and in the test stage, under the control of a vector f, a binary text mask and a style image enter the deformable module to obtain a coarse-grained style text prototype, which is sent to Gp, Ns, and Nt in sequence for forward reasoning operation, and finally, Nt is output as a stylized final artistic word.

The application provides an art word generating system facing a complex texture structure, which comprises an input processing module, a processing module and a processing module, wherein the input processing module processes input source words to generate a black and white text mask, and the black and white text mask is used for processing input style pictures to generate style small blocks; generating a first generator of an antagonistic network model, processing a black-and-white text mask and a style small block, and generating a style large block for expanding a real edge of a preset multiple; processing the style large block by a second generator for generating an antagonistic network model to generate a black and white style mask with the style large block; the detail thinning module comprises a structure thinning network and a texture thinning network, and the structure thinning network thins large style blocks to generate intermediate artistic words; and the texture refining network refines the intermediate artistic words according to the black-and-white style mask textures to generate final artistic words. Therefore, the artistic words with complex style effects are generated based on the complex texture structure by generating the artistic word rudiment and then carrying out structure and detail refinement on the artistic word rudiment.

Other embodiments of the invention will be apparent to those skilled in the art from consideration of the specification and practice of the invention disclosed herein. This application is intended to cover any variations, uses, or adaptations of the invention following, in general, the principles of the invention and including such departures from the present disclosure as come within known or customary practice within the art to which the invention pertains; it is intended that the specification and examples be considered as exemplary only, with a true scope and spirit of the invention being indicated by the following claims.

It will be understood that the invention is not limited to the precise arrangements described above and shown in the drawings, and that various modifications and changes may be made without departing from the scope thereof; the scope of the invention is limited only by the appended claims.

Claims

1. A complex texture structure-oriented artistic word generation system, the generation system comprising: the system comprises an input processing module, a generation confrontation network model and a detail refining module which are connected in sequence;

the detail refining module comprises a structure refining network and a texture refining network, and the structure refining network is used for carrying out structure refining processing on the style large blocks to generate middle artistic words; the texture refining network is used for carrying out texture refining processing on the intermediate artistic word according to the black-and-white style mask to generate a final artistic word;

2. A complex-texture-oriented artistic word generation system according to claim 1, wherein the generation training set is created in advance by:

acquiring the original style image Y ^g Original black and white mask M ^g The primitive style image Y ^g The style part in the original black-and-white mask M ^g A black area in the middle, the original style image Y ^g Is in the original black-and-white mask M ^g The middle is a white area;

For the partial black-white mask M ^l Performing edge simplification treatment to generate a second black-and-white mask with smooth edge

And the second black-and-white mask

Preset black and white mask block m in corresponding position _s ；

The size of the preset style big block y is a preset style small block

A preset multiple of the size;

for each preset black and white mask block m _s According to the preset multiple, down-sampling is carried out to obtain a preset black and white mask small block

The black and white mask block m _s Is the preset black-white mask small block

A preset multiple of the size;

passing through the preset black and white mask small block

Cutting out the preset style small blocks

Obtaining preset cutting style small blocks

All the preset cutting style small blocks

And all the preset black and white mask small blocks

Determining to generate a training set.

3. The system for generating artistic words facing complex textures of claim 2, wherein the preset large block clipping method comprises:

setting a second size xNxN of the large block, wherein xN is less than L, and x is the preset multiple;

cropping a plurality of large blocks from a first reference image according to a first probability a, said first reference image comprising said original-style image Y ^g The original black-and-white mask M ^g And the first black-and-white mask

Cropping a plurality of large blocks from a second reference image according to a second probability 1-a, the second reference image comprising the local-style image Y ^l The local black and white mask M ^l And the second black-and-white mask

4. The system of claim 3, wherein the preset tiles have a size of NxN, and the preset tiles comprise the preset style tiles

The preset black and white mask small block

And the preset cutting style small block

5. The complex texture structure-oriented artistic word generation system of claim 1, wherein the first generator comprises a convolutional layer, a residual module, a splicing layer and a transposed convolutional layer which are set according to generation requirements; the second generator comprises a convolution layer, a residual error module and a splicing layer which are arranged according to the generation requirement.

6. The complex texture structure-oriented artistic word generation system of claim 5, wherein the generation training comprises a first generation training and a second generation training, wherein:

And the preset clipping style small block

7. The complex-texture-oriented artistic word generation system of claim 6, wherein the generative confrontation network model further comprises a discriminator, the discriminator being configured to cooperate with the first generator to perform the first generative training.

8. The complex texture structure-oriented artistic word generating system as claimed in claim 1, wherein the generating system further comprises a deformable module disposed between the input processing module and the generation countermeasure network model, the deformable module is configured to implement a degree of deformation control of the black-and-white text mask by performing a process of adding noise and edge erosion to the black-and-white text mask.

9. The complex texture structure-oriented artistic word generating system of claim 8, wherein the deformable module implements the processing of adding noise and edge erosion to the black-and-white text mask by the following steps, and implements the control of the deformation degree of the black-and-white text mask:

setting a vector f to expand the noise black and white mask, and realizing the deformation degree control of the black and white text mask; the vector f comprises f ₀ 、f ₁ And f ₂ Wherein, f ₀ Size of core for erosion and expansion, f ₁ To the extent of noise addition in the edge, f ₂ To control the degree of internal noise addition.

10. A complex texture structure-oriented artistic word generating system as claimed in claim 1, wherein the structure refinement network further comprises an attention mechanism module and an image conversion network disposed before the structure refinement network, wherein:

the image conversion network is used for generating an output tensor according to a result of element-by-element multiplication of the input tensor and the element attention parameters, wherein the output tensor is the stylized content graph, the output tensor is used for calculating a loss function value by combining the reference style graph, and the loss function value is used for training the structure refinement network.