CN112069769B

CN112069769B - Intelligent word effect migration method and system for special effect words

Info

Publication number: CN112069769B
Application number: CN201910440039.8A
Authority: CN
Inventors: 刘家瑛; 胡煜章; 汪文靖; 杨帅; 郭宗明
Original assignee: Peking University
Current assignee: Peking University
Priority date: 2019-05-24
Filing date: 2019-05-24
Publication date: 2022-07-26
Anticipated expiration: 2039-05-24
Also published as: CN112069769A

Abstract

Hair brushThe invention provides an intelligent character effect migration method and system aiming at special effect characters, wherein the method comprises the following steps: training a mask extraction subnetwork to extract a decoration element mask by using a training data set, and training a basic special effect migration subnetwork to perform basic character special effect migration; special effect word D with decorative elements _y And its paired font pictures C _y Inputting the data into a trained mask extraction sub-network to obtain a decorative element mask M _y (ii) a Will D _y And its paired character pattern picture C _y And a target font picture C _x Inputting the result into a trained basic special effect migration sub-network to obtain a result S of basic special effect migration and decoration element elimination _x (ii) a By means of M _y ，C _y And C _x Carrying out element recombination to fuse decorative elements into S _x Obtaining the moved special effect character D with the decorative elements corresponding to the target character pattern _x . The method can transfer the decorative elements while transferring the special effect of the characters, and can not cause the loss and distortion of the decorative elements.

Description

Intelligent word effect migration method and system for special effect words

Technical Field

The invention belongs to the field of character special effect style migration, and particularly relates to an intelligent character effect migration method and system for special effect characters.

Background

The special effect word style migration aims to migrate the style of a given special effect word to a given font, and batch generation of the special effect word is achieved. In recent years, special effect word style migration has become an increasingly academic concern.

The special effect word style migration method can be divided into three categories. Conventional methods based on block fusion implement style migration by finding similar texture blocks on a given special effect word that block match the current word pattern. To better preserve the legibility of glyphs, this type of approach typically computes priors for the glyph structure and considers the similarity on the glyphs when finding similar blocks. The method based on the global statistic generally extracts the features by means of a deep neural network pre-trained on a classification task, then calculates the global statistic of the features, and enables a migration result to be consistent with a special effect of a character to be migrated on the global statistic in the modes of iteration, training a forward network and the like. The method based on the deep neural network firstly collects a pair training set, then builds a deep neural network framework, trains the network on the training set, and finally applies the trained network to the special effect of the character to be migrated to realize character effect migration.

However, the existing method for transferring the style of special effect words cannot process the decorative elements on the special effect of words, which can cause the loss and distortion of the decorative elements.

Disclosure of Invention

Aiming at the technical problems, the invention provides an intelligent character effect transferring method and system for special effect characters, which can transfer the special effects of characters and simultaneously transfer decorative elements of the characters without losing and distorting the decorative elements.

The technical scheme adopted by the invention is as follows:

an intelligent word effect migration method for special effect words comprises the following steps:

training a mask extraction sub-network to extract a decoration element mask by utilizing a training data set, and training a basic special effect migration sub-network to migrate a basic character special effect;

special effect word D with decorative elements _y And its paired character pattern picture C _y Inputting the data into the trained mask extraction sub-network to obtain a decorative element mask M _y ；

Will D _y And its paired character pattern picture C _y And a target font picture C _x Inputting the result into a trained basic special effect migration sub-network to obtain a result S of basic special effect migration and decoration element elimination _x ；

Using M _y ，C _y And C _x Carrying out element recombination to fuse decorative elements into S _x Obtaining the moved special effect character D with the decorative elements corresponding to the target character pattern _x 。

Further, the training data set comprises a synthesized special effect word picture D with decoration and a matched character pattern picture C thereof, and a collected special effect word picture D with decoration and manually marked character pattern _w And the paired character pattern picture C _w (ii) a Wherein the picture D is obtained by randomly adding the decoration element picture to a collected or synthesized special effect word picture obtained by randomly combining from a collected texture picture to a glyph picture not containing a special effect.

Furthermore, the mask extraction sub-network adopts a U-net network structure, and a discriminator network netSegD for judging a picture domain where the input picture is located is built, wherein the netSegD comprises four convolution modules, and each convolution module comprises a convolution layer and a linear rectification function.

Further, the training method for mask extraction sub-network comprises the following steps:

combining the D and the C in the channel dimension and inputting the mask to extract the sub-network to obtain the decoration element mask

Simultaneously extracting the feature P of the penultimate layer;

will D _w And C _w Merging the input masks in the channel dimension to extract sub-networks to obtain the decorative element mask

Simultaneous extraction of features P of the penultimate layer _w ；

According to

P、P _w And setting a loss function to obtain a trained mask extraction sub-network.

Further, the basic special effect migration sub-network adopts a multi-scale framework structure and is trained sequentially under three resolutions of 64 × 64, 128 × 128 and 256 × 256.

Further, the training method of the basic special effect migration sub-network comprises the following steps:

will D _y 、C _y 、C _x Inputting the data into a basic special effect migration sub-network, merging the data in a channel domain to obtain a migration result S _x ；

According to S _x And setting a loss function to obtain a trained basic special effect migration sub-network.

Further, the method for recombining the elements comprises the following steps:

using DenseCrF for M _y Optimizing;

will M _y Scaling to a certain resolution, clustering by using DBSCAN, and obtaining different decorative elements and masks thereof by searching for connected shapes;

for in C _y Each decorative element belonging to the area E is searched to obtain the corresponding decorative element C _x Thereby the decorative element is moved from D to E _y Is copied to S _x Adjusting the size of the decorative element to obtain D _x 。

Further, for the case of C _y Each decorative element belonging to the area E is searched according to the following formula to obtain the corresponding decorative element C _x At the suitable position E':

wherein, M _guide (·)，

And M _Exi (. The) denotes the region at M _guide ，

And M _Exi The sum of the values of (a) and (b);

M _guide the obtaining method comprises the following steps: at C _y Upper computation level prior M _Hor And vertical prior M _Ver Normalized to [0,1 ]]Interval, fuzzy with Gauss fuzzy kernel; at C _y Upper computation distribution prior M _Dis (ii) a Will obscure M _Hor And M _Ver And M _Dis Combining in channel dimensions to yield M _guide ；

Is at C _x The same as M _guide The method for obtaining.

An intelligent word effect migration system for special effect words, comprising:

the mask extraction sub-network module is used for inputting special effect words with decorative elements and the matching font pictures thereof and obtaining a decorative element mask from the special effect words and the matching font pictures;

a basic special effect migration sub-network module, which is used for inputting special effect words with decorative elements, the character pattern matching pictures thereof and the target character pattern pictures, migrating the basic special effects of the special effect words to the target character patterns, eliminating the decorative elements of the special effect words and outputting results;

and the element recombination module is used for fusing the decoration elements on the result output by the basic special effect migration sub-network module by utilizing the shade obtained by the shade extraction sub-network module to obtain the migrated special effect characters with the decoration elements corresponding to the target character patterns.

The method comprises the steps of firstly extracting a sub-network through a shade to obtain the shade of the decoration element for separating the character special effect and the decoration element, then carrying out basic character special effect migration through a basic special effect migration sub-network, and finally recombining the decoration element and the special effect character after the basic character special effect migration according to the distribution of the decoration element to obtain a special effect migration result. The method can transfer the decorative elements of the character special effect while transferring the special effect of the character, and cannot cause the loss and distortion of the decorative elements.

Drawings

Fig. 1 is a block diagram of a mask extraction sub-network used in the present invention.

Fig. 2 is a block diagram of a basic effect migration sub-network used in the present invention.

Fig. 3 is a block diagram of an intelligent word effect migration framework used in the present invention.

Detailed Description

In order to make the aforementioned and other features and advantages of the invention more comprehensible, embodiments accompanied with figures are described in detail below.

The embodiment discloses an intelligent word effect migration method for special effect words, which is specifically described as follows:

step 1: a set of character special effect picture pair data sets is collected, and one group of picture pairs consists of a special effect character picture and a corresponding character pattern picture without special effect. The special effect words can be generated in batches through a Photoshop special effect word synthesis action. The data set contains 60 special effect character styles, each style containing 19 fonts of English capital and lowercase letters, and corresponding special effect character picture pairs of 988 fonts. The resolution of the special effect word picture and the font picture is 320 multiplied by 320. 85% of the picture pairs are divided as a training set, and 15% of the picture pairs are divided as a test set. 300 cartoon or geometric texture pictures are collected, and synthesized special effect words are obtained through random combination and used as supplement of training. Thereafter, 4000 sheets of vector graphics were collected as decorative elements. In the network training process, the decorative element pictures are randomly added to the collected or synthesized special effect word pictures to synthesize special effect words with decorations as a synthesized special effect word domain. 1000 special effect character pictures are collected from a network and other channels, and characters are manually marked to serve as a real special effect character field.

Step 2: a mask extraction subnetwork is constructed.

The network structure is shown in fig. 1. The mask extraction subnetwork adopts a U-net network structure. In the special effect word synthesizing field, firstly, the synthesized special effect word D with decoration and the character pattern picture C matched with the special effect word D are merged and input into a network in a channel dimension, input data passes through a mask extraction sub-network to obtain a mask of decoration elements

And simultaneously extracting the feature P of the penultimate layer. In the real special effect word field, firstly, the special effect word D with decoration is put _w Character pattern picture C matched with it _w Merging input networks in channel dimension, extracting sub-networks from input data through masks to obtain masks of decorative elements

Simultaneously extracting features P of the penultimate layer _w . Meanwhile, a discriminator network netSegD is established. The netSegD consists of four convolution modules,each convolution module includes convolution layers and a linear rectification function. netSegD outputs a determination of the picture domain in which the input picture is located.

And 3, step 3: the training mask extracts the sub-network.

Total loss function L _mask The device consists of two parts:

L _mask ＝λ _seg L _seg +λ _adv L _adv ，

parameter lambda _seg Is set to 1, lambda _adv Set to 0.01. L is _seg Extract the loss function for the mask:

parameter lambda _L1 Is set to 1, lambda _Per Set to 1, M is the labeled decoration element mask of the special effect word picture, VGG (·) is a feature extracted at five layers ReLU1, ReLU2, ReLU3, ReLU4, ReLU5 using a pre-trained classification network VGG.

L _adv Extract the penalty-fighting function for the mask:

L _adv ＝-log(netSegD(P _w ))，

netSegD(P _w ) Inputting P for arbiter netSegD _w The resulting output. netSegD is trained on the picture domain from which the input features are judged to come, with a loss function of:

and 4, step 4: and building a basic special effect migration network.

The network structure is as shown in fig. 2, a multi-scale frame structure is adopted, and the network is difficult to directly train on high resolution, so that a training strategy with gradually improved resolution is adopted, the training difficulty is gradually increased from easy to difficult, and specifically, the network is trained sequentially under three resolutions of 64 × 64, 128 × 128 and 256 × 256. Inputting the special effect word D with synthesized decoration elements _y Match the characterFigure C _y And target font picture C _x . Merging the three input pictures in a channel domain, and obtaining a migration result S after passing through a basic special effect migration network _x 。

And 5: and training a basic special effect migration network. Total loss function L _transfer Consists of two parts, using WGAN-GP:

parameter lambda _L1 Is set to 1, lambda _adv Set to 0.01. L is a radical of an alcohol _adv To combat the loss function:

in the above formula, the first and second carbon atoms are,

meaning that the mean is calculated for distribution Y with respect to X. D represents a discriminator network, and D (a, b, c, D) represents the discrimination result of a after the input of a, b, c, D by the discriminator network. The arbiter network adopts the network structure of PatchGAN.

Is a real target special effect word in the data set,

in distribution of

And S _x To be uniform.

Step 6: and building an intelligent word effect migration system framework. The trained mask extraction sub-network and the base special effects migration sub-network are combined as in fig. 3. Within the system framework, a given special effect word D to be migrated with a decorative element is first of all assigned _y And its paired font pictures C _y Inputting the obtained data to the trained mask extraction sub-network to obtain a decorative element mask M _y . Then the special effect word D with the decorative elements is put into _y Matching character pattern picture C _y And target font picture C _x Inputting the result into a basic special effect migration sub-network to obtain a result S of basic special effect migration and decoration element elimination _x . Then carrying out element recombination to obtain the special effect word D with the decoration element after the final style migration _x 。

The element recombination part is firstly passed through C _y Calculating the level prior M _Hor . Definition of x _y,min Is C _y The pixel position, x, of the foreground font at the leftmost end of each upper line _y,max The rightmost end thereof. Then generate

Wherein, K _w ＝0.06(x _y,max -x _y,min )。

Thereafter, M is generated by recursion _Hor ：

M _Hor (x _y，center ，y)＝0，

x _y,center Is represented by C _y The center of the foreground font of each upper line is located at the pixel position.

Then M is added _Hor Normalized to [0,1 ]]Interval, and fuzzy with Gaussian fuzzy core to obtain final M _Hor 。

Similarly, by C _y Calculating vertical prior M _Ver . Definition of y _x,min Is C _y The pixel position of the foreground font at the top of each column, y _x,max For its lowest end, an estimate of the vertical variation of the glyph may then be generated

Wherein, K' _w ＝0.06(y _x,max -y _x,min )。

Thereafter, M is generated by recursion _Ver ：

M _Ver (x,y _x,center )＝0，

y _x,center Is represented by C _y The center of the foreground font of each column is at the pixel position.

Then M is added _Ver Normalized to [0,1 ]]Interval, and fuzzy with Gaussian fuzzy core to obtain final M _Ver 。

At C _y Upper computation distribution prior M _Dis ：

M _Dis ＝(1-Dis(x,y)) ⁵ ，

Dis (x, y) denotes the glyph distribution prior calculated by Yang et al (cf. Shuai Yang, Jianying Liu, Zhouhui Lian, and Zongming Guo. "Awessome type graphics: Statistics-Based Text Effects Transfer", Proc. of IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, Hawaii, Jul. 2017.).

Combining M in channel dimension _Hor 、M _Ver 、M _Dis To obtain the total prior M _guide . Similarly, at C _x The total prior is obtained by the upper calculation

Setting a superposed recording matrix M _Exi When the element position has a decorative element, M _Exi (x, y) is 0, otherwise M _Exi (x,y)＝1。

The element recombination was performed by first using DenseCrF on M _y And (6) optimizing. Then, at M _y To get M _y Scaled to 5 × 5 resolution and clustered with DBSCAN. Different decorative elements and their shades are obtained by finding connected shapes.For each decorative element, note it at C _y Upper part belongs to the area E and is searched to obtain the corresponding area C _x At the suitable position E':

M _guide (·)，

and M _Exi (. The) denotes the region at M _guide ，

And M _Exi The sum of the values of (a) and (b).

Decorative elements from D _y Is copied to S _x If the superposition area of the decoration element and the font is reduced after copying, the decoration element is slightly reduced and is closer to the font, otherwise, the decoration element is slightly enlarged and is further away from the font, and the special effect character D with the decoration element after final style migration is obtained _x 。

And 7: special effect word D to be transferred with decorative elements _y And its corresponding font picture C _y Target font picture C _x Inputting the system frame to obtain the special effect character D with the decoration elements after the final style migration _x 。

The above embodiments are only intended to illustrate the technical solution of the present invention and not to limit the same, and a person skilled in the art can modify the technical solution of the present invention or substitute the same without departing from the spirit and scope of the present invention, and the scope of the present invention should be determined by the claims.

Claims

1. An intelligent word effect migration method for special effect words comprises the following steps:

special effect word D with decorative elements _y And its paired font pictures C _y Inputting the data into a trained mask extraction sub-network to obtain a decorative element mask M _y ；

Will D _y And its paired character pattern picture C _y Target font picture C _x Inputting the result into a trained basic special effect migration sub-network to obtain a result S of basic special effect migration and decoration element elimination _x ；

2. The method of claim 1, wherein the mask extraction sub-network is a U-net network structure, and a discriminator network netSegD is constructed for determining a picture domain in which the input picture is located, the netSegD comprising four convolution modules, each convolution module comprising a convolution layer and a linear rectification function.

3. The method of claim 2, wherein the training data set comprises a composite decorated special effect word picture D and its paired glyph picture C, and a collected and artificially annotated glyph-decorated special effect word picture D _w And the paired character pattern picture C _w (ii) a Wherein the picture D is obtained by randomly adding the decoration element picture to a collected or synthesized special effect word picture obtained by randomly combining from a collected texture picture to a glyph picture not containing a special effect.

4. The method of claim 3, wherein the training method of the mask extraction subnetwork comprises the steps of:

combining the D and the C in the channel dimension and inputting the shade to extract a sub-network to obtain a decoration element shade

Simultaneously extracting the feature P of the penultimate layer;

Simultaneous extraction of features P of the penultimate layer _w ；

According to

5. The method of claim 4, wherein the method is based on

P、P _w The method of setting the loss function is as follows:

the loss function comprises a mask extraction loss function L _seg Sum mask extraction resisting loss function L _adv The total loss function L is composed of the two loss functions _mask ：

L _mask ＝λ _seg L _seg +λ _adv L _adv ，

Wherein λ is _seg Is set to 1, lambda _adv Set to 0.01;

wherein λ is _L1 Is set to 1, lambda _Per Setting as 1, M is a labeled decoration element mask of a special effect word picture, and VGG (-) is a feature extracted by a pre-trained classification network VGG on five layers of ReLU1, ReLU2, ReLU3, ReLU4 and ReLU 5;

L _adv ＝-log(netSegD(P _w ))，

among them, netSegD (P) _w ) Inputting P for arbiter netSegD _w The resulting output, netSegD, is trained in the picture domain from which the input features are judged to come from, with a loss function of:

6. the method of claim 1, wherein the base special effects migration subnetwork employs a multi-scale framework structure, and the network is sequentially trained with a resolution-escalating training strategy comprising sequential training at three resolutions of 64 x 64, 128 x 128, and 256 x 256.

7. Method according to claim 1 or 2, wherein the training of the base effect migration subnetwork comprises the following steps:

8. The method of claim 1, wherein the recombination of elements comprises the steps of:

using DenseCrF for M _y Optimizing;

9. The method of claim 8Of a process in which _y Each decorative element belonging to the area E is searched according to the following formula to obtain the corresponding decorative element C _x At the suitable position E':

wherein M is _gujde (·)，

And M _Exi (. respectively) indicates the regions in M _guide ，

And M _Exi The sum of the values of (a) and (b);

M _guide the obtaining method comprises the following steps: at C _y Up-computation level prior M _Hor And vertical prior M _Ver Normalized to [0,1 ]]Interval, fuzzy with Gauss fuzzy kernel; at C _y Upper computation distribution prior M _Dis (ii) a Will be blurred M _Hor And M _Ver And M _Dis Combining in channel dimensions to yield M _guide ；

Is at C _x The same as M _guide The method of obtaining.

10. A smart word effect migration system for special effect words, comprising:

the mask extraction sub-network module is used for inputting special effect characters with decorative elements and paired character patterns of the special effect characters and obtaining a decorative element mask from the special effect characters;

and the element recombination module is used for fusing the decoration elements on the result output by the basic special effect migration sub-network module by utilizing the mask obtained by the mask extraction sub-network module to obtain the migrated special effect characters with the decoration elements corresponding to the target character patterns.