CN110738715B - Automatic migration method of dynamic text special effect based on sample - Google Patents

Automatic migration method of dynamic text special effect based on sample Download PDF

Info

Publication number
CN110738715B
CN110738715B CN201810796815.3A CN201810796815A CN110738715B CN 110738715 B CN110738715 B CN 110738715B CN 201810796815 A CN201810796815 A CN 201810796815A CN 110738715 B CN110738715 B CN 110738715B
Authority
CN
China
Prior art keywords
target
style
text
special effect
frame
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201810796815.3A
Other languages
Chinese (zh)
Other versions
CN110738715A (en
Inventor
连宙辉
门怡芳
唐英敏
肖建国
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Peking University
Original Assignee
Peking University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Peking University filed Critical Peking University
Priority to CN201810796815.3A priority Critical patent/CN110738715B/en
Publication of CN110738715A publication Critical patent/CN110738715A/en
Application granted granted Critical
Publication of CN110738715B publication Critical patent/CN110738715B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T13/00Animation
    • G06T13/203D [Three Dimensional] animation
    • G06T3/04
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Abstract

The invention discloses an automatic transfer method of a dynamic text special effect based on samples, which comprises the steps of inputting a group of samples comprising a material text image, a material dynamic special effect word or material style animation and a target text image; according to the dynamic style in the material style animation and the text content in the target text image, the text special effect in the material style animation is fully transferred to the target text through key frame extraction, space-time coherence construction and a method of texture synthesis based on improved PatchMatch, so that the overall and detailed texture style is maintained, and the target dynamic special effect character/target style animation with vivid dynamic effect and natural smoothness is automatically generated; the target style animation is composed of a plurality of frames of target style graphs. The method can solve the time sequence problems of texture flickering and jittering in the target style animation, solve the appearance problems of loss and incomplete migration of the special effect of the complex fluid, and ensure the continuity of the texture of each frame.

Description

Automatic migration method of dynamic text special effect based on sample
Technical Field
The invention belongs to the technical field of computer vision and graphics, and relates to a dynamic text special effect migration method based on a sample, which can realize automatic migration of dynamic special effects among different texts and generate vivid dynamic special effect words with stable time sequence and high style reduction degree.
Background
The style migration technology is a research hotspot in the field of computer vision and graphics, and can realize migration of style textures among different contents, video style migration is an expansion of image style migration on a time sequence space, and a next frame result is predicted for a current frame mostly according to a target motion stream and is used for guiding synthesis of a next frame, so that coherence of the time sequence between frames is realized, and time sequence texture errors such as flicker, jitter and the like are prevented. Yang (Shuai Yang, Jianying Liu, Zhouhui Lian, and Zongming Guo.2017.Awesome type graphics: Statistics-based text effects transfer. in proceedings of the IEEE Conference on Computer Vision and Pattern recognition.7464-7473.) proposes a static text special effect migration method, assuming that all image blocks with equal distance to a text skeleton line have the same texture style mode, thereby establishing a distance model for guiding the generation of the spatial distribution of special effect textures in an image. The dynamic text special effect migration task is an expansion of a static method on a time sequence space, a more complex and vivid dynamic effect can be generated compared with a static word, and the most direct realization method is to perform text special effect migration frame by frame, and add time sequence limitation to a time sequence guiding method for reference video style migration between frames so as to ensure the time sequence stability of a generated result. Because the dynamic special effect contains various sub-effects, even the static special effect and the dynamic special effect are fused (for example, the rust word effect of flame combustion contains both the dynamic flame effect and the static rust effect), the method has certain complexity, even if a time sequence guide item is added, the dynamic special effect word with stable static effect and natural and smooth dynamic effect is still difficult to generate, the static special effect texture can generate violent flicker and jitter, and the method is difficult to process the fluid special effect which does not meet the distance distribution hypothesis and is vivid and strong, and is difficult to provide correct spatial distribution guidance, so that the style special effect migration is incomplete or even fails.
The animation stylization technology stylizes the target animation by inputting a material stylized graph and a section of target animation by using a traditional texture synthesis or deep learning method so as to enable the target animation to have the style characteristics of the material stylized graph. However, this method cannot change the target text image into the target style animation, and when the input does not include the target animation and only includes a section of material dynamic special effect word (including the material text image corresponding thereto) and one target text image, the existing animation stylization method cannot generate a dynamic special effect word with stable and smooth dynamic effect.
In conclusion, the existing style migration technology has the sequence problems of flicker, jitter and the like of dynamic textures, the appearance problems of loss, incomplete migration and the like of complex fluid special effect, and dynamic special effect words with stable and smooth dynamic effects cannot be generated when no target animation is input.
Disclosure of Invention
Aiming at the problems in the prior art, the invention provides a dynamic text special effect migration method based on a sample, which solves the time sequence problems of texture flicker and jitter in animation generation through key frame extraction and space-time coherent term construction, and solves the problems of loss of a special effect of a complex fluid and incomplete migration appearance through introducing a distance weight graph and an improved PatchMatch algorithm (image block matching algorithm) of simulated annealing.
The method of the invention solves the following scenes: inputting a section of material dynamic special effect words (containing a material text image corresponding to the material dynamic special effect words) and a target text image to generate the target dynamic special effect words, so that the text of the target text image has the dynamic effect in the material dynamic special effect words. The conventional animation stylization method is difficult to directly apply to the scene of the invention. The invention aims to change a target text image into a target style animation, realize the conversion from a static image to a dynamic image, only provide one target text image, and is different from the prior animation stylization method in which a dynamic target animation is directly input, so that a target motion stream cannot be directly evaluated, and prediction and guidance on a time sequence are provided. The method does not adopt a processing flow for carrying out style migration frame by frame, but takes the whole video as a whole in a three-dimensional space, expands image blocks in a two-dimensional image into time-sequence dimensions to form space-time blocks by extracting key frames and constructing space-time coherent terms, then constructs an objective function by combining the space-time coherent terms and a text guide term, and finally carries out nearest neighbor matching on the space-time blocks of the target video in a material video under the guide of the objective function by adopting an improved PatchMatch algorithm, thereby realizing the synthesis of target dynamic textures. The improved PatchMatch algorithm realizes deep and directionally-guided matching search by introducing a distance weight map and a simulated annealing algorithm, and can more fully transfer the texture style of the material to a target map.
The technical scheme provided by the invention is as follows:
a group of samples comprising a material text image, a material dynamic special effect word and a target text image are input, and through the steps of key frame extraction, space-time coherence construction, improved PatchMatch-based texture synthesis and the like, the text special effect in a material style animation can be fully transferred to the target text, the overall and detailed texture styles are maintained, and the target dynamic special effect word with vivid dynamic effect, nature and smoothness is automatically generated. The method specifically comprises the following steps:
A. inputting a group of samples, wherein the group of samples comprise a material text image, material dynamic special effect words (material style animation) and a target text image, and outputting automatically generated target dynamic special effect words;
the material style animation is composed of a plurality of frames of material style diagrams, the position and the size of characters in the material style animation are always unchanged, and the dynamic style (dynamic effect) applied to the characters is freely changed; the material text graph is a semantic mask of characters in the material style animation, the contents of the characters are aligned, and the target text graph contains text contents (characters) needing stylization; and automatically generating the target dynamic special effect words, namely the final output result, namely the target style animation, according to the dynamic style of the dynamic special effect words in the material and the text content in the target text image, wherein the target style animation is composed of a plurality of frames of target style images.
B. Extracting key frames in the material style animation, calculating the new generation and extinction numbers of the special effect particles frame by frame, drawing a line graph of which the number of the particles with changed survival states changes along with the frame number, extracting the frame number corresponding to the peak value meeting the minimum interval as a frame with larger change of special effect form difference, and defining the frame as the key frame; extracting each material style animation to obtain a plurality of key frames; each key frame corresponds to a material style sheet.
Extracting key frames in the material style animation, which specifically comprises the following steps:
B1. the method comprises the steps of extracting special effect particles from each frame of image (material style picture) in the material style animation, removing a background according to the RGB value of each pixel point, wherein the background is mostly black or other pure colors, and the rest pixel points are the special effect particles.
B2. Counting the number of particles with changed survival states from a second frame material style diagram, wherein the particles with changed survival states refer to new particles of which the previous frame is not special effect particles and the current frame is special effect particles and lost particles of which the previous frame is special effect particles and the current frame is not special effect particles, and adding the two types of particle numbers with changed survival states to obtain the number of changed particles.
B3. And drawing a line graph by taking the frame number as a horizontal axis and the number of the changed particles as a vertical axis, wherein the number of the changed particles of the material style charts of all the frames is taken, beta peak values arranged from high to low on the line graph are taken, the interval of the horizontal axis corresponding to every two peak values is ensured to be larger than theta, and the frame numbers corresponding to the peak values are taken as key frame numbers. Beta is the number of key frames, and can be one tenth of the total frame number of the material style animation; θ is the minimum interval that should be satisfied between the frame numbers of the key frames, and may be 5. And if the material style animation background is not pure color, extracting the key frame in a uniform sampling mode.
C. And (3) constructing a space-time coherent item, calculating the distance (L2 distance) of the image blocks of the material style diagram and the target style diagram corresponding to each key frame in the RGB space, and adding the distances between the image blocks of the material style diagram and the target style diagram corresponding to each key frame to obtain the distance between space-time blocks, namely the space-time coherent item.
D. And establishing an objective function, wherein the objective function is composed of a space-time coherent item and a text guide item. The text guide item is obtained by calculating an L2 paradigm of the image blocks of the material text image and the image blocks of the target text image on an RGB space.
E. And performing nearest neighbor matching between the material space-time block of the material style animation and a target space-time block (formed by image blocks on key frames) of the target style animation by using an improved PatchMatch algorithm to obtain a matching relation applicable to the image blocks of the material style diagrams and the target style diagrams corresponding to the key frames, and applying the matching relation to each frame of the material style diagrams to obtain the target style animation. The space-time block is the expansion of the original image block in the whole time dimension, and the space-time block is represented by the image block on the key frame. Therefore, only one time of PatchMatch nearest neighbor matching is needed to obtain a matching result, so that the corresponding material style diagrams and the target style diagrams on all the frames meet the matching relation.
The improved PatchMatch algorithm for nearest neighbor matching mainly comprises the following operations:
and E1.PatchMatch algorithm iteratively optimizes the target function by utilizing a maximum expectation value algorithm, and alternately executes two steps of nearest neighbor search and target style graph reconstruction until convergence.
E2. The nearest neighbor search mainly comprises two steps of random search and direction-guided depth propagation. Randomly searching whether a better matching relation can be obtained or not by using a random matching mode; the direction-guided depth propagation realizes the direction-guided and depth propagation process by introducing a weight map and a core idea of a simulated annealing algorithm. The method comprises the steps of firstly calculating the closest distance from each pixel point in a target text image to a text edge outline, generating a distance image, obtaining the weight value of each pixel point in each image block of the target text image according to the distance image, and marking the weight value as a weight map. The weighted value and the distance value are in negative correlation, and the correlation is exponential correlation, so that the pixel points closer to the text contour line have higher weighted values. The weight map guides the texture to be synthesized from the edge of the text outline to the periphery by controlling the weight of each pixel point when the similarity of the image blocks is calculated (the nature of each guide item in the objective function is the similarity between the measurement image blocks, the L2 distance of each pixel point needs to be weighted and added when the L2 distance between the image blocks is calculated, and the weight of each pixel point is provided by the weight map) and the weight of each pixel point when the target style sheet is reconstructed.
Based on the depth propagation of the simulated annealing algorithm, the initial temperature and the termination temperature are set firstly, and the nearest neighbor matching result is initialized. When neighborhood propagation of PatchMatch is carried out, a target equation value corresponding to the candidate solution is calculated, if the value is smaller than the target equation value under the current solution, the candidate solution is adopted to replace the current solution, the current solution is updated, otherwise, the solution is received with a certain probability, the receiving probability is reduced along with the reduction of the temperature, the temperature is reduced along with the increase of the iteration times, through the dynamic probability receiving mode, a new solution is more easily received in the initial iteration, the texture is deeply propagated, and only the solution which strictly meets the energy reduction can be received in the subsequent process.
E3. And (3) target style diagram reconstruction, namely reconstructing the target style diagram according to the material style diagram and the corresponding matching result of the pixel points in the target style diagram, wherein the RGB value of each pixel point in the target style diagram is the weighted average value of the optimal matching block corresponding to all the image blocks covering the pixel point at the point, and the weight is provided by a weight map.
Compared with the prior art, the invention has the beneficial effects that:
the invention provides a dynamic text special effect migration method, which is characterized in that by extracting key frames and constructing a space-time coherent item, and utilizing a time sequence expanded space-time block to remove neighborhood matching search on a time dimension, a target special effect particle automatically simulates dynamic change of a material special effect particle under a determined matching relation, the time sequence problems of texture flickering and jittering in a target style animation are solved, and the texture continuity of each frame is ensured.
Inputting a section of material dynamic special effect words (including a material text image corresponding to the material dynamic special effect words) and a target text image to generate target dynamic special effect words, and solving the appearance problems of loss of complex fluid special effect and incomplete migration by introducing a distance weight map and an improved PatchMatch algorithm of simulated annealing; the generated text of the target text image can have the dynamic effect in the material dynamic special effect words.
Drawings
FIG. 1 is a block flow diagram of a method provided by the present invention.
FIG. 2 is a diagram illustrating input and output effects of an embodiment of the present invention;
wherein, (a) is the input material text chart; (b) dynamic special effect words are input materials; (c) inputting a target text graph; (d) the method is the automatically generated target dynamic special effect word.
FIG. 3 is a diagram illustrating key frame extraction according to an embodiment of the present invention;
wherein, (a) is a material style chart of two continuous frames; (b) the special effect particle masks extracted for the front frame and the rear frame; (c) calculating a variance particle mask containing new particles and death particles; (d) is a plot of varying particle number as a function of frame number.
FIG. 4 is a schematic diagram of spatiotemporal coherence term construction according to an embodiment of the present invention;
wherein x, y constitute a planar spatial domain; t is a time domain and corresponds to the frame number of each frame of the material (target) style animation in time sequence; t is t1、t2、t3The frame numbers of the three key frames;
Figure BDA0001736127320000051
is at the t1Frame material style diagram
Figure BDA0001736127320000052
The image block of (1);
Figure BDA0001736127320000053
is at the t1Frame target style sheet
Figure BDA0001736127320000054
The image block of (1).
FIG. 5 is a schematic diagram of the directional propagation using the weight map in the embodiment of the present invention;
wherein, (a) is a distance map corresponding to the target text map; (b) a weight map corresponding to a certain target image block; (c) is a schematic diagram of a specific application module of the weight map.
FIG. 6 is an example of the result of performing a dynamic special effect migration on different glyphs for different special effect styles according to an embodiment of the present invention.
Detailed Description
The invention will be further described by way of examples, without in any way limiting the scope of the invention, with reference to the accompanying drawings.
The invention provides a dynamic text special effect migration method based on a sample, which expands image blocks in a two-dimensional image to a time sequence dimension to become space-time blocks by extracting key frames and constructing space-time coherent items, utilizes the space-time coherent items and text guide items to construct an objective function, adopts an improved PatchMatch algorithm to introduce a distance weight map and simulated annealing to realize depth and direction-guided nearest neighbor matching search, and applies the obtained matching relation to each frame of material style map to complete the synthesis of an objective dynamic special effect word.
The flow of the method of the invention is shown in the attached figure 1, and a specific embodiment is as follows:
1) taking a material text image, a material dynamic special effect word and a target text image as input, taking a target style sheet as output, and obtaining input and output results as shown in FIG. 2;
2) extracting key frames of the input material dynamic special effect words, wherein the visualization process is shown in figure 3;
2-a) segmenting text particles with special effects according to the RGB values of the pixels in each frame of material style diagram, namely removing all background pixels, wherein most of the background used in the sample is pure colors such as black, and if the background is not pure color, directly extracting the key frames in a uniform sampling mode. Marking the special effect particles extracted from the t frame as MtIf the number of the special effect particles in the t-1 th frame is Mt-1
2-b) counting the number of particles with changed survival states from the second frame material style diagram, wherein the particles with changed survival states refer to new particles with the previous frame not being special effect particles and the current frame being special effect particles and extinction particles with the previous frame being special effect particles and the current frame not being special effect particles, and the new particles are recorded as
Figure BDA0001736127320000061
The loss of the particles is recorded
Figure BDA0001736127320000062
The total number of changed particles is then the sum of the number of new and lost particles:
Figure BDA0001736127320000063
wherein n istThe number of changed particles of the t frame;
Figure BDA0001736127320000064
is the number of nascent particles;
Figure BDA0001736127320000065
as the number of lost particles; m is the total number of frames contained in the material style animation.
2-c) drawing a line drawing for the number of the change particles of the frame material style drawing by taking the frame number as a horizontal axis and the number of the change particles as a vertical axis, taking beta peak values arranged from high to low on the line drawing, ensuring a certain interval between every two peak values, and taking the frame number corresponding to the peak values as a key frame number, namely the frame number
Figure BDA0001736127320000066
Wherein kf is a serial number set of the extracted key frames; k is a key frame sequence number; θ is the minimum interval between key frames; mu is [1, theta ]]Any value in between, nkThe number of changed particles of the k frame; beta is the number of key frames, and the value is one tenth of the total frame number of the material style animation.
3) And constructing a space-time coherent item, wherein the space-time coherent item considers the texture continuity of key frames on the whole video, so that the finally obtained matching relation is suitable for all frames, and meanwhile, the time dimension is removed from the search space in a mode that all frames share the matching relation, so that the three-dimensional space problem is simplified into the matching problem on a two-dimensional plane. The spatiotemporal coherence term is obtained by calculating the sum of L2 distances of the corresponding material stylistic graph on each key frame and the image block of the target stylistic graph on the RGB space:
Figure BDA0001736127320000067
wherein E isstIs the energy value of the spatio-temporal coherence term;
Figure BDA0001736127320000068
an image block of the t frame material stylized graph is obtained;
Figure BDA0001736127320000069
an image block of the t frame target style sheet; d represents the L2 paradigm distance of the image block in the RGB space; t is the key frame number; kf is the key frame set. A visual presentation of which is shown in fig. 4.
4) And establishing an objective equation, wherein an objective function is composed of a space-time coherent term and a text guide term. The spatio-temporal coherence term is obtained in step 3, and the text guide term is obtained by calculating an L2 paradigm of image blocks of the material text image and image blocks of the target text image in the RGB space, that is:
Figure BDA0001736127320000071
wherein P represents a material text chart StextOr material style sheet SstyThe image block of (1); q denotes the target text graph TtextOr target style sheet TstyThe image block of (1); ptextThe image blocks in the material text image; qtextFor the image blocks in the target text image, the distance values of the two constitute the text guide item.
5) And performing nearest neighbor matching by using an improved PatchMatch algorithm to obtain a matching relation between the material image block and the target image block, and applying the matching relation to each frame of material stylized graph to obtain the target style animation.
The PatchMatch algorithm obtains the matching relation between the reconstructed target style image and the material style image by iteratively optimizing the target equation on multiple scales. And (3) under each scale, iteratively optimizing the objective function by using a maximum expectation algorithm, and alternately executing the two steps of nearest neighbor search and target style graph reconstruction until convergence. The nearest neighbor search mainly comprises propagation and random search, the propagation mainly utilizes the continuity of textures in natural images to propagate a good matching relation to the periphery, and the method realizes a directionally-guided and deep propagation process by introducing a weight map and a core idea of a simulated annealing algorithm.
Because the text edge of the target text image contains more semantic features and the correct matching relationship is easier to find, the directional propagation is adopted to guide the texture to be synthesized from the text outline edge to the periphery. The processing procedure is as shown in fig. 5, and the closest distance between each pixel point q in the target text graph and the text edge contour Ω is first calculated, and the graph is displayed in different colors according to different distance values, and is called a distance graph. And extracting any w × w image block from the distance map, wherein the w value is 3, so that the image block contains 3 × 3 distance values, and calculating each distance value by adopting an equation 5 to obtain a weight map of each image block.
Figure BDA0001736127320000072
Wherein alpha isq' is the weighted value corresponding to the pixel point q ', and the pixel point q ' is the image block N taking the pixel point q as the centerqAny one of the pixel points; cardinality of exponents
Figure BDA0001736127320000073
The value is 2; d(q ', omega) is the vertical distance between the pixel point q ' and the text edge contour omega, namely the distance value at q '; in the same way, d(q, Ω) is the distance value at pixel point q.
The weight map controls the weight of each pixel point when calculating the similarity of the image blocks (the nature of each guide item in the objective function is to measure the similarity between the image blocks, the L2 distance of each pixel point needs to be weighted and added when calculating the L2 distance between the image blocks, and the weight of each pixel point is provided through the weight map) and the weight of each pixel point when reconstructing the target style sheet, so that the pixel points close to the text contour line occupy a larger proportion, specifically, when calculating the L2 paradigm distance between two image blocks, each pixel point performs weighted summation on the distance, and the weight is calculated according to the weight map of the formula 5.
In order to ensure that the special effect texture migration fully introduces the depth propagation based on simulated annealing, the initial temperature and the termination temperature are firstly set, and the nearest neighbor matching relationship is initialized according to the position correlation between the material and the target. When neighborhood propagation of PatchMatch is carried out, a target equation value E ' corresponding to the candidate solution NN ' is calculated, if the value is smaller than the target equation value under the current solution, namely delta E is equal to E ' -E, and delta E is smaller than 0, the current solution is replaced by the candidate solution, the current solution is updated, and if not, the current temperature T is calculatedk
Figure BDA0001736127320000081
Wherein, T0Is the initial temperature; t isfIs the termination temperature;
Figure BDA0001736127320000082
the current iteration number is;
Figure BDA0001736127320000083
for the total number of iterations, the temperature decreases as the number of iterations increases.
The acceptance probability prop is calculated, which decreases with decreasing temperature, in the following way:
Figure BDA0001736127320000084
and randomly generating a probability value xi between 0 and 1, and if prop is greater than xi, accepting a candidate solution and updating the current solution. The acceptance probability decreases with decreasing temperature, making it easier to accept new solutions in the initial iteration, making the texture perform deep propagation, while in subsequent processes, only solutions that strictly meet the energy reduction can be accepted. And iterating the processes of selecting the candidate solution and updating until no pixel point can be updated.
And (3) reconstructing a target style sheet, namely reconstructing the target style sheet according to the material style sheet and corresponding matching results of pixel points in the target style sheet, wherein the RGB value of each pixel point in the target style sheet is the weighted average value of all optimal matching blocks corresponding to the image blocks covering the pixel point at the point, and the weighted value is the corresponding result of the weight map.
And after the nearest neighbor matching is completed, applying the matching relation to each frame of material stylistic graph by using a target stylistic graph reconstruction method to obtain the target stylistic animation.
Fig. 6 is a result example of performing dynamic special effect migration on different dynamic special effect styles under different fonts according to the embodiment of the present invention, where a synthesized result is a special effect word video, and an image result shows a synthesized result of some representative frames. The result shows that the method of the invention can effectively retain the style texture of the material special effect word, has high consistency on the special effect style, has natural and smooth dynamic effect, and solves the time sequence problems of flicker and jitter.
It is noted that the disclosed embodiments are intended to aid in further understanding of the invention, but those skilled in the art will appreciate that: various substitutions and modifications are possible without departing from the spirit and scope of the invention and appended claims. Therefore, the invention should not be limited to the embodiments disclosed, but the scope of the invention is defined by the appended claims.

Claims (9)

1. An automatic transfer method of dynamic text special effects is characterized in that:
inputting a group of samples comprising a material text image, a material dynamic special effect word or material style animation and a target text image; the material style animation comprises a multi-frame material style chart, the position and the size of characters in the material style animation are always unchanged, and the dynamic style applied to the characters is freely changed; the material text image is a semantic mask of characters in the material style animation, and the contents of the semantic mask and the semantic mask are aligned; the target text graph comprises text content needing to be stylized;
according to the dynamic style in the material style animation and the text content in the target text image, the text special effect in the material style animation is fully transferred to the target text through key frame extraction, space-time coherence construction and a method of texture synthesis based on improved PatchMatch, so that the overall and detailed texture style is maintained, and the target dynamic special effect character/target style animation with vivid dynamic effect and natural smoothness is automatically generated; the target style animation is composed of a plurality of frames of target style graphs;
the method comprises the following steps:
A. inputting a group of samples, wherein the samples comprise a material text image, material dynamic special effect words and a target text image, and the material dynamic special effect words are material style animations;
B. extracting key frames in the material style animation:
extracting special effect particles from each frame of material style diagrams in the material style animations, calculating the new generation and extinction numbers of the special effect particles frame by frame, drawing a line graph of the number of particles with the change of the survival state along with the change of the frame number, and extracting the frame number corresponding to the peak value meeting the minimum interval as a frame with larger change of special effect form difference to define the frame as a key frame; taking beta peak values arranged from high to low on a line drawing, ensuring that the interval of a transverse axis corresponding to every two peak values is larger than theta, and taking frame numbers corresponding to the peak values as key frame numbers; beta is the number of key frames; theta is the minimum interval between the frame numbers of the key frames;
extracting each material style animation to obtain a plurality of key frames; each key frame corresponds to a material style sheet;
C. constructing a space-time coherent item, calculating the L2 distance of the image blocks of the corresponding material stylistic diagrams and the target stylistic diagrams on the RGB space on each key frame, and then adding the distances between the image blocks of the corresponding material stylistic diagrams and the target stylistic diagrams on each key frame to obtain the distance between space-time blocks, namely the space-time coherent item;
D. establishing an objective function, wherein the objective function consists of the space-time coherent item and the text guide item obtained in the step C; the text guide item is obtained by calculating an L2 paradigm of image blocks of the material text image and image blocks of the target text image in an RGB space;
E. utilizing an improved PatchMatch algorithm to carry out nearest neighbor matching between a material space-time block of the material style animation and a target space-time block of the target style animation to obtain a matching relation, and applying the matching relation to a corresponding material style map to obtain the target style animation; the time-space block is formed by image blocks on a key frame;
the improvement comprises the following operations:
E1. iteratively optimizing a target function by adopting a PatchMatch algorithm and utilizing a maximum expectation value algorithm, and alternately executing nearest neighbor search operation and target style sheet reconstruction operation until convergence;
E2. the nearest neighbor search comprises two steps of random search and direction-guided depth propagation; random search adopts a random matching mode to try to obtain a better matching relation; the depth propagation of the direction guide realizes the propagation process of the direction guide and the depth by introducing a weight map and a simulated annealing algorithm;
E21. firstly, calculating the nearest distance from each pixel point in a target text graph to a text edge outline to generate a distance graph; according to the distance map, obtaining the weight value of each pixel point in each image block of the target text map, and marking as a weight map; in the weight map, the texture is guided to be synthesized from the outline edge of the text to the periphery by controlling the weight of each pixel point when the similarity of the image blocks is calculated and the weight of each pixel point when the target style sheet is reconstructed;
E22. the depth propagation method based on the simulated annealing algorithm executes the following operations:
firstly, setting an initial temperature and a termination temperature, and initializing a nearest neighbor matching result;
neighborhood propagation for PatchMatch: calculating a target equation value corresponding to the candidate solution, and if the value is smaller than the target equation value under the current solution, replacing the current solution with the candidate solution, and updating the current solution; otherwise, calculating the acceptance probability to accept the candidate solution according to the acceptance probability; the acceptance probability is reduced along with the reduction of the temperature, and the temperature is reduced along with the increase of the iteration times;
by the adoption of the dynamic probability acceptance mode, new solutions are more easily accepted in initial iteration, and the texture is deeply propagated;
E3. target style graph reconstruction: and rebuilding the target style chart according to the corresponding matching results of the pixel points in the material style chart and the target style chart, wherein the RGB value of each pixel point in the target style chart is the weighted average value of the optimal matching block corresponding to all the image blocks covering the pixel point at the point, and the weighted value is obtained from the weight map.
2. The method of claim 1, wherein the step B of extracting key frames from the material-style animation comprises the steps of:
B1. extracting special effect particles from each frame of material style image in the material style animation;
B2. counting the number of particles with changed survival states from a second frame material style diagram, wherein the particles with changed survival states refer to new particles of which the previous frame is not special effect particles and the current frame is special effect particles and casualty particles of which the previous frame is special effect particles and the current frame is not special effect particles, and adding the two types of particle numbers with changed survival states to obtain the number of changed particles;
B3. drawing a line drawing of the number of the changed particles of the material style charts of all frames by taking the frame number as a horizontal axis and the number of the changed particles as a vertical axis; taking beta peak values arranged from high to low on a line drawing, ensuring that the interval of the horizontal axis corresponding to every two peak values is larger than theta, and expressing the key frame number corresponding to the peak values as formula 2:
Figure FDA0002947262860000021
wherein kf is a serial number set of the extracted key frames; k is a key frame sequence number; θ is the minimum interval between key frames; mu is [1, theta ]]Any value in between, nkThe number of changed particles of the k frame; β is the key frame number.
3. The automatic migration method of special effects of dynamic texts as claimed in claim 2, wherein the number β of key frames is one tenth of the total number of frames of material style animation; the minimum distance θ between key frames may be 5.
4. The automatic transfer method of dynamic text special effects as claimed in claim 1, wherein if the background of the material style animation is a pure color, the special effect particle extraction method specifically removes the pure color background according to the RGB value of each pixel point, and the remaining pixel points are special effect particles.
5. The method of claim 1, wherein the key frames are extracted by uniform sampling if the background of the material-style animation is not a solid color.
6. The method for automatically migrating a dynamic text special effect as claimed in claim 1, wherein in step C, the spatiotemporal coherence term is obtained by calculating the sum of L2 distances in RGB space between the corresponding material style sheet and the image block of the target style sheet on each key frame by equation 3:
Figure FDA0002947262860000031
wherein E isstIs the energy value of the spatio-temporal coherence term;
Figure FDA0002947262860000032
an image block of the t frame material stylized graph is obtained;
Figure FDA0002947262860000033
an image block of the t frame target style sheet; d represents the L2 paradigm distance of the image block in the RGB space; t is the key frame number; kf is the key frame set.
7. The method for automatically migrating a dynamic text effect as claimed in claim 6, wherein in step D, the text guide is obtained by calculating the L2 paradigm of the image blocks of the material text image and the image blocks of the target text image in RGB space by equation 4:
Figure FDA0002947262860000034
wherein P represents a material text chart StextOr material style sheet SstyThe image block of (1); q denotes the target text graph TtextOr target style sheet TstyThe image block of (1); ptextThe image blocks in the material text image; qtextFor the image blocks in the target text image, the distance values of the two constitute the text guide item.
8. The method for automatically migrating a dynamic text special effect according to claim 1, wherein the weight map of step E1 is specifically configured to extract any w × w image block from the distance map, wherein the image block contains a distance value of w × w; calculating each distance value by adopting an equation 5 to obtain a weight map of each image block:
Figure FDA0002947262860000041
wherein alpha isq′Is the weighted value corresponding to the pixel point q ', and the pixel point q' is the image block N taking the pixel point q as the centerqAny one of the pixel points;
Figure FDA0002947262860000042
the index is the base number of the index, and the value is 2; d(q ', omega) is the vertical distance between the pixel point q ' and the text edge contour omega, namely the distance value at q '; in the same way, d(q, Ω) is the distance value at pixel point q.
9. The method for automatically migrating a special effect of a dynamic text according to claim 1, wherein in the step E2, when performing neighborhood propagation of patch match, a target equation value E 'corresponding to the candidate solution NN' is calculated, and if E 'is smaller than a target equation value under the current solution, that is, Δ E ═ E' -E, and Δ E < 0, the current solution is replaced with the candidate solution, and the current solution is updated;
otherwise, calculating the current temperature T by equation 6k
Figure FDA0002947262860000043
Wherein, T0Is the initial temperature; t isfIs the termination temperature;
Figure FDA0002947262860000044
the current iteration number is;
Figure FDA0002947262860000045
for the total number of iterations, the temperature decreases as the number of iterations increases;
the acceptance probability prop is calculated by equation 7:
Figure FDA0002947262860000046
the probability of acceptance decreases with decreasing temperature;
and randomly generating a probability value xi between 0 and 1, and if prop is greater than xi, accepting the candidate solution and updating the current solution.
CN201810796815.3A 2018-07-19 2018-07-19 Automatic migration method of dynamic text special effect based on sample Active CN110738715B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810796815.3A CN110738715B (en) 2018-07-19 2018-07-19 Automatic migration method of dynamic text special effect based on sample

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810796815.3A CN110738715B (en) 2018-07-19 2018-07-19 Automatic migration method of dynamic text special effect based on sample

Publications (2)

Publication Number Publication Date
CN110738715A CN110738715A (en) 2020-01-31
CN110738715B true CN110738715B (en) 2021-07-09

Family

ID=69233921

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810796815.3A Active CN110738715B (en) 2018-07-19 2018-07-19 Automatic migration method of dynamic text special effect based on sample

Country Status (1)

Country Link
CN (1) CN110738715B (en)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112883806B (en) * 2021-01-21 2024-03-22 杭州广电云网络科技有限公司 Video style migration method and device based on neural network, computer equipment and storage medium
CN113421214A (en) * 2021-07-15 2021-09-21 北京小米移动软件有限公司 Special effect character generation method and device, storage medium and electronic equipment
CN115082600A (en) * 2022-06-02 2022-09-20 网易(杭州)网络有限公司 Animation production method, animation production device, computer equipment and computer readable storage medium

Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102542593A (en) * 2011-09-30 2012-07-04 中山大学 Interactive video stylized rendering method based on video interpretation
KR20140129994A (en) * 2013-04-29 2014-11-07 중앙대학교 산학협력단 Apparatus and method for texture transfer for video animation
CN105025201A (en) * 2015-07-29 2015-11-04 武汉大学 Space-time continuum video background repair method
CN105138538A (en) * 2015-07-08 2015-12-09 清华大学 Cross-domain knowledge discovery-oriented topic mining method
CN106202352A (en) * 2016-07-05 2016-12-07 华南理工大学 The method that indoor furniture style based on Bayesian network designs with colour match
US9576351B1 (en) * 2015-11-19 2017-02-21 Adobe Systems Incorporated Style transfer for headshot portraits
WO2017075783A1 (en) * 2015-11-05 2017-05-11 Panasonic Intellectual Property Corporation Of America Wireless device and wireless communication method
CN107644006A (en) * 2017-09-29 2018-01-30 北京大学 A kind of Chinese script character library automatic generation method based on deep neural network
CN108198130A (en) * 2017-12-28 2018-06-22 广东欧珀移动通信有限公司 Image processing method, device, storage medium and electronic equipment

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102542593A (en) * 2011-09-30 2012-07-04 中山大学 Interactive video stylized rendering method based on video interpretation
KR20140129994A (en) * 2013-04-29 2014-11-07 중앙대학교 산학협력단 Apparatus and method for texture transfer for video animation
CN105138538A (en) * 2015-07-08 2015-12-09 清华大学 Cross-domain knowledge discovery-oriented topic mining method
CN105025201A (en) * 2015-07-29 2015-11-04 武汉大学 Space-time continuum video background repair method
WO2017075783A1 (en) * 2015-11-05 2017-05-11 Panasonic Intellectual Property Corporation Of America Wireless device and wireless communication method
US9576351B1 (en) * 2015-11-19 2017-02-21 Adobe Systems Incorporated Style transfer for headshot portraits
CN106202352A (en) * 2016-07-05 2016-12-07 华南理工大学 The method that indoor furniture style based on Bayesian network designs with colour match
CN107644006A (en) * 2017-09-29 2018-01-30 北京大学 A kind of Chinese script character library automatic generation method based on deep neural network
CN108198130A (en) * 2017-12-28 2018-06-22 广东欧珀移动通信有限公司 Image processing method, device, storage medium and electronic equipment

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
"A Common Framework for Interactive Texture Transfer";Yifang Men等;《2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition》;20180623;第6353-6362页 *
"ARBITRARY STYLE TRANSFER IN REAL-TIME WITH ADAPTIVE INSTANCE NORMALIZATION";Xun Huang等;《2017 IEEE International Conference on Computer Vision》;20171029;第1-5页 *
"结合深度自编码和时空特征约束的运动风格转移方法";胡东等;《计算机辅助设计与图形学学报》;20180531;第30卷(第5期);第946-956页 *

Also Published As

Publication number Publication date
CN110738715A (en) 2020-01-31

Similar Documents

Publication Publication Date Title
Jiang et al. Scfont: Structure-guided chinese font generation via deep stacked networks
CN108830913B (en) Semantic level line draft coloring method based on user color guidance
Natsume et al. Rsgan: face swapping and editing using face and hair representation in latent spaces
CN107578436B (en) Monocular image depth estimation method based on full convolution neural network FCN
CN110738715B (en) Automatic migration method of dynamic text special effect based on sample
Cao et al. Dreamavatar: Text-and-shape guided 3d human avatar generation via diffusion models
US9076258B2 (en) Stylizing animation by example
EP4293567A1 (en) Three-dimensional face reconstruction method and apparatus, device, and storage medium
CN105096326A (en) Laplace cutout matrix method by using moving least square method
Ren et al. Two-stage sketch colorization with color parsing
CN110211223A (en) A kind of increment type multiview three-dimensional method for reconstructing
CN110288667B (en) Image texture migration method based on structure guidance
CN113255813A (en) Multi-style image generation method based on feature fusion
CN114743027B (en) Weak supervision learning-guided cooperative significance detection method
He et al. Diff-font: Diffusion model for robust one-shot font generation
Wei et al. A three-stage GAN model based on edge and color prediction for image outpainting
CN110097615B (en) Stylized and de-stylized artistic word editing method and system
Chen et al. Comboverse: Compositional 3d assets creation using spatially-aware diffusion guidance
CN116563443A (en) Shoe appearance design and user customization system based on 3D generation countermeasure network
CN111862253B (en) Sketch coloring method and system for generating countermeasure network based on deep convolution
Wu et al. HIGSA: Human image generation with self-attention
Liu et al. Palette-based recoloring of natural images under different illumination
Zhang et al. Animation Costume Style Migration Based on CycleGAN
Cao et al. AnimeDiffusion: Anime Diffusion Colorization
Li et al. Video vectorization via bipartite diffusion curves propagation and optimization

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant