CN111899322B

CN111899322B - Video processing method, animation rendering SDK, equipment and computer storage medium

Info

Publication number: CN111899322B
Application number: CN202010606426.7A
Authority: CN
Inventors: 齐国鹏; 吕鹏伟; 陈仁健
Original assignee: Tencent Technology Shenzhen Co Ltd
Current assignee: Tencent Technology Shenzhen Co Ltd
Priority date: 2020-06-29
Filing date: 2020-06-29
Publication date: 2023-12-12
Anticipated expiration: 2040-06-29
Also published as: CN111899322A

Abstract

The application discloses a video processing method, an animation rendering SDK, equipment and a computer storage medium, relates to the technical field of computers, in particular to the technical field of image processing, and is used for realizing the whole process of animation rendering through the animation rendering SDK and reducing the complexity of adding animation special effects. The method comprises the following steps: acquiring a video file and an animation file; determining a mapping relation between the animation fragments corresponding to the video layers and the video fragments according to the time information of the video layers and the time information of the video fragments included in the animation files; the video image layer is used for bearing image content of video frames included in the video clips; according to the mapping relation, determining video frames corresponding to each animation frame from the video clips, and performing animation rendering on the determined video frames frame by frame to obtain a plurality of target video frames; and performing video synthesis based on the plurality of target video frames to obtain the target video file added with the animation special effects.

Description

Video processing method, animation rendering SDK, equipment and computer storage medium

Technical Field

The application relates to the technical field of computers, in particular to the technical field of image processing, and provides a video processing method, an animation rendering SDK, equipment and a computer storage medium.

Background

In recent years, short videos have been widely used and spread as carriers of information, and in the process of producing short videos, producers often prefer to add special effects of animation, so that the short video content is more wonderful.

At present, various schemes for adding animation special effects into short videos, such as a Lottie scheme and a PAG scheme based on Airbnb open source, are realized, the schemes all open up a workflow for designing AE (Adobe After Effects) animation to be presented at a mobile terminal, namely a designer can design the animation on AE, export an animation file through an export plug-in, and load and render through a software development kit (Software Development, SDK) at the mobile terminal, so that the addition of the animation special effects is realized. However, in all the current schemes, the sticker animation is usually used as a template, and in the specific use process, when each frame of special effect is required to be rendered by the sticker animation, the corresponding video picture is transmitted into the sticker animation template, and the scheduling logic of the whole time axis of the implementation mode is realized by controlling the SDK externally, so that the complexity of adding the animation special effect is increased, and the flexibility is lower.

Disclosure of Invention

The embodiment of the application provides a video processing method, an animation rendering SDK, equipment and a computer storage medium, which are used for realizing the whole process of animation rendering through the animation rendering SDK and reducing the complexity of adding animation special effects.

In one aspect, a video processing method is provided, the method comprising:

acquiring a video file and an animation file for adding special effects to a video clip appointed in the video file;

determining a mapping relation between the animation segments corresponding to the video layers and the video segments according to the time information of the video layers and the time information of the video segments included in the animation file; the video layer is used for bearing image content of video frames included in the video clips;

according to the mapping relation, determining video frames corresponding to each animation frame from the video clips, and performing animation rendering on the determined video frames frame by frame to obtain a plurality of target video frames;

and carrying out video synthesis based on the target video frames to obtain the target video file added with the animation special effects.

In one aspect, there is provided an animation rendering SDK, the animation rendering SDK comprising:

an acquisition unit configured to acquire a video file and an animation file for adding special effects to video clips specified in the video file;

the determining unit is used for determining the mapping relation between the animation segments corresponding to the video layers and the video segments according to the time information of the video layers and the time information of the video segments included in the animation file; the video layer is used for bearing image content of video frames included in the video clips;

The animation rendering unit is used for determining video frames corresponding to each animation frame from the video clips according to the mapping relation, and performing animation rendering on the determined video frames frame by frame to obtain a plurality of target video frames;

and the video synthesis unit is used for carrying out video synthesis based on the target video frames to obtain a target video file added with the animation special effects.

Optionally, the animation rendering unit is further configured to:

converting the data format of the frame data of the video frame corresponding to any animation frame to obtain the frame data in a renderable data format;

and performing animation rendering processing according to the frame data of any animation frame and the frame data of the renderable data format of the corresponding video frame to obtain a target video frame corresponding to the any animation frame.

Optionally, the animation rendering unit is configured to:

when the current system is determined to support hardware decoding, a system decoding interface is called to decode the video frame corresponding to any animation frame in a hardware decoding mode, so that frame data of the video frame are obtained; or,

and when the current system is determined not to support hardware decoding, calling a decoding interface built in the SDK to decode the video frame corresponding to any animation frame in a software decoding mode, so as to obtain frame data of the video frame.

Optionally, the animation rendering unit is configured to:

invoking a video unpacker built in the SDK to unpack the data packet corresponding to the video clip in the video file to obtain image track data of the video clip;

and according to the image track data of the video clips, carrying out animation rendering on the determined video frames frame by frame to obtain a plurality of target video frames.

In one aspect, a computer device is provided comprising a memory, a processor and a computer program stored on the memory and executable on the processor, the processor implementing the steps of any of the methods described above when the computer program is executed.

In one aspect, there is provided a computer storage medium having stored thereon computer program instructions which, when executed by a processor, perform the steps of any of the methods described above.

In the embodiment of the application, when the animation special effect is added to the video clip, the mapping relation between the animation frames can be obtained through the time information of the video layer and the time information of the video clip, and when the animation rendering processing is carried out, the video frames corresponding to the animation frames are determined and the animation rendering is carried out frame by frame, so that the target video file added with the animation special effect is obtained by utilizing the video synthesis of the target video frames, therefore, the whole rendering process from the original video file to the target video file added with the animation special effect can be realized by the animation rendering SDK, the SDK does not need to carry out communication interaction with the outside and receive external scheduling control when the animation rendering is carried out, and the complexity of the animation special effect adding process is reduced. In addition, in the process of adding the animation special effects, only video frames needing to be subjected to animation rendering can be obtained, other video frames except the video frames do not need to be processed, and the processing speed of adding the animation special effects is improved.

Drawings

In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings that are required to be used in the embodiments or the description of the prior art will be briefly described below, and it is obvious that the drawings in the following description are only embodiments of the present application, and that other drawings can be obtained according to the provided drawings without inventive effort for a person skilled in the art.

FIG. 1 is a schematic view of a scene provided in an embodiment of the present application;

FIG. 2 is a schematic diagram of another scenario provided in an embodiment of the present application;

FIG. 3 is a schematic diagram of a structure of an animation rendering SDK according to an embodiment of the present application;

fig. 4 is a schematic flow chart of a video processing method according to an embodiment of the present application;

fig. 5 is a schematic diagram of a mapping relationship when the video layer and the video clip provided in the embodiment of the present application are equal in time length;

fig. 6 is a schematic diagram of another mapping relationship when the video layer and the video segment provided in the embodiment of the present application are equal in time length;

fig. 7 is a schematic diagram of a mapping relationship when video layers and video segments provided in an embodiment of the present application are not equal in duration;

fig. 8 is another schematic diagram of a mapping relationship when video layers and video segments provided in an embodiment of the present application are not equal in duration;

FIG. 9 is a schematic diagram of determining a video frame corresponding to an animation frame according to an embodiment of the present application;

FIG. 10 is another schematic diagram of determining an animation frame corresponds to a video frame according to an embodiment of the present application;

FIG. 11 is a flowchart of determining a corresponding video frame and performing a moving picture rendering according to an embodiment of the present application;

FIG. 12 is a schematic diagram of a rendering result of adding special effects for a three-way event of a game video according to an embodiment of the present application;

FIG. 13 is a schematic diagram of another configuration of an animation rendering SDK according to an embodiment of the present application;

fig. 14 is a schematic structural diagram of a computer device according to an embodiment of the present application.

Detailed Description

For the purpose of making the objects, technical solutions and advantages of the present application more apparent, the technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present application, and it is apparent that the described embodiments are only some embodiments of the present application, not all embodiments of the present application. All other embodiments, which can be made by those skilled in the art based on the embodiments of the application without making any inventive effort, are intended to be within the scope of the application. Embodiments of the application and features of the embodiments may be combined with one another arbitrarily without conflict. Also, while a logical order is depicted in the flowchart, in some cases, the steps depicted or described may be performed in a different order than presented herein.

In order to facilitate understanding of the technical solution provided by the embodiments of the present application, some key terms used in the embodiments of the present application are explained here:

AE: adobe After Effect, a powerful graphic video processing software.

Animation file: after the designer designs the animation through the graphic video processing software, the file in the animation format is exported through the export plug-in. For example, after the designer designs the animation by AE, the AE engineering file is exported as a file in animation format.

PAG: the name of the realization scheme of the sticker animation is that the format of the animation file obtained based on the PAG scheme is PAG format, the PAG animation file adopts a dynamic bit storage technology with extremely high compression rate, and resources such as pictures, sound, video and the like can be directly integrated in a single file. In the PAG-based sticker animation scheme, a designer can design animation special effects required by a product through AE software, then reads animation characteristic data in an AE engineering file through a PAGExporter export plug-in of the PAG scheme, and can select one of vector export, bitmap sequence frame export and video sequence frame export modes to export a binary file in a PAG format according to specific requirements, when a client displays, the client can decode the used PAG binary file through a PAG SDK, render through a rendering module and then display on an Android platform, an iOS platform and a web end respectively. Compared with the Lottie animation implementation scheme of an Airbnb open source, the method has the advantages that development cost is greatly reduced, meanwhile, the aspect of describing the same animation content is greatly reduced, the vector scheme file is smaller, the animation special effects are supported to be richer, the rendering multi-level buffer drawing is supported, the text editing and the picture content replacement are supported, the bitmap sequence frame scheme and the video sequence frame scheme support AE infinite special effects, the function is higher, the bitmap sequence frame scheme is mainly used for a web terminal, and the video sequence frame scheme is focused on a mobile terminal.

In the PAG data structure, one PAGFile contains a plurality of compositions (compsitions), and the compositions can be divided into three types: vectorComposition, bitmapComposition and VideoComposition correspond to three export methods of PAG file, respectively: vector derivation, bitmap sequence frame derivation, and video sequence frame derivation.

The vector derivation is a restoration of an AE animation layer structure, in the concrete representation of VectorComposition, a composition attribute (completions) and a layer attribute may be included, where completions are used to represent a basic attribute of composition, such as ID, width, duration, frame rate, background color, etc., layer information included in the layer attribute corresponds to an AE layer structure one by one, such as supporting virtual Object (Null Object), solid color layer (solid layer), text layer (text layer), shape layer (shape layer), picture layer (ImageLayer), pre-composition layer (pre-composition layer) and the like, and a solid color layer (SolidLayer) may be included in the layer attribute, which includes width, height, color, and layer attribute of the layer, a start time, a stretching parameter, a Mask (Mask), a layer effect (effect layer), a Transform layer (Transform), a layer position of the layer, a Transform, a further scaling coefficient, and the like, and a further scaling coefficient in a Transform direction of the layer, and a Transform direction of the Transform.

Bitmap sequence frame derivation is to cut each frame of AE animation into one picture, and in the concrete representation of Bitmap Composition, the bitmap sequence contains the compositionmaps and bitmap sequence arrays, and the bitmap sequence contains the width (width), height (height), frame rate (frame rate), whether the bitmap is a key frame (isKeyFrame), the position information (x, y) of the bitmap, and the binary data (ByteData) of the bitmap.

The video sequence frame is based on the bit map sequence frame, the intercepted picture is compressed in video format, and in the concrete representation of video composition, the video sequence frame comprises compsite video (hasalpa) and a video sequence array, and the video sequence comprises the width and the height of a bitmap, the frame rate and binary data of a key frame and the bitmap.

Animation frames: for representing a frame in an animation file. When the animation file is exported, the relevant animation data is compressed in a certain coding mode to obtain the animation file, and correspondingly, when the animation file needs to be used, corresponding decoding operation needs to be carried out, and each animation frame can be understood as descriptive information included in the frame. Taking PAG sticker animation as an example, the decoding function is to deserialize the PAG binary file into data objects that can be operated by the client, and the decoded data structure is the data structure of the PAG.

Video frame: for representing a frame in a video file.

Video layer: the video layer is actually a picture layer, the difference point is that the common picture layer only needs to be replaced once, the video layer takes effect in the whole rendering process, and the video layer needs to replace picture information frame by frame in the rendering process. Adding animation special effects to video can also be understood as adding video content to the sticker animation, so that when adding video content, it is required to know at which time point and at which position of video the video content is added, and a video layer is served for the purpose, namely, the video layer is a layer in an animation file for bearing image content in the video file.

Visible time interval: for a video layer, there is usually a duration, and the duration of the video layer may further include a duration to be displayed, where the duration to be displayed is a visible time interval of the video layer. For example, the total duration of the sticker animation is 10s, the duration interval in which the video layer exists is 3 s-10 s, and the visible time interval of the video layer can be 5 s-10 s.

Video decoding container: video de-containers are used to de-encapsulate video files, otherwise known as de-multiplexing (demux). Each video file contains multiple tracks (tracks), the tracks of the image are pictures seen by people, the audio tracks are sounds heard by people, the subtitle tracks are subtitles displayed, the video files are required to be transmitted, so that resources of different tracks are required to be concentrated together and then transmitted to a destination through a network or other modes, the process of multiplexing (mux) is to combine the resources of multiple tracks into one container, and therefore, after the video files are acquired, the demultiplexing is needed, namely, the multiple tracks in the video files are decomposed.

Video decoder: in video transmission, the video data is encoded in an encoding mode to reduce the data packet size, so that after the video is actually used for decoding, a video decoder is also required to decode the image track data.

FFmpeg (Fast Forward Mpeg): is an open source computer program that can be used to record, convert digital audio, video, and convert it into streams. FFmpeg embeds the video de-container and decoder so the video de-container and video decoder can be implemented by FFmpeg.

YUV: YUV is a color coding method, where "Y" represents brightness (luminence), that is, gray scale values, and "U" and "V" represent chromaticity (chromance) for describing image colors and saturation, for specifying the colors of pixels.

RGBA: color spaces representing Red (Red), green (Green), blue (Blue) and Alpha, i.e., transparency/opacity.

In the current paper-sticking animation scheme, the paper-sticking animation is usually used as a template, and in the specific use process, when each frame of special effect is required to be rendered by the paper-sticking animation, the corresponding video picture is transmitted into the paper-sticking animation template, and the dispatching logic of the whole time axis of the implementation mode is realized by controlling the SDK through external driving, but the external driving needs to process the video content of each frame, so that the complexity of adding the special effect of the animation is increased, and the flexibility is lower.

Considering that in the existing implementation manner, the function of the animation rendering SDK is limited to animation rendering, and other controls are all implemented by external driving, so that the problem of higher complexity is brought, so that if the whole process from video input to final animation rendering can be implemented by the animation rendering SDK alone, the complexity can be reduced to a certain extent. Therefore, the embodiment of the application provides a video processing method, in which the animation rendering SDK can obtain the mapping relation between animation frames and video frames through the time information of the video image layer and the time information of the video fragment, and when the animation rendering processing is carried out, the video frames corresponding to the animation frames are determined and the animation rendering is carried out frame by frame to obtain the target video frames, so that the target video file added with the animation special effects is obtained by utilizing the video synthesis of the target video frames, therefore, the whole rendering process from the original video file to the target video file added with the animation special effects is realized by the animation rendering SDK, communication interaction with the outside and external scheduling control are not needed by the SDK when the animation rendering is carried out, and the complexity of the animation special effect adding process is reduced. In addition, in the process of adding the animation special effects, only video frames needing to be subjected to animation rendering can be obtained, other video frames except the video frames do not need to be processed, and the processing speed of adding the animation special effects is improved.

In the embodiment of the application, considering that in practical application, the duration of the video layer and the duration of the video fragment are not necessarily completely equal, when rendering is performed, a certain variable speed processing needs to be performed on the decoding progress of the video fragment. When the duration of the video layer and the duration of the video fragment are completely equal, the rendering frame rate of the video layer can be kept consistent with the decoding frame rate of the video fragment without variable speed processing, and when the duration of the video layer and the duration of the video fragment are unequal, variable speed processing is needed, namely, a mapping relation is established according to the ratio of the duration of the video layer to the duration of the video fragment and the starting time of the video layer and the video fragment, so that corresponding video frames are decoded, and the self-adaption of the rendering progress and the decoding progress is realized.

In the embodiment of the application, the visible time interval of the video layer is defined in the animation file, and the content of the video frame is visible only when the video layer exists and is positioned in the visible time interval of the video layer, so that the animation rendering processing can be carried out only for the visible time interval, and therefore, when the animation rendering processing is carried out, whether the animation rendering processing is needed or not can be judged, and when the animation rendering processing is needed, the processing such as decoding and the like is carried out on the corresponding video frame only when the animation rendering processing is determined to be needed, otherwise, the next frame is skipped to be processed, thereby reducing the processing workload and improving the processing speed of the animation special effect addition.

In the embodiment of the application, when decoding is carried out and the system platform supports hardware decoding, the system decoding interface can be called for decoding, so that the performance of a graphic processor (GPU, graphics Processing Unit) is fully utilized, and the decoding efficiency and time consumption are improved.

The scheme provided by the embodiment of the application can be suitable for most scenes needing to be subjected to animation rendering, as shown in fig. 1, the scheme provided by the embodiment of the application can be suitable for one scene, and the application scene can comprise terminal equipment 101, wherein the terminal equipment 101 comprises, but is not limited to, personal computers (personal computer, PCs), mobile phones, mobile computers, tablet computers, media players, intelligent wearable equipment, intelligent televisions, vehicle-mounted equipment, personal digital assistants (personal digital assistant, PDAs) and other electronic equipment with the capability of adding the animation of the stickers. The terminal 101 may be provided with an animation rendering SDK according to an embodiment of the present application, where the SDK may be embedded in any video editing software that needs to be added with a sticker animation, for example, small video production software.

Specifically, the user may select a video clip with a special effect of animation to be added through the video editing software in the terminal device 101, select a sticker animation to be added, and provide relevant information of the video clip and the sticker animation to the animation rendering SDK after determining to add, for example, may provide information such as starting time and duration of the video clip in an original video file, number of the sticker animation, etc. to the animation rendering SDK, where the animation rendering SDK may obtain the target video file with the special effect of animation added through the video processing method provided by the embodiment of the present application.

As shown in fig. 2, another scenario in which the scheme provided by the embodiment of the present application can be applied may include a terminal device 201 (including a terminal device 201-1, a terminal device 201-2, and a terminal device … …, 201-n) and a server 202.

The connection between the terminal device 201 and the server 202 may be through one or more networks 203, where the network 203 may be a wired network, or may be a WIreless network, for example, a mobile cellular network, or may be a WIreless-Fidelity (WIFI) network, or may be other possible networks, which embodiments of the present application are not limited in this respect.

Terminal device 201 includes, but is not limited to, personal computers (personal computer, PCs), mobile phones, mobile computers, tablet computers, media players, smart wearable devices, smart televisions, in-vehicle devices, personal digital assistants (personal digital assistant, PDAs), and like electronic devices.

The server 202 may be an independent physical server, a server cluster or a distributed system formed by a plurality of physical servers, or a cloud server providing cloud services, cloud databases, cloud computing, cloud functions, cloud storage, network services, cloud communication, middleware services, domain name services, security services, CDNs, basic cloud computing services such as big data and artificial intelligence platforms, and the like.

In one possible scenario, the terminal device 201 may have video editing software installed therein or software including video editing functions, for example, a website providing video editing functions may be opened through a browser. Taking video editing software as an example, a user can select a video clip with an animation special effect to be added through the video editing software, select a sticker animation to be added, and upload relevant information of the video clip and the sticker animation to the server 202 after determining to add.

The server 202 may be, for example, a background server of video editing software, and the server 202 is provided with the animation rendering SDK provided by the embodiment of the present application, so that after obtaining relevant information of the video clip and the sticker animation, the animation rendering SDK may obtain the target video file added with the animation special effect through the video processing method provided by the embodiment of the present application.

In another possible scenario, the terminal device 201 is installed with software that can generate a video file during the running process of the software, for example, the terminal device 201 may be installed with video recording software, through which video can be recorded and uploaded to the server 202, or the terminal device 201 may be installed with game software, which is used to record a game picture of a user in the process of playing a game, and upload the video of the recorded game picture to the server 202 for storage, or the server 202 records a game picture of the user in the background, so as to obtain a game video file.

After the server 202 obtains the video file in the above manner, the video content analysis may be performed on the video file to determine the video clips in the video file to which the animation special effects need to be added, so that the video processing method provided by the embodiment of the present application may be used to add the sticker animation special effects to the video clips. For example, the game video often includes highlight clips corresponding to one or more characteristic events, such as a click event clip, and then the event information, such as a start time and a duration, may be acquired and provided to the animation rendering SDK, and the animation rendering SDK may select a corresponding sticker animation for the highlight clip according to the event type, and obtain a target video file added with an animation special effect by using the video processing method provided by the embodiment of the present application, and provide the target video file to the terminal device 201 for displaying, so that a user reviews the game highlight time.

Of course, the method provided by the embodiment of the present application is not limited to the application scenario shown in fig. 1 and fig. 2, but may be used in other possible application scenarios, and the embodiment of the present application is not limited. The functions that can be implemented by each device in the application scenario shown in fig. 1 or fig. 2 will be described together in the following method embodiments, which are not described here again.

Before describing the method provided by the embodiment of the present application, the structure of the animation rendering SDK provided by the embodiment of the present application is described, as shown in fig. 3, which is a schematic structural diagram of the animation rendering SDK. Among them, the animation rendering SDK may include a video decoding module 30, a resource management module 31, and an animation rendering module 32.

In particular, the video decoding module may include a video de-container 301, a video decoder 302, and a format converter 303. The video decapsulator 301 is configured to decapsulate a video file, to obtain image track data in the video file. From a cross-platform perspective, the video solution container 301 may be implemented using FFmpeg to support solution containers in a common video format, such as platforms that need to be compatible with android (Andorid), iOS, mac OS, windows, linux, etc. In the embodiment of the present application, the video parsing container 301 provides a plurality of data interfaces, such as an accurate seek interface, for locating the progress of video frames corresponding to the currently rendered animation frames in the video clip, and a video key frame list obtaining interface for obtaining a video key frame list of the video clip.

Generally, video is encoded in a certain encoding format, such as h.264, hevc, etc., and thus needs to be decoded when needed, and the video decoder 302 is used to decode video frames to obtain frame data of the video frames. The video decoder 302 may be implemented by a decoding module with built-in FFmpeg, and supports preferential call of a decoding interface of the system platform, for example, a system decoding interface video toolbox may be called at the iOS or mac end, and a system decoding interface MediaCodec may be called at the Android end, so that GPU decoding may be fully used, and efficiency and time consumption of decoding may be improved.

In actual rendering, RGBA format data is required, so that conversion of image format is also required by the format converter 303, the format converter 303 may be implemented by using an open graphics library (Open Graphics Library, openGL), the OpenGL may fully use a GPU to perform format conversion, and improve conversion efficiency, and in addition, for a system platform without a GPU, such as a Linux platform, a central processing unit (central processing unit, CPU) may also be used to perform conversion.

The resource management module 31 may include a storage path management sub-module 311, a time information management sub-module 312, and a transformation information management sub-module 313. Wherein, the resource management module 31 may be used for setting related information of the input video file, the storage path management sub-module 311 is used for setting a storage path of the input video file, the time information management sub-module 312 is used for setting time information of video clips in the input video file, the transformation information management sub-module 313 is used for setting size transformation information of the video, and the like.

The animation rendering module 32 is used for executing an animation rendering process, and the animation rendering module 32 depends on the video decoding module 30 and the resource management module 31, for example, when the animation rendering module 32 performs animation rendering, it is required to use frame data of a specific time or a specific video frame, the resource management module 31 is required to provide information such as a storage path, a start time and a duration of the video, and the video decoding module 30 is required to perform capacity resolving, decoding and data conversion on the video frame.

Referring to fig. 4, a flowchart of a video processing method according to an embodiment of the present application may be implemented by the terminal device in fig. 1 or the server 102 in fig. 2, and the flowchart of the method is described below.

Step 401: and acquiring a video file and an animation file for adding special effects to the video clips appointed in the video file.

In the embodiment of the application, before special effects are added, a video file with special effects to be added needs to be acquired, and the special effects to be added for the video file need to be known, so that an animation file to be added needs to be acquired.

In the actual application process, if the user adds special effects to the video through the video editing software, the user can select the video to be added with the special effects and select the paper-attached animation to be added, and after confirmation processing, the animation rendering SDK in the terminal equipment or the animation rendering SDK in the server can correspondingly acquire the video file to be added with the special effects and the animation file to be added. For example, in the PAG animation paste scheme, the animation rendering SDK may be a PAG SDK for paste animation effect presentation, and the PAG SDK applicable to each platform may be adopted based on the difference of platforms.

The animation rendering SDK may acquire storage path information and time information of the video file, and may acquire the video file to be added with the special effect based on the storage path information, and determine a video clip to be added with the special effect in the video file according to the time information. The storage path information and the time information of the video file can be specifically set and managed by the storage path management sub-module 311 and the time information management sub-module 312 in the resource management module 31 shown in fig. 3.

The time information may include a start time and a duration, where the start time refers to a start position of a video segment to which a special effect is to be added in the complete video, and the duration is a duration of the video segment to which the special effect is to be added. For example, after the user selects the video to be added with the special effect, the storage path and time information of the video can be provided to the animation rendering SDK, wherein the time information is 5s at the starting time and 10s in the duration, and then the animation rendering SDK can know that the special effect is added to the video segments with the duration of 10s from 5s in the video of the given storage path. Of course, the time information may also take other forms, for example, the time information may be a start time and an end time, that is, a start position and an end position of the video segment in the complete video, which is not limited in the embodiment of the present application.

The animation rendering SDK may acquire the animation file, or may acquire a storage path of the animation file, and further acquire the animation file based on the storage path. Wherein, the animation file in the storage path can be downloaded in advance, such as pre-stored when the video editing software is installed, or downloaded in the using process of the video editing software; alternatively, the animation file in the storage path may also be downloaded after the user determines the selection; alternatively, the animation file may also be stored in a database, and the animation rendering SDK may then also obtain the animation file based on the identity of the animation file. The storage path information and time information of the animation file can also be set and managed by the storage path management sub-module 311 and the time information management sub-module 312 in the resource management module 31 shown in fig. 3.

Step 402: and determining the mapping relation between the animation fragments corresponding to the video layers and the video fragments according to the time information of the video layers and the time information of the video fragments included in the animation file.

In the embodiment of the application, the animation file is obtained by exporting the animation engineering file through the export plug-in after the designer performs animation design through the graphic video processing software, and can be, for example, a PAG animation file.

In the animation file, the image layer is used for bearing image content, the image layer can comprise an image layer and a video layer, the image layer is used for bearing image content, the image layer only needs to globally replace an image in the whole animation rendering process, the image layer takes effect in the whole rendering process, and the video layer needs to replace the image frame by frame in the animation rendering process, namely, video frames of video clips. In the animation design, one or more video layers and one or more picture layers can be included, and identifiers can be added to the video layers and the picture layers to distinguish different video layers and picture layers and distinguish the video layers and the picture layers. In practical application, a piece of mark information can be added for the video layer to be used for carrying out special identification on the video layer. For example, when the animation design is performed by AE, the identification may be performed in the layer name of AE or in a marker.

For each layer, the time information of the layer may also include a start time indicating from which time the layer starts and a duration indicating the duration of the layer. In the case of animation design, a designer sets time information of each layer, and after the animation file is exported, the time information of each layer can be determined based on the description information of the animation file. When the animation file is exported, the animation engineering file is usually encoded in an encoding mode to compress data, so that after the animation file is obtained, the animation file can be decoded to obtain description information of the animation file, wherein the description information comprises layer attributes of each layer, so that time information of each layer, including time information of video layers, is known based on the layer attributes.

For example, after the designer finishes the animation design through AE, the designer may export the PAG animation file through the PAG export plug-in, and the data structure of the PAG includes a picture layer, so after decoding the PAG animation file, the description information of the PAG animation file may be obtained, where the description information includes layer attribute information of each picture layer, including information such as a layer name and time information, and the like, such as start time and duration, and accordingly, layer attribute information of a video layer that is specifically identified is also obtained along with decoding, thereby obtaining time information of the video layer.

In the embodiment of the application, the video layer has a starting time and a duration, the video segment also has the starting time and the duration, when the time and the duration of the video layer are equal, the rendering frame rate of the video layer and the decoding frame rate of the video segment can be kept consistent when rendering is performed, the rendering frame rate refers to the number of frames rendered in unit time, the decoding frame rate refers to the number of frames decoded in unit time, and when the time and the duration of the video layer are equal, the first time included in the animation segment corresponds to the second time included in the video segment one by one, wherein the first time refers to the time where the animation frame is positioned in the time range of the video layer. As shown in fig. 5, the time information of the video layer is a starting time a and a duration m, the time information of the video segment is a starting time b and a duration n, when m=n, the time a in the duration range of the video layer corresponds to the time in the video segment, the time a+0.5 in the duration range of the video layer corresponds to the time b+0.5 in the video segment, the time a+1 in the duration range of the video layer corresponds to the time b+1 in the video segment, and so on.

Correspondingly, each animation frame included in the animation segment corresponding to the video layer corresponds to each video frame included in the video segment one by one. As shown in fig. 6, the animation frames and the video frames have a one-to-one correspondence, for example, the animation frame at the a time corresponds to the video frame at the b time, the animation frame at the a+1 time corresponds to the video frame at the b+1 time, and the animation frames in the a-a+1 time period correspond to the video frames in the b-b+1 time period one-to-one.

When the duration of the video layer is not equal to that of the video segment, the duration of the target video obtained in a final way is kept consistent with that of the sticker animation, which is equivalent to that of stretching or compressing the video segment, so that the video segment is required to be decoded in a variable speed manner, and the decoding progress can meet the requirement of the rendering progress. For example, the duration of the video layer is 2s, and the duration of the video clip is 4s, so that the 2s video frame in the video clip is required when the 1s animation frame in the video layer is rendered, and therefore the decoding frame rate of the video frame is required to be increased, and the decoding progress is accelerated to adapt to the rendering progress; or the duration of the video layer is 2s, and the duration of the video clip is 1s, so that the video frame of the 0.5s in the video clip is required when the animation frame of the 1 st s in the video layer is rendered, thereby reducing the decoding frame rate of the video frame, and slowing down the decoding progress to adapt to the rendering progress.

When the duration of the video layer is not equal to that of the video clip, the mapping relation can be established according to the time information of the video clip and the video layer.

Specifically, according to the ratio of the duration of the video clip to the duration of the video layer, and the starting time of the video clip and the starting time of the video layer, the mapping relationship between each first time and each second time included in the video clip in the duration range of the video layer can be determined. The following formula for characterizing the mapping relation is:

wherein a is the starting time of the video layer in the sticker animation, m is the duration of the video layer, b is the starting time of the video segment in the complete video, n is the duration of the video segment, t represents the first time in the video layer, t can also be understood as the current rendering progress in the specific rendering process, namely, the time of the currently processed animation frame in the video layer duration range or the sticker animation duration range, f (t) represents the second time in the video segment, and f (t) can also be understood as the current decoding progress in the specific rendering process, namely, the time of the currently required decoded video frame in the complete video or the video segment.

When the duration of the video layer is smaller than that of the video clip, the effect after rendering is to fast play the video clip, which is equivalent to the compression of the duration of the video clip. For example, when the duration of the video layer is 2s and the duration of the video clip is 4s, the time mapping relationship between the video layer and the video clip is as follows:

f(t)＝2(t-a)+b

Fig. 7 is a schematic diagram of a mapping relationship between a video layer and a video clip, where a time a in a duration range of the video layer corresponds to a time b in the video clip, a time a+0.5 in the duration range of the video layer corresponds to a time b+1 in the video clip, a time a+1 in the duration range of the video layer corresponds to a time b+2 in the video clip, and so on.

When the duration of the video layer is longer than that of the video clip, the effect after rendering is to slowly put the video clip, which is equivalent to stretching the duration of the video clip. For example, when the duration of the video layer is 2s and the duration of the video clip is 1s, the time mapping relationship between the video layer and the video clip is as follows:

f(t)＝1/2(t-a)+b

fig. 8 is another schematic diagram of a mapping relationship between a video layer and a video clip, where a time a in a duration range of the video layer corresponds to a time b in the video clip, a time a+0.5 in the video layer corresponds to a time b+0.25 in the video clip, a time a+1 in the video layer corresponds to a time b+0.5 in the video clip, and so on.

In the embodiment of the present application, the mapping relationship may refer to a mapping relationship in time, and the corresponding relationship between the time of the animation frame and the time in the video clip may be known based on the mapping relationship in time according to the above formula. Or, it may also refer to a mapping relationship between specific frames. In specific application, the mapping relationship in time can be further converted into a mapping relationship between specific frames, and based on the mapping relationship between frames, it can be known which video frame the animation frame specifically corresponds to, and in the following description, the part will be specifically referred to, so that details will not be repeated here.

It should be stated that the first time refers to a time at which any animation frame is located in a duration range of the video layer, and is not specific to a certain time in the duration range of the video layer, and similarly, the second time refers to any time in the video segment, and is not specific to a certain time in the video segment.

In the practical application process, the duration of the video clip can also be zero, i.e. the video clip only comprises one frame of image at the starting moment, then when the duration of the video clip is zero, the video layer can be regarded as a common picture layer in essence, i.e. only needs to be replaced once in the whole rendering process, then the video layer needs to be subjected to framing processing, i.e. the video layer display images are all the same frame of video image, and the images are specifically the starting video frame of the video clip, i.e. all the animation frames within the duration range of the video layer only correspond to the starting video frame of the video clip.

Step 403: and determining video frames corresponding to each animation frame from the video clips according to the mapping relation, and performing animation rendering on the determined video frames frame by frame to obtain a plurality of target video frames.

In the embodiment of the application, based on the established mapping relation, the video frames corresponding to the animation frames can be determined from the video clips.

Specifically, for any one of the animation frames, based on the mapping relationship in time, the second time corresponding to the first time of the any one of the animation frames in the video clip can be determined, and the associated video frame of each second time can be determined from the video clip, and the associated video frame can be used as the video frame corresponding to the determined any one of the animation frames. In the practical application process, not every second moment can accurately correspond to a video frame, so that the associated video frame of the second moment can refer to a video frame at the second moment, or can refer to a video frame with a time point close to the second moment, or a video frame with the shortest time interval from the second moment in the adjacent video frames.

As shown in fig. 9 and 10, schematic diagrams of determining the corresponding video frames of the animation frames are shown. The first time at which the animation frame a is located is time a+1, and the corresponding second time in the video clip is time b+2, as shown in fig. 9, if there is exactly one video frame a at time b+2, then the video frame a is the associated video frame at time b+2, then the corresponding video frame of the animation frame a is video frame a; as shown in fig. 10, if there is no video frame at the time b+2, the video frame C with the shortest time interval from the time b+2 may be used as the associated video frame, and the corresponding video frame of the animation frame a is the video frame C.

In the embodiment of the application, the video frame corresponding to each animation frame can be obtained based on the above process, so that the mapping relation between specific frames can be established, and further, when the animation rendering is performed, the video frame corresponding to the animation frame can be directly determined according to the mapping relation between the frames, so as to perform the animation rendering. When the animation file is a PAG animation file, the PAG SDK introduces a 2D vector graphics processing function library Skia to perform vector graphics processing, the Skia can realize one-place modification and multi-platform effect in the aspects of font, coordinate conversion, bitmap and other processing, and can better support cross-platform and realize a rendering result which is completely consistent across platforms.

In the practical application process, the video file needs to be unpacked in advance before the video frame is rendered, so that the image track data is obtained. The video clips are only required to be subjected to animation rendering, so that only the data packets corresponding to the video clips can be unpacked, and the image track data of the video clips can be obtained. And when in rendering, animation rendering can be carried out on the determined video frames frame by frame according to the image track data of the video clips, so as to obtain a plurality of target video frames.

In the embodiment of the application, the corresponding video frames of each animation frame can be decoded and converted into the format to obtain the renderable frame data, so that the animation rendering processing is performed according to the frame data of the animation frame and the renderable frame data of the video frame to obtain the target video frame corresponding to each animation frame.

Step 404: and performing video synthesis based on the plurality of target video frames to obtain the target video file added with the animation special effects.

In the embodiment of the application, a plurality of video frames are utilized to carry out video synthesis, and the target video file added with the animation special effects can be obtained. Specifically, when video synthesis is performed, each target video frame can be synthesized according to the original sequence of the animation frames, and coding and packaging are performed to obtain a target video file.

The process of steps 402-404 may be implemented by video decoding module 30 and animation rendering module 32 shown in fig. 3.

Since the process of determining the corresponding video frame and rendering the moving picture is similar for each of the animation frames, a specific process of determining the corresponding video frame and rendering the moving picture will be described below by taking one animation frame as an example, and the animation frame may be any video frame in the animation file. Referring to fig. 11, a flow chart of determining a corresponding video frame and performing moving picture rendering is taken as an example of an animation frame.

Step 1101: it is determined whether the current animation frame requires an animation rendering process.

In the embodiment of the present application, the animation rendering process of the animation rendering module 32 is performed frame by frame, but in a specific process, not all frames need to be subjected to the animation rendering process, for example, when the animation head needs to be inserted at the beginning of the video, or when the animation tail needs to be inserted at the end of the video, for this case, no content of the video clip appears when actually displayed, and then the animation frames may not be subjected to the rendering process. Therefore, before the animation frame processed at the current rendering schedule is processed, it is necessary to determine whether the animation frame at the current rendering schedule needs to be processed for animation rendering.

Specifically, when the current animation frame does not have the video layer, the video image content is not required to be added into the current animation frame, so that the current animation frame does not need to be subjected to animation rendering processing, and therefore whether the current animation frame needs to be subjected to animation rendering processing can be determined by determining whether the current animation frame comprises the video layer. When the current animation frame is determined to comprise a video layer, it can be determined that the current animation frame needs to be subjected to animation rendering processing; in contrast, when it is determined that the current animation frame does not include the video layer, it may be determined that the current animation frame does not need to be subjected to the animation rendering process, and then the current animation frame may be skipped to enter the processing procedure of the next animation frame.

Specifically, when a designer designs a sticker animation, for a certain visual effect, the video layer may be hidden and displayed in a certain time interval, for example, the duration interval of the video layer is 2 s-5 s, but the video layer may be set not to be displayed in 2 s-3 s, and in the finally presented visual effect, the video content is actually invisible, so that for the animation frame in the invisible time interval, the rendering process may also be skipped. Thus, it may be determined whether the current animation frame requires an animation rendering process by determining whether the current animation frame is within a visible time interval of the video layer. When the current animation frame is determined to be positioned in the visible time interval of the video layer, determining that the current animation frame needs to be subjected to animation rendering processing; in contrast, when the current animation frame is determined to be located in the invisible time interval of the video layer, it is determined that the current animation frame does not need to be subjected to animation rendering processing, and then the current animation frame can be skipped to enter the processing procedure of the next animation frame.

For example, in the PAG-based sticker animation scheme, after the PAG animation file is decoded, layer attribute information of each layer, such as duration time interval, visible time interval, and the like, may be obtained, so as to determine whether the current animation frame has a video layer and is located in the visible time interval of the video layer, and further determine that the current frame needs to be subjected to animation rendering processing.

Step 1102: if the result of the determination in step 1101 is negative, it is determined whether to end the rendering.

Step 1103: if the result of the determination in step 1102 is no, the next frame rendering process is entered.

If the current animation frame is determined to be subjected to animation rendering processing, determining whether to end rendering, if yes, ending the process, and if yes, entering the rendering process of the next frame. The determining whether to end the rendering may be determining whether all the animation frames are rendered, and if all the animation frames are rendered, ending the rendering.

Step 1104: if the determination result of step 1101 is yes, determining, according to the mapping relationship, a video frame corresponding to the current animation frame.

Specifically, when the mapping relationship is a mapping relationship in time, the associated video frame at the second time can be determined from the video segment according to the second time corresponding to the first time at which the current animation frame is located, where the associated video frame is the video frame corresponding to the determined current animation frame. The specific determination process is already described in detail in the above step 403, and thus will not be described in detail here.

Specifically, when the mapping relationship is that of a specific frame, then the video frame corresponding to the current animation frame can be directly searched according to the mapping relationship.

The current animation frame is an animation frame processed by the current rendering progress, and then the decoding progress of the video clip needs to be synchronized, so that the decoding speed of the video clip is adapted to the rendering speed. Thus, the process of determining video frames, which is essentially also the synchronization process of the current rendering schedule and the current decoding schedule, determines video frames, i.e. video frames that the video decoder is currently required to decode.

Step 1105: and decoding the video frame corresponding to the current animation frame to obtain frame data of the video frame.

In the embodiment of the application, the video file needs to be unpacked in advance before the video frame is decoded, so that the image track data is obtained. The video clips are only required to be subjected to animation rendering, so that only the data packets corresponding to the video clips can be unpacked, and the image track data of the video clips can be obtained. The process of decapsulation may be implemented by the video decompression container 301 included in the video decoding module 30 shown in fig. 3, for example, may be implemented by FFmpeg built in the animation rendering SDK.

Specifically, for the video frame corresponding to the determined current animation frame, the image track data corresponding to the video frame can be found from the decompressed image track data, and the video frame is decoded. In general, when encoding, not every video frame will store a complete image, for example, a complete key frame may be stored, and several frames after the key frame may only store difference data with the key frame, so when the video frame corresponding to the current animation frame is a key frame, only single frame data of the video frame may be decoded, and when the video frame corresponding to the current animation frame is not a key frame, decoding from the last key frame is needed to obtain all frame data of the video frame.

Of course, in the practical application process, the whole video segment may be decoded, and then the frame data of the required video frame may be found therefrom.

The decoding process may be implemented by the video decoder 302 included in the video decoding module 30 shown in fig. 3. In the practical application process, the animation rendering module 32 may send the time information of the video frame to be decoded to the video decoder 302, and the video decoder 302 may decode the time information accordingly. For example, when the current rendering progress proceeds to time a+1, frame data at times b+2 to b+3 of the video clip may be needed, and then animation rendering module 32 may invoke video decoder 302 to decode frame data at times b+2 to b+3.

Some system platforms can support hardware decoding, and the system platforms provide hardware decoding interfaces, such as an iOS or mac end interface VideoToolbox or an Android end interface MediaCodec, so that the hardware decoding can fully use the GPU for decoding, and the decoding efficiency and the required time consumption are low, and therefore, when the system platform can support hardware decoding, the decoding interface of the system platform can be preferentially called. Therefore, in practical application, whether the current system supports hardware decoding or not can be determined, and when the current system supports hardware decoding, a system decoding interface is called to decode the video frame in a hardware decoding mode; or when the current system is determined not to support hardware decoding, such as a Linux system, a decoding interface built in the SDK can be called to decode the video frame in a software decoding mode.

Step 1106: and converting the data format of the frame data of the video frame to obtain the frame data in a renderable data format.

In the embodiment of the application, the data format conversion can be carried out on the video frames corresponding to each animation frame to obtain the frame data in the renderable data format, so that the animation rendering processing is carried out according to the frame data of the animation frames and the frame data in the renderable data format of the video frames to obtain the target video frames corresponding to each animation frame.

For example, in general, the decoded data is in YUV format, whereas in actual rendering, it is generally necessary to use RGBA format data, and therefore it is necessary to convert the decoded YUV format data into RGBA format data. In the practical application process, the format conversion may be implemented by calling the format converter 303 included in the video decoding module 30.

After the format conversion is completed, if related attributes such as cutting, converting and the like for the video frame are set in the conversion information management sub-module, matrix conversion processing is required for the image data.

Step 1107: and performing animation rendering processing according to the frame data of the animation frame and the frame data of the video frame in a renderable data format to obtain a target video frame.

In the embodiment of the application, the frame data of the animation frame can be obtained after decoding the animation file. Wherein, based on the frame data of each animation frame, what kind of processing needs to be performed on the video frame in the animation rendering process can be known. For example, when the animation file is a PAG animation file, the layer attribute of the PAG animation file may include processing required for the video frame in the layer, for example, the layer attribute may include a stretching parameter, a transformation parameter, and the like, and the transformation parameter may be further subdivided and may be further divided into an anchor point, a position, an x/y direction position, a scaling coefficient, a rotation coefficient, transparency, and the like, so that the video frame may be processed based on the layer attribute when rendered.

Specifically, when the animation rendering is performed, the frame data of the video frame can be rendered on the video layer, and then the frame data of the animation frame is rendered, so that the graph corresponding to the animation frame is covered on the graph corresponding to the video frame.

The scheme of the embodiment of the application can be applied to any scene needing to add the animation effect, such as adding the animation effect when making a short video or adding the animation effect in a game video. Taking a game video as an example, when a user uses a game application, a plurality of game events usually occur, when game information such as a game week combat report is pushed to the user, the game events can be provided to the user in a video form, corresponding paper-sticking animations can be designed for different game events, animation special effects can be added for specific events from the game video provided for the user, and the visual effect of the video is improved. As shown in fig. 12, a rendering result diagram of adding special effects for a triple-play event of a game video is shown, and the special effects for the triple-play event are added, which is essentially that effects of positioning and highlighting a target for killing are added on the basis of a frame picture of the game video, as in fig. 12, a current character is highlighted and an avatar of the target for killing of the current character is shown, so that a user can intuitively view that the characters A, B and C are killed by the current character when viewing a finally generated target video.

According to the video processing scheme provided by the embodiment of the application, the video clips with the special effects to be added and the related information thereof can be input externally, so that the whole special effect adding process is completed in the animation rendering SDK, the whole time axis processing is realized by the animation rendering SDK, and the development efficiency of video editing scenes and server-side special effect adding scenes is improved.

Referring to fig. 13, based on the same inventive concept, an embodiment of the present application further provides an animation rendering SDK 130, including:

an acquisition unit 1301 configured to acquire a video file and an animation file for adding special effects to a video clip specified in the video file;

a determining unit 1302, configured to determine a mapping relationship between the animation segments corresponding to the video layer according to the time information of the video layer and the time information of the video segments included in the animation file; the video image layer is used for bearing image content of video frames included in the video clips;

the animation rendering unit 1303 is configured to determine, according to the mapping relationship, a video frame corresponding to each animation frame from the video clip, and perform animation rendering on the determined video frame by frame, so as to obtain a plurality of target video frames;

the video synthesis unit 1304 is configured to perform video synthesis based on a plurality of target video frames, and obtain a target video file to which an animation effect is added.

Optionally, the time information includes a start time and a duration, and the determining unit 1302 is configured to:

when the duration of the video clip is equal to the duration of the video layer, determining that each animation frame included in the animation clip corresponds to each video frame included in the video clip one by one; or,

when the duration of the video clip is unequal to the duration of the video layer, determining a mapping relation according to the ratio of the duration of the video clip to the duration of the video layer, and the starting time of the video clip and the starting time of the video layer; or,

and when the duration of the video clip is zero, determining that all animation frames in the duration of the video layer correspond to the initial video frame of the video clip.

Optionally, for any animation frame within the duration range of the video layer, the determining unit 1302 is configured to:

determining a corresponding second moment of the first moment of any animation frame in the video clip according to the mapping relation;

determining an associated video frame from the video clip at a second time;

and determining the associated video frame as the video frame corresponding to any animation frame.

Optionally, for any animation frame, the animation rendering unit 1303 is configured to:

when any animation frame is determined to need to be subjected to animation rendering processing, determining a video frame corresponding to the any animation frame according to the mapping relation;

Decoding a video frame corresponding to any animation frame to obtain frame data of the video frame;

and performing animation rendering processing on the basis of the frame data of any animation frame and the frame data of the video frame to obtain any animation frame so as to obtain a corresponding target video frame.

Optionally, the animation rendering unit 1303 is configured to:

when any animation frame is determined to comprise a video layer, determining that any animation frame needs to be subjected to animation rendering processing; or,

and determining that any animation frame is required to be subjected to animation rendering processing in a visible time interval of the video layer.

Optionally, the animation rendering unit 1303 is further configured to:

and performing animation rendering processing according to the frame data of any animation frame and the frame data of the renderable data format of the corresponding video frame to obtain a target video frame corresponding to any animation frame.

Optionally, the animation rendering unit 1303 is configured to:

when the current system is determined to support hardware decoding, a system decoding interface is called to decode a video frame corresponding to any animation frame in a hardware decoding mode, so that frame data of the video frame are obtained; or,

And when the current system is determined not to support hardware decoding, a decoding interface built in the SDK is called to decode the video frame corresponding to any animation frame in a software decoding mode, so that frame data of the video frame are obtained.

Optionally, the animation rendering unit 1303 is configured to:

calling a video unpacking container built in the SDK to unpack a data packet corresponding to the video clip in the video file to obtain image track data of the video clip;

The acquisition unit included in the animation rendering SDK 130 may correspond to the resource management module 31 in the animation rendering SDK 30 shown in fig. 3, for example, and the determination unit 1302 and the animation rendering unit 1303 may correspond to the video decoding module 30 and the animation rendering module 32 in the animation rendering SDK 30 shown in fig. 3 as a whole, for example. The animation rendering SDK may be used to perform the method shown in the embodiment shown in fig. 4 to 11, and thus, the description of the embodiment shown in fig. 4 to 11 may be referred to for the functions that each functional module of the animation rendering SDK can realize, and so on, and will not be repeated.

Referring to fig. 14, based on the same technical concept, the embodiment of the present invention further provides a computer device 140, which may include a memory 1401 and a processor 1402.

The memory 1401 is used for storing a computer program executed by the processor 1402. The memory 1401 may mainly include a storage program area and a storage data area, wherein the storage program area may store an operating system, an application program required for at least one function, and the like; the storage data area may store data created according to the use of the computer device, etc. The processor 1402 may be a central processing unit (central processing unit, CPU), or a digital processing unit, or the like. The specific connection medium between the memory 1401 and the processor 1402 is not limited in the embodiments of the present invention. In the embodiment of the present invention, the memory 1401 and the processor 1402 are connected through the bus 1403 in fig. 14, the bus 1403 is shown by a thick line in fig. 14, and the connection manner between other components is merely illustrative, and not limited to the above. The bus 1403 may be classified into an address bus, a data bus, a control bus, and the like. For ease of illustration, only one thick line is shown in fig. 14, but not only one bus or one type of bus.

The memory 1401 may be a volatile memory (RAM), such as a random-access memory (RAM); the memory 1401 may also be a non-volatile memory (non-volatile memory), such as a read-only memory, a flash memory (flash memory), a Hard Disk Drive (HDD) or a Solid State Drive (SSD), or the memory 1401 is any other medium that can be used to carry or store desired program code in the form of instructions or data structures and that can be accessed by a computer, but is not limited thereto. The memory 1401 may be a combination of the above memories.

A processor 1402 for executing the method executed by the apparatus in the embodiment shown in fig. 4 to 11 when calling the computer program stored in the memory 1401.

In some possible embodiments, aspects of the method provided by the present invention may also be implemented in the form of a program product comprising program code for causing a computer device to carry out the steps of the method according to the various exemplary embodiments of the invention described in this specification, when said program product is run on the computer device, e.g. the computer device may carry out the method as carried out by the device in the examples shown in fig. 4-11.

The program product may employ any combination of one or more readable media. The readable medium may be a readable signal medium or a readable storage medium. The readable storage medium can be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or a combination of any of the foregoing. More specific examples (a non-exhaustive list) of the readable storage medium would include the following: an electrical connection having one or more wires, a portable disk, a hard disk, random Access Memory (RAM), read-only memory (ROM), erasable programmable read-only memory (EPROM or flash memory), optical fiber, portable compact disk read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.

While preferred embodiments of the present invention have been described, additional variations and modifications in those embodiments may occur to those skilled in the art once they learn of the basic inventive concepts. It is therefore intended that the following claims be interpreted as including the preferred embodiments and all such alterations and modifications as fall within the scope of the invention.

It will be apparent to those skilled in the art that various modifications and variations can be made to the present invention without departing from the spirit or scope of the invention. Thus, it is intended that the present invention also include such modifications and alterations insofar as they come within the scope of the appended claims or the equivalents thereof.

Claims

1. A method of video processing, the method comprising:

responding to the selection operation in the video editing software, and selecting a video clip with an animation special effect to be added and a sticker animation to be added;

inputting the selected video clip and the information of the sticker animation to an animation rendering SDK, and executing the following operations through the animation rendering SDK:

obtaining a video file and an animation file for adding special effects to a video clip appointed in the video file, wherein the animation file is a file in an animation format exported by an export plug-in after animation is designed by graphic video processing software, and comprises all animation frames required for adding the special effects of the animation to the video file;

Determining a mapping relation between the animation segments corresponding to the video layers and the video segments according to the time information of the video layers and the time information of the video segments included in the animation file; the video image layer is used for bearing image content of video frames included in the video clips, and the time information includes duration of corresponding clips;

determining video frames corresponding to each animation frame from the video clips according to the mapping relation, and performing animation rendering on the determined video frames frame by frame based on the rendering frame rate of the video layer to obtain a plurality of target video frames; when animation rendering is carried out, if the duration of the video layer is equal to that of the video segment, controlling the rendering frame rate to be consistent with the decoding frame rate of the video segment, or if the video layer is unequal, carrying out variable speed processing on the decoding frame rate of the video segment, so that the decoding progress can meet the requirement of the rendering progress;

2. The method of claim 1, wherein the time information includes a start time and a duration, and determining, according to the time information of the video layer included in the animation file and the time information of the video clip, a mapping relationship between the animation clip corresponding to the video layer and the video clip includes:

when the duration of the video clip is unequal to the duration of the video layer, determining the mapping relation according to the ratio of the duration of the video clip to the duration of the video layer, the starting time of the video clip and the starting time of the video layer; or,

3. The method of claim 1, wherein determining, from the video segments, video frames corresponding to each animation frame according to the mapping relationship, comprises:

determining a corresponding second moment in the video clip at the first moment of any animation frame according to the mapping relation;

determining an associated video frame from the video clip at the second time instant;

4. The method of claim 1, wherein determining, according to the mapping relationship, a video frame corresponding to each animation frame from the video clip, and performing animation rendering on the determined video frame by frame based on a rendering frame rate of the video layer, to obtain a plurality of target video frames, includes:

When determining that any animation frame needs to be subjected to animation rendering processing, determining a video frame corresponding to the any animation frame according to the mapping relation;

decoding the video frame corresponding to any animation frame to obtain frame data of the video frame;

and based on the rendering frame rate of the video layer, performing animation rendering processing according to the frame data of any animation frame and the frame data of the video frame to obtain a target video frame corresponding to any animation frame.

5. The method of claim 4, wherein determining that any one of the animation frames requires an animation rendering process comprises:

when determining that any animation frame comprises a video layer, determining that any animation frame needs to be subjected to animation rendering processing; or,

6. The method of claim 4, wherein after decoding the video frame corresponding to the any one of the animation frames to obtain frame data of the video frame, the method further comprises:

performing data format conversion on the frame data of the video frame corresponding to any animation frame to obtain frame data in a renderable data format;

Then, based on the frame data of any one of the animation frames and the frame data of the video frames, performing animation rendering processing to obtain any one of the animation frames to obtain a corresponding target video frame, including:

7. The method of claim 4, wherein decoding the video frame corresponding to any one of the animation frames to obtain frame data of the video frame, comprises:

8. The method of any of claims 1-7, wherein prior to animating the determined video frames frame-by-frame to obtain the plurality of target video frames, the method further comprises:

and performing animation rendering on the determined video frames frame by frame to obtain a plurality of target video frames, wherein the method comprises the following steps:

9. An apparatus comprising an animation rendering software development kit SDK, the animation rendering SDK comprising:

the system comprises an acquisition unit, a display unit and a display unit, wherein the acquisition unit is used for acquiring a video file and an animation file for adding special effects to a video segment appointed in the video file, the animation file is a file in an animation format which is exported by an export plug-in after animation is designed by graphic video processing software, and comprises all animation frames required for adding the special effects of the animation to the video file; the video clips and the paper-sticking animations are selected to be added with the specific video clips and the paper-sticking animations to be added and then input to the animation rendering SDK in response to selection operation in video editing software;

The determining unit is used for determining the mapping relation between the animation segments corresponding to the video layers and the video segments according to the time information of the video layers and the time information of the video segments included in the animation file; the video image layer is used for bearing image content of video frames included in the video clips, and the time information includes duration of corresponding clips;

the animation rendering unit is used for determining video frames corresponding to each animation frame from the video clips according to the mapping relation, and performing animation rendering on the determined video frames frame by frame based on the rendering frame rate of the video layer to obtain a plurality of target video frames; when animation rendering is carried out, if the duration of the video layer is equal to that of the video segment, controlling the rendering frame rate to be consistent with the decoding frame rate of the video segment, or if the video layer is unequal, carrying out variable speed processing on the decoding frame rate of the video segment, so that the decoding progress can meet the requirement of the rendering progress;

10. The apparatus of claim 9, wherein the time information includes a start time and a duration, and the determining unit is configured to:

11. The apparatus of claim 10, wherein the determining unit is configured to, for any animation frame within the range of video layer durations:

12. The apparatus of claim 9, wherein the animation rendering unit is to, for any animation frame:

and performing animation rendering processing on the basis of the frame data of any animation frame and the frame data of the video frame to obtain a target video frame corresponding to any animation frame.

13. The apparatus of claim 12, wherein the animation rendering unit is to:

14. A computer device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, characterized in that,

The processor, when executing the computer program, implements the steps of the method of any one of claims 1 to 8.

15. A computer storage medium having stored thereon computer program instructions, characterized in that,

which computer program instructions, when executed by a processor, carry out the steps of the method according to any one of claims 1 to 8.