CN112822544B

CN112822544B - Video material file generation method, video synthesis method, device and medium

Info

Publication number: CN112822544B
Application number: CN202011633376.8A
Authority: CN
Inventors: 刘春宇
Original assignee: Guangzhou Kugou Computer Technology Co Ltd
Current assignee: Guangzhou Kugou Computer Technology Co Ltd
Priority date: 2020-12-31
Filing date: 2020-12-31
Publication date: 2023-10-20
Anticipated expiration: 2040-12-31
Also published as: CN112822544A

Abstract

The embodiment of the application discloses a video material file generation method, a video synthesis method, video synthesis equipment and a medium, and belongs to the technical field of video processing. The method comprises the following steps: acquiring at least one video; for each image of the video, responding to a sub-image interception operation of the image, and acquiring the sub-image of the image and the position information of the sub-image in the corresponding image; and generating a video material file based on the sub-image and the position information, wherein the video material file is used for being analyzed by a terminal to be synthesized to a target video. In the process of adding the materials in the target video, the video is decoded to obtain the added sub-image, so that the time spent for obtaining the sub-image is saved, the efficiency of adding the sub-image to the target video is improved, and the decoding resource is saved.

Description

Video material file generation method, video synthesis method, device and medium

Technical Field

The embodiment of the application relates to the technical field of video processing, in particular to a video material file generation method, a video synthesis method, video synthesis equipment and a medium.

Background

With the rapid development of video processing technology, users can add different materials into the target video, so that the target video with the added materials is more attractive.

The terminal can acquire at least one video, each video comprises a plurality of sub-images, each sub-image can serve as a material added in the target video, the terminal decodes at least one video at the same time to obtain sub-images included in each video, and the obtained sub-images are synthesized to the target video according to the playing sequence of the corresponding video.

However, since at least one video needs to be decoded at the same time, it takes a long time to synthesize the sub-images included in the video to the target video, which is inefficient.

Disclosure of Invention

The embodiment of the application provides a video material file generation method, a video synthesis method, video synthesis equipment and a medium, which save the time spent in acquiring sub-images, improve the efficiency of adding the sub-images to a target video and save decoding resources. The technical scheme is as follows:

in one aspect, a method for generating a video material file is provided, the method including:

acquiring at least one video;

for each image of the video, responding to a sub-image interception operation of the image, and acquiring the sub-image of the image and the position information of the sub-image in the corresponding image;

And generating a video material file based on the sub-image and the position information, wherein the video material file is used for being analyzed by a terminal to be synthesized to a target video.

Optionally, for each image of the video, in response to a sub-image capturing operation on the image, acquiring a sub-image of the image and position information of the sub-image in the corresponding image includes:

displaying an image of any one of the videos;

displaying a region selection frame on an image of the any one video in response to a clipping operation of the image of the any one video;

and responding to the dragging operation of the region selection frame, intercepting a sub-image in the region selection frame when the dragging operation is finished, and obtaining the sub-image of the image of any video and the position information of the sub-image in the corresponding image.

Optionally, the displaying the image of any one of the videos includes:

and displaying the images of the video with the playing sequence positioned behind the previous video after the previous video is intercepted according to the playing sequence of each video.

And responding to the triggering operation of the image capturing options, carrying out pattern recognition on the image of each video, acquiring the recognized pattern as a sub-image of the image, and acquiring the position information of the recognized pattern in the corresponding image as the position information of the sub-image in the corresponding image.

if the sub-images of at least two images have the same shape and different positions in the images, merging the sub-images of the at least two images and the corresponding position information.

Optionally, after the capturing, for each image of the video, a sub-image of the image and position information of the sub-image in the corresponding image in response to a sub-image capturing operation on the image, the method further includes:

determining a playing starting point of a corresponding image of the sub-image in the affiliated video as the playing starting point of the corresponding sub-image;

determining the playing end point of the corresponding image of the sub-image in the video as the playing end point of the corresponding sub-image;

The generating a video material file based on the sub-image and the position information includes:

and generating the video material file based on the sub-image, the position information and the playing start point and the playing end point of the sub-image.

Optionally, the playing start point of each sub-image is the playing start time point of the corresponding image of each sub-image in the affiliated video, or the playing start point of each sub-image is the playing start time stamp of the corresponding image of each sub-image in the affiliated video;

the play end point of each sub-image is the play end time point of the corresponding image of each sub-image in the video to which the corresponding image belongs, or the play end point of each sub-image is the play end time stamp of the corresponding image of each sub-image in the video to which the corresponding image belongs.

Optionally, the generating a video material file based on the sub-image and the position information includes:

storing the name and the position information of the sub-image in a position sub-file correspondingly;

and compressing the position subfiles and the sub-images to generate the video material files.

Optionally, after generating the video material file based on the sub-image and the position information, the method further includes:

And sending the video material files to a server, and storing the video material files by the server.

In another aspect, a video synthesis method is provided, the method including:

acquiring a video material file, wherein the video material file comprises at least one sub-image and position information of each sub-image;

analyzing the video material file to obtain the at least one sub-image and the position information of each sub-image;

and synthesizing the at least one sub-image to the target video based on the position information of each sub-image.

Optionally, the acquiring the video material file includes:

and acquiring the video material file from a server.

Optionally, the parsing the video material file to obtain the at least one sub-image and the position information of each sub-image includes:

analyzing the video material file to obtain the at least one sub-image, the position information of each sub-image and the playing start point and the playing end point of each sub-image;

the synthesizing the at least one sub-image to the target video based on the position information of each sub-image includes:

And based on the play start point and the play end point of each sub-image, the at least one sub-image is overlapped on the video picture of the corresponding point in the target video according to the position information of each sub-image.

Optionally, after the parsing the video material file to obtain the at least one sub-image, the position information of each sub-image, and the play start point and the play end point of each sub-image, the method further includes:

displaying the at least one sub-image, the position information of each sub-image and the playing start point and the playing end point of each sub-image;

Modifying at least one of the play start point, the play end point and the position information of each sub-image in response to a modification operation;

the step of superposing the at least one sub-image on the video picture of the corresponding point in the target video according to the position information of each sub-image based on the play start point and the play end point of each sub-image comprises the following steps:

and based on the modified play start point and play end point of each sub-image and the position information of each sub-image, the at least one sub-image is overlapped on the video picture corresponding to the play point and the end point in the target video according to the modified position information.

In another aspect, there is provided a video material file generating apparatus, the apparatus including:

the video acquisition module is used for acquiring at least one video;

the intercepting module is used for responding to the intercepting operation of the sub-images of the images for the images of each video and acquiring the sub-images of the images and the position information of the sub-images in the corresponding images;

and the file generation module is used for generating a video material file based on the sub-image and the position information, wherein the video material file is used for being analyzed by a terminal to be synthesized to a target video.

Optionally, the intercepting module includes:

an image display unit configured to display an image of any one of the videos;

a selection frame display unit configured to display an area selection frame on an image of any one of the videos in response to a capturing operation of the image of the any one of the videos;

and the intercepting unit is used for intercepting the sub-image in the region selection frame when the dragging operation is finished in response to the dragging operation of the region selection frame, and obtaining the sub-image of the image of any video and the position information of the sub-image in the corresponding image.

Optionally, the image display unit is configured to display, according to the playing order of each video, an image of a video whose playing order is located after the previous video is intercepted.

Optionally, the capturing module is configured to perform pattern recognition on an image of each video in response to a triggering operation on an image capturing option, acquire the recognized pattern as a sub-image of the image, and acquire position information of the recognized pattern in a corresponding image as position information of the sub-image in the corresponding image.

Optionally, the capturing module is configured to combine the sub-images of the at least two images and the corresponding position information if the sub-images of the at least two images have the same shape and different positions in the images.

Optionally, the apparatus further comprises:

the time determining module is used for determining the playing starting point of the corresponding image of the sub-image in the affiliated video as the playing starting point of the corresponding sub-image;

the time determining module is used for determining the playing end point of the corresponding image of the sub-image in the affiliated video as the playing end point of the corresponding sub-image;

the file generation module is used for generating the video material file based on the sub-image, the position information and the playing start point and the playing end point of the sub-image.

Optionally, the playing start point of the corresponding image of the sub-image in the affiliated video is the playing start time point of the corresponding image of the sub-image in the affiliated video, or the playing start point is the playing start time stamp of the corresponding image of the sub-image in the affiliated video;

the playing end point of the corresponding image of the sub-image in the affiliated video is the playing end time point of the corresponding image of the sub-image in the affiliated video, or the playing end point is the playing end time stamp of the corresponding image of the sub-image in the affiliated video.

Optionally, the file generation module is configured to:

Optionally, the apparatus further comprises:

and the file sending module is used for sending the video material files to a server, and storing the video material files by the server.

In another aspect, there is provided a video compositing apparatus, the apparatus comprising:

the file acquisition module is used for acquiring a video material file, wherein the video material file comprises at least one sub-image and position information of each sub-image;

the analysis module is used for analyzing the video material file and acquiring the at least one sub-image and the position information of each sub-image;

and the synthesis module is used for synthesizing the at least one sub-image to the target video based on the position information of each sub-image.

Optionally, the file obtaining module is configured to obtain the video material file from a server.

Optionally, the parsing module is configured to parse the video material file, and obtain the at least one sub-image, the position information of each sub-image, and the play start point and the play end point of each sub-image;

The synthesizing module is configured to superimpose the at least one sub-image onto a video frame corresponding to the start point and the end point in the target video according to the position information of each sub-image based on the play start point and the play end point of each sub-image.

Optionally, the apparatus further comprises:

the display module is used for displaying the at least one sub-image, the position information of each sub-image, and the playing start point and the playing end point of each sub-image;

the modification module is used for responding to modification operation and modifying at least one of the play start point, the play end point and the position information of each sub-image;

The synthesizing module is configured to superimpose the at least one sub-image on a video frame corresponding to the start point and the end point in the target video according to the modified position information based on the modified play start point and the play end point of each sub-image and the position information of each sub-image.

In another aspect, there is provided a computer device including a processor and a memory, the memory storing at least one program code, the at least one program code being loaded and executed by the processor to implement the video material file generation method as described in the above aspect, or to implement the video composition method as described in the above aspect.

In another aspect, there is provided a computer readable storage medium having stored therein at least one program code loaded and executed by a processor to implement the video material file generation method as described in the above aspect, or to implement the video composition method as described in the above aspect.

In yet another aspect, a computer program product or a computer program is provided, the computer program product or the computer program comprising computer program code stored in a computer readable storage medium, the computer program code being read from the computer readable storage medium by a processor of a computer device, the computer program code being executed by the processor to cause the computer device to implement the video material file generation method as described in the above aspect, or to cause the computer device to implement the video composition method as described in the above aspect.

According to the video material file generation method, the video synthesis device and the medium, the sub-image is intercepted from the image of any one video of at least one video, so that the terminal can combine the intercepted sub-image with the position information of the sub-image in the corresponding image to generate the video material file, other terminals can analyze the video material file to synthesize the video material file into the target video, the effect of adding the material into the target video is achieved, the video is decoded to obtain the added sub-image without adding the material into the target video, the time consumed for acquiring the sub-image is saved, the efficiency of adding the sub-image into the target video is improved, and decoding resources are saved.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings that are needed in the description of the embodiments will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present application, and other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.

Fig. 1 is a schematic structural diagram of an implementation environment according to an embodiment of the present application.

Fig. 2 is a flowchart of a method for generating a video material file according to an embodiment of the present application.

Fig. 3 is a flowchart of a video synthesizing method according to an embodiment of the present application.

Fig. 4 is a flowchart of a method for generating a video material file according to an embodiment of the present application.

Fig. 5 is a schematic diagram of a display area selection frame according to an embodiment of the present application.

Fig. 6 is a schematic diagram of a display interface according to an embodiment of the present application.

Fig. 7 is an interface schematic diagram of a video material file according to an embodiment of the present application.

Fig. 8 is a flowchart of a method for generating a video material file according to an embodiment of the present application.

Fig. 9 is a flowchart of a video synthesizing method according to an embodiment of the present application.

Fig. 10 is a schematic diagram of a video frame according to an embodiment of the present application.

Fig. 11 is a flowchart of a method for generating video material files and synthesizing video according to an embodiment of the present application.

Fig. 12 is a schematic structural diagram of a video material file generating apparatus according to an embodiment of the present application.

Fig. 13 is a schematic structural diagram of another video material file generating apparatus according to an embodiment of the present application.

Fig. 14 is a schematic structural diagram of a video synthesizer according to an embodiment of the present application.

Fig. 15 is a schematic structural diagram of another video synthesizer according to an embodiment of the present application.

Fig. 16 is a schematic structural diagram of a terminal according to an embodiment of the present application.

Fig. 17 is a schematic structural diagram of a server according to an embodiment of the present application.

Detailed Description

For the purpose of making the objects, technical solutions and advantages of the embodiments of the present application more apparent, the following detailed description of the embodiments of the present application will be given with reference to the accompanying drawings.

It is to be understood that the terms "first," "second," "third," "fourth," "fifth," "sixth," etc. as used herein may be used to describe various concepts, but are not limited by these terms unless otherwise specified. These terms are only used to distinguish one concept from another. For example, the first permutation may be referred to as a second permutation and the second permutation may be referred to as a first permutation without departing from the scope of the present application.

The terms "each," "plurality," "at least one," "any" and the like as used herein, at least one includes one, two or more, a plurality includes two or more, and each refers to each of a corresponding plurality, any of which refers to any of the plurality. For example, the plurality of elements includes 3 elements, and each refers to each of the 3 elements, and any one refers to any one of the 3 elements, which may be the first, the second, or the third.

Fig. 1 is a schematic structural diagram of an implementation environment according to an embodiment of the present application. Referring to fig. 1, the implementation environment includes a terminal 101 and a server 102, which are connected by a wireless or wired network.

The terminal 101 intercepts a sub-image of an image of any one video from at least one video, acquires position information of the sub-image in a corresponding image, and can generate a video material file based on the acquired sub-image and the position information. The terminal 101 can also transmit the generated video material files to the server 102, the server 102 is configured to store the received video material files, and the server 102 can also transmit the video material files to other terminals, which parse the video material files to synthesize into a target video.

The terminal in the embodiment of the application is a mobile phone, a tablet personal computer, a computer and other types of terminals, and the server is a server, or a server cluster formed by a plurality of servers, or a cloud computing service center.

The method provided by the embodiment of the application is applied to a video editing scene, a user can watch any video through the terminal, and the user can edit the video to add any material in the video, so that the control terminal obtains the video material file provided by the embodiment of the application, and the terminal can analyze the video material file to synthesize the video material file into a target video to add the material in the target video.

Fig. 2 is a flowchart of a method for generating a video material file according to an embodiment of the present application. Referring to fig. 2, the method includes:

201. the terminal acquires at least one video.

Wherein any one of the videos includes a plurality of images, which are sequentially ordered in order from front to back to construct the video. The video is a material video, and sub-images included in the video can be superimposed into video pictures of other videos to beautify the video pictures of the other videos.

Optionally, the video in the embodiment of the present application is stored in a server, and the terminal can obtain at least one video from the server; or, the video in the embodiment of the application is stored in the terminal, and the terminal can directly acquire the stored video.

202. For each image of the video, the terminal acquires the sub-image of the image and the position information of the sub-image in the corresponding image in response to the sub-image capturing operation on the image.

In the embodiment of the application, the terminal can display the image of each video, if the operation of intercepting the sub-image of any image is detected, the area corresponding to the operation of intercepting the sub-image is determined, and the sub-image positioned in the area and the position information of the sub-image in the corresponding image are acquired.

For example, the sub-image capturing operation in the embodiment of the present application includes a combination operation of a long press operation and a slide operation, or an operation of selecting a frame for a display area and dragging the area, or other types of operations, or the like. The embodiment of the present application does not describe the intercepting operation in a unfolding manner, and the detailed description will be made in the following embodiments.

The truncated sub-images are partial images or all images of the corresponding images. The position information is used to indicate the position of the sub-image in the corresponding image. For example, the position information is represented by the upper left corner coordinates, width and height of the sub-image, or the position information is represented by the upper right corner coordinates, width and height of the sub-image, or the position information is represented by the center coordinates, width and height of the sub-image, or in other ways.

For example, if the position information is represented by the upper left corner coordinates, width, and height of the sub-image, the upper left corner coordinates may be (60, 100), the width may be 100, and the height may be 100. Alternatively, the upper left corner is (50, 60), the width is 60, and the height is 100.

203. The terminal generates a video material file based on the sub-image and the position information.

The video material file is used for being analyzed by the terminal to be synthesized to the target video.

In the embodiment of the application, the terminal can analyze the video material file to acquire the sub-image and the position information synthesized to the target video, and the video is not required to be decoded any more, so that the energy consumption of the terminal can be saved.

Optionally, the video material file is a compressed file, after the terminal obtains the video material file, the terminal can analyze the video material file, so as to obtain a sub-image and position information included in the video material file, and further synthesize the sub-image to the target video according to the position information.

According to the method provided by the embodiment of the application, the sub-image is intercepted from the image of any one video of at least one video, so that the terminal can combine the intercepted sub-image with the position information of the sub-image in the corresponding image to generate the video material file, other terminals can analyze the video material file to synthesize the video material file into the target video, the effect of adding the material into the target video is realized, the video is decoded to obtain the added sub-image without the process of adding the material into the target video, the time spent for acquiring the sub-image is saved, the efficiency of adding the sub-image into the target video is improved, and decoding resources are saved.

Fig. 3 is a flowchart of a video synthesizing method according to an embodiment of the present application. Referring to fig. 3, the method includes:

301. and the terminal acquires the video material file.

Wherein the video material file includes at least one sub-image and position information of each sub-image.

The video material files in the embodiment of the present application are the same as those in step 203, and will not be described here again.

302. The terminal analyzes the video material file and obtains at least one sub-image and the position information of each sub-image.

In the embodiment of the application, the video material files are stored in the form of files, and after the terminal acquires the video material files, the video material files can be analyzed.

In addition, the sub-images and the position information of the sub-images in the embodiment of the present application are the same as those in the step 202, and are not described herein.

303. The terminal synthesizes at least one sub-image to the target video based on the position information of each sub-image.

The position information of the sub-images is used for indicating the positions of the sub-images in the corresponding images, so that the terminal can add the sub-images to the positions corresponding to the position information in the target video according to the position information of each sub-image, and the effect of adding the sub-images in the target video is achieved.

According to the method provided by the embodiment of the application, the video material file is analyzed to obtain the sub-image and the corresponding position information, so that the sub-image can be synthesized into the target video according to the position information, the video is decoded to obtain the added sub-image without adding materials in the target video, the time spent for obtaining the sub-image is saved, the efficiency for adding the sub-image into the target video is improved, and the decoding resources are saved.

Fig. 4 is a flowchart of a method for generating a video material file according to an embodiment of the present application. Referring to fig. 4, the method includes:

401. the terminal acquires at least one video.

In order to reduce the resources consumed for decoding the video, the application can intercept the image included in the video in advance so as to compress the intercepted sub-image and the position of the sub-image in the corresponding image to generate the video material file, thereby reducing the resource consumption of the terminal for decoding the video, and acquiring at least one video before generating the video material file.

At least one video acquired by the terminal is a material video, the images included in the material video comprise sub-images, and the sub-images in the images are intercepted, so that the intercepted sub-images can be synthesized into other video pictures, and other video contents are richer.

Optionally, if the video is stored in the server, the terminal sends a video acquisition request to the server, and the server can send the video to the terminal based on the video acquisition request, and the terminal acquires the video sent by the server.

For example, the terminal displays at least one candidate video, and acquires at least one video in a selected state in response to a selection operation of any one of the candidate videos. Wherein the selection operation is a single click operation, a double click operation, a long press operation, or other type of operation.

Alternatively, if the video is stored in the terminal, the terminal directly displays the stored video. For example, the terminal displays the stored videos, and acquires at least one video in a selected state in response to a selection operation of any one video.

In the embodiment of the application, the terminal can acquire at least one video in any one of the following modes:

(1) Videos are classified by genre, e.g., the genre of the video includes cartoon genre, landscape genre, music title genre, artistic word genre, etc. The terminal can acquire according to the type of the video in the process of acquiring at least one video.

Optionally, the terminal displays at least one video in the interface corresponding to each type according to the type of the video, the terminal responds to the triggering operation on the target type, displays at least one video corresponding to the target type, and responds to the selection operation on any video, and determines to acquire the video in the selection state. In the embodiment of the application, the video can be acquired according to the type, the video material files comprising the sub-images of the same type can be generated later, and the uniformity of the sub-images included in the video material files can be ensured.

(2) The videos are arranged according to the time sequence, and then the terminal can acquire the videos according to the sequence of the release time of the videos in the process of acquiring at least one video.

For example, the terminal displays the videos in order of release time from first to last, and obtains the video in the selected state in response to a selection operation on any one of the videos. According to the embodiment of the application, the video can be acquired according to the release time of the video, and the video material file comprising the newly released sub-image can be generated subsequently, so that the timeliness of the included sub-image is ensured, and the timeliness of the generated video material file is further improved.

(3) The videos are arranged according to the heat value of the videos, and then the terminal can acquire the videos according to the sequence from high to low of the heat value of the videos in the process of acquiring at least one video. Wherein, the hot value is used to represent the hot degree of the video, the higher the hot value is, the hotter the video is, and the lower the hot value is, the colder the video is.

For example, the terminal displays the videos in order of high-to-low hotness values, and acquires the video in the selected state in response to a selection operation of any one of the videos. According to the embodiment of the application, the video can be acquired according to the popularity value of the video, the video material file comprising the sub-image with high popularity degree can be generated later, and the utilization rate of the video material file can be improved.

402. The terminal displays an image of any one video.

In the embodiment of the present application, any video includes a plurality of images, and the terminal can sequentially display any image in any video.

After the terminal acquires at least one video, the terminal can decode the video to acquire a plurality of images included in each video, and then can display the images in any video.

Since the video is composed of a plurality of images in the order of the front to back, the images of any one video are sequentially displayed in the order of the plurality of images in the video during the display of the images of the video. Optionally, after the previous video is intercepted, displaying the images of the video whose playing sequence is located after the previous video according to the playing sequence of each video.

The terminal in the embodiment of the application acquires at least one video based on the application interface, and displays the image included in each video based on the at least one video. The application interface is provided by a target application with a video material generating function. Optionally, in the process of displaying the images of at least one video based on the application interface, the terminal sequentially displays the images of the videos according to the selected sequence of each video, sequentially displays the images of the videos according to the sequence designated by the user, or sequentially displays the images of the videos according to other sequences. And, the terminal displays a plurality of images of one video first, performs 403-404 to intercept sub-images in the images, and then performs 405 to generate video material files.

It should be noted that 403-404 only illustrates that any one of the images in one video is captured, and other images in the video or images in other videos are captured according to 403-404, which is not a limitation of the embodiment of the present application.

403. The terminal displays an area selection frame on an image of any one of the videos in response to a clipping operation of the image of any one of the videos.

In the embodiment of the application, after the terminal displays the image of any video, if the intercepting operation of the image is detected, the user is determined to need to intercept the image, and an area selection frame is displayed in the image of the video. For example, as shown in fig. 5, after the terminal responds to the intercepting operation of any one video, a rectangular region selection frame is displayed on the image.

Alternatively, the region selection frame may include a variety of shapes, such as square, rectangular, heart, or other shapes, and embodiments of the present application are not limited.

In the embodiment of the application, the terminal responds to the intercepting operation of the image of any video, displays the region selection frame of at least one shape, the user can select any shape from at least one shape, the selecting operation is carried out on the selected shape, and the terminal responds to the selecting operation, and displays the region selection frame corresponding to the shape in the image of any video.

When a user selects the shape of the region selection frame, the region selection frame with similar shape can be selected according to the shape of the sub-image to be intercepted in the image, so that the accuracy of intercepting the sub-image by the region selection frame with the shape is improved.

In some embodiments, the terminal further provides a function of customizing the region selection frame, the user can execute a circle selection operation through the terminal, the circle is selected from the region selection frame with a defined shape, and the terminal further responds to the circle selection operation to display the region selection frame with the corresponding customized shape.

For example, in the process of displaying an image, a user can slide the edge of the sub-image cut according to the need, so that the position of the start of the sliding coincides with the position of the start, the circle selection of the sub-image is completed, and then the terminal displays an area selection frame formed by the sliding.

Optionally, if the terminal detects the operation of the combination key of the input device during the process of displaying the image of any video, the terminal enters an image capturing state, and then the user can select the region in the interface for displaying the image, and further the terminal responds to the selecting operation to display the region selection frame. For example, the combination key operates as "ctrl+alt", "ctrl+alt+f", or as another combination key.

In the embodiment of the present application, the region selection frame is displayed according to the intercepting operation, and in another embodiment, the size of the displayed region selection frame can be adjusted in response to the adjusting operation after the terminal displays the region selection frame.

In the embodiment of the present application, the manners of adjusting the size of the area selection frame include, but are not limited to, the following manners:

(1) After detecting the dragging operation on the edge of the area selection frame, the terminal responds to the dragging operation to adjust the size of the area selection frame, and then adjusts the edge of the area selection frame to the stopping position of the dragging operation.

For example, if the terminal detects a drag operation to the right of the right edge of the area selection frame, the size of the area selection frame is controlled to expand in the right direction, or if the terminal detects a drag operation to the lower of the upper edge of the area selection frame, the size of the area selection frame is controlled to shrink in the lower direction, and if the terminal detects a drag operation to the outside of the upper right corner of the area selection frame, the size of the area selection frame is controlled to expand in the upward and rightward directions, or the terminal can also expand or shrink the size of the area selection frame according to other operations, which is not limited in the embodiment of the present application.

For example, as shown in fig. 6, when the two-way arrow shown in fig. 6 is displayed after the mouse pointer points to the edge of the region selection frame, the user can trigger a drag operation to adjust the size of the region selection frame.

(2) After detecting the adjustment operation to the edge of the area selection frame, the terminal suspends and displays a size adjustment window, responds to the input operation to the size adjustment window, and adjusts the size of the area selection frame according to the size input by the input operation.

Alternatively, the center coordinates of the region selection frame, the width and the height of the region selection frame are displayed in the size adjustment window with reference to the center of the region selection frame, and the terminal can adjust the size of the region selection frame based on the coordinates, the width or the height inputted in the size adjustment window.

404. And the terminal responds to the dragging operation of the region selection frame, intercepts the sub-image in the region selection frame when the dragging operation is finished, and obtains the sub-image of the image of any video and the position information of the sub-image in the corresponding image.

In the embodiment of the application, the terminal displays the area selection frame, and the user can also adjust the position of the area selection frame by triggering the dragging operation of the area selection frame, so that the sub-image included in the adjusted area selection frame is the sub-image to be intercepted.

If the user needs to adjust the position of the area selection frame, triggering a dragging operation on the area selection frame in the terminal, dragging the area selection frame to the end position of the dragging operation after the terminal detects the dragging operation, and then intercepting a sub-image in the area selection frame to obtain the sub-image and the position information of the sub-image in the corresponding image.

The sub-images in the application are all images cut out from the images of the video, so the sub-images have positions in the corresponding images, the terminal records the position information of the sub-images, and the sub-images can be added into other videos according to the position information. In addition, if the image of any video includes multiple sub-images, the terminal can execute steps 403-404 multiple times to intercept the multiple sub-images and determine the position of each sub-image in the corresponding image.

It should be noted that, the embodiment of the present application is described only by taking the operation of capturing the sub-image as the operation of manually controlling the region selection frame to capture the sub-image as an example. In another embodiment, the terminal is capable of automatically intercepting sub-images in the image in response to the sub-image intercepting operation. Optionally, the terminal performs pattern recognition on the image of each video in response to a triggering operation on the image capturing option, acquires the recognized pattern as a sub-image of the image, and acquires the position information of the recognized pattern in the corresponding image as the position information of the sub-image in the corresponding image.

In the embodiment of the application, an image capturing option is included in an interface for displaying an image of a video, and the image capturing option is used for triggering a sub-image capturing operation for automatically capturing the image. If the terminal detects the triggering operation of the image interception option, determining that the image in the image needs to be intercepted, performing image recognition on the image at the moment, and performing image recognition on the image to obtain a sub-image. The terminal can identify the graph by adopting a straight line extraction method, or identify the graph by adopting an image segmentation method, or identify the graph by adopting other modes.

The terminal in the embodiment of the application can automatically carry out pattern recognition on the image based on the detected sub-image interception operation so as to automatically intercept the sub-image, thereby improving the accuracy of intercepting the sub-image, simplifying the operation flow of intercepting the sub-image and improving the efficiency of intercepting the sub-image.

In some embodiments, after the position information of the sub-image in the corresponding image is determined by the above steps, the truncated sub-image and the corresponding position information can be combined, so that the number of sub-images is reduced on the premise that the effect of storing the sub-image is not affected, the data size of the sub-images is reduced, and the utilization rate of the storage space is improved.

Optionally, if the sub-images of the at least two images have the same shape and different positions in the images, the sub-images of the at least two images and the corresponding position information are combined.

Combining the sub-images of at least two images and corresponding position information comprises:

and merging the sub-images with the same shape into one sub-image, and storing the position information of each sub-image in the sub-images with the same shape and the merged sub-image in parallel so as to ensure that one sub-image corresponds to a plurality of position information.

In one possible implementation, one sub-image is randomly selected from sub-images with the same shape as the combined sub-image, the other sub-images are deleted, the position information of the other sub-images is reserved, and the reserved position information is stored corresponding to the combined sub-image.

In another possible implementation manner, one sub-image with the highest image quality is selected from sub-images with the same shape as a combined sub-image, other sub-images are deleted, position information of the other sub-images is reserved, and the reserved position information is stored corresponding to the combined sub-image.

In another possible implementation manner, the sub-images with the same shape are averaged to obtain a combined sub-image, and the position information of each sub-image in the sub-images with the same shape is stored corresponding to the combined sub-image.

For example, in the embodiment of the present application, if there are three sub-images with the same shape and different positions in the image, the three sub-images are combined to obtain a combined sub-image, and the position information of the three sub-images is reserved, and the position information of the three sub-images is stored corresponding to the combined sub-image.

405. The terminal generates a video material file based on the sub-image and the position information.

In the embodiment of the application, the terminal can intercept sub-images in the images and also can acquire the position information of each sub-image in the corresponding image, and at the moment, each sub-image has the corresponding position information, so that the intercepted sub-images and the corresponding position information are stored in the video material file.

In some embodiments, fig. 7 is an interface schematic diagram of a video material file according to an embodiment of the present application, where, as shown in fig. 7, the video material file includes a square image file, a circular image file, a pentagonal star image file, and a position sub-file. Correspondingly, the following processes are adopted to realize when the video material files are generated: and correspondingly storing the name and the position information of the picture file in the position subfile, and compressing the square picture file, the circular picture file, the pentagonal star picture file and the position subfile to generate the video material file. For example, the location sub-file is template. Json, and the video material file is template. Zip.

It should be noted that, the sub-image in the embodiment of the present application may be any image. The embodiments of the present application will be described by taking a circle, a five-pointed star, and a square as examples.

The names of the sub-images are named by the shapes included in the sub-images, or the names of the sub-images are named by the truncated sequence, or the names of the sub-images are named by other modes. For example, the header of the position sub-file includes the coordinates of the sitting side, the top coordinates, the width, the height, and the name of the sub-image, and the position information of each sub-image is sequentially recorded in the order of the coordinates of the sitting side, the top coordinates, the width, the height, and the name of the sub-image at the lower part of the position sub-file. Wherein, left side coordinates are represented by "left", top coordinates are represented by "top", width is represented by "width", and height is represented by "height".

In the embodiment of the application, after the terminal generates the video material file, the video material file is sent to the server, the server stores the video material file, and other terminals can acquire the video material file from the server subsequently, so as to analyze the video material file to synthesize the video material file into the target video.

It should be noted that, in the embodiment of the present application, only the terminal performs the steps 401 to 405 as an example, and the terminal may perform the steps 401 to 405 through the template material editor to generate the video material file. In addition, the terminal can also analyze the video material file by adopting the template material player after generating the video material file through the template material editor, and play at least one sub-image included in the video material file.

In addition, the embodiment of the application can also determine the sub-images with the same shape and different position information, combine the determined sub-images with the corresponding position information, and reduce the number of the sub-images on the premise of not influencing the effect of storing the sub-images, thereby reducing the data quantity of the stored sub-images and improving the utilization rate of the storage space.

In addition, the terminal in the embodiment of the application can automatically carry out pattern recognition on the image based on the detected sub-image interception operation so as to automatically intercept the sub-image, thereby improving the accuracy of intercepting the sub-image, simplifying the operation flow of intercepting the sub-image and improving the efficiency of intercepting the sub-image.

In the above embodiments, the terminal has only been described taking the example of generating the video material file according to the sub-image and the position information as an example, in the following embodiments, the terminal may further obtain the play start point and the play end point of the corresponding image of the truncated sub-image in the associated video, and further generate the video material file based on the sub-image, the position information, and the obtained play start point and play end point, which is described in detail in the embodiment of fig. 8:

fig. 8 is a flowchart of a method for generating a video material file according to an embodiment of the present application. Referring to fig. 8, the method includes:

801. the terminal acquires at least one video.

802. For each image of the video, the terminal acquires the sub-image of the image and the position information of the sub-image in the corresponding image in response to the sub-image capturing operation on the image.

The process of steps 801 to 802 is similar to that of steps 401 to 404, and will not be repeated here.

803. And the terminal determines the playing starting point of the corresponding image of the sub-image in the video as the playing starting point of the corresponding sub-image.

In the embodiment of the application, the terminal executes the intercepting operation on the image to intercept the sub-image, and the corresponding image of the sub-image has the playing point in the affiliated video, so that the terminal can determine the playing starting point of the image in the affiliated video, and the playing starting point is determined as the playing starting point of the corresponding sub-image.

In some embodiments, the play start point is a play start time point of a corresponding image of the sub-image in the affiliated video, or the play start point is a play start time stamp of a corresponding image of the sub-image in the affiliated video.

In the process of intercepting the sub-images, the terminal is in a playing state, acquires a playing time point of the first intercepted sub-image corresponding to the current playing of the video, and determines the playing time point as a playing starting point of the sub-image.

Or each video comprises a plurality of video frames, each video frame corresponds to a time stamp, the time stamp represents an original playing time point of the video, and in the process of intercepting the sub-images, the terminal acquires the time stamp corresponding to the first intercepted sub-image corresponding to the video, and determines the time stamp as a playing starting point of the sub-image.

For example, for any video, if the play start point of the image of the truncated sub-image in the video is 0 seconds, the play start point of the sub-image truncated from the image is also 0 seconds, or if the play start point of the image of the truncated sub-image in the video is 3 seconds, the play start point of the sub-image truncated from the image is also 3 seconds.

804. And the terminal determines the playing end point of the corresponding image of the sub-image in the video as the playing end point of the corresponding sub-image.

In the embodiment of the present application, the terminal can determine not only the play start point of the sub-image but also the play end point of the sub-image through step 803, and the terminal can determine the play end point of the image in the video, and then can determine the play end point as the play end point of the corresponding sub-image.

In some embodiments, the play end point is a play end time point of a corresponding image of the sub-image in the affiliated video, or the play end point is a play end time stamp of a corresponding image of the sub-image in the affiliated video.

In the process of intercepting the sub-images, the video is also in a playing state, and the terminal acquires the playing time point of the last intercepted sub-image corresponding to the video when the video is currently played and determines the playing time point as the playing end point of the sub-image.

Or each video comprises a plurality of video frames, each video frame corresponds to a time stamp, the time stamp represents an original playing time point of the video, and in the process of intercepting the sub-images, the terminal acquires the time stamp corresponding to the last intercepted sub-image to the corresponding video of the sub-image, and determines the time stamp as the playing end point of the sub-image.

Note that, in the present application, 803 and 804 are executed sequentially, 803 is executed before 804, 803 and 804 are executed simultaneously, or 803 is executed after 804.

805. The terminal generates a video material file based on the sub-image, the position information, and the play start point and the play end point of the sub-image.

In the embodiment of the present application, the process of generating the video material file by the terminal based on the sub-image, the position information and the play start point and play end point of the sub-image is similar to the process of generating the video material file in step 405, and will not be described herein.

It should be noted that, unlike step 405, in step 805, the name, the position information, the play start point, and the play end point of the sub-image are stored in the position sub-file, and the position sub-file and the sub-image are compressed to obtain the video material file.

According to the method provided by the embodiment of the application, the sub-images can be intercepted from the images of each video, the position information of the sub-images in the corresponding images can be determined, and the play starting point and the play ending point of the sub-images can be determined, so that the video material files comprising the sub-images, the position information and the play starting point and the play ending point of the sub-images are generated, and other terminals can respectively add the sub-images to the target video according to the play starting point and the play ending point, thereby improving the accuracy of adding the sub-images. In the process of adding the materials in the target video, the video is decoded to obtain the added sub-image, so that the time consumed for obtaining the sub-image is saved, the efficiency of adding the sub-image to the target video is improved, and the decoding resource is saved.

The terminal may further synthesize the video based on the already-produced video material file to obtain a video with an additional display effect, and the synthesis process will be described below with reference to the embodiment shown in fig. 9. Fig. 9 is a flowchart of a video synthesizing method according to an embodiment of the present application. Referring to fig. 9, the method includes:

901. and the terminal acquires the video material file.

The video material file comprises at least one sub-image, position information of each sub-image, and a playing start point and a playing end point of each sub-image.

Optionally, at least one video material file is stored in the server, and after the terminal sends a file acquisition instruction to the server, the video material file is acquired from the server.

In the embodiment of the application, the terminal can acquire the video material files in any one of the following modes:

(1) When the server stores the video material files, the video material files can be stored according to the types of the video material files, and then the terminal can acquire the video material files based on the types of the video material files.

The types of the video material files comprise cartoon types, landscape types, music subject types, artistic word types and the like.

(2) The server arranges the video material files according to the sequence of the generation time, and when the terminal acquires the video material files, the terminal acquires the video material files according to the sequence of the generation time of the video material files.

For example, the terminal acquires a preset number of video material files in order of the generation time of the video material files from first to second. Wherein the preset number is set by the terminal, or by the server, or by an operator, or by other means.

(3) The server stores the heat value of the video material files, so that the terminal can acquire videos according to the sequence from high to low of the heat value of the video material files in the process of acquiring at least one video material file. Wherein, the popularity value is used for representing the popularity of the video material files, the higher the popularity value is, the hotter the video material files are, and the lower the popularity value is, the colder the video material files are.

902. Analyzing the video material file to obtain at least one sub-image, the position information of each sub-image, and the playing start point and the playing end point of each sub-image.

In the embodiment of the application, the video material file comprises at least one sub-image, the position information of each sub-image, and the playing start point and the playing end point of each sub-image, and then the terminal analyzes the acquired video material file to obtain at least one sub-image, the position information of each sub-image, and the playing start point and the playing end point of each sub-image included in the video material file.

Optionally, the video material file includes at least one sub-image and a position sub-file, and when the terminal analyzes the video material file, the terminal first obtains the at least one sub-image and the position sub-file included in the video material file, and then analyzes the position sub-file to obtain the position information of each sub-image included in the position sub-file, and the playing start point and the playing end point of each sub-image.

In some embodiments, the play start point of each sub-image is a play start time point of a corresponding image of each sub-image in the affiliated video, or the play start point of each sub-image is a play start time stamp of a corresponding image of each sub-image in the affiliated video.

The play start point of each sub-image is the same as that of step 803 in the above embodiment, and will not be described here again.

903. And the terminal superimposes at least one sub-image on a video picture corresponding to the starting point and the ending point in the target video according to the position information of each sub-image based on the starting point and the ending point of playing of each sub-image.

In the embodiment of the application, the terminal determines the position of each sub-image in the image, and the play start point and the play end point of each sub-image, so that the terminal can superimpose each sub-image on a video picture in the target video according to the position information, the play start point and the play end point, thereby realizing the effect of adding the sub-images in the target video.

For example, if the play start point of the sub-image is 0 seconds and the play end point is 2 seconds, the sub-image is superimposed on the video screen within the time period of 0 seconds to 2 seconds of the target video.

For example, when square, circular, and pentagon are included in the video material file, the sub-image is superimposed on the video picture in the above-described manner, and the video picture is displayed as an interface as shown in fig. 10.

It should be noted that, the embodiments of the present application are only described by taking as an example at least one sub-image in a video material file, position information of each sub-image, and a play start point and a play end point of each sub-image. In another embodiment, the terminal is further capable of modifying at least one of a play start point, a play end point, and position information of each sub-image, and synthesizing the sub-image into the target video based on the modified information.

Optionally, displaying at least one sub-image, position information of each sub-image, and play start point and play end point of each sub-image, modifying at least one of the play start point, play end point and position information of each sub-image in response to the modifying operation, and superposing at least one sub-image on a video picture corresponding to the start point and end point in the target video according to the modified position information based on the modified play start point and play end point of each sub-image and the position information of each sub-image.

In some embodiments, the play end point of each sub-image is a play end time point of a corresponding image of each sub-image in the video to which the corresponding image belongs, or the play end point of each sub-image is a play end time stamp of a corresponding image of each sub-image in the video to which the corresponding image belongs.

The play end point of each sub-image is the same as the play end point in step 804 in the above embodiment, and will not be described herein.

In the embodiment of the application, after the terminal obtains the video material file, the terminal analyzes the video material file to obtain at least one sub-image, the position information of each sub-image, and the play start point and the play end point of each sub-image, the terminal can also display the analyzed information, the user can modify the information according to the requirement, if the user performs a modification operation, the terminal modifies at least one of the position information of each sub-image and the play start point and the play end point of each sub-image based on the modification operation if detecting the modification operation of any information, and the terminal superimposes at least one sub-image on the video picture corresponding to the start point and the end point in the target video based on the modified position information of each sub-image and the play start point and the play end point of each sub-image.

For example, if the user needs to modify the location information of the sub-image, performs a modification operation on the location information, and inputs the modified location information in the terminal, the terminal can modify the sub-image based on the modified location information. For another example, if the user needs to modify the play start point of the sub-image, the modification operation of the play start point of the sub-image is performed, and the modified play start point is input into the terminal, the terminal can modify the sub-image based on the modified play start point.

According to the method provided by the embodiment of the application, as the acquired video material file comprises at least one sub-image, the position information of each sub-image and the playing start point and the playing end point of each sub-image, the sub-image can be overlapped to the target video corresponding to the start point and the end point according to the position information of each sub-image, the accuracy of overlapping the sub-images in the target video can be improved, and the video is decoded to obtain the added sub-image without adding materials in the target video, so that the time consumed for acquiring the sub-image is saved, the efficiency for adding the sub-image to the target video is improved, and the decoding resources are saved.

In addition, the terminal can display the position information of each sub-image and the playing start point and the playing end point of each sub-image, and the user can modify at least one of the position information of each sub-image and the playing start point and the playing end point of each sub-image based on the displayed information, so that the editing function of the video material file is expanded, the user can freely control the playing time of the video material file, and the flexibility of adjusting the video material file is improved.

The above-described embodiments of fig. 4 and 9 respectively describe a process of creating a video material file by a terminal, a process of further synthesizing a video based on the created video material file by the terminal to obtain a video having an additional display effect, and a process of integrally creating a video material file and a process of synthesizing a video based on the video material file by the embodiment of fig. 11. Fig. 11 is a flowchart of a method for generating video material files and synthesizing video according to an embodiment of the present application. Referring to fig. 11, the method includes:

1101. the first terminal acquires at least one video.

1102. The first terminal responds to the image interception operation of each video image, and acquires the sub-image of the image and the position information of the sub-image in the corresponding image.

1103. The first terminal generates a video material file based on the sub-image and the position information.

1104. The first terminal transmits the video material file to the server.

Steps 1101-1104 in the embodiment of the present application are similar to step 405 described above, and are not repeated here.

1105. The server stores video material files.

1106. The second terminal acquires the video material file from the server.

1107. And the second terminal analyzes the video material file to obtain at least one sub-image and the position information of each sub-image.

1108. The second terminal synthesizes at least one sub-image to the target video based on the position information of each sub-image.

Steps 1106-1108 in the embodiment of the present application are similar to steps 901-903 described above, and are not described herein.

It should be noted that, the first terminal in the embodiment of the present application is a terminal used by a developer, and the developer can execute the above playing to generate the video material file. The second terminal is a terminal used by any user, and the second terminal is controlled to acquire the video material file, so that the second terminal can analyze the video material file and synthesize at least one included sub-image into the target video.

The embodiment shown in fig. 11 is an example in which a first terminal generates a video material file from at least one video, and a second terminal uses the generated video material file. The source of at least one video acquired by the first terminal will be described below.

In the related art, if the second terminal needs to add the video material into the target video, at least one video is selected, the selected at least one video is synthesized into the target video, and the effect of adding the material into the target video is completed.

Fig. 12 is a schematic structural diagram of a video material file generating apparatus according to an embodiment of the present application. Referring to fig. 12, the apparatus includes:

a video acquisition module 1201, configured to acquire at least one video;

a clipping module 1202, configured to, for each image of the video, obtain, in response to a clipping operation on a sub-image of the image, the sub-image of the image and position information of the sub-image in the corresponding image;

The file generating module 1203 is configured to generate a video material file based on the sub-image and the position information, where the video material file is used for being parsed by the terminal to be synthesized to the target video.

Optionally, referring to fig. 13, the interception module 1202 includes:

an image display unit 12021 for displaying an image of any one video;

a selection frame display unit 12022 for displaying an area selection frame on an image of any one video in response to a clipping operation of the image of any one video;

and a clipping unit 12023, configured to, in response to a drag operation on the region selection frame, clip a sub-image in the region selection frame when the drag operation ends, and obtain a sub-image of an image of any video and position information of the sub-image in the corresponding image.

Alternatively, the image display unit 12021 is configured to display, in order of playing each video, an image of a video whose playing order is located after the previous video is captured.

Optionally, the capturing module 1202 is configured to perform pattern recognition on an image of each video in response to a triggering operation on an image capturing option, acquire the recognized pattern as a sub-image of the image, and acquire position information of the recognized pattern in the corresponding image as position information of the sub-image in the corresponding image.

Optionally, the clipping module 1202 is configured to combine the sub-images of the at least two images and the corresponding position information if the sub-images of the at least two images have the same shape and different positions in the images.

Optionally, referring to fig. 13, the apparatus further includes:

the time determining module 1204 is configured to determine a play start point of a corresponding image of the sub-image in the affiliated video as a play start point of the corresponding sub-image;

the time determining module 1204 is configured to determine a play end point of the corresponding image of the sub-image in the video to which the sub-image belongs as a play end point of the corresponding sub-image;

the file generating module 1203 is configured to generate a video material file based on the sub-image, the position information, and the play start point and the play end point of the sub-image.

the play end point of the corresponding image of the sub-image in the affiliated video is the play end time point of the corresponding image of the sub-image in the affiliated video, or the play end point is the play end time stamp of the corresponding image of the sub-image in the affiliated video.

Optionally, the file generating module 1203 is configured to:

and compressing the position subfiles and the sub-images to generate video material files.

Optionally, referring to fig. 13, the apparatus further includes:

the file transmitting module 1205 is configured to transmit the video material file to a server, and the server stores the video material file.

It should be noted that: the video material file generating apparatus provided in the above embodiment only illustrates the division of the above functional modules when generating the video material file, and in practical application, the above functional allocation may be performed by different functional modules according to needs, that is, the internal structure of the terminal is divided into different functional modules, so as to complete all or part of the functions described above. In addition, the embodiments of the video material file generating apparatus provided in the foregoing embodiments and the embodiments of the video material file generating method belong to the same concept, and detailed implementation processes of the embodiments are referred to in the method embodiments, which are not repeated herein.

Fig. 14 is a schematic structural diagram of a video synthesizer according to an embodiment of the present application. Referring to fig. 14, the apparatus includes:

A file obtaining module 1401, configured to obtain a video material file, where the video material file includes at least one sub-image and location information of each sub-image;

the parsing module 1402 is configured to parse the video material file to obtain at least one sub-image and location information of each sub-image;

a synthesizing module 1403 is configured to synthesize at least one sub-image to the target video based on the position information of each sub-image.

Optionally, the file acquisition module 1401 is configured to acquire a video material file from a server.

Optionally, an parsing module 1402 is configured to parse the video material file to obtain at least one sub-image, position information of each sub-image, and a play start point and a play end point of each sub-image;

a synthesizing module 1403, configured to superimpose at least one sub-image on a video frame corresponding to the start point and the end point in the target video according to the position information of each sub-image based on the play start point and the play end point of each sub-image.

The play end point of each sub-image is the play end time point of the corresponding image of each sub-image in the affiliated video, or the play end point of each sub-image is the play end time stamp of the corresponding image of each sub-image in the affiliated video.

Optionally, referring to fig. 15, the apparatus further includes:

a display module 1404, configured to display at least one sub-image, position information of each sub-image, and a play start point and a play end point of each sub-image;

a modifying module 1405, configured to modify at least one of the play start point, the play end point, and the position information of each sub-image in response to a modifying operation;

a synthesizing module 1403, configured to superimpose at least one sub-image on a video frame corresponding to the start point and the end point in the target video according to the modified position information based on the modified play start point and play end point of each sub-image and the position information of each sub-image.

It should be noted that: the video synthesizing device provided in the above embodiment only illustrates the division of the above functional modules when synthesizing the target video, and in practical application, the above functional allocation may be performed by different functional modules according to needs, that is, the internal structure of the terminal is divided into different functional modules, so as to complete all or part of the functions described above. In addition, the embodiments of the video synthesizing apparatus provided in the foregoing embodiments and the embodiments of the video synthesizing method belong to the same concept, and specific implementation processes thereof are detailed in the method embodiments, which are not repeated herein.

The disclosed embodiments also provide a computer device including a processor and a memory, where at least one program code is stored in the memory, where the at least one program code is loaded and executed by the processor to implement the video material file generating method as in the above embodiments, or to implement the video synthesizing method as in the above embodiments.

Optionally, the computer device is provided as a terminal. Fig. 16 is a schematic structural diagram of a terminal according to an embodiment of the present application. The terminal 1600 may be a portable mobile terminal such as: a smart phone, a tablet computer, an MP3 player (Moving Picture Experts Group Audio Layer III, motion picture expert compression standard audio plane 3), an MP4 (Moving Picture Experts Group Audio Layer IV, motion picture expert compression standard audio plane 4) player, a notebook computer, or a desktop computer. Terminal 1600 may also be referred to by other names of user devices, portable terminals, laptop terminals, desktop terminals, etc.

Terminal 1600 includes: a processor 1601, and a memory 1602.

Processor 1601 may include one or more processing cores, such as a 4-core processor, an 8-core processor, and the like. The processor 1601 may be implemented in at least one hardware form of a DSP (Digital Signal Processing ), FPGA (Field-Programmable Gate Array, field programmable gate array), PLA (Programmable Logic Array ). The processor 1601 may also include a host processor, which is a processor for processing data in an awake state, also referred to as a CPU (Central Processing Unit ); a coprocessor is a low-power processor for processing data in a standby state. In some embodiments, the processor 1601 may be integrated with a GPU (Graphics Processing Unit, image processor) for taking care of rendering and rendering of content to be displayed by the display screen. In some embodiments, the processor 1601 may also include an AI (Artificial Intelligence ) processor for processing computing operations related to machine learning.

Memory 1602 may include one or more computer-readable storage media, which may be non-transitory. Memory 1602 may also include high-speed random access memory, as well as non-volatile memory, such as one or more magnetic disk storage devices, flash memory storage devices. In some embodiments, a non-transitory computer readable storage medium in memory 1602 is used to store at least one program code for execution by processor 1601 to implement the video material file generation method, or video composition method, provided by a method embodiment of the present application.

In some embodiments, terminal 1600 may also optionally include: a peripheral interface 1603, and at least one peripheral. The processor 1601, memory 1602, and peripheral interface 1603 may be connected by bus or signal lines. The individual peripheral devices may be connected to the peripheral device interface 1603 by buses, signal lines, or circuit boards. Specifically, the peripheral device includes: at least one of radio frequency circuitry 1604, a display screen 1605, a camera assembly 1606, audio circuitry 1607, a positioning assembly 1608, and a power supply 1609.

Peripheral interface 1603 may be used to connect I/O (Input/Output) related at least one peripheral to processor 1601 and memory 1602. In some embodiments, the processor 1601, memory 1602, and peripheral interface 1603 are integrated on the same chip or circuit board; in some other embodiments, either or both of the processor 1601, memory 1602, and peripheral interface 1603 may be implemented on separate chips or circuit boards, which is not limited in this embodiment.

The Radio Frequency circuit 1604 is used for receiving and transmitting RF (Radio Frequency) signals, also known as electromagnetic signals. The radio frequency circuit 1604 communicates with a communication network and other communication devices via electromagnetic signals. The radio frequency circuit 1604 converts an electrical signal into an electromagnetic signal for transmission, or converts a received electromagnetic signal into an electrical signal. Optionally, the radio frequency circuit 1604 includes: antenna systems, RF transceivers, one or more amplifiers, tuners, oscillators, digital signal processors, codec chipsets, subscriber identity module cards, and so forth. The radio frequency circuit 1604 may communicate with other terminals via at least one wireless communication protocol. The wireless communication protocol includes, but is not limited to: the world wide web, metropolitan area networks, intranets, generation mobile communication networks (2G, 3G, 4G, and 5G), wireless local area networks, and/or WiFi (Wireless Fidelity ) networks. In some embodiments, the radio frequency circuit 1604 may also include NFC (Near Field Communication ) related circuits, which the present application is not limited to.

The display screen 1605 is used to display a UI (User Interface). The UI may include graphics, text, icons, video, and any combination thereof. When the display 1605 is a touch display, the display 1605 also has the ability to collect touch signals at or above the surface of the display 1605. The touch signal may be input to the processor 1601 as a control signal for processing. At this point, the display 1605 may also be used to provide virtual buttons and/or virtual keyboards, also referred to as soft buttons and/or soft keyboards. In some embodiments, the display 1605 may be one and disposed on the front panel of the terminal 1600; in other embodiments, the display 1605 may be at least two, each disposed on a different surface of the terminal 1600 or in a folded configuration; in other embodiments, the display 1605 may be a flexible display disposed on a curved surface or a folded surface of the terminal 1600. Even more, the display screen 1605 may be arranged in an irregular pattern other than rectangular, i.e., a shaped screen. The display 1605 may be made of LCD (Liquid Crystal Display ), OLED (Organic Light-Emitting Diode) or other materials.

The camera assembly 1606 is used to capture images or video. Optionally, camera assembly 1606 includes a front camera and a rear camera. The front camera is arranged on the front panel of the terminal, and the rear camera is arranged on the back of the terminal. In some embodiments, the at least two rear cameras are any one of a main camera, a depth camera, a wide-angle camera and a tele camera, so as to realize that the main camera and the depth camera are fused to realize a background blurring function, and the main camera and the wide-angle camera are fused to realize a panoramic shooting and Virtual Reality (VR) shooting function or other fusion shooting functions. In some embodiments, camera assembly 1606 may also include a flash. The flash lamp can be a single-color temperature flash lamp or a double-color temperature flash lamp. The dual-color temperature flash lamp refers to a combination of a warm light flash lamp and a cold light flash lamp, and can be used for light compensation under different color temperatures.

Audio circuitry 1607 may include a microphone and a speaker. The microphone is used for collecting sound waves of users and environments, converting the sound waves into electric signals, and inputting the electric signals to the processor 1601 for processing, or inputting the electric signals to the radio frequency circuit 1604 for voice communication. The microphone may be provided in a plurality of different locations of the terminal 1600 for stereo acquisition or noise reduction purposes. The microphone may also be an array microphone or an omni-directional pickup microphone. The speaker is used to convert electrical signals from the processor 1601 or the radio frequency circuit 1604 into sound waves. The speaker may be a conventional thin film speaker or a piezoelectric ceramic speaker. When the speaker is a piezoelectric ceramic speaker, not only the electric signal can be converted into a sound wave audible to humans, but also the electric signal can be converted into a sound wave inaudible to humans for ranging and other purposes. In some embodiments, audio circuitry 1607 may also include a headphone jack.

The location component 1608 is used to locate the current geographic location of the terminal 1600 to enable navigation or LBS (Location Based Service, location based services). The positioning component 1608 may be a positioning component based on the United states GPS (Global Positioning System ), the Beidou system of China, or the Galileo system of Russia.

A power supply 1609 is used to power the various components in the terminal 1600. The power supply 1609 may be an alternating current, a direct current, a disposable battery, or a rechargeable battery. When the power supply 1609 includes a rechargeable battery, the rechargeable battery may be a wired rechargeable battery or a wireless rechargeable battery. The wired rechargeable battery is a battery charged through a wired line, and the wireless rechargeable battery is a battery charged through a wireless coil. The rechargeable battery may also be used to support fast charge technology.

In some embodiments, terminal 1600 also includes one or more sensors 1610. The one or more sensors 1610 include, but are not limited to: acceleration sensor 1611, gyroscope sensor 1612, pressure sensor 1613, fingerprint sensor 1614, optical sensor 1615, and proximity sensor 1616.

The acceleration sensor 1611 may detect the magnitudes of accelerations on three coordinate axes of the coordinate system established with the terminal 140. For example, the acceleration sensor 1611 may be used to detect components of gravitational acceleration in three coordinate axes. The processor 1601 may control the display screen 1605 to display a user interface in a landscape view or a portrait view based on the gravitational acceleration signal acquired by the acceleration sensor 1611. The acceleration sensor 1611 may also be used for acquisition of motion data of a game or a user.

The gyro sensor 1612 may detect a body direction and a rotation angle of the terminal 1600, and the gyro sensor 1612 may collect 3D actions of the user on the terminal 1600 in cooperation with the acceleration sensor 1611. The processor 1601 may implement the following functions based on the data collected by the gyro sensor 1612: motion sensing (e.g., changing UI according to a tilting operation by a user), image stabilization at shooting, game control, and inertial navigation.

Pressure sensor 1613 may be disposed on a side frame of terminal 1600 and/or on an underlying layer of display 1605. When the pressure sensor 1613 is disposed at a side frame of the terminal 1600, a grip signal of the terminal 1600 by a user may be detected, and the processor 1601 performs a left-right hand recognition or a quick operation according to the grip signal collected by the pressure sensor 1613. When the pressure sensor 1613 is disposed at the lower layer of the display screen 1605, the processor 1601 performs control on an operability control on the UI interface according to a pressure operation of the display screen 1605 by a user. The operability controls include at least one of a button control, a scroll bar control, an icon control, and a menu control.

The fingerprint sensor 1614 is used to collect a fingerprint of a user, and the processor 1601 identifies the identity of the user based on the fingerprint collected by the fingerprint sensor 1614, or the fingerprint sensor 1614 identifies the identity of the user based on the collected fingerprint. Upon recognizing that the user's identity is a trusted identity, the processor 1601 authorizes the user to perform related sensitive operations including unlocking the screen, viewing encrypted information, downloading software, paying for and changing settings, etc. The fingerprint sensor 1614 may be disposed on the front, back, or side of the terminal 1600. When a physical key or vendor Logo is provided on terminal 1600, fingerprint sensor 1614 may be integrated with the physical key or vendor Logo.

The optical sensor 1615 is used to collect ambient light intensity. In one embodiment, the processor 1601 may control the display brightness of the display screen 1605 based on the ambient light intensity collected by the optical sensor 1615. Specifically, when the intensity of the ambient light is high, the display brightness of the display screen 1605 is turned up; when the ambient light intensity is low, the display brightness of the display screen 1605 is turned down. In another embodiment, the processor 1601 may also dynamically adjust the capture parameters of the camera module 1606 based on the ambient light intensity collected by the optical sensor 1615.

A proximity sensor 1616, also referred to as a distance sensor, is provided on the front panel of the terminal 1600. The proximity sensor 1616 is used to collect a distance between a user and the front surface of the terminal 1600. In one embodiment, when the proximity sensor 1616 detects that the distance between the user and the front face of the terminal 1600 is gradually decreasing, the processor 1601 controls the display 1605 to switch from the bright screen state to the off screen state; when the proximity sensor 1616 detects that the distance between the user and the front surface of the terminal 1600 gradually increases, the processor 1601 controls the display 1605 to switch from the off-screen state to the on-screen state.

Those skilled in the art will appreciate that the structure shown in fig. 16 is not limiting and that more or fewer components than shown may be included or certain components may be combined or a different arrangement of components may be employed.

Optionally, the computer device is provided as a server. Fig. 17 is a schematic diagram of a server according to an exemplary embodiment, where the server 1700 may include one or more processors (Central Processing Units, CPU) 1701 and one or more memories 1702, where at least one program code is stored in the memories 1702, and where the at least one program code is loaded and executed by the processors 1701 to implement the methods provided by the various method embodiments described above. Of course, the server may also have a wired or wireless network interface, a keyboard, an input/output interface, and other components for implementing the functions of the device, which are not described herein.

The embodiment of the application also provides a computer readable storage medium, in which at least one program code is stored, and the at least one program code is loaded and executed by a processor, so as to implement the video material file generating method of the above embodiment, or to implement the video synthesizing method of the above embodiment.

Embodiments of the present application also provide a computer program product or a computer program, where the computer program product or the computer program includes computer program code, where the computer program code is stored in a computer readable storage medium, and where a processor of a computer device reads the computer program code from the computer readable storage medium, and where the processor executes the computer program code, so that the computer device implements the video material file generating method as described in the foregoing embodiments, or causes the computer device to implement the video synthesizing method as described in the foregoing embodiments.

It will be understood by those skilled in the art that all or part of the steps for implementing the above embodiments may be implemented by hardware, or may be implemented by a program for instructing relevant hardware, where the program may be stored in a computer readable storage medium, and the above storage medium may be a read-only memory, a magnetic disk or an optical disk, etc.

The foregoing is merely an alternative embodiment of the present application and is not intended to limit the embodiment of the present application, and any modifications, equivalent substitutions, improvements, etc. made within the spirit and principle of the embodiment of the present application should be included in the protection scope of the present application.

Claims

1. A method for generating a video material file, the method comprising:

acquiring at least one video, wherein the at least one video is at least one video acquired from the order of high to low according to the heat value of the video after the videos are arranged according to the heat value;

for each image of the video, responding to a sub-image interception operation of the image, and acquiring the sub-image of the image and the position information of the sub-image in the corresponding image, wherein the sub-image is used for being overlapped to video pictures of other videos to beautify the other videos;

and compressing the position sub-file and the sub-images to generate the video material file, wherein the video material file is used for a terminal to analyze and acquire at least one sub-image and the position information of each sub-image, and synthesizing the at least one sub-image to a target video based on the position information.

2. The method according to claim 1, wherein for each image of the video, in response to a sub-image capture operation on the image, acquiring the sub-image of the image and the position information of the sub-image in the corresponding image, comprises:

Displaying an image of any one of the videos;

displaying a region selection frame on an image of any one of the videos in response to an intercepting operation on the image of the any one of the videos;

3. The method of claim 2, wherein displaying an image of any one of the videos comprises:

4. The method according to claim 1, wherein for each image of the video, in response to a sub-image capture operation on the image, acquiring the sub-image of the image and the position information of the sub-image in the corresponding image, comprises:

5. The method according to claim 1, wherein for each image of the video, in response to a sub-image capture operation on the image, acquiring the sub-image of the image and the position information of the sub-image in the corresponding image, comprises:

6. The method of claim 1, wherein for each image of the video, after obtaining a sub-image of the image and position information of the sub-image in the corresponding image in response to a sub-image capture operation on the image, the method further comprises:

7. The method according to claim 6, wherein the play start point of the corresponding image of the sub-image in the affiliated video is a play start time point of the corresponding image of the sub-image in the affiliated video, or the play start point is a play start time stamp of the corresponding image of the sub-image in the affiliated video;

8. The method of claim 1, wherein after compressing the location subfile and the sub-image to generate the video material file, the method further comprises:

9. A method of video synthesis, the method comprising:

acquiring a video material file, wherein the video material file comprises at least one sub-image and position information of each sub-image, the video material file is a material file generated by correspondingly storing names and position information of the sub-images in the position sub-file and compressing the position sub-file and the sub-images, the sub-images are images obtained by intercepting each image of a video, the sub-images are used for being overlapped in video pictures of other videos to realize beautification of the other videos, the at least one sub-image and the position information of each sub-image are obtained by intercepting the images of at least one video, and the at least one video is at least one video obtained from high to low according to the heat value of the video after the videos are arranged according to the heat value;

10. The method of claim 9, wherein parsing the video material file to obtain the at least one sub-image and the location information of each sub-image comprises:

and based on the play start point and the play end point of each sub-image, the at least one sub-image is overlapped on the video picture corresponding to the start point and the end point in the target video according to the position information of each sub-image.

11. The method according to claim 10, wherein the play start point of each sub-image is a play start time point of the corresponding image of each sub-image in the affiliated video, or the play start point of each sub-image is a play start time stamp of the corresponding image of each sub-image in the affiliated video;

12. The method according to claim 10, wherein after parsing the video material file to obtain the at least one sub-image, the position information of each sub-image, and the play start point and the play end point of each sub-image, the method further comprises:

and based on the modified play start point and play end point of each sub-image and the position information of each sub-image, the at least one sub-image is overlapped on the video picture corresponding to the start point and the end point in the target video according to the modified position information.

13. A video material file generation apparatus, the apparatus comprising:

the video acquisition module is used for acquiring at least one video, wherein the at least one video is acquired from the high-low order according to the heat value of the video after the videos are arranged according to the heat value;

the intercepting module is used for responding to the intercepting operation of the sub-images of each video to acquire the sub-images of the images and the position information of the sub-images in the corresponding images, wherein the sub-images are used for being overlapped to video pictures of other videos to realize the beautification of the other videos;

the file generation module is used for correspondingly storing the name and the position information of the sub-image in a position sub-file;

14. A video compositing apparatus, the apparatus comprising:

The file acquisition module is used for acquiring a video material file, wherein the video material file comprises at least one sub-image and position information of each sub-image, the video material file is a material file generated by correspondingly storing names and position information of the sub-images in the position sub-file and compressing the position sub-file and the sub-images, the sub-images are images obtained by intercepting each image of a video, the sub-images are used for being overlapped in video pictures of other videos to realize beautification of the other videos, the at least one sub-image and the position information of each sub-image are obtained by intercepting the images of at least one video, and the at least one video is obtained by arranging the videos according to a heat value of the videos from high to low;

15. A computer device comprising a processor and a memory, wherein the memory has stored therein at least one program code that is loaded and executed by the processor to implement the video material file generation method of any of claims 1 to 8 or to implement the video composition method of any of claims 9 to 12.

16. A computer readable storage medium having stored therein at least one program code loaded and executed by a processor to implement the video material file generation method of any one of claims 1 to 8 or to implement the video composition method of any one of claims 9 to 12.