WO2022037331A1

WO2022037331A1 - Video processing method, video processing apparatus, storage medium, and electronic device

Info

Publication number: WO2022037331A1
Application number: PCT/CN2021/106309
Authority: WO
Inventors: 张弓
Original assignee: Oppo广东移动通信有限公司
Priority date: 2020-08-17
Filing date: 2021-07-14
Publication date: 2022-02-24
Also published as: CN111970562A

Abstract

A video processing method, a video processing apparatus, a computer-readable storage medium, and an electronic device. The video processing method comprises: acquiring at least two video clips to be spliced, at least one of which comprises two or more frames of images (S210); performing frame insertion at at least one position, at which splicing is to be performed, in the at least two video clips, so as to generate a frame inserted image (S220); and splicing the video clips and the frame inserted image, so as to obtain a target video (S230). As such, the problem of picture mutation in video splicing is solved.

Description

Video processing method, video processing device, storage medium and electronic device

This application claims the priority of the Chinese patent application filed on August 17, 2020 with the application number of 202010827407.7 and titled "Video Processing Method, Video Processing Device, Storage Medium and Electronic Equipment", and the entire content of the Chinese patent application Incorporated herein by reference.

technical field

The present disclosure relates to the technical field of image and video processing, and in particular, to a video processing method, a video processing apparatus, a computer-readable storage medium, and an electronic device.

Background technique

Video stitching refers to stitching multiple videos into one video. In the related art, it is common to simply combine different videos, for example, splicing the next video after the last frame of the previous video, that is, end-to-end splicing, to form a spliced video that can be played continuously.

SUMMARY OF THE INVENTION

The present disclosure provides a video processing method, a video processing apparatus, a computer-readable storage medium, and an electronic device.

According to a first aspect of the present disclosure, there is provided a video processing method, comprising: acquiring at least two video clips to be spliced, wherein at least one video clip includes two or more frames of images; At least one of the positions to be spliced is inserted into a frame to generate an inserted frame image; the video clip and the inserted frame image are spliced to obtain a target video.

According to a second aspect of the present disclosure, a video processing apparatus is provided, comprising: an acquisition module for acquiring at least two video clips to be spliced, wherein at least one video clip includes two or more frames of images; a frame insertion module , for inserting frames at at least one of the at least two video clips to be spliced to generate an inserted frame image; a splicing module for splicing the video clip and the inserted frame image to obtain a target video.

According to a third aspect of the present disclosure, there is provided a computer-readable storage medium on which a computer program is stored, and when the computer program is executed by a processor, implements the video processing method of the first aspect and possible implementations thereof.

According to a fourth aspect of the present disclosure, there is provided an electronic device, comprising: a processor; and a memory for storing executable instructions of the processor; wherein the processor is configured to execute the executable instructions to The video processing method of the above-mentioned first aspect and possible implementations thereof are performed.

Description of drawings

FIG. 1 shows a schematic diagram of a mobile terminal in this exemplary embodiment;

FIG. 2 shows a flowchart of a video processing method in this exemplary embodiment;

FIG. 3 shows a schematic diagram of a sequence of video clips in this exemplary embodiment;

FIG. 4 shows a flowchart of frame insertion in this exemplary embodiment;

FIG. 5 shows a schematic diagram of frame insertion by MEMC in this exemplary embodiment;

Fig. 6 shows the flow chart of determining the number and time phase of the interpolated frame in this exemplary embodiment;

FIG. 7 shows a schematic diagram of video processing in this exemplary embodiment;

FIG. 8 shows a structural block diagram of a video processing apparatus in this exemplary embodiment;

FIG. 9 shows a structural block diagram of a video processing apparatus in this exemplary embodiment.

detailed description

Example embodiments will now be described more fully with reference to the accompanying drawings. Example embodiments, however, can be embodied in various forms and should not be construed as limited to the examples set forth herein; rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the concept of example embodiments to those skilled in the art. The described features, structures, or characteristics may be combined in any suitable manner in one or more embodiments. In the following description, numerous specific details are provided in order to give a thorough understanding of the embodiments of the present disclosure. However, those skilled in the art will appreciate that the technical solutions of the present disclosure may be practiced without one or more of the specific details, or other methods, components, devices, steps, etc. may be employed. In other instances, well-known solutions have not been shown or described in detail to avoid obscuring aspects of the present disclosure.

Furthermore, the drawings are merely schematic illustrations of the present disclosure and are not necessarily drawn to scale. The same reference numerals in the drawings denote the same or similar parts, and thus their repeated descriptions will be omitted. Some of the block diagrams shown in the figures are functional entities that do not necessarily necessarily correspond to physically or logically separate entities. These functional entities may be implemented in software, or in one or more hardware modules or integrated circuits, or in different networks and/or processor devices and/or microcontroller devices.

In the related art, video is usually spliced end-to-end. However, there are usually differences in the content of different videos. The videos obtained by splicing in this way have poor content continuity, and there may be sudden changes in the picture at the splicing position, which affects the viewing experience.

In view of the above problems, exemplary embodiments of the present disclosure provide a video processing method for performing splicing processing on different video segments. The application scenarios of this method include but are not limited to: the user selects multiple video clips in the album, and splices them into a target video and plays them through this exemplary embodiment; in the video clip, the multiple videos to be spliced The present exemplary embodiment performs a splicing process.

Exemplary embodiments of the present disclosure provide an electronic device to execute the above-described video processing method. The electronic device generally includes a processor and a memory. The memory is used to store executable instructions of the processor, and can also store application data such as images and videos. The processor is used to execute the executable instructions to realize video processing. The electronic device can be a terminal device such as a smart phone, a tablet computer, a smart wearable device, a drone, a desktop computer, a vehicle-mounted smart device, and a game console, or a server device, such as a platform server that provides video processing services.

The following takes the mobile terminal 100 in FIG. 1 as an example to illustrate the structure of the above electronic device. It will be understood by those skilled in the art that the configuration in Figure 1 can also be applied to stationary type devices, in addition to components specifically for mobile purposes.

As shown in FIG. 1 , the mobile terminal 100 may specifically include: a processor 110, an internal memory 121, an external memory interface 122, a USB (Universal Serial Bus, Universal Serial Bus) interface 130, a charging management module 140, a power management module 141, battery 142, antenna 1, antenna 2, mobile communication module 150, wireless communication module 160, audio module 170, speaker 171, receiver 172, microphone 173, headphone jack 174, sensor module 180, display screen 190, camera module 191, indication 192, a motor 193, a key 194, a Subscriber Identification Module (SIM) card interface 195, and the like.

The processor 110 may include one or more processing units, for example, the processor 110 may include an application processor (Application Processor, AP), a modem processor, a graphics processor (Graphics Processing Unit, GPU), an image signal processor (Image Signal Processor, ISP), controller, encoder, decoder, digital signal processor (Digital Signal Processor, DSP), baseband processor and/or neural network processor (Neural-Network Processing Unit, NPU), etc. The AP, GPU, etc. can perform processing on images and video data, such as inserting frames at the positions of the video clips to be spliced, generating an inserted frame image, and splicing the video clip and the inserted frame image into a target video, etc.

The encoder can encode (ie compress) image or video data, for example, encode the target video data obtained after video processing to form corresponding code stream data to reduce the bandwidth occupied by data transmission; the decoder can Or the code stream data of the video is decoded (ie, decompressed) to restore the image or video data, for example, the obtained original video segment is decoded to obtain the image data of each frame in the video segment. The mobile terminal 100 may support one or more encoders and decoders. In this way, the mobile terminal 100 can process images or videos in various encoding formats, such as: JPEG (Joint Photographic Experts Group, Joint Photographic Experts Group), PNG (Portable Network Graphics, Portable Network Graphics), BMP (Bitmap, Bitmap), etc. Image format, MPEG (Moving Picture Experts Group, Moving Picture Experts Group) 1, MPEG2, H.263, H.264, HEVC (High Efficiency Video Coding, High Efficiency Video Coding) and other video formats.

In some embodiments, the processor 110 may include one or more interfaces through which connections are formed with other components of the mobile terminal 100 .

The external memory interface 122 may be used to connect an external memory card. The internal memory 121 may be used to store computer executable program codes, and may also store data (such as images, videos) and the like created during the use of the mobile terminal 100 .

The USB interface 130 is an interface conforming to the USB standard specification, and can be used to connect a charger to charge the mobile terminal 100, and can also be connected to an earphone or other electronic devices.

The charging management module 140 is used to receive charging input from the charger. While charging the battery 142, the charging management module 140 can also supply power to the device through the power management module 141; the power management module 141 can also monitor the state of the battery.

The wireless communication function of the mobile terminal 100 may be implemented by the antenna 1, the antenna 2, the mobile communication module 150, the wireless communication module 160, the modulation and demodulation processor, the baseband processor, and the like. Antenna 1 and Antenna 2 are used to transmit and receive electromagnetic wave signals. The mobile communication module 150 may provide a wireless communication solution including 2G/3G/4G/5G etc. applied on the mobile terminal 100. The wireless communication module 160 can provide wireless local area networks (Wireless Local Area Networks, WLAN) (such as wireless fidelity (Wireless Fidelity, Wi-Fi) networks), Bluetooth (Bluetooth, BT), global navigation satellite Wireless communication solutions such as Global Navigation Satellite System (GNSS), Frequency Modulation (FM), Near Field Communication (NFC), and Infrared (IR).

The mobile terminal 100 can realize the display function through the GPU, the display screen 190 and the application processor, etc., can realize the shooting function through the ISP, the camera module 191, the encoder, the decoder, the GPU, the display screen 190 and the application processor, etc., and can also realize the shooting function. The audio function is realized by the audio module 170 , the speaker 171 , the receiver 172 , the microphone 173 , the headphone interface 174 and the application processor.

The sensor module 180 may include a depth sensor 1801, a pressure sensor 1802, a gyro sensor 1803, an air pressure sensor 1804, and the like.

The indicator 192 can be an indicator light, which can be used to indicate the charging state, the change of the power, and can also be used to indicate a message, a missed call, a notification, and the like. The motor 193 can generate vibration prompts, and can also be used for touch vibration feedback and the like. The keys 194 include a power-on key, a volume key, and the like.

The mobile terminal 100 may support one or more SIM card interfaces 195 for connecting SIM cards.

The video processing method according to the exemplary embodiment of the present disclosure will be described below with reference to FIG. 2 . FIG. 2 shows a schematic flow of the video processing method, which may include the following steps S210 to S230, and each step will be described in detail below:

Step S210: Acquire at least two video clips to be spliced, wherein at least one video clip includes two or more frames of images.

A video clip refers to the original video material, which is essentially an independent video. Since subsequent splicing is required, in order to distinguish it from the spliced video, the video material is referred to as a "segment" here. Each video clip is generally composed of two or more frames of images. In this exemplary embodiment, a single-frame video clip can also be acquired, that is, a single image is regarded as a special type of video clip.

In the scene of video recording, you can obtain multiple short videos or images shot at different times, or multiple short videos or images shot by different devices (including different cameras), and treat each short video or image as a different video segment. . Different video clips can be input serially or in parallel.

It should be noted that the initially acquired video segment may be encoded video data, and the video data may be processed by a video decoder or other parser to obtain image data of each frame in the video segment.

In an optional implementation manner, video clips with the same background part and similar foreground parts may be acquired to facilitate subsequent splicing.

In an optional implementation manner, after step S210 is performed, the following processing can also be performed on any one or more video clips according to actual requirements:

① Split a video clip into multiple video clips. Generally, if a video clip contains multiple pieces of content, it can be split, for example, by detecting the similarity between two adjacent frames of images in the video clip, and determining the position with lower similarity as the change point, at the change point Do video splits. In each video segment obtained in this way, the content continuity is relatively high, which is favorable for subsequent splicing. In an optional implementation, a video clip can be selected as the main video, the main video can be split, and subsequent video clips can be spliced with each split clip of the main video, which is equivalent to inserting into different positions of the main video. .

② Combine two or more video clips into one video clip. For example, two video clips are video clips obtained by two cameras synchronously shooting the same scene from different angles. You can select two frames with the same timestamp from the two video clips, register the two frames, and then use the registration parameters. Register and synthesize two video clips to obtain a video clip with more comprehensive and rich scene content; or one video clip is the shot scene video, and the other video clip is the picture video, and the picture video can be inserted into the scene A certain position in the video, to achieve a composite of two video clips, and so on.

③ Delete one or more frames of images from the video clip to filter the content in the video clip. For example, removing ambiguous frames from a video clip, or removing frames that are less relevant to the subject of the video, or extracting a representative frame of the video clip, removing other frames, and so on.

④ Adjust the resolution of the video clips, you can adjust each video clip to the same resolution. For example, set a standard resolution in advance, perform up-sampling processing on video clips lower than this resolution, down-sampling processing on video clips higher than this resolution, or use the video clip with the lowest resolution as the standard, and perform up-sampling processing on other video clips. Fragments are downsampled, and so on.

It should be noted that, in practical applications, any combination of the above four processing methods may be performed according to requirements, and of course other processing methods not listed above may also be used, which is not limited in the present disclosure.

Step S220, performing frame insertion at at least one to-be-spliced position of the at least two video clips to generate an inserted frame image.

The position to be spliced refers to the position where different video clips are spliced, and may include a start position, an end position, and a position between any two adjacent video clips. Wherein, the start position is located before the first video segment of the at least two video segments, and the end position is located after the last video segment of the at least two video segments. It should be noted that, in some cases, the first video clip and the last video clip may be first spliced, so the above-mentioned start position and end position can be used as the position to be spliced.

Referring to Fig. 3, obtain the video clip 1, video clip 2, ..., and video clip n to be spliced, the start position P1 before the video clip 1, the end position Pn+1 after the video clip n, The positions between the two adjacent video clips in the middle include P2, P3, etc., and P1 to Pn+1 are positions to be spliced.

The number of video segments to be spliced is at least two, thus including at least three locations to be spliced. In this exemplary embodiment, frame interpolation may be performed at at least one of the positions to be spliced, so as to increase the continuity between the two video segments before and after the position to be spliced.

Among the positions to be spliced, the start position P1 and the end position Pn+1 are two relatively special positions, and only one side has a video clip. The positions to be spliced are divided into two categories below. The first type of positions to be spliced is the intermediate positions to be spliced, that is, the positions between two adjacent video clips, including P2, P3, etc. in Figure 3; the second type of positions to be spliced are the start and end positions. Describe how to insert frames for the two types of positions to be spliced:

For the first type of positions to be spliced, referring to FIG. 4 , step S220 may include the following steps S410:

Step S410, perform frame interpolation according to at least one frame of image in the previous video clip at the position to be spliced and at least one frame of image in the next video clip at the position to be spliced to generate an inserted frame image.

Taking P3 in FIG. 3 as an example, the previous video clip is video clip 2, and the next video clip is video clip 3. One or more frames of images are selected from video clip 2, and one or more frames are also selected from video clip 3. For multiple frames of images, frame interpolation is performed at P3 according to the selected images. FRC (Frame Rate Conversion) technology can be used when inserting frames, such as MEMC (Motion Estimation and Motion Compensation, motion estimation and motion compensation), optical flow method, neural network processing, etc.

Further, the above-mentioned images selected for frame insertion in the previous video clip and the next video clip may be: the end frame image in the previous video clip and the start frame image in the next video clip, these two The frame is also the boundary frame image at the position to be spliced, and the frame interpolation is performed through these two frames, and the accuracy of the result is high.

It should be noted that if a video clip is a single-frame video clip, such as video clip 4 in Figure 3, the single frame is the start frame of video clip 4 and the end frame of video clip 4. The single frame of clip 4 and the end frame of video clip 3 are interpolated at P4. Of course, there may also be a situation in which the previous video clip and the next video clip are both single-frame video clips, and it is sufficient to insert frames at the position to be spliced in the middle of the two single frames.

For the second type of positions to be spliced, referring to FIG. 4 , step S220 may further include the following steps S420:

Step S420, according to at least one frame of image in the first video clip and at least one frame of image in the last video clip, perform frame interpolation at the above-mentioned starting position and/or the above-mentioned ending position to generate an interpolated frame image.

As shown in FIG. 3 , one or more frames of images can be selected from video clip 1, and one or more frames of images can be selected from video clip n. According to the movement trend of the image in video clip n to the image in video clip 1 to insert frames. In other words, for the start position and end position, the previous video clip can be regarded as the last video clip of the video clip sequence, and the following video clip can be regarded as the first video clip of the video clip sequence. , then interpolate between the two video clips. For the generated interpolated frame images, all of them can be placed at the start position, or all of them can be placed at the end position, and part of the interpolated frame images can be placed at the start position and the other part at the end position.

Further, the above-mentioned images selected for frame insertion in the first video segment and the last video segment may be: a start frame image in the first video segment and an end frame image in the last video segment.

It should be noted that the content of the inserted frame here is a transition picture from the end frame image of all video clips back to the start frame image of all video clips.

The following takes MEMC as an example to describe the frame insertion process in detail. Referring to FIG. 5 , it is assumed that the end frame image F1 of the previous video clip and the start frame image F2 of the subsequent video clip are selected.

First, determine one image as the reference image and the other as the current image, for example, take F1 as the reference image and F2 as the current image;

For the current image F2, the size of the image block can be set according to the actual situation, traverse the reference image F1 according to different image blocks, find the matching block of each image block, and determine according to the position change between the image block and its matching block. The MV (Motion Vector) of the current image relative to the reference image is recorded as the forward MV;

Similarly, by using the above operations, it is also possible to determine the position change of each image block of the reference image F1 relative to the current image F2 to obtain the backward MV;

Perform certain correction operations on the forward MV and the backward MV, such as filtering, weighting, etc.;

According to the time phase of the interpolated frame, along the forward MV and the backward MV, the motion state of the image block at different time phases is estimated, and the MV is re-corrected to determine the interpolated block. According to the interpolated block relative to the current image F2 and the reference The mapping MV of the image F1 performs weight interpolation for the interpolation blocks in the corresponding image blocks of the current image F2 and the reference image F1, so as to obtain each pixel value and generate an interpolated frame image.

In an optional implementation manner, referring to FIG. 6 , the number and time phase of the inserted frames can be determined through the following steps S610 to S650:

Step S610, select at least two frames of images in the previous video segment of the position to be spliced, and obtain a first MV by performing motion estimation on them;

Step S602, select at least two frames of images in the next video segment of the position to be spliced, and obtain a second MV by performing motion estimation on them;

Step S630, according to the boundary motion state between the previous video clip and the next video clip, obtain a third MV between the two video clips;

Step S640, draw a time-motion curve according to the first MV, the second MV, the third MV, and the time stamps of the images selected in the previous video clip and the next video clip;

Step S650, in the part between the previous video segment and the next video segment in the time-motion curve, according to the actual motion state, determine the number and time phase of the interpolated frames. For example, a number of interpolation points can be determined by equal time phase or unequal time phase, so as to determine the number and time phase of the interpolation frames.

Step S230, splicing the video clip and the interpolated frame image to obtain the target video.

Splicing all video clips and interleaved frame images, and setting the timestamp of each frame image, for example, it can be set according to the required video frame rate (such as 24fps, 30fps, 60fps, etc.), so as to obtain a complete target video.

It should be noted that if the frame is inserted at the start position or the end position, the target video can be set to loop playback, that is, after playing to the end of the video, jump to the beginning of the video to replay, and the insertion based on the start position or the end position Frame images, when jumping from the end of the target video to the beginning, the picture shows continuity, forming a self-looping look and feel.

In an optional implementation manner, before step S220, the foregoing video clips may be arranged in a splicing sequence. The splicing order between video clips can be any specified or determined order. An exemplary manner for determining the splicing sequence is provided below, but the following content should not limit the scope of protection of the present disclosure:

(1) The splicing sequence is set by the user. For example, after acquiring multiple video clips, they are displayed in a sorting interface, and the user is allowed to drag the video clips to change the order.

(2) The splicing sequence is determined according to the shooting time of the video clips, and the video clips are usually arranged in the order of shooting time from early to late.

(3) Determine the splicing sequence according to the moving path of the foreground part in the video clip, for example, identify the background part and the foreground part in the video clip, and determine the static reference object in the background part, such as a tree, a building, etc., through the foreground part Based on the positional relationship with the reference object, the moving path of the foreground part is determined, for example, moving from the left side of the reference object to the right side, and the splicing sequence between different video segments is determined according to the path.

In an optional implementation manner, before step S220, a duplicate frame between two adjacent video clips may also be detected, and the duplicate frame may be deleted from any one of the video clips. Duplicate frames refer to duplicate pictures that appear in two adjacent video clips. For example, if there is an intersection between two video clips, deleting duplicate frames from any video clip can prevent the same picture from appearing repeatedly. It should be noted that there may be a situation where a certain video clip appears completely in another video clip. For example, if video clip A appears completely in video clip B, A can be regarded as a subset of B, and all A can be considered as a subset of B. Delete, at this time, you can further compare the duplicate frames of B and other adjacent video clips, and perform deletion processing.

The above video processing method will be further described below through two specific examples.

Example 1: Referring to FIG. 7 , two video clips are acquired, namely video clip 1 and video clip 2 . The video clip 1 includes N frames of images, and it is detected that the Kth frame is a change point, and the video content before and after the change point is quite different. Taking the K frame as the split point, the video clip 1 is divided into the first short video clip (the first short video clip (the first short video clip). frame to the K-1th frame) and the second short video clip (the Kth frame to the Nth frame); the video clip 2 includes M frame images, the background part is similar to the video clip 1, only the position of the foreground part is different, select the A representative frame (denoted as Z frame) is retained, and the rest is deleted to form a single-frame video clip; according to the first short video clip, Z frame, and the second short video clip, a video clip sequence is formed. The video clip sequence Including 4 positions to be spliced; insert frame T1 between the first short video clip and Z frame, insert frame T2 between Z frame and the second short video clip, splicing 1～K-1, T1, Z, T2, K~N, get the target video.

Example 2: Acquire two video clips, namely video clip 1 and video clip 2. Select a change point in video clip 1, and split video clip 1 into a first short video clip and a second short video clip; select two change points in video clip 2, and split video clip 2 into a third short video clip Video clips, fourth short video clips and fifth short video clips; according to the set splicing sequence, in the order of the first, third, second, fifth, and fourth short video clips, respectively in every two short video clips. Interpolate frames between clips, and finally splicing into the target video.

Based on the above content, in this exemplary embodiment, on the one hand, during video splicing, the correlation between video image frames is used to insert frames between different video clips to increase transitional content between video clips, Improve the continuity of the picture and solve the problem of sudden picture changes in video splicing. On the other hand, the processing process of this solution is simple, and can be implemented based on the video clips to be spliced, without additional information, and has high practicability.

Exemplary embodiments of the present disclosure also provide a video processing apparatus. As shown in FIG. 8 , the video processing apparatus 800 may include a processor 810 and a memory 820 , wherein the processor 810 is configured to execute the following program stored in the memory 820 Module:

The acquiring module 821 is configured to acquire at least two video clips to be spliced, wherein at least one video clip includes two or more frames of images;

The frame insertion module 822 is configured to perform frame insertion at at least one to-be-spliced position in the above-mentioned at least two video clips to generate an inserted frame image;

The splicing module 823 is configured to splicing the video segment and the interpolated frame image to obtain the target video.

In an optional implementation manner, the position to be spliced includes a start position, an end position, and a position between any two adjacent video clips, and the start position is located before the first video clip of the at least two video clips , the end position is after the last video clip of the at least two video clips.

In an optional implementation manner, the frame insertion module 822 is configured to:

Frame interpolation is performed according to at least one frame of image in the previous video clip of the position to be spliced and at least one frame of image in the next video clip of the position to be spliced to generate an inserted frame image.

In an optional implementation manner, at least one frame image in the previous video clip includes an end frame image in the previous video clip, and at least one frame image in the subsequent video clip includes a starting frame image in the subsequent video clip frame image.

According to at least one frame of image in the first video clip and at least one frame of image in the last video clip, frame interpolation is performed at the start position and/or the end position to generate an interpolated frame image.

In an optional implementation manner, at least one frame of image in the first video clip includes a start frame image in the first video clip, and at least one frame of image in the last video clip includes an image in the last video clip end frame image.

In an optional implementation manner, the splicing module 823 is further configured to:

After getting the target video, set the target video to play in a loop.

In an optional implementation manner, the frame insertion module 822 is further configured to:

Select at least two frames of images in the previous video segment of the position to be spliced, and obtain a first motion vector by performing motion estimation on them;

Select at least two frames of images in the next video segment of the position to be spliced, and obtain a second motion vector by performing motion estimation on them;

According to the boundary motion state between the previous video clip and the next video clip, obtain the third motion vector between the previous video clip and the next video clip;

Draw a time-motion curve according to the first motion vector, the second motion vector, the third motion vector, and the time stamps of the images selected in the previous video segment and the next video segment;

In the part between the previous video segment and the next video segment in the time-motion curve, the number and time phase of the interpolated frames are determined according to the actual motion state.

Before performing frame insertion, the at least two video clips are arranged in a splicing sequence.

Before arranging the above at least two video clips according to the splicing sequence, the splicing sequence is determined by any one or more of the following methods:

Get the splicing order set by the user;

Determine the splicing sequence according to the shooting time of each video clip;

The splicing sequence is determined according to the moving path of the foreground part in each video clip.

Before performing frame interpolation, duplicate frames between two adjacent video clips are detected, and duplicate frames are removed from either video clip.

In an optional implementation manner, the at least two video clips include single-frame video clips.

In an optional implementation manner, the obtaining module 821 is further configured to:

After acquiring at least two video clips to be spliced, perform any one or more of the following processing on at least one of the video clips:

Split a video clip into multiple video clips;

Combine two or more video clips into one video clip;

remove one or more frames from a video clip;

Make resolution adjustments to video clips.

Exemplary embodiments of the present disclosure further provide another video processing apparatus. As shown in FIG. 9 , the video processing apparatus 900 may include:

The acquiring module 910 is configured to acquire at least two video clips to be spliced, wherein at least one video clip includes two or more frames of images;

The frame insertion module 920 is configured to perform frame insertion at at least one to-be-spliced position in the above at least two video clips to generate an inserted frame image;

The splicing module 930 is configured to splicing the video segment and the interpolated frame image to obtain the target video.

In an optional implementation manner, the position to be spliced includes a start position, an end position, and a position between any two adjacent video clips, and the start position is located at the first video clip of the at least two video clips. Before, the end position is located after the last video clip of the at least two video clips.

In an optional implementation manner, the frame insertion module 920 is configured to:

Further, at least one frame of image in the previous video clip includes an end frame image in the previous video clip, and at least one frame image in the next video clip includes a start frame image in the next video clip.

According to at least one frame of image in the first video clip and at least one frame of image in the last video clip, frame interpolation is performed at the start position and/or the end position to generate the frame interpolation image.

In an optional implementation manner, at least one frame of image in the first video clip includes a start frame image in the first video clip, and at least one frame of image in the last video clip includes the last video The end frame image in the clip.

In an optional implementation manner, the splicing module 930 is further configured to:

After getting the target video, set the target video to play in a loop.

In an optional implementation manner, the frame insertion module 920 is further configured to:

Select at least two frames of images in the previous video segment of the position to be spliced, and obtain the first motion vector by performing motion estimation on them;

Get the splicing order set by the user;

In an optional implementation manner, the obtaining module 910 is further configured to:

Split a video clip into multiple video clips;

Combine two or more video clips into one video clip;

remove one or more frames from a video clip;

Make resolution adjustments to video clips.

The specific details of each part in the above-mentioned apparatus have been described in detail in the method part of the implementation, and thus will not be repeated.

Exemplary embodiments of the present disclosure also provide a computer-readable storage medium on which a program product capable of implementing the above-described method of the present specification is stored. In some possible implementations, various aspects of the present disclosure can also be implemented in the form of a program product, which includes program code, when the program product runs on a terminal device, the program code is used to cause the terminal device to execute the above-mentioned procedures in this specification. The steps described in the "Exemplary Methods" section according to various exemplary embodiments of the present disclosure, for example, any one or more of the steps in FIG. 2 or FIG. 4 may be performed. The program product may take the form of a portable compact disk read only memory (CD-ROM) and include program code, and may be executed on a terminal device, such as a personal computer. However, the program product of the present disclosure is not limited thereto, and in this document, a readable storage medium may be any tangible medium that contains or stores a program that can be used by or in conjunction with an instruction execution system, apparatus, or device.

The program product may employ any combination of one or more readable media. The readable medium may be a readable signal medium or a readable storage medium. The readable storage medium may be, for example, but not limited to, an electrical, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus or device, or a combination of any of the above. More specific examples (non-exhaustive list) of readable storage media include: electrical connections with one or more wires, portable disks, hard disks, random access memory (RAM), read only memory (ROM), erasable programmable read only memory (EPROM or flash memory), optical fiber, portable compact disk read only memory (CD-ROM), optical storage devices, magnetic storage devices, or any suitable combination of the foregoing.

A computer readable signal medium may include a propagated data signal in baseband or as part of a carrier wave with readable program code embodied thereon. Such propagated data signals may take a variety of forms, including but not limited to electromagnetic signals, optical signals, or any suitable combination of the foregoing. A readable signal medium can also be any readable medium, other than a readable storage medium, that can transmit, propagate, or transport the program for use by or in connection with the instruction execution system, apparatus, or device.

Program code embodied on a readable medium may be transmitted using any suitable medium, including but not limited to wireless, wireline, optical fiber cable, RF, etc., or any suitable combination of the foregoing.

Program code for performing the operations of the present disclosure may be written in any combination of one or more programming languages, including object-oriented programming languages—such as Java, C++, etc., as well as conventional procedural programming Language - such as the "C" language or similar programming language. The program code may execute entirely on the user's computing device, partly on the user's device, as a stand-alone software package, partly on the user's computing device and partly on a remote computing device, or entirely on the remote computing device or server execute on. In the case of a remote computing device, the remote computing device may be connected to the user computing device through any kind of network, including a local area network (LAN) or a wide area network (WAN), or may be connected to an external computing device (eg, using an Internet service provider business via an Internet connection).

As will be appreciated by one skilled in the art, various aspects of the present disclosure may be implemented as a system, method or program product. Therefore, various aspects of the present disclosure can be embodied in the following forms: a complete hardware implementation, a complete software implementation (including firmware, microcode, etc.), or a combination of hardware and software aspects, which may be collectively referred to herein as implementations "circuit", "module" or "system".

Other embodiments of the present disclosure will readily occur to those skilled in the art upon consideration of the specification and practice of the invention disclosed herein. This disclosure is intended to cover any variations, uses, or adaptations of this disclosure that follow the general principles of this disclosure and include common general knowledge or techniques in the technical field not disclosed by this disclosure . The specification and embodiments are to be regarded as exemplary only, with the true scope and spirit of the disclosure being indicated by the claims.

It is to be understood that the present disclosure is not limited to the precise structures described above and illustrated in the accompanying drawings, and that various modifications and changes may be made without departing from the scope thereof. The scope of the present disclosure is limited only by the appended claims.

Claims

A video processing method, comprising:

Obtain at least two video clips to be spliced, at least one of the video clips includes two or more frames of images;

Perform frame insertion at at least one to-be-spliced position in the at least two video clips to generate an inserted frame image;

The target video is obtained by splicing the video clip and the interpolated frame image.
The method according to claim 1, wherein the position to be spliced includes a start position, an end position, and a position between any two adjacent video clips, and the start position is located in the at least two video clips. Before the first video clip of the video clips, the end position is located after the last video clip of the at least two video clips.
The method according to claim 2, wherein the performing frame insertion at at least one position to be spliced in the at least two video clips to generate the frame insertion image, comprising:

Frame interpolation is performed according to at least one frame of image in the previous video clip of the position to be spliced and at least one frame of image in the next video clip of the to-be-spliced position to generate an inserted frame image.
The method according to claim 3, wherein at least one frame of image in the previous video clip comprises an end frame image in the previous video clip, and at least one frame of image in the subsequent video clip Including the start frame image in the latter video segment.
The method according to claim 2, wherein the performing frame insertion at at least one position to be spliced in the at least two video clips to generate the frame insertion image, comprising:

According to at least one frame of image in the first video clip, and at least one frame of image in the last video clip, frame interpolation is performed at the start position and/or the end position to generate an interpolated frame image .
The method according to claim 5, wherein the at least one frame image in the first video clip comprises a start frame image in the first video clip, and at least one frame image in the last video clip A frame image includes the end frame image in the last video segment.
The method according to claim 5, wherein after obtaining the target video, the method further comprises:

Set the target video to loop.
The method according to claim 2, wherein the method further comprises:

Select at least two frames of images in the previous video segment of the position to be spliced, and obtain a first motion vector by performing motion estimation on them;

Select at least two frames of images in the next video segment of the position to be spliced, and obtain a second motion vector by performing motion estimation on them;

obtaining a third motion vector between the previous video segment and the latter video segment according to the boundary motion state between the former video segment and the latter video segment;

Drawing a time-motion curve according to the first motion vector, the second motion vector, the third motion vector, and the timestamps of the images selected in the previous video segment and the next video segment;

In the part between the previous video segment and the next video segment in the time-motion curve, the number and time phase of the interpolated frames are determined according to the actual motion state.
The method according to claim 1, characterized in that, before frame interpolation is performed in at least one of the at least two video clips at the to-be-spliced position, the method further comprises:

The at least two video segments are arranged in a splicing order.
The method according to claim 9, wherein before arranging the at least two video clips according to the splicing sequence, the method further comprises:

The splicing sequence is determined by any one or more of the following methods:

obtaining the splicing sequence set by the user;

Determine the splicing sequence according to the shooting time of each of the video clips;

The splicing sequence is determined according to the moving path of the foreground part in each of the video segments.
The method according to claim 1, characterized in that, before frame interpolation is performed in at least one of the at least two video clips at the to-be-spliced position, the method further comprises:

Duplicate frames between two adjacent video clips are detected, and the duplicate frames are removed from any of the video clips.
The method according to any one of claims 1 to 11, wherein the at least two video clips comprise single-frame video clips.
The method according to any one of claims 1 to 11, wherein after acquiring at least two video segments to be spliced, the method further comprises:

Perform any one or more of the following processing on at least one video clip of the at least two video clips:

splitting one of the video clips into multiple video clips;

combining two or more of said video clips into one video clip;

remove one or more frames from the video clip;

A resolution adjustment is performed on the video clip.
A video processing device, characterized in that it includes a processor and a memory, and the processor is configured to execute the following program modules stored in the memory:

an acquisition module, configured to acquire at least two video clips to be spliced, wherein at least one video clip includes two or more frames of images;

a frame insertion module, configured to perform frame insertion at at least one position to be spliced in the at least two video clips to generate an inserted frame image;

The splicing module is configured to splicing the video clip and the interpolated image to obtain a target video.
The apparatus according to claim 14, wherein the position to be spliced includes a start position, an end position, and a position between any two adjacent video clips, and the start position is located in the at least two video clips. Before the first video clip of the video clips, the end position is located after the last video clip of the at least two video clips.
The device according to claim 15, wherein the frame insertion module is configured to:

According to at least one frame of image in the previous video clip of the position to be spliced, and at least one frame of image in the next video clip of the position to be spliced, frame insertion is performed to generate an inserted frame image.
The apparatus according to claim 16, wherein at least one frame of image in the previous video clip comprises an end frame image in the previous video clip, and at least one frame of image in the subsequent video clip Including the start frame image in the latter video segment.
The device according to claim 15, wherein the frame insertion module is configured to:

According to at least one frame of image in the first video clip and at least one frame of image in the last video clip, frame interpolation is performed at the start position and/or the end position to generate the interpolation frame. frame image.
A computer-readable storage medium on which a computer program is stored, characterized in that, when the computer program is executed by a processor, the method according to any one of claims 1 to 13 is implemented.
An electronic device, comprising:

processor; and

a memory for storing executable instructions for the processor;

wherein the processor is configured to perform the method of any one of claims 1 to 13 by executing the executable instructions.