WO2022222835A1

WO2022222835A1 - Video processing method, video processing apparatus and electronic device

Info

Publication number: WO2022222835A1
Application number: PCT/CN2022/086751
Authority: WO
Inventors: 韩桂敏
Original assignee: 维沃移动通信（杭州）有限公司
Priority date: 2021-04-21
Filing date: 2022-04-14
Publication date: 2022-10-27
Also published as: CN113207038B; CN113207038A

Abstract

The present application belongs to the technical field of electronic devices. Disclosed are a video processing method, a video processing apparatus and an electronic device. The video processing method comprises: acquiring N video frames; receiving a first input for a first video frame in the N video frames; in response to the first input, acquiring a first image in the first video frame; acquiring M second images corresponding to M second video frames in the N video frames, wherein the M second images are images that are obtained by removing a third image from the M second video frames, and the third image corresponds to the same object as the first image; respectively merging the first image with the M second images to obtain M third video frames; and according to the M third video frames, obtaining a first video.

Description

Video processing method, video processing device and electronic device

CROSS-REFERENCE TO RELATED APPLICATIONS

This application claims priority to Chinese Patent Application No. 202110432361.3 filed in China on Apr. 21, 2021, the entire contents of which are incorporated herein by reference.

technical field

The present application belongs to the technical field of electronic equipment, and specifically relates to a video processing method, a video processing apparatus and electronic equipment.

Background technique

In the prior art, life can be recorded by shooting a video. Currently, the video cannot be processed during the video shooting process. After the video is shot, it is generally only possible to perform editing processing such as segment division, interception or merging. The video obtained after the editing process generally does not change the original video frame image. If the user wants to obtain a video different from the original video frame image, it cannot be achieved.

It can be seen that there is a problem of poor video processing flexibility in the prior art.

SUMMARY OF THE INVENTION

The purpose of the embodiments of the present application is to provide a video processing method, a video processing apparatus, and an electronic device, so as to solve the problem of poor video processing flexibility in the prior art.

In a first aspect, an embodiment of the present application provides a video processing method, which includes:

Get N video frames;

receiving a first input for a first video frame of the N video frames;

in response to the first input, acquiring a first image in the first video frame;

Acquire M second images corresponding to M second video frames in the N video frames, wherein the M second images are images obtained by removing the third image from the M second video frames, and the M second images are obtained by removing the third image from the M second video frames. The third image corresponds to the same object as the first image;

Merging the first image with the M second images respectively to obtain M third video frames;

According to the M third video frames, a first video is obtained.

In a second aspect, an embodiment of the present application provides a video processing apparatus, including:

The first acquisition module is used to acquire N video frames;

a receiving module, configured to receive a first input for a first video frame in the N video frames;

a second acquiring module, configured to acquire the first image in the first video frame in response to the first input;

A third acquiring module, configured to acquire M second images corresponding to M second video frames in the N video frames, wherein the M second images are the M second video frames minus the third an image obtained from an image, the third image corresponds to the same object as the first image;

a first processing module, configured to obtain M third video frames by merging the first image with the M second images respectively;

The second processing module is configured to obtain the first video according to the M third video frames.

In a third aspect, embodiments of the present application provide an electronic device, the electronic device includes a processor, a memory, and a program or instruction stored on the memory and executable on the processor, the program or instruction being The processor implements the steps of the method according to the first aspect when executed.

In a fourth aspect, an embodiment of the present application provides a readable storage medium, where a program or an instruction is stored on the readable storage medium, and when the program or instruction is executed by a processor, the steps of the method according to the first aspect are implemented .

In a fifth aspect, an embodiment of the present application provides a chip, the chip includes a processor and a communication interface, the communication interface is coupled to the processor, and the processor is configured to run a program or an instruction, and implement the first aspect the method described.

In the embodiment of the present application, during or after the video shooting, by inputting the first input to the specific video frame, the video processing apparatus can obtain the specific image in the specific video frame, and remove the other video frames with the same image. After the specific image is an image of the same object, the specific image is merged with other video frames to obtain a new video frame. Through the above processing, a new video frame that is different from the original video frame image can be obtained, and the user can also obtain a video different from the original video frame image. It can be seen that this can improve the flexibility of video processing.

Description of drawings

1 is a schematic flowchart of a video processing method provided by an embodiment of the present application;

2 to 5 are schematic diagrams of a freeze-frame shooting process provided by an embodiment of the present application;

6 to 7 are schematic diagrams of performing background image restoration through image fusion technology provided by an embodiment of the present application;

8 is a schematic diagram of a video frame provided by an embodiment of the present application;

9 to 10 are schematic diagrams of selecting special effect materials provided by embodiments of the present application;

11 to 12 are schematic diagrams of adding graffiti material to a video frame provided by an embodiment of the present application;

13 is a schematic structural diagram of a video processing apparatus provided by an embodiment of the present application;

14 is a schematic structural diagram of an electronic device provided by an embodiment of the present application;

FIG. 15 is a schematic diagram of a hardware structure of an electronic device provided by an embodiment of the present application.

Detailed ways

The technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present application. Obviously, the described embodiments are part of the embodiments of the present application, not all of the embodiments. Based on the embodiments in the present application, all other embodiments obtained by those of ordinary skill in the art without creative work fall within the protection scope of the present application.

The terms "first", "second" and the like in the description and claims of the present application are used to distinguish similar objects, and are not used to describe a specific order or sequence. It is to be understood that the data so used are interchangeable under appropriate circumstances so that the embodiments of the present application can be practiced in sequences other than those illustrated or described herein, and distinguish between "first", "second", etc. The objects are usually of one type, and the number of objects is not limited. For example, the first object may be one or more than one. In addition, "and/or" in the description and claims indicates at least one of the connected objects, and the character "/" generally indicates that the associated objects are in an "or" relationship.

The video processing method, video processing apparatus, and electronic device provided by the embodiments of the present application will be described in detail below with reference to the accompanying drawings through specific embodiments and application scenarios thereof.

FIG. 1 shows a schematic flowchart of a video processing method provided by an embodiment of the present application.

As shown in Figure 1, the video processing method includes the following steps:

Step 101: Acquire N video frames;

Step 102: receiving a first input for a first video frame in the N video frames;

Step 103: Acquire a first image in the first video frame in response to the first input;

Step 104: Acquire M second images corresponding to M second video frames in the N video frames, wherein the M second images are images obtained by removing the third image from the M second video frames , the third image corresponds to the same object as the first image;

Step 105: Merging the first image and the M second images respectively to obtain M third video frames;

Step 106: Obtain a first video according to the M third video frames.

The video processing methods provided in the embodiments of the present application may be executed by a video processing apparatus.

In step 101, the video processing apparatus may acquire N video frames during the video shooting process, or may acquire N video frames after the video shooting, which is not limited in this application. The N video frames may be all or part of video frames of a certain video, or may be all or part of video frames of multiple videos.

In step 102, the video processing apparatus receives a first input for the first video frame among the N video frames, where the first input can be understood as an input for selecting the first video frame, for example, in the process of playing the N video frames , and a click operation of the user is received at the moment of playing the first video frame, the click operation may be regarded as the first input for the first video frame. The first video frame can be understood as the video frame that the user wants to freeze, and the moment when the user inputs the first input can be understood as the freeze moment.

In step 103, the video processing apparatus may acquire the first image in the first video frame in response to the first input. The first image may be understood as an image of the target object that the user wants to freeze in the first video frame, and the first image may be referred to as a freeze image. The target object that the user wants to freeze can be either a moving object, such as people, animals, vehicles, etc., or a static object, such as buildings, objects, etc. The target object the user wants to freeze can be either a foreground image or a background. image.

To determine the first image in the first video frame, the first input may also be an input for the first image in the first video frame.

The video processing apparatus may extract the first image of the target object from the first video frame based on the semantic segmentation technology. Specifically, when the target object is a human being, the video processing apparatus may extract the human portrait from the first video frame based on the human portrait segmentation technology. The term "portrait segmentation" refers to the separation of the portrait and the background in the picture, into different areas, and using different labels to distinguish them. It is suitable for application scenarios based on image content understanding, such as background replacement, rendering, blurring, etc.

In step 104, the video processing apparatus may acquire M second images corresponding to the M second video frames in the N video frames, where the M second images are images obtained by removing the third image from the M second video frames, and the M second images are obtained by removing the third image from the M second video frames. The three images correspond to the same object as the first image.

The M second video frames can be understood as video frames other than the first video frame among the N video frames, and can also be understood as the video frames located after the first video frame among the N video frames, and can also be understood as N video frames. The video frame that is located before the first video frame in the video frame is not described in detail in this embodiment of the present application.

Assuming that the first image is the image of person A in the first video frame, the third image is also the image of person A in the M second video frames, and the M second images are the removed images in the M second video frames The image obtained after adding the image of Person A.

The video processing apparatus may segment the third image and the second image from the M second video frames based on the semantic segmentation technology, so as to obtain the M second images.

In step 105, the video processing apparatus may combine the first image and the M second images respectively to obtain M third video frames. In this way, the freeze-motion image is fused into other video frames, thereby obtaining a new video frame that is different from the original video frame image.

The position of the first image in the M third video frames may or may not correspond to the position of the third image in the M second video frames.

In step 106, the video processing apparatus may obtain the first video according to the M third video frames. For example, the M third video frames may be combined to form the first video, or the M third video frames may be combined with other video frames to form the first video.

In this embodiment, in addition to obtaining the first video, the video processing apparatus can also combine the first image and the second image to obtain the target photo.

In the embodiment of the present application, during or after the video shooting, by inputting the first input to the specific video frame, the video processing apparatus can obtain the specific image in the specific video frame, and remove the other video frames with the same image. After the specific image is an image of the same object, the specific image is merged with other video frames to obtain a new video frame. Through the above processing, a new video frame that is different from the original video frame image can be obtained, which enables the user to obtain a video different from the original video frame image. It can be seen that this can improve the flexibility of video processing.

Optionally, the first video frame includes a first sub-video frame and a second sub-video frame;

The acquiring the first image in the first video frame includes:

acquiring a first sub-image in the first sub-video frame and a second sub-image in the second sub-video frame, wherein the third image, the first sub-image and the second sub-image correspond to the same object;

The obtaining M third video frames by merging the first image and the M second images respectively, including:

The first sub-image and the second sub-image are respectively combined with the M second images to obtain M third video frames.

In this embodiment, the user can input the first input for multiple video frames as needed, or in other words, the user can input multiple first inputs, and the video processing device can freeze the target object once each time the user inputs the first input. , so as to obtain multiple images of the target object presented on the shooting interface. During the process that the user inputs the first input multiple times, the target object may be in a moving state, for example, the target object may change its position or posture. In this way, through the above process, the video processing device can obtain a clear and accurate image of the target object. A plurality of first images that change dynamically in sequence.

The video processing apparatus may also add a freeze mark to the first video frame, so that the user can clearly view the video frame freezed by the user by browsing the freeze mark.

In the case where the user inputs the first input multiple times, the video processing apparatus may combine the multiple first images with the M second images to obtain M third video frames, so that the M third video frames can be presented clearly and smoothly. A plurality of first images that change dynamically in sequence. Therefore, the first video obtained according to the M third video frames can present a plurality of first images that are clear and dynamically sequentially changed.

Since the first video can present multiple clear and dynamic images of the target object, the first video has a streamer effect. When the target object is a person, a portrait streamer video can be shot through the above process, and the portrait streamer video It can clearly retain each stop-motion portrait during the shooting process, expand the application scenarios of long-exposure photography, and improve the fun of shooting.

As an example, the video processing apparatus may photograph the target object in a long-exposure photographing mode, thereby acquiring N video frames. For example, after starting the camera of the video processing device, the user can select the long-exposure shooting mode. In the long-exposure shooting mode, the user can click the "start shooting button" in the shooting preview interface. At this time, the video processing device can shoot the target object. Take a shot, and acquire N video frames.

In the process of the video processing apparatus shooting the target object, the user can freeze the target object by inputting a first input. For example, the first input can be the user clicking the "photograph button" in the shooting interface, or the user pressing the video processing A certain physical button of the device may also be a user inputting a voice command, and so on.

When the video processing apparatus receives the first input, the video processing apparatus may obtain the image (ie, the first image) of the target object presented on the shooting interface at the moment when the user inputs the first input. The video processing apparatus may acquire the first image of the target object from the video frame captured when the first input is received.

When the shooting of the target object reaches the user's expectation, the user can input a second input to end the shooting process. A certain physical button, it can also be a user input voice command, and so on.

When the video processing device receives the second input, the video processing device can obtain the first video, so that the first video can present a plurality of clear and dynamic images of the target object.

The above process will be described below with reference to FIGS. 2 to 5 .

As shown in FIG. 2 , the user clicks the start shooting button 21 in the long-exposure shooting mode, and the camera 20 starts shooting the target object 22 .

As shown in FIG. 3 , when the camera device 20 is shooting the target object 22 , the user clicks the camera button 23 . At the moment of clicking, the camera device 20 freezes the target object 22 , and the camera device 20 captures the target object at the moment of the click. The image presented in the capture interface.

As shown in FIG. 4 , the user clicks the camera button 23 again. At the moment of clicking, the camera 20 freezes the target object 22 again, and the camera 20 obtains the image of the target object displayed on the shooting interface again at the click moment. So far, the camera 20 can obtain two images of the target object 22 being frozen.

As shown in FIG. 5 , when the user clicks the end shooting button 24 , the camera 20 can generate a video including two images of the target object 22 being frozen.

In this embodiment, in the process of photographing the target object, the target object is freezed multiple times through multiple first inputs, and an image of the target object presented on the photographing interface is acquired for each first input, that is, Generate a video frame containing multiple images of the target object. In this way, by clearly freezing the moving target object, a dynamic and clear video of the target object can be obtained.

In this embodiment, in addition to obtaining the first video, the video processing device can also combine the first sub-image, the second sub-image and the second image to obtain a target photo, which can present a clear and dynamic image of the target object Multiple images that change in sequence can be called streamer photos.

Optionally, after the receiving the first input for the first video frame in the N video frames, the method further includes:

Obtaining K fourth video frames before the first video frame in the N video frames;

The K fourth images in the K fourth video frames are removed, and the removed areas of the K fourth video frames are repaired to obtain K fifth video frames, wherein the fourth images and the first image corresponds to the same object;

The obtaining of the M second images corresponding to the M second video frames in the N video frames includes:

acquiring M second video frames after the first video frame in the N video frames;

acquiring M second images corresponding to the M second video frames;

The obtaining the first video according to the M third video frames includes:

A first video is obtained according to the K fifth video frames and the M third video frames.

In this embodiment, the video processing apparatus may perform different processing on the first video frame, the K fourth video frames located before the first video frame, and the M second video frames located after the first video frame among the N video frames. processing to further expand the application scenarios of long exposure photography and enhance the fun of shooting.

The video processing device can analyze and process each video frame captured in real time during the shooting process, separate the target object and the background in each video frame, and mark them with different labels.

For the first video frame, the image of the target object (ie, the first image) and the background image may be retained.

For the M second video frames, the image of the target object (ie the third image) can be eliminated, only the background image (ie the second image) is retained, and the first image and the second image are respectively combined to obtain the third video frame.

For the K fourth video frames, the image of the target object (that is, the fourth image) can be eliminated, and only the background image can be retained. video frame. The term "Image Fusion" refers to the process of image processing and computer technology for the image data of the same target collected by multi-source channels, to maximize the extraction of favorable information in the respective channels, and finally to synthesize them into high-quality images. image.

As an example, FIGS. 6 to 7 show schematic diagrams of background image restoration through image fusion technology. As shown in FIG. 6 , assuming that frame N is an unfrozen video frame, find other frames corresponding to the background image 25 of the area where the target object 22 in frame N is not occluded, such as frame N-2. As shown in FIG. 7 , extract the background image 25 of the corresponding area in frame N-2, cover the target object 22 in frame N, and fuse to generate a new video frame N in which the target object has been eliminated and only the background image 25 is retained. '.

In this embodiment, through the above technical solution, the streamer video can not only clearly retain each frozen image of the target object during the shooting process, but also clearly retain the environment image where the target object is located, further expanding the long-exposure photography. The application scene enhances the fun of shooting.

acquiring S sixth video frames located between the first sub-video frame and the second sub-video frame in the N video frames;

Obtain S fifth images corresponding to the S sixth video frames, wherein the S fifth images are images obtained by removing the sixth image from the S sixth video frames, and the sixth image is the same as the sixth image. the first sub-image corresponds to the same object;

Merging the first sub-image with the S fifth images respectively to obtain S seventh video frames;

acquiring M second video frames after the second sub-video frame in the N video frames;

acquiring M second images corresponding to the M second video frames;

The obtaining the first video according to the M third video frames includes:

A first video is obtained according to the S seventh video frames and the M third video frames.

During the playing process of the streamer video (ie, the first video) obtained by this embodiment, the entire process of the dynamic change of the target object can be displayed.

Specifically, when the first video is played to the node at the first freeze moment (ie, the first sub-video frame), the image of the target object presented on the shooting interface at the freeze moment can be displayed. When the first video is played to the node at the second freeze moment (ie, the second sub-video frame), the image of the target object presented on the shooting interface at the first freeze moment and the second freeze moment can be displayed, and so on. , the first video shows the process of the dynamic change of the target object. When the first video is played to a node at a non-freeze moment, an image of the target object presented on the shooting interface when the node including the last freeze moment can be displayed. That is to say, once the target object displayed by the node at any freeze moment is displayed, it will always exist in the video segment after the freeze moment.

As shown in FIG. 8 , assuming that frame m+3 is the video frame at the first freeze moment, the image of the target object 22 presented on the shooting interface at the first freeze moment is displayed; frame m+11 is the second freeze frame The video frame of the moment, showing the images of the target object 22 presented on the shooting interface at the first freeze moment and the second freeze moment. Then, frame m+4 to frame m+10 (frame m+7 is taken as an example in FIG. 8 ) can all display the image presented by the target object 22 on the shooting interface at the first freeze moment.

Optionally, before the obtaining of the first video, the method further includes:

collecting the motion trajectory of the target object in the N video frames;

According to the motion track, generating a dynamic graffiti for depicting the motion track;

The obtaining the first video according to the M third video frames includes:

A first video is obtained according to the M third video frames and the dynamic graffiti.

In this embodiment, when the target object is a person, the motion trajectory of the target object may be a hand motion trajectory. In addition, the motion trajectory of the target object may also be the footprint of the target object, which is not limited in this embodiment of the present application.

In this embodiment, a virtual graffiti material can be added to the first video based on the human gesture recognition technology, and the user can select a pre-provided special effect material to describe the hand movement trajectory of the target object when shooting. As shown in FIG. 9 , the user can click the material selection button 26 before clicking the start shooting button 21 to select the special effect for recording the hand movement track. As shown in FIG. 10 , the user can click the material selection button 26 to switch between different virtual materials at any time during the shooting process.

Human gesture recognition technology can recognize gestures by processing real video images based on computer vision, including gesture segmentation, trajectory tracking and classification and recognition. Tracking hand motion trajectory can use image tracking algorithm, such as optical flow method, continuously adaptive MeanShift algorithm (Continuously Adaptive Mean-SHIFT, camshift), Kernel Correlation Filter algorithm (Kernel Correlation Filter, KCF), deep learning and other algorithms, or , the hand position detected in each video frame may also be used directly to track the hand motion trajectory, which is not limited in this embodiment of the present application.

During the shooting process, the video processing device can analyze the hand movement trajectory of the target object in real time through an algorithm, and follow the hand movement trajectory to add the selected graffiti material. For example, if the "love shape" material is selected, the heart-shaped graffiti will be displayed at the dynamic track traced by the hand movement track. Figures 11 to 12 respectively show video frames to which graffiti materials 27 are added. These graffiti materials can present effects such as light painting, fireworks, etc., and present the dynamic trajectory of the target object. During the playback of the first video, the target can be presented. The dynamic trajectory of the object enhances the fun of shooting.

FIG. 13 shows a schematic structural diagram of a video processing apparatus provided by an embodiment of the present application.

As shown in FIG. 13 , the video processing apparatus 300 includes:

The first acquisition module 301 is used to acquire N video frames;

A receiving module 302, for receiving the first input for the first video frame in the N video frames;

A second acquiring module 303, configured to acquire the first image in the first video frame in response to the first input;

A third obtaining module 304, configured to obtain M second images corresponding to M second video frames in the N video frames, wherein the M second images are the M second video frames except the first video frame. An image obtained from three images, the third image corresponds to the same object as the first image;

a first processing module 305, configured to combine the first image with the M second images respectively to obtain M third video frames;

The second processing module 306 is configured to obtain a first video according to the M third video frames.

Optionally, the video processing apparatus 300 further includes:

a fourth acquisition module, configured to acquire K fourth video frames before the first video frame in the N video frames;

The third processing module is configured to remove the K fourth images in the K fourth video frames, and repair the removed areas of the K fourth video frames to obtain K fifth video frames, wherein , the fourth image corresponds to the same object as the first image;

The third obtaining module 304 is specifically used for:

acquiring M second images corresponding to the M second video frames;

The second processing module 306 is specifically used for:

The second obtaining module 303 is specifically used for:

The first processing module 305 is specifically used for:

Optionally, the video processing apparatus 300 further includes:

a fifth acquisition module, configured to acquire S sixth video frames located between the first sub-video frame and the second sub-video frame in the N video frames;

a sixth obtaining module, configured to obtain S fifth images corresponding to the S sixth video frames, wherein the S fifth images are images obtained by removing the sixth image from the S sixth video frames, the sixth image corresponds to the same object as the first sub-image;

a fourth processing module, configured to combine the first sub-image with the S fifth images to obtain S seventh video frames;

The third obtaining module 304 is specifically used for:

acquiring M second images corresponding to the M second video frames;

The third processing module is specifically used for:

Optionally, the video processing apparatus 300 further includes:

a collection module, used for collecting the motion trajectory of the target object in the N video frames;

a generating module for generating dynamic graffiti for depicting the motion trajectory according to the motion trajectory;

The third processing module is specifically used for:

The video processing apparatus in this embodiment of the present application may be an apparatus, or may be a component, an integrated circuit, or a chip in a terminal. The apparatus may be a mobile electronic device or a non-mobile electronic device. Exemplarily, the mobile electronic device may be a mobile phone, a tablet computer, a notebook computer, a palmtop computer, an in-vehicle electronic device, a wearable device, an ultra-mobile personal computer (UMPC), a netbook, or a personal digital assistant (personal digital assistant). assistant, PDA), etc., non-mobile electronic devices can be servers, network attached storage (Network Attached Storage, NAS), personal computer (personal computer, PC), television (television, TV), teller machine or self-service machine, etc., this application Examples are not specifically limited.

The video processing apparatus in this embodiment of the present application may be an apparatus having an operating system. The operating system may be an Android (Android) operating system, an iOS operating system, or other possible operating systems, which are not specifically limited in the embodiments of the present application.

The video processing apparatus provided in the embodiments of the present application can implement the various processes implemented by the method embodiments in FIG. 1 to FIG. 12 , and can achieve the same beneficial effects. To avoid repetition, details are not repeated here.

Optionally, as shown in FIG. 14 , an embodiment of the present application further provides an electronic device 400, including a processor 401, a memory 402, a program or instruction stored in the memory 402 and executable on the processor 401, When the program or instruction is executed by the processor 401, each process of the above video processing method embodiment is implemented, and the same technical effect can be achieved. To avoid repetition, details are not described here.

It should be noted that the electronic devices in the embodiments of the present application include the aforementioned mobile electronic devices and non-mobile electronic devices.

FIG. 15 is a schematic diagram of a hardware structure of an electronic device implementing an embodiment of the present application.

The electronic device 500 includes but is not limited to: a radio frequency unit 501, a network module 502, an audio output unit 503, an input unit 504, a sensor 505, a display unit 506, a user input unit 507, an interface unit 508, a memory 509, and a processor 5010, etc. part.

Those skilled in the art can understand that the electronic device 500 may also include a power source (such as a battery) for supplying power to various components, and the power source may be logically connected to the processor 5010 through a power management system, so as to manage charging, discharging, and power management through the power management system. consumption management and other functions. The structure of the electronic device shown in FIG. 15 does not constitute a limitation to the electronic device. The electronic device may include more or less components than the one shown, or combine some components, or arrange different components, which will not be repeated here. .

The processor 5010 is used for: acquiring N video frames;

The user input unit 507 is configured to: receive a first input for a first video frame in the N video frames;

The processor 5010 is further configured to: in response to the first input, obtain a first image in the first video frame; obtain M second images corresponding to M second video frames in the N video frames, Wherein, the M second images are images obtained by removing a third image from the M second video frames, and the third image corresponds to the same object as the first image; The M second images are combined to obtain M third video frames; according to the M third video frames, a first video is obtained.

Optionally, the processor 5010 is further configured to:

acquiring M second images corresponding to the M second video frames;

Processor 5010 is also used to:

Optionally, the processor 5010 is further configured to:

acquiring M second images corresponding to the M second video frames;

Optionally, the processor 5010 is further configured to:

collecting the motion trajectory of the target object in the N video frames;

It should be understood that, in this embodiment of the present application, the input unit 504 may include a graphics processor (Graphics Processing Unit, GPU) 5041 and a microphone 5042. Such as camera) to obtain still pictures or video image data for processing. The display unit 506 may include a display panel 5061, which may be configured in the form of a liquid crystal display, an organic light emitting diode, or the like. The user input unit 507 includes a touch panel 5071 and other input devices 5072 . The touch panel 5071 is also called a touch screen. The touch panel 5071 may include two parts, a touch detection device and a touch controller. Other input devices 5072 may include, but are not limited to, physical keyboards, function keys (such as volume control keys, switch keys, etc.), trackballs, mice, and joysticks, which are not described herein again. Memory 509 may be used to store software programs as well as various data, including but not limited to application programs and operating systems. The processor 5010 may integrate an application processor and a modem processor, wherein the application processor mainly processes an operating system, a user interface, an application program, and the like, and the modem processor mainly processes wireless communication. It can be understood that, the above-mentioned modulation and demodulation processor may not be integrated into the processor 5010.

Embodiments of the present application further provide a readable storage medium, where a program or an instruction is stored on the readable storage medium. When the program or instruction is executed by a processor, each process of the above video processing method embodiment can be achieved, and the same can be achieved. In order to avoid repetition, the technical effect will not be repeated here.

Wherein, the processor is the processor in the electronic device described in the foregoing embodiments. The readable storage medium includes a computer-readable storage medium, and examples of the computer-readable storage medium include non-transitory machine-readable storage media, such as computer read-only memory (Read-Only Memory, ROM), random access memory ( Random Access Memory, RAM), disk or CD, etc.

An embodiment of the present application further provides a chip, where the chip includes a processor and a communication interface, the communication interface is coupled to the processor, and the processor is configured to run a program or an instruction to implement the above video processing method embodiments. Each process can achieve the same technical effect. In order to avoid repetition, it will not be repeated here.

It should be understood that the chip mentioned in the embodiments of the present application may also be referred to as a system-on-chip, a system-on-chip, a system-on-a-chip, or a system-on-a-chip, or the like.

It should be noted that, herein, the terms "comprising", "comprising" or any other variation thereof are intended to encompass non-exclusive inclusion, such that a process, method, article or device comprising a series of elements includes not only those elements, It also includes other elements not expressly listed or inherent to such a process, method, article or apparatus. Without further limitation, an element qualified by the phrase "comprising a..." does not preclude the presence of additional identical elements in a process, method, article or apparatus that includes the element. Furthermore, it should be noted that the scope of the methods and apparatus in the embodiments of the present application is not limited to performing the functions in the order shown or discussed, but may also include performing the functions in a substantially simultaneous manner or in the reverse order depending on the functions involved. To perform functions, for example, the described methods may be performed in an order different from that described, and various steps may also be added, omitted, or combined. Additionally, features described with reference to some examples may be combined in other examples.

From the description of the above embodiments, those skilled in the art can clearly understand that the method of the above embodiment can be implemented by means of software plus a necessary general hardware platform, and of course can also be implemented by hardware, but in many cases the former is better implementation. Based on this understanding, the technical solutions of the present application can be embodied in the form of computer software products that are essentially or contribute to the prior art, and the computer software products are stored in a storage medium (such as ROM/RAM, magnetic disk , CD), including several instructions to make a terminal (which may be a mobile phone, a computer, a server, or a network device, etc.) execute the methods described in the various embodiments of the present application.

The embodiments of the present application have been described above in conjunction with the accompanying drawings, but the present application is not limited to the above-mentioned specific embodiments, which are merely illustrative rather than restrictive. Under the inspiration of this application, without departing from the scope of protection of the purpose of this application and the claims, many forms can be made, which all fall within the protection of this application.

Claims

A video processing method, comprising:

Get N video frames;

receiving a first input for a first video frame of the N video frames;

in response to the first input, acquiring a first image in the first video frame;

Acquire M second images corresponding to M second video frames in the N video frames, wherein the M second images are images obtained by removing the third image from the M second video frames, and the M second images are obtained by removing the third image from the M second video frames. The third image corresponds to the same object as the first image;

Merging the first image with the M second images respectively to obtain M third video frames;

According to the M third video frames, a first video is obtained.
The method of claim 1, wherein after the receiving the first input for the first video frame of the N video frames, the method further comprises:

Obtaining K fourth video frames before the first video frame in the N video frames;

The K fourth images in the K fourth video frames are removed, and the removed areas of the K fourth video frames are repaired to obtain K fifth video frames, wherein the fourth images and the first image corresponds to the same object;

The obtaining of the M second images corresponding to the M second video frames in the N video frames includes:

acquiring M second video frames after the first video frame in the N video frames;

acquiring M second images corresponding to the M second video frames;

The obtaining the first video according to the M third video frames includes:

A first video is obtained according to the K fifth video frames and the M third video frames.
The method of claim 1, wherein the first video frame comprises a first sub-video frame and a second sub-video frame;

The acquiring the first image in the first video frame includes:

acquiring a first sub-image in the first sub-video frame and a second sub-image in the second sub-video frame, wherein the third image, the first sub-image and the second sub-image correspond to the same object;

The obtaining M third video frames by merging the first image and the M second images respectively, including:

The first sub-image and the second sub-image are respectively combined with the M second images to obtain M third video frames.
3. The method of claim 3, wherein after the receiving the first input for the first video frame of the N video frames, the method further comprises:

acquiring S sixth video frames located between the first sub-video frame and the second sub-video frame in the N video frames;

Obtain S fifth images corresponding to the S sixth video frames, wherein the S fifth images are images obtained by removing the sixth image from the S sixth video frames, and the sixth image is the same as the sixth image. the first sub-image corresponds to the same object;

Merging the first sub-image with the S fifth images respectively to obtain S seventh video frames;

The obtaining of the M second images corresponding to the M second video frames in the N video frames includes:

acquiring M second video frames after the second sub-video frame in the N video frames;

acquiring M second images corresponding to the M second video frames;

The obtaining the first video according to the M third video frames includes:

A first video is obtained according to the S seventh video frames and the M third video frames.
The method according to claim 1, wherein, before said obtaining the first video, the method further comprises:

collecting the motion trajectory of the target object in the N video frames;

According to the motion track, generating a dynamic graffiti for depicting the motion track;

The obtaining the first video according to the M third video frames includes:

A first video is obtained according to the M third video frames and the dynamic graffiti.
A video processing device, comprising:

The first acquisition module is used to acquire N video frames;

a receiving module, configured to receive a first input for a first video frame in the N video frames;

a second acquisition module, configured to acquire the first image in the first video frame in response to the first input;

A third acquiring module, configured to acquire M second images corresponding to M second video frames in the N video frames, wherein the M second images are the M second video frames minus the third an image obtained from an image, the third image corresponds to the same object as the first image;

a first processing module, configured to obtain M third video frames by merging the first image with the M second images respectively;

The second processing module is configured to obtain the first video according to the M third video frames.
The apparatus of claim 6, further comprising:

a fourth acquisition module, configured to acquire K fourth video frames before the first video frame in the N video frames;

The third processing module is configured to remove the K fourth images in the K fourth video frames, and repair the removed areas of the K fourth video frames to obtain K fifth video frames, wherein , the fourth image corresponds to the same object as the first image;

The third acquisition module is specifically used for:

acquiring M second video frames after the first video frame in the N video frames;

acquiring M second images corresponding to the M second video frames;

The second processing module is specifically used for:

A first video is obtained according to the K fifth video frames and the M third video frames.
The apparatus of claim 6, wherein the first video frame comprises a first sub-video frame and a second sub-video frame;

The second acquisition module is specifically used for:

acquiring a first sub-image in the first sub-video frame and a second sub-image in the second sub-video frame, wherein the third image, the first sub-image and the second sub-image correspond to the same object;

The first processing module is specifically used for:

The first sub-image and the second sub-image are respectively combined with the M second images to obtain M third video frames.
The apparatus of claim 8, further comprising:

a fifth acquisition module, configured to acquire S sixth video frames located between the first sub-video frame and the second sub-video frame in the N video frames;

a sixth obtaining module, configured to obtain S fifth images corresponding to the S sixth video frames, wherein the S fifth images are images obtained by removing the sixth image from the S sixth video frames, the sixth image corresponds to the same object as the first sub-image;

a fourth processing module, configured to combine the first sub-image with the S fifth images to obtain S seventh video frames;

The third acquisition module is specifically used for:

acquiring M second video frames after the second sub-video frame in the N video frames;

acquiring M second images corresponding to the M second video frames;

The third processing module is specifically used for:

A first video is obtained according to the S seventh video frames and the M third video frames.
The apparatus of claim 6, further comprising:

a collection module, used for collecting the motion trajectory of the target object in the N video frames;

a generating module for generating dynamic graffiti for depicting the motion trajectory according to the motion trajectory;

The third processing module is specifically used for:

A first video is obtained according to the M third video frames and the dynamic graffiti.
An electronic device, comprising a processor, a memory, and a program or instruction stored on the memory and executable on the processor, the program or instruction being executed by the processor to achieve as claimed in claims 1 to 5 The steps of any one of the video processing methods.
A readable storage medium on which programs or instructions are stored, and when the programs or instructions are executed by a processor, implement the steps of the video processing method according to any one of claims 1 to 5.
An electronic device configured to perform the steps of the video processing method as claimed in any one of claims 1 to 5.
A computer program product executed by at least one processor to implement the steps of the video processing method according to any one of claims 1 to 5.
A chip, comprising a processor and a communication interface, the communication interface is coupled with the processor, and the processor is used for running a program or an instruction to implement the steps of the video processing method according to any one of claims 1 to 5 .