CN114697723A

CN114697723A - Video generation method, device and medium

Info

Publication number: CN114697723A
Application number: CN202011580496.6A
Authority: CN
Inventors: 王倩; 赵煜
Original assignee: Beijing Xiaomi Mobile Software Co Ltd
Current assignee: Beijing Xiaomi Mobile Software Co Ltd
Priority date: 2020-12-28
Filing date: 2020-12-28
Publication date: 2022-07-01
Anticipated expiration: 2040-12-28
Also published as: CN114697723B

Abstract

The present disclosure provides a video generation method, apparatus, and medium, the method including: displaying an interface, the interface comprising: the method comprises the steps that a background image of a video to be processed and at least one body separating image of a main body object which is superposed on the background image according to body separating position information are obtained; receiving a position movement control signal of a first body image in the at least one body image, determining a target image frame corresponding to a target timestamp of the position movement control signal, and replacing the body key frame to which the first body image belongs with the target image frame as a body key frame; and generating a freeze-frame body-separating effect video according to the video to be processed and the updated body-separating key frame. In the disclosure, by displaying an interface including a body-separating image, a user inputs a position movement control signal for a body-separating image for the interface, and selects a new body-separating key frame to replace an original body-separating key frame to which the body-separating image belongs, so that the position movement of the body-separating image is realized, and a personalized stop-motion body-separating effect video is obtained.

Description

Video generation method, device and medium

Technical Field

The present disclosure relates to the field of mobile terminal data processing technologies, and in particular, to a video generation method, apparatus, and medium.

Background

With the development of video processing technology, video editing software can complete various video editing functions, provide various forms of video editing effects for users, and how to conveniently perform effect editing on a mobile terminal is a technical problem to be solved.

Disclosure of Invention

In view of the above, the present disclosure provides a video generation method, apparatus, and medium.

According to a first aspect of the embodiments of the present disclosure, there is provided a video generation method applied to a mobile terminal, the method including:

displaying an interface, the interface comprising: the method comprises the following steps that a background image of a video to be processed and at least one body-separating image of a main body object which is superposed on the background image according to body-separating position information are obtained; wherein each body segmentation image corresponds to a body segmentation key frame;

receiving a position movement control signal of a first body image in the at least one body image, determining a target image frame corresponding to a target timestamp of the position movement control signal, and replacing the body keyframe to which the first body image belongs with the target image frame as a body keyframe;

and generating a freeze-frame body-splitting effect video according to the video to be processed and the updated body-splitting key frame.

In one embodiment, before the displaying the interface, the method further comprises:

and synthesizing a background image according to the background part except the main body object in each image frame in the video to be processed.

In one embodiment, the position movement control signal is a dragging touch signal for the first avatar image;

the determining a target image frame corresponding to a target timestamp of the position movement control signal includes:

determining an end point of the dragging touch signal, determining a target timestamp corresponding to the end point, and determining a target image frame corresponding to the target timestamp.

In an embodiment, after the displaying the interface, the method further includes:

determining at least one node position corresponding to the dragging touch signal, determining a node timestamp corresponding to each node position, and determining an image frame corresponding to each node timestamp;

and when the dragging control signal reaches each node position, displaying a subject object image of the subject object in an image frame corresponding to the node position reached by the dragging control signal.

In one embodiment, the displaying the subject object image in the image frame corresponding to the node position reached by the drag control signal includes:

rendering a main body object image in an image frame corresponding to the position of the node where the dragging control signal arrives in a set rendering mode;

and displaying the rendered main object image.

In one embodiment, the position movement control signal includes: a first point touch signal to the first body image and a second point touch signal to other regions except the first body image;

the determining a target image frame corresponding to a target timestamp of the position movement control signal includes: determining a position point of the second point touch signal, determining a target timestamp corresponding to the position point, and determining a target image frame corresponding to the target timestamp.

In an embodiment, the method further comprises:

displaying a time axis of the video to be processed;

and displaying a slider mark corresponding to the position movement control signal on the time axis.

According to a second aspect of the embodiments of the present disclosure, there is provided a video generating apparatus applied to a mobile terminal, the apparatus including:

a first display module configured to display an interface, the interface comprising: the method comprises the following steps of (1) superposing a background image of a video to be processed and at least one body separating image of a main body object on the background image according to body separating position information; wherein each body segmentation image corresponds to a body segmentation key frame;

a receiving module configured to receive a position movement control signal for a first body image of the at least one body image;

an updating module configured to determine a target image frame corresponding to a target timestamp of the position movement control signal, and replace the target image frame with an avatar key frame to which the first avatar image belongs as an avatar key frame;

and the generating module is configured to generate a freeze-frame body-splitting effect video according to the to-be-processed video and the updated body-splitting key frame.

In one embodiment, the apparatus further comprises:

and the synthesis module is configured to synthesize a background image according to the background part except the main body object in each image frame in the video to be processed.

In one embodiment, the update module is further configured to determine a target image frame corresponding to a target timestamp of the position movement control signal using: determining at least one node position corresponding to the dragging touch signal, determining a node timestamp corresponding to each node position, and determining an image frame corresponding to each node timestamp;

the device further comprises:

and the second display module is configured to display the subject object image in the image frame corresponding to the node position reached by the dragging control signal when the dragging control signal reaches each node position.

In an embodiment, the second display module is further configured to display the subject object image in the image frame corresponding to the node position reached by the drag control signal by using the following method:

and displaying the rendered main object image.

the update module is further configured to determine a target image frame corresponding to a target timestamp of the position movement control signal using: determining a position point of the second point touch signal, determining a target timestamp corresponding to the position point, and determining a target image frame corresponding to the target timestamp.

In one embodiment, the apparatus further comprises:

a third display module configured to display a time axis of the video to be processed; and displaying a slider mark corresponding to the position movement control signal on the time axis.

According to a third aspect of the embodiments of the present disclosure, there is provided a video generating apparatus including:

a processor;

a memory for storing processor-executable instructions;

wherein the processor is configured to execute executable instructions in the memory to implement the steps of the video generation method.

According to a fourth aspect of embodiments of the present disclosure, there is provided a non-transitory computer readable storage medium having stored thereon executable instructions that, when executed by a processor, implement the steps of the video generation method.

The technical scheme provided by the embodiment of the disclosure can have the following beneficial effects: by displaying the interface containing the body-separating image, a user inputs a position movement control signal aiming at the body-separating image aiming at the interface, and selects a new body-separating key frame to replace an original body-separating key frame to which the body-separating image belongs, so that the position movement of the body-separating image is realized, and the personalized stop-motion body-separating effect video is obtained.

The embodiment of the disclosure can be applied to a video which is shot and adjusted by a common user in any scene and aims at a single character or animal subject to obtain the multi-body self-setting effect of the subject, and the use interestingness is improved.

It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the disclosure.

Drawings

The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the present disclosure and together with the description, serve to explain the principles of the disclosure.

FIG. 1 is a flow diagram illustrating a video generation method in accordance with an exemplary embodiment;

FIG. 2 is a pictorial display diagram illustrating a method of generating a stop-motion avatar effect video in accordance with an exemplary embodiment;

FIG. 3 is a pictorial display diagram illustrating a method of generating a stop-motion avatar effect video in accordance with an exemplary embodiment;

FIG. 4 is a pictorial display diagram illustrating a method of generating a stop-motion avatar effect video in accordance with an exemplary embodiment;

fig. 5 is a block diagram illustrating an apparatus for generating a stop-motion body-split effect video according to an exemplary embodiment;

fig. 6 is a block diagram illustrating an apparatus for generating a stop-motion body-split effect video according to an exemplary embodiment.

Detailed Description

Reference will now be made in detail to the exemplary embodiments, examples of which are illustrated in the accompanying drawings. When the following description refers to the accompanying drawings, like numbers in different drawings represent the same or similar elements unless otherwise indicated. The implementations described in the following exemplary embodiments do not represent all implementations consistent with the embodiments in this disclosure. Rather, they are merely examples of apparatus and methods consistent with certain aspects of the embodiments in the present disclosure, as detailed in the appended claims.

The embodiment of the disclosure provides a video generation method. Referring to fig. 1, fig. 1 is a flow chart illustrating a video generation method according to an exemplary embodiment. As shown in fig. 1, the method may include:

step S11, displaying an interface, which may include: the method comprises the following steps that a background image of a video to be processed and at least one body-separating image of a main body object which is superposed on the background image according to body-separating position information are obtained; wherein each body segmentation image corresponds to a body segmentation key frame;

step S12, receiving a position movement control signal for a first body image in the at least one body image, determining a target image frame corresponding to a target timestamp of the position movement control signal, and replacing a body key frame to which the first body image belongs with the target image frame as a body key frame;

and step S13, generating a freeze-frame body-separating effect video according to the video to be processed and the updated body-separating key frame.

In one embodiment, the target subject is an autonomously movable object such as a human or an animal.

In an embodiment, step S11 may be preceded by step S10, which is to determine the number of the body-distinguishing key frames in the video to be processed, and determine the body-object information of each image frame in the video to be processed, where the body-object information may include: the main body object image corresponding to the main body object, the body position information of the main body object in the image frame and the time stamp of the image frame in the original video. In this embodiment, when the subject-to-object information of each image frame in the video to be processed is determined in step S10, and a new avatar image is selected for a subsequent user with respect to an avatar image in the interface to replace the avatar image, the new avatar image that the user wants to select can be quickly determined.

In step S10, when the body-separated key frame in the video to be processed is determined, the subject recognition and segmentation are realized by using the image detection and image segmentation technique based on machine learning. For example: and performing main body recognition by using the trained artificial neural network, and training the artificial neural network by using a large number of images containing characters or animals as training samples. In this embodiment, it is possible to support a single recognition of a plurality of subjects, and also support a recognition of a merged subject including a plurality of characters.

In an exemplary embodiment, step S11 may further include: and receiving a shooting instruction, shooting the video to be processed, and determining the body-separating key frame in the video to be processed in real time in the shooting process. Through the embodiment, the body-separating key frames can be determined in the shooting process, and the positions of the body-separating images in the body-separating key frames are displayed immediately after shooting is finished.

In an exemplary embodiment, when determining the frame of the target subject in the video to be processed in step S10, the frame of the target subject is selected by a setting selection algorithm, where the setting selection algorithm may include setting a fixed time interval to select reasonably spaced frame of the target subject, or recognizing a motion form of the target subject of each frame of the image, and selecting a frame of the target subject with a good motion according to the motion form.

In an exemplary embodiment, the step S11 displaying the interface may include: and displaying the background image layer and the body-separated image layer, wherein the size of the background image layer is the same as that of the body-separated image layer. Each of the body-divided image layers may include a body-divided image in an image frame, and the body-divided position information of the body-divided image in the body-divided image layer is the position of the subject object in the corresponding image frame. Through the layering mode, after a user conveniently moves the split images in the split image layer, after a new split image is determined, the split image layer corresponding to the new split image is superposed on the background image layer, and an animation effect corresponding to the moving operation is displayed, so that the user can visually watch an adjusting effect, and the user can conveniently adjust the image.

In this embodiment, by displaying an interface including an avatar image, a user inputs a position movement control signal for an avatar image with respect to the interface, and selects a new avatar key frame to replace an original avatar key frame to which the avatar image belongs, thereby realizing position movement of the avatar image and obtaining a personalized stop-motion avatar effect video.

The embodiment can be applied to the video with the multi-body freezing effect of the main body obtained after shooting and adjusting for a single person or animal main body by a common user in any scene, and the use interest is improved.

The embodiment of the present disclosure provides a video generation method, which may include the method shown in fig. 1, and may further include, before step S11, for example: and determining a background image of the video to be processed.

The determining the background image of the video to be processed may include: and synthesizing a background image according to the background part except the main body object in each image frame in the video to be processed.

In this embodiment, when determining the background image in the interface, the background image is obtained by synthesizing the background portions except for the main object in each image frame in the video to be processed, so that the obtained background portion does not include the main object, and the background portion is used as the background image in the interface, which can ensure that when a user performs a position moving operation on a separate image, the main object and the background are clearly distinguished in a position changing process of the main object.

The embodiment of the present disclosure provides a video generation method, which may include the method shown in fig. 1, and exemplarily:

the position movement control signal is a dragging touch signal for the first avatar image.

Optionally, the dragging touch signal may be a touch operation signal based on interface touch, such as a dragging operation signal and a sliding operation signal, or may be a floating touch signal.

The determining the target image frame corresponding to the target timestamp of the position movement control signal in step S12 may include: determining an end point of the dragging touch signal, determining a target timestamp corresponding to the end point, and determining a target image frame corresponding to the target timestamp.

In an exemplary embodiment, the method may further include: displaying icons representing touch operations, such as: the pattern of this icon is a little-hand pattern.

In this embodiment, the adjustment of the segmentation image is completed by inputting the dragging touch signal, so that a new segmentation key frame is determined for an old segmentation key frame to replace the old segmentation key frame, and the distance interval and the form difference between the segmentation images of different segmentation key frames can be visually checked by a user through an interface by using the mode of dragging the touch signal, thereby facilitating the user to obtain a video with a required segmentation effect.

The following description is made with reference to fig. 2 and example one.

Example one:

as shown in fig. 1-st diagram in fig. 2, the mobile terminal displays a background image on an interface, and three separate images superimposed on the background image.

The user wants to change the 2 nd avatar image, and drag the 2 nd avatar image to the right, as shown in the 2 nd diagram in fig. 2, the user first clicks the 2 nd avatar image.

And (3) the user performs a dragging touch operation, the mobile terminal displays a hand image on the interface, and as shown in the 3 rd figure in fig. 2, a picture when the dragging touch operation reaches the end point is displayed.

After the user finishes the dragging touch operation, as shown in fig. 4 in fig. 2, the mobile terminal displays a newly determined 2 nd avatar image on the interface.

In an exemplary embodiment, during the continuous input of the dragging touch signal, the method may further include:

step 1, determining at least one node position corresponding to the dragging touch signal, determining a node timestamp corresponding to each node position, and determining an image frame corresponding to each node timestamp;

and 2, when the dragging control signal reaches each node position, displaying a main object image of the main object in an image frame corresponding to the node position reached by the dragging control signal.

In an exemplary embodiment, dragging the node position corresponding to the touch signal may include dragging each image frame corresponding to a sliding track of the touch signal, or may include dragging a plurality of image frames at intervals of a set number of frames corresponding to a sliding track of the touch signal.

The following description is made with reference to fig. 3 and example two.

Example two:

as shown in fig. 1-1 of fig. 3, the mobile terminal displays a background image on an interface, and three separate images superimposed on the background image.

The user wants to change the 2 nd avatar image, and drags the 2 nd avatar image to the right, as shown in the 2 nd diagram in fig. 3, the user first clicks the 2 nd avatar image.

A user performs a dragging touch operation, the mobile terminal displays a hand image on the interface, and in the dragging process, a main object image of the image frame corresponding to the current touch position of the dragging operation is displayed, as shown in fig. 3, a main object image of the image frame corresponding to the current touch position in the dragging process is displayed.

After the user finishes the dragging touch operation, as shown in fig. 4 in fig. 3, the mobile terminal displays a newly determined 2 nd avatar image on the interface.

In this embodiment, when the user inputs the dragging touch signal, the main object image in the image frame corresponding to the plurality of node positions corresponding to the process of continuous input of the dragging touch signal is displayed, so that the user can view the form of the main object in the dragging process, and the end position of the dragging control signal is determined according to the viewed result, thereby selecting a new ideal body-separated key frame.

In an exemplary embodiment, after receiving the position movement control signal for the partial body image in step S13, the method may further include: displaying the background image, and at least one body image superimposed on the background image according to body position information, the at least one body image being other body images except the first body image. In this embodiment, after the dragging touch signal is started to be executed, the first avatar image corresponding to the starting point of the dragging touch signal is no longer displayed, and the effect of deleting the first avatar image is visually displayed, so that the user is more concerned about the position and the form of the new avatar image in the dragging process, and a new avatar key frame is better selected.

In an exemplary embodiment, displaying a subject object image in an image frame corresponding to a node position where the drag control signal arrives, may include:

and displaying the rendered main object image.

Wherein, the setting of the rendering manner may include at least one of: fill in set color, fill in set pattern, adjust transparency to set transparency, set sample stroke.

In this embodiment, by rendering the avatar image in the image frame corresponding to the node position reached by the drag control signal, the user can visually and intuitively distinguish the unselected avatar image from the determined avatar image, thereby facilitating the operation of the user.

The embodiment of the present disclosure provides a video generation method, which may include the method shown in fig. 1, and the position movement control signal may include, for example: a first tap signal to the first body image and a second tap signal to other areas than the first body image. By receiving the first touch signal, it may be determined that the first avatar image is an avatar image to be subjected to position movement adjustment. Optionally, the first and second touch signals may be touch operation signals based on interface touch, such as a click operation signal, a double-click operation signal, and a press operation signal, or may be floating touch signals.

Determining the target image frame corresponding to the target timestamp of the position movement control signal in step S12 may include: determining a position point of the second point touch signal, determining a target timestamp corresponding to the position point, and determining a target image frame corresponding to the target timestamp.

In an exemplary embodiment, the second point touch signal is a point touch signal for a region other than all the body images, and the point touch signal does not belong to any body image, so that the target image frame corresponding to the determined second point touch signal is an image frame other than the body keyframe.

The user can complete the adjustment of the main body object image by clicking twice, so that a new body-splitting key frame is determined for an old body-splitting key frame, and the operation is convenient and simple.

In an exemplary embodiment, the second touch signal is a touch signal for a body image other than the first body image and belonging to another body image, so that the target image frame corresponding to the determined second touch signal is a body key frame other than the first body key frame (i.e. the body key frame to which the first body image belongs). In this case, the content of the completed processing is to delete the first body key frame from the body key frame set, and to retain the other body key frames except the first body key frame.

The embodiment of the present disclosure provides a video generation method, where the method includes the method shown in fig. 1, and may further include, for example: displaying a time axis of the video to be processed; and displaying a slider mark corresponding to the position movement control signal on the time axis.

In an exemplary embodiment, the method may further include: and displaying the corresponding time point of the current position point on the time axis at the slider mark.

In this embodiment, by displaying the time axis of the video to be processed and displaying the slider identifier corresponding to the position movement control signal on the time axis, the user can conveniently know the corresponding timestamp in the adjustment process, thereby better selecting an ideal frame for the individual identification.

The video generation method provided in the embodiment of the present disclosure may include the method shown in fig. 1, and for example, in step S13, generating a stop-motion body-segmentation effect video according to the to-be-processed video and the updated body-segmentation key frame may include:

each image frame in a video band in a first time period in the stop-motion body-separating effect video is a superposition effect image of a corresponding image frame in a video to be processed and body-separating images of all body-separating key frames, and the first time period is a time period from a starting time point of an original video to a time stamp corresponding to a first body-separating key frame.

Each image frame in a video segment in a second time period in the freeze-frame body-separating effect video is a superposition effect image of a corresponding image frame in the video to be processed and body-separating images of other body-separating key frames except for the first body-separating key frame, and the second time period is a time period from a time stamp corresponding to the first body-separating key frame to a time stamp corresponding to the second body-separating key frame.

And so on.

Each image frame in the video band in the last period in the stop-motion body-separating effect video is a corresponding image frame in the video to be processed, and the second period is a period from the time stamp corresponding to the last body-separating key frame to the end time stamp of the video to be processed.

The following description is made with reference to fig. 4 and example three.

Example three:

a stop-motion body-splitting effect video is determined according to the body-splitting image determined in the second example, and the playing effect of the stop-motion body-splitting effect is shown in fig. 4. In the stop motion body segmentation effect video, a total of 3 body segmentation key frames are included, and representative image frames in different time periods are shown in fig. 4.

The embodiment of the disclosure provides a video generation device, which is applied to a mobile terminal. Referring to fig. 5, fig. 5 is a block diagram illustrating a video generating apparatus according to an exemplary embodiment. As shown in fig. 5, the apparatus may include:

a first display module 501 configured to display an interface, which may include: the method comprises the following steps that a background image of a video to be processed and at least one body-separating image of a main body object which is superposed on the background image according to body-separating position information are obtained; wherein each body segmentation image corresponds to a body segmentation key frame;

a receiving module 502 configured to receive a position movement control signal for a first body image of the at least one body image;

an updating module 503, configured to determine a target image frame corresponding to a target timestamp of the position movement control signal, and replace the target image frame with an avatar key frame to which the first avatar image belongs as an avatar key frame;

a generating module 504 configured to generate a stop-motion body-splitting effect video according to the to-be-processed video and the updated body-splitting key frame.

The embodiment of the present disclosure provides a video generating apparatus, which may include the apparatus shown in fig. 5, and may further include:

The embodiment of the present disclosure provides a video generating apparatus, which may include the apparatus shown in fig. 5, wherein the position movement control signal is, for example, a drag touch signal for the first avatar image; optionally, the dragging touch signal may be a touch operation signal based on interface touch, such as a dragging operation signal and a sliding operation signal, or may be a floating touch signal.

The determining the target image frame corresponding to the target timestamp of the position movement control signal may include:

the apparatus may further include:

and displaying the rendered main object image.

The embodiment of the present disclosure provides a video generating apparatus, which may include the apparatus shown in fig. 5, and the position movement control signal may include, for example: a first point touch signal to the first body image and a second point touch signal to other regions except the first body image; by receiving the first touch signal, it may be determined that the first avatar image is an avatar image to be subjected to position movement adjustment. Optionally, the first and second touch signals may be touch operation signals based on interface touch, such as a click operation signal, a double-click operation signal, and a press operation signal, or may be floating touch signals.

The video generation device provided by the embodiment of the disclosure may include:

a processor;

a memory for storing processor-executable instructions;

According to a fourth aspect of embodiments of the present disclosure, there is provided a non-transitory computer readable storage medium having stored thereon executable instructions that, when executed by a processor, implement the steps of a video generation method.

Fig. 6 is a block diagram illustrating an apparatus 600 for video generation according to an example embodiment. For example, the apparatus 600 may be a mobile phone, a computer, a digital broadcast terminal, a messaging device, a game console, a tablet device, a medical device, an exercise device, a personal digital assistant, and the like.

Referring to fig. 6, apparatus 600 may include one or more of the following components: processing component 602, memory 604, power component 606, multimedia component 608, audio component 610, input/output (I/O) interface 612, sensor component 614, and communication component 616.

The processing component 602 generally controls overall operation of the device 600, such as operations associated with display, telephone calls, data communications, camera operations, and recording operations. The processing component 602 may include one or more processors 620 to execute instructions to perform all or a portion of the steps of the methods described above. Further, the processing component 602 can include one or more modules that facilitate interaction between the processing component 602 and other components. For example, the processing component 602 can include a multimedia module to facilitate interaction between the multimedia component 608 and the processing component 602.

The memory 604 is configured to store various types of data to support operation at the device 600. Examples of such data include instructions for any application or method operating on device 600, contact data, phonebook data, messages, pictures, videos, and so forth. The memory 604 may be implemented by any type or combination of volatile and non-volatile memory devices such as Static Random Access Memory (SRAM), electrically erasable programmable read-only memory (EEPROM), erasable programmable read-only memory (EPROM), programmable read-only memory (PROM), read-only memory (ROM), magnetic memory, flash memory, magnetic or optical disks.

Power supply component 606 provides power to the various components of device 600. The power components 606 may include a power management system, one or more power supplies, and other components associated with generating, managing, and distributing power for the apparatus 600.

The multimedia component 608 includes a screen that provides an output interface between the device 600 and a user. In some embodiments, the screen may include a Liquid Crystal Display (LCD) and a Touch Panel (TP). If the screen includes a touch panel, the screen may be implemented as a touch screen to receive an input signal from a user. The touch panel includes one or more touch sensors to sense touch, slide, and gestures on the touch panel. The touch sensor may not only sense the boundary of a touch or slide action, but also detect the duration and pressure associated with the touch or slide operation. In some embodiments, the multimedia component 608 includes a front facing camera and/or a rear facing camera. The front camera and/or the rear camera may receive external multimedia data when the device 600 is in an operating mode, such as a shooting mode or a video mode. Each front camera and rear camera may be a fixed optical lens system or have a focal length and optical zoom capability.

The audio component 610 is configured to output and/or input audio signals. For example, audio component 610 includes a Microphone (MIC) configured to receive external audio signals when apparatus 600 is in an operational mode, such as a call mode, a recording mode, and a voice recognition mode. The received audio signal may further be stored in the memory 604 or transmitted via the communication component 616. In some embodiments, audio component 610 further includes a speaker for outputting audio signals.

The I/O interface 612 provides an interface between the processing component 602 and peripheral interface modules, which may be keyboards, click wheels, buttons, etc. These buttons may include, but are not limited to: a home button, a volume button, a start button, and a lock button.

The sensor component 614 includes one or more sensors for providing status assessment of various aspects of the apparatus 600. For example, the sensor component 614 may detect an open/closed state of the device 600, the relative positioning of components, such as a display and keypad of the apparatus 600, the sensor component 614 may also detect a change in position of the apparatus 600 or a component of the apparatus 600, the presence or absence of user contact with the apparatus 600, orientation or acceleration/deceleration of the apparatus 600, and a change in temperature of the apparatus 600. The sensor assembly 614 may include a proximity sensor configured to detect the presence of a nearby object without any physical contact. The sensor assembly 614 may also include a light sensor, such as a CMOS or CCD image sensor, for use in imaging applications. In some embodiments, the sensor assembly 614 may also include an acceleration sensor, a gyroscope sensor, a magnetic sensor, a pressure sensor, or a temperature sensor.

The communication component 616 is configured to facilitate communications between the apparatus 600 and other devices in a wired or wireless manner. The apparatus 600 may access a wireless network based on a communication standard, such as WiFi, 2G or 3G, or a combination thereof. In an exemplary embodiment, the communication component 616 receives broadcast signals or broadcast related information from an external broadcast management system via a broadcast channel. In an exemplary embodiment, the communication component 616 further includes a Near Field Communication (NFC) module to facilitate short-range communications. For example, the NFC module may be implemented based on Radio Frequency Identification (RFID) technology, infrared data association (IrDA) technology, Ultra Wideband (UWB) technology, Bluetooth (BT) technology, and other technologies.

In an exemplary embodiment, the apparatus 600 may be implemented by one or more Application Specific Integrated Circuits (ASICs), Digital Signal Processors (DSPs), Digital Signal Processing Devices (DSPDs), Programmable Logic Devices (PLDs), Field Programmable Gate Arrays (FPGAs), controllers, micro-controllers, microprocessors or other electronic components for performing the above-described methods.

In an exemplary embodiment, a non-transitory computer readable storage medium comprising instructions, such as the memory 604 comprising instructions, executable by the processor 620 of the apparatus 600 to perform the above-described method is also provided. For example, the non-transitory computer readable storage medium may be a ROM, a Random Access Memory (RAM), a CD-ROM, a magnetic tape, a floppy disk, an optical data storage device, and the like.

Other embodiments of the disclosure will be apparent to those skilled in the art from consideration of the specification and practice of the invention disclosed herein. This application is intended to cover any variations, uses, or adaptations of the embodiments of the disclosure following, in general, the principles of the embodiments of the disclosure and including such departures from the present disclosure as come within known or customary practice within the art to which the embodiments pertain. It is intended that the specification and examples be considered as exemplary only, with a true scope and spirit of the disclosed embodiments being indicated by the following claims.

It is to be understood that the embodiments of the present disclosure are not limited to the precise arrangements described above and shown in the drawings, and that various combinations, substitutions, modifications, and changes of the method steps or apparatus components disclosed in the present disclosure may be made without departing from the scope thereof, and are intended to be included within the scope of the present disclosure. The scope of the disclosure as claimed is limited by the claims appended hereto.

It should be noted that, in the present disclosure, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other like elements in a process, method, article, or apparatus that comprises the element.

With regard to the apparatus in the above-described embodiment, the specific manner in which each module performs the operation has been described in detail in the embodiment related to the method, and will not be elaborated here.

Other embodiments of the disclosure will be apparent to those skilled in the art from consideration of the specification and practice of the disclosure disclosed herein. This application is intended to cover any variations, uses, or adaptations of the disclosure following, in general, the principles of the disclosure and including such departures from the present disclosure as come within known or customary practice within the art to which the disclosure pertains. It is intended that the specification and examples be considered as exemplary only, with a true scope and spirit of the disclosure being indicated by the following claims.

It will be understood that the present disclosure is not limited to the precise arrangements described above and shown in the drawings and that various modifications and changes may be made without departing from the scope thereof. The scope of the present disclosure is limited only by the appended claims.

Claims

1. A video generation method is applied to a mobile terminal and is characterized by comprising the following steps:

displaying an interface, the interface comprising: the method comprises the following steps that a background image of a video to be processed and at least one body-separating image of a main body object which is superposed on the background image according to body-separating position information are obtained; wherein each of the body images corresponds to a body keyframe;

2. The method of claim 1, wherein prior to the displaying the interface, the method further comprises:

3. The method of claim 1,

the position movement control signal is a dragging touch signal for the first body splitting image;

4. The method of claim 3,

after the interface is displayed, the method further comprises:

5. The method of claim 4,

the displaying the subject object image in the image frame corresponding to the node position reached by the dragging control signal comprises:

and displaying the rendered main object image.

6. The method of claim 1,

the position movement control signal includes: a first point touch signal to the first body image and a second point touch signal to other regions except the first body image;

7. The method of claim 1,

the method further comprises the following steps:

displaying a time axis of the video to be processed;

and displaying a slider identification corresponding to the position movement control signal on the time axis.

8. A video generation device applied to a mobile terminal is characterized by comprising:

a first display module configured to display an interface, the interface comprising: the method comprises the following steps that a background image of a video to be processed and at least one body-separating image of a main body object which is superposed on the background image according to body-separating position information are obtained; wherein each body segmentation image corresponds to a body segmentation key frame;

9. The apparatus of claim 8,

the device further comprises:

10. The apparatus of claim 8,

11. The apparatus of claim 10,

the update module is further configured to determine a target image frame corresponding to a target timestamp of the position movement control signal using: determining at least one node position corresponding to the dragging touch signal, determining a node timestamp corresponding to each node position, and determining an image frame corresponding to each node timestamp;

the device further comprises:

12. The apparatus of claim 11,

the second display module is further configured to display the subject object image in the image frame corresponding to the node position reached by the drag control signal by using the following method:

and displaying the rendered main object image.

13. The apparatus of claim 8,

14. The apparatus of claim 8,

the device further comprises:

a third display module configured to display a timeline of the video to be processed; and displaying a slider mark corresponding to the position movement control signal on the time axis.

15. A video generation apparatus, comprising:

a processor;

a memory for storing processor-executable instructions;

wherein the processor is configured to execute executable instructions in the memory to implement the steps of the video generation method of any of claims 1 to 7.

16. A non-transitory computer readable storage medium having stored thereon executable instructions, which when executed by a processor, implement the steps of the video generation method of any of claims 1 to 7.