CN113794799A

CN113794799A - Video processing method and device

Info

Publication number: CN113794799A
Application number: CN202111092369.6A
Authority: CN
Inventors: 许晓琳
Original assignee: Vivo Mobile Communication Co Ltd
Current assignee: Vivo Mobile Communication Co Ltd
Priority date: 2021-09-17
Filing date: 2021-09-17
Publication date: 2021-12-14

Abstract

The application discloses a video processing method and device, and belongs to the technical field of electronics. The video processing method comprises the following steps: determining a first object in the first image based on the first input; acquiring a target frame image including a second object in the first video based on the second input; adjusting a first pose of the first object to match a second pose of the second object in the target frame image; replacing the second object in the target frame image with the first object.

Description

Video processing method and device

Technical Field

The application belongs to the technical field of electronics, and particularly relates to a video processing method and device.

Background

With the development of electronic technology, people's lives, works, entertainment, and the like are not kept away from electronic devices, and even children begin to frequently touch electronic devices.

For example, the enlightenment education for children is not limited to books such as picture books, but is more in the form of animation films. In the cartoon, objective phenomena are presented in a thinking mode of children, and the cartoon is more easily accepted by the children. With the increasing demand of people, it is more desirable to make some animations to blend the animation scenes with the real world of children, so as to improve the viewing interests and comprehension ability of children. The whole process of producing the cartoon needs to go through a plurality of steps such as file script, art setting, original picture lens division, animation production, dubbing editing and the like, and the production cost is high.

Disclosure of Invention

An object of the embodiments of the present application is to provide a video processing method, which can solve the problem in the prior art that the cost for making an animation film is high for a user.

In a first aspect, an embodiment of the present application provides a video processing method, where the method includes: determining a first object in the first image based on the first input; acquiring a target frame image including a second object in the first video based on the second input; adjusting a first pose of the first object to match a second pose of the second object in the target frame image; replacing the second object in the target frame image with the first object.

In a second aspect, an embodiment of the present application provides a video processing apparatus, including: a determination module to determine a first object in a first image based on a first input; the acquisition module is used for acquiring a target frame image including a second object in the first video based on the second input; a first adjusting module, configured to adjust, according to a second pose of the second object in the target frame image, a first pose of the first object to match the second pose; a replacing module for replacing the second object in the target frame image with the first object.

In a third aspect, an embodiment of the present application provides an electronic device, which includes a processor, a memory, and a program or instructions stored on the memory and executable on the processor, and when executed by the processor, the program or instructions implement the steps of the method according to the first aspect.

In a fourth aspect, embodiments of the present application provide a readable storage medium, on which a program or instructions are stored, which when executed by a processor implement the steps of the method according to the first aspect.

In a fifth aspect, an embodiment of the present application provides a chip, where the chip includes a processor and a communication interface, where the communication interface is coupled to the processor, and the processor is configured to execute a program or instructions to implement the method according to the first aspect.

Thus, in the embodiment of the application, the user can select the first image in the process of making the animation film, so that the first object in the first image is used as a self-defined animation role; further, the first video can be selected to serve as an animation to be produced, and then the second object is selected from the animation to be produced to serve as an animation character to be replaced. Based on the selection of the user, extracting a target frame image including a second object in the cartoon, then identifying a second posture of the second object in any frame image, adjusting a first posture of a first object according to the second posture in the frame image to enable the first posture to be matched with the second posture, and finally replacing the second object with the first object in the frame image. And in the same way, the replacement of the second object in all the target frame images is completed. Therefore, in the embodiment of the application, on the basis of fully utilizing the existing animation films, a user can replace a certain animation role in any animation film with a user-defined animation role through simple operation, so that simple production of one animation film is completed quickly, and the production cost is low.

Drawings

Fig. 1 is a flowchart of a video processing method according to an embodiment of the present application;

fig. 2 to 16 are operation diagrams of the electronic device of the embodiment of the present application;

fig. 17 is a block diagram of a video processing apparatus according to an embodiment of the present application;

fig. 18 is one of hardware configuration diagrams of an electronic apparatus according to an embodiment of the present application;

fig. 19 is a second hardware configuration diagram of the electronic device according to the embodiment of the present application.

Detailed Description

The technical solutions in the embodiments of the present application will be described clearly below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are some, but not all, embodiments of the present application. All other embodiments that can be derived by one of ordinary skill in the art from the embodiments given herein are intended to be within the scope of the present disclosure.

The terms first, second and the like in the description and in the claims of the present application are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It will be appreciated that the data so used may be interchanged under appropriate circumstances such that embodiments of the application may be practiced in sequences other than those illustrated or described herein, and that the terms "first," "second," and the like are generally used herein in a generic sense and do not limit the number of terms, e.g., the first term can be one or more than one. In addition, "and/or" in the specification and claims means at least one of connected objects, a character "/" generally means that a preceding and succeeding related objects are in an "or" relationship.

The video processing method provided by the embodiment of the present application is described in detail below with reference to the accompanying drawings through specific embodiments and application scenarios thereof.

Referring to fig. 1, a flowchart of a video processing method according to an embodiment of the present application is shown, and the method is applied to an electronic device, and includes:

step 110: based on the first input, a first object in the first image is determined.

The first input comprises touch input performed by a user on a screen, and is not limited to input of clicking, sliding, dragging and the like; the first input may also be a blank input, a gesture action, a face action, and the like of the user, and the first input further includes an input of a physical key on the device by the user, and is not limited to an input of a press and the like. Furthermore, the first input includes one or more inputs, wherein the plurality of inputs may be continuous or intermittent.

The first input is for selecting a first image.

In case a plurality of objects are included in the first image, the first input is also used for selecting the first object.

Alternatively, the first image may be a dynamic image, such as a video; the first image may also be a still image, such as a picture.

Optionally, the first object includes any one of a person and an object.

Optionally, the present embodiment is applied to a piece of video processing software.

In one scenario, a user opens the video processing software and enters an animation interface.

Referring to fig. 2, in the interface, a plurality of controls are displayed, each control being used for indicating an import mode, and each control in the interface is used for importing a first image. Exemplary means of introduction include, but are not limited to: the "camera scans" import, "album" import, "file library" import, "materials library" import.

Taking the "album" import mode as an example, the user clicks "album" (as shown in fig. 2), enters the album interface (as shown in fig. 3), displays all the pictures in the album, and clicks "picture one" as the first image.

Taking the "camera scan" import mode as an example, the user clicks "camera scan" (as shown in fig. 4), enters a scan interface, and the scan interface displays an image acquired by the camera as a first image. For example, a user may manually draw a character on another carrier and then scan using a camera.

Optionally, if the scanning is successful, an interactive interface for saving the scanned image may be provided, so as to save the shooting step. Referring to fig. 5, after the scanning is successful, the user enters a storage interface, and two storage modes of "picture" and "pdf" are displayed in the interface for the user to select.

In addition, the user can click on the 'document library' and import the first image from other software; or, the user can click on the 'material library' to import the first image from the material library of the current software.

Optionally, after the first image of other software is imported, an option of saving the first image to the material library can be provided for the user to select, and the first image can be directly used next time.

Further, after importing the first image, the first object in the first image is automatically identified to determine the first object in step 110.

In one scenario, if the first image includes a plurality of objects, the plurality of objects are displayed, and the user selects a first object to be replaced among the plurality of objects.

Wherein the first object is a user-defined animated character.

Step 120: based on the second input, a target frame image including the second object in the first video is acquired.

The second input comprises touch input performed by a user on the screen, and is not limited to input of clicking, sliding, dragging and the like; the second input may also be a blank input, a gesture action, a face action, etc. of the user, and the second input further includes an input of a physical key on the device by the user, and is not limited to an input of a press, etc. Also, the second input includes one or more inputs, wherein the plurality of inputs may be continuous or intermittent.

The second input is for selecting the first video and the second object.

Optionally, the first video comprises an animation.

In one scenario, a library of framework materials is prepared in advance by software.

Illustratively, software acquires a large number of animations in advance, and performs preprocessing to obtain animation frames corresponding to the animations, and stores the animation frames in a frame material library. In the animation frame, each animation role in the animation is extracted for the user to select from the extracted animation roles so as to replace the animation roles with self-defined animation roles.

Referring to fig. 6, after the user imports the first image, the animation frame selection interface is entered, in which the user clicks "frame material library", the animation frame list shown in fig. 7 is displayed, and the user clicks "animation frame one", as the first video.

In yet another scenario, the animation framework is customized by the user.

Referring to fig. 8, after a user imports a first image, the user enters an animation frame selection interface, in the interface, the user clicks "make other frames", two ways of uploading animation films, namely "album" and "file library", and the user can click "album" to display an animation film list in the album, so that the user can upload the animation films in album software to software as a first video, and then the software processes the animation films uploaded by the user to obtain an animation frame corresponding to the animation films.

Further, if the animation frame is successfully made, an interactive interface for storing the animation frame to the material library can be provided, so that the user can conveniently and directly select from the material library next time. Referring to fig. 10, after the creation is successful, the user enters a save interface, and a "save to material library" option is displayed in the interface for the user to select.

The above provides an application scenario for selecting a first video. In the application scenario, the animation framework is used for extracting an animation role from the animation film for the user to select and replace. In other application scenes, the animation film can be directly selected, and then the animation role in the animation film is identified for the user to select.

Further, after the selection of the first video is completed, the control of each animation character is output for the user to select, so that any animation character is used as a second object.

The target frame image may be a single frame image or a multi-frame image.

Optionally, the target frame image in this step is: all frame images of the second object are included in the first video; optionally, the target frame image in this step is: in the first video, a partial frame image of the second object is included.

Step 130: and adjusting the first posture of the first object to be matched with the second posture according to the second posture of the second object in the target frame image.

Optionally, the second pose of the second object comprises a body pose and an expression pose, and correspondingly, the first pose of the first object determined from the first image also comprises a body pose and an expression pose.

For example, for the second subject, jumping is occurring, with a body gesture that includes features of various parts of the body, such as the feet being lifted upward, and an expressive gesture that includes features of the five sense organs, such as the mouth being open. Therefore, based on these poses of the second object, the first pose of the first object needs to be adjusted so that the first pose matches the second pose. In particular, the foot of the first subject is also adjusted to lift upwards, and so on.

Step 140: the second object in the target frame image is replaced with the first object.

In this step, the target frame image in the first video is processed to replace the second object of the target frame image with the first object.

Optionally, for any frame of image, the second object is firstly scratched out, then the first object is displayed at the same position, and finally the periphery of the outline of the first object is processed to make the outline of the first object compatible with the background.

In the flow of the video processing method according to another embodiment of the present application, the second input includes a first sub-input and a second sub-input, and step 120 includes:

substep A1: a first sub-input to a first video is received.

Substep A2: at least one object in the first video is identified in response to the first sub-input.

The first sub-input is for selecting a first video.

In this embodiment, for the first video, each frame of image is acquired, and then in each frame of image, the objects therein are identified, including but not limited to: humans, animals, articles, plants, and the like. After each frame of image is recognized separately, an object having the same feature is defined as one object, so that all objects included in the first video can be recognized.

Optionally, the at least one object in this step comprises all objects that are identifiable.

Substep A3: a second sub-input to a second object is received.

Wherein the at least one object comprises a second object.

Substep A4: and responding to the second sub-input, and acquiring a target frame image including a second object in the first video.

The second sub-input is for selecting a second object.

Optionally, the target frame image in this embodiment is: including all the frame images of the second object.

In this embodiment, based on the identified at least one object, for each object, all frame images including the object may be found in all frame images of the first video, associated together, for example, placed in one folder. Thus, upon user selection of the first video, a plurality of controls are displayed, each for indicating an object, and each associated with all of the frame images in which the indicated object is located.

Referring to fig. 11, for example, in the illustrated interface, the left area displays an animation element (i.e., a first object), the back area displays a plurality of characters in the first video, and the user clicks any character (e.g., character one), and all frame images associated with character one are processed as follows: the character one is replaced with the animation element on the left side.

Referring to fig. 11 and 12, in one scenario, the user clicks the animation element icon in the left area and drags to the character two icon in the right area until the character two icon achieves the desired effect of change, such as changing to gray, and the user releases his finger, so that all the frame images associated with the character two are processed as follows: and replacing the character II with the animation element dragged by the user.

Alternatively, in FIGS. 11 and 12, the user clicks "reselect" to the left area, and may reselect the custom animated character (i.e., the first object); the user clicks on "re-select" in the right area to re-select the animation (i.e., the first video) that needs to be produced.

Optionally, in step 140, if the replacement is completed, the interface shown in fig. 13 pops up, and the user is prompted to replace a character, and the replacement can be continued.

In this embodiment, for the first video selected by the user, all the replaceable objects in the first video may be listed, and at the same time, each object is associated with all the frame images where the object is located. In this way, a user can select any one object, so that the video processing software can acquire all the frame images including the object associated with the object based on the object selected by the user, further perform object replacement processing on each frame of acquired image, and finally present the animation film made by the user.

In the flow of the video processing method according to another embodiment of the present application, the second input includes a third sub-input and a fourth sub-input, and step 120 includes:

substep B1: a third sub-input to the first video is received.

Substep B2: at least one set of images in the first video is acquired in response to the third sub-input.

The third sub-input is for selecting the first video.

In this embodiment, after the user selects the first video, all the frame images in the first video may be divided into at least one group, so that the user may select one group of images to replace the second object in the group of images.

Optionally, upon selection of the first video, a plurality of controls are displayed, each control for indicating a set of images.

Substep B3: a fourth sub-input to the target group of images is received.

Substep B4: in response to the fourth sub-input, a target frame image including the second object in the target group image is acquired.

The fourth sub-input is for selecting any one of the sets of images as the target set of images.

Illustratively, in this step, after the user selects the target group image, each frame image included in the target group image is identified, and all frame images including the second object are identified as the target frame images in the present embodiment.

For example, referring to the application scenario in the foregoing embodiment, after the user selects the target group image, the object recognition is performed on each frame image included in the target group image to recognize the object included in each frame image, and the object having the same characteristic is defined as one object, so that for each object recognized in the target group image, a plurality of controls are displayed, each control is used to indicate one object, and each control is associated with the image including the object in the target group image, and the user selects the second object therein, so that the software acquires the image including the object in the target group image, that is, the target frame image.

Compared with the previous embodiment, the user can replace the second object in all the pictures in one cartoon with the first object uniformly. In this embodiment, the user may replace the second object in a part of the animation film with the first object, so as to meet the personalized requirement of the user for creating the animation.

In the flow of the video processing method according to another embodiment of the present application, step B2 includes any one of the following:

substep C1: at least one group of images in the first video are obtained based on scene information in the first video.

Wherein a group of images corresponds to a scene information.

Optionally, the scene information includes three parts of an object, a background, and a time period.

In this embodiment, each frame image in the first video is divided into at least one group according to the scene information corresponding to each frame image in the first video. For each group of images, the corresponding scene information is the same for each frame image contained therein, i.e. corresponding to one scene information.

For reference, first, the frame images in the first video are obtained according to the playing order, wherein the frame images are sorted according to the playing order. Then, the frame images are divided into at least one group on the basis of the original sorting, so that each frame image divided into one group is ensured to be always in a time period. In the grouping process, the background and the object corresponding to each frame image are respectively identified, for example, the background and the object corresponding to the continuous multi-frame image are kept unchanged, and the continuous multi-frame image is divided into one group.

For example, the first video includes two scene information, the first is that the first 5 seconds are on the playground, 5 children play football, the second is that the last 5 seconds are in the classroom, or the 5 children are in class, so that all the frame images in the first video are divided into two groups, all the frame images corresponding to the first scene are in one group, all the frame images corresponding to the second scene are in one group, and then the user can operate on all the frame images corresponding to any scene to replace a certain object in the scene.

Substep C2: at least one group of images in the first video are obtained based on the playing period information in the first video.

Wherein, a group of images corresponds to a playing period information.

In this embodiment, each frame image in the first video is divided into at least one group according to the playing period information corresponding to each frame image in the first video. For each group of images, the corresponding playing period information is the same for each frame image contained therein, i.e. corresponding to one playing period information.

For reference, first, the frame images in the first video are obtained according to the playing order, wherein the frame images are sorted according to the playing order. Then, the frame images are divided into at least one group based on the original sorting, so that each frame image divided into one group is ensured to be always in one time interval. Alternatively, in the grouping, at least one group may be divided by one period every 5 seconds. Specifically, the 5 seconds may be adjusted to other interval durations according to the playing duration of the first video.

For example, the playing time of the first video is 1 minute, all frame images in the first video are divided into 12 groups, that is, from 0 minute and 0 second of the first video, multiple frame images within every continuous 5 seconds are divided into one group, so that the user can perform a separate operation on all frame images within any 5 seconds to replace the second object in the frame images within the 5 seconds with the first object.

In the present embodiment, two schemes of dividing all frame images of the first video are provided. In an application scene, options corresponding to the two schemes can be provided for a user to select. Therefore, the user can select a proper scheme based on the production requirement of the cartoon so as to divide scenes in the cartoon and customize the cartoon roles at different time periods.

In the flow of the video processing method according to another embodiment of the present application, before step 140, the method further includes:

step D1: a third input of target feature information for a target object is received, the target object comprising either the first object or any object in the first video.

Step D2: in response to a third input, target feature information of the target object is adjusted.

The third input comprises touch input performed by a user on the screen, and is not limited to input of clicking, sliding, dragging and the like; the third input may also be a blank input, a gesture action, a face action, etc. of the user, and the third input further includes an input of a physical key on the device by the user, and is not limited to an input of a press, etc. Also, the third input includes one or more inputs, wherein the plurality of inputs may be continuous or intermittent.

The third input is for: and selecting target characteristic information aiming at the replaced first object in the first video or other objects in the first video.

The target characteristic information comprises at least one of face characteristic information, shape characteristic information and color characteristic information.

In one scenario, in the case that the replacement selection of the first object and the second object is completed, an interface as shown in fig. 14 is displayed, in which the first icon corresponds to the icon of the first object, so that the user clicks the icon to enter the interface shown in fig. 15. In the interface, a first object is displayed in the left area, a plurality of options such as 'color', 'shape dragging tool', 'emotion selection' and 'character selection' are displayed in the right area, and a user clicks any option to enter a setting interface corresponding to the option, so that the user can complete related setting of the first object by combining the setting options on the right side and the setting effect of the left area.

Optionally, the target feature information includes facial feature information.

For example, the user clicks on the "shape drag tool" so that within the left region, the face region of the first object may be edited, and the user may drag the face shape, eye shape, etc. of the first object to adjust to the desired facial effect.

As another example, the user clicks on either of the "emotion selection" and the "character selection" to display an emotion-related option or a character-related option in the right area, and the user may select a certain emotion or a certain character, and accordingly, the five sense organs of the first object may be adjusted to the corresponding emotion state or character state. Taking the "happy" emotion as an example, after the user selects the "happy" emotion, the mouth angle of the first object is adjusted to the raised state.

Optionally, the target feature information includes shape feature information.

For example, the user clicks on the "shape drag tool" so that in the left area, the overall outline of the first object can be edited and the user can drag the body shape of the first object to adjust to the desired stature.

As another example, the user clicks on a "shape pull tool" so that, in the left region, the outline of various parts of the first object can be edited and the user can pull on the shape of various parts of the first object's body (legs, etc.) to adjust to the desired stature.

As another example, the user clicks on a "shape drag tool" so that within the left area, the overall outline of the first object may be edited and the user may drag the outline shape of the first object to adjust to the desired shape effect.

Optionally, the target characteristic information includes color characteristic information.

For example, the user clicks "color" so that in the left area, the color of each part of the first object can be edited, the right area displays various colors, and the user can select a corresponding color for a certain part, and finally the desired display effect is achieved. For example, the user can set the color of clothes, hair, and the like to be worn.

The adjustment scheme for the target feature information of any object provided by this embodiment may not only implement dynamic feature adjustment for a certain object, but also implement static feature adjustment for a certain object.

For example, in static feature adjustment, the overall height of the first object after replacement may be adjusted; for another example, after the dynamic characteristics are adjusted, the emotion of the first object after replacement can be adjusted, the adjusted emotion is embodied in the five sense organs, and the emotion of the first object can be embodied in the change of the five sense organs by combining the change of the five sense organs, opening the mouth and closing the mouth and the like of the second object before replacement in the video, so that the purpose of adjusting the dynamic characteristics is achieved.

For another example, in the dynamic feature adjustment, the first object may be set to have different emotions in different scenes, so that when the scenes are switched in the video, the emotion change of the first object can be reflected, and the purpose of adjusting the dynamic feature is achieved.

In further embodiments, the target characteristic information may also include more content.

Referring to fig. 15, in this step, after the user completes the setting, the user clicks "save", so that in the first video, the first object is adjusted so that the target feature information is applied to the first object.

In yet another scenario, the user may also set target feature information for other objects in the first video besides the first object.

In fig. 14, icons of all objects in the first video are displayed, and the user clicks an icon of an arbitrary object, so that target feature information of the object can be set. Further, after the user completes all settings, "composition" in fig. 14 is clicked, and the interface shown in fig. 16 is entered, so that the user can preview the final effect of the first video, and save the first video after previewing, or save the first video directly.

In this embodiment, a scheme of performing user-defined setting on target characteristic information of any object is provided to improve the interest of a user in making an animation, and at the same time, too high making cost is not required, so that more making experiences are brought to the user.

In more embodiments of the application, a user can also directly perform replacement operation on a currently displayed object in the process of watching the first video, and set the replaced object feature information, so that the operation is simpler and easier.

It should be noted that the control in the present application serves as a carrier for displaying information, including but not limited to text marks, symbol marks, and image marks.

In summary, the present application provides a video processing method, which is suitable for making an animation film. Compared with the prior art, the method has the advantages that great force and time are consumed for manufacturing the cartoon, the speciality is strong, and the manufacturing cost is high; the method and the device can complete simple animation production at low cost.

Based on this application, the user can become some in the cartoon with oneself drawing image, shooting image etc to accomplish the simple preparation of a cartoon, need not participate in other loaded down with trivial details preparation steps of cartoon, easy operation, the cost of manufacture is lower.

In the video processing method of the present application, a user may draw animation elements that the user wants on a white paper, scan the animation elements with an electronic device or other scannable tools, upload the drawings directly to a given animation frame, and replace the selected animation elements with the animation elements drawn by the user directly after selecting the animation elements to be replaced. Further, after replacement, the animation elements intelligently change expressions, actions, and the like according to the animation elements in the animation framework before replacement. In this way, the user can replace the animation elements drawn by hand into the animation film one by one, and change the color theme, the character body expression mood and the like of the animation elements in the animation film.

Therefore, based on the application, children and cartoon kids can easily make simple cartoons belonging to the children and cartoon kids; the device can be used for entertaining adults and releasing pressure, and can also be another interesting small video making mode; the interest and hobbies of animation production and drawing can be aroused for children; also can make cartoon for learning (such as foreign language learning), improve the learning interest of children, and combine the cartoon with children education to form a preschool education tool.

It should be noted that, in the video processing method provided in the embodiment of the present application, the execution subject may be a video processing apparatus, or a control module in the video processing apparatus for executing the video processing method. In the embodiment of the present application, a video processing apparatus executing a video processing method is taken as an example, and the video processing apparatus provided in the embodiment of the present application is described.

Fig. 17 shows a block diagram of a video processing apparatus according to another embodiment of the present application, the apparatus including:

a determining module 10 for determining a first object in the first image based on the first input;

an obtaining module 20, configured to obtain, based on a second input, a target frame image including a second object in the first video;

a first adjusting module 30, configured to adjust a first posture of the first object to match a second posture of the second object in the target frame image;

and a replacing module 40, configured to replace the second object in the target frame image with the first object.

Optionally, the obtaining module 20 includes:

a first receiving unit for receiving a first sub-input to a first video;

a first response unit for identifying at least one object in the first video in response to the first sub-input;

a second receiving unit for receiving a second sub-input to a second object, at least one object including the second object;

and the second response unit is used for responding to the second sub-input and acquiring a target frame image including a second object in the first video.

Optionally, the obtaining module 20 includes:

a third receiving unit for receiving a third sub-input to the first video;

a third response unit, configured to, in response to the third sub-input, obtain at least one group of images in the first video;

a fourth receiving unit for receiving a fourth sub-input to the target group image;

and a fourth response unit for acquiring a target frame image including the second object in the target group image in response to the fourth sub-input.

Optionally, the third response unit comprises any one of:

the first acquiring subunit is used for acquiring at least one group of images in the first video based on the scene information in the first video, wherein one group of images corresponds to one piece of scene information;

and the second acquisition subunit is used for acquiring at least one group of images in the first video based on the playing period information in the first video, wherein one group of images corresponds to one piece of playing period information.

Optionally, the apparatus further comprises:

the receiving module is used for receiving a third input of target characteristic information of a target object, wherein the target object comprises any one of the first object or the first video;

a second adjusting module, configured to adjust target feature information of the target object in response to a third input;

The video processing apparatus in the embodiment of the present application may be an apparatus, or may be a component, an integrated circuit, or a chip in a terminal. The device can be mobile electronic equipment or non-mobile electronic equipment. By way of example, the mobile electronic device may be a mobile phone, a tablet computer, a notebook computer, a palm top computer, a vehicle-mounted electronic device, a wearable device, an ultra-mobile personal computer (UMPC), a netbook or a Personal Digital Assistant (PDA), and the like, and the non-mobile electronic device may be a server, a Network Attached Storage (NAS), a Personal Computer (PC), a Television (TV), a teller machine or a self-service machine, and the like, and the embodiments of the present application are not particularly limited.

The video processing apparatus in the embodiment of the present application may be an apparatus having an action system. The action system may be an Android (Android) action system, an ios action system, or other possible action systems, and the embodiment of the present application is not particularly limited.

The video processing apparatus provided in the embodiment of the present application can implement each process implemented by the foregoing method embodiment, and is not described here again to avoid repetition.

Optionally, as shown in fig. 18, an electronic device 100 is further provided in this embodiment of the present application, and includes a processor 101, a memory 102, and a program or an instruction stored in the memory 102 and executable on the processor 101, where the program or the instruction is executed by the processor 101 to implement each process of any one of the above embodiments of the video processing method, and can achieve the same technical effect, and in order to avoid repetition, details are not repeated here.

It should be noted that the electronic device in the embodiment of the present application includes the mobile electronic device and the non-mobile electronic device described above.

Fig. 19 is a schematic diagram of a hardware structure of an electronic device implementing an embodiment of the present application.

The electronic device 1000 includes, but is not limited to: a radio frequency unit 1001, a network module 1002, an audio output unit 1003, an input unit 1004, a sensor 1005, a display unit 1006, a user input unit 1007, an interface unit 1008, a memory 1009, and a processor 1010.

Those skilled in the art will appreciate that the electronic device 1000 may further comprise a power source (e.g., a battery) for supplying power to various components, and the power source may be logically connected to the processor 1010 through a power management system, so as to implement functions of managing charging, discharging, and power consumption through the power management system. The electronic device structure shown in fig. 19 does not constitute a limitation of the electronic device, and the electronic device may include more or less components than those shown, or combine some components, or arrange different components, and thus, the description thereof is omitted.

Wherein the processor 1010 is configured to determine a first object in the first image based on the first input; acquiring a target frame image including a second object in the first video based on the second input; adjusting a first pose of the first object to match a second pose of the second object in the target frame image; replacing the second object in the target frame image with the first object.

Optionally, the processor 1010 is further configured to control the user input unit 1007 to receive a first sub-input for the first video; identifying at least one object in the first video in response to the first sub-input; controlling the user input unit 1007 to receive a second sub-input for the second object, the at least one object including the second object; and responding to the second sub-input, and acquiring a target frame image comprising the second object in the first video.

Optionally, the processor 1010 is further configured to control the user input unit 1007 to receive a third sub-input to the first video; acquiring at least one group of images in the first video in response to the third sub-input; controlling the user input unit 1007 to receive a fourth sub-input for the target group image; in response to the fourth sub-input, a target frame image including the second object in the target group image is acquired.

Optionally, the processor 1010 is further configured to obtain at least one group of images in the first video based on the scene information in the first video, where the group of images corresponds to one piece of scene information; and acquiring at least one group of images in the first video based on the playing period information in the first video, wherein the group of images corresponds to one piece of playing period information.

Optionally, the processor 1010 is further configured to control the user input unit 1007 to receive a third input of target feature information of a target object, where the target object includes any one of the first object or the first video; adjusting the target feature information of the target object in response to the third input; wherein the target feature information includes at least one of face feature information, shape feature information, and color feature information.

It should be understood that in the embodiment of the present application, the input Unit 1004 may include a Graphics Processing Unit (GPU) 10041 and a microphone 10042, and the Graphics Processing Unit 10041 processes still pictures or video-processed image data obtained by an image capturing device (such as a camera) in a video Processing capturing mode or an image capturing mode. The display unit 1006 may include a display panel 10061, and the display panel 10061 may be configured in the form of a liquid crystal display, an organic light emitting diode, or the like. The user input unit 1007 includes a touch panel 10071 and other input devices 10072. The touch panel 10071 is also referred to as a touch screen. The touch panel 10071 may include two parts, a touch detection device and a touch controller. Other input devices 10072 may include, but are not limited to, a physical keyboard, function keys (e.g., volume control keys, switch keys, etc.), a trackball, a mouse, and an action stick, which are not described in detail herein. The memory 1009 may be used to store software programs as well as various data, including but not limited to applications and action systems. The processor 1010 may integrate an application processor, which primarily handles motion systems, user pages, applications, etc., and a modem processor, which primarily handles wireless communications. It will be appreciated that the modem processor described above may not be integrated into processor 1010.

The embodiments of the present application further provide a readable storage medium, where a program or an instruction is stored on the readable storage medium, and when the program or the instruction is executed by a processor, the program or the instruction implements each process of the video processing method embodiment, and can achieve the same technical effect, and in order to avoid repetition, details are not repeated here.

The processor is the processor in the electronic device described in the above embodiment. The readable storage medium includes a computer readable storage medium, such as a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and so on.

The embodiment of the present application further provides a chip, where the chip includes a processor and a communication interface, the communication interface is coupled to the processor, and the processor is configured to run a program or an instruction to implement each process of the above video processing method embodiment, and can achieve the same technical effect, and the details are not repeated here to avoid repetition.

It should be understood that the chips mentioned in the embodiments of the present application may also be referred to as system-on-chip, system-on-chip or system-on-chip, etc.

Through the above description of the embodiments, those skilled in the art will clearly understand that the method of the above embodiments can be implemented by software plus a necessary general hardware platform, and certainly can also be implemented by hardware, but in many cases, the former is a better implementation manner. Based on such understanding, the technical solutions of the present application may be embodied in the form of a computer software product, which is stored in a storage medium (such as ROM/RAM, magnetic disk, optical disk) and includes instructions for enabling a terminal (such as a mobile phone, a computer, a server, or a network device) to execute the method according to the embodiments of the present application.

While the present embodiments have been described with reference to the accompanying drawings, it is to be understood that the invention is not limited to the precise embodiments described above, which are meant to be illustrative and not restrictive, and that various changes may be made therein by those skilled in the art without departing from the spirit and scope of the invention as defined by the appended claims.

Claims

1. A method of video processing, the method comprising:

determining a first object in the first image based on the first input;

acquiring a target frame image including a second object in the first video based on the second input;

adjusting a first pose of the first object to match a second pose of the second object in the target frame image;

replacing the second object in the target frame image with the first object.

2. The method of claim 1, wherein obtaining the target frame image including the second object in the first video based on the second input comprises:

receiving a first sub-input to the first video;

identifying at least one object in the first video in response to the first sub-input;

receiving a second sub-input to the second object, the at least one object including the second object;

and responding to the second sub-input, and acquiring a target frame image comprising the second object in the first video.

3. The method of claim 1, wherein obtaining the target frame image including the second object in the first video based on the second input comprises:

receiving a third sub-input to the first video;

acquiring at least one group of images in the first video in response to the third sub-input;

receiving a fourth sub-input to the target group of images;

in response to the fourth sub-input, a target frame image including the second object in the target group image is acquired.

4. The method of claim 3, wherein the acquiring at least one set of images in the first video comprises any one of:

acquiring at least one group of images in the first video based on scene information in the first video, wherein the group of images corresponds to one piece of scene information;

and acquiring at least one group of images in the first video based on the playing period information in the first video, wherein the group of images corresponds to one piece of playing period information.

5. The method of claim 1, wherein prior to replacing the second object in the target frame image with the first object, the method further comprises:

receiving a third input of target feature information of a target object, the target object comprising either the first object or any object in the first video;

adjusting the target feature information of the target object in response to the third input;

wherein the target feature information includes at least one of face feature information, shape feature information, and color feature information.

6. A video processing apparatus, characterized in that the apparatus comprises:

a determination module to determine a first object in a first image based on a first input;

the acquisition module is used for acquiring a target frame image including a second object in the first video based on the second input;

a first adjusting module, configured to adjust, according to a second pose of the second object in the target frame image, a first pose of the first object to match the second pose;

a replacing module for replacing the second object in the target frame image with the first object.

7. The apparatus of claim 6, wherein the obtaining module comprises:

a first receiving unit for receiving a first sub-input to the first video;

a second receiving unit, configured to receive a second sub-input to the second object, where the at least one object includes the second object;

and the second response unit is used for responding to the second sub-input and acquiring a target frame image comprising the second object in the first video.

8. The apparatus of claim 6, wherein the obtaining module comprises:

a third receiving unit for receiving a third sub-input to the first video;

a fourth response unit, configured to, in response to the fourth sub-input, acquire a target frame image including the second object in the target group image.

9. The apparatus of claim 8, wherein the third response unit comprises any one of:

a first obtaining subunit, configured to obtain at least one group of images in the first video based on scene information in the first video, where the group of images corresponds to one piece of scene information;

and the second acquisition subunit is used for acquiring at least one group of images in the first video based on the playing period information in the first video, wherein the group of images corresponds to one piece of playing period information.

10. The apparatus of claim 6, further comprising:

a receiving module, configured to receive a third input of target feature information of a target object, where the target object includes any one of the first object or the first video;

a second adjustment module for adjusting the target feature information of the target object in response to the third input;