WO2019100757A1 - Video generation method and device, and electronic apparatus - Google Patents

Video generation method and device, and electronic apparatus Download PDF

Info

Publication number
WO2019100757A1
WO2019100757A1 PCT/CN2018/098602 CN2018098602W WO2019100757A1 WO 2019100757 A1 WO2019100757 A1 WO 2019100757A1 CN 2018098602 W CN2018098602 W CN 2018098602W WO 2019100757 A1 WO2019100757 A1 WO 2019100757A1
Authority
WO
WIPO (PCT)
Prior art keywords
video
human body
action
motion
audio
Prior art date
Application number
PCT/CN2018/098602
Other languages
French (fr)
Chinese (zh)
Inventor
张晨曦
李震
杨鹏博
戴硕
李鹤
黄怡青
Original Assignee
乐蜜有限公司
张晨曦
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 乐蜜有限公司, 张晨曦 filed Critical 乐蜜有限公司
Publication of WO2019100757A1 publication Critical patent/WO2019100757A1/en

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/4302Content synchronisation processes, e.g. decoder synchronisation
    • H04N21/4307Synchronising the rendering of multiple content streams or additional data on devices, e.g. synchronisation of audio on a mobile phone with the video output on the TV screen
    • H04N21/43074Synchronising the rendering of multiple content streams or additional data on devices, e.g. synchronisation of audio on a mobile phone with the video output on the TV screen of additional data with content streams on the same device, e.g. of EPG data or interactive icon with a TV program
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/439Processing of audio elementary streams
    • H04N21/4394Processing of audio elementary streams involving operations for analysing the audio stream, e.g. detecting features or characteristics in audio streams
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/44Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream, rendering scenes according to MPEG-4 scene graphs
    • H04N21/44004Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream, rendering scenes according to MPEG-4 scene graphs involving video buffer management, e.g. video decoder buffer or video display buffer
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/442Monitoring of processes or resources, e.g. detecting the failure of a recording device, monitoring the downstream bandwidth, the number of times a movie has been viewed, the storage space available from the internal hard disk
    • H04N21/44213Monitoring of end-user related data
    • H04N21/44218Detecting physical presence or behaviour of the user, e.g. using sensors to detect if the user is leaving the room or changes his face expression during a TV program
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/47End-user applications
    • H04N21/472End-user interface for requesting content, additional data or services; End-user interface for interacting with content, e.g. for content reservation or setting reminders, for requesting event notification, for manipulating displayed content
    • H04N21/47217End-user interface for requesting content, additional data or services; End-user interface for interacting with content, e.g. for content reservation or setting reminders, for requesting event notification, for manipulating displayed content for controlling playback functions for recorded or on-demand content, e.g. using progress bars, mode or play-point indicators or bookmarks
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/47End-user applications
    • H04N21/478Supplemental services, e.g. displaying phone caller identification, shopping application
    • H04N21/4781Games
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/80Generation or processing of content or additional data by content creator independently of the distribution process; Content per se
    • H04N21/83Generation or processing of protective or descriptive data associated with content; Content structuring
    • H04N21/845Structuring of content, e.g. decomposing content into time segments
    • H04N21/8456Structuring of content, e.g. decomposing content into time segments by decomposing the content in the time domain, e.g. in time segments

Definitions

  • the present application relates to the field of mobile terminal technologies, and in particular, to a video generation method, apparatus, and electronic device.
  • Somatosensory dance games through the Internet operating platform, human-computer interaction.
  • the user makes the corresponding body movements according to the prompts of the dancing device according to the somatosensory, so that the user can achieve the fitness function while enjoying the somatosensory interaction experience while dancing.
  • the somatosensory dance game is mainly applied to fixed devices, such as a somatosensory dance machine, a computer, etc., and the portability is poor.
  • the judgment of the user's body movement is determined by determining whether the arrow direction of the user's foot is correct or not, and the manner of dancing is relatively simple.
  • the user's participation is low because the game process cannot be recorded.
  • the present application aims to solve at least one of the technical problems in the related art to some extent.
  • the first object of the present application is to provide a video generation method. Since the standard action is a human body action that the user needs to make, the dance action can be effectively enriched compared to the dance mode of the user's foot arrow in the prior art. To enhance the user experience. In addition, according to the degree of difference between the standard action and the human body action at the same time node, the action evaluation information of the human body action is generated, which enables the user to know in time whether the human body action made by the user is standard, and further enhance the user experience.
  • the user can play back or share the video, thereby enhancing the user's sense of participation, and solving the prior art somatosensory dance game is mainly applied to fixed devices, such as a somatosensory dance machine, a computer, etc. , portability is poor.
  • the judgment of the user's body movement is determined by determining whether the arrow direction of the user's foot is correct or not, and the manner of dancing is relatively simple.
  • the user's sense of participation is low due to the inability to record the game process.
  • a second object of the present application is to propose a video generating apparatus.
  • a third object of the present application is to propose an electronic device.
  • a fourth object of the present application is to propose a non-transitory computer readable storage medium.
  • a fifth object of the present application is to propose a computer program product.
  • the first aspect of the present application provides a video generating method, including:
  • a target video is generated based on the audio, each video frame frame, and motion evaluation information of each human body motion.
  • the method before the playing the audio and simultaneously acquiring the video image, the method further includes:
  • the generating the target video according to the motion evaluation information of the audio, each video frame, and each human motion includes:
  • the target video is generated based on the audio and a video frame frame after adding the motion evaluation information.
  • the overall evaluation information is generated according to the motion evaluation information of each human body motion
  • the total evaluation information is displayed.
  • the result display interface further includes: reviewing a control, a shooting control, and a sharing control;
  • the shooting interface is displayed to regenerate the target video
  • the target video is shared when a triggering operation for the sharing control is detected.
  • the sharing, by the target video includes:
  • sharing interface includes a self-owned platform sharing control and a third-party platform sharing control
  • a video aggregation page is displayed; the video aggregation page includes the target video and/or a video that has been shared on the own platform.
  • the method before the acquiring the selected audio, the method further includes:
  • the song selection interface is displayed when an operation for the shooting control is detected.
  • the recognizing a human body motion of the video frame frame that is synchronously collected by the time node includes:
  • the human body motion is determined according to the actual angle between the connection between the adjacent two joints and the preset reference direction.
  • the video generating method of the embodiment of the present application obtains selected audio and standard actions corresponding to each time node in the audio; plays audio, and collects each video frame during the playing of the audio; and plays the audio to each time node. Displaying the corresponding standard action and identifying the human body motion in the video frame frame acquired by the time node synchronously; generating the action evaluation information of the human body action according to the degree of difference between the standard action and the human body action at the same time; according to the audio, Each video frame frame and motion evaluation information of each human body motion generate a target video.
  • the standard action is a human body action that the user needs to make
  • the dance action can be effectively enriched and the user experience can be improved.
  • the action evaluation information of the human body action is generated, which enables the user to know in time whether the human body action made by the user is standard, and further enhance the user experience.
  • the user can play back or share the video, thereby enhancing the user's sense of participation, and is used to solve the prior art somatosensory dance game mainly used on fixed devices, such as a somatosensory dance machine. Computers, etc., poor portability.
  • the judgment of the user's body movement is determined by determining whether the arrow direction of the user's foot is correct or not, and the manner of dancing is relatively simple.
  • the user's sense of participation is low due to the inability to record the game process.
  • the second aspect of the present application provides a video generating apparatus, including:
  • a selection module for acquiring selected audio and standard actions corresponding to each time node in the audio
  • An acquisition module configured to play the audio, and collect each video frame during the playing of the audio
  • a display module configured to display a corresponding standard action when the audio is played to each time node, and identify a human body motion of the video frame frame that is synchronously acquired by the time node;
  • An evaluation module configured to generate action evaluation information of the human body action according to a degree of difference between the standard action and the human body action at the same time node;
  • a generating module configured to generate a target video according to the audio, each video frame frame, and motion evaluation information of each human body motion.
  • the device further includes:
  • a display determining module configured to display a preparation action, and collect a preparation image, and determine that the human body action in the preparation image matches the preparation action before the playing the audio and synchronously acquiring the video image.
  • the generating module is specifically configured to:
  • the target video is generated based on the audio and a video frame frame after adding the motion evaluation information.
  • the device further includes:
  • a display generation module configured to: after the motion evaluation information of the human motion is generated, the motion evaluation information of the human motion is generated according to the degree of difference between the standard motion and the human motion according to the same time node On the interface, the action evaluation information of each human body action is displayed; when the audio play ends, the total evaluation information is generated according to the action evaluation information of each human body motion; and the total evaluation information is displayed on the result display interface.
  • the result display interface further includes: a look back control, a shooting control, and a sharing control; and the display generating module is further configured to:
  • the shooting interface is displayed to regenerate the target video
  • the target video is shared when a triggering operation for the sharing control is detected.
  • the display generating module is specifically configured to:
  • sharing interface includes a self-owned platform sharing control and a third-party platform sharing control
  • a video aggregation page is displayed; the video aggregation page includes the target video and/or a video that has been shared on the own platform.
  • the device further includes:
  • the interface display module is configured to display a song selection interface when detecting an operation for the shooting control before the obtaining the selected audio.
  • the display module is specifically configured to:
  • the human body motion is determined according to the actual angle between the connection between the adjacent two joints and the preset reference direction.
  • the video generating apparatus of the embodiment of the present application acquires the selected audio and the standard actions corresponding to each time node in the audio; plays the audio, and collects each video frame during the playing of the audio; and plays the audio to each time node. Displaying the corresponding standard action and identifying the human body motion in the video frame frame acquired by the time node synchronously; generating the action evaluation information of the human body action according to the degree of difference between the standard action and the human body action at the same time; according to the audio, Each video frame frame and motion evaluation information of each human body motion generate a target video.
  • the standard action is a human body action that the user needs to make
  • the dance action can be effectively enriched and the user experience can be improved.
  • the action evaluation information of the human body action is generated, which enables the user to know in time whether the human body action made by the user is standard, and further enhance the user experience.
  • the user can play back or share the video to enhance the user's sense of participation, and is used to solve the existing somatosensory dance game mainly used on fixed devices, such as body dancing machines, computers, etc.
  • an embodiment of the third aspect of the present application provides an electronic device including: a housing, a processor, a memory, a circuit board, and a power supply circuit, wherein the circuit board is disposed inside the space enclosed by the housing, and is processed. And a memory disposed on the circuit board; a power supply circuit for powering each circuit or device of the electronic device; a memory for storing executable program code; and the processor operating by reading executable program code stored in the memory The program corresponding to the executable program code is used to execute the video generating method described in the first aspect of the present application.
  • the fourth aspect of the present application provides a non-transitory computer readable storage medium having stored thereon a computer program, wherein the program is executed by the processor to implement the first aspect of the present application.
  • the fifth aspect of the present application provides a computer program product, where the instructions in the computer program product are executed by a processor, and the video generation method according to the embodiment of the first aspect of the present application is executed. .
  • FIG. 1 is a schematic flowchart diagram of a first video generating method according to an embodiment of the present application
  • FIG. 2 is a schematic flowchart of a second video generating method according to an embodiment of the present application.
  • FIG. 3 is a schematic flowchart diagram of a third video generating method according to an embodiment of the present application.
  • FIG. 4 is a schematic flowchart diagram of a fourth video generating method according to an embodiment of the present application.
  • FIG. 5 is a schematic structural diagram of a video generating apparatus according to an embodiment of the present disclosure.
  • FIG. 6 is a schematic structural diagram of another video generating apparatus according to an embodiment of the present disclosure.
  • FIG. 7 is a schematic structural diagram of an embodiment of an electronic device according to the present application.
  • the existing somatosensory dance game is mainly applied to fixed devices, such as a somatosensory dance machine, a computer, etc., and the portability is poor.
  • the judgment of the user's body movement is determined by determining whether the arrow direction of the user's foot is correct or not, and the manner of dancing is relatively simple.
  • the technical problem of the user's participation is low due to the inability to record the game process.
  • the selected audio and the standard action corresponding to each time node in the audio are acquired; Audio, and capture each video frame during playback of the audio; display the corresponding standard action when the audio is played to each time node, and identify the human motion in the video frame frame synchronously acquired by the time node; The degree of difference between the standard motion and the human motion generates motion evaluation information of the human motion; and generates a target video based on audio, each video frame frame, and motion evaluation information of each human motion.
  • the standard action is a human body action that the user needs to make, compared with the dance mode of the user's foot arrow in the prior art, the dance action can be effectively enriched and the user experience can be improved.
  • the action evaluation information of the human body action is generated, which enables the user to know in time whether the human body action made by the user is standard, and further enhance the user experience.
  • the user can play back or share the video, enhancing the user's sense of participation.
  • FIG. 1 is a schematic flowchart diagram of a first video generating method according to an embodiment of the present application.
  • the video generation method can be applied to an application of an electronic device, such as a personal computer (PC), a cloud device or a mobile device, a mobile device such as a smart phone, or a tablet computer.
  • PC personal computer
  • cloud device or a mobile device
  • mobile device such as a smart phone
  • tablet computer a tablet computer
  • the video generation method includes the following steps:
  • Step 101 Acquire selected audio, and standard actions corresponding to each time node in the audio.
  • an application condition of the audio selection may be set on the application of the electronic device.
  • the trigger condition may be an audio selection control, and the user may trigger the selection of audio through the audio selection control.
  • the song selection interface can be invoked, and then the user can arbitrarily select an audio from the song selection interface as the audio selected by itself.
  • the application can get the audio selected by the user.
  • a shooting control may be set on an application of the electronic device, and when the application detects the user's operation for the shooting control, for example, when the user clicks the shooting control, the application interface
  • the song selection interface can be automatically displayed, and then the user can select an audio from the song selection interface according to his own needs as the audio selected by himself.
  • the application can get the audio selected by the user.
  • the audio in the song selection interface may be pre-imported into a corresponding standard action. Specifically, each time node in the audio has a corresponding standard action. Therefore, after the application obtains the selected audio, the application The program can obtain standard actions corresponding to each time node from the audio.
  • step 102 the audio is played, and each video frame is collected during the playing of the audio.
  • the electronic device can play the audio according to the user's operation. For example, when the electronic device detects that the user clicks the audio, the electronic device can play the audio and simultaneously open the audio. The camera captures each video frame.
  • step 103 when the audio is played to each time node, the corresponding standard action is displayed.
  • the audio can be pre-set before each time node is played.
  • the advance time is displayed to show the corresponding standard action.
  • the preset advance time can be set by the user according to his own needs, or the preset advance time can be preset by the built-in program of the electronic device, which is not limited. It should be understood that the preset advance time should not be set too long, for example, the preset advance time may be 0.2 s.
  • the time node is compared with the advance time to obtain a difference, and then the difference is used as a starting time, and then a schematic diagram of the standard action can be displayed from the starting time.
  • a schematic diagram of a standard motion may be displayed in any area of the shooting interface.
  • the schematic diagram of the standard motion may be fixed, or the schematic diagram of the standard motion may move along a preset trajectory. limit.
  • the preset track may be preset for the built-in program of the electronic device.
  • the user can watch the standard action.
  • the semi-transparent mask can be displayed on the shooting interface, wherein the mask has In the hollowed out area of interest, the area of interest displays an image showing the standard action, ie a schematic showing the standard actions in the area of interest.
  • the corresponding standard action can be displayed in the form of a barrage on the shooting interface, which is not limited.
  • the schematic diagram of the standard motion moves along the preset trajectory
  • the photographing interface displays the schematic diagram of the standard motion
  • the schematic diagram of the standard motion can be controlled to move along the preset trajectory.
  • Step 104 Identify a human body motion in a video frame frame that is synchronously acquired by the time node.
  • the camera for collecting video frame frames may be a camera capable of collecting user depth information, and the acquired depth information may identify human body motions in the video frame.
  • the camera may be a Red-Green-Blue Depth (RGBD), and the depth information of the human body in the video picture frame may be acquired while being imaged, so that the human body motion in the video picture frame can be identified according to the depth information.
  • the body motion depth information can be acquired by the structured light or the TOF lens, so that the human body motion in the video frame frame can be identified according to the depth information, which is not limited.
  • the joints of the human body in the frame of the video picture can be identified.
  • the face information of the video frame and the position information of the face can be recognized according to the face recognition technology, and then the position information of each joint of the human body can be calculated according to the proportional relationship between the limb and the height in the human anatomy.
  • the position information of each joint of the human body in the video picture frame can also be determined by other algorithms, which is not limited.
  • the two joints adjacent to each joint of the human body can be connected to obtain the connection between the adjacent two joints, and finally according to the actual angle between the connection between the adjacent joints and the preset reference direction. Determine the human motion in the video frame.
  • the preset reference direction may be a horizontal direction or a vertical direction.
  • Step 105 Generate action evaluation information of the human body action according to the degree of difference between the standard action and the human body action at the same time node.
  • the action evaluation information of the human body action includes a human action action score, which is used to indicate the degree of difference between the human body action and the corresponding standard action. Specifically, the higher the human action action score indicates the human body action and the corresponding The smaller the difference between the standard actions, and the lower the human action score, the greater the difference between the human body action and the corresponding standard action.
  • whether the human body motion and the standard motion match are determined according to whether the degree of difference between the human body motion and the standard motion is greater than a difference threshold. Specifically, it is possible to determine a standard angle between a line connecting each adjacent two joints and a reference direction when performing a standard action, and compare the corresponding standard angle with the actual line for each adjacent two joints. The difference between the angles.
  • the human motion in the video frame does not match the standard motion, it indicates that the difference between the human motion and the corresponding standard motion made by the user is greater.
  • the human motion can be performed by the user.
  • the obtained score is set to 0, and when the human body motion in the video frame frame matches the standard motion, it indicates that the difference between the human body motion and the corresponding standard motion made by the user is small, and at this time, each neighbor can be
  • the connection between the two joints determines the scoring coefficient of the connection according to the corresponding difference and error range.
  • the evaluation information of the connection may be generated according to the scoring coefficient of the connection and the score corresponding to the connection.
  • the evaluation information of the connection may be equal to the scoring coefficient of the connection multiplied by the connection.
  • the motion evaluation information of the human body motion can be obtained by adding the evaluation information of the links between the adjacent two joints.
  • the motion evaluation information of the human body motion may further include an animation effect corresponding to the section to which the human motion score belongs. For example, when the human action score is 100, if the human action score belongs to the interval [90, 100], the animation effect can be “perfect or perfect” and match the diamond flash, the interval [80, 90), the animation effect can It is "very good or good” and is matched with flowers.
  • the generated human action score is 94 points, and the animation effect generated on the shooting interface is “perfect” and is matched with the diamond flashing.
  • Step 106 Generate a target video according to audio, each video frame frame, and motion evaluation information of each human body motion.
  • the action evaluation information of the human body action corresponding to the different time nodes may be acquired, and then the target video is generated according to the audio, the acquired video picture frames, and the motion evaluation information of the corresponding human body motion.
  • motion evaluation information corresponding to the human body motion may be added in each video picture frame, and then the video picture frame after the information is evaluated according to the audio and the added motion , generate the target video.
  • the video generating method of the embodiment obtains the selected audio and the standard actions corresponding to each time node in the audio; plays the audio, and collects each video frame during the playing of the audio; when the audio is played to each time node Displaying the corresponding standard action and identifying the human body motion in the video frame frame acquired by the time node synchronously; generating the action evaluation information of the human body action according to the degree of difference between the standard action and the human body action at the same time; according to the audio, each The video frame frame and the motion evaluation information of each human body motion generate a target video.
  • the standard action is a human body action that the user needs to make
  • the dance action can be effectively enriched and the user experience can be improved.
  • the action evaluation information of the human body action is generated, which enables the user to know in time whether the human body action made by the user is standard, and further enhance the user experience.
  • the user can play back or share the video, enhancing the user's sense of participation.
  • the camera may accidentally acquire the image, or, in order to prevent the camera from collecting the image when the user is not aligned, the input is invalid.
  • the preparation stage may be entered in advance. The above process will be described in detail below with reference to FIG.
  • FIG. 2 is a schematic flowchart diagram of a second video generating method according to an embodiment of the present application.
  • the video generation method includes the following steps:
  • step 201 the preparation action is displayed, and the preparation image is collected.
  • the preparation action may be displayed on the preparation interface, and the preparation action may be preset by a built-in program of the electronic device, and the preparation action may be, for example, a two-handed action, or other, which is not limited thereto.
  • the camera of the electronic device can capture the preparation image, wherein the preparation image includes the human body action made by the user.
  • the preparation action may be displayed in any area of the preparation interface, and the preparation action may be fixed in a preset time period, or the preparation action may be moved along a preset track, which is not limited.
  • the preset track may be preset for the built-in program of the electronic device.
  • the user can watch the preparation action.
  • the semi-transparent mask can be displayed on the preparation interface, wherein the mask has In the hollowed out interest area, an image for indicating the preparation action is displayed in the attention area, that is, a schematic diagram showing the preparation action in the attention area.
  • the preparation action may be displayed in the form of a barrage in the preparation interface, which is not limited. Thereby, the user can view other content while watching the preparation action, thereby improving the user experience.
  • Step 202 Determine that the human body action in the preparation image matches the preparation action.
  • the human body motion in the preparation image can be identified, and then it is determined whether the human body motion in the preparation image matches the preparation motion, and when the human body motion in the preparation image is determined to match the preparation motion, the image acquisition can be started.
  • the camera for collecting the prepared image may be a camera capable of collecting user depth information, and the acquired depth information may identify the human body motion in the preparation image.
  • the camera may be a depth camera, and the depth information of the human body in the preparation image may be acquired while being imaged, so that the human body motion in the preparation image can be identified according to the depth information.
  • the body motion depth information can be acquired by the structured light or the TOF lens, so that the human body motion in the preparation image can be identified according to the depth information, which is not limited.
  • each joint of the human body in the preparation image can be identified, and then the two joints adjacent to each joint of the human body are connected to obtain a connection between the adjacent two joints, and finally according to the relationship between the adjacent two joints.
  • the actual angle between the connection and the preset reference direction determines the human motion in the video frame.
  • whether the human body motion matches the preparation motion can be determined according to whether the degree of difference between the human body motion and the preparation motion is greater than a difference threshold. Specifically, it is possible to determine a standard angle between a line connecting each adjacent two joints and a reference direction when performing the preparation action, and compare the corresponding standard angle with the actual line for each adjacent two joints. The difference between the angles. When the difference calculated by the connection between each adjacent two joints is within the error range, it can be determined that the human body motion in the preparation image matches the preparation motion, and when there is at least one adjacent joint between the two joints When the difference calculated by the line is not within the error range, it can be determined that the human body action in the preparation image does not match the preparation action.
  • the video generation method of this embodiment advances into an accurate stage before the image acquisition by the electronic device. Specifically, the preparation action is displayed, and the preparation image is collected; and the human body action in the preparation image is determined to match the preparation action.
  • the image acquisition is started, thereby preventing the user from inadvertently triggering the shooting control of the electronic device, thereby causing the camera to accidentally acquire the image, or avoiding the camera being in the wrong state.
  • Image acquisition is performed in the case of a prospective user, resulting in the entry of an invalid image, which ensures the validity and accuracy of subsequent image acquisition.
  • the video generation method may further include the following steps:
  • Step 301 Display motion evaluation information of each human body motion on a shooting interface for collecting each video frame.
  • the video frame frames that are synchronously collected may be multiple, and each video frame frame has a corresponding motion evaluation information, and the motion evaluation information of the human body action is added to the synchronous collection.
  • the video picture frame that is, the action evaluation information of each human body action is displayed on the shooting interface for collecting each video picture frame.
  • the generated multiple motion evaluation information may be filtered, the highest evaluation motion evaluation information is retained, and then the highest evaluation motion evaluation information is added to the synchronously collected multiple video frame frames.
  • At least one video frame frame wherein at least one video frame frame displays a human body motion corresponding to the highest rated motion evaluation information.
  • Step 302 When the audio playback ends, the total evaluation information is generated according to the motion evaluation information of each human body motion.
  • the action evaluation information of each human body action may be used, and the human action scores included therein are generated to generate a total score, and the animation effect corresponding to the total score belongs to the interval. Generate total rating information.
  • the weight corresponding to each standard motion in the audio may be preset, and after determining the motion evaluation information of each human motion, the human motion score of each human motion may be multiplied by the corresponding weight.
  • the product value is obtained, so that the total score is obtained by accumulating the product value, and then the corresponding animation effect is determined according to the interval to which the total score is assigned.
  • the weight corresponding to each standard action can be set.
  • the weight corresponding to each standard action can be set to 0.01, when determining each human body action.
  • the product value can be obtained by multiplying the human action score of each human body action by the corresponding weight, thereby obtaining the total score by accumulating the product value. If the total score score obtained is 87, it can be seen that the interval to which it belongs is [80, 90), so the animation effect can be "good" and flash with flowers.
  • step 303 the total evaluation information is displayed on the result display interface.
  • the total evaluation information may be displayed on the result display interface, so that the user can understand whether the human body action made by the user is standard, and the user experience is improved.
  • the action evaluation information of each human body motion is displayed on the shooting interface for collecting each video frame frame, and when the audio playback ends, the action evaluation information is generated according to each human body motion, and the total is generated.
  • Evaluation information in the results display interface, display the total evaluation information.
  • the result display interface further includes: reviewing the control, the shooting control, and the sharing control.
  • the electronic device detects the trigger operation of the user for the lookback control
  • the target video can be played, so that the user can understand and correct the human body motion when playing the video, so that the action is more standard when the video is recorded next time
  • the device may display the shooting interface, and trigger steps 102-106 to regenerate the target video, that is, the user may shoot the video again by triggering the shooting control
  • sharing the target video includes the following steps:
  • Step 401 showing a sharing interface.
  • the sharing interface includes a self-owned platform sharing control and a third-party platform sharing control.
  • the third party platform may be, for example, Instagram, Facebook, Twitter, or the like.
  • the sharing interface is displayed, so that the user can share the target video through the sharing control of the sharing interface.
  • Step 402 When detecting a trigger operation for sharing the control of the own platform, displaying the shooting control and the display control on the sharing interface.
  • the sharing interface when the user triggers the sharing control of the own platform, the sharing interface can display the shooting control and the display control.
  • the electronic device can acquire the audio in the target video and display the preparation interface, so that the user The video can be regenerated based on the audio in the target video.
  • step 403 can be triggered.
  • Step 403 When the triggering operation for the display control is detected, the video aggregation page is displayed; the video aggregation page includes the target video and/or the video shared by the own platform.
  • the electronic device when the user clicks on the display control, the electronic device can display the video aggregation page, so that the user can share the target video or view the video shared by other users.
  • the video aggregation page can also include a shooting control so that the user can reselect the audio through the shooting control and record the video.
  • the video generating method of the embodiment displays the shooting control and the display control on the sharing interface when the triggering operation for sharing the control of the own platform is detected by displaying the sharing interface, and displaying the video when the triggering operation for the display control is detected.
  • Aggregate page; the video aggregate page contains the target video and/or videos that have been shared on the own platform. Thereby, the user can share the target video, so that other users can watch the target video and enhance the user's participation.
  • the present application also proposes a video generating apparatus.
  • FIG. 5 is a schematic structural diagram of a video generating apparatus according to an embodiment of the present disclosure.
  • the video generating apparatus 500 includes a selection module 510, an acquisition module 520, a presentation module 530, an evaluation module 540, and a generation module 550. among them,
  • the selection module 510 is configured to acquire selected audio and standard actions corresponding to each time node in the audio.
  • the acquisition module 520 is configured to play audio and collect each video frame during the playing of the audio.
  • the display module 530 is configured to display a corresponding standard action when the audio is played to each time node, and identify a human body motion of the video frame frame acquired by the time node synchronously.
  • the display module 530 is specifically configured to identify each joint of the human body in the frame of the video picture; connect two adjacent joints in each joint of the human body to obtain a connection between two adjacent joints; The actual angle between the line between the adjacent joints and the preset reference direction determines the human body motion.
  • the evaluation module 540 is configured to generate motion evaluation information of the human body motion according to the degree of difference between the standard motion and the human body motion at the same time node.
  • the generating module 550 is configured to generate a target video according to audio, each video frame frame, and motion evaluation information of each human body motion.
  • the generating module 550 is specifically configured to: add motion evaluation information corresponding to the human body motion in each video frame frame according to the human body motion recognized by each video frame frame; and evaluate information according to the audio and the added motion The subsequent video frame frame generates the target video.
  • the video generating apparatus 500 may further include:
  • the display determining module 560 is configured to display a preparation action before the audio is played and synchronously capture the video picture, and collect the preparation image to determine that the human body action in the preparation image matches the preparation action.
  • the display generation module 570 is configured to display the action evaluation information of the human body action according to the difference degree between the standard action and the human body action at the same time node, and display each human body action on the shooting interface of collecting each video picture frame. Action evaluation information; when the audio playback ends, the total evaluation information is generated according to the motion evaluation information of each human body motion; and the total evaluation information is displayed on the result display interface.
  • the interface display module 580 is configured to display a song selection interface when detecting an operation for the shooting control before acquiring the selected audio.
  • the result display interface further includes: a lookback control, a shooting control, and a sharing control; the display generating module 570 is further configured to play the target video when the triggering operation for the lookback control is detected; When the shooting operation of the shooting control is performed, the shooting interface is displayed to regenerate the target video; when the triggering operation for the sharing control is detected, the target video is shared.
  • the display generation module 570 is specifically configured to display a sharing interface; wherein the sharing interface includes a self-owned platform sharing control and a third-party platform sharing control; when detecting a trigger operation for sharing the control of the own platform Displaying the shooting control and display control in the sharing interface; displaying the video aggregation page when detecting the triggering operation for the display control; the video aggregation page includes the target video and/or the video shared by the own platform.
  • the video generating apparatus of this embodiment acquires the selected audio and the standard actions corresponding to each time node in the audio; plays the audio, and collects each video picture frame during the playing of the audio; when the audio is played to each time node Displaying the corresponding standard action and identifying the human body motion in the video frame frame acquired by the time node synchronously; generating the action evaluation information of the human body action according to the degree of difference between the standard action and the human body action at the same time; according to the audio, each The video frame frame and the motion evaluation information of each human body motion generate a target video.
  • the standard action is a human body action that the user needs to make
  • the dance action can be effectively enriched and the user experience can be improved.
  • the action evaluation information of the human body action is generated, which enables the user to know in time whether the human body action made by the user is standard, and further enhance the user experience.
  • the user can play back or share the video, enhancing the user's sense of participation.
  • the embodiment of the present application further provides an electronic device, where the electronic device includes the device described in any of the foregoing embodiments.
  • FIG. 7 is a schematic structural diagram of an embodiment of an electronic device according to the present application, which may implement the process of the embodiment shown in FIG. 1-6 of the present application.
  • the electronic device may include: a housing 71, a processor 72, and a memory.
  • the circuit board 74 is disposed inside the space surrounded by the housing 71, the processor 72 and the memory 73 are disposed on the circuit board 74, and the power supply circuit 75 is used for the electronic device
  • the memory 73 is for storing executable program code; the processor 72 runs a program corresponding to the executable program code by reading the executable program code stored in the memory 73 for performing any of the foregoing embodiments The video generation method.
  • the electronic device exists in a variety of forms including, but not limited to:
  • Mobile communication devices These devices are characterized by mobile communication functions and are mainly aimed at providing voice and data communication.
  • Such terminals include: smart phones (such as iPhone), multimedia phones, functional phones, and low-end phones.
  • Ultra-mobile personal computer equipment This type of equipment belongs to the category of personal computers, has computing and processing functions, and generally has mobile Internet access.
  • Such terminals include: PDAs, MIDs, and UMPC devices, such as the iPad.
  • Portable entertainment devices These devices can display and play multimedia content. Such devices include: audio, video players (such as iPod), handheld game consoles, e-books, and smart toys and portable car navigation devices.
  • the server consists of a processor, a hard disk, a memory, a system bus, etc.
  • the server is similar to a general-purpose computer architecture, but because of the need to provide highly reliable services, processing power and stability High reliability in terms of reliability, security, scalability, and manageability.
  • the storage medium may be a magnetic disk, an optical disk, a read-only memory (ROM), or a random access memory (RAM).
  • the present application further provides a non-transitory computer readable storage medium having stored thereon a computer program, wherein the program is executed by a processor to implement a video generation method as described in the foregoing embodiments. .
  • the present application also provides a computer program product that, when executed by a processor, executes a video generation method as described in the foregoing embodiments.
  • first and second are used for descriptive purposes only and are not to be construed as indicating or implying a relative importance or implicitly indicating the number of technical features indicated.
  • features defining “first” or “second” may include at least one of the features, either explicitly or implicitly.
  • the meaning of "a plurality” is at least two, such as two, three, etc., unless specifically defined otherwise.
  • a "computer-readable medium” can be any apparatus that can contain, store, communicate, propagate, or transport a program for use in an instruction execution system, apparatus, or device, or in conjunction with the instruction execution system, apparatus, or device.
  • computer readable media include the following: electrical connections (electronic devices) having one or more wires, portable computer disk cartridges (magnetic devices), random access memory (RAM), Read only memory (ROM), erasable editable read only memory (EPROM or flash memory), fiber optic devices, and portable compact disk read only memory (CDROM).
  • the computer readable medium may even be a paper or other suitable medium on which the program can be printed, as it may be optically scanned, for example by paper or other medium, followed by editing, interpretation or, if appropriate, other suitable The method is processed to obtain the program electronically and then stored in computer memory.
  • portions of the application can be implemented in hardware, software, firmware, or a combination thereof.
  • multiple steps or methods may be implemented in software or firmware stored in a memory and executed by a suitable instruction execution system.
  • a suitable instruction execution system For example, if implemented in hardware and in another embodiment, it can be implemented by any one or combination of the following techniques well known in the art: discrete with logic gates for implementing logic functions on data signals Logic circuits, application specific integrated circuits with suitable combinational logic gates, programmable gate arrays (PGAs), field programmable gate arrays (FPGAs), and the like.
  • each functional unit in each embodiment of the present application may be integrated into one processing module, or each unit may exist physically separately, or two or more units may be integrated into one module.
  • the above integrated modules can be implemented in the form of hardware or in the form of software functional modules.
  • the integrated modules, if implemented in the form of software functional modules and sold or used as separate products, may also be stored in a computer readable storage medium.
  • the above mentioned storage medium may be a read only memory, a magnetic disk or an optical disk or the like. While the embodiments of the present application have been shown and described above, it is understood that the above-described embodiments are illustrative and are not to be construed as limiting the scope of the present application. The embodiments are subject to variations, modifications, substitutions and variations.

Abstract

Provided are a video generation method and device, and an electronic apparatus. The method comprises: acquiring a selected audio file and standard movements corresponding to respective time points in the audio file; playing the audio file, and collecting respective video picture frames in the process of playing the audio file; when the respective time points in the audio file are reached, displaying the corresponding standard movements, and recognizing body movements in the video picture frames synchronously collected at the time points; generating, according to a degree of difference between the standard movement and the body movement at the same time point, movement evaluation information associated with the body movement; and generating a target video according to the audio file, the respective video picture frames and the movement evaluation information associated with the respective body movements. Since the standard movements are body movements to be performed by a user, the invention can effectively enrich dancing movements compared to the prior art, which requires a user to step on corresponding arrows to dance. Moreover, generation of movement evaluation information enables a user to timely ascertain whether a body movement performed by the user meets the standard, thereby improving user experience.

Description

视频生成方法、装置和电子设备Video generation method, device and electronic device
相关申请的交叉引用Cross-reference to related applications
本申请要求乐蜜有限公司于2017年11月23日提交的、发明名称为“视频生成方法、装置和电子设备”的、中国专利申请号“201711185439.6”的优先权。The present application claims priority to Chinese Patent Application No. "201711185439.6", filed on November 23, 2017, entitled "Video Generation Method, Apparatus, and Electronic Apparatus".
技术领域Technical field
本申请涉及移动终端技术领域,尤其涉及一种视频生成方法、装置和电子设备。The present application relates to the field of mobile terminal technologies, and in particular, to a video generation method, apparatus, and electronic device.
背景技术Background technique
体感跳舞游戏,通过互联网运营平台,进行人机互动。用户通过根据体感跳舞设备的提示,做出相应的身体的动作,从而使用户在跳舞的同时,能够达到健身的作用,享受体感互动的体验。Somatosensory dance games, through the Internet operating platform, human-computer interaction. The user makes the corresponding body movements according to the prompts of the dancing device according to the somatosensory, so that the user can achieve the fitness function while enjoying the somatosensory interaction experience while dancing.
现有技术中,体感跳舞游戏主要应用于固定设备上,例如体感跳舞机、电脑等,便携性较差。此外,对用户身体动作的判断,是通过确定用户脚踩的箭头方向正确与否,跳舞的方式较为单一。并且,用户在玩游戏时,由于无法记录游戏过程,导致用户的参与感较低。In the prior art, the somatosensory dance game is mainly applied to fixed devices, such as a somatosensory dance machine, a computer, etc., and the portability is poor. In addition, the judgment of the user's body movement is determined by determining whether the arrow direction of the user's foot is correct or not, and the manner of dancing is relatively simple. Moreover, when the user is playing a game, the user's participation is low because the game process cannot be recorded.
发明内容Summary of the invention
本申请旨在至少在一定程度上解决相关技术中的技术问题之一。The present application aims to solve at least one of the technical problems in the related art to some extent.
为此,本申请的第一个目的在于提出一种视频生成方法,由于标准动作为用户需要做出的人体动作,相比于现有技术中用户脚踩箭头的跳舞方式,能够有效丰富跳舞动作,提升用户体验。此外,根据同一时间节点的标准动作与人体动作之间的差异程度,生成人体动作的动作评价信息,能够使得用户及时了解自己做出的人体动作是否标准,进一步提升用户的使用体验。最后,通过在音频播放结束时,生成视频,由此,用户可以回放或者分享视频,提升用户的参与感,解决现有技术中体感跳舞游戏主要应用于固定设备上,例如体感跳舞机、电脑等,便携性较差。此外,对用户身体动作的判断,是通过确定用户脚踩的箭头方向正确与否,跳舞的方式较为单一。并且,用户在玩游戏时,由于无法记录游戏过程,导致用户的参与感较低的技术问题。To this end, the first object of the present application is to provide a video generation method. Since the standard action is a human body action that the user needs to make, the dance action can be effectively enriched compared to the dance mode of the user's foot arrow in the prior art. To enhance the user experience. In addition, according to the degree of difference between the standard action and the human body action at the same time node, the action evaluation information of the human body action is generated, which enables the user to know in time whether the human body action made by the user is standard, and further enhance the user experience. Finally, by generating a video at the end of the audio playback, the user can play back or share the video, thereby enhancing the user's sense of participation, and solving the prior art somatosensory dance game is mainly applied to fixed devices, such as a somatosensory dance machine, a computer, etc. , portability is poor. In addition, the judgment of the user's body movement is determined by determining whether the arrow direction of the user's foot is correct or not, and the manner of dancing is relatively simple. Moreover, when the user is playing a game, the user's sense of participation is low due to the inability to record the game process.
本申请的第二个目的在于提出一种视频生成装置。A second object of the present application is to propose a video generating apparatus.
本申请的第三个目的在于提出一种电子设备。A third object of the present application is to propose an electronic device.
本申请的第四个目的在于提出一种非临时性计算机可读存储介质。A fourth object of the present application is to propose a non-transitory computer readable storage medium.
本申请的第五个目的在于提出一种计算机程序产品。A fifth object of the present application is to propose a computer program product.
为达上述目的,本申请第一方面实施例提出了一种视频生成方法,包括:To achieve the above objective, the first aspect of the present application provides a video generating method, including:
获取选定的音频,以及所述音频中各时间节点对应的标准动作;Obtaining selected audio, and standard actions corresponding to each time node in the audio;
播放所述音频,并在播放所述音频过程中采集各视频画面帧;Playing the audio, and collecting each video frame during the playing of the audio;
在所述音频播放至每一个时间节点时,展示对应的标准动作,并识别所述时间节点同步采集的视频画面帧中的人体动作;Displaying a corresponding standard action when the audio is played to each time node, and identifying a human body motion in the video frame frame acquired by the time node synchronously;
根据同一时间节点的所述标准动作与所述人体动作之间的差异程度,生成所述人体动作的动作评价信息;Generating action evaluation information of the human body motion according to a degree of difference between the standard action of the node at the same time and the human body motion;
根据所述音频、各视频画面帧和各人体动作的动作评价信息,生成目标视频。A target video is generated based on the audio, each video frame frame, and motion evaluation information of each human body motion.
可选地,作为第一方面的第一种可能的实现方式,所述播放所述音频,并同步采集视频画面之前,还包括:Optionally, as the first possible implementation manner of the first aspect, before the playing the audio and simultaneously acquiring the video image, the method further includes:
展示准备动作,并采集准备图像;Display preparation actions and collect preparation images;
确定所述准备图像中的人体动作与所述准备动作匹配。It is determined that the human body motion in the preparation image matches the preparation motion.
可选地,作为第一方面的第二种可能的实现方式,所述根据所述音频、各视频画面帧和各人体动作的动作评价信息,生成目标视频,包括:Optionally, as a second possible implementation manner of the first aspect, the generating the target video according to the motion evaluation information of the audio, each video frame, and each human motion includes:
根据各视频画面帧所识别出的人体动作,在各视频画面帧中,添加相应人体动作的动作评价信息;Adding action evaluation information corresponding to the human body motion in each video frame frame according to the human body motion recognized by each video frame frame;
根据所述音频和添加所述动作评价信息后的视频画面帧,生成所述目标视频。The target video is generated based on the audio and a video frame frame after adding the motion evaluation information.
可选地,作为第一方面的第三种可能的实现方式,所述根据同一时间节点的所述标准动作与所述人体动作之间的差异程度,生成所述人体动作的动作评价信息之后,还包括:Optionally, as a third possible implementation manner of the first aspect, after the action evaluation information of the human motion is generated according to the difference between the standard action and the human motion of the same time node, Also includes:
在用于采集各视频画面帧的拍摄界面上,展示每一个人体动作的动作评价信息;Displaying motion evaluation information of each human body motion on a shooting interface for collecting each video frame;
当所述音频播放结束时,根据每一个人体动作的动作评价信息,生成总评价信息;When the audio playback ends, the overall evaluation information is generated according to the motion evaluation information of each human body motion;
在结果展示界面,展示所述总评价信息。In the result display interface, the total evaluation information is displayed.
可选地,作为第一方面的第四种可能的实现方式,所述结果展示界面还包括:回看控件、拍摄控件和分享控件;Optionally, as a fourth possible implementation manner of the first aspect, the result display interface further includes: reviewing a control, a shooting control, and a sharing control;
当探测到针对所述回看控件的触发操作时,播放所述目标视频;Playing the target video when a triggering operation for the lookback control is detected;
当探测到针对所述拍摄控件的触发操作时,展示所述拍摄界面,以重新生成所述目标视频;When the triggering operation for the shooting control is detected, the shooting interface is displayed to regenerate the target video;
当探测到针对所述分享控件的触发操作时,对所述目标视频进行分享。The target video is shared when a triggering operation for the sharing control is detected.
可选地,作为第一方面的第五种可能的实现方式,所述对所述目标视频进行分享,包括:Optionally, as the fifth possible implementation manner of the first aspect, the sharing, by the target video, includes:
展示分享界面;其中,所述分享界面包括自有平台分享控件和第三方平台分享控件;a sharing interface; wherein the sharing interface includes a self-owned platform sharing control and a third-party platform sharing control;
当探测到针对所述自有平台分享控件的触发操作时,在所述分享界面展示所述拍摄控件和展示控件;Displaying the shooting control and the display control on the sharing interface when detecting a triggering operation for the own platform sharing control;
当探测到针对所述展示控件的触发操作时,展示视频聚合页面;所述视频聚合页面包含所述目标视频和/或在自有平台已分享的视频。When a triggering operation for the display control is detected, a video aggregation page is displayed; the video aggregation page includes the target video and/or a video that has been shared on the own platform.
可选地,作为第一方面的第六种可能的实现方式,所述获取选定的音频之前,还包括:Optionally, as the sixth possible implementation manner of the first aspect, before the acquiring the selected audio, the method further includes:
当探测到针对拍摄控件的操作时,展示歌曲选择界面。The song selection interface is displayed when an operation for the shooting control is detected.
可选地,作为第一方面的第七种可能的实现方式,所述识别所述时间节点同步采集的视频画面帧的人体动作,包括:Optionally, as a seventh possible implementation manner of the first aspect, the recognizing a human body motion of the video frame frame that is synchronously collected by the time node includes:
识别所述视频画面帧中,人体的各关节;Identifying the joints of the human body in the frame of the video picture;
连接人体各关节中相邻的两关节,得到相邻两关节之间的连线;Connecting two adjacent joints in each joint of the human body to obtain a connection between two adjacent joints;
根据相邻两关节之间的连线与预设参考方向之间的实际夹角,确定人体动作。The human body motion is determined according to the actual angle between the connection between the adjacent two joints and the preset reference direction.
本申请实施例的视频生成方法,通过获取选定的音频,以及音频中各时间节点对应的标准动作;播放音频,并在播放音频过程中采集各视频画面帧;在音频播放至每一个时间节点时,展示对应的标准动作,并识别时间节点同步采集的视频画面帧中的人体动作;根据同一时间节点的标准动作与人体动作之间的差异程度,生成人体动作的动作评价信息;根据音频、各视频画面帧和各人体动作的动作评价信息,生成目标视频。本实施例中,由于标准动作为用户需要做出的人体动作,相比于现有技术中用户脚踩箭头的跳舞方式,能够有效丰富跳舞动作,提升用户体验。此外,根据同一时间节点的标准动作与人体动作之间的差异程度,生成人体动作的动作评价信息,能够使得用户及时了解自己做出的人体动作是否标准,进一步提升用户的使用体验。最后,通过在音频播放结束时,生成视频,由此,用户可以回放或者分享视频,提升用户的参与感,用于解决现有技术中体感跳舞游戏主要应用于固定设备上,例如体感跳舞机、电脑等,便携性较差。此外,对用户身体动作的判断,是通过确定用户脚踩的箭头方向正确与否,跳舞的方式较为单一。并且,用户在玩游戏时,由于无法记录游戏过程,导致用户的参与感较低的技术问题。The video generating method of the embodiment of the present application obtains selected audio and standard actions corresponding to each time node in the audio; plays audio, and collects each video frame during the playing of the audio; and plays the audio to each time node. Displaying the corresponding standard action and identifying the human body motion in the video frame frame acquired by the time node synchronously; generating the action evaluation information of the human body action according to the degree of difference between the standard action and the human body action at the same time; according to the audio, Each video frame frame and motion evaluation information of each human body motion generate a target video. In this embodiment, since the standard action is a human body action that the user needs to make, compared with the dance mode of the user's foot arrow in the prior art, the dance action can be effectively enriched and the user experience can be improved. In addition, according to the degree of difference between the standard action and the human body action at the same time node, the action evaluation information of the human body action is generated, which enables the user to know in time whether the human body action made by the user is standard, and further enhance the user experience. Finally, by generating a video at the end of the audio playback, the user can play back or share the video, thereby enhancing the user's sense of participation, and is used to solve the prior art somatosensory dance game mainly used on fixed devices, such as a somatosensory dance machine. Computers, etc., poor portability. In addition, the judgment of the user's body movement is determined by determining whether the arrow direction of the user's foot is correct or not, and the manner of dancing is relatively simple. Moreover, when the user is playing a game, the user's sense of participation is low due to the inability to record the game process.
为达上述目的,本申请第二方面实施例提出了一种视频生成装置,包括:In order to achieve the above objective, the second aspect of the present application provides a video generating apparatus, including:
选择模块,用于获取选定的音频,以及所述音频中各时间节点对应的标准动作;a selection module for acquiring selected audio and standard actions corresponding to each time node in the audio;
采集模块,用于播放所述音频,并在播放所述音频过程中采集各视频画面帧;An acquisition module, configured to play the audio, and collect each video frame during the playing of the audio;
展示模块,用于在所述音频播放至每一个时间节点时,展示对应的标准动作,并识别所述时间节点同步采集的视频画面帧的人体动作;a display module, configured to display a corresponding standard action when the audio is played to each time node, and identify a human body motion of the video frame frame that is synchronously acquired by the time node;
评价模块,用于根据同一时间节点的所述标准动作与所述人体动作之间的差异程度,生成所述人体动作的动作评价信息;An evaluation module, configured to generate action evaluation information of the human body action according to a degree of difference between the standard action and the human body action at the same time node;
生成模块,用于根据所述音频、各视频画面帧和各人体动作的动作评价信息,生成目标视频。And a generating module, configured to generate a target video according to the audio, each video frame frame, and motion evaluation information of each human body motion.
可选地,作为第二方面的第一种可能的实现方式,所述装置还包括:Optionally, as a first possible implementation manner of the second aspect, the device further includes:
展示确定模块,用于在所述播放所述音频,并同步采集视频画面之前,展示准备动作,并采集准备图像,确定所述准备图像中的人体动作与所述准备动作匹配。And a display determining module, configured to display a preparation action, and collect a preparation image, and determine that the human body action in the preparation image matches the preparation action before the playing the audio and synchronously acquiring the video image.
可选地,作为第二方面的第二种可能的实现方式,所述生成模块,具体用于:Optionally, as a second possible implementation manner of the second aspect, the generating module is specifically configured to:
根据各视频画面帧所识别出的人体动作,在各视频画面帧中,添加相应人体动作的动作评价信息;Adding action evaluation information corresponding to the human body motion in each video frame frame according to the human body motion recognized by each video frame frame;
根据所述音频和添加所述动作评价信息后的视频画面帧,生成所述目标视频。The target video is generated based on the audio and a video frame frame after adding the motion evaluation information.
可选地,作为第二方面的第三种可能的实现方式,所述装置还包括:Optionally, as a third possible implementation manner of the second aspect, the device further includes:
展示生成模块,用于在所述根据同一时间节点的所述标准动作与所述人体动作之间的差异程度,生成所述人体动作的动作评价信息之后,在用于采集各视频画面帧的拍摄界面上,展示每一个人体动作的动作评价信息;当所述音频播放结束时,根据每一个人体动作的动作评价信息,生成总评价信息;在结果展示界面,展示所述总评价信息。a display generation module, configured to: after the motion evaluation information of the human motion is generated, the motion evaluation information of the human motion is generated according to the degree of difference between the standard motion and the human motion according to the same time node On the interface, the action evaluation information of each human body action is displayed; when the audio play ends, the total evaluation information is generated according to the action evaluation information of each human body motion; and the total evaluation information is displayed on the result display interface.
可选地,作为第二方面的第四种可能的实现方式,所述结果展示界面还包括:回看控件、拍摄控件和分享控件;所述展示生成模块,还用于:Optionally, as a fourth possible implementation manner of the second aspect, the result display interface further includes: a look back control, a shooting control, and a sharing control; and the display generating module is further configured to:
当探测到针对所述回看控件的触发操作时,播放所述目标视频;Playing the target video when a triggering operation for the lookback control is detected;
当探测到针对所述拍摄控件的触发操作时,展示所述拍摄界面,以重新生成所述目标视频;When the triggering operation for the shooting control is detected, the shooting interface is displayed to regenerate the target video;
当探测到针对所述分享控件的触发操作时,对所述目标视频进行分享。The target video is shared when a triggering operation for the sharing control is detected.
可选地,作为第二方面的第五种可能的实现方式,所述展示生成模块,具体用于:Optionally, as a fifth possible implementation manner of the second aspect, the display generating module is specifically configured to:
展示分享界面;其中,所述分享界面包括自有平台分享控件和第三方平台分享控件;a sharing interface; wherein the sharing interface includes a self-owned platform sharing control and a third-party platform sharing control;
当探测到针对所述自有平台分享控件的触发操作时,在所述分享界面展示所述拍摄控件和展示控件;Displaying the shooting control and the display control on the sharing interface when detecting a triggering operation for the own platform sharing control;
当探测到针对所述展示控件的触发操作时,展示视频聚合页面;所述视频聚合页面包含所述目标视频和/或在自有平台已分享的视频。When a triggering operation for the display control is detected, a video aggregation page is displayed; the video aggregation page includes the target video and/or a video that has been shared on the own platform.
可选地,作为第二方面的第六种可能的实现方式,所述装置还包括:Optionally, as a sixth possible implementation manner of the second aspect, the device further includes:
界面展示模块,用于在所述获取选定的音频之前,当探测到针对拍摄控件的操作时,展示歌曲选择界面。The interface display module is configured to display a song selection interface when detecting an operation for the shooting control before the obtaining the selected audio.
可选地,作为第二方面的第七种可能的实现方式,所述展示模块,具体用于:Optionally, as a seventh possible implementation manner of the second aspect, the display module is specifically configured to:
识别所述视频画面帧中,人体的各关节;Identifying the joints of the human body in the frame of the video picture;
连接人体各关节中相邻的两关节,得到相邻两关节之间的连线;Connecting two adjacent joints in each joint of the human body to obtain a connection between two adjacent joints;
根据相邻两关节之间的连线与预设参考方向之间的实际夹角,确定人体动作。The human body motion is determined according to the actual angle between the connection between the adjacent two joints and the preset reference direction.
本申请实施例的视频生成装置,通过获取选定的音频,以及音频中各时间节点对应的标准动作;播放音频,并在播放音频过程中采集各视频画面帧;在音频播放至每一个时间节点时,展示对应的标准动作,并识别时间节点同步采集的视频画面帧中的人体动作;根据同一时间节点的标准动作与人体动作之间的差异程度,生成人体动作的动作评价信息;根据音频、各视频画面帧和各人体动作的动作评价信息,生成目标视频。本实施例中,由于标准动作为用户需要做出的人体动作,相比于现有技术中用户脚踩箭头的跳舞方式,能够有效丰富跳舞动作,提升用户体验。此外,根据同一时间节点的标准动作与人体动作之间的差异程度,生成人体动作的动作评价信息,能够使得用户及时了解自己做出的人体动作是否标准,进一步提升用户的使用体验。最后,通过在音频播放结束时,生成视频,由此,用户可以回放或者分享视频,提升用户的参与感,用于解决现有体感跳舞游戏主要应用于固定设备上,例如体感跳舞机、电脑等,便携性较差。此外,对用户身体动作的判断,是通过确定用户脚踩的箭头方向正确与否,跳舞的方式较为单一。并且,用户在玩游戏时,由于无法记录游戏过程,导致用户的参与感较低的技术问题。The video generating apparatus of the embodiment of the present application acquires the selected audio and the standard actions corresponding to each time node in the audio; plays the audio, and collects each video frame during the playing of the audio; and plays the audio to each time node. Displaying the corresponding standard action and identifying the human body motion in the video frame frame acquired by the time node synchronously; generating the action evaluation information of the human body action according to the degree of difference between the standard action and the human body action at the same time; according to the audio, Each video frame frame and motion evaluation information of each human body motion generate a target video. In this embodiment, since the standard action is a human body action that the user needs to make, compared with the dance mode of the user's foot arrow in the prior art, the dance action can be effectively enriched and the user experience can be improved. In addition, according to the degree of difference between the standard action and the human body action at the same time node, the action evaluation information of the human body action is generated, which enables the user to know in time whether the human body action made by the user is standard, and further enhance the user experience. Finally, by generating a video at the end of audio playback, the user can play back or share the video to enhance the user's sense of participation, and is used to solve the existing somatosensory dance game mainly used on fixed devices, such as body dancing machines, computers, etc. , portability is poor. In addition, the judgment of the user's body movement is determined by determining whether the arrow direction of the user's foot is correct or not, and the manner of dancing is relatively simple. Moreover, when the user is playing a game, the user's sense of participation is low due to the inability to record the game process.
为达上述目的,本申请第三方面实施例提出了一种电子设备,包括:壳体、处理器、存储器、电路板和电源电路,其中,电路板安置在壳体围成的空间内部,处理器和存储器设置在电路板上;电源电路,用于为上述电子设备的各个电路或器件供电;存储器用于存储可执行程序代码;处理器通过读取存储器中存储的可执行程序代码来运行与可执行程序代码对应的程序,用于执行本申请第一方面实施例所述的视频生成方法。To achieve the above objective, an embodiment of the third aspect of the present application provides an electronic device including: a housing, a processor, a memory, a circuit board, and a power supply circuit, wherein the circuit board is disposed inside the space enclosed by the housing, and is processed. And a memory disposed on the circuit board; a power supply circuit for powering each circuit or device of the electronic device; a memory for storing executable program code; and the processor operating by reading executable program code stored in the memory The program corresponding to the executable program code is used to execute the video generating method described in the first aspect of the present application.
为达上述目的,本申请第四方面实施例提出了一种非临时性计算机可读存储介质,其上存储有计算机程序,其特征在于,该程序被处理器执行时实现如本申请第一方面实施例所述的视频生成方法。To achieve the above objective, the fourth aspect of the present application provides a non-transitory computer readable storage medium having stored thereon a computer program, wherein the program is executed by the processor to implement the first aspect of the present application. The video generation method described in the embodiment.
为达上述目的,本申请第五方面实施例提出了一种计算机程序产品,当所述计算机程序产品中的指令由处理器执行时,执行如本申请第一方面实施例所述的视频生成方法。In order to achieve the above object, the fifth aspect of the present application provides a computer program product, where the instructions in the computer program product are executed by a processor, and the video generation method according to the embodiment of the first aspect of the present application is executed. .
本申请附加的方面和优点将在下面的描述中部分给出,部分将从下面的描述中变得明显,或通过本申请的实践了解到。The aspects and advantages of the present invention will be set forth in part in the description which follows.
附图说明DRAWINGS
为了更清楚地说明本申请实施例中的技术方案,下面将对实施例中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图是本申请的一些实施例,对于本领域普通 技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其他的附图。In order to more clearly illustrate the technical solutions in the embodiments of the present application, the drawings used in the embodiments will be briefly described below. Obviously, the drawings in the following description are some embodiments of the present application. Those skilled in the art can also obtain other drawings based on these drawings without paying any creative work.
图1为本申请实施例所提供的第一种视频生成方法的流程示意图;FIG. 1 is a schematic flowchart diagram of a first video generating method according to an embodiment of the present application;
图2为本申请实施例所提供的第二种视频生成方法的流程示意图;2 is a schematic flowchart of a second video generating method according to an embodiment of the present application;
图3为本申请实施例所提供的第三种视频生成方法的流程示意图;FIG. 3 is a schematic flowchart diagram of a third video generating method according to an embodiment of the present application;
图4为本申请实施例所提供的第四种视频生成方法的流程示意图;FIG. 4 is a schematic flowchart diagram of a fourth video generating method according to an embodiment of the present application;
图5为本申请实施例提供的一种视频生成装置的结构示意图;FIG. 5 is a schematic structural diagram of a video generating apparatus according to an embodiment of the present disclosure;
图6为本申请实施例提供的另一种视频生成装置的结构示意图;FIG. 6 is a schematic structural diagram of another video generating apparatus according to an embodiment of the present disclosure;
图7为本申请电子设备一个实施例的结构示意图。FIG. 7 is a schematic structural diagram of an embodiment of an electronic device according to the present application.
具体实施方式Detailed ways
下面详细描述本申请的实施例,所述实施例的示例在附图中示出,其中自始至终相同或类似的标号表示相同或类似的元件或具有相同或类似功能的元件。下面通过参考附图描述的实施例是示例性的,旨在用于解释本申请,而不能理解为对本申请的限制。The embodiments of the present application are described in detail below, and the examples of the embodiments are illustrated in the drawings, wherein the same or similar reference numerals are used to refer to the same or similar elements or elements having the same or similar functions. The embodiments described below with reference to the accompanying drawings are intended to be illustrative, and are not to be construed as limiting.
针对现有体感跳舞游戏主要应用于固定设备上,例如体感跳舞机、电脑等,便携性较差。此外,对用户身体动作的判断,是通过确定用户脚踩的箭头方向正确与否,跳舞的方式较为单一。并且,用户在玩游戏时,由于无法记录游戏过程,导致用户的参与感较低的技术问题,本申请实施例中,通过获取选定的音频,以及音频中各时间节点对应的标准动作;播放音频,并在播放音频过程中采集各视频画面帧;在音频播放至每一个时间节点时,展示对应的标准动作,并识别时间节点同步采集的视频画面帧中的人体动作;根据同一时间节点的标准动作与人体动作之间的差异程度,生成人体动作的动作评价信息;根据音频、各视频画面帧和各人体动作的动作评价信息,生成目标视频。本实施例中,由于标准动作为用户需要做出的人体动作,相比于现有技术中用户脚踩箭头的跳舞方式,能够有效丰富跳舞动作,提升用户体验。此外,根据同一时间节点的标准动作与人体动作之间的差异程度,生成人体动作的动作评价信息,能够使得用户及时了解自己做出的人体动作是否标准,进一步提升用户的使用体验。最后,通过在音频播放结束时,生成视频,由此,用户可以回放或者分享视频,提升用户的参与感。The existing somatosensory dance game is mainly applied to fixed devices, such as a somatosensory dance machine, a computer, etc., and the portability is poor. In addition, the judgment of the user's body movement is determined by determining whether the arrow direction of the user's foot is correct or not, and the manner of dancing is relatively simple. Moreover, when the user is playing the game, the technical problem of the user's participation is low due to the inability to record the game process. In the embodiment of the present application, the selected audio and the standard action corresponding to each time node in the audio are acquired; Audio, and capture each video frame during playback of the audio; display the corresponding standard action when the audio is played to each time node, and identify the human motion in the video frame frame synchronously acquired by the time node; The degree of difference between the standard motion and the human motion generates motion evaluation information of the human motion; and generates a target video based on audio, each video frame frame, and motion evaluation information of each human motion. In this embodiment, since the standard action is a human body action that the user needs to make, compared with the dance mode of the user's foot arrow in the prior art, the dance action can be effectively enriched and the user experience can be improved. In addition, according to the degree of difference between the standard action and the human body action at the same time node, the action evaluation information of the human body action is generated, which enables the user to know in time whether the human body action made by the user is standard, and further enhance the user experience. Finally, by generating a video at the end of the audio playback, the user can play back or share the video, enhancing the user's sense of participation.
下面参考附图描述本申请实施例的视频生成方法、装置和电子设备。The video generation method, apparatus, and electronic device of the embodiments of the present application are described below with reference to the accompanying drawings.
图1为本申请实施例所提供的第一种视频生成方法的流程示意图。该视频生成方法可以应用于电子设备的应用程序中,其中,电子设备例如为个人电脑(Personal Computer,PC),云端设备或者移动设备,移动设备例如智能手机,或者平板电脑等。FIG. 1 is a schematic flowchart diagram of a first video generating method according to an embodiment of the present application. The video generation method can be applied to an application of an electronic device, such as a personal computer (PC), a cloud device or a mobile device, a mobile device such as a smart phone, or a tablet computer.
如图1所示,该视频生成方法包括以下步骤:As shown in FIG. 1, the video generation method includes the following steps:
步骤101,获取选定的音频,以及音频中各时间节点对应的标准动作。Step 101: Acquire selected audio, and standard actions corresponding to each time node in the audio.
作为一种可能的实现方式,电子设备的应用程序上可以设置一个音频选取的触发条件,例如,触发条件可以为一个音频选取控件,用户可以通过该音频选取控件触发选取音频。例如,当用户触发该音频选取控件时,可以调用歌曲选择界面,而后用户可以从歌曲选择界面任意选取一个音频,作为自身选定的音频。当用户选定音频后,应用程序可以获取用户选定的音频。As a possible implementation manner, an application condition of the audio selection may be set on the application of the electronic device. For example, the trigger condition may be an audio selection control, and the user may trigger the selection of audio through the audio selection control. For example, when the user triggers the audio selection control, the song selection interface can be invoked, and then the user can arbitrarily select an audio from the song selection interface as the audio selected by itself. When the user selects the audio, the application can get the audio selected by the user.
作为另一种可能的实现方式,电子设备的应用程序上可以设置一个拍摄控件,当应用程序探测到用户针对该拍摄控件的操作时,例如,当用户点击该拍摄控件时,该应用程序的界面可以自动展示歌曲选择界面,而后用户可以根据自身需求,从歌曲选择界面选取一个音频,作为自身选定的音频。当用户选定音频后,应用程序可以获取用户选定的音频。As another possible implementation manner, a shooting control may be set on an application of the electronic device, and when the application detects the user's operation for the shooting control, for example, when the user clicks the shooting control, the application interface The song selection interface can be automatically displayed, and then the user can select an audio from the song selection interface according to his own needs as the audio selected by himself. When the user selects the audio, the application can get the audio selected by the user.
本实施例中,歌曲选择界面中的音频,可以预先导入对应的标准动作,具体地,音频中每个时间节点均具有对应的标准动作,因此,在应用程序获取选定的音频后,该应用程序可以从该音频中获取各时间节点对应的标准动作。In this embodiment, the audio in the song selection interface may be pre-imported into a corresponding standard action. Specifically, each time node in the audio has a corresponding standard action. Therefore, after the application obtains the selected audio, the application The program can obtain standard actions corresponding to each time node from the audio.
步骤102,播放音频,并在播放音频过程中采集各视频画面帧。In step 102, the audio is played, and each video frame is collected during the playing of the audio.
可选地,在拍摄界面,当用户选定音频后,电子设备可以根据用户的操作对该音频进行播放,例如,当电子设备监测到用户点击该音频时,电子设备可以播放该音频,同时打开摄像头,采集各视频画面帧。Optionally, in the shooting interface, after the user selects the audio, the electronic device can play the audio according to the user's operation. For example, when the electronic device detects that the user clicks the audio, the electronic device can play the audio and simultaneously open the audio. The camera captures each video frame.
步骤103,在音频播放至每一个时间节点时,展示对应的标准动作。In step 103, when the audio is played to each time node, the corresponding standard action is displayed.
由于用户从看到标准动作到做出人体动作,大脑需要反应一段时间,因此,本申请实施例中,为了便于用户及时做出人体动作,可以在音频播放至每一个时间节点之前,提前预设的提前时长,展示对应的标准动作。其中,预设的提前时长可以由用户根据自身需求进行设置,或者,预设的提前时长可以由电子设备的内置程序预先设定,对此不作限制。应当理解的是,预设的提前时长不应设置的过长,例如预设的提前时长可以为0.2s。Since the user needs to react for a period of time from the time when the user sees the standard action to the human body action, in the embodiment of the present application, in order to facilitate the user to make the human body action in time, the audio can be pre-set before each time node is played. The advance time is displayed to show the corresponding standard action. The preset advance time can be set by the user according to his own needs, or the preset advance time can be preset by the built-in program of the electronic device, which is not limited. It should be understood that the preset advance time should not be set too long, for example, the preset advance time may be 0.2 s.
具体地,可以针对每一个时间节点,将时间节点与提前时长作差,得到差值,而后将差值作为起始时刻,进而可以从起始时刻开始,展示标准动作的示意图。Specifically, for each time node, the time node is compared with the advance time to obtain a difference, and then the difference is used as a starting time, and then a schematic diagram of the standard action can be displayed from the starting time.
作为一种可能的实现方式,可以在拍摄界面的任意区域展示标准动作的示意图,该标准动作的示意图可以是固定不动的,或者,该标准动作的示意图可以沿预设轨迹移动,对此不作限制。其中,预设轨迹可以为电子设备的内置程序预先设置的。As a possible implementation manner, a schematic diagram of a standard motion may be displayed in any area of the shooting interface. The schematic diagram of the standard motion may be fixed, or the schematic diagram of the standard motion may move along a preset trajectory. limit. The preset track may be preset for the built-in program of the electronic device.
作为另一种可能的实现方式,为了不影响用户查看电子设备屏幕上内容的同时,用户又能观看标准动作,本实施例中,可以在拍摄界面,展示半透明蒙版,其中,蒙版具有镂空的关注区,关注区内展示有用于示意标准动作的图像,即在关注区内展示标准动作的示意图。或者,可以在拍摄界面以弹幕的形式展示对应的标准动作,对此不作限制。As another possible implementation manner, in order to prevent the user from viewing the content on the screen of the electronic device, the user can watch the standard action. In this embodiment, the semi-transparent mask can be displayed on the shooting interface, wherein the mask has In the hollowed out area of interest, the area of interest displays an image showing the standard action, ie a schematic showing the standard actions in the area of interest. Alternatively, the corresponding standard action can be displayed in the form of a barrage on the shooting interface, which is not limited.
当标准动作的示意图沿预设轨迹移动时,在拍摄界面展示标准动作的示意图的同时, 可以控制该标准动作的示意图沿预设轨迹移动。When the schematic diagram of the standard motion moves along the preset trajectory, while the photographing interface displays the schematic diagram of the standard motion, the schematic diagram of the standard motion can be controlled to move along the preset trajectory.
步骤104,识别时间节点同步采集的视频画面帧中的人体动作。Step 104: Identify a human body motion in a video frame frame that is synchronously acquired by the time node.
作为一种可能的实现方式,用于采集视频画面帧的摄像头可以为能够采集用户深度信息的摄像头,通过获取的深度信息,可以识别出视频画面帧中的人体动作。例如,该摄像头可以为深度摄像头(Red-Green-Blue Depth,RGBD),成像的同时可以获取视频画面帧中人体的深度信息,从而根据深度信息可以识别视频画面帧中的人体动作。此外,还可以通过结构光或者TOF镜头进行人体动作深度信息的获取,从而根据深度信息可以识别视频画面帧中的人体动作,对此不作限制。As a possible implementation manner, the camera for collecting video frame frames may be a camera capable of collecting user depth information, and the acquired depth information may identify human body motions in the video frame. For example, the camera may be a Red-Green-Blue Depth (RGBD), and the depth information of the human body in the video picture frame may be acquired while being imaged, so that the human body motion in the video picture frame can be identified according to the depth information. In addition, the body motion depth information can be acquired by the structured light or the TOF lens, so that the human body motion in the video frame frame can be identified according to the depth information, which is not limited.
作为另一种可能的实现方式,可以识别视频画面帧中,人体的各关节。例如,可以根据人脸识别技术识别出视频画面帧中的人脸以及人脸的位置信息,而后根据人体解剖学中肢体与身高的比例关系,可计算得到人体各关节的位置信息。当然也可以通过其他算法确定视频画面帧中人体的各关节的位置信息,对此不作限制。As another possible implementation, the joints of the human body in the frame of the video picture can be identified. For example, the face information of the video frame and the position information of the face can be recognized according to the face recognition technology, and then the position information of each joint of the human body can be calculated according to the proportional relationship between the limb and the height in the human anatomy. Of course, the position information of each joint of the human body in the video picture frame can also be determined by other algorithms, which is not limited.
在识别各关节后,可以连接人体各关节相邻的两关节,得到相邻两关节之间的连线,最后根据相邻两关节之间的连线与预设参考方向之间的实际夹角,确定视频画面帧中的人体动作。其中,预设参考方向可以为水平方向或者垂直方向。After identifying each joint, the two joints adjacent to each joint of the human body can be connected to obtain the connection between the adjacent two joints, and finally according to the actual angle between the connection between the adjacent joints and the preset reference direction. Determine the human motion in the video frame. The preset reference direction may be a horizontal direction or a vertical direction.
步骤105,根据同一时间节点的标准动作与人体动作之间的差异程度,生成人体动作的动作评价信息。Step 105: Generate action evaluation information of the human body action according to the degree of difference between the standard action and the human body action at the same time node.
本申请实施例中,人体动作的动作评价信息包括人体动作分值,用于指示人体动作与对应的标准动作之间的差异程度,具体地,人体动作分值越高,表明人体动作与对应的标准动作之间的差异程度越小,而人体动作分值越低,表明人体动作与对应的标准动作之间的差异程度越大。In the embodiment of the present application, the action evaluation information of the human body action includes a human action action score, which is used to indicate the degree of difference between the human body action and the corresponding standard action. Specifically, the higher the human action action score indicates the human body action and the corresponding The smaller the difference between the standard actions, and the lower the human action score, the greater the difference between the human body action and the corresponding standard action.
本申请实施例中,在生成人体动作的动作评价信息之前,可以预先根据人体动作与标准动作之间的差异程度是否大于差异阈值,判断人体动作与标准动作是否匹配。具体地,可以确定在执行标准动作时,各相邻两关节之间的连线与参考方向之间的标准角度,针对每一条相邻两关节之间的连线,比较对应的标准角度与实际角度之间的差值。当每一条相邻两关节之间的连线计算出的差值均在误差范围内时,可以确定视频画面帧中的人体动作与标准动作匹配,而当存在至少一条相邻两关节之间的连线计算出的差值未处于误差范围内时,可以确定视频画面帧中的人体动作与标准动作不匹配。In the embodiment of the present application, before the motion evaluation information of the human body motion is generated, whether the human body motion and the standard motion match are determined according to whether the degree of difference between the human body motion and the standard motion is greater than a difference threshold. Specifically, it is possible to determine a standard angle between a line connecting each adjacent two joints and a reference direction when performing a standard action, and compare the corresponding standard angle with the actual line for each adjacent two joints. The difference between the angles. When the difference calculated by the connection between each adjacent two joints is within the error range, it can be determined that the human motion in the video frame is matched with the standard motion, and when there is at least one adjacent two joints When the difference calculated by the connection is not within the error range, it can be determined that the human motion in the video frame does not match the standard motion.
可选地,当视频画面帧中的人体动作与标准动作不匹配时,表明用户做出的人体动作与对应的标准动作之间的差异程度较大,此时,可以将用户做出的人体动作得到的评分置0,而当视频画面帧中的人体动与标准动作匹配时,表明用户做出的人体动作与对应的标准动作之间的差异程度较小,此时,可以针对每一条相邻两关节之间的连线,根据对应的差 值和误差范围,确定连线的评分系数,例如,标记误差范围为[a,b],误差为Δ,可以根据公式p=1-[2Δ/(a-b)],计算得到连线的评分系数p,或者可以根据其他算法计算连线的评分系数,对此不作限制。当得到连线的评分系数后,可以根据连线的评分系数和连线对应的分值,生成连线的评价信息,例如,连线的评价信息可以等于该连线的评分系数乘以连线对应的分值。最后,可以通过将各条相邻两关节之间的连线的评价信息相加,得到人体动作的动作评价信息。Optionally, when the human motion in the video frame does not match the standard motion, it indicates that the difference between the human motion and the corresponding standard motion made by the user is greater. At this time, the human motion can be performed by the user. The obtained score is set to 0, and when the human body motion in the video frame frame matches the standard motion, it indicates that the difference between the human body motion and the corresponding standard motion made by the user is small, and at this time, each neighbor can be The connection between the two joints determines the scoring coefficient of the connection according to the corresponding difference and error range. For example, the marker error range is [a, b] and the error is Δ, which can be based on the formula p=1-[2Δ/ (ab)], the score coefficient p of the connection is calculated, or the score coefficient of the connection may be calculated according to other algorithms, which is not limited. After obtaining the scoring coefficient of the connection, the evaluation information of the connection may be generated according to the scoring coefficient of the connection and the score corresponding to the connection. For example, the evaluation information of the connection may be equal to the scoring coefficient of the connection multiplied by the connection. Corresponding score. Finally, the motion evaluation information of the human body motion can be obtained by adding the evaluation information of the links between the adjacent two joints.
进一步地,人体动作的动作评价信息还可以包括人体动作分值所属区间对应的动画效果。例如,当人体动作分值满分为100时,若人体动作分值所属的区间[90,100],动画效果可以为“完美或perfect”并搭配钻石闪烁,所属的区间[80,90),动画效果可以为“很好或good”并搭配鲜花闪烁。Further, the motion evaluation information of the human body motion may further include an animation effect corresponding to the section to which the human motion score belongs. For example, when the human action score is 100, if the human action score belongs to the interval [90, 100], the animation effect can be “perfect or perfect” and match the diamond flash, the interval [80, 90), the animation effect can It is "very good or good" and is matched with flowers.
举例而言,根据时间节点A的标准动作与人体动作之间的差异程度,生成的人体动作分值为94分,在拍摄界面生成的动画效果为“perfect”并搭配钻石闪烁。由此,可以使得用户及时了解自己做出的人体动作是否标准,从而提升了用户的代入感。For example, according to the degree of difference between the standard action of the time node A and the human body motion, the generated human action score is 94 points, and the animation effect generated on the shooting interface is “perfect” and is matched with the diamond flashing. Thereby, the user can be made aware of whether the human body movements made by the user are in a timely manner, thereby improving the user's sense of substitution.
步骤106,根据音频、各视频画面帧和各人体动作的动作评价信息,生成目标视频。Step 106: Generate a target video according to audio, each video frame frame, and motion evaluation information of each human body motion.
本申请实施例中,当音频播放结束时,可以获取不同时间节点对应的人体动作的动作评价信息,而后根据该音频、获取的各视频画面帧和对应的人体动作的动作评价信息,生成目标视频。In the embodiment of the present application, when the audio playback ends, the action evaluation information of the human body action corresponding to the different time nodes may be acquired, and then the target video is generated according to the audio, the acquired video picture frames, and the motion evaluation information of the corresponding human body motion. .
作为一种可能的实现方式,可以根据各视频画面帧所识别出的人体动作,在各视频画面帧中,添加相应人体动作的动作评价信息,而后根据音频和添加动作评价信息后的视频画面帧,生成目标视频。As a possible implementation manner, according to the human body motion recognized by each video picture frame, motion evaluation information corresponding to the human body motion may be added in each video picture frame, and then the video picture frame after the information is evaluated according to the audio and the added motion , generate the target video.
本实施例的视频生成方法,通过获取选定的音频,以及音频中各时间节点对应的标准动作;播放音频,并在播放音频过程中采集各视频画面帧;在音频播放至每一个时间节点时,展示对应的标准动作,并识别时间节点同步采集的视频画面帧中的人体动作;根据同一时间节点的标准动作与人体动作之间的差异程度,生成人体动作的动作评价信息;根据音频、各视频画面帧和各人体动作的动作评价信息,生成目标视频。本实施例中,由于标准动作为用户需要做出的人体动作,相比于现有技术中用户脚踩箭头的跳舞方式,能够有效丰富跳舞动作,提升用户体验。此外,根据同一时间节点的标准动作与人体动作之间的差异程度,生成人体动作的动作评价信息,能够使得用户及时了解自己做出的人体动作是否标准,进一步提升用户的使用体验。最后,通过在音频播放结束时,生成视频,由此,用户可以回放或者分享视频,提升用户的参与感。The video generating method of the embodiment obtains the selected audio and the standard actions corresponding to each time node in the audio; plays the audio, and collects each video frame during the playing of the audio; when the audio is played to each time node Displaying the corresponding standard action and identifying the human body motion in the video frame frame acquired by the time node synchronously; generating the action evaluation information of the human body action according to the degree of difference between the standard action and the human body action at the same time; according to the audio, each The video frame frame and the motion evaluation information of each human body motion generate a target video. In this embodiment, since the standard action is a human body action that the user needs to make, compared with the dance mode of the user's foot arrow in the prior art, the dance action can be effectively enriched and the user experience can be improved. In addition, according to the degree of difference between the standard action and the human body action at the same time node, the action evaluation information of the human body action is generated, which enables the user to know in time whether the human body action made by the user is standard, and further enhance the user experience. Finally, by generating a video at the end of the audio playback, the user can play back or share the video, enhancing the user's sense of participation.
作为一种可能的实现方式,为了避免用户无意触发电子设备的拍摄控件,从而导致摄像头误采集图像的情况,或者,为了避免摄像头在未对准用户的情况下就进行图像采集, 从而导致录入无效的图像的情况,本申请实施例中,在电子设备进行图像采集前,可以预先进入准备阶段。下面结合图2,对上述过程进行详细说明。As a possible implementation manner, in order to prevent the user from inadvertently triggering the shooting control of the electronic device, the camera may accidentally acquire the image, or, in order to prevent the camera from collecting the image when the user is not aligned, the input is invalid. In the case of the image, in the embodiment of the present application, before the electronic device performs image collection, the preparation stage may be entered in advance. The above process will be described in detail below with reference to FIG.
图2为本申请实施例所提供的第二种视频生成方法的流程示意图。FIG. 2 is a schematic flowchart diagram of a second video generating method according to an embodiment of the present application.
如图2所示,该视频生成方法包括以下步骤:As shown in FIG. 2, the video generation method includes the following steps:
步骤201,展示准备动作,并采集准备图像。In step 201, the preparation action is displayed, and the preparation image is collected.
本申请实施例中,可以在准备界面,展示准备动作,该准备动作可以由电子设备的内置程序预先设置,准备动作例如可以为双手平举的动作,或者为其他,对此不作限制。在展示准备动作的同时,电子设备的摄像头可以采集准备图像,其中,准备图像中包含用户做出的人体动作。In the embodiment of the present application, the preparation action may be displayed on the preparation interface, and the preparation action may be preset by a built-in program of the electronic device, and the preparation action may be, for example, a two-handed action, or other, which is not limited thereto. While displaying the preparation action, the camera of the electronic device can capture the preparation image, wherein the preparation image includes the human body action made by the user.
作为一种可能的实现方式,可以在准备界面的任意区域展示准备动作,该准备动作可以在预设时间段内固定不动,或者,该准备动作可以沿预设轨迹移动,对此不作限制。其中,预设轨迹可以为电子设备的内置程序预先设置的。As a possible implementation manner, the preparation action may be displayed in any area of the preparation interface, and the preparation action may be fixed in a preset time period, or the preparation action may be moved along a preset track, which is not limited. The preset track may be preset for the built-in program of the electronic device.
作为另一种可能的实现方式,为了不影响用户查看电子设备屏幕上内容的同时,用户又能观看准备动作,本实施例中,可以在准备界面,展示半透明蒙版,其中,蒙版具有镂空的关注区,关注区内展示有用于示意准备动作的图像,即在关注区展示准备动作的示意图。或者,可以在准备界面以弹幕的形式展示准备动作,对此不作限制。由此,用户可以在观看准备动作的同时,可以查看其它内容,提升用户体验。As another possible implementation manner, in order to prevent the user from viewing the content on the screen of the electronic device, the user can watch the preparation action. In this embodiment, the semi-transparent mask can be displayed on the preparation interface, wherein the mask has In the hollowed out interest area, an image for indicating the preparation action is displayed in the attention area, that is, a schematic diagram showing the preparation action in the attention area. Alternatively, the preparation action may be displayed in the form of a barrage in the preparation interface, which is not limited. Thereby, the user can view other content while watching the preparation action, thereby improving the user experience.
步骤202,确定准备图像中的人体动作与准备动作匹配。Step 202: Determine that the human body action in the preparation image matches the preparation action.
本申请实施例中,可以识别准备图像中的人体动作,而后判断准备图像中的人体动作是否与准备动作匹配,在确定准备图像中的人体动作与准备动作匹配时,可以开始进行图像采集。In the embodiment of the present application, the human body motion in the preparation image can be identified, and then it is determined whether the human body motion in the preparation image matches the preparation motion, and when the human body motion in the preparation image is determined to match the preparation motion, the image acquisition can be started.
作为一种可能的实现方式,用于采集准备图像的摄像头可以为能够采集用户深度信息的摄像头,通过获取的深度信息,可以识别出准备图像中的人体动作。例如,该摄像头可以为深度摄像头,成像的同时可以获取准备图像中人体的深度信息,从而根据深度信息可以识别准备图像中的人体动作。此外,还可以通过结构光或者TOF镜头进行人体动作深度信息的获取,从而根据深度信息可以识别准备图像中的人体动作,对此不作限制。As a possible implementation manner, the camera for collecting the prepared image may be a camera capable of collecting user depth information, and the acquired depth information may identify the human body motion in the preparation image. For example, the camera may be a depth camera, and the depth information of the human body in the preparation image may be acquired while being imaged, so that the human body motion in the preparation image can be identified according to the depth information. In addition, the body motion depth information can be acquired by the structured light or the TOF lens, so that the human body motion in the preparation image can be identified according to the depth information, which is not limited.
作为另一种可能的实现方式,可以识别准备图像中人体的各关节,而后连接人体各关节相邻的两关节,得到相邻两关节之间的连线,最后根据相邻两关节之间的连线与预设参考方向之间的实际夹角,确定视频画面帧中的人体动作。As another possible implementation manner, each joint of the human body in the preparation image can be identified, and then the two joints adjacent to each joint of the human body are connected to obtain a connection between the adjacent two joints, and finally according to the relationship between the adjacent two joints. The actual angle between the connection and the preset reference direction determines the human motion in the video frame.
在识别准备图像中的人体动作后,可以根据人体动作与准备动作之间的差异程度是否大于差异阈值,判断人体动作是否与准备动作匹配。具体地,可以确定在执行准备动作时,各相邻两关节之间的连线与参考方向之间的标准角度,针对每一条相邻两关节之间的连线, 比较对应的标准角度与实际角度之间的差值。当每一条相邻两关节之间的连线计算出的差值均在误差范围内时,可以确定准备图像中的人体动作与准备动作匹配,而当存在至少一条相邻两关节之间的连线计算出的差值未处于误差范围内时,可以确定准备图像中的人体动作与准备动作不匹配。After recognizing the human body motion in the preparation image, whether the human body motion matches the preparation motion can be determined according to whether the degree of difference between the human body motion and the preparation motion is greater than a difference threshold. Specifically, it is possible to determine a standard angle between a line connecting each adjacent two joints and a reference direction when performing the preparation action, and compare the corresponding standard angle with the actual line for each adjacent two joints. The difference between the angles. When the difference calculated by the connection between each adjacent two joints is within the error range, it can be determined that the human body motion in the preparation image matches the preparation motion, and when there is at least one adjacent joint between the two joints When the difference calculated by the line is not within the error range, it can be determined that the human body action in the preparation image does not match the preparation action.
本实施例的视频生成方法,通过在电子设备进行图像采集前,预先进入准确阶段。具体地,展示准备动作,并采集准备图像;确定准备图像中的人体动作与准备动作匹配。本实施例中,通过在人体动作与准备动作匹配时,开始进行图像采集,由此可以避免用户无意触发电子设备的拍摄控件,从而导致摄像头误采集图像的情况,或者,可以避免摄像头在未对准用户的情况下就进行图像采集,从而导致录入无效的图像的情况,能够保证后续图像采集的有效性和准确性。The video generation method of this embodiment advances into an accurate stage before the image acquisition by the electronic device. Specifically, the preparation action is displayed, and the preparation image is collected; and the human body action in the preparation image is determined to match the preparation action. In this embodiment, when the human body action and the preparation action are matched, the image acquisition is started, thereby preventing the user from inadvertently triggering the shooting control of the electronic device, thereby causing the camera to accidentally acquire the image, or avoiding the camera being in the wrong state. Image acquisition is performed in the case of a prospective user, resulting in the entry of an invalid image, which ensures the validity and accuracy of subsequent image acquisition.
作为一种可能的实现方式,为了增强视频生成过程中的参与感和趣味性,可以对用户做出的人体动作进行评价,参见图3,在图1所示实施例的基础上,在步骤105后,该视频生成方法还可以包括以下步骤:As a possible implementation manner, in order to enhance the participation and interest in the video generation process, the human body action made by the user can be evaluated. Referring to FIG. 3, based on the embodiment shown in FIG. 1, at step 105. The video generation method may further include the following steps:
步骤301,在用于采集各视频画面帧的拍摄界面上,展示每一个人体动作的动作评价信息。Step 301: Display motion evaluation information of each human body motion on a shooting interface for collecting each video frame.
本申请实施例中,拍摄界面展示标准动作时,同步采集的视频画面帧可以为多个,每一个视频画面帧具有对应的一个动作评价信息,将人体动作的动作评价信息,添加到同步采集到的视频画面帧,即在用于采集各视频画面帧的拍摄界面上,展示每一个人体动作的动作评价信息。作为一种可能的实现方式,可以对生成的多个动作评价信息进行筛选,保留评价最高的动作评价信息,而后将评价最高的动作评价信息,添加到同步采集到的多个视频画面帧中的至少一个视频画面帧,其中,至少一个视频画面帧,展示有评价最高的动作评价信息对应的人体动作。In the embodiment of the present application, when the shooting interface displays the standard action, the video frame frames that are synchronously collected may be multiple, and each video frame frame has a corresponding motion evaluation information, and the motion evaluation information of the human body action is added to the synchronous collection. The video picture frame, that is, the action evaluation information of each human body action is displayed on the shooting interface for collecting each video picture frame. As a possible implementation manner, the generated multiple motion evaluation information may be filtered, the highest evaluation motion evaluation information is retained, and then the highest evaluation motion evaluation information is added to the synchronously collected multiple video frame frames. At least one video frame frame, wherein at least one video frame frame displays a human body motion corresponding to the highest rated motion evaluation information.
步骤302,当音频播放结束时,根据每一个人体动作的动作评价信息,生成总评价信息。Step 302: When the audio playback ends, the total evaluation information is generated according to the motion evaluation information of each human body motion.
本申请实施例中,当音频播放结束时,可以根据每一个人体动作的动作评价信息,将其中包含的人体动作分值生成总的成绩分值,以及总的分值所属区间对应的动画效果,生成总评价信息。In the embodiment of the present application, when the audio playing ends, the action evaluation information of each human body action may be used, and the human action scores included therein are generated to generate a total score, and the animation effect corresponding to the total score belongs to the interval. Generate total rating information.
作为一种可能的实现方式,可以预先设置音频中每一个标准动作对应的权重,在确定每一个人体动作的动作评价信息后,可以通过将每一个人体动作的人体动作分值乘以对应的权重,得到乘积值,从而通过累加乘积值,获取总的成绩分值,而后根据总的成绩分值所属的区间,确定对应的动画效果。As a possible implementation manner, the weight corresponding to each standard motion in the audio may be preset, and after determining the motion evaluation information of each human motion, the human motion score of each human motion may be multiplied by the corresponding weight. The product value is obtained, so that the total score is obtained by accumulating the product value, and then the corresponding animation effect is determined according to the interval to which the total score is assigned.
举例而言,当音频中具有100个时间节点,即有100个标准动作时,可以设置每一个 标准动作对应的权重,例如可以设置每一个标准动作对应权重均为0.01,当确定每一个人体动作的动作评价信息后,可以通过将每一个人体动作的人体动作分值乘以对应的权重,得到乘积值,从而通过累加乘积值,获取总的成绩分值。如果获取的总的成绩分值为87,可知其所属的区间为[80,90),因此,动画效果可以为“good”并搭配鲜花闪烁。For example, when there are 100 time nodes in the audio, that is, there are 100 standard actions, the weight corresponding to each standard action can be set. For example, the weight corresponding to each standard action can be set to 0.01, when determining each human body action. After the action evaluation information, the product value can be obtained by multiplying the human action score of each human body action by the corresponding weight, thereby obtaining the total score by accumulating the product value. If the total score score obtained is 87, it can be seen that the interval to which it belongs is [80, 90), so the animation effect can be "good" and flash with flowers.
步骤303,在结果展示界面,展示总评价信息。In step 303, the total evaluation information is displayed on the result display interface.
本实施例中,当确定总评价信息后,可以在结果展示界面,展示总评价信息,从而可以使得用户了解自己做出的人体动作是否标准,提升了用户的使用体验。In this embodiment, after the total evaluation information is determined, the total evaluation information may be displayed on the result display interface, so that the user can understand whether the human body action made by the user is standard, and the user experience is improved.
本实施例的视频生成方法,通过在用于采集各视频画面帧的拍摄界面上,展示每一个人体动作的动作评价信息,当音频播放结束时,根据每一个人体动作的动作评价信息,生成总评价信息,在结果展示界面,展示总评价信息。由此,可以使得用户了解自己做出的人体动作是否标准,提升了用户的使用体验。In the video generation method of the embodiment, the action evaluation information of each human body motion is displayed on the shooting interface for collecting each video frame frame, and when the audio playback ends, the action evaluation information is generated according to each human body motion, and the total is generated. Evaluation information, in the results display interface, display the total evaluation information. Thereby, the user can know whether the human body movement made by himself is standard, and the user experience is improved.
本申请实施例中,结果展示界面还包括:回看控件、拍摄控件和分享控件。具体地,当电子设备探测到用户针对回看控件的触发操作时,可以播放目标视频,使得用户在回放视频时,能够了解并纠正人体动作,使得下次录视频时动作更加标准;而当电子设备探测到用户针对拍摄控件的触发操作时,可以展示拍摄界面,触发步骤102~106,以重新生成目标视频,即用户可以通过触发拍摄控件,再次拍摄视频;而当电子设备探测到针对分享控件的触发操作时,对目标视频进行分享。In the embodiment of the present application, the result display interface further includes: reviewing the control, the shooting control, and the sharing control. Specifically, when the electronic device detects the trigger operation of the user for the lookback control, the target video can be played, so that the user can understand and correct the human body motion when playing the video, so that the action is more standard when the video is recorded next time; When the device detects the trigger operation of the user for the shooting control, the device may display the shooting interface, and trigger steps 102-106 to regenerate the target video, that is, the user may shoot the video again by triggering the shooting control; and when the electronic device detects the sharing control Share the target video when the action is triggered.
作为一种可能的实现方式,参见图4,对目标视频进行分享,具体包括以下步骤:As a possible implementation manner, referring to FIG. 4, sharing the target video includes the following steps:
步骤401,展示分享界面。 Step 401, showing a sharing interface.
本申请实施例中,分享界面包括自有平台分享控件和第三方平台分享控件。其中,第三方平台例如可以为Instagram、Facebook、Twitter等。In the embodiment of the present application, the sharing interface includes a self-owned platform sharing control and a third-party platform sharing control. The third party platform may be, for example, Instagram, Facebook, Twitter, or the like.
本申请实施例中,通过展示分享界面,从而用户可以通过分享界面的分享控件对目标视频进行分享。In the embodiment of the present application, the sharing interface is displayed, so that the user can share the target video through the sharing control of the sharing interface.
步骤402,当探测到针对自有平台分享控件的触发操作时,在分享界面展示拍摄控件和展示控件。Step 402: When detecting a trigger operation for sharing the control of the own platform, displaying the shooting control and the display control on the sharing interface.
本申请实施例中,当用户触发自有平台分享控件时,分享界面可以展示拍摄控件和展示控件,当用户点击拍摄控件时,电子设备可以获取目标视频中的音频,并展示准备界面,从而用户可以根据目标视频中的音频,重新生成视频。而当用户点击展示控件时,可以触发步骤403。In the embodiment of the present application, when the user triggers the sharing control of the own platform, the sharing interface can display the shooting control and the display control. When the user clicks the shooting control, the electronic device can acquire the audio in the target video and display the preparation interface, so that the user The video can be regenerated based on the audio in the target video. When the user clicks on the display control, step 403 can be triggered.
步骤403,当探测到针对展示控件的触发操作时,展示视频聚合页面;视频聚合页面包含目标视频和/或在自有平台已分享的视频。Step 403: When the triggering operation for the display control is detected, the video aggregation page is displayed; the video aggregation page includes the target video and/or the video shared by the own platform.
本申请实施例中,当用户点击展示控件时,电子设备可以展示视频聚合页面,从而用 户可分享目标视频,或者查看其它用户已分享的视频。In the embodiment of the present application, when the user clicks on the display control, the electronic device can display the video aggregation page, so that the user can share the target video or view the video shared by other users.
可选地,视频聚合页面还可以包含拍摄控件,从而用户可以通过拍摄控件重新选取音频,并录制视频。Optionally, the video aggregation page can also include a shooting control so that the user can reselect the audio through the shooting control and record the video.
本实施例的视频生成方法,通过展示分享界面,当探测到针对自有平台分享控件的触发操作时,在分享界面展示拍摄控件和展示控件,当探测到针对展示控件的触发操作时,展示视频聚合页面;视频聚合页面包含目标视频和/或在自有平台已分享的视频。由此,用户可以分享目标视频,从而可以使其它用户可以观看目标视频,提升用户的参与感。The video generating method of the embodiment displays the shooting control and the display control on the sharing interface when the triggering operation for sharing the control of the own platform is detected by displaying the sharing interface, and displaying the video when the triggering operation for the display control is detected. Aggregate page; the video aggregate page contains the target video and/or videos that have been shared on the own platform. Thereby, the user can share the target video, so that other users can watch the target video and enhance the user's participation.
为了实现上述实施例,本申请还提出一种视频生成装置。In order to implement the above embodiments, the present application also proposes a video generating apparatus.
图5为本申请实施例提供的一种视频生成装置的结构示意图。FIG. 5 is a schematic structural diagram of a video generating apparatus according to an embodiment of the present disclosure.
如图5所示,该视频生成装置500包括:选择模块510、采集模块520、展示模块530、评价模块540,以及生成模块550。其中,As shown in FIG. 5, the video generating apparatus 500 includes a selection module 510, an acquisition module 520, a presentation module 530, an evaluation module 540, and a generation module 550. among them,
选择模块510,用于获取选定的音频,以及音频中各时间节点对应的标准动作。The selection module 510 is configured to acquire selected audio and standard actions corresponding to each time node in the audio.
采集模块520,用于播放音频,并在播放音频过程中采集各视频画面帧。The acquisition module 520 is configured to play audio and collect each video frame during the playing of the audio.
展示模块530,用于在音频播放至每一个时间节点时,展示对应的标准动作,并识别时间节点同步采集的视频画面帧的人体动作。The display module 530 is configured to display a corresponding standard action when the audio is played to each time node, and identify a human body motion of the video frame frame acquired by the time node synchronously.
作为一种可能的实现方式,展示模块530,具体用于识别视频画面帧中,人体的各关节;连接人体各关节中相邻的两关节,得到相邻两关节之间的连线;根据相邻两关节之间的连线与预设参考方向之间的实际夹角,确定人体动作。As a possible implementation manner, the display module 530 is specifically configured to identify each joint of the human body in the frame of the video picture; connect two adjacent joints in each joint of the human body to obtain a connection between two adjacent joints; The actual angle between the line between the adjacent joints and the preset reference direction determines the human body motion.
评价模块540,用于根据同一时间节点的标准动作与人体动作之间的差异程度,生成人体动作的动作评价信息。The evaluation module 540 is configured to generate motion evaluation information of the human body motion according to the degree of difference between the standard motion and the human body motion at the same time node.
生成模块550,用于根据音频、各视频画面帧和各人体动作的动作评价信息,生成目标视频。The generating module 550 is configured to generate a target video according to audio, each video frame frame, and motion evaluation information of each human body motion.
作为一种可能的实现方式,生成模块550,具体用于根据各视频画面帧所识别出的人体动作,在各视频画面帧中,添加相应人体动作的动作评价信息;根据音频和添加动作评价信息后的视频画面帧,生成目标视频。As a possible implementation manner, the generating module 550 is specifically configured to: add motion evaluation information corresponding to the human body motion in each video frame frame according to the human body motion recognized by each video frame frame; and evaluate information according to the audio and the added motion The subsequent video frame frame generates the target video.
进一步地,作为本申请实施例的一种可能的实现方式,参见图6,在图5所示实施例的基础上,该视频生成装置500还可以包括:Further, as a possible implementation manner of the embodiment of the present application, referring to FIG. 6, on the basis of the embodiment shown in FIG. 5, the video generating apparatus 500 may further include:
展示确定模块560,用于在播放音频,并同步采集视频画面之前,展示准备动作,并采集准备图像,确定准备图像中的人体动作与准备动作匹配。The display determining module 560 is configured to display a preparation action before the audio is played and synchronously capture the video picture, and collect the preparation image to determine that the human body action in the preparation image matches the preparation action.
展示生成模块570,用于在根据同一时间节点的标准动作与人体动作之间的差异程度,生成人体动作的动作评价信息之后,在采集各视频画面帧的拍摄界面上,展示每一个人体动作的动作评价信息;当音频播放结束时,根据每一个人体动作的动作评价信息,生成总 评价信息;在结果展示界面,展示总评价信息。The display generation module 570 is configured to display the action evaluation information of the human body action according to the difference degree between the standard action and the human body action at the same time node, and display each human body action on the shooting interface of collecting each video picture frame. Action evaluation information; when the audio playback ends, the total evaluation information is generated according to the motion evaluation information of each human body motion; and the total evaluation information is displayed on the result display interface.
界面展示模块580,用于在获取选定的音频之前,当探测到针对拍摄控件的操作时,展示歌曲选择界面。The interface display module 580 is configured to display a song selection interface when detecting an operation for the shooting control before acquiring the selected audio.
本申请实施例中,结果展示界面还包括:回看控件、拍摄控件和分享控件;展示生成模块570,还用于当探测到针对回看控件的触发操作时,播放目标视频;当探测到针对拍摄控件的触发操作时,展示拍摄界面,以重新生成目标视频;当探测到针对分享控件的触发操作时,对目标视频进行分享。In the embodiment of the present application, the result display interface further includes: a lookback control, a shooting control, and a sharing control; the display generating module 570 is further configured to play the target video when the triggering operation for the lookback control is detected; When the shooting operation of the shooting control is performed, the shooting interface is displayed to regenerate the target video; when the triggering operation for the sharing control is detected, the target video is shared.
作为一种可能的实现方式,展示生成模块570,具体用于展示分享界面;其中,分享界面包括自有平台分享控件和第三方平台分享控件;当探测到针对自有平台分享控件的触发操作时,在分享界面展示拍摄控件和展示控件;当探测到针对展示控件的触发操作时,展示视频聚合页面;视频聚合页面包含目标视频和/或在自有平台已分享的视频。As a possible implementation manner, the display generation module 570 is specifically configured to display a sharing interface; wherein the sharing interface includes a self-owned platform sharing control and a third-party platform sharing control; when detecting a trigger operation for sharing the control of the own platform Displaying the shooting control and display control in the sharing interface; displaying the video aggregation page when detecting the triggering operation for the display control; the video aggregation page includes the target video and/or the video shared by the own platform.
需要说明的是,前述对视频生成方法实施例的解释说明也适用于该实施例的视频生成装置500,此处不再赘述。It should be noted that the foregoing description of the video generation method embodiment is also applicable to the video generation apparatus 500 of this embodiment, and details are not described herein again.
本实施例的视频生成装置,通过获取选定的音频,以及音频中各时间节点对应的标准动作;播放音频,并在播放音频过程中采集各视频画面帧;在音频播放至每一个时间节点时,展示对应的标准动作,并识别时间节点同步采集的视频画面帧中的人体动作;根据同一时间节点的标准动作与人体动作之间的差异程度,生成人体动作的动作评价信息;根据音频、各视频画面帧和各人体动作的动作评价信息,生成目标视频。本实施例中,由于标准动作为用户需要做出的人体动作,相比于现有技术中用户脚踩箭头的跳舞方式,能够有效丰富跳舞动作,提升用户体验。此外,根据同一时间节点的标准动作与人体动作之间的差异程度,生成人体动作的动作评价信息,能够使得用户及时了解自己做出的人体动作是否标准,进一步提升用户的使用体验。最后,通过在音频播放结束时,生成视频,由此,用户可以回放或者分享视频,提升用户的参与感。The video generating apparatus of this embodiment acquires the selected audio and the standard actions corresponding to each time node in the audio; plays the audio, and collects each video picture frame during the playing of the audio; when the audio is played to each time node Displaying the corresponding standard action and identifying the human body motion in the video frame frame acquired by the time node synchronously; generating the action evaluation information of the human body action according to the degree of difference between the standard action and the human body action at the same time; according to the audio, each The video frame frame and the motion evaluation information of each human body motion generate a target video. In this embodiment, since the standard action is a human body action that the user needs to make, compared with the dance mode of the user's foot arrow in the prior art, the dance action can be effectively enriched and the user experience can be improved. In addition, according to the degree of difference between the standard action and the human body action at the same time node, the action evaluation information of the human body action is generated, which enables the user to know in time whether the human body action made by the user is standard, and further enhance the user experience. Finally, by generating a video at the end of the audio playback, the user can play back or share the video, enhancing the user's sense of participation.
本申请实施例还提供一种电子设备,电子设备包含前述任一实施例所述的装置。The embodiment of the present application further provides an electronic device, where the electronic device includes the device described in any of the foregoing embodiments.
图7为本申请电子设备一个实施例的结构示意图,可以实现本申请图1-6所示实施例的流程,如图7所示,上述电子设备可以包括:壳体71、处理器72、存储器73、电路板74和电源电路75,其中,电路板74安置在壳体71围成的空间内部,处理器72和存储器73设置在电路板74上;电源电路75,用于为上述电子设备的各个电路或器件供电;存储器73用于存储可执行程序代码;处理器72通过读取存储器73中存储的可执行程序代码来运行与可执行程序代码对应的程序,用于执行前述任一实施例所述的视频生成方法。FIG. 7 is a schematic structural diagram of an embodiment of an electronic device according to the present application, which may implement the process of the embodiment shown in FIG. 1-6 of the present application. As shown in FIG. 7, the electronic device may include: a housing 71, a processor 72, and a memory. 73, a circuit board 74 and a power supply circuit 75, wherein the circuit board 74 is disposed inside the space surrounded by the housing 71, the processor 72 and the memory 73 are disposed on the circuit board 74, and the power supply circuit 75 is used for the electronic device Each circuit or device is powered; the memory 73 is for storing executable program code; the processor 72 runs a program corresponding to the executable program code by reading the executable program code stored in the memory 73 for performing any of the foregoing embodiments The video generation method.
处理器72对上述步骤的具体执行过程以及处理器72通过运行可执行程序代码来进一步执行的步骤,可以参见本申请图1-6所示实施例的描述,在此不再赘述。For the specific execution of the foregoing steps by the processor 72 and the steps performed by the processor 72 by running the executable program code, refer to the description of the embodiment shown in FIG. 1-6 of the present application, and details are not described herein again.
该电子设备以多种形式存在,包括但不限于:The electronic device exists in a variety of forms including, but not limited to:
(1)移动通信设备:这类设备的特点是具备移动通信功能,并且以提供话音、数据通信为主要目标。这类终端包括:智能手机(例如iPhone)、多媒体手机、功能性手机,以及低端手机等。(1) Mobile communication devices: These devices are characterized by mobile communication functions and are mainly aimed at providing voice and data communication. Such terminals include: smart phones (such as iPhone), multimedia phones, functional phones, and low-end phones.
(2)超移动个人计算机设备:这类设备属于个人计算机的范畴,有计算和处理功能,一般也具备移动上网特性。这类终端包括:PDA、MID和UMPC设备等,例如iPad。(2) Ultra-mobile personal computer equipment: This type of equipment belongs to the category of personal computers, has computing and processing functions, and generally has mobile Internet access. Such terminals include: PDAs, MIDs, and UMPC devices, such as the iPad.
(3)便携式娱乐设备:这类设备可以显示和播放多媒体内容。该类设备包括:音频、视频播放器(例如iPod),掌上游戏机,电子书,以及智能玩具和便携式车载导航设备。(3) Portable entertainment devices: These devices can display and play multimedia content. Such devices include: audio, video players (such as iPod), handheld game consoles, e-books, and smart toys and portable car navigation devices.
(4)服务器:提供计算服务的设备,服务器的构成包括处理器、硬盘、内存、系统总线等,服务器和通用的计算机架构类似,但是由于需要提供高可靠的服务,因此在处理能力、稳定性、可靠性、安全性、可扩展性、可管理性等方面要求较高。(4) Server: A device that provides computing services. The server consists of a processor, a hard disk, a memory, a system bus, etc. The server is similar to a general-purpose computer architecture, but because of the need to provide highly reliable services, processing power and stability High reliability in terms of reliability, security, scalability, and manageability.
(5)其他具有数据交互功能的电子设备。(5) Other electronic devices with data interaction functions.
本领域普通技术人员可以理解实现上述实施例方法中的全部或部分流程,是可以通过计算机程序来指令相关的硬件来完成,所述的程序可存储于一计算机可读取存储介质中,该程序在执行时,可包括如上述各方法的实施例的流程。其中,所述的存储介质可为磁碟、光盘、只读存储记忆体(Read-Only Memory,ROM)或随机存储记忆体(Random Access Memory,RAM)等。One of ordinary skill in the art can understand that all or part of the process of implementing the foregoing embodiments can be completed by a computer program to instruct related hardware, and the program can be stored in a computer readable storage medium. When executed, the flow of an embodiment of the methods as described above may be included. The storage medium may be a magnetic disk, an optical disk, a read-only memory (ROM), or a random access memory (RAM).
以上所述,仅为本申请的具体实施方式,但本申请的保护范围并不局限于此,任何熟悉本技术领域的技术人员在本申请揭露的技术范围内,可轻易想到的变化或替换,都应涵盖在本申请的保护范围之内。因此,本申请的保护范围应以权利要求的保护范围为准。The foregoing is only a specific embodiment of the present application, but the scope of protection of the present application is not limited thereto, and any change or replacement that can be easily conceived by those skilled in the art within the technical scope disclosed by the present application is All should be covered by the scope of this application. Therefore, the scope of protection of this application should be determined by the scope of protection of the claims.
为了实现上述实施例,本申请还提出一种非临时性计算机可读存储介质,其上存储有计算机程序,其特征在于,该程序被处理器执行时实现如前述实施例所述的视频生成方法。In order to implement the above embodiments, the present application further provides a non-transitory computer readable storage medium having stored thereon a computer program, wherein the program is executed by a processor to implement a video generation method as described in the foregoing embodiments. .
为了实现上述实施例,本申请还提出一种计算机程序产品,当所述计算机程序产品中的指令由处理器执行时,执行如前述实施例所述的视频生成方法。In order to implement the above embodiments, the present application also provides a computer program product that, when executed by a processor, executes a video generation method as described in the foregoing embodiments.
在本说明书的描述中,参考术语“一个实施例”、“一些实施例”、“示例”、“具体示例”、或“一些示例”等的描述意指结合该实施例或示例描述的具体特征、结构、材料或者特点包含于本申请的至少一个实施例或示例中。在本说明书中,对上述术语的示意性表述不必须针对的是相同的实施例或示例。而且,描述的具体特征、结构、材料或者特点可以在任一个或多个实施例或示例中以合适的方式结合。此外,在不相互矛盾的情况下,本领域的技术人员可以将本说明书中描述的不同实施例或示例以及不同实施例或示例的特征进行结合和组合。In the description of the present specification, the description with reference to the terms "one embodiment", "some embodiments", "example", "specific example", or "some examples" and the like means a specific feature described in connection with the embodiment or example. A structure, material or feature is included in at least one embodiment or example of the application. In the present specification, the schematic representation of the above terms is not necessarily directed to the same embodiment or example. Furthermore, the particular features, structures, materials, or characteristics described may be combined in a suitable manner in any one or more embodiments or examples. In addition, various embodiments or examples described in the specification, as well as features of various embodiments or examples, may be combined and combined.
此外,术语“第一”、“第二”仅用于描述目的,而不能理解为指示或暗示相对重要性 或者隐含指明所指示的技术特征的数量。由此,限定有“第一”、“第二”的特征可以明示或者隐含地包括至少一个该特征。在本申请的描述中,“多个”的含义是至少两个,例如两个,三个等,除非另有明确具体的限定。Moreover, the terms "first" and "second" are used for descriptive purposes only and are not to be construed as indicating or implying a relative importance or implicitly indicating the number of technical features indicated. Thus, features defining "first" or "second" may include at least one of the features, either explicitly or implicitly. In the description of the present application, the meaning of "a plurality" is at least two, such as two, three, etc., unless specifically defined otherwise.
流程图中或在此以其他方式描述的任何过程或方法描述可以被理解为,表示包括一个或更多个用于实现定制逻辑功能或过程的步骤的可执行指令的代码的模块、片段或部分,并且本申请的优选实施方式的范围包括另外的实现,其中可以不按所示出或讨论的顺序,包括根据所涉及的功能按基本同时的方式或按相反的顺序,来执行功能,这应被本申请的实施例所属技术领域的技术人员所理解。Any process or method description in the flowcharts or otherwise described herein may be understood to represent a module, segment or portion of code comprising one or more executable instructions for implementing the steps of a custom logic function or process. And the scope of the preferred embodiments of the present application includes additional implementations, in which the functions may be performed in a substantially simultaneous manner or in the reverse order depending on the functions involved, in accordance with the illustrated or discussed order. It will be understood by those skilled in the art to which the embodiments of the present application pertain.
在流程图中表示或在此以其他方式描述的逻辑和/或步骤,例如,可以被认为是用于实现逻辑功能的可执行指令的定序列表,可以具体实现在任何计算机可读介质中,以供指令执行系统、装置或设备(如基于计算机的系统、包括处理器的系统或其他可以从指令执行系统、装置或设备取指令并执行指令的系统)使用,或结合这些指令执行系统、装置或设备而使用。就本说明书而言,"计算机可读介质"可以是任何可以包含、存储、通信、传播或传输程序以供指令执行系统、装置或设备或结合这些指令执行系统、装置或设备而使用的装置。计算机可读介质的更具体的示例(非穷尽性列表)包括以下:具有一个或多个布线的电连接部(电子装置),便携式计算机盘盒(磁装置),随机存取存储器(RAM),只读存储器(ROM),可擦除可编辑只读存储器(EPROM或闪速存储器),光纤装置,以及便携式光盘只读存储器(CDROM)。另外,计算机可读介质甚至可以是可在其上打印所述程序的纸或其他合适的介质,因为可以例如通过对纸或其他介质进行光学扫描,接着进行编辑、解译或必要时以其他合适方式进行处理来以电子方式获得所述程序,然后将其存储在计算机存储器中。The logic and/or steps represented in the flowchart or otherwise described herein, for example, may be considered as an ordered list of executable instructions for implementing logical functions, and may be embodied in any computer readable medium, Used in conjunction with, or in conjunction with, an instruction execution system, apparatus, or device (eg, a computer-based system, a system including a processor, or other system that can fetch instructions and execute instructions from an instruction execution system, apparatus, or device) Or use with equipment. For the purposes of this specification, a "computer-readable medium" can be any apparatus that can contain, store, communicate, propagate, or transport a program for use in an instruction execution system, apparatus, or device, or in conjunction with the instruction execution system, apparatus, or device. More specific examples (non-exhaustive list) of computer readable media include the following: electrical connections (electronic devices) having one or more wires, portable computer disk cartridges (magnetic devices), random access memory (RAM), Read only memory (ROM), erasable editable read only memory (EPROM or flash memory), fiber optic devices, and portable compact disk read only memory (CDROM). In addition, the computer readable medium may even be a paper or other suitable medium on which the program can be printed, as it may be optically scanned, for example by paper or other medium, followed by editing, interpretation or, if appropriate, other suitable The method is processed to obtain the program electronically and then stored in computer memory.
应当理解,本申请的各部分可以用硬件、软件、固件或它们的组合来实现。在上述实施方式中,多个步骤或方法可以用存储在存储器中且由合适的指令执行系统执行的软件或固件来实现。如,如果用硬件来实现和在另一实施方式中一样,可用本领域公知的下列技术中的任一项或他们的组合来实现:具有用于对数据信号实现逻辑功能的逻辑门电路的离散逻辑电路,具有合适的组合逻辑门电路的专用集成电路,可编程门阵列(PGA),现场可编程门阵列(FPGA)等。It should be understood that portions of the application can be implemented in hardware, software, firmware, or a combination thereof. In the above-described embodiments, multiple steps or methods may be implemented in software or firmware stored in a memory and executed by a suitable instruction execution system. For example, if implemented in hardware and in another embodiment, it can be implemented by any one or combination of the following techniques well known in the art: discrete with logic gates for implementing logic functions on data signals Logic circuits, application specific integrated circuits with suitable combinational logic gates, programmable gate arrays (PGAs), field programmable gate arrays (FPGAs), and the like.
本技术领域的普通技术人员可以理解实现上述实施例方法携带的全部或部分步骤是可以通过程序来指令相关的硬件完成,所述的程序可以存储于一种计算机可读存储介质中,该程序在执行时,包括方法实施例的步骤之一或其组合。One of ordinary skill in the art can understand that all or part of the steps carried by the method of implementing the above embodiments can be completed by a program to instruct related hardware, and the program can be stored in a computer readable storage medium. When executed, one or a combination of the steps of the method embodiments is included.
此外,在本申请各个实施例中的各功能单元可以集成在一个处理模块中,也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个模块中。上述集成的模块既可以采用硬件的形式实现,也可以采用软件功能模块的形式实现。所述集成的模块如果以 软件功能模块的形式实现并作为独立的产品销售或使用时,也可以存储在一个计算机可读取存储介质中。In addition, each functional unit in each embodiment of the present application may be integrated into one processing module, or each unit may exist physically separately, or two or more units may be integrated into one module. The above integrated modules can be implemented in the form of hardware or in the form of software functional modules. The integrated modules, if implemented in the form of software functional modules and sold or used as separate products, may also be stored in a computer readable storage medium.
上述提到的存储介质可以是只读存储器,磁盘或光盘等。尽管上面已经示出和描述了本申请的实施例,可以理解的是,上述实施例是示例性的,不能理解为对本申请的限制,本领域的普通技术人员在本申请的范围内可以对上述实施例进行变化、修改、替换和变型。The above mentioned storage medium may be a read only memory, a magnetic disk or an optical disk or the like. While the embodiments of the present application have been shown and described above, it is understood that the above-described embodiments are illustrative and are not to be construed as limiting the scope of the present application. The embodiments are subject to variations, modifications, substitutions and variations.

Claims (19)

  1. 一种视频生成方法,其特征在于,包括以下步骤:A video generation method, comprising the steps of:
    获取选定的音频,以及所述音频中各时间节点对应的标准动作;Obtaining selected audio, and standard actions corresponding to each time node in the audio;
    播放所述音频,并在播放所述音频过程中采集各视频画面帧;Playing the audio, and collecting each video frame during the playing of the audio;
    在所述音频播放至每一个时间节点时,展示对应的标准动作,并识别所述时间节点同步采集的视频画面帧中的人体动作;Displaying a corresponding standard action when the audio is played to each time node, and identifying a human body motion in the video frame frame acquired by the time node synchronously;
    根据同一时间节点的所述标准动作与所述人体动作之间的差异程度,生成所述人体动作的动作评价信息;Generating action evaluation information of the human body motion according to a degree of difference between the standard action of the node at the same time and the human body motion;
    根据所述音频、各视频画面帧和各人体动作的动作评价信息,生成目标视频。A target video is generated based on the audio, each video frame frame, and motion evaluation information of each human body motion.
  2. 根据权利要求1所述的视频生成方法,其特征在于,所述播放所述音频,并同步采集视频画面之前,还包括:The video generating method according to claim 1, wherein before the playing the audio and simultaneously acquiring the video image, the method further comprises:
    展示准备动作,并采集准备图像;Display preparation actions and collect preparation images;
    确定所述准备图像中的人体动作与所述准备动作匹配。It is determined that the human body motion in the preparation image matches the preparation motion.
  3. 根据权利要求1或2所述的视频生成方法,其特征在于,所述根据所述音频、各视频画面帧和各人体动作的动作评价信息,生成目标视频,包括:The video generating method according to claim 1 or 2, wherein the generating the target video according to the audio, each video frame frame, and the action evaluation information of each human body motion comprises:
    根据各视频画面帧所识别出的人体动作,在各视频画面帧中,添加相应人体动作的动作评价信息;Adding action evaluation information corresponding to the human body motion in each video frame frame according to the human body motion recognized by each video frame frame;
    根据所述音频和添加所述动作评价信息后的视频画面帧,生成所述目标视频。The target video is generated based on the audio and a video frame frame after adding the motion evaluation information.
  4. 根据权利要求1-3任一项所述的视频生成方法,其特征在于,所述根据同一时间节点的所述标准动作与所述人体动作之间的差异程度,生成所述人体动作的动作评价信息之后,还包括:The video generating method according to any one of claims 1 to 3, wherein the action evaluation of the human body motion is generated according to the degree of difference between the standard action and the human body motion of the same time node After the information, it also includes:
    在用于采集各视频画面帧的拍摄界面上,展示每一个人体动作的动作评价信息;Displaying motion evaluation information of each human body motion on a shooting interface for collecting each video frame;
    当所述音频播放结束时,根据每一个人体动作的动作评价信息,生成总评价信息;When the audio playback ends, the overall evaluation information is generated according to the motion evaluation information of each human body motion;
    在结果展示界面,展示所述总评价信息。In the result display interface, the total evaluation information is displayed.
  5. 根据权利要求4所述的视频生成方法,其特征在于,所述结果展示界面还包括:回看控件、拍摄控件和分享控件;The video generating method according to claim 4, wherein the result display interface further comprises: reviewing a control, a shooting control, and a sharing control;
    当探测到针对所述回看控件的触发操作时,播放所述目标视频;Playing the target video when a triggering operation for the lookback control is detected;
    当探测到针对所述拍摄控件的触发操作时,展示所述拍摄界面,以重新生成所述目标视频;When the triggering operation for the shooting control is detected, the shooting interface is displayed to regenerate the target video;
    当探测到针对所述分享控件的触发操作时,对所述目标视频进行分享。The target video is shared when a triggering operation for the sharing control is detected.
  6. 根据权利要求5所述的视频生成方法,其特征在于,所述对所述目标视频进行分享,包括:The video generating method according to claim 5, wherein the sharing the target video comprises:
    展示分享界面;其中,所述分享界面包括自有平台分享控件和第三方平台分享控件;a sharing interface; wherein the sharing interface includes a self-owned platform sharing control and a third-party platform sharing control;
    当探测到针对所述自有平台分享控件的触发操作时,在所述分享界面展示所述拍摄控件和展示控件;Displaying the shooting control and the display control on the sharing interface when detecting a triggering operation for the own platform sharing control;
    当探测到针对所述展示控件的触发操作时,展示视频聚合页面;所述视频聚合页面包含所述目标视频和/或在自有平台已分享的视频。When a triggering operation for the display control is detected, a video aggregation page is displayed; the video aggregation page includes the target video and/or a video that has been shared on the own platform.
  7. 根据权利要求1-6任一项所述的视频生成方法,其特征在于,所述获取选定的音频之前,还包括:The video generating method according to any one of claims 1 to 6, wherein before the obtaining the selected audio, the method further comprises:
    当探测到针对拍摄控件的操作时,展示歌曲选择界面。The song selection interface is displayed when an operation for the shooting control is detected.
  8. 根据权利要求1-7任一项所述的视频生成方法,其特征在于,所述识别所述时间节点同步采集的视频画面帧的人体动作,包括:The video generating method according to any one of claims 1 to 7, wherein the recognizing the human body motion of the video frame frame acquired by the time node synchronously includes:
    识别所述视频画面帧中,人体的各关节;Identifying the joints of the human body in the frame of the video picture;
    连接人体各关节中相邻的两关节,得到相邻两关节之间的连线;Connecting two adjacent joints in each joint of the human body to obtain a connection between two adjacent joints;
    根据相邻两关节之间的连线与预设参考方向之间的实际夹角,确定人体动作。The human body motion is determined according to the actual angle between the connection between the adjacent two joints and the preset reference direction.
  9. 一种视频生成装置,其特征在于,所述装置包括:A video generating device, the device comprising:
    选择模块,用于获取选定的音频,以及所述音频中各时间节点对应的标准动作;a selection module for acquiring selected audio and standard actions corresponding to each time node in the audio;
    采集模块,用于播放所述音频,并在播放所述音频过程中采集各视频画面帧;An acquisition module, configured to play the audio, and collect each video frame during the playing of the audio;
    展示模块,用于在所述音频播放至每一个时间节点时,展示对应的标准动作,并识别所述时间节点同步采集的视频画面帧的人体动作;a display module, configured to display a corresponding standard action when the audio is played to each time node, and identify a human body motion of the video frame frame that is synchronously acquired by the time node;
    评价模块,用于根据同一时间节点的所述标准动作与所述人体动作之间的差异程度,生成所述人体动作的动作评价信息;An evaluation module, configured to generate action evaluation information of the human body action according to a degree of difference between the standard action and the human body action at the same time node;
    生成模块,用于根据所述音频、各视频画面帧和各人体动作的动作评价信息,生成目标视频。And a generating module, configured to generate a target video according to the audio, each video frame frame, and motion evaluation information of each human body motion.
  10. 根据权利要求9所述的视频生成装置,其特征在于,所述装置还包括:The video generating apparatus according to claim 9, wherein the apparatus further comprises:
    展示确定模块,用于在所述播放所述音频,并同步采集视频画面之前,展示准备动作,并采集准备图像,确定所述准备图像中的人体动作与所述准备动作匹配。And a display determining module, configured to display a preparation action, and collect a preparation image, and determine that the human body action in the preparation image matches the preparation action before the playing the audio and synchronously acquiring the video image.
  11. 根据权利要求9或10所述的视频生成装置,其特征在于,所述生成模块,具体用于:The video generating apparatus according to claim 9 or 10, wherein the generating module is specifically configured to:
    根据各视频画面帧所识别出的人体动作,在各视频画面帧中,添加相应人体动作的动作评价信息;Adding action evaluation information corresponding to the human body motion in each video frame frame according to the human body motion recognized by each video frame frame;
    根据所述音频和添加所述动作评价信息后的视频画面帧,生成所述目标视频。The target video is generated based on the audio and a video frame frame after adding the motion evaluation information.
  12. 根据权利要求9-11任一项所述的视频生成装置,其特征在于,所述装置还包括:The video generating apparatus according to any one of claims 9-11, wherein the apparatus further comprises:
    展示生成模块,用于在所述根据同一时间节点的所述标准动作与所述人体动作之间的差异程度,生成所述人体动作的动作评价信息之后,在用于采集各视频画面帧的拍摄界面 上,展示每一个人体动作的动作评价信息;当所述音频播放结束时,根据每一个人体动作的动作评价信息,生成总评价信息;在结果展示界面,展示所述总评价信息。a display generation module, configured to: after the motion evaluation information of the human motion is generated, the motion evaluation information of the human motion is generated according to the degree of difference between the standard motion and the human motion according to the same time node On the interface, the action evaluation information of each human body action is displayed; when the audio play ends, the total evaluation information is generated according to the action evaluation information of each human body motion; and the total evaluation information is displayed on the result display interface.
  13. 根据权利要求12所述的视频生成装置,其特征在于,所述结果展示界面还包括:回看控件、拍摄控件和分享控件;所述展示生成模块,还用于:The video generating apparatus according to claim 12, wherein the result display interface further comprises: a look back control, a shooting control, and a sharing control; and the display generating module is further configured to:
    当探测到针对所述回看控件的触发操作时,播放所述目标视频;Playing the target video when a triggering operation for the lookback control is detected;
    当探测到针对所述拍摄控件的触发操作时,展示所述拍摄界面,以重新生成所述目标视频;When the triggering operation for the shooting control is detected, the shooting interface is displayed to regenerate the target video;
    当探测到针对所述分享控件的触发操作时,对所述目标视频进行分享。The target video is shared when a triggering operation for the sharing control is detected.
  14. 根据权利要求13所述的视频生成装置,其特征在于,所述展示生成模块,具体用于:The video generating apparatus according to claim 13, wherein the display generating module is specifically configured to:
    展示分享界面;其中,所述分享界面包括自有平台分享控件和第三方平台分享控件;a sharing interface; wherein the sharing interface includes a self-owned platform sharing control and a third-party platform sharing control;
    当探测到针对所述自有平台分享控件的触发操作时,在所述分享界面展示所述拍摄控件和展示控件;Displaying the shooting control and the display control on the sharing interface when detecting a triggering operation for the own platform sharing control;
    当探测到针对所述展示控件的触发操作时,展示视频聚合页面;所述视频聚合页面包含所述目标视频和/或在自有平台已分享的视频。When a triggering operation for the display control is detected, a video aggregation page is displayed; the video aggregation page includes the target video and/or a video that has been shared on the own platform.
  15. 根据权利要求9-14任一项所述的视频生成装置,其特征在于,所述装置还包括:The video generating apparatus according to any one of claims 9 to 14, wherein the apparatus further comprises:
    界面展示模块,用于在所述获取选定的音频之前,当探测到针对拍摄控件的操作时,展示歌曲选择界面。The interface display module is configured to display a song selection interface when detecting an operation for the shooting control before the obtaining the selected audio.
  16. 根据权利要求9-15任一项所述的视频生成装置,其特征在于,所述展示模块,具体用于:The video generating apparatus according to any one of claims 9 to 15, wherein the display module is specifically configured to:
    识别所述视频画面帧中,人体的各关节;Identifying the joints of the human body in the frame of the video picture;
    连接人体各关节中相邻的两关节,得到相邻两关节之间的连线;Connecting two adjacent joints in each joint of the human body to obtain a connection between two adjacent joints;
    根据相邻两关节之间的连线与预设参考方向之间的实际夹角,确定人体动作。The human body motion is determined according to the actual angle between the connection between the adjacent two joints and the preset reference direction.
  17. 一种电子设备,其特征在于,包括:壳体、处理器、存储器、电路板和电源电路,其中,电路板安置在壳体围成的空间内部,处理器和存储器设置在电路板上;电源电路,用于为上述电子设备的各个电路或器件供电;存储器用于存储可执行程序代码;处理器通过读取存储器中存储的可执行程序代码来运行与可执行程序代码对应的程序,用于执行权利要求1-8任一项所述的视频生成方法。An electronic device, comprising: a housing, a processor, a memory, a circuit board, and a power supply circuit, wherein the circuit board is disposed inside the space enclosed by the housing, and the processor and the memory are disposed on the circuit board; a circuit for powering each circuit or device of the above electronic device; a memory for storing executable program code; the processor running a program corresponding to the executable program code by reading executable program code stored in the memory, for A video generation method according to any one of claims 1-8.
  18. 一种非临时性计算机可读存储介质,其上存储有计算机程序,其特征在于,该程序被处理器执行时实现如权利要求1-8任一项所述的视频生成方法。A non-transitory computer readable storage medium having stored thereon a computer program, wherein the program is executed by a processor to implement the video generating method according to any one of claims 1-8.
  19. 一种计算机程序产品,其特征在于,当所述计算机程序产品中的指令由处理器执行时,执行如权利要求1-8任一项所述的视频生成方法。A computer program product, wherein the video generation method according to any one of claims 1-8 is performed when an instruction in the computer program product is executed by a processor.
PCT/CN2018/098602 2017-11-23 2018-08-03 Video generation method and device, and electronic apparatus WO2019100757A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201711185439.6 2017-11-23
CN201711185439.6A CN107920269A (en) 2017-11-23 2017-11-23 Video generation method, device and electronic equipment

Publications (1)

Publication Number Publication Date
WO2019100757A1 true WO2019100757A1 (en) 2019-05-31

Family

ID=61897675

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2018/098602 WO2019100757A1 (en) 2017-11-23 2018-08-03 Video generation method and device, and electronic apparatus

Country Status (2)

Country Link
CN (1) CN107920269A (en)
WO (1) WO2019100757A1 (en)

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110958386A (en) * 2019-11-12 2020-04-03 北京达佳互联信息技术有限公司 Video synthesis method and device, electronic equipment and computer-readable storage medium
CN112750184A (en) * 2019-10-30 2021-05-04 阿里巴巴集团控股有限公司 Data processing, action driving and man-machine interaction method and equipment
CN112752142A (en) * 2020-08-26 2021-05-04 腾讯科技(深圳)有限公司 Dubbing data processing method and device and electronic equipment
CN113132808A (en) * 2019-12-30 2021-07-16 腾讯科技(深圳)有限公司 Video generation method and device and computer readable storage medium
CN113283384A (en) * 2021-06-17 2021-08-20 贝塔智能科技(北京)有限公司 Taiji interaction system based on limb recognition technology
CN113365133A (en) * 2021-06-02 2021-09-07 北京字跳网络技术有限公司 Video sharing method, device, equipment and medium
CN113810536A (en) * 2021-08-02 2021-12-17 惠州Tcl移动通信有限公司 Method, device and terminal for displaying information based on motion trajectory of human body in video
CN113949891A (en) * 2021-10-13 2022-01-18 咪咕文化科技有限公司 Video processing method and device, server and client
CN114745576A (en) * 2022-03-25 2022-07-12 上海合志信息技术有限公司 Family fitness interaction method and device, electronic equipment and storage medium

Families Citing this family (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107920269A (en) * 2017-11-23 2018-04-17 乐蜜有限公司 Video generation method, device and electronic equipment
CN109068081A (en) * 2018-08-10 2018-12-21 北京微播视界科技有限公司 Video generation method, device, electronic equipment and storage medium
CN109525891B (en) * 2018-11-29 2020-01-21 北京字节跳动网络技术有限公司 Multi-user video special effect adding method and device, terminal equipment and storage medium
CN109621425B (en) * 2018-12-25 2023-08-18 广州方硅信息技术有限公司 Video generation method, device, equipment and storage medium
CN113678137B (en) * 2019-08-18 2024-03-12 聚好看科技股份有限公司 Display apparatus
CN110465074B (en) * 2019-08-20 2023-10-20 腾讯科技(深圳)有限公司 Information prompting method and device
CN112560605B (en) * 2020-12-02 2023-04-18 北京字节跳动网络技术有限公司 Interaction method, device, terminal, server and storage medium
CN113596353A (en) * 2021-08-10 2021-11-02 广州艾美网络科技有限公司 Somatosensory interaction data processing method and device and somatosensory interaction equipment
CN114513694A (en) * 2022-02-17 2022-05-17 平安国际智慧城市科技股份有限公司 Scoring determination method and device, electronic equipment and storage medium
CN114549706A (en) * 2022-02-21 2022-05-27 成都工业学院 Animation generation method and animation generation device

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN201349264Y (en) * 2008-12-30 2009-11-18 深圳市同洲电子股份有限公司 Motion image processing device and system
CN102799191A (en) * 2012-08-07 2012-11-28 北京国铁华晨通信信息技术有限公司 Method and system for controlling pan/tilt/zoom based on motion recognition technology
US9805766B1 (en) * 2016-07-19 2017-10-31 Compal Electronics, Inc. Video processing and playing method and video processing apparatus thereof
CN107920269A (en) * 2017-11-23 2018-04-17 乐蜜有限公司 Video generation method, device and electronic equipment
CN107952238A (en) * 2017-11-23 2018-04-24 乐蜜有限公司 Video generation method, device and electronic equipment

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP5704863B2 (en) * 2010-08-26 2015-04-22 キヤノン株式会社 Image processing apparatus, image processing method, and storage medium
CN102724449A (en) * 2011-03-31 2012-10-10 青岛海信电器股份有限公司 Interactive TV and method for realizing interaction with user by utilizing display device
CN102622509A (en) * 2012-01-21 2012-08-01 天津大学 Three-dimensional game interaction system based on monocular video
CN103390174A (en) * 2012-05-07 2013-11-13 深圳泰山在线科技有限公司 Physical education assisting system and method based on human body posture recognition
US10803762B2 (en) * 2013-04-02 2020-10-13 Nec Solution Innovators, Ltd Body-motion assessment device, dance assessment device, karaoke device, and game device
CN104899912B (en) * 2014-03-07 2019-07-05 腾讯科技(深圳)有限公司 Animation method and back method and equipment

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN201349264Y (en) * 2008-12-30 2009-11-18 深圳市同洲电子股份有限公司 Motion image processing device and system
CN102799191A (en) * 2012-08-07 2012-11-28 北京国铁华晨通信信息技术有限公司 Method and system for controlling pan/tilt/zoom based on motion recognition technology
US9805766B1 (en) * 2016-07-19 2017-10-31 Compal Electronics, Inc. Video processing and playing method and video processing apparatus thereof
CN107920269A (en) * 2017-11-23 2018-04-17 乐蜜有限公司 Video generation method, device and electronic equipment
CN107952238A (en) * 2017-11-23 2018-04-24 乐蜜有限公司 Video generation method, device and electronic equipment

Cited By (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112750184A (en) * 2019-10-30 2021-05-04 阿里巴巴集团控股有限公司 Data processing, action driving and man-machine interaction method and equipment
CN112750184B (en) * 2019-10-30 2023-11-10 阿里巴巴集团控股有限公司 Method and equipment for data processing, action driving and man-machine interaction
CN110958386A (en) * 2019-11-12 2020-04-03 北京达佳互联信息技术有限公司 Video synthesis method and device, electronic equipment and computer-readable storage medium
CN113132808B (en) * 2019-12-30 2022-07-29 腾讯科技(深圳)有限公司 Video generation method and device and computer readable storage medium
CN113132808A (en) * 2019-12-30 2021-07-16 腾讯科技(深圳)有限公司 Video generation method and device and computer readable storage medium
CN112752142A (en) * 2020-08-26 2021-05-04 腾讯科技(深圳)有限公司 Dubbing data processing method and device and electronic equipment
CN113365133A (en) * 2021-06-02 2021-09-07 北京字跳网络技术有限公司 Video sharing method, device, equipment and medium
CN113365133B (en) * 2021-06-02 2022-10-18 北京字跳网络技术有限公司 Video sharing method, device, equipment and medium
CN113283384A (en) * 2021-06-17 2021-08-20 贝塔智能科技(北京)有限公司 Taiji interaction system based on limb recognition technology
CN113810536A (en) * 2021-08-02 2021-12-17 惠州Tcl移动通信有限公司 Method, device and terminal for displaying information based on motion trajectory of human body in video
CN113810536B (en) * 2021-08-02 2023-12-12 惠州Tcl移动通信有限公司 Information display method, device and terminal based on human limb action track in video
CN113949891A (en) * 2021-10-13 2022-01-18 咪咕文化科技有限公司 Video processing method and device, server and client
CN113949891B (en) * 2021-10-13 2023-12-08 咪咕文化科技有限公司 Video processing method and device, server and client
CN114745576A (en) * 2022-03-25 2022-07-12 上海合志信息技术有限公司 Family fitness interaction method and device, electronic equipment and storage medium

Also Published As

Publication number Publication date
CN107920269A (en) 2018-04-17

Similar Documents

Publication Publication Date Title
WO2019100757A1 (en) Video generation method and device, and electronic apparatus
WO2019100755A1 (en) Video generation method and device, and electronic apparatus
WO2019100753A1 (en) Video generation method and apparatus, and electronic device
WO2019100756A1 (en) Image acquisition method and apparatus, and electronic device
WO2019100754A1 (en) Human body movement identification method and device, and electronic device
US9278288B2 (en) Automatic generation of a game replay video
CN107096221B (en) System and method for providing time-shifted intelligent synchronized gaming video
CN107029429B (en) System, method, and readable medium for implementing time-shifting tutoring for cloud gaming systems
KR102342933B1 (en) Information processing device, control method of information processing device, and storage medium
CN104620522B (en) User interest is determined by detected body marker
US10549203B2 (en) Systems and methods for providing time-shifted intelligently synchronized game video
KR20200135946A (en) Deinterleaving of gameplay data
CN113453034B (en) Data display method, device, electronic equipment and computer readable storage medium
US20160045828A1 (en) Apparatus and method of user interaction
US11334621B2 (en) Image search system, image search method and storage medium
JP7248437B2 (en) Programs, electronics and data recording methods
KR102365431B1 (en) Electronic device for providing target video in sports play video and operating method thereof
JP2014023745A (en) Dance teaching device
US20170193668A1 (en) Intelligent Equipment-Based Motion Sensing Control Method, Electronic Device and Intelligent Equipment
CN115237314B (en) Information recommendation method and device and electronic equipment
CN115442658B (en) Live broadcast method, live broadcast device, storage medium, electronic equipment and product
CN110102057A (en) A kind of cut scene marching method, device, equipment and medium
Chen Capturing fast motion with consumer grade unsynchronized rolling-shutter cameras
WO2018035832A1 (en) Advertisement video playback device
WO2018035829A1 (en) Advertisement playback device

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 18880767

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 18880767

Country of ref document: EP

Kind code of ref document: A1