WO2022116751A1 - Interaction method and apparatus, and terminal, server and storage medium - Google Patents

Interaction method and apparatus, and terminal, server and storage medium Download PDF

Info

Publication number
WO2022116751A1
WO2022116751A1 PCT/CN2021/127010 CN2021127010W WO2022116751A1 WO 2022116751 A1 WO2022116751 A1 WO 2022116751A1 CN 2021127010 W CN2021127010 W CN 2021127010W WO 2022116751 A1 WO2022116751 A1 WO 2022116751A1
Authority
WO
WIPO (PCT)
Prior art keywords
image frame
frame data
body part
information
action icon
Prior art date
Application number
PCT/CN2021/127010
Other languages
French (fr)
Chinese (zh)
Inventor
丛延东
Original Assignee
北京字节跳动网络技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 北京字节跳动网络技术有限公司 filed Critical 北京字节跳动网络技术有限公司
Publication of WO2022116751A1 publication Critical patent/WO2022116751A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/70Determining position or orientation of objects or cameras
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/22Image preprocessing by selection of a specific region containing or referencing a pattern; Locating or processing of specific regions to guide the detection or recognition
    • G06V10/235Image preprocessing by selection of a specific region containing or referencing a pattern; Locating or processing of specific regions to guide the detection or recognition based on user input or interaction
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/20Movements or behaviour, e.g. gesture recognition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30196Human being; Person

Definitions

  • the present disclosure relates to the technical field of image processing, and in particular, to an interaction method, device, terminal, server and storage medium.
  • body recognition technology as a branch of computer vision processing technology, has an increasingly wide range of applications, such as video-based fitness training, video-based dance teaching, and video-based game experience. How to apply the body recognition results of the user images collected by the camera to the guidance and evaluation of the user's body movements to improve the user's action experience is still a problem to be solved at present.
  • the embodiments of the present disclosure provide an interaction method, apparatus, terminal, server, and storage medium, which can improve user interaction experience.
  • an embodiment of the present disclosure provides an interaction method, which is applied to a client, including:
  • the second image frame data is image frame data at a preset time point after the first image frame data
  • the evaluation result is determined according to the degree of matching between the state information of the target human body part and the action icon in the second image frame data.
  • an embodiment of the present disclosure further provides an interaction method, applied to a server, including:
  • the position data of human body parts of the same image frame in the plurality of candidate videos are fused to obtain a standard position data set;
  • the preset position information of the action icon corresponding to the target body part is determined by using the searched position data, so as to participate in determining the display position of the action icon in the image frame data displayed by the client.
  • an embodiment of the present disclosure further provides an interaction device, which is configured on a client and includes:
  • a first acquisition module configured to collect and display the first image frame data of the user
  • a first determining module configured to identify at least one human body part in the first image frame data, and determine the position information of the human body part
  • a display position determination module configured to determine the display position of the action icon based on the position information of the at least one human body part and the preset position information of the action icon corresponding to the human body part, and display it at the display position the action icon;
  • a second collection module configured to collect and display second image frame data of the user; wherein, the second image frame data is image frame data at a preset time point after the first image frame data;
  • a second determination module configured to determine the target human body part associated with the action icon in the second image frame data and the state information of the target human body part
  • An evaluation module configured to determine an evaluation result according to the degree of matching between the state information of the target human body part in the second image frame data and the action icon.
  • an embodiment of the present disclosure further provides an interaction device, which is configured on a server and includes:
  • a position data extraction module configured to obtain a plurality of candidate videos, and extract the position data of human body parts of each image frame in the plurality of candidate videos;
  • a standard position data set determination module configured to fuse the body part position data of the same image frame in the plurality of candidate videos based on preset rules to obtain a standard position data set
  • a position data search module configured to search for the position data of the target body part in the standard position data set in at least one image frame of the plurality of candidate videos
  • the preset position information determination module is used to determine the preset position information of the action icon corresponding to the target body part by using the searched position data, so as to participate in determining the position of the action icon in the image frame data displayed by the client. placement.
  • an embodiment of the present disclosure further provides a terminal, including a memory, a processor, and a camera, wherein:
  • the camera is used to collect the user's image frame data in real time
  • a computer program is stored in the memory, and when the computer program is executed by the processor, the processor executes any interaction method provided by the embodiments of the present disclosure.
  • an embodiment of the present disclosure further provides a server, including a memory and a processor, wherein: a computer program is stored in the memory, and when the computer program is executed by the processor, the processor executes Any of the interaction methods provided by the embodiments of the present disclosure.
  • an embodiment of the present disclosure further provides a computer-readable storage medium, where a computer program is stored in the storage medium, and when the computer program is executed by a processor, the processor executes the computer program provided by the embodiment of the present disclosure. of any interaction method.
  • an embodiment of the present disclosure further provides a computer program product, wherein the computer program product includes computer program instructions, and when executed by a processor, the computer program instructions cause the processor to execute the present disclosure Any interaction method provided by the embodiment.
  • the client can call the camera to collect the first image frame data and the second image frame data of the user in real time and display them,
  • the first image frame data is the image frame data collected previously, first identify the body part in the first image frame data in real time and determine the position information of the body part, and then combine the preset position information of the action icon to determine that the action icon is in The exact display position on the first image frame data, that is, with the change of the position of the body part, the display position of the action icon on the first image frame data can be adjusted in real time (or called correction); finally, according to the second image frame data
  • the matching degree between the state information of the target human body part associated with the action icon and the action icon is determined, and the evaluation result is determined.
  • the embodiment of the present disclosure realizes the effective combination of the user image frame data collected by the camera and the action icon to be displayed in the image frame data, dynamically adjusts the display position of the action icon according to the position of the user's body part, and accurately evaluates the state of the user's body part information to improve the user's interactive experience.
  • FIG. 1 is a flowchart of an interaction method provided by an embodiment of the present disclosure
  • FIG. 2 is a schematic diagram of image frame data displaying action icons according to an embodiment of the present disclosure
  • FIG. 3 is a flowchart of another interaction method provided by an embodiment of the present disclosure.
  • FIG. 4 is a schematic diagram of image frame data showing an action icon and a guiding video animation provided by an embodiment of the present disclosure
  • FIG. 5 is a schematic diagram of image frame data showing an animation of an evaluation result provided by an embodiment of the present disclosure
  • FIG. 6 is a schematic diagram of displaying a shared video on the same screen according to an embodiment of the present disclosure
  • FIG. 7 is a flowchart of another interaction method provided by an embodiment of the present disclosure.
  • FIG. 8 is a schematic structural diagram of an interaction apparatus according to an embodiment of the present disclosure.
  • FIG. 9 is a schematic structural diagram of another interaction apparatus provided by an embodiment of the present disclosure.
  • FIG. 10 is a schematic structural diagram of a terminal according to an embodiment of the present disclosure.
  • FIG. 11 is a schematic structural diagram of a server according to an embodiment of the present disclosure.
  • FIG. 1 is a flowchart of an interaction method provided by an embodiment of the present disclosure, which is applied to a client.
  • the method can be applied to the situation of how to combine the user image frame data collected by the camera in real time with the action icons to be displayed on the image frame data, and evaluate the state information of the human body part in the user image frame data collected in real time.
  • the method can be executed by an interactive device configured on the client, and the device can be implemented by software and/or hardware.
  • the client mentioned in the embodiment of the present disclosure may include any client with a video interaction function, and the terminal device on which the client is installed may include, but not limited to, a smart phone, a tablet computer, a notebook, and the like.
  • the types of the state information of the user's body parts may include, but are not limited to, the state information of the body parts related to dance games, dance training, fitness movements, and teaching actions, etc., that is, the embodiments of the present disclosure may be applicable to games , fitness and teaching and other application scenarios.
  • the interaction method provided by the embodiment of the present disclosure may include S101-S106:
  • S101 Collect and display first image frame data of a user.
  • the user can pre-select the entire set of action video content to be completed, and before starting to execute the relevant action, touch the image capture control (or video recording control) on the client interface to trigger an image capture request, and the client responds.
  • the camera is called to collect the user's image frame data in real time and display it on the interface.
  • the first image frame data may be any image frame data collected by the camera in real time, and the word "first" does not have any limited meaning in order.
  • the body parts identified in the first image frame data include at least one of a head, an arm, a hand, a foot, and a leg.
  • S102 can use the human body recognition technology to identify the human body parts in the collected user image frame data in real time, and simultaneously determine the position information of the human body parts, and the position information may specifically be the position information of key points on the human body parts.
  • the implementation principle of the human body recognition technology reference may be made to the prior art, which is not specifically limited in the embodiments of the present disclosure.
  • the preset position information of the action icon is used to constrain the display position of the action icon in the user image frame data, and can be pre-determined by the server in the development stage, and then delivered to the client.
  • the preset position information of the action icon may include relative position information between the to-be-displayed position of the action icon and the corresponding body part.
  • the client may determine whether an action icon needs to be displayed in the currently collected first image frame data based on the collection time information (or video recording time information) of the user's first image frame data. For example, in the case of recording a dance action video with a duration of 30 seconds, it is preset that when the video is recorded at the 5th, 15th, and 25th seconds, the action needs to be displayed in the user image frame data collected in real time. Therefore, when the user completes the dance action, the client can determine whether the action icon needs to be displayed in the current image frame data based on the acquisition time information of the user's current image frame data or the current recording time information of the dance action video. In addition, the collection time information of the image frame data and the video recording time information can be mutually determined. If the client records the collection time information of the first frame of user image data collected as 0 seconds, the collection time of the current image frame data is the time of the video. recording time.
  • the client can also determine whether the action icon needs to be displayed in the current image frame data based on the predetermined display correspondence between the specific image frame data and the action icon. For example, if the client collects the user image frame data showing the specified body movement, it will display the action icon in the image frame data, and the specified body movement is the body that needs to exist in the image frame data specified in the display corresponding relationship. action.
  • the client dynamically determines that the action icon is in the user's The display position in the image frame data, so as to ensure the accurate display of the action icon in the user's image frame data.
  • the preset position information of the action icons may also be called chart information, which defines the relative position information of the human body parts and the action icons, and the action icons may also be called note points.
  • the preset position information of the action icon may be obtained based on the position data of the body part corresponding to the action icon in the standard data set.
  • the image frame data in which the action icon needs to be displayed may be predetermined according to the display requirement of the action icon (for example, the action icon is displayed at a specified moment of video recording).
  • the developer can pre-determine that when the dance game progresses to the Nth second, an action icon will be displayed at a preset position in the user's image frame data, such as the shoulder. , and then the developer determines the relative position information of the action icon and the preset position based on the position data of the preset position in the standard data set in the image frame data of the Nth second, as the preset position information of the action icon.
  • the standard dataset is obtained by fusing the position data of human body parts of the same image frame in multiple candidate videos (referring to at least two) based on preset rules.
  • the same image frame in multiple candidate videos presents the same state information of human body parts, for example, presents the same human body action information, such as dance videos recorded for different people of the same dance, namely Can be used as a candidate video.
  • the position data of human body parts in each frame of image data in each candidate video can be obtained by using a motion capture system to perform motion capture.
  • the server may determine the weight value of each candidate video; and then, based on the weight value of each candidate video, perform a weighted average calculation on the position data of human body parts of the same image frame in the multiple candidate videos to obtain a standard position data set.
  • the weight value of each candidate video can be determined according to video interaction information and/or video publisher information. For example, the higher the amount of video interaction information, the higher the video weight value; if the video publisher is a high-profile person, the higher the video weight value is. big.
  • the plurality of candidate videos may be obtained based on preset video screening information, the preset video screening information includes video interaction information and/or video publisher information, and the video interaction information includes the likes and/or comments of the videos.
  • the Internet data can be screened to screen out the likes exceeding the likes threshold, the commenting Published videos as candidate videos.
  • Each threshold can be flexibly set.
  • a standard position data set is obtained, which can integrate the position characteristics of human body parts of different people, reasonably optimize the position information of human body parts displayed in the video, and improve the video quality. , and optimize the placement of action icons. At the same time, it also helps to improve the public's recognition and acceptance of the optimized video effects.
  • the action icon can be displayed in the user image frame data in any available style, and the display sample can include the shape, color, dynamic effect, and static effect of the action icon.
  • the situation is designed in advance, and is not specifically limited in the embodiment of the present disclosure.
  • FIG. 2 is a schematic diagram of image frame data showing an action icon provided by an embodiment of the present disclosure, which is used to illustrate the embodiment of the present disclosure and should not be construed as a specific limitation to the embodiment of the present disclosure.
  • the current image frame data of the user displays a circular first action icon 21 and an arrow-shaped second action icon 22 .
  • the first action icon 21 may be used to guide the user to move the hand to the position of the first action icon 21, and the second action icon 22 may be used to guide the user to swipe the hand in the direction of the arrow.
  • the number of action icons that can be displayed in each image frame data is not specifically limited in this embodiment of the present disclosure.
  • S104 Collect and display second image frame data of the user; wherein, the second image frame data is image frame data at a preset time point after the first image frame data.
  • the collection interval between the second image frame data and the first image frame data is not specifically limited in this embodiment of the present disclosure, that is, the specific value of the preset time point can be set flexibly.
  • the second image frame data or the first image frame data do not specifically refer to a specific frame of image data, and can be used to refer to multiple frames of image data, but there is an order of image acquisition.
  • the state information of the user's body part displayed in the first image frame data and the second image frame data may change continuously.
  • the action icon displayed in the first image frame data may continue to be displayed in the second image frame data, or may not be displayed based on the determined display position of the action icon.
  • the collection time interval between the first user image frame data and the second image frame data is usually very small. Therefore, based on the determined display position of the action icon, Continuing to display the second image frame data will not cause a large change in the display position of the action icon, that is, the display positions of the action icon in the first image frame data and the second image frame data are consistent to a certain extent.
  • At least one human body part in the second image frame data can also be identified, and the position information of the human body part can be determined, and then based on the position information of the at least one human body part, and the relationship with the human body part
  • the preset position information of the corresponding action icon determines the display position of the action icon in the second image frame data, and displays it.
  • the action icon includes an emoticon icon, and in the process of displaying the first image frame data or displaying the second image frame data, it further includes:
  • the display position of the expression icon is determined, and the expression icon is displayed at the determined display position.
  • the expression icon matching Duzui is determined to be "heart” or “kiss”, and then based on the facial expression recognition technology
  • the position of the user's mouth determine the preset area of the mouth (specifically can be set flexibly) as the display position of "love” or “kiss”, and display the special effect icon of "love” or "kiss” in the preset area, so as to Make the interaction more interesting.
  • S105 Determine the target human body part associated with the action icon in the second image frame data and the state information of the target human body part.
  • the target body part associated with the action icon in the second image frame data is related to the action video content to be completed pre-selected by the user.
  • the client may determine the target human body part associated with the action icon in the second image frame data based on the playback time information of the background music or the collection time information of the second image frame data. At least one of a head, arms, hands, feet, and legs may be included. For example, when the background music is played to the Nth second, or the collection time of the second image frame data is the Nth second, it is determined that the target human body part associated with the action icon in the second image frame data is the user's hand.
  • the state information of the target body part includes position information of the target body part and/or action information formed by the target body part. For example, when the background music is played to the Nth second, or the collection time of the second image frame data is the Nth second, the user's hand is placed on the user's shoulder, or the user's hand presents an OK gesture, or the user's hand presents a clapping. action etc.
  • S106 Determine the evaluation result according to the matching degree between the state information of the target human body part and the action icon in the second image frame data.
  • the client may determine the user's evaluation result in the second image frame data based on the matching results of multiple dimensions. The higher the matching degree, the better the evaluation result.
  • the evaluation result can be displayed in the second image frame data.
  • the evaluation results can be realized in the form of numbers, text and/or English, and dynamic special effects can also be added during the presentation to enhance the visual effect of the interface.
  • the client determines the user's evaluation result in the current image frame data, it can also combine the user's evaluation result in the previously collected image frame data to determine the user's cumulative evaluation result, and carry out exhibit.
  • the client determines the user's evaluation result in the current image frame data
  • it can also combine the user's evaluation result in the previously collected image frame data to determine the user's cumulative evaluation result, and carry out exhibit.
  • the client determines the user's evaluation result in the current image frame data.
  • the effective response area of the action icon may be determined according to the display position and/or display style of the action icon, for example, an area with a preset size and a preset shape may be determined based on the display position of the action icon as the effective response area of the action icon, or The shape area corresponding to the display style of the action icon may be determined as its effective response area, or based on the shape area of the action icon, an area of a preset shape with an area smaller or larger than its shape area may be determined as its effective response area; or , and determine the effective response area based on the display position and display style of the action icon, which can be set flexibly. How to determine the effective response area of the action icon may be predetermined by the server.
  • the position of the human body part is the same as that of the action icon.
  • the position matching degree of the effective response area is high, otherwise, the position matching degree is poor when one item is not satisfied. It can be seen that the higher the position matching degree, the better the evaluation result.
  • the position matching degree between the target body part and the action icon can also be flexibly adopted by those skilled in the art.
  • taking the state information of the target human body part associated with the action icon including the action information formed by the human body part as an example, according to the matching degree between the state information of the target human body part and the action icon in the second image frame data , to determine the assessment results including:
  • the action information formed by the body parts includes but is not limited to dance game action information.
  • the action matching degree between the action information formed by the target human body part and the standard action information in the second image frame data can be determined, for example, for the OK gesture, the user's hand when the OK gesture is presented can be extracted respectively.
  • the key point coordinates are then compared with the hand key point coordinates corresponding to the standard OK gesture to determine the action matching degree.
  • the client can call the camera to collect and display the first image frame data and the second image frame data of the user in real time, wherein the first image frame data is the image frame data collected previously, and the first image frame data is identified in real time.
  • the human body part in the first image frame data and the position information of the human body part are determined, and then combined with the preset position information of the action icon, the accurate display position of the action icon on the first image frame data is determined, that is, with the change of the position of the human body part , the display position of the action icon on the first image frame data can be adjusted in real time (or called correction); finally, according to the matching degree of the state information of the target human body part associated with the action icon in the second image frame data and the action icon, Determine the assessment results.
  • the embodiment of the present disclosure realizes the effective combination of the user image frame data collected by the camera and the action icon to be displayed in the image frame data, dynamically adjusts the display position of the action icon according to the position of the user's body part, and accurately evaluates the state of the user's body part information to improve the user's interactive experience.
  • FIG. 3 is a flowchart of another interaction method provided by an embodiment of the present disclosure, which is further optimized and expanded based on the foregoing technical solution, and may be combined with the foregoing optional implementation manners.
  • the interaction method provided by the embodiment of the present disclosure may include S201-S209:
  • S201 Collect and display the first image frame data of the user.
  • S203 Determine the display position of the action icon based on the position information of at least one body part and the preset position information of the action icon corresponding to the body part.
  • the client can determine the current display style of the action icon according to the playback time information of the current background music or the collection time information of the first image frame data.
  • the display styles of action icons at different time points can be the same or different.
  • the action icon is displayed in a display style at the display position.
  • S206 Collect and display second image frame data of the user; wherein, the second image frame data is image frame data at a preset time point after the first image frame data.
  • the guidance information includes at least one of a guidance video animation, a guidance picture and a guidance instruction.
  • the guiding instructions can also be played in the form of voice.
  • the guidance information may be obtained based on the standard data set in the foregoing embodiments. Taking a guide video animation or a guide picture as an example, it can be obtained by importing the standard data set in the foregoing embodiment into a human body model and performing image processing. Specifically, developers can use the server to import standard data sets into the human body model based on the existing 3D animation production principles, and generate guided video animations through model rendering, or obtain guiding pictures in the form of screenshots, and then send them to the client from the server. middle.
  • the standard data set integrates the location characteristics of different people's body parts, and obtaining guidance information based on the standard data set can improve the reference value of the guidance information and improve the public's recognition and acceptance of the guidance information.
  • the guidance information may be directly superimposed and displayed in the second image frame data, or may be displayed in the second image frame data in the form of an independent play window or the like.
  • the specific display position of the guidance information in the second image frame data is not limited in the embodiment of the present disclosure, and may be, for example, the lower right, upper right, upper left, or lower left of the image.
  • the client can also dynamically adjust the display position of the guidance information based on the position of the user's body parts in the image frame data, so as to avoid overlapping display of body parts and guidance information. For example, if the client detects that the user's limb is positioned to the right in the second image frame data, the guide information may be displayed at the position that is biased to the left in the second image frame data.
  • FIG. 4 is a schematic diagram of image frame data showing action icons and guiding video animation provided by an embodiment of the present disclosure, which is used to illustrate the embodiment of the present disclosure and should not be construed as a specific limitation to the embodiment of the present disclosure.
  • the current image frame data displays a first action icon 21 and a second action icon 22 ; at the same time, a guide video animation 23 is displayed at the lower left of the current image frame data to guide the user to complete correct body movements.
  • S208 Determine the target human body part associated with the action icon and the state information of the target human body part in the second image frame data.
  • S209 Determine the evaluation result according to the matching degree between the state information of the target human body part and the action icon in the second image frame data.
  • the method further includes:
  • the evaluation result animation is determined according to the evaluation result; the specific implementation of the evaluation result animation (or called action determination animation) can be flexibly set, and the embodiment of the present disclosure does not make specific limitations;
  • the animation display position of the evaluation result animation in the second image frame data is determined, and the evaluation result animation is displayed in the animation display position.
  • the display position of the evaluation result animation may or may not coincide with the display position of the action icon.
  • the evaluation result animation can be displayed on the display position of the action icon, and the action icon is hidden at the same time, so as to generate an interface effect of special switching and transformation.
  • FIG. 5 is a schematic diagram of image frame data showing an evaluation result animation provided by an embodiment of the present disclosure, which is used to illustrate the embodiment of the present disclosure and should not be construed as a specific limitation to the embodiment of the present disclosure.
  • the position of the user's hand and limb movements has a high degree of matching with the position of the action icon at the shoulder (that is, the degree of coincidence with the effective response area of the action icon is high), and the evaluation result of the user's hand and limb movements is: Perfect, therefore, a circular evaluation result animation 51 is displayed in the image frame data, and the word "perfect" is displayed in the evaluation result animation.
  • the evaluation result animation 51 can dynamically change the size of the circle and change the display color during the presentation process.
  • Image frame data showing animation of evaluation results can be used as valid video frame data.
  • the method further includes:
  • the first shared video is generated; since the user's image frame data belongs to the image sequence collected in real time, it can be obtained based on the first image frame data and the second image frame data.
  • a complete user video, and action icons, guidance information, and evaluation result animations can be displayed in the corresponding image frame data in the shared video;
  • a first video sharing request is sent to the server; wherein, the first video sharing request carries the first sharing video and the user identifier of the sharing object, and the user identifier of the sharing object is used by the server to determine the first sharing object to share.
  • Two shared videos; the second shared video and the first shared video may be videos of the same action content recorded by different people; the number of shared objects may be one or more, correspondingly, the second shared video may refer to one video or multiple videos;
  • the composite video returned by the server is received; wherein, the composite video is obtained by the server synthesizing the first shared video and the second shared video for display on the same screen.
  • the same-screen display can be a left-right split-screen display, or a top-bottom split-screen display. Depending on the number of users participating in the video sharing, the same-screen display method is different.
  • the current client after the current client generates the first sharing video, it can switch from the current interface to the sharing object selection interface according to the sharing object selection operation triggered by the current user, so that the current user can determine at least one sharing object and obtain the selected sharing object by the current user. After sharing the user identification of the object, switch to the current interface again, and generate a first video sharing request and send it to the server according to the video sharing operation triggered by the current user.
  • the client controlled by the shared object it can also perform the same operation as the foregoing operation, so as to realize the sharing of the second shared video to the server.
  • the client controlled by the current user ie the sharing initiator
  • the client controlled by the sharing object may simultaneously send a video sharing request to the server on the basis of user communication. After the server completes the video synthesis, it can send the synthesized video to the client controlled by the current user and the client controlled by the sharing object respectively.
  • FIG. 6 is a schematic diagram of displaying a shared video on the same screen provided by an embodiment of the present disclosure, specifically taking two people participating in video sharing as an example to illustrate the embodiment of the present disclosure, and should not be construed as a specific limitation of the embodiment of the present disclosure.
  • user A and user B are each other's sharing objects, and the client controlled by the sharing initiator and the client controlled by the sharing object can simultaneously display the shared videos of the two.
  • the display position of the action icon is above the shoulder, and the hand of user A is displayed above the shoulder, that is, the matching degree of the hand position and the display position of the action icon is high, and the evaluation result of user A is perfect; B's hand is displayed on the right side of the body, that is, the match between the hand position and the display position of the action icon is low, and the evaluation result of user B is average.
  • different evaluation result animations are shown in Fig. 6.
  • the evaluation result animation is an animation formed by a star pattern, and the word "perfect” is displayed in the star pattern;
  • B The animation of the evaluation result is an animation formed by a circular pattern, and the word "general” is displayed in the circular pattern.
  • the method before displaying the first image frame data, the method further includes:
  • the current mode is switched to the image synchronization sharing mode; that is, in the image synchronization sharing mode, after the current user determines the sharing object, the image frame data obtained locally in real time can be displayed at the same time as the image frame data controlled by the sharing object.
  • the display effect on the same screen can refer to the display effect shown in Figure 6.
  • the method further includes:
  • the first shared image frame data and the second shared image frame data are shared in real time by the sharing object, and the sharing object is predetermined by the user.
  • the synchronous display of the first shared image frame data and the second shared image frame data between different clients can be realized directly through the interaction between the client and the client; The data transfer between them realizes the synchronous display of the first shared image frame data and the second shared image frame data between different clients.
  • the sharing object may be determined before or after the current user triggers the video synchronization operation, and after the client controlled by the current user is switched from the current mode to the image synchronization sharing mode, a mode switching notification may be sent to the server.
  • the user identifier of the shared object can be carried to notify the server to send the received first shared image frame data and second shared image frame data shared by the shared object in real time to the client controlled by the current user in real time.
  • the client controlled by the current user displays the image frame data obtained in real time, it also shares the image frame data to the server in real time, so that the client controlled by the sharing object can also display the image frame of the current user synchronously after performing the aforementioned operations. data.
  • Contents such as action icons, guidance information, and evaluation result animations can also be displayed synchronously during the display of image frame data on the same screen.
  • the client currently controlled by the user and the client controlled by the shared object can be switched to the image synchronization sharing mode at the same time on the basis of mutual communication between users.
  • image frame data of different users can be displayed on the same screen in the same client through image sharing and synthesis, which improves the interest of image interaction or video interaction.
  • FIG. 7 is a flowchart of another interaction method provided by an embodiment of the present disclosure, applied to a server, and the method may be executed by an interaction apparatus configured on the server, and the apparatus may be implemented by software and/or hardware.
  • the interaction method applied to the server provided by the embodiment of the present disclosure may be executed in cooperation with the interaction method applied to the client provided by the embodiment of the present disclosure.
  • the interaction method applied to the client provided by the embodiment of the present disclosure may be executed in cooperation with the interaction method applied to the client provided by the embodiment of the present disclosure.
  • the interaction method provided by the embodiment of the present disclosure may include S301-S304:
  • S301 Acquire multiple candidate videos, and extract the position data of human body parts of each image frame in the multiple candidate videos.
  • S303 Search for the position data of the target human body part in at least one image frame in the multiple candidate videos in the standard position data set.
  • the standard action information corresponding to the action icon may also be determined based on the action information formed in the image frame by the target body part corresponding to the action icon.
  • acquiring multiple candidate videos, and extracting body part position data of each image frame in the multiple candidate videos including:
  • the preset video screening information includes video interaction information and/or video publisher information, and the video interaction information includes the amount of likes and/or comments of the video;
  • the interaction method provided by the embodiment of the present disclosure further includes:
  • the guide information includes at least one of a guide video animation, a guide picture and a guide instruction.
  • the position data of human body parts of the same image frame in multiple candidate videos are fused to obtain a standard position data set, including:
  • a weighted average calculation is performed on the position data of the human body parts of the same image frame in the multiple candidate videos to obtain a standard position data set.
  • the interaction method provided by the embodiment of the present disclosure further includes:
  • the first video sharing request carries the first sharing video and the user ID of the sharing object, and the first sharing video is collected by the client based on the first image frame data and the second image frame. data generation;
  • the second shared video and the first shared video may include image frames of human body parts showing the same state information
  • the interaction method provided by the embodiment of the present disclosure further includes:
  • the server may determine a standard position data set based on the position data of the human body parts of each image frame in the multiple candidate videos, and then determine the target human body based on the position data of the target human body part in the standard position data set.
  • the preset position information of the action icon corresponding to the part is sent to the client, so that the client can dynamically determine the accuracy of the action icon in the image frame data in combination with the position information of the human body part identified from the currently displayed image frame data Display position, that is to achieve the effect of dynamically adjusting the display position of the action icon in the user's image frame data as the position of the user's body parts changes; at the same time, the client also based on the real-time collection of the user's image frame data in the target associated with the action icon The matching degree between the state information of the human body part and the action icon determines the evaluation result.
  • the embodiment of the present disclosure realizes the effective combination of the user image frame data collected by the camera and the action icon to be displayed in the image frame data, dynamically adjusts the display position of the action icon according to the position of the user's body part, and accurately evaluates the state of the user's body part information to improve the user's interactive experience.
  • the shared video of multiple people can be displayed on the same screen in the client, which improves the fun of video sharing.
  • FIG. 8 is a schematic structural diagram of an interaction apparatus according to an embodiment of the present disclosure.
  • the apparatus may be configured in a client, and may be implemented by software and/or hardware.
  • the client mentioned in the embodiment of the present disclosure may include any client with a video interaction function, and the terminal device on which the client is installed may include, but not limited to, a smart phone, a tablet computer, a notebook, and the like.
  • the interaction apparatus 400 may include a first collection module 401 , a first determination module 402 , a display position determination module 403 , a second collection module 404 , a second determination module 405 , and an evaluation module 406 ,in:
  • the first collection module 401 is used to collect and display the first image frame data of the user
  • a first determining module 402 configured to identify at least one human body part in the first image frame data, and determine the position information of the human body part
  • the display position determination module 403 is configured to determine the display position of the action icon based on the position information of at least one human body part and the preset position information of the action icon corresponding to the human body part, and display the action icon at the display position;
  • the second collection module 404 is configured to collect and display the second image frame data of the user; wherein, the second image frame data is the image frame data at a preset time point after the first image frame data;
  • the second determination module 405 is configured to determine the target human body part associated with the action icon and the state information of the target human body part in the second image frame data;
  • the evaluation module 406 is configured to determine the evaluation result according to the matching degree between the state information of the target human body part and the action icon in the second image frame data.
  • the state information of the target body part includes position information of the target body part and/or action information formed by the target body part.
  • the state information of the target body part includes position information of the target body part
  • Evaluation module 406 includes:
  • an effective response area determination unit used to determine the effective response area of the action icon in the second image frame data
  • the first evaluation result determination unit is configured to determine the position matching degree between the position information of the target human body part and the effective response area of the action icon, and determine the evaluation result according to the position matching degree.
  • the state information of the target body part includes action information formed by the target body part
  • Evaluation module 406 includes:
  • a standard action information determining unit used for determining standard action information corresponding to the action icon
  • the second evaluation result determination unit is configured to determine the action matching degree between the action information formed by the target human body part in the second image frame data and the standard action information, and determine the evaluation result according to the action matching degree.
  • the preset position information of the action icon is obtained based on the position data of the body part corresponding to the action icon in the standard data set;
  • the standard dataset is obtained by fusing the position data of human body parts of the same image frame in multiple candidate videos based on preset rules.
  • the plurality of candidate videos are obtained based on preset video screening information
  • the preset video screening information includes video interaction information and/or video publisher information
  • the video interaction information includes video likes and/or or comment volume.
  • the interaction apparatus 400 provided by the embodiment of the present disclosure further includes:
  • the guide information display module is used for displaying guide information on the second image frame data, so as to guide the user to change the state information of the target body part associated with the action icon.
  • the guide information includes at least one of a guide video animation, a guide picture and a guide instruction.
  • the placement determination module 403 includes:
  • a display position determination unit configured to determine the display position of the action icon based on the position information of at least one human body part and the preset position information of the action icon corresponding to the human body part;
  • Action icon display unit used to display the action icon in the display position
  • the action icon display unit includes:
  • a display style determination subunit used for determining the display style of the action icon based on the playback time information of the background music or based on the collection time information of the first image frame data
  • the action icon display subunit is used to display the action icon in the display style in the display position.
  • the interaction apparatus 400 provided by the embodiment of the present disclosure further includes:
  • the evaluation result animation determination module is used to determine the evaluation result animation according to the evaluation result
  • the animation display module is used to determine the animation display position of the evaluation result animation in the second image frame data by using the display position of the action icon, and display the evaluation result animation in the animation display position.
  • the second determining module 405 includes:
  • an associated human body part determination unit for determining the target human body part associated with the action icon in the second image frame data
  • a state information determination unit used to determine the state information of the target human body part
  • the associated human body part determining unit is specifically configured to: determine the target human body part associated with the action icon in the second image frame data based on the playing time information of the background music or the collection time information of the second image frame data.
  • the action icon includes an emoticon icon
  • the interaction apparatus 400 provided in this embodiment of the present disclosure further includes:
  • a user expression recognition module used to identify the user expression in the first image frame data or the second image frame data, and determine the expression icon matching the user expression
  • the expression icon display module is used to determine the display position of the expression icon based on the position information of the facial features forming the user's expression on the first image frame data or the second image frame data, and display the expression icon in the determined display position.
  • the interaction apparatus 400 provided by the embodiment of the present disclosure further includes:
  • a first shared video generation module configured to generate a first shared video based on the collected first image frame data and second image frame data
  • a sharing request sending module configured to send a first video sharing request to the server according to the user's video sharing operation; wherein, the first video sharing request carries the first sharing video and the user ID of the sharing object, and the user ID of the sharing object is used for The server determines the second shared video shared by the shared object;
  • the composite video receiving module is used to receive the composite video returned by the server; wherein, the composite video is obtained by the server after synthesizing the first shared video and the second shared video and displaying on the same screen.
  • the interaction apparatus 400 provided by the embodiment of the present disclosure further includes:
  • the mode switching module is used to switch from the current mode to the image synchronization sharing mode according to the user's image synchronization operation;
  • a first on-screen display module configured to receive the first shared image frame data in real time, and display the first shared image frame data and the first image frame data on the same screen;
  • the second on-screen display module is used to receive the second shared image frame data in real time, and display the second shared image frame data and the second image frame data on the same screen;
  • the first shared image frame data and the second shared image frame data are shared in real time by the sharing object, and the sharing object is predetermined by the user.
  • the action information formed by the body parts includes dance game-like action information.
  • the body part identified in the first image frame data or the second image frame data includes at least one of a head, an arm, a hand, a foot, and a leg.
  • the interaction device configured on the client provided by the embodiment of the present disclosure can execute any interaction method applied to the client provided by the embodiment of the present disclosure, and has functional modules and beneficial effects corresponding to the execution method.
  • any interaction method applied to the client provided by the embodiment of the present disclosure and has functional modules and beneficial effects corresponding to the execution method.
  • FIG. 9 is a schematic structural diagram of another interaction apparatus provided by an embodiment of the present disclosure.
  • the apparatus may be configured in a server, and may be implemented by software and/or hardware.
  • the interaction apparatus 500 may include a position data extraction module 501, a standard position data set determination module 502, a position data search module 503, and a preset position information determination module 504, wherein:
  • the position data extraction module 501 is used to obtain a plurality of candidate videos, and extract the position data of human body parts of each image frame in the plurality of candidate videos;
  • the standard position data set determination module 502 is configured to fuse the body part position data of the same image frame in the multiple candidate videos based on preset rules to obtain a standard position data set;
  • the position data search module 503 is used to search for the position data of the target human body part in the standard position data set in at least one image frame in the multiple candidate videos;
  • the preset position information determination module 504 is used to determine the preset position information of the action icon corresponding to the target body part by using the searched position data, so as to participate in determining the display position of the action icon in the image frame data displayed by the client.
  • the location data extraction module 501 includes:
  • a video screening unit configured to obtain multiple candidate videos based on preset video screening information; wherein the preset video screening information includes video interaction information and/or video publisher information, and the video interaction information includes video likes and/or the amount of comments;
  • the position data extraction unit is used for extracting the position data of human body parts of each image frame in the multiple candidate videos.
  • the interaction apparatus 500 provided by the embodiment of the present disclosure further includes:
  • a guidance information generation module for generating guidance information based on a standard location data set
  • the guidance information sending module is used to send guidance information to the client, so that the client can display the guidance information on the collected user image frame data, and guide the user to change the state information of the target body part associated with the action icon in the image frame data.
  • the guide information includes at least one of a guide video animation, a guide picture and a guide instruction.
  • the standard location data set determination module 502 includes:
  • a video weight determination unit for determining the weight value of each candidate video
  • the standard position data set determination unit is configured to perform weighted average calculation on the position data of human body parts of the same image frame in the multiple candidate videos based on the weight value of each candidate video to obtain the standard position data set.
  • the interaction apparatus 500 provided by the embodiment of the present disclosure further includes:
  • the video sharing request receiving module is used for receiving the first video sharing request sent by the client; wherein, the first video sharing request carries the first sharing video and the user identifier of the sharing object, and the first sharing video is obtained by the client based on the collected first video image frame data and second image frame data are generated;
  • a shared video determination module configured to determine the second shared video shared by the shared object based on the user identification of the shared object; wherein the second shared video and the first shared video may include image frames of human body parts showing the same state information;
  • a video synthesis module used to synthesize the first shared video and the second shared video into a composite video displayed on the same screen
  • the composite video sending module is used to send the composite video to the client.
  • the interaction apparatus 500 provided by the embodiment of the present disclosure further includes:
  • a first shared image receiving module configured to receive the first shared image frame data shared by the shared object in real time; wherein, the shared object is predetermined by the user;
  • the first shared image sending module is used for sending the first shared image frame data to the client in real time, so that the client displays the first shared image frame data and the locally collected first image frame data on the same screen;
  • a shared image frame data and the first image frame data locally collected by the client can display human body parts with the same state information;
  • the second shared image receiving module is configured to receive the second shared image frame data shared by the shared object in real time; wherein, the shared object is predetermined by the user;
  • the second shared image sending module is used to send the second shared image frame data to the client in real time, so that the client can display the second shared image frame data and the locally collected second image frame data on the same screen;
  • the two-shared image frame data and the second image frame data locally collected by the client can display human body parts with the same state information.
  • the interaction device configured on the server provided by the embodiment of the present disclosure can execute the interaction method applied to the server provided by the embodiment of the present disclosure, and has functional modules and beneficial effects corresponding to the execution method.
  • the interaction device configured on the server provided by the embodiment of the present disclosure, and has functional modules and beneficial effects corresponding to the execution method.
  • FIG. 10 is a schematic structural diagram of a terminal provided by an embodiment of the present disclosure, which is used to exemplarily describe a terminal that implements the interaction method provided by the embodiment of the present disclosure.
  • the terminals in the embodiments of the present disclosure may include, but are not limited to, such as mobile phones, notebook computers, digital broadcast receivers, PDAs (personal digital assistants), PADs (tablets), PMPs (portable multimedia players), in-vehicle terminals (eg, in-vehicle terminals) mobile terminals such as navigation terminals) and the like, and stationary terminals such as digital TVs, desktop computers, and the like.
  • the terminal shown in FIG. 10 is only an example, and should not impose any limitations on the functions and occupancy scope of the embodiments of the present disclosure.
  • the terminal 600 includes one or more processors 601 , a memory 602 and a camera 605 .
  • the camera 605 is used to collect image frame data of the user in real time.
  • Processor 601 may be a central processing unit (CPU) or other form of processing unit having data processing capabilities and/or instruction execution capabilities, and may control other components in terminal 600 to perform desired functions.
  • CPU central processing unit
  • Processor 601 may be a central processing unit (CPU) or other form of processing unit having data processing capabilities and/or instruction execution capabilities, and may control other components in terminal 600 to perform desired functions.
  • Memory 602 may include one or more computer program products, which may include various forms of computer-readable storage media, such as volatile memory and/or non-volatile memory.
  • Volatile memory may include, for example, random access memory (RAM) and/or cache memory, among others.
  • Non-volatile memory may include, for example, read only memory (ROM), hard disk, flash memory, and the like.
  • One or more computer program instructions may be stored on the computer-readable storage medium, and the processor 601 may execute the program instructions to implement the interaction method applied to the client provided by the embodiments of the present disclosure, and may also implement other desired functions.
  • Various contents such as input signals, signal components, noise components, etc. may also be stored in the computer-readable storage medium.
  • the interaction method applied to the client may include: collecting and displaying the first image frame data of the user; identifying at least one human body part in the first image frame data, and determining the position information of the human body part; based on the position of the at least one human body part information, and the preset position information of the action icon corresponding to the body part, determine the display position of the action icon, and display the action icon at the display position; collect the second image frame data of the user and display; wherein, the second image frame data is image frame data at a preset time point after the first image frame data; determine the target body part associated with the action icon in the second image frame data and the state information of the target body part; according to the state of the target body part in the second image frame data The matching degree of the information and the action icon determines the evaluation result.
  • terminal 600 may also perform other optional implementations provided by the method embodiments of the present disclosure.
  • the terminal 600 may also include an input device 603 and an output device 604, these components being interconnected by a bus system and/or other form of connection mechanism (not shown).
  • the input device 603 may also include, for example, a keyboard, a mouse, and the like.
  • the output device 604 can output various information to the outside, including the determined distance information, direction information, and the like.
  • the output device 604 may include, for example, displays, speakers, printers, and communication networks and their connected remote output devices, among others.
  • terminal 600 may also include any other appropriate components according to the specific application.
  • FIG. 11 is a schematic structural diagram of a server provided by an embodiment of the present disclosure, which is used to exemplarily describe a server that implements the interaction method provided by the embodiment of the present disclosure.
  • the server shown in FIG. 11 is only an example, and should not impose any limitations on the functions and occupation scope of the embodiments of the present disclosure.
  • server 700 includes one or more processors 701 and memory 702 .
  • Processor 701 may be a central processing unit (CPU) or other form of processing unit having data processing capabilities and/or instruction execution capabilities, and may control other components in server 700 to perform desired functions.
  • CPU central processing unit
  • Processor 701 may be a central processing unit (CPU) or other form of processing unit having data processing capabilities and/or instruction execution capabilities, and may control other components in server 700 to perform desired functions.
  • Memory 702 may include one or more computer program products, which may include various forms of computer-readable storage media, such as volatile memory and/or non-volatile memory.
  • Volatile memory may include, for example, random access memory (RAM) and/or cache memory, among others.
  • Non-volatile memory may include, for example, read only memory (ROM), hard disk, flash memory, and the like.
  • One or more computer program instructions may be stored on the computer-readable storage medium, and the processor 701 may execute the program instructions to implement the interaction method applied to the server provided by the embodiments of the present disclosure, and may also implement other desired functions.
  • Various contents such as input signals, signal components, noise components, etc. may also be stored in the computer-readable storage medium.
  • the interaction method applied to the server may include: acquiring multiple candidate videos, and extracting body part position data of each image frame in the multiple candidate videos; The position data is fused to obtain a standard position data set; the position data of the target body part in at least one image frame in the multiple candidate videos is found in the standard position data set; the searched position data is used to determine the action icon corresponding to the target body part.
  • Preset position information to participate in determining the display position of the action icon in the image frame data displayed by the client.
  • server 700 may also execute other optional implementations provided by the method embodiments of the present disclosure.
  • the server 700 may also include an input device 703 and an output device 704 interconnected by a bus system and/or other form of connection mechanism (not shown).
  • the input device 703 may also include, for example, a keyboard, a mouse, and the like.
  • the output device 704 can output various information to the outside, including the determined distance information, direction information, and the like.
  • the output devices 704 may include, for example, displays, speakers, printers, and communication networks and their connected remote output devices, among others.
  • server 700 may also include any other appropriate components according to the specific application.
  • the embodiments of the present disclosure may also be computer program products, which include computer program instructions, which, when executed by the processor, cause the processor to execute the application to the client or the application provided by the embodiments of the present disclosure.
  • Arbitrary interaction method applied to the server may also be computer program products, which include computer program instructions, which, when executed by the processor, cause the processor to execute the application to the client or the application provided by the embodiments of the present disclosure.
  • the computer program product may write program code for performing operations of embodiments of the present disclosure in any combination of one or more programming languages, including object-oriented programming languages, such as Java, C++, etc., as well as conventional procedural programming language, such as "C" language or similar programming language.
  • the program code may execute entirely on a user terminal or server, partly on a user terminal or server, as a stand-alone software package, partly on a user terminal or server and partly on a remote terminal or server, or completely Execute on a remote terminal or server.
  • an embodiment of the present disclosure may also be a computer-readable storage medium on which computer program instructions are stored, and when the computer program instructions are executed by a processor, cause the processor to execute the application client or application provided by the embodiment of the present disclosure. Any method of interaction with the server.
  • the interaction method applied to the client may include: collecting and displaying the first image frame data of the user; identifying at least one human body part in the first image frame data, and determining the position information of the human body part; position information, and the preset position information of the action icons corresponding to the body parts, determine the display position of the action icons, and display the action icons at the display position; collect and display the second image frame data of the user; wherein, the second image frame data is the image frame data at a preset time point after the first image frame data; determine the target human body part associated with the action icon in the second image frame data and the state information of the target human body part; according to the second image frame data of the target human body part The matching degree between the status information and the action icon determines the evaluation result.
  • the interaction method applied to the server may include: acquiring multiple candidate videos, and extracting body part position data of each image frame in the multiple candidate videos; The position data of human body parts are fused to obtain a standard position data set; the position data of the target human body part in at least one image frame in multiple candidate videos is found in the standard position data set; the action corresponding to the target human body part is determined by using the searched position data The preset position information of the icon to participate in determining the display position of the action icon in the image frame data displayed by the client.
  • the processor may also cause the processor to execute other optional implementations provided by the method embodiments of the present disclosure.
  • a computer-readable storage medium can employ any combination of one or more readable media.
  • the readable medium may be a readable signal medium or a readable storage medium.
  • the readable storage medium may include, for example, but not limited to, electrical, magnetic, optical, electromagnetic, infrared, or semiconductor systems, apparatuses or devices, or a combination of any of the above. More specific examples (non-exhaustive list) of readable storage media include: electrical connections with one or more wires, portable disks, hard disks, random access memory (RAM), read only memory (ROM), erasable programmable read only memory (EPROM or flash memory), optical fiber, portable compact disk read only memory (CD-ROM), optical storage devices, magnetic storage devices, or any suitable combination of the foregoing.

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Multimedia (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Human Computer Interaction (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Psychiatry (AREA)
  • Social Psychology (AREA)
  • User Interface Of Digital Computer (AREA)

Abstract

The embodiments of the present disclosure relate to an interaction method and apparatus, and a terminal, a server and a storage medium. The method may comprise: acquiring first image frame data of a user and displaying the first image frame data; identifying at least one human body part in the first image frame data, and determining position information of the human body part; determining a display position of an action icon on the first image frame data, and displaying the action icon at the display position; collecting second image frame data of the user and displaying the second image frame data; determining a target human body part, associated with the action icon, and state information of the target human body part in the second image frame data; and determining an evaluation result according to the matching degree between the state information of the target human body part in the second image frame data and the action icon. According to the embodiments of the present disclosure, the display position of an action icon can be dynamically adjusted according to the position of a human body part of a user, and the state information of the human body part of the user is accurately evaluated, so that the interaction experience of the user is improved.

Description

交互方法、装置、终端、服务器和存储介质Interactive method, device, terminal, server and storage medium
本申请要求于2020年12月02日提交中国专利局、申请号为202011399864.7、申请名称为“交互方法、装置、终端、服务器和存储介质”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。This application claims the priority of the Chinese patent application with the application number 202011399864.7 and the application title "Interaction Method, Device, Terminal, Server and Storage Medium" filed with the China Patent Office on December 02, 2020, the entire contents of which are incorporated by reference in this application.
技术领域technical field
本公开涉及图像处理技术领域,尤其涉及一种交互方法、装置、终端、服务器和存储介质。The present disclosure relates to the technical field of image processing, and in particular, to an interaction method, device, terminal, server and storage medium.
背景技术Background technique
目前,肢体识别技术作为计算机视觉处理技术的一个分支,应用领域日益广泛,例如,基于视频的健身训练、基于视频的舞蹈教学和基于视频的游戏体验等。如何将针对摄像头采集的用户图像的肢体识别结果,应用于对用户肢体动作的引导和评估中,提高用户的动作体验,仍是当前待解决的问题。At present, body recognition technology, as a branch of computer vision processing technology, has an increasingly wide range of applications, such as video-based fitness training, video-based dance teaching, and video-based game experience. How to apply the body recognition results of the user images collected by the camera to the guidance and evaluation of the user's body movements to improve the user's action experience is still a problem to be solved at present.
发明内容SUMMARY OF THE INVENTION
为了解决上述技术问题或者至少部分地解决上述技术问题,本公开实施例提供了一种交互方法、装置、终端、服务器和存储介质,能够提高用户的交互体验。In order to solve the above technical problems or at least partially solve the above technical problems, the embodiments of the present disclosure provide an interaction method, apparatus, terminal, server, and storage medium, which can improve user interaction experience.
第一方面,本公开实施例提供了一种交互方法,应用于客户端,包括:In a first aspect, an embodiment of the present disclosure provides an interaction method, which is applied to a client, including:
采集用户的第一图像帧数据并展示;Collect the user's first image frame data and display it;
识别所述第一图像帧数据中的至少一个人体部位,并确定所述人体部位的位置信息;Identifying at least one human body part in the first image frame data, and determining the position information of the human body part;
基于所述至少一个人体部位的位置信息,以及与所述人体部位对应的动作图标的预设位置信息,确定所述动作图标的展示位置,并在所述展示位置展示所述动作图标;determining the display position of the action icon based on the position information of the at least one body part and the preset position information of the action icon corresponding to the body part, and displaying the action icon at the display position;
采集所述用户的第二图像帧数据并展示;其中,所述第二图像帧数据是所述第一图像帧数据之后预设时间点的图像帧数据;collecting and displaying second image frame data of the user; wherein, the second image frame data is image frame data at a preset time point after the first image frame data;
确定所述第二图像帧数据中与所述动作图标关联的目标人体部位以及所述目标人体部位的状态信息;determining the target body part associated with the action icon in the second image frame data and the state information of the target body part;
根据所述第二图像帧数据中所述目标人体部位的状态信息与所述动作图标的匹配度,确定评估结果。The evaluation result is determined according to the degree of matching between the state information of the target human body part and the action icon in the second image frame data.
第二方面,本公开实施例还提供了一种交互方法,应用于服务器,包括:In a second aspect, an embodiment of the present disclosure further provides an interaction method, applied to a server, including:
获取多个候选视频,并提取所述多个候选视频中各图像帧的人体部位位置数据;Obtaining multiple candidate videos, and extracting the position data of human body parts of each image frame in the multiple candidate videos;
基于预设的规则对所述多个候选视频中同一图像帧的人体部位位置数据进行融合,得到标准位置数据集;Based on preset rules, the position data of human body parts of the same image frame in the plurality of candidate videos are fused to obtain a standard position data set;
查找所述多个候选视频中至少一个图像帧中的目标人体部位在所述标准位置数据集中的位置数据;Find the position data of the target human body part in the standard position data set in at least one image frame in the plurality of candidate videos;
利用所述查找的位置数据确定与所述目标人体部位对应的动作图标的预设位置信息,以参与确定所述动作图标在客户端展示的图像帧数据中的展示位置。The preset position information of the action icon corresponding to the target body part is determined by using the searched position data, so as to participate in determining the display position of the action icon in the image frame data displayed by the client.
第三方面,本公开实施例还提供了一种交互装置,配置于客户端,包括:In a third aspect, an embodiment of the present disclosure further provides an interaction device, which is configured on a client and includes:
第一采集模块,用于采集用户的第一图像帧数据并展示;a first acquisition module, configured to collect and display the first image frame data of the user;
第一确定模块,用于识别所述第一图像帧数据中的至少一个人体部位,并确定所述人体部位的位置信息;a first determining module, configured to identify at least one human body part in the first image frame data, and determine the position information of the human body part;
展示位置确定模块,用于基于所述至少一个人体部位的位置信息,以及与所述人体部位对应的动作图标的预设位置信息,确定所述动作图标的展示位置,并在所述展示位置展示所述动作图标;A display position determination module, configured to determine the display position of the action icon based on the position information of the at least one human body part and the preset position information of the action icon corresponding to the human body part, and display it at the display position the action icon;
第二采集模块,用于采集所述用户的第二图像帧数据并展示;其中,所述第二图像帧数据是所述第一图像帧数据之后预设时间点的图像帧数据;a second collection module, configured to collect and display second image frame data of the user; wherein, the second image frame data is image frame data at a preset time point after the first image frame data;
第二确定模块,用于确定所述第二图像帧数据中与所述动作图标关联的目标人体部位以及所述目标人体部位的状态信息;a second determination module, configured to determine the target human body part associated with the action icon in the second image frame data and the state information of the target human body part;
评估模块,用于根据所述第二图像帧数据中所述目标人体部位的状态信息与所述动作图标的匹配度,确定评估结果。An evaluation module, configured to determine an evaluation result according to the degree of matching between the state information of the target human body part in the second image frame data and the action icon.
第四方面,本公开实施例还提供了一种交互装置,配置于服务器,包括:In a fourth aspect, an embodiment of the present disclosure further provides an interaction device, which is configured on a server and includes:
位置数据提取模块,用于获取多个候选视频,并提取所述多个候选视频中各图像帧的人体部位位置数据;a position data extraction module, configured to obtain a plurality of candidate videos, and extract the position data of human body parts of each image frame in the plurality of candidate videos;
标准位置数据集确定模块,用于基于预设的规则对所述多个候选视频中同一图像帧的人体部位位置数据进行融合,得到标准位置数据集;a standard position data set determination module, configured to fuse the body part position data of the same image frame in the plurality of candidate videos based on preset rules to obtain a standard position data set;
位置数据查找模块,用于查找所述多个候选视频中至少一个图像帧中的目标人体部位在所述标准位置数据集中的位置数据;a position data search module, configured to search for the position data of the target body part in the standard position data set in at least one image frame of the plurality of candidate videos;
预设位置信息确定模块,用于利用所述查找的位置数据确定与所述目标人体部位对应的动作图标的预设位置信息,以参与确定所述动作图标在客户端展示的图像帧数据中的展示位置。The preset position information determination module is used to determine the preset position information of the action icon corresponding to the target body part by using the searched position data, so as to participate in determining the position of the action icon in the image frame data displayed by the client. placement.
第五方面,本公开实施例还提供了一种终端,包括存储器、处理器和摄像头,其中:In a fifth aspect, an embodiment of the present disclosure further provides a terminal, including a memory, a processor, and a camera, wherein:
所述摄像头用于实时采集用户的图像帧数据;The camera is used to collect the user's image frame data in real time;
所述存储器中存储有计算机程序,当所述计算机程序被所述处理器执行时,所述处理器执行本公开实施例提供的任一交互方法。A computer program is stored in the memory, and when the computer program is executed by the processor, the processor executes any interaction method provided by the embodiments of the present disclosure.
第六方面,本公开实施例还提供了一种服务器,包括存储器和处理器,其中:所述存储器中存储有计算机程序,当所述计算机程序被所述处理器执行时,所述处理器执行本公开实施例提供的任一交互方法。In a sixth aspect, an embodiment of the present disclosure further provides a server, including a memory and a processor, wherein: a computer program is stored in the memory, and when the computer program is executed by the processor, the processor executes Any of the interaction methods provided by the embodiments of the present disclosure.
第七方面,本公开实施例还提供了一种计算机可读存储介质,所述存储介质中存储有计算机程序,当所述计算机程序被处理器执行时,所述处理器执行本公开实施例提供的任一交互方法。In a seventh aspect, an embodiment of the present disclosure further provides a computer-readable storage medium, where a computer program is stored in the storage medium, and when the computer program is executed by a processor, the processor executes the computer program provided by the embodiment of the present disclosure. of any interaction method.
第八方面,本公开实施例还提供了一种计算机程序产品,其特征在于,所述计算机程序产品包括计算机程序指令,所述计算机程序指令在被处理器运行时使得所述处理 器执行本公开实施例提供的任一交互方法。In an eighth aspect, an embodiment of the present disclosure further provides a computer program product, wherein the computer program product includes computer program instructions, and when executed by a processor, the computer program instructions cause the processor to execute the present disclosure Any interaction method provided by the embodiment.
本公开实施例提供的技术方案与现有技术相比至少具有如下优点:在本公开实施例中,客户端可以调用摄像头实时采集用户的第一图像帧数据和第二图像帧数据并进行展示,其中,第一图像帧数据为在先采集的图像帧数据,首先实时识别第一图像帧数据中的人体部位并确定人体部位的位置信息,然后结合动作图标的预设位置信息,确定动作图标在第一图像帧数据上的准确展示位置,即随着人体部位位置的变化,动作图标在第一图像帧数据上的展示位置可以实时调整(或称为修正);最后,根据第二图像帧数据中与动作图标关联的目标人体部位的状态信息与动作图标的匹配度,确定评估结果。本公开实施例实现了将摄像头采集的用户图像帧数据和该图像帧数据中待展示的动作图标进行有效结合,根据用户人体部位的位置动态调整动作图标的展示位置,准确评估用户人体部位的状态信息,提高了用户的交互体验。Compared with the prior art, the technical solutions provided by the embodiments of the present disclosure have at least the following advantages: in the embodiments of the present disclosure, the client can call the camera to collect the first image frame data and the second image frame data of the user in real time and display them, Among them, the first image frame data is the image frame data collected previously, first identify the body part in the first image frame data in real time and determine the position information of the body part, and then combine the preset position information of the action icon to determine that the action icon is in The exact display position on the first image frame data, that is, with the change of the position of the body part, the display position of the action icon on the first image frame data can be adjusted in real time (or called correction); finally, according to the second image frame data The matching degree between the state information of the target human body part associated with the action icon and the action icon is determined, and the evaluation result is determined. The embodiment of the present disclosure realizes the effective combination of the user image frame data collected by the camera and the action icon to be displayed in the image frame data, dynamically adjusts the display position of the action icon according to the position of the user's body part, and accurately evaluates the state of the user's body part information to improve the user's interactive experience.
附图说明Description of drawings
此处的附图被并入说明书中并构成本说明书的一部分,示出了符合本公开的实施例,并与说明书一起用于解释本公开的原理。The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the disclosure and together with the description serve to explain the principles of the disclosure.
为了更清楚地说明本公开实施例或现有技术中的技术方案,下面将对实施例或现有技术描述中所需要使用的附图作简单地介绍,显而易见地,对于本领域普通技术人员而言,在不付出创造性劳动性的前提下,还可以根据这些附图获得其他的附图。In order to more clearly illustrate the embodiments of the present disclosure or the technical solutions in the prior art, the accompanying drawings that are required to be used in the description of the embodiments or the prior art will be briefly introduced below. In other words, on the premise of no creative labor, other drawings can also be obtained from these drawings.
图1为本公开实施例提供的一种交互方法的流程图;FIG. 1 is a flowchart of an interaction method provided by an embodiment of the present disclosure;
图2为本公开实施例提供的展示有动作图标的图像帧数据的一种示意图;FIG. 2 is a schematic diagram of image frame data displaying action icons according to an embodiment of the present disclosure;
图3为本公开实施例提供的另一种交互方法的流程图;FIG. 3 is a flowchart of another interaction method provided by an embodiment of the present disclosure;
图4为本公开实施例提供的展示有动作图标和引导视频动画的图像帧数据的一种示意图;4 is a schematic diagram of image frame data showing an action icon and a guiding video animation provided by an embodiment of the present disclosure;
图5为本公开实施例提供的展示有评估结果动画的图像帧数据的一种示意图;5 is a schematic diagram of image frame data showing an animation of an evaluation result provided by an embodiment of the present disclosure;
图6为本公开实施例提供的同屏展示分享视频的一种示意图;6 is a schematic diagram of displaying a shared video on the same screen according to an embodiment of the present disclosure;
图7为本公开实施例提供的另一种交互方法的流程图;FIG. 7 is a flowchart of another interaction method provided by an embodiment of the present disclosure;
图8为本公开实施例提供的一种交互装置的结构示意图;FIG. 8 is a schematic structural diagram of an interaction apparatus according to an embodiment of the present disclosure;
图9为本公开实施例提供的另一种交互装置的结构示意图;FIG. 9 is a schematic structural diagram of another interaction apparatus provided by an embodiment of the present disclosure;
图10为本公开实施例提供的一种终端的结构示意图;FIG. 10 is a schematic structural diagram of a terminal according to an embodiment of the present disclosure;
图11为本公开实施例提供的一种服务器的结构示意图。FIG. 11 is a schematic structural diagram of a server according to an embodiment of the present disclosure.
具体实施方式Detailed ways
为了能够更清楚地理解本公开的上述目的、特征和优点,下面将对本公开的方案进行进一步描述。需要说明的是,在不冲突的情况下,本公开的实施例及实施例中的特征可以相互组合。In order to more clearly understand the above objects, features and advantages of the present disclosure, the solutions of the present disclosure will be further described below. It should be noted that the embodiments of the present disclosure and the features in the embodiments may be combined with each other under the condition of no conflict.
在下面的描述中阐述了很多具体细节以便于充分理解本公开,但本公开还可以采用其他不同于在此描述的方式来实施;显然,说明书中的实施例只是本公开的一部分实施例, 而不是全部的实施例。Many specific details are set forth in the following description to facilitate a full understanding of the present disclosure, but the present disclosure can also be implemented in other ways different from those described herein; obviously, the embodiments in the specification are only a part of the embodiments of the present disclosure, and Not all examples.
图1为本公开实施例提供的一种交互方法的流程图,应用于客户端中。该方法可以适用于如何将摄像头实时采集的用户图像帧数据和该图像帧数据上待展示的动作图标进行结合,并对实时采集的用户图像帧数据中人体部位的状态信息进行评估的情况。另外,该方法可以由配置于客户端的交互装置执行,该装置可以采用软件和/或硬件实现。本公开实施例中提及的客户端可以包括任意的具有视频交互功能的客户端,安装客户端的终端设备可以包括但不限于智能手机、平板电脑和笔记本等。FIG. 1 is a flowchart of an interaction method provided by an embodiment of the present disclosure, which is applied to a client. The method can be applied to the situation of how to combine the user image frame data collected by the camera in real time with the action icons to be displayed on the image frame data, and evaluate the state information of the human body part in the user image frame data collected in real time. In addition, the method can be executed by an interactive device configured on the client, and the device can be implemented by software and/or hardware. The client mentioned in the embodiment of the present disclosure may include any client with a video interaction function, and the terminal device on which the client is installed may include, but not limited to, a smart phone, a tablet computer, a notebook, and the like.
在本公开实施例中,用户人体部位的状态信息的类型可以包括但不限于与舞蹈游戏、舞蹈训练、健身动作以及教学动作等相关的人体部位的状态信息,即本公开实施例可以适用于游戏、健身和教学等多种应用场景中。In the embodiment of the present disclosure, the types of the state information of the user's body parts may include, but are not limited to, the state information of the body parts related to dance games, dance training, fitness movements, and teaching actions, etc., that is, the embodiments of the present disclosure may be applicable to games , fitness and teaching and other application scenarios.
如图1所示,本公开实施例提供的交互方法可以包括S101-S106:As shown in FIG. 1 , the interaction method provided by the embodiment of the present disclosure may include S101-S106:
S101、采集用户的第一图像帧数据并展示。S101. Collect and display first image frame data of a user.
示例性的,用户可以预先选择将要完成的整套动作视频内容,并在开始执行相关动作之前,通过触控客户端界面上的图像采集控件(或者视频录制控件),触发图像采集请求,客户端响应于该图像采集请求,调用摄像头实时采集用户的图像帧数据,并在界面上展示。其中,第一图帧数据可以是摄像头实时采集的任意图像帧数据,用词“第一”不带有任何顺序上的限定含义。Exemplarily, the user can pre-select the entire set of action video content to be completed, and before starting to execute the relevant action, touch the image capture control (or video recording control) on the client interface to trigger an image capture request, and the client responds. In response to the image collection request, the camera is called to collect the user's image frame data in real time and display it on the interface. The first image frame data may be any image frame data collected by the camera in real time, and the word "first" does not have any limited meaning in order.
S102、识别第一图像帧数据中的至少一个人体部位,并确定该人体部位的位置信息。S102. Identify at least one human body part in the first image frame data, and determine the position information of the human body part.
其中,在第一图像帧数据中识别的人体部位包括头部、胳膊、手部、脚部和腿部中的至少一种。另外,S102可以利用人体识别技术实时识别采集的用户图像帧数据中的人体部位,并同时确定人体部位的位置信息,该位置信息具体可以是人体部位上关键点的位置信息。关于人体识别技术的实现原理可以参考现有技术,本公开实施例不作具体限定。The body parts identified in the first image frame data include at least one of a head, an arm, a hand, a foot, and a leg. In addition, S102 can use the human body recognition technology to identify the human body parts in the collected user image frame data in real time, and simultaneously determine the position information of the human body parts, and the position information may specifically be the position information of key points on the human body parts. Regarding the implementation principle of the human body recognition technology, reference may be made to the prior art, which is not specifically limited in the embodiments of the present disclosure.
S103、基于至少一个人体部位的位置信息,以及与该人体部位对应的动作图标的预设位置信息,确定动作图标的展示位置,并在该展示位置展示该动作图标。S103 . Based on the position information of at least one human body part and the preset position information of the action icon corresponding to the human body part, determine the display position of the action icon, and display the action icon at the display position.
动作图标的预设位置信息用于对动作图标在用户图像帧数据中的展示位置进行约束,并可以由服务器在开发阶段预先确定,然后下发至客户端中。动作图标的预设位置信息可以包括动作图标的待展示位置与对应的人体部位的相对位置信息。The preset position information of the action icon is used to constrain the display position of the action icon in the user image frame data, and can be pre-determined by the server in the development stage, and then delivered to the client. The preset position information of the action icon may include relative position information between the to-be-displayed position of the action icon and the corresponding body part.
客户端可以基于用户的第一图像帧数据的采集时间信息(或者视频录制时间信息),确定当前采集的第一图像帧数据中是否需要展示动作图标。例如,针对录制一段时长为30秒的舞蹈动作视频的情况,预先设定当视频录制到第5秒、第15秒和第25秒等时刻时,需要在实时采集的用户图像帧数据中展示动作图标,因此在用户完成该舞蹈动作过程中,客户端可以基于用户的当前图像帧数据的采集时间信息或者舞蹈动作视频的当前录制时间信息,确定当前图像帧数据中是否需要展示动作图标。并且,图像帧数据的采集时间信息与视频录制时间信息可以相互确定,假设客户端将采集的第一帧用户图像数据的采集时间信息记为0秒,则当前图像帧数据的采集时间即视频的录制时间。The client may determine whether an action icon needs to be displayed in the currently collected first image frame data based on the collection time information (or video recording time information) of the user's first image frame data. For example, in the case of recording a dance action video with a duration of 30 seconds, it is preset that when the video is recorded at the 5th, 15th, and 25th seconds, the action needs to be displayed in the user image frame data collected in real time. Therefore, when the user completes the dance action, the client can determine whether the action icon needs to be displayed in the current image frame data based on the acquisition time information of the user's current image frame data or the current recording time information of the dance action video. In addition, the collection time information of the image frame data and the video recording time information can be mutually determined. If the client records the collection time information of the first frame of user image data collected as 0 seconds, the collection time of the current image frame data is the time of the video. recording time.
客户端还可以基于预先确定的特定图像帧数据与动作图标的展示对应关系,确定当前图像帧数据中是否需要展示动作图标。例如,如果客户端采集到展示有指定肢体动作的用户图像帧数据时,则在该图像帧数据中展示动作图标,该指定肢体动作为前述展示对应关 系中指定的图像帧数据中需要存在的肢体动作。The client can also determine whether the action icon needs to be displayed in the current image frame data based on the predetermined display correspondence between the specific image frame data and the action icon. For example, if the client collects the user image frame data showing the specified body movement, it will display the action icon in the image frame data, and the specified body movement is the body that needs to exist in the image frame data specified in the display corresponding relationship. action.
在用户图像帧数据的实时采集过程中,客户端基于第一图像帧数据中识别的人体部位的位置信息,以及与该人体部位对应的动作图标的预设位置信息,动态确定该动作图标在用户图像帧数据中的展示位置,从而确保该动作图标在用户图像帧数据中的准确展示。以舞蹈游戏场景为例,动作图标的预设位置信息还可以称为谱面信息,其中定义了人体部位与动作图标的相对位置信息,动作图标还可以被称为音符点。During the real-time collection of the user's image frame data, the client dynamically determines that the action icon is in the user's The display position in the image frame data, so as to ensure the accurate display of the action icon in the user's image frame data. Taking a dance game scene as an example, the preset position information of the action icons may also be called chart information, which defines the relative position information of the human body parts and the action icons, and the action icons may also be called note points.
在一种可能的实施方式下,动作图标的预设位置信息可以是基于与动作图标对应的人体部位在标准数据集中的位置数据得到。具体的,针对一段完整的动作视频,其中需要展示动作图标的图像帧数据可以根据动作图标的展示需求(例如在视频录制的指定时刻展示动作图标)而预先确定。以一段时长为20秒的舞蹈游戏为例,在游戏开发阶段,开发人员可以预先确定当舞蹈游戏进行至第N秒时,在用户图像帧数据中的预设部位,例如肩膀处,展示动作图标,然后开发人员基于第N秒的图像帧数据中预设部位在标准数据集中的位置数据,确定动作图标与该预设部位的相对位置信息,作为动作图标的预设位置信息。In a possible implementation manner, the preset position information of the action icon may be obtained based on the position data of the body part corresponding to the action icon in the standard data set. Specifically, for a complete action video, the image frame data in which the action icon needs to be displayed may be predetermined according to the display requirement of the action icon (for example, the action icon is displayed at a specified moment of video recording). Taking a dance game with a duration of 20 seconds as an example, in the game development stage, the developer can pre-determine that when the dance game progresses to the Nth second, an action icon will be displayed at a preset position in the user's image frame data, such as the shoulder. , and then the developer determines the relative position information of the action icon and the preset position based on the position data of the preset position in the standard data set in the image frame data of the Nth second, as the preset position information of the action icon.
标准数据集是基于预设的规则对多个(指至少两个)候选视频中同一图像帧的人体部位位置数据进行融合得到。多个候选视频中的同一图像帧(例如各个候选视频中的第N帧)上呈现相同的人体部位状态信息,例如呈现相同的人体部动作信息,例如针对同一舞蹈的不同人录制的舞蹈视频即可作为候选视频。各个候选视频中各帧图像数据中人体部位位置数据可以利用动作捕捉系统进行动作捕捉得到。示例性的,服务器可以确定每个候选视频的权重值;然后基于每个候选视频的权重值,对多个候选视频中同一图像帧的人体部位位置数据进行加权平均计算,得到标准位置数据集。每个候选视频的权重值可以根据视频交互信息和/或视频发布者信息进行确定,例如视频交互信息量越高,视频权重值越大;视频发布者如果为高知名度人物,则视频权重值越大。The standard dataset is obtained by fusing the position data of human body parts of the same image frame in multiple candidate videos (referring to at least two) based on preset rules. The same image frame in multiple candidate videos (for example, the Nth frame in each candidate video) presents the same state information of human body parts, for example, presents the same human body action information, such as dance videos recorded for different people of the same dance, namely Can be used as a candidate video. The position data of human body parts in each frame of image data in each candidate video can be obtained by using a motion capture system to perform motion capture. Exemplarily, the server may determine the weight value of each candidate video; and then, based on the weight value of each candidate video, perform a weighted average calculation on the position data of human body parts of the same image frame in the multiple candidate videos to obtain a standard position data set. The weight value of each candidate video can be determined according to video interaction information and/or video publisher information. For example, the higher the amount of video interaction information, the higher the video weight value; if the video publisher is a high-profile person, the higher the video weight value is. big.
进一步的,多个候选视频可以是基于预设视频筛选信息得到,预设视频筛选信息包括视频交互信息和/或视频发布者信息,视频交互信息包括视频的点赞量和/或评论量。示例性的,在本公开实施例中,针对呈现相同的人体部位状态信息的视频,可以在互联网数据中筛选出点赞量超过点赞阈值、评论量超过评论阈值、以及由知名度较高的人发布的视频,作为候选视频。各个阈值可以灵活取值。Further, the plurality of candidate videos may be obtained based on preset video screening information, the preset video screening information includes video interaction information and/or video publisher information, and the video interaction information includes the likes and/or comments of the videos. Exemplarily, in this embodiment of the present disclosure, for videos showing the same state information of human body parts, the Internet data can be screened to screen out the likes exceeding the likes threshold, the commenting Published videos as candidate videos. Each threshold can be flexibly set.
通过将多个候选视频中同一图像帧的人体部位位置数据进行融合处理,得到标准位置数据集,可以综合不同人的人体部位位置特点,对视频中展示的人体部位位置信息进行合理优化,提高视频的参考价值,并且优化动作图标的展示位置,同时,还有助于提高大众对优化后的视频效果的认可度和接受度。By fusing the position data of human body parts of the same image frame in multiple candidate videos, a standard position data set is obtained, which can integrate the position characteristics of human body parts of different people, reasonably optimize the position information of human body parts displayed in the video, and improve the video quality. , and optimize the placement of action icons. At the same time, it also helps to improve the public's recognition and acceptance of the optimized video effects.
此外,在保证界面视觉效果的情况下,动作图标可以采用任意可用的样式展示在用户图像帧数据中,展示样本中可以包括动作图标的形状、颜色、动态效果以及静态效果等信息,可以根据实际情况进行预先设计,本公开实施例不作具体限定。In addition, under the condition of ensuring the visual effect of the interface, the action icon can be displayed in the user image frame data in any available style, and the display sample can include the shape, color, dynamic effect, and static effect of the action icon. The situation is designed in advance, and is not specifically limited in the embodiment of the present disclosure.
图2为本公开实施例提供的展示有动作图标的图像帧数据的一种示意图,用于对本公开实施例进行示例性说明,不应理解为对本公开实施例的具体限定。如图2所示,用户的当前图像帧数据中展示有圆形的第一动作图标21和箭头形状的第二动作图标22。第一动作图标21可以用于引导用户将手部移动至第一动作图标21的位置,第二动作图标22可以 用于引导用户将手部按照箭头方向进行划动。每张图像帧数据中可展示的动作图标数量,本公开实施例不作具体限定。FIG. 2 is a schematic diagram of image frame data showing an action icon provided by an embodiment of the present disclosure, which is used to illustrate the embodiment of the present disclosure and should not be construed as a specific limitation to the embodiment of the present disclosure. As shown in FIG. 2 , the current image frame data of the user displays a circular first action icon 21 and an arrow-shaped second action icon 22 . The first action icon 21 may be used to guide the user to move the hand to the position of the first action icon 21, and the second action icon 22 may be used to guide the user to swipe the hand in the direction of the arrow. The number of action icons that can be displayed in each image frame data is not specifically limited in this embodiment of the present disclosure.
S104、采集用户的第二图像帧数据并展示;其中,该第二图像帧数据是第一图像帧数据之后预设时间点的图像帧数据。S104: Collect and display second image frame data of the user; wherein, the second image frame data is image frame data at a preset time point after the first image frame data.
第二图像帧数据和第一图像帧数据之间的采集间隔本公开实施例不作具体限定,即预设时间点的具体取值可以灵活设置。第二图像帧数据或第一图像帧数据并非专指代特定的一帧图像数据,均可以用于指代多帧图像数据,只是图像采集存在先后顺序。随着用户图像帧数据的实时采集,第一图像帧数据和第二图像帧数据中展示的用户人体部位的状态信息可以不断变化。另外,展示在第一图像帧数据中的动作图标,基于已经确定的动作图标的展示位置,可以继续展示第二图像帧数据中,也可以不展示。The collection interval between the second image frame data and the first image frame data is not specifically limited in this embodiment of the present disclosure, that is, the specific value of the preset time point can be set flexibly. The second image frame data or the first image frame data do not specifically refer to a specific frame of image data, and can be used to refer to multiple frames of image data, but there is an order of image acquisition. With the real-time collection of the user image frame data, the state information of the user's body part displayed in the first image frame data and the second image frame data may change continuously. In addition, the action icon displayed in the first image frame data may continue to be displayed in the second image frame data, or may not be displayed based on the determined display position of the action icon.
在本公开实施例中,由于实时采集用户的图像帧数据,第一用户图像帧数据和第二图像帧数据之间的采集时间间隔通常很小,因此,基于已经确定的动作图标的展示位置,继续展示第二图像帧数据中,也不会引起动作图标展示位置的较大变动,即动作图标在第一图像帧数据和第二图像帧数据中的展示位置具有一定程度上的一致性。当然,采集用户的第二图像帧数据后,也可以继续识别第二图像帧数据中的至少一个人体部位,并确定人体部位的位置信息,然后基于至少一个人体部位的位置信息,以及与人体部位对应的动作图标的预设位置信息,确定动作图标在第二图像帧数据中的展示位置,并进行展示。In the embodiment of the present disclosure, since the image frame data of the user is collected in real time, the collection time interval between the first user image frame data and the second image frame data is usually very small. Therefore, based on the determined display position of the action icon, Continuing to display the second image frame data will not cause a large change in the display position of the action icon, that is, the display positions of the action icon in the first image frame data and the second image frame data are consistent to a certain extent. Of course, after collecting the second image frame data of the user, at least one human body part in the second image frame data can also be identified, and the position information of the human body part can be determined, and then based on the position information of the at least one human body part, and the relationship with the human body part The preset position information of the corresponding action icon determines the display position of the action icon in the second image frame data, and displays it.
在一种可能的实施方式下,动作图标包括表情图标,在展示第一图像帧数据或者展示第二图像帧数据的过程中,还包括:In a possible implementation manner, the action icon includes an emoticon icon, and in the process of displaying the first image frame data or displaying the second image frame data, it further includes:
识别第一图像帧数据或者第二图像帧数据中的用户表情,并确定与用户表情匹配的表情图标;Identify the user's expression in the first image frame data or the second image frame data, and determine the expression icon matching the user's expression;
基于第一图像帧数据或者第二图像帧数据上形成用户表情的五官的位置信息,确定表情图标的展示位置,并将表情图标展示在确定的展示位置。Based on the position information of the facial features forming the user's expression on the first image frame data or the second image frame data, the display position of the expression icon is determined, and the expression icon is displayed at the determined display position.
例如,如果基于人脸表情识别技术,识别第一图像帧数据或者第二图像帧数据中的用户表情为嘟嘴,则确定与嘟嘴匹配的表情图标为“爱心”或“吻”,然后基于用户嘴的位置,确定嘴部的预设区域(具体可以灵活设置)作为“爱心”或“吻”的展示位置,将“爱心”或“吻”的特效图标展示在该预设区域内,从而提高交互的趣味性。For example, if the user's expression in the first image frame data or the second image frame data is identified as Duzui based on the facial expression recognition technology, the expression icon matching Duzui is determined to be "heart" or "kiss", and then based on the facial expression recognition technology The position of the user's mouth, determine the preset area of the mouth (specifically can be set flexibly) as the display position of "love" or "kiss", and display the special effect icon of "love" or "kiss" in the preset area, so as to Make the interaction more interesting.
S105、确定第二图像帧数据中与动作图标关联的目标人体部位以及该目标人体部位的状态信息。S105. Determine the target human body part associated with the action icon in the second image frame data and the state information of the target human body part.
第二图像帧数据中与动作图标关联的目标人体部位与用户预先选择的将要完成的动作视频内容有关。在一种可能的实施方式下,客户端可以基于背景音乐的播放时间信息或者第二图像帧数据的采集时间信息,确定第二图像帧数据中与动作图标关联的目标人体部位,该目标人体部位可以包括头部、胳膊、手部、脚部和腿部中的至少一种。例如,在背景音乐的播放至第N秒,或者第二图像帧数据的采集时间为第N秒时,确定第二图像帧数据中与动作图标关联的目标人体部位为用户的手部。The target body part associated with the action icon in the second image frame data is related to the action video content to be completed pre-selected by the user. In a possible implementation manner, the client may determine the target human body part associated with the action icon in the second image frame data based on the playback time information of the background music or the collection time information of the second image frame data. At least one of a head, arms, hands, feet, and legs may be included. For example, when the background music is played to the Nth second, or the collection time of the second image frame data is the Nth second, it is determined that the target human body part associated with the action icon in the second image frame data is the user's hand.
其中,目标人体部位的状态信息包括该目标人体部位的位置信息和/或该目标人体部位形成的动作信息。例如,在背景音乐的播放至第N秒,或者第二图像帧数据的采集时间为第N秒时,用户手部放在用户肩膀处,或者用户手部呈现OK手势,或者用户手部呈现拍 手动作等。The state information of the target body part includes position information of the target body part and/or action information formed by the target body part. For example, when the background music is played to the Nth second, or the collection time of the second image frame data is the Nth second, the user's hand is placed on the user's shoulder, or the user's hand presents an OK gesture, or the user's hand presents a clapping. action etc.
S106、根据第二图像帧数据中目标人体部位的状态信息与动作图标的匹配度,确定评估结果。S106: Determine the evaluation result according to the matching degree between the state information of the target human body part and the action icon in the second image frame data.
在本公开实施例中,针对每个动作图标,均可以预先设置其在图像帧数据中的预设位置信息以及关联的人体部位的标准动作信息,即根据人体部位的状态信息的不同,人体部位的状态信息与动作图标的匹配度可以包括位置匹配度和动作匹配度,因此,客户端可以基于多个维度的匹配结果,确定第二图像帧数据中用户的评估结果。匹配度越高,评估结果越好。评估结果可以展示在第二图像帧数据中。评估结果可以采用数字、文字和/或英文等形式实现,并且在展示过程中还可以添加动态特效,以提升界面视觉效果。In the embodiment of the present disclosure, for each action icon, its preset position information in the image frame data and the associated standard action information of the body part can be preset, that is, according to the different state information of the body part, the body part The degree of matching between the state information of , and the action icon may include the degree of position matching and the degree of action matching. Therefore, the client may determine the user's evaluation result in the second image frame data based on the matching results of multiple dimensions. The higher the matching degree, the better the evaluation result. The evaluation result can be displayed in the second image frame data. The evaluation results can be realized in the form of numbers, text and/or English, and dynamic special effects can also be added during the presentation to enhance the visual effect of the interface.
在用户图像帧数据的实时采集过程中,客户端确定当前图像帧数据中用户的评估结果后,还可以结合在前采集的图像帧数据中用户的评估结果,确定用户的累计评估结果,并进行展示。当然,如果当前图像帧数据中用户的评估结果较差,也可以将已经累计的评估结果清零。During the real-time collection of user image frame data, after the client determines the user's evaluation result in the current image frame data, it can also combine the user's evaluation result in the previously collected image frame data to determine the user's cumulative evaluation result, and carry out exhibit. Of course, if the user's evaluation result in the current image frame data is poor, the accumulated evaluation result can also be cleared.
在一种可能的实施方式下,以与动作图标关联的目标人体部位的状态信息包括人体部位的位置信息为例,根据第二图像帧数据中目标人体部位的状态信息与动作图标的匹配度,确定评估结果,包括:In a possible implementation, taking the state information of the target human body part associated with the action icon including the position information of the human body part as an example, according to the matching degree between the state information of the target human body part and the action icon in the second image frame data, Determine the results of the assessment, including:
确定动作图标在第二图像帧数据中的有效响应区域;determining the effective response area of the action icon in the second image frame data;
确定目标人体部位的位置信息和动作图标的有效响应区域的位置匹配度,并根据位置匹配度确定评估结果。Determine the position matching degree between the position information of the target body part and the effective response area of the action icon, and determine the evaluation result according to the position matching degree.
其中,动作图标的有效响应区域可以根据动作图标的展示位置和/或展示样式确定,例如可以基于动作图标的展示位置确定预设面积大小以及预设形状的区域作为动作图标的有效响应区域,或者可以将动作图标的展示样式对应的形状区域确定为其有效响应区域,或者可以基于动作图标的形状区域,将面积小于或者面积大于其形状区域的预设形状的区域确定为其有效响应区域;或者,同时基于动作图标的展示位置和展示样式确定其有效响应区域,具体可以灵活设置。关于动作图标的有效响应区域如何确定,可以是由服务器预先确定。Wherein, the effective response area of the action icon may be determined according to the display position and/or display style of the action icon, for example, an area with a preset size and a preset shape may be determined based on the display position of the action icon as the effective response area of the action icon, or The shape area corresponding to the display style of the action icon may be determined as its effective response area, or based on the shape area of the action icon, an area of a preset shape with an area smaller or larger than its shape area may be determined as its effective response area; or , and determine the effective response area based on the display position and display style of the action icon, which can be set flexibly. How to determine the effective response area of the action icon may be predetermined by the server.
如果目标人体部位处于动作图标的有效响应区域内,且该人体部位的位置与该有效响应区域中心的距离小于第一距离阈值(取值可灵活设置),则该人体部位的位置与动作图标的有效响应区域的位置匹配度较高,否则一项不满足时则位置匹配度较差。可见,位置匹配度较高,评估结果越优。If the target human body part is within the effective response area of the action icon, and the distance between the position of the human body part and the center of the effective response area is less than the first distance threshold (the value can be set flexibly), then the position of the human body part is the same as that of the action icon. The position matching degree of the effective response area is high, otherwise, the position matching degree is poor when one item is not satisfied. It can be seen that the higher the position matching degree, the better the evaluation result.
当然,其他可用于确定目标人体部位与动作图标的位置匹配度的方式,本领域技术人员也可以灵活采用。例如,还可以直接计算动作图标在第一图像帧数据或第二图像帧数据中的展示位置坐标,与第二图像帧数据中目标人体部位的位置坐标之间的距离,如果计算的距离值小于第二距离阈值(取值可灵活设置),则第二图像帧数据中关联人体部位与动作图标的位置匹配度较高,相应的评估结果较优;如果计算的距离值大于或等于第二距离阈值,则第二图像帧数据中关联人体部位与动作图标的位置匹配度较差,相应的评估结果较差。Of course, other methods that can be used to determine the position matching degree between the target body part and the action icon can also be flexibly adopted by those skilled in the art. For example, it is also possible to directly calculate the distance between the display position coordinates of the action icon in the first image frame data or the second image frame data and the position coordinates of the target body part in the second image frame data, if the calculated distance value is less than The second distance threshold (the value can be set flexibly), the position of the associated human body part and the action icon in the second image frame data is highly matched, and the corresponding evaluation result is better; if the calculated distance value is greater than or equal to the second distance If the threshold is set, the position of the associated body part and the action icon in the second image frame data is poorly matched, and the corresponding evaluation result is poor.
在一种可能的实施方式下,以与动作图标关联的目标人体部位的状态信息包括人体部 位形成的动作信息为例,根据第二图像帧数据中目标人体部位的状态信息与动作图标的匹配度,确定评估结果,包括:In a possible implementation, taking the state information of the target human body part associated with the action icon including the action information formed by the human body part as an example, according to the matching degree between the state information of the target human body part and the action icon in the second image frame data , to determine the assessment results, including:
确定动作图标对应的标准动作信息;其中,不同动作图标对应的标准动作信息可以预先在服务器中确定;Determine the standard action information corresponding to the action icon; wherein, the standard action information corresponding to different action icons can be determined in the server in advance;
确定第二图像帧数据中目标人体部位形成的动作信息和标准动作信息的动作匹配度,并根据动作匹配度确定评估结果。Determine the action matching degree between the action information formed by the target human body part and the standard action information in the second image frame data, and determine the evaluation result according to the action matching degree.
其中,人体部位形成的动作信息包括但不限于舞蹈游戏类动作信息。示例性的,可以基于关键点匹配技术,确定第二图像帧数据中目标人体部位形成的动作信息和标准动作信息的动作匹配度,例如针对OK手势,可以分别提取用户手部呈现OK手势时的关键点坐标,然后与标准OK手势对应的手部关键点坐标进行比对,确定动作匹配度。The action information formed by the body parts includes but is not limited to dance game action information. Exemplarily, based on the key point matching technology, the action matching degree between the action information formed by the target human body part and the standard action information in the second image frame data can be determined, for example, for the OK gesture, the user's hand when the OK gesture is presented can be extracted respectively. The key point coordinates are then compared with the hand key point coordinates corresponding to the standard OK gesture to determine the action matching degree.
在本公开实施例中,客户端可以调用摄像头实时采集用户的第一图像帧数据和第二图像帧数据并进行展示,其中,第一图像帧数据为在先采集的图像帧数据,首先实时识别第一图像帧数据中的人体部位并确定人体部位的位置信息,然后结合动作图标的预设位置信息,确定动作图标在第一图像帧数据上的准确展示位置,即随着人体部位位置的变化,动作图标在第一图像帧数据上的展示位置可以实时调整(或称为修正);最后,根据第二图像帧数据中与动作图标关联的目标人体部位的状态信息与动作图标的匹配度,确定评估结果。本公开实施例实现了将摄像头采集的用户图像帧数据和该图像帧数据中待展示的动作图标进行有效结合,根据用户人体部位的位置动态调整动作图标的展示位置,准确评估用户人体部位的状态信息,提高了用户的交互体验。In the embodiment of the present disclosure, the client can call the camera to collect and display the first image frame data and the second image frame data of the user in real time, wherein the first image frame data is the image frame data collected previously, and the first image frame data is identified in real time. The human body part in the first image frame data and the position information of the human body part are determined, and then combined with the preset position information of the action icon, the accurate display position of the action icon on the first image frame data is determined, that is, with the change of the position of the human body part , the display position of the action icon on the first image frame data can be adjusted in real time (or called correction); finally, according to the matching degree of the state information of the target human body part associated with the action icon in the second image frame data and the action icon, Determine the assessment results. The embodiment of the present disclosure realizes the effective combination of the user image frame data collected by the camera and the action icon to be displayed in the image frame data, dynamically adjusts the display position of the action icon according to the position of the user's body part, and accurately evaluates the state of the user's body part information to improve the user's interactive experience.
图3为本公开实施例提供的另一种交互方法的流程图,基于上述技术方案进一步优化与扩展,并可以与上述各个可选实施方式进行结合。FIG. 3 is a flowchart of another interaction method provided by an embodiment of the present disclosure, which is further optimized and expanded based on the foregoing technical solution, and may be combined with the foregoing optional implementation manners.
如图3所示,本公开实施例提供的交互方法可以包括S201-S209:As shown in FIG. 3, the interaction method provided by the embodiment of the present disclosure may include S201-S209:
S201、采集用户的第一图像帧数据并展示。S201. Collect and display the first image frame data of the user.
S202、识别第一图像帧数据中的至少一个人体部位,并确定人体部位的位置信息。S202. Identify at least one human body part in the first image frame data, and determine the position information of the human body part.
S203、基于至少一个人体部位的位置信息,以及与人体部位对应的动作图标的预设位置信息,确定动作图标的展示位置。S203. Determine the display position of the action icon based on the position information of at least one body part and the preset position information of the action icon corresponding to the body part.
需要说明的是,S201-S203的相关内容请分别参见上文S101-S103的相关内容。It should be noted that, for the relevant content of S201-S203, please refer to the relevant content of S101-S103 above, respectively.
S204、基于背景音乐的播放时间信息或者基于第一图像帧数据的采集时间信息,确定动作图标的展示样式。S204. Determine the display style of the action icon based on the playback time information of the background music or the collection time information of the first image frame data.
针对背景音乐的不同播放时间(例如播放至第3秒或者播放至第7秒),或者不同的图像帧数据采集时间(或者视频录制时间),以何种展示方式展示动作图标,在动作开发阶段已预先确定,因此,客户端可以根据当前背景音乐的播放时间信息或者第一图像帧数据的采集时间信息确定动作图标的当前展示样式。不同时间点的动作图标展示样式可以相同,也可以不同。According to the different playing time of background music (for example, playing until the 3rd second or playing until the 7th second), or different image frame data collection time (or video recording time), which display method is used to display the action icon, in the action development stage It has been determined in advance. Therefore, the client can determine the current display style of the action icon according to the playback time information of the current background music or the collection time information of the first image frame data. The display styles of action icons at different time points can be the same or different.
S205、在展示位置采用展示样式展示动作图标。S205, the action icon is displayed in a display style at the display position.
S206、采集用户的第二图像帧数据并展示;其中,第二图像帧数据是第一图像帧数据之后预设时间点的图像帧数据。S206: Collect and display second image frame data of the user; wherein, the second image frame data is image frame data at a preset time point after the first image frame data.
需要说明的是,S206的相关内容请参见上文S104的相关内容。It should be noted that, for the relevant content of S206, please refer to the relevant content of S104 above.
S207、在第二图像帧数据上展示引导信息,以引导用户改变与动作图标关联的目标人体部位的状态信息。S207 , displaying guidance information on the second image frame data to guide the user to change the state information of the target body part associated with the action icon.
其中,引导信息包括引导视频动画、引导图片和引导指令中的至少一种。另外,引导指令还可以采用语音形式进行播放。引导信息可以是基于前述实施例中的标准数据集得到。以引导视频动画或引导图片为例,可以是通过将前述实施例中的标准数据集导入人体模型中,经过图像处理得到。具体的,开发人员可以基于现有的三维动画制作原理,利用服务器将标准数据集导入人体模型中,经过模型渲染生成引导视频动画,或者通过截图形式得到引导图片,然后由服务器下发至客户端中。Wherein, the guidance information includes at least one of a guidance video animation, a guidance picture and a guidance instruction. In addition, the guiding instructions can also be played in the form of voice. The guidance information may be obtained based on the standard data set in the foregoing embodiments. Taking a guide video animation or a guide picture as an example, it can be obtained by importing the standard data set in the foregoing embodiment into a human body model and performing image processing. Specifically, developers can use the server to import standard data sets into the human body model based on the existing 3D animation production principles, and generate guided video animations through model rendering, or obtain guiding pictures in the form of screenshots, and then send them to the client from the server. middle.
标准数据集中综合了不同人的人体部位位置特点,基于该标准数据集得到引导信息,可以提高引导信息的参考价值,提高大众对引导信息的认可度和接受度。The standard data set integrates the location characteristics of different people's body parts, and obtaining guidance information based on the standard data set can improve the reference value of the guidance information and improve the public's recognition and acceptance of the guidance information.
引导信息可以直接叠加展示在第二图像帧数据中,也可以采用独立的播放窗口等形式展示在第二图像帧数据中。引导信息在第二图像帧数据中的具体展示位置,本公开实施例不作限定,例如可以是图像的右下方、右上方、左上方或者左下方等。The guidance information may be directly superimposed and displayed in the second image frame data, or may be displayed in the second image frame data in the form of an independent play window or the like. The specific display position of the guidance information in the second image frame data is not limited in the embodiment of the present disclosure, and may be, for example, the lower right, upper right, upper left, or lower left of the image.
进一步的,在用户图像帧数据的实时采集过程中,客户端还可以基于用户人体部位在图像帧数据中的位置,对引导信息的展示位置进行动态调整,以避免人体部位和引导信息的重叠展示,例如,如果客户端检测到用户肢体在第二图像帧数据中偏向右侧的位置,则可以将引导信息展示在第二图像帧数据中偏向左侧的位置。Further, during the real-time collection of user image frame data, the client can also dynamically adjust the display position of the guidance information based on the position of the user's body parts in the image frame data, so as to avoid overlapping display of body parts and guidance information. For example, if the client detects that the user's limb is positioned to the right in the second image frame data, the guide information may be displayed at the position that is biased to the left in the second image frame data.
图4为本公开实施例提供的展示有动作图标和引导视频动画的图像帧数据的一种示意图,用于对本公开实施例进行示例性说明,不应理解为对本公开实施例的具体限定。如图4所示,当前图像帧数据中展示有第一动作图标21和第二动作图标22;同时,在当前图像帧数据的左下方展示有引导视频动画23,引导用户完成正确的肢体动作。FIG. 4 is a schematic diagram of image frame data showing action icons and guiding video animation provided by an embodiment of the present disclosure, which is used to illustrate the embodiment of the present disclosure and should not be construed as a specific limitation to the embodiment of the present disclosure. As shown in FIG. 4 , the current image frame data displays a first action icon 21 and a second action icon 22 ; at the same time, a guide video animation 23 is displayed at the lower left of the current image frame data to guide the user to complete correct body movements.
S208、确定第二图像帧数据中与动作图标关联的目标人体部位以及目标人体部位的状态信息。S208: Determine the target human body part associated with the action icon and the state information of the target human body part in the second image frame data.
S209、根据第二图像帧数据中目标人体部位的状态信息与动作图标的匹配度,确定评估结果。S209: Determine the evaluation result according to the matching degree between the state information of the target human body part and the action icon in the second image frame data.
需要说明的是,S208-S209的相关内容请分别参见上文S105-S106的相关内容。It should be noted that, for the relevant content of S208-S209, please refer to the relevant content of S105-S106 above, respectively.
在上述技术方案的基础上,在根据第二图像帧数据中目标人体部位的状态信息与动作图标的匹配度,确定评估结果之后,还包括:On the basis of the above technical solution, after determining the evaluation result according to the matching degree between the state information of the target human body part and the action icon in the second image frame data, the method further includes:
根据评估结果确定评估结果动画;评估结果动画(或称为动作判定动画)的具体实现可以灵活设置,本公开实施例不作具体限定;The evaluation result animation is determined according to the evaluation result; the specific implementation of the evaluation result animation (or called action determination animation) can be flexibly set, and the embodiment of the present disclosure does not make specific limitations;
利用动作图标的展示位置,确定评估结果动画在第二图像帧数据中的动画展示位置,并在动画展示位置展示评估结果动画。Using the display position of the action icon, the animation display position of the evaluation result animation in the second image frame data is determined, and the evaluation result animation is displayed in the animation display position.
示例性的,评估结果动画的展示位置可以与动作图标的展示位置重合也可以不重合。例如,在确定评估结果动画后,可以在动作图标的展示位置上展示评估结果动画,并同时将动作图标进行隐藏,从而产生一种特效切换变换的界面效果。Exemplarily, the display position of the evaluation result animation may or may not coincide with the display position of the action icon. For example, after the evaluation result animation is determined, the evaluation result animation can be displayed on the display position of the action icon, and the action icon is hidden at the same time, so as to generate an interface effect of special switching and transformation.
通过展示评估结果动画,可以提高界面视觉效果,提高用户录制视频的趣味性。图5为本公开实施例提供的展示有评估结果动画的图像帧数据的一种示意图,用于对本公开实施例进行示例性说明,不应理解为对本公开实施例的具体限定。如图5所示,用户手部肢 体动作位置与肩膀处的动作图标的位置匹配度较高(也即与动作图标的有效响应区域的重合度较高),用户手部肢体动作的评估结果为完美,因此,在图像帧数据中展示一种圆形的评估结果动画51,该评估结果动画中展示有“完美”字样。评估结果动画51在展示过程中可以动态改变圆形的大小,并变换展示色彩等。展示有评估结果动画的图像帧数据可以作为有效的视频帧数据。By displaying the evaluation result animation, the visual effect of the interface can be improved, and the user's video recording can be more interesting. FIG. 5 is a schematic diagram of image frame data showing an evaluation result animation provided by an embodiment of the present disclosure, which is used to illustrate the embodiment of the present disclosure and should not be construed as a specific limitation to the embodiment of the present disclosure. As shown in Figure 5, the position of the user's hand and limb movements has a high degree of matching with the position of the action icon at the shoulder (that is, the degree of coincidence with the effective response area of the action icon is high), and the evaluation result of the user's hand and limb movements is: Perfect, therefore, a circular evaluation result animation 51 is displayed in the image frame data, and the word "perfect" is displayed in the evaluation result animation. The evaluation result animation 51 can dynamically change the size of the circle and change the display color during the presentation process. Image frame data showing animation of evaluation results can be used as valid video frame data.
在上述技术方案的基础上,在一种可能的实施方式下,在根据第二图像帧数据中目标人体部位的状态信息与动作图标的匹配度,确定评估结果之后,还包括:On the basis of the above technical solution, in a possible implementation manner, after determining the evaluation result according to the matching degree between the state information of the target body part and the action icon in the second image frame data, the method further includes:
基于采集的第一图像帧数据和第二图像帧数据,生成第一分享视频;由于用户的图像帧数据属于实时采集的图像序列,因此,可以基于第一图像帧数据和第二图像帧数据得到一个完整的用户视频,并且动作图标、引导信息和评估结果动画等可以展示在分享视频中相应的图像帧数据中;Based on the collected first image frame data and the second image frame data, the first shared video is generated; since the user's image frame data belongs to the image sequence collected in real time, it can be obtained based on the first image frame data and the second image frame data. A complete user video, and action icons, guidance information, and evaluation result animations can be displayed in the corresponding image frame data in the shared video;
根据用户的视频分享操作,向服务器发送第一视频分享请求;其中,第一视频分享请求中携带第一分享视频和分享对象的用户标识,分享对象的用户标识用于服务器确定分享对象分享的第二分享视频;第二分享视频和第一分享视频可以是由不同人录制的针对相同动作内容的视频;分享对象的数量可以是一个或者多个,相应的,第二分享视频可以指代一个视频或者多个视频;According to the user's video sharing operation, a first video sharing request is sent to the server; wherein, the first video sharing request carries the first sharing video and the user identifier of the sharing object, and the user identifier of the sharing object is used by the server to determine the first sharing object to share. Two shared videos; the second shared video and the first shared video may be videos of the same action content recorded by different people; the number of shared objects may be one or more, correspondingly, the second shared video may refer to one video or multiple videos;
接收服务器返回的合成视频;其中,合成视频由服务器将第一分享视频和第二分享视频合成同屏展示后得到。关于视频合成的具体实现可以参考现有技术。同屏展示可以是左右分屏展示,也可以是上下分屏展示,根据参与视频分享的用户数量的不同,同屏展示方式不同。The composite video returned by the server is received; wherein, the composite video is obtained by the server synthesizing the first shared video and the second shared video for display on the same screen. For the specific implementation of video synthesis, reference may be made to the prior art. The same-screen display can be a left-right split-screen display, or a top-bottom split-screen display. Depending on the number of users participating in the video sharing, the same-screen display method is different.
示例性的,当前客户端生成第一分享视频后,可以根据当前用户触发的分享对象选择操作,由当前界面切换至分享对象选择界面,以供当前用户确定至少一个分享对象,获取当前用户选择的分享对象的用户标识后,再次切换至当前界面,根据当前用户触发的视频分享操作,生成第一视频分享请求并向服务器发送。针对分享对象控制的客户端而言,其也可以执行与前述操作相同的操作,实现将第二分享视频分享至服务器。并且,当前用户(即分享发起者)控制的客户端和分享对象控制的客户端可以在用户通信的基础上,同时向服务器发送视频分享请求。服务器完成视频合成后,可以将合成视频分别发送至当前用户控制的客户端和分享对象控制的客户端。Exemplarily, after the current client generates the first sharing video, it can switch from the current interface to the sharing object selection interface according to the sharing object selection operation triggered by the current user, so that the current user can determine at least one sharing object and obtain the selected sharing object by the current user. After sharing the user identification of the object, switch to the current interface again, and generate a first video sharing request and send it to the server according to the video sharing operation triggered by the current user. For the client controlled by the shared object, it can also perform the same operation as the foregoing operation, so as to realize the sharing of the second shared video to the server. In addition, the client controlled by the current user (ie the sharing initiator) and the client controlled by the sharing object may simultaneously send a video sharing request to the server on the basis of user communication. After the server completes the video synthesis, it can send the synthesized video to the client controlled by the current user and the client controlled by the sharing object respectively.
图6为本公开实施例提供的同屏展示分享视频的一种示意图,具体以两人参与视频分享为例,对本公开实施例进行示例性说明,不应理解为对本公开实施例的具体限定。如图6所示,用户A和用户B互为对方的分享对象,分享发起者控制的客户端和分享对象控制的客户端,可以同时展示两人的分享视频。在图6中,动作图标的展示位置为肩膀上方,用户A的手部展示在肩膀上方,即手部位置与动作图标的展示位置的匹配度较高,用户A的评估结果为完美;而用户B的手部展示在身体右侧,即手部位置与动作图标的展示位置的匹配度较低,用户B的评估结果为一般。并且,针对不同的评估结果,图6中分别展示了不同的评估结果动画,例如,针对用户A,评估结果动画为星形图案形成的动画,星形图案中展示有“完美”字样;针对用户B,评估结果动画为圆形图案形成的动画,圆形图案中展示有“一般”字样。FIG. 6 is a schematic diagram of displaying a shared video on the same screen provided by an embodiment of the present disclosure, specifically taking two people participating in video sharing as an example to illustrate the embodiment of the present disclosure, and should not be construed as a specific limitation of the embodiment of the present disclosure. As shown in FIG. 6 , user A and user B are each other's sharing objects, and the client controlled by the sharing initiator and the client controlled by the sharing object can simultaneously display the shared videos of the two. In Figure 6, the display position of the action icon is above the shoulder, and the hand of user A is displayed above the shoulder, that is, the matching degree of the hand position and the display position of the action icon is high, and the evaluation result of user A is perfect; B's hand is displayed on the right side of the body, that is, the match between the hand position and the display position of the action icon is low, and the evaluation result of user B is average. Moreover, for different evaluation results, different evaluation result animations are shown in Fig. 6. For example, for user A, the evaluation result animation is an animation formed by a star pattern, and the word "perfect" is displayed in the star pattern; B, The animation of the evaluation result is an animation formed by a circular pattern, and the word "general" is displayed in the circular pattern.
在上述技术方案的基础上,在一种可能的实施方式下,在展示第一图像帧数据之前,还包括:On the basis of the above technical solution, in a possible implementation manner, before displaying the first image frame data, the method further includes:
根据用户的图像同步操作,由当前模式切换至图像同步分享模式;即在图像同步分享模式下,当前用户确定分享对象后,可以在展示本地实时获取的图像帧数据的同时,展示分享对象控制的客户端实时获取的图像帧数据,同屏展示效果可以参考图6所示的展示效果。According to the user's image synchronization operation, the current mode is switched to the image synchronization sharing mode; that is, in the image synchronization sharing mode, after the current user determines the sharing object, the image frame data obtained locally in real time can be displayed at the same time as the image frame data controlled by the sharing object. For the image frame data obtained by the client in real time, the display effect on the same screen can refer to the display effect shown in Figure 6.
相应的,在展示第一图像帧数据和第二图像帧数据的过程中,还包括:Correspondingly, in the process of displaying the first image frame data and the second image frame data, the method further includes:
实时接收第一分享图像帧数据,并将第一分享图像帧数据和第一图像帧数据进行同屏展示;Receive the first shared image frame data in real time, and display the first shared image frame data and the first image frame data on the same screen;
实时接收第二分享图像帧数据,并将第二分享图像帧数据和第二图像帧数据进行同屏展示;Receive the second shared image frame data in real time, and display the second shared image frame data and the second image frame data on the same screen;
其中,第一分享图像帧数据和第二分享图像帧数据由分享对象实时分享,分享对象由用户预先确定。The first shared image frame data and the second shared image frame data are shared in real time by the sharing object, and the sharing object is predetermined by the user.
示例性的,可以直接通过客户端与客户端之间的交互,实现第一分享图像帧数据和第二分享图像帧数据在不同客户端之间的同步展示;也可以通过服务器在两个客户端之间的数据中转,实现第一分享图像帧数据和第二分享图像帧数据在不同客户端之间的同步展示。Exemplarily, the synchronous display of the first shared image frame data and the second shared image frame data between different clients can be realized directly through the interaction between the client and the client; The data transfer between them realizes the synchronous display of the first shared image frame data and the second shared image frame data between different clients.
示例性的,分享对象可以是在当前用户触发视频同步操作之前或者之后确定,当前用户控制的客户端由当前模式切换至图像同步分享模式后,可以向服务器发送模式切换通知,该模式切换通知中可以携带分享对象的用户标识,以通知服务器将接收的分享对象实时分享的第一分享图像帧数据和第二分享图像帧数据,并实时发送至当前用户控制的客户端。同时,当前用户控制的客户端在展示实时获取的图像帧数据时,也实时将图像帧数据分享至服务器,以使得分享对象控制的客户端执行前述操作后,也可以同步展示当前用户的图像帧数据。动作图标、引导信息和评估结果动画等内容,在图像帧数据的同屏展示过程中,也可以同步展示。并且,当前用户控制的客户端和分享对象控制的客户端可以在用户之间相互通信的基础上,同时切换至图像同步分享模式。Exemplarily, the sharing object may be determined before or after the current user triggers the video synchronization operation, and after the client controlled by the current user is switched from the current mode to the image synchronization sharing mode, a mode switching notification may be sent to the server. The user identifier of the shared object can be carried to notify the server to send the received first shared image frame data and second shared image frame data shared by the shared object in real time to the client controlled by the current user in real time. At the same time, when the client controlled by the current user displays the image frame data obtained in real time, it also shares the image frame data to the server in real time, so that the client controlled by the sharing object can also display the image frame of the current user synchronously after performing the aforementioned operations. data. Contents such as action icons, guidance information, and evaluation result animations can also be displayed synchronously during the display of image frame data on the same screen. In addition, the client currently controlled by the user and the client controlled by the shared object can be switched to the image synchronization sharing mode at the same time on the basis of mutual communication between users.
在本公开实施例中,通过图像分享与合成,实现了不同用户的图像帧数据在同一客户端中的同屏展示,提高了图像交互或者视频交互的趣味性。In the embodiments of the present disclosure, image frame data of different users can be displayed on the same screen in the same client through image sharing and synthesis, which improves the interest of image interaction or video interaction.
图7为本公开实施例提供的另一种交互方法的流程图,应用于服务器,该方法可以由配置于服务器的交互装置执行,该装置可以采用软件和/或硬件实现。7 is a flowchart of another interaction method provided by an embodiment of the present disclosure, applied to a server, and the method may be executed by an interaction apparatus configured on the server, and the apparatus may be implemented by software and/or hardware.
本公开实施例提供的应用于服务器的交互方法,可以与本公开实施例提供的应用于客户端的交互方法配合执行,以下实施例中未详细描述的内容,可以参考上述实施例中的解释。The interaction method applied to the server provided by the embodiment of the present disclosure may be executed in cooperation with the interaction method applied to the client provided by the embodiment of the present disclosure. For content not described in detail in the following embodiments, reference may be made to the explanations in the above embodiments.
如图7所示,本公开实施例提供的交互方法可以包括S301-S304:As shown in FIG. 7 , the interaction method provided by the embodiment of the present disclosure may include S301-S304:
S301、获取多个候选视频,并提取多个候选视频中各图像帧的人体部位位置数据。S301. Acquire multiple candidate videos, and extract the position data of human body parts of each image frame in the multiple candidate videos.
S302、基于预设的规则对多个候选视频中同一图像帧的人体部位位置数据进行融合,得到标准位置数据集。S302 , fuse the position data of human body parts of the same image frame in the multiple candidate videos based on a preset rule to obtain a standard position data set.
S303、查找多个候选视频中至少一个图像帧中的目标人体部位在标准位置数据集中的位置数据。S303: Search for the position data of the target human body part in at least one image frame in the multiple candidate videos in the standard position data set.
S304、利用查找的位置数据确定与目标人体部位对应的动作图标的预设位置信息,以参与确定动作图标在客户端展示的图像帧数据中的展示位置。S304. Determine the preset position information of the action icon corresponding to the target body part by using the searched position data, so as to participate in determining the display position of the action icon in the image frame data displayed by the client.
在确定动作图标的预设位置信息的同时,还可以基于与动作图标对应的目标人体部位在图像帧中形成的动作信息,确定动作图标对应的标准动作信息。While determining the preset position information of the action icon, the standard action information corresponding to the action icon may also be determined based on the action information formed in the image frame by the target body part corresponding to the action icon.
在一种可能的实施方式下,获取多个候选视频,并提取多个候选视频中各图像帧的人体部位位置数据,包括:In a possible implementation, acquiring multiple candidate videos, and extracting body part position data of each image frame in the multiple candidate videos, including:
基于预设视频筛选信息,获取多个候选视频;其中,预设视频筛选信息包括视频交互信息和/或视频发布者信息,视频交互信息包括视频的点赞量和/或评论量;Obtain a plurality of candidate videos based on the preset video screening information; wherein, the preset video screening information includes video interaction information and/or video publisher information, and the video interaction information includes the amount of likes and/or comments of the video;
提取多个候选视频中各图像帧的人体部位位置数据。Extract body part position data for each image frame in multiple candidate videos.
在一种可能的实施方式下,本公开实施例提供的交互方法还包括:In a possible implementation manner, the interaction method provided by the embodiment of the present disclosure further includes:
基于标准位置数据集生成引导信息;Generate guidance information based on standard location datasets;
向客户端发送引导信息,以使客户端在采集的用户图像帧数据上展示引导信息,并引导用户改变图像帧数据中与动作图标关联的目标人体部位的状态信息。Sending guidance information to the client, so that the client displays the guidance information on the collected user image frame data, and guides the user to change the state information of the target body part associated with the action icon in the image frame data.
在一种可能的实施方式下,引导信息包括引导视频动画、引导图片和引导指令中的至少一种。In a possible implementation manner, the guide information includes at least one of a guide video animation, a guide picture and a guide instruction.
在一种可能的实施方式下,基于预设的规则对多个候选视频中同一图像帧的人体部位位置数据进行融合,得到标准位置数据集,包括:In a possible implementation manner, based on preset rules, the position data of human body parts of the same image frame in multiple candidate videos are fused to obtain a standard position data set, including:
确定每个候选视频的权重值;Determine the weight value of each candidate video;
基于每个候选视频的权重值,对多个候选视频中同一图像帧的人体部位位置数据进行加权平均计算,得到标准位置数据集。Based on the weight value of each candidate video, a weighted average calculation is performed on the position data of the human body parts of the same image frame in the multiple candidate videos to obtain a standard position data set.
在一种可能的实施方式下,本公开实施例提供的交互方法还包括:In a possible implementation manner, the interaction method provided by the embodiment of the present disclosure further includes:
接收客户端发送第一视频分享请求;其中,第一视频分享请求中携带第一分享视频和分享对象的用户标识,第一分享视频由客户端基于采集的第一图像帧数据和第二图像帧数据生成;Receiving the first video sharing request sent by the client; wherein, the first video sharing request carries the first sharing video and the user ID of the sharing object, and the first sharing video is collected by the client based on the first image frame data and the second image frame. data generation;
基于分享对象的用户标识,确定分享对象分享的第二分享视频;其中,第二分享视频和第一分享视频可以包括展示相同状态信息的人体部位的图像帧;Determine the second shared video shared by the shared object based on the user identification of the shared object; wherein, the second shared video and the first shared video may include image frames of human body parts showing the same state information;
将第一分享视频和第二分享视频合成同屏展示的合成视频;Combine the first shared video and the second shared video into a composite video displayed on the same screen;
将合成视频发送至客户端。Send the composite video to the client.
在一种可能的实施方式下,本公开实施例提供的交互方法还包括:In a possible implementation manner, the interaction method provided by the embodiment of the present disclosure further includes:
接收分享对象实时分享的第一分享图像帧数据;其中,分享对象由用户预先确定;Receive the first shared image frame data shared by the shared object in real time; wherein, the shared object is predetermined by the user;
将第一分享图像帧数据实时发送至客户端,以使客户端将第一分享图像帧数据和本地采集的第一图像帧数据进行同屏展示;其中,第一分享图像帧数据与客户端本地采集的第一图像帧数据可以展示具有相同状态信息的人体部位;Send the first shared image frame data to the client in real time, so that the client displays the first shared image frame data and the locally collected first image frame data on the same screen; wherein, the first shared image frame data and the client's local The collected first image frame data can show human body parts with the same state information;
接收分享对象实时分享的第二分享图像帧数据;其中,分享对象由用户预先确定;receiving the second shared image frame data shared by the sharing object in real time; wherein, the sharing object is predetermined by the user;
将第二分享图像帧数据实时发送至客户端,以使客户端将第二分享图像帧数据和本地采集的第二图像帧数据进行同屏展示;其中,第二分享图像帧数据与客户端本地采集的第二图像帧数据可以展示具有相同状态信息的人体部位。Send the second shared image frame data to the client in real time, so that the client displays the second shared image frame data and the locally collected second image frame data on the same screen; wherein, the second shared image frame data and the client locally The collected second image frame data may show human body parts with the same state information.
在本公开实施例中,服务器可以基于多个候选视频中各图像帧的人体部位的位置数据, 确定出标准位置数据集,进而基于目标人体部位在标准位置数据集中的位置数据,确定与目标人体部位对应的动作图标的预设位置信息,然后下发至客户端中,使得客户端结合从当前展示的图像帧数据中识别的人体部位的位置信息,动态确定动作图标在图像帧数据中的准确展示位置,即达到随着用户人体部位的位置变化,动作图标在用户图像帧数据中的展示位置动态调整的效果;同时,客户端还基于实时采集的用户图像帧数据中与动作图标关联的目标人体部位的状态信息与动作图标的匹配度,确定评估结果。本公开实施例实现了将摄像头采集的用户图像帧数据和该图像帧数据中待展示的动作图标进行有效结合,根据用户人体部位的位置动态调整动作图标的展示位置,准确评估用户人体部位的状态信息,提高了用户的交互体验。此外,通过服务器和客户端的交互,客户端中可以同屏展示多人的分享视频,提高了视频分享的趣味性。In the embodiment of the present disclosure, the server may determine a standard position data set based on the position data of the human body parts of each image frame in the multiple candidate videos, and then determine the target human body based on the position data of the target human body part in the standard position data set. The preset position information of the action icon corresponding to the part is sent to the client, so that the client can dynamically determine the accuracy of the action icon in the image frame data in combination with the position information of the human body part identified from the currently displayed image frame data Display position, that is to achieve the effect of dynamically adjusting the display position of the action icon in the user's image frame data as the position of the user's body parts changes; at the same time, the client also based on the real-time collection of the user's image frame data in the target associated with the action icon The matching degree between the state information of the human body part and the action icon determines the evaluation result. The embodiment of the present disclosure realizes the effective combination of the user image frame data collected by the camera and the action icon to be displayed in the image frame data, dynamically adjusts the display position of the action icon according to the position of the user's body part, and accurately evaluates the state of the user's body part information to improve the user's interactive experience. In addition, through the interaction between the server and the client, the shared video of multiple people can be displayed on the same screen in the client, which improves the fun of video sharing.
图8为本公开实施例提供的一种交互装置的结构示意图,该装置可以配置于客户端中,可以采用软件和/或硬件实现。本公开实施例中提及的客户端可以包括任意的具有视频交互功能的客户端,安装客户端的终端设备可以包括但不限于智能手机、平板电脑和笔记本等。FIG. 8 is a schematic structural diagram of an interaction apparatus according to an embodiment of the present disclosure. The apparatus may be configured in a client, and may be implemented by software and/or hardware. The client mentioned in the embodiment of the present disclosure may include any client with a video interaction function, and the terminal device on which the client is installed may include, but not limited to, a smart phone, a tablet computer, a notebook, and the like.
如图8所示,本公开实施例提供的交互装置400可以包括第一采集模块401、第一确定模块402、展示位置确定模块403、第二采集模块404、第二确定模块405和评估模块406,其中:As shown in FIG. 8 , the interaction apparatus 400 provided by the embodiment of the present disclosure may include a first collection module 401 , a first determination module 402 , a display position determination module 403 , a second collection module 404 , a second determination module 405 , and an evaluation module 406 ,in:
第一采集模块401,用于采集用户的第一图像帧数据并展示;The first collection module 401 is used to collect and display the first image frame data of the user;
第一确定模块402,用于识别第一图像帧数据中的至少一个人体部位,并确定人体部位的位置信息;a first determining module 402, configured to identify at least one human body part in the first image frame data, and determine the position information of the human body part;
展示位置确定模块403,用于基于至少一个人体部位的位置信息,以及与人体部位对应的动作图标的预设位置信息,确定动作图标的展示位置,并在展示位置展示动作图标;The display position determination module 403 is configured to determine the display position of the action icon based on the position information of at least one human body part and the preset position information of the action icon corresponding to the human body part, and display the action icon at the display position;
第二采集模块404,用于采集用户的第二图像帧数据并展示;其中,第二图像帧数据是第一图像帧数据之后预设时间点的图像帧数据;The second collection module 404 is configured to collect and display the second image frame data of the user; wherein, the second image frame data is the image frame data at a preset time point after the first image frame data;
第二确定模块405,用于确定第二图像帧数据中与动作图标关联的目标人体部位以及目标人体部位的状态信息;The second determination module 405 is configured to determine the target human body part associated with the action icon and the state information of the target human body part in the second image frame data;
评估模块406,用于根据第二图像帧数据中目标人体部位的状态信息与动作图标的匹配度,确定评估结果。The evaluation module 406 is configured to determine the evaluation result according to the matching degree between the state information of the target human body part and the action icon in the second image frame data.
在一种可能的实施方式下,目标人体部位的状态信息包括目标人体部位的位置信息和/或目标人体部位形成的动作信息。In a possible implementation manner, the state information of the target body part includes position information of the target body part and/or action information formed by the target body part.
在一种可能的实施方式下,目标人体部位的状态信息包括目标人体部位的位置信息;In a possible implementation manner, the state information of the target body part includes position information of the target body part;
评估模块406包括: Evaluation module 406 includes:
有效响应区域确定单元,用于确定动作图标在第二图像帧数据中的有效响应区域;an effective response area determination unit, used to determine the effective response area of the action icon in the second image frame data;
第一评估结果确定单元,用于确定目标人体部位的位置信息和动作图标的有效响应区域的位置匹配度,并根据位置匹配度确定评估结果。The first evaluation result determination unit is configured to determine the position matching degree between the position information of the target human body part and the effective response area of the action icon, and determine the evaluation result according to the position matching degree.
在一种可能的实施方式下,目标人体部位的状态信息包括目标人体部位形成的动作信息;In a possible implementation manner, the state information of the target body part includes action information formed by the target body part;
评估模块406包括: Evaluation module 406 includes:
标准动作信息确定单元,用于确定动作图标对应的标准动作信息;a standard action information determining unit, used for determining standard action information corresponding to the action icon;
第二评估结果确定单元,用于确定第二图像帧数据中目标人体部位形成的动作信息和标准动作信息的动作匹配度,并根据动作匹配度确定评估结果。The second evaluation result determination unit is configured to determine the action matching degree between the action information formed by the target human body part in the second image frame data and the standard action information, and determine the evaluation result according to the action matching degree.
在一种可能的实施方式下,动作图标的预设位置信息是基于与动作图标对应的人体部位在标准数据集中的位置数据得到;In a possible implementation manner, the preset position information of the action icon is obtained based on the position data of the body part corresponding to the action icon in the standard data set;
标准数据集是基于预设的规则对多个候选视频中同一图像帧的人体部位位置数据进行融合得到。The standard dataset is obtained by fusing the position data of human body parts of the same image frame in multiple candidate videos based on preset rules.
在一种可能的实施方式下,多个候选视频是基于预设视频筛选信息得到,预设视频筛选信息包括视频交互信息和/或视频发布者信息,视频交互信息包括视频的点赞量和/或评论量。In a possible implementation, the plurality of candidate videos are obtained based on preset video screening information, the preset video screening information includes video interaction information and/or video publisher information, and the video interaction information includes video likes and/or or comment volume.
在一种可能的实施方式下,本公开实施例提供的交互装置400还包括:In a possible implementation manner, the interaction apparatus 400 provided by the embodiment of the present disclosure further includes:
引导信息展示模块,用于在第二图像帧数据上展示引导信息,以引导用户改变与动作图标关联的目标人体部位的状态信息。The guide information display module is used for displaying guide information on the second image frame data, so as to guide the user to change the state information of the target body part associated with the action icon.
在一种可能的实施方式下,引导信息包括引导视频动画、引导图片和引导指令中的至少一种。In a possible implementation manner, the guide information includes at least one of a guide video animation, a guide picture and a guide instruction.
在一种可能的实施方式下,展示位置确定模块403包括:In a possible implementation, the placement determination module 403 includes:
展示位置确定单元,用于基于至少一个人体部位的位置信息,以及与人体部位对应的动作图标的预设位置信息,确定动作图标的展示位置;a display position determination unit, configured to determine the display position of the action icon based on the position information of at least one human body part and the preset position information of the action icon corresponding to the human body part;
动作图标展示单元,用于在展示位置展示动作图标;Action icon display unit, used to display the action icon in the display position;
动作图标展示单元包括:The action icon display unit includes:
展示样式确定子单元,用于基于背景音乐的播放时间信息或者基于第一图像帧数据的采集时间信息,确定动作图标的展示样式;a display style determination subunit, used for determining the display style of the action icon based on the playback time information of the background music or based on the collection time information of the first image frame data;
动作图标展示子单元,用于在展示位置采用展示样式展示动作图标。The action icon display subunit is used to display the action icon in the display style in the display position.
在一种可能的实施方式下,本公开实施例提供的交互装置400还包括:In a possible implementation manner, the interaction apparatus 400 provided by the embodiment of the present disclosure further includes:
评估结果动画确定模块,用于根据评估结果确定评估结果动画;The evaluation result animation determination module is used to determine the evaluation result animation according to the evaluation result;
动画展示模块,用于利用动作图标的展示位置,确定评估结果动画在第二图像帧数据中的动画展示位置,并在动画展示位置展示评估结果动画。The animation display module is used to determine the animation display position of the evaluation result animation in the second image frame data by using the display position of the action icon, and display the evaluation result animation in the animation display position.
在一种可能的实施方式下,第二确定模块405包括:In a possible implementation manner, the second determining module 405 includes:
关联人体部位确定单元,用于确定第二图像帧数据中与动作图标关联的目标人体部位;an associated human body part determination unit for determining the target human body part associated with the action icon in the second image frame data;
状态信息确定单元,用于确定目标人体部位的状态信息;a state information determination unit, used to determine the state information of the target human body part;
其中,关联人体部位确定单元具体用于:基于背景音乐的播放时间信息或者第二图像帧数据的采集时间信息,确定第二图像帧数据中与动作图标关联的目标人体部位。The associated human body part determining unit is specifically configured to: determine the target human body part associated with the action icon in the second image frame data based on the playing time information of the background music or the collection time information of the second image frame data.
在一种可能的实施方式下,动作图标包括表情图标,本公开实施例提供的交互装置400还包括:In a possible implementation manner, the action icon includes an emoticon icon, and the interaction apparatus 400 provided in this embodiment of the present disclosure further includes:
用户表情识别模块,用于识别第一图像帧数据或者第二图像帧数据中的用户表情,并确定与用户表情匹配的表情图标;a user expression recognition module, used to identify the user expression in the first image frame data or the second image frame data, and determine the expression icon matching the user expression;
表情图标展示模块,用于基于第一图像帧数据或者第二图像帧数据上形成用户表情的五官的位置信息,确定表情图标的展示位置,并将表情图标展示在确定的展示位置。The expression icon display module is used to determine the display position of the expression icon based on the position information of the facial features forming the user's expression on the first image frame data or the second image frame data, and display the expression icon in the determined display position.
在一种可能的实施方式下,本公开实施例提供的交互装置400还包括:In a possible implementation manner, the interaction apparatus 400 provided by the embodiment of the present disclosure further includes:
第一分享视频生成模块,用于基于采集的第一图像帧数据和第二图像帧数据,生成第一分享视频;a first shared video generation module, configured to generate a first shared video based on the collected first image frame data and second image frame data;
分享请求发送模块,用于根据用户的视频分享操作,向服务器发送第一视频分享请求;其中,第一视频分享请求中携带第一分享视频和分享对象的用户标识,分享对象的用户标识用于服务器确定分享对象分享的第二分享视频;A sharing request sending module, configured to send a first video sharing request to the server according to the user's video sharing operation; wherein, the first video sharing request carries the first sharing video and the user ID of the sharing object, and the user ID of the sharing object is used for The server determines the second shared video shared by the shared object;
合成视频接收模块,用于接收服务器返回的合成视频;其中,合成视频由服务器将第一分享视频和第二分享视频合成同屏展示后得到。The composite video receiving module is used to receive the composite video returned by the server; wherein, the composite video is obtained by the server after synthesizing the first shared video and the second shared video and displaying on the same screen.
在一种可能的实施方式下,本公开实施例提供的交互装置400还包括:In a possible implementation manner, the interaction apparatus 400 provided by the embodiment of the present disclosure further includes:
模式切换模块,用于根据用户的图像同步操作,由当前模式切换至图像同步分享模式;The mode switching module is used to switch from the current mode to the image synchronization sharing mode according to the user's image synchronization operation;
第一同屏展示模块,用于实时接收第一分享图像帧数据,并将第一分享图像帧数据和第一图像帧数据进行同屏展示;a first on-screen display module, configured to receive the first shared image frame data in real time, and display the first shared image frame data and the first image frame data on the same screen;
第二同屏展示模块,用于实时接收第二分享图像帧数据,并将第二分享图像帧数据和第二图像帧数据进行同屏展示;The second on-screen display module is used to receive the second shared image frame data in real time, and display the second shared image frame data and the second image frame data on the same screen;
其中,第一分享图像帧数据和第二分享图像帧数据由分享对象实时分享,分享对象由用户预先确定。The first shared image frame data and the second shared image frame data are shared in real time by the sharing object, and the sharing object is predetermined by the user.
在一种可能的实施方式下,人体部位形成的动作信息包括舞蹈游戏类动作信息。In a possible implementation manner, the action information formed by the body parts includes dance game-like action information.
在一种可能的实施方式下,在第一图像帧数据或者第二图像帧数据中识别的人体部位包括头部、胳膊、手部、脚部和腿部中的至少一种。In a possible implementation manner, the body part identified in the first image frame data or the second image frame data includes at least one of a head, an arm, a hand, a foot, and a leg.
本公开实施例所提供的配置于客户端的交互装置可执行本公开实施例所提供的应用于客户端的任意交互方法,具备执行方法相应的功能模块和有益效果。本公开装置实施例中未详尽描述的内容可以参考本公开任意方法实施例中的描述。The interaction device configured on the client provided by the embodiment of the present disclosure can execute any interaction method applied to the client provided by the embodiment of the present disclosure, and has functional modules and beneficial effects corresponding to the execution method. For the content that is not described in detail in the apparatus embodiment of the present disclosure, reference may be made to the description in any method embodiment of the present disclosure.
图9为本公开实施例提供的另一种交互装置的结构示意图,该装置可以配置于服务器中,可以采用软件和/或硬件实现。FIG. 9 is a schematic structural diagram of another interaction apparatus provided by an embodiment of the present disclosure. The apparatus may be configured in a server, and may be implemented by software and/or hardware.
如图9所示,本公开实施例提供的交互装置500可以包括位置数据提取模块501、标准位置数据集确定模块502、位置数据查找模块503和预设位置信息确定模块504,其中:As shown in FIG. 9 , the interaction apparatus 500 provided by the embodiment of the present disclosure may include a position data extraction module 501, a standard position data set determination module 502, a position data search module 503, and a preset position information determination module 504, wherein:
位置数据提取模块501,用于获取多个候选视频,并提取多个候选视频中各图像帧的人体部位位置数据;The position data extraction module 501 is used to obtain a plurality of candidate videos, and extract the position data of human body parts of each image frame in the plurality of candidate videos;
标准位置数据集确定模块502,用于基于预设的规则对多个候选视频中同一图像帧的人体部位位置数据进行融合,得到标准位置数据集;The standard position data set determination module 502 is configured to fuse the body part position data of the same image frame in the multiple candidate videos based on preset rules to obtain a standard position data set;
位置数据查找模块503,用于查找多个候选视频中至少一个图像帧中的目标人体部位在标准位置数据集中的位置数据;The position data search module 503 is used to search for the position data of the target human body part in the standard position data set in at least one image frame in the multiple candidate videos;
预设位置信息确定模块504,用于利用查找的位置数据确定与目标人体部位对应的动作图标的预设位置信息,以参与确定动作图标在客户端展示的图像帧数据中的展示位置。The preset position information determination module 504 is used to determine the preset position information of the action icon corresponding to the target body part by using the searched position data, so as to participate in determining the display position of the action icon in the image frame data displayed by the client.
在一种可能的实施方式下,位置数据提取模块501包括:In a possible implementation manner, the location data extraction module 501 includes:
视频筛选单元,用于基于预设视频筛选信息,获取多个候选视频;其中,预设视频筛选信息包括视频交互信息和/或视频发布者信息,视频交互信息包括视频的点赞量和/或评论量;A video screening unit, configured to obtain multiple candidate videos based on preset video screening information; wherein the preset video screening information includes video interaction information and/or video publisher information, and the video interaction information includes video likes and/or the amount of comments;
位置数据提取单元,用于提取多个候选视频中各图像帧的人体部位位置数据。The position data extraction unit is used for extracting the position data of human body parts of each image frame in the multiple candidate videos.
在一种可能的实施方式下,本公开实施例提供的交互装置500还包括:In a possible implementation manner, the interaction apparatus 500 provided by the embodiment of the present disclosure further includes:
引导信息生成模块,用于基于标准位置数据集生成引导信息;a guidance information generation module for generating guidance information based on a standard location data set;
引导信息发送模块,用于向客户端发送引导信息,以使客户端在采集的用户图像帧数据上展示引导信息,并引导用户改变图像帧数据中与动作图标关联的目标人体部位的状态信息。The guidance information sending module is used to send guidance information to the client, so that the client can display the guidance information on the collected user image frame data, and guide the user to change the state information of the target body part associated with the action icon in the image frame data.
在一种可能的实施方式下,引导信息包括引导视频动画、引导图片和引导指令中的至少一种。In a possible implementation manner, the guide information includes at least one of a guide video animation, a guide picture and a guide instruction.
在一种可能的实施方式下,标准位置数据集确定模块502包括:In a possible implementation, the standard location data set determination module 502 includes:
视频权重确定单元,用于确定每个候选视频的权重值;a video weight determination unit for determining the weight value of each candidate video;
标准位置数据集确定单元,用于基于每个候选视频的权重值,对多个候选视频中同一图像帧的人体部位位置数据进行加权平均计算,得到标准位置数据集。The standard position data set determination unit is configured to perform weighted average calculation on the position data of human body parts of the same image frame in the multiple candidate videos based on the weight value of each candidate video to obtain the standard position data set.
在一种可能的实施方式下,本公开实施例提供的交互装置500还包括:In a possible implementation manner, the interaction apparatus 500 provided by the embodiment of the present disclosure further includes:
视频分享请求接收模块,用于接收客户端发送第一视频分享请求;其中,第一视频分享请求中携带第一分享视频和分享对象的用户标识,第一分享视频由客户端基于采集的第一图像帧数据和第二图像帧数据生成;The video sharing request receiving module is used for receiving the first video sharing request sent by the client; wherein, the first video sharing request carries the first sharing video and the user identifier of the sharing object, and the first sharing video is obtained by the client based on the collected first video image frame data and second image frame data are generated;
分享视频确定模块,用于基于分享对象的用户标识,确定分享对象分享的第二分享视频;其中,第二分享视频和第一分享视频可以包括展示相同状态信息的人体部位的图像帧;a shared video determination module, configured to determine the second shared video shared by the shared object based on the user identification of the shared object; wherein the second shared video and the first shared video may include image frames of human body parts showing the same state information;
视频合成模块,用于将第一分享视频和第二分享视频合成同屏展示的合成视频;a video synthesis module, used to synthesize the first shared video and the second shared video into a composite video displayed on the same screen;
合成视频发送模块,用于将合成视频发送至客户端。The composite video sending module is used to send the composite video to the client.
在一种可能的实施方式下,本公开实施例提供的交互装置500还包括:In a possible implementation manner, the interaction apparatus 500 provided by the embodiment of the present disclosure further includes:
第一分享图像接收模块,用于接收分享对象实时分享的第一分享图像帧数据;其中,分享对象由用户预先确定;a first shared image receiving module, configured to receive the first shared image frame data shared by the shared object in real time; wherein, the shared object is predetermined by the user;
第一分享图像发送模块,用于将第一分享图像帧数据实时发送至客户端,以使客户端将第一分享图像帧数据和本地采集的第一图像帧数据进行同屏展示;其中,第一分享图像帧数据与客户端本地采集的第一图像帧数据可以展示具有相同状态信息的人体部位;The first shared image sending module is used for sending the first shared image frame data to the client in real time, so that the client displays the first shared image frame data and the locally collected first image frame data on the same screen; A shared image frame data and the first image frame data locally collected by the client can display human body parts with the same state information;
第二分享图像接收模块,用于接收分享对象实时分享的第二分享图像帧数据;其中,分享对象由用户预先确定;The second shared image receiving module is configured to receive the second shared image frame data shared by the shared object in real time; wherein, the shared object is predetermined by the user;
第二分享图像发送模块,用于将第二分享图像帧数据实时发送至客户端,以使客户端将第二分享图像帧数据和本地采集的第二图像帧数据进行同屏展示;其中,第二分享图像帧数据与客户端本地采集的第二图像帧数据可以展示具有相同状态信息的人体部位。The second shared image sending module is used to send the second shared image frame data to the client in real time, so that the client can display the second shared image frame data and the locally collected second image frame data on the same screen; The two-shared image frame data and the second image frame data locally collected by the client can display human body parts with the same state information.
本公开实施例所提供的配置于服务器的交互装置可执行本公开实施例所提供的应用于服务器的交互方法,具备执行方法相应的功能模块和有益效果。本公开装置实施例中未详尽描述的内容可以参考本公开任意方法实施例中的描述。The interaction device configured on the server provided by the embodiment of the present disclosure can execute the interaction method applied to the server provided by the embodiment of the present disclosure, and has functional modules and beneficial effects corresponding to the execution method. For the content that is not described in detail in the apparatus embodiment of the present disclosure, reference may be made to the description in any method embodiment of the present disclosure.
图10为本公开实施例提供的一种终端的结构示意图,用于对实现本公开实施例提供的交互方法的终端进行示例性说明。本公开实施例中的终端可以包括但不限于诸如移动电话、笔记本电脑、数字广播接收器、PDA(个人数字助理)、PAD(平板电脑)、PMP(便携式多媒体播放器)、车载终端(例如车载导航终端)等等的移动终端以及诸如数字TV、台式 计算机等等的固定终端。图10示出的终端仅仅是一个示例,不应对本公开实施例的功能和占用范围带来任何限制。FIG. 10 is a schematic structural diagram of a terminal provided by an embodiment of the present disclosure, which is used to exemplarily describe a terminal that implements the interaction method provided by the embodiment of the present disclosure. The terminals in the embodiments of the present disclosure may include, but are not limited to, such as mobile phones, notebook computers, digital broadcast receivers, PDAs (personal digital assistants), PADs (tablets), PMPs (portable multimedia players), in-vehicle terminals (eg, in-vehicle terminals) mobile terminals such as navigation terminals) and the like, and stationary terminals such as digital TVs, desktop computers, and the like. The terminal shown in FIG. 10 is only an example, and should not impose any limitations on the functions and occupancy scope of the embodiments of the present disclosure.
如图10所示,终端600包括一个或多个处理器601、存储器602和摄像头605。As shown in FIG. 10 , the terminal 600 includes one or more processors 601 , a memory 602 and a camera 605 .
摄像头605用于实时采集用户的图像帧数据。The camera 605 is used to collect image frame data of the user in real time.
处理器601可以是中央处理单元(CPU)或者具有数据处理能力和/或指令执行能力的其他形式的处理单元,并且可以控制终端600中的其他组件以执行期望的功能。 Processor 601 may be a central processing unit (CPU) or other form of processing unit having data processing capabilities and/or instruction execution capabilities, and may control other components in terminal 600 to perform desired functions.
存储器602可以包括一个或多个计算机程序产品,计算机程序产品可以包括各种形式的计算机可读存储介质,例如易失性存储器和/或非易失性存储器。易失性存储器例如可以包括随机存取存储器(RAM)和/或高速缓冲存储器(cache)等。非易失性存储器例如可以包括只读存储器(ROM)、硬盘、闪存等。在计算机可读存储介质上可以存储一个或多个计算机程序指令,处理器601可以运行程序指令,以实现本公开实施例提供的应用于客户端的交互方法,还可以实现其他期望的功能。在计算机可读存储介质中还可以存储诸如输入信号、信号分量、噪声分量等各种内容。 Memory 602 may include one or more computer program products, which may include various forms of computer-readable storage media, such as volatile memory and/or non-volatile memory. Volatile memory may include, for example, random access memory (RAM) and/or cache memory, among others. Non-volatile memory may include, for example, read only memory (ROM), hard disk, flash memory, and the like. One or more computer program instructions may be stored on the computer-readable storage medium, and the processor 601 may execute the program instructions to implement the interaction method applied to the client provided by the embodiments of the present disclosure, and may also implement other desired functions. Various contents such as input signals, signal components, noise components, etc. may also be stored in the computer-readable storage medium.
其中,应用于客户端的交互方法可以包括:采集用户的第一图像帧数据并展示;识别第一图像帧数据中的至少一个人体部位,并确定人体部位的位置信息;基于至少一个人体部位的位置信息,以及与人体部位对应的动作图标的预设位置信息,确定动作图标的展示位置,并在展示位置展示动作图标;采集用户的第二图像帧数据并展示;其中,第二图像帧数据是第一图像帧数据之后预设时间点的图像帧数据;确定第二图像帧数据中与动作图标关联的目标人体部位以及目标人体部位的状态信息;根据第二图像帧数据中目标人体部位的状态信息与动作图标的匹配度,确定评估结果。The interaction method applied to the client may include: collecting and displaying the first image frame data of the user; identifying at least one human body part in the first image frame data, and determining the position information of the human body part; based on the position of the at least one human body part information, and the preset position information of the action icon corresponding to the body part, determine the display position of the action icon, and display the action icon at the display position; collect the second image frame data of the user and display; wherein, the second image frame data is image frame data at a preset time point after the first image frame data; determine the target body part associated with the action icon in the second image frame data and the state information of the target body part; according to the state of the target body part in the second image frame data The matching degree of the information and the action icon determines the evaluation result.
应当理解,终端600还可以执行本公开方法实施例提供的其他可选实施方案。It should be understood that the terminal 600 may also perform other optional implementations provided by the method embodiments of the present disclosure.
在一个示例中,终端600还可以包括:输入装置603和输出装置604,这些组件通过总线系统和/或其他形式的连接机构(未示出)互连。In one example, the terminal 600 may also include an input device 603 and an output device 604, these components being interconnected by a bus system and/or other form of connection mechanism (not shown).
此外,该输入装置603还可以包括例如键盘、鼠标等等。In addition, the input device 603 may also include, for example, a keyboard, a mouse, and the like.
该输出装置604可以向外部输出各种信息,包括确定出的距离信息、方向信息等。该输出装置604可以包括例如显示器、扬声器、打印机、以及通信网络及其所连接的远程输出设备等等。The output device 604 can output various information to the outside, including the determined distance information, direction information, and the like. The output device 604 may include, for example, displays, speakers, printers, and communication networks and their connected remote output devices, among others.
当然,为了简化,图10中仅示出了该终端600中与本公开有关的组件中的一些,省略了诸如总线、输入/输出接口等等的组件。除此之外,根据具体应用情况,终端600还可以包括任何其他适当的组件。Of course, for simplicity, only some of the components in the terminal 600 related to the present disclosure are shown in FIG. 10 , and components such as a bus, an input/output interface, and the like are omitted. Besides, the terminal 600 may also include any other appropriate components according to the specific application.
图11为本公开实施例提供的一种服务器的结构示意图,用于对实现本公开实施例提供的交互方法的服务器进行示例性说明。图11示出的服务器仅仅是一个示例,不应对本公开实施例的功能和占用范围带来任何限制。FIG. 11 is a schematic structural diagram of a server provided by an embodiment of the present disclosure, which is used to exemplarily describe a server that implements the interaction method provided by the embodiment of the present disclosure. The server shown in FIG. 11 is only an example, and should not impose any limitations on the functions and occupation scope of the embodiments of the present disclosure.
如图11所示,服务器700包括一个或多个处理器701和存储器702。As shown in FIG. 11 , server 700 includes one or more processors 701 and memory 702 .
处理器701可以是中央处理单元(CPU)或者具有数据处理能力和/或指令执行能力的其他形式的处理单元,并且可以控制服务器700中的其他组件以执行期望的功能。 Processor 701 may be a central processing unit (CPU) or other form of processing unit having data processing capabilities and/or instruction execution capabilities, and may control other components in server 700 to perform desired functions.
存储器702可以包括一个或多个计算机程序产品,计算机程序产品可以包括各种形式的计算机可读存储介质,例如易失性存储器和/或非易失性存储器。易失性存储器例如可以 包括随机存取存储器(RAM)和/或高速缓冲存储器(cache)等。非易失性存储器例如可以包括只读存储器(ROM)、硬盘、闪存等。在计算机可读存储介质上可以存储一个或多个计算机程序指令,处理器701可以运行程序指令,以实现本公开实施例提供的应用于服务器的交互方法,还可以实现其他期望的功能。在计算机可读存储介质中还可以存储诸如输入信号、信号分量、噪声分量等各种内容。 Memory 702 may include one or more computer program products, which may include various forms of computer-readable storage media, such as volatile memory and/or non-volatile memory. Volatile memory may include, for example, random access memory (RAM) and/or cache memory, among others. Non-volatile memory may include, for example, read only memory (ROM), hard disk, flash memory, and the like. One or more computer program instructions may be stored on the computer-readable storage medium, and the processor 701 may execute the program instructions to implement the interaction method applied to the server provided by the embodiments of the present disclosure, and may also implement other desired functions. Various contents such as input signals, signal components, noise components, etc. may also be stored in the computer-readable storage medium.
其中,应用于服务器的交互方法可以包括:获取多个候选视频,并提取多个候选视频中各图像帧的人体部位位置数据;基于预设的规则对多个候选视频中同一图像帧的人体部位位置数据进行融合,得到标准位置数据集;查找多个候选视频中至少一个图像帧中的目标人体部位在标准位置数据集中的位置数据;利用查找的位置数据确定与目标人体部位对应的动作图标的预设位置信息,以参与确定动作图标在客户端展示的图像帧数据中的展示位置。Wherein, the interaction method applied to the server may include: acquiring multiple candidate videos, and extracting body part position data of each image frame in the multiple candidate videos; The position data is fused to obtain a standard position data set; the position data of the target body part in at least one image frame in the multiple candidate videos is found in the standard position data set; the searched position data is used to determine the action icon corresponding to the target body part. Preset position information to participate in determining the display position of the action icon in the image frame data displayed by the client.
应当理解,服务器700还可以执行本公开方法实施例提供的其他可选实施方案。It should be understood that the server 700 may also execute other optional implementations provided by the method embodiments of the present disclosure.
在一个示例中,服务器700还可以包括:输入装置703和输出装置704,这些组件通过总线系统和/或其他形式的连接机构(未示出)互连。In one example, the server 700 may also include an input device 703 and an output device 704 interconnected by a bus system and/or other form of connection mechanism (not shown).
此外,该输入装置703还可以包括例如键盘、鼠标等等。In addition, the input device 703 may also include, for example, a keyboard, a mouse, and the like.
该输出装置704可以向外部输出各种信息,包括确定出的距离信息、方向信息等。该输出装置704可以包括例如显示器、扬声器、打印机、以及通信网络及其所连接的远程输出设备等等。The output device 704 can output various information to the outside, including the determined distance information, direction information, and the like. The output devices 704 may include, for example, displays, speakers, printers, and communication networks and their connected remote output devices, among others.
当然,为了简化,图11中仅示出了该服务器700中与本公开有关的组件中的一些,省略了诸如总线、输入/输出接口等等的组件。除此之外,根据具体应用情况,服务器700还可以包括任何其他适当的组件。Of course, for simplicity, only some of the components in the server 700 related to the present disclosure are shown in FIG. 11 , and components such as buses, input/output interfaces, and the like are omitted. Besides, the server 700 may also include any other appropriate components according to the specific application.
除了上述方法和设备以外,本公开的实施例还可以是计算机程序产品,其包括计算机程序指令,计算机程序指令在被处理器运行时使得处理器执行本公开实施例所提供的应用于客户端或应用于服务器的任意交互方法。In addition to the above-mentioned methods and apparatuses, the embodiments of the present disclosure may also be computer program products, which include computer program instructions, which, when executed by the processor, cause the processor to execute the application to the client or the application provided by the embodiments of the present disclosure. Arbitrary interaction method applied to the server.
计算机程序产品可以以一种或多种程序设计语言的任意组合来编写用于执行本公开实施例操作的程序代码,程序设计语言包括面向对象的程序设计语言,诸如Java、C++等,还包括常规的过程式程序设计语言,诸如“C”语言或类似的程序设计语言。程序代码可以完全地在用户终端或服务器上执行、部分地在用户终端或服务器上执行、作为一个独立的软件包执行、部分在用户终端或服务器上且部分在远程终端或服务器上执行、或者完全在远程终端或服务器上执行。The computer program product may write program code for performing operations of embodiments of the present disclosure in any combination of one or more programming languages, including object-oriented programming languages, such as Java, C++, etc., as well as conventional procedural programming language, such as "C" language or similar programming language. The program code may execute entirely on a user terminal or server, partly on a user terminal or server, as a stand-alone software package, partly on a user terminal or server and partly on a remote terminal or server, or completely Execute on a remote terminal or server.
此外,本公开的实施例还可以是计算机可读存储介质,其上存储有计算机程序指令,计算机程序指令在被处理器运行时使得处理器执行本公开实施例所提供的应用于客户端或应用于服务器的任意交互方法。In addition, an embodiment of the present disclosure may also be a computer-readable storage medium on which computer program instructions are stored, and when the computer program instructions are executed by a processor, cause the processor to execute the application client or application provided by the embodiment of the present disclosure. Any method of interaction with the server.
一方面,应用于客户端的交互方法可以包括:采集用户的第一图像帧数据并展示;识别第一图像帧数据中的至少一个人体部位,并确定人体部位的位置信息;基于至少一个人体部位的位置信息,以及与人体部位对应的动作图标的预设位置信息,确定动作图标的展示位置,并在展示位置展示动作图标;采集用户的第二图像帧数据并展示;其中,第二图像帧数据是第一图像帧数据之后预设时间点的图像帧数据;确定第二图像帧数据中与动作 图标关联的目标人体部位以及目标人体部位的状态信息;根据第二图像帧数据中目标人体部位的状态信息与动作图标的匹配度,确定评估结果。In one aspect, the interaction method applied to the client may include: collecting and displaying the first image frame data of the user; identifying at least one human body part in the first image frame data, and determining the position information of the human body part; position information, and the preset position information of the action icons corresponding to the body parts, determine the display position of the action icons, and display the action icons at the display position; collect and display the second image frame data of the user; wherein, the second image frame data is the image frame data at a preset time point after the first image frame data; determine the target human body part associated with the action icon in the second image frame data and the state information of the target human body part; according to the second image frame data of the target human body part The matching degree between the status information and the action icon determines the evaluation result.
另一方面,应用于服务器的交互方法可以包括:获取多个候选视频,并提取多个候选视频中各图像帧的人体部位位置数据;基于预设的规则对多个候选视频中同一图像帧的人体部位位置数据进行融合,得到标准位置数据集;查找多个候选视频中至少一个图像帧中的目标人体部位在标准位置数据集中的位置数据;利用查找的位置数据确定与目标人体部位对应的动作图标的预设位置信息,以参与确定动作图标在客户端展示的图像帧数据中的展示位置。On the other hand, the interaction method applied to the server may include: acquiring multiple candidate videos, and extracting body part position data of each image frame in the multiple candidate videos; The position data of human body parts are fused to obtain a standard position data set; the position data of the target human body part in at least one image frame in multiple candidate videos is found in the standard position data set; the action corresponding to the target human body part is determined by using the searched position data The preset position information of the icon to participate in determining the display position of the action icon in the image frame data displayed by the client.
应当理解,计算机程序指令在被处理器运行时,还可以使得处理器执行本公开方法实施例提供的其他可选实施方案。It should be understood that, when the computer program instructions are executed by the processor, the processor may also cause the processor to execute other optional implementations provided by the method embodiments of the present disclosure.
计算机可读存储介质可以采用一个或多个可读介质的任意组合。可读介质可以是可读信号介质或者可读存储介质。可读存储介质例如可以包括但不限于电、磁、光、电磁、红外线、或半导体的系统、装置或器件,或者任意以上的组合。可读存储介质的更具体的例子(非穷举的列表)包括:具有一个或多个导线的电连接、便携式盘、硬盘、随机存取存储器(RAM)、只读存储器(ROM)、可擦式可编程只读存储器(EPROM或闪存)、光纤、便携式紧凑盘只读存储器(CD-ROM)、光存储器件、磁存储器件、或者上述的任意合适的组合。A computer-readable storage medium can employ any combination of one or more readable media. The readable medium may be a readable signal medium or a readable storage medium. The readable storage medium may include, for example, but not limited to, electrical, magnetic, optical, electromagnetic, infrared, or semiconductor systems, apparatuses or devices, or a combination of any of the above. More specific examples (non-exhaustive list) of readable storage media include: electrical connections with one or more wires, portable disks, hard disks, random access memory (RAM), read only memory (ROM), erasable programmable read only memory (EPROM or flash memory), optical fiber, portable compact disk read only memory (CD-ROM), optical storage devices, magnetic storage devices, or any suitable combination of the foregoing.
需要说明的是,在本文中,诸如“第一”和“第二”等之类的关系术语仅仅用来将一个实体或者操作与另一个实体或操作区分开来,而不一定要求或者暗示这些实体或操作之间存在任何这种实际的关系或者顺序。而且,术语“包括”、“包含”或者其任何其他变体意在涵盖非排他性的包含,从而使得包括一系列要素的过程、方法、物品或者设备不仅包括那些要素,而且还包括没有明确列出的其他要素,或者是还包括为这种过程、方法、物品或者设备所固有的要素。在没有更多限制的情况下,由语句“包括一个……”限定的要素,并不排除在包括要素的过程、方法、物品或者设备中还存在另外的相同要素。It should be noted that, in this document, relational terms such as "first" and "second" etc. are only used to distinguish one entity or operation from another entity or operation, and do not necessarily require or imply these There is no such actual relationship or sequence between entities or operations. Moreover, the terms "comprising", "comprising" or any other variation thereof are intended to encompass a non-exclusive inclusion such that a process, method, article or device that includes a list of elements includes not only those elements, but also includes not explicitly listed or other elements inherent to such a process, method, article or apparatus. Without further limitation, an element qualified by the phrase "comprising a..." does not preclude the presence of additional identical elements in the process, method, article, or device that includes the element.
以上仅是本公开的具体实施方式,使本领域技术人员能够理解或实现本公开。对这些实施例的多种修改对本领域的技术人员来说将是显而易见的,本文中所定义的一般原理可以在不脱离本公开的精神或范围的情况下,在其它实施例中实现。因此,本公开将不会被限制于本文的这些实施例,而是要符合与本文所公开的原理和新颖特点相一致的最宽的范围。The above are only specific embodiments of the present disclosure, so that those skilled in the art can understand or implement the present disclosure. Various modifications to these embodiments will be readily apparent to those skilled in the art, and the generic principles defined herein may be implemented in other embodiments without departing from the spirit or scope of the present disclosure. Therefore, the present disclosure is not to be limited to the embodiments herein, but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.

Claims (21)

  1. 一种交互方法,其特征在于,应用于客户端,包括:An interaction method, characterized in that, applied to a client, comprising:
    采集用户的第一图像帧数据并展示;Collect the user's first image frame data and display it;
    识别所述第一图像帧数据中的至少一个人体部位,并确定所述人体部位的位置信息;Identifying at least one human body part in the first image frame data, and determining the position information of the human body part;
    基于所述至少一个人体部位的位置信息,以及与所述人体部位对应的动作图标的预设位置信息,确定所述动作图标的展示位置,并在所述展示位置展示所述动作图标;determining the display position of the action icon based on the position information of the at least one body part and the preset position information of the action icon corresponding to the body part, and displaying the action icon at the display position;
    采集所述用户的第二图像帧数据并展示;其中,所述第二图像帧数据是所述第一图像帧数据之后预设时间点的图像帧数据;collecting and displaying second image frame data of the user; wherein, the second image frame data is image frame data at a preset time point after the first image frame data;
    确定所述第二图像帧数据中与所述动作图标关联的目标人体部位以及所述目标人体部位的状态信息;determining the target body part associated with the action icon in the second image frame data and the state information of the target body part;
    根据所述第二图像帧数据中所述目标人体部位的状态信息与所述动作图标的匹配度,确定评估结果。The evaluation result is determined according to the degree of matching between the state information of the target human body part and the action icon in the second image frame data.
  2. 根据权利要求1所述的方法,其特征在于,所述目标人体部位的状态信息包括所述目标人体部位的位置信息和/或所述目标人体部位形成的动作信息。The method according to claim 1, wherein the state information of the target body part includes position information of the target body part and/or motion information formed by the target body part.
  3. 根据权利要求2所述的方法,其特征在于,所述目标人体部位的状态信息包括所述目标人体部位的位置信息;The method according to claim 2, wherein the state information of the target body part comprises position information of the target body part;
    所述根据所述第二图像帧数据中所述目标人体部位的状态信息与所述动作图标的匹配度,确定评估结果,确定评估结果,包括:The determining the evaluation result according to the matching degree between the state information of the target human body part in the second image frame data and the action icon, and determining the evaluation result, including:
    确定所述动作图标在所述第二图像帧数据中的有效响应区域;determining the effective response area of the action icon in the second image frame data;
    确定所述目标人体部位的位置信息和所述动作图标的有效响应区域的位置匹配度,并根据所述位置匹配度确定所述评估结果。The position matching degree between the position information of the target body part and the effective response area of the action icon is determined, and the evaluation result is determined according to the position matching degree.
  4. 根据权利要求2所述的方法,其特征在于,所述目标人体部位的状态信息包括所述目标人体部位形成的动作信息;The method according to claim 2, wherein the state information of the target body part comprises action information formed by the target body part;
    所述根据所述第二图像帧数据中所述目标人体部位的状态信息与所述动作图标的匹配度,确定评估结果,包括:The determining the evaluation result according to the matching degree between the state information of the target human body part and the action icon in the second image frame data includes:
    确定所述动作图标对应的标准动作信息;determining the standard action information corresponding to the action icon;
    确定所述第二图像帧数据中所述目标人体部位形成的动作信息和所述标准动作信息的动作匹配度,并根据所述动作匹配度确定所述评估结果。Determine the motion matching degree between the motion information formed by the target human body part and the standard motion information in the second image frame data, and determine the evaluation result according to the motion matching degree.
  5. 根据权利要求1所述的方法,其特征在于,所述动作图标的预设位置信息是基于与所述动作图标对应的人体部位在标准数据集中的位置数据得到;The method according to claim 1, wherein the preset position information of the action icon is obtained based on the position data of the body part corresponding to the action icon in a standard data set;
    所述标准数据集是基于预设的规则对多个候选视频中同一图像帧的人体部位位置数据进行融合得到。The standard data set is obtained by fusing human body position data of the same image frame in multiple candidate videos based on preset rules.
  6. 根据权利要求5所述的方法,其特征在于,所述多个候选视频是基于预设视频筛选信息得到,所述预设视频筛选信息包括视频交互信息和/或视频发布者信息,所述视频交互信息包括视频的点赞量和/或评论量。The method according to claim 5, wherein the plurality of candidate videos are obtained based on preset video screening information, and the preset video screening information includes video interaction information and/or video publisher information, and the video Interaction information includes the number of likes and/or comments on the video.
  7. 根据权利要求1所述的方法,其特征在于,在展示所述用户的第二图像帧数据的过程中,还包括:The method according to claim 1, wherein in the process of displaying the second image frame data of the user, the method further comprises:
    在所述第二图像帧数据上展示引导信息,以引导所述用户改变所述目标人体部位的状态信息。Guiding information is displayed on the second image frame data to guide the user to change the state information of the target body part.
  8. 根据权利要求7所述的方法,其特征在于,所述引导信息包括引导视频动画、引导图片和引导指令中的至少一种。The method according to claim 7, wherein the guidance information comprises at least one of a guidance video animation, a guidance picture and a guidance instruction.
  9. 根据权利要求1所述的方法,其特征在于,所述在所述展示位置展示所述动作图标,包括:The method according to claim 1, wherein the displaying the action icon at the display position comprises:
    基于背景音乐的播放时间信息或者基于所述第一图像帧数据的采集时间信息,确定所述动作图标的展示样式;Determine the display style of the action icon based on the playback time information of the background music or based on the collection time information of the first image frame data;
    在所述展示位置采用所述展示样式展示所述动作图标;Display the action icon in the display position using the display style;
    所述确定所述第二图像帧数据中与所述动作图标关联的目标人体部位,包括:The determining of the target body part associated with the action icon in the second image frame data includes:
    基于所述背景音乐的播放时间信息或者基于所述第二图像帧数据的采集时间信息,确定所述第二图像帧数据中与所述动作图标关联的目标人体部位。Based on the playing time information of the background music or based on the collection time information of the second image frame data, the target human body part associated with the action icon in the second image frame data is determined.
  10. 根据权利要求1所述的方法,其特征在于,在所述根据所述第二图像帧数据中所述目标人体部位的状态信息与所述动作图标的匹配度,确定评估结果之后,还包括:The method according to claim 1, wherein after determining the evaluation result according to the matching degree between the state information of the target human body part in the second image frame data and the action icon, the method further comprises:
    根据所述评估结果确定评估结果动画;determining an evaluation result animation according to the evaluation result;
    利用所述动作图标的展示位置,确定所述评估结果动画在所述第二图像帧数据中的动画展示位置,并在所述动画展示位置展示所述评估结果动画。Using the display position of the action icon, the animation display position of the evaluation result animation in the second image frame data is determined, and the evaluation result animation is displayed at the animation display position.
  11. 根据权利要求1所述的方法,其特征在于,所述动作图标包括表情图标,在展示所述第一图像帧数据或者展示所述第二图像帧数据的过程中,还包括:The method according to claim 1, wherein the action icon comprises an emoticon icon, and in the process of displaying the first image frame data or displaying the second image frame data, further comprising:
    识别所述第一图像帧数据或者所述第二图像帧数据中的用户表情,并确定与所述用户表情匹配的表情图标;Identifying the user's expression in the first image frame data or the second image frame data, and determining an expression icon matching the user's expression;
    基于所述第一图像帧数据或者所述第二图像帧数据上形成所述用户表情的五官的位置信息,确定所述表情图标的展示位置,并将所述表情图标展示在确定的展示位置。Based on the position information of the facial features forming the user's expression on the first image frame data or the second image frame data, the display position of the expression icon is determined, and the expression icon is displayed at the determined display position.
  12. 根据权利要求1所述的方法,其特征在于,在所述根据所述第二图像帧数据中所述目标人体部位的状态信息与所述动作图标的匹配度,确定评估结果之后,还包括:The method according to claim 1, wherein after determining the evaluation result according to the matching degree between the state information of the target human body part in the second image frame data and the action icon, the method further comprises:
    基于采集的所述第一图像帧数据和所述第二图像帧数据,生成第一分享视频;generating a first shared video based on the collected first image frame data and the second image frame data;
    根据所述用户的视频分享操作,向服务器发送第一视频分享请求;其中,所述第一视频分享请求中携带所述第一分享视频和分享对象的用户标识,所述分享对象的用户标识用于所述服务器确定所述分享对象分享的第二分享视频;According to the video sharing operation of the user, a first video sharing request is sent to the server; wherein, the first video sharing request carries the first sharing video and the user ID of the sharing object, and the user ID of the sharing object is determining the second shared video shared by the shared object on the server;
    接收所述服务器返回的合成视频;其中,所述合成视频由所述服务器将所述第一分享视频和所述第二分享视频合成同屏展示后得到。Receive a composite video returned by the server; wherein, the composite video is obtained by the server synthesizing the first shared video and the second shared video for display on the same screen.
  13. 根据权利要求1所述的方法,其特征在于,在展示所述第一图像帧数据之前,还包括:The method according to claim 1, wherein before displaying the first image frame data, the method further comprises:
    根据所述用户的图像同步操作,由当前模式切换至图像同步分享模式;According to the image synchronization operation of the user, switch from the current mode to the image synchronization sharing mode;
    相应的,在展示所述第一图像帧数据和所述第二图像帧数据的过程中,还包括:Correspondingly, in the process of displaying the first image frame data and the second image frame data, the method further includes:
    实时接收第一分享图像帧数据,并将所述第一分享图像帧数据和所述第一图像帧数据进行同屏展示;Receive the first shared image frame data in real time, and display the first shared image frame data and the first image frame data on the same screen;
    实时接收第二分享图像帧数据,并将所述第二分享图像帧数据和所述第二图像帧 数据进行同屏展示;Receive the second shared image frame data in real time, and display the second shared image frame data and the second image frame data on the same screen;
    其中,所述第一分享图像帧数据和所述第二分享图像帧数据由分享对象实时分享,所述分享对象由所述用户预先确定。The first shared image frame data and the second shared image frame data are shared in real time by a sharing object, and the sharing object is predetermined by the user.
  14. 根据权利要求2所述的方法,其特征在于,所述人体部位形成的动作信息包括舞蹈游戏类动作信息。The method according to claim 2, wherein the motion information formed by the body parts includes dance game-like motion information.
  15. 一种交互方法,其特征在于,应用于服务器,包括:An interaction method, characterized in that, applied to a server, comprising:
    获取多个候选视频,并提取所述多个候选视频中各图像帧的人体部位位置数据;Obtaining multiple candidate videos, and extracting the position data of human body parts of each image frame in the multiple candidate videos;
    基于预设的规则对所述多个候选视频中同一图像帧的人体部位位置数据进行融合,得到标准位置数据集;Based on preset rules, the position data of human body parts of the same image frame in the plurality of candidate videos are fused to obtain a standard position data set;
    查找所述多个候选视频中至少一个图像帧中的目标人体部位在所述标准位置数据集中的位置数据;Find the position data of the target human body part in the standard position data set in at least one image frame in the plurality of candidate videos;
    利用所述查找的位置数据确定与所述目标人体部位对应的动作图标的预设位置信息,以参与确定所述动作图标在客户端展示的图像帧数据中的展示位置。The preset position information of the action icon corresponding to the target body part is determined by using the searched position data, so as to participate in determining the display position of the action icon in the image frame data displayed by the client.
  16. 一种交互装置,其特征在于,配置于客户端,包括:An interactive device, characterized in that, configured on a client, comprising:
    第一采集模块,用于采集用户的第一图像帧数据并展示;a first acquisition module, configured to collect and display the first image frame data of the user;
    第一确定模块,用于识别所述第一图像帧数据中的至少一个人体部位,并确定所述人体部位的位置信息;a first determining module, configured to identify at least one human body part in the first image frame data, and determine the position information of the human body part;
    展示位置确定模块,用于基于所述至少一个人体部位的位置信息,以及与所述人体部位对应的动作图标的预设位置信息,确定所述动作图标的展示位置,并在所述展示位置展示所述动作图标;A display position determination module, configured to determine the display position of the action icon based on the position information of the at least one human body part and the preset position information of the action icon corresponding to the human body part, and display it at the display position the action icon;
    第二采集模块,用于采集所述用户的第二图像帧数据并展示;其中,所述第二图像帧数据是所述第一图像帧数据之后预设时间点的图像帧数据;a second collection module, configured to collect and display second image frame data of the user; wherein, the second image frame data is image frame data at a preset time point after the first image frame data;
    第二确定模块,用于确定所述第二图像帧数据中与所述动作图标关联的目标人体部位以及所述目标人体部位的状态信息;a second determination module, configured to determine the target human body part associated with the action icon in the second image frame data and the state information of the target human body part;
    评估模块,用于根据所述第二图像帧数据中所述目标人体部位的状态信息与所述动作图标的匹配度,确定评估结果。An evaluation module, configured to determine an evaluation result according to the degree of matching between the state information of the target human body part in the second image frame data and the action icon.
  17. 一种交互装置,其特征在于,配置于服务器,包括:An interactive device, characterized in that, configured on a server, comprising:
    位置数据提取模块,用于获取多个候选视频,并提取所述多个候选视频中各图像帧的人体部位位置数据;a position data extraction module, configured to obtain a plurality of candidate videos, and extract the position data of human body parts of each image frame in the plurality of candidate videos;
    标准位置数据集确定模块,用于基于预设的规则对所述多个候选视频中同一图像帧的人体部位位置数据进行融合,得到标准位置数据集;a standard position data set determination module, configured to fuse the body part position data of the same image frame in the plurality of candidate videos based on preset rules to obtain a standard position data set;
    位置数据查找模块,用于查找所述多个候选视频中至少一个图像帧中的目标人体部位在所述标准位置数据集中的位置数据;a position data search module, configured to search for the position data of the target body part in the standard position data set in at least one image frame of the plurality of candidate videos;
    预设位置信息确定模块,用于利用所述查找的位置数据确定与所述目标人体部位对应的动作图标的预设位置信息,以参与确定所述动作图标在客户端展示的图像帧数据中的展示位置。The preset position information determination module is used to determine the preset position information of the action icon corresponding to the target body part by using the searched position data, so as to participate in determining the position of the action icon in the image frame data displayed by the client. placement.
  18. 一种终端,其特征在于,包括存储器、处理器和摄像头,其中:A terminal, characterized in that it includes a memory, a processor and a camera, wherein:
    所述摄像头用于实时采集用户的图像帧数据;The camera is used to collect the user's image frame data in real time;
    所述存储器中存储有计算机程序,当所述计算机程序被所述处理器执行时,所述处理器执行权利要求1-14中任一项所述的交互方法。A computer program is stored in the memory, and when the computer program is executed by the processor, the processor executes the interaction method of any one of claims 1-14.
  19. 一种服务器,其特征在于,包括存储器和处理器,其中:A server, characterized by comprising a memory and a processor, wherein:
    所述存储器中存储有计算机程序,当所述计算机程序被所述处理器执行时,所述处理器执行权利要求15所述的交互方法。A computer program is stored in the memory, and when the computer program is executed by the processor, the processor executes the interaction method of claim 15 .
  20. 一种计算机可读存储介质,其特征在于,所述存储介质中存储有计算机程序,当所述计算机程序被处理器执行时,所述处理器执行权利要求1-14中任一项所述的交互方法,或者执行权利要求15所述的交互方法。A computer-readable storage medium, wherein a computer program is stored in the storage medium, and when the computer program is executed by a processor, the processor executes the method described in any one of claims 1-14. interactive method, or execute the interactive method described in claim 15 .
  21. 一种计算机程序产品,其特征在于,所述计算机程序产品包括计算机程序指令,所述计算机程序指令在被处理器运行时使得所述处理器执行权利要求1-14中任一项所述的交互方法,或者执行权利要求15所述的交互方法。A computer program product, characterized in that the computer program product comprises computer program instructions that, when executed by a processor, cause the processor to perform the interaction of any one of claims 1-14 method, or execute the interaction method of claim 15 .
PCT/CN2021/127010 2020-12-02 2021-10-28 Interaction method and apparatus, and terminal, server and storage medium WO2022116751A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202011399864.7 2020-12-02
CN202011399864.7A CN112560605B (en) 2020-12-02 2020-12-02 Interaction method, device, terminal, server and storage medium

Publications (1)

Publication Number Publication Date
WO2022116751A1 true WO2022116751A1 (en) 2022-06-09

Family

ID=75048069

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/127010 WO2022116751A1 (en) 2020-12-02 2021-10-28 Interaction method and apparatus, and terminal, server and storage medium

Country Status (2)

Country Link
CN (1) CN112560605B (en)
WO (1) WO2022116751A1 (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116150421A (en) * 2023-04-23 2023-05-23 深圳竹云科技股份有限公司 Image display method, device, computer equipment and storage medium
CN117455466A (en) * 2023-12-22 2024-01-26 南京三百云信息科技有限公司 Method and system for remote evaluation of automobile

Families Citing this family (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112560605B (en) * 2020-12-02 2023-04-18 北京字节跳动网络技术有限公司 Interaction method, device, terminal, server and storage medium
CN113727147A (en) * 2021-08-27 2021-11-30 上海哔哩哔哩科技有限公司 Gift presenting method and device for live broadcast room
CN113723307B (en) * 2021-08-31 2024-09-06 上海掌门科技有限公司 Social sharing method, equipment and computer readable medium based on push-up detection
CN113946210B (en) * 2021-09-16 2024-01-23 武汉灏存科技有限公司 Action interaction display system and method
CN113742630B (en) * 2021-09-16 2023-12-15 阿里巴巴新加坡控股有限公司 Image processing method, electronic device, and computer storage medium
CN113923361B (en) * 2021-10-19 2024-07-09 北京字节跳动网络技术有限公司 Data processing method, apparatus, device, and computer readable storage medium
CN116320583A (en) * 2023-03-20 2023-06-23 抖音视界有限公司 Video call method, device, electronic equipment and storage medium

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107920269A (en) * 2017-11-23 2018-04-17 乐蜜有限公司 Video generation method, device and electronic equipment
CN108833818A (en) * 2018-06-28 2018-11-16 腾讯科技(深圳)有限公司 video recording method, device, terminal and storage medium
CN109068081A (en) * 2018-08-10 2018-12-21 北京微播视界科技有限公司 Video generation method, device, electronic equipment and storage medium
CN109600559A (en) * 2018-11-29 2019-04-09 北京字节跳动网络技术有限公司 A kind of special video effect adding method, device, terminal device and storage medium
CN109618183A (en) * 2018-11-29 2019-04-12 北京字节跳动网络技术有限公司 A kind of special video effect adding method, device, terminal device and storage medium
CN110888532A (en) * 2019-11-25 2020-03-17 深圳传音控股股份有限公司 Man-machine interaction method and device, mobile terminal and computer readable storage medium
CN112560605A (en) * 2020-12-02 2021-03-26 北京字节跳动网络技术有限公司 Interaction method, device, terminal, server and storage medium

Family Cites Families (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101564594B (en) * 2008-04-25 2012-11-07 财团法人工业技术研究院 Interactive type limb action recovery method and system
CN102622591B (en) * 2012-01-12 2013-09-25 北京理工大学 3D (three-dimensional) human posture capturing and simulating system
CN102622509A (en) * 2012-01-21 2012-08-01 天津大学 Three-dimensional game interaction system based on monocular video
US9082312B2 (en) * 2012-05-09 2015-07-14 Antennasys, Inc. Physical activity instructional apparatus
WO2014041032A1 (en) * 2012-09-11 2014-03-20 L.I.F.E. Corporation S.A. Wearable communication platform
CN104461012B (en) * 2014-12-25 2017-07-11 中国科学院合肥物质科学研究院 A kind of dance training assessment system based on digital field and wireless motion capture equipment
CN104866108B (en) * 2015-06-05 2018-03-23 中国科学院自动化研究所 Multifunctional dance experiencing system
CN105635669B (en) * 2015-12-25 2019-03-01 北京迪生数字娱乐科技股份有限公司 The movement comparison system and method for data and real scene shooting video are captured based on three-dimensional motion
CN108326878A (en) * 2017-01-18 2018-07-27 王怀亮 A kind of limb action electronic switching equipment, instruction identification method and recording/playback method
CN107240049B (en) * 2017-05-10 2020-04-03 中国科学技术大学先进技术研究院 Automatic evaluation method and system for remote action teaching quality in immersive environment
CN107349594B (en) * 2017-08-31 2019-03-19 华中师范大学 A kind of action evaluation method of virtual Dance System
CN109389054A (en) * 2018-09-21 2019-02-26 北京邮电大学 Intelligent mirror design method based on automated graphics identification and action model comparison
CN109589563B (en) * 2018-12-29 2021-06-22 南京华捷艾米软件科技有限公司 Dance posture teaching and assisting method and system based on 3D motion sensing camera
CN110141850B (en) * 2019-01-30 2023-10-20 腾讯科技(深圳)有限公司 Action control method, device, electronic equipment and storage medium

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107920269A (en) * 2017-11-23 2018-04-17 乐蜜有限公司 Video generation method, device and electronic equipment
CN108833818A (en) * 2018-06-28 2018-11-16 腾讯科技(深圳)有限公司 video recording method, device, terminal and storage medium
CN109068081A (en) * 2018-08-10 2018-12-21 北京微播视界科技有限公司 Video generation method, device, electronic equipment and storage medium
CN109600559A (en) * 2018-11-29 2019-04-09 北京字节跳动网络技术有限公司 A kind of special video effect adding method, device, terminal device and storage medium
CN109618183A (en) * 2018-11-29 2019-04-12 北京字节跳动网络技术有限公司 A kind of special video effect adding method, device, terminal device and storage medium
CN110888532A (en) * 2019-11-25 2020-03-17 深圳传音控股股份有限公司 Man-machine interaction method and device, mobile terminal and computer readable storage medium
CN112560605A (en) * 2020-12-02 2021-03-26 北京字节跳动网络技术有限公司 Interaction method, device, terminal, server and storage medium

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116150421A (en) * 2023-04-23 2023-05-23 深圳竹云科技股份有限公司 Image display method, device, computer equipment and storage medium
CN117455466A (en) * 2023-12-22 2024-01-26 南京三百云信息科技有限公司 Method and system for remote evaluation of automobile
CN117455466B (en) * 2023-12-22 2024-03-08 南京三百云信息科技有限公司 Method and system for remote evaluation of automobile

Also Published As

Publication number Publication date
CN112560605B (en) 2023-04-18
CN112560605A (en) 2021-03-26

Similar Documents

Publication Publication Date Title
WO2022116751A1 (en) Interaction method and apparatus, and terminal, server and storage medium
US11158102B2 (en) Method and apparatus for processing information
CN109462776B (en) Video special effect adding method and device, terminal equipment and storage medium
CN109313812B (en) Shared experience with contextual enhancements
WO2022121601A1 (en) Live streaming interaction method and apparatus, and device and medium
WO2022083383A1 (en) Image processing method and apparatus, electronic device and computer-readable storage medium
WO2020029523A1 (en) Video generation method and apparatus, electronic device, and storage medium
CN106575361B (en) Method for providing visual sound image and electronic equipment for implementing the method
CN109729372B (en) Live broadcast room switching method, device, terminal, server and storage medium
CN107533360A (en) A kind of method for showing, handling and relevant apparatus
WO2022007565A1 (en) Image processing method and apparatus for augmented reality, electronic device and storage medium
WO2023030010A1 (en) Interaction method, and electronic device and storage medium
US20150347461A1 (en) Display apparatus and method of providing information thereof
WO2021023047A1 (en) Facial image processing method and device, terminal, and storage medium
US20190174069A1 (en) System and Method for Autonomously Recording a Visual Media
CN109600559B (en) Video special effect adding method and device, terminal equipment and storage medium
CN112596694B (en) Method and device for processing house source information
CN111541951B (en) Video-based interactive processing method and device, terminal and readable storage medium
CN113923462A (en) Video generation method, live broadcast processing method, video generation device, live broadcast processing device and readable medium
WO2022206335A1 (en) Image display method and apparatus, device, and medium
WO2020155915A1 (en) Method and apparatus for playing back audio
WO2024007833A1 (en) Video playing method and apparatus, and device and storage medium
US11886484B2 (en) Music playing method and apparatus based on user interaction, and device and storage medium
US20230209125A1 (en) Method for displaying information and computer device
WO2021228200A1 (en) Method for realizing interaction in three-dimensional space scene, apparatus and device

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21899785

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 25.09.2023)

122 Ep: pct application non-entry in european phase

Ref document number: 21899785

Country of ref document: EP

Kind code of ref document: A1