WO2019128558A1 - Analysis method and system of user limb movement and mobile terminal - Google Patents

Analysis method and system of user limb movement and mobile terminal Download PDF

Info

Publication number
WO2019128558A1
WO2019128558A1 PCT/CN2018/116700 CN2018116700W WO2019128558A1 WO 2019128558 A1 WO2019128558 A1 WO 2019128558A1 CN 2018116700 W CN2018116700 W CN 2018116700W WO 2019128558 A1 WO2019128558 A1 WO 2019128558A1
Authority
WO
WIPO (PCT)
Prior art keywords
image
user
information
limb
module
Prior art date
Application number
PCT/CN2018/116700
Other languages
French (fr)
Chinese (zh)
Inventor
张文波
刘裕峰
刘锦龙
Original Assignee
北京达佳互联信息技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 北京达佳互联信息技术有限公司 filed Critical 北京达佳互联信息技术有限公司
Publication of WO2019128558A1 publication Critical patent/WO2019128558A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/20Movements or behaviour, e.g. gesture recognition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/22Matching criteria, e.g. proximity measures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/174Facial expression recognition

Definitions

  • the image processing field in the embodiment of the present application is particularly a method, a system, and a mobile terminal for analyzing a user's limb motion.
  • Image comprehension is the study of a computer system to interpret images to achieve a science similar to the human visual system's understanding of the outside world. To understand a science in the outside world, the questions that are discussed are: what information is needed from the image to accomplish a task, and how to use it to obtain the necessary explanations.
  • the study of image understanding involves and includes methods, devices, and specific application implementations for obtaining images.
  • image understanding technology is used to recognize text information in a picture, and to recognize and convert the text in the bitmap form into editable text.
  • the inventor created by the present application found in the research that the image understanding in the related art is limited to converting the fixed bitmap pattern into an editable form, and cannot perform deeper analysis and application according to the understanding result after understanding the image information. .
  • Embodiments of the present application provide an analysis method for a user's body motion by analyzing a body language of a user in an image, and matching information that can be directly perceived by a human body according to a body language, and displaying and applying information that can be directly perceived by a human being.
  • System and mobile terminal
  • a technical solution adopted by the embodiment created by the present application is to provide an analysis method for a user's limb motion, including the following steps:
  • the parsing method of the user's limb motion further includes the following steps:
  • An emoticon image having the same action meaning as the human facial motion information is matched.
  • the step of acquiring a limb image of the user includes:
  • the step of identifying the body language of the limb image representation includes:
  • the step of matching the visual information or the audio information having the same meaning as the body language includes:
  • An emoticon image having the same action meaning as the human facial motion information is matched.
  • the method before the step of acquiring a face image of the user, the method further includes the following steps:
  • the emoticon image is placed in the display container according to a preset script to visually display the emoticon image.
  • the step of matching an emoticon image having the same action meaning as the human facial motion includes the following steps:
  • the display container has an expression picture having the same action meaning as the human face action.
  • the method further includes the following steps:
  • the method further includes the following steps:
  • the bonus scores are summed to form a final score for the user within the first time threshold.
  • the parsing method of the user's limb motion further includes the following steps:
  • An emoticon image having the same emotional meaning as the facial image is matched, and a bonus score of the facial image is confirmed according to the matching degree.
  • the parsing method of the user's limb motion further includes the following steps:
  • the step of acquiring a face image of the user includes:
  • step of identifying the facial motion information of the facial image representation includes:
  • the step of matching an emoticon image having the same action meaning as the human facial action information includes:
  • An emoticon image having the same emotional meaning as the facial image is matched, and a bonus score of the facial image is confirmed according to the matching degree.
  • the step of collecting the face image of the user in the unit time or in real time and identifying the emotion information represented by the face image includes the following steps:
  • the step of identifying the emotion information represented by the face image and the matching degree between the face image and the emotion information includes:
  • the embodiment of the present application further provides an analysis system for a user's limb movement, including:
  • An acquisition module configured to acquire a limb image of the user
  • a processing module for identifying a body language of the limb image representation
  • An execution module for matching visual information or audio information having the same meaning as the body language is an execution module for matching visual information or audio information having the same meaning as the body language.
  • the analyzing system of the user's limb motion further includes:
  • a first obtaining submodule configured to acquire a face image of the user
  • a first processing submodule configured to identify human facial motion information represented by the facial image
  • the first execution sub-module is configured to match an emoticon image having the same action meaning as the human facial motion information.
  • the acquiring module includes: a first acquiring submodule, configured to acquire a face image of the user;
  • the processing module includes: a first processing submodule, configured to identify human facial motion information represented by the facial image;
  • the execution module includes: a first execution sub-module, configured to match an emoticon image having the same action meaning as the human facial action information.
  • the analyzing system of the user's limb motion further includes:
  • a first calling submodule configured to call at least one of the pre-stored emoticons
  • the first display sub-module is configured to place the emoticon image in a display container according to a preset script, so that the emoticon image is visually displayed.
  • the analyzing system of the user's limb motion further includes:
  • a first comparison sub-module configured to compare the human facial motion information with an expression image in a range of the display container
  • a first confirmation sub-module configured to confirm that the display container has the same action as the human face motion when the action meaning represented by the expression picture in the display container is the same as the human face motion information The expression of the meaning of the image.
  • the analyzing system of the user's limb motion further includes:
  • a second obtaining submodule configured to acquire matching degree information of the human facial action information and the emoticon image
  • the second execution sub-module is configured to calculate a bonus score corresponding to the matching degree information according to a preset matching rule.
  • the analyzing system of the user's limb motion further includes:
  • a first recording sub-module configured to record all the bonus points in the preset first time threshold
  • a third execution sub-module configured to accumulate the bonus scores to form a final score of the user within the first time threshold.
  • the analyzing system of the user's limb motion further includes:
  • a third obtaining sub-module configured to randomly extract a preset number of emoticons representing human emotions from the emoticon package in a preset unit time, and place the emoticon images in a display container;
  • a second processing submodule configured to collect a face image of the user in a timed or real time in the unit time, and identify the emotion information represented by the face image, and match the face image with the emotion information degree;
  • a fourth execution sub-module configured to match an emoticon image having the same emotional meaning as the facial image, and confirm a bonus score of the facial image according to the matching degree.
  • the analyzing system of the user's limb motion further includes:
  • a third obtaining sub-module configured to randomly extract a preset number of emoticons representing human emotions from the emoticon package in a preset unit time, and place the emoticon images in a display container;
  • a first acquiring submodule configured to collect a face image of the user in a timed or real time in the unit time
  • a first processing sub-module configured to identify emotion information represented by the face image, and a matching degree between the face image and the emotion information
  • a first execution submodule configured to match an emoticon image having the same emotional meaning as the facial image, and confirm a bonus score of the facial image according to the matching degree.
  • the analyzing system of the user's limb motion further includes:
  • a first collection sub-module configured to collect a face image of the user
  • a third processing sub-module configured to input the facial image into a preset emotion recognition model, and obtain a classification result and classification data of the facial image;
  • a fifth execution sub-module configured to determine emotion information of the face image according to the classification result, and determine a matching degree between the face image and the emotion information according to the classification data.
  • the first processing submodule is configured to:
  • the embodiment of the present application further provides a mobile terminal, including:
  • One or more processors are One or more processors;
  • One or more applications wherein the one or more applications are stored in the memory and configured to be executed by the one or more processors, the one or more programs configured to perform the The method for analyzing the user's limb movement.
  • the embodiment of the present application provides a computer readable storage medium, where the computer readable storage medium stores a computer program, and when the computer program is executed by the processor, the foregoing An analysis method step of the user's limb movement.
  • the embodiment of the present application provides a computer program product, when it is run on a computer, causing the computer to perform the step of analyzing the user's limb motion according to any of the above embodiments.
  • the beneficial effects of the embodiments of the present application are: by identifying the body language of the user's limb image in the picture, and matching the visual or audio information having the same meaning as the body language.
  • the information expressed by the limb features in the image is presented in a way that can be directly interpreted by humans, thereby realizing a deep interpretation of the human body movements, facilitating communication between the language disabled or the languageless users.
  • FIG. 1 is a schematic diagram of a basic flow of a method for analyzing a user's limb motion according to an embodiment of the present application
  • FIG. 2 is a schematic diagram showing a first embodiment of a method for analyzing a user's limb motion according to an embodiment of the present application
  • FIG. 3 is a schematic view showing a second embodiment of a method for analyzing a user's limb motion according to an embodiment of the present application
  • FIG. 4 is a schematic flowchart of analyzing an application of a user's facial expression according to an embodiment of the present application
  • FIG. 5 is a schematic flowchart diagram of an embodiment of displaying an emoticon image according to an embodiment of the present application
  • FIG. 6 is a schematic flowchart of confirming that an emoticon image in a display container is the same as a human facial action information according to an embodiment of the present application;
  • FIG. 7 is a schematic flowchart of performing rewards by matching results according to an embodiment of the present application.
  • FIG. 9 is a schematic flowchart of analyzing emotion information of a face image according to an embodiment of the present application.
  • FIG. 10 is a schematic flowchart diagram of emotion information classification and matching degree detection of a face image according to an embodiment of the present application.
  • FIG. 11 is a schematic diagram showing a third embodiment of a method for analyzing a limb motion of a user according to an embodiment of the present application.
  • FIG. 12 is a basic structural block diagram of an analysis system for a user's limb movement according to an embodiment of the present application
  • FIG. 13 is a schematic diagram of a basic structure of a mobile terminal according to an embodiment of the present application.
  • FIG. 1 is a schematic flow chart of a method for analyzing a user's limb motion according to an embodiment of the present invention.
  • a method for analyzing a user's limb motion includes the following steps:
  • the limb images in this embodiment may include, but are not limited to, a face image, a gesture motion image, and/or a lip motion image.
  • the face image can also be referred to as a face image.
  • the terminal acquires a target image that includes a limb image of the user stored in the local storage space by accessing the local storage space designation area.
  • the user's limb image is directly acquired in real time by turning on the photographing device disposed on the terminal or connected to the terminal.
  • the terminal may acquire a target image including the user's limb image stored in the external device having the storage function by accessing the connected specific area of the external device having the storage function.
  • the limb image may refer to an image corresponding to the region where the user's limb is located in the target image. In another case, the limb image may refer to a complete target image.
  • Body language refers to the specific meaning of the action representation of the limb image.
  • the body language includes (not limited to): the emotion information represented by the user's face image, the language information of the gesture action image action representation or the language information represented by the lip motion image.
  • body language refers to the specific meaning of the action representation of the limb image, including but not limited to: emotional information represented by the user's facial image, language information of the gesture action representation in the gesture motion image, and/or lip motion image Language information characterized by lip movements.
  • the technical solution adopted for identifying the body language of the limb image representation may be: performing recognition by a deep learning method. Specifically, a large number of pictures including body images of the human body are collected as training samples, and subjective judgments of body functions expressed by various body images are obtained, and subjective meanings of body movements of each training sample are obtained, and the subjective meaning is set. The expected output for this training sample. Then, the training sample is input into the convolutional neural network model, and the feature data of the training sample is extracted, and the classification data of the training sample is output, and the classification data is the probability value of the training sample belonging to each classification result in the current training.
  • the classification results in the examples are the names of different body language.
  • the classification result with the largest probability value and greater than the preset measurement threshold is the excitation output of the training sample in the current round of training. Comparing whether the desired output is consistent with the excitation output, and the training ends when the expected output is consistent with the excitation output; when the expected output is inconsistent with the excitation output, correcting the weight of the convolutional neural network by the back propagation algorithm to adjust the output result . After adjusting the weight of the convolutional neural network, the training samples are re-inputted and cycled until the desired output is consistent with the excitation output.
  • the classification result can be set according to the demand.
  • the classification result can be several according to the complexity of the output, and the more the classification result, the higher the complexity of the training.
  • the training samples can be repeatedly input to the convolutional neural network until the excitation output of the convolutional neural network for each training sample, the expectation corresponding to the training sample After the output of the same probability exceeds the preset value, the training ends.
  • the expected output is consistent with the excitation output, and may refer to: the number of training samples whose corresponding excitation output is consistent with the corresponding desired output, not less than the preset number.
  • the expected output is inconsistent with the excitation output, and may refer to: the number of training samples whose corresponding excitation output is consistent with the corresponding desired output is less than the preset number.
  • the expected output is consistent with the excitation output, and may refer to: the ratio of the training samples whose corresponding excitation output is consistent with the corresponding expected output, and the ratio of the total number of training samples is not less than the preset ratio.
  • the expected output is inconsistent with the excitation output, and may refer to: the ratio of the training samples whose corresponding excitation output is consistent with the corresponding expected output, and the ratio of the total number of training samples is less than the preset ratio.
  • the body language of the limb image (not involved in training) can be quickly and accurately determined.
  • S1300 Matching visual information or audio information having the same meaning as the body language.
  • Visual information refers to information that can be observed by human eyes, including but not limited to: text information, picture information, and/or video information.
  • the body language represented by the user's limb image is obtained by the convolutional neural network model, that is, the text information of the user's limb image representation is obtained.
  • the text information is used as a search key, and the visual information or the audio information having the same meaning as the text information is retrieved in the local database.
  • the visualization information or the audio information stored in the local database is set according to the meaning of its expression, so as to facilitate the body language to perform corresponding matching by retrieving the tags.
  • the body language represented by the user's limb image is obtained by the convolutional neural network model, that is, the description information for characterizing the limb in the limb image is obtained, and the description information is one.
  • a type of textual information can be called a literal language.
  • the visualization information or the audio information based on the same meaning as the description information is retrieved.
  • one or more labels may be configured according to the meaning expressed by each of the visualization information or the audio information for the visualization information or the audio information stored in the first preset database, so as to facilitate subsequent The retrieval process, that is, retrieval matching through tags at the time of retrieval.
  • the first preset database may be stored locally in the terminal, or may be stored in an external device with a storage function connected to the terminal.
  • FIG. 2 is a schematic diagram showing the first embodiment of the method for analyzing the limb motion of the user.
  • the parsing method of the user's limb motion is used to parse the user's limb motion, and convert the motion into a text language, and obtain the user's limb image in real time, and convert the image into text.
  • the language is output. For example, identify sign language in dummy or special operations and convert the body language into a written language.
  • the "hello" expressed by the user's body language is converted into the text language "hello".
  • FIG. 3 is a schematic diagram showing a second embodiment of a method for analyzing a user's limb motion in the present embodiment.
  • the emotional information represented by the facial expression action of the user is identified.
  • the expression having the same emotional meaning as the emotion information is retrieved for output, but is not limited thereto.
  • text, pictures, animations, or speech that have the same emotional meaning as the emotional information can be output.
  • an expression with a meaning of joy is sent to the other party. For example: an emoticon with a meaning of joy.
  • the body language represented by the limb image of the user in the picture is recognized, and the visual information or the audio information having the same meaning as the body language is matched.
  • the information expressed by the limb features in the image is presented in a way that can be directly interpreted by humans, thereby realizing a deep interpretation of the human body movements, facilitating communication between the language disabled or the languageless users.
  • FIG. 4 is a schematic flowchart of analyzing an application of a user's facial expression according to an embodiment of the present invention.
  • the method for analyzing the user's limb motion further includes the following steps:
  • S2100 acquiring a face image of the user
  • the terminal acquires a target image including a face image of the user stored in the local storage space by accessing the local storage space designation area.
  • the user's face image is directly acquired in real time by turning on the photographing device disposed on the terminal or connected to the terminal.
  • the terminal may acquire a target image including a face image of the user stored in the external device having the storage function by accessing the connected specific area of the external device having the storage function.
  • the facial motion information of the human body includes emotional information represented by the facial motion of the human body, and may also be referred to as facial expression information, such as emotions, sorrows, sorrows, and the like; and may also be an action that characterizes the user without emotional representation, such as licking, tongue, or wrinkle forehead. .
  • the technical solution for recognizing the facial motion information of the human face image representation may be: performing the method of deep learning. Specifically, a large number of pictures including facial images of the human body are collected as training samples, and subjective judgments of human body facial motion information expressed by various facial images are obtained, and subjective meanings of limb movements of each training sample are obtained, and The subjective meaning is set to the desired output of the training sample. Then, the training sample is input into the convolutional neural network model, and the feature data of the training sample is extracted, and the classification data of the training sample is output, and the classification data is the probability value of the training sample belonging to each classification result in the current training.
  • the classification result in the embodiment is the name of different human facial motion information.
  • the classification result with the largest probability value and greater than the preset measurement threshold is the excitation output of the training sample in the current round of training. Comparing whether the desired output is consistent with the excitation output, and the training ends when the expected output is consistent with the excitation output; when the expected output is inconsistent with the excitation output, correcting the weight of the convolutional neural network by the back propagation algorithm to adjust the output result . After adjusting the weight of the convolutional neural network, the training samples are re-inputted and cycled until the desired output is consistent with the excitation output.
  • the classification result can be set according to the demand.
  • the classification result can be several according to the complexity of the output, and the more the classification result, the higher the complexity of the training.
  • the training samples can be repeatedly input to the convolutional neural network until the excitation output of the convolutional neural network for each training sample, the expectation corresponding to the training sample After the output of the same probability exceeds the preset value, the training ends.
  • the expected output is consistent with the excitation output, and may refer to: the number of training samples whose corresponding excitation output is consistent with the corresponding desired output, not less than the preset number.
  • the expected output is inconsistent with the excitation output, and may refer to: the number of training samples whose corresponding excitation output is consistent with the corresponding desired output is less than the preset number.
  • the expected output is consistent with the excitation output, and may refer to: the ratio of the training samples whose corresponding excitation output is consistent with the corresponding expected output, and the ratio of the total number of training samples is not less than the preset ratio.
  • the expected output is inconsistent with the excitation output, and may refer to: the ratio of the training samples whose corresponding excitation output is consistent with the corresponding expected output, and the ratio of the total number of training samples is less than the preset ratio.
  • the S2300 matches an emoticon image having the same action meaning as the human facial motion information.
  • An emoticon image refers to an external device with a storage function connected to a terminal or a terminal, and stores an emoticon or animated emoticon designed to simulate a user's expression.
  • the human face motion information represented by the user's face image is obtained by the convolutional neural network model, that is, the text information of the user's face image representation is obtained.
  • the text information as a search key
  • an emoticon image having the same meaning as the text information is retrieved in the local database.
  • the emoticon images stored in the local database are set with one or more tags according to the meaning of their expressions, so that the facial motion information of the human body can be matched by the retrieval tag.
  • the facial motion information represented by the facial image is obtained by the convolutional neural network model, that is, the description information for characterizing the facial expression in the facial image is obtained.
  • the description information is a type of text information, which may be referred to as a word language.
  • an emoticon image based on the same meaning as the description information is retrieved.
  • one or more tags may be configured according to the meaning expressed by each emoticon image for the emoticon images stored in the second preset database, so as to facilitate the subsequent retrieval process, that is, Search and match through tags when searching.
  • the second preset database may be stored locally in the terminal, or may be stored in an external device with a storage function connected to the terminal.
  • the specific meaning represented by the expression information is obtained, and then the expression image with the same meaning as the expression information is matched, which is convenient for the user to input, and also combines the analysis result with the user expression. Deeper interaction processing.
  • FIG. 5 is a schematic flow chart of an embodiment of displaying an emoticon image according to an embodiment of the present invention.
  • step S2100 the following steps are further included:
  • an expression package including a plurality of emoticons or a plurality of emoticons garbled in the area or folder are stored in the specified storage area or in the folder in the terminal storage space.
  • each facial expression image is characterized by a human facial motion.
  • an external device having a storage function in a designated area of the terminal or connected may store an emoticon pack including a plurality of emoticons.
  • each facial expression image is characterized by a human facial motion.
  • one or more emoticons may be called for display according to a preset script.
  • S2012 The emoticon image is placed in a display container according to a preset script, so that the emoticon image is visually displayed.
  • the script is a preset program for controlling the display action of the emoticon picture, wherein the time control of the emoticon time in the display area is set, the motion control for setting the motion track of the picture in the display area, and the setting picture are successfully matched in the display area.
  • Emoticon rendering control By traversing the above controls, the display of the emoticon in the display container can be completed.
  • the emoticon placed in the display container is displayed in the display area of the terminal for viewing by the user after the typographic rendering is performed in the display container by using the parameters set by the preset script.
  • the emoticon image is used to emulate the application, and the face image of the user is collected in real time in the state where the terminal camera is turned on, and then the face image is displayed on the terminal screen, and the user imitates the emoticon image within the display screen range.
  • the script renders the illuminating of the emoticon.
  • an emoticon can be used to emulate an application.
  • the terminal can use the opened camera to collect the face image of the user in real time, and then display the face image in the display area of the terminal, such as the display screen of the terminal.
  • the display screen of the terminal can display an emoticon image.
  • the user can imitate the expression in the emoticon image displayed in the display screen range of the terminal, and make an expression that is the same as the expression in the emoticon image displayed in the display screen range of the terminal.
  • the action in turn, the terminal uses the opened camera to collect the expression action made by the user, and classifies and recognizes the collected image, that is, the image that the user performs imitating.
  • the facial expression made by the user user for example, the facial expression is the same as the expression in one or more expression images in the range of the display screen of the terminal, the successful matching facial expression image is scored, and the preset script pair is pressed.
  • the illuminating rendering of the emoticon is the same as the expression in one or more expression images in the range of the display screen of the terminal.
  • FIG. 6 is a schematic flowchart of confirming that the emoticon image in the range of the display container is the same as the facial motion information of the human body in the embodiment.
  • step S2300 specifically includes the following steps:
  • the face image of the user is collected in real time, and then the face image is displayed on the terminal screen, and the user imitates the action of displaying the image in the screen range, classifies and recognizes the image imitated by the user, and then classifies the result and the range within the display container.
  • the action information represented by the expression picture is compared.
  • the terminal places the emoticon image in the display container according to the preset script, so that the emoticon image is visually displayed, that is, the terminal displays the emoticon image on the display screen of the terminal, and can be collected by the camera in real time.
  • a face image of the user wherein the face image may include: an expression action performed by the user, and the performed gesture action may be: an action performed by the user to imitate an expression represented by the expression image displayed in the display screen of the terminal, Or, the action that the user makes at will.
  • the terminal may classify and recognize the facial expressions included in the facial image of the user, that is, recognize the facial motion information in the facial image, and further, the facial motion information of the human body and The expression images within the display container range, that is, within the display screen range of the terminal, are compared to determine the comparison result.
  • the facial expression of the user is an expression represented by the facial motion information of the human body.
  • FIG. 7 is a schematic flowchart of performing a reward by matching results in the embodiment.
  • step S2300 the following steps are further included:
  • the analysis of the human face motion information operation is performed by the classification result of the convolutional neural network model, and the classification result output by the convolutional neural network model classification layer is the probability value of the facial image belonging to each classification result.
  • the probability value may be in the range of 0-1.
  • the classification result corresponding to the face image may include a plurality of values generally between 0-1.
  • the classification result is set as: four emotional results of emotions, and after the face image is input, [0.75 0.2 0.4 0.3] is obtained, and since 0.75 is the maximum value and is greater than the preset threshold value of 0.5, the person The classification result of the face image is "Hi".
  • the matching degree information of the human facial motion information and the facial expression image expressing the emotion as "hi" is 0.75, that is, the similarity of the facial expression image and the facial expression image expressing the emotion as "hi” is 75%.
  • the classification result is set as: four expressions of laugh, cry, frown and no expression.
  • [0.79 0.1 0.3 0.1] is obtained, because 0.79 is the maximum value and is greater than the preset.
  • the threshold is 0.5
  • the classification result of the face image is "laugh”.
  • the matching degree information of the human face motion information and the expression image with the meaning of “laughing” is 0.75, that is, the similarity of the characterized facial expression of the facial image and the expression image with the meaning of “laughing” is 75%. .
  • the matching rule is a preset method of calculating a bonus score based on the matching degree information.
  • the matching degree information is divided into “perfect, very good, good and missed”, wherein “perfect” is the classification result in the interval of 0.9-1.0; “very good” is the matching degree information.
  • the classification result is in the range of 0.7-0.9; "good” is the classification result within the interval of 0.5-0.7; "missing” is the classification result with the matching degree information being 0.5 or less.
  • the bonus score corresponding to the matching degree information is calculated according to a preset matching rule.
  • the matching quality of the matching result can be further refined, and a more accurate bonus score can be obtained.
  • the matching results are continuously recorded for a predetermined period of time, and the scores of the users within the duration are counted after the time is over.
  • FIG. 8 is a schematic flowchart of a statistical aliquot of the embodiment.
  • step S2412 the following steps are further included:
  • the first time threshold is the length of time of the predetermined match game, for example, the length of time for setting a match game is 3 minutes.
  • the setting of the specific time length is not limited thereto, and in some alternative embodiments, the time length of the first time threshold can be shorter or longer.
  • the total score of the user's bonus score within the first time threshold is counted as the total score of the user participating in the match within the first time threshold.
  • FIG. 9 is a schematic flowchart of analyzing emotion information of a face image according to an embodiment of the present invention.
  • the method for analyzing the user's limb motion further includes the following steps:
  • S3100 randomly extract a preset number of expression images representing human emotions from the expression pack in a preset unit time, and place the expression image in the display container;
  • the unit time is the time to load a wave of emoticons into the display container.
  • the time for loading a wave of emoticons is 5 seconds, that is, the time when a wave of emoticons appears in the display container is 5 seconds, and a new wave of expressions after 5 seconds.
  • the picture will be replaced.
  • the emoticon image loaded per unit time can be preset, and the setting rule can be fixed.
  • each wave of emoticons in the unit time is added by 5 or not, and in some embodiments,
  • the setting is random, wherein the better the network state is, the larger the preset data is set: in other embodiments, the addition of the emoticon image can be incremental, and the increasing number is set according to the actual situation. Set, such as one, two or more increments at a time.
  • S3200 Collect a face image of the user in a timed or real time in the unit time, and identify the emotion information represented by the face image and the matching degree between the face image and the emotion information;
  • the user's face image is acquired in real time by the camera connected to or connected to the terminal, but is not limited thereto.
  • the face image can be extracted by a timing method (for example, 0.1 s). .
  • the emotional information of the face image is parsed and confirmed by the classification result of the convolutional neural network model.
  • the classification result output by the convolutional neural network model classification layer is the probability value of the facial image belonging to each classification result, wherein the probability value
  • the value range may be 0-1.
  • the classification result corresponding to the face image may include multiple values between 0-1.
  • the classification result is set as: four emotion results of emotions, anger and grief, after the face image is input, [0.75 0.2 0.4 0.3] is obtained, and since 0.75 is the maximum value and is greater than the preset threshold value of 0.5, the face is The classification result of the image is "Hi".
  • the expression image with the same emotion as “Hi” in the display container is determined, and the matching degree information of the emotion information and the expression image is 0.75, that is, the expression action represented by the face image is the similarity between the emotional action and the expression image. It is 75%.
  • S3300 Match an emoticon image having the same emotional meaning as the facial image, and confirm a bonus score of the facial image according to the matching degree.
  • the matching rule is a preset method of calculating a bonus score based on the matching degree information.
  • the matching degree information is divided into “perfect, very good, good and missed”, wherein “perfect” is the classification result in the interval of 0.9-1.0; “very good” is the matching degree information.
  • the classification result is in the range of 0.7-0.9; "good” is the classification result within the interval of 0.5-0.7; "missing” is the classification result with the matching degree information being 0.5 or less.
  • the bonus score corresponding to the matching degree information is calculated according to a preset matching rule.
  • FIG. 10 is a schematic flowchart of the emotion information classification and the matching degree detection of the face image according to the embodiment.
  • step 3200 specifically includes the following steps:
  • the user's face image is acquired in real time by turning on the shooting device that is connected to or connected to the terminal, but is not limited thereto.
  • facial images can be acquired in a timed manner (e.g., 0.1 s).
  • S3220 Input the face image into a preset emotion recognition model, and obtain a classification result and classification data of the face image;
  • the emotion recognition model is specifically a convolutional neural network model trained to a convergent state.
  • the technical solution for identifying the emotion information represented by the face image may be: performing the method by deep learning. Specifically, a large number of pictures including face images of the human body are collected as training samples, and subjective judgments of the emotion information expressed by the various face images are obtained, and the subjective meaning of the limb movements of each training sample is acquired, and the subjective meaning is obtained. The meaning is set to the expected output of the training sample. Then, the training sample is input into the convolutional neural network model, and the feature data of the training sample is extracted, and the classification data of the training sample is output, and the classification data is the probability value of the training sample belonging to each classification result in the current training.
  • the classification result in the embodiment is the name of the different emotion information.
  • the classification result with the largest probability value and greater than the preset measurement threshold is the excitation output of the training sample in the current round of training. Comparing whether the desired output is consistent with the excitation output, and the training ends when the expected output is consistent with the excitation output; when the expected output is inconsistent with the excitation output, correcting the weight of the convolutional neural network by the back propagation algorithm to adjust the output result . After adjusting the weight of the convolutional neural network, the training samples are re-inputted and cycled until the desired output is consistent with the excitation output.
  • the classification result can be set according to the demand.
  • the classification result can be several according to the complexity of the output, and the more the classification result, the higher the complexity of the training.
  • the training samples can be repeatedly input to the convolutional neural network until the excitation output of the convolutional neural network for each training sample, the expectation corresponding to the training sample After the output of the same probability exceeds the preset value, the training ends.
  • the expected output is consistent with the excitation output, and may refer to: the number of training samples whose corresponding excitation output is consistent with the corresponding desired output, not less than the preset number.
  • the expected output is inconsistent with the excitation output, and may refer to: the number of training samples whose corresponding excitation output is consistent with the corresponding desired output is less than the preset number.
  • the expected output is consistent with the excitation output, and may refer to: the ratio of the training samples whose corresponding excitation output is consistent with the corresponding expected output, and the ratio of the total number of training samples is not less than the preset ratio.
  • the expected output is inconsistent with the excitation output, and may refer to: the ratio of the training samples whose corresponding excitation output is consistent with the corresponding expected output, and the ratio of the total number of training samples is less than the preset ratio.
  • the classification result output by the convolutional neural network model classification layer is the probability value of the facial image belonging to each classification result, wherein the probability The value may be in the range of 0-1.
  • the classification result corresponding to the face image may include multiple values between 0-1.
  • the classification result is set as: four emotion results of emotions, anger and grief, after the face image is input, [0.75 0.2 0.4 0.3] is obtained, and since 0.75 is the maximum value and is greater than the preset threshold value of 0.5, the face is The classification result of the image is "Hi".
  • the matching degree information of the human facial motion information and the facial expression image expressing the emotion as "hi" is 0.75, that is, the similarity of the facial expression image and the facial expression image expressing the emotion as "hi” is 75%.
  • the classification result is set as: four expressions of laugh, cry, frown and no expression.
  • [0.79 0.1 0.3 0.1] is obtained, because 0.79 is the maximum value and is greater than the preset.
  • the threshold is 0.5
  • the classification result of the face image is "laugh”.
  • the matching degree information of the human facial motion information and the facial expression image having the meaning of “laughing” is 0.75, that is, the similarity between the facial expression image represented by the facial image and the facial expression image expressing the “laughing” is 75%.
  • FIG. 11 is a schematic diagram showing a third embodiment of a method for analyzing a user's limb motion in the present embodiment.
  • the self-portrait image of the user can be simultaneously displayed in the display area of the terminal, and the emoticon image is displayed in the screen.
  • the user imitates the same expression action according to the displayed emoticon picture, and the terminal detects whether the emotic expression is the same as an emoticon picture in the display area.
  • the matching expression is enlarged and displayed, and the corresponding reward is displayed according to the matching degree. Score.
  • the above-mentioned imitation expression is: the user imitates the expression when performing the same expression action according to the displayed emoticon picture.
  • FIG. 12 is a basic structural block diagram of an analysis system for a user's limb motion according to the embodiment.
  • an analysis system for a user's limb motion includes an acquisition module 2100, a processing module 2200, and an execution module 2300.
  • the obtaining module 2100 is configured to acquire a limb image of the user;
  • the processing module 2200 is configured to identify the body language of the limb image representation;
  • the executing module 2300 is configured to match the visual information or the audio information having the same meaning as the body language.
  • the above embodiment recognizes the body language represented by the limb image of the user in the picture and matches the visual information or audio information having the same meaning as the body language.
  • the information expressed by the limb features in the image is presented in a way that can be directly interpreted by humans, thereby realizing a deep interpretation of the human body movements, facilitating communication between the language disabled or the languageless users.
  • the analysis system of the user's limb motion further includes: a first acquisition sub-module, a first processing sub-module, and a first execution sub-module.
  • the first obtaining sub-module is configured to acquire a facial image of the user;
  • the first processing sub-module is configured to identify human facial motion information represented by the facial image;
  • the first executing sub-module is configured to match the facial motion information of the human body.
  • the acquiring module includes: a first acquiring sub-module, configured to acquire a face image of the user; and the processing module includes: a first processing sub-module, configured to identify the facial image representation The human face motion information; the execution module includes: a first execution sub-module, configured to match an emoticon image having the same action meaning as the human facial motion information.
  • the analysis system of the user's limb motion further includes: a first calling sub-module and a first display sub-module.
  • the first calling sub-module is configured to call the pre-stored at least one emoticon image;
  • the first display sub-module is configured to place the emoticon image in the display container according to the preset script, so that the emoticon image is visually displayed.
  • the first calling sub-module is configured to call the pre-stored at least one emoticon image before the acquiring the user's facial image; the first display sub-module is configured to place the emoticon image in the display container according to the preset script. Inside, to make the emoticon visual display.
  • the analysis system of the user's limb motion further includes: a first comparison sub-module and a first confirmation sub-module.
  • the first comparison sub-module is configured to compare the facial motion information of the human body with the facial expression image in the range of the display container; the first confirmation sub-module is used for the meaning of the action represented by the expression image in the display container.
  • the human face motion information is the same, it is confirmed that there is an expression picture having the same action meaning as the human face motion in the display container.
  • the analysis system of the user's limb motion further includes: a second acquisition sub-module and a second execution sub-module.
  • the second obtaining submodule and the second executing submodule are configured to obtain the matching degree information of the human facial action information and the emoticon image; and the second executing sub-module is configured to calculate the bonus score corresponding to the matching degree information according to the preset matching rule.
  • the second obtaining sub-module acquires matching degree information of the human facial motion information and the emoticon image; the second executive The module is configured to calculate a bonus score corresponding to the matching degree information according to a preset matching rule.
  • the analysis system of the user's limb motion further includes: a first recording sub-module and a third execution sub-module.
  • the first recording sub-module is configured to record all the bonus scores in the preset first time threshold; the third execution sub-module is configured to accumulate the bonus scores to form a final score of the user within the first time threshold.
  • the analysis system of the user's limb motion further includes: a first recording sub-module and a third execution sub-module.
  • the first recording sub-module is configured to: after calculating the bonus score corresponding to the matching degree information according to the preset matching rule, record all the bonus scores in the preset first time threshold; and use the third execution sub-module The bonus scores are summed to form the final score of the user within the first time threshold.
  • the analysis system of the user's limb motion further includes: a third acquisition sub-module, a second processing sub-module, and a fourth execution sub-module.
  • the third obtaining sub-module is configured to randomly extract a preset number of emoticons representing human emotions from the emoticon package in a preset unit time, and place the emoticon images in the display container;
  • the second processing sub-module is used for
  • the user's face image is collected in a unit time or in real time, and the emotion information represented by the face image and the matching degree between the face image and the emotion information are recognized;
  • the fourth execution sub-module is used to match the same emotion with the face image. The expression of the meaning of the image, and the bonus score of the face image is confirmed according to the matching degree.
  • the analysis system of the user's limb motion further includes: a third acquisition sub-module, configured to randomly extract a preset number of expression images representing human emotions from the expression package in a preset unit time, and The embedding image is placed in the display container; the first obtaining sub-module is configured to collect the facial image of the user in a timed or real-time manner in the unit time; the first processing sub-module is configured to identify the facial image The characterized emotion information, and the matching degree between the face image and the emotion information; the first execution sub-module is configured to match the expression image having the same emotional meaning as the face image, and confirm according to the matching degree The bonus score of the face image.
  • a third acquisition sub-module configured to randomly extract a preset number of expression images representing human emotions from the expression package in a preset unit time, and The embedding image is placed in the display container
  • the first obtaining sub-module is configured to collect the facial image of the user in a timed or real-time manner in the unit time
  • the analysis system of the user's limb motion further includes: a first collection sub-module, a third processing sub-module, and a fifth execution sub-module, wherein the first collection sub-module is configured to collect a user's face image;
  • the third processing sub-module is configured to input the facial image into the preset emotion recognition model, and obtain the classification result and the classification data of the facial image;
  • the fifth execution sub-module is configured to determine the emotional information of the facial image according to the classification result, And matching the face image with the emotion information according to the classification data.
  • the first processing sub-module is configured to input the facial image into a preset emotion recognition model, and acquire a classification result and classification data of the facial image; according to the classification As a result, the emotion information of the face image is determined, and the degree of matching between the face image and the emotion information is determined according to the classification data.
  • the terminal in this embodiment refers to the mobile terminal and the PC end, and the mobile terminal is taken as an example for description.
  • FIG. 13 is a schematic structural diagram of a mobile terminal according to an embodiment of the present invention.
  • all the programs in the parsing method for implementing the user's limb motion in the embodiment are stored in the memory 1520 of the mobile terminal, and the processor 1580 can call the program in the memory 1520 to execute the user limb. All the functions listed in the analysis method of the action. The method for analyzing the user's limb motion in the present embodiment is described in detail in the function implemented by the mobile terminal, and details are not described herein.
  • the embodiment of the present application further provides a mobile terminal.
  • a mobile terminal As shown in FIG. 13 , for the convenience of description, only the parts related to the embodiment of the present application are shown. For details that are not disclosed, refer to the method part of the embodiment of the present application.
  • the terminal may be any terminal device including a mobile terminal, a tablet computer, a PDA (Personal Digital Assistant), a POS (Point of Sales), an in-vehicle computer, and the terminal is a mobile terminal as an example:
  • FIG. 13 is a block diagram showing a partial structure of a mobile terminal related to a terminal provided by an embodiment of the present application.
  • the mobile terminal includes: a radio frequency (RF) circuit 1510, a memory 1520, an input unit 1530, a display unit 1540, a sensor 1550, an audio circuit 1560, a wireless fidelity (Wi-Fi) module 1570, The processor 1580, and the power supply 1590 and the like.
  • RF radio frequency
  • the mobile terminal structure shown in FIG. 13 does not constitute a limitation of the mobile terminal, and may include more or less components than those illustrated, or some components may be combined, or different component arrangements.
  • the RF circuit 1510 can be used for receiving and transmitting signals during the transmission or reception of information or during a call. Specifically, after receiving the downlink information of the base station, the processing is processed by the processor 1580. In addition, the data designed for the uplink is sent to the base station.
  • RF circuit 1510 includes, but is not limited to, an antenna, at least one amplifier, a transceiver, a coupler, a Low Noise Amplifier (LNA), a duplexer, and the like.
  • LNA Low Noise Amplifier
  • RF circuitry 1510 can also communicate with the network and other devices via wireless communication.
  • the above wireless communication may use any communication standard or protocol, including but not limited to Global System of Mobile communication (GSM), General Packet Radio Service (GPRS), Code Division Multiple Access (Code Division). Multiple Access (CDMA), Wideband Code Division Multiple Access (WCDMA), Long Term Evolution (LTE), E-mail, Short Messaging Service (SMS), and the like.
  • GSM Global System of Mobile communication
  • GPRS General Packet Radio Service
  • CDMA Code Division Multiple Access
  • WCDMA Wideband Code Division Multiple Access
  • LTE Long Term Evolution
  • E-mail Short Messaging Service
  • the memory 1520 can be used to store software programs and modules, and the processor 1580 executes various functional applications and data processing of the mobile terminal by running software programs and modules stored in the memory 1520.
  • the memory 1520 may mainly include a storage program area and a storage data area, wherein the storage program area may store an operating system, an application required for at least one function (such as a voiceprint playing function, an image playing function, etc.), and the like; the storage data area may be stored. Data created according to the use of the mobile terminal (such as audio data, phone book, etc.).
  • memory 1520 can include high speed random access memory, and can also include non-volatile memory, such as at least one magnetic disk storage device, flash memory device, or other volatile solid state storage device.
  • the input unit 1530 can be configured to receive input numeric or character information and to generate key signal inputs related to user settings and function control of the mobile terminal.
  • the input unit 1530 may include a touch panel 1531 and other input devices 1532.
  • the touch panel 1531 also referred to as a touch screen, can collect touch operations on or near the user (such as the user using a finger, a stylus, or the like on the touch panel 1531 or near the touch panel 1531. Operation), and drive the corresponding connecting device according to a preset program.
  • the touch panel 1531 may include two parts: a touch detection device and a touch controller.
  • the touch detection device detects the touch orientation of the user, and detects a signal brought by the touch operation, and transmits the signal to the touch controller; the touch controller receives the touch information from the touch detection device, converts the touch information into contact coordinates, and sends the touch information.
  • the processor 1580 is provided and can receive commands from the processor 1580 and execute them.
  • the touch panel 1531 can be implemented in various types such as resistive, capacitive, infrared, and surface acoustic waves.
  • the input unit 1530 may also include other input devices 1532.
  • other input devices 1532 may include, but are not limited to, one or more of a physical keyboard, function keys (such as volume control buttons, switch buttons, etc.), trackballs, mice, joysticks, and the like.
  • the display unit 1540 can be used to display information input by the user or information provided to the user as well as various menus of the mobile terminal.
  • the display unit 1540 can include a display panel 1541.
  • the display panel 1541 can be configured in the form of a liquid crystal display (LCD), an organic light-emitting diode (OLED), or the like.
  • the touch panel 1531 may cover the display panel 1541. After the touch panel 1531 detects a touch operation on or near the touch panel 1531, the touch panel 1531 transmits to the processor 1580 to determine the type of the touch event, and then the processor 1580 according to the touch event. The type provides a corresponding visual output on display panel 1541.
  • touch panel 1531 and the display panel 1541 are used as two independent components to implement the input and input functions of the mobile terminal in FIG. 13, in some embodiments, the touch panel 1531 and the display panel 1541 may be integrated. And realize the input and output functions of the mobile terminal.
  • the mobile terminal can also include at least one type of sensor 1550, such as a light sensor, motion sensor, and other sensors.
  • the light sensor may include an ambient light sensor and a proximity sensor, wherein the ambient light sensor may adjust the brightness of the display panel 1541 according to the brightness of the ambient light, and the proximity sensor may close the display panel 1541 when the mobile terminal moves to the ear. / or backlight.
  • the accelerometer sensor can detect the magnitude of acceleration in all directions (usually three axes). When it is stationary, it can detect the magnitude and direction of gravity.
  • attitude of the mobile terminal such as horizontal and vertical screen switching, Related games, magnetometer attitude calibration), vibration recognition related functions (such as pedometer, tapping), etc.; as well as other sensors such as gyroscopes, barometers, hygrometers, thermometers, infrared sensors, etc., which can be configured in mobile terminals, No longer.
  • An audio circuit 1560, a speaker 1561, and a microphone 1562 can provide an audio interface between the user and the mobile terminal.
  • the audio circuit 1560 can transmit the converted electrical data of the received audio data to the speaker 1561, and convert it into a voiceprint signal output by the speaker 1561.
  • the microphone 1562 converts the collected voiceprint signal into an electrical signal by the audio.
  • Circuit 1560 is converted to audio data upon reception, processed by audio data output processor 1580, transmitted via RF circuitry 1510 to, for example, another mobile terminal, or output audio data to memory 1520 for further processing.
  • Wi-Fi is a short-range wireless transmission technology.
  • the mobile terminal can help users to send and receive e-mail, browse web pages and access streaming media through the Wi-Fi module 1570. It provides users with wireless broadband Internet access.
  • FIG. 13 shows the Wi-Fi module 1570, it can be understood that it does not belong to the essential configuration of the mobile terminal, and may be omitted as needed within the scope of not changing the essence of the invention.
  • the processor 1580 is a control center of the mobile terminal that connects various portions of the entire mobile terminal using various interfaces and lines, by running or executing software programs and/or modules stored in the memory 1520, and recalling data stored in the memory 1520.
  • the mobile terminal performs various functions and processing data to perform overall monitoring on the mobile terminal.
  • the processor 1580 may include one or more processing units; preferably, the processor 1580 may integrate an application processor and a modem processor, where the application processor mainly processes an operating system, a user interface, an application, and the like.
  • the modem processor primarily handles wireless communications. It will be appreciated that the above described modem processor may also not be integrated into the processor 1580.
  • the mobile terminal also includes a power source 1590 (such as a battery) for powering various components.
  • a power source 1590 such as a battery
  • the power source can be logically coupled to the processor 1580 through a power management system to manage functions such as charging, discharging, and power management through the power management system.
  • the mobile terminal may further include a camera, a Bluetooth module, and the like, and details are not described herein again.
  • the embodiment of the present application provides a computer readable storage medium, where the computer readable storage medium stores a computer program, and when the computer program is executed by the processor, the implementation of the embodiment of the present application is implemented.
  • the embodiment of the present application recognizes the body language represented by the limb image of the user in the picture, and matches the visual information or the audio information having the same meaning as the body language. In this way, the information expressed by the limb features in the image is presented in a way that can be directly interpreted by humans, thereby realizing a deep interpretation of the human body movements, facilitating communication between the language disabled or the languageless users.
  • the embodiment of the present application provides a computer program product, when it is run on a computer, causing the computer to perform the step of analyzing the user's limb motion according to any one of the foregoing embodiments provided by the embodiment of the present application. .
  • the embodiment of the present application recognizes the body language represented by the limb image of the user in the picture, and matches the visual information or the audio information having the same meaning as the body language. In this way, the information expressed by the limb features in the image is presented in a way that can be directly interpreted by humans, thereby realizing a deep interpretation of the human body movements, facilitating communication between the language disabled or the languageless users.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Human Computer Interaction (AREA)
  • Multimedia (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • General Engineering & Computer Science (AREA)
  • Oral & Maxillofacial Surgery (AREA)
  • Psychiatry (AREA)
  • Social Psychology (AREA)
  • User Interface Of Digital Computer (AREA)
  • Processing Or Creating Images (AREA)

Abstract

Disclosed in embodiments of the invention are an analysis method and system of user limb movement and a mobile terminal. The method comprises the following steps: acquiring a limb image of a user; identifying a limb language characterized by the limb image; and matching visualized information or audio information with the same meaning as the limb language. The limb language characterized by the limb image of the user in a picture is identified, and the visualized information or the audio information with the same meaning as the limb language is matched, thus the information expressed by limb features in the image is presented in a manner which can be directly interpreted by human, deep-level interpretation for human limb movement is realized, and communication between language-impaired persons or users without a common language is facilitated.

Description

用户肢体动作的解析方法、系统及移动终端Method, system and mobile terminal for analyzing user's limb movement
本申请要求于2017年12月28日提交中国专利局、申请号为201711464338.2发明名称为“用户肢体动作的解析方法、系统及移动终端”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。The present application claims priority to Chinese Patent Application No. 200911464338.2, entitled "analysis method, system and mobile terminal for user's limb movement", which is filed on December 28, 2017, the entire contents of which are incorporated by reference. In this application.
技术领域Technical field
本申请实施例图像处理领域,尤其是一种用户肢体动作的解析方法、系统及移动终端。The image processing field in the embodiment of the present application is particularly a method, a system, and a mobile terminal for analyzing a user's limb motion.
背景技术Background technique
图像理解是研究用计算机系统解释图像,实现类似人类视觉系统理解外部世界的一门科学。理解外部世界的一门科学,所讨论的问题是:为了完成某一任务需要从图像中获取哪些信息,以及如何利用这些信息获得必要的解释。图像理解的研究涉及和包含了研究获取图像的方法、装置和具体的应用实现。Image comprehension is the study of a computer system to interpret images to achieve a science similar to the human visual system's understanding of the outside world. To understand a science in the outside world, the questions that are discussed are: what information is needed from the image to accomplish a task, and how to use it to obtain the necessary explanations. The study of image understanding involves and includes methods, devices, and specific application implementations for obtaining images.
相关技术中,图像理解技术被用于识别图片中的文字信息,将位图形态的文字进行识别并转换为可编辑文字。本申请创造的发明人在研究中发现,相关技术中的图像理解仅限于将固定位图图样,转化为可编辑形态,无法在对图像信息进行理解后,根据理解结果进行更深层次的解析和应用。In the related art, image understanding technology is used to recognize text information in a picture, and to recognize and convert the text in the bitmap form into editable text. The inventor created by the present application found in the research that the image understanding in the related art is limited to converting the fixed bitmap pattern into an editable form, and cannot perform deeper analysis and application according to the understanding result after understanding the image information. .
发明内容Summary of the invention
本申请实施例提供一种通过解析图像中用户的肢体语言,并根据肢体语言匹配可被人类直接感知的信息,并对可被人类直接感知的信息进行展示和应用的用户肢体动作的解析方法、系统及移动终端。Embodiments of the present application provide an analysis method for a user's body motion by analyzing a body language of a user in an image, and matching information that can be directly perceived by a human body according to a body language, and displaying and applying information that can be directly perceived by a human being. System and mobile terminal.
为解决上述技术问题,本申请创造的实施例采用的一个技术方案是:提供一种用户肢体动作的解析方法,包括下述步骤:In order to solve the above technical problem, a technical solution adopted by the embodiment created by the present application is to provide an analysis method for a user's limb motion, including the following steps:
获取用户的肢体图像;Obtaining a limb image of the user;
识别所述肢体图像表征的肢体语言;Identifying the body language of the limb image representation;
匹配与所述肢体语言具有相同含义的可视化信息或音频信息。Matching visual or audio information having the same meaning as the body language.
可选地,所述用户肢体动作的解析方法还包括下述步骤:Optionally, the parsing method of the user's limb motion further includes the following steps:
获取用户的人脸图像;Obtaining a face image of the user;
识别所述人脸图像表征的人体脸部动作信息;Identifying human facial motion information represented by the facial image;
匹配与所述人体脸部动作信息具有相同动作含义的表情图片。An emoticon image having the same action meaning as the human facial motion information is matched.
可选地,所述获取用户的肢体图像的步骤,包括:Optionally, the step of acquiring a limb image of the user includes:
获取用户的人脸图像;Obtaining a face image of the user;
所述识别所述肢体图像表征的肢体语言的步骤,包括:The step of identifying the body language of the limb image representation includes:
识别所述人脸图像表征的人体脸部动作信息;Identifying human facial motion information represented by the facial image;
所述匹配与所述肢体语言具有相同含义的可视化信息或音频信息的步骤,包括:The step of matching the visual information or the audio information having the same meaning as the body language includes:
匹配与所述人体脸部动作信息具有相同动作含义的表情图片。An emoticon image having the same action meaning as the human facial motion information is matched.
可选地,所述获取用户的人脸图像的步骤之前,还包括下述步骤:Optionally, before the step of acquiring a face image of the user, the method further includes the following steps:
调用预存储的至少一个所述表情图片;Retrieving at least one of the pre-stored emoticons;
将所述表情图片按预设脚本放置在显示容器内,以使所述表情图片可视化显示。The emoticon image is placed in the display container according to a preset script to visually display the emoticon image.
可选地,所述匹配与所述人体脸部动作具有相同动作含义的表情图片的步骤,具体包括下述步骤:Optionally, the step of matching an emoticon image having the same action meaning as the human facial motion includes the following steps:
将所述人体脸部动作信息与所述显示容器范围内的表情图片进行比对;Comparing the human facial motion information with an expression image within the display container;
当所述显示容器内的表情图片所表征的动作含义与所述人体脸部动作信息相同时,确认所述显示容器内存在与所述人体脸部动作具有相同动作含义的表情图片。When the action meaning represented by the expression picture in the display container is the same as the human face action information, it is confirmed that the display container has an expression picture having the same action meaning as the human face action.
可选地,所述匹配与所述人体脸部动作具有相同动作含义的表情图片的步骤之后,还包括下述步骤:Optionally, after the step of matching the emoticon image having the same action meaning as the human facial motion, the method further includes the following steps:
获取所述人体脸部动作信息与所述表情图片的匹配度信息;Obtaining matching degree information of the human facial action information and the emoticon image;
根据预设的匹配规则计算所述匹配度信息对应的奖励分值。Calculating a bonus score corresponding to the matching degree information according to a preset matching rule.
可选地,所述根据预设的匹配规则计算所述匹配度信息对应的奖励分值的步骤之后,还包括下述步骤:Optionally, after the step of calculating the bonus score corresponding to the matching degree information according to the preset matching rule, the method further includes the following steps:
记录预设第一时间阈值内所有的奖励分值;Recording all the bonus points within the preset first time threshold;
将所述奖励分值累加形成用户在所述第一时间阈值内的最终得分。The bonus scores are summed to form a final score for the user within the first time threshold.
可选地,所述用户肢体动作的解析方法还包括下述步骤:Optionally, the parsing method of the user's limb motion further includes the following steps:
在预设的单位时间内从表情包中随机抽取预设数量的表征人类情绪的表情图片,并将所述表情图片放置在显示容器内;Extracting a preset number of expression images representing human emotions from the expression packs in a preset unit time, and placing the expression images in the display container;
在所述单位时间内定时或实时采集用户的人脸图像,并识别所述人脸图像所表征的情绪信息,及所述人脸图像与所述情绪信息的匹配度;Collecting a face image of the user in a timed or real time in the unit time, and identifying the emotion information represented by the face image, and the matching degree between the face image and the emotion information;
匹配与所述人脸图像具有相同情绪含义的表情图片,并根据所述匹配度确认所述人脸图像的奖励分值。An emoticon image having the same emotional meaning as the facial image is matched, and a bonus score of the facial image is confirmed according to the matching degree.
可选地,所述用户肢体动作的解析方法还包括下述步骤:Optionally, the parsing method of the user's limb motion further includes the following steps:
在预设的单位时间内从表情包中随机抽取预设数量的表征人类情绪的表情图片,并将所述表情图片放置在显示容器内;Extracting a preset number of expression images representing human emotions from the expression packs in a preset unit time, and placing the expression images in the display container;
所述获取用户的人脸图像的步骤,包括:The step of acquiring a face image of the user includes:
在所述单位时间内定时或实时采集用户的人脸图像;Collecting a user's face image in a timed or real time in the unit time;
所述识别所述人脸图像表征的人体脸部动作信息的步骤,包括:And the step of identifying the facial motion information of the facial image representation includes:
识别所述人脸图像所表征的情绪信息,及所述人脸图像与所述情绪信息的匹配度;Identifying the emotion information represented by the face image, and matching the face image with the emotion information;
所述匹配与所述人体脸部动作信息具有相同动作含义的表情图片的步骤,包括:The step of matching an emoticon image having the same action meaning as the human facial action information includes:
匹配与所述人脸图像具有相同情绪含义的表情图片,并根据所述匹配度确认所述人脸图像的奖励分值。An emoticon image having the same emotional meaning as the facial image is matched, and a bonus score of the facial image is confirmed according to the matching degree.
可选地,所述在所述单位时间内定时或实时采集用户的人脸图像,并识别所述人脸图像所表征的情绪信息的步骤,具体包括下述步骤:Optionally, the step of collecting the face image of the user in the unit time or in real time and identifying the emotion information represented by the face image includes the following steps:
采集用户的人脸图像;Collecting a user's face image;
将所述人脸图像输入到预设的情绪识别模型中,并获取所述人脸图像的分类结果及分类数据;Inputting the face image into a preset emotion recognition model, and acquiring classification result and classification data of the face image;
根据所述分类结果确定所述人脸图像的情绪信息,并根据所述分类数据确定所述人脸图像与所述情绪信息的匹配度。Determining the emotion information of the face image according to the classification result, and determining a matching degree of the face image and the emotion information according to the classification data.
可选地,所述识别所述人脸图像所表征的情绪信息,及所述人脸图像与所述情绪信息的匹配度的步骤,包括:Optionally, the step of identifying the emotion information represented by the face image and the matching degree between the face image and the emotion information includes:
将所述人脸图像输入到预设的情绪识别模型中,并获取所述人脸图像的分类结果及分类数据;Inputting the face image into a preset emotion recognition model, and acquiring classification result and classification data of the face image;
根据所述分类结果确定所述人脸图像的情绪信息,并根据所述分类数据确定所述人脸图像与所述情绪信息的匹配度。Determining the emotion information of the face image according to the classification result, and determining a matching degree of the face image and the emotion information according to the classification data.
为解决上述技术问题,本申请实施例还提供一种用户肢体动作的解析系统,包括:To solve the above technical problem, the embodiment of the present application further provides an analysis system for a user's limb movement, including:
获取模块,用于获取用户的肢体图像;An acquisition module, configured to acquire a limb image of the user;
处理模块,用于识别所述肢体图像表征的肢体语言;a processing module for identifying a body language of the limb image representation;
执行模块,用于匹配与所述肢体语言具有相同含义的可视化信息或音频信息。An execution module for matching visual information or audio information having the same meaning as the body language.
可选地,所述用户肢体动作的解析系统还包括:Optionally, the analyzing system of the user's limb motion further includes:
第一获取子模块,用于获取用户的人脸图像;a first obtaining submodule, configured to acquire a face image of the user;
第一处理子模块,用于识别所述人脸图像表征的人体脸部动作信息;a first processing submodule, configured to identify human facial motion information represented by the facial image;
第一执行子模块,用于匹配与所述人体脸部动作信息具有相同动作含义的表情图片。The first execution sub-module is configured to match an emoticon image having the same action meaning as the human facial motion information.
可选地,所述获取模块,包括:第一获取子模块,用于获取用户的人脸图像;Optionally, the acquiring module includes: a first acquiring submodule, configured to acquire a face image of the user;
所述处理模块,包括:第一处理子模块,用于识别所述人脸图像表征的 人体脸部动作信息;The processing module includes: a first processing submodule, configured to identify human facial motion information represented by the facial image;
所述执行模块,包括:第一执行子模块,用于匹配与所述人体脸部动作信息具有相同动作含义的表情图片。The execution module includes: a first execution sub-module, configured to match an emoticon image having the same action meaning as the human facial action information.
可选地,所述用户肢体动作的解析系统还包括:Optionally, the analyzing system of the user's limb motion further includes:
第一调用子模块,用于调用预存储的至少一个所述表情图片;a first calling submodule, configured to call at least one of the pre-stored emoticons;
第一显示子模块,用于将所述表情图片按预设脚本放置在显示容器内,以使所述表情图片可视化显示。The first display sub-module is configured to place the emoticon image in a display container according to a preset script, so that the emoticon image is visually displayed.
可选地,所述用户肢体动作的解析系统还包括:Optionally, the analyzing system of the user's limb motion further includes:
第一比对子模块,用于将所述人体脸部动作信息与所述显示容器范围内的表情图片进行比对;a first comparison sub-module, configured to compare the human facial motion information with an expression image in a range of the display container;
第一确认子模块,用于当所述显示容器内的表情图片所表征的动作含义与所述人体脸部动作信息相同时,确认所述显示容器内存在与所述人体脸部动作具有相同动作含义的表情图片。a first confirmation sub-module, configured to confirm that the display container has the same action as the human face motion when the action meaning represented by the expression picture in the display container is the same as the human face motion information The expression of the meaning of the image.
可选地,所述用户肢体动作的解析系统还包括:Optionally, the analyzing system of the user's limb motion further includes:
第二获取子模块,用于获取所述人体脸部动作信息与所述表情图片的匹配度信息;a second obtaining submodule, configured to acquire matching degree information of the human facial action information and the emoticon image;
第二执行子模块,用于根据预设的匹配规则计算所述匹配度信息对应的奖励分值。The second execution sub-module is configured to calculate a bonus score corresponding to the matching degree information according to a preset matching rule.
可选地,所述用户肢体动作的解析系统还包括:Optionally, the analyzing system of the user's limb motion further includes:
第一记录子模块,用于记录预设第一时间阈值内所有的奖励分值;a first recording sub-module, configured to record all the bonus points in the preset first time threshold;
第三执行子模块,用于将所述奖励分值累加形成用户在所述第一时间阈值内的最终得分。And a third execution sub-module, configured to accumulate the bonus scores to form a final score of the user within the first time threshold.
可选地,所述用户肢体动作的解析系统还包括:Optionally, the analyzing system of the user's limb motion further includes:
第三获取子模块,用于在预设的单位时间内从表情包中随机抽取预设数量的表征人类情绪的表情图片,并将所述表情图片放置在显示容器内;a third obtaining sub-module, configured to randomly extract a preset number of emoticons representing human emotions from the emoticon package in a preset unit time, and place the emoticon images in a display container;
第二处理子模块,用于在所述单位时间内定时或实时采集用户的人脸图像,并识别所述人脸图像所表征的情绪信息,及所述人脸图像与所述情绪信息的匹配度;a second processing submodule, configured to collect a face image of the user in a timed or real time in the unit time, and identify the emotion information represented by the face image, and match the face image with the emotion information degree;
第四执行子模块,用于匹配与所述人脸图像具有相同情绪含义的表情图片,并根据所述匹配度确认所述人脸图像的奖励分值。And a fourth execution sub-module, configured to match an emoticon image having the same emotional meaning as the facial image, and confirm a bonus score of the facial image according to the matching degree.
可选地,所述用户肢体动作的解析系统还包括:Optionally, the analyzing system of the user's limb motion further includes:
第三获取子模块,用于在预设的单位时间内从表情包中随机抽取预设数量的表征人类情绪的表情图片,并将所述表情图片放置在显示容器内;a third obtaining sub-module, configured to randomly extract a preset number of emoticons representing human emotions from the emoticon package in a preset unit time, and place the emoticon images in a display container;
第一获取子模块,用于在所述单位时间内定时或实时采集用户的人脸图像;a first acquiring submodule, configured to collect a face image of the user in a timed or real time in the unit time;
第一处理子模块,用于识别所述人脸图像所表征的情绪信息,及所述人脸图像与所述情绪信息的匹配度;a first processing sub-module, configured to identify emotion information represented by the face image, and a matching degree between the face image and the emotion information;
第一执行子模块,用于匹配与所述人脸图像具有相同情绪含义的表情图片,并根据所述匹配度确认所述人脸图像的奖励分值。And a first execution submodule, configured to match an emoticon image having the same emotional meaning as the facial image, and confirm a bonus score of the facial image according to the matching degree.
可选地,所述用户肢体动作的解析系统还包括:Optionally, the analyzing system of the user's limb motion further includes:
第一采集子模块,用于采集用户的人脸图像;a first collection sub-module, configured to collect a face image of the user;
第三处理子模块,用于将所述人脸图像输入到预设的情绪识别模型中,并获取所述人脸图像的分类结果及分类数据;a third processing sub-module, configured to input the facial image into a preset emotion recognition model, and obtain a classification result and classification data of the facial image;
第五执行子模块,用于根据所述分类结果确定所述人脸图像的情绪信息,并根据所述分类数据确定所述人脸图像与所述情绪信息的匹配度。a fifth execution sub-module, configured to determine emotion information of the face image according to the classification result, and determine a matching degree between the face image and the emotion information according to the classification data.
可选地,所述第一处理子模块,用于Optionally, the first processing submodule is configured to:
将所述人脸图像输入到预设的情绪识别模型中,并获取所述人脸图像的分类结果及分类数据;Inputting the face image into a preset emotion recognition model, and acquiring classification result and classification data of the face image;
根据所述分类结果确定所述人脸图像的情绪信息,并根据所述分类数据确定所述人脸图像与所述情绪信息的匹配度。Determining the emotion information of the face image according to the classification result, and determining a matching degree of the face image and the emotion information according to the classification data.
为解决上述技术问题,本申请实施例还提供一种移动终端,包括:To solve the above technical problem, the embodiment of the present application further provides a mobile terminal, including:
一个或多个处理器;One or more processors;
存储器;Memory
一个或多个应用程序,其中所述一个或多个应用程序被存储在所述存储器中并被配置为由所述一个或多个处理器执行,所述一个或多个程序配置用于执行上述所述的用户肢体动作的解析方法。One or more applications, wherein the one or more applications are stored in the memory and configured to be executed by the one or more processors, the one or more programs configured to perform the The method for analyzing the user's limb movement.
另一方面,本申请实施例提供了一种计算机可读存储介质,所述计算机可读存储介质内存储有计算机程序,所述计算机程序被处理器执行时实现本申请实施例所提供的上述任一所述的用户肢体动作的解析方法步骤。On the other hand, the embodiment of the present application provides a computer readable storage medium, where the computer readable storage medium stores a computer program, and when the computer program is executed by the processor, the foregoing An analysis method step of the user's limb movement.
另一方面,本申请实施例提供一种计算机程序产品,当其在计算机上运行时,使得计算机执行上述实施例中任一所述的用户肢体动作的解析方法步骤。On the other hand, the embodiment of the present application provides a computer program product, when it is run on a computer, causing the computer to perform the step of analyzing the user's limb motion according to any of the above embodiments.
本申请实施例的有益效果是:通过识别图片中用户的肢体图像表征的肢体语言,并匹配与该肢体语言具有相同含义可视化信息或音频信息。以此,将图像中的肢体特征所表述的信息通过能够被人类直接解读的方式进行呈现,实现了对于人类肢体动作的深层次解读,方便语言障碍者或者语言不通用户之间的交流。The beneficial effects of the embodiments of the present application are: by identifying the body language of the user's limb image in the picture, and matching the visual or audio information having the same meaning as the body language. In this way, the information expressed by the limb features in the image is presented in a way that can be directly interpreted by humans, thereby realizing a deep interpretation of the human body movements, facilitating communication between the language disabled or the languageless users.
当然,实施本申请的任一产品或方法必不一定需要同时达到以上所述的所有优点。Of course, implementing any of the products or methods of the present application necessarily does not necessarily require all of the advantages described above to be achieved at the same time.
附图说明DRAWINGS
为了更清楚地说明本申请实施例中的技术方案,下面将对实施例描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本申请的一些实施例,对于本领域技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其他的附图。In order to more clearly illustrate the technical solutions in the embodiments of the present application, the drawings used in the description of the embodiments will be briefly described below. It is obvious that the drawings in the following description are only some embodiments of the present application. Other drawings can also be obtained from those skilled in the art based on these drawings without paying any creative effort.
图1为本申请实施例用户肢体动作的解析方法的基本流程示意图;1 is a schematic diagram of a basic flow of a method for analyzing a user's limb motion according to an embodiment of the present application;
图2为本申请实施例用户肢体动作的解析方法第一种实施方式的展示示意图;2 is a schematic diagram showing a first embodiment of a method for analyzing a user's limb motion according to an embodiment of the present application;
图3为本申请实施例用户肢体动作的解析方法第二种实施方式的展示示 意图;3 is a schematic view showing a second embodiment of a method for analyzing a user's limb motion according to an embodiment of the present application;
图4为本申请实施例解析用户面部表情进行应用的流程示意图;4 is a schematic flowchart of analyzing an application of a user's facial expression according to an embodiment of the present application;
图5为本申请实施例显示表情图片的一种实施方式流程示意图;FIG. 5 is a schematic flowchart diagram of an embodiment of displaying an emoticon image according to an embodiment of the present application;
图6为本申请实施例确认显示容器范围内的表情图片与人体脸部动作信息相同的流程示意图;FIG. 6 is a schematic flowchart of confirming that an emoticon image in a display container is the same as a human facial action information according to an embodiment of the present application;
图7为本申请实施例通过匹配结果进行奖励的流程示意图;FIG. 7 is a schematic flowchart of performing rewards by matching results according to an embodiment of the present application;
图8为本申请实施例统计等分的具体流程示意图;8 is a schematic flowchart of a statistical aliquot of an embodiment of the present application;
图9为本申请实施例解析人脸图像情绪信息的流程示意图;FIG. 9 is a schematic flowchart of analyzing emotion information of a face image according to an embodiment of the present application;
图10为本申请实施例人脸图像的情绪信息分类与匹配度检测的流程示意图;FIG. 10 is a schematic flowchart diagram of emotion information classification and matching degree detection of a face image according to an embodiment of the present application;
图11为本申请实施例用户肢体动作的解析方法第三种实施方式的展示示意图;11 is a schematic diagram showing a third embodiment of a method for analyzing a limb motion of a user according to an embodiment of the present application;
图12为本申请实施例用户肢体动作的解析系统基本结构框图;12 is a basic structural block diagram of an analysis system for a user's limb movement according to an embodiment of the present application;
图13为本申请实施例移动终端基本结构示意图。FIG. 13 is a schematic diagram of a basic structure of a mobile terminal according to an embodiment of the present application.
具体实施方式Detailed ways
为了使本技术领域的人员更好地理解本申请方案,下面将结合本申请实施例中的附图,对本申请实施例中的技术方案进行清楚、完整地描述。The technical solutions in the embodiments of the present application are clearly and completely described in the following with reference to the accompanying drawings in the embodiments of the present application.
在本申请的说明书和权利要求书及上述附图中的描述的一些流程中,包含了按照特定顺序出现的多个操作,但是应该清楚了解,这些操作可以不按照其在本文中出现的顺序来执行或并行执行,操作的序号如101、102等,仅仅是用于区分开各个不同的操作,序号本身不代表任何的执行顺序。另外,这些流程可以包括更多或更少的操作,并且这些操作可以按顺序执行或并行执行。需要说明的是,本文中的“第一”、“第二”等描述,是用于区分不同的消息、设备、模块等,不代表先后顺序,也不限定“第一”和“第二”是不同的类型。In the flow of the description in the specification and claims of the present application and the above-described figures, a plurality of operations in a specific order are included, but it should be clearly understood that these operations may not follow the order in which they appear in this document. Execution or parallel execution, the serial number of the operation such as 101, 102, etc., is only used to distinguish different operations, and the serial number itself does not represent any execution order. Additionally, these processes may include more or fewer operations, and these operations may be performed sequentially or in parallel. It should be noted that the descriptions of “first” and “second” in this document are used to distinguish different messages, devices, modules, etc., and do not represent the order, nor the “first” and “second”. It is a different type.
下面将结合本申请实施例中的附图,对本申请实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例仅仅是本申请一部分实施例,而 不是全部的实施例。基于本申请中的实施例,本领域技术人员在没有作出创造性劳动前提下所获得的所有其他实施例,都属于本申请保护的范围。The technical solutions in the embodiments of the present application are clearly and completely described in the following with reference to the drawings in the embodiments of the present application. It is obvious that the described embodiments are only a part of the embodiments of the present application, and not all of the embodiments. All other embodiments obtained by a person skilled in the art based on the embodiments of the present application without creative efforts are within the scope of the present application.
实施例Example
请参阅图1,图1为本实施例用户肢体动作的解析方法的基本流程示意图。Please refer to FIG. 1. FIG. 1 is a schematic flow chart of a method for analyzing a user's limb motion according to an embodiment of the present invention.
如图1所示,一种用户肢体动作的解析方法,包括下述步骤:As shown in FIG. 1 , a method for analyzing a user's limb motion includes the following steps:
S1100、获取用户的肢体图像;S1100, acquiring a limb image of the user;
本实施例中的肢体图像可以包括但不限于:人脸图像、手势动作图像和/或嘴唇动作图像。The limb images in this embodiment may include, but are not limited to, a face image, a gesture motion image, and/or a lip motion image.
其中,人脸图像也可以称为脸部图像。Among them, the face image can also be referred to as a face image.
在一种实现方式中,终端通过访问本地存储空间指定区域,获取存储在本地存储空间内的包括有用户的肢体图像的目标图像。在另一种实现方式中,通过开启设置在终端上或者与终端连接的拍摄设备,直接实时获取用户的肢体图像。In an implementation manner, the terminal acquires a target image that includes a limb image of the user stored in the local storage space by accessing the local storage space designation area. In another implementation manner, the user's limb image is directly acquired in real time by turning on the photographing device disposed on the terminal or connected to the terminal.
在另一种实现方式中,终端可以通过访问所连接的具有存储功能的外部设备的特定区域,获取存储在具有存储功能的外部设备的包括有用户的肢体图像的目标图像。In another implementation, the terminal may acquire a target image including the user's limb image stored in the external device having the storage function by accessing the connected specific area of the external device having the storage function.
一种情况中,该肢体图像可以指目标图像中用户的肢体所在区域对应的图像。另一种情况中,该肢体图像可以指完整的目标图像。In one case, the limb image may refer to an image corresponding to the region where the user's limb is located in the target image. In another case, the limb image may refer to a complete target image.
S1200、识别所述肢体图像表征的肢体语言;S1200. Identify a body language of the limb image representation;
肢体语言是指肢体图像的动作表征的具体含义,肢体语言包括(不限于):用户脸部图像表征的情绪信息、手势动作图像动作表征的语言信息或嘴唇动作图像表征的语言信息。Body language refers to the specific meaning of the action representation of the limb image. The body language includes (not limited to): the emotion information represented by the user's face image, the language information of the gesture action image action representation or the language information represented by the lip motion image.
也就是说,肢体语言是指肢体图像的动作表征的具体含义,肢体语言包括但不限于:用户的脸部图像表征的情绪信息、手势动作图像中手势动作表征的语言信息和/或嘴唇动作图像嘴唇动作表征的语言信息。That is to say, body language refers to the specific meaning of the action representation of the limb image, including but not limited to: emotional information represented by the user's facial image, language information of the gesture action representation in the gesture motion image, and/or lip motion image Language information characterized by lip movements.
本实施方式中,识别肢体图像表征的肢体语言采用的技术方案可以为: 通过深度学习的方法进行识别。具体地,收集大量的包括人体的肢体图像的图片作为训练样本,根据人们对于各种肢体图像所表述的肢体语言的主观判断,获取每个训练样本肢体动作的主观含义,并将该主观含义设为人们对该训练样本的期望输出。然后,将训练样本输入到卷积神经网络模型中,通过对训练样本的特征的提取,并输出训练样本的分类数据,分类数据为训练样本在本轮训练中属于各分类结果的概率值,本实施例中分类结果为不同的肢体语言的名称。其中,概率值最大且大于预设的衡量阈值的分类结果,为本轮训练中该训练样本的激励输出。比较该期望输出与激励输出是否一致,当期望输出与激励输出一致时训练结束;当期望输出与激励输出不一致时,通过反向传播算法,校正卷积神经网络的权值,以调整输出的结果。调整卷积神经网络的权值后,将训练样本重新输入,循环往复直至期望输出与激励输出一致时训练结束。In the present embodiment, the technical solution adopted for identifying the body language of the limb image representation may be: performing recognition by a deep learning method. Specifically, a large number of pictures including body images of the human body are collected as training samples, and subjective judgments of body functions expressed by various body images are obtained, and subjective meanings of body movements of each training sample are obtained, and the subjective meaning is set. The expected output for this training sample. Then, the training sample is input into the convolutional neural network model, and the feature data of the training sample is extracted, and the classification data of the training sample is output, and the classification data is the probability value of the training sample belonging to each classification result in the current training. The classification results in the examples are the names of different body language. The classification result with the largest probability value and greater than the preset measurement threshold is the excitation output of the training sample in the current round of training. Comparing whether the desired output is consistent with the excitation output, and the training ends when the expected output is consistent with the excitation output; when the expected output is inconsistent with the excitation output, correcting the weight of the convolutional neural network by the back propagation algorithm to adjust the output result . After adjusting the weight of the convolutional neural network, the training samples are re-inputted and cycled until the desired output is consistent with the excitation output.
其中,该分类结果可以根据需求人为设定。分类结果根据输出的复杂程度能够为若干个,分类结果越多则训练的复杂程度越高。Among them, the classification result can be set according to the demand. The classification result can be several according to the complexity of the output, and the more the classification result, the higher the complexity of the training.
在一种情况中,有时需要反复输入以验证输出的稳定性,稳定性较好时结束训练。也就是说:为了保证卷积神经网络输出的结果的稳定性,可以反复向卷积神经网络输入训练样本,直至卷积神经网络针对每一训练样本输出的激励输出,与该训练样本对应的期望输出相同的概率超过预设值之后,训练结束。In one case, it is sometimes necessary to repeatedly input to verify the stability of the output, and when the stability is good, the training is ended. That is to say: in order to ensure the stability of the output of the convolutional neural network, the training samples can be repeatedly input to the convolutional neural network until the excitation output of the convolutional neural network for each training sample, the expectation corresponding to the training sample After the output of the same probability exceeds the preset value, the training ends.
一种情况中,上述期望输出与激励输出一致,可以指:所对应激励输出与所对应期望输出一致的训练样本的数量,不小于预设数量。相应的,上述期望输出与激励输出不一致,可以指:所对应激励输出与所对应期望输出一致的训练样本的数量,小于预设数量。In one case, the expected output is consistent with the excitation output, and may refer to: the number of training samples whose corresponding excitation output is consistent with the corresponding desired output, not less than the preset number. Correspondingly, the expected output is inconsistent with the excitation output, and may refer to: the number of training samples whose corresponding excitation output is consistent with the corresponding desired output is less than the preset number.
另一种情况中,上述期望输出与激励输出一致,可以指:所对应激励输出与所对应期望输出一致的训练样本数量,与训练样本的总数量的比值不小于预设比值。相应的,上述期望输出与激励输出不一致,可以指:所对应激励输出与所对应期望输出一致的训练样本数量,与训练样本的总数量的比值小于预设比值。In another case, the expected output is consistent with the excitation output, and may refer to: the ratio of the training samples whose corresponding excitation output is consistent with the corresponding expected output, and the ratio of the total number of training samples is not less than the preset ratio. Correspondingly, the expected output is inconsistent with the excitation output, and may refer to: the ratio of the training samples whose corresponding excitation output is consistent with the corresponding expected output, and the ratio of the total number of training samples is less than the preset ratio.
通过大量的能够表征不同肢体语言的肢体图像训练至收敛的卷积神经网 络模型,能够快速且准确的确定输入其中的肢体图像(未参与训练)的肢体语言。By training a large number of limb images that can characterize different body language to a convolved convolutional neural network model, the body language of the limb image (not involved in training) can be quickly and accurately determined.
S1300、匹配与所述肢体语言具有相同含义的可视化信息或音频信息。S1300: Matching visual information or audio information having the same meaning as the body language.
可视化信息是指能够被人类眼睛观测到的信息,包括但不限于:文字信息、图片信息和/或视频信息。Visual information refers to information that can be observed by human eyes, including but not limited to: text information, picture information, and/or video information.
通过卷积神经网络模型获取到用户的肢体图像所表征的肢体语言,即获取到用户的肢体图像表征的文字信息。以该文字信息为检索关键字,在本地数据库中检索与该文字信息具有相同含义的可视化信息或音频信息。在一些实施方式中,为方便匹配,将存储在本地数据库中的可视化信息或音频信息,根据其表达的含义设定一个或多个标签,以方便肢体语言通过检索标签进行对应匹配。The body language represented by the user's limb image is obtained by the convolutional neural network model, that is, the text information of the user's limb image representation is obtained. The text information is used as a search key, and the visual information or the audio information having the same meaning as the text information is retrieved in the local database. In some embodiments, in order to facilitate matching, the visualization information or the audio information stored in the local database is set according to the meaning of its expression, so as to facilitate the body language to perform corresponding matching by retrieving the tags.
也就是说,在一种实现方式中,通过卷积神经网络模型获取到用户的肢体图像所表征的肢体语言,即获取到用于表征该肢体图像中肢体含义的描述信息,该描述信息为一种文字类的信息,可以称为文字语言。进而,以该描述信息为检索关键字,在第一预设数据库中,检索与该描述信息基于相同含义的可视化信息或音频信息。在一些实现方式中,为了方便匹配,可以为存储于第一预设数据库中的可视化信息或音频信息,根据每一可视化信息或音频信息所表达的含义,配置一个或多个标签,以方便后续的检索过程,即在检索时通过标签进行检索匹配。That is, in one implementation, the body language represented by the user's limb image is obtained by the convolutional neural network model, that is, the description information for characterizing the limb in the limb image is obtained, and the description information is one. A type of textual information can be called a literal language. Further, using the description information as a search key, in the first preset database, the visualization information or the audio information based on the same meaning as the description information is retrieved. In some implementations, for the convenience of matching, one or more labels may be configured according to the meaning expressed by each of the visualization information or the audio information for the visualization information or the audio information stored in the first preset database, so as to facilitate subsequent The retrieval process, that is, retrieval matching through tags at the time of retrieval.
其中,上述第一预设数据库可以存储于终端本地,也可以存储于终端所连接的具有存储功能的外部设备中。The first preset database may be stored locally in the terminal, or may be stored in an external device with a storage function connected to the terminal.
请参阅图2,图2为本实施例用户肢体动作的解析方法第一种实施方式的展示示意图。Please refer to FIG. 2. FIG. 2 is a schematic diagram showing the first embodiment of the method for analyzing the limb motion of the user.
如图2所示,在一些实施方式中,用户肢体动作的解析方法被用于解析用户的肢体动作,并将该动作转化为文字语言,通过实时获取用户的肢体图像,将该图像转化为文字语言进行输出。例如,识别哑语或特种作战的手语,并将该肢体语言转化为文字语言。图2中,将用户的肢体语言所表述的“你好”转化为文字语言“你好”。As shown in FIG. 2, in some embodiments, the parsing method of the user's limb motion is used to parse the user's limb motion, and convert the motion into a text language, and obtain the user's limb image in real time, and convert the image into text. The language is output. For example, identify sign language in dummy or special operations and convert the body language into a written language. In Figure 2, the "hello" expressed by the user's body language is converted into the text language "hello".
请参阅图3,图3为本实施中用户肢体动作的解析方法第二种实施方式的展示示意图。Please refer to FIG. 3. FIG. 3 is a schematic diagram showing a second embodiment of a method for analyzing a user's limb motion in the present embodiment.
如图3所示,在一些实施方式中,对用户的脸部表情动作所表征的情绪信息进行识别。并检索与该情绪信息具有相同情绪含义的表情进行输出,但不限于于此。在一些实施方式中,能够输出与该情绪信息具有相同情绪含义的文字、图片、动画或语音。如图3所示,在进行聊天时,用户出现喜悦的表情时,向对方发送具有喜悦含义的表情。例如:具有喜悦含义的表情图片。As shown in FIG. 3, in some embodiments, the emotional information represented by the facial expression action of the user is identified. The expression having the same emotional meaning as the emotion information is retrieved for output, but is not limited thereto. In some embodiments, text, pictures, animations, or speech that have the same emotional meaning as the emotional information can be output. As shown in FIG. 3, when the user performs a chat, when the user has an expression of joy, an expression with a meaning of joy is sent to the other party. For example: an emoticon with a meaning of joy.
上述实施方式,通过识别图片中用户的肢体图像表征的肢体语言,并匹配与该肢体语言具有相同含义可视化信息或音频信息。以此,将图像中的肢体特征所表述的信息通过能够被人类直接解读的方式进行呈现,实现了对于人类肢体动作的深层次解读,方便语言障碍者或者语言不通用户之间的交流。In the above embodiment, the body language represented by the limb image of the user in the picture is recognized, and the visual information or the audio information having the same meaning as the body language is matched. In this way, the information expressed by the limb features in the image is presented in a way that can be directly interpreted by humans, thereby realizing a deep interpretation of the human body movements, facilitating communication between the language disabled or the languageless users.
请参阅图4,图4为本实施例解析用户面部表情进行应用的流程示意图。Please refer to FIG. 4. FIG. 4 is a schematic flowchart of analyzing an application of a user's facial expression according to an embodiment of the present invention.
如图4所示,用户肢体动作的解析方法还包括下述步骤:As shown in FIG. 4, the method for analyzing the user's limb motion further includes the following steps:
S2100、获取用户的人脸图像;S2100: acquiring a face image of the user;
在一种实现方式中,终端通过访问本地存储空间指定区域,获取存储在本地存储空间内的包括有用户的人脸图像的目标图像。在另一种实现方式中,通过开启设置在终端上或者与终端连接的拍摄设备,直接实时获取用户的人脸图像。In an implementation manner, the terminal acquires a target image including a face image of the user stored in the local storage space by accessing the local storage space designation area. In another implementation manner, the user's face image is directly acquired in real time by turning on the photographing device disposed on the terminal or connected to the terminal.
在另一种实现方式中,终端可以通过访问所连接的具有存储功能的外部设备的特定区域,获取存储在具有存储功能的外部设备的包括有用户的人脸图像的目标图像。In another implementation, the terminal may acquire a target image including a face image of the user stored in the external device having the storage function by accessing the connected specific area of the external device having the storage function.
S2200、识别所述人脸图像表征的人体脸部动作信息;S2200: Identify human face motion information represented by the facial image;
人体脸部动作信息包括人体脸部动作所表征的情绪信息,也可以称为表情信息,如喜怒哀乐等;同时也可以是表征用户无情绪表征的动作,例如撇嘴、吐舌或皱额头等。The facial motion information of the human body includes emotional information represented by the facial motion of the human body, and may also be referred to as facial expression information, such as emotions, sorrows, sorrows, and the like; and may also be an action that characterizes the user without emotional representation, such as licking, tongue, or wrinkle forehead. .
本实施方式中,识别人脸图像表征的人体脸部动作信息采用的技术方案可以为:通过深度学习的方法进行识别。具体地,收集大量的包括人体的人 脸图像的图片作为训练样本,根据人们对于各种人脸图像所表述的人体脸部动作信息的主观判断,获取每个训练样本肢体动作的主观含义,并将该主观含义设为人们对该训练样本的期望输出。然后,将训练样本输入到卷积神经网络模型中,通过对训练样本的特征的提取,并输出训练样本的分类数据,分类数据为训练样本在本轮训练中属于各分类结果的概率值,本实施例中分类结果为不同的人体脸部动作信息的名称。其中,概率值最大且大于预设的衡量阈值的分类结果,为本轮训练中该训练样本的激励输出。比较该期望输出与激励输出是否一致,当期望输出与激励输出一致时训练结束;当期望输出与激励输出不一致时,通过反向传播算法,校正卷积神经网络的权值,以调整输出的结果。调整卷积神经网络的权值后,将训练样本重新输入,循环往复直至期望输出与激励输出一致时训练结束。In the embodiment, the technical solution for recognizing the facial motion information of the human face image representation may be: performing the method of deep learning. Specifically, a large number of pictures including facial images of the human body are collected as training samples, and subjective judgments of human body facial motion information expressed by various facial images are obtained, and subjective meanings of limb movements of each training sample are obtained, and The subjective meaning is set to the desired output of the training sample. Then, the training sample is input into the convolutional neural network model, and the feature data of the training sample is extracted, and the classification data of the training sample is output, and the classification data is the probability value of the training sample belonging to each classification result in the current training. The classification result in the embodiment is the name of different human facial motion information. The classification result with the largest probability value and greater than the preset measurement threshold is the excitation output of the training sample in the current round of training. Comparing whether the desired output is consistent with the excitation output, and the training ends when the expected output is consistent with the excitation output; when the expected output is inconsistent with the excitation output, correcting the weight of the convolutional neural network by the back propagation algorithm to adjust the output result . After adjusting the weight of the convolutional neural network, the training samples are re-inputted and cycled until the desired output is consistent with the excitation output.
其中,该分类结果可以根据需求人为设定。分类结果根据输出的复杂程度能够为若干个,分类结果越多则训练的复杂程度越高。Among them, the classification result can be set according to the demand. The classification result can be several according to the complexity of the output, and the more the classification result, the higher the complexity of the training.
在一种情况中,有时需要反复输入以验证输出的稳定性,稳定性较好时结束训练。也就是说:为了保证卷积神经网络输出的结果的稳定性,可以反复向卷积神经网络输入训练样本,直至卷积神经网络针对每一训练样本输出的激励输出,与该训练样本对应的期望输出相同的概率超过预设值之后,训练结束。In one case, it is sometimes necessary to repeatedly input to verify the stability of the output, and when the stability is good, the training is ended. That is to say: in order to ensure the stability of the output of the convolutional neural network, the training samples can be repeatedly input to the convolutional neural network until the excitation output of the convolutional neural network for each training sample, the expectation corresponding to the training sample After the output of the same probability exceeds the preset value, the training ends.
一种情况中,上述期望输出与激励输出一致,可以指:所对应激励输出与所对应期望输出一致的训练样本的数量,不小于预设数量。相应的,上述期望输出与激励输出不一致,可以指:所对应激励输出与所对应期望输出一致的训练样本的数量,小于预设数量。In one case, the expected output is consistent with the excitation output, and may refer to: the number of training samples whose corresponding excitation output is consistent with the corresponding desired output, not less than the preset number. Correspondingly, the expected output is inconsistent with the excitation output, and may refer to: the number of training samples whose corresponding excitation output is consistent with the corresponding desired output is less than the preset number.
另一种情况中,上述期望输出与激励输出一致,可以指:所对应激励输出与所对应期望输出一致的训练样本数量,与训练样本的总数量的比值不小于预设比值。相应的,上述期望输出与激励输出不一致,可以指:所对应激励输出与所对应期望输出一致的训练样本数量,与训练样本的总数量的比值小于预设比值。In another case, the expected output is consistent with the excitation output, and may refer to: the ratio of the training samples whose corresponding excitation output is consistent with the corresponding expected output, and the ratio of the total number of training samples is not less than the preset ratio. Correspondingly, the expected output is inconsistent with the excitation output, and may refer to: the ratio of the training samples whose corresponding excitation output is consistent with the corresponding expected output, and the ratio of the total number of training samples is less than the preset ratio.
通过大量的能够表征不同人体脸部动作信息的人脸图像训练至收敛的卷积神经网络模型,能够快速且准确的确定输入其中的人脸图像(未参与训练) 的人体脸部动作信息。Through the training of a large number of face images capable of characterizing different facial motion information to the convolved convolutional neural network model, it is possible to quickly and accurately determine the facial motion information of the face image (not participating in the training).
S2300、匹配与所述人体脸部动作信息具有相同动作含义的表情图片。The S2300 matches an emoticon image having the same action meaning as the human facial motion information.
表情图片指终端或终端所连接的具有存储功能的外部设备中,存储模拟用户表情设计而成的表情或动画表情。An emoticon image refers to an external device with a storage function connected to a terminal or a terminal, and stores an emoticon or animated emoticon designed to simulate a user's expression.
通过卷积神经网络模型获取到用户的人脸图像所表征的人体脸部动作信息,即获取到用户的人脸图像表征的文字信息。以该文字信息为检索关键字,在本地数据库中检索与该文字信息具有相同含义的表情图片。在一些实施方式中,为方便匹配,将存储在本地数据库中的表情图片,根据其表达的含义设定一个或多个标签,以方便人体脸部动作信息通过检索标签进行对应匹配。The human face motion information represented by the user's face image is obtained by the convolutional neural network model, that is, the text information of the user's face image representation is obtained. Using the text information as a search key, an emoticon image having the same meaning as the text information is retrieved in the local database. In some embodiments, to facilitate matching, the emoticon images stored in the local database are set with one or more tags according to the meaning of their expressions, so that the facial motion information of the human body can be matched by the retrieval tag.
也就是说,在一种实现方式中,通过卷积神经网络模型获取到人脸图像所表征的人体脸部动作信息,即获取到用于表征该人脸图像中人脸表情含义的描述信息,该描述信息为一种文字类的信息,可以称为文字语言。进而,以该描述信息为检索关键字,在第二预设数据库中,检索与该描述信息基于相同含义的表情图片。在一些实现方式中,为了方便匹配,可以为存储于第二预设数据库中的表情图片,根据每一表情图片所表达的含义,配置一个或多个标签,以方便后续的检索过程,即在检索时通过标签进行检索匹配。That is to say, in an implementation manner, the facial motion information represented by the facial image is obtained by the convolutional neural network model, that is, the description information for characterizing the facial expression in the facial image is obtained, The description information is a type of text information, which may be referred to as a word language. Further, using the description information as a search key, in the second preset database, an emoticon image based on the same meaning as the description information is retrieved. In some implementation manners, in order to facilitate matching, one or more tags may be configured according to the meaning expressed by each emoticon image for the emoticon images stored in the second preset database, so as to facilitate the subsequent retrieval process, that is, Search and match through tags when searching.
其中,上述第二预设数据库可以存储于终端本地,也可以存储于终端所连接的具有存储功能的外部设备中。The second preset database may be stored locally in the terminal, or may be stored in an external device with a storage function connected to the terminal.
通过对用户的表情信息,即情绪信息进行解析后,获得该表情信息所表征的具体含义,然后匹配与该表情信息含义相同的表情图片,方便用户用户输入,同时也结合解析结果对用户表情做更深层次的交互处理。After parsing the expression information of the user, that is, the emotion information, the specific meaning represented by the expression information is obtained, and then the expression image with the same meaning as the expression information is matched, which is convenient for the user to input, and also combines the analysis result with the user expression. Deeper interaction processing.
请参阅图5,图5为本实施例显示表情图片的一种实施方式流程示意图。Please refer to FIG. 5. FIG. 5 is a schematic flow chart of an embodiment of displaying an emoticon image according to an embodiment of the present invention.
如图5所示,步骤S2100之前还包括下述步骤:As shown in FIG. 5, before step S2100, the following steps are further included:
S2011、调用预存储的至少一个所述表情图片;S2011, calling at least one of the pre-stored expression images;
在一种实现方式中,终端存储空间内在指定的区域内存储或文件夹内存储有包括若干表情图片的表情包或者乱码在该区域或文件夹内的多个表情包。其中,每一张表情图片内均表征有一个人体脸部动作。In an implementation manner, an expression package including a plurality of emoticons or a plurality of emoticons garbled in the area or folder are stored in the specified storage area or in the folder in the terminal storage space. Among them, each facial expression image is characterized by a human facial motion.
也就是说,终端的指定区域内或所连接的具有存储功能的外部设备可以存储有包括若干表情图片的表情包。其中,每一张表情图片内均表征有一个人体脸部动作。That is to say, an external device having a storage function in a designated area of the terminal or connected may store an emoticon pack including a plurality of emoticons. Among them, each facial expression image is characterized by a human facial motion.
在一种实现方式中,可以根据预设脚本调用一个或多个表情图片进行显示。In one implementation, one or more emoticons may be called for display according to a preset script.
S2012、将所述表情图片按预设脚本放置在显示容器内,以使所述表情图片可视化显示。S2012: The emoticon image is placed in a display container according to a preset script, so that the emoticon image is visually displayed.
脚本是用于控制表情图片显示动作的预设程序,其中设定了表情图片在显示区域停留时间的时间控件、设定图片在显示区域运动轨迹的运动控件和设定图片在显示区域匹配成功时表情图片发光的渲染控件。遍历上述控件即能够完成表情图片在显示容器内的显示。The script is a preset program for controlling the display action of the emoticon picture, wherein the time control of the emoticon time in the display area is set, the motion control for setting the motion track of the picture in the display area, and the setting picture are successfully matched in the display area. Emoticon rendering control. By traversing the above controls, the display of the emoticon in the display container can be completed.
将表情图片放入显示容器内。其中,放置在显示容器内的表情图片,通过预设脚本设定的参数,在显示容器内进行排版渲染后,展示于终端的显示区域供用户观看。Place the emoticon image in the display container. The emoticon placed in the display container is displayed in the display area of the terminal for viewing by the user after the typographic rendering is performed in the display container by using the parameters set by the preset script.
在一些实施方式中,表情图片被用于模仿应用程序,在终端摄像头开启状态下,实时采集用户的人脸图像,然后将人脸图像显示在终端屏幕中,用户模仿显示屏幕范围内的表情图片的动作,将用户模仿的图像进行分类识别,当用户脸部表情与显示屏幕范围内的一个或多个表情图片表征的动作相同时,对该匹配成功的表情图片进行计分,并按预设脚本对该表情图片的发光渲染。In some embodiments, the emoticon image is used to emulate the application, and the face image of the user is collected in real time in the state where the terminal camera is turned on, and then the face image is displayed on the terminal screen, and the user imitates the emoticon image within the display screen range. The action of classifying and recognizing the image simulated by the user, when the facial expression of the user is the same as the action of one or more expression images in the range of the display screen, scoring the successfully matched expression image, and pressing the preset The script renders the illuminating of the emoticon.
也就是说,在一些实施方式中,表情图片可以被用于模仿应用程序。在终端摄像头被开启的状态下,终端可以利用该被开启的摄像头实时采集用户的人脸图像,然后将人脸图像显示在终端的显示区域,如终端的显示屏幕中。终端的显示屏幕可以显示表情图片,进而,用户可以模仿该终端的显示屏幕范围内所显示的表情图片中的表情,作出与该终端的显示屏幕范围内所显示的表情图片中的表情相同的表情动作,进而,终端利用该被开启的摄像头采集用户所作的表情动作,并将所采集的图像,即用户进行模仿的图像进行分类识别。当用户用户所作的表情动作,例如脸部表情与该终端的显示屏幕范围内的一个或多个表情图片中的表情相同时,对该匹配成功的表情图片进行 计分,并按预设脚本对该表情图片的发光渲染。That is, in some embodiments, an emoticon can be used to emulate an application. In a state where the terminal camera is turned on, the terminal can use the opened camera to collect the face image of the user in real time, and then display the face image in the display area of the terminal, such as the display screen of the terminal. The display screen of the terminal can display an emoticon image. Further, the user can imitate the expression in the emoticon image displayed in the display screen range of the terminal, and make an expression that is the same as the expression in the emoticon image displayed in the display screen range of the terminal. The action, in turn, the terminal uses the opened camera to collect the expression action made by the user, and classifies and recognizes the collected image, that is, the image that the user performs imitating. When the facial expression made by the user user, for example, the facial expression is the same as the expression in one or more expression images in the range of the display screen of the terminal, the successful matching facial expression image is scored, and the preset script pair is pressed. The illuminating rendering of the emoticon.
请参阅图6,图6为本实施例确认显示容器范围内的表情图片与人体脸部动作信息相同的流程示意图。Please refer to FIG. 6. FIG. 6 is a schematic flowchart of confirming that the emoticon image in the range of the display container is the same as the facial motion information of the human body in the embodiment.
如图6所示,步骤S2300具体包括下述步骤:As shown in FIG. 6, step S2300 specifically includes the following steps:
S2310、将所述人体脸部动作信息与所述显示容器范围内的表情图片进行比对;S2310: Compare the human face motion information with an expression image in a range of the display container;
实时采集用户的人脸图像,然后将人脸图像显示在终端屏幕中,用户模仿显示屏幕范围内的表情图片的动作,将用户模仿的图像进行分类识别,然后将分类结果与显示容器范围内的表情图片表征的动作信息进行比对。The face image of the user is collected in real time, and then the face image is displayed on the terminal screen, and the user imitates the action of displaying the image in the screen range, classifies and recognizes the image imitated by the user, and then classifies the result and the range within the display container. The action information represented by the expression picture is compared.
也就是说,在一种实现方式中,终端将表情图片按预设脚本放置在显示容器内,以使表情图片可视化显示,即终端将表情图片显示于终端的显示屏幕之后,可以通过摄像头实时采集用户的人脸图像,其中,该人脸图像中可以包括:用户所作的表情动作,该所作的表情动作可以为:用户模仿终端的显示屏幕范围内所显示表情图片表征的表情而作的动作,或者,用户随意作出的动作。进而,终端获得用户的人脸图像之后,可以对用户的人脸图像中所包含的表情动作进行分类识别,即识别人脸图像中的人体脸部动作信息,进而,将人体脸部动作信息与显示容器范围内,即终端的显示屏幕范围内的表情图片进行比对,进而确定比对结果。That is to say, in an implementation manner, the terminal places the emoticon image in the display container according to the preset script, so that the emoticon image is visually displayed, that is, the terminal displays the emoticon image on the display screen of the terminal, and can be collected by the camera in real time. a face image of the user, wherein the face image may include: an expression action performed by the user, and the performed gesture action may be: an action performed by the user to imitate an expression represented by the expression image displayed in the display screen of the terminal, Or, the action that the user makes at will. Further, after obtaining the face image of the user, the terminal may classify and recognize the facial expressions included in the facial image of the user, that is, recognize the facial motion information in the facial image, and further, the facial motion information of the human body and The expression images within the display container range, that is, within the display screen range of the terminal, are compared to determine the comparison result.
S2320、当显示容器内的表情图片所表征的动作含义与所述人体脸部动作信息相同时,确认所述显示容器内存在与所述人体脸部动作具有相同动作含义的表情图片。S2320: When the action meaning represented by the expression picture in the display container is the same as the human face action information, it is confirmed that the display container has an expression picture having the same action meaning as the human face action.
当用户脸部表情与显示屏幕范围内的一个或多个表情图片表征的动作相同时,确认所述显示容器内存在与所述人体脸部动作具有相同动作含义的表情图片。其中,用户脸部表情为:通过该人体脸部动作信息表征的表情。When the user's facial expression is the same as the action of one or more expression picture representations within the range of the display screen, it is confirmed that there is an expression picture having the same action meaning as the human face motion in the display container. The facial expression of the user is an expression represented by the facial motion information of the human body.
在一些实施方式中,当用户的表情动作即用户脸部表情与表情图片的表征的动作含义相同时,还需要根据二者的匹配度计算奖励分值。具体请参阅图7,图7为本实施例通过匹配结果进行奖励的流程示意图。In some embodiments, when the facial expression action of the user, that is, the action meaning of the representation of the user's facial expression and the expression image is the same, it is also necessary to calculate the bonus score according to the matching degree of the two. For details, please refer to FIG. 7. FIG. 7 is a schematic flowchart of performing a reward by matching results in the embodiment.
如图7所示,步骤S2300之后还包括下述步骤:As shown in FIG. 7, after step S2300, the following steps are further included:
S2411、获取所述人体脸部动作信息与所述表情图片的匹配度信息;S2411: Obtain matching degree information of the human face motion information and the expression image;
本实施方式中,人体脸部动作信息动作的解析,通过卷积神经网络模型的分类结果进行确认,卷积神经网络模型分类层输出的分类结果,为该人脸图像属于各分类结果的概率值,其中,该概率值的取值范围可以为0-1,相应的,该人脸图像对应的分类结果可以包含通常为0-1之间的多个数值。例如举例而言:分类结果的设定为:喜怒哀乐四种情绪结果,人脸图像输入后,得到[0.75 0.2 0.4 0.3],由于其中0.75为最大值且大于预设阈值0.5,则该人脸图像的分类结果为“喜”。人体脸部动作信息与表征情绪为“喜”的表情图片的匹配度信息为0.75,即人脸图像所表征的表情动作与表征情绪为“喜”的表情图片的相似度为75%。In the present embodiment, the analysis of the human face motion information operation is performed by the classification result of the convolutional neural network model, and the classification result output by the convolutional neural network model classification layer is the probability value of the facial image belonging to each classification result. The probability value may be in the range of 0-1. Correspondingly, the classification result corresponding to the face image may include a plurality of values generally between 0-1. For example, for example, the classification result is set as: four emotional results of emotions, and after the face image is input, [0.75 0.2 0.4 0.3] is obtained, and since 0.75 is the maximum value and is greater than the preset threshold value of 0.5, the person The classification result of the face image is "Hi". The matching degree information of the human facial motion information and the facial expression image expressing the emotion as "hi" is 0.75, that is, the similarity of the facial expression image and the facial expression image expressing the emotion as "hi" is 75%.
举例而言:分类结果的设定为:笑、哭、皱眉和无表情四种表情结果,输入获得的人脸图像之后,得到[0.79 0.1 0.3 0.1],由于其中0.79为最大值且大于预设阈值0.5,则该人脸图像的分类结果为“笑”。进而,人体脸部动作信息与表征含义为“笑”的表情图片的匹配度信息为0.75,即人脸图像的所表征的表情动作与表征含义为“笑”的表情图片的相似度为75%。For example: the classification result is set as: four expressions of laugh, cry, frown and no expression. After inputting the obtained face image, [0.79 0.1 0.3 0.1] is obtained, because 0.79 is the maximum value and is greater than the preset. When the threshold is 0.5, the classification result of the face image is "laugh". Furthermore, the matching degree information of the human face motion information and the expression image with the meaning of “laughing” is 0.75, that is, the similarity of the characterized facial expression of the facial image and the expression image with the meaning of “laughing” is 75%. .
S2412、根据预设的匹配规则计算所述匹配度信息对应的奖励分值。S2412. Calculate, according to a preset matching rule, a bonus score corresponding to the matching degree information.
匹配规则为预设的根据匹配度信息计算奖励分值的方法。举例说明,根据匹配度信息将匹配结果划分为“完美、很好、不错和错过”,其中,“完美”为匹配度信息为0.9-1.0区间内的分类结果;“很好”为匹配度信息为0.7-0.9区间内的分类结果;“不错”为匹配度信息为0.5-0.7区间内的分类结果;“错过”为匹配度信息为0.5以下的分类结果。并设定一次“完美”的匹配结果分值为30分;一次“很好”的匹配结果分值为20分;一次“不错”的匹配结果分值为10分;一次“错过”的匹配结果分值为0分。The matching rule is a preset method of calculating a bonus score based on the matching degree information. For example, according to the matching degree information, the matching result is divided into “perfect, very good, good and missed”, wherein “perfect” is the classification result in the interval of 0.9-1.0; “very good” is the matching degree information. The classification result is in the range of 0.7-0.9; "good" is the classification result within the interval of 0.5-0.7; "missing" is the classification result with the matching degree information being 0.5 or less. And set a "perfect" match result score of 30 points; a "very good" match result score of 20 points; a "good" match result score of 10 points; a "miss" match result The score is 0.
根据预设的匹配规则计算匹配度信息对应的奖励分值。The bonus score corresponding to the matching degree information is calculated according to a preset matching rule.
通过匹配度信息对匹配结果进行打分,能够对匹配结果的匹配质量进行进一步的细化,能够得到更精准的奖励分值。By matching the matching result by the matching degree information, the matching quality of the matching result can be further refined, and a more accurate bonus score can be obtained.
在一些实施方式中,在预设的一个时间段内持续的对匹配结果进行记录, 并在时间结束后对持续时间段内用户的得分进行统计。具体请参阅图8,图8为本实施例统计等分的具体流程示意图。In some embodiments, the matching results are continuously recorded for a predetermined period of time, and the scores of the users within the duration are counted after the time is over. For details, refer to FIG. 8. FIG. 8 is a schematic flowchart of a statistical aliquot of the embodiment.
如图8所示,步骤S2412之后还包括下述步骤:As shown in FIG. 8, after step S2412, the following steps are further included:
S2421、记录预设第一时间阈值内所有的奖励分值;S2421: Record all the bonus points in the preset first time threshold;
在一种实现方式中,第一时间阈值为预定的一局匹配游戏的时间长度,例如:设定一局匹配游戏的时间长度为3分钟。具体时间长度的设置不局限于此,在一些选择性实施例中,第一时间阈值的时间长度能够更短或者更长。In one implementation, the first time threshold is the length of time of the predetermined match game, for example, the length of time for setting a match game is 3 minutes. The setting of the specific time length is not limited thereto, and in some alternative embodiments, the time length of the first time threshold can be shorter or longer.
S2422、将所述奖励分值累加形成用户在所述第一时间阈值内的最终得分。S2422, accumulating the bonus scores to form a final score of the user within the first time threshold.
统计第一时间阈值内用户的奖励分值的总分值,作为用户在第一时间阈值内参与匹配的总得分。The total score of the user's bonus score within the first time threshold is counted as the total score of the user participating in the match within the first time threshold.
也就是说,将在第一时间阈值内所记录的所有的奖励分值进行相加,将相加所得的结果,作为用户在第一时间阈值内参与匹配的最终得分。That is, all the bonus scores recorded within the first time threshold are added, and the result obtained is added as the final score of the user participating in the matching within the first time threshold.
请参阅图9,图9为本实施例解析人脸图像情绪信息的流程示意图。Please refer to FIG. 9. FIG. 9 is a schematic flowchart of analyzing emotion information of a face image according to an embodiment of the present invention.
如图9所示,用户肢体动作的解析方法还包括下述步骤:As shown in FIG. 9, the method for analyzing the user's limb motion further includes the following steps:
S3100、在预设的单位时间内从表情包中随机抽取预设数量的表征人类情绪的表情图片,并将所述表情图片放置在显示容器内;S3100: randomly extract a preset number of expression images representing human emotions from the expression pack in a preset unit time, and place the expression image in the display container;
单位时间为向显示容器加载一波表情图片的时间,例如加载一波表情图片的时间为5秒,即一波表情图片出现在显示容器内的时间为5秒,5秒后新的一波表情图片会进行替换。单位时间加载的表情图片能够预设设定,设定规可以是固定的,例如单位时间内每一波表情图片的添加均为5个,也可以是不固定的,在一些实施方式中,可以根据终端的网络状态进行随机设定,其中,网络状态越好,所设定的预设数据越大:在另一些实施方式中,表情图片的添加能够是递增的,递增的数量根据实际情况设定,如每次递增一个、两个或者更多个。The unit time is the time to load a wave of emoticons into the display container. For example, the time for loading a wave of emoticons is 5 seconds, that is, the time when a wave of emoticons appears in the display container is 5 seconds, and a new wave of expressions after 5 seconds. The picture will be replaced. The emoticon image loaded per unit time can be preset, and the setting rule can be fixed. For example, each wave of emoticons in the unit time is added by 5 or not, and in some embodiments, According to the network state of the terminal, the setting is random, wherein the better the network state is, the larger the preset data is set: in other embodiments, the addition of the emoticon image can be incremental, and the increasing number is set according to the actual situation. Set, such as one, two or more increments at a time.
S3200、在所述单位时间内定时或实时采集用户的人脸图像,并识别所述人脸图像所表征的情绪信息及所述人脸图像与所述情绪信息的匹配度;S3200: Collect a face image of the user in a timed or real time in the unit time, and identify the emotion information represented by the face image and the matching degree between the face image and the emotion information;
以单位时间为限,通过开启终端自带或者与其连接的拍摄设备实时获取用户的人脸图像,但不限于此,在一些实施方式中,能够通过定时的方式(如0.1s)提取人脸图像。The user's face image is acquired in real time by the camera connected to or connected to the terminal, but is not limited thereto. In some embodiments, the face image can be extracted by a timing method (for example, 0.1 s). .
人脸图像的情绪信息解析,通过卷积神经网络模型的分类结果进行确认,卷积神经网络模型分类层输出的分类结果,为该人脸图像属于各分类结果的概率值,其中,该概率值的取值范围可以为0-1,相应的,该人脸图像对应的分类结果可以包含0-1之间的多个数值。举例而言:分类结果的设定为:喜怒哀乐四种情绪结果,人脸图像输入后,得到[0.75 0.2 0.4 0.3],由于其中0.75为最大值且大于预设阈值0.5,则该人脸图像的分类结果为“喜”。根据分类结果确定在显示容器内与“喜”情绪相同的表情图片,情绪信息与该表情图片的匹配度信息为0.75,即人脸图像所表征的表情动作即情绪动作与该表情图片的相似度为75%。The emotional information of the face image is parsed and confirmed by the classification result of the convolutional neural network model. The classification result output by the convolutional neural network model classification layer is the probability value of the facial image belonging to each classification result, wherein the probability value The value range may be 0-1. Correspondingly, the classification result corresponding to the face image may include multiple values between 0-1. For example: the classification result is set as: four emotion results of emotions, anger and sorrow, after the face image is input, [0.75 0.2 0.4 0.3] is obtained, and since 0.75 is the maximum value and is greater than the preset threshold value of 0.5, the face is The classification result of the image is "Hi". According to the classification result, the expression image with the same emotion as “Hi” in the display container is determined, and the matching degree information of the emotion information and the expression image is 0.75, that is, the expression action represented by the face image is the similarity between the emotional action and the expression image. It is 75%.
S3300、匹配与所述人脸图像具有相同情绪含义的表情图片,并根据所述匹配度确认所述人脸图像的奖励分值。S3300: Match an emoticon image having the same emotional meaning as the facial image, and confirm a bonus score of the facial image according to the matching degree.
匹配规则为预设的根据匹配度信息计算奖励分值的方法。举例说明,根据匹配度信息将匹配结果划分为“完美、很好、不错和错过”,其中,“完美”为匹配度信息为0.9-1.0区间内的分类结果;“很好”为匹配度信息为0.7-0.9区间内的分类结果;“不错”为匹配度信息为0.5-0.7区间内的分类结果;“错过”为匹配度信息为0.5以下的分类结果。并设定一次“完美”的匹配结果分值为30分;一次“很好”的匹配结果分值为20分;一次“不错”的匹配结果分值为10分;一次“错过”的匹配结果分值为0分。The matching rule is a preset method of calculating a bonus score based on the matching degree information. For example, according to the matching degree information, the matching result is divided into “perfect, very good, good and missed”, wherein “perfect” is the classification result in the interval of 0.9-1.0; “very good” is the matching degree information. The classification result is in the range of 0.7-0.9; "good" is the classification result within the interval of 0.5-0.7; "missing" is the classification result with the matching degree information being 0.5 or less. And set a "perfect" match result score of 30 points; a "very good" match result score of 20 points; a "good" match result score of 10 points; a "miss" match result The score is 0.
根据预设的匹配规则计算匹配度信息对应的奖励分值。The bonus score corresponding to the matching degree information is calculated according to a preset matching rule.
请参阅图10,图10为本实施例人脸图像的情绪信息分类与匹配度检测的流程示意图。Please refer to FIG. 10. FIG. 10 is a schematic flowchart of the emotion information classification and the matching degree detection of the face image according to the embodiment.
如图10所示,步骤3200具体包括下述步骤:As shown in FIG. 10, step 3200 specifically includes the following steps:
S3210、采集用户的人脸图像;S3210: collecting a face image of the user;
以单位时间为限,通过开启终端自带或者与其连接的拍摄设备实时获取用户的人脸图像,但不限于此。在一些实施方式中可以通过定时的方式(如 0.1s)采集人脸图像。The user's face image is acquired in real time by turning on the shooting device that is connected to or connected to the terminal, but is not limited thereto. In some embodiments, facial images can be acquired in a timed manner (e.g., 0.1 s).
S3220、将所述人脸图像输入到预设的情绪识别模型中,并获取所述人脸图像的分类结果及分类数据;S3220: Input the face image into a preset emotion recognition model, and obtain a classification result and classification data of the face image;
情绪识别模型具体为训练至收敛状态的卷积神经网络模型。The emotion recognition model is specifically a convolutional neural network model trained to a convergent state.
本实施方式中,识别人脸图像表征的情绪信息采用的技术方案可以为:通过深度学习的方法进行识别。具体地,收集大量的包括人体的人脸图像的图片作为训练样本,根据人们对于各种人脸图像所表述的情绪信息的主观判断,获取每个训练样本肢体动作的主观含义,并将该主观含义设为人们对该训练样本的期望输出。然后,将训练样本输入到卷积神经网络模型中,通过对训练样本的特征的提取,并输出训练样本的分类数据,分类数据为训练样本在本轮训练中属于各分类结果的概率值,本实施例中分类结果为不同的情绪信息的名称。其中,概率值最大且大于预设的衡量阈值的分类结果,为本轮训练中该训练样本的激励输出。比较该期望输出与激励输出是否一致,当期望输出与激励输出一致时训练结束;当期望输出与激励输出不一致时,通过反向传播算法,校正卷积神经网络的权值,以调整输出的结果。调整卷积神经网络的权值后,将训练样本重新输入,循环往复直至期望输出与激励输出一致时训练结束。In the embodiment, the technical solution for identifying the emotion information represented by the face image may be: performing the method by deep learning. Specifically, a large number of pictures including face images of the human body are collected as training samples, and subjective judgments of the emotion information expressed by the various face images are obtained, and the subjective meaning of the limb movements of each training sample is acquired, and the subjective meaning is obtained. The meaning is set to the expected output of the training sample. Then, the training sample is input into the convolutional neural network model, and the feature data of the training sample is extracted, and the classification data of the training sample is output, and the classification data is the probability value of the training sample belonging to each classification result in the current training. The classification result in the embodiment is the name of the different emotion information. The classification result with the largest probability value and greater than the preset measurement threshold is the excitation output of the training sample in the current round of training. Comparing whether the desired output is consistent with the excitation output, and the training ends when the expected output is consistent with the excitation output; when the expected output is inconsistent with the excitation output, correcting the weight of the convolutional neural network by the back propagation algorithm to adjust the output result . After adjusting the weight of the convolutional neural network, the training samples are re-inputted and cycled until the desired output is consistent with the excitation output.
其中,该分类结果可以根据需求人为设定。分类结果根据输出的复杂程度能够为若干个,分类结果越多则训练的复杂程度越高。Among them, the classification result can be set according to the demand. The classification result can be several according to the complexity of the output, and the more the classification result, the higher the complexity of the training.
在一种情况中,有时需要反复输入以验证输出的稳定性,稳定性较好时结束训练。也就是说:为了保证卷积神经网络输出的结果的稳定性,可以反复向卷积神经网络输入训练样本,直至卷积神经网络针对每一训练样本输出的激励输出,与该训练样本对应的期望输出相同的概率超过预设值之后,训练结束。In one case, it is sometimes necessary to repeatedly input to verify the stability of the output, and when the stability is good, the training is ended. That is to say: in order to ensure the stability of the output of the convolutional neural network, the training samples can be repeatedly input to the convolutional neural network until the excitation output of the convolutional neural network for each training sample, the expectation corresponding to the training sample After the output of the same probability exceeds the preset value, the training ends.
一种情况中,上述期望输出与激励输出一致,可以指:所对应激励输出与所对应期望输出一致的训练样本的数量,不小于预设数量。相应的,上述期望输出与激励输出不一致,可以指:所对应激励输出与所对应期望输出一致的训练样本的数量,小于预设数量。In one case, the expected output is consistent with the excitation output, and may refer to: the number of training samples whose corresponding excitation output is consistent with the corresponding desired output, not less than the preset number. Correspondingly, the expected output is inconsistent with the excitation output, and may refer to: the number of training samples whose corresponding excitation output is consistent with the corresponding desired output is less than the preset number.
另一种情况中,上述期望输出与激励输出一致,可以指:所对应激励输出与所对应期望输出一致的训练样本数量,与训练样本的总数量的比值不小于预设比值。相应的,上述期望输出与激励输出不一致,可以指:所对应激励输出与所对应期望输出一致的训练样本数量,与训练样本的总数量的比值小于预设比值。In another case, the expected output is consistent with the excitation output, and may refer to: the ratio of the training samples whose corresponding excitation output is consistent with the corresponding expected output, and the ratio of the total number of training samples is not less than the preset ratio. Correspondingly, the expected output is inconsistent with the excitation output, and may refer to: the ratio of the training samples whose corresponding excitation output is consistent with the corresponding expected output, and the ratio of the total number of training samples is less than the preset ratio.
S3230、根据所述分类结果确定所述人脸图像的情绪信息,并根据所述分类数据确定所述人脸图像与所述情绪信息的匹配度。S3230. Determine emotion information of the face image according to the classification result, and determine a matching degree between the face image and the emotion information according to the classification data.
人体脸部动作信息动作的解析,通过卷积神经网络模型的分类结果进行确认,卷积神经网络模型分类层输出的分类结果,为该人脸图像属于各分类结果的概率值,其中,该概率值的取值范围可以为0-1,相应的,该人脸图像对应的分类结果可以包含0-1之间的多个数值。举例而言:分类结果的设定为:喜怒哀乐四种情绪结果,人脸图像输入后,得到[0.75 0.2 0.4 0.3],由于其中0.75为最大值且大于预设阈值0.5,则该人脸图像的分类结果为“喜”。人体脸部动作信息与表征情绪为“喜”的表情图片的匹配度信息为0.75,即人脸图像所表征的表情动作与表征情绪为“喜”的表情图片的相似度为75%。The analysis of the human facial motion information action is confirmed by the classification result of the convolutional neural network model, and the classification result output by the convolutional neural network model classification layer is the probability value of the facial image belonging to each classification result, wherein the probability The value may be in the range of 0-1. Correspondingly, the classification result corresponding to the face image may include multiple values between 0-1. For example: the classification result is set as: four emotion results of emotions, anger and sorrow, after the face image is input, [0.75 0.2 0.4 0.3] is obtained, and since 0.75 is the maximum value and is greater than the preset threshold value of 0.5, the face is The classification result of the image is "Hi". The matching degree information of the human facial motion information and the facial expression image expressing the emotion as "hi" is 0.75, that is, the similarity of the facial expression image and the facial expression image expressing the emotion as "hi" is 75%.
举例而言:分类结果的设定为:笑、哭、皱眉和无表情四种表情结果,输入获得的人脸图像之后,得到[0.79 0.1 0.3 0.1],由于其中0.79为最大值且大于预设阈值0.5,则该人脸图像的分类结果为“笑”。进而,人体脸部动作信息与表征含义为“笑”的表情图片的匹配度信息为0.75,即人脸图像所表征的表情动作与表征含义为“笑”的表情图片的相似度为75%。For example: the classification result is set as: four expressions of laugh, cry, frown and no expression. After inputting the obtained face image, [0.79 0.1 0.3 0.1] is obtained, because 0.79 is the maximum value and is greater than the preset. When the threshold is 0.5, the classification result of the face image is "laugh". Furthermore, the matching degree information of the human facial motion information and the facial expression image having the meaning of “laughing” is 0.75, that is, the similarity between the facial expression image represented by the facial image and the facial expression image expressing the “laughing” is 75%.
请参阅图11,图11为本实施中用户肢体动作的解析方法第三种实施方式的展示示意图。Please refer to FIG. 11. FIG. 11 is a schematic diagram showing a third embodiment of a method for analyzing a user's limb motion in the present embodiment.
如图11所示,在终端的显示区域可以同时显示用户的自拍图像,同时在屏幕内显示表情图片。用户根据显示的表情图片模仿做相同的表情动作,终端检测模仿表情是否与显示区域内的某个表情图片相同,匹配相同时,对匹配成功的表情进行放大显示,同时根据匹配程度显示相应的奖励分值。As shown in FIG. 11, the self-portrait image of the user can be simultaneously displayed in the display area of the terminal, and the emoticon image is displayed in the screen. The user imitates the same expression action according to the displayed emoticon picture, and the terminal detects whether the emotic expression is the same as an emoticon picture in the display area. When the matching is the same, the matching expression is enlarged and displayed, and the corresponding reward is displayed according to the matching degree. Score.
其中,上述模仿表情为:用户根据显示的表情图片模仿做相同的表情动作时的表情。The above-mentioned imitation expression is: the user imitates the expression when performing the same expression action according to the displayed emoticon picture.
为解决上述技术问题,本申请实施例还提供一种用户肢体动作的解析系统。具体请参阅图12,图12为本实施例用户肢体动作的解析系统基本结构框图。To solve the above technical problem, the embodiment of the present application further provides an analysis system for a user's limb motion. For details, please refer to FIG. 12. FIG. 12 is a basic structural block diagram of an analysis system for a user's limb motion according to the embodiment.
如图12所示,一种用户肢体动作的解析系统,包括:获取模块2100、处理模块2200和执行模块2300。其中,获取模块2100用于获取用户的肢体图像;处理模块2200用于识别肢体图像表征的肢体语言;执行模块2300用于匹配与肢体语言具有相同含义的可视化信息或音频信息。As shown in FIG. 12, an analysis system for a user's limb motion includes an acquisition module 2100, a processing module 2200, and an execution module 2300. The obtaining module 2100 is configured to acquire a limb image of the user; the processing module 2200 is configured to identify the body language of the limb image representation; and the executing module 2300 is configured to match the visual information or the audio information having the same meaning as the body language.
上述实施方式通过识别图片中用户的肢体图像表征的肢体语言,并匹配与该肢体语言具有相同含义可视化信息或音频信息。以此,将图像中的肢体特征所表述的信息通过能够被人类直接解读的方式进行呈现,实现了对于人类肢体动作的深层次解读,方便语言障碍者或者语言不通用户之间的交流。The above embodiment recognizes the body language represented by the limb image of the user in the picture and matches the visual information or audio information having the same meaning as the body language. In this way, the information expressed by the limb features in the image is presented in a way that can be directly interpreted by humans, thereby realizing a deep interpretation of the human body movements, facilitating communication between the language disabled or the languageless users.
在一些实施方式中,用户肢体动作的解析系统还包括:第一获取子模块、第一处理子模块和第一执行子模块。其中,第一获取子模块用于获取用户的人脸图像;第一处理子模块用于识别人脸图像表征的人体脸部动作信息;第一执行子模块用于匹配与人体脸部动作信息具有相同动作含义的表情图片。In some embodiments, the analysis system of the user's limb motion further includes: a first acquisition sub-module, a first processing sub-module, and a first execution sub-module. The first obtaining sub-module is configured to acquire a facial image of the user; the first processing sub-module is configured to identify human facial motion information represented by the facial image; and the first executing sub-module is configured to match the facial motion information of the human body. An emoticon image with the same action meaning.
在一些实施方式中,所述获取模块,包括:第一获取子模块,用于获取用户的人脸图像;所述处理模块,包括:第一处理子模块,用于识别所述人脸图像表征的人体脸部动作信息;所述执行模块,包括:第一执行子模块,用于匹配与所述人体脸部动作信息具有相同动作含义的表情图片。In some embodiments, the acquiring module includes: a first acquiring sub-module, configured to acquire a face image of the user; and the processing module includes: a first processing sub-module, configured to identify the facial image representation The human face motion information; the execution module includes: a first execution sub-module, configured to match an emoticon image having the same action meaning as the human facial motion information.
在一些实施方式中,用户肢体动作的解析系统还包括:第一调用子模块和第一显示子模块。其中,第一调用子模块用于调用预存储的至少一个表情图片;第一显示子模块用于将表情图片按预设脚本放置在显示容器内,以使表情图片可视化显示。In some embodiments, the analysis system of the user's limb motion further includes: a first calling sub-module and a first display sub-module. The first calling sub-module is configured to call the pre-stored at least one emoticon image; the first display sub-module is configured to place the emoticon image in the display container according to the preset script, so that the emoticon image is visually displayed.
在一些实施方式中,第一调用子模块用于所述获取用户的人脸图像之前,调用预存储的至少一个表情图片;第一显示子模块用于将表情图片按预设脚本放置在显示容器内,以使表情图片可视化显示。In some embodiments, the first calling sub-module is configured to call the pre-stored at least one emoticon image before the acquiring the user's facial image; the first display sub-module is configured to place the emoticon image in the display container according to the preset script. Inside, to make the emoticon visual display.
在一些实施方式中,用户肢体动作的解析系统还包括:第一比对子模块和第一确认子模块。其中,第一比对子模块用于将人体脸部动作信息与显示 容器范围内的表情图片进行比对;第一确认子模块用于当所述显示容器内的表情图片所表征的动作含义与人体脸部动作信息相同时,确认显示容器内存在与人体脸部动作具有相同动作含义的表情图片。In some embodiments, the analysis system of the user's limb motion further includes: a first comparison sub-module and a first confirmation sub-module. The first comparison sub-module is configured to compare the facial motion information of the human body with the facial expression image in the range of the display container; the first confirmation sub-module is used for the meaning of the action represented by the expression image in the display container. When the human face motion information is the same, it is confirmed that there is an expression picture having the same action meaning as the human face motion in the display container.
在一些实施方式中,用户肢体动作的解析系统还包括:第二获取子模块和第二执行子模块。其中,第二获取子模块和第二执行子模块。其中,第二获取子模块用于获取人体脸部动作信息与表情图片的匹配度信息;第二执行子模块用于根据预设的匹配规则计算匹配度信息对应的奖励分值。In some embodiments, the analysis system of the user's limb motion further includes: a second acquisition sub-module and a second execution sub-module. The second obtaining submodule and the second executing submodule. The second obtaining sub-module is configured to obtain the matching degree information of the human facial action information and the emoticon image; and the second executing sub-module is configured to calculate the bonus score corresponding to the matching degree information according to the preset matching rule.
在一些实施方式中,第二获取子模块用于所述匹配与所述人体脸部动作具有相同动作含义的表情图片之后,获取人体脸部动作信息与表情图片的匹配度信息;第二执行子模块用于根据预设的匹配规则计算匹配度信息对应的奖励分值。In some embodiments, after acquiring the emoticon image having the same action meaning as the human facial motion, the second obtaining sub-module acquires matching degree information of the human facial motion information and the emoticon image; the second executive The module is configured to calculate a bonus score corresponding to the matching degree information according to a preset matching rule.
在一些实施方式中,用户肢体动作的解析系统还包括:第一记录子模块和第三执行子模块。其中,第一记录子模块用于记录预设第一时间阈值内所有的奖励分值;第三执行子模块用于将奖励分值累加形成用户在第一时间阈值内的最终得分。In some embodiments, the analysis system of the user's limb motion further includes: a first recording sub-module and a third execution sub-module. The first recording sub-module is configured to record all the bonus scores in the preset first time threshold; the third execution sub-module is configured to accumulate the bonus scores to form a final score of the user within the first time threshold.
在一些实施方式中,用户肢体动作的解析系统还包括:第一记录子模块和第三执行子模块。其中,第一记录子模块用于所述根据预设的匹配规则计算所述匹配度信息对应的奖励分值之后,记录预设第一时间阈值内所有的奖励分值;第三执行子模块用于将奖励分值累加形成用户在第一时间阈值内的最终得分。In some embodiments, the analysis system of the user's limb motion further includes: a first recording sub-module and a third execution sub-module. The first recording sub-module is configured to: after calculating the bonus score corresponding to the matching degree information according to the preset matching rule, record all the bonus scores in the preset first time threshold; and use the third execution sub-module The bonus scores are summed to form the final score of the user within the first time threshold.
在一些实施方式中,用户肢体动作的解析系统还包括:第三获取子模块、第二处理子模块和第四执行子模块。其中,第三获取子模块用于在预设的单位时间内从表情包中随机抽取预设数量的表征人类情绪的表情图片,并将表情图片放置在显示容器内;第二处理子模块用于在单位时间内定时或实时采集用户的人脸图像,并识别人脸图像所表征的情绪信息及人脸图像与情绪信息的匹配度;第四执行子模块用于匹配与人脸图像具有相同情绪含义的表情图片,并根据匹配度确认人脸图像的奖励分值。In some embodiments, the analysis system of the user's limb motion further includes: a third acquisition sub-module, a second processing sub-module, and a fourth execution sub-module. The third obtaining sub-module is configured to randomly extract a preset number of emoticons representing human emotions from the emoticon package in a preset unit time, and place the emoticon images in the display container; the second processing sub-module is used for The user's face image is collected in a unit time or in real time, and the emotion information represented by the face image and the matching degree between the face image and the emotion information are recognized; the fourth execution sub-module is used to match the same emotion with the face image. The expression of the meaning of the image, and the bonus score of the face image is confirmed according to the matching degree.
在一些实施方式中,所述用户肢体动作的解析系统还包括:第三获取子 模块,用于在预设的单位时间内从表情包中随机抽取预设数量的表征人类情绪的表情图片,并将所述表情图片放置在显示容器内;第一获取子模块,用于在所述单位时间内定时或实时采集用户的人脸图像;第一处理子模块,用于识别所述人脸图像所表征的情绪信息,及所述人脸图像与所述情绪信息的匹配度;第一执行子模块,用于匹配与所述人脸图像具有相同情绪含义的表情图片,并根据所述匹配度确认所述人脸图像的奖励分值。In some embodiments, the analysis system of the user's limb motion further includes: a third acquisition sub-module, configured to randomly extract a preset number of expression images representing human emotions from the expression package in a preset unit time, and The embedding image is placed in the display container; the first obtaining sub-module is configured to collect the facial image of the user in a timed or real-time manner in the unit time; the first processing sub-module is configured to identify the facial image The characterized emotion information, and the matching degree between the face image and the emotion information; the first execution sub-module is configured to match the expression image having the same emotional meaning as the face image, and confirm according to the matching degree The bonus score of the face image.
在一些实施方式中,用户肢体动作的解析系统还包括:第一采集子模块、第三处理子模块和第五执行子模块,其中,第一采集子模块用于采集用户的人脸图像;第三处理子模块用于将人脸图像输入到预设的情绪识别模型中,并获取人脸图像的分类结果及分类数据;第五执行子模块用于根据分类结果确定人脸图像的情绪信息,并根据分类数据确定人脸图像与情绪信息的匹配度。In some embodiments, the analysis system of the user's limb motion further includes: a first collection sub-module, a third processing sub-module, and a fifth execution sub-module, wherein the first collection sub-module is configured to collect a user's face image; The third processing sub-module is configured to input the facial image into the preset emotion recognition model, and obtain the classification result and the classification data of the facial image; the fifth execution sub-module is configured to determine the emotional information of the facial image according to the classification result, And matching the face image with the emotion information according to the classification data.
在一些实施方式中,所述第一处理子模块,用于将所述人脸图像输入到预设的情绪识别模型中,并获取所述人脸图像的分类结果及分类数据;根据所述分类结果确定所述人脸图像的情绪信息,并根据所述分类数据确定所述人脸图像与所述情绪信息的匹配度。In some embodiments, the first processing sub-module is configured to input the facial image into a preset emotion recognition model, and acquire a classification result and classification data of the facial image; according to the classification As a result, the emotion information of the face image is determined, and the degree of matching between the face image and the emotion information is determined according to the classification data.
本实施例中的终端是指移动终端与PC端,并以移动终端为例进行说明。The terminal in this embodiment refers to the mobile terminal and the PC end, and the mobile terminal is taken as an example for description.
本实施例还提供一种移动终端。具体请参阅图13,图13为本实施例移动终端基本结构示意图。This embodiment also provides a mobile terminal. For details, refer to FIG. 13. FIG. 13 is a schematic structural diagram of a mobile terminal according to an embodiment of the present invention.
需要指出的是本实施列中,移动终端的存储器1520内存储用于实现本实施例中用户肢体动作的解析方法中的所有程序,处理器1580能够调用该存储器1520内的程序,执行上述用户肢体动作的解析方法所列举的所有功能。由于移动终端实现的功能在本实施例中的用户肢体动作的解析方法进行了详述,在此不再进行赘述。It should be noted that in the present embodiment, all the programs in the parsing method for implementing the user's limb motion in the embodiment are stored in the memory 1520 of the mobile terminal, and the processor 1580 can call the program in the memory 1520 to execute the user limb. All the functions listed in the analysis method of the action. The method for analyzing the user's limb motion in the present embodiment is described in detail in the function implemented by the mobile terminal, and details are not described herein.
本申请实施例还提供了移动终端,如图13所示,为了便于说明,仅示出了与本申请实施例相关的部分,具体技术细节未揭示的,请参照本申请实施例方法部分。该终端可以为包括移动终端、平板电脑、PDA(Personal Digital Assistant,个人数字助理)、POS(Point of Sales,销售终端)、车载电脑等任 意终端设备,以终端为移动终端为例:The embodiment of the present application further provides a mobile terminal. As shown in FIG. 13 , for the convenience of description, only the parts related to the embodiment of the present application are shown. For details that are not disclosed, refer to the method part of the embodiment of the present application. The terminal may be any terminal device including a mobile terminal, a tablet computer, a PDA (Personal Digital Assistant), a POS (Point of Sales), an in-vehicle computer, and the terminal is a mobile terminal as an example:
图13示出的是与本申请实施例提供的终端相关的移动终端的部分结构的框图。参考图13,移动终端包括:射频(Radio Frequency,RF)电路1510、存储器1520、输入单元1530、显示单元1540、传感器1550、音频电路1560、无线保真(wireless fidelity,Wi-Fi)模块1570、处理器1580、以及电源1590等部件。本领域技术人员可以理解,图13中示出的移动终端结构并不构成对移动终端的限定,可以包括比图示更多或更少的部件,或者组合某些部件,或者不同的部件布置。FIG. 13 is a block diagram showing a partial structure of a mobile terminal related to a terminal provided by an embodiment of the present application. Referring to FIG. 13, the mobile terminal includes: a radio frequency (RF) circuit 1510, a memory 1520, an input unit 1530, a display unit 1540, a sensor 1550, an audio circuit 1560, a wireless fidelity (Wi-Fi) module 1570, The processor 1580, and the power supply 1590 and the like. It will be understood by those skilled in the art that the mobile terminal structure shown in FIG. 13 does not constitute a limitation of the mobile terminal, and may include more or less components than those illustrated, or some components may be combined, or different component arrangements.
下面结合图13对移动终端的各个构成部件进行具体的介绍:The following describes the components of the mobile terminal in detail with reference to FIG. 13:
RF电路1510可用于收发信息或通话过程中,信号的接收和发送,特别地,将基站的下行信息接收后,给处理器1580处理;另外,将设计上行的数据发送给基站。通常,RF电路1510包括但不限于天线、至少一个放大器、收发信机、耦合器、低噪声放大器(Low Noise Amplifier,LNA)、双工器等。此外,RF电路1510还可以通过无线通信与网络和其他设备通信。上述无线通信可以使用任一通信标准或协议,包括但不限于全球移动通讯系统(Global System of Mobile communication,GSM)、通用分组无线服务(General Packet Radio Service,GPRS)、码分多址(Code Division Multiple Access,CDMA)、宽带码分多址(Wideband Code Division Multiple Access,WCDMA)、长期演进(Long Term Evolution,LTE)、电子邮件、短消息服务(Short Messaging Service,SMS)等。The RF circuit 1510 can be used for receiving and transmitting signals during the transmission or reception of information or during a call. Specifically, after receiving the downlink information of the base station, the processing is processed by the processor 1580. In addition, the data designed for the uplink is sent to the base station. Generally, RF circuit 1510 includes, but is not limited to, an antenna, at least one amplifier, a transceiver, a coupler, a Low Noise Amplifier (LNA), a duplexer, and the like. In addition, RF circuitry 1510 can also communicate with the network and other devices via wireless communication. The above wireless communication may use any communication standard or protocol, including but not limited to Global System of Mobile communication (GSM), General Packet Radio Service (GPRS), Code Division Multiple Access (Code Division). Multiple Access (CDMA), Wideband Code Division Multiple Access (WCDMA), Long Term Evolution (LTE), E-mail, Short Messaging Service (SMS), and the like.
存储器1520可用于存储软件程序以及模块,处理器1580通过运行存储在存储器1520的软件程序以及模块,从而执行移动终端的各种功能应用以及数据处理。存储器1520可主要包括存储程序区和存储数据区,其中,存储程序区可存储操作系统、至少一个功能所需的应用程序(比如声纹播放功能、图像播放功能等)等;存储数据区可存储根据移动终端的使用所创建的数据(比如音频数据、电话本等)等。此外,存储器1520可以包括高速随机存取存储器,还可以包括非易失性存储器,例如至少一个磁盘存储器件、闪存器件、或其他易失性固态存储器件。The memory 1520 can be used to store software programs and modules, and the processor 1580 executes various functional applications and data processing of the mobile terminal by running software programs and modules stored in the memory 1520. The memory 1520 may mainly include a storage program area and a storage data area, wherein the storage program area may store an operating system, an application required for at least one function (such as a voiceprint playing function, an image playing function, etc.), and the like; the storage data area may be stored. Data created according to the use of the mobile terminal (such as audio data, phone book, etc.). Moreover, memory 1520 can include high speed random access memory, and can also include non-volatile memory, such as at least one magnetic disk storage device, flash memory device, or other volatile solid state storage device.
输入单元1530可用于接收输入的数字或字符信息,以及产生与移动终端 的用户设置以及功能控制有关的键信号输入。具体地,输入单元1530可包括触控面板1531以及其他输入设备1532。触控面板1531,也称为触摸屏,可收集用户在其上或附近的触摸操作(比如用户使用手指、触笔等任何适合的物体或附件在触控面板1531上或在触控面板1531附近的操作),并根据预先设定的程式驱动相应的连接装置。可选的,触控面板1531可包括触摸检测装置和触摸控制器两个部分。其中,触摸检测装置检测用户的触摸方位,并检测触摸操作带来的信号,将信号传送给触摸控制器;触摸控制器从触摸检测装置上接收触摸信息,并将它转换成触点坐标,再送给处理器1580,并能接收处理器1580发来的命令并加以执行。此外,可以采用电阻式、电容式、红外线以及表面声波等多种类型实现触控面板1531。除了触控面板1531,输入单元1530还可以包括其他输入设备1532。具体地,其他输入设备1532可以包括但不限于物理键盘、功能键(比如音量控制按键、开关按键等)、轨迹球、鼠标、操作杆等中的一种或多种。The input unit 1530 can be configured to receive input numeric or character information and to generate key signal inputs related to user settings and function control of the mobile terminal. Specifically, the input unit 1530 may include a touch panel 1531 and other input devices 1532. The touch panel 1531, also referred to as a touch screen, can collect touch operations on or near the user (such as the user using a finger, a stylus, or the like on the touch panel 1531 or near the touch panel 1531. Operation), and drive the corresponding connecting device according to a preset program. Optionally, the touch panel 1531 may include two parts: a touch detection device and a touch controller. Wherein, the touch detection device detects the touch orientation of the user, and detects a signal brought by the touch operation, and transmits the signal to the touch controller; the touch controller receives the touch information from the touch detection device, converts the touch information into contact coordinates, and sends the touch information. The processor 1580 is provided and can receive commands from the processor 1580 and execute them. In addition, the touch panel 1531 can be implemented in various types such as resistive, capacitive, infrared, and surface acoustic waves. In addition to the touch panel 1531, the input unit 1530 may also include other input devices 1532. Specifically, other input devices 1532 may include, but are not limited to, one or more of a physical keyboard, function keys (such as volume control buttons, switch buttons, etc.), trackballs, mice, joysticks, and the like.
显示单元1540可用于显示由用户输入的信息或提供给用户的信息以及移动终端的各种菜单。显示单元1540可包括显示面板1541,可选的,可以采用液晶显示器(Liquid Crystal Display,LCD)、有机发光二极管(Organic Light-Emitting Diode,OLED)等形式来配置显示面板1541。进一步的,触控面板1531可覆盖显示面板1541,当触控面板1531检测到在其上或附近的触摸操作后,传送给处理器1580以确定触摸事件的类型,随后处理器1580根据触摸事件的类型在显示面板1541上提供相应的视觉输出。虽然在图13中,触控面板1531与显示面板1541是作为两个独立的部件来实现移动终端的输入和输入功能,但是在某些实施例中,可以将触控面板1531与显示面板1541集成而实现移动终端的输入和输出功能。The display unit 1540 can be used to display information input by the user or information provided to the user as well as various menus of the mobile terminal. The display unit 1540 can include a display panel 1541. Alternatively, the display panel 1541 can be configured in the form of a liquid crystal display (LCD), an organic light-emitting diode (OLED), or the like. Further, the touch panel 1531 may cover the display panel 1541. After the touch panel 1531 detects a touch operation on or near the touch panel 1531, the touch panel 1531 transmits to the processor 1580 to determine the type of the touch event, and then the processor 1580 according to the touch event. The type provides a corresponding visual output on display panel 1541. Although the touch panel 1531 and the display panel 1541 are used as two independent components to implement the input and input functions of the mobile terminal in FIG. 13, in some embodiments, the touch panel 1531 and the display panel 1541 may be integrated. And realize the input and output functions of the mobile terminal.
移动终端还可包括至少一种传感器1550,比如光传感器、运动传感器以及其他传感器。具体地,光传感器可包括环境光传感器及接近传感器,其中,环境光传感器可根据环境光线的明暗来调节显示面板1541的亮度,接近传感器可在移动终端移动到耳边时,关闭显示面板1541和/或背光。作为运动传感器的一种,加速计传感器可检测各个方向上(一般为三轴)加速度的大小,静止时可检测出重力的大小及方向,可用于识别移动终端姿态的应用(比如横竖屏切换、相关游戏、磁力计姿态校准)、振动识别相关功能(比如计步器、 敲击)等;至于移动终端还可配置的陀螺仪、气压计、湿度计、温度计、红外线传感器等其他传感器,在此不再赘述。The mobile terminal can also include at least one type of sensor 1550, such as a light sensor, motion sensor, and other sensors. Specifically, the light sensor may include an ambient light sensor and a proximity sensor, wherein the ambient light sensor may adjust the brightness of the display panel 1541 according to the brightness of the ambient light, and the proximity sensor may close the display panel 1541 when the mobile terminal moves to the ear. / or backlight. As a kind of motion sensor, the accelerometer sensor can detect the magnitude of acceleration in all directions (usually three axes). When it is stationary, it can detect the magnitude and direction of gravity. It can be used to identify the attitude of the mobile terminal (such as horizontal and vertical screen switching, Related games, magnetometer attitude calibration), vibration recognition related functions (such as pedometer, tapping), etc.; as well as other sensors such as gyroscopes, barometers, hygrometers, thermometers, infrared sensors, etc., which can be configured in mobile terminals, No longer.
音频电路1560、扬声器1561,传声器1562可提供用户与移动终端之间的音频接口。音频电路1560可将接收到的音频数据转换后的电信号,传输到扬声器1561,由扬声器1561转换为声纹信号输出;另一方面,传声器1562将收集的声纹信号转换为电信号,由音频电路1560接收后转换为音频数据,再将音频数据输出处理器1580处理后,经RF电路1510以发送给比如另一移动终端,或者将音频数据输出至存储器1520以便进一步处理。An audio circuit 1560, a speaker 1561, and a microphone 1562 can provide an audio interface between the user and the mobile terminal. The audio circuit 1560 can transmit the converted electrical data of the received audio data to the speaker 1561, and convert it into a voiceprint signal output by the speaker 1561. On the other hand, the microphone 1562 converts the collected voiceprint signal into an electrical signal by the audio. Circuit 1560 is converted to audio data upon reception, processed by audio data output processor 1580, transmitted via RF circuitry 1510 to, for example, another mobile terminal, or output audio data to memory 1520 for further processing.
Wi-Fi属于短距离无线传输技术,移动终端通过Wi-Fi模块1570可以帮助用户收发电子邮件、浏览网页和访问流式媒体等,它为用户提供了无线的宽带互联网访问。虽然图13示出了Wi-Fi模块1570,但是可以理解的是,其并不属于移动终端的必须构成,完全可以根据需要在不改变发明的本质的范围内而省略。Wi-Fi is a short-range wireless transmission technology. The mobile terminal can help users to send and receive e-mail, browse web pages and access streaming media through the Wi-Fi module 1570. It provides users with wireless broadband Internet access. Although FIG. 13 shows the Wi-Fi module 1570, it can be understood that it does not belong to the essential configuration of the mobile terminal, and may be omitted as needed within the scope of not changing the essence of the invention.
处理器1580是移动终端的控制中心,利用各种接口和线路连接整个移动终端的各个部分,通过运行或执行存储在存储器1520内的软件程序和/或模块,以及调用存储在存储器1520内的数据,执行移动终端的各种功能和处理数据,从而对移动终端进行整体监控。可选的,处理器1580可包括一个或多个处理单元;优选的,处理器1580可集成应用处理器和调制解调处理器,其中,应用处理器主要处理操作系统、用户界面和应用程序等,调制解调处理器主要处理无线通信。可以理解的是,上述调制解调处理器也可以不集成到处理器1580中。The processor 1580 is a control center of the mobile terminal that connects various portions of the entire mobile terminal using various interfaces and lines, by running or executing software programs and/or modules stored in the memory 1520, and recalling data stored in the memory 1520. The mobile terminal performs various functions and processing data to perform overall monitoring on the mobile terminal. Optionally, the processor 1580 may include one or more processing units; preferably, the processor 1580 may integrate an application processor and a modem processor, where the application processor mainly processes an operating system, a user interface, an application, and the like. The modem processor primarily handles wireless communications. It will be appreciated that the above described modem processor may also not be integrated into the processor 1580.
移动终端还包括给各个部件供电的电源1590(比如电池),优选的,电源可以通过电源管理系统与处理器1580逻辑相连,从而通过电源管理系统实现管理充电、放电、以及功耗管理等功能。The mobile terminal also includes a power source 1590 (such as a battery) for powering various components. Preferably, the power source can be logically coupled to the processor 1580 through a power management system to manage functions such as charging, discharging, and power management through the power management system.
尽管未示出,移动终端还可以包括摄像头、蓝牙模块等,在此不再赘述。Although not shown, the mobile terminal may further include a camera, a Bluetooth module, and the like, and details are not described herein again.
相应于上述方法实施例,本申请实施例提供了一种计算机可读存储介质,所述计算机可读存储介质内存储有计算机程序,所述计算机程序被处理器执行时实现实现本申请实施例所提供的上述任一所述的用户肢体动作的解析方 法步骤。Corresponding to the above method embodiment, the embodiment of the present application provides a computer readable storage medium, where the computer readable storage medium stores a computer program, and when the computer program is executed by the processor, the implementation of the embodiment of the present application is implemented. The method of analyzing the method of the user's limb movement as described above.
本申请实施例通过识别图片中用户的肢体图像表征的肢体语言,并匹配与该肢体语言具有相同含义可视化信息或音频信息。以此,将图像中的肢体特征所表述的信息通过能够被人类直接解读的方式进行呈现,实现了对于人类肢体动作的深层次解读,方便语言障碍者或者语言不通用户之间的交流。The embodiment of the present application recognizes the body language represented by the limb image of the user in the picture, and matches the visual information or the audio information having the same meaning as the body language. In this way, the information expressed by the limb features in the image is presented in a way that can be directly interpreted by humans, thereby realizing a deep interpretation of the human body movements, facilitating communication between the language disabled or the languageless users.
相应于上述方法实施例,本申请实施例提供了一种计算机程序产品,当其在计算机上运行时,使得计算机执行本申请实施例所提供的上述任一所述的用户肢体动作的解析方法步骤。Corresponding to the above method embodiment, the embodiment of the present application provides a computer program product, when it is run on a computer, causing the computer to perform the step of analyzing the user's limb motion according to any one of the foregoing embodiments provided by the embodiment of the present application. .
本申请实施例通过识别图片中用户的肢体图像表征的肢体语言,并匹配与该肢体语言具有相同含义可视化信息或音频信息。以此,将图像中的肢体特征所表述的信息通过能够被人类直接解读的方式进行呈现,实现了对于人类肢体动作的深层次解读,方便语言障碍者或者语言不通用户之间的交流。The embodiment of the present application recognizes the body language represented by the limb image of the user in the picture, and matches the visual information or the audio information having the same meaning as the body language. In this way, the information expressed by the limb features in the image is presented in a way that can be directly interpreted by humans, thereby realizing a deep interpretation of the human body movements, facilitating communication between the language disabled or the languageless users.
需要说明的是,本申请的说明书及其附图中给出了本申请的较佳的实施例,但是,本申请可以通过许多不同的形式来实现,并不限于本说明书所描述的实施例,这些实施例不作为对本申请内容的额外限制,提供这些实施例的目的是使对本申请的公开内容的理解更加透彻全面。并且,上述各技术特征继续相互组合,形成未在上面列举的各种实施例,均视为本申请说明书记载的范围;进一步地,对本领域普通技术人员来说,可以根据上述说明加以改进或变换,而所有这些改进和变换都应属于本申请所附权利要求的保护范围。It should be noted that the preferred embodiments of the present application are given in the specification of the present application and the accompanying drawings. However, the present application can be implemented in many different forms, and is not limited to the embodiments described in the specification. The examples are not intended to be limiting as to the scope of the present application, and the embodiments are provided to make the understanding of the disclosure of the present application more comprehensive. Further, each of the above technical features is further combined with each other to form various embodiments that are not enumerated above, and are considered to be within the scope of the specification of the present application; further, those skilled in the art can improve or change according to the above description. All such improvements and modifications are intended to fall within the scope of the appended claims.

Claims (19)

  1. 一种用户肢体动作的解析方法,其特征在于,包括下述步骤:An analytical method for a user's limb movement, comprising the steps of:
    获取用户的肢体图像;Obtaining a limb image of the user;
    识别所述肢体图像表征的肢体语言;Identifying the body language of the limb image representation;
    匹配与所述肢体语言具有相同含义的可视化信息或音频信息。Matching visual or audio information having the same meaning as the body language.
  2. 根据权利要求1所述的用户肢体动作的解析方法,其特征在于,所述获取用户的肢体图像的步骤,包括:The method for analyzing a limb motion of a user according to claim 1, wherein the step of acquiring a limb image of the user comprises:
    获取用户的人脸图像;Obtaining a face image of the user;
    所述识别所述肢体图像表征的肢体语言的步骤,包括:The step of identifying the body language of the limb image representation includes:
    识别所述人脸图像表征的人体脸部动作信息;Identifying human facial motion information represented by the facial image;
    所述匹配与所述肢体语言具有相同含义的可视化信息或音频信息的步骤,包括:The step of matching the visual information or the audio information having the same meaning as the body language includes:
    匹配与所述人体脸部动作信息具有相同动作含义的表情图片。An emoticon image having the same action meaning as the human facial motion information is matched.
  3. 根据权利要求2所述的用户肢体动作的解析方法,其特征在于,所述获取用户的人脸图像的步骤之前,还包括下述步骤:The method for analyzing a user's limb motion according to claim 2, wherein the step of acquiring a face image of the user further comprises the following steps:
    调用预存储的至少一个所述表情图片;Retrieving at least one of the pre-stored emoticons;
    将所述表情图片按预设脚本放置在显示容器内,以使所述表情图片可视化显示。The emoticon image is placed in the display container according to a preset script to visually display the emoticon image.
  4. 根据权利要求3所述的用户肢体动作的解析方法,其特征在于,所述匹配与所述人体脸部动作具有相同动作含义的表情图片的步骤,具体包括下述步骤:The method for analyzing a user's limb motion according to claim 3, wherein the step of matching the emoticon image having the same action meaning as the human facial motion comprises the following steps:
    将所述人体脸部动作信息与所述显示容器范围内的表情图片进行比对;Comparing the human facial motion information with an expression image within the display container;
    当所述显示容器内的表情图片所表征的动作含义与所述人体脸部动作信息相同时,确认所述显示容器内存在与所述人体脸部动作具有相同动作含义的表情图片。When the action meaning represented by the expression picture in the display container is the same as the human face action information, it is confirmed that the display container has an expression picture having the same action meaning as the human face action.
  5. 根据权利要求3所述的用户肢体动作的解析方法,其特征在于,所述匹配与所述人体脸部动作具有相同动作含义的表情图片的步骤之后,还包括下述步骤:The method for analyzing a user's limb motion according to claim 3, wherein the step of matching the emoticon image having the same action meaning as the human facial motion further comprises the following steps:
    获取所述人体脸部动作信息与所述表情图片的匹配度信息;Obtaining matching degree information of the human facial action information and the emoticon image;
    根据预设的匹配规则计算所述匹配度信息对应的奖励分值。Calculating a bonus score corresponding to the matching degree information according to a preset matching rule.
  6. 根据权利要求5所述的用户肢体动作的解析方法,其特征在于,所述根据预设的匹配规则计算所述匹配度信息对应的奖励分值的步骤之后,还包括下述步骤:The method for analyzing a user's limb motion according to claim 5, wherein the step of calculating the bonus score corresponding to the matching degree information according to the preset matching rule further comprises the following steps:
    记录预设第一时间阈值内所有的奖励分值;Recording all the bonus points within the preset first time threshold;
    将所述奖励分值累加形成用户在所述第一时间阈值内的最终得分。The bonus scores are summed to form a final score for the user within the first time threshold.
  7. 根据权利要求2所述的用户肢体动作的解析方法,其特征在于,所述用户肢体动作的解析方法还包括下述步骤:The method for analyzing a user's limb movement according to claim 2, wherein the method for analyzing the user's limb motion further comprises the following steps:
    在预设的单位时间内从表情包中随机抽取预设数量的表征人类情绪的表情图片,并将所述表情图片放置在显示容器内;Extracting a preset number of expression images representing human emotions from the expression packs in a preset unit time, and placing the expression images in the display container;
    所述获取用户的人脸图像的步骤,包括:The step of acquiring a face image of the user includes:
    在所述单位时间内定时或实时采集用户的人脸图像;Collecting a user's face image in a timed or real time in the unit time;
    所述识别所述人脸图像表征的人体脸部动作信息的步骤,包括:And the step of identifying the facial motion information of the facial image representation includes:
    识别所述人脸图像所表征的情绪信息,及所述人脸图像与所述情绪信息的匹配度;Identifying the emotion information represented by the face image, and matching the face image with the emotion information;
    所述匹配与所述人体脸部动作信息具有相同动作含义的表情图片的步骤,包括:The step of matching an emoticon image having the same action meaning as the human facial action information includes:
    匹配与所述人脸图像具有相同情绪含义的表情图片,并根据所述匹配度确认所述人脸图像的奖励分值。An emoticon image having the same emotional meaning as the facial image is matched, and a bonus score of the facial image is confirmed according to the matching degree.
  8. 根据权利要求7所述的用户肢体动作的解析方法,其特征在于,所述识别所述人脸图像所表征的情绪信息,及所述人脸图像与所述情绪信息的匹配度的步骤,包括:The method for analyzing a user's limb motion according to claim 7, wherein the step of identifying the emotion information represented by the face image and the matching degree between the face image and the emotion information comprises: :
    将所述人脸图像输入到预设的情绪识别模型中,并获取所述人脸图像的分类结果及分类数据;Inputting the face image into a preset emotion recognition model, and acquiring classification result and classification data of the face image;
    根据所述分类结果确定所述人脸图像的情绪信息,并根据所述分类数据确定所述人脸图像与所述情绪信息的匹配度。Determining the emotion information of the face image according to the classification result, and determining a matching degree of the face image and the emotion information according to the classification data.
  9. 一种用户肢体动作的解析系统,其特征在于,包括:An analytical system for a user's limb movement, comprising:
    获取模块,用于获取用户的肢体图像;An acquisition module, configured to acquire a limb image of the user;
    处理模块,用于识别所述肢体图像表征的肢体语言;a processing module for identifying a body language of the limb image representation;
    执行模块,用于匹配与所述肢体语言具有相同含义的可视化信息或音频信息。An execution module for matching visual information or audio information having the same meaning as the body language.
  10. 根据权利要求9所述的用户肢体动作的解析系统,其特征在于,所述获取模块,包括:第一获取子模块,用于获取用户的人脸图像;The analysis system of the user's limb movement according to claim 9, wherein the acquisition module comprises: a first acquisition sub-module, configured to acquire a face image of the user;
    所述处理模块,包括:第一处理子模块,用于识别所述人脸图像表征的人体脸部动作信息;The processing module includes: a first processing submodule, configured to identify human facial motion information represented by the facial image;
    所述执行模块,包括:第一执行子模块,用于匹配与所述人体脸部动作信息具有相同动作含义的表情图片。The execution module includes: a first execution sub-module, configured to match an emoticon image having the same action meaning as the human facial action information.
  11. 根据权利要求10所述的用户肢体动作的解析系统,其特征在于,所述用户肢体动作的解析系统还包括:The analysis system of the user's limb movement according to claim 10, wherein the analysis system of the user's limb movement further comprises:
    第一调用子模块,用于调用预存储的至少一个所述表情图片;a first calling submodule, configured to call at least one of the pre-stored emoticons;
    第一显示子模块,用于将所述表情图片按预设脚本放置在显示容器内,以使所述表情图片可视化显示。The first display sub-module is configured to place the emoticon image in a display container according to a preset script, so that the emoticon image is visually displayed.
  12. 根据权利要求10所述的用户肢体动作的解析系统,其特征在于,所述用户肢体动作的解析系统还包括:The analysis system of the user's limb movement according to claim 10, wherein the analysis system of the user's limb movement further comprises:
    第一比对子模块,用于将所述人体脸部动作信息与所述显示容器范围内的表情图片进行比对;a first comparison sub-module, configured to compare the human facial motion information with an expression image in a range of the display container;
    第一确认子模块,用于当所述显示容器内的表情图片所表征的动作含义与所述人体脸部动作信息相同时,确认所述显示容器内存在与所述人体脸部 动作具有相同动作含义的表情图片。a first confirmation sub-module, configured to confirm that the display container has the same action as the human face motion when the action meaning represented by the expression picture in the display container is the same as the human face motion information The expression of the meaning of the image.
  13. 根据权利要求10所述的用户肢体动作的解析系统,其特征在于,所述用户肢体动作的解析系统还包括:The analysis system of the user's limb movement according to claim 10, wherein the analysis system of the user's limb movement further comprises:
    第二获取子模块,用于获取所述人体脸部动作信息与所述表情图片的匹配度信息;a second obtaining submodule, configured to acquire matching degree information of the human facial action information and the emoticon image;
    第二执行子模块,用于根据预设的匹配规则计算所述匹配度信息对应的奖励分值。The second execution sub-module is configured to calculate a bonus score corresponding to the matching degree information according to a preset matching rule.
  14. 根据权利要求10所述的用户肢体动作的解析系统,其特征在于,所述用户肢体动作的解析系统还包括:The analysis system of the user's limb movement according to claim 10, wherein the analysis system of the user's limb movement further comprises:
    第一记录子模块,用于记录预设第一时间阈值内所有的奖励分值;a first recording sub-module, configured to record all the bonus points in the preset first time threshold;
    第三执行子模块,用于将所述奖励分值累加形成用户在所述第一时间阈值内的最终得分。And a third execution sub-module, configured to accumulate the bonus scores to form a final score of the user within the first time threshold.
  15. 根据权利要求9所述的用户肢体动作的解析系统,其特征在于,所述用户肢体动作的解析系统还包括:The analysis system of the user's limb movement according to claim 9, wherein the analysis system of the user's limb movement further comprises:
    第三获取子模块,用于在预设的单位时间内从表情包中随机抽取预设数量的表征人类情绪的表情图片,并将所述表情图片放置在显示容器内;a third obtaining sub-module, configured to randomly extract a preset number of emoticons representing human emotions from the emoticon package in a preset unit time, and place the emoticon images in a display container;
    第一获取子模块,用于在所述单位时间内定时或实时采集用户的人脸图像;a first acquiring submodule, configured to collect a face image of the user in a timed or real time in the unit time;
    第一处理子模块,用于识别所述人脸图像所表征的情绪信息,及所述人脸图像与所述情绪信息的匹配度;a first processing sub-module, configured to identify emotion information represented by the face image, and a matching degree between the face image and the emotion information;
    第一执行子模块,用于匹配与所述人脸图像具有相同情绪含义的表情图片,并根据所述匹配度确认所述人脸图像的奖励分值。And a first execution submodule, configured to match an emoticon image having the same emotional meaning as the facial image, and confirm a bonus score of the facial image according to the matching degree.
  16. 根据权利要求15所述的用户肢体动作的解析系统,其特征在于,所述第一处理子模块,用于The analysis system of the user's limb movement according to claim 15, wherein the first processing sub-module is configured to
    将所述人脸图像输入到预设的情绪识别模型中,并获取所述人脸图像的分类结果及分类数据;Inputting the face image into a preset emotion recognition model, and acquiring classification result and classification data of the face image;
    根据所述分类结果确定所述人脸图像的情绪信息,并根据所述分类数据确定所述人脸图像与所述情绪信息的匹配度。Determining the emotion information of the face image according to the classification result, and determining a matching degree of the face image and the emotion information according to the classification data.
  17. 一种移动终端,其特征在于,包括:A mobile terminal, comprising:
    一个或多个处理器;One or more processors;
    存储器;Memory
    一个或多个应用程序,其中所述一个或多个应用程序被存储在所述存储器中并被配置为由所述一个或多个处理器执行,所述一个或多个程序配置用于执行权利要求1-8任意一项所述的用户肢体动作的解析方法。One or more applications, wherein the one or more applications are stored in the memory and configured to be executed by the one or more processors, the one or more programs configured to execute rights The method for analyzing a user's limb movement as described in any one of claims 1-8.
  18. 一种计算机可读存储介质,其特征在于,所述计算机可读存储介质内存储有计算机程序,所述计算机程序被处理器执行时实现权利要求1-8任意一项所述的用户肢体动作的解析方法。A computer readable storage medium, wherein the computer readable storage medium stores a computer program, the computer program being executed by a processor to implement the user's limb motion according to any one of claims 1-8 Analytic method.
  19. 一种计算机程序产品,其特征在于,当其在计算机上运行时,使得计算机执行权利要求1-8任意一项所述的用户肢体动作的解析方法。A computer program product, characterized in that, when run on a computer, the computer is caused to perform the method of analyzing the user's limb motion as claimed in any one of claims 1-8.
PCT/CN2018/116700 2017-12-28 2018-11-21 Analysis method and system of user limb movement and mobile terminal WO2019128558A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201711464338.2A CN108062533A (en) 2017-12-28 2017-12-28 Analytic method, system and the mobile terminal of user's limb action
CN201711464338.2 2017-12-28

Publications (1)

Publication Number Publication Date
WO2019128558A1 true WO2019128558A1 (en) 2019-07-04

Family

ID=62140685

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2018/116700 WO2019128558A1 (en) 2017-12-28 2018-11-21 Analysis method and system of user limb movement and mobile terminal

Country Status (2)

Country Link
CN (1) CN108062533A (en)
WO (1) WO2019128558A1 (en)

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110503457A (en) * 2019-07-11 2019-11-26 平安科技(深圳)有限公司 Analysis method and device, storage medium, the computer equipment of user satisfaction
CN110554782A (en) * 2019-07-25 2019-12-10 北京智慧章鱼科技有限公司 Expression input image synthesis method and system
CN112000828A (en) * 2020-07-20 2020-11-27 北京百度网讯科技有限公司 Method and device for searching emoticons, electronic equipment and readable storage medium
CN112541384A (en) * 2020-07-30 2021-03-23 深圳市商汤科技有限公司 Object searching method and device, electronic equipment and storage medium
CN112613457A (en) * 2020-12-29 2021-04-06 招联消费金融有限公司 Image acquisition mode detection method and device, computer equipment and storage medium
CN113766295A (en) * 2021-04-16 2021-12-07 腾讯科技(深圳)有限公司 Playing processing method, device, equipment and storage medium

Families Citing this family (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108062533A (en) * 2017-12-28 2018-05-22 北京达佳互联信息技术有限公司 Analytic method, system and the mobile terminal of user's limb action
CN109063595B (en) * 2018-07-13 2020-11-03 苏州浪潮智能软件有限公司 Method and device for converting limb movement into computer language
CN109271542A (en) * 2018-09-28 2019-01-25 百度在线网络技术(北京)有限公司 Cover determines method, apparatus, equipment and readable storage medium storing program for executing
CN111079472A (en) * 2018-10-19 2020-04-28 北京微播视界科技有限公司 Image comparison method and device
CN109492602B (en) * 2018-11-21 2020-11-03 华侨大学 Process timing method and system based on human body language
CN113785539A (en) * 2019-04-10 2021-12-10 Oppo广东移动通信有限公司 System and method for dynamically recommending input based on recognition of user emotion
CN111144287B (en) * 2019-12-25 2023-06-09 Oppo广东移动通信有限公司 Audiovisual auxiliary communication method, device and readable storage medium
CN111339833B (en) * 2020-02-03 2022-10-28 重庆特斯联智慧科技股份有限公司 Identity verification method, system and equipment based on face edge calculation
CN112906650B (en) * 2021-03-24 2023-08-15 百度在线网络技术(北京)有限公司 Intelligent processing method, device, equipment and storage medium for teaching video

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101314081A (en) * 2008-07-11 2008-12-03 深圳华为通信技术有限公司 Lecture background matching method and apparatus
CN104333730A (en) * 2014-11-26 2015-02-04 北京奇艺世纪科技有限公司 Video communication method and video communication device
CN105976843A (en) * 2016-05-18 2016-09-28 乐视控股(北京)有限公司 In-vehicle music control method, device, and automobile
CN108062533A (en) * 2017-12-28 2018-05-22 北京达佳互联信息技术有限公司 Analytic method, system and the mobile terminal of user's limb action

Family Cites Families (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101442861B (en) * 2008-12-19 2013-01-02 上海广茂达光艺科技股份有限公司 Control system and control method for LED lamplight scene
US9037354B2 (en) * 2011-09-09 2015-05-19 Thales Avionics, Inc. Controlling vehicle entertainment systems responsive to sensed passenger gestures
CN104349214A (en) * 2013-08-02 2015-02-11 北京千橡网景科技发展有限公司 Video playing method and device
CN104345873A (en) * 2013-08-06 2015-02-11 北大方正集团有限公司 File operation method and file operation device for network video conference system
CN104464390A (en) * 2013-09-15 2015-03-25 南京大五教育科技有限公司 Body feeling education system
CN104598012B (en) * 2013-10-30 2017-12-05 中国艺术科技研究所 A kind of interactive advertising equipment and its method of work
CN106257489A (en) * 2016-07-12 2016-12-28 乐视控股(北京)有限公司 Expression recognition method and system
CN106502424A (en) * 2016-11-29 2017-03-15 上海小持智能科技有限公司 Based on the interactive augmented reality system of speech gestures and limb action
CN106997457B (en) * 2017-03-09 2020-09-11 Oppo广东移动通信有限公司 Figure limb identification method, figure limb identification device and electronic device

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101314081A (en) * 2008-07-11 2008-12-03 深圳华为通信技术有限公司 Lecture background matching method and apparatus
CN104333730A (en) * 2014-11-26 2015-02-04 北京奇艺世纪科技有限公司 Video communication method and video communication device
CN105976843A (en) * 2016-05-18 2016-09-28 乐视控股(北京)有限公司 In-vehicle music control method, device, and automobile
CN108062533A (en) * 2017-12-28 2018-05-22 北京达佳互联信息技术有限公司 Analytic method, system and the mobile terminal of user's limb action

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110503457A (en) * 2019-07-11 2019-11-26 平安科技(深圳)有限公司 Analysis method and device, storage medium, the computer equipment of user satisfaction
CN110554782A (en) * 2019-07-25 2019-12-10 北京智慧章鱼科技有限公司 Expression input image synthesis method and system
CN112000828A (en) * 2020-07-20 2020-11-27 北京百度网讯科技有限公司 Method and device for searching emoticons, electronic equipment and readable storage medium
CN112000828B (en) * 2020-07-20 2024-05-24 北京百度网讯科技有限公司 Method, device, electronic equipment and readable storage medium for searching expression picture
CN112541384A (en) * 2020-07-30 2021-03-23 深圳市商汤科技有限公司 Object searching method and device, electronic equipment and storage medium
CN112541384B (en) * 2020-07-30 2023-04-28 深圳市商汤科技有限公司 Suspicious object searching method and device, electronic equipment and storage medium
CN112613457A (en) * 2020-12-29 2021-04-06 招联消费金融有限公司 Image acquisition mode detection method and device, computer equipment and storage medium
CN112613457B (en) * 2020-12-29 2024-04-09 招联消费金融股份有限公司 Image acquisition mode detection method, device, computer equipment and storage medium
CN113766295A (en) * 2021-04-16 2021-12-07 腾讯科技(深圳)有限公司 Playing processing method, device, equipment and storage medium

Also Published As

Publication number Publication date
CN108062533A (en) 2018-05-22

Similar Documents

Publication Publication Date Title
WO2019128558A1 (en) Analysis method and system of user limb movement and mobile terminal
CN110288077B (en) Method and related device for synthesizing speaking expression based on artificial intelligence
WO2021043053A1 (en) Animation image driving method based on artificial intelligence, and related device
WO2018171223A1 (en) Data processing method and nursing robot device
WO2021036644A1 (en) Voice-driven animation method and apparatus based on artificial intelligence
CN110598046B (en) Artificial intelligence-based identification method and related device for title party
CN108735216B (en) Voice question searching method based on semantic recognition and family education equipment
CN109063583A (en) A kind of learning method and electronic equipment based on read operation
CN110609620A (en) Human-computer interaction method and device based on virtual image and electronic equipment
CN108763552B (en) Family education machine and learning method based on same
CN110298212B (en) Model training method, emotion recognition method, expression display method and related equipment
CN110830362B (en) Content generation method and mobile terminal
CN111723855A (en) Learning knowledge point display method, terminal equipment and storage medium
CN107766403A (en) A kind of photograph album processing method, mobile terminal and computer-readable recording medium
CN109391842B (en) Dubbing method and mobile terminal
CN107870904A (en) A kind of interpretation method, device and the device for translation
CN109167884A (en) A kind of method of servicing and device based on user speech
CN108133708B (en) Voice assistant control method and device and mobile terminal
CN109584897A (en) Vedio noise reduction method, mobile terminal and computer readable storage medium
CN111738100A (en) Mouth shape-based voice recognition method and terminal equipment
CN109686359A (en) Speech output method, terminal and computer readable storage medium
CN114630135A (en) Live broadcast interaction method and device
CN111639209A (en) Book content searching method, terminal device and storage medium
CN116229311B (en) Video processing method, device and storage medium
CN109471664A (en) Intelligent assistant's management method, terminal and computer readable storage medium

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 18896894

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 18896894

Country of ref document: EP

Kind code of ref document: A1