WO2019233216A1 - 一种手势动作的识别方法、装置以及设备 - Google Patents

一种手势动作的识别方法、装置以及设备 Download PDF

Info

Publication number
WO2019233216A1
WO2019233216A1 PCT/CN2019/084630 CN2019084630W WO2019233216A1 WO 2019233216 A1 WO2019233216 A1 WO 2019233216A1 CN 2019084630 W CN2019084630 W CN 2019084630W WO 2019233216 A1 WO2019233216 A1 WO 2019233216A1
Authority
WO
WIPO (PCT)
Prior art keywords
finger
gesture
joint
designated
angle
Prior art date
Application number
PCT/CN2019/084630
Other languages
English (en)
French (fr)
Inventor
赵世杰
李峰
左小祥
程君
Original Assignee
腾讯科技(深圳)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 腾讯科技(深圳)有限公司 filed Critical 腾讯科技(深圳)有限公司
Priority to EP19814802.5A priority Critical patent/EP3805982B1/en
Publication of WO2019233216A1 publication Critical patent/WO2019233216A1/zh
Priority to US17/004,735 priority patent/US11366528B2/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/01Input arrangements or combined input and output arrangements for interaction between user and computer
    • G06F3/017Gesture based interaction, e.g. based on a set of recognized hand gestures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/10Complex mathematical operations
    • G06F17/11Complex mathematical operations for solving equations, e.g. nonlinear equations, general mathematical optimization problems
    • G06F17/12Simultaneous equations, e.g. systems of linear equations
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T13/00Animation
    • G06T13/203D [Three Dimensional] animation
    • G06T13/403D [Three Dimensional] animation of characters, e.g. humans, animals or virtual beings
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/42Global feature extraction by analysis of the whole pattern, e.g. using frequency domain transformations or autocorrelation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/764Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/107Static hand or arm
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/20Movements or behaviour, e.g. gesture recognition
    • G06V40/28Recognition of hand or arm movements, e.g. recognition of deaf sign language
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/107Static hand or arm
    • G06V40/117Biometrics derived from hands
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04MTELEPHONIC COMMUNICATION
    • H04M1/00Substation equipment, e.g. for use by subscribers
    • H04M1/72Mobile telephones; Cordless telephones, i.e. devices for establishing wireless links to base stations without route selection
    • H04M1/724User interfaces specially adapted for cordless or mobile telephones
    • H04M1/72448User interfaces specially adapted for cordless or mobile telephones with means for adapting the functionality of the device according to specific conditions
    • H04M1/72454User interfaces specially adapted for cordless or mobile telephones with means for adapting the functionality of the device according to specific conditions according to context-related or environment-related conditions
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04MTELEPHONIC COMMUNICATION
    • H04M2250/00Details of telephonic subscriber devices
    • H04M2250/52Details of telephonic subscriber devices including functional features of a camera

Definitions

  • the present application relates to the field of artificial intelligence technology, and in particular, to a method, a device, a device, and a computer-readable storage medium for recognizing a gesture action.
  • Human gesture skeleton recognition is a research task that is widely studied in the field of human-computer interaction.
  • the more mature human gesture skeleton recognition method is to recognize gesture pictures based on convolutional neural network, input gesture pictures into convolutional neural network, and output hand The position coordinates of each joint point in the figure.
  • the existing human gesture skeleton recognition method can only recognize a single gesture picture, therefore, it can only recognize static gestures, but in actual scenes in the field of human-computer interaction, gestures are often dynamic.
  • Existing gesture recognition technologies Dynamic and orderly gestures cannot be recognized yet.
  • the embodiments of the present application provide a method, a device, a device, and a related product for recognizing a gesture action, so that dynamic gesture actions can be recognized, and have a high accuracy rate, and thus have broad application prospects.
  • the first aspect of the present application provides a method for recognizing a gesture, the method includes:
  • the server obtains a first gesture picture and a second gesture picture
  • the server recognizes the first gesture picture to obtain a first vector, the first vector is used to characterize the angle of a finger joint in the first gesture picture, and recognizes the second gesture picture to obtain a second vector, so The second vector is used to represent an angle of a finger joint in the second gesture picture;
  • the server calculates a total amount of a first angle change of a first specified joint in a first specified finger according to the first vector and the second vector, and the first specified finger refers to an angle of a joint when a specified gesture is performed.
  • a finger that needs to be changed, the first designated joint refers to a finger joint that needs to change an angle in the first designated finger when a designated gesture is performed;
  • the server obtains a recognition result of a gesture action according to the total amount of the first angle change and a first preset threshold.
  • the method further includes:
  • the server calculates a total amount of a second angle change of a second specified joint in a second specified finger according to the first vector and the second vector, and the second specified finger refers to an angle of the joint when performing a specified gesture action
  • a finger that does not need to be changed, the second designated joint refers to a finger joint in the second designated finger;
  • obtaining, by the server, the recognition result of the gesture action according to the total amount of the first angle change and the first preset threshold includes:
  • the server obtains a recognition result of a gesture action according to the total amount of the first angle change and a first preset threshold, and the total amount of the second angle change and a second preset threshold.
  • the method further includes:
  • obtaining, by the server, the recognition result of the gesture action according to the total amount of the first angle change and the first preset threshold includes:
  • the server obtains a recognition result of a gesture action according to the total amount of change in the first angle and a first preset threshold, and the amount of change in the first determination coefficient and a third preset threshold.
  • the method further includes:
  • obtaining, by the server, the recognition result of the gesture action according to the total amount of change in the first angle and a first preset threshold includes:
  • the server according to the total amount of the first angle change and a first preset threshold, the total amount of the second angle change and a second preset threshold, and the change amount of the second determination coefficient and a fourth preset threshold, Get the recognition result of the gesture action.
  • the third designated finger refers to a finger whose finger joint points have a linear relationship when performing the designated gesture action
  • the server according to the total amount of change in the first angle and a first preset threshold, the total amount of change in the second angle and a second preset threshold, and the amount of change in the second determination coefficient and a fourth preset value.
  • Set the threshold to get the recognition result of gestures including:
  • the server according to the total amount of the first angle change and a first preset threshold, the total amount of the second angle change and a second preset threshold, the change amount of the second determination coefficient and a fourth preset threshold, and A third linear regression determination coefficient and a fifth preset threshold corresponding to the third designated finger are used to obtain a recognition result of the gesture action.
  • the server calculates the total angle change corresponding to the specified joint in the specified finger in the following manner:
  • An angle change amount corresponding to a specified joint in the specified finger is obtained from the difference vector, and a sum value of the angle change amounts is calculated to obtain a total angle change amount corresponding to the specified joint in the specified finger.
  • the designated gesture is a finger-push action
  • the first designated finger is a middle finger
  • the first designated joint includes three finger joints on the middle finger.
  • the designated finger motion is a flicking motion
  • the first designated finger is a middle finger
  • the first designated joint includes three finger joints on the middle finger
  • the second designated finger includes a thumb, an index finger, a ring finger, and a little finger .
  • the method further includes:
  • the server displays an animation effect corresponding to the designated gesture action on an interface.
  • the server recognizes a gesture picture to obtain a corresponding vector in the following manner:
  • a coordinate set is obtained according to the gesture picture and the convolutional neural network model, and the coordinate set includes position coordinates of each joint point of the hand in the recognized gesture picture;
  • a vector corresponding to the recognized gesture picture is generated according to the angle, and the vector is used to represent an angle of a finger joint in the recognized gesture picture.
  • the calculating, by the server, the angle corresponding to the finger joint according to the position coordinates of each joint point in the coordinate set includes:
  • the server uses an inverse cosine function and the two vectors to calculate an angle corresponding to a finger joint.
  • the second aspect of the present application provides a gesture recognition device, the device includes:
  • An acquisition module configured to acquire a first gesture picture and a second gesture picture
  • a recognition module configured to recognize the first gesture picture to obtain a first vector, the first vector used to characterize the angle of a finger joint in the first gesture picture, and identify the second gesture picture to obtain a second A vector, where the second vector is used to represent an angle of a finger joint in the second gesture picture;
  • the calculation module is configured to calculate a total amount of a change in a first angle of a first specified joint in a first specified finger according to the first vector and the second vector, where the first specified finger refers to a time when a specified gesture is performed.
  • the angle of the joint needs to be changed, and the first specified joint refers to the finger joint whose angle needs to be changed in the first specified finger when the specified gesture is performed;
  • the determining module is configured to obtain a recognition result of a gesture action according to the total amount of the first angle change and a first preset threshold.
  • calculation module is also set to:
  • the determination module can be set to:
  • calculation module is also set to:
  • the determination module can be set to:
  • calculation module is also set to:
  • the determination module can be set to:
  • the determination module is also set to:
  • the third designated finger refers to a finger whose finger joint points have a linear relationship when performing the designated gesture action; according to the first angle change amount and the first A preset threshold, the total amount of the second angle change and a second preset threshold, the amount of the second determination coefficient change and a fourth preset threshold, and a third linear regression determination coefficient corresponding to the third designated finger and A fifth preset threshold to obtain a recognition result of a gesture action.
  • calculation module can be set to:
  • An angle change amount corresponding to a specified joint in the specified finger is obtained from the difference vector, and a sum value of the angle change amounts is calculated to obtain a total angle change amount corresponding to the specified joint in the specified finger.
  • the designated gesture action is a flicking gesture
  • the first designated finger is a middle finger
  • the first designated joint includes three finger joints on the middle finger.
  • the designated finger motion is a finger movement
  • the first designated finger is a middle finger
  • the first designated joint includes three finger joints on the middle finger
  • the second designated finger includes a thumb, an index finger, a ring finger, and a pinky finger.
  • the device further includes a display module configured to:
  • an animation effect corresponding to the designated gesture action is displayed on an interface.
  • the identification module can be set to:
  • a coordinate set is obtained according to the gesture picture and the convolutional neural network model, and the coordinate set includes position coordinates of each joint point of the hand in the recognized gesture picture;
  • a vector corresponding to the recognized gesture picture is generated according to the angle, and the vector is used to represent an angle of a finger joint in the recognized gesture picture.
  • the identification module can be set to:
  • the arc cosine function and the two vectors are used to calculate the angle corresponding to the finger joint.
  • a third aspect of the present application provides a gesture recognition device, where the device includes a processor and a memory:
  • the memory is configured to store program code, and transmit the program code to the processor
  • the processor is configured to execute the steps of the method for recognizing a gesture action according to the first aspect according to the instructions in the program code.
  • a fourth aspect of the present application provides a computer-readable storage medium, the computer-readable storage medium being configured to store program code, the program code being configured to execute the method for recognizing a gesture action according to the first aspect.
  • a fifth aspect of the present application provides a computer program product including instructions, which when run on a computer, causes the computer to execute the method for recognizing a gesture action according to the first aspect.
  • a method for recognizing a gesture obtains a first vector and a second vector representing the angle of a finger joint by recognizing a first gesture picture and a second gesture picture, and then uses a mathematical model method. , Calculate the total amount of the first angle change according to the first vector and the second vector, and the total amount of the first angle change is the total amount of angle change corresponding to the first designated joint in the first designated finger; wherein the first designated finger is an implementation The finger whose joint angle needs to change when the gesture is specified.
  • the first designated joint is the finger joint whose angle needs to change in the first designated finger when the gesture is performed.
  • the method can recognize a dynamic gesture based on the angle change of the specific joint of the specific finger in the two gesture pictures.
  • the method for recognizing gestures provided in this application has a wider application prospect in the field of artificial intelligence.
  • FIG. 1 is a schematic diagram of a gesture recognition method according to an embodiment of the present application
  • FIG. 2 is a flowchart of a method for recognizing a gesture in an embodiment of the present application
  • FIG. 3 is a schematic diagram of identifying a joint point in a gesture picture according to an embodiment of the present application.
  • FIG. 4 is a schematic diagram of calculating an angle of a finger joint in an embodiment of the present application.
  • FIG. 5 is a flowchart of a method for recognizing a gesture in an embodiment of the present application.
  • FIG. 6 is a schematic diagram of a gesture recognition method applied to a live broadcast scenario according to an embodiment of the present application.
  • FIG. 7 is a schematic structural diagram of a gesture recognition device according to an embodiment of the present application.
  • FIG. 8 is a schematic structural diagram of a gesture recognition device according to an embodiment of the present application.
  • FIG. 9 is a schematic structural diagram of a terminal device according to an embodiment of the present application.
  • FIG. 10 is a schematic structural diagram of a server in an embodiment of the present application.
  • the embodiment of the present application is based on the fact that when a user performs a specified gesture, a specific joint of a specific finger must occur
  • the kinematics principle of angle change proposes a method to recognize gestures based on the angle change of finger joints, so as to realize dynamic and orderly gesture recognition.
  • the method obtains a first vector and a second vector characterizing the angle of a finger joint by identifying a first gesture picture and a second gesture picture, and then uses a mathematical model method to calculate the first of the first specified fingers according to the first vector and the second vector.
  • the total amount of the first angle change of the designated joint wherein the first designated finger is the finger whose joint angle needs to be changed when performing the designated gesture action, and the first designated joint is the angle which needs to be changed when the designated gesture is performed.
  • the finger joint of the finger can determine whether the user performs a specified gesture action, thereby obtaining a recognition result of the gesture action.
  • the method can recognize a dynamic gesture based on the angle change of the specific joint of the specific finger in the two gesture pictures. .
  • the method for recognizing gestures provided in this application has a wider application prospect in the field of artificial intelligence.
  • the processing device may be a terminal device such as a smart phone, a tablet computer, a personal computer (PC, Personal Computer), a minicomputer, or a mainframe, or a server with graphics processing capabilities.
  • the processing device can be an independent processing device or a cluster formed by multiple processing devices. For example, when the amount of data to be processed is large, the above gesture recognition method may be performed by a cluster composed of multiple servers.
  • the foregoing gesture recognition method provided by the embodiments of the present application may be completed by the terminal and the server.
  • the terminal may collect a gesture picture
  • the server obtains the gesture picture from the terminal, and recognize the gesture picture to determine Gesture actions performed by the user.
  • FIG. 1 is a schematic diagram of a gesture recognition method according to an embodiment of the present application.
  • the application scenario includes a terminal 110, a server 120, and a terminal 130.
  • the server 120 provides network data transmission services for the terminal.
  • the server 120 is connected to the terminal 110 and the user B of the user A, respectively.
  • the terminal 130 establishes a network connection to provide user A and user B with a video communication service.
  • the terminal device is described using a laptop as an example, and does not constitute a limitation on the technical solution of the present application.
  • user B can use sign language instead of voice to communicate with user A, where sign language is a language way to exchange thoughts through dynamic gestures.
  • the terminal 130 of user B can capture the corresponding gesture picture in real time, which can be the first gesture picture and the second gesture picture.
  • the server 120 can retrieve the gesture picture from the terminal. 130 Acquire the first gesture picture and the second gesture picture, identify the first gesture picture in real time to obtain a first vector, and recognize the second gesture picture to obtain a second vector, where the first vector represents the finger joint in the first gesture picture Angle, the second vector represents the angle of the finger joint in the second gesture picture, the server 120 calculates the total amount of the first angle change of the first specified joint in the first specified finger according to the first vector and the second vector, and according to the first angle
  • the total amount of change and the first preset threshold can obtain the recognition result of the gesture action in real time. For example, if the total amount of change in the first angle is greater than a first preset threshold, it is determined that the user performs a specified gesture action, and the specified gesture action is a recognition result of the gesture action.
  • the server 120 determines that the user B performs an “OK” gesture. In order for the user A to understand the meaning of the gesture performed by the user B, the server 120 sends the meaning corresponding to the recognized gesture to the user A as a recognition result.
  • the terminal 110 displays the recognition result on a display interface of the terminal 110 so that the user A can view the result in real time.
  • the first vector representing the angle of the finger joint in the first gesture picture and the angle representing the angle of the finger joint in the second gesture picture are obtained.
  • Second vector According to the first vector and the second vector, the total amount of the first angle change of the first designated joint in the first designated finger can be determined, and the movement trend in the first designated finger can be judged according to the angle change, so as to determine whether the user implements the designated Gesture actions, in this way, realize real-time recognition of gesture actions.
  • the gesture action recognition method provided by this application has a wide range of application prospects, especially in scenarios that require high real-time performance. , Such as real-time video communication and live broadcast scenarios.
  • gesture motion recognition method of the present application is only an optional example of the gesture motion recognition method of the present application.
  • the gesture motion recognition method provided in the embodiment of the present application can also be applied to other application scenarios. Apply for a technical solution limitation.
  • FIG. 2 is a flowchart of a gesture recognition method according to an embodiment of the present application. Referring to FIG. 2, the method includes:
  • S201 The server obtains a first gesture picture and a second gesture picture.
  • Gesture pictures are pictures that contain gestures. By recognizing the gesture of the hand in the gesture picture, gesture recognition can be realized.
  • a dynamic gesture that is, a gesture action
  • at least two gesture pictures having a temporal relationship need to be acquired, including a first gesture picture and a second gesture picture.
  • two frames containing a gesture may be obtained from the video as the first gesture picture and the second gesture picture.
  • the video may be identified frame by frame, the image containing the hand is marked, and two consecutive frames are selected from the marked image frames as the first gesture picture and the second gesture picture.
  • the first gesture picture and the second gesture picture may also be discontinuous.
  • two frame images with an interval of one frame may be selected from the marked image frames as the first gesture picture and the second gesture picture. It should be noted that it takes a certain time to implement a gesture action. In order to recognize the gesture action more accurately and faster, two gesture pictures with a certain interval can be selected as the first gesture picture and the second gesture picture, respectively. For example, if it takes 1 s to perform a gesture action, two time-related gesture pictures can be extracted from the video at a time interval of a preset time of 1 s as the first gesture picture and the second gesture picture, respectively.
  • the hand may be taken as a photographic object, the hand may be photographed at different times, and the photos containing the hand at different times may be taken as the first gesture picture and the second gesture picture.
  • the continuous shooting function can be used to shoot the hand in the same direction, orientation and shooting angle to generate multiple pictures, and two of the multiple pictures are selected as the first gesture picture and the second gesture picture.
  • two gesture pictures taken in a time period may be selected as the first gesture picture and the second gesture picture according to a preset time interval.
  • S202 The server recognizes the first gesture picture to obtain a first vector, and the first vector is used to represent an angle of a finger joint in the first gesture picture, and recognizes the second gesture picture to obtain a second vector.
  • the second vector is used to represent the angle of a finger joint in the second gesture picture.
  • the processing device may first recognize the joint angles in the gesture pictures, so as to determine the angle change of the finger joints according to the angles of the finger joints in the two gesture pictures.
  • the first gesture picture may be recognized to obtain a first vector representing the angle of the finger joint in the first gesture picture
  • the second gesture picture may be recognized to obtain the angle representing the finger joint in the second gesture picture. Second vector.
  • the angle of the finger joint can be identified through deep learning to identify joint points, and then calculated based on the position coordinates of each joint point.
  • a coordinate set is obtained according to the gesture picture and the convolutional neural network model recognition.
  • the coordinate set includes the position coordinates of each joint point of the hand in the recognized gesture picture; according to the position coordinates of each joint point in the coordinate set, Calculate the angle of the finger joint.
  • Figure 3 is a schematic diagram of each joint point determined according to a convolutional neural network model.
  • the thumb is three joint points
  • the index finger, middle finger, ring finger, and little finger each have four joint points
  • the palm also has one joint point.
  • the nodes are connected to each finger.
  • the angle corresponding to the finger joint can be calculated according to the position coordinates of each joint point in the first gesture picture, and the angle corresponding to each finger joint is represented in a vector form as the first vector.
  • the process of recognizing the second gesture picture to obtain the second vector is similar to the first gesture picture, and is not repeated here.
  • the convolutional neural network model in this embodiment is a neural network model that uses gesture pictures as input and the position coordinates of joint points as output.
  • the neural network model can be obtained by deep learning.
  • a large number of gesture pictures can be obtained, the position coordinates of the joint points in the gesture pictures can be labeled, and training samples can be obtained.
  • the training samples are used to train the initial convolutional neural network model to obtain the convolutional nerves that recognize the gesture pictures. Network model.
  • Fig. 4 is a schematic diagram for calculating the angle of a finger joint. Referring to Fig. 4, each finger of the hand has a joint point. For any one joint point, if the joint point is connected to two finger knuckles, two can be calculated. The angle of the vector corresponding to the knuckle, and the angle of the vector is used as the angle of the finger joint.
  • the vectors corresponding to the two knuckles connected by the joint points can be expressed by v 1 and v 2 respectively, and the angle ⁇ of the finger joint can be calculated by the following formula:
  • v 1 ⁇ v 2 represents the inner product of the vector
  • represent the modulus values of the vectors v 1 and v 2 respectively
  • acos is the inverse cosine function.
  • the recognition order of the first gesture picture and the second gesture picture may be arbitrary, and the first gesture picture and the second gesture picture may be recognized at the same time, or according to the set sequence. Recognize the first gesture picture and the second gesture picture.
  • S203 The server calculates the total amount of first angle change of the first specified joint in the first specified finger according to the first vector and the second vector.
  • the first designated finger refers to a finger whose joint angle needs to be changed when a designated gesture is performed.
  • the first designated joint refers to a finger joint that needs to change an angle in the first designated finger when the designated gesture is performed. That is, if the user wants to perform a designated gesture, he must move the first designated finger, so that the first designated joint in the first designated finger undergoes a certain change.
  • An implementation manner is that, when the "praise” gesture is performed, if the user's behavior habit is to start the "praise” gesture from the fist state, then when the gesture action is performed, the thumb changes from the bent state to the straight state.
  • the thumb is the finger whose joint angle needs to be changed when the "appreciate” gesture is performed, so the thumb is determined as the first designated finger. Further, since the angles of the two joints on the thumb need to be changed, the two joints of the thumb are determined as the first designated joints.
  • Another implementation is that if the user ’s behavior habit is to start the "praise” gesture from the extended state, the thumb does not need to be changed when the gesture is performed, and the forefinger, middle finger, ring finger and little finger are straightened. It becomes a bent state.
  • the index finger, the middle finger, the ring finger, and the little finger are the fingers whose joint angle needs to be changed when performing the "praise” gesture.
  • the index finger, the middle finger, the ring finger, and the little finger can be determined as the first designated finger.
  • the joint that needs to be changed when the "appreciate" gesture is performed in the index finger, middle finger, ring finger, and little finger may be determined as the first designated joint.
  • One implementation method is that when the "Yeah" gesture is performed, the index finger and the middle finger are changed from the straight state to the bent state. It can be seen that when the forefinger and the middle finger are the fingers whose joint angle needs to be changed when the "Yeah” gesture is performed, the index finger The middle finger is determined as the first designated finger. Further, the angles of the three joints of the index finger and the three joints of the middle finger need to be changed. Therefore, the three joints of the index finger are determined as the first designated joints corresponding to the index finger, and the three joints of the middle finger are determined to correspond to the middle finger. First designated joint.
  • Another implementation is that when the user starts to implement the "Yeah" gesture from the extended state, the index finger and the middle finger do not need to be changed, and the thumb, ring finger and little finger are changed from the straight state to the bent state.
  • the thumb, ring finger and The little finger is determined as the first designated finger.
  • the joints that need to be changed among the thumb, ring finger and little finger can be determined as the corresponding first designated joints.
  • the user can also start to implement the "Yeah” gesture from the state of extending his index finger, and the "middle finger” is the finger whose joint angle needs to be changed when the "Yeah” gesture is performed, and the middle finger can be determined as the first designated finger The three joints of the middle finger are determined as the first designated joints. Because the first vector includes the angle of each joint corresponding to each finger, and the second vector includes the angle of each joint corresponding to each finger of the same hand, the first vector in the first designated finger can be calculated based on the first vector and the second vector. A total amount of first angle change for a given joint.
  • a difference vector may be calculated according to the first vector and the second vector, and then the corresponding angular change amount of the first specified joint in the first specified finger is obtained from the difference vector, and calculated.
  • the sum of the angle change amounts obtains the total amount of the first angle change corresponding to the first specified joint in the first specified finger.
  • the number of the first designated fingers may be one or multiple. When the number of the first designated fingers is one, the total amount of the first angle change of the first designated joint in the finger may be calculated. When the number of first designated fingers is multiple, for each designated finger, the total amount of first angle change of the first designated joint in the first designated finger is calculated separately.
  • the angles of the joints of other fingers may also change when the designated gesture is performed.
  • the above changes bring some interference to the recognition of the designated gesture.
  • the first designated finger of one gesture action A is the index finger and middle finger
  • the first designated finger of another gesture action B is the middle finger. If the designated gesture action is A, the A
  • the total angle change of the joint is also calculated, which can avoid misrecognition of gesture B as gesture A, and improve the accuracy of gesture recognition.
  • the finger that does not need to change the angle of the joint when the specified gesture is performed may be recorded as the second specified finger, and the finger joint in the second specified finger may be recorded as the second specified joint.
  • the total amount of second angle change of the second designated joint in the second designated finger is calculated. Similar to the first designated finger, the number of the second designated finger may be one or multiple. When the number of the second designated fingers is plural, it is necessary to separately calculate the total amount of the second angle change of the second designated joint corresponding to the second designated finger for each second designated finger.
  • the calculation method of the total amount of the second angle change is similar to the calculation method of the total amount of the first angle change.
  • a difference vector is calculated, and then the corresponding angular change amount of the specified joint in the specified finger is obtained from the difference vector, and the sum of the angular change amount is calculated to obtain the angular change corresponding to the specified joint in the specified finger.
  • the calculated total angle change amount is the first angle change total amount
  • the calculated total angle change is the second total angle change.
  • S204 The server obtains a recognition result of the gesture action according to the total amount of the first angle change and a first preset threshold.
  • the first preset threshold is a standard value used to measure the total amount of the first angle change of the first designated joint in the first designated finger. If the total amount of change in the first angle is greater than the standard value, it indicates that the total amount of change in the first angle is large, and the change in the first designated joint in the first designated finger has reached the first A degree to which a specified joint needs to change. Based on this, it can be determined that the user has performed a specified gesture action, and the specified gesture action can be used as a recognition result of the gesture action.
  • a judgment condition for determining whether to implement a specified gesture action may be set according to actual business conditions.
  • the recognition result of gesture gestures may be obtained according to the total amount of the first angle change and the first preset threshold value, and the total amount of the second angle change and the second preset threshold value.
  • the total amount of change in the second angle may be determined to determine whether the user has performed the specified gesture action. In some possible implementations of this embodiment, if the total amount of change in the first angle is greater than the first preset threshold and the total amount of change in the second angle is less than the second preset threshold, it is determined that the user performs a specified gesture action.
  • the total amount of the second angle change is a standard value for measuring the total amount of the second angle change of the second designated joint in the second designated finger. If the total amount of the second angle change is less than the standard value, it indicates that the total amount of the second angle change is small, and the second designated joint in the second designated finger may be regarded as not moving. Since the change trend of the first specified joint in the first specified finger coincides with the change trend when the specified gesture is performed, and the change in the second specified joint in the second specified finger is small, it can be regarded as no change. Based on this, , You can determine that the user performs the specified gesture action.
  • an animation effect corresponding to the specified gesture action may also be displayed on the interface, thereby enhancing the interactive experience. For example, after the user performs a "call" gesture, a phone animation corresponding to the "call” can be displayed on the interface. Alternatively, a corresponding sound effect may be configured for the specified gesture action.
  • the embodiment of the present application provides a method for recognizing a gesture.
  • the method obtains a first vector and a second vector that represent the angle of a finger joint by recognizing a first gesture picture and a second gesture picture, and then uses mathematics
  • the model method calculates a total amount of the first angle change according to the first vector and the second vector, and the total amount of the first angle change is the total amount of angle change corresponding to the first designated joint in the first designated finger; wherein the first designated finger
  • the angle of the joint needs to be changed.
  • the first specified joint is the finger joint whose angle needs to be changed in the first specified finger when the specified gesture is performed.
  • the threshold value can obtain the recognition result of the gesture action. Because when a user performs a specified gesture, there must be an angle change of a specific joint of a specific finger, so the method can recognize a dynamic gesture based on the angle change of the specific joint of the specific finger in the two gesture pictures.
  • the method for recognizing gestures provided in this application has a wider application prospect in the field of artificial intelligence.
  • recognition of a specified gesture action is realized by changing the angle of a specified joint in a specified finger.
  • a user performs a gesture, there is not only a change in the angle of the finger joint, but also a change in the linear relationship of the finger.
  • the linear relationship of the finger changes greatly.
  • the linear relationship of the designated finger can be characterized by a determination coefficient obtained by linear regression of the designated joint point on the designated finger, and by calculating the amount of change of the determination coefficient corresponding to the designated finger, it can be further determined whether the user performs the designated gesture action.
  • FIG. 5 is a flowchart of a gesture recognition method according to this embodiment. Referring to FIG. 5, the method includes:
  • S501 The server obtains a first gesture picture and a second gesture picture.
  • S502 The server recognizes the first gesture picture to obtain a first vector, and recognizes the second gesture picture to obtain a second vector.
  • S503 The server calculates the total amount of first angle change of the first designated joint in the first designated finger according to the first vector and the second vector.
  • the server calculates a first linear regression determination coefficient corresponding to the first designated finger according to the first vector, and calculates a second linear regression determination coefficient corresponding to the first designated finger according to the second vector.
  • the joint points of each finger can be regressed to obtain the corresponding regression equation.
  • the goodness of fit of the regression equation corresponding to the finger will also change greatly.
  • a decision coefficient can be used to determine the goodness of fit of the regression equation.
  • the squared residuals can also be used to reflect the goodness of fit. Among them, the squared residuals are related to the absolute size of the observed values. The determination coefficient is relativeized on the basis of the sum of squared residuals and is less affected by the magnitude of the absolute value, so it can more accurately reflect the goodness of fit.
  • the average value of the coordinates in the y direction can also be calculated According to the coordinates y i in the y direction and the average value
  • the total square sum SS tot can be calculated.
  • n is the number of joint points included in the finger.
  • the determination coefficient R 2 can be calculated according to the total square sum and the residual square sum. See the following formula:
  • the coordinates of each joint point in the first designated finger in the y direction can be determined. Based on the coordinates and the predicted point coordinates obtained by regression, a determination coefficient can be calculated, and the determination coefficient is the first linearity. Regression determination coefficient. Similar to the first linear regression determination coefficient, the coordinates of each joint point in the first designated finger in the y direction can be determined according to the second vector. Based on the coordinates and the predicted point coordinates obtained by the regression, a determination coefficient can be calculated. The determination coefficient That is the second linear regression determination coefficient. It should be noted that the order of calculating the first linear regression determination coefficient and the second regression linear determination coefficient does not affect the implementation of the embodiment of the present application, and a corresponding order may be set according to requirements.
  • S505 The server calculates a change amount of the first determination coefficient corresponding to the first designated finger according to the first linear regression determination coefficient and the second linear regression determination coefficient.
  • the first linear regression determination coefficient and the second linear regression determination coefficient can be regarded as the determination coefficients of the joint points in the first designated finger at different times. According to the first linear regression determination coefficient and the second linear regression determination coefficient, the first linear regression determination coefficient can be calculated. Specify the first determination coefficient change amount corresponding to the finger. As a possible implementation manner, a difference between the first linear regression determination coefficient and the second linear regression determination coefficient can be obtained, and the first determination coefficient change amount can be obtained according to the difference. For example, the absolute value of the difference can be used as the first determination coefficient. The amount of change. In other possible implementation manners of this embodiment, other methods, such as quotient calculation, may also be adopted to calculate the first determination coefficient change amount.
  • S505 is executed after S504, and the execution order of S503 can be arbitrary. In some possible implementations, S503, S504, and S505 can be executed simultaneously in two paths, and in other In a possible implementation manner, execution may also be performed sequentially according to a set order, such as executing S504 and S505 first, and then performing S503.
  • S506 The server obtains the recognition result of the gesture action according to the total amount of change in the first angle and the first preset threshold, and the amount of change in the first determination coefficient and the third preset threshold.
  • the third preset threshold is a standard value for measuring the magnitude of the change amount of the first determination coefficient. According to the relationship between the total amount of change in the first angle and the first preset threshold, and the relationship between the total amount of change in the first determination coefficient and the third preset threshold, a recognition result of the gesture action can be obtained.
  • the change amount of the first determination coefficient is greater than the third preset threshold, it indicates that the change amount of the first determination coefficient is large, and the position distribution of the joint points in the first designated finger is significantly different. Variety. Based on the dual judgment of joint angle and joint point position distribution, it may be that when the total amount of change in the first angle is greater than a first preset threshold and the amount of change in the first determination coefficient is greater than a third preset threshold, it may be determined that the user performs a specified gesture action , The specified gesture action can be used as the recognition result of the gesture action.
  • a regression coefficient in the second designated finger may be calculated, a position distribution of the joint points in the second designated finger may be determined, and whether to implement the change may be determined according to a change in the position distribution of the joint points in the second designated finger.
  • Specify a gesture action that is, based on the total amount of the first angle change and the first preset threshold, the total amount of the second angle change and the second preset threshold, the total amount of the first decision coefficient change and the third preset threshold, and the second decision
  • a coefficient change amount and a fourth preset threshold value are used to obtain a recognition result of a gesture action.
  • a third linear regression determination coefficient corresponding to the second designated finger may also be calculated according to the first vector
  • a fourth linear regression determination coefficient corresponding to the second designated finger may be calculated according to the second vector.
  • the amount of change in the first determination coefficient is greater than the third threshold
  • the amount of change in the second determination coefficient is less than the first Four preset thresholds determine that the user performs the specified gesture action.
  • the fourth preset threshold is a standard value used to measure the amount of change in the second determination coefficient. If the change amount of the second determination coefficient is less than the fourth preset threshold, it indicates that the change amount of the second determination coefficient corresponding to the second designated finger is small, and the position distribution of the joint points of the second designated finger is small, that is, the The fingers did not produce large movements.
  • the second determination coefficient of each second designated finger changes less than the fourth preset threshold, it is determined that the remaining fingers other than the first designated finger are not A large motion, and the first angle change amount is greater than a first preset threshold, indicating that the first designated finger has a large motion, and based on this, it can be determined that the user performs a specified gesture motion.
  • the judgment of the second determination coefficient change amount of each second designated finger can be implemented in a variety of ways.
  • One implementation method is to change the second determination coefficient change amount of each second designated finger to the fourth preset threshold value, respectively.
  • Another implementation is to compare the second determination coefficient change amount of the second designated finger, determine the maximum value of the second determination coefficient change amount of the second designated finger, and set the maximum value to the fourth preset
  • the thresholds are compared to determine whether the amount of change of the second determination coefficient is less than a fourth preset threshold.
  • a finger with specific characteristics in a specified gesture action such as a finger that always has a linear relationship, can be judged to determine whether to perform the specified gesture action. In this way, the accuracy of recognizing a specific gesture action can be further improved.
  • a finger with specific characteristics in a specified gesture action such as a finger that always has a linear relationship
  • the third designated finger refers to a finger whose finger joint points have a linear relationship when a designated gesture is performed. Since the joint point of the third designated finger has a linear relationship, when the joint point of the third designated finger is regressed to obtain a corresponding regression equation, the regression equation should have a good degree of goodness of fit.
  • the goodness of fit can be characterized by the determination coefficient. Therefore, based on the judgment of the total angle change and the determination coefficient change, the third linear regression determination coefficient corresponding to the third designated finger can also be judged to Determines whether to perform the specified gesture action.
  • the total amount of the first angle change and the first preset threshold the total amount of the second angle change and the second preset threshold, the change amount of the first determination coefficient and the third preset threshold, and the second A coefficient change amount and a fourth preset threshold are determined, and a third linear regression determination coefficient and a fifth preset threshold corresponding to the third designated finger are used to obtain a recognition result of the gesture action.
  • the first determination coefficient change amount is greater than the third preset threshold value
  • the second determination coefficient change amount If it is less than the fourth preset threshold and the third linear regression determination coefficient corresponding to the third designated finger is greater than the fifth preset threshold, it is determined that the user performs a specified gesture action, and the specified gesture action may be used as a recognition result of the gesture action.
  • the fifth preset threshold is a standard value used to measure the size of the third linear regression coefficient corresponding to the third designated finger. If the third linear regression determination coefficient corresponding to the third designated finger is greater than the fifth preset threshold, it indicates that the third linear regression determination coefficient corresponding to the third designated finger is larger and the fitting goodness is better. The large possibility is linear.
  • the number of the third designated fingers may be one or multiple.
  • the average of the third linear regression determination coefficient corresponding to the third designated fingers may also be determined.
  • the total amount of change in the first angle is greater than the first preset threshold
  • the total amount of change in the second angle is less than the second preset threshold
  • the first determination coefficient change is greater than the third preset threshold
  • the second determination coefficient is changed
  • the first preset threshold, the second preset threshold, the third preset threshold, the fourth preset threshold, and the fifth preset threshold may be set according to experience values.
  • the preset The threshold may be set differently, which is not limited in this embodiment.
  • the recognition result of the gesture action can be obtained according to the total amount of the first angle change and the first preset threshold.
  • the first determination coefficient is combined with the total amount of the second angle and the second preset threshold. Any combination of the change amount and the third preset threshold value, the second determination coefficient change amount and the fourth preset threshold value, the third linear regression determination coefficient and the fifth preset threshold value can be obtained to obtain a more accurate gesture action. Identify the results.
  • the embodiment of the present application provides a method for recognizing gestures.
  • determining the change in the angle of a specified joint of a specified finger an increase in the determination of the amount of change in the determination coefficient corresponding to the specified finger is added.
  • the change amount of the determination coefficient corresponding to the specified finger can determine the change of the position distribution of the joint points of the specified finger. Based on the change of the position distribution, it can further determine whether the user performs the specified gesture action, thereby improving the accuracy of gesture gesture recognition.
  • FIG. 6 is a schematic diagram of an application scenario of a gesture recognition method according to an embodiment of the present application.
  • the application scenario is a real-time video live broadcast scenario
  • the application scenario includes a terminal 110, a server 120, and a terminal 130.
  • the terminal 110 is the terminal corresponding to the host
  • the terminal 130 is the terminal corresponding to the viewer
  • the server 120 is a live broadcast server
  • the live server 120 establishes a network connection with the terminal 110 corresponding to the anchor and the terminal 120 corresponding to the viewer, respectively, so as to provide corresponding live broadcast services.
  • Anchor A can apply to open a live broadcast room.
  • anchor A opens live broadcast room 100, and when anchor A broadcasts live through terminal 110, all terminals accessing the live broadcast room can receive the live broadcast content, as shown in Figure 6.
  • the viewer can enter the live broadcast room of the anchor A through the terminal 130 to watch the live content.
  • the number of the viewer's terminal may be one or multiple.
  • the viewer's terminal can be located in different areas, and the viewer's terminal can also access different local area networks.
  • the deployment of the viewer's terminal in the live broadcast room 100 shown in FIG. 6 is only an optional example of this application, and does not represent the actual terminal Positional relationship.
  • the anchor can implement various gestures during the live broadcast, recognize the gestures performed by the anchor, and display corresponding animation effects according to the recognized gestures to improve the interactivity.
  • the process of recognizing a gesture action can be independently performed by the terminal 110.
  • the terminal 110 corresponding to the anchor acquires a gesture picture, including a first gesture picture and a second gesture picture, and then recognizes the change in the joint angle in the gesture picture and other information to identify
  • the gestures performed by the anchor can be displayed on the interface when the specified gestures are recognized.
  • the recognition of gesture actions can also be performed by the server.
  • the anchor implements a finger movement
  • the terminal 110 corresponding to the anchor acquires gesture pictures in real time, including the first gesture picture and the second gesture picture.
  • the server 120 obtains the first gesture picture and the second gesture picture from the terminal corresponding to the anchor, and recognizes them in real time to obtain a first vector and a second vector.
  • the first designated finger can be calculated.
  • the total amount of change in the first angle of the first specified joint of the first specified joint, the amount of change of the second angle of the second specified joint of the second specified finger, the first specified finger corresponds to the first determination coefficient change, and the second specified finger corresponds to the second Determine the coefficient change amount, etc.
  • the anchor implements the finger movement.
  • the server 120 determines that the user performs a flick action
  • the corresponding animation effect may be returned to the terminal 110 corresponding to the anchor and the terminal 130 corresponding to the viewer, and the terminal corresponding to the anchor 110 and the terminal 130 corresponding to the viewer may display the animation effect on the interface for the user Watch and enhance interactivity.
  • the first designated finger is the middle finger, and the first designated joint includes three finger joints on the middle finger.
  • the second designated fingers are thumb, index finger, ring finger and pinky finger.
  • the second designated joints are two finger joints of the thumb, three finger joints of the index finger, three finger joints of the ring finger, and three finger joints of the little finger.
  • the finger joint angle can be calculated based on the gesture picture.
  • Two consecutive frame images including the hand can be obtained from the video as the first gesture image and the second gesture image, and the joint points in the gesture image can be identified through a pre-trained convolutional neural network model, optional Ground, the first gesture image is recognized, and the joint angle is calculated to obtain a first vector.
  • 14 joint angles can be calculated according to each finger joint, so the first vector representing the finger joint angle in the first gesture image is 14 Dimensional vector.
  • the second gesture image is recognized, and the joint angle is calculated to obtain a second vector.
  • the second vector representing the finger joint angle in the second gesture image is a 14-dimensional vector.
  • the joint angle can be calculated according to formula (1), and according to the actual situation of the human joint, the output result of the acos function can be controlled between 0 and ⁇ .
  • the total amount of angular change of the joints of the first designated finger, that is, the middle finger, and the second designated finger, that is, thumb, index finger, ring finger, and little finger can be judged separately.
  • the first vector is represented by alpha (t)
  • the second vector is represented by alpha (t + 1)
  • the component corresponding to the first designated finger in the first vector is represented by beta (t).
  • the component corresponding to the two designated fingers is represented by gamma (t)
  • the component corresponding to the first designated finger in the second vector is represented by beta (t + 1)
  • the component corresponding to the second designated finger in the second vector is represented by gamma (t + 1) indicates.
  • the total amount of the first angle change of the first specified joint in the first specified finger can be calculated. It can be based on the component beta (t) corresponding to the first specified finger in the first vector alpha (t) and The component beta (t + 1) corresponding to the first designated finger in the second vector alpha (t + 1), calculates the total angle change of the first designated joint in the first designated finger, that is, the angles of the three joints of the middle finger Total change.
  • the total amount of angle change can be expressed by the sum of all components of beta (t + 1) -beta (t). Then, the total angle change is compared with a first preset threshold, where the first preset threshold is 130. If the sum of all components of beta (t + 1) -beta (t) is greater than 130, the middle finger is judged to have Finger movement.
  • the judgment of the second designated finger is mainly to avoid misidentification.
  • the total amount of the second angular change of the second specified joint in the first specified finger may be calculated.
  • the component gamma (t) corresponding to the second specified finger in the first vector alpha (t) The component gamma (t + 1) corresponding to the second designated finger in the second vector alpha (t + 1) calculates the total amount of the second angle change of the second designated joint in the second designated finger, that is, the two of the thumb Total angle change of each joint, total angle change of three joints of index finger, total angle change of three joints of ring finger, and total angle change of three joints of little finger.
  • gamma (t + 1) -gamma (t) can be divided into four parts according to the corresponding components of thumb, index finger, ring finger and little finger, and the four parts are summed to obtain the thumb.
  • the index finger, ring finger, and little finger respectively correspond to the total amount of change in the second angle.
  • the total amount of the second angle change is compared with a second preset threshold, where the second preset threshold is 30. If the total amount of the second angle change corresponding to the thumb, forefinger, ring finger, and little finger is less than 30, it is determined
  • the thumb, forefinger, ring finger, and little finger are relatively stable with no major movements.
  • the norm norm (gamma (t + 1) -gamma (t)) can also be calculated for gamma (t + 1) -gamma (t), if the norm norm (gamma (t + 1) -gamma) (t)) Less than 30, it indicates that the thumb, forefinger, ring finger, and little finger are relatively stable and have no major movement.
  • the determination coefficient of the position of the finger joint point can also be calculated to determine whether to perform the specified gesture action. Similar to the determination of the angle change of the finger joint, the determination of the determination coefficient can also be divided into two parts. The first part is to determine the change amount of the determination coefficient of the first designated finger, that is, the middle finger, and the second part is to determine the second designated finger. , That is, the determination coefficients of the thumb, forefinger, ring finger, and little finger.
  • the calculation of the determination coefficient can refer to formulas (2) to (5).
  • the determination coefficient of the i-th finger at time t is denoted as R 2 (i, t).
  • the first designated finger Is the middle finger
  • the first linear regression determination coefficient corresponding to the first designated finger is R 2 (3, t)
  • the second linear regression determination coefficient corresponding to the first designated finger is R 2 (3, t + 1).
  • a linear regression determination coefficient and a second linear regression determination coefficient can calculate a first determination coefficient change amount, and the first determination coefficient change amount can be expressed by R 2 (3, t + 1) -R 2 (3, t).
  • the first determination coefficient variation is compared with a third preset threshold, where the third preset threshold is 0.4. If R 2 (3, t + 1) -R 2 (3, t) is greater than 0.4, the middle finger may be determined.
  • the joint point of the body has changed from a bent state to a straight state.
  • the second designated finger is thumb, forefinger, ring finger and little finger
  • the third linear regression determination coefficient corresponding to the thumb is R 2 (1, t)
  • the fourth linear regression determination coefficient corresponding to the thumb is R 2 (1, t + 1)
  • the third linear regression determination coefficient corresponding to the index finger is R 2 (2, t)
  • the fourth linear regression determination coefficient corresponding to the index finger is R 2 (2, t + 1)
  • the third index corresponding to the ring finger
  • the trilinear regression determination coefficient is R 2 (4, t)
  • the fourth linear regression determination coefficient corresponding to the ring finger is R 2 (4, t + 1)
  • the third linear regression determination coefficient corresponding to the little finger is R 2 (4, t )
  • the fourth linear regression determination coefficient corresponding to the little finger is R 2 (4, t + 1).
  • the second determination coefficient change amount can be calculated, and the second determination coefficient change amount can be expressed by R 2 (i, t + 1) -R 2 (i, t), where i is 1, 2,
  • the second designated finger due to the flicking action also includes a third designated finger, that is, an index finger, a ring finger, and a pinky finger.
  • the third linear regression determination coefficient R 2 (i, t) of the index finger, the ring finger and the little finger can be judged respectively, where i is 2, 4, 5.
  • the ring points of the ring finger and little finger are linear, with beta (t + 1) -beta (t)> 130, norm (gamma (t + 1) -gamma (t)) ⁇ 30, and R 2 (3, t + 1)
  • beta (t + 1) ) -beta (t)> 130, norm (gamma (t + 1) -gamma (t)) ⁇ 30, R 2 (3, t + 1) -R 2 (3, t)> 0.4, and max (R On the premise that 2 (i, t + 1) -R 2 (i, t)) ⁇ 0.2, if i 2,4,5, average (R (i, t))> 0.9 holds, you can be sure to recognize Finger movement.
  • the identification of the finger movements can be used in live broadcast scenarios in conjunction with some other pendant designs.
  • the screen can display the animation effect corresponding to the finger movement in real time.
  • the above application scenario uses the finger movement as an example to exemplify the gesture gesture recognition method provided in this application.
  • other gesture actions such as “OK” may also be used. , "Scissors Hands", etc.
  • the threshold and the fifth preset threshold can be adaptively adjusted.
  • the present application further provides a gesture recognition device.
  • FIG. 7 is a schematic structural diagram of a gesture recognition device according to an embodiment of the present application. Referring to FIG. 7, the device includes:
  • the obtaining module 710 is configured to obtain a first gesture picture and a second gesture picture
  • the recognition module 720 is configured to recognize the first gesture picture to obtain a first vector, where the first vector is used to represent an angle of a finger joint in the first gesture picture, and to identify the second gesture picture to obtain a first vector. Two vectors, where the second vector is used to represent the angle of a finger joint in the second gesture picture;
  • a calculation module 730 is configured to calculate a total amount of a change in a first angle of a first specified joint in a first specified finger according to the first vector and the second vector, where the first specified finger refers to performing a specified gesture action When the angle of the joint needs to change, the first specified joint refers to the finger joint whose angle needs to change in the first specified finger when performing the specified gesture action;
  • the determining module 740 is configured to obtain a recognition result of a gesture action according to the total amount of the first angle change and a first preset threshold.
  • calculation module 730 is further configured to:
  • the determining module 740 may be set as:
  • calculation module 730 is further configured to:
  • the determining module 740 may be set as:
  • calculation module 730 is further configured to:
  • the determining module 740 may be set as:
  • the determining module 740 is further configured to:
  • the third designated finger refers to a finger whose finger joint points have a linear relationship when performing the designated gesture action; according to the first angle change amount and the first A preset threshold, the total amount of the second angle change and a second preset threshold, the amount of the second determination coefficient change and a fourth preset threshold, and a third linear regression determination coefficient corresponding to the third designated finger and A fifth preset threshold to obtain a recognition result of a gesture action.
  • calculation module 730 may be configured as:
  • An angle change amount corresponding to a specified joint in the specified finger is obtained from the difference vector, and a sum value of the angle change amounts is calculated to obtain a total angle change amount corresponding to the specified joint in the specified finger.
  • the designated gesture is a finger-push action
  • the first designated finger is a middle finger
  • the first designated joint includes three finger joints on the middle finger.
  • the designated finger motion is a finger movement
  • the first designated finger is a middle finger
  • the first designated joint includes three finger joints on the middle finger
  • the second designated finger includes a thumb, an index finger, a ring finger, and a pinky finger.
  • FIG. 8 is a schematic structural diagram of a gesture recognition device according to this embodiment.
  • the device further includes a display module 750 configured to:
  • an animation effect corresponding to the designated gesture action is displayed on an interface.
  • the identification module 720 may be configured as:
  • a coordinate set is obtained according to the gesture picture and the convolutional neural network model, and the coordinate set includes position coordinates of each joint point of the hand in the recognized gesture picture;
  • a vector corresponding to the recognized gesture picture is generated according to the angle, and the vector is used to represent an angle of a finger joint in the recognized gesture picture.
  • the identification module 720 may be configured as:
  • the arc cosine function and the two vectors are used to calculate the angle corresponding to the finger joint.
  • an embodiment of the present application provides a gesture gesture recognition device, which includes acquiring a first gesture picture and a second gesture picture, and using deep learning to recognize joint points in the gesture picture to obtain a finger joint angle
  • the first vector and the second vector are processed by using a mathematical model method to determine the total amount of the first angle change of the first designated joint of the first designated finger in the gesture picture, where the first designated finger
  • the angle of the joint needs to be changed, and the first specified joint is the finger joint whose angle needs to be changed in the first specified finger when the specified gesture is performed.
  • the threshold is preset, it can be determined that the user performs a specified gesture action.
  • the present application calculates the total amount of angular change of a specified joint in a specified finger on the basis of being able to accurately recognize a joint point of a finger through deep learning, and calculates the total amount of change and the characteristic of the finger whose angle needs to be changed when performing a specified gesture action. Comparing the first preset thresholds of the total amount of angle changes of the joints that have changed may recognize the gesture action.
  • the related technology can only recognize static gestures, and the gesture motion recognition method provided by this application has a wider application prospect.
  • FIG. 9 is a gesture recognition device according to an embodiment of the present application.
  • the device may be a terminal device. As shown in FIG. 9, for convenience of explanation, only the parts related to the embodiment of the present application are shown, and specific technical details are not disclosed, please refer to the method part of the embodiment of the present application.
  • the terminal device may be any terminal device including a mobile phone, a tablet computer, a personal digital assistant (full English name: Personal Digital Assistant, English abbreviation: PDA), a sales terminal (full English name: Point of Sales, English abbreviation: POS), vehicle computer, etc. Take the terminal as a mobile phone as an example:
  • FIG. 9 is a block diagram showing a partial structure of a mobile phone related to a terminal provided in an embodiment of the present application.
  • the mobile phone includes: a radio frequency (English full name: Radio Frequency, English abbreviation: RF) circuit 910, memory 920, input unit 930, display unit 940, sensor 950, audio circuit 960, wireless fidelity (full English name: wireless fidelity , English abbreviation: WiFi) module 970, processor 980, and power supply 990 and other components.
  • RF Radio Frequency
  • WiFi wireless fidelity
  • FIG. 9 does not constitute a limitation on the mobile phone, and may include more or fewer components than shown in the figure, or some components may be combined, or different components may be arranged.
  • the RF circuit 910 may be used to receive and send signals during information transmission and reception or during a call.
  • the downlink information of the base station is received and processed by the processor 980.
  • the uplink data of the design is transmitted to the base station.
  • the RF circuit 910 includes, but is not limited to, an antenna, at least one amplifier, a transceiver, a coupler, a low-noise amplifier (full English name: Low Noise Amplifier, English abbreviation: LNA), a duplexer, and the like.
  • the RF circuit 910 can also communicate with a network and other devices through wireless communication.
  • the above wireless communication can use any communication standard or protocol, including but not limited to the Global System for Mobile Communication (English: Global System of Mobile Communication, English abbreviation: GSM), General Packet Radio Service (English: General Packet Radio Service, GPRS) ), Code Division Multiple Access (English full name: Code Division Multiple Access, English abbreviation: CDMA), Broadband Code Division Multiple Access (English full name: Wideband Code Division Multiple Access, English abbreviation: WCDMA), Long Term Evolution (English Full Name: Long Term Evolution, English abbreviation: LTE), email, short message service (full English name: Short Messaging Service, SMS), etc.
  • GSM Global System of Mobile Communication
  • GPRS General Packet Radio Service
  • CDMA Code Division Multiple Access
  • CDMA Code Division Multiple Access
  • WCDMA Wideband Code Division Multiple Access
  • LTE Long Term Evolution
  • email short message service
  • SMS full English name: Short Messaging Service
  • the memory 920 may be configured to store software programs and modules, and the processor 990 executes various functional applications and data processing of the mobile phone by running the software programs and modules stored in the memory 920.
  • the memory 920 may mainly include a storage program area and a storage data area, where the storage program area may store an application program (such as a sound playback function, an image playback function, etc.) required to implement the system and at least one function; the storage data area may store data according to Data (such as audio data, phone book, etc.) created by the use of mobile phones.
  • the memory 920 may include a high-speed random access memory, and may further include a non-volatile memory, such as at least one magnetic disk storage device, a flash memory device, or other volatile solid-state storage devices.
  • the input unit 930 may be configured to receive inputted numeric or character information, and generate key signal inputs related to user settings and function control of the mobile phone.
  • the input unit 930 may include a touch panel 931 and other input devices 932.
  • the touch panel 931 also known as a touch screen, can collect the user's touch implementation on or near it (for example, the user uses a finger, a stylus or any suitable object or accessory on the touch panel 931 or near the touch panel 931 (Implementation), and drive the corresponding connection device according to a preset program.
  • the touch panel 931 may include a touch detection device and a touch controller.
  • the touch detection device detects the user's touch position, and detects the signal brought by the touch, and transmits the signal to the touch controller; the touch controller receives the touch information from the touch detection device, converts it into contact coordinates, and sends it It is given to the processor 980, and can receive commands from the processor 980 and execute them.
  • various types such as resistive, capacitive, infrared, and surface acoustic wave can be used to implement the touch panel 931.
  • the input unit 930 may include other input devices 932.
  • other input devices 932 may include, but are not limited to, one or more of a physical keyboard, function keys (such as volume control keys, switch keys, etc.), a trackball, a mouse, an implementation lever, and the like.
  • the display unit 940 may be configured to display information input by the user or information provided to the user and various menus of the mobile phone.
  • the display unit 940 may include a display panel 941.
  • a liquid crystal display full name: Liquid Crystal Display, English abbreviation: LCD
  • an organic light emitting diode full English name: Organic Light-Emitting Diode, English abbreviation: OLED
  • the touch panel 931 may cover the display panel 941. When the touch panel 931 detects a touch on or near the touch panel 931, it is transmitted to the processor 980 to determine the type of the touch event. The type provides corresponding visual output on the display panel 941.
  • the touch panel 931 and the display panel 941 are implemented as two separate components to implement the input and input functions of the mobile phone, in some embodiments, the touch panel 931 and the display panel 941 can be integrated and Realize the input and output functions of mobile phones.
  • the mobile phone may further include at least one sensor 950, such as a light sensor, a motion sensor, and other sensors.
  • the light sensor may include an ambient light sensor and a proximity sensor, wherein the ambient light sensor may adjust the brightness of the display panel 941 according to the brightness of the ambient light, and the proximity sensor may close the display panel 941 and the mobile phone when the mobile phone is moved to the ear. / Or backlight.
  • an accelerometer sensor can detect the magnitude of acceleration in various directions (usually three axes), and can detect the magnitude and direction of gravity when it is stationary.
  • the mobile phone can be used to identify the attitude of the mobile phone (such as horizontal and vertical screen switching, related Games, magnetometer attitude calibration), vibration recognition related functions (such as pedometer, tap), etc .; as for the mobile phone can also be equipped with gyroscope, barometer, hygrometer, thermometer, infrared sensor and other sensors, no longer here To repeat.
  • attitude of the mobile phone such as horizontal and vertical screen switching, related Games, magnetometer attitude calibration
  • vibration recognition related functions such as pedometer, tap
  • the mobile phone can also be equipped with gyroscope, barometer, hygrometer, thermometer, infrared sensor and other sensors, no longer here To repeat.
  • the audio circuit 960, the speaker 961, and the microphone 962 can provide an audio interface between the user and the mobile phone.
  • the audio circuit 960 may transmit the received electrical data converted electrical signals to a speaker 961, which is converted into a sound signal by the speaker 961.
  • the microphone 962 converts the collected sound signal into an electrical signal, and the audio circuit 960 After receiving, it is converted into audio data, and then the audio data output processor 980 is processed and then sent to, for example, another mobile phone via the RF circuit 910, or the audio data is output to the memory 920 for further processing.
  • WiFi is a short-range wireless transmission technology.
  • the mobile phone can help users send and receive emails, browse web pages, and access streaming media through the WiFi module 970. It provides users with wireless broadband Internet access.
  • FIG. 9 shows the WiFi module 970, it can be understood that it does not belong to the necessary configuration of the mobile phone, and can be omitted as needed without changing the essence of the invention.
  • the processor 980 is the control center of the mobile phone. It uses various interfaces and lines to connect various parts of the entire mobile phone. It runs or executes software programs and / or modules stored in the memory 920, and calls data stored in the memory 920 to execute Various functions and processing data of the mobile phone, so as to monitor the mobile phone as a whole.
  • the processor 980 may include one or more processing units; preferably, the processor 980 may integrate an application processor and a modem processor, wherein the application processor mainly processes an implementation system, a user interface, and an application program, etc.
  • the modem processor mainly handles wireless communication. It can be understood that the foregoing modem processor may not be integrated into the processor 980.
  • the mobile phone also includes a power source 990 (such as a battery) for supplying power to various components.
  • a power source 990 such as a battery
  • the power source can be logically connected to the processor 980 through the power management system, so as to implement functions such as managing charging, discharging, and power consumption management through the power management system.
  • the mobile phone may further include a camera, a Bluetooth module, and the like, and details are not described herein again.
  • the processor 980 included in the terminal also has the following functions:
  • the first specified finger refers to a change in an angle of a joint when a specified gesture is performed
  • the first designated joint refers to a finger joint whose angle needs to change in the first designated finger when performing a designated gesture
  • a recognition result of a gesture action is obtained according to the total amount of the first angle change and a first preset threshold.
  • the processor 980 may also perform the operation steps of any one of the implementation manners of the foregoing gesture motion recognition method.
  • FIG. 10 is a schematic structural diagram of a server according to an embodiment of the present application.
  • the server 1000 may have a large difference due to different configurations or performance, and may include one or more central processing units (CPUs) 1022 (for example, one or more processors) and
  • the memory 1032 is one or more storage media 1030 (eg, one or more storage devices) storing application programs 1042 or data 1044.
  • the memory 1032 and the storage medium 1030 may be temporary storage or persistent storage.
  • the program stored in the storage medium 1030 may include one or more modules (not shown in the figure), and each module may include a series of instruction operations on the server.
  • the central processing unit 1022 may be configured to communicate with the storage medium 1030, and execute a series of instruction operations in the storage medium 1030 on the server 1000.
  • the server 1000 may also include one or more power sources 1026, one or more wired or wireless network interfaces 1050, one or more input / output interfaces 1058, and / or, one or more operating systems 1041, such as Windows ServerTM, Mac OS XTM, UnixTM, LinuxTM, FreeBSDTM and more.
  • operating systems 1041 such as Windows ServerTM, Mac OS XTM, UnixTM, LinuxTM, FreeBSDTM and more.
  • the steps performed by the server in the above embodiment may be based on the server structure shown in FIG. 10.
  • the CPU 1022 is set to perform the following steps:
  • the first specified finger refers to a change in an angle of a joint when a specified gesture is performed
  • the first designated joint refers to a finger joint whose angle needs to change in the first designated finger when performing a designated gesture
  • a recognition result of a gesture action is obtained according to the total amount of the first angle change and a first preset threshold.
  • the CPU 1022 may be further configured to perform steps in any one of the implementation manners of the foregoing gesture recognition method.
  • the embodiment of the present application further provides a computer-readable storage medium configured to store program code, and the program code is configured to execute any one of the methods for recognizing a gesture action according to the foregoing embodiments.
  • the embodiment of the present application further provides a computer program product including instructions, which when executed on a computer, causes the computer to execute any one of the methods for recognizing a gesture action described in the foregoing embodiments.
  • the disclosed systems, devices, and methods may be implemented in other ways.
  • the device embodiments described above are only schematic.
  • the division of the unit is only a logical function division.
  • multiple units or components may be combined or Can be integrated into another system, or some features can be ignored or not implemented.
  • the displayed or discussed mutual coupling or direct coupling or communication connection may be indirect coupling or communication connection through some interfaces, devices or units, which may be electrical, mechanical or other forms.
  • the units described as separate components may or may not be physically separated, and the components displayed as units may or may not be physical units, that is, may be located in one place, or may be distributed on multiple network units. Some or all of the units may be selected according to actual needs to achieve the objective of the solution of this embodiment.
  • each functional unit in each embodiment of the present application may be integrated into one processing unit, or each of the units may exist separately physically, or two or more units may be integrated into one unit.
  • the above integrated unit may be implemented in the form of hardware or in the form of software functional unit.
  • the integrated unit is implemented in the form of a software functional unit and sold or used as an independent product, it may be stored in a computer-readable storage medium.
  • the technical solution of the present application essentially or part that contributes to the related technology or all or part of the technical solution may be embodied in the form of a software product, which is stored in a storage medium.
  • a computer device which may be a personal computer, a server, or a network device, etc.
  • the aforementioned storage medium includes: U disk, mobile hard disk, read-only memory (full English name: Read-Only Memory, English abbreviation: ROM), random access memory (full English name: Random Access Memory, English abbreviation: RAM), magnetic Various media such as discs or optical discs that can store program codes.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Human Computer Interaction (AREA)
  • Evolutionary Computation (AREA)
  • General Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Mathematical Physics (AREA)
  • Computing Systems (AREA)
  • Databases & Information Systems (AREA)
  • Artificial Intelligence (AREA)
  • Medical Informatics (AREA)
  • Pure & Applied Mathematics (AREA)
  • Mathematical Optimization (AREA)
  • Mathematical Analysis (AREA)
  • Data Mining & Analysis (AREA)
  • Computational Mathematics (AREA)
  • Psychiatry (AREA)
  • Social Psychology (AREA)
  • Biophysics (AREA)
  • Molecular Biology (AREA)
  • Computational Linguistics (AREA)
  • Biomedical Technology (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Operations Research (AREA)
  • Algebra (AREA)
  • User Interface Of Digital Computer (AREA)

Abstract

本申请公开了一种手势动作的识别方法,该方法通过识别第一手势图片和第二手势图片得到表征手指关节的角度的第一向量和第二向量,然后利用数学模型方法,根据第一向量和第二向量计算得到第一角度变化总量,该第一角度变化总量是第一指定手指中第一指定关节对应的角度变化总量;根据第一角度变化总量和第一预设阈值,获得手势动作的识别结果。由于用户在实施指定手势动作时,必然存在特定手指的特定关节发生角度变化,因此,该方法根据该特定手指的特定关节在两张手势图片中的角度变化情况,就能够识别动态的手势动作。本申请提供的手势动作的识别方法在人工智能领域中具有更为广泛的应用前景。本申请还公开了一种手势动作的识别装置。

Description

一种手势动作的识别方法、装置以及设备
本申请要求于2018年06月07日提交中国专利局、申请号为201810582795X、发明名称“一种手势动作的识别方法、装置以及设备”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本申请涉及人工智能技术领域,尤其涉及一种手势动作的识别方法、装置、设备以及计算机可读存储介质。
背景技术
人体手势骨架识别是目前人机交互领域广泛研究的研究任务,目前较为成熟的人体手势骨架识别方法是,基于卷积神经网络对手势图片进行识别,将手势图片输入卷积神经网络,网络输出手上各个关节点在图中的位置坐标。
现有的人体手势骨架识别方法仅仅能够识别单张手势图片,因此,其仅能够对静态的手势进行识别,但在人机交互领域的实际场景中手势往往是动态的,现有的手势识别技术还不能识别动态有序的手势动作。
发明内容
本申请实施例提供了一种手势动作的识别方法、装置、设备以及相关产品,使得能够对动态的手势动作进行识别,并且具有较高的准确率,因而具有广泛的应用前景。
有鉴于此,本申请第一方面提供了一种手势动作的识别方法,所述方法包括:
服务器获取第一手势图片和第二手势图片;
所述服务器识别所述第一手势图片得到第一向量,所述第一向量用于表征所述第一手势图片中手指关节的角度,以及识别所述第二手势图片得到第二向量,所述第二向量用于表征所述第二手势图片中手指关节的角度;
所述服务器根据所述第一向量和所述第二向量,计算第一指定手指中第一指定关节的第一角度变化总量,所述第一指定手指是指实施指定手势动作时关节的角度需要发生变化的手指,所述第一指定关节是指实施指定手势动 作时第一指定手指中角度需要发生变化的手指关节;
所述服务器根据所述第一角度变化总量和第一预设阈值,获得手势动作的识别结果。
可选的,所述方法还包括:
所述服务器根据所述第一向量和所述第二向量,计算第二指定手指中第二指定关节的第二角度变化总量,所述第二指定手指是指实施指定手势动作时关节的角度不需要发生变化的手指,所述第二指定关节是指所述第二指定手指中的手指关节;
则所述服务器所述根据所述第一角度变化总量和第一预设阈值,获得手势动作的识别结果,包括:
所述服务器根据所述第一角度变化总量和第一预设阈值,以及所述第二角度变化总量和第二预设阈值,获得手势动作的识别结果。
可选的,所述方法还包括:
所述服务器根据所述第一向量计算所述第一指定手指对应的第一线性回归决定系数,以及根据所述第二向量计算所述第一指定手指对应的第二线性回归决定系数;
所述服务器根据所述第一线性回归决定系数和所述第二线性回归决定系数,计算所述第一指定手指对应的第一决定系数变化量;
则所述服务器所述根据所述第一角度变化总量和第一预设阈值,获得手势动作的识别结果,包括:
所述服务器根据所述第一角度变化总量和第一预设阈值,以及所述第一决定系数变化量和第三预设阈值,获得手势动作的识别结果。
可选的,所述方法还包括:
所述服务器根据所述第一向量计算所述第二指定手指对应的第三线性回归决定系数,以及根据所述第二向量计算所述第二指定手指对应的第四线性回归决定系数;
所述服务器根据所述第三线性回归决定系数和所述第四线性回归决定系数,计算所述第二指定手指对应的第二决定系数变化量;
则所述服务器所述根据所述第一角度变化总量和第一预设阈值,获得手 势动作的识别结果,包括:
所述服务器根据所述第一角度变化总量和第一预设阈值,所述第二角度变化总量和第二预设阈值,以及所述第二决定系数变化量和第四预设阈值,获得手势动作的识别结果。
可选的,若所述第二指定手指中包括第三指定手指,所述第三指定手指是指实施所述指定手势动作时手指关节点呈线性关系的手指;
则所述服务器所述根据所述第一角度变化总量和第一预设阈值,所述第二角度变化总量和第二预设阈值,以及所述第二决定系数变化量和第四预设阈值,获得手势动作的识别结果,包括:
所述服务器根据所述第一角度变化总量和第一预设阈值,所述第二角度变化总量和第二预设阈值,所述第二决定系数变化量和第四预设阈值,以及所述第三指定手指对应的第三线性回归决定系数和第五预设阈值,获得手势动作的识别结果。
可选的,所述服务器通过以下方式计算指定手指中指定关节对应的角度变化总量:
根据所述第一向量和所述第二向量,计算得到差向量;
从所述差向量中获取指定手指中指定关节各自对应的角度变化量,计算角度变化量的和值得到指定手指中指定关节对应的角度变化总量。
可选的,所述指定手势动作为弹指动作,所述第一指定手指是中指;第一指定关节包括中指上的三个手指关节。
可选的,所述指定手指动作为弹指动作,所述第一指定手指是中指;第一指定关节包括中指上的三个手指关节;所述第二指定手指包括大拇指、食指、无名指和小指。
可选的,在所述服务器所述确定用户实施所述指定手势动作之后,所述方法还包括:
所述服务器在界面上显示与所述指定手势动作对应的动画效果。
可选的,所述服务器通过以下方式识别手势图片得到对应的向量:
根据手势图片和卷积神经网络模型识别得到坐标集,所述坐标集中包括被识别的手势图片中手部各个关节点的位置坐标;
根据所述坐标集中各个关节点的位置坐标,计算手指关节对应的角度;
根据所述角度生成与被识别的手势图片对应的向量,所述向量用于表征被识别的手势图片中手指关节的角度。
可选的,所述服务器所述根据所述坐标集中各个关节点的位置坐标,计算手指关节对应的角度,包括:
所述服务器根据所述坐标集中各个关节点的位置坐标,计算得到手指关节所连接的两节指节对应的两个向量;
所述服务器利用反余弦函数和所述两个向量,计算手指关节对应的角度。
本申请第二方面提供一种手势动作的识别装置,所述装置包括:
获取模块,被设置为获取第一手势图片和第二手势图片;
识别模块,被设置为识别所述第一手势图片得到第一向量,所述第一向量用于表征所述第一手势图片中手指关节的角度,以及识别所述第二手势图片得到第二向量,所述第二向量用于表征所述第二手势图片中手指关节的角度;
计算模块,被设置为根据所述第一向量和所述第二向量,计算第一指定手指中第一指定关节的第一角度变化总量,所述第一指定手指是指实施指定手势动作时关节的角度需要发生变化的手指,所述第一指定关节是指实施指定手势动作时第一指定手指中角度需要发生变化的手指关节;
确定模块,被设置为根据所述第一角度变化总量和第一预设阈值,获得手势动作的识别结果。
可选的,计算模块还被设置为:
根据所述第一向量和所述第二向量,计算第二指定手指中第二指定关节的第二角度变化总量,所述第二指定手指是指实施指定手势动作时关节的角度不需要发生变化的手指,所述第二指定关节是指所述第二指定手指中的手指关节;
则确定模块可以被设置为:
根据所述第一角度变化总量和第一预设阈值,以及所述第二角度变化总量和第二预设阈值,获得手势动作的识别结果。
可选的,计算模块还被设置为:
根据所述第一向量计算所述第一指定手指对应的第一线性回归决定系数,以及根据所述第二向量计算所述第一指定手指对应的第二线性回归决定系数;
根据所述第一线性回归决定系数和所述第二线性回归决定系数,计算所述第一指定手指对应的第一决定系数变化量;
则确定模块可以被设置为:
根据所述第一角度变化总量和第一预设阈值,以及所述第一决定系数变化量和第三预设阈值,获得手势动作的识别结果。
可选的,计算模块还被设置为:
根据所述第一向量计算所述第二指定手指对应的第三线性回归决定系数,以及根据所述第二向量计算所述第二指定手指对应的第四线性回归决定系数;
根据所述第三线性回归决定系数和所述第四线性回归决定系数,计算所述第二指定手指对应的第二决定系数变化量;
则确定模块可以被设置为:
根据所述第一角度变化总量和第一预设阈值,所述第二角度变化总量和第二预设阈值,以及所述第二决定系数变化量和第四预设阈值,获得手势动作的识别结果。
可选的,确定模块还被设置为:
若所述第二指定手指中包括第三指定手指,所述第三指定手指是指实施所述指定手势动作时手指关节点呈线性关系的手指;根据所述第一角度变化总量和第一预设阈值,所述第二角度变化总量和第二预设阈值,所述第二决定系数变化量和第四预设阈值,以及所述第三指定手指对应的第三线性回归决定系数和第五预设阈值,获得手势动作的识别结果。
可选的,计算模块可以被设置为:
根据所述第一向量和所述第二向量,计算得到差向量;
从所述差向量中获取指定手指中指定关节各自对应的角度变化量,计算角度变化量的和值得到指定手指中指定关节对应的角度变化总量。
可选的,所述指定手势动作为弹指动作,所述第一指定手指是中指;第 一指定关节包括中指上的三个手指关节。
可选的,所述指定手指动作为弹指动作,所述第一指定手指是中指;第一指定关节包括中指上的三个手指关节;
所述第二指定手指包括大拇指、食指、无名指和小指。
可选的,所述装置还包括显示模块,被设置为:
在所述确定用户实施所述指定手势动作之后,在界面上显示与所述指定手势动作对应的动画效果。
可选的,识别模块可以被设置为:
根据手势图片和卷积神经网络模型识别得到坐标集,所述坐标集中包括被识别的手势图片中手部各个关节点的位置坐标;
根据所述坐标集中各个关节点的位置坐标,计算手指关节对应的角度;
根据所述角度生成与被识别的手势图片对应的向量,所述向量用于表征被识别的手势图片中手指关节的角度。
可选的,识别模块可以被设置为:
根据所述坐标集中各个关节点的位置坐标,计算得到手指关节所连接的两节指节对应的两个向量;
利用反余弦函数和所述两个向量,计算手指关节对应的角度。
本申请第三方面提供一种手势动作的识别设备,所述设备包括处理器以及存储器:
所述存储器被设置为存储程序代码,并将所述程序代码传输给所述处理器;
所述处理器被设置为根据所述程序代码中的指令,执行如上述第一方面所述的手势动作的识别方法的步骤。
本申请第四方面提供一种计算机可读存储介质,所述计算机可读存储介质被设置为存储程序代码,所述程序代码被设置为执行上述第一方面所述的手势动作的识别方法。
本申请第五方面提供一种包括指令的计算机程序产品,当其在计算机上运行时,使得所述计算机执行上述第一方面所述的手势动作的识别方法。
从以上技术方案可以看出,本申请实施例具有以下优点:
本申请实施例中,提供了一种手势动作的识别方法,该方法通过识别第一手势图片和第二手势图片得到表征手指关节的角度的第一向量和第二向量,然后利用数学模型方法,根据第一向量和第二向量计算得到第一角度变化总量,该第一角度变化总量是第一指定手指中第一指定关节对应的角度变化总量;其中,第一指定手指为实施指定手势动作时关节的角度需要发生变化的手指,第一指定关节为实施指定手势动作时第一指定手指中角度需要发生变化的手指关节;根据第一角度变化总量和第一预设阈值,可以确定用户是否实施指定手势动作,进而获得手势动作的识别结果。由于用户在实施指定手势动作时,必然存在特定手指的特定关节发生角度变化,因此,该方法根据该特定手指的特定关节在两张手势图片中的角度变化情况,就能够识别动态的手势动作。本申请提供的手势动作的识别方法在人工智能领域中具有更为广泛的应用前景。
附图说明
图1为本申请实施例中手势动作的识别方法的场景示意图;
图2为本申请实施例中手势动作的识别方法的一个流程图;
图3为本申请实施例中识别手势图片中关节点的示意图;
图4为本申请实施例中计算手指关节的角度的原理图;
图5为本申请实施例中手势动作的识别方法的一个流程图;
图6为本申请请实施例中手势动作的识别方法应用于直播场景的示意图;
图7为本申请实施例中手势动作的识别装置的一个结构示意图;
图8为本申请实施例中手势动作的识别装置的一个结构示意图;
图9为本申请实施例中终端设备的一个结构示意图;
图10为本申请实施例中服务器的一个结构示意图。
具体实施方式
为了使本技术领域的人员更好地理解本申请方案,下面将结合本申请实施例中的附图,对本申请实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例仅是本申请一部分实施例,而不是全部的实施例。基于本申 请中的实施例,本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例,都属于本申请保护的范围。
本申请的说明书和权利要求书及上述附图中的术语“第一”、“第二”、“第三”、“第四”等(如果存在)是用于区别类似的对象,而不必用于描述特定的顺序或先后次序。应该理解这样使用的数据在适当情况下可以互换,以便这里描述的本申请的实施例例如能够以除了在这里图示或描述的那些以外的顺序实施。此外,术语“包括”和“具有”以及他们的任何变形,意图在于覆盖不排他的包含,例如,包含了一系列步骤或单元的过程、方法、系统、产品或设备不必限于清楚地列出的那些步骤或单元,而是可包括没有清楚地列出的或对于这些过程、方法、产品或设备固有的其它步骤或单元。
针对现有手势识别方法仅能对静态的手势动作进行识别,而不能对动态有序的手势进行识别的问题,本申请实施例基于用户在实施指定手势动作时,必然存在特定手指的特定关节发生角度变化这一运动学原理,提出了一种根据手指关节的角度变化识别手势动作的方法,从而实现对动态有序的手势进行识别。该方法通过识别第一手势图片和第二手势图片得到表征手指关节角度的第一向量和第二向量,然后利用数学模型方法,根据第一向量和第二向量计算第一指定手指中第一指定关节的第一角度变化总量,其中,第一指定手指为实施指定手势动作时关节的角度需要发生变化的手指,第一指定关节为实施指定手势动作时第一指定手指中角度需要发生变化的手指关节,根据第一角度变化总量和第一预设阈值之间的关系,可以确定用户是否实施指定手势动作,从而获得手势动作的识别结果。
由于用户在实施指定手势动作时,必然存在特定手指的特定关节发生角度变化,因此,该方法根据该特定手指的特定关节在两张手势图片中的角度变化情况,就能够识别出动态的手势动作。本申请提供的手势动作的识别方法在人工智能领域中具有更为广泛的应用前景。
应理解,本申请实施例提供的上述手势动作的识别方法可以应用于具有图像处理能力的处理设备。该处理设备可以是智能手机、平板电脑、个人计算机(PC,Personal Computer)、小型机或者大型机等终端设备,也可以是具有图形处理能力的服务器。处理设备可以是独立的处理设备,也可以是多 个处理设备形成的集群。例如,当需要处理的数据量较大时,可以通过多服务器构成的集群执行上述手势动作的识别方法。在实际应用中,本申请实施例提供的上述手势动作的识别方法可以由终端和服务器共同完成,例如,可以由终端采集手势图片,服务器从终端获取手势图片,并对手势图片进行识别,从而确定用户实施的手势动作。
为了便于理解本申请的技术方案,接下来将结合具体应用场景对本申请实施例提供的手势动作的识别方法进行介绍。
图1为本申请实施例提供的手势动作的识别方法的场景示意图。如图1所示,该应用场景中包括终端110、服务器120、终端130,其中,服务器120为终端提供网络数据传输服务,如图1所示,服务器120分别与用户A的终端110、用户B的终端130建立网络连接,从而为用户A和用户B提供视频通信服务。在该应用场景中,终端设备是以笔记本电脑作为示例进行说明的,并不构成对本申请技术方案的限定。在用户A和B实时视频通信的过程中,用户B可以使用手语代替语音与用户A通信,其中,手语是通过动态的手势动作来交流思想的一种语言方式。当用户B实施了动态的手势动作,例如,“OK”手势动作,用户B的终端130可以实时采集到对应的手势图片,可以为第一手势图片和第二手势图片,服务器120可以从终端130获取该第一手势图片和第二手势图片,实时识别第一手势图片得到第一向量,识别第二手势图片得到第二向量,其中,第一向量表征第一手势图片中手指关节的角度,第二向量表征第二手势图片中手指关节的角度,服务器120根据第一向量和第二向量,计算第一指定手指中第一指定关节的第一角度变化总量,根据第一角度变化总量与第一预设阈值,可以实时获得手势动作的识别结果。例如,第一角度变化总量大于第一预设阈值,则确定用户实施指定手势动作,该指定手势动作即为手势动作的识别结果。
在该应用场景中,服务器120确定用户B实施“OK”手势动作,为了方便用户A了解用户B实施的手势动作的含义,服务器120将识别的手势动作对应的含义作为识别结果发送给用户A的终端110,在终端110的显示界面上显示该识别结果,以便用户A实时查看。
在上述应用场景中,通过对第一手势图片和第二手势图片进行实时识别, 得到表征第一手势图片中手指关节的角度的第一向量和表征第二手势图片中手指关节的角度的第二向量。根据第一向量和第二向量,可以确定第一指定手指中第一指定关节的第一角度变化总量,根据角度变化情况可以判断第一指定手指中的运动趋势,从而可以确定用户是否实施指定手势动作,如此,实现了对手势动作的实时识别,相较于传统的手势识别方法,本申请提供的手势动作识别方法具有广泛的应用前景,特别是应用于对实时性要求较高的场景中,如实时视频通信和直播场景。
需要说明的是,上述应用场景仅为本申请的手势动作的识别方法的一个可选示例,本申请实施例提供的手势动作的识别方法还可以应用在其他应用场景中,上述示例并不构成对本申请技术方案的限定。
接下来,结合附图对本申请实施例提供的手势动作的识别方法进行详细的介绍。
图2为本申请实施例提供的一种手势动作的识别方法的流程图,参见图2,该方法包括:
S201:服务器获取第一手势图片和第二手势图片。
手势图片是指包含手势的图片。通过对手势图片中手部的姿势进行识别,可以实现手势识别。当需要对动态的手势也即手势动作进行识别时,需获取至少两张具有时序关系的手势图片,包括第一手势图片和第二手势图片。
在一些可能的实现方式中,可以从视频中获取具有时序关系的两帧包含手势的图片作为第一手势图片和第二手势图片。例如,可以对视频逐帧识别,标记包含手部的图像,从标记的图像帧中选择连续的两帧图像作为第一手势图片和第二手势图片。其中,第一手势图片和第二手势图片也可以是不连续的,例如,可以从标记的图像帧中选择间隔一帧的两张帧图像作为第一手势图片和第二手势图片。需要说明的是,实施手势动作需要一定的时间,为了更准确、更快速地识别出手势动作,可以选取间隔一定时间的两张手势图片分别作为第一手势图片和第二手势图片。例如,实施手势动作需要1s的时间,则可以按照预设时间为1s的时间间隔从视频中提取时序相关的两张手势图片分别作为第一手势图片和第二手势图片。
在另一些可能的实现方式中,可以将手部作为拍摄对象,在不同时刻对 手部分别拍照,将不同时刻包含手部的照片作为第一手势图片和第二手势图片。例如,可以采用连拍功能,在同一方向、方位和拍摄角度对手部进行拍摄,生成多张图片,从多张图片中选择两张作为第一手势图片和第二手势图片。类似的,为了更准确、更快速地识别出手势动作,可以按照预设时间间隔选取在一个时间段拍摄的两张手势图片作为第一手势图片和第二手势图片。
S202:服务器识别所述第一手势图片得到第一向量,所述第一向量用于表征所述第一手势图片中手指关节的角度,以及识别所述第二手势图片得到第二向量,所述第二向量用于表征所述第二手势图片中手指关节的角度。
基于上文所述的运动原理,用户在实施指定手势动作时,必然存在手指关节的角度变化,根据特定手指的特定关节在两张手势图片中的角度变化情况就能识别动态的手势动作。基于此,处理设备可以先对手势图片中关节角度进行识别,以便根据两张手势图片中手指关节的角度确定手指关节的角度变化情况。可选地,可以对第一手势图片进行识别,得到表征第一手势图片中手指关节的角度的第一向量,对第二手势图片进行识别,得到表征第二手势图片中手指关节的角度的第二向量。
其中,对手指关节的角度的识别,可以通过深度学习识别关节点,然后根据各关节点的位置坐标进行计算实现。在一些可能的实现方式中,根据手势图片和卷积神经网络模型识别得到坐标集,坐标集中包括被识别的手势图片中手部各个关节点的位置坐标;根据坐标集中各个关节点的位置坐标,计算手指关节对应的角度。如此,将第一手势图片输入到卷积神经网络模型,可以获得第一手势图片中各关节点的位置坐标。图3为根据卷积神经网络模型确定的各关节点的示意图,如图3所示,大拇指为3个关节点,食指、中指、无名指、小指各具有4个关节点,手掌心也具有一个关节点,分别与各手指连接。在确定出各个手指的关节点后,可以根据第一手势图片中各关节点的位置坐标,计算手指关节对应的角度,将各手指关节对应的角度以向量形式表示即为第一向量。对第二手势图片识别得到第二向量的过程与第一手势图片类似,在此不再赘述。
本实施例中的卷积神经网络模型是以手势图片为输入,关节点的位置坐 标为输出的神经网络模型,该神经网络模型可以通过深度学习的方式训练得到。作为一种可能的实现方式,可以获取大量手势图片,对手势图片中关节点的位置坐标进行标注,得到训练样本,利用训练样本对初始卷积神经网络模型训练,得到识别手势图片的卷积神经网络模型。
在通过卷积神经网络模型识别得到包含各个关键点的位置坐标的坐标集后,可以根据坐标集中各个关节点的位置坐标,计算得到手指关节所连接的两节指节对应的两个向量,接着利用反余弦函数和两个向量,计算手指关节对应的角度。图4为计算手指关节的角度的原理图,参见图4,手部的每根手指均具有关节点,针对任意一个关节点,若该关节点连接有两节手指指节,则可以计算两个指节对应的向量的夹角,将向量夹角作为手指关节的角度。
为了方便理解,可以将关节点连接的两节指节对应的向量分别用v 1和v 2表示,则手指关节的角度θ可以通过如下公式计算:
Figure PCTCN2019084630-appb-000001
其中,v 1·v 2表示向量的内积,|v 1|和|v 2|分别表示向量v 1和v 2的模值,acos为反余弦函数。
需要说明的是,在本步骤中,第一手势图片和第二手势图片的识别顺序可以是任意的,可以同时识别第一手势图片和第二手势图片,也可以按照设定的先后顺序对第一手势图片和第二手势图片进行识别。
S203:服务器根据所述第一向量和所述第二向量,计算第一指定手指中第一指定关节的第一角度变化总量。
第一指定手指是指实施指定手势动作时关节的角度需要发生变化的手指。第一指定关节是指实施指定手势动作时第一指定手指中角度需要发生变化的手指关节。即,用户若想要实施指定手势动作则其必须运动第一指定手指,使得第一指定手指中的第一指定关节发生一定的变化。
在本实施例中,基于用户行为习惯的差异,在实施同一手势动作时,不同的用户可以有多种不同的实现方式,为了提高手势动作的识别率,针对同一手势动作的不同实施方式,设置有对应的用于判别是否实施手势动作的第一指定手指。
为了便于理解本申请的技术方案,结合可选示例对第一指定手指以及第 一指定关节进行说明。
以实施“赞赏”手势动作为例进行说明。针对“赞赏”手势动作的识别,针对不同的手势动作实现方式,可以设置不同的第一指定手指以及第一指定关节。
一种实现方式为,在实施“赞赏”手势时,若用户的行为习惯为由握拳状态开始实施“赞赏”手势,则在实施该手势动作时,大拇指由弯曲状态变为伸直状态,可见,大拇指为实施“赞赏”手势时关节角度需要发生变化的手指,因而,将大拇指确定为第一指定手指。进一步地,大拇指上的两个关节的角度均需发生变化,故将大拇指的两个关节确定为第一指定关节。
另一种实现方式为,若用户的行为习惯为由伸手状态开始实施“赞赏”手势,则在实施该手势动作时,大拇指不需要发生变化,而食指、中指、无名指以及小指由伸直状态变为弯曲状态,可见,食指、中指、无名指以及小指为实施“赞赏”手势时关节角度需要发生变化的手指,可以将食指、中指、无名指以及小指确定为第一指定手指。对应地,可以将食指、中指、无名指以及小指中在实施“赞赏”手势时,需要发生变化的关节确定为第一指定关节。
以实施“Yeah”手势动作为例进行说明。针对“Yeah”手势动作的识别,针对不同的手势动作实现方式,可以设置不同的第一指定手指以及第一指定关节。
一种实现方式为,在实施“Yeah”手势时,食指和中指由伸直状态变为弯曲状态,可见,食指和中指为实施“Yeah”手势时,关节角度需要发生变化的手指,因而将食指和中指确定为第一指定手指。进一步地,食指的三个关节与中指的三个关节的角度均需发生变化,故将食指的三个关节确定为与食指对应的第一指定关节,中指的三个关节确定为与中指对应的第一指定关节。
另一种实现方式为,当用户由伸手状态开始实施“Yeah”手势,则食指和中指不必发生变化,而大拇指、无名指和小指由伸直状态变化为弯曲状态,可以将大拇指、无名指和小指确定为第一指定手指。对应地,可以将大拇指、无名指和小指中需要发生变化的关节确定为对应的第一指定关节。
还有一种实现方式为,用户还可以由伸出食指的状态开始实施“Yeah”手势,则“中指”为实施“Yeah”手势时,关节角度需要发生变化的手指,可以将中指确定第一指定手指,中指的三个关节确定为第一指定关节。由于第一向量中包括各个手指对应的各个关节的角度,第二向量中包括同一手部各个手指对应的各个关节的角度,因此,可以根据第一向量和第二向量计算第一指定手指中第一指定关节的第一角度变化总量。在本实施例一些可能的实现方式中,可以根据第一向量和第二向量,计算得到差向量,然后从差向量中获取第一指定手指中第一指定关节的各自对应的角度变化量,计算角度变化量的和值得到第一指定手指中第一指定关节对应的第一角度变化总量。
需要说明的是,第一指定手指的数量可以为一个,也可以为多个。当第一指定手指数量为一个时,计算该手指中第一指定关节的第一角度变化总量即可。当第一指定手指数量为多个时,则对每一个指定手指,分别计算该第一指定手指中第一指定关节的第一角度变化总量。
可以理解,实施指定手势动作时,除了第一指定手指的关节的角度会发生变化外,还可能有其他手指的关节的角度也会发生变化,上述变化给指定手势动作的识别带来一定干扰。例如,一种手势动作A的第一指定手指为食指、中指,另一种手势动作B的第一指定手指为中指,若指定手势动作为A,则在对A进行识别的时候,对食指的关节的角度变化总量也进行计算,可以避免将手势动作B误识别为手势动作A,提高手势动作识别准确率。
基于此,可以将实施指定手势动作时关节的角度不需要发生变化的手指记为第二指定手指,第二指定手指中的手指关节记为第二指定关节。
根据第一向量和第二向量,计算第二指定手指中第二指定关节的第二角度变化总量。与第一指定手指类似,第二指定手指的数量可以是一个,也可以是多个。当第二指定手指的数量为多个时,需要对每个第二指定手指,分别计算该第二指定手指对应的第二指定关节的第二角度变化总量。
第二角度变化总量的计算方式与第一角度变化总量计算方式类似。根据第一向量和第二向量,计算得到差向量,然后从差向量中获取指定手指中指定关节的各自对应的角度变化量,计算角度变化量的和值得到指定手指中指定关节对应的角度变化总量。若获取的为第一指定手指中第一指定关节的角 度变化量,则计算得到的角度变化总量为第一角度变化总量,若获取的为第二指定手指中第二指定关节的角度变化量,则计算得到的角度变化总量为第二角度变化总量。
S204:服务器根据所述第一角度变化总量和第一预设阈值,获得手势动作的识别结果。
第一预设阈值是用于衡量第一指定手指中第一指定关节的第一角度变化总量大小的标准值。若第一角度变化总量大于该标准值,则表明第一角度变化总量较大,第一指定手指中的第一指定关节发生的变化达到了实施指定手势动作时第一指定手指中的第一指定关节需要发生变化的程度,基于此,可以确定用户实施了指定手势动作,该指定手势动作可以作为手势动作的识别结果。
在上述实现方式中,若第一角度变化总量大于第一预设阈值,则确定用户实施指定手势动作,否则,确定用户未实施指定手势动作。
在本申请其他可能的实现方式中,也可以是当第一角度变化总量小于预设阈值时,确定用户实施指定手势动作,否则,确定用户未实施指定手势动作。在实现时,可以根据实际业务情况设置确定是否实施指定手势动作的判断条件。
进一步地,为了提高手势动作识别准确率,可以根据所述第一角度变化总量和第一预设阈值,以及所述第二角度变化总量和第二预设阈值,获得手势动作的识别结果。可选地,可以在对第一角度变化总量判断的基础上,对第二角度变化总量进行判断,以确定用户是否实施了指定手势动作。在本实施例一些可能的实现方式中,若第一角度变化总量大于第一预设阈值,且第二角度变化总量小于第二预设阈值,则确定用户实施指定手势动作。
其中,第二角度变化总量是用于衡量第二指定手指中第二指定关节的第二角度变化总量大小的标准值。若第二角度变化总量小于该标准值,则表明第二角度变化总量较小,第二指定手指中的第二指定关节可以视为未运动。由于第一指定手指中的第一指定关节的变化趋势与实施指定手势动作时的变化趋势吻合,而第二指定手指中的第二指定关节的变化较小,可以视为未发生变化,基于此,可以确定用户实施指定手势动作。
在有些情况下,确定用户实施指定手势动作之后,还可以在界面上显示与指定手势动作对应的动画效果,从而增强互动体验。例如,用户实施“打电话”手势动作后,可以在界面上显示与“打电话”对应的电话机动画效果。或者,也可以为指定手势动作配置对应的音效效果。
由上可知,本申请实施例提供了一种手势动作的识别方法,该方法通过识别第一手势图片和第二手势图片得到表征手指关节的角度的第一向量和第二向量,然后利用数学模型方法,根据第一向量和第二向量计算得到第一角度变化总量,该第一角度变化总量是第一指定手指中第一指定关节对应的角度变化总量;其中,第一指定手指为实施指定手势动作时关节的角度需要发生变化的手指,第一指定关节为实施指定手势动作时第一指定手指中角度需要发生变化的手指关节;根据第一角度变化总量和第一预设阈值,可以获得手势动作的识别结果。由于用户在实施指定手势动作时,必然存在特定手指的特定关节发生角度变化,因此,该方法根据该特定手指的特定关节在两张手势图片中的角度变化情况,就能够识别动态的手势动作。本申请提供的手势动作的识别方法在人工智能领域中具有更为广泛的应用前景。
上述实施例是通过指定手指中指定关节的角度变化实现指定手势动作的识别。用户在实施手势动作时,不仅存在手指关节角度的变化,还存在手指线性关系的变化,例如,手指由伸直变为弯曲状态时,该手指的线性关系发生了较大的变化。为了进一步提高手势动作的识别准确率,还可以在对关节角度的变化情况识别的基础上,根据指定手指的线性关系的变化情况,对用户实施的手势动作进行识别。其中,指定手指的线性关系可以通过对指定手指上指定关节点进行线性回归得到的决定系数表征,通过计算指定手指对应的决定系数变化量,可以进一步确定用户是否实施指定手势动作。接下来,将结合附图,对本申请实施例提供的手势动作的识别方法的另一种实现方式进行介绍,该方法是在图2所示实施例的基础上实现的,本实施例仅就与图2所示实施例的不同之处进行介绍,其他相同内容可以参见图2所示实施例,在此不再赘述。
图5为本实施例提供的一种手势动作的识别方法的流程图,参见图5,该方法包括:
S501:服务器获取第一手势图片和第二手势图片。
S502:服务器识别第一手势图片得到第一向量,以及识别第二手势图片得到第二向量。
S503:服务器根据第一向量和第二向量,计算第一指定手指中第一指定关节的第一角度变化总量。
S501至S503可以参见S201至S203相关内容描述。
S504:服务器根据第一向量计算第一指定手指对应的第一线性回归决定系数,以及根据第二向量计算第一指定手指对应的第二线性回归决定系数。
在确定出手势图片中关节点的位置坐标后,可以对每根手指的关节点进行回归,得到对应的回归方程。当手指的线性关系发生变化时,该手指对应的回归方程的拟合优度也将发生较大的变化。在统计学中,可以采用决定系数判断回归方程的拟合优度。在有些情况下,也可以采用残差平方和反应拟合优度,其中,残差平方和与观测值的绝对大小有关,针对绝对值相差较大的两组数据难以准确地反应拟合优度,而决定系数在残差平方和的基础上进行了相对化处理,受绝对值大小的影响较小,因而能够较为准确地反映出拟合优度。
下面对拟合优度的计算过程进行详细介绍。
若将手势图片中每个关节点的y方向的坐标用y i表示,每个关节点通过线性回归得到的预测点坐标用f i表示,则可以根据y i以及f i计算残差平方和SS res,可以参见如下公式:
Figure PCTCN2019084630-appb-000002
在获得y方向的坐标y i后,还可以计算y方向坐标的平均值
Figure PCTCN2019084630-appb-000003
根据y方向的坐标y i以及平均值
Figure PCTCN2019084630-appb-000004
可以计算得到总体平方和SS tot,可选计算过程可以参见式(3)及式(4):
Figure PCTCN2019084630-appb-000005
Figure PCTCN2019084630-appb-000006
其中,n为该该手指所包含的关节点的个数。
在计算出总体平方和SS tot和残差平方和SS res后,可以根据总体平方和以 及残差平方和计算出决定系数R 2,可以参见如下公式:
Figure PCTCN2019084630-appb-000007
本实施例,根据第一向量,可以确定第一指定手指中各关节点在y方向的坐标,根据该坐标以及回归得到的预测点坐标可以计算出一个决定系数,该决定系数即为第一线性回归决定系数。与第一线性回归决定系数类似,可以根据第二向量,确定第一指定手指中各关节点在y方向的坐标,根据该坐标以及回归得到的预测点坐标可以计算出一个决定系数,该决定系数即为第二线性回归决定系数。需要说明,计算第一线性回归决定系数和第二回归线性决定系数的先后顺序并不影响本申请实施例的实现,可以根据需求设定相应的顺序。
S505:服务器根据所述第一线性回归决定系数和所述第二线性回归决定系数,计算所述第一指定手指对应的第一决定系数变化量。
第一线性回归决定系数和第二线性回归决定系数可以视为第一指定手指中的关节点在不同时刻的决定系数,根据第一线性回归决定系数和第二线性回归决定系数,可以计算第一指定手指对应的第一决定系数变化量。作为一种可能的实现方式,可以将第一线性回归决定系数和第二线性回归决定系数作差,根据差值得到第一决定系数变化量,例如可以将差值的绝对值作为第一决定系数变化量。在本实施例其他可能的实现方式中,也可以采用其他方式,例如求商等方式,计算第一决定系数变化量。
还需要说明的是,S505在S504之后执行,而S503的执行顺序可以是任意的,在一些可能的实现方式中,可以同时执行将S503与S504、S505分为两条路径同时执行,在另一些可能的实现方式中,也可以按照设定的顺序先后执行,如先执行S504、S505,再执行S503。
S506:服务器根据第一角度变化总量和第一预设阈值,以及第一决定系数变化量和第三预设阈值,获得手势动作的识别结果。
第一预设阈值可以参见S204相关内容描述,在此不再赘述。
第三预设阈值是用于衡量第一决定系数变化量大小的标准值。根据第一角度变化总量与第一预设阈值之间的关系,第一决定系数变化总量与第三预设阈值之间的关系,可以获得手势动作的识别结果。
在本实施例一些可能的实现方式中,若第一决定系数变化量大于第三预设阈值,则表明第一决定系数变化量较大,第一指定手指中关节点的位置分布发生了明显的变化。基于关节角度和关节点位置分布的双重判断,可以为,第一角度变化总量大于第一预设阈值,且第一决定系数变化量大于第三预设阈值时,可以确定用户实施指定手势动作,该指定手势动作可以作为手势动作的识别结果。
进一步地,为了提高识别准确率,还可以计算第二指定手指中的回归系数,确定第二指定手指中关节点的位置分布,根据第二指定手指中关节点的位置分布的变化情况确定是否实施指定手势动作,即根据第一角度变化总量和第一预设阈值,第二角度变化总量和第二预设阈值,第一决定系数变化总量与第三预设阈值,以及第二决定系数变化量和第四预设阈值,获得手势动作的识别结果。
可选地,在S504中,还可以根据根据第一向量计算第二指定手指对应的第三线性回归决定系数,以及根据第二向量计算第二指定手指对应的第四线性回归决定系数。对应地,在S505中,还需要根据第三线性回归决定系数和第四线性回归决定系数,计算第二指定手指对应的第二决定系数变化量。对应地,在S506中需要对第一角度变化总量、第二角度变化总量、第一决定系数变化量和第二决定系数变化量均进行判断。可选地,若第一角度变化总量大于第一预设阈值,第二角度变化总量小于第二预设阈值,第一决定系数变化量大于第三阈值且第二决定系数变化量小于第四预设阈值,则确定用户实施所述指定手势动作。
其中,第四预设阈值是用于衡量第二决定系数变化量大小的标准值。若第二决定系数变化量小于第四预设阈值,则表明该第二指定手指对应的第二决定系数变化量较小,该第二指定手指的关节点的位置分布变化较小,也即该手指并未产生较大的动作。
需要说明的是,当第二指定手指的数量为多个时,若各个第二指定手指的第二决定系数变化量均小于第四预设阈值,则判断第一指定手指外的其余手指并无较大动作,而第一角度变化量大于第一预设阈值表征第一指定手指具有较大动作,基于此,可以确定用户实施指定手势动作。其中,对各个第 二指定手指的第二决定系数变化量的判断可以通过多种方式实现,一种实现方式为,将各个第二指定手指的第二决定系数变化量分别与第四预设阈值进行判断;另一种实现方式为,将第二指定手指的第二决定系数变化量进行比较,确定出第二指定手指的第二决定系数变化量的最大值,将最大值与第四预设阈值进行比较,以确定第二决定系数变化量是否小于第四预设阈值。
在有些情况下,还可以对指定手势动作中具有特定特征的手指,如始终呈线性关系的手指,进行判断,以确定是否实施指定手势动作。如此,可以进一步提高对特定手势动作识别的准确率。接下来对其可选的识别过程进行详细的说明。
在本实施例一些可能的实现方式中,若第二指定手指中包括第三指定手指,其中,第三指定手指是指实施指定手势动作时手指关节点呈线性关系的手指。由于第三指定手指的关节点呈线性关系,在对第三指定手指的关节点进行回归,得到对应的回归方程时,回归方程应当具有较好的拟合优度。而拟合优度可以通过决定系数表征,因此,在对角度变化总量、决定系数变化量进行判断的基础上,还可以对第三指定手指对应的第三线性回归决定系数自身进行判断,以确定是否实施指定手势动作。在一些可能的实现方式中,根据第一角度变化总量和第一预设阈值,第二角度变化总量和第二预设阈值,第一决定系数变化量和第三预设阈值,第二决定系数变化量和第四预设阈值,第三指定手指对应的第三线性回归决定系数和第五预设阈值,获得手势动作的识别结果。
可选地,若第一角度变化总量大于第一预设阈值,第二角度变化总量小于第二预设阈值,第一决定系数变化量大于第三预设阈值,第二决定系数变化量小于第四预设阈值,且第三指定手指对应的第三线性回归决定系数大于第五预设阈值,则确定用户实施指定手势动作,该指定手势动作可以作为手势动作的识别结果。
其中,第五预设阈值是用于衡量第三指定手指对应的第三线性回归系数大小的标准值。若第三指定手指对应的第三线性回归决定系数大于第五预设阈值,表明第三指定手指对应的第三线性回归决定系数较大,拟合优度较好,该第三指定手指有较大的可能呈线性关系。
在本实施例一些可能的实现方式中,第三指定手指的数量可以为一个,也可以为多个。当第三指定手指的数量为多个时,也可以对第三指定手指对应的第三线性回归决定系数的均值进行判断。可选地,在第一角度变化总量大于第一预设阈值,第二角度变化总量小于第二预设阈值,第一决定系数变化量大于第三预设阈值,第二决定系数变化量小于第四预设阈值的前提下,若第三指定手指对应的第三线性回归决定系数的均值大于第五预设阈值,则确定用户实施指定手势动作。
在上述实施例中,第一预设阈值、第二预设阈值、第三预设阈值、第四预设阈值以及第五预设阈值可以根据经验值进行设置,针对不同的手势动作,上述预设阈值可以是不同的,本实施例对此不作限定。在本实施例中,根据第一角度变化总量和第一预设阈值,可以获得手势动作的识别结果,在此基础上,结合第二角度总量和第二预设阈值,第一决定系数变化量与第三预设阈值,第二决定系数变化量与第四预设阈值,第三线性回归决定系数与第五预设阈值中的任一组合进行判断,可以获得更准确的手势动作的识别结果。
由上可知,本申请实施例提供了一种手势动作的识别方法,通过在对指定手指的指定关节的角度变化进行判定的基础上,增加对指定手指对应的决定系数的变化量的判定,根据指定手指对应的决定系数的变化量可以确定指定手指的关节点的位置分布的变化情况,基于位置分布的变化情况可以进一步确定用户是否实施指定手势动作,从而提高了手势动作识别的准确率。
为了使本申请的技术方案更容易理解,下面将结合可选应用场景对本申请实施例提供的手势动作的识别方法进行介绍。
图6为本申请实施例提供的手势动作的识别方法的应用场景示意图。如图6所示,该应用场景为实时视频直播场景,该应用场景中包括终端110、服务器120、终端130。其中,终端110为主播对应的终端,终端130为观众对应的终端,服务器120为直播服务器,直播服务器120分别与主播对应的终端110和观众对应的终端120建立网络连接,从而提供相应的直播服务。主播A可以申请开通一个直播间,在该场景中,若主播A开通直播间100,当主播A通过终端110进行直播时,则访问该直播间的终端都能够接收到直播内容,如图6中所示,观众可以通过终端130进入主播A的直播间观看直播 内容,需要说明的是,观众的终端的数量可以为一个,也可为多个。观众的终端可以位于不同区域,观众的终端也可以接入不同的局域网中,图6所示的直播间100内的观众的终端部署仅为本申请的一个可选示例,并不表示终端的实际位置关系。
其中,主播可以在直播过程中实施各种各样的手势动作,通过对主播实施的手势动作进行识别,并根据识别的手势动作显示对应的动画效果,以提高互动性。识别手势动作的过程可以由终端110独立完成,可选地,主播对应的终端110采集手势图片,包括第一手势图片和第二手势图片,然后通过手势图片中关节角度的变化等信息,识别出主播所实施的手势动作,当识别到指定手势动作时,可以在界面上显示对应的动画效果。
在实际应用中,手势动作的识别也可以是由服务器完成的,如图6所示,主播实施弹指动作,主播对应的终端110实时采集手势图片,包括第一手势图片和第二手势图片,服务器120从主播对应的终端获取第一手势图片和第二手势图片,并对其进行实时识别,得到第一向量和第二向量,根据第一向量和第二向量,可以计算第一指定手指的第一指定关节的第一角度变化总量,第二指定手指的第二指定关节的第二角度变化总量,第一指定手指对应第一决定系数变化量,第二指定手指对应的第二决定系数变化量等,根据上述角度变化总量、决定系数变化量确实主播实施弹指动作。当服务器120确定用户实施弹指动作时,可以向主播对应的终端110和观众对应的终端130返回对应的动画效果,主播对应的终端110和观众对应的终端130在界面上显示该动画效果,以便用户观看,增强互动性。
为了便于理解手势动作的识别过程,下面将结合弹指动作对手势动作的识别进行详细说明。
针对弹指动作这一指定手势动作,第一指定手指为中指,第一指定关节包括中指上的三个手指关节。而第二指定手指为大拇指、食指、无名指和小指。第二指定关节分别为大拇指的两个手指关节、食指的三个手指关节、无名指的三个手指关节和小指的三个手指关节。
首先,可以根据手势图片对手指关节角度进行计算。可以从视频中获取包括手部的两张连续的帧图像作为第一手势图像和第二手势图像,通过预先 训练的卷积神经网络模型,可以对手势图像中的关节点进行识别,可选地,对第一手势图像进行识别,计算关节角度,得到第一向量,其中,根据各手指关节,可以计算出14个关节角度,故表征第一手势图像中手指关节角度的第一向量为14维向量。类似的,对第二手势图像进行识别,计算关节角度,得到第二向量,表征第二手势图像中手指关节角度的第二向量为14维向量。需要说明的是,关节角度可以根据式(1)计算,根据人体关节的实际情况,可以控制acos函数的输出结果在0到π之间。
在该应用场景中,可以对第一指定手指也即中指,和第二指定手指也即大拇指、食指、无名指和小指的关节的角度变化总量分别进行判断。为了方便表述,将第一向量用alpha(t)表示,第二向量用alpha(t+1)表示,第一向量中第一指定手指对应的分量用beta(t)表示,第一向量中第二指定手指对应的分量用gamma(t)表示,而第二向量中第一指定手指对应的分量用beta(t+1)表示,第二向量中第二指定手指对应的分量用gamma(t+1)表示。
根据第一向量和第二向量,计算第一指定手指中第一指定关节的第一角度变化总量,可以为根据第一向量alpha(t)中第一指定手指对应的分量beta(t)与第二向量alpha(t+1)中第一指定手指对应的分量beta(t+1),计算出第一指定手指中第一指定关节的角度变化总量,也即中指的三个关节的角度变化总量。该角度变化总量可以通过beta(t+1)-beta(t)所有分量的和表示。接着,将该角度变化总量与第一预设阈值进行比较,其中,第一预设阈值为130,若beta(t+1)-beta(t)所有分量的和大于130,则判断中指有弹指动作。
对于第二指定手指也即gamma分量的判断,主要是为了避免误识别的情况发生。根据第一向量和第二向量,计算第一指定手指中第二指定关节的第二角度变化总量,可以为,根据第一向量alpha(t)中第二指定手指对应的分量gamma(t)与第二向量alpha(t+1)中第二指定手指对应的分量gamma(t+1),计算出第二指定手指中第二指定关节的第二角度变化总量,也即大拇指的两个个关节的角度变化总量、食指的三个关节的角度变化总量、无名指的三个关节的角度变化总量以及小指的三个关节的角度变化总量。
作为一种可能的实现方式,可以将gamma(t+1)-gamma(t)按照大拇指、食指、无名指和小指对应的分量分为四部分,对这四部分分别进行求和,得到 大拇指、食指、无名指和小指各自对应的第二角度变化总量。接着将第二角度变化总量与第二预设阈值比较,其中,第二预设阈值为30,若大拇指、食指、无名指和小指各自对应的第二角度变化总量均小于30,则确定大拇指、食指、无名指和小指相对稳定,无较大动作。在有些情况下,也可以对gamma(t+1)-gamma(t)计算范数norm(gamma(t+1)-gamma(t)),若范数norm(gamma(t+1)-gamma(t))小于30,则表明大拇指、食指、无名指和小指相对稳定,无较大动作。
其次,还可以计算手指关节点位置的决定系数,判断是否实施指定手势动作。与手指关节的角度变化判断类似,对决定系数的判断也可以分为两部分,第一部分为对第一指定手指,也即中指的决定系数变化量进行判断,第二部分为对第二指定手指,也即大拇指、食指、无名指和小指的决定系数变化量进行判断。
对决定系数的计算可以参照式(2)至式(5),为了方便表述,将t时间第i根手指的决定系数记为R 2(i,t),则针对弹指动作,第一指定手指为中指,第一指定手指对应的第一线性回归决定系数即为R 2(3,t),第一指定手指对应的第二线性回归决定系数为R 2(3,t+1),根据第一线性回归决定系数和第二线性回归决定系数可以计算出第一决定系数变化量,第一决定系数变化量可以用R 2(3,t+1)-R 2(3,t)表示。将第一决定系数变化量与第三预设阈值比较,其中,第三预设阈值为0.4,若R 2(3,t+1)-R 2(3,t)大于0.4,则可以确定中指的关节点由弯曲的状态转向伸直的状态。
针对弹指动作,第二指定手指为大拇指、食指、无名指和小指,大拇指对应的第三线性回归决定系数为R 2(1,t),大拇指对应的第四线性回归决定系数为R 2(1,t+1),食指对应的第三线性回归决定系数为R 2(2,t),食指对应的第四线性回归决定系数为R 2(2,t+1),无名指对应的第三线性回归决定系数为R 2(4,t),无名指对应的第四线性回归决定系数为R 2(4,t+1),小指对应的第三线性回归决定系数为R 2(4,t),小指对应的第四线性回归决定系数为R 2(4,t+1)。根据上述决定系数,可以计算得到第二决定系数变化量,第二决定系数变化量可以用R 2(i,t+1)-R 2(i,t)表示,其中i对1,2,4,5均成立。
将第四线性回归系数与第四预设阈值进行比较,其中,第四线性回归系 数为0.2,针对i=1,2,4,5,若max(R 2(i,t+1)-R 2(i,t))<0.2,则第二指定手指对应的第二决定系数变化量较小,判断大拇指、食指、无名指和小指无较大的动作。
若以上条件均成立,也即beta(t+1)-beta(t)>130,norm(gamma(t+1)-gamma(t))<30,R 2(3,t+1)-R 2(3,t)>0.4,且max(R 2(i,t+1)-R 2(i,t))<0.2,则可以确定用户实施弹指动作。
进一步地,由于弹指动作的第二指定手指中还包括第三指定手指,即食指、无名指以及小指。可以分别对食指、无名指以及小指的第三线性回归决定系数R 2(i,t)进行判断,其中,i为2,4,5。
若R 2(i,t)大于第五预设阈值,其中,第五预设阈值为0.9,针对i=2,4,5,R 2(i,t)>0.9均成立,则表明食指、无名指以及小指的关节点均呈线性关系,在beta(t+1)-beta(t)>130,norm(gamma(t+1)-gamma(t))<30,R 2(3,t+1)-R 2(3,t)>0.4,且max(R 2(i,t+1)-R 2(i,t))<0.2的前提下,若R 2(i,t)>0.9针对i=2,4,5均成立,可以确定识别到弹指动作。
在有些情况下,也可以不对R 2(i,t)分别进行判断,可以直接对i=2,4,5时的均值average(R(i,t))进行判断,在beta(t+1)-beta(t)>130,norm(gamma(t+1)-gamma(t))<30,R 2(3,t+1)-R 2(3,t)>0.4,且max(R 2(i,t+1)-R 2(i,t))<0.2的前提下,若针对i=2,4,5,average(R(i,t))>0.9成立,可以确定识别到弹指动作。
该弹指动作的识别可以配合一些其他的挂件设计,应用在直播场景中。当主播作出弹指动作时,屏幕可以实时展示与弹指动作对应的动画效果。
需要说明的是,上述应用场景是以弹指动作作为示例对本申请提供的手势动作的识别方法进行示例性说明,在本实施例其他可能的实现方式中,也可以对其他手势动作,如“OK”、“剪刀手”等进行识别。当对其他手势动作进行识别时,上述实施例中的第一指定手指、第二指定手指、第三指定手指以及第一预设阈值、第二预设阈值、第三预设阈值、第四预设阈值和第五预设阈值可以进行适应性调整。
以上为本申请实施例提供的手势动作的识别方法的一些可选实现方式,基于上述实现方式,本申请还提供了一种手势动作的识别装置。
接下来,结合附图,从功能模块化的角度对本申请实施例提供的手势动作的识别装置进行介绍。
图7为本申请实施例提供的一种手势动作的识别装置的结构示意图,参见图7,该装置包括:
获取模块710,被设置为获取第一手势图片和第二手势图片;
识别模块720,被设置为识别所述第一手势图片得到第一向量,所述第一向量用于表征所述第一手势图片中手指关节的角度,以及识别所述第二手势图片得到第二向量,所述第二向量用于表征所述第二手势图片中手指关节的角度;
计算模块730,被设置为根据所述第一向量和所述第二向量,计算第一指定手指中第一指定关节的第一角度变化总量,所述第一指定手指是指实施指定手势动作时关节的角度需要发生变化的手指,所述第一指定关节是指实施指定手势动作时第一指定手指中角度需要发生变化的手指关节;
确定模块740,被设置为根据所述第一角度变化总量和第一预设阈值,获得手势动作的识别结果。
可选的,计算模块730还被设置为:
根据所述第一向量和所述第二向量,计算第二指定手指中第二指定关节的第二角度变化总量,所述第二指定手指是指实施指定手势动作时关节的角度不需要发生变化的手指,所述第二指定关节是指所述第二指定手指中的手指关节;
则确定模块740可以被设置为:
根据所述第一角度变化总量和第一预设阈值,以及所述第二角度变化总量和第二预设阈值,获得手势动作的识别结果。
可选的,计算模块730还被设置为:
根据所述第一向量计算所述第一指定手指对应的第一线性回归决定系数,以及根据所述第二向量计算所述第一指定手指对应的第二线性回归决定系数;
根据所述第一线性回归决定系数和所述第二线性回归决定系数,计算所述第一指定手指对应的第一决定系数变化量;
则确定模块740可以被设置为:
根据所述第一角度变化总量和第一预设阈值,以及所述第一决定系数变化量和第三预设阈值,获得手势动作的识别结果。
可选的,计算模块730还被设置为:
根据所述第一向量计算所述第二指定手指对应的第三线性回归决定系数,以及根据所述第二向量计算所述第二指定手指对应的第四线性回归决定系数;
根据所述第三线性回归决定系数和所述第四线性回归决定系数,计算所述第二指定手指对应的第二决定系数变化量;
则确定模块740可以被设置为:
根据所述第一角度变化总量和第一预设阈值,所述第二角度变化总量和第二预设阈值,以及所述第二决定系数变化量和第四预设阈值,获得手势动作的识别结果。
可选的,确定模块740还被设置为:
若所述第二指定手指中包括第三指定手指,所述第三指定手指是指实施所述指定手势动作时手指关节点呈线性关系的手指;根据所述第一角度变化总量和第一预设阈值,所述第二角度变化总量和第二预设阈值,所述第二决定系数变化量和第四预设阈值,以及所述第三指定手指对应的第三线性回归决定系数和第五预设阈值,获得手势动作的识别结果。
可选的,计算模块730可以被设置为:
根据所述第一向量和所述第二向量,计算得到差向量;
从所述差向量中获取指定手指中指定关节各自对应的角度变化量,计算角度变化量的和值得到指定手指中指定关节对应的角度变化总量。
可选的,所述指定手势动作为弹指动作,所述第一指定手指是中指;第一指定关节包括中指上的三个手指关节。
可选的,所述指定手指动作为弹指动作,所述第一指定手指是中指;第一指定关节包括中指上的三个手指关节;
所述第二指定手指包括大拇指、食指、无名指和小指。
可选的,参见图8,图8为本实施例提供的手势动作的识别装置的一个结 构示意图,所述装置还包括显示模块750,被设置为:
在所述确定用户实施所述指定手势动作之后,在界面上显示与所述指定手势动作对应的动画效果。
可选的,识别模块720可以被设置为:
根据手势图片和卷积神经网络模型识别得到坐标集,所述坐标集中包括被识别的手势图片中手部各个关节点的位置坐标;
根据所述坐标集中各个关节点的位置坐标,计算手指关节对应的角度;
根据所述角度生成与被识别的手势图片对应的向量,所述向量用于表征被识别的手势图片中手指关节的角度。
可选的,识别模块720可以被设置为:
根据所述坐标集中各个关节点的位置坐标,计算得到手指关节所连接的两节指节对应的两个向量;
利用反余弦函数和所述两个向量,计算手指关节对应的角度。
由上可知,本申请实施例提供了一种手势动作的识别装置,包括获取第一手势图片和第二手势图片,利用深度学习对手势图片中的关节点进行识别,得到表征手指关节角度的第一向量和第二向量,利用数学模型方法对第一向量和第二向量进行处理,确定手势图片中第一指定手指的第一指定关节的第一角度变化总量,其中,第一指定手指为实施指定手势动作时关节的角度需要发生变化的手指,第一指定关节为实施指定手势动作时第一指定手指中角度需要发生变化的手指关节,如此,当第一角度变化总量大于第一预设阈值时,可以确定用户实施指定手势动作。
本申请在通过深度学习能够准确地识别手指关节点的基础上,计算指定手指中的指定关节的角度变化总量,将该变化总量与表征实施指定手势动作时角度需要发生变化的手指中需要发生变化的关节的角度变化总量的第一预设阈值进行比较,可以实现对手势动作的识别。与相关技术仅能对静态手势进行识别,本申请提供的手势动作识别方法具有更为广泛的应用前景。
上述实施例是通过模块功能化的角度,对本申请实施例提供的手势动作的识别装置进行介绍,接下来将从硬件实体化的角度对本申请实施例提供的手势动作的识别装置进行介绍。
图9为本申请实施例提供的一种手势动作的识别设备,该设备可以为终端设备。如图9所示,为了便于说明,仅示出了与本申请实施例相关的部分,具体技术细节未揭示的,请参照本申请实施例方法部分。该终端设备可以为包括手机、平板电脑、个人数字助理(英文全称:Personal Digital Assistant,英文缩写:PDA)、销售终端(英文全称:Point of Sales,英文缩写:POS)、车载电脑等任意终端设备,以终端为手机为例:
图9示出的是与本申请实施例提供的终端相关的手机的部分结构的框图。参考图9,手机包括:射频(英文全称:Radio Frequency,英文缩写:RF)电路910、存储器920、输入单元930、显示单元940、传感器950、音频电路960、无线保真(英文全称:wireless fidelity,英文缩写:WiFi)模块970、处理器980、以及电源990等部件。本领域技术人员可以理解,图9中示出的手机结构并不构成对手机的限定,可以包括比图示更多或更少的部件,或者组合某些部件,或者不同的部件布置。
下面结合图9对手机的各个构成部件进行介绍:
RF电路910可用于收发信息或通话过程中,信号的接收和发送,特别地,将基站的下行信息接收后,给处理器980处理;另外,将设计上行的数据发送给基站。通常,RF电路910包括但不限于天线、至少一个放大器、收发信机、耦合器、低噪声放大器(英文全称:Low Noise Amplifier,英文缩写:LNA)、双工器等。此外,RF电路910还可以通过无线通信与网络和其他设备通信。上述无线通信可以使用任一通信标准或协议,包括但不限于全球移动通讯系统(英文全称:Global System of Mobile communication,英文缩写:GSM)、通用分组无线服务(英文全称:General Packet Radio Service,GPRS)、码分多址(英文全称:Code Division Multiple Access,英文缩写:CDMA)、宽带码分多址(英文全称:Wideband Code Division Multiple Access,英文缩写:WCDMA)、长期演进(英文全称:Long Term Evolution,英文缩写:LTE)、电子邮件、短消息服务(英文全称:Short Messaging Service,SMS)等。
存储器920可被设置为存储软件程序以及模块,处理器990通过运行存储在存储器920的软件程序以及模块,从而执行手机的各种功能应用以及数据处理。存储器920可主要包括存储程序区和存储数据区,其中,存储程序 区可存储实施系统、至少一个功能所需的应用程序(比如声音播放功能、图像播放功能等)等;存储数据区可存储根据手机的使用所创建的数据(比如音频数据、电话本等)等。此外,存储器920可以包括高速随机存取存储器,还可以包括非易失性存储器,例如至少一个磁盘存储器件、闪存器件、或其他易失性固态存储器件。
输入单元930可被设置为接收输入的数字或字符信息,以及产生与手机的用户设置以及功能控制有关的键信号输入。可选地,输入单元930可包括触控面板931以及其他输入设备932。触控面板931,也称为触摸屏,可收集用户在其上或附近的触摸实施(比如用户使用手指、触笔等任何适合的物体或附件在触控面板931上或在触控面板931附近的实施),并根据预先设定的程式驱动相应的连接装置。可选的,触控面板931可包括触摸检测装置和触摸控制器两个部分。其中,触摸检测装置检测用户的触摸方位,并检测触摸实施带来的信号,将信号传送给触摸控制器;触摸控制器从触摸检测装置上接收触摸信息,并将它转换成触点坐标,再送给处理器980,并能接收处理器980发来的命令并加以执行。此外,可以采用电阻式、电容式、红外线以及表面声波等多种类型实现触控面板931。除了触控面板931,输入单元930还可以包括其他输入设备932。可选地,其他输入设备932可以包括但不限于物理键盘、功能键(比如音量控制按键、开关按键等)、轨迹球、鼠标、实施杆等中的一种或多种。
显示单元940可被设置为显示由用户输入的信息或提供给用户的信息以及手机的各种菜单。显示单元940可包括显示面板941,可选的,可以采用液晶显示器(英文全称:Liquid Crystal Display,英文缩写:LCD)、有机发光二极管(英文全称:Organic Light-Emitting Diode,英文缩写:OLED)等形式来配置显示面板941。进一步的,触控面板931可覆盖显示面板941,当触控面板931检测到在其上或附近的触摸实施后,传送给处理器980以确定触摸事件的类型,随后处理器980根据触摸事件的类型在显示面板941上提供相应的视觉输出。虽然在图9中,触控面板931与显示面板941是作为两个独立的部件来实现手机的输入和输入功能,但是在某些实施例中,可以将触控面板931与显示面板941集成而实现手机的输入和输出功能。
手机还可包括至少一种传感器950,比如光传感器、运动传感器以及其他传感器。可选地,光传感器可包括环境光传感器及接近传感器,其中,环境光传感器可根据环境光线的明暗来调节显示面板941的亮度,接近传感器可在手机移动到耳边时,关闭显示面板941和/或背光。作为运动传感器的一种,加速计传感器可检测各个方向上(一般为三轴)加速度的大小,静止时可检测出重力的大小及方向,可用于识别手机姿态的应用(比如横竖屏切换、相关游戏、磁力计姿态校准)、振动识别相关功能(比如计步器、敲击)等;至于手机还可配置的陀螺仪、气压计、湿度计、温度计、红外线传感器等其他传感器,在此不再赘述。
音频电路960、扬声器961,传声器962可提供用户与手机之间的音频接口。音频电路960可将接收到的音频数据转换后的电信号,传输到扬声器961,由扬声器961转换为声音信号输出;另一方面,传声器962将收集的声音信号转换为电信号,由音频电路960接收后转换为音频数据,再将音频数据输出处理器980处理后,经RF电路910以发送给比如另一手机,或者将音频数据输出至存储器920以便进一步处理。
WiFi属于短距离无线传输技术,手机通过WiFi模块970可以帮助用户收发电子邮件、浏览网页和访问流式媒体等,它为用户提供了无线的宽带互联网访问。虽然图9示出了WiFi模块970,但是可以理解的是,其并不属于手机的必须构成,完全可以根据需要在不改变发明的本质的范围内而省略。
处理器980是手机的控制中心,利用各种接口和线路连接整个手机的各个部分,通过运行或执行存储在存储器920内的软件程序和/或模块,以及调用存储在存储器920内的数据,执行手机的各种功能和处理数据,从而对手机进行整体监控。可选的,处理器980可包括一个或多个处理单元;优选的,处理器980可集成应用处理器和调制解调处理器,其中,应用处理器主要处理实施系统、用户界面和应用程序等,调制解调处理器主要处理无线通信。可以理解的是,上述调制解调处理器也可以不集成到处理器980中。
手机还包括给各个部件供电的电源990(比如电池),优选的,电源可以通过电源管理系统与处理器980逻辑相连,从而通过电源管理系统实现管理充电、放电、以及功耗管理等功能。
尽管未示出,手机还可以包括摄像头、蓝牙模块等,在此不再赘述。
在本申请实施例中,该终端所包括的处理器980还具有以下功能:
获取第一手势图片和第二手势图片;
识别所述第一手势图片得到第一向量,所述第一向量用于表征所述第一手势图片中手指关节的角度,以及识别所述第二手势图片得到第二向量,所述第二向量用于表征所述第二手势图片中手指关节的角度;
根据所述第一向量和所述第二向量,计算第一指定手指中第一指定关节的第一角度变化总量,所述第一指定手指是指实施指定手势动作时关节的角度需要发生变化的手指,所述第一指定关节是指实施指定手势动作时第一指定手指中角度需要发生变化的手指关节;
根据所述第一角度变化总量和第一预设阈值,获得手势动作的识别结果。
在一些可能的实现方式中,处理器980还可以执行上述手势动作的识别方法中的任意一种实现方式的操作步骤。
本申请实施例提供的上述方法还可以由另一种手势动作的识别设备实现,该设备可以为服务器,接下来,将结合附图,对本实施例提供的服务器的结构进行详细说明。
图10为本申请实施例提供的服务器的一个结构示意图。如图10所示,该服务器1000可因配置或性能不同而产生比较大的差异,可以包括一个或一个以上中央处理器(central processing units,CPU)1022(例如,一个或一个以上处理器)和存储器1032,一个或一个以上存储应用程序1042或数据1044的存储介质1030(例如一个或一个以上海量存储设备)。其中,存储器1032和存储介质1030可以是短暂存储或持久存储。存储在存储介质1030的程序可以包括一个或一个以上模块(图示没标出),每个模块可以包括对服务器中的一系列指令操作。更进一步地,中央处理器1022可以设置为与存储介质1030通信,在服务器1000上执行存储介质1030中的一系列指令操作。
服务器1000还可以包括一个或一个以上电源1026,一个或一个以上有线或无线网络接口1050,一个或一个以上输入输出接口1058,和/或,一个或一个以上操作系统1041,例如Windows ServerTM,Mac OS XTM,UnixTM,LinuxTM,FreeBSDTM等等。
上述实施例中由服务器所执行的步骤可以基于该图10所示的服务器结构。
其中,CPU 1022被设置为执行如下步骤:
获取第一手势图片和第二手势图片;
识别所述第一手势图片得到第一向量,所述第一向量用于表征所述第一手势图片中手指关节的角度,以及识别所述第二手势图片得到第二向量,所述第二向量用于表征所述第二手势图片中手指关节的角度;
根据所述第一向量和所述第二向量,计算第一指定手指中第一指定关节的第一角度变化总量,所述第一指定手指是指实施指定手势动作时关节的角度需要发生变化的手指,所述第一指定关节是指实施指定手势动作时第一指定手指中角度需要发生变化的手指关节;
根据所述第一角度变化总量和第一预设阈值,获得手势动作的识别结果。
在本实施例一些可能的实现方式中,CPU1022还可以被设置为执行上述手势动作的识别方法的任意一种实现方式的步骤。
本申请实施例还提供一种计算机可读存储介质,被设置为存储程序代码,该程序代码被设置为执行前述各个实施例所述的一种手势动作的识别方法中的任意一种实施方式。
本申请实施例还提供一种包括指令的计算机程序产品,当其在计算机上运行时,使得计算机执行前述各个实施例所述的一种手势动作的识别方法中的任意一种实施方式。
所属领域的技术人员可以清楚地了解到,为描述的方便和简洁,上述描述的系统,装置和单元的可选工作过程,可以参考前述方法实施例中的对应过程,在此不再赘述。
在本申请所提供的几个实施例中,应该理解到,所揭露的系统,装置和方法,可以通过其它的方式实现。例如,以上所描述的装置实施例仅仅是示意性的,例如,所述单元的划分,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式,例如多个单元或组件可以结合或者可以集成到另一个系统,或一些特征可以忽略,或不执行。另一点,所显示或讨论的相互之间的耦合或直接耦合或通信连接可以是通过一些接口,装置或单元的间接耦合 或通信连接,可以是电性,机械或其它的形式。
所述作为分离部件说明的单元可以是或者也可以不是物理上分开的,作为单元显示的部件可以是或者也可以不是物理单元,即可以位于一个地方,或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部单元来实现本实施例方案的目的。
另外,在本申请各个实施例中的各功能单元可以集成在一个处理单元中,也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个单元中。上述集成的单元既可以采用硬件的形式实现,也可以采用软件功能单元的形式实现。
所述集成的单元如果以软件功能单元的形式实现并作为独立的产品销售或使用时,可以存储在一个计算机可读取存储介质中。基于这样的理解,本申请的技术方案本质上或者说对相关技术做出贡献的部分或者该技术方案的全部或部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质中,包括若干指令用以使得一台计算机设备(可以是个人计算机,服务器,或者网络设备等)执行本申请各个实施例所述方法的全部或部分步骤。而前述的存储介质包括:U盘、移动硬盘、只读存储器(英文全称:Read-Only Memory,英文缩写:ROM)、随机存取存储器(英文全称:Random Access Memory,英文缩写:RAM)、磁碟或者光盘等各种可以存储程序代码的介质。
以上所述,以上实施例仅用以说明本申请的技术方案,而非对其限制;尽管参照前述实施例对本申请进行了详细的说明,本领域的普通技术人员应当理解:其依然可以对前述各实施例所记载的技术方案进行修改,或者对其中部分技术特征进行等同替换;而这些修改或者替换,并不使相应技术方案的本质脱离本申请各实施例技术方案的精神和范围。

Claims (15)

  1. 一种手势动作的识别方法,包括:
    服务器获取第一手势图片和第二手势图片;
    所述服务器识别所述第一手势图片得到第一向量,所述第一向量用于表征所述第一手势图片中手指关节的角度,以及识别所述第二手势图片得到第二向量,所述第二向量用于表征所述第二手势图片中手指关节的角度;
    所述服务器根据所述第一向量和所述第二向量,计算第一指定手指中第一指定关节的第一角度变化总量,所述第一指定手指是指实施指定手势动作时关节的角度需要发生变化的手指,所述第一指定关节是指实施指定手势动作时第一指定手指中角度需要发生变化的手指关节;
    所述服务器根据所述第一角度变化总量和第一预设阈值,获得手势动作的识别结果。
  2. 根据权利要求1所述的方法,其中,所述方法还包括:
    所述服务器根据所述第一向量和所述第二向量,计算第二指定手指中第二指定关节的第二角度变化总量,所述第二指定手指是指实施指定手势动作时关节的角度不需要发生变化的手指,所述第二指定关节是指所述第二指定手指中的手指关节;
    则所述服务器所述根据所述第一角度变化总量和第一预设阈值,获得手势动作的识别结果,包括:
    所述服务器根据所述第一角度变化总量和第一预设阈值,以及所述第二角度变化总量和第二预设阈值,获得手势动作的识别结果。
  3. 根据权利要求1所述的方法,其中,所述方法还包括:
    所述服务器根据所述第一向量计算所述第一指定手指对应的第一线性回归决定系数,以及根据所述第二向量计算所述第一指定手指对应的第二线性回归决定系数;
    所述服务器根据所述第一线性回归决定系数和所述第二线性回归决定系数,计算所述第一指定手指对应的第一决定系数变化量;
    则所述服务器所述根据所述第一角度变化总量和第一预设阈值,获得手势动作的识别结果,包括:
    所述服务器根据所述第一角度变化总量和第一预设阈值,以及所述第一 决定系数变化量和第三预设阈值,获得手势动作的识别结果。
  4. 根据权利要求2所述的方法,其中,所述方法还包括:
    所述服务器根据所述第一向量计算所述第二指定手指对应的第三线性回归决定系数,以及根据所述第二向量计算所述第二指定手指对应的第四线性回归决定系数;
    所述服务器根据所述第三线性回归决定系数和所述第四线性回归决定系数,计算所述第二指定手指对应的第二决定系数变化量;
    则所述服务器所述根据所述第一角度变化总量和第一预设阈值,获得手势动作的识别结果,包括:
    所述服务器根据所述第一角度变化总量和第一预设阈值,所述第二角度变化总量和第二预设阈值,以及所述第二决定系数变化量和第四预设阈值,获得手势动作的识别结果。
  5. 根据权利要求4所述的方法,其中,若所述第二指定手指中包括第三指定手指,所述第三指定手指是指实施所述指定手势动作时手指关节点呈线性关系的手指;
    则所述服务器所述根据所述第一角度变化总量和第一预设阈值,所述第二角度变化总量和第二预设阈值,以及所述第二决定系数变化量和第四预设阈值,获得手势动作的识别结果,包括:
    所述服务器根据所述第一角度变化总量和第一预设阈值,所述第二角度变化总量和第二预设阈值,所述第二决定系数变化量和第四预设阈值,以及所述第三指定手指对应的第三线性回归决定系数和第五预设阈值,获得手势动作的识别结果。
  6. 根据权利要求1或者2所述的方法,其中,所述服务器通过以下方式计算指定手指中指定关节对应的角度变化总量:
    根据所述第一向量和所述第二向量,计算得到差向量;
    从所述差向量中获取指定手指中指定关节各自对应的角度变化量,计算角度变化量的和值得到指定手指中指定关节对应的角度变化总量。
  7. 根据权利要求1所述的方法,其中,所述指定手势动作为弹指动作,所述第一指定手指是中指;第一指定关节包括中指上的三个手指关节。
  8. 根据权利要求2所述的方法,其中,所述指定手指动作为弹指动作,所述第一指定手指是中指;第一指定关节包括中指上的三个手指关节;所述第二指定手指包括大拇指、食指、无名指和小指。
  9. 根据权利要求1所述的方法,其中,在所述服务器所述确定用户实施所述指定手势动作之后,所述方法还包括:
    所述服务器在界面上显示与所述指定手势动作对应的动画效果。
  10. 根据权利要求1所述的方法,其中,所述服务器通过以下方式识别手势图片得到对应的向量:
    根据手势图片和卷积神经网络模型识别得到坐标集,所述坐标集中包括被识别的手势图片中手部各个关节点的位置坐标;
    根据所述坐标集中各个关节点的位置坐标,计算手指关节对应的角度;
    根据所述角度生成与被识别的手势图片对应的向量,所述向量用于表征被识别的手势图片中手指关节的角度。
  11. 根据权利要求10所述的方法,其中,所述服务器所述根据所述坐标集中各个关节点的位置坐标,计算手指关节对应的角度,包括:
    所述服务器根据所述坐标集中各个关节点的位置坐标,计算得到手指关节所连接的两节指节对应的两个向量;
    所述服务器利用反余弦函数和所述两个向量,计算手指关节对应的角度。
  12. 一种手势动作的识别装置,包括一个或多个处理器,以及一个或多个存储程序单元的存储器,其中,所述程序单元由所述处理器执行,所述程序单元包括:
    获取模块,被设置为获取第一手势图片和第二手势图片;
    识别模块,被设置为识别所述第一手势图片得到第一向量,所述第一向量用于表征所述第一手势图片中手指关节的角度,以及识别所述第二手势图片得到第二向量,所述第二向量用于表征所述第二手势图片中手指关节的角度;
    计算模块,被设置为根据所述第一向量和所述第二向量,计算第一指定手指中第一指定关节的第一角度变化总量,所述第一指定手指是指实施指定手势动作时关节的角度需要发生变化的手指,所述第一指定关节是指实施指 定手势动作时第一指定手指中角度需要发生变化的手指关节;
    确定模块,被设置为根据所述第一角度变化总量和第一预设阈值,获得手势动作的识别结果。
  13. 一种终端设备,所述设备包括处理器以及存储器:
    所述存储器被设置为存储可执行计算机指令,并将所述可执行计算机指令传输给所述处理器;
    所述处理器被设置为根据所述可执行计算机指令中的指令执行权利要求1至11中任一项所述的手势动作的识别方法。
  14. 一种计算机可读存储介质,所述计算机可读存储介质被设置为存储可执行计算机指令,当其在计算机上运行时,使得所述计算机执行权利要求1至11中任一项所述的手势动作的识别方法。
  15. 一种包括指令的计算机程序产品,当其在计算机上运行时,使得所述计算机执行权利要求1至11中任一项所述的手势动作的识别方法。
PCT/CN2019/084630 2018-06-07 2019-04-26 一种手势动作的识别方法、装置以及设备 WO2019233216A1 (zh)

Priority Applications (2)

Application Number Priority Date Filing Date Title
EP19814802.5A EP3805982B1 (en) 2018-06-07 2019-04-26 Gesture recognition method, apparatus and device
US17/004,735 US11366528B2 (en) 2018-06-07 2020-08-27 Gesture movement recognition method, apparatus, and device

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201810582795.X 2018-06-07
CN201810582795.XA CN110163045A (zh) 2018-06-07 2018-06-07 一种手势动作的识别方法、装置以及设备

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US17/004,735 Continuation US11366528B2 (en) 2018-06-07 2020-08-27 Gesture movement recognition method, apparatus, and device

Publications (1)

Publication Number Publication Date
WO2019233216A1 true WO2019233216A1 (zh) 2019-12-12

Family

ID=67644892

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2019/084630 WO2019233216A1 (zh) 2018-06-07 2019-04-26 一种手势动作的识别方法、装置以及设备

Country Status (4)

Country Link
US (1) US11366528B2 (zh)
EP (1) EP3805982B1 (zh)
CN (1) CN110163045A (zh)
WO (1) WO2019233216A1 (zh)

Families Citing this family (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113038149A (zh) * 2019-12-09 2021-06-25 上海幻电信息科技有限公司 直播视频互动方法、装置以及计算机设备
CN111158478B (zh) * 2019-12-26 2023-02-03 维沃移动通信有限公司 响应方法及电子设备
CN111481208B (zh) * 2020-04-01 2023-05-12 中南大学湘雅医院 一种应用于关节康复的辅助系统、方法及存储介质
CN113553884B (zh) * 2020-04-26 2023-04-18 武汉Tcl集团工业研究院有限公司 手势识别方法、终端设备及计算机可读存储介质
CN111539352A (zh) * 2020-04-27 2020-08-14 支付宝(杭州)信息技术有限公司 一种判断人体关节运动方向的方法及系统
CN113238650B (zh) * 2021-04-15 2023-04-07 青岛小鸟看看科技有限公司 手势识别和控制的方法、装置及虚拟现实设备
WO2023070933A1 (zh) * 2021-10-26 2023-05-04 深圳市鸿合创新信息技术有限责任公司 手势识别方法、装置、设备及介质
CN115100747B (zh) * 2022-08-26 2022-11-08 山东宝德龙健身器材有限公司 基于视觉检测的跑步机智能辅助系统

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101980107A (zh) * 2010-10-20 2011-02-23 陆钰明 一种基于直线基本手势的手势码的实现方法
CN102707799A (zh) * 2011-09-12 2012-10-03 北京盈胜泰科技术有限公司 一种手势识别方法及手势识别装置
CN105278699A (zh) * 2014-09-29 2016-01-27 北京至感传感器技术研究院有限公司 易穿戴式手势识别装置
CN105787439A (zh) * 2016-02-04 2016-07-20 广州新节奏智能科技有限公司 一种基于卷积神经网络的深度图像人体关节定位方法
CN106406518A (zh) * 2016-08-26 2017-02-15 清华大学 手势控制装置及手势识别方法
CN107194344A (zh) * 2017-05-16 2017-09-22 西安电子科技大学 自适应骨骼中心的人体行为识别方法

Family Cites Families (16)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9477396B2 (en) * 2008-11-25 2016-10-25 Samsung Electronics Co., Ltd. Device and method for providing a user interface
JP5615583B2 (ja) * 2010-04-08 2014-10-29 京セラ株式会社 文字入力装置、文字入力方法および文字入力プログラム
US8620024B2 (en) * 2010-09-17 2013-12-31 Sony Corporation System and method for dynamic gesture recognition using geometric classification
JP2012243007A (ja) * 2011-05-18 2012-12-10 Toshiba Corp 映像表示装置及びそれを用いた映像領域選択方法
KR101459445B1 (ko) * 2012-12-18 2014-11-07 현대자동차 주식회사 차량내 손목각을 이용한 사용자 인터페이스 조작 시스템 및 방법
TWI536259B (zh) * 2012-12-28 2016-06-01 緯創資通股份有限公司 手勢辨識模組及手勢辨識方法
DE102013017425A1 (de) * 2013-10-19 2015-05-07 Drägerwerk AG & Co. KGaA Verfahren für die Erkennung von Gesten eines menschlichen Körpers
US20150177842A1 (en) * 2013-12-23 2015-06-25 Yuliya Rudenko 3D Gesture Based User Authorization and Device Control Methods
KR102303115B1 (ko) * 2014-06-05 2021-09-16 삼성전자 주식회사 웨어러블 디바이스 및 웨어러블 디바이스 의 증강 현실 정보 제공 방법
US10310675B2 (en) * 2014-08-25 2019-06-04 Canon Kabushiki Kaisha User interface apparatus and control method
JP6452369B2 (ja) * 2014-09-29 2019-01-16 キヤノン株式会社 情報処理装置とその制御方法、プログラム、記憶媒体
DE102014116292A1 (de) * 2014-11-07 2016-05-12 Visteon Global Technologies, Inc. System zur Informationsübertragung in einem Kraftfahrzeug
TWI553509B (zh) * 2015-10-30 2016-10-11 鴻海精密工業股份有限公司 手勢控制系統及方法
EP3488324A1 (en) * 2016-07-20 2019-05-29 Usens, Inc. Method and system for 3d hand skeleton tracking
US20200167553A1 (en) * 2017-07-21 2020-05-28 Sage Senses Inc. Method, system and apparatus for gesture recognition
CN108052202B (zh) * 2017-12-11 2021-06-11 深圳市星野信息技术有限公司 一种3d交互方法、装置、计算机设备及存储介质

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101980107A (zh) * 2010-10-20 2011-02-23 陆钰明 一种基于直线基本手势的手势码的实现方法
CN102707799A (zh) * 2011-09-12 2012-10-03 北京盈胜泰科技术有限公司 一种手势识别方法及手势识别装置
CN105278699A (zh) * 2014-09-29 2016-01-27 北京至感传感器技术研究院有限公司 易穿戴式手势识别装置
CN105787439A (zh) * 2016-02-04 2016-07-20 广州新节奏智能科技有限公司 一种基于卷积神经网络的深度图像人体关节定位方法
CN106406518A (zh) * 2016-08-26 2017-02-15 清华大学 手势控制装置及手势识别方法
CN107194344A (zh) * 2017-05-16 2017-09-22 西安电子科技大学 自适应骨骼中心的人体行为识别方法

Also Published As

Publication number Publication date
EP3805982B1 (en) 2023-10-25
EP3805982A4 (en) 2021-07-21
US20200393911A1 (en) 2020-12-17
CN110163045A (zh) 2019-08-23
US11366528B2 (en) 2022-06-21
EP3805982A1 (en) 2021-04-14

Similar Documents

Publication Publication Date Title
WO2019233216A1 (zh) 一种手势动作的识别方法、装置以及设备
EP3965003A1 (en) Image processing method and device
WO2018103525A1 (zh) 人脸关键点跟踪方法和装置、存储介质
WO2020042727A1 (zh) 一种应用场景的交互方法和移动终端以及存储介质
CN108279823A (zh) 一种柔性屏显示方法、终端及计算机可读存储介质
WO2019120192A1 (zh) 文本编辑方法及移动终端
WO2019154360A1 (zh) 界面切换方法及移动终端
WO2019024237A1 (zh) 信息推荐方法、移动终端及计算机可读存储介质
CN108170350A (zh) 实现数码变焦的方法、终端及计算机可读存储介质
CN108881635A (zh) 屏幕亮度调节方法、移动终端及计算机可读存储介质
WO2020221121A1 (zh) 视频查询方法、装置、设备及存储介质
CN109726179A (zh) 截屏图片处理方法、存储介质及移动终端
CN109857321A (zh) 基于屏幕投影的操作方法、移动终端、可读存储介质
CN108845711A (zh) 屏幕触控方法、终端及计算机可读存储介质
CN111738100A (zh) 一种基于口型的语音识别方法及终端设备
CN108600325A (zh) 一种推送内容的确定方法、服务器和计算机可读存储介质
CN109361864B (zh) 一种拍摄参数设置方法及终端设备
CN108833791B (zh) 一种拍摄方法和装置
CN110278481A (zh) 画中画实现方法、终端及计算机可读存储介质
CN110347284A (zh) 一种柔性显示屏操控方法、终端及计算机可读存储介质
CN105513098B (zh) 一种图像处理的方法和装置
CN107179830B (zh) 体感应用的信息处理方法、移动终端及存储介质
CN110060617B (zh) 一种屏幕光斑补偿的方法、装置、终端及可读存储介质
CN109547696B (zh) 一种拍摄方法及终端设备
CN109701279A (zh) 游戏控制方法、移动终端及计算机可读存储介质

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 19814802

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

ENP Entry into the national phase

Ref document number: 2019814802

Country of ref document: EP

Effective date: 20210111