CN110858277A - Method and device for obtaining attitude classification model - Google Patents
Method and device for obtaining attitude classification model Download PDFInfo
- Publication number
- CN110858277A CN110858277A CN201810958437.4A CN201810958437A CN110858277A CN 110858277 A CN110858277 A CN 110858277A CN 201810958437 A CN201810958437 A CN 201810958437A CN 110858277 A CN110858277 A CN 110858277A
- Authority
- CN
- China
- Prior art keywords
- image
- posture
- gesture
- preset
- classification model
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/20—Movements or behaviour, e.g. gesture recognition
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/214—Generating training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/20—Analysis of motion
- G06T7/215—Motion-based segmentation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/20—Analysis of motion
- G06T7/246—Analysis of motion using feature-based methods, e.g. the tracking of corners or segments
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/20—Analysis of motion
- G06T7/254—Analysis of motion involving subtraction of images
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/40—Extraction of image or video features
- G06V10/44—Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/40—Extraction of image or video features
- G06V10/46—Descriptors for shape, contour or point-related descriptors, e.g. scale invariant feature transform [SIFT] or bags of words [BoW]; Salient regional features
- G06V10/462—Salient features, e.g. scale invariant feature transforms [SIFT]
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Multimedia (AREA)
- Data Mining & Analysis (AREA)
- General Engineering & Computer Science (AREA)
- Evolutionary Computation (AREA)
- Evolutionary Biology (AREA)
- Bioinformatics & Computational Biology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Artificial Intelligence (AREA)
- Life Sciences & Earth Sciences (AREA)
- Health & Medical Sciences (AREA)
- General Health & Medical Sciences (AREA)
- Psychiatry (AREA)
- Social Psychology (AREA)
- Human Computer Interaction (AREA)
- Image Analysis (AREA)
Abstract
The application discloses a method and a device for obtaining a posture classification model, wherein the method comprises the following steps: performing gesture recognition on an image containing a preset gesture through a key point detection model to obtain a gesture tag of the image containing the preset gesture; and performing model training according to the image containing the preset posture and the posture label to obtain a posture classification model. The posture classification model obtained by the method has a simpler network structure and lower requirements on computing resources, and can run on mobile equipment in real time; the gesture classification model only adopts the gesture recognition matching process to recognize the human body gesture, does not need to detect key points of human bones, is simple in recognition process and high in recognition efficiency, and reduces the processing time of single-frame images.
Description
Technical Field
The application relates to the field of computer vision, in particular to a method for obtaining a posture classification model. The application also relates to a device for obtaining the posture classification model and an electronic device. The application also relates to a gesture recognition method, a gesture recognition device and an electronic device. The application further relates to a gesture recognition method, a gesture recognition device and an electronic device.
Background
With the progress of technology and the development of market, intelligent devices based on computer vision are widely used, for example, various monitoring devices and intelligent game devices, which need to accurately analyze and recognize the posture information of a target object, so as to achieve the purpose of monitoring the target object or interacting with the target object.
At the present stage, human body posture recognition is an important research direction in the field of computer vision, is widely applied to scenes such as input of motion sensing games, fall detection, identity recognition, control of intelligent equipment and the like, and is substantially the positioning of key points of a human body.
At present, a mainstream gesture recognition model is a gesture recognition model based on skeletal key points, such as an AlphaPose model and an OpenPose model, which are complex and have high requirements on computing resources, and are difficult to operate in real time at a higher frame rate (for example, a frame rate of more than 24 FPS) on a mobile device, taking OpenPose as an example, an algorithm model is more than 200M in size, and when the gesture recognition model operates on the mobile device, the gesture recognition model can only operate at a lower frame rate even under the support of a powerful graphics processor, and the real-time performance of the gesture recognition cannot be ensured; in addition, the human body posture recognition process of the posture recognition model based on the bone key points needs two processes of bone key point detection and posture recognition matching, the recognition process is complex, and the recognition efficiency is low.
Disclosure of Invention
The application provides a method for obtaining a posture classification model, which aims to solve the problems that the existing mobile equipment cannot ensure the real-time performance of posture recognition, and a posture recognition model based on skeleton key points is complex in recognition process and low in recognition efficiency. The application further provides a device for obtaining the posture classification model and an electronic device. The application also provides a gesture recognition method, a gesture recognition device and an electronic device. The application further provides a gesture recognition method, a gesture recognition device and an electronic device.
The application provides a method for obtaining a posture classification model, which comprises the following steps:
performing gesture recognition on an image containing a preset gesture through a key point detection model to obtain a gesture tag of the image containing the preset gesture;
and performing model training according to the image containing the preset posture and the posture label to obtain a posture classification model.
Optionally, the performing model training according to the image containing the predetermined pose and the pose tag includes:
performing characterization processing on the image containing the preset posture to obtain the posture characteristic of the image containing the preset posture;
and taking the posture features and the posture labels as training samples, and carrying out model training on a preset classification model.
Optionally, the predetermined classification model is a trained image classification model, and performing model training on the predetermined classification model by using the posture features and the posture labels as training samples includes:
and carrying out transfer learning on the trained image classification model according to the training sample to obtain a posture classification model.
Optionally, the characterizing the image containing the predetermined pose includes:
converting the image containing the preset gesture into a YUV image;
extracting a moving target based on Y component data of the YUV image to obtain contour data contained in the YUV image;
and carrying out normalization processing on the contour data contained in the YUV image to obtain the attitude characteristics.
Optionally, the extracting a moving target based on the Y component data of the YUV image includes:
extracting a moving target through an interframe difference algorithm based on the Y component data of the YUV image; alternatively, the first and second electrodes may be,
extracting a moving target through a background difference algorithm based on the Y component data of the YUV image; alternatively, the first and second electrodes may be,
and extracting a moving target by an optical flow method based on the Y component data of the YUV image.
Optionally, the performing pose recognition on the image including the predetermined pose by using the keypoint detection model includes:
performing key point detection on the image containing the preset posture through a key point detection model to obtain key points in the image containing the preset posture;
and performing gesture recognition on the key points through a motion matching algorithm to obtain the gesture tag of the image containing the preset gesture.
The application also provides a gesture recognition method, which comprises the following steps:
acquiring an image to be recognized, which needs gesture recognition;
carrying out attitude classification on the image to be recognized through an attitude classification model to obtain an attitude classification result of the image to be recognized;
wherein the posture classification model is obtained by:
performing gesture recognition on an image containing a preset gesture through a key point detection model to obtain a gesture tag of the image containing the preset gesture;
and performing model training according to the image containing the preset posture and the posture label to obtain a posture classification model.
Optionally, the performing posture classification on the image to be recognized through the posture classification model includes:
performing characterization processing on the image to be recognized to obtain attitude characteristics contained in the image to be recognized;
and inputting the posture characteristics contained in the image to be recognized into the posture classification model for posture classification.
Optionally, the characterizing the image to be recognized includes:
converting the image to be identified into a YUV image;
extracting a moving target based on Y component data of the YUV image to obtain contour data contained in the YUV image;
and carrying out normalization processing on the contour data contained in the YUV image to obtain the attitude characteristics.
Optionally, the extracting a moving target based on the Y component data of the YUV image includes:
extracting a moving target through an interframe difference algorithm based on the Y component data of the YUV image; alternatively, the first and second electrodes may be,
extracting a moving target through a background difference algorithm based on the Y component data of the YUV image; alternatively, the first and second electrodes may be,
and extracting a moving target by an optical flow method based on the Y component data of the YUV image.
The application also provides a gesture recognition method, which comprises the following steps:
performing gesture recognition on an image containing a preset gesture through a key point detection model to obtain a gesture tag of the image containing the preset gesture;
performing model training according to the image containing the preset posture and the posture label to obtain a posture classification model;
and carrying out attitude classification on the image to be recognized through the attitude classification model to obtain an attitude classification result of the image to be recognized.
Optionally, the performing model training according to the image containing the predetermined pose and the pose tag includes:
performing characterization processing on the image containing the preset posture to obtain the posture characteristic of the image containing the preset posture;
and taking the posture features and the posture labels as training samples, and carrying out model training on a preset classification model.
Optionally, the predetermined classification model is a trained image classification model, and performing model training on the predetermined classification model by using the posture features and the posture labels as training samples includes:
and carrying out transfer learning on the trained image classification model according to the training sample to obtain a posture classification model.
Optionally, the characterizing the image containing the predetermined pose includes:
converting the image containing the preset gesture into a YUV image;
extracting a moving target based on Y component data of the YUV image to obtain contour data contained in the YUV image;
and carrying out normalization processing on the contour data contained in the YUV image to obtain the attitude characteristics.
Optionally, the extracting a moving target based on the Y component data of the YUV image includes:
extracting a moving target through an interframe difference algorithm based on the Y component data of the YUV image; alternatively, the first and second electrodes may be,
extracting a moving target through a background difference algorithm based on the Y component data of the YUV image; alternatively, the first and second electrodes may be,
and extracting a moving target by an optical flow method based on the Y component data of the YUV image.
Optionally, the performing, by the gesture classification model, gesture classification on the image to be recognized includes:
performing characterization processing on the image to be recognized to obtain attitude characteristics contained in the image to be recognized;
inputting the posture characteristics contained in the image to be recognized into the posture classification model for posture classification;
the gesture features contained in the image to be recognized and the gesture features of the image containing the preset gesture are the same type of feature set.
Optionally, the characterizing the image to be recognized includes:
converting the image to be identified into a YUV image;
extracting a moving target based on Y component data of the YUV image to obtain contour data contained in the YUV image;
and carrying out normalization processing on the contour data contained in the YUV image to obtain the attitude characteristics.
Optionally, the extracting a moving target based on the Y component data of the YUV image includes:
extracting a moving target through an interframe difference algorithm based on the Y component data of the YUV image; alternatively, the first and second electrodes may be,
extracting a moving target through a background difference algorithm based on the Y component data of the YUV image; alternatively, the first and second electrodes may be,
and extracting a moving target by an optical flow method based on the Y component data of the YUV image.
Optionally, the performing pose recognition on the image including the predetermined pose by using the keypoint detection model includes:
performing key point detection on the image containing the preset posture through a key point detection model to obtain key points in the image containing the preset posture;
and performing gesture recognition on the key points through a motion matching algorithm to obtain the gesture tag of the image containing the preset gesture.
The present application further provides a device for obtaining a posture classification model, including:
the gesture tag obtaining unit is used for carrying out gesture recognition on the image containing the preset gesture through the key point detection model to obtain a gesture tag of the image containing the preset gesture;
and the posture classification model obtaining unit is used for carrying out model training according to the image containing the preset posture and the posture label to obtain a posture classification model.
The present application further provides an electronic device, comprising:
a processor;
a memory for storing a program for obtaining a pose classification model, which when read and executed by the processor, performs the following operations:
performing gesture recognition on an image containing a preset gesture through a key point detection model to obtain a gesture tag of the image containing the preset gesture;
and performing model training according to the image containing the preset posture and the posture label to obtain a posture classification model.
The present application further provides a gesture recognition apparatus, including:
the device comprises a to-be-recognized image obtaining unit, a to-be-recognized image acquiring unit and a gesture recognizing unit, wherein the to-be-recognized image obtaining unit is used for obtaining an image to be recognized which needs gesture recognition;
and the gesture classification result obtaining unit is used for carrying out gesture classification on the image to be recognized through a gesture classification model to obtain a gesture classification result of the image to be recognized.
The present application further provides an electronic device, comprising:
a processor;
a memory for storing a gesture recognition program that when read executed by the processor performs the following operations:
acquiring an image to be recognized, which needs gesture recognition;
and carrying out posture classification on the image to be recognized through a posture classification model to obtain a posture classification result of the image to be recognized.
The present application further provides a gesture recognition apparatus, including:
the gesture tag obtaining unit is used for carrying out gesture recognition on the image containing the preset gesture through the key point detection model to obtain a gesture tag of the image containing the preset gesture;
the attitude classification model obtaining unit is used for carrying out model training according to the image containing the preset attitude and the attitude label to obtain an attitude classification model;
and the attitude classification result obtaining unit is used for carrying out attitude classification on the image to be recognized through the attitude classification model to obtain an attitude classification result of the image to be recognized.
The present application further provides an electronic device, comprising:
a processor;
a memory for storing a gesture recognition program that when read executed by the processor performs the following operations:
performing gesture recognition on an image containing a preset gesture through a key point detection model to obtain a gesture tag of the image containing the preset gesture;
performing model training according to the image containing the preset posture and the posture label to obtain a posture classification model;
and carrying out attitude classification on the image to be recognized through the attitude classification model to obtain an attitude classification result of the image to be recognized.
Compared with the prior art, the method has the following advantages:
according to the method for obtaining the posture classification model, the posture of the image containing the preset posture is identified through the key point detection model, and the posture label of the image containing the preset posture is obtained; and performing model training according to the image containing the preset posture and the posture label to obtain a posture classification model. The posture classification model obtained by the method has a simpler network structure and lower requirements on computing resources, and can run on mobile equipment in real time; in addition, the process of recognizing the human body posture by the posture classification model is only a process of posture recognition matching, human skeleton key point detection is not needed, the recognition process is simple, the recognition efficiency is high, and the processing time of a single-frame image is reduced.
Drawings
FIG. 1 is a flow chart of a method provided in a first embodiment of the present application;
FIG. 2 is a flow chart of a model training method provided in a first embodiment of the present application;
FIG. 3 is a flow chart of a method provided by a second embodiment of the present application;
FIG. 4 is a flow chart of gesture classification provided by a second embodiment of the present application;
FIG. 5 is a flow chart of a method provided by a third embodiment of the present application;
FIG. 6 is a flow chart of a model training method provided in a third embodiment of the present application;
FIG. 7 is a block diagram of the apparatus unit provided in the fourth embodiment of the present application;
fig. 8 is a schematic diagram of an electronic device provided in a fifth embodiment of the present application;
FIG. 9 is a block diagram of the apparatus provided in the sixth embodiment of the present application;
FIG. 10 is a schematic diagram of an electronic device provided by a seventh embodiment of the present application;
FIG. 11 is a block diagram of the apparatus unit provided in the eighth embodiment of the present application;
fig. 12 is a schematic diagram of an electronic device according to a ninth embodiment of the present application.
Detailed Description
In the following description, numerous specific details are set forth in order to provide a thorough understanding of the present application. This application is capable of implementation in many different ways than those herein set forth and of similar import by those skilled in the art without departing from the spirit of this application and is therefore not limited to the specific implementations disclosed below.
Aiming at the existing gesture recognition scenes such as motion sensing game input and limb action detection, the application provides a method for obtaining a gesture classification model, a gesture recognition device corresponding to the method and electronic equipment in order to improve the recognition efficiency of a gesture recognition model to a target gesture and increase the application range of the gesture recognition model. The application also provides a gesture recognition method, a gesture recognition device corresponding to the method and electronic equipment. The application also provides a gesture recognition method, a device corresponding to the gesture recognition method and electronic equipment. The following provides embodiments to explain the method, apparatus, and electronic device in detail.
The first embodiment of the present application provides a method for obtaining a posture classification model, which can be applied to a mobile device terminal for recognizing a human posture. Fig. 1 is a flowchart of a method for obtaining a pose classification model according to a first embodiment of the present application, and the method according to this embodiment is described in detail below with reference to fig. 1. The following description refers to embodiments for the purpose of illustrating the principles of the methods, and is not intended to be limiting in actual use.
As shown in fig. 1, the method for obtaining a pose classification model provided in this embodiment includes the following steps:
s101, performing gesture recognition on the image containing the preset gesture through the key point detection model to obtain a gesture tag of the image containing the preset gesture.
The key point detection is one of basic algorithms in the computer vision field, and means that preset key points of an identified object are detected and obtained through a preset detection algorithm, and the posture information of the identified object can be obtained through posture recognition of the key points obtained through detection. For example, the detection of human skeleton key points (position Estimation) widely applied includes the detection of human skeleton key points of multiple persons and the detection of human skeleton key points of single person, the positions of the important points of the human body, such as 'wrist', 'elbow', 'shoulder' and 'head', are set as key points in advance, in the application, the key points obtained through detection can describe the skeleton information of the human body, and the detection method can be applied to a plurality of scenes, such as intelligent video monitoring, a patient monitoring system, human-computer interaction, virtual reality, human animation, intelligent home, intelligent security, athlete auxiliary training and the like.
In this embodiment, the key point detection model may be a bone key point detection model openpos, which is a real-time multi-user key point detection model.
The predetermined gesture refers to a gesture type preset in a specific application scene, for example, for a dance type application in a motion sensing game, the common situation is as follows: and indicating the user station to detect and match human skeleton key points at a preset position, then tracking and identifying the dance action of the user by using the detected human skeleton key points, comparing and grading the identified dance action with the dance action preset by the machine, wherein the preset dance action is a preset gesture in the process.
The source of the image containing the predetermined gesture may be a public data set or a private data set, for example, an image of a dance action pre-recorded by a camera, or an image in the public data set. The image type containing the predetermined pose may be an RGB image or a YUV image. In this embodiment, the image containing the predetermined gesture is an RGB image containing a predetermined dance motion.
In this embodiment, the above gesture recognition of the image including the predetermined gesture by the keypoint detection model may be performed as follows:
the key point detection is carried out on the image containing the preset human body posture through the skeleton key point detection model OpenPose, the key points in the image containing the preset human body posture are obtained, the process is substantially the position of each skeleton key point of the human body, and a foundation is provided for practical scenes such as follow-up further action recognition, action abnormity detection, intelligent monitoring, automatic driving and the like, and the method specifically can be as follows: the image containing the preset human body posture is used as the input of a skeleton key point detection model OpenPose, and the horizontal and vertical coordinates of each skeleton key point of the human body in the image are output; and performing gesture recognition on the obtained key points through a motion matching algorithm to obtain a recognition result, wherein the recognition result is the gesture label of the image containing the preset gesture.
And S102, performing model training according to the image containing the preset posture and the posture label to obtain a posture classification model.
After the pose tag of the image containing the preset pose is obtained through the above steps, the step is used for performing model training on the image containing the preset pose and the pose tag of the image, and obtaining a pose classification model capable of classifying images of the same category of the image containing the preset pose.
The images of the same category including the images of the predetermined gesture mean that the gestures included in the images have the same category as the predetermined gesture, for example, the predetermined gesture is a dance motion preset in the motion sensing game, and the images of the same category can be the acquired dance motion of the user.
In this embodiment, the process of performing model training according to the image containing the predetermined pose and the pose tag as shown in fig. 2 includes the following steps:
and S1021, performing characterization processing on the image containing the preset posture to obtain the posture characteristic of the image containing the preset posture.
In this embodiment, the characterizing process of the image including the predetermined pose includes the following steps:
first, an image containing a predetermined pose is converted into a YUV image. The YUV image is divided into three components, the "Y" component representing brightness, i.e., a gray value, and the "U" and "V" components representing color and saturation, for specifying the color of a pixel. In this embodiment, the image with the predetermined pose is an RGB image, and compared with the RGB image, the data storage space and the data transmission bandwidth occupied by the YUV image are much smaller, and in this embodiment, in the subsequent processing of the image, it is the texture of the image rather than the color of the image, so the RGB image with the predetermined pose needs to be converted into the YUV image, and specifically, the conversion can be performed by the following formula:
Y=0.299R+0.587G+0.114B;
U=-0.147R-0.289G+0.436B;
V=0.615R-0.515G-0.100B。
and secondly, extracting a moving target based on the Y component data of the YUV image to obtain the contour data contained in the YUV image. The method for extracting the moving target based on the Y component data of the YUV image is more, for example, the moving target is extracted by a background difference algorithm, which specifically includes: selecting a proper background image, carrying out differential operation on the current frame and the background image to obtain a differential image, selecting a proper threshold value, and carrying out binarization on the differential image; the method can also extract the moving object by an optical flow method, and the method mainly aims to calculate an optical flow field, namely, under the condition of proper smoothness constraint, a motion field is estimated according to the space-time gradient of an image sequence, and the moving object and a scene are detected and segmented by analyzing the change of the motion field.
In this embodiment, the moving object is extracted by an interframe difference algorithm, and the basic principle is as follows: and extracting a motion area in the image by closed-value conversion by adopting pixel-based time difference between two or three adjacent frames of images in the image sequence. The specific method comprises the following steps: subtracting pixel values corresponding to adjacent frames to obtain a difference image, carrying out binarization processing on the difference image, and under the condition that the change of the environmental brightness is not large, if the change of the corresponding pixel value is less than a preset threshold value, considering the pixel as a background pixel; if the pixel value change of the image area is large, the change is considered to be caused by a moving object in the image, the image area is marked as a foreground pixel, and the position of the moving object in the image can be determined by utilizing the marked pixel area.
And finally, carrying out normalization processing on the contour data contained in the YUV image to obtain the attitude characteristics. The normalization process aims to convert the YUV images into corresponding unique standard forms, and the unique standard forms have invariant characteristics to affine transformations such as translation, rotation and scaling.
And S1022, performing model training on a preset classification model by taking the posture features and the posture labels as training samples.
After the pose features and the pose labels of the image containing the preset pose are obtained through the steps, the steps are used for performing model training by taking the pose features and the pose labels obtained through the steps as training samples.
The predetermined classification model refers to a preselected image classification model which is trained and has a perfect image classification function, for example, an image classification model trained by using imagenet pictures and coffee models. In this embodiment, the method for performing model training on a predetermined classification model using the pose features and the pose labels as training samples includes: and performing transfer learning on the preselected image classification model which is trained and has a perfect image classification function according to the posture characteristics and the posture labels of the images containing the preset postures to obtain a new posture classification model which is required by the embodiment and can classify the images of the same category of the images containing the preset postures.
The process of the transfer learning comprises the following steps: and training the data set formed by the posture characteristics and the posture labels of the images containing the preset postures based on the preselected image classification model which is trained and has a perfect image classification function, and correspondingly adjusting the network architecture and other aspects of the image classification model according to the output requirements to obtain the posture classification model capable of classifying the images of the same category of the images containing the preset postures.
In the method for obtaining a pose classification model provided in this embodiment, a key point detection model (e.g., a bone key point detection model openpos) is used to perform pose recognition on an image including a predetermined pose, so as to obtain a pose tag of the image including the predetermined pose; performing characterization processing on the image containing the preset posture to obtain the posture characteristic of the image containing the preset posture; and taking the posture features and the posture labels as training samples, and performing transfer learning on a preselected image classification model which is trained and has a perfect image classification function to obtain a posture classification model which is required by the embodiment and can classify the images of the same category including the images of the preset posture. Compared with the existing gesture recognition model based on the skeleton key points, the gesture classification model has a simpler network structure and lower requirements on computing resources, and can run on mobile equipment in real time; in addition, the process of recognizing the human body posture by the posture classification model is only a process of posture recognition matching, human skeleton key point detection is not needed, the recognition process is simple, the recognition efficiency is high, and the processing time of a single-frame image is reduced.
The second embodiment of the present application provides a gesture recognition method, which is applicable to a gesture recognition scenario of a mobile device. Fig. 3 is a flowchart of a method provided in a second embodiment of the present application, and the method provided in this embodiment is described in detail below with reference to fig. 3.
As shown in fig. 3, the gesture recognition method provided in this embodiment includes the following steps:
s201, acquiring an image to be recognized, which needs gesture recognition.
The method comprises the step of acquiring the image to be recognized, which needs gesture recognition. The image to be recognized can be an image in any format containing posture information, such as a user limb action image captured in a motion sensing game.
S202, carrying out posture classification on the image to be recognized through a posture classification model to obtain a posture classification result of the image to be recognized.
The method comprises the following steps of carrying out gesture classification on the image to be recognized obtained in the previous step through a pre-trained gesture classification model, and obtaining a recognition result of the image to be recognized.
The posture classification model is obtained by the following steps: performing gesture recognition on the image containing the preset gesture through the key point detection model to obtain a gesture tag of the image containing the preset gesture; model training is performed according to the image with the predetermined posture and the posture label of the image to obtain a posture classification model, which is the posture classification model obtained in the first embodiment, and the detailed contents of this part refer to the related description provided in the first embodiment and are not repeated herein.
In this embodiment, the process of performing pose classification on the image to be recognized through the pose classification model is shown in fig. 4, and includes the following processes:
s2021, performing characterization processing on the image to be recognized to obtain the posture features contained in the image to be recognized.
In this embodiment, the characterizing process of the image to be recognized includes the following steps:
firstly, an image to be recognized is converted into a YUV image. The YUV image is divided into three components, the "Y" component representing brightness, i.e., a gray value, and the "U" and "V" components representing color and saturation, for specifying the color of a pixel. The image to be recognized is an image in any format that can be converted into a YUV image, in this embodiment, the image to be recognized is an RGB image, and compared with the RGB image, the data storage space and the data transmission bandwidth occupied by the YUV image are much smaller, and in this embodiment, in the subsequent processing of the image, the texture of the image, not the color of the image, is mainly used, so the RGB image including the predetermined posture needs to be converted into the YUV image, and specifically, the conversion can be performed by the following formula:
Y=0.299R+0.587G+0.114B;
U=-0.147R-0.289G+0.436B;
V=0.615R-0.515G-0.100B。
and secondly, extracting a moving target based on the Y component data of the YUV image to obtain the contour data contained in the YUV image. For example, the moving object may be extracted by a background difference algorithm, specifically: selecting a proper background image, carrying out differential operation on the current frame and the background image to obtain a differential image, selecting a proper threshold value, and carrying out binarization on the differential image; the method can also extract the moving object by an optical flow method, and the method mainly aims to calculate an optical flow field, namely, under the condition of proper smoothness constraint, a motion field is estimated according to the space-time gradient of an image sequence, and the moving object and a scene are detected and segmented by analyzing the change of the motion field.
In this embodiment, the moving object is extracted by an interframe difference algorithm, and the basic principle is as follows: and extracting a motion area in the image by closed-value conversion by adopting pixel-based time difference between two or three adjacent frames of images in the image sequence. The specific method comprises the following steps: subtracting pixel values corresponding to adjacent frames to obtain a difference image, carrying out binarization processing on the difference image, and under the condition that the change of the environmental brightness is not large, if the change of the corresponding pixel value is less than a preset threshold value, considering the pixel as a background pixel; if the pixel value change of the image area is large, the change is considered to be caused by a moving object in the image, the image area is marked as a foreground pixel, and the position of the moving object in the image can be determined by utilizing the marked pixel area.
And finally, carrying out normalization processing on the contour data contained in the YUV image to obtain the attitude characteristics. The normalization process aims to convert the YUV images into corresponding unique standard forms which have invariant characteristics to affine transformations such as translation, rotation, scaling and the like.
And S2022, inputting the posture characteristics contained in the image to be recognized into the posture classification model for posture classification, and obtaining a classification result of the image to be recognized.
The third embodiment of the present application provides a gesture recognition method. Fig. 5 is a flowchart of a method provided in a third embodiment of the present application, and the method provided in this embodiment is described in detail below with reference to fig. 5.
As shown in fig. 5, the gesture recognition method provided in this embodiment includes the following steps:
s301, performing gesture recognition on the image containing the preset gesture through the key point detection model to obtain a gesture tag of the image containing the preset gesture.
The key point detection means that preset key points of the identified object are detected and obtained through a preset detection algorithm, and the posture information of the identified object can be obtained through posture recognition of the key points obtained through detection.
The predetermined gesture refers to a gesture type preset in a specific application scene, for example, for a dance type application in a motion sensing game, the common situation is as follows: and indicating the user station to detect and match human skeleton key points at a preset position, then tracking and identifying the dance action of the user by using the detected human skeleton key points, comparing and grading the identified dance action with the dance action preset by the machine, wherein the preset dance action is a preset gesture in the process.
The source of the image containing the predetermined gesture may be a public data set or a private data set, for example, a dance motion image pre-recorded through a camera, or a dance motion image in the public data set. The image type containing the predetermined pose may be an RGB image or a YUV image. In this embodiment, the image containing the predetermined gesture is an RGB image containing a predetermined dance motion.
The above gesture recognition of the image containing the predetermined gesture by the keypoint detection model may be as follows: the key point detection is carried out on the image containing the preset human body posture through the skeleton key point detection model OpenPose, the key points in the image containing the preset human body posture are obtained, the process is substantially the position of each skeleton key point of the human body, a foundation is provided for practical scenes such as follow-up further action recognition, action abnormity detection, intelligent monitoring, automatic driving and the like, and the specific implementation mode can be as follows: the image containing the preset human body posture is used as the input of a skeleton key point detection model OpenPose, and the horizontal and vertical coordinates of each skeleton key point of the human body in the image are output; and performing gesture recognition on the obtained key points through a motion matching algorithm to obtain a recognition result, wherein the recognition result is the gesture label of the image containing the preset gesture.
S302, performing model training according to the image containing the preset posture and the posture label to obtain a posture classification model.
After the pose tag of the image containing the preset pose is obtained through the above steps, the step is used for performing model training on the image containing the preset pose and the pose tag of the image, and obtaining a pose classification model capable of classifying images of the same category of the image containing the preset pose.
In this embodiment, the process of performing model training according to the image containing the predetermined pose and the pose tag as shown in fig. 6 includes the following steps:
and S3021, performing characterization processing on the image containing the preset posture to obtain the posture characteristic of the image containing the preset posture.
The image containing the preset gesture is characterized, and the method comprises the following steps:
first, an image containing a predetermined pose is converted into a YUV image. The YUV image is divided into three components, the "Y" component representing brightness, i.e., a gray value, and the "U" and "V" components representing color and saturation, for specifying the color of a pixel. In this embodiment, the image with the predetermined pose is an RGB image, and compared with the RGB image, the data storage space and the data transmission bandwidth occupied by the YUV image are much smaller, and in the subsequent processing of the image in this embodiment, the texture of the image is mainly played instead of the color of the image, so the RGB image with the predetermined pose needs to be converted into the YUV image, which can be specifically converted by the following formula:
Y=0.299R+0.587G+0.114B;
U=-0.147R-0.289G+0.436B;
V=0.615R-0.515G-0.100B。
and secondly, extracting a moving target based on the Y component data of the YUV image to obtain the contour data contained in the YUV image. The method for extracting the moving target based on the Y component data of the YUV image is more, for example, the moving target is extracted by a background difference algorithm, which specifically includes: selecting a proper background image, carrying out differential operation on the current frame and the background image to obtain a differential image, selecting a proper threshold value, and carrying out binarization on the differential image; the method can also extract the moving object by an optical flow method, and the method mainly aims to calculate an optical flow field, namely, under the condition of proper smoothness constraint, a motion field is estimated according to the space-time gradient of an image sequence, and the moving object and a scene are detected and segmented by analyzing the change of the motion field.
In this embodiment, the moving object is extracted by an interframe difference algorithm, and the basic principle is as follows: and extracting a motion area in the image by closed-value conversion by adopting pixel-based time difference between two or three adjacent frames of images in the image sequence. The specific method comprises the following steps: subtracting pixel values corresponding to adjacent frame images to obtain a difference image, carrying out binarization processing on the difference image, and under the condition that the environmental brightness does not change much, if the change of the corresponding pixel value is less than a preset threshold value, considering the pixel as a background pixel; if the pixel value change of the image area is large, the change is considered to be caused by a moving object in the image, the image area is marked as a foreground pixel, and the position of the moving object in the image can be determined by utilizing the marked pixel area.
And finally, carrying out normalization processing on the contour data contained in the YUV image to obtain the attitude characteristics. The normalization process aims to convert the YUV images into corresponding unique standard forms which have invariant characteristics to affine transformations such as translation, rotation, scaling and the like.
And S3022, performing model training on the preset classification model by taking the posture features and the posture labels as training samples.
After the pose features and the pose labels of the image containing the preset pose are obtained through the steps, the steps are used for performing model training by taking the pose features and the pose labels obtained through the steps as training samples.
The predetermined classification model refers to a preselected image classification model which is trained and has a perfect image classification function, for example, an image classification model trained by using imagenet pictures and coffee models. In this embodiment, the method for performing model training on a predetermined classification model using the pose features and the pose labels as training samples includes: and performing transfer learning on the preselected image classification model which is trained and has a perfect image classification function according to the posture characteristics and the posture labels of the images containing the preset postures to obtain a posture classification model which is required by the embodiment and can classify the images of the same category of the images containing the preset postures.
The process of the transfer learning comprises the following steps: and training the image classification model which is selected in advance, is trained on the basis of the image classification model which is trained completely and has a perfect image classification function on the basis of the new data set formed by the posture features and the posture labels of the images containing the preset postures, and correspondingly adjusting the network architecture and other aspects of the image classification model according to the output requirements to obtain a new posture classification model capable of classifying the images of the same category of the images containing the preset postures.
S303, carrying out posture classification on the image to be recognized through the posture classification model to obtain a posture classification result of the image to be recognized.
The image to be recognized may be an image containing posture information, such as a user limb motion image captured by a camera in a motion sensing game.
In this embodiment, the process of performing pose classification on the image to be recognized through the pose classification model may be as follows:
the image to be recognized is characterized in the same way as the image including the predetermined pose in the step S3021, and the pose feature included in the image to be recognized is obtained, where the pose feature included in the image to be recognized is the same type of feature set as the pose feature of the image including the predetermined pose. The process of obtaining the posture features contained in the image to be recognized is as follows: converting an image to be identified into a YUV image; extracting a moving target based on the Y component data of the YUV image to obtain contour data included in the YUV image, for example, extracting the moving target by a background difference algorithm or extracting the moving target by an optical flow method, in this embodiment, extracting the moving target by an inter-frame difference algorithm; and carrying out normalization processing on the contour data contained in the YUV image to obtain the attitude characteristics contained in the image to be recognized.
And inputting the posture characteristics contained in the image to be recognized into the posture classification model for posture classification to obtain a classification result of the image to be recognized.
The fourth embodiment of the present application also provides a device for obtaining a pose classification model, which is substantially similar to the method embodiment and therefore is relatively simple to describe, and the details of the related technical features can be found in the corresponding description of the method embodiment provided above, and the following description of the device embodiment is only illustrative.
Referring to fig. 7, to understand the embodiment, fig. 7 is a block diagram of a unit of the apparatus provided in the embodiment, and as shown in fig. 7, the apparatus provided in the embodiment includes:
a pose tag obtaining unit 401, configured to perform pose recognition on an image including a predetermined pose through the key point detection model, and obtain a pose tag of the image including the predetermined pose;
a pose classification model obtaining unit 402, configured to perform model training according to the image and the pose tag that include the predetermined pose, and obtain a pose classification model.
The posture classification model obtaining unit 402 includes:
the image posture characteristic obtaining subunit is used for carrying out characterization processing on the image containing the preset posture to obtain the posture characteristic of the image containing the preset posture;
and the model training subunit is used for performing model training on the preset classification model by taking the posture characteristics and the posture labels as training samples.
The predetermined classification model is a trained image classification model, and the model training subunit is specifically configured to: and carrying out transfer learning on the trained image classification model according to the training samples to obtain a posture classification model.
The attitude feature obtaining subunit of the image includes:
converting the image containing the preset gesture into a YUV image;
extracting a moving target based on Y component data of the YUV image to obtain contour data contained in the YUV image;
and carrying out normalization processing on the contour data contained in the YUV image to obtain the attitude characteristics.
The extracting of the moving target based on the Y component data of the YUV image comprises the following steps:
extracting a moving target by an interframe difference algorithm based on the Y component data of the YUV image; alternatively, the first and second electrodes may be,
extracting a moving target by a background difference algorithm based on Y component data of the YUV image; alternatively, the first and second electrodes may be,
and extracting the moving object by an optical flow method based on the Y component data of the YUV image.
The posture tag obtaining unit 401 includes:
the key point obtaining subunit is used for performing key point detection on the image containing the preset posture through the key point detection model to obtain key points in the image containing the preset posture;
and the label obtaining subunit is used for carrying out gesture recognition on the key points through an action matching algorithm to obtain a gesture label of the image containing the preset gesture.
In the foregoing embodiment, a method and a device for obtaining a pose classification model are provided, and in addition, a fifth embodiment of the present application further provides an electronic device, where the embodiment of the electronic device is as follows:
please refer to fig. 8 for understanding the present embodiment, fig. 8 is a schematic view of an electronic device provided in the present embodiment.
As shown in fig. 8, the electronic apparatus includes: a processor 501; a memory 502;
a memory 502 for storing a program for obtaining a pose classification model, which when read executed by the processor 501 performs the following operations:
performing gesture recognition on the image containing the preset gesture through the key point detection model to obtain a gesture tag of the image containing the preset gesture;
and performing model training according to the image containing the preset posture and the posture label to obtain a posture classification model.
For example, the electronic device is a computer, and the computer can perform pose recognition on an image containing a predetermined pose through a key point detection model to obtain a pose tag of the image containing the predetermined pose; and performing model training according to the image containing the preset posture and the posture label to obtain a posture classification model.
Optionally, performing model training according to the image containing the predetermined pose and the pose tag, including:
performing characterization processing on the image containing the preset posture to obtain the posture characteristic of the image containing the preset posture;
and taking the posture features and the posture labels as training samples, and carrying out model training on a preset classification model.
Optionally, the predetermined classification model is an image classification model after training, and the model training is performed on the predetermined classification model by using the posture feature and the posture label as training samples, including:
and carrying out transfer learning on the trained image classification model according to the training samples to obtain a posture classification model.
Optionally, the characterizing the image including the predetermined pose includes:
converting the image containing the preset gesture into a YUV image;
extracting a moving target based on Y component data of the YUV image to obtain contour data contained in the YUV image;
and carrying out normalization processing on the contour data contained in the YUV image to obtain the attitude characteristics.
Optionally, extracting a moving target based on the Y component data of the YUV image includes:
extracting a moving target by an interframe difference algorithm based on the Y component data of the YUV image; alternatively, the first and second electrodes may be,
extracting a moving target by a background difference algorithm based on Y component data of the YUV image; alternatively, the first and second electrodes may be,
and extracting the moving object by an optical flow method based on the Y component data of the YUV image.
Optionally, performing pose recognition on the image including the predetermined pose through the key point detection model, including:
performing key point detection on the image containing the preset posture through a key point detection model to obtain key points in the image containing the preset posture;
and performing gesture recognition on the key points through a motion matching algorithm to obtain a gesture label of the image containing the preset gesture.
The sixth embodiment of the present application also provides a gesture recognition apparatus, since the apparatus embodiment is substantially similar to the method embodiment, so that the description is relatively simple, and the details of the related technical features can be found in the corresponding description of the method embodiment provided above, and the following description of the apparatus embodiment is only illustrative.
Please refer to fig. 9 for understanding the embodiment, fig. 9 is a block diagram of a unit of the apparatus provided in the embodiment, and as shown in fig. 9, the apparatus provided in the embodiment includes:
an image to be recognized obtaining unit 601, configured to obtain an image to be recognized that needs gesture recognition;
a pose classification result obtaining unit 602, configured to perform pose classification on the image to be recognized through the pose classification model, and obtain a pose classification result of the image to be recognized.
The posture classification model is obtained by the following steps:
performing gesture recognition on the image containing the preset gesture through the key point detection model to obtain a gesture tag of the image containing the preset gesture;
and performing model training according to the image containing the preset posture and the posture label to obtain a posture classification model.
The posture classification result obtaining unit 602 includes:
the gesture feature obtaining subunit is used for performing characterization processing on the image to be recognized to obtain the gesture features contained in the image to be recognized;
and the gesture classification subunit is used for inputting the gesture features contained in the image to be recognized into the gesture classification model for gesture classification.
The posture feature obtaining subunit included in the image to be recognized is specifically configured to:
converting an image to be identified into a YUV image;
extracting a moving target based on Y component data of the YUV image to obtain contour data contained in the YUV image;
and carrying out normalization processing on the contour data contained in the YUV image to obtain the attitude characteristics.
The extracting of the moving target based on the Y component data of the YUV image comprises the following steps:
extracting a moving target by an interframe difference algorithm based on the Y component data of the YUV image; alternatively, the first and second electrodes may be,
extracting a moving target by a background difference algorithm based on Y component data of the YUV image; alternatively, the first and second electrodes may be,
and extracting the moving object by an optical flow method based on the Y component data of the YUV image.
In the foregoing embodiment, a gesture recognition method and a gesture recognition apparatus are provided, and in addition, a seventh embodiment of the present application further provides an electronic device, where the embodiment of the electronic device is as follows:
please refer to fig. 10 for understanding the present embodiment, fig. 10 is a schematic view of an electronic device provided in the present embodiment.
As shown in fig. 10, the electronic apparatus includes: a processor 701; a memory 702;
the memory 702 is used for storing a program for obtaining a pose classification model, which when read and executed by the processor 701, performs the following operations:
acquiring an image to be recognized, which needs gesture recognition;
carrying out attitude classification on the image to be recognized through an attitude classification model to obtain an attitude classification result of the image to be recognized;
wherein the posture classification model is obtained by the following method:
performing gesture recognition on the image containing the preset gesture through the key point detection model to obtain a gesture tag of the image containing the preset gesture;
and performing model training according to the image containing the preset posture and the posture label to obtain a posture classification model.
Optionally, the gesture classification of the image to be recognized through the gesture classification model includes:
performing characterization processing on an image to be recognized to obtain attitude characteristics contained in the image to be recognized;
and inputting the posture characteristics contained in the image to be recognized into a posture classification model for posture classification.
Optionally, the characterizing the image to be recognized includes:
converting an image to be identified into a YUV image;
extracting a moving target based on Y component data of the YUV image to obtain contour data contained in the YUV image;
and carrying out normalization processing on the contour data contained in the YUV image to obtain the attitude characteristics.
Optionally, extracting a moving target based on the Y component data of the YUV image includes:
extracting a moving target by an interframe difference algorithm based on the Y component data of the YUV image; alternatively, the first and second electrodes may be,
extracting a moving target by a background difference algorithm based on Y component data of the YUV image; alternatively, the first and second electrodes may be,
and extracting the moving object by an optical flow method based on the Y component data of the YUV image.
The eighth embodiment of the present application further provides a gesture recognition apparatus, since the apparatus embodiment is substantially similar to the method embodiment, and therefore the description is relatively simple, and the details of the related technical features can be found in the corresponding description of the method embodiment provided above, and the following description of the apparatus embodiment is only illustrative.
Please refer to fig. 11 for understanding the embodiment, fig. 11 is a block diagram of a unit of the apparatus provided in the embodiment, and as shown in fig. 11, the apparatus provided in the embodiment includes:
a pose tag obtaining unit 801, configured to perform pose recognition on an image including a predetermined pose through the key point detection model, and obtain a pose tag of the image including the predetermined pose;
a pose classification model obtaining unit 802, configured to perform model training according to an image including a predetermined pose and a pose tag, to obtain a pose classification model;
a pose classification result obtaining unit 803, configured to perform pose classification on the image to be recognized through the pose classification model, and obtain a pose classification result of the image to be recognized.
The pose classification model obtaining unit 802 includes:
the gesture feature obtaining subunit is used for performing characterization processing on the image containing the preset gesture to obtain a gesture feature of the image containing the preset gesture;
and the model training subunit is used for performing model training on the preset classification model by taking the posture characteristics and the posture labels as training samples.
The predetermined classification model is an image classification model after training, and the model training subunit is specifically configured to:
and carrying out transfer learning on the trained image classification model according to the training samples to obtain a posture classification model.
A model training subunit comprising:
converting the image containing the preset gesture into a YUV image;
extracting a moving target based on Y component data of the YUV image to obtain contour data contained in the YUV image;
and carrying out normalization processing on the contour data contained in the YUV image to obtain the attitude characteristics.
The extracting of the moving target based on the Y component data of the YUV image comprises the following steps:
extracting a moving target by an interframe difference algorithm based on the Y component data of the YUV image; alternatively, the first and second electrodes may be,
extracting a moving target by a background difference algorithm based on Y component data of the YUV image; alternatively, the first and second electrodes may be,
and extracting the moving object by an optical flow method based on the Y component data of the YUV image.
The posture classification result obtaining unit 803 includes:
the gesture feature obtaining subunit is used for performing characterization processing on the image to be recognized to obtain the gesture features contained in the image to be recognized;
the gesture classification subunit is used for inputting gesture features contained in the image to be recognized into a gesture classification model for gesture classification;
the gesture features contained in the image to be recognized and the gesture features of the image containing the preset gesture are the same type of feature set.
The posture characteristic obtaining subunit included in the image to be recognized comprises:
converting an image to be identified into a YUV image;
extracting a moving target based on Y component data of the YUV image to obtain contour data contained in the YUV image;
and carrying out normalization processing on the contour data contained in the YUV image to obtain the attitude characteristics.
The extracting of the moving target based on the Y component data of the YUV image comprises the following steps:
extracting a moving target by an interframe difference algorithm based on the Y component data of the YUV image; alternatively, the first and second electrodes may be,
extracting a moving target by a background difference algorithm based on Y component data of the YUV image; alternatively, the first and second electrodes may be,
and extracting the moving object by an optical flow method based on the Y component data of the YUV image.
The posture tag obtaining unit 801 includes:
the key point obtaining subunit is used for performing key point detection on the image containing the preset posture through the key point detection model to obtain key points in the image containing the preset posture;
and the gesture tag obtaining subunit is used for performing gesture recognition on the key points through a motion matching algorithm to obtain a gesture tag of the image containing the preset gesture.
In the foregoing embodiment, a gesture recognition method and a gesture recognition apparatus are provided, and in addition, a ninth embodiment of the present application further provides an electronic device, where the embodiment of the electronic device is as follows:
please refer to fig. 12 for understanding the present embodiment, fig. 12 is a schematic view of an electronic device provided in the present embodiment.
As shown in fig. 12, the electronic apparatus includes: a processor 901; a memory 902;
the memory 902 is used for storing a program for obtaining a pose classification model, which when read and executed by the processor 901 performs the following operations:
performing gesture recognition on the image containing the preset gesture through the key point detection model to obtain a gesture tag of the image containing the preset gesture; performing model training according to the image containing the preset posture and the posture label to obtain a posture classification model; and carrying out attitude classification on the image to be recognized through the attitude classification model to obtain an attitude classification result of the image to be recognized.
Optionally, performing model training according to the image containing the predetermined pose and the pose tag, including:
performing characterization processing on the image containing the preset posture to obtain the posture characteristic of the image containing the preset posture;
and taking the posture features and the posture labels as training samples, and carrying out model training on a preset classification model.
Optionally, the predetermined classification model is an image classification model after training, and the model training is performed on the predetermined classification model by using the posture feature and the posture label as training samples, including:
and carrying out transfer learning on the trained image classification model according to the training samples to obtain a posture classification model.
Optionally, the characterizing the image including the predetermined pose includes:
converting the image containing the preset gesture into a YUV image;
extracting a moving target based on Y component data of the YUV image to obtain contour data contained in the YUV image;
and carrying out normalization processing on the contour data contained in the YUV image to obtain the attitude characteristics.
Optionally, extracting a moving target based on the Y component data of the YUV image includes:
extracting a moving target by an interframe difference algorithm based on the Y component data of the YUV image; alternatively, the first and second electrodes may be,
extracting a moving target by a background difference algorithm based on Y component data of the YUV image; alternatively, the first and second electrodes may be,
and extracting the moving object by an optical flow method based on the Y component data of the YUV image.
Optionally, the gesture classification of the image to be recognized through the gesture classification model includes:
performing characterization processing on an image to be recognized to obtain attitude characteristics contained in the image to be recognized;
inputting the posture characteristics contained in the image to be recognized into a posture classification model for posture classification;
the gesture features contained in the image to be recognized and the gesture features of the image containing the preset gesture are the same type of feature set.
Optionally, the characterizing the image to be recognized includes:
converting an image to be identified into a YUV image;
extracting a moving target based on Y component data of the YUV image to obtain contour data contained in the YUV image;
and carrying out normalization processing on the contour data contained in the YUV image to obtain the attitude characteristics.
Optionally, extracting a moving target based on the Y component data of the YUV image includes:
extracting a moving target by an interframe difference algorithm based on the Y component data of the YUV image; alternatively, the first and second electrodes may be,
extracting a moving target by a background difference algorithm based on Y component data of the YUV image; alternatively, the first and second electrodes may be,
and extracting the moving object by an optical flow method based on the Y component data of the YUV image.
Optionally, performing pose recognition on the image including the predetermined pose through the key point detection model, including:
performing key point detection on the image containing the preset posture through a key point detection model to obtain key points in the image containing the preset posture;
and performing gesture recognition on the key points through a motion matching algorithm to obtain a gesture label of the image containing the preset gesture.
In a typical configuration, a computing device includes one or more processors (CPUs), input/output interfaces, network interfaces, and memory.
The memory may include forms of volatile memory in a computer readable medium, Random Access Memory (RAM) and/or non-volatile memory, such as Read Only Memory (ROM) or flash memory (flash RAM). Memory is an example of a computer-readable medium.
1. Computer-readable media, including both non-transitory and non-transitory, removable and non-removable media, may implement information storage by any method or technology. The information may be computer readable instructions, data structures, modules of a program, or other data. Examples of computer storage media include, but are not limited to, phase change memory (PRAM), Static Random Access Memory (SRAM), Dynamic Random Access Memory (DRAM), other types of Random Access Memory (RAM), Read Only Memory (ROM), Electrically Erasable Programmable Read Only Memory (EEPROM), flash memory or other memory technology, compact disc read only memory (CD-ROM), Digital Versatile Discs (DVD) or other optical storage, magnetic cassettes, magnetic tape magnetic disk storage or other magnetic storage devices, or any other non-transmission medium that can be used to store information that can be accessed by a computing device. As defined herein, computer readable media does not include non-transitory computer readable media (transient media), such as modulated data signals and carrier waves.
2. As will be appreciated by one skilled in the art, embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
Although the present application has been described with reference to the preferred embodiments, it is not intended to limit the present application, and those skilled in the art can make variations and modifications without departing from the spirit and scope of the present application, therefore, the scope of the present application should be determined by the claims that follow.
Claims (22)
1. A method of obtaining a pose classification model, comprising:
performing gesture recognition on an image containing a preset gesture through a key point detection model to obtain a gesture tag of the image containing the preset gesture;
and performing model training according to the image containing the preset posture and the posture label to obtain a posture classification model.
2. The method of claim 1, wherein the model training from the image containing the predetermined pose and the pose tag comprises:
performing characterization processing on the image containing the preset posture to obtain the posture characteristic of the image containing the preset posture;
and taking the posture features and the posture labels as training samples, and carrying out model training on a preset classification model.
3. The method of claim 2, wherein the predetermined classification model is a trained image classification model, and the model training of the predetermined classification model using the pose features and the pose labels as training samples comprises:
and carrying out transfer learning on the trained image classification model according to the training sample to obtain a posture classification model.
4. The method of claim 2, wherein the characterizing the image containing the predetermined pose comprises:
converting the image containing the preset gesture into a YUV image;
extracting a moving target based on Y component data of the YUV image to obtain contour data contained in the YUV image;
and carrying out normalization processing on the contour data contained in the YUV image to obtain the attitude characteristics.
5. The method of claim 4, wherein extracting a motion target based on Y component data of the YUV image comprises:
extracting a moving target through an interframe difference algorithm based on the Y component data of the YUV image; alternatively, the first and second electrodes may be,
extracting a moving target through a background difference algorithm based on the Y component data of the YUV image; alternatively, the first and second electrodes may be,
and extracting a moving target by an optical flow method based on the Y component data of the YUV image.
6. The method of claim 1, wherein the performing pose recognition on the image containing the predetermined pose by the keypoint detection model comprises:
performing key point detection on the image containing the preset posture through a key point detection model to obtain key points in the image containing the preset posture;
and performing gesture recognition on the key points through a motion matching algorithm to obtain the gesture tag of the image containing the preset gesture.
7. A gesture recognition method, comprising:
acquiring an image to be recognized, which needs gesture recognition;
carrying out attitude classification on the image to be recognized through an attitude classification model to obtain an attitude classification result of the image to be recognized;
wherein the posture classification model is obtained by:
performing gesture recognition on an image containing a preset gesture through a key point detection model to obtain a gesture tag of the image containing the preset gesture;
and performing model training according to the image containing the preset posture and the posture label to obtain a posture classification model.
8. The method of claim 7, wherein the pose classification of the image to be recognized by the pose classification model comprises:
performing characterization processing on the image to be recognized to obtain attitude characteristics contained in the image to be recognized;
and inputting the posture characteristics contained in the image to be recognized into the posture classification model for posture classification.
9. The method according to claim 8, wherein the characterizing the image to be recognized comprises:
converting the image to be identified into a YUV image;
extracting a moving target based on Y component data of the YUV image to obtain contour data contained in the YUV image;
and carrying out normalization processing on the contour data contained in the YUV image to obtain the attitude characteristics.
10. The method of claim 9, wherein extracting a motion target based on Y component data of the YUV images comprises:
extracting a moving target through an interframe difference algorithm based on the Y component data of the YUV image; alternatively, the first and second electrodes may be,
extracting a moving target through a background difference algorithm based on the Y component data of the YUV image; alternatively, the first and second electrodes may be,
and extracting a moving target by an optical flow method based on the Y component data of the YUV image.
11. A gesture recognition method, comprising:
performing gesture recognition on an image containing a preset gesture through a key point detection model to obtain a gesture tag of the image containing the preset gesture;
performing model training according to the image containing the preset posture and the posture label to obtain a posture classification model;
and carrying out attitude classification on the image to be recognized through the attitude classification model to obtain an attitude classification result of the image to be recognized.
12. The method of claim 11, wherein the model training from the image containing the predetermined pose and the pose tag comprises:
performing characterization processing on the image containing the preset posture to obtain the posture characteristic of the image containing the preset posture;
and taking the posture features and the posture labels as training samples, and carrying out model training on a preset classification model.
13. The method of claim 12, wherein the predetermined classification model is a trained image classification model, and the model training of the predetermined classification model using the pose features and the pose labels as training samples comprises:
and carrying out transfer learning on the trained image classification model according to the training sample to obtain a posture classification model.
14. The method of claim 12, wherein the characterizing the image containing the predetermined pose comprises:
converting the image containing the preset gesture into a YUV image;
extracting a moving target based on Y component data of the YUV image to obtain contour data contained in the YUV image;
and carrying out normalization processing on the contour data contained in the YUV image to obtain the attitude characteristics.
15. The method of claim 14, wherein extracting a motion target based on Y component data of the YUV images comprises:
extracting a moving target through an interframe difference algorithm based on the Y component data of the YUV image; alternatively, the first and second electrodes may be,
extracting a moving target through a background difference algorithm based on the Y component data of the YUV image; alternatively, the first and second electrodes may be,
and extracting a moving target by an optical flow method based on the Y component data of the YUV image.
16. The method according to any one of claims 12-15, wherein the pose classification of the image to be recognized by the pose classification model comprises:
performing characterization processing on the image to be recognized to obtain attitude characteristics contained in the image to be recognized;
inputting the posture characteristics contained in the image to be recognized into the posture classification model for posture classification;
the gesture features contained in the image to be recognized and the gesture features of the image containing the preset gesture are the same type of feature set.
17. The method according to claim 16, wherein the characterizing the image to be recognized comprises:
converting the image to be identified into a YUV image;
extracting a moving target based on Y component data of the YUV image to obtain contour data contained in the YUV image;
and carrying out normalization processing on the contour data contained in the YUV image to obtain the attitude characteristics.
18. The method of claim 17, wherein extracting a motion target based on Y component data of the YUV images comprises:
extracting a moving target through an interframe difference algorithm based on the Y component data of the YUV image; alternatively, the first and second electrodes may be,
extracting a moving target through a background difference algorithm based on the Y component data of the YUV image; alternatively, the first and second electrodes may be,
and extracting a moving target by an optical flow method based on the Y component data of the YUV image.
19. The method of claim 11, wherein the performing pose recognition on the image containing the predetermined pose by the keypoint detection model comprises:
performing key point detection on the image containing the preset posture through a key point detection model to obtain key points in the image containing the preset posture;
and performing gesture recognition on the key points through a motion matching algorithm to obtain the gesture tag of the image containing the preset gesture.
20. An apparatus for obtaining a pose classification model, comprising:
the gesture tag obtaining unit is used for carrying out gesture recognition on the image containing the preset gesture through the key point detection model to obtain a gesture tag of the image containing the preset gesture;
and the posture classification model obtaining unit is used for carrying out model training according to the image containing the preset posture and the posture label to obtain a posture classification model.
21. An attitude recognition apparatus characterized by comprising:
the device comprises a to-be-recognized image obtaining unit, a to-be-recognized image acquiring unit and a gesture recognizing unit, wherein the to-be-recognized image obtaining unit is used for obtaining an image to be recognized which needs gesture recognition;
and the gesture classification result obtaining unit is used for carrying out gesture classification on the image to be recognized through a gesture classification model to obtain a gesture classification result of the image to be recognized.
22. An attitude recognition apparatus characterized by comprising:
the gesture tag obtaining unit is used for carrying out gesture recognition on the image containing the preset gesture through the key point detection model to obtain a gesture tag of the image containing the preset gesture;
the attitude classification model obtaining unit is used for carrying out model training according to the image containing the preset attitude and the attitude label to obtain an attitude classification model;
and the attitude classification result obtaining unit is used for carrying out attitude classification on the image to be recognized through the attitude classification model to obtain an attitude classification result of the image to be recognized.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810958437.4A CN110858277A (en) | 2018-08-22 | 2018-08-22 | Method and device for obtaining attitude classification model |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810958437.4A CN110858277A (en) | 2018-08-22 | 2018-08-22 | Method and device for obtaining attitude classification model |
Publications (1)
Publication Number | Publication Date |
---|---|
CN110858277A true CN110858277A (en) | 2020-03-03 |
Family
ID=69635782
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201810958437.4A Pending CN110858277A (en) | 2018-08-22 | 2018-08-22 | Method and device for obtaining attitude classification model |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN110858277A (en) |
Cited By (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111899192A (en) * | 2020-07-23 | 2020-11-06 | 北京字节跳动网络技术有限公司 | Interaction method, interaction device, electronic equipment and computer-readable storage medium |
CN111931725A (en) * | 2020-09-23 | 2020-11-13 | 北京无垠创新科技有限责任公司 | Human body action recognition method, device and storage medium |
CN112102947A (en) * | 2020-04-13 | 2020-12-18 | 国家体育总局体育科学研究所 | Apparatus and method for body posture assessment |
CN113190104A (en) * | 2021-01-18 | 2021-07-30 | 郭奕忠 | Method for realizing man-machine interaction by recognizing human actions through visual analysis by intelligent equipment |
CN114393575A (en) * | 2021-12-17 | 2022-04-26 | 重庆特斯联智慧科技股份有限公司 | Robot control method and system based on high-efficiency recognition of user posture |
CN115270997A (en) * | 2022-09-20 | 2022-11-01 | 中国人民解放军32035部队 | Rocket target attitude stability discrimination method based on transfer learning and related device |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103646425A (en) * | 2013-11-20 | 2014-03-19 | 深圳先进技术研究院 | A method and a system for body feeling interaction |
CN104616028A (en) * | 2014-10-14 | 2015-05-13 | 北京中科盘古科技发展有限公司 | Method for recognizing posture and action of human limbs based on space division study |
CN107609479A (en) * | 2017-08-09 | 2018-01-19 | 上海交通大学 | Attitude estimation method and system based on the sparse Gaussian process with noise inputs |
CN108062526A (en) * | 2017-12-15 | 2018-05-22 | 厦门美图之家科技有限公司 | A kind of estimation method of human posture and mobile terminal |
CN108304819A (en) * | 2018-02-12 | 2018-07-20 | 北京易真学思教育科技有限公司 | Gesture recognition system and method, storage medium |
-
2018
- 2018-08-22 CN CN201810958437.4A patent/CN110858277A/en active Pending
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103646425A (en) * | 2013-11-20 | 2014-03-19 | 深圳先进技术研究院 | A method and a system for body feeling interaction |
CN104616028A (en) * | 2014-10-14 | 2015-05-13 | 北京中科盘古科技发展有限公司 | Method for recognizing posture and action of human limbs based on space division study |
CN107609479A (en) * | 2017-08-09 | 2018-01-19 | 上海交通大学 | Attitude estimation method and system based on the sparse Gaussian process with noise inputs |
CN108062526A (en) * | 2017-12-15 | 2018-05-22 | 厦门美图之家科技有限公司 | A kind of estimation method of human posture and mobile terminal |
CN108304819A (en) * | 2018-02-12 | 2018-07-20 | 北京易真学思教育科技有限公司 | Gesture recognition system and method, storage medium |
Non-Patent Citations (1)
Title |
---|
郭钧 等: "基于多神经网络融合的运动人体姿态识别" * |
Cited By (12)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112102947A (en) * | 2020-04-13 | 2020-12-18 | 国家体育总局体育科学研究所 | Apparatus and method for body posture assessment |
CN112102947B (en) * | 2020-04-13 | 2024-02-13 | 国家体育总局体育科学研究所 | Apparatus and method for body posture assessment |
CN111899192A (en) * | 2020-07-23 | 2020-11-06 | 北京字节跳动网络技术有限公司 | Interaction method, interaction device, electronic equipment and computer-readable storage medium |
CN111899192B (en) * | 2020-07-23 | 2022-02-01 | 北京字节跳动网络技术有限公司 | Interaction method, interaction device, electronic equipment and computer-readable storage medium |
US11842425B2 (en) | 2020-07-23 | 2023-12-12 | Beijing Bytedance Network Technology Co., Ltd. | Interaction method and apparatus, and electronic device and computer-readable storage medium |
CN111931725A (en) * | 2020-09-23 | 2020-11-13 | 北京无垠创新科技有限责任公司 | Human body action recognition method, device and storage medium |
CN111931725B (en) * | 2020-09-23 | 2023-10-13 | 北京无垠创新科技有限责任公司 | Human motion recognition method, device and storage medium |
CN113190104A (en) * | 2021-01-18 | 2021-07-30 | 郭奕忠 | Method for realizing man-machine interaction by recognizing human actions through visual analysis by intelligent equipment |
CN114393575A (en) * | 2021-12-17 | 2022-04-26 | 重庆特斯联智慧科技股份有限公司 | Robot control method and system based on high-efficiency recognition of user posture |
CN114393575B (en) * | 2021-12-17 | 2024-04-02 | 重庆特斯联智慧科技股份有限公司 | Robot control method and system based on high-efficiency recognition of user gestures |
CN115270997A (en) * | 2022-09-20 | 2022-11-01 | 中国人民解放军32035部队 | Rocket target attitude stability discrimination method based on transfer learning and related device |
CN115270997B (en) * | 2022-09-20 | 2022-12-27 | 中国人民解放军32035部队 | Rocket target attitude stability discrimination method based on transfer learning and related device |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN109359538B (en) | Training method of convolutional neural network, gesture recognition method, device and equipment | |
CN110858277A (en) | Method and device for obtaining attitude classification model | |
JP6636154B2 (en) | Face image processing method and apparatus, and storage medium | |
CN108701376B (en) | Recognition-based object segmentation of three-dimensional images | |
Oreifej et al. | Hon4d: Histogram of oriented 4d normals for activity recognition from depth sequences | |
WO2018177379A1 (en) | Gesture recognition, gesture control and neural network training methods and apparatuses, and electronic device | |
Singh et al. | Muhavi: A multicamera human action video dataset for the evaluation of action recognition methods | |
US9710698B2 (en) | Method, apparatus and computer program product for human-face features extraction | |
Li et al. | Finding the secret of image saliency in the frequency domain | |
CN109684925B (en) | Depth image-based human face living body detection method and device | |
CN108229324B (en) | Gesture tracking method and device, electronic equipment and computer storage medium | |
WO2019133403A1 (en) | Multi-resolution feature description for object recognition | |
US9639943B1 (en) | Scanning of a handheld object for 3-dimensional reconstruction | |
GB2555136A (en) | A method for analysing media content | |
CN107358189B (en) | Object detection method in indoor environment based on multi-view target extraction | |
JP2007148663A (en) | Object-tracking device, object-tracking method, and program | |
Li et al. | Treat samples differently: Object tracking with semi-supervised online covboost | |
US11157765B2 (en) | Method and system for determining physical characteristics of objects | |
CN108229281B (en) | Neural network generation method, face detection device and electronic equipment | |
JP2013037539A (en) | Image feature amount extraction device and program thereof | |
Jetley et al. | 3D activity recognition using motion history and binary shape templates | |
CN112965602A (en) | Gesture-based human-computer interaction method and device | |
US20220207917A1 (en) | Facial expression image processing method and apparatus, and electronic device | |
Fan et al. | A feature-based object tracking approach for realtime image processing on mobile devices | |
CN112580395A (en) | Depth information-based 3D face living body recognition method, system, device and medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |