WO2021190046A1 - 手势识别模型的训练方法、手势识别方法及装置 - Google Patents
手势识别模型的训练方法、手势识别方法及装置 Download PDFInfo
- Publication number
- WO2021190046A1 WO2021190046A1 PCT/CN2020/141233 CN2020141233W WO2021190046A1 WO 2021190046 A1 WO2021190046 A1 WO 2021190046A1 CN 2020141233 W CN2020141233 W CN 2020141233W WO 2021190046 A1 WO2021190046 A1 WO 2021190046A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- gesture
- gesture recognition
- model
- fusion
- category
- Prior art date
Links
- 238000000034 method Methods 0.000 title claims abstract description 47
- 230000004927 fusion Effects 0.000 claims description 104
- 238000004590 computer program Methods 0.000 claims description 21
- 230000015654 memory Effects 0.000 claims description 14
- 238000004364 calculation method Methods 0.000 claims description 9
- 238000010606 normalization Methods 0.000 claims description 6
- 238000004422 calculation algorithm Methods 0.000 description 5
- 238000010586 diagram Methods 0.000 description 5
- 238000013528 artificial neural network Methods 0.000 description 4
- 230000000694 effects Effects 0.000 description 4
- 230000003044 adaptive effect Effects 0.000 description 1
- 230000006870 function Effects 0.000 description 1
- 239000011159 matrix material Substances 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 238000007500 overflow downdraw method Methods 0.000 description 1
- 238000003672 processing method Methods 0.000 description 1
- 230000006403 short-term memory Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/10—Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
- G06V40/107—Static hand or arm
- G06V40/113—Recognition of static hand signs
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/214—Generating training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/25—Fusion techniques
Definitions
- the embodiments of the present disclosure relate to the technical field of gesture recognition, and in particular, to a method for training a gesture recognition model, a method and device for gesture recognition.
- the embodiments of the present disclosure provide a gesture recognition method, a gesture recognition module, and a display device, which are used to improve the recognition accuracy of a gesture recognition model.
- embodiments of the present disclosure provide a method for training a gesture recognition model, including:
- the training set includes multiple gesture sample images, and the multiple gesture sample images contain gestures of various types;
- the fusion model includes a plurality of gesture recognition models
- the target weight combination of the fusion model is determined from a plurality of weight combinations to be trained, wherein each of the plurality of weight combinations to be trained includes each of the gesture recognition models The corresponding weights respectively.
- the determining the target weight combination of the fusion model from a plurality of weight combinations to be trained according to the prediction score includes:
- the weight combination to be trained whose recognition accuracy exceeds a preset threshold is determined as the target weight combination.
- the calculation of the recognition accuracy of the fusion model corresponding to the multiple weight combinations to be trained according to the prediction score includes:
- the obtaining the prediction scores of the multiple gesture recognition models for each of the multiple gesture sample images in each of the categories includes:
- the predicted score is normalized to obtain a normalized predicted score.
- embodiments of the present disclosure provide a gesture recognition method, including:
- the target weight combination includes a weight corresponding to each gesture recognition model
- the target weight combination is obtained by training the above-mentioned gesture recognition model training method.
- embodiments of the present disclosure provide a training module for a gesture recognition model, including:
- the first acquisition module is configured to acquire a training set, where the training set includes multiple gesture sample images, and the multiple gesture sample images contain gestures of various types;
- the second acquisition module is configured to acquire a fusion model, where the fusion model includes multiple gesture recognition models;
- the third acquiring module is configured to acquire the prediction scores of each of the gesture recognition models in each of the categories for the plurality of gesture sample images;
- the training module is used to determine the target weight combination of the fusion model from a plurality of weight combinations to be trained according to the prediction score, wherein each of the weight combinations to be trained includes each of the weight combinations to be trained.
- the weight corresponding to the gesture recognition model.
- the training module includes:
- the calculation sub-module is configured to calculate the recognition accuracy rates of the fusion models corresponding to the multiple weight combinations to be trained according to the prediction scores;
- the determining sub-module is configured to determine the weight combination to be trained whose recognition accuracy exceeds a preset threshold as the target weight combination.
- the calculation sub-module includes:
- the first execution unit is configured to perform the following operations for each gesture sample image:
- the second execution unit is configured to determine the recognition accuracy rate of the fusion model corresponding to the weight combination to be trained according to whether the recognition of the multiple gesture sample images is correct.
- the third acquisition module includes:
- the normalization processing sub-module is used to normalize the predicted scores to obtain the normalized predicted scores.
- a gesture recognition module including:
- the first acquisition module is configured to acquire the prediction scores of each type of gesture image to be recognized by each gesture recognition model in the fusion model, and the fusion model includes a plurality of gesture recognition models;
- the second acquiring module is configured to acquire a target weight combination of the fusion model, and the target weight combination includes a weight corresponding to each gesture recognition model;
- the processing module is configured to multiply the prediction score predicted by each gesture recognition model by the weight corresponding to the gesture recognition model for each category to obtain a weighted prediction score, and combine all the gesture recognition models Add up the weighted prediction scores to obtain a fusion prediction score for each of the categories;
- the third acquiring module is configured to acquire the category with the largest fusion prediction score as the category of the gesture in the gesture image to be recognized;
- the target weight combination is obtained by training the above-mentioned gesture recognition model training method.
- the embodiments of the present disclosure provide a training module for a gesture recognition model, including a processor, a memory, and a computer program stored in the memory and capable of running on the processor.
- the computer program is The processor implements the steps of the training method of the gesture recognition model when executed.
- embodiments of the present disclosure provide a gesture recognition module, including a processor, a memory, and a computer program stored on the memory and running on the processor, the computer program being executed by the processor When realizing the steps of the above gesture recognition method.
- the embodiments of the present disclosure provide a computer-readable storage medium on which a computer program is stored, and when the computer program is executed by a processor, the steps of the training method of the gesture recognition model described above are implemented, Or, when the computer program is executed by a processor, the steps of the aforementioned gesture recognition method are implemented.
- the weights of multiple gesture recognition models in the fusion model are trained through the training set, and the weights of each gesture recognition model can be learned accurately and adaptively.
- the weights of each gesture recognition model can be learned accurately and adaptively.
- FIG. 1 is a schematic flowchart of a training method of a gesture recognition model according to an embodiment of the present disclosure
- FIG. 2 is a schematic flowchart of a method for training a gesture recognition model according to another embodiment of the present disclosure
- FIG. 3 is a schematic flowchart of a method for training a gesture recognition model according to another embodiment of the present disclosure
- FIG. 4 is a schematic diagram of a gesture recognition method according to an embodiment of the present disclosure.
- FIG. 5 is a schematic structural diagram of a training module of a gesture recognition model according to an embodiment of the disclosure.
- FIG. 6 is a schematic structural diagram of a gesture recognition module according to an embodiment of the disclosure.
- FIG. 7 is a schematic structural diagram of a training module of a gesture recognition model according to another embodiment of the present disclosure.
- FIG. 8 is a schematic structural diagram of a gesture recognition module according to another embodiment of the present disclosure.
- Multi-model fusion refers to the fusion of the prediction results of multiple models in some way, and the fusion result is used as the final recognition and classification result.
- Commonly used fusion methods are: voting, average, and maximum. For the case where the classification performance difference of different models is small, that is, for most samples, each model can basically be classified correctly, and only a small number of samples have differences between the models. In this case, use the voting method and the average method Or the maximum value method is not conducive to further improving the final recognition accuracy.
- An embodiment of the present disclosure provides a method for training a gesture recognition model, including:
- Step 11 Obtain a training set, where the training set includes multiple gesture sample images, and the multiple gesture sample images contain gestures of various types;
- the types of gestures include, for example, fist, OK, compassion, gesture, farewell, prayer, and praise.
- the multiple gesture sample images include all types of gestures that are desired to be recognized.
- Each of the gesture sample images is pre-marked with the information of the type of the gesture contained in it, so as to verify the recognition accuracy corresponding to the weight combination of the subsequent training.
- each of the gesture sample images includes a category of gestures to make the training process easier.
- the number of gesture sample images included in the training set exceeds a preset threshold, for example, more than 100, and a large amount of sample data makes the training result more accurate.
- Step 12 Obtain a fusion model, where the fusion model includes multiple gesture recognition models;
- the multiple gesture recognition models are different models, such as models using different gesture recognition algorithms.
- the plurality of gesture recognition models may include at least one of the following: a gesture recognition model based on CPnet (Common Deep Network), a gesture recognition model based on LSTM (Long Short Term Memory Neural Network), and a gesture based on a template matching method. Recognition model, etc.
- Step 13 Obtain the prediction scores of each of the gesture recognition models for the plurality of gesture sample images in each of the categories;
- the multiple gesture sample images are part or all of the multiple gesture sample images in the training set.
- the prediction score of each gesture recognition model for each gesture sample image in each of the categories is obtained.
- the predicted score is a numerical value, which can be a decimal, positive or negative number, and so on.
- the training set includes N gesture sample images, and the N gesture sample images include c categories Gesture
- c prediction scores [S 1 , S 2 ,..., S c ] can be obtained.
- one gesture sample image one The category corresponds to a predicted score.
- the first row data of the matrix M i is the gesture recognition model training set sample image of the gesture in the first predicted scores for each category
- the N-th line data of the gesture recognition model training set M i N-th The predicted scores of the gesture sample images in each category, and so on.
- S 11 is the prediction score of the first gesture sample image in the first category
- S 1c is the prediction score of the first gesture sample image in the c category, and so on.
- Step 14 Determine the target weight combination of the fusion model from multiple weight combinations to be trained according to the prediction score, wherein each of the multiple weight combinations to be trained includes each of the gestures Identify the weights corresponding to the model.
- the multiple weight combinations to be trained refer to multiple weight combinations participating in training.
- Each weight combination to be trained includes: the weight corresponding to each gesture recognition model.
- the generated weight combination to be trained is (w 1 , w 2 , w 3 , w 4 ), where w 1 is the weight corresponding to M 1 , w 2 is the weight corresponding to M 2 , and w 3 is the weight corresponding to M 3 , W 4 is the weight corresponding to M 4 , train the weight combination to be trained, and adjust one or more weights in the weight combination to be trained according to the training result to obtain a new weight combination to be trained, for example (w 1 ', w 2 , w 3 ', w 4 ), and continue to train the adjusted weight combination to be trained, and so on, and finally get the target weight combination that meets the training requirements (w s1 , w s2 , w s3 , W s4 ), where w s
- the weights of multiple gesture recognition models in the fusion model are trained through the training set, and the weights of each gesture recognition model can be learned accurately and adaptively.
- the weights of each gesture recognition model can be learned accurately and adaptively.
- the obtaining the prediction scores of the multiple gesture recognition models for each of the multiple gesture sample images in each of the categories includes: normalizing the prediction scores, Get the normalized prediction score. That is, the prediction results of multiple gesture recognition models for all gesture sample images are normalized to a uniform range, such as [0,1], to facilitate calculation.
- a variety of normalization processing methods can be used to normalize the predicted score, such as (0,1) standardization, Z-score standardization, Sigmoid function, and so on.
- the determining the target weight combination of the fusion model from the weight combination to be trained according to the prediction score includes:
- Step 21 Calculate the recognition accuracy rate of the fusion model corresponding to the multiple decibel combinations of weights to be trained according to the prediction score;
- Step 22 Determine the weight combination to be trained whose recognition accuracy exceeds a preset threshold as the target weight combination.
- the preset threshold is 98% or 99%, etc., which can be set as required.
- a weight combination to be trained is generated in advance, and then the recognition accuracy rate of the fusion model corresponding to the weight combination to be trained is calculated according to the prediction score, if the recognition is accurate If the rate is lower than the preset threshold, adjust the weight combination to be trained to obtain a new weight combination to be trained, and then continue to calculate the fusion corresponding to the new weight combination to be trained according to the prediction score
- the recognition accuracy of the model is deduced by analogy until the weight combination to be trained whose recognition accuracy exceeds a preset threshold is determined as the target weight combination.
- the weight combination whose recognition accuracy exceeds a preset threshold is used as the target weight combination, which can effectively improve the accuracy of gesture recognition of the fusion model.
- the calculation of the recognition accuracy of the fusion model corresponding to the plurality of weight combinations to be trained according to the prediction score includes:
- Step 31 For each of the gesture sample images, perform the following operations:
- Step 311 For each category, multiply the prediction score predicted by each gesture recognition model by the weight corresponding to the gesture recognition model to obtain a weighted prediction score;
- the training set includes 100 gesture sample images
- the 100 gesture sample images include 24 types of gestures
- the fusion model includes 4 gesture recognition models.
- a weight combination to be trained is (w 1 , w 2 , w 3 , w 4 ).
- the prediction score S 1124 predicted by the gesture recognition model 1 in category 24 is multiplied by the weight w 1 corresponding to the gesture recognition model 1 to obtain the weighted prediction score, that is, S 1124 ⁇ w 1 .
- Step 312 For each of the categories, add the weighted prediction scores corresponding to all the gesture recognition models to obtain the fusion prediction score of the gesture sample image in each of the categories;
- Step 313 Use the category with the largest fusion prediction score as the recognized gesture category
- Step 314 Compare the recognized gesture category with the pre-marked gesture category to determine whether the recognition is correct
- the pre-labeled gesture category is category 6, and the recognition is accurate. Assuming that the pre-labeled gesture category is category 8, it is considered that the recognition is wrong.
- Step 32 Determine the recognition accuracy rate of the fusion model corresponding to the weight combination to be trained according to whether the recognition of the multiple gesture sample images is correct.
- the recognition accuracy rate of the fusion model corresponding to the weight combination to be trained is determined according to whether the recognition of all the gesture sample images is correct.
- the weight combination (w 1 , w 2 , w 3 , w 4 ) is used, and 85 of the 100 gesture sample images are correctly recognized, the weight combination (w 1 , w 2 , w 3 , The recognition accuracy of w 4 ) is 85%.
- the weight combination to be trained whose recognition accuracy exceeds a preset threshold is determined as the target weight combination.
- a neural network algorithm may be used to train the weight combination to be trained according to the predicted score.
- the weight of each gesture recognition model can be learned accurately and adaptively, and the optimal solution of the weight of each gesture recognition model can be determined.
- the initial weight combination can also be based on experience Set to solve the optimal solution faster.
- the present disclosure does not exclude the use of other algorithms for training.
- an embodiment of the present disclosure also provides a gesture recognition method, including:
- Step 41 Obtain the prediction scores in each category of the gesture image to be recognized by each gesture recognition model in the fusion model, where the fusion model includes multiple gesture recognition models;
- Step 42 Obtain a target weight combination of the fusion model, where the target weight combination includes a weight corresponding to each gesture recognition model;
- Step 43 For each category, multiply the prediction score predicted by each gesture recognition model by the weight corresponding to the gesture recognition model to obtain a weighted prediction score, and weight all the gesture recognition models The subsequent prediction scores are added to obtain the fusion prediction score of each category;
- Step 44 Obtain the category with the largest fusion prediction score as the category of the gesture in the gesture image to be recognized;
- the target weight combination is obtained by training the above-mentioned gesture recognition model training method.
- the fusion model For example, suppose that there are p gesture recognition models in the fusion model, namely M 1 , M 2 , ..., M p , and the types of gestures that the fusion model can recognize are c types.
- the target weight combination corresponding to the fusion model is (w 1 , w 2 ,..., w p ).
- an image to be recognized gesture predicted scores for each category [S i, 1, S i , 2, ..., S i, c ].
- the fusion prediction scores corresponding to all categories are compared, and the category with the largest fusion prediction score is obtained as the category of the gesture in the gesture image.
- the prediction results of each gesture recognition model in different categories in the fusion model are merged, and the weights corresponding to each gesture recognition model are obtained by accurate and adaptive learning, so that there is no need to manually set the weights, and can Effectively improve the accuracy of gesture recognition of the fusion model.
- the present disclosure also provides a training module 50 for a gesture recognition model, including:
- the first acquisition module 51 is configured to acquire a training set, where the training set includes multiple gesture sample images, and the multiple gesture sample images contain gestures of various types;
- the second acquisition module 52 is configured to acquire a fusion model, and the fusion model includes a plurality of gesture recognition models;
- the third obtaining module 53 is configured to obtain the prediction scores of each of the gesture recognition models for each of the plurality of gesture sample images in each of the categories;
- the training module 54 is configured to determine the target weight combination of the fusion model from a plurality of weight combinations to be trained according to the prediction score, wherein each of the plurality of weight combinations to be trained includes each Weights corresponding to each of the gesture recognition models.
- the training module 54 includes:
- the calculation sub-module is configured to calculate the recognition accuracy rates of the fusion models corresponding to the multiple weight combinations to be trained according to the prediction scores;
- the determining sub-module is configured to determine the weight combination to be trained whose recognition accuracy exceeds a preset threshold as the target weight combination.
- the calculation sub-module includes:
- the first execution unit is configured to perform the following operations for each gesture sample image:
- the second execution unit is configured to determine the recognition accuracy rate of the fusion model corresponding to the weight combination to be trained according to whether the recognition of the multiple gesture sample images is correct.
- the third obtaining module 53 includes:
- the normalization processing sub-module is used to perform normalization processing on the predicted score to obtain a normalized predicted score.
- the present disclosure also provides a gesture recognition module, including:
- the first acquiring module 61 is configured to acquire the prediction scores of each category of the gesture image to be recognized by each gesture recognition model in the fusion model, and the fusion model includes a plurality of gesture recognition models;
- the second obtaining module 62 is configured to obtain a target weight combination of the fusion model, and the target weight combination includes a weight corresponding to each gesture recognition model;
- the processing module 63 is configured to, for each category, multiply the prediction score predicted by each gesture recognition model by the weight corresponding to the gesture recognition model to obtain a weighted prediction score, and recognize all the gestures Add the weighted prediction scores of the model to obtain the fusion prediction score of each category;
- the third acquiring module 64 is configured to acquire the category with the largest fusion prediction score as the category of the gesture in the gesture image to be recognized;
- the target weight combination is obtained by training the gesture recognition model training method in the foregoing embodiment.
- the present disclosure also provides a display device, which includes the above-mentioned gesture recognition module.
- an embodiment of the present disclosure also provides a training module 70 for a gesture recognition model, including a processor 71, a memory 72, a computer program stored in the memory 72 and running on the processor 71, When the computer program is executed by the processor 71, the various processes of the above-mentioned gesture recognition model training method embodiment are realized, and the same technical effect can be achieved. In order to avoid repetition, it will not be repeated here.
- an embodiment of the present disclosure also provides a gesture recognition module 80, including a processor 81, a memory 82, a computer program stored on the memory 82 and running on the processor 81, the computer program is
- the processor 81 implements each process of the foregoing gesture recognition method embodiment when executing, and can achieve the same technical effect. To avoid repetition, details are not described herein again.
- the embodiments of the present disclosure also provide a computer-readable storage medium, and a computer program is stored on the computer-readable storage medium.
- a computer program is stored on the computer-readable storage medium.
- the embodiments of the present disclosure also provide a computer-readable storage medium, and a computer program is stored on the computer-readable storage medium.
- a computer program is stored on the computer-readable storage medium.
- the computer-readable storage medium such as read-only memory (Read-Only Memory, ROM for short), random access memory (Random Access Memory, RAM for short), magnetic disk, or optical disk, etc.
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Evolutionary Biology (AREA)
- Evolutionary Computation (AREA)
- Bioinformatics & Computational Biology (AREA)
- General Engineering & Computer Science (AREA)
- Artificial Intelligence (AREA)
- Life Sciences & Earth Sciences (AREA)
- Human Computer Interaction (AREA)
- Multimedia (AREA)
- Image Analysis (AREA)
Abstract
Description
Claims (13)
- 一种手势识别模型的训练方法,其特征在于,包括:获取训练集,所述训练集包括多张手势样本图像,所述多张手势样本图像中包含多种类别的手势;获取融合模型,所述融合模型中包括多个手势识别模型;获取每个所述手势识别模型对多张所述手势样本图像在每种所述类别的预测分数;根据所述预测分数,从多个待训练的权重组合中确定出所述融合模型的目标权重组合,其中,所述多个待训练的权重组合中的每个中包括每个所述手势识别模型分别对应的权重。
- 如权利要求1所述的方法,其特征在于,所述根据所述预测分数,从多个待训练的权重组合中确定出所述融合模型的目标权重组合包括:根据所述预测分数计算所述多个待训练的权重组合分别对应的所述融合模型的识别准确率;确定出识别准确率超过预设阈值的所述待训练的权重组合作为所述目标权重组合。
- 如权利要求2所述的方法,其特征在于,所述根据所述预测分数计算所述多个待训练的权重组合分别对应的所述融合模型的识别准确率包括:针对每个所述手势样本图像,执行以下操作:针对每种所述类别,将每个所述手势识别模型预测的预测分数乘以所述手势识别模型对应的权重,得到加权后的预测分数;并将所有所述手势识别模型对应的加权后的预测分数相加,得到所述手势样本图像在每种所述类别的融合预测分数;将所述融合预测分数最大的所述类别,作为识别出的手势的类别;将所述识别出的手势的类别与预先标注的手势的类别进行比较,确定识别是否正确;根据多个所述手势样本图像识别是否正确,确定所述待训练的权重组合对应的所述融合模型的识别准确率。
- 如权利要求1所述的方法,其特征在于,所述获取多个所述手势识别模型对多张所述手势样本图像在每种所述类别的预测分数包括:将所述预测分数进行归一化处理,得到归一化后的预测分数。
- 一种手势识别方法,其特征在于,包括:获取融合模型中的每个手势识别模型对待识别的手势图像在每种类别的预测分数,所述融合模型中包括多个手势识别模型;获取所述融合模型的目标权重组合,所述目标权重组合中包括每个所述手势识别模型对应的权重;针对每种所述类别,将每个所述手势识别模型预测的预测分数乘以所述手势识别模型对应的权重,得到加权后的预测分数,并将所有所述手势识别模型的加权后的预测分数相加,得到每种所述类别的融合预测分数;获取所述融合预测分数最大的所述类别,作为所述待识别的手势图像中的手势的类别;其中,所述目标权重组合由权利要求1-4任一项所述的方法训练得到。
- 一种手势识别模型的训练模组,其特征在于,包括:第一获取模块,用于获取训练集,所述训练集包括多张手势样本图像,所述多张手势样本图像中包含多种类别的手势;第二获取模块,用于获取融合模型,所述融合模型中包括多个手势识别模型;第三获取模块,用于获取每个所述手势识别模型对多张所述手势样本图像在每种所述类别的预测分数;训练模块,用于根据所述预测分数,从多个待训练的权重组合中确定出所述融合模型的目标权重组合,其中,所述多个待训练的权重组合中的每个中包括每个所述手势识别模型分别对应的权重。
- 如权利要求6所述的模组,其特征在于,所述训练模块包括:计算子模块,用于根据所述预测分数计算所述多个待训练的权重组合分别对应的所述融合模型的识别准确率;确定子模块,用于确定出识别准确率超过预设阈值的所述待训练的权重组合作为所述目标权重组合。
- 如权利要求7所述的模组,其特征在于,所述计算子模块包括:第一执行单元,用于针对每个所述手势样本图像,执行以下操作:针对每种所述类别,将每个所述手势识别模型预测的预测分数乘以所述手势识别模型对应的权重,得到加权后的预测分数;并将所有所述手势识别模型对应的加权后的预测分数相加,得到所述手势样本图像在每种所述类别的融合预测分数;将所述融合预测分数最大的所述类别,作为识别出的手势的类别;将所述识别出的手势的类别与预先标注的手势的类别进行比较,确定识别是否正确;第二执行单元,用于根据多个所述手势样本图像识别是否正确,确定所述待训练的权重组合对应的所述融合模型的识别准确率。
- 如权利要求6所述的模组,其特征在于,所述第三获取模块包括:归一化处理子模块,用于将所述预测分数进行归一化处理,得到归一化后的预测分数。
- 一种手势识别模组,其特征在于,包括:第一获取模块,用于获取融合模型中的每个手势识别模型对待识别的手势图像在每种类别的预测分数,所述融合模型中包括多个手势识别模型;第二获取模块,用于获取所述融合模型的目标权重组合,所述目标权重组合中包括每个所述手势识别模型对应的权重;处理模块,用于针对每种所述类别,将每个所述手势识别模型预测的预测分数乘以所述手势识别模型对应的权重,得到加权后的预测分数,并将所有所述手势识别模型的加权后的预测分数相加,得到每种所述类别的融合预测分数;第三获取模块,用于获取融合预测分数最大的所述类别,作为所述待识别的手势图像中的手势的类别;其中,所述目标权重组合由权利要求1-4任一项所述的方法训练得到。
- 一种手势识别模型的训练模组,其特征在于,包括处理器、存储器及存储在所述存储器上并可在所述处理器上运行的计算机程序,所述计算机程序被所述处理器执行时实现如权利要求1至4中任一项所述的手势识别模 型的训练方法的步骤。
- 一种手势识别模组,其特征在于,包括处理器、存储器及存储在所述存储器上并可在所述处理器上运行的计算机程序,所述计算机程序被所述处理器执行时实现如权利要求5所述的手势识别方法的步骤。
- 一种计算机可读存储介质,其特征在于,所述计算机可读存储介质上存储计算机程序,所述计算机程序被处理器执行时实现如权利要求1至4中任一项所述的手势识别模型的训练方法的步骤,或者,所述计算机程序被处理器执行时实现如权利要求5所述的手势识别方法的步骤。
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010211278.9A CN111428639A (zh) | 2020-03-24 | 2020-03-24 | 手势识别模型的训练方法、手势识别方法及装置 |
CN202010211278.9 | 2020-03-24 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2021190046A1 true WO2021190046A1 (zh) | 2021-09-30 |
Family
ID=71548659
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/CN2020/141233 WO2021190046A1 (zh) | 2020-03-24 | 2020-12-30 | 手势识别模型的训练方法、手势识别方法及装置 |
Country Status (2)
Country | Link |
---|---|
CN (1) | CN111428639A (zh) |
WO (1) | WO2021190046A1 (zh) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2023071121A1 (zh) * | 2021-10-26 | 2023-05-04 | 苏州浪潮智能科技有限公司 | 一种基于多模型融合的目标检测方法、装置、设备及介质 |
Families Citing this family (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111428639A (zh) * | 2020-03-24 | 2020-07-17 | 京东方科技集团股份有限公司 | 手势识别模型的训练方法、手势识别方法及装置 |
CN113139463B (zh) * | 2021-04-23 | 2022-05-13 | 北京百度网讯科技有限公司 | 用于训练模型的方法、装置、设备、介质和程序产品 |
CN113837025A (zh) * | 2021-09-03 | 2021-12-24 | 深圳创维-Rgb电子有限公司 | 一种手势识别方法、系统、终端及存储介质 |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20170351910A1 (en) * | 2016-06-04 | 2017-12-07 | KinTrans, Inc. | Automatic body movement recognition and association system |
CN107679491A (zh) * | 2017-09-29 | 2018-02-09 | 华中师范大学 | 一种融合多模态数据的3d卷积神经网络手语识别方法 |
CN107729854A (zh) * | 2017-10-25 | 2018-02-23 | 南京阿凡达机器人科技有限公司 | 一种机器人的手势识别方法、系统及机器人 |
CN109976526A (zh) * | 2019-03-27 | 2019-07-05 | 广东技术师范大学 | 一种基于表面肌电传感器和九轴传感器的手语识别方法 |
CN110755073A (zh) * | 2019-10-09 | 2020-02-07 | 华中科技大学 | 基于阻抗谱信号的智能骨骼及关节信息处理系统及方法 |
CN111428639A (zh) * | 2020-03-24 | 2020-07-17 | 京东方科技集团股份有限公司 | 手势识别模型的训练方法、手势识别方法及装置 |
Family Cites Families (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107742102B (zh) * | 2017-10-13 | 2020-03-24 | 北京华捷艾米科技有限公司 | 一种基于深度传感器的手势识别方法 |
CN109145793A (zh) * | 2018-08-09 | 2019-01-04 | 东软集团股份有限公司 | 建立手势识别模型的方法、装置、存储介质及电子设备 |
-
2020
- 2020-03-24 CN CN202010211278.9A patent/CN111428639A/zh active Pending
- 2020-12-30 WO PCT/CN2020/141233 patent/WO2021190046A1/zh active Application Filing
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20170351910A1 (en) * | 2016-06-04 | 2017-12-07 | KinTrans, Inc. | Automatic body movement recognition and association system |
CN107679491A (zh) * | 2017-09-29 | 2018-02-09 | 华中师范大学 | 一种融合多模态数据的3d卷积神经网络手语识别方法 |
CN107729854A (zh) * | 2017-10-25 | 2018-02-23 | 南京阿凡达机器人科技有限公司 | 一种机器人的手势识别方法、系统及机器人 |
CN109976526A (zh) * | 2019-03-27 | 2019-07-05 | 广东技术师范大学 | 一种基于表面肌电传感器和九轴传感器的手语识别方法 |
CN110755073A (zh) * | 2019-10-09 | 2020-02-07 | 华中科技大学 | 基于阻抗谱信号的智能骨骼及关节信息处理系统及方法 |
CN111428639A (zh) * | 2020-03-24 | 2020-07-17 | 京东方科技集团股份有限公司 | 手势识别模型的训练方法、手势识别方法及装置 |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2023071121A1 (zh) * | 2021-10-26 | 2023-05-04 | 苏州浪潮智能科技有限公司 | 一种基于多模型融合的目标检测方法、装置、设备及介质 |
Also Published As
Publication number | Publication date |
---|---|
CN111428639A (zh) | 2020-07-17 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
WO2021190046A1 (zh) | 手势识别模型的训练方法、手势识别方法及装置 | |
WO2021155706A1 (zh) | 利用不平衡正负样本对业务预测模型训练的方法及装置 | |
CN110020592B (zh) | 物体检测模型训练方法、装置、计算机设备及存储介质 | |
WO2019169688A1 (zh) | 车辆定损方法、装置、电子设备及存储介质 | |
WO2019228317A1 (zh) | 人脸识别方法、装置及计算机可读介质 | |
WO2018166114A1 (zh) | 图片识别的方法、系统、电子装置及介质 | |
WO2017220032A1 (zh) | 基于深度学习的车牌分类方法、系统、电子装置及存储介质 | |
JP7266674B2 (ja) | 画像分類モデルの訓練方法、画像処理方法及び装置 | |
JP7414901B2 (ja) | 生体検出モデルのトレーニング方法及び装置、生体検出の方法及び装置、電子機器、記憶媒体、並びにコンピュータプログラム | |
CN110909784B (zh) | 一种图像识别模型的训练方法、装置及电子设备 | |
WO2021238586A1 (zh) | 一种训练方法、装置、设备以及计算机可读存储介质 | |
WO2019232861A1 (zh) | 手写模型训练方法、文本识别方法、装置、设备及介质 | |
CN113361396B (zh) | 多模态的知识蒸馏方法及系统 | |
CN114596497B (zh) | 目标检测模型的训练方法、目标检测方法、装置及设备 | |
KR20220024990A (ko) | L2TL(Learning to Transfer Learn)을 위한 프레임워크 | |
WO2023088174A1 (zh) | 目标检测方法及装置 | |
CN113011532A (zh) | 分类模型训练方法、装置、计算设备及存储介质 | |
CN116258861A (zh) | 基于多标签学习的半监督语义分割方法以及分割装置 | |
CN109214616B (zh) | 一种信息处理装置、系统和方法 | |
CN116912568A (zh) | 基于自适应类别均衡的含噪声标签图像识别方法 | |
WO2021042544A1 (zh) | 基于去网纹模型的人脸验证方法、装置、计算机设备及存储介质 | |
US20220092429A1 (en) | Training neural networks using learned optimizers | |
CN114742319A (zh) | 法考客观题成绩预测方法、系统及存储介质 | |
CN109409231B (zh) | 基于自适应隐马尔可夫的多特征融合手语识别方法 | |
US20210365719A1 (en) | System and method for few-shot learning |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 20927587 Country of ref document: EP Kind code of ref document: A1 |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 20927587 Country of ref document: EP Kind code of ref document: A1 |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 20927587 Country of ref document: EP Kind code of ref document: A1 |
|
32PN | Ep: public notification in the ep bulletin as address of the adressee cannot be established |
Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 04.04.2023) |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 20927587 Country of ref document: EP Kind code of ref document: A1 |