WO2021190046A1 - 手势识别模型的训练方法、手势识别方法及装置 - Google Patents

手势识别模型的训练方法、手势识别方法及装置 Download PDF

Info

Publication number
WO2021190046A1
WO2021190046A1 PCT/CN2020/141233 CN2020141233W WO2021190046A1 WO 2021190046 A1 WO2021190046 A1 WO 2021190046A1 CN 2020141233 W CN2020141233 W CN 2020141233W WO 2021190046 A1 WO2021190046 A1 WO 2021190046A1
Authority
WO
WIPO (PCT)
Prior art keywords
gesture
gesture recognition
model
fusion
category
Prior art date
Application number
PCT/CN2020/141233
Other languages
English (en)
French (fr)
Inventor
贾红红
王镜茹
Original Assignee
京东方科技集团股份有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 京东方科技集团股份有限公司 filed Critical 京东方科技集团股份有限公司
Publication of WO2021190046A1 publication Critical patent/WO2021190046A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/107Static hand or arm
    • G06V40/113Recognition of static hand signs
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/25Fusion techniques

Definitions

  • the embodiments of the present disclosure relate to the technical field of gesture recognition, and in particular, to a method for training a gesture recognition model, a method and device for gesture recognition.
  • the embodiments of the present disclosure provide a gesture recognition method, a gesture recognition module, and a display device, which are used to improve the recognition accuracy of a gesture recognition model.
  • embodiments of the present disclosure provide a method for training a gesture recognition model, including:
  • the training set includes multiple gesture sample images, and the multiple gesture sample images contain gestures of various types;
  • the fusion model includes a plurality of gesture recognition models
  • the target weight combination of the fusion model is determined from a plurality of weight combinations to be trained, wherein each of the plurality of weight combinations to be trained includes each of the gesture recognition models The corresponding weights respectively.
  • the determining the target weight combination of the fusion model from a plurality of weight combinations to be trained according to the prediction score includes:
  • the weight combination to be trained whose recognition accuracy exceeds a preset threshold is determined as the target weight combination.
  • the calculation of the recognition accuracy of the fusion model corresponding to the multiple weight combinations to be trained according to the prediction score includes:
  • the obtaining the prediction scores of the multiple gesture recognition models for each of the multiple gesture sample images in each of the categories includes:
  • the predicted score is normalized to obtain a normalized predicted score.
  • embodiments of the present disclosure provide a gesture recognition method, including:
  • the target weight combination includes a weight corresponding to each gesture recognition model
  • the target weight combination is obtained by training the above-mentioned gesture recognition model training method.
  • embodiments of the present disclosure provide a training module for a gesture recognition model, including:
  • the first acquisition module is configured to acquire a training set, where the training set includes multiple gesture sample images, and the multiple gesture sample images contain gestures of various types;
  • the second acquisition module is configured to acquire a fusion model, where the fusion model includes multiple gesture recognition models;
  • the third acquiring module is configured to acquire the prediction scores of each of the gesture recognition models in each of the categories for the plurality of gesture sample images;
  • the training module is used to determine the target weight combination of the fusion model from a plurality of weight combinations to be trained according to the prediction score, wherein each of the weight combinations to be trained includes each of the weight combinations to be trained.
  • the weight corresponding to the gesture recognition model.
  • the training module includes:
  • the calculation sub-module is configured to calculate the recognition accuracy rates of the fusion models corresponding to the multiple weight combinations to be trained according to the prediction scores;
  • the determining sub-module is configured to determine the weight combination to be trained whose recognition accuracy exceeds a preset threshold as the target weight combination.
  • the calculation sub-module includes:
  • the first execution unit is configured to perform the following operations for each gesture sample image:
  • the second execution unit is configured to determine the recognition accuracy rate of the fusion model corresponding to the weight combination to be trained according to whether the recognition of the multiple gesture sample images is correct.
  • the third acquisition module includes:
  • the normalization processing sub-module is used to normalize the predicted scores to obtain the normalized predicted scores.
  • a gesture recognition module including:
  • the first acquisition module is configured to acquire the prediction scores of each type of gesture image to be recognized by each gesture recognition model in the fusion model, and the fusion model includes a plurality of gesture recognition models;
  • the second acquiring module is configured to acquire a target weight combination of the fusion model, and the target weight combination includes a weight corresponding to each gesture recognition model;
  • the processing module is configured to multiply the prediction score predicted by each gesture recognition model by the weight corresponding to the gesture recognition model for each category to obtain a weighted prediction score, and combine all the gesture recognition models Add up the weighted prediction scores to obtain a fusion prediction score for each of the categories;
  • the third acquiring module is configured to acquire the category with the largest fusion prediction score as the category of the gesture in the gesture image to be recognized;
  • the target weight combination is obtained by training the above-mentioned gesture recognition model training method.
  • the embodiments of the present disclosure provide a training module for a gesture recognition model, including a processor, a memory, and a computer program stored in the memory and capable of running on the processor.
  • the computer program is The processor implements the steps of the training method of the gesture recognition model when executed.
  • embodiments of the present disclosure provide a gesture recognition module, including a processor, a memory, and a computer program stored on the memory and running on the processor, the computer program being executed by the processor When realizing the steps of the above gesture recognition method.
  • the embodiments of the present disclosure provide a computer-readable storage medium on which a computer program is stored, and when the computer program is executed by a processor, the steps of the training method of the gesture recognition model described above are implemented, Or, when the computer program is executed by a processor, the steps of the aforementioned gesture recognition method are implemented.
  • the weights of multiple gesture recognition models in the fusion model are trained through the training set, and the weights of each gesture recognition model can be learned accurately and adaptively.
  • the weights of each gesture recognition model can be learned accurately and adaptively.
  • FIG. 1 is a schematic flowchart of a training method of a gesture recognition model according to an embodiment of the present disclosure
  • FIG. 2 is a schematic flowchart of a method for training a gesture recognition model according to another embodiment of the present disclosure
  • FIG. 3 is a schematic flowchart of a method for training a gesture recognition model according to another embodiment of the present disclosure
  • FIG. 4 is a schematic diagram of a gesture recognition method according to an embodiment of the present disclosure.
  • FIG. 5 is a schematic structural diagram of a training module of a gesture recognition model according to an embodiment of the disclosure.
  • FIG. 6 is a schematic structural diagram of a gesture recognition module according to an embodiment of the disclosure.
  • FIG. 7 is a schematic structural diagram of a training module of a gesture recognition model according to another embodiment of the present disclosure.
  • FIG. 8 is a schematic structural diagram of a gesture recognition module according to another embodiment of the present disclosure.
  • Multi-model fusion refers to the fusion of the prediction results of multiple models in some way, and the fusion result is used as the final recognition and classification result.
  • Commonly used fusion methods are: voting, average, and maximum. For the case where the classification performance difference of different models is small, that is, for most samples, each model can basically be classified correctly, and only a small number of samples have differences between the models. In this case, use the voting method and the average method Or the maximum value method is not conducive to further improving the final recognition accuracy.
  • An embodiment of the present disclosure provides a method for training a gesture recognition model, including:
  • Step 11 Obtain a training set, where the training set includes multiple gesture sample images, and the multiple gesture sample images contain gestures of various types;
  • the types of gestures include, for example, fist, OK, compassion, gesture, farewell, prayer, and praise.
  • the multiple gesture sample images include all types of gestures that are desired to be recognized.
  • Each of the gesture sample images is pre-marked with the information of the type of the gesture contained in it, so as to verify the recognition accuracy corresponding to the weight combination of the subsequent training.
  • each of the gesture sample images includes a category of gestures to make the training process easier.
  • the number of gesture sample images included in the training set exceeds a preset threshold, for example, more than 100, and a large amount of sample data makes the training result more accurate.
  • Step 12 Obtain a fusion model, where the fusion model includes multiple gesture recognition models;
  • the multiple gesture recognition models are different models, such as models using different gesture recognition algorithms.
  • the plurality of gesture recognition models may include at least one of the following: a gesture recognition model based on CPnet (Common Deep Network), a gesture recognition model based on LSTM (Long Short Term Memory Neural Network), and a gesture based on a template matching method. Recognition model, etc.
  • Step 13 Obtain the prediction scores of each of the gesture recognition models for the plurality of gesture sample images in each of the categories;
  • the multiple gesture sample images are part or all of the multiple gesture sample images in the training set.
  • the prediction score of each gesture recognition model for each gesture sample image in each of the categories is obtained.
  • the predicted score is a numerical value, which can be a decimal, positive or negative number, and so on.
  • the training set includes N gesture sample images, and the N gesture sample images include c categories Gesture
  • c prediction scores [S 1 , S 2 ,..., S c ] can be obtained.
  • one gesture sample image one The category corresponds to a predicted score.
  • the first row data of the matrix M i is the gesture recognition model training set sample image of the gesture in the first predicted scores for each category
  • the N-th line data of the gesture recognition model training set M i N-th The predicted scores of the gesture sample images in each category, and so on.
  • S 11 is the prediction score of the first gesture sample image in the first category
  • S 1c is the prediction score of the first gesture sample image in the c category, and so on.
  • Step 14 Determine the target weight combination of the fusion model from multiple weight combinations to be trained according to the prediction score, wherein each of the multiple weight combinations to be trained includes each of the gestures Identify the weights corresponding to the model.
  • the multiple weight combinations to be trained refer to multiple weight combinations participating in training.
  • Each weight combination to be trained includes: the weight corresponding to each gesture recognition model.
  • the generated weight combination to be trained is (w 1 , w 2 , w 3 , w 4 ), where w 1 is the weight corresponding to M 1 , w 2 is the weight corresponding to M 2 , and w 3 is the weight corresponding to M 3 , W 4 is the weight corresponding to M 4 , train the weight combination to be trained, and adjust one or more weights in the weight combination to be trained according to the training result to obtain a new weight combination to be trained, for example (w 1 ', w 2 , w 3 ', w 4 ), and continue to train the adjusted weight combination to be trained, and so on, and finally get the target weight combination that meets the training requirements (w s1 , w s2 , w s3 , W s4 ), where w s
  • the weights of multiple gesture recognition models in the fusion model are trained through the training set, and the weights of each gesture recognition model can be learned accurately and adaptively.
  • the weights of each gesture recognition model can be learned accurately and adaptively.
  • the obtaining the prediction scores of the multiple gesture recognition models for each of the multiple gesture sample images in each of the categories includes: normalizing the prediction scores, Get the normalized prediction score. That is, the prediction results of multiple gesture recognition models for all gesture sample images are normalized to a uniform range, such as [0,1], to facilitate calculation.
  • a variety of normalization processing methods can be used to normalize the predicted score, such as (0,1) standardization, Z-score standardization, Sigmoid function, and so on.
  • the determining the target weight combination of the fusion model from the weight combination to be trained according to the prediction score includes:
  • Step 21 Calculate the recognition accuracy rate of the fusion model corresponding to the multiple decibel combinations of weights to be trained according to the prediction score;
  • Step 22 Determine the weight combination to be trained whose recognition accuracy exceeds a preset threshold as the target weight combination.
  • the preset threshold is 98% or 99%, etc., which can be set as required.
  • a weight combination to be trained is generated in advance, and then the recognition accuracy rate of the fusion model corresponding to the weight combination to be trained is calculated according to the prediction score, if the recognition is accurate If the rate is lower than the preset threshold, adjust the weight combination to be trained to obtain a new weight combination to be trained, and then continue to calculate the fusion corresponding to the new weight combination to be trained according to the prediction score
  • the recognition accuracy of the model is deduced by analogy until the weight combination to be trained whose recognition accuracy exceeds a preset threshold is determined as the target weight combination.
  • the weight combination whose recognition accuracy exceeds a preset threshold is used as the target weight combination, which can effectively improve the accuracy of gesture recognition of the fusion model.
  • the calculation of the recognition accuracy of the fusion model corresponding to the plurality of weight combinations to be trained according to the prediction score includes:
  • Step 31 For each of the gesture sample images, perform the following operations:
  • Step 311 For each category, multiply the prediction score predicted by each gesture recognition model by the weight corresponding to the gesture recognition model to obtain a weighted prediction score;
  • the training set includes 100 gesture sample images
  • the 100 gesture sample images include 24 types of gestures
  • the fusion model includes 4 gesture recognition models.
  • a weight combination to be trained is (w 1 , w 2 , w 3 , w 4 ).
  • the prediction score S 1124 predicted by the gesture recognition model 1 in category 24 is multiplied by the weight w 1 corresponding to the gesture recognition model 1 to obtain the weighted prediction score, that is, S 1124 ⁇ w 1 .
  • Step 312 For each of the categories, add the weighted prediction scores corresponding to all the gesture recognition models to obtain the fusion prediction score of the gesture sample image in each of the categories;
  • Step 313 Use the category with the largest fusion prediction score as the recognized gesture category
  • Step 314 Compare the recognized gesture category with the pre-marked gesture category to determine whether the recognition is correct
  • the pre-labeled gesture category is category 6, and the recognition is accurate. Assuming that the pre-labeled gesture category is category 8, it is considered that the recognition is wrong.
  • Step 32 Determine the recognition accuracy rate of the fusion model corresponding to the weight combination to be trained according to whether the recognition of the multiple gesture sample images is correct.
  • the recognition accuracy rate of the fusion model corresponding to the weight combination to be trained is determined according to whether the recognition of all the gesture sample images is correct.
  • the weight combination (w 1 , w 2 , w 3 , w 4 ) is used, and 85 of the 100 gesture sample images are correctly recognized, the weight combination (w 1 , w 2 , w 3 , The recognition accuracy of w 4 ) is 85%.
  • the weight combination to be trained whose recognition accuracy exceeds a preset threshold is determined as the target weight combination.
  • a neural network algorithm may be used to train the weight combination to be trained according to the predicted score.
  • the weight of each gesture recognition model can be learned accurately and adaptively, and the optimal solution of the weight of each gesture recognition model can be determined.
  • the initial weight combination can also be based on experience Set to solve the optimal solution faster.
  • the present disclosure does not exclude the use of other algorithms for training.
  • an embodiment of the present disclosure also provides a gesture recognition method, including:
  • Step 41 Obtain the prediction scores in each category of the gesture image to be recognized by each gesture recognition model in the fusion model, where the fusion model includes multiple gesture recognition models;
  • Step 42 Obtain a target weight combination of the fusion model, where the target weight combination includes a weight corresponding to each gesture recognition model;
  • Step 43 For each category, multiply the prediction score predicted by each gesture recognition model by the weight corresponding to the gesture recognition model to obtain a weighted prediction score, and weight all the gesture recognition models The subsequent prediction scores are added to obtain the fusion prediction score of each category;
  • Step 44 Obtain the category with the largest fusion prediction score as the category of the gesture in the gesture image to be recognized;
  • the target weight combination is obtained by training the above-mentioned gesture recognition model training method.
  • the fusion model For example, suppose that there are p gesture recognition models in the fusion model, namely M 1 , M 2 , ..., M p , and the types of gestures that the fusion model can recognize are c types.
  • the target weight combination corresponding to the fusion model is (w 1 , w 2 ,..., w p ).
  • an image to be recognized gesture predicted scores for each category [S i, 1, S i , 2, ..., S i, c ].
  • the fusion prediction scores corresponding to all categories are compared, and the category with the largest fusion prediction score is obtained as the category of the gesture in the gesture image.
  • the prediction results of each gesture recognition model in different categories in the fusion model are merged, and the weights corresponding to each gesture recognition model are obtained by accurate and adaptive learning, so that there is no need to manually set the weights, and can Effectively improve the accuracy of gesture recognition of the fusion model.
  • the present disclosure also provides a training module 50 for a gesture recognition model, including:
  • the first acquisition module 51 is configured to acquire a training set, where the training set includes multiple gesture sample images, and the multiple gesture sample images contain gestures of various types;
  • the second acquisition module 52 is configured to acquire a fusion model, and the fusion model includes a plurality of gesture recognition models;
  • the third obtaining module 53 is configured to obtain the prediction scores of each of the gesture recognition models for each of the plurality of gesture sample images in each of the categories;
  • the training module 54 is configured to determine the target weight combination of the fusion model from a plurality of weight combinations to be trained according to the prediction score, wherein each of the plurality of weight combinations to be trained includes each Weights corresponding to each of the gesture recognition models.
  • the training module 54 includes:
  • the calculation sub-module is configured to calculate the recognition accuracy rates of the fusion models corresponding to the multiple weight combinations to be trained according to the prediction scores;
  • the determining sub-module is configured to determine the weight combination to be trained whose recognition accuracy exceeds a preset threshold as the target weight combination.
  • the calculation sub-module includes:
  • the first execution unit is configured to perform the following operations for each gesture sample image:
  • the second execution unit is configured to determine the recognition accuracy rate of the fusion model corresponding to the weight combination to be trained according to whether the recognition of the multiple gesture sample images is correct.
  • the third obtaining module 53 includes:
  • the normalization processing sub-module is used to perform normalization processing on the predicted score to obtain a normalized predicted score.
  • the present disclosure also provides a gesture recognition module, including:
  • the first acquiring module 61 is configured to acquire the prediction scores of each category of the gesture image to be recognized by each gesture recognition model in the fusion model, and the fusion model includes a plurality of gesture recognition models;
  • the second obtaining module 62 is configured to obtain a target weight combination of the fusion model, and the target weight combination includes a weight corresponding to each gesture recognition model;
  • the processing module 63 is configured to, for each category, multiply the prediction score predicted by each gesture recognition model by the weight corresponding to the gesture recognition model to obtain a weighted prediction score, and recognize all the gestures Add the weighted prediction scores of the model to obtain the fusion prediction score of each category;
  • the third acquiring module 64 is configured to acquire the category with the largest fusion prediction score as the category of the gesture in the gesture image to be recognized;
  • the target weight combination is obtained by training the gesture recognition model training method in the foregoing embodiment.
  • the present disclosure also provides a display device, which includes the above-mentioned gesture recognition module.
  • an embodiment of the present disclosure also provides a training module 70 for a gesture recognition model, including a processor 71, a memory 72, a computer program stored in the memory 72 and running on the processor 71, When the computer program is executed by the processor 71, the various processes of the above-mentioned gesture recognition model training method embodiment are realized, and the same technical effect can be achieved. In order to avoid repetition, it will not be repeated here.
  • an embodiment of the present disclosure also provides a gesture recognition module 80, including a processor 81, a memory 82, a computer program stored on the memory 82 and running on the processor 81, the computer program is
  • the processor 81 implements each process of the foregoing gesture recognition method embodiment when executing, and can achieve the same technical effect. To avoid repetition, details are not described herein again.
  • the embodiments of the present disclosure also provide a computer-readable storage medium, and a computer program is stored on the computer-readable storage medium.
  • a computer program is stored on the computer-readable storage medium.
  • the embodiments of the present disclosure also provide a computer-readable storage medium, and a computer program is stored on the computer-readable storage medium.
  • a computer program is stored on the computer-readable storage medium.
  • the computer-readable storage medium such as read-only memory (Read-Only Memory, ROM for short), random access memory (Random Access Memory, RAM for short), magnetic disk, or optical disk, etc.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • General Engineering & Computer Science (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Human Computer Interaction (AREA)
  • Multimedia (AREA)
  • Image Analysis (AREA)

Abstract

一种手势识别模型的训练方法、手势识别方法及装置,该手势识别模型的训练方法包括:获取训练集,所述训练集包括多张手势样本图像,所述多张手势样本图像中包含多种类别的手势(11);获取融合模型,所述融合模型中包括多个手势识别模型(12);获取每个所述手势识别模型对多张所述手势样本图像在每种所述类别的预测分数(13);根据所述预测分数,从多个待训练的权重组合中确定出所述融合模型的目标权重组合,其中,所述多个待训练的权重组合中的每个中包括每个所述手势识别模型分别对应的权重(14)。所述方法提升手势识别模型的识别准确率。

Description

手势识别模型的训练方法、手势识别方法及装置
相关申请的交叉引用
本申请主张在2020年03月24日在中国提交的中国专利申请号No.202010211278.9的优先权,其全部内容通过引用包含于此。
技术领域
本公开实施例涉及手势识别技术领域,尤其涉及一种手势识别模型的训练方法、手势识别方法及装置。
背景技术
手势识别过程中对手势进行分类时,使用单一的手势识别模型对识别准确率的提升有限,而多模型融合已成为提升手势识别准确率的一种有效手段。
发明内容
本公开实施例提供一种手势识别方法、手势识别模组和显示装置,用于提升手势识别模型的识别准确率。
为了解决上述技术问题,本公开是这样实现的:
第一方面,本公开实施例提供了一种手势识别模型的训练方法,包括:
获取训练集,所述训练集包括多张手势样本图像,所述多张手势样本图像中包含多种类别的手势;
获取融合模型,所述融合模型中包括多个手势识别模型;
获取每个所述手势识别模型对多张所述手势样本图像在每种所述类别的预测分数;
根据所述预测分数,从多个待训练的权重组合中确定出所述融合模型的目标权重组合,其中,所述多个待训练的权重组合中的每个中包括每个所述手势识别模型分别对应的权重。
可选的,所述根据所述预测分数,从多个待训练的权重组合中确定出所述融合模型的目标权重组合包括:
根据所述预测分数计算所述多个待训练的权重组合分别对应的所述融合模型的识别准确率;
确定出识别准确率超过预设阈值的所述待训练的权重组合作为所述目标权重组合。
可选的,所述根据所述预测分数计算所述多个待训练的权重组合分别对应的所述融合模型的识别准确率包括:
针对每个所述手势样本图像,执行以下操作:
针对每种所述类别,将每个所述手势识别模型预测的预测分数乘以所述手势识别模型对应的权重,得到加权后的预测分数;并将所有所述手势识别模型对应的加权后的预测分数相加,得到所述手势样本图像在每种所述类别的融合预测分数;
将所述融合预测分数最大的所述类别,作为识别出的手势的类别;
将所述识别出的手势的类别与预先标注的手势的类别进行比较,确定识别是否正确;
根据多个所述手势样本图像识别是否正确,确定所述待训练的权重组合对应的所述融合模型的识别准确率。
可选的,所述获取多个所述手势识别模型对多张所述手势样本图像在每种所述类别的预测分数包括:
将所述预测分数进行归一化处理,得到归一化后的预测分数。
第二方面,本公开实施例提供了一种手势识别方法,包括:
获取融合模型中的每个手势识别模型对待识别的手势图像在每种类别的预测分数,所述融合模型中包括多个手势识别模型;
获取所述融合模型的目标权重组合,所述目标权重组合中包括每个所述手势识别模型对应的权重;
针对每种所述类别,将每个所述手势识别模型预测的预测分数乘以所述手势识别模型对应的权重,得到加权后的预测分数,并将所有所述手势识别模型的加权后的预测分数相加,得到每种所述类别的融合预测分数;
获取所述融合预测分数最大的所述类别,作为所述待识别的手势图像中的手势的类别;
其中,所述目标权重组合由上述手势识别模型的训练方法训练得到。
第三方面,本公开实施例提供了一种手势识别模型的训练模组,包括:
第一获取模块,用于获取训练集,所述训练集包括多张手势样本图像,所述多张手势样本图像中包含多种类别的手势;
第二获取模块,用于获取融合模型,所述融合模型中包括多个手势识别模型;
第三获取模块,用于获取每个所述手势识别模型对多张所述手势样本图像在每种所述类别的预测分数;
训练模块,用于根据所述预测分数,从多个待训练的权重组合中确定出所述融合模型的目标权重组合,其中,所述待训练的权重组合中的每个中包括每个所述手势识别模型对应的权重。
可选的,所述训练模块包括:
计算子模块,用于根据所述预测分数计算所述多个待训练的权重组合分别对应的所述融合模型的识别准确率;
确定子模块,用于确定出识别准确率超过预设阈值的所述待训练的权重组合作为所述目标权重组合。
可选的,所述计算子模块包括:
第一执行单元,用于针对每个所述手势样本图像,执行以下操作:
针对每种所述类别,将每个所述手势识别模型预测的预测分数乘以所述手势识别模型对应的权重,得到加权后的预测分数;并将所有所述手势识别模型对应的加权后的预测分数相加,得到所述手势样本图像在每种所述类别的融合预测分数;
将所述融合预测分数最大的所述类别,作为识别出的手势的类别;
将所述识别出的手势的类别与预先标注的手势的类别进行比较,确定识别是否正确;
第二执行单元,用于根据多个所述手势样本图像识别是否正确,确定所述待训练的权重组合对应的所述融合模型的识别准确率。
可选的,所述第三获取模块包括:
归一化处理子模块,用于将所述预测分数进行归一化处理,得到归一化 后的预测分数。
第四方面,本公开实施例提供了一种手势识别模组,包括:
第一获取模块,用于获取融合模型中的每个手势识别模型对待识别的手势图像在每种类别的预测分数,所述融合模型中包括多个手势识别模型;
第二获取模块,用于获取所述融合模型的目标权重组合,所述目标权重组合中包括每个所述手势识别模型对应的权重;
处理模块,用于针对每种所述类别,将每个所述手势识别模型预测的预测分数乘以所述手势识别模型对应的权重,得到加权后的预测分数,并将所有所述手势识别模型的加权后的预测分数相加,得到每种所述类别的融合预测分数;
第三获取模块,用于获取融合预测分数最大的所述类别,作为所述待识别的手势图像中的手势的类别;
其中,所述目标权重组合由上述手势识别模型的训练方法训练得到。
第五方面,本公开实施例提供了一种手势识别模型的训练模组,包括处理器、存储器及存储在所述存储器上并可在所述处理器上运行的计算机程序,所述计算机程序被所述处理器执行时实现上述手势识别模型的训练方法的步骤。
第六方面,本公开实施例提供了手势识别模组,包括处理器、存储器及存储在所述存储器上并可在所述处理器上运行的计算机程序,所述计算机程序被所述处理器执行时实现上述手势识别方法的步骤。
第七方面,本公开实施例提供了一种计算机可读存储介质,所述计算机可读存储介质上存储计算机程序,所述计算机程序被处理器执行时实现上述手势识别模型的训练方法的步骤,或者,所述计算机程序被处理器执行时实现上述手势识别方法的步骤。
在本公开实施例中,通过训练集对融合模型中多个手势识别模型的权重进行训练,可以准确且自适应的学习得到各个手势识别模型的权重,在进行手势识别时,无需人工手动设置权重,并且可以有效提升融合模型的手势识别的准确率。
附图说明
通过阅读下文优选实施方式的详细描述,各种其他的优点和益处对于本领域普通技术人员将变得清楚明了。附图仅用于示出优选实施方式的目的,而并不认为是对本公开的限制。而且在整个附图中,用相同的参考符号表示相同的部件。在附图中:
图1为本公开一实施例的手势识别模型的训练方法的流程示意图;
图2为本公开另一实施例的手势识别模型的训练方法的流程示意图;
图3为本公开又一实施例的手势识别模型的训练方法的流程示意图;
图4为本公开一实施例的手势识别方法示意图;
图5为本公开一实施例的手势识别模型的训练模组的结构示意图;
图6为本公开一实施例的手势识别模组的结构示意图;
图7为本公开另一实施例的手势识别模型的训练模组的结构示意图;
图8为本公开另一实施例的手势识别模组的结构示意图。
具体实施方式
下面将结合本公开实施例中的附图,对本公开实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例是本公开一部分实施例,而不是全部的实施例。基于本公开中的实施例,本领域普通技术人员在没有作出创造性劳动前提下所获得的所有其他实施例,都属于本公开保护的范围。
多模型融合即对多个模型的预测结果通过某种方式进行融合,用融合后的结果作为最终的识别分类结果,常用的融合方法有:投票法、平均法、最大值法等。对于不同模型的分类性能差异较小的情况下,即对于大多数样本每个模型基本可以正确分类,只在较少的样本上各个模型有差异,在这种情况下,使用投票法、平均法或者最大值法都不利于进一步提升最终的识别准确率。
为了解决现有的手势识别模型的识别准确率不高的问题,请参考图1,本公开实施例提供一种手势识别模型的训练方法,包括:
步骤11:获取训练集,所述训练集包括多张手势样本图像,所述多张手势样本图像中包含多种类别的手势;
手势的类别例如包括拳头、OK、比心、作揖、作别、祈祷、点赞等。可选地,所述多张手势样本图像中包含所有希望识别的手势的类别。
每一所述手势样本图像均预先标注有其包含的手势的类别的信息,以用于验证后续训练的权重组合对应的识别准确率。
可选的,每个所述手势样本图像包含一个类别的手势,以使得训练过程更简单。
可选地,所述训练集中包括的手势样本图像的个数超过预设阈值,例如超过100张,大量样本数据使得训练结果更准确。
步骤12:获取融合模型,所述融合模型中包括多个手势识别模型;
所述多个手势识别模型为不同的模型,例如采用不同的手势识别算法的模型。举例来说,所述多个手势识别模型可以包括以下至少之一:基于CPnet(通用深度网络)的手势识别模型、基于LSTM(长短时记忆神经网络)的手势识别模型、基于模板匹配法的手势识别模型等。
步骤13:获取每个所述手势识别模型对多张所述手势样本图像在每种所述类别的预测分数;
多张所述手势样本图像为所述训练集中的多张手势样本图像中的部分或全部。
可选的,获取每个所述手势识别模型对每张所述手势样本图像在每种所述类别的预测分数。所述预测分数为一个数值,可以是小数、正数或负数等。
举例来说,假设融合模型中具有p个手势识别模型,分别为M 1,M 2,…,M p,训练集中包括N张手势样本图像,所述N张手势样本图像中包含c种类别的手势,每个所述手势识别模型对每张手势样本图像进行手势识别之后,可以得到c个预测分数[S 1,S 2,…,S c],其中,针对一张手势样本图像,一个所述类别对应一个预测分数。
对于一手势识别模型M i,对训练集中的所有手势样本图像的预测结果X i如下:
Figure PCTCN2020141233-appb-000001
其中,上述矩阵中的第一行数据为手势识别模型M i对训练集中的第一张手势样本图像在每种类别的预测分数,第N行数据手势识别模型M i对训练集中的第N张手势样本图像在每种类别的预测分数,依此类推。S 11为第一张手势样本图像在第一种类别的预测分数,S 1c为第一张手势样本图像在第c种类别的预测分数,依此类推。
步骤14:根据所述预测分数从多个待训练的权重组合中确定出所述融合模型的目标权重组合,其中,所述多个待训练的权重组合中的每个中包括每个所述手势识别模型对应的权重。
本公开实施例中,所述多个待训练的权重组合是指参与训练的多个权重组合。
每个待训练的权重组合中包括:每个所述手势识别模型对应的权重,举例来说,融合模型中具有4个手势识别模型,分别为M 1、M 2、M 3、M 4,初始生成的待训练的权重组合为(w 1、w 2、w 3、w 4),其中,w 1为M 1对应的权重,w 2为M 2对应的权重,w 3为M 3对应的权重,w 4为M 4对应的权重,对待训练的权重组合进行训练,并根据训练结果调整待训练的权重组合中的一个或多个权重,得到一个新的待训练的权重组合,例如为(w 1’、w 2、w 3’、w 4),并继续对调整后的待训练的权重组合进行训练,以此类推,最终得到符合训练要求的目标权重组合(w s1、w s2、w s3、w s4),其中,w s1为M 1对应的目标权重,w s2为M 2对应的目标权重,w s3为M 3对应的目标权重,w s4为M 4对应的目标权重。
本公开实施例中,通过训练集对融合模型中多个手势识别模型的权重进行训练,可以准确且自适应的学习得到各个手势识别模型的权重,在进行手势识别时,无需人工手动设置权重,并且可以有效提升融合模型的手势识别的准确率。
本公开实施例中,可选的,所述获取多个所述手势识别模型对多张所述手势样本图像在每种所述类别的预测分数包括:将所述预测分数进行归一化处理,得到归一化后的预测分数。即把多个手势识别模型对所有手势样本图像的预测结果归一化到统一范围内,例如[0,1],以方便计算。本公开实施例中,可以采用多种归一化处理方法对预测分数进行归一化处理,例如(0,1)标 准化、Z-score标准化、Sigmoid函数等。
请参考图2,本公开实施例中,可选的,上述步骤14中,所述根据所述预测分数从待训练的权重组合中确定出所述融合模型的目标权重组合包括:
步骤21:根据所述预测分数计算所述多个待训练的权重组合分贝对应的所述融合模型的识别准确率;
步骤22:确定出识别准确率超过预设阈值的所述待训练的权重组合作为所述目标权重组合。
所述预设阈值为98%或99%等,根据需要设定。
本公开实施例中,可选的,预先生成一待训练的权重组合,然后,根据所述预测分数计算所述待训练的权重组合对应的所述融合模型的识别准确率,若所述识别准确率低于预设阈值,则对所述待训练的权重组合进行调整,得到新的待训练的权重组合,然后继续根据所述预测分数计算新的所述待训练的权重组合对应的所述融合模型的识别准确率,依次类推,直至确定出识别准确率超过预设阈值的所述待训练的权重组合作为所述目标权重组合。
本公开实施例中,将识别准确率超过预设阈值的权重组合作为目标权重组合,可以有效提升融合模型的手势识别的准确率。
请参考图3,本公开实施例中,可选的,上述步骤21中,所述根据所述预测分数计算所述多个待训练的权重组合分别对应的所述融合模型的识别准确率包括:
步骤31:针对每个所述手势样本图像,执行以下操作:
步骤311:针对每种所述类别,将每个所述手势识别模型预测的预测分数乘以所述手势识别模型对应的权重,得到加权后的预测分数;
例如,假设训练集中包括100张手势样本图像,该100张手势样本图像中包含24种类别的手势,融合模型中包括4个手势识别模型。
假设一个待训练的权重组合为(w 1,w 2,w 3,w 4)。
假设对于手势样本图像1:
将手势识别模型1在类别1预测的预测分数S 111乘以手势识别模型1对应的权重w 1,即S 111×w 1
将手势识别模型1在类别2预测的预测分数S 112乘以手势识别模型1对 应的权重w 1,得到加权后的预测分数,即S 112×w 1
……
将手势识别模型1在类别24预测的预测分数S 1124乘以手势识别模型1对应的权重w 1,得到加权后的预测分数,即S 1124×w 1
步骤312:针对每种所述类别,将所有所述手势识别模型对应的加权后的预测分数相加,得到所述手势样本图像在每种所述类别的融合预测分数;
假设对于手势样本图像1:
针对类别1,将手势识别模型1-4对应的加权后的预测分数相加,得到融合预测分数,即S 111×w 1+S 211×w 1+S 311×w 1+S 411×w 1
以此类推,针对类别2-24,将手势识别模型1-4对应的加权后的预测分数相加,得到对应的融合预测分数。
步骤313:将融合预测分数最大的所述类别,作为识别出的手势的类别;
假设对于手势样本图像1,类别6对应的融合预测分数最大,则将类别6作为识别出的类别。
步骤314:将识别出的手势的类别与预先标注的手势的类别进行比较,确定识别是否正确;
假设对于手势样本图像1,预先标注的手势的类别为类别6,则识别准确。假设预先标注的手势的类别为类别8,则认为识别错误。
步骤32:根据多个所述手势样本图像识别是否正确,确定所述待训练的权重组合对应的所述融合模型的识别准确率。
可选的,根据所有所述手势样本图像识别是否正确,确定所述待训练的权重组合对应的所述融合模型的识别准确率。
假设采用权重组合(w 1,w 2,w 3,w 4),100张手势样本图像中,有85张手势样本图像识别正确,则可以确定该权重组合(w 1,w 2,w 3,w 4)的识别准确率为85%。
最后,根据上述步骤22,确定出识别准确率超过预设阈值的所述待训练的权重组合作为所述目标权重组合。
本公开的上述实施例中,可以采用神经网络算法,根据所述预测分数对待训练的权重组合进行训练。通过神经网络算法可以准确且自适应的学习得 到各个手势识别模型的权重,确定出各个手势识别模型的权重的最优解。在通过神经网络算法对权重组合进行训练时,可以首先随机生成一初始权重组合,然后对该初始权重组合进行训练调整,以学习到最优的权重组合,当然,初始权重组合也可以是根据经验设定,以更快解决最优解。当然,本公开也不排除采用其他算法进行训练。
请参考图4,本公开实施例还提供一种手势识别方法,包括:
步骤41:获取融合模型中的每个手势识别模型对待识别的手势图像在每种类别的预测分数,所述融合模型中包括多个手势识别模型;
步骤42:获取所述融合模型的目标权重组合,所述目标权重组合中包括每个所述手势识别模型对应的权重;
步骤43:针对每种所述类别,将每个所述手势识别模型预测的预测分数乘以所述手势识别模型对应的权重,得到加权后的预测分数,并将所有所述手势识别模型的加权后的预测分数相加,得到每种所述类别的融合预测分数;
步骤44:获取所述融合预测分数最大的所述类别,作为所述待识别的手势图像中的手势的类别;
其中,所述目标权重组合由上述手势识别模型的训练方法训练得到。
举例来说,假设融合模型中具有p个手势识别模型,分别为M 1,M 2,…,M p,融合模型能够识别的手势的类别为c种。
融合模型对应的目标权重组合为(w 1,w 2,…,w p)。
对于每个手势识别模型M i(i=1,2,…,p),对待识别的手势图像在每种类别的预测分数为[S i,1,S i,2,…,S i,c]。
针对每个类别j(j=1,2,…,c),将手势识别模型预测M i的预测分数S i,j乘以手势识别模型M i对应的权重w i,得到加权后的预测分数S i,j×w i,并将所有所述手势识别模型的加权后的预测分数相加,得到融合预测分数:S 1,j×w 1+S 2,j×w 2+…+S p,j×w p
最后,将所有类别对应的融合预测分数进行比较,得到融合预测分数最大的所述类别,作为所述手势图像中的手势的类别。
本公开实施例种,将融合模型中各个手势识别模型在不同类别上的预测结果进行融合,并且各个手势识别模型对应的权重由准确且自适应的学习得 到,从而无需人工手动设置权重,并且可以有效提升融合模型的手势识别的准确率。
请参考图5,本公开还提供一种手势识别模型的训练模组50,包括:
第一获取模块51,用于获取训练集,所述训练集包括多张手势样本图像,所述多张手势样本图像中包含多种类别的手势;
第二获取模块52,用于获取融合模型,所述融合模型中包括多个手势识别模型;
第三获取模块53,用于获取每个所述手势识别模型对多张所述手势样本图像在每种所述类别的预测分数;
训练模块54,用于根据所述预测分数,从多个待训练的权重组合中确定出所述融合模型的目标权重组合,其中,所述多个待训练的权重组合中的每个中包括每个所述手势识别模型对应的权重。
可选的,所述训练模块54包括:
计算子模块,用于根据所述预测分数计算所述多个待训练的权重组合分别对应的所述融合模型的识别准确率;
确定子模块,用于确定出识别准确率超过预设阈值的所述待训练的权重组合作为所述目标权重组合。
可选的,所述计算子模块包括:
第一执行单元,用于针对每个所述手势样本图像,执行以下操作:
针对每种所述类别,将每个所述手势识别模型预测的预测分数乘以所述手势识别模型对应的权重,得到加权后的预测分数;并将所有所述手势识别模型对应的加权后的预测分数相加,得到所述手势样本图像在每种所述类别的融合预测分数;
将所述融合预测分数最大的所述类别,作为识别出的手势的类别;
将所述识别出的手势的类别与预先标注的手势的类别进行比较,确定识别是否正确;
第二执行单元,用于根据多个所述手势样本图像识别是否正确,确定所述待训练的权重组合对应的所述融合模型的识别准确率。
可选的,所述第三获取模块53包括:
归一化处理子模块,用于将所述预测分数进行归一化处理,得到归一化后的预测分数。
请参考图6,本公开还提供一种手势识别模组,包括:
第一获取模块61,用于获取融合模型中的每个手势识别模型对待识别的手势图像在每种类别的预测分数,所述融合模型中包括多个手势识别模型;
第二获取模块62,用于获取所述融合模型的目标权重组合,所述目标权重组合中包括每个所述手势识别模型对应的权重;
处理模块63,用于针对每种所述类别,将每个所述手势识别模型预测的预测分数乘以所述手势识别模型对应的权重,得到加权后的预测分数,并将所有所述手势识别模型的加权后的预测分数相加,得到每种所述类别的融合预测分数;
第三获取模块64,用于获取融合预测分数最大的所述类别,作为所述待识别的手势图像中的手势的类别;
其中,所述目标权重组合由上述实施例中的手势识别模型的训练方法训练得到。
本公开还提供一种显示装置,包括上述手势识别模组。
请参考图7,本公开实施例还提供一种手势识别模型的训练模组70,包括处理器71,存储器72,存储在存储器72上并可在所述处理器71上运行的计算机程序,该计算机程序被处理器71执行时实现上述手势识别模型的训练方法实施例的各个过程,且能达到相同的技术效果,为避免重复,这里不再赘述。
请参考图8,本公开实施例还提供一种手势识别模组80,包括处理器81,存储器82,存储在存储器82上并可在所述处理器81上运行的计算机程序,该计算机程序被处理器81执行时实现上述手势识别方法实施例的各个过程,且能达到相同的技术效果,为避免重复,这里不再赘述。
本公开实施例还提供一种计算机可读存储介质,计算机可读存储介质上存储有计算机程序,该计算机程序被处理器执行时实现上述手势识别模型的训练方法实施例的各个过程,且能达到相同的技术效果,为避免重复,这里不再赘述。
本公开实施例还提供一种计算机可读存储介质,计算机可读存储介质上存储有计算机程序,该计算机程序被处理器执行时实现上述手势识别方法实施例的各个过程,且能达到相同的技术效果,为避免重复,这里不再赘述。
其中,所述的计算机可读存储介质,如只读存储器(Read-Only Memory,简称ROM)、随机存取存储器(Random Access Memory,简称RAM)、磁碟或者光盘等。
上面结合附图对本公开的实施例进行了描述,但是本公开并不局限于上述的具体实施方式,上述的具体实施方式仅仅是示意性的,而不是限制性的,本领域的普通技术人员在本公开的启示下,在不脱离本公开宗旨和权利要求所保护的范围情况下,还可做出很多形式,均属于本公开的保护之内。

Claims (13)

  1. 一种手势识别模型的训练方法,其特征在于,包括:
    获取训练集,所述训练集包括多张手势样本图像,所述多张手势样本图像中包含多种类别的手势;
    获取融合模型,所述融合模型中包括多个手势识别模型;
    获取每个所述手势识别模型对多张所述手势样本图像在每种所述类别的预测分数;
    根据所述预测分数,从多个待训练的权重组合中确定出所述融合模型的目标权重组合,其中,所述多个待训练的权重组合中的每个中包括每个所述手势识别模型分别对应的权重。
  2. 如权利要求1所述的方法,其特征在于,所述根据所述预测分数,从多个待训练的权重组合中确定出所述融合模型的目标权重组合包括:
    根据所述预测分数计算所述多个待训练的权重组合分别对应的所述融合模型的识别准确率;
    确定出识别准确率超过预设阈值的所述待训练的权重组合作为所述目标权重组合。
  3. 如权利要求2所述的方法,其特征在于,所述根据所述预测分数计算所述多个待训练的权重组合分别对应的所述融合模型的识别准确率包括:
    针对每个所述手势样本图像,执行以下操作:
    针对每种所述类别,将每个所述手势识别模型预测的预测分数乘以所述手势识别模型对应的权重,得到加权后的预测分数;并将所有所述手势识别模型对应的加权后的预测分数相加,得到所述手势样本图像在每种所述类别的融合预测分数;
    将所述融合预测分数最大的所述类别,作为识别出的手势的类别;
    将所述识别出的手势的类别与预先标注的手势的类别进行比较,确定识别是否正确;
    根据多个所述手势样本图像识别是否正确,确定所述待训练的权重组合对应的所述融合模型的识别准确率。
  4. 如权利要求1所述的方法,其特征在于,所述获取多个所述手势识别模型对多张所述手势样本图像在每种所述类别的预测分数包括:
    将所述预测分数进行归一化处理,得到归一化后的预测分数。
  5. 一种手势识别方法,其特征在于,包括:
    获取融合模型中的每个手势识别模型对待识别的手势图像在每种类别的预测分数,所述融合模型中包括多个手势识别模型;
    获取所述融合模型的目标权重组合,所述目标权重组合中包括每个所述手势识别模型对应的权重;
    针对每种所述类别,将每个所述手势识别模型预测的预测分数乘以所述手势识别模型对应的权重,得到加权后的预测分数,并将所有所述手势识别模型的加权后的预测分数相加,得到每种所述类别的融合预测分数;
    获取所述融合预测分数最大的所述类别,作为所述待识别的手势图像中的手势的类别;
    其中,所述目标权重组合由权利要求1-4任一项所述的方法训练得到。
  6. 一种手势识别模型的训练模组,其特征在于,包括:
    第一获取模块,用于获取训练集,所述训练集包括多张手势样本图像,所述多张手势样本图像中包含多种类别的手势;
    第二获取模块,用于获取融合模型,所述融合模型中包括多个手势识别模型;
    第三获取模块,用于获取每个所述手势识别模型对多张所述手势样本图像在每种所述类别的预测分数;
    训练模块,用于根据所述预测分数,从多个待训练的权重组合中确定出所述融合模型的目标权重组合,其中,所述多个待训练的权重组合中的每个中包括每个所述手势识别模型分别对应的权重。
  7. 如权利要求6所述的模组,其特征在于,所述训练模块包括:
    计算子模块,用于根据所述预测分数计算所述多个待训练的权重组合分别对应的所述融合模型的识别准确率;
    确定子模块,用于确定出识别准确率超过预设阈值的所述待训练的权重组合作为所述目标权重组合。
  8. 如权利要求7所述的模组,其特征在于,所述计算子模块包括:
    第一执行单元,用于针对每个所述手势样本图像,执行以下操作:
    针对每种所述类别,将每个所述手势识别模型预测的预测分数乘以所述手势识别模型对应的权重,得到加权后的预测分数;并将所有所述手势识别模型对应的加权后的预测分数相加,得到所述手势样本图像在每种所述类别的融合预测分数;
    将所述融合预测分数最大的所述类别,作为识别出的手势的类别;
    将所述识别出的手势的类别与预先标注的手势的类别进行比较,确定识别是否正确;
    第二执行单元,用于根据多个所述手势样本图像识别是否正确,确定所述待训练的权重组合对应的所述融合模型的识别准确率。
  9. 如权利要求6所述的模组,其特征在于,所述第三获取模块包括:
    归一化处理子模块,用于将所述预测分数进行归一化处理,得到归一化后的预测分数。
  10. 一种手势识别模组,其特征在于,包括:
    第一获取模块,用于获取融合模型中的每个手势识别模型对待识别的手势图像在每种类别的预测分数,所述融合模型中包括多个手势识别模型;
    第二获取模块,用于获取所述融合模型的目标权重组合,所述目标权重组合中包括每个所述手势识别模型对应的权重;
    处理模块,用于针对每种所述类别,将每个所述手势识别模型预测的预测分数乘以所述手势识别模型对应的权重,得到加权后的预测分数,并将所有所述手势识别模型的加权后的预测分数相加,得到每种所述类别的融合预测分数;
    第三获取模块,用于获取融合预测分数最大的所述类别,作为所述待识别的手势图像中的手势的类别;
    其中,所述目标权重组合由权利要求1-4任一项所述的方法训练得到。
  11. 一种手势识别模型的训练模组,其特征在于,包括处理器、存储器及存储在所述存储器上并可在所述处理器上运行的计算机程序,所述计算机程序被所述处理器执行时实现如权利要求1至4中任一项所述的手势识别模 型的训练方法的步骤。
  12. 一种手势识别模组,其特征在于,包括处理器、存储器及存储在所述存储器上并可在所述处理器上运行的计算机程序,所述计算机程序被所述处理器执行时实现如权利要求5所述的手势识别方法的步骤。
  13. 一种计算机可读存储介质,其特征在于,所述计算机可读存储介质上存储计算机程序,所述计算机程序被处理器执行时实现如权利要求1至4中任一项所述的手势识别模型的训练方法的步骤,或者,所述计算机程序被处理器执行时实现如权利要求5所述的手势识别方法的步骤。
PCT/CN2020/141233 2020-03-24 2020-12-30 手势识别模型的训练方法、手势识别方法及装置 WO2021190046A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202010211278.9A CN111428639A (zh) 2020-03-24 2020-03-24 手势识别模型的训练方法、手势识别方法及装置
CN202010211278.9 2020-03-24

Publications (1)

Publication Number Publication Date
WO2021190046A1 true WO2021190046A1 (zh) 2021-09-30

Family

ID=71548659

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2020/141233 WO2021190046A1 (zh) 2020-03-24 2020-12-30 手势识别模型的训练方法、手势识别方法及装置

Country Status (2)

Country Link
CN (1) CN111428639A (zh)
WO (1) WO2021190046A1 (zh)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2023071121A1 (zh) * 2021-10-26 2023-05-04 苏州浪潮智能科技有限公司 一种基于多模型融合的目标检测方法、装置、设备及介质

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111428639A (zh) * 2020-03-24 2020-07-17 京东方科技集团股份有限公司 手势识别模型的训练方法、手势识别方法及装置
CN113139463B (zh) * 2021-04-23 2022-05-13 北京百度网讯科技有限公司 用于训练模型的方法、装置、设备、介质和程序产品
CN113837025A (zh) * 2021-09-03 2021-12-24 深圳创维-Rgb电子有限公司 一种手势识别方法、系统、终端及存储介质

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20170351910A1 (en) * 2016-06-04 2017-12-07 KinTrans, Inc. Automatic body movement recognition and association system
CN107679491A (zh) * 2017-09-29 2018-02-09 华中师范大学 一种融合多模态数据的3d卷积神经网络手语识别方法
CN107729854A (zh) * 2017-10-25 2018-02-23 南京阿凡达机器人科技有限公司 一种机器人的手势识别方法、系统及机器人
CN109976526A (zh) * 2019-03-27 2019-07-05 广东技术师范大学 一种基于表面肌电传感器和九轴传感器的手语识别方法
CN110755073A (zh) * 2019-10-09 2020-02-07 华中科技大学 基于阻抗谱信号的智能骨骼及关节信息处理系统及方法
CN111428639A (zh) * 2020-03-24 2020-07-17 京东方科技集团股份有限公司 手势识别模型的训练方法、手势识别方法及装置

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107742102B (zh) * 2017-10-13 2020-03-24 北京华捷艾米科技有限公司 一种基于深度传感器的手势识别方法
CN109145793A (zh) * 2018-08-09 2019-01-04 东软集团股份有限公司 建立手势识别模型的方法、装置、存储介质及电子设备

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20170351910A1 (en) * 2016-06-04 2017-12-07 KinTrans, Inc. Automatic body movement recognition and association system
CN107679491A (zh) * 2017-09-29 2018-02-09 华中师范大学 一种融合多模态数据的3d卷积神经网络手语识别方法
CN107729854A (zh) * 2017-10-25 2018-02-23 南京阿凡达机器人科技有限公司 一种机器人的手势识别方法、系统及机器人
CN109976526A (zh) * 2019-03-27 2019-07-05 广东技术师范大学 一种基于表面肌电传感器和九轴传感器的手语识别方法
CN110755073A (zh) * 2019-10-09 2020-02-07 华中科技大学 基于阻抗谱信号的智能骨骼及关节信息处理系统及方法
CN111428639A (zh) * 2020-03-24 2020-07-17 京东方科技集团股份有限公司 手势识别模型的训练方法、手势识别方法及装置

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2023071121A1 (zh) * 2021-10-26 2023-05-04 苏州浪潮智能科技有限公司 一种基于多模型融合的目标检测方法、装置、设备及介质

Also Published As

Publication number Publication date
CN111428639A (zh) 2020-07-17

Similar Documents

Publication Publication Date Title
WO2021190046A1 (zh) 手势识别模型的训练方法、手势识别方法及装置
WO2021155706A1 (zh) 利用不平衡正负样本对业务预测模型训练的方法及装置
CN110020592B (zh) 物体检测模型训练方法、装置、计算机设备及存储介质
WO2019169688A1 (zh) 车辆定损方法、装置、电子设备及存储介质
WO2019228317A1 (zh) 人脸识别方法、装置及计算机可读介质
WO2018166114A1 (zh) 图片识别的方法、系统、电子装置及介质
WO2017220032A1 (zh) 基于深度学习的车牌分类方法、系统、电子装置及存储介质
JP7266674B2 (ja) 画像分類モデルの訓練方法、画像処理方法及び装置
JP7414901B2 (ja) 生体検出モデルのトレーニング方法及び装置、生体検出の方法及び装置、電子機器、記憶媒体、並びにコンピュータプログラム
CN110909784B (zh) 一种图像识别模型的训练方法、装置及电子设备
WO2021238586A1 (zh) 一种训练方法、装置、设备以及计算机可读存储介质
WO2019232861A1 (zh) 手写模型训练方法、文本识别方法、装置、设备及介质
CN113361396B (zh) 多模态的知识蒸馏方法及系统
CN114596497B (zh) 目标检测模型的训练方法、目标检测方法、装置及设备
KR20220024990A (ko) L2TL(Learning to Transfer Learn)을 위한 프레임워크
WO2023088174A1 (zh) 目标检测方法及装置
CN113011532A (zh) 分类模型训练方法、装置、计算设备及存储介质
CN116258861A (zh) 基于多标签学习的半监督语义分割方法以及分割装置
CN109214616B (zh) 一种信息处理装置、系统和方法
CN116912568A (zh) 基于自适应类别均衡的含噪声标签图像识别方法
WO2021042544A1 (zh) 基于去网纹模型的人脸验证方法、装置、计算机设备及存储介质
US20220092429A1 (en) Training neural networks using learned optimizers
CN114742319A (zh) 法考客观题成绩预测方法、系统及存储介质
CN109409231B (zh) 基于自适应隐马尔可夫的多特征融合手语识别方法
US20210365719A1 (en) System and method for few-shot learning

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20927587

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 20927587

Country of ref document: EP

Kind code of ref document: A1

122 Ep: pct application non-entry in european phase

Ref document number: 20927587

Country of ref document: EP

Kind code of ref document: A1

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 04.04.2023)

122 Ep: pct application non-entry in european phase

Ref document number: 20927587

Country of ref document: EP

Kind code of ref document: A1