WO2017177903A1 - 一种实时手势检测的在线验证方法及系统 - Google Patents

一种实时手势检测的在线验证方法及系统 Download PDF

Info

Publication number
WO2017177903A1
WO2017177903A1 PCT/CN2017/080117 CN2017080117W WO2017177903A1 WO 2017177903 A1 WO2017177903 A1 WO 2017177903A1 CN 2017080117 W CN2017080117 W CN 2017080117W WO 2017177903 A1 WO2017177903 A1 WO 2017177903A1
Authority
WO
WIPO (PCT)
Prior art keywords
model
real
module
new model
recognition result
Prior art date
Application number
PCT/CN2017/080117
Other languages
English (en)
French (fr)
Inventor
张宏鑫
陈鼎熠
池立盈
Original Assignee
芋头科技(杭州)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 芋头科技(杭州)有限公司 filed Critical 芋头科技(杭州)有限公司
Publication of WO2017177903A1 publication Critical patent/WO2017177903A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/20Movements or behaviour, e.g. gesture recognition
    • G06V40/28Recognition of hand or arm movements, e.g. recognition of deaf sign language
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting

Definitions

  • the invention belongs to the field of electronic technologies, and in particular relates to an online verification method and system for gesture detection.
  • a depth camera is also provided, which can acquire spatial information through infrared reflection, enriching the camera capture.
  • the invention provides an online verification method and system for real-time gesture detection to solve the problems of the prior art
  • An online verification method for real-time gesture detection comprising the following steps:
  • Step 1 The image acquisition module captures images in a visual range in real time
  • Step 2 an embedded terminal performs gesture recognition and tracking monitoring on the collected image by loading the trained model
  • Step 3 recording the recognition result and responding to the recognition result
  • Step 4 analyzing the correctness of the recognition result and retraining according to the set rule to obtain a new model, and verifying the accuracy of the new model;
  • Step 5 updating the previously trained model with the new model.
  • step 4 is as follows:
  • Step 41 The identification result is periodically uploaded to a background server, and the background server uses a deep learning method to verify the correctness of the recognition result.
  • Step 42 Record the error case that the recognition result is incorrect, determine that the error case reaches the set number, or collect the set time, and add the data of the error case to the training data of the previous model. , retraining to obtain a new model;
  • Step 43 analyzes the quality of the new model using a standard validation set.
  • the step 5 is specifically as follows:
  • Step 51 when it is determined that the new model is better than the previous model, the background server is to the embedded
  • the incoming terminal sends a request to upgrade the model
  • Step 52 The embedded terminal automatically downloads the new model to the embedded terminal in response to the request.
  • step 2 is specifically as follows:
  • Step 21 When there is a moving object in the visual range, the embedded terminal initiates gesture recognition
  • step 22 the pre-trained model is loaded, the target gesture is filtered out from the image, and subsequent images are tracked and detected.
  • an online verification system for real-time gesture detection including
  • An image acquisition module for capturing images in a visual range in real time
  • the gesture recognition tracking module is located at an embedded terminal, and is connected to the image acquisition module, and performs gesture recognition and tracking monitoring on the collected image by loading the trained model;
  • Recording a response module coupled to the gesture recognition tracking module, for recording a recognition result and responsive to the recognition result;
  • a verification operation module connected to the record response module, configured to analyze the correctness of the recognition result and retrain the new model according to the set rule, and verify the accuracy of the new model
  • a model update module coupled to the verification operation module, for updating the model according to the new model.
  • the verification operation module is located at a background server, and includes:
  • Detecting a back test sub-module connected to the record response module, for performing backtesting on the recognition result, and recording the wrong identification information and noise information;
  • model training sub-module connected to the detection back-test sub-module, adding the wrong identification information and noise information of the set quantity or set time to the training data, and re-training to obtain a new model
  • the verification sub-module is connected to the model training sub-module, and the new model is quantitatively evaluated according to the regularly updated verification set, and when the new model is better than the previous model, a message for updating the model is issued.
  • the record response module includes a visual feedback unit, and the visual feedback unit responds to the recognition result on a display interface of the embedded terminal by displaying a corresponding icon;
  • the recording response module includes an utterance feedback unit that responds to the recognition result by playing music or collecting music.
  • the recording response module is located at the embedded terminal.
  • the image acquisition module adopts a two-dimensional imaging camera.
  • an embedded smart device including the above-described online verification system for real-time gesture detection.
  • the above technical solution realizes a set of real-time gesture recognition methods and systems, and provides a more accurate online model optimization system than the traditional non-depth camera-based gesture recognition, which is beneficial to improve the accuracy of recognition.
  • FIG. 1 is a schematic flow chart of a method of the present invention
  • step 4 of the present invention is a schematic flow chart of step 4 of the present invention.
  • step 5 of the present invention is a schematic flow chart of step 5 of the present invention.
  • step 2 of the present invention is a schematic flow chart of step 2 of the present invention.
  • Figure 5 is a schematic structural view of the system of the present invention.
  • Figure 6 is a schematic view showing the structure of a specific embodiment of the present invention.
  • an online verification method for real-time gesture detection includes the following steps:
  • Step 1 The image acquisition module captures images in a visual range in real time
  • Step 2 an embedded terminal performs gesture recognition and tracking monitoring on the collected image by loading the trained model
  • Step 3 recording the recognition result and responding to the recognition result
  • Step 4 analyzing the correctness of the recognition result and retraining to obtain a new model according to the set rules, Verify the accuracy of the new model;
  • Step 5 updating the previously trained model with the new model.
  • a large amount of data is collected in advance, and a partial image including a gesture is intercepted as a positive sample, and more negative sample images not including a gesture are intercepted as a training set. Then use the training set for the appropriate algorithm to train the model, load the pre-trained model before recognition, and calculate whether each image contains gestures, which is limited by the pre-trained model.
  • the invention affects the recognition effect, and the invention records and analyzes the recognition result, and re-trains according to the specific data to obtain a new model for updating, and provides a more accurate online model optimization system, which is beneficial to improve the accuracy of recognition and makes Target gestures that occur within any field of view can be effectively identified.
  • step 4 is as follows:
  • Step 41 The recognition result is periodically uploaded to a background server, and the background server uses the deep learning method to verify the correctness of the recognition result;
  • Step 42 Record an error case in which the recognition result is incorrect, determine that the error case reaches the set number or collect the set time, add the data of the error case to the training data of the previous model, and retrain to obtain the new model;
  • step 43 a standard validation set is used to analyze the quality of the new model.
  • the background server accepts the recognition result sent by the front end and records it, and then uses the more accurate detection method to backtest the result, and records the wrong identification information and some noise information. At regular intervals, some detected data is added to the training data to train the model. After the training model is derived, the new model is quantitatively evaluated based on a set of regularly updated validation sets.
  • the present invention builds the corresponding functions on the embedded terminal and the backend server.
  • the embedded terminal provides real-time and fast recognition.
  • the detection operation is performed only when an object motion is detected, which greatly reduces the resource occupancy of the system and simultaneously tracks the detected area, which speeds up detection.
  • the data reduces the system's resource consumption, and the system's timeliness and stability are greatly improved.
  • the background server it provides more precise functions.
  • the timeliness requirement is very low.
  • the data transmitted to the backend server through the embedded terminal is periodically used by the background server to verify the correct rate of recognition. And periodically use the collected data to retrain the client's detection model.
  • the client will be deployed to different environments, and in the initial use, different levels of false detection may occur. But after several rounds of inspection and retraining on the server side, the new model will be fully applicable to the deployed environment. Meet the dual guarantee of timeliness and accuracy.
  • Step 5 is as follows:
  • Step 51 When determining that the new model is better than the previous model, the background server sends a request for upgrading the model to the embedded terminal;
  • Step 52 The embedded terminal responds to the request, and the background server automatically downloads the new model to the embedded terminal.
  • the background server After getting a trained new model, use a standard test set to analyze the quality of the model.
  • the background server sends an update request.
  • the embedded terminal responds to the request, the background server automatically downloads the new model to the embedded terminal. Every Users can get customized models, so that the embedded terminal gesture recognition system can adapt to different environments.
  • step 2 is as follows:
  • Step 21 when there is a moving object in the visual range, the embedded terminal starts gesture recognition
  • step 22 the pre-trained model is loaded, the target gesture is filtered out from the image, and the subsequent image is tracked and detected.
  • the collected image is detected by the classifier after the training is implemented. If the target gesture appears, the response of the response is recorded and given, and the position of the gesture is recorded, and the subsequent image is tracked and detected.
  • the image acquisition module 11 is configured to capture an image in a visual range in real time
  • the gesture recognition tracking module 12 is located in an embedded terminal 1 and is connected to the image acquisition module 11 to perform gesture recognition and tracking monitoring on the captured image by loading the trained model.
  • a record response module 13 coupled to the gesture recognition tracking module 12 for recording the recognition result and responsive to the recognition result
  • the verification operation module 20 is connected to the record response module 13 for analyzing the correctness of the recognition result and retraining to obtain a new model according to the set rule, and verifying the accuracy of the new model;
  • the model update module 21 is coupled to the verification operation module 20 for updating the previous model according to the new model.
  • the gesture recognition tracking module 12 can also be a gesture recognition program running on the embedded terminal 1 , which is equipped with a real-time monitoring function, and provides monitoring results and data to a background server 2 for detecting back testing.
  • the embedded terminal 1 can be Independent of the background server 2 without a network environment Operate independently.
  • the verification operation module 20 is located at a background server end 2, and includes:
  • the model training sub-module is connected with the detection back-test sub-module, and adds the wrong identification information and noise information of the set quantity or set time to the training data, and retrains to obtain a new model;
  • the verification sub-module is connected with the model training sub-module, and the new model is quantitatively evaluated according to the regularly updated verification set, and when the new model is better than the previous model, the message of updating the model is issued.
  • the background server 2 can set the data collection and detection function, perform more accurate deep learning through the background server 2, perform backtesting analysis on the detection result of the embedded terminal 1, record the occurrence of the false detection, and time the training data. , update the new model to embedded terminal 1.
  • the above deep learning can be performed by constructing a multi-layered neural network, and the underlying convolution layer extracts basic information of the image, such as edge or point information. Then extract more abstract features layer by layer. For example, in the example of gestures, the middle layer extracts information such as skin color and skin folds. The higher network layer extracts the local features of the gesture, and finally fits the most connected layer. A reasonable classification function.
  • the whole process is automatic training, although it takes a little time, but it is a background optimization update service, no need to worry about timeliness.
  • the background server can collect training data and retrain the model of the deep network at specific times. In order to ensure that the model accuracy of the background server 2 is higher than that of the embedded terminal 1, the purpose of verifying and optimizing the update can be achieved.
  • the record response module 13 includes a visual feedback unit, and the visual feedback unit responds by displaying a corresponding icon on the display interface of the embedded terminal. Identification result;
  • the recording response module includes an utterance feedback unit that responds to the recognition result by playing music or collecting music.
  • the record response module 13 is located at the embedded terminal.
  • the image acquisition module 11 can adopt a two-dimensional imaging camera for collecting real-time images, and has a static image and a video capture function on 30 frames per second.
  • an embedded smart device including the above-described online verification system for real-time gesture detection.
  • the embedded smart device can be a robot running an embedded system.
  • an HD camera is connected to the embedded smart device through a MIPI (Mobile Industry Processor Interface) or a USB interface; the entire gesture control example is shown in FIG. 6:
  • MIPI Mobile Industry Processor Interface
  • the high-definition camera captures the image data appearing in the visual range in real time.
  • the gesture recognition system is activated only when there is a moving object in the camera range.
  • the target gesture is detected, the target is recorded in real time.
  • the partial graphics area of the gesture is then executed according to the different target gestures. For example, when a gesture for playing music appears, the local music interface is called to start playing music. And if the identified target gesture is a command to collect music, A favorite icon will appear on the screen, and the music collection interface will be added to add the music played at that time to the favorites list.
  • the recognition result recorded in the embedded terminal will be uploaded to the background server periodically, and the system will use the deep learning method to verify the correctness of the recognition result, and record the wrong case.
  • the background program adds these error cases to the original training data and retrains the model.
  • a standard validation set is used to analyze the quality of the new model.
  • the server sends a request to upgrade the model to the embedded terminal. After the embedded terminal responds, the server automatically downloads the new model to the client. After multiple iterations, the accuracy of the recognition will be greatly improved.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • General Physics & Mathematics (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Engineering & Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Evolutionary Biology (AREA)
  • General Health & Medical Sciences (AREA)
  • Psychiatry (AREA)
  • Social Psychology (AREA)
  • Human Computer Interaction (AREA)
  • Multimedia (AREA)
  • Image Analysis (AREA)
  • User Interface Of Digital Computer (AREA)

Abstract

一种实时手势检测的在线验证方法,包括以下步骤:步骤1,图像采集模块实时捕获视觉范围内的图像;步骤2,一嵌入式终端通过载入训练好的模型对采集的所述图像进行手势识别和跟踪监测;步骤3,记录识别结果并响应于所述识别结果;步骤4,分析所述识别结果的正确性并依据设定规则重新训练获得新模型,验证所述新模型的准确性;步骤5,以所述新模型更新先前训练好的所述模型。相比于传统的基于非深度相机的手势识别,提供了更为准确的在线模型优化系统,有利于提高识别的准确性。

Description

一种实时手势检测的在线验证方法及系统 技术领域
本发明属于电子技术领域,尤其涉及一种手势检测的在线验证方法及系统。
背景技术
随着嵌入式技术的成熟,各种智能产品如雨后春笋般涌现。在智能设备中,机器视觉一直是备受关注的热点问题。目前已有的手势技术主要分为两大类:一类是基于深度相机的三维视觉识别,除了设置摄像头外,还装有一个深度相机,可以通过红外反射获取空间信息,丰富了摄像头捕捉到的特征,使得识别准确度大大增加,目前已应用于前端科技产品上,比如微软的XBox系列,其配套的kinect设备正是业界较为成熟的深度相机,可以通过它用手势或身体的姿态,与游戏进行交互,然而,基于深度相机的三维视觉识别技术虽然增强了对图像的感知,但是受限于深度相机的时效性、稳定性以及一些架构上的兼容性,还无法大规模推广使用;另一类技术是基于传统的二维成像摄像头,比如海尔的智能电视,通过摄像头捕捉到的手势图像,进行电视的控制操作,其实现的原理主要是基于载入事先训练好的模型来筛选出符合指定手势的窗口。但是这样的传统方案存在一些问题:1)对于不同的场景和环境受限于事先训练好的模型;2)准确性和实时性无法同时满足。
发明内容
本发明提供一种实时手势检测的在线验证方法及系统,以解决现有技术的问题;
具体技术方案如下:
一种实时手势检测的在线验证方法,其中,包括以下步骤:
步骤1,图像采集模块实时捕获视觉范围内的图像;
步骤2,一嵌入式终端通过载入训练好的模型对采集的所述图像进行手势识别及跟踪监测;
步骤3,记录识别结果并响应于所述识别结果;
步骤4,分析所述识别结果的正确性并依据设定规则重新训练获得新模型,验证所述新模型的准确性;
步骤5,以所述新模型更新先前训练好的所述模型。
上述的实时手势检测的在线验证方法,所述步骤4具体如下:
步骤41,所述识别结果定时上传至一后台服务器,所述后台服务器利用深度学习的方法验证识别结果的正确性;
步骤42,记录所述识别结果为不正确的错误案例,判断所述错误案例达到设定数量或收集了设定时间后,将所述错误案例的数据添加至先前的所述模型的训练数据中,重新训练获得新模型;
步骤43,使用标准的验证集来分析所述新模型的质量。
上述的实时手势检测的在线验证方法,
所述步骤5具体如下:
步骤51,判断所述新模型优于先前的模型时,所述后台服务器向所述嵌 入式终端发送升级所述模型的请求;
步骤52,所述嵌入式终端响应所述请求,所述后台服务器自动下载所述新模型至所述嵌入式终端。
上述的实时手势检测的在线验证方法,所述步骤2具体如下:
步骤21,所述视觉范围内有移动的物体时,所述嵌入式终端启动手势识别;
步骤22,载入预先训练好的模型,自所述图像中筛选出目标手势,对后续的图像进行跟踪检测。
还包括,一种实时手势检测的在线验证系统,其中,包括,
图像采集模块,用于实时捕获视觉范围内的图像;
手势识别跟踪模块,位于一嵌入式终端,与所述图像采集模块连接,通过载入训练好的模型对采集的所述图像进行手势识别及跟踪监测;
记录响应模块,与所述手势识别跟踪模块连接,用于记录识别结果并响应于所述识别结果;
检验操作模块,与所述记录响应模块连接,用于分析所述识别结果的正确性并依据设定规则重新训练获得新模型,并验证所述新模型的准确性;
模型更新模块,与所述检验操作模块连接,用于依据所述新模型更新所述模型。
上述的一种实时手势检测的在线验证系统,所述检验操作模块位于一后台服务器端,包括:
检测回测子模块,与所述记录响应模块连接,用于对所述识别结果进行回测,将错误的识别信息及噪声信息记录下来;
模型训练子模块,与所述检测回测子模块连接,将设定数量或设定时间的所述错误的识别信息及噪声信息加入训练数据中,重新训练获得新模型;
验证子模块,与所述模型训练子模块连接,依据定时更新的验证集对所述新模型进行量化评估,当所述新模型优于先前的所述模型时,发出更新模型的消息。
上述的一种实时手势检测的在线验证系统,所述记录响应模块包括视觉反馈单元,所述视觉反馈单元通过显示相应的图标于所述嵌入式终端的显示界面上响应所述识别结果;
和/或,
所述记录响应模块包括发声反馈单元,所述发声反馈单元通过播放音乐或收藏音乐响应于所述识别结果。
上述的一种实时手势检测的在线验证系统,所述记录响应模块位于所述嵌入式终端。
上述的一种实时手势检测的在线验证系统,所述图像采集模块采用二维成像摄像头。
还包括,一种嵌入式智能设备,包括上述的实时手势检测的在线验证系统。
有益效果:以上技术方案实现了一套实时手势识别的方法及系统,相比于传统的基于非深度相机的手势识别,提供了更为准确的在线模型优化系统,有利于提高识别的准确性。
附图说明
图1为本发明的方法流程示意图;
图2为本发明的步骤4的流程示意图;
图3为本发明的步骤5的流程示意图;
图4为本发明的步骤2的流程示意图;
图5为本发明的系统结构示意图;
图6为本发明的一种具体实施例的结构示意图。
具体实施方式
下面将结合本发明实施例中的附图,对本发明实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例仅仅是本发明一部分实施例,而不是全部的实施例。基于本发明中的实施例,本领域普通技术人员在没有作出创造性劳动的前提下所获得的所有其他实施例,都属于本发明保护的范围。
需要说明的是,在不冲突的情况下,本发明中的实施例及实施例中的特征可以相互组合。
下面结合附图和具体实施例对本发明作进一步说明,但不作为本发明的限定。
参照图1,一种实时手势检测的在线验证方法,其中,包括以下步骤:
步骤1,图像采集模块实时捕获视觉范围内的图像;
步骤2,一嵌入式终端通过载入训练好的模型对采集的图像进行手势识别及跟踪监测;
步骤3,记录识别结果并响应于识别结果;
步骤4,分析识别结果的正确性并依据设定规则重新训练获得新模型, 验证新模型的准确性;
步骤5,以所述新模型更新先前训练好的所述模型。
现有技术中预先收集大量的数据,截取包含手势的局部图像作为正样本,同时截取更多不包含手势的负样本图像,以此作为训练集。然后将训练集用于合适的算法训练出模型,在识别前载入预先训练好的模型,计算出每张图像是否包含手势,其受限于事先训练好的模型,对于不同的场景和环境,影响识别的效果,本发明对识别结果进行记录和分析,定时根据特定的数据重新训练获得新模型以便进行更新,提供了更为准确的在线模型优化系统,有利于提高识别的准确性,并使得任意视野范围内出现的目标手势能够被有效识别。
上述的实时手势检测的在线验证方法,步骤4具体如下:
步骤41,识别结果定时上传至一后台服务器,后台服务器利用深度学习的方法验证识别结果的正确性;
步骤42,记录识别结果为不正确的错误案例,判断错误案例达到设定数量或收集了设定时间后,将错误案例的数据添加至先前的模型的训练数据中,重新训练获得新模型;
步骤43,使用标准的验证集来分析新模型的质量。
后台服务器接受前端发送过来的识别结果并记录,然后使用更为精准的检测方式对结果进行回测,将错误的识别信息,以及一些噪声信息记录下来。每隔规定的时间,将一些检测的数据加入训练数据中,对模型进行训练。在得出训练模型后,会根据一套定时更新的验证集对新的模型进行量化评估。
现有的大多数检测识别算法存在算法的速度和算法的精准度无法兼顾的 问题。准确的模型普遍需要大量的计算量,难以满足实时交互系统的要求,而快速的算法则容易面临误识别和召回率低的问题,本发明将相应功能分别搭建在嵌入式终端和后台服务器端。嵌入式终端提供实时快速的识别,出于性能的考虑,只有在检测到有物体运动时才做检测操作,这样大大降低了系统的资源占用,同时对检测到的区域进行跟踪,既加快了检测的数据,又减少了系统的资源占用,系统的时效性和稳定性大大提升;在后台服务端则提供更为精准的功能,由于后台服务器端的操作属于定时更新,所以对时效性的要求非常低,通过嵌入式终端传输给后台服务器端的数据会定时被后台服务器端的算法用于检验识别的正确率。并且定时利用采集的数据重新训练客户端的检测模型。在实际的使用时,客户端会布置到不同的环境,在初期的使用中,可能会出现不同程度的误检。但是通过服务端几轮的检验和重新训练后,全新的模型将完全适用于所部署的环境。满足时效性和准确性的双重保障。
上述的实时手势检测的在线验证方法,
步骤5具体如下:
步骤51,判断新模型优于先前的模型时,后台服务器向嵌入式终端发送升级模型的请求;
步骤52,嵌入式终端响应请求,后台服务器自动下载新模型至嵌入式终端。
得到训练好的新模型后,使用标准的测试集分析模型的质量。每隔特定时间,如果出现优于嵌入式终端的新模型,后台服务器端发出更新请求,当嵌入式终端响应请求后,后台服务器端会自动下载新模型到嵌入式终端。每 个用户可以得到定制化的模型,使得嵌入式终端手势识别系统可以适应不同的环境。
上述的实时手势检测的在线验证方法,步骤2具体如下:
步骤21,视觉范围内有移动的物体时,嵌入式终端启动手势识别;
步骤22,载入预先训练好的模型,自图像中筛选出目标手势,对后续的图像进行跟踪检测。
具体地,对采集到的图像用实现训练后的分类器进行检测,如果目标手势出现,记录并给出响应的反馈,同时记录出现手势的位置,对后续的图像进行跟踪检测。
还包括,一种实时手势检测的在线验证系统,其中,参照图5,包括,
图像采集模块11,用于实时捕获视觉范围内的图像;
手势识别跟踪模块12,位于一嵌入式终端1,与图像采集模块11连接,通过载入训练好的模型对采集的图像进行手势识别及跟踪监测;
记录响应模块13,与手势识别跟踪模块12连接,用于记录识别结果并响应于识别结果;
检验操作模块20,与记录响应模块13连接,用于分析识别结果的正确性并依据设定规则重新训练获得新模型,并验证新模型的准确性;
模型更新模块21,与检验操作模块20连接,用于依据新模型更新先前的模型。
手势识别跟踪模块12还可以是运行于嵌入式终端1的手势识别程序,其搭载实时监测功能,同时提供监测的结果和数据至一后台服务器端2用于检测回测,嵌入式终端1可以在没有网络环境的情况下独立于后台服务器端2 独立运行。
上述的一种实时手势检测的在线验证系统,检验操作模块20位于一后台服务器端2,包括:
检测回测子模块,与记录响应模块连接,用于对识别结果进行回测,将错误的识别信息及噪声信息记录下来;
模型训练子模块,与检测回测子模块连接,将设定数量或设定时间的错误的识别信息及噪声信息加入训练数据中,重新训练获得新模型;
验证子模块,与模型训练子模块连接,依据定时更新的验证集对新模型进行量化评估,当新模型优于先前的模型时,发出更新模型的消息。
后台服务器端2可以设置数据采集和检测功能,通过后台服务器端2进行更为精确的深度学习,对嵌入式终端1的检测结果进行回测分析,对出现的误检测进行记录,并且定时训练数据,将新的模型更新到嵌入式终端1。
上述的深度学习可以通过架构多层的神经网络,底层的卷积层提取图像的基础信息,比如边缘或点的信息。之后逐层提取更为抽象的特征,比如在手势的例子中,中间层会提取肤色、皮肤褶皱等信息,较高的网络层则会抽取手势的局部特征,最后通过全连接层拟合出最合理的分类函数。整个过程是自动训练,虽然耗时较慢,但是属于后台优化更新服务,无需担心时效性。同时,后台服务器端可以收集训练数据,每隔特定时间重新训练深度网络的模型。以保证后台服务器端2的模型精度高于嵌入式终端1,能够起到校验和优化更新的目的。
上述的一种实时手势检测的在线验证系统,记录响应模块13包括视觉反馈单元,视觉反馈单元通过显示相应的图标于嵌入式终端的显示界面上响应 识别结果;
和/或,
记录响应模块包括发声反馈单元,发声反馈单元通过播放音乐或收藏音乐响应于识别结果。
嵌入式终端1的其他服务。在接受到目标手势的指令后,做出播放或停止音乐等交互反应,同时有相应的图标和视觉效果展现在嵌入式终端的外部显示模块上。
上述的一种实时手势检测的在线验证系统,记录响应模块13位于嵌入式终端。
上述的一种实时手势检测的在线验证系统,图像采集模块11可以采用二维成像摄像头,用于采集实时图像,并具备静态图和30帧每秒上的视频采集功能,
还包括,一种嵌入式智能设备,包括上述的实时手势检测的在线验证系统。该嵌入式智能设备可以是运行嵌入式系统的机器人。
一种具体的实施例,参照图6,一个高清摄像头,通过MIPI(Mobile Industry Processor Interface,移动产业处理器接口)或者USB接口连接到嵌入式智能设备;整个手势控制示例如图6所示:
在嵌入式终端:高清摄像头会实时捕捉出现在视觉范围内的图像数据,只有当摄像头范围内有移动的物体时,手势识别的系统才被激活,当检测到目标手势时,会实时记录出现目标手势的局部图形区域,然后根据出现不同的目标手势执行相应的命令。比如出现播放音乐的手势时,会调用本地的音乐接口,开始播放音乐。而如果识别到的目标手势是收藏音乐的命令后,在 屏幕上会出现收藏图标,同时再调用音乐收藏的接口将当时播放的音乐添加到收藏列表。
而在后台服务器端:在嵌入式终端记录的识别结果会定时上传至后台服务器,系统会利用深度学习的方法验证识别结果的正确性,同时将错误的案例记录下来。当错误的案例达到一定数量或者收集了一定量的时间后,后台程序会将这些错误案例添加到原来的训练数据中,重新训练模型。得到新的模型后,会使用标准的验证集来分析新模型的质量。当得到的新模型优于原有模型时,服务器会向嵌入式终端发送升级模型的请求。在嵌入式终端响应后,服务器会自动下载新模型到客户端。在多次迭代之后,识别的精准度会大大提升。
以上仅为本发明较佳的实施例,并非因此限制本发明的实施方式及保护范围,对于本领域技术人员而言,应当能够意识到凡运用本发明说明书及图示内容所作出的等同替换和显而易见的变化所得到的方案,均应当包含在本发明的保护范围内。

Claims (10)

  1. 一种实时手势检测的在线验证方法,其特征在于,包括以下步骤:
    步骤1,图像采集模块实时捕获视觉范围内的图像;
    步骤2,一嵌入式终端通过载入训练好的模型对采集的所述图像进行手势识别及跟踪监测;
    步骤3,记录识别结果并响应于所述识别结果;
    步骤4,分析所述识别结果的正确性并依据设定规则重新训练获得新模型,验证所述新模型的准确性;
    步骤5,以所述新模型更新先前训练好的所述模型。
  2. 根据权利要求1所述的实时手势检测的在线验证方法,其特征在于,所述步骤4具体如下:
    步骤41,所述识别结果定时上传至一后台服务器,所述后台服务器利用深度学习的方法验证识别结果的正确性;
    步骤42,记录所述识别结果为不正确的错误案例,判断所述错误案例达到设定数量或收集了设定时间后,将所述错误案例的数据添加至先前的所述模型的训练数据中,重新训练获得新模型;
    步骤43,使用标准的验证集来分析所述新模型的质量。
  3. 根据权利要求2所述的实时手势检测的在线验证方法,其特征在于,所述步骤5具体如下:
    步骤51,判断所述新模型优于先前的模型时,所述后台服务器向所述嵌入式终端发送升级所述模型的请求;
    步骤52,所述嵌入式终端响应所述请求,所述后台服务器自动下载所述 新模型至所述嵌入式终端。
  4. 根据权利要求1所述的实时手势检测的在线验证方法,其特征在于,所述步骤2具体如下:
    步骤21,所述视觉范围内有移动的物体时,所述嵌入式终端启动手势识别;
    步骤22,载入预先训练好的模型,自所述图像中筛选出目标手势,对后续的图像进行跟踪检测。
  5. 一种实时手势检测的在线验证系统,其特征在于,包括,
    图像采集模块,用于实时捕获视觉范围内的图像;
    手势识别跟踪模块,位于一嵌入式终端,与所述图像采集模块连接,通过载入训练好的模型对采集的所述图像进行手势识别及跟踪监测;
    记录响应模块,与所述手势识别跟踪模块连接,用于记录识别结果并响应于所述识别结果;
    检验操作模块,与所述记录响应模块连接,用于分析所述识别结果的正确性并依据设定规则重新训练获得新模型,并验证所述新模型的准确性;
    模型更新模块,与所述检验操作模块连接,用于依据所述新模型更新所述模型。
  6. 根据权利要求5所述的一种实时手势检测的在线验证系统,其特征在于,所述检验操作模块位于一后台服务器端,包括:
    检测回测子模块,与所述记录响应模块连接,用于对所述识别结果进行回测,将错误的识别信息及噪声信息记录下来;
    模型训练子模块,与所述检测回测子模块连接,将设定数量或设定时间 的所述错误的识别信息及噪声信息加入训练数据中,重新训练获得新模型;
    验证子模块,与所述模型训练子模块连接,依据定时更新的验证集对所述新模型进行量化评估,当所述新模型优于先前的所述模型时,发出更新模型的消息。
  7. 根据权利要求5所述的一种实时手势检测的在线验证系统,其特征在于,所述记录响应模块包括视觉反馈单元,所述视觉反馈单元通过显示相应的图标于所述嵌入式终端的显示界面上响应所述识别结果;
    和/或,
    所述记录响应模块包括发声反馈单元,所述发声反馈单元通过播放音乐或收藏音乐响应于所述识别结果。
  8. 根据权利要求5所述的一种实时手势检测的在线验证系统,其特征在于,所述记录响应模块位于所述嵌入式终端。
  9. 根据权利要求5所述的一种实时手势检测的在线验证系统,其特征在于,所述图像采集模块采用二维成像摄像头。
  10. 一种嵌入式智能设备,其特征在于,包括权利要求5-9任意一项所述的实时手势检测的在线验证系统。
PCT/CN2017/080117 2016-04-13 2017-04-11 一种实时手势检测的在线验证方法及系统 WO2017177903A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201610231456.8 2016-04-13
CN201610231456.8A CN107292223A (zh) 2016-04-13 2016-04-13 一种实时手势检测的在线验证方法及系统

Publications (1)

Publication Number Publication Date
WO2017177903A1 true WO2017177903A1 (zh) 2017-10-19

Family

ID=60041441

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2017/080117 WO2017177903A1 (zh) 2016-04-13 2017-04-11 一种实时手势检测的在线验证方法及系统

Country Status (3)

Country Link
CN (1) CN107292223A (zh)
TW (1) TWI638278B (zh)
WO (1) WO2017177903A1 (zh)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109683938A (zh) * 2018-12-26 2019-04-26 苏州思必驰信息科技有限公司 用于移动终端的声纹模型升级方法和装置
CN112347947A (zh) * 2020-11-10 2021-02-09 厦门长江电子科技有限公司 融合智能检测与自动化测试的图像数据处理系统与方法
CN112378916A (zh) * 2020-11-10 2021-02-19 厦门长江电子科技有限公司 基于机器视觉的图像分级自动化检测系统与方法
CN112684887A (zh) * 2020-12-28 2021-04-20 展讯通信(上海)有限公司 应用设备及其隔空手势识别方法

Families Citing this family (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108549881A (zh) * 2018-05-02 2018-09-18 杭州创匠信息科技有限公司 证件文字的识别方法和装置
CN109492675B (zh) * 2018-10-22 2021-02-05 深圳前海达闼云端智能科技有限公司 医学影像的识别方法、装置、存储介质和电子设备
CN109858380A (zh) * 2019-01-04 2019-06-07 广州大学 可扩展手势识别方法、装置、系统、手势识别终端和介质
JP7262232B2 (ja) * 2019-01-29 2023-04-21 東京エレクトロン株式会社 画像認識システムおよび画像認識方法
CN112396042A (zh) * 2021-01-20 2021-02-23 鹏城实验室 实时更新的目标检测方法及系统、计算机可读存储介质
CN112861934A (zh) * 2021-01-25 2021-05-28 深圳市优必选科技股份有限公司 一种嵌入式终端的图像分类方法、装置及嵌入式终端

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103632143A (zh) * 2013-12-05 2014-03-12 冠捷显示科技(厦门)有限公司 结合云计算基于影像的物件识别系统
US20150309579A1 (en) * 2014-04-28 2015-10-29 Microsoft Corporation Low-latency gesture detection
CN105205436A (zh) * 2014-06-03 2015-12-30 北京创思博德科技有限公司 一种基于前臂生物电多传感器的手势识别系统

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8396582B2 (en) * 2008-03-08 2013-03-12 Tokyo Electron Limited Method and apparatus for self-learning and self-improving a semiconductor manufacturing tool
US8693724B2 (en) * 2009-05-29 2014-04-08 Microsoft Corporation Method and system implementing user-centric gesture control
US8933884B2 (en) * 2010-01-15 2015-01-13 Microsoft Corporation Tracking groups of users in motion capture system
CN102831439B (zh) * 2012-08-15 2015-09-23 深圳先进技术研究院 手势跟踪方法及系统
US20140354540A1 (en) * 2013-06-03 2014-12-04 Khaled Barazi Systems and methods for gesture recognition
TWM514600U (zh) * 2015-08-04 2015-12-21 Univ Feng Chia 虛擬園區之體感探索互動系統

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103632143A (zh) * 2013-12-05 2014-03-12 冠捷显示科技(厦门)有限公司 结合云计算基于影像的物件识别系统
US20150309579A1 (en) * 2014-04-28 2015-10-29 Microsoft Corporation Low-latency gesture detection
CN105205436A (zh) * 2014-06-03 2015-12-30 北京创思博德科技有限公司 一种基于前臂生物电多传感器的手势识别系统

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109683938A (zh) * 2018-12-26 2019-04-26 苏州思必驰信息科技有限公司 用于移动终端的声纹模型升级方法和装置
CN109683938B (zh) * 2018-12-26 2022-08-02 思必驰科技股份有限公司 用于移动终端的声纹模型升级方法和装置
CN112347947A (zh) * 2020-11-10 2021-02-09 厦门长江电子科技有限公司 融合智能检测与自动化测试的图像数据处理系统与方法
CN112378916A (zh) * 2020-11-10 2021-02-19 厦门长江电子科技有限公司 基于机器视觉的图像分级自动化检测系统与方法
CN112378916B (zh) * 2020-11-10 2024-03-29 厦门长江电子科技有限公司 基于机器视觉的图像分级自动化检测系统与方法
CN112684887A (zh) * 2020-12-28 2021-04-20 展讯通信(上海)有限公司 应用设备及其隔空手势识别方法

Also Published As

Publication number Publication date
CN107292223A (zh) 2017-10-24
TWI638278B (zh) 2018-10-11
TW201737139A (zh) 2017-10-16

Similar Documents

Publication Publication Date Title
TWI638278B (zh) 一種即時手勢檢測的在線驗證方法及系統
US10438077B2 (en) Face liveness detection method, terminal, server and storage medium
CN102831439B (zh) 手势跟踪方法及系统
CN108846365B (zh) 视频中打架行为的检测方法、装置、存储介质及处理器
CN107251096B (zh) 图像捕获装置和方法
EP3860133A1 (en) Audio and video quality enhancement method and system employing scene recognition, and display device
CN108269333A (zh) 人脸识别方法、应用服务器及计算机可读存储介质
KR102092931B1 (ko) 시선 추적 방법 및 이를 수행하기 위한 사용자 단말
US9805256B2 (en) Method for setting a tridimensional shape detection classifier and method for tridimensional shape detection using said shape detection classifier
CN110209273A (zh) 手势识别方法、交互控制方法、装置、介质与电子设备
CN111652087B (zh) 验车方法、装置、电子设备和存储介质
CN111553274A (zh) 一种基于轨迹分析的高空抛物检测方法及装置
CN106663196A (zh) 视频中的计算机显著人物识别
CN107452015A (zh) 一种具有重检测机制的目标跟踪系统
CN112312215B (zh) 基于用户识别的开机内容推荐方法、智能电视及存储介质
CN112527113A (zh) 手势识别及手势识别网络的训练方法和装置、介质和设备
CN105979366A (zh) 智能电视及其内容推荐的方法、装置
US20220415038A1 (en) Image detection method and apparatus, computer device, and computer-readable storage medium
CN109887331A (zh) 一种带车牌识别功能的车位监测终端及其监测方法
WO2017196617A1 (en) Obtaining calibration data of a camera
CN110314361B (zh) 一种基于卷积神经网络的篮球进球得分判断方法及系统
WO2020007191A1 (zh) 活体识别检测方法、装置、介质及电子设备
CN111241926A (zh) 考勤与学情分析方法、系统、设备及可读存储介质
WO2022041182A1 (zh) 音乐推荐方法和装置
CN110598644B (zh) 多媒体播放控制方法、装置及电子设备

Legal Events

Date Code Title Description
NENP Non-entry into the national phase

Ref country code: DE

121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 17781879

Country of ref document: EP

Kind code of ref document: A1

122 Ep: pct application non-entry in european phase

Ref document number: 17781879

Country of ref document: EP

Kind code of ref document: A1