WO2023077659A1 - 基于融合信息的太极拳识别方法、终端设备及存储介质 - Google Patents

基于融合信息的太极拳识别方法、终端设备及存储介质 Download PDF

Info

Publication number
WO2023077659A1
WO2023077659A1 PCT/CN2021/143893 CN2021143893W WO2023077659A1 WO 2023077659 A1 WO2023077659 A1 WO 2023077659A1 CN 2021143893 W CN2021143893 W CN 2021143893W WO 2023077659 A1 WO2023077659 A1 WO 2023077659A1
Authority
WO
WIPO (PCT)
Prior art keywords
information
skeleton
joint information
depth image
recognition method
Prior art date
Application number
PCT/CN2021/143893
Other languages
English (en)
French (fr)
Inventor
王浩宇
杨珊莉
吴剑煌
Original Assignee
中国科学院深圳先进技术研究院
福建中医药大学附属康复医院
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 中国科学院深圳先进技术研究院, 福建中医药大学附属康复医院 filed Critical 中国科学院深圳先进技术研究院
Publication of WO2023077659A1 publication Critical patent/WO2023077659A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/774Generating sets of training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/776Validation; Performance evaluation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/80Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/20Movements or behaviour, e.g. gesture recognition

Definitions

  • the present application relates to the technical field of action recognition, and in particular to a Tai Chi recognition method based on fusion information, a terminal device and a storage medium.
  • the present application provides a Tai Chi recognition method based on fusion information, a terminal device and a storage medium.
  • the application provides a Tai Chi recognition method based on fusion information, and the Tai Chi recognition method includes:
  • the Tai Chi recognition method also includes:
  • the third skeleton joint information includes a plurality of skeleton joint information
  • the acquisition of second skeleton joint information corresponding to the depth image based on the acquisition time of the first depth image and sensor data includes:
  • the cross-validation of the first skeleton joint information by using the second skeleton joint information includes:
  • Cross-validation is performed on the joint information of the first skeleton according to the degree of matching.
  • the extracting the first human body orientation information from the first skeleton joint information includes:
  • the extracting the second human body orientation information from the second skeleton joint information includes:
  • the second human body orientation information is determined based on the third joint point position and the fourth joint point position.
  • the Tai Chi recognition method also includes:
  • the verification fails, and the joint information of the second skeleton is output.
  • the Tai Chi recognition method also includes:
  • the verification fails, the first skeleton joint information is eliminated, and the preset deep learning model is trained using the second skeleton joint information.
  • the Tai Chi recognition method also includes:
  • the present application also provides a terminal device, which includes an acquisition module, an image module, a sensor module, and an action recognition module; wherein,
  • the collection module is used to collect the first depth image and sensor data
  • the image module is configured to input the first depth image into a preset deep learning model, and obtain output first skeleton joint information;
  • the sensor module is configured to acquire second skeleton joint information corresponding to the first depth image based on the acquisition time of the first depth image and sensor data;
  • the action recognition module is configured to perform cross-validation on the first skeleton joint information by using the second skeleton joint information, and output the first skeleton joint information after the verification is successful.
  • the present application also provides another terminal device, where the terminal device includes a memory and a processor, wherein the memory is coupled to the processor;
  • the memory is used to store program data
  • the processor is used to execute the program data to realize the above Tai Chi recognition method.
  • the present application also provides a computer storage medium, which is used to store program data, and when the program data is executed by a processor, it is used to realize the above-mentioned method for recognizing Taijiquan.
  • the beneficial effects of the present application are: the terminal device collects the first depth image and sensor data; the first depth image is input into the preset deep learning model, and the output first skeleton joint information is obtained; the acquisition time based on the first depth image, and The sensor data is used to obtain the second skeleton joint information corresponding to the first depth image; the second skeleton joint information is used to cross-validate the first skeleton joint information; after the verification is successful, the first skeleton joint information is output.
  • the Taijiquan recognition method of the present application uses sensor data to perform cross-validation on the output results of the deep learning model, effectively improving the accuracy of Taijiquan movement recognition.
  • Fig. 1 is a schematic flow chart of an embodiment of the Tai Chi recognition method provided by the present application
  • Fig. 2 is a schematic diagram of the Tai Chi recognition process provided by the present application.
  • Fig. 3 is a schematic diagram of calculating the right side orientation of the body provided by the application through the inertial sensor;
  • FIG. 4 is a schematic structural diagram of an embodiment of a terminal device provided by the present application.
  • FIG. 5 is a schematic structural diagram of another embodiment of a terminal device provided by the present application.
  • Fig. 6 is a schematic structural diagram of an embodiment of a computer storage medium provided by the present application.
  • human motion posture recognition methods there are two main categories of human motion posture recognition methods: one is to extract the human body skeleton through visual acquisition equipment based on traditional computer vision algorithms or deep learning algorithms, and then realize the recognition of human motion postures.
  • the representative of this type of method is Microsoft Kinect, Intel Realsense, OpenPose and other methods, which are based on color images and depth images, from which the human skeleton is recognized. Based on this result, the joint node position, velocity, acceleration and other information on the human skeleton are calculated.
  • Another type of algorithm is to directly obtain information such as the position, velocity, and acceleration of joint nodes on the human skeleton by wearing an inertial sensor.
  • the current human motion gesture recognition method has a low accuracy rate for certain actions.
  • the recognition accuracy for the front and back of the human body is not high in the process of turning around the human body; in addition, the existing The vision-based method first extracts the skeleton of the human body, and then calculates the information of the human joints based on the motion information of the skeleton. The calculation accuracy of joint motion data is not high.
  • this application proposes a Tai Chi recognition method and device based on fusion information.
  • the terminal device uses the color depth image collected by the depth camera as an input; at the same time, the subject wears an inertial Sensors, the system collects data from inertial sensors as the second input.
  • the terminal device inputs the color depth image into the trained deep learning model to obtain the preliminary extraction result of the human skeleton information, and then cross-validates the result with the joint position information collected by the inertial sensor, and finally fuses the two information to obtain Accurate human skeleton information and precise motion data of joints.
  • the data results are used as training data to continue to strengthen the training of the deep learning model.
  • FIG. 1 is a schematic flowchart of an embodiment of a Taijiquan recognition method provided in this application
  • FIG. 2 is a schematic diagram of a Taijiquan recognition process provided in this application.
  • the Taijiquan recognition method of the present application is applied to a terminal device, wherein the terminal device of the present application may be a server, or may be a system in which the server and the terminal device cooperate with each other.
  • the terminal device of the present application may be a server, or may be a system in which the server and the terminal device cooperate with each other.
  • various parts included in the terminal device such as various units, subunits, modules, and submodules, may all be set in the server, or may be set in the server and the terminal device separately.
  • the above server may be hardware or software.
  • the server When the server is hardware, it can be implemented as a distributed server cluster composed of multiple servers, or as a single server.
  • the server When the server is software, it can be implemented as multiple software or software modules, such as software or software modules used to provide a distributed server, or as a single software or software module, which is not specifically limited here.
  • the Taijiquan recognition method in the embodiment of the present application may be implemented in a manner in which a processor invokes computer-readable instructions stored in a memory.
  • actions involved in the embodiments of the present application include actions in multiple fields and directions, for example, Tai Chi actions, dance actions, fitness actions, life actions, and so on.
  • applicable action types are not listed one by one.
  • the Tai Chi recognition method in the embodiment of the present application specifically includes the following steps:
  • Step S11 collecting a first depth image and sensor data.
  • the terminal device uses a depth camera to collect a color depth image including the user's motion, that is, the first depth image; in addition, the terminal device uses an inertial sensor to collect the motion of each skeleton joint of the user. Sensor data during the process, wherein the sensor data specifically includes velocity data and acceleration data of each skeleton joint.
  • the user needs to build a hardware environment to initialize the depth camera and the inertial sensor.
  • initializing the inertial sensor also includes the first calibration of the position of the inertial sensor.
  • the user needs to face the depth camera with his feet apart and his hands naturally drooping.
  • the depth camera detects the user's image for the first time, the depth camera will input the acquired color depth image into the pre-established preset deep learning model, and the preset deep learning model will perform calculations to identify the user's bone information and can Identify the human joints from A 1 to A N in the skeleton information.
  • the terminal device establishes a mapping relationship between the index or number of the human bone joint and the index or number of the inertial sensor, and records the position of each inertial sensor in the world coordinate system as the initial position p i of each inertial sensor.
  • the world coordinate system of the embodiment of the present application takes the midpoint of the two feet of the human skeleton model as the circle point, the front facing the human body is the positive direction of the z-axis, the right side of the human body is the positive direction of the x-axis, and the human head The direction is the positive direction of the y-axis.
  • the preset deep learning model in the embodiment of the present application can use a traditional supervised deep learning method to learn a large amount of labeled video data to establish a deep learning model.
  • the deep learning model can use the color depth image collected by the depth camera as input, and calculate the motion skeleton information of the human body in the output image.
  • Step S12 Input the first depth image into the preset deep learning model, and obtain the output first skeleton joint information.
  • the terminal device inputs the first depth image into the preset deep learning model, thereby identifying the skeleton and nodes of the human body, and obtaining the joint information of the first skeleton.
  • the preset deep learning model is obtained by training in advance according to the offline training process introduced in step S11 above.
  • Step S13 Based on the acquisition time of the first depth image and the sensor data, obtain the second skeleton joint information corresponding to the first depth image.
  • the terminal device acquires the time interval ⁇ t based on the acquisition time difference between the first depth image and the depth image acquired last time.
  • the terminal device can set the depth camera in advance to collect at the same interval of collection time, therefore, the time interval ⁇ t is generally a fixed value.
  • the terminal device calculates the new position of each inertial sensor in the world coordinate system through an integration method based on the sensor data captured by the inertial sensor, including acceleration data and velocity data, combined with the time interval ⁇ t .
  • time integration methods such as the Runge-Kutta method, etc. may also be used.
  • the positions of all inertial sensors in the world coordinate system at the next moment of the currently collected depth image can be calculated, so as to obtain the second skeleton joint information corresponding to the first depth image.
  • Step S14 Perform cross-validation on the first skeleton joint information by using the second skeleton joint information.
  • the terminal device may perform cross-validation on the above-mentioned second skeleton joint information and the first skeleton joint information by using the orientation of the human body.
  • the terminal device can obtain the positions of the user's left foot and right foot from the first skeleton joint information calculated based on the preset deep learning model, respectively and From this, the current right orientation of the human body can be calculated as:
  • the terminal device can obtain the positions of all inertial sensors in the world coordinate system from the second skeleton joint information. Assume that the index of the left foot sensor is m, and the index of the right foot sensor is n. Please refer to Figure 3 for details.
  • the right side of the current human body Orientation is:
  • the matching degree is greater than 0, it means that the human body orientation calculated by the deep learning model is the same as the human body orientation detected by the inertial sensor, and proceed to step 15.
  • the matching degree is less than 0, it means that the human body orientation calculated by the deep learning model is opposite to the human body orientation detected by the inertial sensor. At this time, the human body orientation calculated by the inertial sensor should prevail, that is, the second skeleton joint information is output to indicate the human body motion posture and joint motion data.
  • the terminal device can eliminate the wrong result of the deep learning model, and save the correct result calculated by the inertial sensor, as the training data of the deep learning model, and continue to train the deep learning module.
  • Step S15 After the verification is successful, output the joint information of the first skeleton.
  • the terminal device when the matching degree is greater than 0, the terminal device outputs the first skeleton joint information representing the motion posture of the human body and joint motion data.
  • the matching degree is greater than 0, it can be considered that the human body orientation output by the deep learning model is correct.
  • the position of the inertial sensor can be recalibrated to avoid the position drift of the inertial sensor caused by the cumulative error of the inertial sensor calculated by the time integration algorithm .
  • the deep learning model of the embodiment of the present application calculates the skeleton model and joint positions of the output human body, it will generate a value from 0 to 1 at the same time, indicating the probability that the output result is correct, represented by ⁇ .
  • the value of ⁇ is higher; when the user turns around, the value of ⁇ is lower.
  • the terminal device can recalibrate the position of the inertial sensor, that is, reposition, so that the joint positions calculated by the deep learning model are consistent. It should be noted that the calibration process is the same as the calibration process during initialization, and will not be repeated here.
  • the terminal device collects the first depth image and sensor data; input the first depth image into the preset deep learning model, and obtain the output first skeleton joint information; based on the acquisition time of the first depth image, and The sensor data is used to obtain the second skeleton joint information corresponding to the first depth image; the second skeleton joint information is used to cross-validate the first skeleton joint information; after the verification is successful, the first skeleton joint information is output.
  • the Tai Chi recognition method of the present application uses sensor data to cross-validate the output results of the deep learning model, effectively improving the accuracy of action recognition.
  • the writing order of each step does not mean a strict execution order and constitutes any limitation on the implementation process.
  • the specific execution order of each step should be based on its function and possible
  • the inner logic is OK.
  • FIG. 4 is a schematic structural diagram of an embodiment of the terminal device provided in the present application.
  • the terminal device 400 in the embodiment of the present application includes an acquisition module 41, an image module 42, a sensor module 43, and an action recognition module 44; wherein,
  • the acquisition module 41 is configured to acquire the first depth image and sensor data.
  • the image module 42 is configured to input the first depth image into a preset deep learning model, and obtain output first skeleton joint information.
  • the sensor module 43 is configured to acquire second skeleton joint information corresponding to the first depth image based on the acquisition time of the first depth image and sensor data.
  • the action recognition module 44 is configured to perform cross-validation on the first skeleton joint information by using the second skeleton joint information, and output the first skeleton joint information after the verification is successful.
  • FIG. 5 is a schematic structural diagram of another embodiment of the terminal device provided in the present application.
  • the terminal device 500 in this embodiment of the present application includes a memory 51 and a processor 52, where the memory 51 and the processor 52 are coupled.
  • the memory 51 is used for storing program data
  • the processor 52 is used for executing the program data to realize the Tai Chi recognition method described in the above-mentioned embodiments.
  • the processor 52 may also be referred to as a CPU (Central Processing Unit, central processing unit).
  • the processor 52 may be an integrated circuit chip with signal processing capability.
  • the processor 52 can also be a general-purpose processor, a digital signal processor (DSP, Digital Signal Process), an application specific integrated circuit (ASIC, Application Specific Integrated Circuit), a field programmable gate array (FPGA, Field Programmable Gate Array) or other possible Program logic devices, discrete gate or transistor logic devices, discrete hardware components.
  • DSP digital signal processor
  • ASIC Application Specific Integrated Circuit
  • FPGA Field Programmable Gate Array
  • Program logic devices discrete gate or transistor logic devices, discrete hardware components.
  • the general purpose processor may be a microprocessor or the processor 52 may be any conventional processor or the like.
  • the present application also provides a computer storage medium.
  • the computer storage medium 600 is used to store program data 61.
  • the program data 61 is executed by the processor, it is used to realize Tai Chi recognition as described in the above-mentioned embodiment. method.
  • the present application also provides a computer program product, wherein the above-mentioned computer program product includes a computer program, and the above-mentioned computer program is operable to cause a computer to execute the Tai Chi recognition method as described in the embodiment of the present application.
  • the computer program product may be a software installation package.
  • the Taijiquan recognition method described in the above embodiments of the present application may be stored in a device, such as a computer-readable storage medium, when implemented in the form of a software function unit and sold or used as an independent product.
  • a device such as a computer-readable storage medium
  • the technical solution of the present application is essentially or the part that contributes to the prior art or all or part of the technical solution can be embodied in the form of a software product, and the computer software product is stored in a storage medium , including several instructions to make a computer device (which may be a personal computer, a server, or a network device, etc.) or a processor (processor) execute all or part of the steps of the methods described in various embodiments of the present invention.
  • the aforementioned storage medium includes: U disk, mobile hard disk, read-only memory (ROM, Read-Only Memory), random access memory (RAM, Random Access Memory), magnetic disk or optical disc, etc., which can store program codes. .

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Multimedia (AREA)
  • Health & Medical Sciences (AREA)
  • General Physics & Mathematics (AREA)
  • Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • Computing Systems (AREA)
  • Software Systems (AREA)
  • Medical Informatics (AREA)
  • Evolutionary Computation (AREA)
  • Databases & Information Systems (AREA)
  • Artificial Intelligence (AREA)
  • Psychiatry (AREA)
  • Social Psychology (AREA)
  • Human Computer Interaction (AREA)
  • Image Analysis (AREA)

Abstract

本申请提供了一种太极拳识别方法、终端设备以及计算机存储介质。该太极拳识别方法包括:采集第一深度图像,以及传感器数据;将第一深度图像输入预设深度学习模型,获取输出的第一骨架关节信息;基于第一深度图像的采集时间,以及传感器数据,获取第一深度图像对应的第二骨架关节信息;采用第二骨架关节信息对第一骨架关节信息进行交叉验证;验证成功后,输出第一骨架关节信息。通过上述方式,本申请的太极拳识别方法采用传感器数据对深度学习模型的输出结果进行交叉验证,有效提高尤其是太极拳动作识别的准确度。

Description

基于融合信息的太极拳识别方法、终端设备及存储介质 技术领域
本申请涉及动作识别技术领域,特别是涉及一种基于融合信息的太极拳识别方法、终端设备以及存储介质。
背景技术
人体的运动姿态识别以及关节的运动数据广泛应用于运动产业、康复产业以及安防领域等我们生活的方方面面。例如太极拳运动是评价和锻炼人体运动能力和心肺能力重要活动。高精度的、实时的人体运动姿态识别对于提升上述产业的发展水平至关重要。
随着社会经济水平的发展,人们对于身体的健康越来越重视。同时随着老龄化的日益加剧,越来越多的老年人出现了身体机能的退化,运动能力下降,需要借助外界力量进行康复训练。评估是康复的第一步,然后由于我国康复教育水平的落后和康复医疗资源的匮乏,老年人的康复一直是一个痛点,尤其是对老年人的运动能力的评估,以往一直依赖于康复治疗师主观评估的方式,效率较低。为了解放医疗资源,提升评估效率,人体运动姿态的自动识别以及关节运动数据的准确获取至关重要。
太极拳动作及其他人体运动姿态识别的主要技术难点之一是如何实时地、准确的对人体的复杂动作进行识别。特别是涉及到运动评估等对关节的运动数据有较高精度要求的领域,准确快速的识别结果以及运动数据是后续一切工作的基础。
发明内容
本申请提供了一种基于融合信息的太极拳识别方法、终端设备以及存储介质。
本申请提供了一种基于融合信息的太极拳识别方法,所述太极拳识别方法 包括:
采集第一深度图像,以及传感器数据;
将所述第一深度图像输入预设深度学习模型,获取输出的第一骨架关节信息;
基于所述第一深度图像的采集时间,以及传感器数据,获取所述第一深度图像对应的第二骨架关节信息;
采用所述第二骨架关节信息对所述第一骨架关节信息进行交叉验证;
验证成功后,输出所述第一骨架关节信息。
其中,所述太极拳识别方法,还包括:
初始化深度相机和惯性传感器;
采用所述深度相机采集第二深度图像;
将所述第二深度图像输入所述预设深度学习模型,获取输出的第三骨架关节信息,其中,所述第三骨架关节信息包括多个骨骼关节信息;
将每个骨骼关节信息与对应位置上的惯性传感器建立映射关系,并记录每个惯性传感器在世界坐标系中的起始位置。
其中,所述基于所述第一深度图像的采集时间,以及传感器数据,获取所述深度图像对应的第二骨架关节信息,包括:
获取所述第一深度图像与相邻深度图像的时间间隔;
基于所述传感器数据,获取在所述第一深度图像对应的采集时间下,所述惯性传感器的速度信息和加速度信息;
利用所述惯性传感器的速度信息、加速度信息以及所述时间间隔,获取所述第二骨架关节信息。
其中,所述采用所述第二骨架关节信息对所述第一骨架关节信息进行交叉 验证,包括:
从所述第一骨架关节信息中提取第一人体朝向信息;
从所述第二骨架关节信息中提取第二人体朝向信息;
基于所述第一人体朝向信息和所述第二人体朝向信息,获取朝向匹配度;
通过匹配度大小对所述第一骨架关节信息进行交叉验证。
其中,所述从所述第一骨架关节信息中提取第一人体朝向信息,包括:
从所述第一骨架关节信息中获取第一关节点位置和第二关节点位置;
基于所述第一关节点位置和所述第二关节点位置,确定所述第一人体朝向信息;
所述从所述第二骨架关节信息中提取第二人体朝向信息,包括:
从所述第二骨架关节信息中获取第三关节点位置和第四关节点位置;
基于所述第三关节点位置和所述第四关节点位置,确定所述第二人体朝向信息。
其中,所述太极拳识别方法,还包括:
当所述匹配度大于0,验证成功,输出所述第一骨架关节信息;
当所述匹配度小于0,验证失败,输出所述第二骨架关节信息。
其中,所述太极拳识别方法,还包括:
当所述匹配度小于0,验证失败,剔除所述第一骨架关节信息,利用所述第二骨架关节信息对所述预设深度学习模型进行训练。
其中,所述太极拳识别方法,还包括:
当所述匹配度大于0,验证成功,获取所述预设深度学习模型输出的正确概率值;
判断所述正确概率值是否大于预设概率值;
若是,则基于所述第一骨架关节信息对惯性传感器的位置进行重新定位。
本申请还提供了一种终端设备,所述终端设备包括采集模块、图像模块、传感器模块以及动作识别模块;其中,
所述采集模块,用于采集第一深度图像,以及传感器数据;
所述图像模块,用于将所述第一深度图像输入预设深度学习模型,获取输出的第一骨架关节信息;
所述传感器模块,用于基于所述第一深度图像的采集时间,以及传感器数据,获取所述第一深度图像对应的第二骨架关节信息;
所述动作识别模块,用于采用所述第二骨架关节信息对所述第一骨架关节信息进行交叉验证,以及在验证成功后,输出所述第一骨架关节信息。
本申请还提供了另一种终端设备,所述终端设备包括存储器和处理器,其中,所述存储器与所述处理器耦接;
其中,所述存储器用于存储程序数据,所述处理器用于执行所述程序数据以实现上述的太极拳识别方法。
本申请还提供了一种计算机存储介质,所述计算机存储介质用于存储程序数据,所述程序数据在被处理器执行时,用以实现上述的太极拳识别方法。
本申请的有益效果是:终端设备采集第一深度图像,以及传感器数据;将第一深度图像输入预设深度学习模型,获取输出的第一骨架关节信息;基于第一深度图像的采集时间,以及传感器数据,获取第一深度图像对应的第二骨架关节信息;采用第二骨架关节信息对第一骨架关节信息进行交叉验证;验证成功后,输出第一骨架关节信息。通过上述方式,本申请的太极拳识别方法采用传感器数据对深度学习模型的输出结果进行交叉验证,有效提高太极拳动作识别的准确度。
附图说明
为了更清楚地说明本发明实施例中的技术方案,下面将对实施例描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本发明的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其他的附图。其中:
图1是本申请提供的太极拳识别方法一实施例的流程示意图;
图2是本申请提供的太极拳识别流程的示意图;
图3是本申请提供通过惯性传感器计算身体右侧朝向的示意图;
图4是本申请提供的终端设备一实施例的结构示意图;
图5是本申请提供的终端设备另一实施例的结构示意图;
图6是本申请提供的计算机存储介质一实施例的结构示意图。
具体实施方式
下面将结合本申请实施例中的附图,对本申请实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例仅是本申请的一部分实施例,而不是全部的实施例。基于本申请中的实施例,本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例,都属于本申请保护的范围。
目前已有的人体运动姿态识别方法主要有两大类:一类是通过视觉采集设备基于传统计算机视觉算法或深度学习算法进行人体骨架的提取进而实现人体运动姿态的识别,该类方法的代表为微软Kinect、Intel Realsense、OpenPose等方法,该类方法基于彩色图像、深度图像,从中识别人体的骨架,基于此结果,在对人体的骨架上的关节节点位置、速度、加速度等信息进行计算。另外一类算法是通过佩戴惯性传感器,直接获取人体的骨架上的关节节点位置、速度、加速度等信息。
然而,目前的人体运动姿态识别方法对于某些动作识别正确率不高,例如对于人体侧身位,尤其是人体转身的过程中对人体的正面和背面的识别正确率不高;另外,现有的基于视觉的方法先提取人体的骨架,然后基于骨架的运动信息计算人体关节的信息,一旦骨架信息提取有误,关节的运动数据的计算结 果也会与真实数据有较大的差距,导致对于人体关节的运动数据计算精度不高。
针对上述问题,本申请提出一种基于融合信息的太极拳识别方法和设备,终端设备以深度相机采集到的彩色深度图像作为一种输入;与此同时,被测试者在身体的关键部位佩戴惯性传感器,系统采集惯性传感器的数据作为第二种输入。终端设备将彩色深度图像输入训练好的深度学习模型之中,得到人体骨架信息的初步提取结果,然后将该结果与惯性传感器采集到的关节位置信息进行交叉验证,最终融合两种信息的结果得到准确的人体骨骼信息以及关节的精确运动数据。同时将该数据结果作为训练数据继续强化深度学习模型的训练。
具体请参阅图1和图2,图1是本申请提供的太极拳识别方法一实施例的流程示意图,图2是本申请提供的太极拳识别流程的示意图。
其中,本申请的太极拳识别方法应用于一种终端设备,其中,本申请的终端设备可以为服务器,也可以为由服务器和终端设备相互配合的系统。相应地,终端设备包括的各个部分,例如各个单元、子单元、模块、子模块可以全部设置于服务器中,也可以分别设置于服务器和终端设备中。
进一步地,上述服务器可以是硬件,也可以是软件。当服务器为硬件时,可以实现成多个服务器组成的分布式服务器集群,也可以实现成单个服务器。当服务器为软件时,可以实现成多个软件或软件模块,例如用来提供分布式服务器的软件或软件模块,也可以实现成单个软件或软件模块,在此不做具体限定。在一些可能的实现方式中,本申请实施例的太极拳识别方法可以通过处理器调用存储器中存储的计算机可读指令的方式来实现。
需要说明的是,本申请实施例所涉及的动作包括多个领域以及多个方向的动作,例如,太极拳动作、舞蹈动作、健身动作、生命动作等。在本申请对于以下实施例的介绍中,不再对可以适用的动作类型不一一列举。
具体而言,如图1所示,本申请实施例的太极拳识别方法具体包括以下步骤:
步骤S11:采集第一深度图像,以及传感器数据。
在本申请实施例中,如图2所示,终端设备采用深度相机,采集包括用户的运动动作的彩色深度图像,即第一深度图像;另外,终端设备采用惯性传感器采集用户各个骨架关节在运动过程中的传感器数据,其中,传感器数据具体包括各个骨架关节的速度数据以及加速度数据等。
具体地,在实现本申请实施例的太极拳识别方法之前,用户需要先搭建硬件环境,从而初始化深度相机和惯性传感器。
一方面,需要搭建深度相机,对用户的动作进行采集,采集得到的主要是彩色RGB图像数据流和深度图像数据流。另一方面,需要为用户在A 1至A N的人体关节处佩戴N个轻型的惯性传感器,按照人体关节处的顺序,用S 1至S N分别表示各个人体关节处对应的惯性传感器,并通过蓝牙或者WIFI等无线传输方式将所有惯性传感器之间进行连接,同于分别与计算主机相连。
进一步地,初始化惯性传感器还包括惯性传感器位置的首次校准。
当太极拳识别方法正式开始时,需要用户正面面向深度相机双脚分开,双手自然下垂。当深度相机第一次检测到用户的图像时,深度相机将获取到的彩色深度图像输入预先建立的预设深度学习模型,该预设深度学习模型进行计算,识别出用户的骨骼信息,并可以识别出骨骼信息中A 1至A N的人体关节。
此时,终端设备将人体骨骼关节的索引或者编号与惯性传感器的索引或者编号建立映射关系,并且记录每个惯性传感器在世界坐标系中的位置,作为每个惯性传感器的起始位置p i
需要说明的是,本申请实施例的世界坐标系以人体骨骼模型的两足中点为圆点,人体面向的正前方为z轴正方向,人体的右侧为x轴正方向,人体头部的方向为y轴正方向。
另外,本申请实施例的预设深度学习模型可以采用传统的有监督深度学习方法对大量的有标记视频数据进行学习,建立深度学习模型。该深度学习模型可以采用深度相机采集到的彩色深度图像作为输入,计算输出图像中的人体的运动骨架信息。
步骤S12:将第一深度图像输入预设深度学习模型,获取输出的第一骨架 关节信息。
在本申请实施例中,如图2所示,终端设备将第一深度图像输入预设深度学习模型,以此识别人体的骨架和节点,得到第一骨架关节信息。该预设深度学习模型根据上述步骤S11介绍的离线训练过程提前进行训练得到。
步骤S13:基于第一深度图像的采集时间,以及传感器数据,获取第一深度图像对应的第二骨架关节信息。
在本申请实施例中,终端设备基于第一深度图像与上一次采集的深度图像的采集时间差,获取时间间隔Δ t。终端设备可以提前设定深度相机按照相同间隔的采集时间进行采集,因此,时间间隔Δ t一般为固定值。
然后,终端设备根据惯性传感器捕捉到的传感器数据,包括加速度数据和速度数据,结合时间间隔Δ t,通过积分方法计算各个惯性传感器在世界坐标系中的新位置。
具体地,以前向欧拉方法为例,惯性传感器i在n+1的位置
Figure PCTCN2021143893-appb-000001
计算方法如下:
Figure PCTCN2021143893-appb-000002
其中,
Figure PCTCN2021143893-appb-000003
是惯性传感器i在n位置时的速度,
Figure PCTCN2021143893-appb-000004
是惯性传感器i在n位置时的加速度。
在其他实施例中,也可以采用其他时间积分方法,例如龙格库塔方法等。
通过上述方法,可以计算出所有惯性传感器在当前采集的深度图像的下一时刻在世界坐标系中的位置,以此获取第一深度图像对应的第二骨架关节信息。
步骤S14:采用第二骨架关节信息对第一骨架关节信息进行交叉验证。
在本申请实施例中,终端设备可以采用人体朝向来对上述第二骨架关节信息和第一骨架关节信息进行交叉验证。
具体地,终端设备从第一骨架关节信息基于预设深度学习模型计算得到的 第一骨架关节信息可以得到用户左脚和右脚的位置,分别为
Figure PCTCN2021143893-appb-000005
Figure PCTCN2021143893-appb-000006
由此,可以计算出人体当前的右侧朝向为:
Figure PCTCN2021143893-appb-000007
终端设备从第二骨架关节信息可以得到所有惯性传感器在世界坐标系中的位置,假设左脚传感器的索引为m,右脚传感器的索引为n,具体请参阅图3,则当前人体的右侧朝向为:
d=p n-p m/||p n-p m||
进一步,本申请实施例中定义μ=d·d′为惯性传感器检测的第二骨架关节信息和深度学习模型计算的第一骨架关节信息在人体朝向上的匹配度。
当匹配度大于0时,说明深度学习模型计算的人体朝向与惯性传感器检测的人体朝向相同,进入步骤15。
当匹配度小于0时,说明深度学习模型计算的人体朝向与惯性传感器检测的人体朝向相反,此时,应该以惯性传感器计算得到的人体朝向为准,即输出第二骨架关节信息表示人体运动姿态以及关节运动数据。
进一步地,匹配度小于0,可以认为深度学习模型输出的人体朝向有偏差,此时,可以对深度学习模型计算的人体朝向进行修正。具体地,终端设备可以剔除深度学习模型的错误结果,并将惯性传感器计算的正确结果保存下来,作为深度学习模型的训练数据,继续对深度学习模块进行训练。
步骤S15:验证成功后,输出第一骨架关节信息。
在本申请实施例中,当匹配度大于0时,终端设备输出第一骨架关节信息表示人体运动姿态以及关节运动数据。
进一步地,匹配度大于0,可以认为深度学习模型输出的人体朝向正确,此时,可以对惯性传感器的位置进行再校准,以避免由于时间积分算法计算惯性传感器误差累计导致的惯性传感器位置漂移现象。
具体地,本申请实施例的深度学习模型,计算输出人体的骨架模型以及关 节位置时,会同时生成一个0至1的数值,表示输出结果正确的概率,用η表示。当用户正面面对视觉传感器时,η值较高;当用户转身时,η值较低。
当η>0.8时,终端设备则可以对惯性传感器的位置进行再校准,即进行重新定位,使得深度学习模型计算得到的关节位置一致。需要说明的是,校准过程与初始化时的校准过程相同,在此不再赘述。
在本申请实施例中,终端设备采集第一深度图像,以及传感器数据;将第一深度图像输入预设深度学习模型,获取输出的第一骨架关节信息;基于第一深度图像的采集时间,以及传感器数据,获取第一深度图像对应的第二骨架关节信息;采用第二骨架关节信息对第一骨架关节信息进行交叉验证;验证成功后,输出第一骨架关节信息。通过上述方式,本申请的太极拳识别方法采用传感器数据对深度学习模型的输出结果进行交叉验证,有效提高动作识别的准确度。
本领域技术人员可以理解,在具体实施方式的上述方法中,各步骤的撰写顺序并不意味着严格的执行顺序而对实施过程构成任何限定,各步骤的具体执行顺序应当以其功能和可能的内在逻辑确定。
为实现上述实施例的太极拳识别方法,本申请还提出了一种终端设备,具体请参阅图4,图4是本申请提供的终端设备一实施例的结构示意图。
本申请实施例的终端设备400包括采集模块41、图像模块42、传感器模块43以及动作识别模块44;其中,
所述采集模块41,用于采集第一深度图像,以及传感器数据。
所述图像模块42,用于将所述第一深度图像输入预设深度学习模型,获取输出的第一骨架关节信息。
所述传感器模块43,用于基于所述第一深度图像的采集时间,以及传感器数据,获取所述第一深度图像对应的第二骨架关节信息。
所述动作识别模块44,用于采用所述第二骨架关节信息对所述第一骨架关节信息进行交叉验证,以及在验证成功后,输出所述第一骨架关节信息。
为实现上述实施例的太极拳识别方法,本申请还提出了又一种终端设备,具体请参阅图5,图5是本申请提供的终端设备又一实施例的结构示意图。
本申请实施例的终端设备500包括存储器51和处理器52,其中,存储器51和处理器52耦接。
存储器51用于存储程序数据,处理器52用于执行程序数据以实现上述实施例所述的太极拳识别方法。
在本实施例中,处理器52还可以称为CPU(Central Processing Unit,中央处理单元)。处理器52可能是一种集成电路芯片,具有信号的处理能力。处理器52还可以是通用处理器、数字信号处理器(DSP,Digital Signal Process)、专用集成电路(ASIC,Application Specific Integrated Circuit)、现场可编程门阵列(FPGA,Field Programmable Gate Array)或者其它可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件。通用处理器可以是微处理器或者该处理器52也可以是任何常规的处理器等。
本申请还提供一种计算机存储介质,如图6所示,计算机存储介质600用于存储程序数据61,程序数据61在被处理器执行时,用以实现如上述实施例所述的太极拳识别方法。
本申请还提供一种计算机程序产品,其中,上述计算机程序产品包括计算机程序,上述计算机程序可操作来使计算机执行如本申请实施例所述的太极拳识别方法。该计算机程序产品可以为一个软件安装包。
本申请上述实施例所述的太极拳识别方法,在实现时以软件功能单元的形式存在并作为独立的产品销售或使用时,可以存储在装置中,例如一个计算机可读取存储介质中。基于这样的理解,本申请的技术方案本质上或者说对现有技术做出贡献的部分或者该技术方案的全部或部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质中,包括若干指令用以使得一台计算机设备(可以是个人计算机,服务器,或者网络设备等)或处理器(processor)执行本发明各个实施方式所述方法的全部或部分步骤。而前述的存储介质包括:U盘、移动硬盘、只读存储器(ROM,Read-Only Memory)、随机存取存储器(RAM,Random Access Memory)、磁碟或者光盘等各种可以 存储程序代码的介质。
以上所述仅为本申请的实施方式,并非因此限制本申请的专利范围,凡是利用本申请说明书及附图内容所作的等效结构或等效流程变换,或直接或间接运用在其他相关的技术领域,均同理包括在本申请的专利保护范围内。

Claims (11)

  1. 一种基于融合信息的太极拳识别方法,其特征在于,所述太极拳识别方法包括:
    采集第一深度图像,以及传感器数据;
    将所述第一深度图像输入预设深度学习模型,获取输出的第一骨架关节信息;
    基于所述第一深度图像的采集时间,以及传感器数据,获取所述第一深度图像对应的第二骨架关节信息;
    采用所述第二骨架关节信息对所述第一骨架关节信息进行交叉验证;
    验证成功后,输出所述第一骨架关节信息。
  2. 根据权利要求1所述的太极拳识别方法,其特征在于,
    所述太极拳识别方法,还包括:
    初始化深度相机和惯性传感器;
    采用所述深度相机采集第二深度图像;
    将所述第二深度图像输入所述预设深度学习模型,获取输出的第三骨架关节信息,其中,所述第三骨架关节信息包括多个骨骼关节信息;
    将每个骨骼关节信息与对应位置上的惯性传感器建立映射关系,并记录每个惯性传感器在世界坐标系中的起始位置。
  3. 根据权利要求2所述的太极拳识别方法,其特征在于,
    所述基于所述第一深度图像的采集时间,以及传感器数据,获取所述深度图像对应的第二骨架关节信息,包括:
    获取所述第一深度图像与相邻深度图像的时间间隔;
    基于所述传感器数据,获取在所述第一深度图像对应的采集时间下,所述惯性传感器的速度信息和加速度信息;
    利用所述惯性传感器的速度信息、加速度信息以及所述时间间隔,获取所述第二骨架关节信息。
  4. 根据权利要求1或3所述的太极拳识别方法,其特征在于,
    所述采用所述第二骨架关节信息对所述第一骨架关节信息进行交叉验证,包括:
    从所述第一骨架关节信息中提取第一人体朝向信息;
    从所述第二骨架关节信息中提取第二人体朝向信息;
    基于所述第一人体朝向信息和所述第二人体朝向信息,获取朝向匹配度;
    通过匹配度大小对所述第一骨架关节信息进行交叉验证。
  5. 根据权利要求4所述的太极拳识别方法,其特征在于,
    所述从所述第一骨架关节信息中提取第一人体朝向信息,包括:
    从所述第一骨架关节信息中获取第一关节点位置和第二关节点位置;
    基于所述第一关节点位置和所述第二关节点位置,确定所述第一人体朝向信息;
    所述从所述第二骨架关节信息中提取第二人体朝向信息,包括:
    从所述第二骨架关节信息中获取第三关节点位置和第四关节点位置;
    基于所述第三关节点位置和所述第四关节点位置,确定所述第二人体朝向信息。
  6. 根据权利要求4所述的太极拳识别方法,其特征在于,
    所述太极拳识别方法,还包括:
    当所述匹配度大于0,验证成功,输出所述第一骨架关节信息;
    当所述匹配度小于0,验证失败,输出所述第二骨架关节信息。
  7. 根据权利要求6所述的太极拳识别方法,其特征在于,
    所述太极拳识别方法,还包括:
    当所述匹配度小于0,验证失败,剔除所述第一骨架关节信息,利用所述第二骨架关节信息对所述预设深度学习模型进行训练。
  8. 根据权利要求6所述的太极拳识别方法,其特征在于,
    所述太极拳识别方法,还包括:
    当所述匹配度大于0,验证成功,获取所述预设深度学习模型输出的正确概率值;
    判断所述正确概率值是否大于预设概率值;
    若是,则基于所述第一骨架关节信息对惯性传感器的位置进行重新定位。
  9. 一种终端设备,其特征在于,所述终端设备包括采集模块、图像模块、传感器模块以及动作识别模块;其中,
    所述采集模块,用于采集第一深度图像,以及传感器数据;
    所述图像模块,用于将所述第一深度图像输入预设深度学习模型,获取输出的第一骨架关节信息;
    所述传感器模块,用于基于所述第一深度图像的采集时间,以及传感器数据,获取所述第一深度图像对应的第二骨架关节信息;
    所述动作识别模块,用于采用所述第二骨架关节信息对所述第一骨架关节信息进行交叉验证,以及在验证成功后,输出所述第一骨架关节信息。
  10. 一种终端设备,其特征在于,所述终端设备包括存储器和处理器,其中,所述存储器与所述处理器耦接;
    其中,所述存储器用于存储程序数据,所述处理器用于执行所述程序数据以实现权利要求1-8任一项所述的太极拳识别方法。
  11. 一种存储介质,其特征在于,所述存储介质用于存储程序数据,所述程序数据在被处理器执行时,用以实现权利要求1-8任一项所述的太极拳识别方法。
PCT/CN2021/143893 2021-11-04 2021-12-31 基于融合信息的太极拳识别方法、终端设备及存储介质 WO2023077659A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202111301208.3A CN115775347A (zh) 2021-11-04 2021-11-04 基于融合信息的太极拳识别方法、终端设备及存储介质
CN202111301208.3 2021-11-04

Publications (1)

Publication Number Publication Date
WO2023077659A1 true WO2023077659A1 (zh) 2023-05-11

Family

ID=85388398

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/143893 WO2023077659A1 (zh) 2021-11-04 2021-12-31 基于融合信息的太极拳识别方法、终端设备及存储介质

Country Status (2)

Country Link
CN (1) CN115775347A (zh)
WO (1) WO2023077659A1 (zh)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20130077820A1 (en) * 2011-09-26 2013-03-28 Microsoft Corporation Machine learning gesture detection
CN104298358A (zh) * 2014-10-29 2015-01-21 指挥家(厦门)智能科技有限公司 一种基于关节空间位置数据的动态3d手势识别方法
JP2017091377A (ja) * 2015-11-13 2017-05-25 日本電信電話株式会社 姿勢推定装置、姿勢推定方法、及び姿勢推定プログラム
CN109086659A (zh) * 2018-06-13 2018-12-25 深圳市感动智能科技有限公司 一种基于多模道特征融合的人体行为识别方法和装置
CN113591726A (zh) * 2021-08-03 2021-11-02 电子科技大学 一种太极拳训练动作的交叉模态评估方法

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20130077820A1 (en) * 2011-09-26 2013-03-28 Microsoft Corporation Machine learning gesture detection
CN104298358A (zh) * 2014-10-29 2015-01-21 指挥家(厦门)智能科技有限公司 一种基于关节空间位置数据的动态3d手势识别方法
JP2017091377A (ja) * 2015-11-13 2017-05-25 日本電信電話株式会社 姿勢推定装置、姿勢推定方法、及び姿勢推定プログラム
CN109086659A (zh) * 2018-06-13 2018-12-25 深圳市感动智能科技有限公司 一种基于多模道特征融合的人体行为识别方法和装置
CN113591726A (zh) * 2021-08-03 2021-11-02 电子科技大学 一种太极拳训练动作的交叉模态评估方法

Also Published As

Publication number Publication date
CN115775347A (zh) 2023-03-10

Similar Documents

Publication Publication Date Title
WO2018120964A1 (zh) 一种基于深度信息和骨骼信息的姿势矫正方法
US11980790B2 (en) Automated gait evaluation for retraining of running form using machine learning and digital video data
Chaudhari et al. Yog-guru: Real-time yoga pose correction system using deep learning methods
Wei et al. Towards on-demand virtual physical therapist: Machine learning-based patient action understanding, assessment and task recommendation
CN113164098A (zh) 人类步态分析系统和方法
US11759126B2 (en) Scoring metric for physical activity performance and tracking
Jensen et al. Classification of kinematic swimming data with emphasis on resource consumption
Anilkumar et al. Pose estimated yoga monitoring system
EP3611691B1 (en) Recognition device, recognition system, recognition method, and recognition program
WO2023097967A1 (zh) 一种动作检测方法、装置、设备、存储介质及计算机程序产品
Mangal et al. A review of the evolution of scientific literature on technology-assisted approaches using RGB-D sensors for musculoskeletal health monitoring
JP6381368B2 (ja) 画像処理装置、画像処理方法、およびプログラム
US20220284652A1 (en) System and method for matching a test frame sequence with a reference frame sequence
CN113705540A (zh) 一种无器械训练动作识别与计数方法及系统
Agarwal et al. FitMe: a fitness application for accurate pose estimation using deep learning
KR20230086874A (ko) 3d 신체 정밀 추적 기술을 이용한 재활 훈련 시스템
Ray et al. Pressim: An end-to-end framework for dynamic ground pressure profile generation from monocular videos using physics-based 3d simulation
WO2023077659A1 (zh) 基于融合信息的太极拳识别方法、终端设备及存储介质
Enikeev et al. Recognition of sign language using leap motion controller data
KR101869304B1 (ko) 컴퓨터를 이용한 수화어 인식시스템, 방법 및 인식프로그램
Ongun et al. Recognition of occupational therapy exercises and detection of compensation mistakes for Cerebral Palsy
Jolly et al. Posture Correction and Detection using 3-D Image Classification
TW202141349A (zh) 影像辨識方法及其裝置及人工智慧模型訓練方法及其裝置
Liu et al. Abnormal behavior recognition in an examination based on pose spatio-temporal features
Rishabh et al. Enhancing Exercise Form: A Pose Estimation Approach with Body Landmark Detection

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21963165

Country of ref document: EP

Kind code of ref document: A1