CN116343325A - Intelligent auxiliary system for household body building - Google Patents

Intelligent auxiliary system for household body building Download PDF

Info

Publication number
CN116343325A
CN116343325A CN202310116843.7A CN202310116843A CN116343325A CN 116343325 A CN116343325 A CN 116343325A CN 202310116843 A CN202310116843 A CN 202310116843A CN 116343325 A CN116343325 A CN 116343325A
Authority
CN
China
Prior art keywords
time
real
motion
standard
moving object
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310116843.7A
Other languages
Chinese (zh)
Inventor
初佃辉
胡鑫
金宇哲
方海男
倪丽
田思润
郝乐川
贾泽西
钟世融
张华�
涂志莹
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Harbin Institute of Technology Weihai
Original Assignee
Harbin Institute of Technology Weihai
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Harbin Institute of Technology Weihai filed Critical Harbin Institute of Technology Weihai
Priority to CN202310116843.7A priority Critical patent/CN116343325A/en
Publication of CN116343325A publication Critical patent/CN116343325A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/20Movements or behaviour, e.g. gesture recognition
    • G06V40/23Recognition of whole body movements, e.g. for sport training
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/70Determining position or orientation of objects or cameras
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/80Analysis of captured images to determine intrinsic or extrinsic camera parameters, i.e. camera calibration
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H20/00ICT specially adapted for therapies or health-improving plans, e.g. for handling prescriptions, for steering therapy or for monitoring patient compliance
    • G16H20/30ICT specially adapted for therapies or health-improving plans, e.g. for handling prescriptions, for steering therapy or for monitoring patient compliance relating to physical therapies or activities, e.g. physiotherapy, acupressure or exercising
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10016Video; Image sequence
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10028Range image; Depth image; 3D point clouds
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02PCLIMATE CHANGE MITIGATION TECHNOLOGIES IN THE PRODUCTION OR PROCESSING OF GOODS
    • Y02P90/00Enabling technologies with a potential contribution to greenhouse gas [GHG] emissions mitigation
    • Y02P90/02Total factory control, e.g. smart factories, flexible manufacturing systems [FMS] or integrated manufacturing systems [IMS]

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Health & Medical Sciences (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • General Health & Medical Sciences (AREA)
  • Biophysics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Epidemiology (AREA)
  • Artificial Intelligence (AREA)
  • Human Computer Interaction (AREA)
  • Social Psychology (AREA)
  • Physical Education & Sports Medicine (AREA)
  • Psychiatry (AREA)
  • Medical Informatics (AREA)
  • Primary Health Care (AREA)
  • Public Health (AREA)
  • Multimedia (AREA)
  • Biomedical Technology (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Image Analysis (AREA)

Abstract

The application provides a household body-building intelligent auxiliary system which comprises an acquisition unit, a feature extraction unit, a standard action feature construction unit, an evaluation unit, a database and a display unit, wherein the acquisition unit is used for acquiring real-time three-dimensional position information of at least one moving object; the real-time action feature extraction unit extracts a real-time action feature sequence of at least one moving object based on the real-time three-dimensional position information; the standard action feature construction unit is used for constructing a standard action feature sequence of at least one motion type and storing the standard action feature sequence in a database; the evaluation unit identifies the motion type and the motion effect of at least one moving object based on the real-time motion feature sequence and the standard motion feature sequence; the display unit is used for displaying the real-time three-dimensional position information, the movement type and the movement effect. According to the technical scheme, the motion type of the moving object can be accurately identified, and the corresponding motion effect can be evaluated.

Description

Intelligent auxiliary system for household body building
Technical Field
The application belongs to the technical field of motion detection, and specifically provides a household body-building intelligent auxiliary system.
Background
Along with the continuous improvement of living standard and health consciousness, various body-building exercises become indispensable activities in the lives of more and more people. Meanwhile, under the large background of epidemic situation prevention and control, the necessity and superiority of the device are shown when the device performs body-building exercise in home and indoor environments. The body building exercise at home is frequently performed, so that the body quality is improved, the resistance to various diseases is enhanced, the mind is relaxed and pleasant during the exercise, and various psychological problems are effectively relieved, and therefore, the body building at home is becoming a popular exercise mode at home.
However, due to the wide variety of exercises that can be performed during home exercise, under the condition that professional training is not performed or no special person is required, actions are not standard, not in place, and the like, not only can exercise effect be achieved, but also damage to the body can be caused by incorrect actions.
Therefore, it is necessary to provide a system capable of accurately and effectively detecting and recognizing the motions of the household fitness exercises and evaluating the exercise effects based on the recognition results.
Disclosure of Invention
In order to solve the problems in the prior art, the application provides a household body-building intelligent auxiliary system which comprises an acquisition unit, a real-time action feature extraction unit, a standard action feature construction unit, an evaluation unit, a database and a display unit;
the acquisition unit is used for acquiring real-time three-dimensional position information of at least one moving object;
the real-time action feature extraction unit extracts a real-time action feature sequence of at least one moving object based on the real-time three-dimensional position information;
the standard action feature construction unit is used for constructing a standard action feature sequence of at least one motion type and storing the standard action feature sequence in a database;
the evaluation unit identifies the motion type and the motion effect of at least one moving object based on the real-time motion feature sequence and the standard motion feature sequence;
the display unit is used for displaying the real-time three-dimensional position information, the movement type and the movement effect.
Further, the real-time action feature extraction unit comprises a coordinate extraction module, a people number identification module, a multi-person segmentation module and a processing module; the real-time motion feature extraction unit extracts a real-time motion feature sequence of at least one moving object based on the steps of:
s100: the coordinate extraction module determines real-time three-dimensional coordinates and real-time visibility of a plurality of joint points based on the real-time three-dimensional position information;
s200: the people number identification module identifies the number of people of the moving object based on the real-time three-dimensional coordinates of the plurality of nodes;
s300: if the number of the moving objects is greater than 1 person, executing step S400, otherwise executing step S500;
s400: the multi-person segmentation module segments the plurality of joint points based on the number of people of the moving objects, and executes step S500 for each segmented moving object;
s500: the processing module processes and generates a real-time action feature sequence of each moving object, wherein the real-time action feature sequence comprises a real-time action feature set of the moving object at a plurality of time points, and the real-time action feature set of each time point comprises real-time three-dimensional coordinates and real-time visibility of each joint point of the moving object at the time point.
Preferably, the plurality of joints includes a plurality of face joints, a plurality of torso joints, and a plurality of extremity joints.
Preferably, the population identification module determines the population of the moving object using at least one of the following criteria: based on the number of the faces determined by the number of the face articulation points, the total number of the articulation points and the three-dimensional distance of any two articulation points.
Preferably, the real-time motion feature extraction unit further includes an angle extraction module that determines a real-time deflection angle of each of the plurality of nodes based on real-time three-dimensional coordinates of the plurality of nodes, and the real-time motion feature set of each time point further includes a real-time deflection angle of each of the nodes of the moving object at the time point.
Preferably, the real-time motion feature extraction unit further includes a posture correction module that performs posture correction on the real-time motion feature sequence based on real-time three-dimensional coordinates of the plurality of torso joint points.
Further, the standard motion feature construction unit constructs a standard motion feature sequence for each motion type based on the steps of:
a100: obtaining a standard action video of the motion type;
a200: calibrating a plurality of reference articulation points from the standard action video of the motion type;
a300: selecting a real-time action characteristic sequence with the same shooting angle as the standard action video from the real-time action characteristic sequences as an alternative action characteristic sequence;
a400: adjusting the positions of corresponding nodes in the alternative action feature sequence based on the positions of the plurality of reference nodes;
a500: adjusting the positions of non-corresponding nodes in the alternative action feature sequence based on the positions of the corresponding nodes in the alternative action feature sequence;
a600: generating a standard motion feature sequence of the motion type based on the positions of the corresponding articulation points and the positions of the non-corresponding articulation points, wherein the standard motion feature sequence comprises standard motion feature sets of the motion type at a plurality of time points, and the standard motion feature set of each time point comprises standard three-dimensional coordinates and standard visibility of each articulation point at the time point.
Preferably, the standard visibility is determined based on a plurality of reference joint points calibrated from the standard motion video.
Preferably, the evaluation unit uses a trained deep learning network to identify the type of motion of the at least one moving object.
Further, the evaluation unit identifies a motion type and a motion effect of the at least one moving object based on:
b100: extracting a real-time action feature sequence of at least one moving object;
b200: identifying the motion type of the moving object based on the trained deep learning network;
b300: extracting a standard action feature sequence corresponding to the identified motion type;
b400: correcting the standard visibility of each joint point in the standard action characteristic sequence corresponding to the identified motion type based on the gesture of the real-time action characteristic sequence of the moving object;
b500: counting the number of different real-time visibility of each node in the real-time action feature sequence of the moving object from the standard visibility of each node in the corrected standard action feature sequence, if the counting result is larger than a preset first threshold value, evaluating the moving effect as bad and returning to the step B100, otherwise executing the step B600;
b600: calculating the matching degree of the real-time action feature sequence of the moving object and the corrected standard action feature sequence, if the matching degree is smaller than a preset second threshold value, evaluating the moving effect as normal and returning to the step B100, otherwise, executing the step B700;
b700: the athletic performance is evaluated as good and returned to step B100.
The intelligent auxiliary system for household body building provided by the embodiment of the application extracts the real-time three-dimensional positions and the visibility of the plurality of joint points from the real-time motion information of the moving objects, can judge the number of people of the moving objects in real time, and identifies and evaluates the motion type and the motion effect of each moving object; and meanwhile, the real-time motion characteristic sequence of the moving object is corrected by using the 2D image material of the standard motion to obtain the standard motion characteristic sequence, so that various types of standard motion characteristic sequences can be conveniently and accurately generated on the basis of not increasing the system cost, and the accuracy of identifying the motion type and evaluating the motion effect is effectively improved.
Drawings
FIG. 1 is a schematic architecture diagram of a home fitness intelligent assistance system according to an embodiment of the present application;
FIG. 2 is a real-time 2D image of a particular moving object moving in accordance with an embodiment of the present application;
FIG. 3 is a schematic diagram of a real-time motion feature extraction unit according to an embodiment of the present application;
FIG. 4 is a flowchart of a coordinate extraction module extracting a real-time motion feature sequence of a moving object according to an embodiment of the present application;
FIG. 5 is a schematic diagram of real-time positions of a plurality of nodes of a moving object extracted by a coordinate extraction module according to an embodiment of the present application;
FIG. 6 is a schematic diagram of the standard three-dimensional coordinates of each node of interest constructed by the standard motion feature construction module according to an embodiment of the present application;
fig. 7a and 7b are schematic views of a moving object according to an embodiment of the present application having different poses when performing the same type of movement.
Detailed Description
The present application will be further described below based on preferred embodiments with reference to the accompanying drawings.
The terminology used in this description is for the purpose of describing the embodiments of the present application and is not intended to be limiting of the present application. It should also be noted that unless explicitly stated or limited otherwise, the terms "disposed," "connected," and "connected" should be construed broadly, and may be, for example, fixedly connected, detachably connected, or integrally connected; the two components can be connected mechanically, directly or indirectly through an intermediate medium, and can be communicated internally. The specific meaning of the terms in this application will be specifically understood by those skilled in the art.
The present application provides, by way of example, a home fitness intelligent assistance system, and fig. 1 shows a schematic architecture of the home fitness intelligent assistance system in some embodiments. As shown in FIG. 1, the intelligent auxiliary system for household fitness comprises an acquisition unit, a real-time motion feature extraction unit, a standard motion feature construction unit, an evaluation unit, a database and a display unit. The following describes specific embodiments of the respective units in detail with reference to the drawings.
In an embodiment of the present application, the acquisition unit is configured to acquire real-time three-dimensional position information of at least one moving object. At present, various devices can acquire three-dimensional positions of a moving object in real time, for example, a camera with a depth sensor can acquire real-time 2D images of the moving object and depth information of different parts of the body of the moving object relative to the camera along a shooting angle; for another example, the binocular camera imitating the interpupillary distance of the human eyes can shoot the moving object at different angles, so that an image pair with parallax is obtained, and the depth information of different parts of the body of the moving object relative to the binocular camera can also be obtained by calculating the parallax at different positions in the image pair. The various ways of acquiring real-time three-dimensional position information of a moving object are well known to those skilled in the art. Fig. 2 schematically shows real-time 2D images of a moving object acquired by an acquisition unit in some specific embodiments of the present application.
In an embodiment of the present application, the real-time motion feature extraction unit extracts a real-time motion feature sequence of at least one moving object based on the real-time three-dimensional position information. Specifically, as shown in fig. 3, the real-time motion feature extraction unit includes a coordinate extraction module, a person number identification module, a multi-person segmentation module, and a processing module. Those skilled in the art will appreciate that there may be various implementations of the real-time motion feature extraction unit, for example, in some specific embodiments, each module of the real-time motion feature extraction unit may be a program module stored in a hard disk, a flash memory, or the like, and invoked and executed by a central processing unit; in other specific embodiments, each of the program modules may be embedded in an integrated development board specifically oriented to AI, such as Jetson Nano.
Fig. 4 illustrates a flow chart of a real-time motion feature extraction unit extracting a real-time motion feature sequence of at least one moving object in some specific embodiments. As shown in fig. 4, the real-time motion feature extraction unit extracts a real-time motion feature sequence of at least one moving object based on the steps of:
s100: the coordinate extraction module determines real-time three-dimensional coordinates and real-time visibility of a plurality of joint points based on the real-time three-dimensional position information;
s200: the people number identification module identifies the number of people of the moving object based on the real-time three-dimensional coordinates of the plurality of nodes;
s300: if the number of the moving objects is greater than 1 person, executing step S400, otherwise executing step S500;
s400: the multi-person segmentation module segments the plurality of joint points based on the number of people of the moving objects, and executes step S500 for each segmented moving object;
s500: the processing module processes and generates a real-time action feature sequence of each moving object, wherein the real-time action feature sequence comprises a real-time action feature set of the moving object at a plurality of time points, and the real-time action feature set of each time point comprises real-time three-dimensional coordinates and real-time visibility of each joint point of the moving object at the time point.
Specifically, in step S100, the coordinate extraction module extracts real-time three-dimensional coordinates of a plurality of nodes of the moving object from the three-dimensional position information acquired by the acquisition unit, so as to characterize pose information of the moving object when performing various actions. Because muscles, skin tissues and the like of the human body are supported by bones, and all bones are connected by joints, the spatial positions among a plurality of joints of the human body can accurately describe the posture characteristics of the human body.
The coordinate extraction module may extract a plurality of nodes of the moving object and track real-time three-dimensional coordinates of each node using a machine learning framework known to those skilled in the art such as MediaPipe. FIG. 5 illustrates a schematic diagram of a plurality of nodes extracted by the coordinate extraction module for characterizing the pose of a moving object in one particular embodiment. In this embodiment, the coordinate extraction module has extracted 33 total nodes, including 11 face nodes, 4 torso nodes, and 18 limb nodes. Table 1 below lists the specific locations of the 33 nodes in the graph, respectively.
TABLE 1
Figure BDA0004079571700000041
Figure BDA0004079571700000051
The position relationship of the 33 joint points is changed differently when the moving object moves in various types, for example, the position relationship of each joint point of the face is kept unchanged basically when the moving object moves in various types; the relative position relationship of each joint point of the trunk presents different change trends when the trunk performs torsion action and bending action; the relative position relation of all the joints of the four limbs is more flexible and changeable. The body posture of the moving object when various actions are made can be characterized by utilizing the change of the relative positions of different joint points. Furthermore, it should be appreciated that one skilled in the art may flexibly add or subtract nodes to meet the needs of motion recognition and evaluation. Fig. 5 is a schematic diagram illustrating real-time positions of a plurality of nodes of a moving object extracted by the coordinate extraction module in a specific embodiment.
In addition, in the process of extracting and tracking each node by using a machine learning framework such as MediaPipe, there may be a case that part of the nodes are blocked due to a specific action of a moving object, so in the embodiment of the application, the coordinate extraction module firstly extracts and tracks real-time three-dimensional coordinates of a plurality of visible nodes, estimates real-time three-dimensional coordinates of a plurality of invisible nodes based on the real-time three-dimensional coordinates of the visible nodes to obtain real-time three-dimensional coordinates of all the nodes, and simultaneously, additionally adds a flag bit to characterize real-time visibility of each node (for example, when one node is visible in a real-time 2D image obtained by the acquisition unit, the real-time visibility is set to be 1, otherwise, the real-time visibility is set to be 0).
Specifically, step S200 is used for identifying the number of people moving simultaneously, and since there may be a situation that a plurality of moving objects move simultaneously during the home fitness process, in this case, the number and spatial distribution of the joint points extracted and tracked by the coordinate extraction module are significantly different from those of the single moving object, so that it is necessary to identify the number of people moving simultaneously first, and identify the real-time motion feature sequence after dividing different moving objects, so as to ensure the accuracy of the identification.
In some specific embodiments, the population identification module determines the population of the moving object using at least one of the following criteria: based on the number of the faces determined by the number of the face articulation points, the total number of the articulation points and the three-dimensional distance of any two articulation points. The number of the face joint points can be extracted and tracked, the number of the moving objects can be accurately identified when each moving object faces the acquisition unit, and the number of the moving objects can be accurately identified by correcting the number of the moving objects when the plurality of moving objects are overlapped, shielded and the like in the depth direction through taking the total number of the visible joint points and the three-dimensional distance of any two joint points as criteria. For example, in the case where the face of a certain moving object is blocked, by counting the total number of visible joints, it can be effectively determined whether the number of people of the moving object is more than 1 person; for another example, when the three-dimensional distance between any two of the joints is significantly greater than the normal height of the human body (e.g., the three-dimensional distance between two of the joints exceeds 2 meters), it may also be determined that at least two moving objects are present.
Specifically, if the number of the recognized moving objects is more than 1 person, the plurality of extracted nodes are segmented by the multi-person segmentation module according to the number of the recognized moving objects in step S400. Segmentation may utilize a machine learning framework known to those skilled in the art, for example, in one particular embodiment, the multi-person segmentation module uses a trained YOLO object recognition framework to segment a plurality of moving objects, the particular steps comprising:
dividing a real-time 2D image acquired by an acquisition unit into unit cells with side length S;
step two, predicting a block diagram Bbox and confidence coefficient based on the number of people of a moving object through the calculation of a full convolutional neural network CNN, and simultaneously calculating posterior probability of the Bbox of the predicted human body;
and thirdly, selecting block diagrams Box according to the sequence of the posterior probability from large to small to divide the real-time 2D image and the corresponding depth image, wherein each block diagram comprises each joint point of a moving object.
The YOLO target recognition frame needs to be trained to be applied to the segmentation of a plurality of moving objects, when the YOLO target recognition frame is trained, the situation that too many joint points are moved out of the visual line to cause deformation of the skeleton gesture and influence on the training effect needs to be avoided, therefore, when the YOLO target recognition frame is trained, the confidence level of each joint point in the training set needs to be additionally considered, and in particular, when one joint point is in the visual line (whether the joint point is visible or not), the confidence level is set to be 1; when one node is out of the line of sight, the confidence coefficient is 0, the confidence coefficient mean value of all the nodes is counted, and data with the confidence coefficient mean value smaller than 0.5 are removed in the training set.
Finally, after the number of people of the moving objects is identified and the joints of the moving objects are segmented, in step S500, the real-time motion feature sequence of each moving object can be extracted by the processing module. Specifically, the real-time motion feature sequence includes a real-time motion feature set of the moving object at a plurality of time points, wherein the real-time motion feature set of each time point includes real-time three-dimensional coordinates and real-time visibility of each node of the moving object at the time point. For example, for a particular moving Object1, it is determined at a plurality of time points t 1 ,t 2 ,…,t i ,…,t n Each time point t in the constructed time series i Real-time motion feature set with 4 x 33 dimensions
Figure BDA0004079571700000061
Figure BDA0004079571700000062
Wherein, the superscript 1-33 respectively represents 33 articulation points, and (x, y, z, v) respectively represents real-time three-dimensional coordinates and real-time visibility of each articulation point. Respectively extracting real-time action characteristic feature sets of each time point to obtain the time sequence t of the moving object1 1 ,t 2 ,…,t i ,…,t n Corresponding real-time motion feature sequences
Figure BDA0004079571700000063
In some preferred embodiments, the real-time motion feature extraction unit further comprises an angle extraction module that determines a real-time yaw angle of each of the plurality of nodes based on real-time three-dimensional coordinates of the plurality of nodes, and the real-time motion feature set for each point in time further comprises the real-time yaw angle of each of the nodes of the moving object at the point in time. If Euler rotation information is added to any joint point on the basis of the space three-dimensional coordinates, the information for representing the gesture of the moving object is greatly expanded, and the accuracy and generalization capability of motion recognition can be effectively improved.
In some preferred embodiments, the real-time motion feature extraction unit further comprises a pose correction module that performs pose correction on the real-time motion feature sequence based on real-time three-dimensional coordinates of the plurality of torso joint points.
In an embodiment of the application, the standard motion feature construction unit is configured to construct a standard motion feature sequence of at least one motion type and store it in the database. It is readily known to a person skilled in the art that the standard motion feature sequence has the same data format as the real-time motion feature sequence, so that it is used as a standard for the identification and evaluation of the real-time motion feature sequence in the following, i.e. for a motion type, the standard motion feature sequence comprises a standard motion feature set of the motion type at a plurality of time points, wherein the standard motion feature set at each time point comprises standard three-dimensional coordinates and standard visibility of each node point at the time point. For example, for a set of time sequences t 1 ,t 2 ,…,t i ,…,t n The corresponding standard action characteristic sequence is
Figure BDA0004079571700000071
Figure BDA0004079571700000072
The specific expression of (2) is:
Figure BDA0004079571700000073
wherein the upper and lower subscripts have the same meaning as
Figure BDA0004079571700000074
Similarly, (X, Y, Z, V) represent the three-dimensional coordinates each joint point should be in when doing this type of motion and the standard visibility that each joint point should have at the shooting angle at which the acquisition unit is located, respectively.
An ideal mode for constructing the standard motion characteristic sequence is that professional practitioners of various motion types (such as yoga and the like with higher requirements on motions) are taken as motion objects, the real-time motion characteristic sequence of the motion type is acquired and extracted through an acquisition unit and a real-time motion extraction unit and is taken as the standard motion characteristic sequence of the motion type, however, in the actual implementation process, the cost for constructing the standard motion characteristic sequence is greatly increased by the implementation mode, and compared with the construction of the standard motion characteristic sequence by adopting standard motion materials acquired by channels such as the Internet, the construction of the standard motion characteristic sequence is definitely more convenient and faster and the cost is saved.
Because the standard motion materials are generally two-dimensional videos shot at a specific angle, the standard three-dimensional positions of all the joints are further inverted from the two-dimensional information of the standard motion videos, and therefore, in the embodiment of the application, the two-dimensional videos showing the standard motion of various motion types are fused with the real-time three-dimensional position information of the same motion type of the moving object acquired by the acquisition unit, so that the standard motion feature sequences of various motion types are constructed.
Specifically, the standard motion feature construction unit constructs a standard motion feature sequence for each motion type based on the steps of:
a100: obtaining a standard action video of the motion type;
a200: calibrating a plurality of reference articulation points from the standard action video of the motion type;
a300: selecting a real-time action characteristic sequence with the same shooting angle as the standard action video from the real-time action characteristic sequences as an alternative action characteristic sequence;
a400: adjusting the positions of corresponding nodes in the alternative action feature sequence based on the positions of the plurality of reference nodes;
a500: adjusting the positions of non-corresponding nodes in the alternative action feature sequence based on the positions of the corresponding nodes in the alternative action feature sequence;
a600: generating a standard motion feature sequence of the motion type based on the positions of the corresponding articulation points and the positions of the non-corresponding articulation points, wherein the standard motion feature sequence comprises standard motion feature sets of the motion type at a plurality of time points, and the standard motion feature set of each time point comprises standard three-dimensional coordinates and standard visibility of each articulation point at the time point.
In some preferred embodiments, the standard visibility is determined based on a plurality of reference joint points calibrated from the standard motion video.
In the above steps, firstly, a standard motion video of a specific motion type is obtained, and calibration of a plurality of reference joints is performed thereon, obviously, for the standard motion video of a two-dimensional format, the calibrated joints are all visible, further, a sequence with the same shooting angle as the standard motion video is selected from each real-time motion feature sequence extracted by the real-time motion feature extraction unit as an alternative motion feature sequence, then, the positions of the corresponding joints in the alternative motion feature sequence are adjusted by using the calibrated reference joints, so that the positions are matched with the reference joints, the visibility of the corresponding joints is set to be visible, finally, the positions of the non-corresponding joints are adjusted based on the limiting conditions (for example, the three-dimensional distance between two adjacent joints is kept unchanged) of the spatial positions and angles between the joints, the visibility of the non-corresponding joints is set to be invisible, and finally, the three-dimensional positions and the visibility of all the joints are determined. And (5) performing the operation on each time point to obtain the standard action characteristic sequence.
By using the steps A100 to A600, the motion characteristic sequence which does not reach the standard degree can be corrected by using the two-dimensional standard motion video which is easy to obtain, so that the standard motion characteristic sequence in the three-dimensional format can be obtained. FIG. 6 is a schematic diagram illustrating the standard three-dimensional coordinates of each of the nodes constructed by the standard motion feature construction module in one particular embodiment.
In an embodiment of the application, the evaluation unit identifies a motion type and a motion effect of the at least one moving object based on the real-time motion feature sequence and the standard motion feature sequence.
In some preferred embodiments, the evaluation unit may identify the motion type of the at least one moving object using a trained deep learning network, in particular the evaluation unit identifies the motion type and the motion effect of the at least one moving object based on the following steps:
b100: extracting a real-time action feature sequence of at least one moving object;
b200: identifying the motion type of the moving object based on the trained deep learning network;
b300: extracting a standard action feature sequence corresponding to the identified motion type;
b400: correcting the standard visibility of each joint point in the standard action characteristic sequence corresponding to the identified motion type based on the gesture of the real-time action characteristic sequence of the moving object;
b500: counting the number of different real-time visibility of each node in the real-time action feature sequence of the moving object from the standard visibility of each node in the corrected standard action feature sequence, if the counting result is larger than a preset first threshold value, evaluating the moving effect as bad and returning to the step B100, otherwise executing the step B600;
b600: calculating the matching degree of the real-time action feature sequence of the moving object and the corrected standard action feature sequence, if the matching degree is smaller than a preset second threshold value, evaluating the moving effect as normal and returning to the step B100, otherwise, executing the step B700;
b700: the athletic performance is evaluated as good and returned to step B100.
In the above steps, the type of motion of the motion object is first identified in steps B100 to B200, and the motion effect is further evaluated in steps B300 to B700. Because the shooting angle of the moving object may be different from the shooting angle of the standard motion by the acquisition unit when the moving object performs various types of motions, the standard motion feature sequence can be subjected to angle transformation by the step B400, adjusted to be the same as the shooting angle of the real-time motion feature sequence, and the standard visibility of each articulation point in the standard motion feature sequence is revised according to the occlusion relation of human tissues on the basis. For example, in a specific embodiment, the shooting angle of the standard motion feature sequence for performing sit-ups is the lateral direction, and the real-time motion feature sequence is acquired by the motion object opposite to the acquisition unit, then the shooting angle of the standard motion feature sequence should be adjusted based on the acquisition direction of the real-time motion feature sequence, and the standard visibility of the articulation point blocked by the human tissue at the shooting angle is corrected to be invisible.
After the operation is finished, the motion effect of the moving object can be estimated by using the corrected standard motion characteristic sequence. In the process of evaluating the motion effect, firstly, step B500 is used for counting the number of different real-time visibility of each node in the real-time motion characteristic sequence from the standard visibility of each node in the standard motion characteristic sequence. The deviation of the visibility of each articulation point represents a larger deviation between the real-time motion and the standard motion of the moving object, for example, when the moving object performs the type of motion (such as yoga, sit-ups, push-ups, rope skipping and the like), the position, the rotation angle and the like of each articulation point of the body deviate from the standard motion, and the number of articulation points with different real-time visibility from the standard visibility may exceed a preset first threshold, and at this time, the motion effect may be directly evaluated as poor.
If the number of the joint points with different real-time visibility and standard visibility is smaller than or equal to a preset first threshold value, the matching degree of the real-time motion feature sequence and the standard motion feature sequence is further evaluated through the step B600, for example, the real-time three-dimensional coordinates and the standard three-dimensional coordinates of each joint point can be normalized by utilizing the connection line between the middle points of the left shoulder and the right shoulder and the middle points of the left hip and the right hip, then the average value of the Euclidean distance between the real-time three-dimensional coordinates and the standard three-dimensional coordinates of each joint point is obtained as the matching degree (the average value of the difference value between the real-time deflection angle of each joint point and the standard deflection angle determined based on the standard three-dimensional coordinates can also be used as the matching degree), if the matching degree is larger than a preset second threshold value, the motion effect is evaluated as normal, otherwise, the real-time motion of a moving object can be considered to be relatively standard, and the motion effect can be evaluated as good.
In some preferred embodiments, classification statistics can be further performed on the joints at different positions, such as the trunk joint and the four limbs joint, and deviation of actions of different parts of the body from standard actions can be further determined according to the statistical result, and corresponding action accuracy scores, action guidance or advice and the like can be given on the basis. Fig. 7a and 7b show schematic views of a specific moving object having different body postures when the same type of movement is performed. As shown in fig. 7a and 7b, when the moving object performs the same type of exercise, the motions of the limbs and trunk are standard, and only the angle of the facial articulation point deviates from the standard motion, at this time, the instruction or suggestion of the motion can be provided for a specific part.
In the embodiment of the application, the display unit may be a liquid crystal display, a mobile phone, a tablet computer, or other devices, and is configured to display the real-time three-dimensional position information, the motion type, and the motion effect.
While the foregoing is directed to embodiments of the present application, other and further embodiments of the invention may be devised without departing from the basic scope thereof, and the scope thereof is determined by the claims that follow.

Claims (10)

1. The utility model provides a house body-building intelligence auxiliary system, includes collection unit, real-time action feature extraction unit, standard action feature construction unit, evaluation unit, database and display element, its characterized in that:
the acquisition unit is used for acquiring real-time three-dimensional position information of at least one moving object;
the real-time action feature extraction unit extracts a real-time action feature sequence of at least one moving object based on the real-time three-dimensional position information;
the standard action feature construction unit is used for constructing a standard action feature sequence of at least one motion type and storing the standard action feature sequence in a database;
the evaluation unit identifies the motion type and the motion effect of at least one moving object based on the real-time motion feature sequence and the standard motion feature sequence;
the display unit is used for displaying the real-time three-dimensional position information, the movement type and the movement effect.
2. The home fitness intelligent assistance system of claim 1, wherein:
the real-time action feature extraction unit comprises a coordinate extraction module, a people number identification module, a multi-person segmentation module and a processing module;
the real-time motion feature extraction unit extracts a real-time motion feature sequence of at least one moving object based on the steps of:
s100: the coordinate extraction module determines real-time three-dimensional coordinates and real-time visibility of a plurality of joint points based on the real-time three-dimensional position information;
s200: the people number identification module identifies the number of people of the moving object based on the real-time three-dimensional coordinates of the plurality of nodes;
s300: if the number of the moving objects is greater than 1 person, executing step S400, otherwise executing step S500;
s400: the multi-person segmentation module segments the plurality of joint points based on the number of people of the moving objects, and executes step S500 for each segmented moving object;
s500: the processing module processes and generates a real-time action feature sequence of each moving object, wherein the real-time action feature sequence comprises a real-time action feature set of the moving object at a plurality of time points, and the real-time action feature set of each time point comprises real-time three-dimensional coordinates and real-time visibility of each joint point of the moving object at the time point.
3. The home fitness intelligent assistance system of claim 2, wherein:
the plurality of joints includes a plurality of facial joints, a plurality of torso joints, and a plurality of extremity joints.
4. The home fitness intelligent assistance system of claim 3, wherein the population identification module determines the population of moving subjects using at least one of the following criteria:
based on the number of the faces determined by the number of the face articulation points, the total number of the articulation points and the three-dimensional distance of any two articulation points.
5. The home fitness intelligent assistance system of claim 2, wherein:
the real-time motion feature extraction unit further includes an angle extraction module that determines a real-time deflection angle of each of the plurality of nodes based on real-time three-dimensional coordinates of the plurality of nodes, and,
the real-time motion feature set for each point in time further comprises real-time yaw angles of respective nodes of the moving object at the point in time.
6. A home fitness intelligent assistance system according to claim 3, wherein:
the real-time motion feature extraction unit further comprises a gesture correction module, and the gesture correction module corrects the gesture of the real-time motion feature sequence based on real-time three-dimensional coordinates of the trunk joints.
7. The home-based fitness intelligent assistance system of claim 2, wherein the standard motion feature construction unit constructs a standard motion feature sequence for each type of motion based on the steps of:
a100: obtaining a standard action video of the motion type;
a200: calibrating a plurality of reference articulation points from the standard action video of the motion type;
a300: selecting a real-time action characteristic sequence with the same shooting angle as the standard action video from the real-time action characteristic sequences as an alternative action characteristic sequence;
a400: adjusting the positions of corresponding nodes in the alternative action feature sequence based on the positions of the plurality of reference nodes;
a500: adjusting the positions of non-corresponding nodes in the alternative action feature sequence based on the positions of the corresponding nodes in the alternative action feature sequence;
a600: generating a standard motion feature sequence of the motion type based on the positions of the corresponding articulation points and the positions of the non-corresponding articulation points, wherein the standard motion feature sequence comprises standard motion feature sets of the motion type at a plurality of time points, and the standard motion feature set of each time point comprises standard three-dimensional coordinates and standard visibility of each articulation point at the time point.
8. The home fitness intelligent assistance system of claim 7, wherein:
the standard visibility is determined based on a plurality of reference joint points calibrated from the standard motion video.
9. The home fitness intelligent assistance system of claim 7, wherein:
the evaluation unit uses a trained deep learning network to identify a type of motion of at least one moving object.
10. The home fitness intelligent assistance system of claim 9, wherein the evaluation unit identifies a type of movement and a movement effect of at least one moving object based on:
b100: extracting a real-time action feature sequence of at least one moving object;
b200: identifying the motion type of the moving object based on the trained deep learning network;
b300: extracting a standard action feature sequence corresponding to the identified motion type;
b400: correcting the standard visibility of each joint point in the standard action characteristic sequence corresponding to the identified motion type based on the gesture of the real-time action characteristic sequence of the moving object;
b500: counting the number of different real-time visibility of each node in the real-time action feature sequence of the moving object from the standard visibility of each node in the corrected standard action feature sequence, if the counting result is larger than a preset first threshold value, evaluating the moving effect as bad and returning to the step B100, otherwise executing the step B600;
b600: calculating the matching degree of the real-time action feature sequence of the moving object and the corrected standard action feature sequence, if the matching degree is smaller than a preset second threshold value, evaluating the moving effect as normal and returning to the step B100, otherwise, executing the step B700;
b700: the athletic performance is evaluated as good and returned to step B100.
CN202310116843.7A 2023-02-14 2023-02-14 Intelligent auxiliary system for household body building Pending CN116343325A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310116843.7A CN116343325A (en) 2023-02-14 2023-02-14 Intelligent auxiliary system for household body building

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310116843.7A CN116343325A (en) 2023-02-14 2023-02-14 Intelligent auxiliary system for household body building

Publications (1)

Publication Number Publication Date
CN116343325A true CN116343325A (en) 2023-06-27

Family

ID=86879771

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310116843.7A Pending CN116343325A (en) 2023-02-14 2023-02-14 Intelligent auxiliary system for household body building

Country Status (1)

Country Link
CN (1) CN116343325A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN118015710A (en) * 2024-04-09 2024-05-10 浙江深象智能科技有限公司 Intelligent sports identification method and device

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN118015710A (en) * 2024-04-09 2024-05-10 浙江深象智能科技有限公司 Intelligent sports identification method and device

Similar Documents

Publication Publication Date Title
CN111144217B (en) Motion evaluation method based on human body three-dimensional joint point detection
CN106650687B (en) Posture correction method based on depth information and skeleton information
Qiao et al. Real-time human gesture grading based on OpenPose
CN110321754B (en) Human motion posture correction method and system based on computer vision
CN110705390A (en) Body posture recognition method and device based on LSTM and storage medium
Thar et al. A proposal of yoga pose assessment method using pose detection for self-learning
CN105512621B (en) A kind of shuttlecock action director's system based on Kinect
CN111881887A (en) Multi-camera-based motion attitude monitoring and guiding method and device
CN114067358A (en) Human body posture recognition method and system based on key point detection technology
CN110544301A (en) Three-dimensional human body action reconstruction system, method and action training system
CN108717531A (en) Estimation method of human posture based on Faster R-CNN
Anilkumar et al. Pose estimated yoga monitoring system
CN113762133A (en) Self-weight fitness auxiliary coaching system, method and terminal based on human body posture recognition
CN110544302A (en) Human body action reconstruction system and method based on multi-view vision and action training system
CN110633004B (en) Interaction method, device and system based on human body posture estimation
WO2017161734A1 (en) Correction of human body movements via television and motion-sensing accessory and system
CN113255522B (en) Personalized motion attitude estimation and analysis method and system based on time consistency
CN111883229B (en) Intelligent movement guidance method and system based on visual AI
CN112668531A (en) Motion posture correction method based on motion recognition
CN116343325A (en) Intelligent auxiliary system for household body building
Francisco et al. Computer vision based on a modular neural network for automatic assessment of physical therapy rehabilitation activities
CN110334609A (en) A kind of real-time body-sensing method for catching of intelligence
CN113947811A (en) Taijiquan action correction method and system based on generation of confrontation network
Morel et al. Automatic evaluation of sports motion: A generic computation of spatial and temporal errors
CN114360052A (en) Intelligent somatosensory coach system based on AlphaPose and joint point angle matching algorithm

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination