CN111652076A - Automatic gesture recognition system for AD (analog-digital) scale comprehension capability test - Google Patents

Automatic gesture recognition system for AD (analog-digital) scale comprehension capability test Download PDF

Info

Publication number
CN111652076A
CN111652076A CN202010390863.XA CN202010390863A CN111652076A CN 111652076 A CN111652076 A CN 111652076A CN 202010390863 A CN202010390863 A CN 202010390863A CN 111652076 A CN111652076 A CN 111652076A
Authority
CN
China
Prior art keywords
human body
image
scale
dimensional coordinates
target object
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202010390863.XA
Other languages
Chinese (zh)
Inventor
余娟
孔航
葛学人
王兵凯
王子石
李文沅
杨知方
余维华
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Chongqing Zhiyixing Technology Development Co ltd
Original Assignee
Chongqing University
Chongqing Medical University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Chongqing University, Chongqing Medical University filed Critical Chongqing University
Priority to CN202010390863.XA priority Critical patent/CN111652076A/en
Publication of CN111652076A publication Critical patent/CN111652076A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/103Static body considered as a whole, e.g. static pedestrian or occupant recognition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/56Extraction of image or video features relating to colour
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/107Static hand or arm
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H50/00ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
    • G16H50/20ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for computer-aided diagnosis, e.g. based on medical expert systems
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H50/00ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
    • G16H50/30ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for calculating health indices; for individual health risk assessment

Abstract

The invention discloses an automatic gesture recognition system for an AD (analog-to-digital) scale comprehension capability test, which mainly comprises a video stream acquisition module, a human body key point two-dimensional coordinate extraction module, a target object vertex two-dimensional coordinate extraction module, a preprocessing module, a gesture recognition module and a database. The evaluation mathematical model is established for the completion condition of the specified action in the AD scale, the coordinate point of the human skeleton is extracted based on OpenPose, and the evaluation of the completion condition of the action of the subject is completed by matching with a paper positioning algorithm based on image morphological processing.

Description

Automatic gesture recognition system for AD (analog-digital) scale comprehension capability test
Technical Field
The invention relates to the field of human body posture recognition, in particular to a posture automatic recognition system for AD scale comprehension capability test.
Background
Dementia is an aging syndrome, the prevalence of which increases rapidly with age. The prevalence rate of dementia of old people over 65 years old in China is 5.14%, the prevalence rate of dementia of old people over 85 years old is increased to 23.66%, more than 1000 million of old people with dementia (accounting for 25% of the world) exist at present, the old people become the fourth largest killer threatening the life health of old people after cardiovascular diseases, cerebral apoplexy and malignant tumors, and the old people are also one of the main reasons for disability of old people.
Dementia syndromes are diverse in etiology, and among them, Alzheimer's Disease (AD), which is the main type of dementia, is a degenerative disease of the nervous system characterized mainly by progressive cognitive dysfunction, accounting for 50-75% of dementia. Cognitive functions include a number of areas of memory, language, computation, attention, visual space, judgment, and execution.
In clinical practice, the severity of cognitive impairment, follow-up of therapeutic efficacy and prognosis judgment of AD patients depend greatly on the assessment of intellectual state examination scales.
Comprehension tests are an important component of neuropsychological scale tests, and currently in clinical practice, rely mainly on the manual analysis of observations of behavioral characteristics of subjects and the assessment of cognitive function. Generally speaking, the comprehension test takes 30 minutes, consumes a lot of human resources, is inefficient in evaluation, has high running cost, and is difficult to realize in large-scale cognitive screening.
In recent years, a great deal of achievements have been made in the field of gesture recognition and analysis by the artificial intelligence technology, and the artificial intelligence technology is expected to be applied to the comprehension testing of an intelligence state inspection scale, assists the traditional artificial AD diagnosis process through intelligent and automatic motion video analysis, and improves the scale evaluation and AD diagnosis efficiency.
The AD gauge has very high detection requirements on action requirements, comprises a fixed time sequence for the whole set of action, and has specific medical standards for a judgment system for finishing the action.
Besides, the AD table not only includes motion detection, but also includes object detection of some other objects (such as paper), and the composite use of different object detection algorithms and motion detection algorithms is also an important part not related to the existing attitude estimation technology.
Disclosure of Invention
The invention aims to provide an automatic gesture recognition system for an AD (analog-to-digital) scale comprehension capability test, which mainly comprises a video stream acquisition module, a human body key point two-dimensional coordinate extraction module, a target object vertex two-dimensional coordinate extraction module, a preprocessing module, a gesture recognition module and a database;
the video stream acquisition module acquires a video stream containing human body posture information and target paper information and decomposes the video stream into a plurality of frames of color images; the human posture is the action that the subject indicated to complete according to the AD scale.
The human body key point two-dimensional coordinate extraction module receives the color images and extracts the human body key point two-dimensional coordinates of each frame of color image by utilizing an OpenPose program;
the human body key points comprise a plurality of body skeleton joint points, left and right hand skeleton joint points and face characteristic points.
The main steps for extracting the two-dimensional coordinates of the key points of the human body are as follows:
a) the sizes of all color images are unified to W × H. W is the width and H is the height.
b) And inputting the color images with uniform sizes into a VGG19 network, and extracting human body posture features to obtain a feature image F. The VGG19 network includes an input layer, a pooling layer, and a convolutional layer.
c) Inputting the characteristic image F into a double-branch convolution neural network to obtain a confidence map S and a key point affinity vector field L; the confidence coefficient S shows the accuracy of the coordinates of the representing key points; the key point affinity vector field L represents the degree of association among all parts of the human body;
the dual-branch convolutional neural network comprises a first branch convolutional neural network and a second branch convolutional neural network which are parallel. The input of the first branch model is a characteristic image F, and the output is a key point confidence map S. The input of the second branch model is a characteristic image F, and the output is a key point affinity vector field L.
d) And acquiring the coordinates of the key points in the characteristic image by using a clustering method.
The target object vertex two-dimensional coordinate extraction module extracts the target object vertex two-dimensional coordinates of each frame of color image by using an image morphology processing method;
the main steps for extracting the two-dimensional coordinates of the vertex of the target object are as follows:
I) and carrying out binarization processing on all the color images to obtain a binarized image.
II) mapping pixel information of the binarized image from an RGB color space to a YCrCb color space. Where Y represents luminance, Cr represents a red component in the light source, and Cb represents a blue component in the light source.
III) eliminating noise points of the binary image by utilizing an open operation. And filling the concave angle of the binary image by using closed operation.
The opening operation comprises the following steps: firstly, carrying out corrosion operation on the binary image, and then carrying out expansion operation on the binary image. The closed operation comprises the following steps: firstly, carrying out expansion operation on the binary image, and then carrying out corrosion operation on the binary image.
IV) selecting the target paper area in the binary image processed in the step 3) by using the minimum rectangular frame, thereby extracting the coordinate information of four corner vertexes of the target object. The target object is rectangular paper.
The preprocessing module receives the two-dimensional coordinates of the key points of the human body and the two-dimensional coordinates of the top points of the target object and carries out preprocessing;
the method comprises the following steps of preprocessing two-dimensional coordinates of key points of a human body and two-dimensional coordinates of vertexes of a target object, and mainly comprises the following steps:
1) and judging whether the confidence z of the two-dimensional coordinates (x, y, z) of the human key points is less than the threshold value, and if so, deleting the corresponding two-dimensional coordinates of the human key points.
2) And carrying out median filtering on the two-dimensional coordinates of the key points of the human body.
3) And filling the vacant coordinates of the key points of the human body by utilizing a piecewise linear interpolation algorithm.
4) And unifying the two-dimensional coordinates of the key points of the human body and the two-dimensional coordinates of the four vertexes of the target paper into the same coordinate system by a coordinate transformation method. The coordinate transformation method includes translation and flipping.
The gesture recognition module stores a gesture recognition mathematical model;
the attitude identification mathematical model mainly comprises an Euclidean distance calculation model, a cosine angle calculation model and a connection line slope calculation model.
The euclidean distance calculation model is as follows:
Figure BDA0002485514420000031
in the formula (di)ijIs the euclidean distance between point i and point j. i is a, b, c. j is a, b, c. i ≠ j. a. b and c respectively represent any three human body key points and/or target object vertexes. x denotes the abscissa and y denotes the ordinate.
The cosine angle calculation model is as follows:
Figure BDA0002485514420000032
wherein cos θbIs a cosine angle. a. b and c respectively represent any three human body key points and/or target object vertexes. x denotes the abscissa and y denotes the ordinateAnd (4) coordinates.
The connection slope calculation model is as follows:
Figure BDA0002485514420000033
in the formula, kacIs the slope of the line. The slope of the connecting line characterizes the relative positions of the two key points.
The gesture recognition module inputs the preprocessed two-dimensional coordinates of the key points of the human body and the preprocessed two-dimensional coordinates of the top points of the target object into a gesture recognition mathematical model to recognize the gesture of the human body, and the main steps are as follows:
1) extracting m human skeleton joint points and 1 target object vertex, and calculating cosine value cos theta of left-hand elbow included angle in each frame of color imageL(t) cosine of right elbow angle cos θr(t); wherein t is the color image serial number; thetaL(t)、θr(t) is the included angle of the left elbow and the included angle of the right elbow; the included angle of the left elbow and the included angle of the right elbow are determined by the front m-1 human skeletal joint points;
2) judging cosine values cos theta of left-hand elbow included angles in color images 1 to t1L(t) increasing, left-hand elbow angle cosine value cos θ in color image t1 to color image tnL(t) is reduced and the cosine of the angle of the right elbow in color image 1 through color image t2 is cos θr(t) increasing, the cosine of the angle of the right elbow in the color image t2 to the color image tn, cos θr(t) whether the reduction is established or not, if so, entering a step 3), and if not, judging that the subject does not finish the AD scale designation action; tn is the total number of color images; t, t1, t2 and tn are positive integers, t1 is less than tn, and t2 is less than tn.
3) Calculating the Euclidean distance dis between the mth human body skeleton key point and the vertex of the target object in each frame of image, and if the Euclidean distance is smaller than a set distance threshold dismaxThen the subject is judged to complete the AD scale assignment. If dis > dismaxThe subject is determined to have not completed the AD scale assignment.
The database stores data of a video stream acquisition module, a human body key point two-dimensional coordinate extraction module, a target object vertex two-dimensional coordinate extraction module, a preprocessing module and a posture identification module.
It is worth to be noted that the invention combines openpos and image morphology processing techniques to judge whether the subject has completed the specified action in the simple intellectual state checklist (MMSE), thereby greatly reducing the workload of manual judgment. The invention particularly relates to extracting two-dimensional coordinates of human body joint points, extracting two-dimensional coordinates of four vertexes of paper, preprocessing data and establishing a posture identification model.
The method has the advantages that an evaluation mathematical model is established for the completion condition of the specified action in the AD gauge, the coordinate points of the human skeleton are extracted based on OpenPose, and the evaluation of the completion condition of the action of the subject is completed by matching with a paper positioning algorithm based on image morphological processing.
Drawings
FIG. 1 is a flow chart of pose discrimination based on OpenPose and image morphology processing techniques;
FIG. 2(a) is a schematic diagram I of the coordinates of key points obtainable by OpenPose;
FIG. 2(b) is a schematic diagram II of the coordinates of key points obtainable by OpenPose;
FIG. 2(c) is a schematic diagram III of the coordinates of key points obtainable by OpenPose;
FIG. 3(a) is a drawing I of the effect of extracting the coordinates of the paper;
FIG. 3(b) is a drawing II of the effect of extracting the coordinates of the paper;
FIG. 3(c) is a drawing of the effect of extracting the coordinates of the paper III;
FIG. 3(d) is a drawing IV of the effect of extracting the coordinates of the paper;
FIG. 4 is a diagram of a mathematical model for gesture recognition;
FIG. 5 is a diagram of a mathematical model for determining whether a subject has finished holding paper with his right hand;
FIG. 6 is a diagram of a mathematical model for determining whether a subject has completed folding;
FIG. 7 is a diagram of a mathematical model for determining whether a subject has placed paper on the thigh;
FIG. 8 is a schematic representation of the Euclidean distance from the wrist of the right hand to the paper;
FIG. 9 is a schematic view of the change in paper area;
FIG. 10 is a schematic diagram showing the Euclidean distance from the paper to the root of the thigh and the cosine change of the included angle between the left elbow and the right elbow;
FIG. 11 is a schematic diagram of a two-branch convolutional neural network.
Detailed Description
The present invention is further illustrated by the following examples, but it should not be construed that the scope of the above-described subject matter is limited to the following examples. Various substitutions and alterations can be made without departing from the technical idea of the invention and the scope of the invention is covered by the present invention according to the common technical knowledge and the conventional means in the field.
Example 1:
referring to fig. 1 to 4, an automatic gesture recognition system for an AD scale comprehension capability test mainly includes a video stream acquisition module, a human body key point two-dimensional coordinate extraction module, a target object vertex two-dimensional coordinate extraction module, a preprocessing module, a gesture recognition module, and a database;
the video stream acquisition module acquires a video stream containing human body posture information and target paper information and decomposes the video stream into a plurality of frames of color images; the human posture is the action the subject indicated to complete according to the AD scale (alzheimer rating scale). The AD scale includes MMSE (simple mental scale), CDT (bell test), CASI (cognitive ability screening scale), HDS (dementia in long valley) and the like. The extraction criterion is 30 frames per second.
The human body key point two-dimensional coordinate extraction module receives the color images and extracts the human body key point two-dimensional coordinates of each frame of color image by utilizing an OpenPose program;
the human body key points comprise two-dimensional coordinates of 18 human body key points, 21 key points of the left hand and the right hand and 70 key points of the face, which are extracted by OpenPose (the university of Carnastomulen (CMU) in America, based on convolutional neural network and supervised learning and an open source library developed by taking cafe as a framework).
The main steps for extracting the two-dimensional coordinates of the key points of the human body are as follows:
a) the sizes of all color images are unified to W × H. W is the width and H is the height.
b) The color images with uniform size are input into a VGG19 network (deep convolutional neural network), and human body posture features are extracted, so that a feature image F is obtained. The VGG19 network includes an input layer, a pooling layer, and a convolutional layer.
c) Referring to fig. 11, the feature image F is input into a dual-branch convolutional neural network to obtain a confidence map S and a key point affinity vector field L; the confidence coefficient S shows the accuracy of the coordinates of the representing key points; the key point affinity vector field L represents the degree of association among all parts of the human body;
the dual-branch convolutional neural network comprises a first branch convolutional neural network and a second branch convolutional neural network which are parallel. The input of the first branch model is a characteristic image F, and the output is a key point confidence map S. The input of the second branch model is a characteristic image F, and the output is a key point affinity vector field L.
d) And acquiring the coordinates of the key points in the characteristic image by using a clustering method.
The target object vertex two-dimensional coordinate extraction module extracts the target object vertex two-dimensional coordinates of each frame of color image by using an image morphology processing method;
the target object is rectangular paper. The rectangular frame is characterized in that the position of the rectangular frame is adaptive to the change of the connected domain and can rotate.
The main steps for extracting the two-dimensional coordinates of the vertex of the target object are as follows:
I) and carrying out binarization processing on all the color images to obtain a binarized image. And reasonably setting a pixel threshold value of [210, 255] according to the pixel value characteristics of the paper area, and eliminating most background interference.
II) mapping pixel information of the binarized image from an RGB color space to a YCrCb color space. Where Y represents luminance, Cr represents a red component in the light source, and Cb represents a blue component in the light source. Whether the current pixel point belongs to the skin color or not can be determined by judging whether the CrCb of the current pixel point falls in the elliptical region of the skin color distribution or not, and therefore the influence of the skin is filtered.
III) eliminating noise points of the binary image by utilizing an open operation. And filling the concave angle of the binary image by using closed operation. And filling concave angles of the binary image by using closed operation so as to solve the problem of incomplete partial images of the paper due to hand shielding.
The opening operation comprises the following steps: firstly, carrying out corrosion operation on the binary image, and then carrying out expansion operation on the binary image. The closed operation comprises the following steps: firstly, carrying out expansion operation on the binary image, and then carrying out corrosion operation on the binary image.
IV) selecting the target paper area in the binary image processed in the step 3) by using the minimum rectangular frame, thereby extracting the coordinate information of four corner vertexes of the target object. The target object is rectangular paper.
The preprocessing module receives the two-dimensional coordinates of the key points of the human body and the two-dimensional coordinates of the top points of the target object and carries out preprocessing;
the method comprises the following steps of preprocessing two-dimensional coordinates of key points of a human body and two-dimensional coordinates of vertexes of a target object, and mainly comprises the following steps:
1) and judging whether the confidence z of the two-dimensional coordinates (x, y, z) of the human key points is less than the threshold value, and if so, deleting the corresponding two-dimensional coordinates of the human key points.
2) And carrying out median filtering on the two-dimensional coordinates of the key points of the human body.
3) And filling the vacant coordinates of the key points of the human body by utilizing a piecewise linear interpolation algorithm.
4) And unifying the two-dimensional coordinates of the key points of the human body and the two-dimensional coordinates of the four vertexes of the target paper into the same coordinate system by a coordinate transformation method. The coordinate transformation method includes translation and flipping.
The gesture recognition module stores a gesture recognition mathematical model;
the attitude identification mathematical model mainly comprises an Euclidean distance calculation model, a cosine angle calculation model and a connection line slope calculation model.
The euclidean distance calculation model is as follows:
Figure BDA0002485514420000071
in the formula (di)ijIs the euclidean distance between point i and point j. i is a, b, c. j is a, b, c. i ≠ j. a. b and c respectively represent any three human body key points and/or target object vertexes. x denotes the abscissa and y denotes the ordinate. x is the number ofiDenotes the abscissa, y, of the point iiDenotes the ordinate, x, of point ijDenotes the abscissa, y, of the point jjRepresenting the ordinate of point j.
The cosine angle calculation model is as follows:
Figure BDA0002485514420000072
wherein cos θbIs a cosine angle. a. b and c respectively represent any three human body key points and/or target object vertexes. x denotes the abscissa and y denotes the ordinate.
The connection slope calculation model is as follows:
Figure BDA0002485514420000073
in the formula, kacIs the slope of the line. The slope of the connecting line characterizes the relative positions of the two key points.
The gesture recognition module inputs the preprocessed two-dimensional coordinates of the key points of the human body and the preprocessed two-dimensional coordinates of the top points of the target object into a gesture recognition mathematical model to recognize the gesture of the human body, and the main steps are as follows:
1) extracting 2, 3, 4, 5, 6, 7, 8 and 1 target key point (paper vertex 1) according to the indication of an AD scale, respectively calculating cosine values cos ∠ 567(t) and cos ∠ 234(t) of the included angle between the left hand and the right hand elbow in each frame of color image, wherein t is the serial number of the color image, and 2) judging the cosine values cos theta of the included angle between the left hand and the elbow in the color images 1 to t1L(t) increasing, left-hand elbow angle cosine value cos θ in color image t1 to color image tnL(t) is reduced and the cosine of the angle of the right elbow in color image 1 through color image t2 is cos θr(t) increasing, the cosine of the angle of the right elbow in the color image t2 to the color image tn, cos θr(t) whether the reduction is established or not, if so, entering a step 3), and if not, judging that the subject does not finish the AD scale designation action; t is tnThe total number of color images;
3) calculating the Euclidean distance dis between the mth human body skeleton key point and the vertex of the target object in each frame of image, and if the Euclidean distance is smaller than a set distance threshold dismaxThen the subject is judged to complete the AD scale assignment. If dis > dismaxThe subject is determined to have not completed the AD scale assignment.
The database stores data of a video stream acquisition module, a human body key point two-dimensional coordinate extraction module, a target object vertex two-dimensional coordinate extraction module, a preprocessing module and a posture identification module. The database is stored in a computer readable storage medium.
Example 2:
referring to fig. 5 to 10, the gesture automatic recognition system for the AD scale comprehension capability test is used for recognizing the experiment that the paper is held by the right hand, folded by two hands, and placed on the thigh to form the whole action chain, which mainly comprises the following steps:
1) coordinate extraction step and result
Decomposing a video of a section of subject completing the specified action in the AD scale into a plurality of color pictures according to 30 frames per second, and respectively processing the pictures through OpenPose and image morphology technologies to obtain the two-dimensional coordinates of the key points of the human body and the vertexes of the paper. Extracting two-dimensional coordinates of human skeleton key points 2, 3, 4, 5, 6, 7 and 8 and four vertexes of paper in each color picture, and performing data preprocessing on the two-dimensional coordinates.
2) Data preprocessing and results
The method comprises the steps of firstly removing data with low confidence coefficient, then eliminating jitter data through median filtering, secondly filling up vacant coordinates through a piecewise linear interpolation algorithm to ensure the integrity of actions, and finally unifying two-dimensional coordinates of key points of a human body and two-dimensional coordinates of four vertexes of paper to be in the same coordinate system through coordinate translation, overturning and other methods.
3) Establishing mathematical model and posture recognition result
According to the formula
Figure BDA0002485514420000081
Calculating the key point (x) of the wrist skeleton of the right hand4,y4) To the sheet apex 1 (x)paper1,ypaper1) The calculated euclidean distance is compared with a set threshold distance.
When the distance is continuously less than the threshold distance, judging that the test subject finishes holding the paper by the right hand; calculating the paper area in each frame of image, and obtaining S when the maximum paper area is larger than 2 times of the minimum paper areamax>2SminJudging that the subject finishes the paper folding action;
calculate the cosine value of right elbow and left elbow contained angle through the cosine formula, the in-process of putting paper on the thigh after the examinee finishes both hands paper folding on the desk, right elbow contained angle can diminish earlier then the grow, the cosine value of left elbow contained angle can grow earlier then diminish, detect this process after according to the formula again
Figure BDA0002485514420000091
Calculate sheet apex 1 (x)paper1,ypaper1) To the right leg root skeletal key point (x)8,y8) When the Euclidean distance is less than the set threshold distance, the examinee is judged to finish putting the paper on the thigh.
The results of the experiment are shown in FIGS. 8 to 10. As can be seen from FIG. 8, after the 15 th frame, distance1If the duration is less than the set threshold value, the test subject can be judged to finish the action of holding paper by the right hand; in fig. 8, the line in the portion I indicates that the right hand does not touch the paper, and the line in the portion II indicates that the right hand holds the paper.
Analysis of the change in area of the paper in FIG. 9 revealed that the area of the paper was maintained at 50000mm2The left and right correspond to the process of taking paper by a subject, and the area of the paper is basically not changed in the process; the area of the paper is maintained at 15000mm2Left and right corresponding to the state of paper after the subject finishes folding paper due to hand shieldingFactors such as the distance between the paper and the camera is increased, and the like, so that the area of the paper after paper folding is smaller than half of the area of the paper before paper folding; in fig. 9, the line in section III indicates before folding, the line in section IV indicates during folding, and the line in section V indicates after folding.
From FIG. 10, it can be seen that the subject placed paper on the thigh around the 120 th frame, and the experimental result shows distance3The process that the cosine value of the elbow included angle is larger and smaller does not occur when the distance is smaller than the set distance threshold, indicates that after the examinee finishes folding paper, the paper is positioned right above the thigh, and the paper is not placed on the thigh at the moment. If the cosine value of the elbow included angle is increased and then decreased, but distance3Above a set distance threshold, the subject may have paper placed next to the thigh or elsewhere after folding. When the cosine value of the elbow included angle is increased and then decreased, distance3Still less than the set distance threshold, indicating that the subject has placed paper on the thigh at this time.
Tables 1, 2, and 3 present key points and criterion information required to identify three simple gestures.
TABLE 1 action-recognition model
Figure BDA0002485514420000101
TABLE 2 motion two recognition model
Figure BDA0002485514420000111
TABLE 3 motion three recognition model
Figure BDA0002485514420000121
In summary, the invention provides an intelligent detection method for a human body posture estimation AD scale based on openpos and image morphology processing technologies. The method comprises the steps of obtaining human body key point coordinates by utilizing OpenPose, and obtaining key point coordinates of other targets by introducing an image morphology processing technology; then, establishing a corresponding posture description model aiming at specific actions in the scale, and setting an action scoring standard; and finally, verifying the accuracy and reliability of the model through experimental results. Experiments show that the system is convenient to use and high in speed, and can effectively improve the scale diagnosis efficiency.

Claims (9)

1. An automatic gesture recognition system for an AD (analog-to-digital) scale comprehension capability test is characterized by mainly comprising a video stream acquisition module, a human body key point two-dimensional coordinate extraction module, a target object vertex two-dimensional coordinate extraction module, a preprocessing module, a gesture recognition module and a database.
The video stream acquisition module acquires a video stream containing human body posture information and target paper information and decomposes the video stream into a plurality of frames of color images;
the human body key point two-dimensional coordinate extraction module receives the color images and extracts the human body key point two-dimensional coordinates of each frame of color image by utilizing an OpenPose program;
the target object vertex two-dimensional coordinate extraction module extracts the target object vertex two-dimensional coordinates of each frame of color image by using an image morphology processing method;
the preprocessing module receives the two-dimensional coordinates of the key points of the human body and the two-dimensional coordinates of the top points of the target object and carries out preprocessing;
the gesture recognition module stores a gesture recognition mathematical model;
the gesture recognition module inputs the preprocessed two-dimensional coordinates of the key points of the human body and the two-dimensional coordinates of the top points of the target object into a gesture recognition mathematical model to recognize the gesture of the human body;
the database stores data of a video stream acquisition module, a human body key point two-dimensional coordinate extraction module, a target object vertex two-dimensional coordinate extraction module, a preprocessing module and a posture identification module.
2. The automatic posture recognition system for the AD scale comprehension capacity test according to claim 1 or 2, wherein the human posture is an action which is indicated to be completed by the subject according to an AD scale.
3. The system of claim 1, wherein the human body key points comprise a plurality of body skeletal joint points, left and right hand skeletal joint points, and facial feature points.
4. The automatic posture recognition system for the AD scale comprehension ability test according to claim 1, wherein the main steps of extracting the two-dimensional coordinates of the key points of the human body are as follows:
1) unifying the sizes of all color images into W multiplied by H; w is the width and H is the height;
2) inputting the color images with uniform sizes into a VGG19 network, and extracting human body posture features to obtain a feature image F; the VGG19 network includes an input layer, a pooling layer, and a convolutional layer;
3) inputting the characteristic image F into a double-branch convolution neural network to obtain a confidence map S and a key point affinity vector field L; the confidence coefficient S shows the accuracy of the coordinates of the representing key points; the key point affinity vector field L represents the degree of association among all parts of the human body;
4) and acquiring the coordinates of the key points in the characteristic image by using a clustering method.
5. The automatic posture recognition system for the AD scale comprehension capability test according to claim 4, wherein the two-branch convolutional neural network comprises a first branch convolutional neural network and a second branch convolutional neural network which are parallel; the input of the first branch model is a characteristic image F, and the output is a key point confidence map S; the input of the second branch model is a characteristic image F, and the output is a key point affinity vector field L.
6. The system for automatically recognizing the attitude facing the AD scale comprehension capability test according to claim 1, wherein the main steps of extracting the two-dimensional coordinates of the vertex of the target object are as follows:
1) carrying out binarization processing on all the color images to obtain a binarized image;
2) mapping pixel information of the binary image from an RGB color space to a YCrCb color space; where Y represents luminance, Cr represents a red component in the light source, and Cb represents a blue component in the light source;
3) eliminating noise points of the binary image by utilizing an opening operation; filling concave angles of the binary image by using closed operation; the opening operation comprises the following steps: firstly, carrying out corrosion operation on the binary image, and then carrying out expansion operation on the binary image; the closed operation comprises the following steps: firstly, performing expansion operation on the binary image, and then performing corrosion operation on the binary image;
4) selecting a target paper area in the binary image processed in the step 3) by using a minimum rectangular frame so as to extract coordinate information of four corner vertexes of the target object; the target object is rectangular paper.
7. The automatic posture recognition system for the AD scale comprehension capability test according to claim 1, wherein the main steps of preprocessing the two-dimensional coordinates of the key points of the human body and the two-dimensional coordinates of the vertexes of the target object are as follows:
1) judging whether the confidence z < threshold of the two-dimensional coordinates (x, y, z) of the human key points is established or not, and if so, deleting the corresponding two-dimensional coordinates of the human key points;
2) carrying out median filtering on the two-dimensional coordinates of the key points of the human body;
3) filling the vacant coordinates of the key points of the human body by utilizing a piecewise linear interpolation algorithm;
4) unifying two-dimensional coordinates of key points of the human body and two-dimensional coordinates of four vertexes of the target paper to the same coordinate system by a coordinate transformation method; the coordinate transformation method includes translation and flipping.
8. The automatic attitude identification system oriented to the AD gauge comprehension capability test of claim 1, wherein the attitude identification mathematical model mainly comprises a Euclidean distance calculation model, a cosine angle calculation model and a line slope calculation model;
the euclidean distance calculation model is as follows:
Figure FDA0002485514410000021
in the formula (di)ijIs the Euclidean distance between point i and point j; i ═ a, b, c; j ═ a, b, c; i is not equal to j; a. b and c respectively represent any three human body key points and/or target object vertexes; x represents the abscissa and y represents the ordinate;
the cosine angle calculation model is as follows:
Figure FDA0002485514410000031
wherein cos θbIs a cosine angle;
the connection slope calculation model is as follows:
Figure FDA0002485514410000032
in the formula, kacIs the slope of the connecting line; the slope of the connecting line characterizes the relative positions of the two key points.
9. The automatic posture recognition system for the AD scale comprehension ability test according to the claim 1 or the claim 2, wherein the main steps of recognizing the human posture are as follows:
1) extracting m human skeleton joint points and 1 target object vertex, and calculating cosine value cos theta of left-hand elbow included angle in each frame of color imageL(t) cosine of right elbow angle cos θr(t); wherein t is the color image serial number; thetaL(t)、θr(t) is the included angle of the left elbow and the included angle of the right elbow; the included angle of the left elbow and the included angle of the right elbow are determined by the front m-1 human skeletal joint points;
2) judging cosine values cos theta of left-hand elbow included angles in color images 1 to t1L(t) increasing, left-hand elbow angle cosine value cos θ in color image t1 to color image tnL(t) is reduced and the cosine of the angle of the right elbow in color image 1 through color image t2 is cos θr(t) increasing, the cosine of the angle of the right elbow in the color image t2 to the color image tn, cos θr(t) whether the reduction is established or not, if so, entering a step 3), and if not, judging that the subject does not finish the AD scale designation action; tn is the total number of color images;
3) calculating the Euclidean distance dis between the mth personal skeleton key point and the vertex of the target object in each frame of image, and if the Euclidean distance dis is smaller than a set distance threshold dismaxThen the subject is judged to complete the AD scale assignment. Distance dis of Euclidean form>dismaxThe subject is determined to have not completed the AD scale assignment.
CN202010390863.XA 2020-05-11 2020-05-11 Automatic gesture recognition system for AD (analog-digital) scale comprehension capability test Pending CN111652076A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010390863.XA CN111652076A (en) 2020-05-11 2020-05-11 Automatic gesture recognition system for AD (analog-digital) scale comprehension capability test

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010390863.XA CN111652076A (en) 2020-05-11 2020-05-11 Automatic gesture recognition system for AD (analog-digital) scale comprehension capability test

Publications (1)

Publication Number Publication Date
CN111652076A true CN111652076A (en) 2020-09-11

Family

ID=72346844

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010390863.XA Pending CN111652076A (en) 2020-05-11 2020-05-11 Automatic gesture recognition system for AD (analog-digital) scale comprehension capability test

Country Status (1)

Country Link
CN (1) CN111652076A (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113065474A (en) * 2021-04-07 2021-07-02 泰豪软件股份有限公司 Behavior recognition method and device and computer equipment
CN115299934A (en) * 2022-08-30 2022-11-08 北京中科睿医信息科技有限公司 Method, device, equipment and medium for determining test action
CN115331153A (en) * 2022-10-12 2022-11-11 山东省第二人民医院(山东省耳鼻喉医院、山东省耳鼻喉研究所) Posture monitoring method for assisting vestibule rehabilitation training

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2017084204A1 (en) * 2015-11-19 2017-05-26 广州新节奏智能科技有限公司 Method and system for tracking human body skeleton point in two-dimensional video stream
WO2018120964A1 (en) * 2016-12-30 2018-07-05 山东大学 Posture correction method based on depth information and skeleton information
CN109165552A (en) * 2018-07-14 2019-01-08 深圳神目信息技术有限公司 A kind of gesture recognition method based on human body key point, system and memory
CN109460702A (en) * 2018-09-14 2019-03-12 华南理工大学 Passenger's abnormal behaviour recognition methods based on human skeleton sequence
CN110008913A (en) * 2019-04-08 2019-07-12 南京工业大学 The pedestrian's recognition methods again merged based on Attitude estimation with viewpoint mechanism
JP2019185752A (en) * 2018-03-30 2019-10-24 株式会社日立製作所 Image extracting device
CN110471526A (en) * 2019-06-28 2019-11-19 广东工业大学 A kind of human body attitude estimates the unmanned aerial vehicle (UAV) control method in conjunction with gesture identification

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2017084204A1 (en) * 2015-11-19 2017-05-26 广州新节奏智能科技有限公司 Method and system for tracking human body skeleton point in two-dimensional video stream
WO2018120964A1 (en) * 2016-12-30 2018-07-05 山东大学 Posture correction method based on depth information and skeleton information
JP2019185752A (en) * 2018-03-30 2019-10-24 株式会社日立製作所 Image extracting device
CN109165552A (en) * 2018-07-14 2019-01-08 深圳神目信息技术有限公司 A kind of gesture recognition method based on human body key point, system and memory
CN109460702A (en) * 2018-09-14 2019-03-12 华南理工大学 Passenger's abnormal behaviour recognition methods based on human skeleton sequence
CN110008913A (en) * 2019-04-08 2019-07-12 南京工业大学 The pedestrian's recognition methods again merged based on Attitude estimation with viewpoint mechanism
CN110471526A (en) * 2019-06-28 2019-11-19 广东工业大学 A kind of human body attitude estimates the unmanned aerial vehicle (UAV) control method in conjunction with gesture identification

Non-Patent Citations (5)

* Cited by examiner, † Cited by third party
Title
AIHGF: "Github 项目 - OpenPose 关键点输出格式", Retrieved from the Internet <URL:https://cloud.tencent.com/developer/article/1396323> *
ALEXANDER JOSEPH BRATCH: "Representation of Human Body Stimuli within the Human Visual System", ALEXANDER JOSEPH BRATCH, 31 March 2021 (2021-03-31) *
张腾;: "基于Open Pose的角色动作提取", 无线互联科技, no. 03, 10 February 2020 (2020-02-10), pages 84 - 85 *
房欣欣 等: "神经心理量表理解力检测的人体姿态特征识别方法", 重庆大学学报, vol. 46, no. 4, 1 July 2021 (2021-07-01), pages 108 - 119 *
论智: "六种人体姿态估计的深度学习模型和代码总结", Retrieved from the Internet <URL:https://zhuanlan.zhihu.com/p/38597956> *

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113065474A (en) * 2021-04-07 2021-07-02 泰豪软件股份有限公司 Behavior recognition method and device and computer equipment
CN113065474B (en) * 2021-04-07 2023-06-27 泰豪软件股份有限公司 Behavior recognition method and device and computer equipment
CN115299934A (en) * 2022-08-30 2022-11-08 北京中科睿医信息科技有限公司 Method, device, equipment and medium for determining test action
CN115331153A (en) * 2022-10-12 2022-11-11 山东省第二人民医院(山东省耳鼻喉医院、山东省耳鼻喉研究所) Posture monitoring method for assisting vestibule rehabilitation training
CN115331153B (en) * 2022-10-12 2022-12-23 山东省第二人民医院(山东省耳鼻喉医院、山东省耳鼻喉研究所) Posture monitoring method for assisting vestibule rehabilitation training

Similar Documents

Publication Publication Date Title
CN111652076A (en) Automatic gesture recognition system for AD (analog-digital) scale comprehension capability test
CN109858540B (en) Medical image recognition system and method based on multi-mode fusion
CN108492272A (en) Cardiovascular vulnerable plaque recognition methods based on attention model and multitask neural network and system
Feng et al. Robust and efficient algorithms for separating latent overlapped fingerprints
CN111407245A (en) Non-contact heart rate and body temperature measuring method based on camera
CN110930374A (en) Acupoint positioning method based on double-depth camera
CN107274399A (en) A kind of Lung neoplasm dividing method based on Hession matrixes and 3D shape index
CN106503651B (en) A kind of extracting method and system of images of gestures
CN111126240B (en) Three-channel feature fusion face recognition method
CN108875586B (en) Functional limb rehabilitation training detection method based on depth image and skeleton data multi-feature fusion
CN111462049A (en) Automatic lesion area form labeling method in mammary gland ultrasonic radiography video
CN111062936B (en) Quantitative index evaluation method for facial deformation diagnosis and treatment effect
CN110264460A (en) A kind of discrimination method of object detection results, device, equipment and storage medium
Umirzakova et al. Fully Automatic Stroke Symptom Detection Method Based on Facial Features and Moving Hand Differences
CN115527065A (en) Hip joint typing method, device and storage medium
CN115462783A (en) Infant crawling posture analysis system based on skeleton key point detection
Groza et al. Pneumothorax segmentation with effective conditioned post-processing in chest X-ray
Zhao et al. Attention residual convolution neural network based on U-net (AttentionResU-Net) for retina vessel segmentation
CN109523484B (en) Fractal feature-based finger vein network repair method
CN111639562A (en) Intelligent positioning method for palm region of interest
CN112215878B (en) X-ray image registration method based on SURF feature points
Hu et al. An end-to-end efficient framework for remote physiological signal sensing
CN109165551B (en) Expression recognition method for adaptively weighting and fusing significance structure tensor and LBP characteristics
CN108108648A (en) A kind of new gesture recognition system device and method
CN110633666A (en) Gesture track recognition method based on finger color patches

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
TA01 Transfer of patent application right

Effective date of registration: 20231225

Address after: Floor 16, No. 192 Yuzhou Road, Yuzhong District, Chongqing, 400042

Applicant after: Chongqing Zhiyixing Technology Development Co.,Ltd.

Address before: 400044 No. 174 Sha Jie street, Shapingba District, Chongqing

Applicant before: Chongqing University

Applicant before: Chongqing Medical University

TA01 Transfer of patent application right