CN114550299A - System and method for evaluating daily life activity ability of old people based on video - Google Patents

System and method for evaluating daily life activity ability of old people based on video Download PDF

Info

Publication number
CN114550299A
CN114550299A CN202210175737.1A CN202210175737A CN114550299A CN 114550299 A CN114550299 A CN 114550299A CN 202210175737 A CN202210175737 A CN 202210175737A CN 114550299 A CN114550299 A CN 114550299A
Authority
CN
China
Prior art keywords
action
joint
user
evaluation
similarity
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210175737.1A
Other languages
Chinese (zh)
Inventor
肖文栋
屈莹
刘璐瑶
王换元
王易坤
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
University of Science and Technology Beijing USTB
Original Assignee
University of Science and Technology Beijing USTB
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by University of Science and Technology Beijing USTB filed Critical University of Science and Technology Beijing USTB
Priority to CN202210175737.1A priority Critical patent/CN114550299A/en
Publication of CN114550299A publication Critical patent/CN114550299A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61BDIAGNOSIS; SURGERY; IDENTIFICATION
    • A61B5/00Measuring for diagnostic purposes; Identification of persons
    • A61B5/0059Measuring for diagnostic purposes; Identification of persons using light, e.g. diagnosis by transillumination, diascopy, fluorescence
    • A61B5/0077Devices for viewing the surface of the body, e.g. camera, magnifying lens
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61BDIAGNOSIS; SURGERY; IDENTIFICATION
    • A61B5/00Measuring for diagnostic purposes; Identification of persons
    • A61B5/103Detecting, measuring or recording devices for testing the shape, pattern, colour, size or movement of the body or parts thereof, for diagnostic purposes
    • A61B5/11Measuring movement of the entire body or parts thereof, e.g. head or hand tremor, mobility of a limb
    • A61B5/1126Measuring movement of the entire body or parts thereof, e.g. head or hand tremor, mobility of a limb using a particular sensing technique
    • A61B5/1128Measuring movement of the entire body or parts thereof, e.g. head or hand tremor, mobility of a limb using a particular sensing technique using image analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/22Matching criteria, e.g. proximity measures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/044Recurrent networks, e.g. Hopfield networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Abstract

The invention discloses a video-based system and a video-based method for evaluating the daily living and activity ability of old people, which automatically, objectively and quickly finish an evaluation process by applying a computer vision technology. Calculating the position of a human body joint skeleton based on video data, eliminating the influence of body types and clothes, performing action recognition, providing a joint activity ability evaluation method based on an attention mechanism, and performing daily life activity ability evaluation by combining action recognition results. The invention has the advantages that: manpower and resource cost are effectively saved, the evaluation process is completed through machine informatization processing and calculation, the influence of human subjective factors is effectively eliminated, and the evaluation result is more objective and accurate.

Description

System and method for evaluating daily life activity ability of old people based on video
Technical Field
The invention relates to the technical field of evaluation of daily living activity of old people, in particular to a video-based system and a video-based method for evaluating daily living activity of old people.
Background
In the process of promoting the healthy development of the nursing service industry, the ability evaluation of the old is the basis for accurately, quantitatively and standardly providing the nursing service. Through the ability evaluation of the old people, all aspects of requirements of the old people can be really mastered, limited resources are reasonably distributed, market supply is scientifically planned, a care plan and service provision are made, an old-age institution is built, the talents of the old-age profession are cultured, and the like; through the ability evaluation of the old people, the supply and demand conditions of the old-age service market are determined, and the rights and interests of the old people are guaranteed. Besides promoting the overall benign development of the endowment service, the ability assessment of the elderly is also of great significance to government agencies, industry associations, endowment institutions and the elderly (family members). The method can really understand the living condition and service demand of the old, scientifically judge the health condition and self-care capacity of the old, reasonably distribute limited endowment resources, is beneficial to constructing a basic information platform of the old, dynamically observe the health condition of the old and has scientific research value.
At present, in order to fully utilize human resources, when an old man enters a nursing home, ADL (assessment of health) is firstly carried out, and the old man is arranged in different courtyards according to ADL assessment results to carry out corresponding grade care. Adl (activities of Daily living) refers to the necessary activities that a person carries out each day in order to meet the needs of Daily life. ADLs are divided into basic daily activities (BADL) and instrumental daily activities (IADL). The BADL refers to activities which are repeatedly performed every day and are necessary for people to maintain the most basic survival and life, and comprises two major activities of self-care and functional movement; wherein the self-care activities refer to eating, dressing, washing, toileting, changing clothes, and the like; functional movement refers to turning over, sitting up from bed, transferring, walking, driving wheelchair, going up and down stairs, etc. IADL refers to activities necessary for a person to maintain an independent life, including using a telephone, shopping, cooking, housework, using a vehicle, handling emergencies, leisure activities within a community, and the like.
The purpose of ADL evaluation is to collect the physical functions, family conditions, and social environment of the elderly (patients), analyze the difference between the degree of disorder and the normal standard of the elderly (patients), provide a basis for specifying a care (rehabilitation) regimen, and provide objective data for determining the care (rehabilitation) effect. At present, ADL evaluation is still completed by evaluators, ADL evaluation of one old man often needs two evaluators to complete recording together, talent resources of the evaluators in China are scarce at present, professional evaluators are rarely allocated in nursing homes in small and medium towns, two evaluators are required to score on site according to scales in the evaluation process, and finally, the nursing grade is analyzed according to scoring results. The work consumes a large amount of manpower and time cost, the evaluation result has subjective assumption and is easy to diverge, and a computer replaces people to finish the evaluation work, so that the resources can be saved and the efficiency can be improved.
Most of ADL aassessments are through the mode of scale at present, carry out the manual mark according to the completion degree of old man to content in the scale promptly, and concrete aassessment mode has five, is respectively: a traditional evaluation method, a comprehensive index evaluation method, an action capability determination method, an actual condition investigation method and an action decomposition evaluation method. The traditional assessment method roughly divides the ability into energy and energy. The combined index evaluation method includes Barthel index and Kenny index. The performance measurement method (ASADLE) is a method of measuring the average value of the time taken for two actions of the elderly (patient), such as putting on and taking off clothes, and recording the result. The actual condition investigation mode comprises Katz index and function independence assessment (FIM). Action decomposition assessment (Klein-Bell), i.e. decomposition of actions, scoring separately, such as putting on the body suit, passing the left hand into the cuff 2 minutes, passing the left elbow into the cuff 2 minutes, pulling the suit on the left shoulder 2 minutes.
Currently, the medical ADL evaluation mainly adopts the Papanicolaou Index and the Katz Index, which are scales in nature, and the Barthel Index (Papanicolaou Index) evaluation comprises 10 items of food intake, bath, dressing, defecation control, toilet, bed and chair transfer, walking for 45 meters on the flat ground, and going up and down stairs, wherein each item can be self-written by 10 points, slightly written by 5 points, greatly written by 0 points and completely written by 100 points, wherein 60 points are good, 60 points are middle points and 40 points are poor, and 40 points are poor. The Barthel Index (Papanicolaou Index) evaluation is simple in evaluation, high in reliability, high in sensitivity and wide in application. The Katz index includes 6 items of bathing, changing clothes, using toilet, transferring, controlling urination and defecation, eating, and rating criteria are classified into A, B, C, D, E, F, G7 levels, wherein level a represents that 6 actions are completely self-care, level B represents that only 1 item is dependent, level C represents that only bathing and one of the remaining 5 items are dependent, level D represents that only bathing, changing clothes and one of the remaining 4 items are dependent, level E represents that only bathing, changing clothes, using toilet and one of the remaining 3 items are dependent, level F represents that the first 4 items and one of the remaining 2 items are dependent, and level G represents that 6 actions are completely dependent. However, it can be seen that the judgment of whether or not the index is dependent and the degree of the dependence has a large artificial subjective emotion, and the objectivity of machine judgment is poor. The requirements for evaluators are high, professional training is required, and great labor and facility resource costs are required.
The prior art can realize the assessment of the daily living activity ability of the old, and the assessment is relatively comprehensive and delicate, but it is greatly influenced by artificial subjective factors, and the degree of dependence of the daily living activity ability of the old is not clearly defined, the assessment result is that evaluators observe according to the eyes completely, human subjective judgment contains relatively large artificial emotions, and the assessment result lacks quantification, and the right-wrong judgment cannot be made, for example, when the results of two evaluators in the assessment process are divergent and the opinions are left, it cannot be determined that two evaluators are right or wrong, even when both sides insist on respective glistenings, the assessment process cannot be continued.
And this process requires significant human cost and resource facilities, firstly an evaluator needs to be trained on talents, and secondly the evaluation process needs to determine the site. At present, in a national endowment service system, ability evaluators are relatively deficient and far short of the number of endowment institutions, namely, not all the endowment institutions are equipped with professional evaluators and cannot meet the industrial standard.
At present, a daily life ability evaluation system only stores a paper evaluation report into a computer system, and the system mainly realizes information management; the working mode of manual evaluation by paper archiving is changed, so that the statistics, query and archiving of the evaluation data are facilitated, a graphical user interface is designed, and the evaluation information is summarized and counted, and the essence of resource consumption and subjective assumption in the evaluation process is not changed.
Disclosure of Invention
Aiming at the defects of the prior art, the invention provides a video-based system and a video-based method for evaluating the daily living activity capacity of the old.
In order to achieve the above purpose, the technical scheme adopted by the invention is as follows:
a video-based system for assessing the ability of an elderly person to live at a daily rate, comprising: the system comprises a physical layer, a data processing layer, a model extraction layer, an action recognition layer and a capability evaluation layer;
the physical layer collects action videos through a camera;
the data processing layer carries out video data preprocessing, including marking, denoising, compressing and normalizing; the data quality is improved, the key frame is extracted, and the calculation time is shortened under the condition that the operation result is not influenced;
the model extraction layer analyzes video clips, extracts key points, positions different parts of a body in each frame and draws a two-dimensional human body key point skeleton image;
the action recognition layer analyzes the movement of the body part along with time, and predicts and classifies actions executed in the video; and the capability evaluation layer extracts the action sequence, carries out evaluation pretreatment and judges whether the action sequence is finished.
The capability evaluation layer compares the user action sequence with the standard posture, calculates Euclidean distance between the user action sequence and the standard posture, judges whether the minimum Euclidean distance is larger than a specified threshold value, if so, judges that 'action missing' or 'action sequence error' occurs, judges that the action is not finished, and if not, judges that the action sequence is finished correctly; dividing the correct action sequence to separate static action and dynamic action; meanwhile, the key joints are positioned based on the attention mechanism, detail evaluation is carried out on the key joints, evaluation is carried out through four indexes, namely joint angle similarity, action center time similarity, action duration time similarity and dynamic action joint average angular velocity similarity, whether actions are accurate or not, whether action rhythms are correct or not and whether dynamics are proper or not are judged respectively, the four indexes are integrated to carry out daily life activity capability evaluation, and the independence and the dependence degree are judged.
Further, the ADL identification method of the action identification layer comprises the following steps:
s1, constructing a data set;
firstly, an atomic action data set is constructed, namely, a representative necessary action required by the daily life activity of a human body is completed, and the method comprises the following steps: feeding, sitting and standing balance, transferring and going up and down stairs. 12 objects are collected, videos are shot from 4 angles, the four actions are executed respectively, and the actions are repeated for 5 times.
S2, estimating the 2D posture of the human body;
firstly, positioning each part of a body in each frame, analyzing the motion of different parts of the body along with time, and estimating the human body joint points in the video.
The method comprises the steps of firstly converting an input video into an image, extracting all frames in the video, pooling the image, detecting the rough position of a human joint based on the pooled image, quantizing feature maps with different sizes into fixed sizes, selecting the maximum value in the feature maps as the fixed input feature of a full connection layer, dividing the full connection layer into two full connection layers, wherein one full connection layer is represented as a boundary frame of the joint point, 4 variables are used for representing the boundary, the horizontal and vertical coordinates and the width and the height of a central point are respectively, the other full connection layer is represented as the confidence score of the joint point, and the confidence score is used for filtering wrong joint points. Aiming at the problem that the same joint point is detected for multiple times, the joint point with lower score is filtered out by using the confidence score, and only the human body joint point with high confidence is drawn.
And connecting joints to form a limb skeleton structure, and connecting joint points in all frames of the video to draw a skeleton. To improve the accuracy of subsequent identification and evaluation, bone confidence is set, a bone score is assigned based on the joint score, and the lowest confidence score of the two joints is taken as the bone score. And a good bone is defined as a bone that scores greater than the maximum confidence of any joint point.
S3, identifying the atomic action;
the joint point action classification is an LSTM model trained based on the pytorech framework. The training input data contains 18 joints and associated motion labels per frame, a 32-frame continuous sequence is used for identifying atomic motion, and a 32-frame sample sequence is a multidimensional array with the size of 32 × 36, as shown below:
Figure BDA0003520157370000061
each row contains the x, y values for 18 joints, for a total of 32 rows.
Further, the initial hidden dimension of the LSTM model was set to 50 and trained using PyTorch Lightning. Using an Adam optimizer, a reduce lronplan scheduler is configured to reduce the learning rate according to the value of the loss function.
Further, the ADL evaluation method of the ability evaluation layer comprises the following steps:
s1, evaluating and preprocessing;
extracting the action recognition sequence in the early stage, judging whether the action sequence is correctly completed, comparing the action sequence of the user with the standard posture, calculating the Euclidean distance between the action sequence of the user and the standard posture, and judging the minimum Euclidean distance dminAnd if the number of the actions is less than or equal to the threshold value, judging that the actions are not finished, and if the number of the actions is less than or equal to the threshold value, judging that the action sequence is finished correctly. In preparation for subsequent segmentation and detail evaluation, if anyThe operation is not completed, and the operation is directly evaluated as dependence without subsequent operation.
S2, evaluating action segmentation;
comparing the user action sequence with the standard gesture, calculating Euclidean distance between the standard gesture and the user action sequence, and searching for the minimum Euclidean distance dminAdding a positive offset delta l to obtain dpBased on dpT corresponding to the abscissa1、t2If the operation sequence is the same, the device is determined to be static, and if the operation sequence is different, the device is determined to be dynamic.
S3, evaluating action details based on the attention mechanism;
and an attention mechanism is introduced and a global attention model is set, so that the force-exerting joint points mainly involved in the current action are calculated in the evaluation process, and the calculation cost is saved for the detailed evaluation of the subsequent joints.
The action detail evaluation reflects the spatial angle error and the time delay error of the user action, and four evaluation indexes are set from the two aspects of static action and dynamic action, respectively: the joint angle similarity, the center time similarity, the duration similarity, and the average angular velocity similarity.
Joint angle similarity: calculating the difference value between each joint angle of the user and each joint angle of the standard action, and measuring the similarity degree between the user action and the standard action, wherein during the static action calculation, the average value of all posture joint angles forming the static action of the user needs to be calculated, and is compared with the average value of all posture joint angles forming the standard action to obtain the similarity degree between the two, namely the similarity degree is obtained
Figure BDA0003520157370000071
ds-user static motion joint angle similarity;
n is the total number of frames of static actions of the user;
ci-the joint angle vector of the ith gesture of the user's static action;
b-Standard pose vector; the dynamic action adopts a dynamic time warping algorithm because the length of the user action is different from that of the standard action sequence;
center time similarity: calculating the center time t of the user action and the standard actioncA difference of (e)c=tc-t′cReflecting the time accumulated error of the user action; wherein
Figure BDA0003520157370000072
ec-motion center time similarity;
tc-user action centre time;
t′c-a standard action centre time;
fstart-the starting frame number of user actions;
fend-the number of termination frames of the user action;
τ — sampling frequency of motion capture device;
duration similarity: calculating the duration t of the user action and the standard actionsA difference of (i.e. e)s=ts-t′sReflecting the duration error of the single action of the user; wherein
Figure BDA0003520157370000081
t′s-a standard action duration;
ts-a user action duration;
average angular velocity similarity of dynamic motion joints: calculating the Euclidean distance between the average angular velocity of the dynamic action joint of the user and the average angular velocity of the standard dynamic action joint, and reflecting the error of the angular velocity of a single joint in dynamic action; namely, it is
Figure BDA0003520157370000082
ew-mean angular velocity similarity of dynamically moving joints;
n is the total frame number of the user dynamic action sequence;
ei-joint angle vector for the ith pose in dynamic motion;
w' -standard joint mean angular velocity;
and evaluating the daily living activity ability of the old through the combination of atomic actions and the detail evaluation indexes of the joint points.
Compared with the prior art, the invention has the advantages that:
in consideration of human and resource costs when the old people live in the endowment institution in real life and perform daily life activity capability evaluation on the old people, and subjective emotional judgment of an evaluator in the evaluation process, the ADL evaluation process of the old people is automatically completed through a computer vision technology. The human and resource cost is effectively saved, the evaluation process is completed through machine informatization processing and calculation, the influence of human subjective factors is effectively eliminated, and the evaluation result is objective and accurate.
The skeleton representation of the human body in the video is found through a posture estimation algorithm, the influence of different individuals on an evaluation result is eliminated, the representation and the recognition of complex actions in daily life activities are realized through the combination of atomic actions, the detail evaluation is carried out on corresponding joint points through an attention mechanism, and the accuracy of the evaluation of action capacity is improved.
Drawings
Fig. 1 is a diagram of an architecture of a system for evaluating activities of daily living of an elderly person according to an embodiment of the present invention.
Fig. 2 is a structural diagram of a technical scheme of a method for evaluating the daily living activity ability of the elderly according to an embodiment of the present invention.
FIG. 3 is a diagram of a recurrent neural network in accordance with an embodiment of the present invention;
FIG. 4 is a flow chart of ADL evaluation preprocessing according to an embodiment of the present invention;
FIG. 5 is a flow chart of ADL evaluation division according to an embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention will be further described in detail below with reference to the accompanying drawings by way of examples.
The overall system architecture of the invention comprises five parts: the physical layer comprises a sensor layer, a data processing layer, a model extraction layer, an action recognition layer and a capability evaluation layer, wherein the data processing layer comprises four steps of marking, denoising, compressing and normalizing, and is shown in figure 1. The technical scheme mainly comprises two parts of ADL recognition and evaluation, wherein the recognition part comprises a two-dimensional attitude estimation module and an atomic motion recognition module, and the ADL evaluation comprises an evaluation preprocessing module, an evaluation motion segmentation module, a key joint positioning module and a motion detail evaluation module; as shown in fig. 2.
Acquiring a motion video through a camera in a physical layer; video data preprocessing is carried out on a data processing layer, so that the data quality is improved, key frames are extracted, and the calculation time is shortened under the condition that the operation result is not influenced; analyzing video clips, extracting key points, positioning different parts of a body in each frame and drawing a two-dimensional human body key point skeleton image in a model extraction layer, wherein the skeleton is high-level representation of human postures, and the skeleton-based action recognition can avoid extracting complicated information from the video, can definitely simulate action dynamics and reduce the influence of different body types and clothes; analyzing the motion of the body part along with the time at a motion recognition layer, predicting and classifying the motion in the video; extracting an action sequence at a capability evaluation layer, carrying out evaluation preprocessing, judging whether the action sequence is finished, and if action loss or action sequence error occurs, judging that the action is not finished; dividing the correct action sequence to separate static action and dynamic action; meanwhile, key joint points are positioned based on an attention mechanism, detail evaluation is carried out on the key joints, whether actions are accurate or not, whether action rhythms are correct or not and whether dynamics are proper are respectively judged through four indexes, namely joint angle similarity, action center time similarity, action duration time similarity and dynamic action joint average angular velocity similarity, the activities are evaluated in daily life by integrating the four indexes, and independence and dependence degrees are judged.
1. ADL recognition by action recognition layer
1.1 construction of data sets
Firstly, an atomic action data set required by the invention is constructed, which is mainly a representative essential action required by completing the activities of daily life of a human body, and comprises the following steps: feeding, sitting and standing balance, transferring and going up and down stairs. 12 objects are collected, videos are shot from 4 angles, the four actions are executed respectively, and the actions are repeated for 5 times.
1.2 human 2D pose estimation
To perform motion classification, various parts of the body are first located in each frame and the motion of different parts of the body over time is analyzed. The human skeleton data is a high-level representation method of activities, can accurately represent human postures, and effectively eliminates the influence of individuals on atomic motion recognition. So the human joint points in the video are first estimated.
The method comprises the steps of detecting human body joint points based on a Keypoint RCNN framework and a characteristic pyramid network, converting an input video into an image, extracting all frames in the video, pooling the image, detecting the rough position of the human body joint based on the pooled image, quantizing characteristic graphs with different sizes into fixed sizes, selecting the maximum value in the characteristic graphs as the fixed input characteristic of a subsequent full-connection layer, dividing the full-connection layer into two full-connection layers, wherein one full-connection layer is a boundary frame expressed as joint points, the boundary is expressed by 4 variables, the horizontal and vertical coordinates and the width and the height of a central point are respectively, the other confidence score expresses the confidence score of the joint points, and the wrong joint points are filtered by utilizing the confidence scores. And (3) aiming at the problem that the same joint point is detected for multiple times, filtering out the joint points with lower scores according to the confidence scores, and drawing only the human body joint points with high confidence.
Joints are connected to form a limb skeleton structure, because the human skeleton structure in the video is the same, the connection needs to be set in the global scope, and the joints in all frames of the video are connected to draw the skeleton. For accuracy of subsequent identification and evaluation, bone confidence is set, bone scores are assigned based on joint scores, and the lowest confidence score of two joints is taken as the bone score. And a good bone is defined as a bone score greater than the maximum confidence of any joint point.
1.3 atomic motion recognition
The motion classification for the joint points is based on the LSTM model trained by the pytorech framework. The training input data contains 18 joints and associated motion labels per frame, a 32-frame continuous sequence is used to identify a specific atomic motion, and a 32-frame sample sequence is a multidimensional array with a size of 32 × 36, as shown below:
Figure BDA0003520157370000111
each row contains the x, y values for 18 joints, for a total of 32 rows.
The initial hidden dimension of the LSTM model was set to 50 and was trained using PyTorch Lightning. Using an Adam optimizer, the reduce lronplateau scheduler is configured to reduce the learning rate according to the value of the loss function.
The LSTM network is a Recurrent Neural Network (RNN) that is able to learn the order dependencies in sequence prediction problems. As shown in fig. 3, the RNN has a repeating chain of neural network modules. The LSTM has a recurrent neural network that repeats a chain of neural network modules.
In NN (neural network): x is the number of0,x1,…xtIs input, h0,h1,…htIs predicted at time t (h)t) Is dependent on the previous prediction and the current input xt. The RNN remembers the previous information and processes the current input in an optimal manner. However, RNNs have the disadvantage of not being able to remember long-term dependencies. LSTM also has a similar chain structure, but its neural network modules can handle long-term dependencies.
2. ADL assessment of competency assessment layer
2.1 evaluation of pretreatment
Extracting the early-stage action recognition sequence, judging whether the action sequence is correctly completed, comparing the action sequence of the user with the standard posture, calculating the Euclidean distance between the action sequence of the user and the standard posture, and judging the minimum Euclidean distance dminIf the number of the actions is larger than the specified threshold, the actions are considered to be absent or wrong in order, and the actions are judged to be not finished. Preparing for subsequent segmentation and detail evaluation, if the action is not completed, directly evaluating as dependent, notThe subsequent work is carried out, and the specific flow is shown in fig. 4.
2.2 assessment of motion segmentation
The necessary actions in the daily life activities of the human body are all composed of static actions and dynamic actions, such as: sitting down can be divided into standing (static action) and sitting down (dynamic action); transfer can be divided into standing (static action) and walking (dynamic action). Therefore, the division of actions into static and dynamic actions is necessary and meets scientific specifications. Comparing the user action sequence with the standard gesture, calculating Euclidean distance between the user action sequence and the standard gesture, and searching for the minimum Euclidean distance dminAdding a positive offset delta l to obtain dpBased on dpT corresponding to the abscissa1、t2The static operation at this time is obtained, and if the static operation is the same as the operation sequence, the static operation is determined, and if the static operation is different from the operation sequence, the dynamic operation is determined. The specific flow is shown in fig. 5.
2.3 attention-based action detail evaluation
And introducing an attention mechanism and setting a global attention model, so that a force-exerting joint point mainly involved in the current action is obtained in the evaluation process, and the calculation cost is saved for the detailed evaluation of the subsequent joint.
The action detail evaluation reflects the spatial angle error and the time delay error of the user action, and four evaluation indexes are set from the two aspects of static action and dynamic action: the joint angle similarity, the center time similarity, the duration similarity, and the average angular velocity similarity.
Joint angle similarity: calculating the difference value between each joint angle of the user and each joint angle of the standard action, measuring the similarity degree between the user action and the standard action, and reflecting the accuracy degree of the user action, wherein during the calculation of the static action, the average value of all posture joint angles forming the static action of the user needs to be calculated, and is compared with the average value of all posture joint angles forming the standard action to obtain the similarity degree of the two, namely the difference value
Figure BDA0003520157370000131
Figure BDA0003520157370000132
And the dynamic action needs to adopt a dynamic time warping algorithm because the length of the user action is different from that of the standard action sequence.
Center time similarity: calculating the center time t of the user action and the standard actioncA difference of (i.e. e)c=tc-t′cAnd reflecting the time accumulated error of the user action. Wherein
Figure BDA0003520157370000133
Duration similarity: calculating the duration t of the user action and the standard actionsA difference of (i.e. e)s=ts-t′sReflecting the duration error of the user's single action. Wherein
Figure BDA0003520157370000134
Mean angular velocity similarity of dynamically moving joints: and calculating the Euclidean distance between the average angular velocity of the dynamic motion joint of the user and the average angular velocity of the standard dynamic motion joint, and reflecting the error of the angular velocity of a single joint in the dynamic motion. Namely, it is
Figure BDA0003520157370000135
And evaluating the daily living activity ability of the old through the combination of atomic actions and the detail evaluation indexes of the joint points.
It will be appreciated by those of ordinary skill in the art that the examples described herein are intended to assist the reader in understanding the manner in which the invention is practiced, and it is to be understood that the scope of the invention is not limited to such specifically recited statements and examples. Those skilled in the art can make various other specific changes and combinations based on the teachings of the present invention without departing from the spirit of the invention, and these changes and combinations are within the scope of the invention.

Claims (4)

1. A video-based system for assessing the ability of an elderly person to live at a daily rate, comprising: the system comprises a physical layer, a data processing layer, a model extraction layer, an action recognition layer and a capability evaluation layer;
the physical layer collects action videos through a camera;
the data processing layer carries out video data preprocessing, including marking, denoising, compressing and normalizing; the data quality is improved, the key frame is extracted, and the calculation time is shortened under the condition that the operation result is not influenced;
the model extraction layer analyzes video clips, extracts key points, positions different parts of a body in each frame and draws a two-dimensional human body key point skeleton image;
the motion recognition layer analyzes the motion of the body part along with the time, and predicts or classifies the motion in the video; the capability evaluation layer extracts the action sequence, carries out evaluation pretreatment and judges whether the action sequence is finished or not;
the capability evaluation layer compares the user action sequence with the standard posture, calculates Euclidean distance between the standard posture and the user action sequence, judges whether the minimum Euclidean distance is larger than a specified threshold value, if so, judges that the action is absent or wrong in action sequence, judges that the action is not finished, and if not, judges that the action sequence is correct; dividing the correct action sequence to separate static action and dynamic action; meanwhile, key joint points are positioned based on an attention mechanism, detail evaluation is carried out on the key joints, whether actions are accurate or not, whether action rhythms are correct or not and whether dynamics are proper or not are respectively judged through four indexes, namely joint angle similarity, action center time similarity, action duration time similarity and dynamic action joint average angular velocity similarity, the activities of daily life are evaluated by integrating the four indexes, and the independence and the dependence degree of the old are judged.
2. The system for assessing the ability of an elderly person to daily live activities according to claim 1, wherein: an ADL identification method of an action identification layer comprises the following steps:
s1, constructing a data set;
constructing a required atomic action data set to complete representative necessary actions required by daily life activities of a human body, wherein the required atomic action data set comprises the following steps: feeding, sitting and standing balance, transferring and going upstairs and downstairs; collecting 12 objects, shooting videos from 4 angles, respectively executing the four actions, and repeating for 5 times;
s2, estimating the 2D posture of the human body;
for motion classification, various parts of the body are located in each frame, the motion of different parts of the body over time is analyzed, and human body joint points in the video are estimated.
Detecting human body joint points based on a Keypoint RCNN architecture and a characteristic pyramid network, firstly converting an input video into an image, and extracting all frames in the video; pooling the images; detecting the rough position of a human joint based on the pooled image, quantizing feature maps with different sizes into fixed sizes, selecting the maximum value in the feature maps as the fixed input feature of a full-connection layer, dividing the full-connection layer into two full-connection layers, wherein one full-connection layer is represented as a boundary frame of the joint point, 4 variables are used for representing the boundary, namely the horizontal and vertical coordinates and the width and the height of a central point, and the other full-connection layer is represented as a confidence score of the joint point; filtering erroneous joint points based on the confidence scores; aiming at the problem that the same joint point is detected for multiple times, the joint points with lower confidence scores are filtered, and only the human body joint points with high confidence degrees are drawn;
connecting joints to form a limb skeleton structure, and connecting joint points of all frames in a video to draw a skeleton; setting bone confidence for the accuracy of subsequent identification and evaluation, and allocating bone scores based on the joint point scores, wherein the lowest confidence score of the two joint points is used as the bone score; defining the bone as the bone score larger than the maximum confidence of any joint point;
s3, identifying the atomic action;
the motion classification for the joint point is an LSTM model trained based on the pyrtch framework; the training input data contains 18 joints and associated motion labels per frame, a 32-frame continuous sequence is used to identify a specific atomic motion, and a 32-frame sample sequence is a multidimensional array with a size of 32 × 36, as shown below:
Figure FDA0003520157360000021
each row contains the x, y values for 18 joints, for a total of 32 rows.
3. The system for evaluating daily living activity ability of an elderly person according to claim 2, wherein: the initial hidden dimension of the LSTM model is set to 50, and PyTorch Lightning is used for training; using an Adam optimizer, a reduce lronplan scheduler is configured to reduce the learning rate based on the value of the loss function.
4. The system for assessing the ability of an elderly person to daily live activities according to claim 1, wherein: an ADL evaluation method of a capability evaluation layer, comprising the steps of:
s1, evaluating and preprocessing;
extracting the early-stage action recognition sequence, judging whether the action sequence is correctly completed, comparing the action sequence of the user with the standard posture, calculating the Euclidean distance between the standard posture and the action sequence of the user, and judging the minimum Euclidean distance dminWhether the action sequence is larger than a specified threshold value or not, if so, judging that the action is absent or wrong in action sequence, judging that the action is not finished, and if not, judging that the action sequence is finished correctly; preparing for subsequent segmentation and detail evaluation, if the action is not finished, directly evaluating as dependence without performing subsequent work;
s2, evaluating action segmentation;
comparing the user action sequence with the standard gesture, calculating Euclidean distance between the standard gesture and the user action sequence, and searching for the minimum Euclidean distance dminAdding a positive offset delta l to obtain dpBased on dpT corresponding to the abscissa1、t2Obtaining the static action at the moment, judging the static action as a static action if the static action is the same as the action sequence, and judging the dynamic action as a dynamic action if the static action is different from the action sequence;
s3, evaluating action details based on the attention mechanism;
an attention mechanism is introduced, a global attention model is set, and the force-exerting joint points mainly related to the current action are calculated and completed in the evaluation process, so that the calculation cost is saved for the detail evaluation of the subsequent joints;
the action detail evaluation reflects the spatial angle error and the time delay error of the user action, and four evaluation indexes are set from the two aspects of static action and dynamic action: the joint point angle similarity, the center time similarity, the duration similarity and the average angular velocity similarity;
joint angle similarity: calculating the difference value between each joint angle of the user and each joint angle of the standard action, and measuring the similarity degree between the user action and the standard action, wherein during the static action calculation, the average value of all posture joint angles forming the static action of the user needs to be calculated, and is compared with the average value of all posture joint angles forming the standard action to obtain the similarity degree between the two, namely the similarity degree is obtained
Figure FDA0003520157360000041
ds-user static motion joint angle similarity;
n is the total number of frames of static actions of the user;
ci-the joint angle vector of the ith gesture of the user's static action;
b-Standard pose vector; the dynamic action adopts a dynamic time warping algorithm because the user action is different from the standard action sequence;
center time similarity: calculating the center time t of the user action and the standard actioncA difference of (i.e. e)c=tc-t′cReflecting the time accumulated error of the user action; wherein
Figure FDA0003520157360000042
ec-motion center time similarity;
tc-user action centre time;
t′c-a standard action centre time;
fstart-the starting frame number of user actions;
fend-the number of termination frames of the user action;
τ — sampling frequency of motion capture device;
duration similarity: calculating the duration t of the user action and the standard actionsA difference of (i.e. e)s=ts-t′sReflecting the duration error of the single action of the user; wherein
Figure FDA0003520157360000043
t′s-a standard action duration;
ts-a user action duration;
mean angular velocity similarity of dynamically moving joints: calculating the Euclidean distance between the average angular velocity of the dynamic action joint of the user and the average angular velocity of the standard dynamic action joint, and reflecting the error of the angular velocity of a single joint in dynamic action; namely, it is
Figure FDA0003520157360000051
ew-mean angular velocity similarity of dynamically moving joints;
n is the total frame number of the user dynamic action sequence;
ei-joint angle vector for the ith pose in dynamic motion;
w' -standard joint mean angular velocity;
and evaluating the daily living activity ability of the old through the combination of atomic actions and the detail evaluation indexes of the joint points.
CN202210175737.1A 2022-02-25 2022-02-25 System and method for evaluating daily life activity ability of old people based on video Pending CN114550299A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210175737.1A CN114550299A (en) 2022-02-25 2022-02-25 System and method for evaluating daily life activity ability of old people based on video

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210175737.1A CN114550299A (en) 2022-02-25 2022-02-25 System and method for evaluating daily life activity ability of old people based on video

Publications (1)

Publication Number Publication Date
CN114550299A true CN114550299A (en) 2022-05-27

Family

ID=81679531

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210175737.1A Pending CN114550299A (en) 2022-02-25 2022-02-25 System and method for evaluating daily life activity ability of old people based on video

Country Status (1)

Country Link
CN (1) CN114550299A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114931743A (en) * 2022-06-15 2022-08-23 康键信息技术(深圳)有限公司 Exercise evaluation method, exercise evaluation device, electronic apparatus, and readable storage medium
CN115223240A (en) * 2022-07-05 2022-10-21 北京甲板智慧科技有限公司 Motion real-time counting method and system based on dynamic time warping algorithm

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114931743A (en) * 2022-06-15 2022-08-23 康键信息技术(深圳)有限公司 Exercise evaluation method, exercise evaluation device, electronic apparatus, and readable storage medium
CN115223240A (en) * 2022-07-05 2022-10-21 北京甲板智慧科技有限公司 Motion real-time counting method and system based on dynamic time warping algorithm

Similar Documents

Publication Publication Date Title
Panwar et al. CNN based approach for activity recognition using a wrist-worn accelerometer
WO2018120964A1 (en) Posture correction method based on depth information and skeleton information
CN112861624A (en) Human body posture detection method, system, storage medium, equipment and terminal
CN101561868B (en) Human motion emotion identification method based on Gauss feature
CN107767935A (en) Medical image specification processing system and method based on artificial intelligence
CN114550299A (en) System and method for evaluating daily life activity ability of old people based on video
CN110135242B (en) Emotion recognition device and method based on low-resolution infrared thermal imaging depth perception
CN110991268B (en) Depth image-based Parkinson hand motion quantization analysis method and system
CN110197235B (en) Human body activity recognition method based on unique attention mechanism
CN111274998A (en) Parkinson's disease finger knocking action identification method and system, storage medium and terminal
CN110503077A (en) A kind of real-time body's action-analysing method of view-based access control model
CN110659677A (en) Human body falling detection method based on movable sensor combination equipment
CN110575663A (en) physical education auxiliary training method based on artificial intelligence
CN114358194A (en) Gesture tracking based detection method for abnormal limb behaviors of autism spectrum disorder
CN115661856A (en) User-defined rehabilitation training monitoring and evaluating method based on Lite-HRNet
CN115316982A (en) Muscle deformation intelligent detection system and method based on multi-mode sensing
Yang et al. Automatic detection pipeline for accessing the motor severity of Parkinson’s disease in finger tapping and postural stability
Bandini et al. A wearable vision-based system for detecting hand-object interactions in individuals with cervical spinal cord injury: First results in the home environment
CN113768471A (en) Parkinson disease auxiliary diagnosis system based on gait analysis
Sim et al. Improving the accuracy of erroneous-plan recognition system for Activities of Daily Living
CN114299279A (en) Unmarked group rhesus monkey motion amount estimation method based on face detection and recognition
Wang et al. Student physical fitness test system and test data analysis system based on computer vision
CN115147768B (en) Fall risk assessment method and system
CN116543455A (en) Method, equipment and medium for establishing parkinsonism gait damage assessment model and using same
CN113974612B (en) Automatic evaluation method and system for upper limb movement function of stroke patient

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination