CN111274954B - Embedded platform real-time falling detection method based on improved attitude estimation algorithm - Google Patents

Embedded platform real-time falling detection method based on improved attitude estimation algorithm Download PDF

Info

Publication number
CN111274954B
CN111274954B CN202010062574.7A CN202010062574A CN111274954B CN 111274954 B CN111274954 B CN 111274954B CN 202010062574 A CN202010062574 A CN 202010062574A CN 111274954 B CN111274954 B CN 111274954B
Authority
CN
China
Prior art keywords
joint
human body
acceleration
joint point
network
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Fee Related
Application number
CN202010062574.7A
Other languages
Chinese (zh)
Other versions
CN111274954A (en
Inventor
郭欣
王红豆
孙连浩
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hebei University of Technology
Original Assignee
Hebei University of Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hebei University of Technology filed Critical Hebei University of Technology
Priority to CN202010062574.7A priority Critical patent/CN111274954B/en
Publication of CN111274954A publication Critical patent/CN111274954A/en
Application granted granted Critical
Publication of CN111274954B publication Critical patent/CN111274954B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/20Movements or behaviour, e.g. gesture recognition
    • G06V40/23Recognition of whole body movements, e.g. for sport training
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61BDIAGNOSIS; SURGERY; IDENTIFICATION
    • A61B5/00Measuring for diagnostic purposes; Identification of persons
    • A61B5/103Detecting, measuring or recording devices for testing the shape, pattern, colour, size or movement of the body or parts thereof, for diagnostic purposes
    • A61B5/11Measuring movement of the entire body or parts thereof, e.g. head or hand tremor, mobility of a limb
    • A61B5/1116Determining posture transitions
    • A61B5/1117Fall detection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/74Image or video pattern matching; Proximity measures in feature spaces
    • G06V10/75Organisation of the matching processes, e.g. simultaneous or sequential comparisons of image or video features; Coarse-fine approaches, e.g. multi-scale approaches; using context analysis; Selection of dictionaries
    • G06V10/751Comparing pixel values or logical combinations thereof, or feature values having positional relevance, e.g. template matching

Abstract

The invention relates to an embedded platform real-time falling detection method based on an improved posture estimation algorithm, which uses a depth separable convolution, an attention mechanism and an inverse residual error structure to build a posture estimation network, and the posture estimation network is used in falling detection, so that the precision of the posture estimation network is further improved, the parameter quantity and the calculated quantity are greatly reduced, the distance of each joint point of a human body between different video frames is calculated to track the human body, the acceleration of the joint point of the human body is calculated by using front and rear video frames, whether falling occurs or not is judged according to the acceleration, the relative position of the joint point and the like, so that the posture estimation network is more suitable for being deployed on an embedded platform, and the real-time effect can be achieved by deploying on a TX2 embedded platform. The method of the invention uses the coordinates of the human body joint points of multiple persons and the skeleton information obtained from the previous and the next frames to track the human body, the posture estimation is more stable by the tracking of the multiple persons, and the falling detection problem under the scene of the multiple persons can be better processed.

Description

Embedded platform real-time falling detection method based on improved attitude estimation algorithm
Technical Field
The invention belongs to the technical field of computer vision, and particularly relates to an embedded platform real-time falling detection method based on an improved attitude estimation algorithm.
Background
With the improvement of medical conditions and the improvement of living standard, the average life span of the population is remarkably increased, and the aging process of the population is accelerated. The aging population brings about a plurality of social problems, and the reduction of the loss of the old people caused by accidents is a problem to be solved urgently at present. According to the WHO report, 64.6 million people are estimated to die each year because of falling without being timely cured, and the proportion of the middle-aged and the elderly people is the largest. In the current old people falling detection method, the method based on wearable equipment has large limitation and high cost; although the computer vision fall detection algorithm based on the posture estimation does not need the old to carry equipment with him, the network of the deep learning posture estimation algorithm has forty layers, the size of the model reaches 200M, a large amount of calculation needs to be performed during forward reasoning, and the requirement on equipment during deployment is high. Patent No. CN201910631143.5 discloses a "fall detection method based on human posture estimation", which extracts human joint points from a video frame, and then determines whether a human body falls through the included angle between the hip joint and the center point of the knee joint and the neck joint. The method does not track multiple persons and optimize the problem that the attitude estimation algorithm model is too large; patent number CN201910477816.6 discloses a method for detecting the fall of people on an escalator based on attitude estimation, which extracts seven feature points of a human body, calculates the variance in the vertical direction of the feature points, and determines whether people on the escalator fall according to the variance, in practice, the number of people on the elevator is not two, because the detection effect of one frame of multiple frames is poor when the attitude of the human body is estimated, and the method does not track multiple people, so the fall behavior of multiple people cannot be well determined; the 'old people falling detection algorithm based on joint point extraction' of Nanjing aerospace university uses a YOLO algorithm to detect the position of a human body, uses an openpos algorithm to acquire joint information, and uses an SVM classifier to classify the behavior of the human body, so that the falling detection accuracy of the algorithm in a fixed scene where a single person exists is high, and the falling detection accuracy of multiple persons is greatly reduced.
At present, the fall detection algorithm based on attitude estimation is not improved, the speed after deployment on an embedded platform is low, and a human body is not tracked in a multi-person scene, so that the fall detection recognition rate of the existing algorithm in the multi-person scene is low, and a great improvement space is still provided.
Disclosure of Invention
Aiming at the defects that a current posture estimation algorithm has overlarge models and overmuch parameters when being actually deployed, a human body is not tracked, and the like, the invention provides an embedded platform real-time falling detection method based on an improved posture estimation algorithm. The method mainly comprises the following steps: the method comprises the steps of constructing a posture estimation network by using a depth separable convolution, an attention mechanism and an inverse residual error structure, using the posture estimation network in falling detection, further improving the precision of the posture estimation network, greatly reducing the parameter quantity and the calculation quantity, calculating the distance of each joint point of a human body among different video frames to track the human body, calculating the acceleration of the joint points of the human body by using front and back video frames, and judging whether falling occurs or not according to the acceleration, the relative position of the joint points and the like.
The technical scheme for realizing the purpose of the invention is as follows:
an embedded platform real-time fall detection method based on an improved attitude estimation algorithm comprises the following steps:
the method comprises the following steps: building attitude estimation network using lightweight structure
1-1, building a feature extraction network: improving a feature extraction part in an opencast algorithm, constructing a network of the feature extraction part by using a depth separable convolution and an inverse residual structure, and introducing an attention mechanism:
(1) structure of the basic module: including a depth separable convolution and two 1x1 convolutions, while using an inverse residual structure, the input to the basic module is split into two branches, the first branch is expanded in number of channels using a 1x1 convolution, then 3x3 depth separable convolution is used, and then a 1x1 convolution is used to reduce the number of channels; the second branch adds the input characteristic diagram of the basic module and the output characteristic diagram of the first branch of the basic module as the output of the basic module;
(2) and (2) constructing a feature extraction network by using the basic module in the step (1), wherein the network structure is as follows: using a picture with the size of 432 × 368 as an input, firstly using a common convolution with the size of 3 × 3, then using 9 basic modules in the step (1) to be sequentially connected to form a feature extraction part in the attitude estimation network, and superposing the output of the last basic module and the output of the sixth basic module on a channel to form the output of the feature extraction network; some of the 9 basic modules include a channel attention module, and the channel attention module is arranged after the deep separable convolution, and assigns weights to the number of channels of the feature map at that time, that is, the importance of different channel feature maps is judged;
1-2, constructing a posture estimation network: extracting feature maps with dimensions of 54x46x120 through a feature extraction network in the step 1-1, and sending the feature maps into a first stage, wherein each stage comprises two branches, each branch firstly passes through a 5-3 x3 deep convolution structure, the deep convolution structure comprises 3x3 deep separable convolution and 1x1 convolution, and then passes through 2 convolutions of 1x1, and the final output channel numbers of the two branches in each stage are respectively 19 and 38; the input of the next stage is the channel superposition of the output of the stage and the characteristic diagram output by the characteristic extraction network, and there are five stages in total, wherein 19 channel characteristic diagrams in the output represent that each characteristic diagram predicts a part of the human body, 18 are added, and a background characteristic diagram is added, and the output of 38 channels represents a vector diagram of the joint point connection of the human body part; except for the final stage, the outputs of the 19 channels and the 38 channels of other stages are fused with the output feature map of the feature extraction network and then used as the input of the next stage;
1-3, loss function and joint point matching: training the network set up in the step 1-1 and the step 1-2 integrally, wherein a human body joint point loss function is the difference between a joint point output characteristic diagram of a posture estimation network and a data set marking position, a joint point connection loss function is the difference between a joint point connection output characteristic diagram of the posture estimation network and a data set marking position, meanwhile, an L2 loss function is used for each stage in the step 1-2, and the integral loss is the sum of losses of all parts; distributing the detected multiple human joint points by using a Hungarian matching algorithm to obtain the joint point coordinates and confidence information of each human;
step two: human posture tracking
Training an improved posture estimation network to obtain a posture estimation model, performing human body posture estimation on a video frame picture by using the posture estimation model to obtain the joint point coordinates of each person, and calculating the distance of each joint point of the same person in different frames; wherein the j-th joint point coordinate matrix of the m-th person is Lj,m=(xj,m,yj,m,cj,m) X in the formulaj,mAnd yj,mCoordinate points representing the joint points of the human body; c. Cj,mThe confidence that the representation is a joint point; the m-th person's coordinate matrix is: pm=(L1,m,L2,m...L18,m) Calculating the average value of the sums of the Euclidean distances of all the joint points of different people in the adjacent frames before and after, wherein the same person is obtained when the distance is minimum and less than a threshold value;
step three: fall behavior detection
Tracking the human bodies in different frames by using the method in the second step, and detecting the falling of the human bodies according to the coordinate change condition of the joint points of the same person in the previous and next frames, the included angle between the joint point connecting line and the horizontal line and the width-height ratio;
step four: and deploying the attitude estimation model on a TX2 embedded platform, performing attitude estimation on a video frame, performing attitude tracking on different people, and performing fall detection in real time.
When a feature extraction network is built, channel attention modules are added to the fourth, fifth and sixth basic modules only in the 9 basic modules; the activation function used in the seventh, eighth and ninth basic modules is h-swish, and the other basic modules use relu6 activation functions.
The specific process of fall behavior detection in the third step is as follows:
3-1, calculating the acceleration of the joint: calculating the acceleration of joint points (hip joints, neck joints, shoulder joints and knee joints) close to the central point of the human body according to the coordinate change condition of the joint points of the human body between the front frame and the rear frame, and detecting the falling of the human body according to the motion direction and the acceleration of the hip joints, the shoulder joints, the neck joints or the knee joints so as to reflect the severe condition of the motion of the human body; obtaining the acceleration of the joint point in the second step according to the coordinate form of the joint point
Figure BDA0002374959890000031
a is the acceleration of the joint movement; (x)t-1,yt-1) The position of the joint point at the last moment; (x)t,yt) The position of the joint point at the current moment is taken as the position of the joint point;
3-2, calculating the relative positions of different joint points of the same person, the included angles of the joint points and a horizontal line and the aspect ratio: after the acceleration of the joint point close to the central point of the human body is judged, whether the human body falls down is further determined by using the relative positions of different joint points, the included angles between the central connecting lines of the neck joint and the two hip joints and the horizontal line and the aspect ratio, namely:
3-2-1, judging the intensity of the human motion according to the acceleration of the joint points, wherein the larger the acceleration is, the more intense the current human motion is, and when the acceleration is smaller than an intensity threshold value, the acceleration of the next frame is continuously calculated in a state that the human does not move or slowly moves; when the acceleration is greater than the intensity threshold value, the human body is in an intensity motion state, continuously detecting for 80 frames, simultaneously calculating the acceleration, and counting to enter the step 3-2-2;
3-2-2, after counting for a period of time in the violent movement state, judging whether the acceleration is smaller than a periodic threshold, if so, continuing to calculate the acceleration at the next moment, counting and accumulating until the accumulated number exceeds 8, and entering the step 3-2-3; if the acceleration is larger than the periodic threshold, the detected periodic or continuous acceleration is large, the detected periodic or continuous acceleration is judged to be other violent movement behaviors except for falling, and the initial acceleration calculation is returned to continue;
and 3-2-3, then, removing squat and sitting behaviors according to the relative position of the joint points after the human body falls, the included angle between the central connecting line of the neck joint and the two hip joints and the horizontal line and the change of the width-to-height ratio, and finally determining the posture of the human body and judging whether the human body is in a falling state.
The detected human neck joint is used as an original point, the direction parallel to the upper edge line and the lower edge line of a video frame is set as the X direction, the direction parallel to the left edge line and the right edge line is set as the Y direction, the X direction difference and the Y direction difference between the neck joint and the hip joint are calculated, the included angle between the connecting line of the neck joint and the centers of the two hip joints and the X direction is calculated at the same time, a human body detection frame is calibrated according to the detected coordinates of all joint points, whether the X direction difference between the neck joint and the center of the hip joint is larger than 2/3 of the connecting line of the two points and whether the Y direction difference is smaller than 1/3 of the connecting line are met at the same time, the included angle between the connecting line of the joint points and the X direction is not between [45 degrees and 135 degrees ], and the human body width-height ratio is larger than 1:1, and whether the human body falls down is further judged.
Compared with the prior art, the invention has the beneficial effects that:
the method uses the depth separable convolution and inverse residual error structure to build the network, so that the parameter quantity and the calculated quantity of the attitude estimation network are reduced by times, meanwhile, the channel attention mechanism is used for making up for the accuracy reduction caused by the condition of reduced parameter quantity, and the depth separable convolution and inverse residual error structure and the channel attention mechanism are jointly used, so that the attitude estimation network is more suitable for being deployed on an embedded platform, and the real-time effect can be achieved when the attitude estimation network is deployed on a TX2 embedded platform; meanwhile, among the plurality of basic modules, when the characteristics are extracted, the characteristic diagram in the middle of the characteristic extraction network and the final characteristic diagram channel of the characteristic extraction network are superposed, and the characteristics can be better extracted by applying different receptive fields.
The method uses the coordinates of the human body joint points of a plurality of people and skeleton information obtained from the front frame and the back frame to track the human body, the posture estimation is more stable due to the tracking of the plurality of people, and the falling detection problem under the scene of the plurality of people can be better solved; in the aspect of falling detection, video frame images are continuously read, the accelerated speeds of different joint points of the same person in front and back frames are calculated, the accelerated speeds of the joint points close to the central point of the human body are used for judging the intensity of the motion of the human body, the motion state and the static state (slow motion state) are distinguished, and according to the motion characteristic that the falling behavior is accelerated greatly and then is static, the relative position, the included angle and the human body width-height ratio information of the joint points are used for finally determining whether the human body falls or not.
Drawings
FIG. 1 is a schematic diagram of a basic module structure including a channel attention module in the detection method of the present invention.
FIG. 2 is a schematic diagram of a channel attention module structure employed in the embodiments of the present invention.
Fig. 3 is a schematic structural diagram of a feature extraction network in the detection method of the present invention.
Fig. 4 a depth convolution structure.
Fig. 5 is a branched network structure.
Fig. 6 shows an improved overall network structure in the detection method of the present invention.
Fig. 7 fall detection flow chart.
FIG. 8 is a diagram of the effect of single-person and multi-person posture estimation.
Fig. 9 is a diagram of the detection effect of the falling process in the invention.
Detailed Description
The following describes embodiments of the present invention in further detail with reference to the accompanying drawings.
The invention discloses an embedded type platform real-time falling detection method based on an improved attitude estimation algorithm, which comprises the following steps:
the method comprises the following steps: building attitude estimation network using lightweight structure
1-1, building a feature extraction network: the use of VGG19 to extract features in the opencast algorithm makes the parameter amount of the feature extraction part huge, so a network of the feature extraction part is built using a depth separable convolution and inverse residual structure, and an attention mechanism is introduced:
(1) structure of the basic module: the number of parameters can be reduced by 7-8 times by using the depth separable convolution and 1x1 convolution instead of common convolution, a network is built again by using the depth separable convolution, the network structure of the original opencast algorithm is completely changed, and the situation that the gradient disappears can be prevented by using an inverse residual structure. The basic module has the structure that: including a depth separable convolution and two 1x1 convolutions, the input to the basic module is split into two branches, the first branch is passed through 1x1 convolution to increase the number of channels, then passed through 3x3 depth separable convolution, and then passed through a 1x1 convolution to decrease the number of channels; the second branch takes the input profile of the basic module and the output profile of the first branch of the basic module and adds them as the output of the basic module. In this application, a basic module has two 1 × 1 convolutions, the first one is used to expand the number of channels of the feature map in order to extract more features, and the second one is used to reduce the number of channels in order to reduce the number of parameters.
(2) And (2) constructing a feature extraction network by using the basic module in the step (1), wherein the network structure is as follows: using a picture with the size of 432 × 368 as an input, firstly using a common convolution with the size of 3 × 3, then using 9 basic modules in the step (1) to sequentially connect and form a feature extraction part in the posture estimation network, wherein the feature extraction part and the subsequent posture estimation part form a large network for training, the network extraction part is a backbone network, the whole structure can be seen in fig. 6, and the output of the last basic module and the output of the sixth basic module are superposed on a channel (here, the feature map channels between the basic modules are superposed) to form a feature extraction network, and the features are better extracted by using sense fields with different sizes; some of the 9 basic modules include a channel attention module, and the channel attention module is arranged after the deep separable convolution, and assigns weights to the number of channels of the feature map at that time, that is, determines the importance of different channel feature maps.
1-2, constructing a posture estimation network: extracting feature maps with dimensions of 54x46x120 through the feature extraction network in the step 1-1, and sending the feature maps into a first stage, wherein each stage comprises two branches, each branch firstly passes through a 5-3 x3 deep convolution structure, the deep convolution structure comprises 3x3 deep separable convolution and 1x1 convolution, and then passes through 2 convolutions of 1x1, the final output channels of the two branches of each stage are respectively 19 and 38, the output feature maps input into the stage in the next stage are fused with the channels of the feature maps output by the feature extraction network, and the output 19 channel feature maps in the five stages are totally used for predicting one part of a human body by each feature map, wherein the total number of the output part is 18, and a background feature map is added, namely the joint point position of the human body is predicted; while the output of the 38 channels represents a vector diagram of the body part joint (joint) connections, including size and orientation, 19 and 38 are determined by the number of joints in the data set. Except for the final stage, the outputs of the 19 channels and the 38 channels of other stages are subjected to channel superposition with the output feature map of the feature extraction and then serve as the input of the next stage.
1-3, loss function and joint point matching: training the network set up in the step 1-1 and the step 1-2, wherein a human body joint point loss function is the difference between a joint point output characteristic diagram of the posture estimation network and a data set marking position, a joint point connection loss function is the difference between a joint point connection output characteristic diagram of the posture estimation network and a data set marking position, meanwhile, an L2 loss function is used for each stage in the step 1-2, and the overall loss is the sum of losses of all parts. Distributing the detected multiple human joint points by using a Hungarian matching algorithm to obtain the joint point coordinates and confidence information of each human;
step two: human posture tracking
Training an improved posture estimation network at a PC (personal computer) end to obtain a posture estimation model, carrying out human body posture estimation on a video frame picture by using the posture estimation model to obtain the joint point coordinates of each person, and calculating the distance of each joint point of the same person in different frames. Wherein the j-th joint point coordinate matrix of the m-th person is Lj,m=(xj,m,yj,m,cj,m) X in the formulaj,mAnd yj,mCoordinate points representing the joint points of the human body; c. Cj,mThe confidence of the joint point is indicated. The m-th person's coordinate matrix is: pm=(L1,m,L2,m...L18,m) And calculating the average value of the sums of the Euclidean distances of all the joint points of different people in the adjacent frames before and after, wherein the same person is obtained when the distance is the minimum and is smaller than the threshold value.
Step three: fall behavior detection
And tracking the human body in different frames by using the method in the step two, so that the human body falling is detected according to the coordinate change condition of the joint points of the same person in the previous and next frames, the included angle between the joint point connecting line and the horizontal line, the aspect ratio and the like.
3-1, calculating the acceleration of the joint: according to the coordinate change condition of the human body joint points between the front frame and the rear frame, the acceleration of the joint points (a hip joint, a neck joint, a shoulder joint and a knee joint) close to the central point of the human body is calculated, and because the central point of the human body can rapidly move downwards when the human body falls, the falling of the human body is detected according to the motion direction and the acceleration of the hip joint, the shoulder joint, the neck joint or the knee joint (mainly looking at the hip joint). The acceleration of these joints can reflect the strenuous state of human motion. Obtaining the acceleration of the joint point in the second step according to the coordinate form of the joint point
Figure BDA0002374959890000061
a is the acceleration of the joint movement; (x)t-1,yt-1) The position of the joint point at the last moment; (x)t,yt) The position of the joint point at the current moment.
3-2, calculating the relative positions of different joint points of the same person, the included angle between the joint point connecting line and the horizontal line and the aspect ratio: after the acceleration of the joint point of the human body close to the central point is judged, whether the human body falls down is further determined by using the relative positions of different joint points, the included angle between the central connecting line of the neck joint and the two hip joints and the horizontal line and the aspect ratio.
3-3, the process of fall detection comprises:
(1) judging the intensity of the human motion according to the acceleration of the joint points, wherein the larger the acceleration is, the current person is represented
The more violent the body movement is, when the acceleration is less than the threshold value, the immobile state of walking, standing and the like can be excluded or
A slow motion state.
(2) For states of strenuous exercise (running, jumping, falling, etc.), the fall process differs from other exercises in that it
His process is a periodically repeating process, while the falling behavior is a one-time behavior, so that a periodic or sustained period is detected
The continuous acceleration is very large, and other behaviors such as running can be judged.
(3) And eliminating squat, sitting and other behaviors according to the relative position of the joint point after the human body falls, the included angle between the central connecting line of the neck joint and the two hip joints and the horizontal line and the change of the width-to-height ratio, and finally determining the posture of the human body and judging whether the human body is in a falling state.
Step four: the method comprises the steps of deploying an attitude estimation model on a TX2 embedded platform, carrying out attitude estimation on video frames, carrying out attitude tracking on different people, calculating the acceleration of joint points, judging the falling behavior of a human body in an auxiliary mode by using the relative positions, included angles and aspect ratios of the joint points, and carrying out real-time falling detection.
Examples
The embedded type platform real-time falling detection method based on the improved posture estimation algorithm comprises the following steps:
the method comprises the following steps: building attitude estimation network using lightweight structure
1-1, building a feature extraction network: a network of feature extraction parts is built using depth separable convolution and inverse residual structure, and an attention mechanism is introduced:
(1) structure of the basic module: as shown in fig. 1, the basic module has the structure: the input of the basic module is divided into two branches, the first branch firstly uses 1 × 1 convolution to expand the number of channels, then uses 3 × 3 depth separable convolution, and then uses one person 1 × 1 convolution to reduce the number of channels; the second branch is the input of the basic module and is directly added with the output characteristic diagram of the first branch as the output of the basic module.
A channel attention module may be added to the basic module, and following the depth separable convolution of 3 × 3, the channel attention module is used to assign a weight to the number of channels of the feature map, and the structure of the channel attention module is shown in fig. 2, and the structure is as follows: using Fsq(.) function to globally pool reduced feature maps in spatial dimensions, Fsq(.) the function is:
Figure BDA0002374959890000062
in the formula: u. ofcRefers to the input feature map; h represents the height of the feature map; w indicates the width of the feature map.
Then using Fex(., w) function to learn the associations between channels, Fex(., w) function is:
s=Fex(z,w)=γ(g(z,w))=γ(w2δ(w1z))
in the formula, δ refers to Sigmoid function, and γ refers to relu function; z represents an input;
Figure BDA0002374959890000071
and
Figure BDA0002374959890000072
c in the formula refers to the number of channels of the feature map. Fscale() Used to re-weight the feature map. The formula is as follows:
Fscale(uc,sc)=scuc
in the formula: scRefers to a scalar value of the importance of the channel. The h-switch activating function is partially used in the basic module, the performance of the switch activating function is better than that of a relu6 activating function, but the calculation is more complex than that of a relu6, and the performance can be improved while the memory overhead is reduced by using the h-switch activating function similar to the switch function, so that the switch activating function is suitable for being used on a mobile device. The formula of the h-swish activation function is:
Figure BDA0002374959890000073
(2) and (2) constructing a feature extraction network by using the basic module in the step (1), wherein the network structure is as follows: using the picture with the size of 432 × 368 as an input, firstly using a common convolution with 3 × 3, then using 9 basic modules in the step (1) to extract the features of the picture, and superposing the output of the last basic module and the output of the sixth basic module on a channel, and using the receptive fields with different sizes to better extract the features. In the feature extraction network, the channel attention module is added to the fourth, fifth and sixth basic modules, the activation function used in the seventh, eighth and ninth basic modules is h-swish, and the relu6 activation function is used in other modules. The structure of the feature extraction network is shown in fig. 3. The channel attention module is added in the embodiment, the channel attention module is fully connected, the accuracy is not greatly improved by adding too many attention modules, the parameter quantity is increased inversely, and the embedded platform deployment is not facilitated.
1-2, constructing a posture estimation network: through the feature extraction network in the step 1-1, feature maps with dimensions of 54x46x120 are obtained through extraction, and the feature maps are sent to a first stage, wherein each stage comprises two branches, and each branch firstly passes through a 5-3 x3 deep convolution structure and then passes through 2 convolutions of 1x 1. The depth convolution structure is shown in fig. 4, and includes 3 × 3 depth separable convolutions and 1 × 1 convolution, and the 3 × 3 depth separable convolutions are processed by the Relu function after being processed by the Batch Normalization (BN) operation, then enter 1 × 1 convolution, and then are processed by the Relu function after being processed by the batch normalization. The branch network structure is shown in fig. 5, the number of the last output channels of the two branches of each stage is respectively 19 and 38, the output characteristic diagram input to the stage of the next stage is superposed with the channels of the characteristic diagram output by the characteristic extraction network, and there are five stages in total, the 19-channel characteristic diagram in the output represents that each characteristic diagram predicts one part of the human body, 18 parts are in total, a background characteristic diagram is added, and the output of 38 channels represents the vector diagram of the joint point connection of the human body parts. Except for the final stage, the outputs of the 19 channels and 38 channels of other stages are fused with the output feature map of feature extraction and then used as the input of the next stage. The general structure of the first stage in the pose estimation network is shown in fig. 6.
1-3, loss function and joint point matching: training the network set up in the step 1-1 and the step 1-2 on the PC end, wherein the input of each branch network in the attitude estimation network is as follows:
S1=ρ1(F),t=1
Figure BDA0002374959890000074
St=ρt(F,St-1,Lt-1),t≥2
Figure BDA0002374959890000075
in the formula: f represents an output feature diagram of the feature extraction network; ρ and
Figure BDA0002374959890000081
showing the successive convolution operations of 3x3 and 1x 1; s represents human body joint point heatA drawing; l represents a vector diagram of the connection relationship between the human body joint points. The human body joint point loss function is the difference between a joint point output characteristic diagram of the posture estimation network and the labeled position of the data set, and the formula is as follows:
Figure BDA0002374959890000082
in the formula, the first step is that,
Figure BDA0002374959890000083
is the actual value of the position of the human joint point,
Figure BDA0002374959890000084
for the output of the posture estimation network, in the actual labeling, if the joint point has a label, W (p) is 1, and if the joint point has no label, W (p) is 0; p refers to pixel points in the feature graph, J refers to the number of the feature graphs, namely 19, t represents different stages of the attitude estimation network, and the value is 1-5 (namely 6)>t>1)。
The loss function of the joint point connection vector diagram is the difference value of the output characteristic diagram of the joint point connection of the posture estimation network and the data set labeling position connection, and the formula is as follows:
Figure BDA0002374959890000085
in the formula, the first step is that,
Figure BDA0002374959890000086
for the actual values of the vector diagram of the human body joint connecting line,
Figure BDA0002374959890000087
and for the output of different stages of the network, p refers to pixel points in the feature graph, C refers to the number of the feature graphs, namely 38, t represents different stages of the attitude estimation network, and the value is 1-5. The overall loss is the sum of losses of all parts, and the formula is as follows:
Figure BDA0002374959890000088
for each joint point of each person in the actual image, the representation form is an extreme point on the thermodynamic diagram, the confidence coefficient of the joint point is determined by a Gaussian function, and the value of a certain position available for the jth joint point of the mth person is as follows:
Figure BDA0002374959890000089
wherein x isj,mRefers to the position corresponding to the jth joint point of the mth person in the image; σ means: because each pixel point in the output characteristic diagram has a value and is represented as a peak value at the joint point, only one or a plurality of pixel points in the actual label are labeled as the joint points, and the sigma represents the propagation range, namely the variance, of each peak value.
Then
Figure BDA00023749598900000810
The values are:
Figure BDA00023749598900000811
the formula of the skeleton information direction between the joint points in the actual picture obtained by the same method is as follows:
Figure BDA00023749598900000812
wherein n iscThe number of vectors at the position of a pixel point p in all the marked people is nonzero, namely, if the same position of a plurality of people exists at the position of p, the vectors are averaged;
knowledge of the articulated point vector, L, of the pose estimation network outputcAfter (p), the correlation between the two joint points needs to be evaluated by using the correlation, and the integral of dot products of connecting vectors of the two joint points and the vectors of each pixel point on the connecting line is used as the correlation between the two joint points, wherein the formula is as follows:
Figure BDA00023749598900000813
in the formula: dj1And dj2Are two joint points, Lc(p (u)) is a vector on the joint line; p (u) is the position between two joint points, i.e. p (u) ═ 1-u) dj1+udj2And p (u) means d when u is 1j2An articulation point, when u is 0, p (u) means dj1And u is used for selecting the position between the connecting lines of the two joint points and is 0-1. Because the correlation between the joint points is known, the joint points are used as the vertexes of the graph, the correlation between the joint points is used as the edge weight of the graph, and the Hungarian matching algorithm is used for distribution to obtain the joint point coordinates and skeleton information of each person;
step two: human posture tracking
Training the improved posture estimation network to obtain a posture estimation model, performing human body posture estimation on the video frame picture by using the posture estimation model to obtain the joint point coordinates of each person, and calculating the distance of each joint point of the same person in different frames. Wherein the j-th joint point coordinate matrix of the m-th person is Lj,m=(xj,m,yj,m,cj,m) X in the formulaj,mAnd yj,mThe coordinate points of the human body joint points are shown, and j is 1, 2, … and 18; c. Cj,mThe confidence of the joint point is indicated. The m-th person's coordinate matrix is: pm=(L1,m,L2,m...L18,m) The average value of the sums of the euclidean distances of the respective joint points of different persons in the preceding and following adjacent frames is calculated, and the person whose distance is the smallest and smaller than the threshold is the same person (the threshold is set in a case where the camera is fixed, that is, the size of the human body does not change much, a fixed value may be specified, for example, 90 pixels, that is, 90pixel values are used in this embodiment, and 1/2 of the horizontal direction distance between the leftmost joint point and the rightmost joint point of the person may also be selected as the threshold when the size of the human body changes relatively greatly in a changing scene, that is, after the distances between the respective person in the previous frame and the person in the frame are calculated, the wide 1/2 of the person is selected as the threshold. ) (ii) a
Step three: fall behavior detection
Tracking the human body in different frames by using the method in the step two, so as to detect that the human body falls down according to the coordinate change condition and the aspect ratio condition of the joint point of the same person in the previous and subsequent frames, wherein a in the falling detection flowchart is shown in fig. 7, a in the drawing refers to acceleration, a threshold value 1 is a threshold value for judging whether the human body is in a strenuous exercise state, and 200 pixels/s are used in the example2(where pixel refers to pixel distance, s refers to second), the threshold 2 is used to determine whether the image is in a static state or in a periodic motion state, and is a periodic threshold, in this example, 80 pixels/s is used2The Number is used for counting judgment, the threshold value 2 (in the embodiment, when the cumulative Number exceeds 8, the static state after falling is preliminarily judged) is used for judging for multiple times to determine whether the user is in the static state after falling or in the periodic motion state, and other states are excluded according to the characteristics of falling motion.
3-1, calculating the acceleration of the joint: according to the coordinate change condition of the human body joint points between the front frame and the rear frame, the acceleration of the joint points (hip joint, neck joint, shoulder joint and knee joint) close to the central point of the human body is calculated, and because the central point of the human body can rapidly move downwards when the human body falls, the falling of the human body is detected according to the motion direction and the acceleration of the hip joint, the shoulder joint, the neck joint or the knee joint. The acceleration of these joints can reflect the strenuous state of human motion. And obtaining the acceleration according to the joint point coordinate form in the step two as follows:
Figure BDA0002374959890000091
in the formula, a is the acceleration of the joint point movement; (x)t-1,yt-1) The position of the joint point at the last moment; (x)t,yt) The position of the joint point at the current moment.
3-2, calculating the relative positions of different joint points of the same person, the included angle between the joint point connecting line and the horizontal line and the aspect ratio: after the acceleration of the joint point close to the central point of the human body is judged, whether the human body falls down is further determined by using the relative positions of different joint points, the included angles between the central connecting lines of the neck joint and the two hip joints and the horizontal line and the aspect ratio:
3-2-1, judging the intensity of the human motion according to the acceleration of the joint points, wherein the larger the acceleration is, the more intense the current human motion is, and when the acceleration is smaller than an intensity threshold value, continuously calculating the acceleration of the next frame for the state that the human does not move or slowly moves (such as walking, standing and the like); when the acceleration is greater than the intensity threshold value and the human body is in an intensity motion state (such as running, jumping, falling and the like), continuously detecting 80 frames, simultaneously calculating the acceleration, and counting to enter the step 3-2-2;
3-2-2, after counting for a period of time in the violent movement state, judging whether the acceleration is smaller than a periodic threshold, if so, continuing to calculate the acceleration at the next moment, counting and accumulating until the accumulated number exceeds 8, and entering the step 3-2-3; if the acceleration is larger than the periodic threshold, the detected periodic or continuous acceleration is large, the detected periodic or continuous acceleration is judged to be other violent movement behaviors except for falling, and the initial acceleration calculation is returned to continue;
and 3-2-3, then, eliminating squat, sitting and other behaviors according to the relative position of the joint points after the human body falls, the included angle between the central connecting line of the neck joint and the two hip joints and the horizontal line and the change of the width-to-height ratio, and finally determining the posture of the human body and judging whether the human body is in a falling state. The method comprises the following steps:
and calculating the relative positions of different joint points of the same person, the included angle between the connecting line of the neck joint and the centers of the two hip joints and the horizontal line and the aspect ratio. The detected human neck joint is used as an original point, the direction parallel to the upper edge line and the lower edge line of a video frame is set as an X direction, the direction parallel to the left edge line and the right edge line is set as a Y direction, the X direction difference and the Y direction difference of the neck joint and the hip joint are calculated, the included angle between the connecting line of the centers of the neck joint and the two hip joints and the X direction is calculated at the same time, a human body detection frame is calibrated according to the detected coordinates of all joint points, and the human body is determined to fall if four conditions that the difference between the X direction of the centers of the neck joint and the hip joint is greater than 2/3 of the connecting line of the two points, the difference between the Y direction is smaller than 1/3 of the connecting line, the included angle between the connecting line of the joint points and the X direction is not between [45 degrees ], 135 degrees ] and the human body width-height ratio is greater than 1:1 are met at the same time.
The difference between the falling process and other sports is that the other processes are periodically repeated, and the falling behavior is a one-off behavior, so that other behaviors such as running can be judged when a large periodic or continuous acceleration is detected.
Step four: the method comprises the steps of deploying a posture estimation model on a TX2 embedded platform, carrying out posture estimation on a video frame, enabling a posture estimation effect graph to be shown in fig. 8 (posture estimation effect graphs of a single person, two persons and multiple persons are respectively shown in fig. 8), then carrying out posture tracking on different persons, calculating acceleration of joint points, using relative positions of the joint points, included angles of the joint points and an X direction and an aspect ratio to assist in judging the falling behavior of a human body, carrying out real-time falling detection, and enabling the falling detection process to be shown in fig. 9.
The accuracy of the posture estimation model on the MPII data set can reach 81.7%, the accuracy on the COCO data set can reach 62.3%, the accuracy is slightly higher than that of the existing openfuse algorithm, the speed is improved by 54% compared with the existing openfuse algorithm, the average FPS (transmission frame per second) on a TX2 platform is 16.7, a real-time effect can be achieved, 4 testers with different fat and thin sizes are selected in order to eliminate the relation between the algorithm and the body types and ages of the testers, actions from standing, walking, running, squatting and falling are respectively performed in a laboratory and a hall, the total number of tests is 80, and the number of successful detection times is 76.
The invention is not the best known technology.

Claims (4)

1. An embedded platform real-time fall detection method based on an improved attitude estimation algorithm comprises the following steps:
the method comprises the following steps: building attitude estimation network using lightweight structure
1-1, building a feature extraction network: improving a feature extraction part in an opencast algorithm, constructing a network of the feature extraction part by using a depth separable convolution and an inverse residual structure, and introducing an attention mechanism:
(1) structure of the basic module: including one depth separable convolution and two 1x1 convolutions while using the inverse residual structure. The input of the basic module is divided into two branches, the first branch firstly uses 1 × 1 convolution to expand the number of channels, then uses 3 × 3 depth separable convolution, and then uses 1 × 1 convolution to reduce the number of channels; the second branch adds the input characteristic diagram of the basic module and the output characteristic diagram of the first branch of the basic module as the output of the basic module;
(2) and (2) constructing a feature extraction network by using the basic module in the step (1), wherein the network structure is as follows: using a picture with the size of 432 × 368 as an input, firstly using a common convolution with the size of 3 × 3, then using 9 basic modules in the step (1) to be sequentially connected to form a feature extraction part in the attitude estimation network, and superposing the output of the last basic module and the output of the sixth basic module on a channel to form the output of the feature extraction network; some of the 9 basic modules include a channel attention module, and the channel attention module is arranged after the deep separable convolution, and assigns weights to the number of channels of the feature map at that time, that is, the importance of different channel feature maps is judged;
1-2, constructing a posture estimation network: extracting feature maps with dimensions of 54x46x120 through a feature extraction network in the step 1-1, and sending the feature maps into a first stage, wherein each stage comprises two branches, each branch firstly passes through a 5-3 x3 deep convolution structure, the deep convolution structure comprises 3x3 deep separable convolution and 1x1 convolution, and then passes through 2 convolutions of 1x1, and the final output channel numbers of the two branches in each stage are respectively 19 and 38; the input of the next stage is the channel superposition of the output of the stage and the feature map output by the feature extraction network, and the total number of the stages is five; the 19-channel characteristic diagram in the output shows that each characteristic diagram predicts a part of the human body, 18 parts are in total, a background characteristic diagram is added, and the 38-channel output shows a vector diagram of the joint point connection of the human body parts; except for the final stage, the outputs of the 19 channels and the 38 channels of other stages are fused with the output feature map of the feature extraction network and then used as the input of the next stage;
1-3, loss function and joint point matching: training the network set up in the step 1-1 and the step 1-2 integrally, wherein a human body joint point loss function is the difference between a joint point output characteristic diagram of a posture estimation network and a data set marking position, a joint point connection loss function is the difference between a joint point connection output characteristic diagram of the posture estimation network and a data set marking position, meanwhile, an L2 loss function is used for each stage in the step 1-2, and the integral loss is the sum of losses of all parts; distributing the detected multiple human joint points by using a Hungarian matching algorithm to obtain the joint point coordinates and confidence information of each human;
step two: human posture tracking
Training an improved posture estimation network to obtain a posture estimation model, performing human body posture estimation on a video frame picture by using the posture estimation model to obtain the joint point coordinates of each person, and calculating the distance of each joint point of the same person in different frames; wherein the j-th joint point coordinate matrix of the m-th person is Lj,m=(xj,m,yj,m,cj,m) X in the formulaj,mAnd yj,mCoordinate points representing the joint points of the human body; c. Cj,mThe confidence that the representation is a joint point; the m-th person's coordinate matrix is: pm=(L1,m,L2,m...L18,m) Calculating the average value of the sums of the Euclidean distances of all the joint points of different people in the adjacent frames before and after, wherein the same person is obtained when the distance is minimum and less than a threshold value;
step three: fall behavior detection
Tracking the human bodies in different frames by using the method in the second step, and detecting the falling of the human bodies according to the coordinate change condition of the joint points of the same person in the previous and next frames, the included angle between the joint point connecting line and the horizontal line and the width-height ratio;
step four: and deploying the attitude estimation model on the embedded platform, performing attitude estimation on the video frame, performing attitude tracking on different people, and performing real-time falling detection.
2. The detection method according to claim 1, wherein when the feature extraction network is built, channel attention modules are added to only the fourth, fifth and sixth basic modules among 9 basic modules used; the activation function used in the seventh, eighth and ninth basic modules is h-swish, and the other basic modules use relu6 activation functions.
3. The detection method according to claim 1, wherein the specific process of fall behavior detection in step three is as follows:
3-1, calculating the acceleration of the joint: calculating the acceleration of the joint point close to the central point of the human body according to the coordinate change condition of the joint point of the human body between the front frame and the rear frame, wherein the joint point close to the central point of the human body comprises a hip joint, a neck joint, a shoulder joint and a knee joint, and the falling of the human body is detected according to the motion direction and the acceleration of the hip joint, the shoulder joint, the neck joint or the knee joint so as to reflect the severe condition of the motion of the human body; obtaining the acceleration of the joint point in the second step according to the coordinate form of the joint point
Figure FDA0003496097740000021
a is the acceleration of the joint movement; (x)t-1,yt-1) The position of the joint point at the last moment; (x)t,yt) The position of the joint point at the current moment is taken as the position of the joint point;
3-2, calculating the relative positions of different joint points of the same person, the included angle between the joint point connecting line and the video frame horizontal line and the aspect ratio: after the acceleration of the joint point close to the central point of the human body is judged, whether the human body falls down is further determined by using the relative positions of different joint points, the included angles between the central connecting lines of the neck joint and the two hip joints and the horizontal line and the aspect ratio, namely:
3-2-1, judging the intensity of the human motion according to the acceleration of the joint points, wherein the larger the acceleration is, the more intense the current human motion is, and when the acceleration is smaller than an intensity threshold value, the acceleration of the next frame is continuously calculated in a state that the human does not move or slowly moves; when the acceleration is greater than the intensity threshold value, the human body is in an intensity motion state, continuously detecting for 80 frames, simultaneously calculating the acceleration, and counting to enter the step 3-2-2;
3-2-2, after counting for a period of time in the violent movement state, judging whether the acceleration is smaller than a periodic threshold, if so, continuing to calculate the acceleration at the next moment, counting and accumulating until the accumulated number exceeds 8, and entering the step 3-2-3; if the acceleration is larger than the periodic threshold, the detected periodic or continuous acceleration is large, the detected periodic or continuous acceleration is judged to be other violent movement behaviors except for falling, and the initial acceleration calculation is returned to continue;
and 3-2-3, then, removing squat and sitting behaviors according to the relative position of the joint points after the human body falls, the included angle between the central connecting line of the neck joint and the two hip joints and the horizontal line and the change of the width-to-height ratio, and finally determining the posture of the human body and judging whether the human body is in a falling state.
4. The detection method according to claim 3, wherein the detected human neck joint is used as an origin, the direction parallel to the upper and lower edge lines of the video frame is set as an X direction, the direction parallel to the left and right edge lines is set as a Y direction, the X direction difference and the Y direction difference between the neck joint and the hip joint are calculated, the included angle between the connecting line between the centers of the neck joint and the two hip joints and the X direction is calculated, the human body detection frame is calibrated according to the detected coordinates of all joint points, whether the X direction difference between the centers of the neck joint and the hip joint is larger than 2/3 of the connecting line between the two points and whether the Y direction difference is smaller than 1/3 of the connecting line, the included angle between the connecting line of the joint points and the X direction is not between [45 degrees ], 135 degrees ] and the human body width-height ratio is larger than 1:1 are met, and whether the human body falls is determined.
CN202010062574.7A 2020-01-20 2020-01-20 Embedded platform real-time falling detection method based on improved attitude estimation algorithm Expired - Fee Related CN111274954B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010062574.7A CN111274954B (en) 2020-01-20 2020-01-20 Embedded platform real-time falling detection method based on improved attitude estimation algorithm

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010062574.7A CN111274954B (en) 2020-01-20 2020-01-20 Embedded platform real-time falling detection method based on improved attitude estimation algorithm

Publications (2)

Publication Number Publication Date
CN111274954A CN111274954A (en) 2020-06-12
CN111274954B true CN111274954B (en) 2022-03-15

Family

ID=71003333

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010062574.7A Expired - Fee Related CN111274954B (en) 2020-01-20 2020-01-20 Embedded platform real-time falling detection method based on improved attitude estimation algorithm

Country Status (1)

Country Link
CN (1) CN111274954B (en)

Families Citing this family (32)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111563492B (en) * 2020-07-14 2020-11-10 浙江大华技术股份有限公司 Fall detection method, fall detection device and storage device
CN111862171B (en) * 2020-08-04 2021-04-13 万申(北京)科技有限公司 CBCT and laser scanning point cloud data tooth registration method based on multi-view fusion
CN112069943A (en) * 2020-08-25 2020-12-11 西安工业大学 Online multi-person posture estimation and tracking method based on top-down framework
CN112101183B (en) * 2020-09-10 2021-08-24 深圳市商汤科技有限公司 Vehicle identification method and device, electronic equipment and storage medium
CN111931733B (en) * 2020-09-25 2021-02-26 西南交通大学 Human body posture detection method based on depth camera
CN112215185B (en) * 2020-10-21 2022-08-05 成都信息工程大学 System and method for detecting falling behavior from monitoring video
CN112270807A (en) * 2020-10-29 2021-01-26 怀化学院 Old man early warning system that tumbles
CN112435440B (en) * 2020-10-30 2022-08-09 成都蓉众和智能科技有限公司 Non-contact type indoor personnel falling identification method based on Internet of things platform
CN112241726B (en) * 2020-10-30 2023-06-02 华侨大学 Posture estimation method based on self-adaptive receptive field network and joint point loss weight
CN112381002B (en) * 2020-11-16 2023-08-15 深圳技术大学 Human body risk posture recognition method and system
CN112488060B (en) * 2020-12-18 2023-08-08 北京百度网讯科技有限公司 Target detection method, device, equipment and medium
CN112668631B (en) * 2020-12-24 2022-06-24 哈尔滨理工大学 Mobile terminal community pet identification method based on convolutional neural network
CN112766091B (en) * 2021-01-05 2023-09-29 中科院成都信息技术股份有限公司 Video unsafe behavior recognition system and method based on human skeleton key points
CN112800900A (en) * 2021-01-18 2021-05-14 上海云话科技有限公司 Mine personnel land falling detection method based on visual perception
CN112686208B (en) * 2021-01-22 2022-11-08 上海喵眼智能科技有限公司 Motion recognition characteristic parameter algorithm based on machine vision
CN112907892A (en) * 2021-01-28 2021-06-04 上海电机学院 Human body falling alarm method based on multiple views
CN112861686B (en) * 2021-02-01 2022-08-30 内蒙古大学 SVM-based image target detection method
CN112836652B (en) * 2021-02-05 2024-04-19 浙江工业大学 Multi-stage human body posture estimation method based on event camera
CN112906548A (en) * 2021-02-07 2021-06-04 广东省科学院智能制造研究所 Fall detection method and system based on edge calculation
CN112861978B (en) * 2021-02-20 2022-09-02 齐齐哈尔大学 Multi-branch feature fusion remote sensing scene image classification method based on attention mechanism
CN112998697B (en) * 2021-02-22 2022-06-14 电子科技大学 Tumble injury degree prediction method and system based on skeleton data and terminal
CN112990060A (en) * 2021-03-30 2021-06-18 北京工业大学 Human body posture estimation analysis method for joint point classification and joint point reasoning
CN113408485A (en) * 2021-07-14 2021-09-17 深圳思悦创新有限公司 Method and device for detecting indoor falling of old people based on FPGA and deep learning
CN113712538A (en) * 2021-08-30 2021-11-30 平安科技(深圳)有限公司 Fall detection method, device, equipment and storage medium based on WIFI signal
CN113744258A (en) * 2021-09-13 2021-12-03 贵州春芯科技有限公司 Pepper disease identification method and device, electronic equipment and storage medium
CN113963442A (en) * 2021-10-25 2022-01-21 重庆科技学院 Fall-down behavior identification method based on comprehensive body state features
CN113723377B (en) * 2021-11-02 2022-01-11 南京信息工程大学 Traffic sign detection method based on LD-SSD network
CN114595748B (en) * 2022-02-21 2024-02-13 南昌大学 Data segmentation method for fall protection system
CN115129162A (en) * 2022-08-29 2022-09-30 上海英立视电子有限公司 Picture event driving method and system based on human body image change
CN115661943B (en) * 2022-12-22 2023-03-31 电子科技大学 Fall detection method based on lightweight attitude assessment network
CN115909503B (en) * 2022-12-23 2023-09-29 珠海数字动力科技股份有限公司 Fall detection method and system based on key points of human body
CN116071785B (en) * 2023-03-06 2023-06-23 合肥工业大学 Human body posture estimation method based on multidimensional space interaction

Family Cites Families (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107220604A (en) * 2017-05-18 2017-09-29 清华大学深圳研究生院 A kind of fall detection method based on video
CN107886069A (en) * 2017-11-10 2018-04-06 东北大学 A kind of multiple target human body 2D gesture real-time detection systems and detection method
CN108549844B (en) * 2018-03-22 2021-10-26 华侨大学 Multi-person posture estimation method based on fractal network and joint relative mode
CN108960056B (en) * 2018-05-30 2022-06-03 西南交通大学 Fall detection method based on attitude analysis and support vector data description
CN109858362A (en) * 2018-12-28 2019-06-07 浙江工业大学 A kind of mobile terminal method for detecting human face based on inversion residual error structure and angle associated losses function
CN110188598B (en) * 2019-04-13 2022-07-05 大连理工大学 Real-time hand posture estimation method based on MobileNet-v2
CN109830086A (en) * 2019-04-16 2019-05-31 河北工业大学 A kind of SCM Based the elderly falls down detection system under walking states
CN110490070A (en) * 2019-07-12 2019-11-22 浙江省北大信息技术高等研究院 A kind of fall detection method based on human body attitude estimation
CN110490080B (en) * 2019-07-22 2023-05-09 毕昇云(武汉)信息技术有限公司 Human body falling judgment method based on image
CN110503083A (en) * 2019-08-30 2019-11-26 北京妙医佳健康科技集团有限公司 A kind of critical point detection method, apparatus and electronic equipment
CN110706255A (en) * 2019-09-25 2020-01-17 马可 Fall detection method based on self-adaptive following

Also Published As

Publication number Publication date
CN111274954A (en) 2020-06-12

Similar Documents

Publication Publication Date Title
CN111274954B (en) Embedded platform real-time falling detection method based on improved attitude estimation algorithm
CN111259850B (en) Pedestrian re-identification method integrating random batch mask and multi-scale representation learning
CN110472604B (en) Pedestrian and crowd behavior identification method based on video
Iwasawa et al. Real-time estimation of human body posture from monocular thermal images
CN109886225A (en) A kind of image gesture motion on-line checking and recognition methods based on deep learning
CN109522793A (en) More people's unusual checkings and recognition methods based on machine vision
CN104794737B (en) A kind of depth information Auxiliary Particle Filter tracking
CN113536996B (en) Crowd motion simulation method based on large amount of real crowd motion videos
CN110490109B (en) Monocular vision-based online human body rehabilitation action recognition method
CN109670380A (en) Action recognition, the method and device of pose estimation
US20210104067A1 (en) Multi-Person Pose Estimation Using Skeleton Prediction
CN110765906A (en) Pedestrian detection algorithm based on key points
CN111860267B (en) Multichannel body-building exercise identification method based on human body skeleton joint point positions
CN110991274B (en) Pedestrian tumbling detection method based on Gaussian mixture model and neural network
CN110097029B (en) Identity authentication method based on high way network multi-view gait recognition
CN116012950B (en) Skeleton action recognition method based on multi-heart space-time attention pattern convolution network
JP7422456B2 (en) Image processing device, image processing method and program
CN113610046B (en) Behavior recognition method based on depth video linkage characteristics
CN115346272A (en) Real-time tumble detection method based on depth image sequence
CN114821640A (en) Skeleton action identification method based on multi-stream multi-scale expansion space-time diagram convolution network
CN108846344B (en) Pedestrian posture multi-feature intelligent identification method integrating deep learning
CN116895098A (en) Video human body action recognition system and method based on deep learning and privacy protection
CN111626109A (en) Fall-down behavior analysis and detection method based on double-current convolutional neural network
CN106056078A (en) Crowd density estimation method based on multi-feature regression ensemble learning
CN112560618B (en) Behavior classification method based on skeleton and video feature fusion

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20220315

CF01 Termination of patent right due to non-payment of annual fee