CN108961675A

CN108961675A - Fall detection method based on convolutional neural networks

Info

Publication number: CN108961675A
Application number: CN201810614024.4A
Authority: CN
Inventors: 彭力; 王永青
Original assignee: Jiangnan University
Current assignee: Jiangnan University
Priority date: 2018-06-14
Filing date: 2018-06-14
Publication date: 2018-12-07
Also published as: WO2019237567A1

Abstract

The present invention relates to a kind of fall detection methods based on convolutional neural networks, it include: training convolutional neural networks, the training convolutional neural networks specifically include: pre-processing to each frame image acquired, pretreated work includes being followed successively by foreground extraction and normalization, whitening operation；Pre-training first is carried out to ResNet network on ImageNet data set, to obtain pre-training model.Classification method based on convolutional neural networks is applied in fall detection method, simultaneously, in order to improve the precision of system, reduce the complexity of operation, extract the personage under complex background using a kind of improved foreground detection method, then will treated that image is put into convolutional neural networks carries out model training.

Description

Fall detection method based on convolutional neural networks

Technical field

The present invention relates to fall detection methods, more particularly to the fall detection method based on convolutional neural networks.

Background technique

Present social senilization's trend increasingly aggravates, the decline of the elderly's physical function and more and more common solitary existing As making one of the main reason for causing injury as the elderly of falling, so carrying out detection to tumble behavior has highly important meaning Justice.

Traditional fall detection method based on computer vision is usually manual extraction feature, and project amount is huge, and Generalization ability is poor, and precision is not high.Different from traditional feature extraction, convolutional neural networks can automatically extract feature, training Model afterwards has geometric invariance, can overcome the problems, such as the variation because of illumination and shooting angle due to generate.

There are following technical problems for traditional technology:

Current fall detection system is broadly divided into two classes: the first kind is sensor-based wearable detection system；It is another Class is the detection system based on video.Currently, the wearable identifying system research based on three-dimensional acceleration or trunk angular speed It has been relatively mature.However, wearable device, which generally requires, is worn on neck or waist, wearing will use family and feel not for a long time It is suitable.And vision-based inspection system is then to pass through specific image by the movement of one or several cameras capture targets Processing Algorithm determines characteristics of image when falling, so that tumble be distinguished with daily routines.It is currently used to be based on The fall detection algorithm of vision is mainly threshold method and intelligent algorithm.Threshold method is usually the head position or center of gravity to human body It is detected.Diraco, which passes through, judges that human body center is considered as falling lower than specified altitude assignment and when being maintained for more than 4s. Rougier et al. estimates head position in next frame image by positioning head position, then by particle filter, calculates horizontal The speed in direction and vertical direction, and the mode being compared with threshold value determines whether to fall.The realization of these methods is simple, But precision is easy to be influenced by extraneous factors such as environment.And the method based on machine learning mainly first carries out people to image Object extracts, then manual extraction feature again, and the feature input model of acquisition is realized that the detection to tumble behavior identifies.This Kind of method need it is artificial extract feature, project amount is huge, and most of only rests in two classification problems, it is contemplated that it is following for The requirement of smart home is got higher, and the various postures of human body are identified also into indispensable part.

Summary of the invention

Based on this, it is necessary in view of the above technical problems, a kind of fall detection method based on convolutional neural networks is provided, Classification method based on convolutional neural networks is applied in fall detection method, meanwhile, in order to improve the precision of system, reduce The complexity of operation extracts the personage under complex background using a kind of improved foreground detection method, then by treated Image, which is put into convolutional neural networks, carries out model training.

A kind of fall detection method based on convolutional neural networks, comprising:

Training convolutional neural networks, the training convolutional neural networks specifically include:

The each frame image acquired is pre-processed, pretreated work includes being followed successively by foreground extraction and normalizing Change, whitening operation；

Pre-training first is carried out to ResNet network on ImageNet data set, to obtain pre-training model；

Step " pre-processes each frame image acquired, pretreated work includes being followed successively by prospect to mention It takes and normalizes, whitening operation；" treated, and picture is put into the pre-training model carries out model training, obtain model Parameter；And

Test set is input to the model after training, is detected with the precision of test the set pair analysis model；

Picture is detected using the convolutional neural networks after trained；

Wherein, the method for the foreground extraction specifically includes:

Image is handled using background subtraction；

Image is handled using mixed Gauss model；

The result of image after being handled using the result after background subtraction processing image and using mixed Gauss model is asked With.

The above-mentioned fall detection method based on convolutional neural networks, the classification method based on convolutional neural networks is applied to In fall detection method, meanwhile, in order to improve the precision of system, reduce the complexity of operation, is examined using a kind of improved prospect Survey method is to extract the personage under complex background, then image is put into progress model instruction in convolutional neural networks by treated Practice.

In other one embodiment, step " each frame image acquired is pre-processed, it is pretreated Work includes being followed successively by foreground extraction and normalization, whitening operation；" in, each frame image acquired is regarded by reading What frequency file obtained.

In other one embodiment, after step " detecting picture using the convolutional neural networks after training ", The training convolutional neural networks are specific further include: show the detection effect figure of each frame picture, and the convolution of implementation model Core visualization.

In other one embodiment, step " shows the detection effect figure of each frame picture, and the volume of implementation model The detection effect figure of each frame picture is shown in product core visualization " on matlab platform, and the convolution kernel of implementation model can Depending on changing.

In other one embodiment, step " handles image using background subtraction；" specifically include:

The average value for taking former frame images, is used as initial background image B_t；

The gray scale of prior image frame and background image carries out subtracting operation, and taking its absolute value is N_t(x, y) formula is

N_t(x, y)=| I_t(x,y)-B_t(x,y)|

To the pixel (x, y) of present frame, if having | I_t(x,y)-B_t(x, y) | >=T, then the pixel is foreground point, that is, is updated Current picture frame is；

Background image is updated with current frame image.

In other one embodiment, step " handles image using mixed Gauss model；" specifically include:

When using gauss hybrid models to background constructing model, the pixel value of each pixel of sequence image can use k A Gauss model simulation, therefore in moment t, the probability density function of some pixel value can indicate are as follows:

Wherein, w_i,tIndicate the weight of Gauss model, and the probability density function of Gauss model indicates are as follows:

Then, K gauss hybrid models are sorted according to weight divided by the size of the quotient of standard deviation, then selects previous B A Gauss model differentiates that wherein the value of B is indicated for distinguishing are as follows:

Each pixel in new picture frame is put into respectively in sorted K Gauss model and is judged, is judged Condition are as follows:

||X_t-μ_t||≤β∑^1/2

In B Gauss model in front, meet above-mentioned condition if wherein had in a Gauss model, this Pixel is it is determined that background, if above-mentioned condition is all unsatisfactory in B Gauss model, before determining that this pixel belongs to Scape.

For each Gauss model, it is assumed that the invalid weight it is necessary to reduce this Gauss model of above-mentioned condition, if Above-mentioned condition is set up it is necessary to update this gauss hybrid models, shown in the following formula of the method for concrete operations:

w_i,t=(1- λ) w_i,t-1+λBM_t

μ_i,t=(1- α) μ_i,t-1+αX_i,t

∑_i,t=(1- α) ∑_i,t-1+α(X_i,t-μ_i,t)(X_i,t-μ_i,t)^T

α=λ/w_i,t

Wherein, pixel is prospect then BM=0, and otherwise BM=1, finally replaces weight most with an initial Gauss model That small Gauss model, threshold value T, learning rate λ, parameter beta are all constants specified in advance.

In other one embodiment, test set " is input to the model after training, with test set to mould by step The precision of type is detected；" in, the test set comes from UR Fall Detection Dataset.

A kind of computer equipment can be run on a memory and on a processor including memory, processor and storage The step of computer program, the processor realizes any one the method when executing described program.

A kind of computer readable storage medium, is stored thereon with computer program, realization when which is executed by processor The step of any one the method.

A kind of processor, the processor is for running program, wherein described program executes described in any item when running Method.

Detailed description of the invention

Fig. 1 is a kind of process signal of fall detection method based on convolutional neural networks provided by the embodiments of the present application Figure.

Fig. 2 is the residual error study in a kind of fall detection method based on convolutional neural networks provided by the embodiments of the present application Construct module diagram.

Fig. 3 is the penalty values letter in a kind of fall detection method based on convolutional neural networks provided by the embodiments of the present application Number curve schematic diagram.

Fig. 4 is the test of model in a kind of fall detection method based on convolutional neural networks provided by the embodiments of the present application Flow chart.

Fig. 5 is the background difference in a kind of fall detection method based on convolutional neural networks provided by the embodiments of the present application The effect diagram of method.

Fig. 6 is the Gaussian Mixture in a kind of fall detection method based on convolutional neural networks provided by the embodiments of the present application The effect diagram of environmental model.

Before Fig. 7 is improved in a kind of fall detection method based on convolutional neural networks provided by the embodiments of the present application The effect diagram of scape detection method.

Fig. 8 is the foreground extraction in a kind of fall detection method based on convolutional neural networks provided by the embodiments of the present application RGB figure.

Fig. 9 is that the convolution kernel in a kind of fall detection method based on convolutional neural networks provided by the embodiments of the present application can Depending on changing.

Figure 10 is the first layer in a kind of fall detection method based on convolutional neural networks provided by the embodiments of the present application Characteristic pattern.

Figure 11 is the second layer in a kind of fall detection method based on convolutional neural networks provided by the embodiments of the present application Characteristic pattern.

Specific embodiment

In order to make the objectives, technical solutions, and advantages of the present invention clearer, with reference to the accompanying drawings and embodiments, right The present invention is further elaborated.It should be appreciated that the specific embodiments described herein are merely illustrative of the present invention, and It is not used in the restriction present invention.

Refering to fig. 1, a kind of fall detection method based on convolutional neural networks, comprising:

Picture is detected using the convolutional neural networks after trained；

Wherein, the method for the foreground extraction specifically includes:

Image is handled using background subtraction；

Image is handled using mixed Gauss model；

N_t(x, y)=| I_t(x,y)-B_t(x,y)|

Background image is updated with current frame image.

||X_t-μ_t||≤β∑^1/2

w_i,t=(1- λ) w_i,t-1+λBM_t

μ_i,t=(1- α) μ_i,t-1+αX_i,t

∑_i,t=(1- α) ∑_i,t-1+α(X_i,t-μ_i,t)(X_i,t-μ_i,t)^T

α=λ/w_i,t

In other one embodiment, test set " is input to the model after training, with test set to mould by step The precision of type is detected；" in, the test set comes from URFall Detection Dataset.

A concrete application scene of the invention is given below:

1. image preprocessing

In the particular embodiment, personage's prospect is extracted using improved foreground detection method.Foreground extraction at present Method mainly has frame differential method, background subtraction, optical flow method, mixed Gauss model.

1) background subtraction

Wherein, if setting I_t, B_tRespectively current picture frame and background frames, T are the gray threshold of foreground detection, background subtraction The algorithm steps of point-score are as follows:

1) average value for taking former frame images, is used as initial background image B_t；

2) gray scale of prior image frame and background image carries out subtracting operation, and taking its absolute value is N_t(x,y)；Formula is

N_t(x, y)=| I_t(x,y)-B_t(x,y)|

3) to the pixel (x, y) of present frame, if having | I_t(x,y)-B_t(x, y) | >=T, then the pixel is foreground point, i.e., more New current picture frame is

4) background image is updated with current frame image.

2) mixed Gauss model

Mixed Gauss model is a kind of ADAPTIVE MIXED Gauss based on to background modeling proposed by Stauffer et al. Background extracting method.When using gauss hybrid models to background constructing model, the pixel value of each pixel of sequence image K Gauss model simulation can be used, therefore in moment t, the probability density function of some pixel value can be indicated are as follows:

Wherein, w_i,tIndicate the weight of Gauss model, and the probability density function of Gauss model can indicate are as follows:

||X_t-μ_t||≤β∑^1/2

w_i,t=(1- λ) w_i,t-1+λBM_t

μ_i,t=(1- α) μ_i,t-1+αX_i,t

∑_i,t=(1- α) ∑_i,t-1+α(X_i,t-μ_i,t)(X_i,t-μ_i,t)^T

α=λ/w_i,t

Wherein, pixel is prospect then BM=0, otherwise BM=1.Finally weight is replaced most with an initial Gauss model That small Gauss model, threshold value T, learning rate λ, parameter beta are all constants specified in advance.

3) improved foreground detection method

Although background subtraction method is simple, calculation amount is small, can generate " ghost " phenomenon.And mixed Gauss model is mixed Gaussian processes is closed not only to background modeling, while also modeling to prospect, therefore very sensitive to the suddenly change of global brightness.In order to Solve simple using limitation brought by a kind of method, the present invention proposes a kind of improvement foreground detection method, i.e., by the two Treatment effect carry out with operation can be good at solving the problems, such as " ghost " and light sensitive.For the pixel of picture matrix Point (x, y) assumes that the output of background subtraction is D (x, y), the output G (x, y) of mixed Gaussian method, improved foreground detection method Output be R (x, y), then

Again with the mode of some binary conversion treatment pictures, the binary image of clearly target person can be obtained.

The selection in operation and largest connected domain is opened and closed to substantially determine after foreground extraction, then to image The position of personage's prospect intercepts corresponding position and obtains the RGB image of corresponding personage's prospect.

2. the selection of model

Using champion's model ResNet in 2015 as network model of the invention.ResNet network can be solved effectively Certainly as network depth increases, the accuracy of algorithm tends to saturation and the problem of rapid decrease.Meanwhile its parameter amount ratio VGGNet is also low, effect highly significant.While the accuracy rate of greatly lift scheme, training speed has also obtained very big It improves.This is mainly attributed to the building thought of its used residual error module, as shown in Figure 2

Following table is comparison of the ResNet-34 and VGG-16 in ImageNet2012

Network	RESNET-34	VGG-16
			Calculation amount	3600000000 FLOPs	15300000000 FLOPs
Precision top-1	0.733	0.715

The network architecture of ResNet is as shown in the table:

Pre-training model

Before formally carrying out model training with pretreated image, first with ImageNet to the ginseng of convolutional neural networks Number carries out pre-training.Bengio professor et al. points out often model to be entered using the method for carrying out random initializtion to model Probability to local minimum is very high, and enables to the effect of model more preferable by the way of pre-training.In actual operation, The effect of pre-training model is exactly the model parameter for training the parameter initialization of network on network for ImageNet.But Be that because ImageNet is divided into 1000 classes, and the present invention only needs to divide the image into two classes, thus need to full articulamentum into Row modification, is changed to 2 by the 1000 of script for the value of num_output, and modify the name of full articulamentum.

Model training

The training process of network is completed based on Caffe platform, and Caffe assumes to construct according to one of neural network: All calculating is wherein indicated with the form of layer, the effect of layer is exactly according to the input data, so that output is after calculating Result.For convolution, if input is piece image, then carries out convolution algorithm operation with the parameter of this layer, finally again Export the result of convolution.Each layer requires two kinds of operations: 1) through path, from enter data to calculate output data；2) anti- To access, the gradient relative to input is calculated according to gradient value above.When each layer is completed the two functions Afterwards, it will be able to many layers are all connected into a network, the function of network be exactly according to input data (image, voice or its His message form) calculate desired output.During training, output can be calculated according to known label result The loss function and gradient of model, then further update the parameter of network further according to gradient value.

Before being trained to model, need the database i.e. image and corresponding label of first package image, i.e., it will figure Image set is encapsulated into the form of database: Lmdb and Leveldb.It is specifically exactly that convert_imageset order is called Change data format.Building for ResNet34 model is to realize that it is defined by defining trian_val.prototxt The specific network structure of ResNet.The organizational structure of this document is the form of class formation body.It is all wrapped in every layer of layer structural body Include many parameters.Such as bottom parameter indicate bottom input, top parameter indicate be output to next layer as a result, Param parameter indicates the parameter of this layer, and the number of filter is indicated including num_output, and kernel_size is indicated The size of filter, stride indicate step-length.The mode of the training pattern of ResNet is by this text of solver.prototxt What part was write, every row in this file all indicates a training parameter.Wherein, some common training parameters, as Net parameter is used to specify the model used, and max_iter parameter indicates the maximum times of setting repeatedly reached, then Snapshot_prefix parameter then indicates its prefix title etc. when model preservation.

It is also possible to the loss function curve in rendering model training on Matlab platform.Every iteration is once just drawn As soon as making time penalty values, precision of every iteration 100 times draftings.As shown in Figure 3.

The test of model

Using trained model to single picture classify and output category as a result, the step for equally exist It is realized on Matlab.Detailed process participates in Fig. 4.

When obtaining propagated forward result, it is only necessary to which the function net.forward for calling matlab included can be obtained figure As the probability under each label.Data are existing in three dimensions in each convolutional layer.It can be regarded as by very Multiple two-dimension pictures stack, and wherein each is known as a characteristic pattern (feature map).If input layer is grayscale image Piece, then with regard to only one feature map；It generally will be that 3 characteristic patterns are (red green if input layer is color image It is blue).There are many convolution kernels (kernel) between layers, upper one layer of each feature map is rolled up with each convolution kernel Product operation, can all generate next layer of a characteristic pattern.

1. first person extraction is come out with improved foreground detection method, wherein background subtraction and mixed Gauss model The effect picture of extraction prospect is as shown in Figure 5 and Figure 6:

It, i.e., will be after the result phase "AND" of both background subtraction and mixed Gauss model with improved foreground detection method Effect picture is as shown in Figure 7.

Then, it then carries out the extraction in simple binary conversion treatment and largest connected domain and determines the range of personage's prospect, thus Interception obtains the corresponding RGB image of personage in original image

2. carrying out the test of model training and performance on disclosed data set UR Fall Detection Dataset. Wherein training set has 7381 pictures, and test set has 1326 pictures.Picture all in data set is all first carried out to prospect to mention The operation taken is put into the training that model is carried out in network after obtaining picture as shown in Figure 8.

3. first pre-training is carried out to the ResNet network that the present invention uses on ImageNet data set, to obtain pre- instruction Practice model.It training set is pre-processed into the picture after (foreground detection, albefaction, normalization) is input in network and is trained.

4. and accuracy test is carried out to trained model with test set, every iteration once tests a penalty values, often changes A generation precision of 100 tests, finally result precision is 96.7% when convergence

5. by the output of the independent picture of Matlab platform test, and realizing the visualization of convolution kernel and characteristic pattern. Wherein the visualization picture of convolution kernel is as shown in Figure 9

Matlab is called to realize characteristic pattern visualization from tape function feature_map, as shown in Figure 10 and Figure 11.

The convolution kernel that can be seen that bottom from Figure 10 and Figure 11 is mainly used to extract the feature on the basis such as profile of personage.

Training set and test set of the invention exists data set both from UR Fall Detection Dataset The training of model and the test of precision are carried out on caffe platform, and is accelerated using cuDNN, and final display precision reaches 96.7%, time complexity 49ms.

Each technical characteristic of embodiment described above can be combined arbitrarily, for simplicity of description, not to above-mentioned reality It applies all possible combination of each technical characteristic in example to be all described, as long as however, the combination of these technical characteristics is not deposited In contradiction, all should be considered as described in this specification.

The embodiments described above only express several embodiments of the present invention, and the description thereof is more specific and detailed, but simultaneously It cannot therefore be construed as limiting the scope of the patent.It should be pointed out that coming for those of ordinary skill in the art It says, without departing from the inventive concept of the premise, various modifications and improvements can be made, these belong to protection of the invention Range.Therefore, the scope of protection of the patent of the invention shall be subject to the appended claims.

Claims

1. a kind of fall detection method based on convolutional neural networks characterized by comprising

The training convolutional neural networks, the training convolutional neural networks specifically include:

The each frame image acquired is pre-processed, pretreated work include be followed successively by foreground extraction and normalization, Whitening operation；

By step " each frame image acquired is pre-processed, pretreated work include be followed successively by foreground extraction and Normalization, whitening operation；" treated, and picture is put into the pre-training model carries out model training, obtain the parameter of model； And

Picture is detected using the convolutional neural networks after trained；

Wherein, the method for the foreground extraction specifically includes:

Image is handled using background subtraction；

Image is handled using mixed Gauss model；

To the result of image after handling the result after image using background subtraction and being handled using mixed Gauss model ask with.

2. the fall detection method according to claim 1 based on convolutional neural networks, which is characterized in that " right in step The each frame image acquired is pre-processed, and pretreated work includes being followed successively by foreground extraction and normalization, albefaction behaviour Make；" in, each frame image acquired is obtained by reading video file.

3. the fall detection method according to claim 1 based on convolutional neural networks, which is characterized in that in step " benefit Picture is detected with the convolutional neural networks after trained " after, the training convolutional neural networks are specific further include: show every The detection effect figure of one frame picture, and the convolution kernel visualization of implementation model.

4. the fall detection method according to claim 3 based on convolutional neural networks, which is characterized in that step " display It is shown on matlab platform in the detection effect figure of each frame picture out, and the convolution kernel visualization of implementation model " each The detection effect figure of frame picture, and the convolution kernel visualization of implementation model.

5. the fall detection method according to claim 1 based on convolutional neural networks, which is characterized in that step " utilizes Background subtraction handles image；" specifically include:

N_t(x, y)=| I_t(x,y)-B_t(x,y)|

To the pixel (x, y) of present frame, if having | I_t(x,y)-B_t(x, y) | >=T, then the pixel is foreground point, that is, is updated current Picture frame be；

Background image is updated with current frame image.

6. the fall detection method according to claim 1 based on convolutional neural networks, which is characterized in that step " utilizes Mixed Gauss model handles image；" specifically include:

When using gauss hybrid models to background constructing model, the pixel value of each pixel of sequence image can be used k high This modeling, therefore in moment t, the probability density function of some pixel value can indicate are as follows:

Then, K gauss hybrid models are sorted according to weight divided by the size of the quotient of standard deviation, is then selected previous B high This model differentiates that wherein the value of B is indicated for distinguishing are as follows:

Each pixel in new picture frame is put into respectively in sorted K Gauss model and is judged, Rule of judgment Are as follows:

||X_t-μ_t||≤β∑^1/2

In B Gauss model in front, meet above-mentioned condition if wherein had in a Gauss model, this pixel It is determined that background determines that this pixel belongs to prospect if above-mentioned condition is all unsatisfactory in B Gauss model.

For each Gauss model, it is assumed that the invalid weight it is necessary to reduce this Gauss model of above-mentioned condition, if above-mentioned Condition set up it is necessary to update this gauss hybrid models, shown in the following formula of the method for concrete operations:

w_i,t=(1- λ) w_i,t-1+λBM_t

μ_i,t=(1- α) μ_i,t-1+αX_i,t

∑_i,t=(1- α) ∑_i,t-1+α(X_i,t-μ_i,t)(X_i,t-μ_i,t)^T

α=λ/w_i,t

Wherein, pixel is prospect then BM=0, and otherwise BM=1, finally replaces weight the smallest with an initial Gauss model That Gauss model, threshold value T, learning rate λ, parameter beta are all constants specified in advance.

7. the fall detection method according to claim 1 based on convolutional neural networks, which is characterized in that

Test set " is input to the model after training, is detected with the precision of test the set pair analysis model by step；" in, it is described Test set comes from UR Fall Detection Dataset.

8. a kind of computer equipment including memory, processor and stores the meter that can be run on a memory and on a processor Calculation machine program, which is characterized in that the processor realizes any one of claims 1 to 7 the method when executing described program Step.

9. a kind of computer readable storage medium, is stored thereon with computer program, which is characterized in that the program is held by processor The step of any one of claims 1 to 7 the method is realized when row.

10. a kind of processor, which is characterized in that the processor is for running program, wherein right of execution when described program is run Benefit requires 1 to 7 described in any item methods.