CN108805002A - Monitor video accident detection method based on deep learning and dynamic clustering - Google Patents
Monitor video accident detection method based on deep learning and dynamic clustering Download PDFInfo
- Publication number
- CN108805002A CN108805002A CN201810320572.6A CN201810320572A CN108805002A CN 108805002 A CN108805002 A CN 108805002A CN 201810320572 A CN201810320572 A CN 201810320572A CN 108805002 A CN108805002 A CN 108805002A
- Authority
- CN
- China
- Prior art keywords
- vector
- image
- video
- sampling
- block
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/40—Scenes; Scene-specific elements in video content
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/23—Clustering techniques
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/40—Scenes; Scene-specific elements in video content
- G06V20/46—Extracting features or characteristics from the video content, e.g. video fingerprints, representative shots or key frames
- G06V20/47—Detecting features for summarising video content
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/40—Scenes; Scene-specific elements in video content
- G06V20/44—Event detection
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- General Physics & Mathematics (AREA)
- Multimedia (AREA)
- Artificial Intelligence (AREA)
- Evolutionary Biology (AREA)
- Evolutionary Computation (AREA)
- Computer Vision & Pattern Recognition (AREA)
- General Engineering & Computer Science (AREA)
- Bioinformatics & Computational Biology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Image Analysis (AREA)
Abstract
The monitor video accident detection method based on deep learning and dynamic clustering that the present invention relates to a kind of.Feature extraction phases learn corresponding network filter by training video, and the pixel Optical-flow Feature of low layer is converted to high-rise semantic motion feature by depth network with deep learning network PCANet;Simultaneously by the screening to moving region in video, weed out include only background information temporal and spatial sampling block.In the feature modeling stage, characteristic vector space is modeled with the nonparametric model clustered based on two layers, and the method merged in opposite directions using vector in vectorial merging phase, the vector clusters in dictionary set are finally subjected to anomalous event judgement at a series of event cluster, and according to the Euclidean distance between test vector and event cluster center vector with K mean cluster algorithm.The present invention effectively avoids the feature vector caused by addition and shifts, and improves accident detection rate.
Description
Technical field
The present invention relates to a kind of monitor video accident detection method, more particularly to one is based on deep learning and dynamic
The monitor video accident detection method of cluster.
Background technology
With the development of computer science and technology, can be dashed forward using technologies such as image procossing, computer vision, machine learning
The limitation of broken traditional video surveillance system is realized and is examined to the video intelligent analysis and the active of anomalous event of video monitoring system
It surveys, real-time early warning, has important value for the video surveillance applications of public safety field.
Accident detection method is broadly divided into four basic steps in monitor video:Image preprocessing, elementary event table
Show, build abnormality detection model and judge anomalous event.Wherein elementary event expression is broadly divided into based on lower-level vision feature
Event indicates and the event based on high-level semantics feature indicates.Based on lower-level vision feature carry out event expression way be usually
Video body is divided into small video block from overlapping, non-overlapping or space-time interest points mode, video block is regarded as substantially
Event, from video block extract lower-level vision feature elementary event is indicated.Currently, special using more lower-level vision
Sign has light stream, gradient, texture etc..Event based on high-level semantics feature indicates mainly to need to carry out data complicated pattern
Processing, such as the methods of target space-time track, social force.Common accident detection model mainly has:Exception based on classification
Event detection model, the accident detection model based on cluster, is based on statistics at the accident detection model based on arest neighbors
Accident detection model, the accident detection model etc. based on information theory.
Although the accident detection method under monitor video is varied, most accident detection methods fortune
Motion feature is modeled with parameter model, needs that many model parameters are voluntarily arranged among these, but parameter experience
Value generally requires to re-start setting when changing video scene.In document《Online anomaly detection in
videos by clustering dynamic exemplars》【J Feng,C Zhang,P Hao】In, author is for those
The very low anomalous event of emerging or probability of occurrence in video, it is proposed that it is a kind of based on the nonparametric model of cluster come pair
Feature vector is modeled, and extracts MHOF features in the video flowing of input first, then sequentially inputs these features
It is merged in fixed-size dictionary set, then the dictionary set after merging is clustered with K mean algorithms;?
Anomalous event judges the stage, which carries out abnormal judgement by the distance between judging characteristic vector and cluster code book.
The performance in detection anomalous event of above-mentioned algorithm is good, but there are still problems with:
1. the algorithm is described the movement in video using MHOF features, and the artificial construction feature such as HOF, HOG
Although description effect it is pretty good, the applicability of various features is different in different video scenes, and changing scene often needs
Used feature is changed simultaneously, the accident detection of more scenes is poorly suitable for;
2. the algorithm dictionary set vector merge in, using simple weighting summation mode, can cause in this way by
After a large amount of vector updates, the value of the feature vector in dictionary set is deviated relative to original value, to final inspection
Survey causes influence;
3. the detection in the algorithm for low frequency anomalous event passes through the frequency of occurrence for counting vectorial in dictionary set simultaneously
The frequency accounting of corresponding code book is calculated to carry out, however feature extraction phases use be then entire image is carried out it is intensive
Sampling, in this way when video scene is sparse scene, the feature vector sampled is largely background information, to wordbook
Indicate that the frequency count value of the vector of background information will be very big in conjunction so that corresponding code book frequency accounting is excessively high, causes
The frequency of other motion events has been both less than judgment threshold, causes flase drop.
Invention content
In view of the above-mentioned problems, the invention discloses a kind of monitor video anomalous event based on deep learning and dynamic clustering
Detection method.This method carries out depth characteristic extraction to video sampling block automatically with PCANet, while being transported to sampling block
Dynamic region screening, and cluster modeling is carried out to characteristic set based on two layers of Clustering Model of vector merging using one.
The technical scheme steps that the present invention solves the use of its technical problem are as follows:
Step S101:Image preprocessing.Read monitoring video flow as input, carry out gray processing and using gaussian filtering into
Row noise reduction process.
Step S102:Overlap sampling.To inputting the video flowing of algorithm, each picture in each frame image in calculating first
The light stream value of vegetarian refreshments is used in combination pixel light stream value to replace gray value;Then fixed-size overlap sampling, output are carried out to I
A series of sizes are the video sampling image block of N × N.
Step S103:It screens moving region.For all video sampling image blocks that sampling obtains, histogram is used first
Two-peak method counts to obtain the division threshold value for dividing and moving pixel and background pixel point in image, then according to the threshold value to each
Sampled images block is judged, the sampled images block comprising motion event is filtered out, and includes only adopting for background information by those
The rejecting of sample block is not considered.
Step S104:Depth characteristic is extracted.After being included only the sampled images block of movable information, these are regarded
Frequency sampling image block is input in 3 layers of PCANet, to carry out parameter training;It finishes in depth network training and then once will
Image block is input in trained depth network, and it is special that network exports corresponding depth for each sampled images block
Sign.
Step S105:Dynamic clustering models.For depth characteristic vector set, feature vector is sequentially input into size first
In fixed dictionary set, if collective number is more than the upper bound, immediate two feature vectors are merged to maintain
Sum is constant;After safeguarding, cluster operation is carried out with K mean algorithms to dictionary set, exports corresponding event cluster code book.
Step S106:After model construction finishes, input test video samples simultaneously every frame image of test video
Moving region judgement is carried out, then sampled images are input in trained PCANet and export corresponding depth characteristic, finally
Feature vector is compared with event cluster code book, if being all higher than respective threshold value at a distance from all code books, it is determined that
It is anomalous event.
Beneficial effects of the present invention:
1. the present invention comes to carry out depth characteristic extraction to sampling block with deep learning network, with traditional artificial structure of utilization
It makes feature to compare, depth characteristic is more preferable for the robustness of video scene, and need not take time to be spy to a certain special scenes
Sign chooses experiment to determine that the movement in scene is described with any feature.
2. the present invention safeguarded in fixed-size dictionary set, merges in opposite directions with two vectors in the model construction stage
Method replace simple weighting summation, effectively avoid the feature vector caused by addition and shift, improve
Accident detection rate.
3. the present invention adds moving region screening process, weed out useless background information before feature extraction, only
To including that the sampling block significantly moved is subsequently calculated, algorithm detection speed is not only increased, and under sparse scene
Improve accident detection rate.
Description of the drawings
Fig. 1 is the flow chart for the accident detection that the present invention is monitored under video;
Fig. 2 is the schematic diagram for the accident detection that the present invention is monitored under video;
Fig. 3 is overlapping sample streams journey figure;
Fig. 4 is moving region screening process figure;
Fig. 5 is that depth characteristic extracts flow chart;
Fig. 6 is dynamic clustering modeling procedure figure;
Fig. 7 is accident detection flow chart;
Fig. 8 is neighbouring sample block position view;
Fig. 9 is final result figure of the present invention.
Specific implementation mode
Below in conjunction with the accompanying drawings, specific embodiments of the present invention are described in further detail.As shown in figs 1-9, have
Body step is described as follows:
Step S101:Image preprocessing.
Input video stream Iin, to IinIt carries out gray processing and carries out noise reduction process using gaussian filtering.At gaussian filtering noise reduction
The concrete operations of reason are as follows:With each pixel in one 3 × 3 Gaussian convolution Nuclear receptor co repressor video frame, determined with the convolution
Field in pixel weighted average gray value go substitute convolution central pixel point value, output by processing after video flowing
I。
Step S102:Overlap sampling.
Treated video flowing I is inputted, calculates the light of each pixel of each frame image in video flowing I first
Flow valuve is used in combination the light stream value of pixel to replace gray value, then carries out fixed-size overlap sampling, output size phase to I
Same and fixed video sampling image block set Cell.Referring to Fig. 3, detailed process is as follows:
Step S301:It is fitted previous frame video image.The former frame in two adjacent images frame in I is inputted, for adjacent
Former frame in continuous two video frame carrys out approximate carry out table to each pixel neighborhood of a point in frame using a multinomial
It reaches
Wherein A is symmetrical matrix, and b is vector, and c is scalar, and value can be fitted by weighted least-squares method and be asked
, export the polynomial fitting f to the frame image1(x)。
Step S302:It is fitted latter frame video image.The a later frame in two adjacent images frame in I is inputted, for adjacent
A later frame in frame carries out approximate expression with same method
And polynomial parameters are acquired by weighted least-squares method, export the polynomial fitting f of the frame image2(x)。
Step S303:Front and back expression formula association solves.Input the polynomial fitting f of adjacent two field pictures1(x) and f2(x),
What it is due to two polynomial repressentations is two continuous frames image adjacent in video image, so that there is movements between them is related
Property, if the displacement of pixel is d between two frames, then have
Wherein
A2=A1
b2=b1-2A1d
Displacement d is defined as the function about x again, corresponding A and b are defined as
The displacement that pixel x can be obtained is
D (x)=A-1(x)Δb(x)
Export the displacement d (x) of each pixel in previous frame image.
Step S304:Pixel gray value is replaced.Displacement d (x) corresponding to input video stream I and each frame image,
After the light stream value of each pixel of every frame in acquiring video flowing I, for each pixel, with the light stream value of the pixel
Original gray value is replaced, the corresponding video flowing I after replacing is exportedout。
Step S305:Overlap sampling.The video flowing I after finishing is replaced in inputout, from first picture of first frame image
Vegetarian refreshments starts, and it is N × N to carry out size successively, and Duplication is the repeated sampling of θ, and output size is identical and fixed video sampling
Image block set Cell.Wherein N is the sample size on Spatial Dimension, and value takes N under normal circumstances depending on image size
=24, repetitive rate θ=0.5, i.e., according to above-mentioned parameter, Spatial Dimension is once adopted every 12 pixels in sampling process
Sample.
Step S103:It screens moving region.
After step S102, input that size is identical and fixed video sampling image block set Cell this stage, so
And due to being the global sampling of overlapping, so including only background information in some sampling blocks, and do not contain any fortune
Dynamic information, thus this stage we sampling block is screened, weed out those only and include the sampling block of background information, output
Include the sampling set of blocks Cell of movable informationout.Referring to Fig. 4, detailed process is as follows:
Step S401:Setting divides threshold value.Input sample image block set Cell.To all sampling institutes in the block in set
There is the light stream vectors value of pixel into the bimodal statistics of column hisgram, according to the method every δ for a section since 0, by institute
There is the light stream value of pixel according to size, counting statistics are carried out in corresponding section, obtain corresponding statistic histogram, generally
In the case of δ=0.025.
After counting statistics finish, corresponding statistic histogram is obtained, scanning histogram finds first from small to large first
Then the position of a wave crest is scanned the position that histogram finds second wave crest, is finally looked between two wave crests from big to small
To the position of trough the division threshold xi is exported using the mediant in the statistics section corresponding to the trough as threshold xi is divided.
Step S402:Sampling block moving region judges.Input divides threshold xi and sampled images set of blocks Cell, is obtaining
After dividing threshold value, next each sampling block is screened, if the light stream vectors size for sampling pixel in the block is big
In threshold xi, then it is assumed that represented by the pixel is moving region, is defined as enlivening pixel;If active in entire sampling block
The accounting of pixel is more than P, and be considered as sampling block expression is moving region, otherwise regards as being that background sampling block is picked
It removes, under normal circumstances P=20%, finally output includes the sampling set of blocks Cell of movable informationout。
Step S104:Depth characteristic is extracted.
By the processing of step S103, there is motion event in remaining next sampling block image.This stage input packet
Sampling set of blocks Cell containing movable informationout, one 3 layers of deep learning net is trained with these sampled images first
Network PCANet;Then again by trained depth network, to extract the depth characteristic of corresponding sampled images, output is trained
The network model Net and corresponding characteristic set v of sampling set of blocks.Referring to Fig. 5, detailed process is as follows:
Step S501:Network first tier learns.Input sample image block set Cellout, the first layer of depth network is equipped with
L1A filter is filtered input picture.For the sampled images that size is N × N, it is k to carry out size to it first1×
k2Intensive sampling, generally take k1=k2=5, and each sampling is rearranged into a column vector xi, then for all
Video sampling block can obtain a vector of samples matrix X.
Then principal component analysis is carried out to matrix X, takes preceding L1Feature vector corresponding to a maximum eigenvalue is as filtering
Device is rearranged into k1×k2The matrix of size.For each filter, the image of input is filtered with it,
The sampled images of so each input can be converted to L1Open filtering imageL under normal circumstances1
=4, export filtering image I corresponding with sampled imagesl。
Step S502:The network second layer learns.Input first layer filtering image Il, L is equipped in the second layer of network2It is a
Filter, general L2=4.Identical as step S501, it is k to carry out size to all images first in the second layer1×k2It is intensive
Vectorization arranged side by side is sampled, vector of samples matrix X is obtained;Then principal component analysis is carried out to the matrix, L before choosing2It is a maximum special
The corresponding feature vector of value indicative is used in combination it to be filtered image as filter.
Since the light stream image of input has L by being exported after first layer1Filtering image is opened, so an image is passing through
After first two layers of depth network, export as L1×L2Open filtering imageAnd trained depth
Spend network N et, wherein each OlIn be corresponding with L2Open filtering image.
Step S503:Depth characteristic exports.Input second layer filtering imageThird layer is
The output layer of network carries out binary conversion treatment so that include only in result to it first for the filtering image of second layer output
There are integer and zero.For each image collectionAn INTEGER MATRICES can be converted it into
Tl
Wherein H (*) is class unit-step function
By above-mentioned processing, each pixel be encoded into [0,16) between integer.Obtaining INTEGER MATRICES TlIt
Afterwards, then to the matrix statistics with histogram is carried out, obtains the statistics with histogram vector of one 16 dimension.
For all total L1A image collection Ol, L can be obtained1These statistical vectors are carried out cascade behaviour by a statistical vector
Make, output dimension isDepth characteristic vector.
Step S105:Dynamic clustering models.
The depth characteristic corresponding to all sampled images has been obtained by step S104.This stage input sample image block
Depth characteristic vector set v models it by two layers of Clustering Model, outgoing event depth characteristic set
The maximum inter- object distance d of cluster code book c and each code book.Referring to Fig. 6, detailed process is as follows:
Step S601:Dictionary set initializes.The empty dictionary set that a size is fixed as N is defined first, then by institute
There is the depth characteristic vector of sampling block to be added to one by one in this dictionary set, and to each vector v in dictionary set
Counting ω (v) is carried out, under normal circumstances N=200.
Step S602:Feature vector is added one by one.Depth characteristic vector set v is inputted, the feature vector in v is added successively
Enter in dictionary set, during addition, for each newly joined feature vector, if wordbook after being added
Vectorial quantity≤N in conjunction, then be directly added into, corresponding new count value ω (v)=1 that vector is added;If=N+1, needs
Vector in dictionary set is merged so that the vector sum in dictionary set maintains N constant.
Step S603:Vector merges.Dictionary set to be combined is inputted, is if desired merged into row vector, we choose word
Two vector vs of Euclidean distance minimum in allusion quotation seta=[x1a,x2a,…,xna] and vb=[x1b,x2b,…,xnb] merge.
In merging process, the small vector of ω (*) value is merged into the big vector of ω (*) value by we, it is assumed here that ω (va)≥
ω(vb), by vector vbIt is merged into vaIn the middle of.
For the every one-dimensional of vector to be combined, compare value of two vectors in the dimension, according to taking between the two
Value size to merge into row vector, if new vector is v=[x1,x2,…,xn], then have
xi=(1- α) xia+α×sign(xia,xib)×xib
And in merging process, the count value ω (v) of the new vector after merging is
ω (v)=ω (va)+ω(vb)
And sum remains the dictionary set output of N after merging.
Step S604:Code book clusters.Input safeguard complete after dictionary set, by all depth characteristic vectors successively
It is added to after dictionary set, final only remaining N number of vector after merging.For this N number of vector, then use K mean values
Algorithm clusters it, is clustered into k event cluster code book, each class represents a kind of motion event in video, and remembers
Cluster centre and maximum distance d vectorial in class in each event class of the lower output of record, wherein taking k=16.
Step S106:Accident detection.
The training dataset for inputting algorithm is converted into corresponding model by step S105, and generates corresponding event cluster
Code book, each code book represent a kind of motion event in training video.In this stage, algorithm by the test video of input into
Row accident detection, video flowing of the output after detection mark, referring to Fig. 7, detailed process is as follows:
Step S701:Calculate motion event probability of occurrence.In step S105, by K mean cluster, it can obtain every
The center vector of a event cluster code book and the maximum inter- object distance of the event cluster.So for each center vector ci, definition
ω (*) value of the event cluster is the sum of ω (*) value of all vectors for belonging to such.
After obtaining the count value ω (*) of each event cluster, count value is converted to corresponding probability of occurrence p (ci)
Indicate the motion event corresponding to the event cluster code book, the probability occurred in training video are how many.
Step S702:Test video feature extraction.After probability has been calculated, for the test video of input, first, in accordance with
Step S101 carries out image preprocessing;Then it is sampled according to step S102, obtains a series of sampling block;Installation steps again
The method of S103 carries out moving region screening, those is weeded out only and include the sampling block of background information, only to including movement
The sampling block of event carries out abnormal judgement;Screening finish after, for those include movable information sampling block, by sampling block
Image is input in trained PCANet networks, with trained PCANet networks come generate corresponding depth characteristic to
Amount, exports corresponding testing feature vector.
Step S703:Accident detection.Input test feature vector, in the depth characteristic vector for obtaining test sample block
And then abnormal judgement is carried out to it.For any one testing feature vector v, by the center vector of itself and all event clusters
ciCompared one by one, if vector v and wherein some center vector ciBetween Euclidean distance be less than its corresponding maximum kind
Interior distance di, it is normal to be considered as the movement corresponding to the sampling block, and goes to step S705;If vector v and all ciIt
Between Euclidean distance be all higher than respective di, it is determined that being abnormal, and go to step S704.
Step S704:Secondary detection.Input is judged as abnormal sampling block, and abnormal regard is determined to be for those
Frequency image sampling block carries out secondary detection to eliminate interference of the noise to detection to it.For each abnormal sample
Block judges sampling block adjacent thereto on space and time dimension (referring to Fig. 8), if possessing M or more simultaneously around it
Abnormal sample block is just regarded as being abnormal;Otherwise the sampling block is divided into normally again, under normal circumstances M=
2。
Step S705:Online updating.Input test feature vector needs to adopt the test after judgement terminates extremely
In the middle of the depth characteristic vector update to event cluster code book of sample block so that code book deeply can gradually learn to regard with detection
Emerging motion event in frequency.It needs again to carry out event cluster code book test vector with the method for step S105 thus
Update.
Claims (6)
1. the monitor video accident detection method based on deep learning and dynamic clustering, automatically adopts video with PCANet
Sample block carries out depth characteristic extraction, while carrying out moving region screening to sampling block, and using one based on vectorial two merged
Strata class model to characteristic set carries out cluster modeling, it is characterised in that includes the following steps:
Step 1:Image preprocessing;Monitoring video flow is read as input, gray processing is carried out and carries out noise reduction using gaussian filtering
Processing;
Step 2:Overlap sampling;To inputting the video flowing of algorithm, each pixel in each frame image in calculating first
Light stream value is used in combination pixel light stream value to replace gray value;Then fixed-size overlap sampling is carried out to I, output is a series of
Size is the video sampling image block of N × N;
Step 3:It screens moving region;For all video sampling image blocks that sampling obtains, histogram Two-peak method is used first
Statistics obtains dividing the division threshold value that pixel and background pixel point are moved in image, then according to the threshold value to each sample graph
As block is judged, the sampled images block comprising motion event is filtered out, includes only that the sampling block of background information is picked by those
Except not considering;
Step 4:Depth characteristic is extracted;After being included only the sampled images block of movable information, by these video samplings
Image block is input in 3 layers of PCANet, to carry out parameter training;It is finished in depth network training and then once by image block
It is input in trained depth network, network exports corresponding depth characteristic for each sampled images block;
Step 5:Dynamic clustering models;For depth characteristic vector set, feature vector is sequentially input first fixed-size
In dictionary set, if collective number is more than the upper bound, immediate two feature vectors are merged to maintain sum not
Become;After safeguarding, cluster operation is carried out with K mean algorithms to dictionary set, exports corresponding event cluster code book;
Step 6:After model construction finishes, input test video is sampled and is transported to every frame image that test article is
Sampled images, are then input in trained PCANet and export corresponding depth characteristic, finally by feature by dynamic region decision
Vector is compared with event cluster code book, if being all higher than respective threshold value at a distance from all code books, it is determined that being abnormal
Event.
2. the monitor video accident detection method according to claim 1 based on deep learning and dynamic clustering,
It is characterized in that the overlap sampling described in step 2, it is specific as follows:
Step 2-1:It is fitted previous frame video image;The former frame in two adjacent images frame in I is inputted, for adjacent continuous
Former frame in two video frame carrys out approximate express to each pixel neighborhood of a point in frame using a multinomial
Wherein A is symmetrical matrix, and b is vector, and c is scalar, and value can be fitted by weighted least-squares method and be acquired, defeated
Go out the polynomial fitting f to the frame image1(x);
Step 2-2:It is fitted latter frame video image;The a later frame in two adjacent images frame in I is inputted, in consecutive frame
A later frame carries out approximate expression with same method
And polynomial parameters are acquired by weighted least-squares method, export the polynomial fitting f of the frame image2(x);
Step 2-3:Front and back expression formula association solves;Input the polynomial fitting f of adjacent two field pictures1(x) and f2(x), due to two
A polynomial repressentation is two continuous frames image adjacent in video image, so there is motion relevances between them, if
The displacement of pixel is d between two frames, then has
Wherein
A2=A1
b2=b1-2A1d
c2=dTA1d-b1 Td+c1
Displacement d is defined as the function about x again, corresponding A and b are defined as
The displacement that pixel x can be obtained is
D (x)=A-1(x)Δb(x)
Export the displacement d (x) of each pixel in previous frame image;
Step 2-4:Pixel gray value is replaced;Displacement d (x) corresponding to input video stream I and each frame image, is acquired
After the light stream value of each pixel of every frame in video flowing I, for each pixel, replaced with the light stream value of the pixel
Original gray value exports the corresponding video flowing I after replacingout;
Step 2-5:Overlap sampling;The video flowing I after finishing is replaced in inputout, opened from first pixel of first frame image
Begin, it is N × N to carry out size successively, and Duplication is the repeated sampling of θ, and output size is identical and fixed video sampling image block
Set Cell;Wherein N is the sample size on Spatial Dimension, and value takes N=24 under normal circumstances depending on image size,
Repetitive rate θ=0.5, i.e., according to above-mentioned parameter, Spatial Dimension is once sampled every 12 pixels in sampling process.
3. the monitor video accident detection method according to claim 1 based on deep learning and dynamic clustering,
It is characterized in that the moving region screening described in step 3, it is specific as follows:
Step 3-1:Setting divides threshold value;Input sample image block set Cell;To all sampling all pictures in the block in set
The light stream vectors value of vegetarian refreshments is into the bimodal statistics of column hisgram, according to the method every δ for a section since 0, by all pictures
The light stream value of vegetarian refreshments carries out counting statistics according to size in corresponding section, obtains corresponding statistic histogram, and δ=
0.025;
After counting statistics finish, corresponding statistic histogram is obtained, scanning histogram finds first wave from small to large first
Then the position at peak scans the position that histogram finds second wave crest, finally finds wave between two wave crests from big to small
The position of paddy exports the division threshold xi using the mediant in the statistics section corresponding to the trough as threshold xi is divided;
Step 3-2:Sampling block moving region judges;Input divides threshold xi and sampled images set of blocks Cell, is obtaining dividing threshold
After value, next each sampling block is screened, if the light stream vectors size for sampling pixel in the block is more than threshold value
ξ, then it is assumed that represented by the pixel is moving region, is defined as enlivening pixel;If enlivening pixel in entire sampling block
Accounting be more than P, be considered as the sampling block expression be moving region, otherwise regard as being that background sampling block is rejected, take P
=20%, finally output includes the sampling set of blocks Cell of movable informationout。
4. the monitor video accident detection method according to claim 1 based on deep learning and dynamic clustering,
It is characterized in that the depth characteristic extraction described in step 4, it is specific as follows:
Step 4-1:Network first tier learns;Input sample image block set Cellout, the first layer of depth network is equipped with L1A filter
Wave device is filtered input picture;For the sampled images that size is N × N, it is k to carry out size to it first1×k2It is close
Collection sampling, takes k1=k2=5, and each sampling is rearranged into a column vector xi, then for all video samplings
Block, to obtain a vector of samples matrix X;
Then principal component analysis is carried out to matrix X, takes preceding L1Feature vector corresponding to a maximum eigenvalue, will as filter
It is rearranged into k1×k2The matrix of size;For each filter, the image of input is filtered with it, then
The sampled images of each input can be converted to L1Open filtering image Il=I*Wl 1, l=1,2 ..., L1, L under normal circumstances1=
4, export filtering image I corresponding with sampled imagesl;
Step 4-2:The network second layer learns;Input first layer filtering image Il, L is equipped in the second layer of network2A filtering
Device takes L2=4;It is k to carry out size to all images first in the second layer1×k2Intensive sampling vectorization arranged side by side, adopted
Sample vector matrix X;Then principal component analysis is carried out to the matrix, L before choosing2The corresponding feature vector conduct of a maximum eigenvalue
Filter is used in combination it to be filtered image;
Since the light stream image of input has L by being exported after first layer1Filtering image is opened, so an image is by depth
After first two layers of network, export as L1×L2Open filtering imageAnd trained depth net
Network Net, wherein each OlIn be corresponding with L2Open filtering image;
Step 4-3:Depth characteristic exports;Input second layer filtering imageThird layer is network
Output layer, for the second layer output filtering image, binary conversion treatment is carried out to it first so that include only whole in result
Number and zero;For each image collectionAn INTEGER MATRICES T can be converted it intol
Wherein H (*) is class unit-step function
By above-mentioned processing, each pixel be encoded into [0,16) between integer;Obtaining INTEGER MATRICES TlAnd then
Statistics with histogram is carried out to the matrix, obtains the statistics with histogram vector of one 16 dimension;
For all total L1A image collection Ol, L can be obtained1These statistical vectors are carried out cascade operation by a statistical vector,
Exporting dimension isDepth characteristic vector.
5. the monitor video accident detection method according to claim 1 based on deep learning and dynamic clustering,
It is characterized in that the dynamic clustering modeling described in step 5, it is specific as follows:
Step 5-1:Dictionary set initializes;The empty dictionary set that a size is fixed as N is defined first, then by all samplings
The depth characteristic vector of block is added to one by one in this dictionary set, and is counted to each vector v in dictionary set
ω (v) is counted, under normal circumstances N=200;
Step 5-2:Feature vector is added one by one;Depth characteristic vector set v is inputted, the feature vector in v is added sequentially to
In dictionary set, during addition, for each newly joined feature vector, if after being added in dictionary set
Vectorial quantity≤N, then be directly added into, it is corresponding it is new be added vector count value ω (v)=1;If=N+1, needs to word
Vector in allusion quotation set merges so that the vector sum in dictionary set maintains N constant;
Step 5-3:Vector merges;Dictionary set to be combined is inputted, is if desired merged into row vector, it is Central European to choose dictionary set
Two minimum vector vs of family name's distancea=[x1a,x2a,…,xna] and vb=[x1b,x2b,…,xnb] merge;In merging process
In, the small vector of ω (*) value is merged into the big vector of ω (*) value, it is assumed here that ω (va)≥ω(vb), by vector vb
It is merged into vaIn the middle of;
For the every one-dimensional of vector to be combined, compare value of two vectors in the dimension, it is big according to value between the two
It is small to merge into row vector, if new vector is v=[x1,x2,…,xn], then have
xi=(1- α) xia+α×sign(xia,xib)×xib
And in merging process, the count value ω (v) of the new vector after merging is
ω (v)=ω (va)+ω(vb)
And sum remains the dictionary set output of N after merging;
Step 5-4:Code book clusters;The dictionary set after completing is safeguarded in input, is sequentially added by all depth characteristic vectors
To after dictionary set, final only remaining N number of vector after merging;For this N number of vector, then use K mean algorithms
It is clustered, is clustered into k event cluster code book, each class represents a kind of motion event in video, and records
Cluster centre and maximum distance d vectorial in class in each event class are exported, wherein taking k=16.
6. the monitor video accident detection method according to claim 1 based on deep learning and dynamic clustering,
It is characterized in that the accident detection described in step 6, it is specific as follows:
Step 6-1:Calculate motion event probability of occurrence;In step S105, by K mean cluster, each thing can be obtained
The center vector of part cluster code book and the maximum inter- object distance of the event cluster;So for each center vector ci, define the thing
ω (*) value of part cluster is the sum of ω (*) value of all vectors for belonging to such;
After obtaining the count value ω (*) of each event cluster, count value is converted to corresponding probability of occurrence p (ci)
Indicate the motion event corresponding to the event cluster code book, the probability occurred in training video are how many;
Step 6-2:Test video feature extraction;After probability has been calculated, for the test video of input, first, in accordance with step 1
Carry out image preprocessing;Then it is sampled according to step 2, obtains a series of sampling block;Motor area is carried out further according to step 3
Domain is screened, those is weeded out only and include the sampling block of background information, only to include the sampling block of motion event carry out it is abnormal
Judge;Screening finish after, for those include movable information sampling block, sampling block image is input to trained
In PCANet networks, corresponding depth characteristic vector is generated with trained PCANet networks, exports corresponding test
Feature vector;
Step 6-3:Accident detection;Input test feature vector, after obtaining the depth characteristic vector of test sample block,
Abnormal judgement is carried out to it again;For any one testing feature vector v, by the center vector c of itself and all event clustersiIt carries out
Compare one by one, if vector v and wherein some center vector ciBetween Euclidean distance be less than its corresponding maximum kind in away from
From di, it is normal to be considered as the movement corresponding to the sampling block, and goes to step 6-5;If vector v and all ciBetween
Euclidean distance is all higher than respective di, it is determined that being abnormal, and go to step 6-4;
Step 6-4:Secondary detection;Input is judged as abnormal sampling block, those are determined to be with abnormal video figure
As sampling block carries out secondary detection to eliminate interference of the noise to detection to it;For each abnormal sample block, sentence
Sampling block adjacent thereto on disconnected space and time dimension just will if possessing M or more abnormal sample block around it simultaneously
It regards as being abnormal;Otherwise the sampling block is divided into normally again, under normal circumstances M=2;
Step 6-5:Online updating;Input test feature vector needs after judgement terminates extremely by the test sample block
In the middle of the update to event cluster code book of depth characteristic vector so that code book deeply can gradually learn in video newly with detection
The motion event of appearance;It needs test vector being again updated event cluster code book with the method for step 5 thus.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810320572.6A CN108805002B (en) | 2018-04-11 | 2018-04-11 | Monitoring video abnormal event detection method based on deep learning and dynamic clustering |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810320572.6A CN108805002B (en) | 2018-04-11 | 2018-04-11 | Monitoring video abnormal event detection method based on deep learning and dynamic clustering |
Publications (2)
Publication Number | Publication Date |
---|---|
CN108805002A true CN108805002A (en) | 2018-11-13 |
CN108805002B CN108805002B (en) | 2022-03-01 |
Family
ID=64094844
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201810320572.6A Active CN108805002B (en) | 2018-04-11 | 2018-04-11 | Monitoring video abnormal event detection method based on deep learning and dynamic clustering |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN108805002B (en) |
Cited By (14)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109460744A (en) * | 2018-11-26 | 2019-03-12 | 南京邮电大学 | A kind of video monitoring system based on deep learning |
CN110210530A (en) * | 2019-05-15 | 2019-09-06 | 杭州智尚云科信息技术有限公司 | Intelligent control method, device, equipment, system and storage medium based on machine vision |
CN110362713A (en) * | 2019-07-12 | 2019-10-22 | 四川长虹电子系统有限公司 | Video monitoring method for early warning and system based on Spark Streaming |
CN111614627A (en) * | 2020-04-27 | 2020-09-01 | 中国舰船研究设计中心 | SDN-oriented cross-plane cooperation DDOS detection and defense method and system |
CN111814644A (en) * | 2020-07-01 | 2020-10-23 | 重庆邮电大学 | Video abnormal event detection method based on disturbance visual interpretation |
CN112367292A (en) * | 2020-10-10 | 2021-02-12 | 浙江大学 | Encrypted flow anomaly detection method based on deep dictionary learning |
CN112866654A (en) * | 2021-03-11 | 2021-05-28 | 福建环宇通信息科技股份公司 | Intelligent video monitoring system |
CN113270200A (en) * | 2021-05-24 | 2021-08-17 | 平安科技(深圳)有限公司 | Abnormal patient identification method based on artificial intelligence and related equipment |
CN113706837A (en) * | 2021-07-09 | 2021-11-26 | 上海汽车集团股份有限公司 | Engine abnormal state detection method and device |
CN113836976A (en) * | 2020-06-23 | 2021-12-24 | 江苏翼视智能科技有限公司 | Method for detecting global abnormal event in surveillance video |
CN114092851A (en) * | 2021-10-12 | 2022-02-25 | 甘肃欧美亚信息科技有限公司 | Monitoring video abnormal event detection method based on time sequence action detection |
CN114205726A (en) * | 2021-09-01 | 2022-03-18 | 珠海市杰理科技股份有限公司 | Testing method and device of finished product earphone and earphone manufacturing system |
CN115345527A (en) * | 2022-10-18 | 2022-11-15 | 成都西交智汇大数据科技有限公司 | Chemical experiment abnormal operation detection method, device, equipment and readable storage medium |
CN115492493A (en) * | 2022-07-28 | 2022-12-20 | 重庆长安汽车股份有限公司 | Tail gate control method, device, equipment and medium |
Citations (15)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2006012174A (en) * | 2004-06-28 | 2006-01-12 | Mitsubishi Electric Research Laboratories Inc | Method for detecting abnormal event in video |
US20090016610A1 (en) * | 2007-07-09 | 2009-01-15 | Honeywell International Inc. | Methods of Using Motion-Texture Analysis to Perform Activity Recognition and Detect Abnormal Patterns of Activities |
CN101872418A (en) * | 2010-05-28 | 2010-10-27 | 电子科技大学 | Detection method based on group environment abnormal behavior |
US20120314064A1 (en) * | 2011-06-13 | 2012-12-13 | Sony Corporation | Abnormal behavior detecting apparatus and method thereof, and video monitoring system |
CN103390278A (en) * | 2013-07-23 | 2013-11-13 | 中国科学技术大学 | Detecting system for video aberrant behavior |
CN104123544A (en) * | 2014-07-23 | 2014-10-29 | 通号通信信息集团有限公司 | Video analysis based abnormal behavior detection method and system |
CN105354542A (en) * | 2015-10-27 | 2016-02-24 | 杭州电子科技大学 | Method for detecting abnormal video event in crowded scene |
CN105608446A (en) * | 2016-02-02 | 2016-05-25 | 北京大学深圳研究生院 | Video stream abnormal event detection method and apparatus |
CN105787472A (en) * | 2016-03-28 | 2016-07-20 | 电子科技大学 | Abnormal behavior detection method based on time-space Laplacian Eigenmaps learning |
CN105913002A (en) * | 2016-04-07 | 2016-08-31 | 杭州电子科技大学 | On-line adaptive abnormal event detection method under video scene |
CN106228149A (en) * | 2016-08-04 | 2016-12-14 | 杭州电子科技大学 | A kind of video anomaly detection method |
CN106384092A (en) * | 2016-09-11 | 2017-02-08 | 杭州电子科技大学 | Online low-rank abnormal video event detection method for monitoring scene |
CN106980829A (en) * | 2017-03-17 | 2017-07-25 | 苏州大学 | Abnormal behaviour automatic testing method of fighting based on video analysis |
CN107590427A (en) * | 2017-05-25 | 2018-01-16 | 杭州电子科技大学 | Monitor video accident detection method based on space-time interest points noise reduction |
CN107729799A (en) * | 2017-06-13 | 2018-02-23 | 银江股份有限公司 | Crowd's abnormal behaviour vision-based detection and analyzing and alarming system based on depth convolutional neural networks |
-
2018
- 2018-04-11 CN CN201810320572.6A patent/CN108805002B/en active Active
Patent Citations (15)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2006012174A (en) * | 2004-06-28 | 2006-01-12 | Mitsubishi Electric Research Laboratories Inc | Method for detecting abnormal event in video |
US20090016610A1 (en) * | 2007-07-09 | 2009-01-15 | Honeywell International Inc. | Methods of Using Motion-Texture Analysis to Perform Activity Recognition and Detect Abnormal Patterns of Activities |
CN101872418A (en) * | 2010-05-28 | 2010-10-27 | 电子科技大学 | Detection method based on group environment abnormal behavior |
US20120314064A1 (en) * | 2011-06-13 | 2012-12-13 | Sony Corporation | Abnormal behavior detecting apparatus and method thereof, and video monitoring system |
CN103390278A (en) * | 2013-07-23 | 2013-11-13 | 中国科学技术大学 | Detecting system for video aberrant behavior |
CN104123544A (en) * | 2014-07-23 | 2014-10-29 | 通号通信信息集团有限公司 | Video analysis based abnormal behavior detection method and system |
CN105354542A (en) * | 2015-10-27 | 2016-02-24 | 杭州电子科技大学 | Method for detecting abnormal video event in crowded scene |
CN105608446A (en) * | 2016-02-02 | 2016-05-25 | 北京大学深圳研究生院 | Video stream abnormal event detection method and apparatus |
CN105787472A (en) * | 2016-03-28 | 2016-07-20 | 电子科技大学 | Abnormal behavior detection method based on time-space Laplacian Eigenmaps learning |
CN105913002A (en) * | 2016-04-07 | 2016-08-31 | 杭州电子科技大学 | On-line adaptive abnormal event detection method under video scene |
CN106228149A (en) * | 2016-08-04 | 2016-12-14 | 杭州电子科技大学 | A kind of video anomaly detection method |
CN106384092A (en) * | 2016-09-11 | 2017-02-08 | 杭州电子科技大学 | Online low-rank abnormal video event detection method for monitoring scene |
CN106980829A (en) * | 2017-03-17 | 2017-07-25 | 苏州大学 | Abnormal behaviour automatic testing method of fighting based on video analysis |
CN107590427A (en) * | 2017-05-25 | 2018-01-16 | 杭州电子科技大学 | Monitor video accident detection method based on space-time interest points noise reduction |
CN107729799A (en) * | 2017-06-13 | 2018-02-23 | 银江股份有限公司 | Crowd's abnormal behaviour vision-based detection and analyzing and alarming system based on depth convolutional neural networks |
Non-Patent Citations (5)
Title |
---|
GNANAVEL, VK 等: "Abnormal Event Detection in Crowded Video Scenes", 《ADVANCES IN INTELLIGENT SYSTEMS AND COMPUTING》 * |
NAJLA BOUARADA 等: "Abnormal Events Detection Based on Trajectory Clustering", 《IEEE》 * |
王军 等: "基于深度学习特征的异常行为检测", 《湖南大学学报(自然科学版)》 * |
盖杰 等: "结合多属性的视频中全局异常事件检测方法", 《杭州电子科技大学学报(自然科学版)》 * |
程艳云 等: "基于视频图像块模型的局部异常行为检测", 《南京邮电大学学报(自然科学报)》 * |
Cited By (22)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109460744A (en) * | 2018-11-26 | 2019-03-12 | 南京邮电大学 | A kind of video monitoring system based on deep learning |
CN109460744B (en) * | 2018-11-26 | 2021-08-27 | 南京邮电大学 | Video monitoring system based on deep learning |
CN110210530A (en) * | 2019-05-15 | 2019-09-06 | 杭州智尚云科信息技术有限公司 | Intelligent control method, device, equipment, system and storage medium based on machine vision |
CN110362713A (en) * | 2019-07-12 | 2019-10-22 | 四川长虹电子系统有限公司 | Video monitoring method for early warning and system based on Spark Streaming |
CN110362713B (en) * | 2019-07-12 | 2023-06-06 | 四川长虹云数信息技术有限公司 | Video monitoring and early warning method and system based on Spark Streaming |
CN111614627A (en) * | 2020-04-27 | 2020-09-01 | 中国舰船研究设计中心 | SDN-oriented cross-plane cooperation DDOS detection and defense method and system |
CN111614627B (en) * | 2020-04-27 | 2022-03-25 | 中国舰船研究设计中心 | SDN-oriented cross-plane cooperation DDOS detection and defense method and system |
CN113836976A (en) * | 2020-06-23 | 2021-12-24 | 江苏翼视智能科技有限公司 | Method for detecting global abnormal event in surveillance video |
CN111814644A (en) * | 2020-07-01 | 2020-10-23 | 重庆邮电大学 | Video abnormal event detection method based on disturbance visual interpretation |
CN111814644B (en) * | 2020-07-01 | 2022-05-03 | 重庆邮电大学 | Video abnormal event detection method based on disturbance visual interpretation |
CN112367292A (en) * | 2020-10-10 | 2021-02-12 | 浙江大学 | Encrypted flow anomaly detection method based on deep dictionary learning |
CN112866654B (en) * | 2021-03-11 | 2023-02-28 | 福建环宇通信息科技股份公司 | Intelligent video monitoring system |
CN112866654A (en) * | 2021-03-11 | 2021-05-28 | 福建环宇通信息科技股份公司 | Intelligent video monitoring system |
CN113270200A (en) * | 2021-05-24 | 2021-08-17 | 平安科技(深圳)有限公司 | Abnormal patient identification method based on artificial intelligence and related equipment |
CN113270200B (en) * | 2021-05-24 | 2022-12-27 | 平安科技(深圳)有限公司 | Abnormal patient identification method based on artificial intelligence and related equipment |
CN113706837A (en) * | 2021-07-09 | 2021-11-26 | 上海汽车集团股份有限公司 | Engine abnormal state detection method and device |
CN113706837B (en) * | 2021-07-09 | 2022-12-06 | 上海汽车集团股份有限公司 | Engine abnormal state detection method and device |
CN114205726A (en) * | 2021-09-01 | 2022-03-18 | 珠海市杰理科技股份有限公司 | Testing method and device of finished product earphone and earphone manufacturing system |
CN114205726B (en) * | 2021-09-01 | 2024-04-12 | 珠海市杰理科技股份有限公司 | Method and device for testing finished earphone and earphone manufacturing system |
CN114092851A (en) * | 2021-10-12 | 2022-02-25 | 甘肃欧美亚信息科技有限公司 | Monitoring video abnormal event detection method based on time sequence action detection |
CN115492493A (en) * | 2022-07-28 | 2022-12-20 | 重庆长安汽车股份有限公司 | Tail gate control method, device, equipment and medium |
CN115345527A (en) * | 2022-10-18 | 2022-11-15 | 成都西交智汇大数据科技有限公司 | Chemical experiment abnormal operation detection method, device, equipment and readable storage medium |
Also Published As
Publication number | Publication date |
---|---|
CN108805002B (en) | 2022-03-01 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN108805002A (en) | Monitor video accident detection method based on deep learning and dynamic clustering | |
Liznerski et al. | Explainable deep one-class classification | |
CN108764085B (en) | Crowd counting method based on generation of confrontation network | |
Medel et al. | Anomaly detection in video using predictive convolutional long short-term memory networks | |
CN110363131B (en) | Abnormal behavior detection method, system and medium based on human skeleton | |
CN109670528B (en) | Data expansion method facing pedestrian re-identification task and based on paired sample random occlusion strategy | |
CN105139004B (en) | Facial expression recognizing method based on video sequence | |
CN107506692A (en) | A kind of dense population based on deep learning counts and personnel's distribution estimation method | |
CN110188637A (en) | A kind of Activity recognition technical method based on deep learning | |
CN110213222A (en) | Network inbreak detection method based on machine learning | |
CN109543727B (en) | Semi-supervised anomaly detection method based on competitive reconstruction learning | |
Wang et al. | STNet: Scale tree network with multi-level auxiliator for crowd counting | |
Li et al. | A supervised clustering and classification algorithm for mining data with mixed variables | |
CN110826702A (en) | Abnormal event detection method for multitask deep network | |
CN111145145B (en) | Image surface defect detection method based on MobileNet | |
CN111079539A (en) | Video abnormal behavior detection method based on abnormal tracking | |
CN113850284B (en) | Multi-operation detection method based on multi-scale feature fusion and multi-branch prediction | |
CN116434069A (en) | Remote sensing image change detection method based on local-global transducer network | |
Hongmeng et al. | A detection method for deepfake hard compressed videos based on super-resolution reconstruction using CNN | |
Tambe et al. | Towards designing an automated classification of lymphoma subtypes using deep neural networks | |
CN113344110A (en) | Fuzzy image classification method based on super-resolution reconstruction | |
CN113761359A (en) | Data packet recommendation method and device, electronic equipment and storage medium | |
CN117237994B (en) | Method, device and system for counting personnel and detecting behaviors in oil and gas operation area | |
Chexia et al. | A Generalized Model for Crowd Violence Detection Focusing on Human Contour and Dynamic Features | |
Zhu et al. | Permutation-invariant tabular data synthesis |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |