CN108805002A - Monitor video accident detection method based on deep learning and dynamic clustering - Google Patents

Monitor video accident detection method based on deep learning and dynamic clustering Download PDF

Info

Publication number
CN108805002A
CN108805002A CN201810320572.6A CN201810320572A CN108805002A CN 108805002 A CN108805002 A CN 108805002A CN 201810320572 A CN201810320572 A CN 201810320572A CN 108805002 A CN108805002 A CN 108805002A
Authority
CN
China
Prior art keywords
vector
image
video
sampling
block
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201810320572.6A
Other languages
Chinese (zh)
Other versions
CN108805002B (en
Inventor
徐向华
刘李启明
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hangzhou Dianzi University
Original Assignee
Hangzhou Dianzi University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hangzhou Dianzi University filed Critical Hangzhou Dianzi University
Priority to CN201810320572.6A priority Critical patent/CN108805002B/en
Publication of CN108805002A publication Critical patent/CN108805002A/en
Application granted granted Critical
Publication of CN108805002B publication Critical patent/CN108805002B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/40Scenes; Scene-specific elements in video content
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/40Scenes; Scene-specific elements in video content
    • G06V20/46Extracting features or characteristics from the video content, e.g. video fingerprints, representative shots or key frames
    • G06V20/47Detecting features for summarising video content
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/40Scenes; Scene-specific elements in video content
    • G06V20/44Event detection

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • General Engineering & Computer Science (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Image Analysis (AREA)

Abstract

The monitor video accident detection method based on deep learning and dynamic clustering that the present invention relates to a kind of.Feature extraction phases learn corresponding network filter by training video, and the pixel Optical-flow Feature of low layer is converted to high-rise semantic motion feature by depth network with deep learning network PCANet;Simultaneously by the screening to moving region in video, weed out include only background information temporal and spatial sampling block.In the feature modeling stage, characteristic vector space is modeled with the nonparametric model clustered based on two layers, and the method merged in opposite directions using vector in vectorial merging phase, the vector clusters in dictionary set are finally subjected to anomalous event judgement at a series of event cluster, and according to the Euclidean distance between test vector and event cluster center vector with K mean cluster algorithm.The present invention effectively avoids the feature vector caused by addition and shifts, and improves accident detection rate.

Description

Monitor video accident detection method based on deep learning and dynamic clustering
Technical field
The present invention relates to a kind of monitor video accident detection method, more particularly to one is based on deep learning and dynamic The monitor video accident detection method of cluster.
Background technology
With the development of computer science and technology, can be dashed forward using technologies such as image procossing, computer vision, machine learning The limitation of broken traditional video surveillance system is realized and is examined to the video intelligent analysis and the active of anomalous event of video monitoring system It surveys, real-time early warning, has important value for the video surveillance applications of public safety field.
Accident detection method is broadly divided into four basic steps in monitor video:Image preprocessing, elementary event table Show, build abnormality detection model and judge anomalous event.Wherein elementary event expression is broadly divided into based on lower-level vision feature Event indicates and the event based on high-level semantics feature indicates.Based on lower-level vision feature carry out event expression way be usually Video body is divided into small video block from overlapping, non-overlapping or space-time interest points mode, video block is regarded as substantially Event, from video block extract lower-level vision feature elementary event is indicated.Currently, special using more lower-level vision Sign has light stream, gradient, texture etc..Event based on high-level semantics feature indicates mainly to need to carry out data complicated pattern Processing, such as the methods of target space-time track, social force.Common accident detection model mainly has:Exception based on classification Event detection model, the accident detection model based on cluster, is based on statistics at the accident detection model based on arest neighbors Accident detection model, the accident detection model etc. based on information theory.
Although the accident detection method under monitor video is varied, most accident detection methods fortune Motion feature is modeled with parameter model, needs that many model parameters are voluntarily arranged among these, but parameter experience Value generally requires to re-start setting when changing video scene.In document《Online anomaly detection in videos by clustering dynamic exemplars》【J Feng,C Zhang,P Hao】In, author is for those The very low anomalous event of emerging or probability of occurrence in video, it is proposed that it is a kind of based on the nonparametric model of cluster come pair Feature vector is modeled, and extracts MHOF features in the video flowing of input first, then sequentially inputs these features It is merged in fixed-size dictionary set, then the dictionary set after merging is clustered with K mean algorithms;? Anomalous event judges the stage, which carries out abnormal judgement by the distance between judging characteristic vector and cluster code book.
The performance in detection anomalous event of above-mentioned algorithm is good, but there are still problems with:
1. the algorithm is described the movement in video using MHOF features, and the artificial construction feature such as HOF, HOG Although description effect it is pretty good, the applicability of various features is different in different video scenes, and changing scene often needs Used feature is changed simultaneously, the accident detection of more scenes is poorly suitable for;
2. the algorithm dictionary set vector merge in, using simple weighting summation mode, can cause in this way by After a large amount of vector updates, the value of the feature vector in dictionary set is deviated relative to original value, to final inspection Survey causes influence;
3. the detection in the algorithm for low frequency anomalous event passes through the frequency of occurrence for counting vectorial in dictionary set simultaneously The frequency accounting of corresponding code book is calculated to carry out, however feature extraction phases use be then entire image is carried out it is intensive Sampling, in this way when video scene is sparse scene, the feature vector sampled is largely background information, to wordbook Indicate that the frequency count value of the vector of background information will be very big in conjunction so that corresponding code book frequency accounting is excessively high, causes The frequency of other motion events has been both less than judgment threshold, causes flase drop.
Invention content
In view of the above-mentioned problems, the invention discloses a kind of monitor video anomalous event based on deep learning and dynamic clustering Detection method.This method carries out depth characteristic extraction to video sampling block automatically with PCANet, while being transported to sampling block Dynamic region screening, and cluster modeling is carried out to characteristic set based on two layers of Clustering Model of vector merging using one.
The technical scheme steps that the present invention solves the use of its technical problem are as follows:
Step S101:Image preprocessing.Read monitoring video flow as input, carry out gray processing and using gaussian filtering into Row noise reduction process.
Step S102:Overlap sampling.To inputting the video flowing of algorithm, each picture in each frame image in calculating first The light stream value of vegetarian refreshments is used in combination pixel light stream value to replace gray value;Then fixed-size overlap sampling, output are carried out to I A series of sizes are the video sampling image block of N × N.
Step S103:It screens moving region.For all video sampling image blocks that sampling obtains, histogram is used first Two-peak method counts to obtain the division threshold value for dividing and moving pixel and background pixel point in image, then according to the threshold value to each Sampled images block is judged, the sampled images block comprising motion event is filtered out, and includes only adopting for background information by those The rejecting of sample block is not considered.
Step S104:Depth characteristic is extracted.After being included only the sampled images block of movable information, these are regarded Frequency sampling image block is input in 3 layers of PCANet, to carry out parameter training;It finishes in depth network training and then once will Image block is input in trained depth network, and it is special that network exports corresponding depth for each sampled images block Sign.
Step S105:Dynamic clustering models.For depth characteristic vector set, feature vector is sequentially input into size first In fixed dictionary set, if collective number is more than the upper bound, immediate two feature vectors are merged to maintain Sum is constant;After safeguarding, cluster operation is carried out with K mean algorithms to dictionary set, exports corresponding event cluster code book.
Step S106:After model construction finishes, input test video samples simultaneously every frame image of test video Moving region judgement is carried out, then sampled images are input in trained PCANet and export corresponding depth characteristic, finally Feature vector is compared with event cluster code book, if being all higher than respective threshold value at a distance from all code books, it is determined that It is anomalous event.
Beneficial effects of the present invention:
1. the present invention comes to carry out depth characteristic extraction to sampling block with deep learning network, with traditional artificial structure of utilization It makes feature to compare, depth characteristic is more preferable for the robustness of video scene, and need not take time to be spy to a certain special scenes Sign chooses experiment to determine that the movement in scene is described with any feature.
2. the present invention safeguarded in fixed-size dictionary set, merges in opposite directions with two vectors in the model construction stage Method replace simple weighting summation, effectively avoid the feature vector caused by addition and shift, improve Accident detection rate.
3. the present invention adds moving region screening process, weed out useless background information before feature extraction, only To including that the sampling block significantly moved is subsequently calculated, algorithm detection speed is not only increased, and under sparse scene Improve accident detection rate.
Description of the drawings
Fig. 1 is the flow chart for the accident detection that the present invention is monitored under video;
Fig. 2 is the schematic diagram for the accident detection that the present invention is monitored under video;
Fig. 3 is overlapping sample streams journey figure;
Fig. 4 is moving region screening process figure;
Fig. 5 is that depth characteristic extracts flow chart;
Fig. 6 is dynamic clustering modeling procedure figure;
Fig. 7 is accident detection flow chart;
Fig. 8 is neighbouring sample block position view;
Fig. 9 is final result figure of the present invention.
Specific implementation mode
Below in conjunction with the accompanying drawings, specific embodiments of the present invention are described in further detail.As shown in figs 1-9, have Body step is described as follows:
Step S101:Image preprocessing.
Input video stream Iin, to IinIt carries out gray processing and carries out noise reduction process using gaussian filtering.At gaussian filtering noise reduction The concrete operations of reason are as follows:With each pixel in one 3 × 3 Gaussian convolution Nuclear receptor co repressor video frame, determined with the convolution Field in pixel weighted average gray value go substitute convolution central pixel point value, output by processing after video flowing I。
Step S102:Overlap sampling.
Treated video flowing I is inputted, calculates the light of each pixel of each frame image in video flowing I first Flow valuve is used in combination the light stream value of pixel to replace gray value, then carries out fixed-size overlap sampling, output size phase to I Same and fixed video sampling image block set Cell.Referring to Fig. 3, detailed process is as follows:
Step S301:It is fitted previous frame video image.The former frame in two adjacent images frame in I is inputted, for adjacent Former frame in continuous two video frame carrys out approximate carry out table to each pixel neighborhood of a point in frame using a multinomial It reaches
Wherein A is symmetrical matrix, and b is vector, and c is scalar, and value can be fitted by weighted least-squares method and be asked , export the polynomial fitting f to the frame image1(x)。
Step S302:It is fitted latter frame video image.The a later frame in two adjacent images frame in I is inputted, for adjacent A later frame in frame carries out approximate expression with same method
And polynomial parameters are acquired by weighted least-squares method, export the polynomial fitting f of the frame image2(x)。
Step S303:Front and back expression formula association solves.Input the polynomial fitting f of adjacent two field pictures1(x) and f2(x), What it is due to two polynomial repressentations is two continuous frames image adjacent in video image, so that there is movements between them is related Property, if the displacement of pixel is d between two frames, then have
Wherein
A2=A1
b2=b1-2A1d
Displacement d is defined as the function about x again, corresponding A and b are defined as
The displacement that pixel x can be obtained is
D (x)=A-1(x)Δb(x)
Export the displacement d (x) of each pixel in previous frame image.
Step S304:Pixel gray value is replaced.Displacement d (x) corresponding to input video stream I and each frame image, After the light stream value of each pixel of every frame in acquiring video flowing I, for each pixel, with the light stream value of the pixel Original gray value is replaced, the corresponding video flowing I after replacing is exportedout
Step S305:Overlap sampling.The video flowing I after finishing is replaced in inputout, from first picture of first frame image Vegetarian refreshments starts, and it is N × N to carry out size successively, and Duplication is the repeated sampling of θ, and output size is identical and fixed video sampling Image block set Cell.Wherein N is the sample size on Spatial Dimension, and value takes N under normal circumstances depending on image size =24, repetitive rate θ=0.5, i.e., according to above-mentioned parameter, Spatial Dimension is once adopted every 12 pixels in sampling process Sample.
Step S103:It screens moving region.
After step S102, input that size is identical and fixed video sampling image block set Cell this stage, so And due to being the global sampling of overlapping, so including only background information in some sampling blocks, and do not contain any fortune Dynamic information, thus this stage we sampling block is screened, weed out those only and include the sampling block of background information, output Include the sampling set of blocks Cell of movable informationout.Referring to Fig. 4, detailed process is as follows:
Step S401:Setting divides threshold value.Input sample image block set Cell.To all sampling institutes in the block in set There is the light stream vectors value of pixel into the bimodal statistics of column hisgram, according to the method every δ for a section since 0, by institute There is the light stream value of pixel according to size, counting statistics are carried out in corresponding section, obtain corresponding statistic histogram, generally In the case of δ=0.025.
After counting statistics finish, corresponding statistic histogram is obtained, scanning histogram finds first from small to large first Then the position of a wave crest is scanned the position that histogram finds second wave crest, is finally looked between two wave crests from big to small To the position of trough the division threshold xi is exported using the mediant in the statistics section corresponding to the trough as threshold xi is divided.
Step S402:Sampling block moving region judges.Input divides threshold xi and sampled images set of blocks Cell, is obtaining After dividing threshold value, next each sampling block is screened, if the light stream vectors size for sampling pixel in the block is big In threshold xi, then it is assumed that represented by the pixel is moving region, is defined as enlivening pixel;If active in entire sampling block The accounting of pixel is more than P, and be considered as sampling block expression is moving region, otherwise regards as being that background sampling block is picked It removes, under normal circumstances P=20%, finally output includes the sampling set of blocks Cell of movable informationout
Step S104:Depth characteristic is extracted.
By the processing of step S103, there is motion event in remaining next sampling block image.This stage input packet Sampling set of blocks Cell containing movable informationout, one 3 layers of deep learning net is trained with these sampled images first Network PCANet;Then again by trained depth network, to extract the depth characteristic of corresponding sampled images, output is trained The network model Net and corresponding characteristic set v of sampling set of blocks.Referring to Fig. 5, detailed process is as follows:
Step S501:Network first tier learns.Input sample image block set Cellout, the first layer of depth network is equipped with L1A filter is filtered input picture.For the sampled images that size is N × N, it is k to carry out size to it first1× k2Intensive sampling, generally take k1=k2=5, and each sampling is rearranged into a column vector xi, then for all Video sampling block can obtain a vector of samples matrix X.
Then principal component analysis is carried out to matrix X, takes preceding L1Feature vector corresponding to a maximum eigenvalue is as filtering Device is rearranged into k1×k2The matrix of size.For each filter, the image of input is filtered with it, The sampled images of so each input can be converted to L1Open filtering imageL under normal circumstances1 =4, export filtering image I corresponding with sampled imagesl
Step S502:The network second layer learns.Input first layer filtering image Il, L is equipped in the second layer of network2It is a Filter, general L2=4.Identical as step S501, it is k to carry out size to all images first in the second layer1×k2It is intensive Vectorization arranged side by side is sampled, vector of samples matrix X is obtained;Then principal component analysis is carried out to the matrix, L before choosing2It is a maximum special The corresponding feature vector of value indicative is used in combination it to be filtered image as filter.
Since the light stream image of input has L by being exported after first layer1Filtering image is opened, so an image is passing through After first two layers of depth network, export as L1×L2Open filtering imageAnd trained depth Spend network N et, wherein each OlIn be corresponding with L2Open filtering image.
Step S503:Depth characteristic exports.Input second layer filtering imageThird layer is The output layer of network carries out binary conversion treatment so that include only in result to it first for the filtering image of second layer output There are integer and zero.For each image collectionAn INTEGER MATRICES can be converted it into Tl
Wherein H (*) is class unit-step function
By above-mentioned processing, each pixel be encoded into [0,16) between integer.Obtaining INTEGER MATRICES TlIt Afterwards, then to the matrix statistics with histogram is carried out, obtains the statistics with histogram vector of one 16 dimension.
For all total L1A image collection Ol, L can be obtained1These statistical vectors are carried out cascade behaviour by a statistical vector Make, output dimension isDepth characteristic vector.
Step S105:Dynamic clustering models.
The depth characteristic corresponding to all sampled images has been obtained by step S104.This stage input sample image block Depth characteristic vector set v models it by two layers of Clustering Model, outgoing event depth characteristic set The maximum inter- object distance d of cluster code book c and each code book.Referring to Fig. 6, detailed process is as follows:
Step S601:Dictionary set initializes.The empty dictionary set that a size is fixed as N is defined first, then by institute There is the depth characteristic vector of sampling block to be added to one by one in this dictionary set, and to each vector v in dictionary set Counting ω (v) is carried out, under normal circumstances N=200.
Step S602:Feature vector is added one by one.Depth characteristic vector set v is inputted, the feature vector in v is added successively Enter in dictionary set, during addition, for each newly joined feature vector, if wordbook after being added Vectorial quantity≤N in conjunction, then be directly added into, corresponding new count value ω (v)=1 that vector is added;If=N+1, needs Vector in dictionary set is merged so that the vector sum in dictionary set maintains N constant.
Step S603:Vector merges.Dictionary set to be combined is inputted, is if desired merged into row vector, we choose word Two vector vs of Euclidean distance minimum in allusion quotation seta=[x1a,x2a,…,xna] and vb=[x1b,x2b,…,xnb] merge. In merging process, the small vector of ω (*) value is merged into the big vector of ω (*) value by we, it is assumed here that ω (va)≥ ω(vb), by vector vbIt is merged into vaIn the middle of.
For the every one-dimensional of vector to be combined, compare value of two vectors in the dimension, according to taking between the two Value size to merge into row vector, if new vector is v=[x1,x2,…,xn], then have
xi=(1- α) xia+α×sign(xia,xib)×xib
And in merging process, the count value ω (v) of the new vector after merging is
ω (v)=ω (va)+ω(vb)
And sum remains the dictionary set output of N after merging.
Step S604:Code book clusters.Input safeguard complete after dictionary set, by all depth characteristic vectors successively It is added to after dictionary set, final only remaining N number of vector after merging.For this N number of vector, then use K mean values Algorithm clusters it, is clustered into k event cluster code book, each class represents a kind of motion event in video, and remembers Cluster centre and maximum distance d vectorial in class in each event class of the lower output of record, wherein taking k=16.
Step S106:Accident detection.
The training dataset for inputting algorithm is converted into corresponding model by step S105, and generates corresponding event cluster Code book, each code book represent a kind of motion event in training video.In this stage, algorithm by the test video of input into Row accident detection, video flowing of the output after detection mark, referring to Fig. 7, detailed process is as follows:
Step S701:Calculate motion event probability of occurrence.In step S105, by K mean cluster, it can obtain every The center vector of a event cluster code book and the maximum inter- object distance of the event cluster.So for each center vector ci, definition ω (*) value of the event cluster is the sum of ω (*) value of all vectors for belonging to such.
After obtaining the count value ω (*) of each event cluster, count value is converted to corresponding probability of occurrence p (ci)
Indicate the motion event corresponding to the event cluster code book, the probability occurred in training video are how many.
Step S702:Test video feature extraction.After probability has been calculated, for the test video of input, first, in accordance with Step S101 carries out image preprocessing;Then it is sampled according to step S102, obtains a series of sampling block;Installation steps again The method of S103 carries out moving region screening, those is weeded out only and include the sampling block of background information, only to including movement The sampling block of event carries out abnormal judgement;Screening finish after, for those include movable information sampling block, by sampling block Image is input in trained PCANet networks, with trained PCANet networks come generate corresponding depth characteristic to Amount, exports corresponding testing feature vector.
Step S703:Accident detection.Input test feature vector, in the depth characteristic vector for obtaining test sample block And then abnormal judgement is carried out to it.For any one testing feature vector v, by the center vector of itself and all event clusters ciCompared one by one, if vector v and wherein some center vector ciBetween Euclidean distance be less than its corresponding maximum kind Interior distance di, it is normal to be considered as the movement corresponding to the sampling block, and goes to step S705;If vector v and all ciIt Between Euclidean distance be all higher than respective di, it is determined that being abnormal, and go to step S704.
Step S704:Secondary detection.Input is judged as abnormal sampling block, and abnormal regard is determined to be for those Frequency image sampling block carries out secondary detection to eliminate interference of the noise to detection to it.For each abnormal sample Block judges sampling block adjacent thereto on space and time dimension (referring to Fig. 8), if possessing M or more simultaneously around it Abnormal sample block is just regarded as being abnormal;Otherwise the sampling block is divided into normally again, under normal circumstances M= 2。
Step S705:Online updating.Input test feature vector needs to adopt the test after judgement terminates extremely In the middle of the depth characteristic vector update to event cluster code book of sample block so that code book deeply can gradually learn to regard with detection Emerging motion event in frequency.It needs again to carry out event cluster code book test vector with the method for step S105 thus Update.

Claims (6)

1. the monitor video accident detection method based on deep learning and dynamic clustering, automatically adopts video with PCANet Sample block carries out depth characteristic extraction, while carrying out moving region screening to sampling block, and using one based on vectorial two merged Strata class model to characteristic set carries out cluster modeling, it is characterised in that includes the following steps:
Step 1:Image preprocessing;Monitoring video flow is read as input, gray processing is carried out and carries out noise reduction using gaussian filtering Processing;
Step 2:Overlap sampling;To inputting the video flowing of algorithm, each pixel in each frame image in calculating first Light stream value is used in combination pixel light stream value to replace gray value;Then fixed-size overlap sampling is carried out to I, output is a series of Size is the video sampling image block of N × N;
Step 3:It screens moving region;For all video sampling image blocks that sampling obtains, histogram Two-peak method is used first Statistics obtains dividing the division threshold value that pixel and background pixel point are moved in image, then according to the threshold value to each sample graph As block is judged, the sampled images block comprising motion event is filtered out, includes only that the sampling block of background information is picked by those Except not considering;
Step 4:Depth characteristic is extracted;After being included only the sampled images block of movable information, by these video samplings Image block is input in 3 layers of PCANet, to carry out parameter training;It is finished in depth network training and then once by image block It is input in trained depth network, network exports corresponding depth characteristic for each sampled images block;
Step 5:Dynamic clustering models;For depth characteristic vector set, feature vector is sequentially input first fixed-size In dictionary set, if collective number is more than the upper bound, immediate two feature vectors are merged to maintain sum not Become;After safeguarding, cluster operation is carried out with K mean algorithms to dictionary set, exports corresponding event cluster code book;
Step 6:After model construction finishes, input test video is sampled and is transported to every frame image that test article is Sampled images, are then input in trained PCANet and export corresponding depth characteristic, finally by feature by dynamic region decision Vector is compared with event cluster code book, if being all higher than respective threshold value at a distance from all code books, it is determined that being abnormal Event.
2. the monitor video accident detection method according to claim 1 based on deep learning and dynamic clustering, It is characterized in that the overlap sampling described in step 2, it is specific as follows:
Step 2-1:It is fitted previous frame video image;The former frame in two adjacent images frame in I is inputted, for adjacent continuous Former frame in two video frame carrys out approximate express to each pixel neighborhood of a point in frame using a multinomial
Wherein A is symmetrical matrix, and b is vector, and c is scalar, and value can be fitted by weighted least-squares method and be acquired, defeated Go out the polynomial fitting f to the frame image1(x);
Step 2-2:It is fitted latter frame video image;The a later frame in two adjacent images frame in I is inputted, in consecutive frame A later frame carries out approximate expression with same method
And polynomial parameters are acquired by weighted least-squares method, export the polynomial fitting f of the frame image2(x);
Step 2-3:Front and back expression formula association solves;Input the polynomial fitting f of adjacent two field pictures1(x) and f2(x), due to two A polynomial repressentation is two continuous frames image adjacent in video image, so there is motion relevances between them, if The displacement of pixel is d between two frames, then has
Wherein
A2=A1
b2=b1-2A1d
c2=dTA1d-b1 Td+c1
Displacement d is defined as the function about x again, corresponding A and b are defined as
The displacement that pixel x can be obtained is
D (x)=A-1(x)Δb(x)
Export the displacement d (x) of each pixel in previous frame image;
Step 2-4:Pixel gray value is replaced;Displacement d (x) corresponding to input video stream I and each frame image, is acquired After the light stream value of each pixel of every frame in video flowing I, for each pixel, replaced with the light stream value of the pixel Original gray value exports the corresponding video flowing I after replacingout
Step 2-5:Overlap sampling;The video flowing I after finishing is replaced in inputout, opened from first pixel of first frame image Begin, it is N × N to carry out size successively, and Duplication is the repeated sampling of θ, and output size is identical and fixed video sampling image block Set Cell;Wherein N is the sample size on Spatial Dimension, and value takes N=24 under normal circumstances depending on image size, Repetitive rate θ=0.5, i.e., according to above-mentioned parameter, Spatial Dimension is once sampled every 12 pixels in sampling process.
3. the monitor video accident detection method according to claim 1 based on deep learning and dynamic clustering, It is characterized in that the moving region screening described in step 3, it is specific as follows:
Step 3-1:Setting divides threshold value;Input sample image block set Cell;To all sampling all pictures in the block in set The light stream vectors value of vegetarian refreshments is into the bimodal statistics of column hisgram, according to the method every δ for a section since 0, by all pictures The light stream value of vegetarian refreshments carries out counting statistics according to size in corresponding section, obtains corresponding statistic histogram, and δ= 0.025;
After counting statistics finish, corresponding statistic histogram is obtained, scanning histogram finds first wave from small to large first Then the position at peak scans the position that histogram finds second wave crest, finally finds wave between two wave crests from big to small The position of paddy exports the division threshold xi using the mediant in the statistics section corresponding to the trough as threshold xi is divided;
Step 3-2:Sampling block moving region judges;Input divides threshold xi and sampled images set of blocks Cell, is obtaining dividing threshold After value, next each sampling block is screened, if the light stream vectors size for sampling pixel in the block is more than threshold value ξ, then it is assumed that represented by the pixel is moving region, is defined as enlivening pixel;If enlivening pixel in entire sampling block Accounting be more than P, be considered as the sampling block expression be moving region, otherwise regard as being that background sampling block is rejected, take P =20%, finally output includes the sampling set of blocks Cell of movable informationout
4. the monitor video accident detection method according to claim 1 based on deep learning and dynamic clustering, It is characterized in that the depth characteristic extraction described in step 4, it is specific as follows:
Step 4-1:Network first tier learns;Input sample image block set Cellout, the first layer of depth network is equipped with L1A filter Wave device is filtered input picture;For the sampled images that size is N × N, it is k to carry out size to it first1×k2It is close Collection sampling, takes k1=k2=5, and each sampling is rearranged into a column vector xi, then for all video samplings Block, to obtain a vector of samples matrix X;
Then principal component analysis is carried out to matrix X, takes preceding L1Feature vector corresponding to a maximum eigenvalue, will as filter It is rearranged into k1×k2The matrix of size;For each filter, the image of input is filtered with it, then The sampled images of each input can be converted to L1Open filtering image Il=I*Wl 1, l=1,2 ..., L1, L under normal circumstances1= 4, export filtering image I corresponding with sampled imagesl
Step 4-2:The network second layer learns;Input first layer filtering image Il, L is equipped in the second layer of network2A filtering Device takes L2=4;It is k to carry out size to all images first in the second layer1×k2Intensive sampling vectorization arranged side by side, adopted Sample vector matrix X;Then principal component analysis is carried out to the matrix, L before choosing2The corresponding feature vector conduct of a maximum eigenvalue Filter is used in combination it to be filtered image;
Since the light stream image of input has L by being exported after first layer1Filtering image is opened, so an image is by depth After first two layers of network, export as L1×L2Open filtering imageAnd trained depth net Network Net, wherein each OlIn be corresponding with L2Open filtering image;
Step 4-3:Depth characteristic exports;Input second layer filtering imageThird layer is network Output layer, for the second layer output filtering image, binary conversion treatment is carried out to it first so that include only whole in result Number and zero;For each image collectionAn INTEGER MATRICES T can be converted it intol
Wherein H (*) is class unit-step function
By above-mentioned processing, each pixel be encoded into [0,16) between integer;Obtaining INTEGER MATRICES TlAnd then Statistics with histogram is carried out to the matrix, obtains the statistics with histogram vector of one 16 dimension;
For all total L1A image collection Ol, L can be obtained1These statistical vectors are carried out cascade operation by a statistical vector, Exporting dimension isDepth characteristic vector.
5. the monitor video accident detection method according to claim 1 based on deep learning and dynamic clustering, It is characterized in that the dynamic clustering modeling described in step 5, it is specific as follows:
Step 5-1:Dictionary set initializes;The empty dictionary set that a size is fixed as N is defined first, then by all samplings The depth characteristic vector of block is added to one by one in this dictionary set, and is counted to each vector v in dictionary set ω (v) is counted, under normal circumstances N=200;
Step 5-2:Feature vector is added one by one;Depth characteristic vector set v is inputted, the feature vector in v is added sequentially to In dictionary set, during addition, for each newly joined feature vector, if after being added in dictionary set Vectorial quantity≤N, then be directly added into, it is corresponding it is new be added vector count value ω (v)=1;If=N+1, needs to word Vector in allusion quotation set merges so that the vector sum in dictionary set maintains N constant;
Step 5-3:Vector merges;Dictionary set to be combined is inputted, is if desired merged into row vector, it is Central European to choose dictionary set Two minimum vector vs of family name's distancea=[x1a,x2a,…,xna] and vb=[x1b,x2b,…,xnb] merge;In merging process In, the small vector of ω (*) value is merged into the big vector of ω (*) value, it is assumed here that ω (va)≥ω(vb), by vector vb It is merged into vaIn the middle of;
For the every one-dimensional of vector to be combined, compare value of two vectors in the dimension, it is big according to value between the two It is small to merge into row vector, if new vector is v=[x1,x2,…,xn], then have
xi=(1- α) xia+α×sign(xia,xib)×xib
And in merging process, the count value ω (v) of the new vector after merging is
ω (v)=ω (va)+ω(vb)
And sum remains the dictionary set output of N after merging;
Step 5-4:Code book clusters;The dictionary set after completing is safeguarded in input, is sequentially added by all depth characteristic vectors To after dictionary set, final only remaining N number of vector after merging;For this N number of vector, then use K mean algorithms It is clustered, is clustered into k event cluster code book, each class represents a kind of motion event in video, and records Cluster centre and maximum distance d vectorial in class in each event class are exported, wherein taking k=16.
6. the monitor video accident detection method according to claim 1 based on deep learning and dynamic clustering, It is characterized in that the accident detection described in step 6, it is specific as follows:
Step 6-1:Calculate motion event probability of occurrence;In step S105, by K mean cluster, each thing can be obtained The center vector of part cluster code book and the maximum inter- object distance of the event cluster;So for each center vector ci, define the thing ω (*) value of part cluster is the sum of ω (*) value of all vectors for belonging to such;
After obtaining the count value ω (*) of each event cluster, count value is converted to corresponding probability of occurrence p (ci)
Indicate the motion event corresponding to the event cluster code book, the probability occurred in training video are how many;
Step 6-2:Test video feature extraction;After probability has been calculated, for the test video of input, first, in accordance with step 1 Carry out image preprocessing;Then it is sampled according to step 2, obtains a series of sampling block;Motor area is carried out further according to step 3 Domain is screened, those is weeded out only and include the sampling block of background information, only to include the sampling block of motion event carry out it is abnormal Judge;Screening finish after, for those include movable information sampling block, sampling block image is input to trained In PCANet networks, corresponding depth characteristic vector is generated with trained PCANet networks, exports corresponding test Feature vector;
Step 6-3:Accident detection;Input test feature vector, after obtaining the depth characteristic vector of test sample block, Abnormal judgement is carried out to it again;For any one testing feature vector v, by the center vector c of itself and all event clustersiIt carries out Compare one by one, if vector v and wherein some center vector ciBetween Euclidean distance be less than its corresponding maximum kind in away from From di, it is normal to be considered as the movement corresponding to the sampling block, and goes to step 6-5;If vector v and all ciBetween Euclidean distance is all higher than respective di, it is determined that being abnormal, and go to step 6-4;
Step 6-4:Secondary detection;Input is judged as abnormal sampling block, those are determined to be with abnormal video figure As sampling block carries out secondary detection to eliminate interference of the noise to detection to it;For each abnormal sample block, sentence Sampling block adjacent thereto on disconnected space and time dimension just will if possessing M or more abnormal sample block around it simultaneously It regards as being abnormal;Otherwise the sampling block is divided into normally again, under normal circumstances M=2;
Step 6-5:Online updating;Input test feature vector needs after judgement terminates extremely by the test sample block In the middle of the update to event cluster code book of depth characteristic vector so that code book deeply can gradually learn in video newly with detection The motion event of appearance;It needs test vector being again updated event cluster code book with the method for step 5 thus.
CN201810320572.6A 2018-04-11 2018-04-11 Monitoring video abnormal event detection method based on deep learning and dynamic clustering Active CN108805002B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810320572.6A CN108805002B (en) 2018-04-11 2018-04-11 Monitoring video abnormal event detection method based on deep learning and dynamic clustering

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810320572.6A CN108805002B (en) 2018-04-11 2018-04-11 Monitoring video abnormal event detection method based on deep learning and dynamic clustering

Publications (2)

Publication Number Publication Date
CN108805002A true CN108805002A (en) 2018-11-13
CN108805002B CN108805002B (en) 2022-03-01

Family

ID=64094844

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810320572.6A Active CN108805002B (en) 2018-04-11 2018-04-11 Monitoring video abnormal event detection method based on deep learning and dynamic clustering

Country Status (1)

Country Link
CN (1) CN108805002B (en)

Cited By (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109460744A (en) * 2018-11-26 2019-03-12 南京邮电大学 A kind of video monitoring system based on deep learning
CN110210530A (en) * 2019-05-15 2019-09-06 杭州智尚云科信息技术有限公司 Intelligent control method, device, equipment, system and storage medium based on machine vision
CN110362713A (en) * 2019-07-12 2019-10-22 四川长虹电子系统有限公司 Video monitoring method for early warning and system based on Spark Streaming
CN111614627A (en) * 2020-04-27 2020-09-01 中国舰船研究设计中心 SDN-oriented cross-plane cooperation DDOS detection and defense method and system
CN111814644A (en) * 2020-07-01 2020-10-23 重庆邮电大学 Video abnormal event detection method based on disturbance visual interpretation
CN112367292A (en) * 2020-10-10 2021-02-12 浙江大学 Encrypted flow anomaly detection method based on deep dictionary learning
CN112866654A (en) * 2021-03-11 2021-05-28 福建环宇通信息科技股份公司 Intelligent video monitoring system
CN113270200A (en) * 2021-05-24 2021-08-17 平安科技(深圳)有限公司 Abnormal patient identification method based on artificial intelligence and related equipment
CN113706837A (en) * 2021-07-09 2021-11-26 上海汽车集团股份有限公司 Engine abnormal state detection method and device
CN113836976A (en) * 2020-06-23 2021-12-24 江苏翼视智能科技有限公司 Method for detecting global abnormal event in surveillance video
CN114092851A (en) * 2021-10-12 2022-02-25 甘肃欧美亚信息科技有限公司 Monitoring video abnormal event detection method based on time sequence action detection
CN114205726A (en) * 2021-09-01 2022-03-18 珠海市杰理科技股份有限公司 Testing method and device of finished product earphone and earphone manufacturing system
CN115345527A (en) * 2022-10-18 2022-11-15 成都西交智汇大数据科技有限公司 Chemical experiment abnormal operation detection method, device, equipment and readable storage medium
CN115492493A (en) * 2022-07-28 2022-12-20 重庆长安汽车股份有限公司 Tail gate control method, device, equipment and medium

Citations (15)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2006012174A (en) * 2004-06-28 2006-01-12 Mitsubishi Electric Research Laboratories Inc Method for detecting abnormal event in video
US20090016610A1 (en) * 2007-07-09 2009-01-15 Honeywell International Inc. Methods of Using Motion-Texture Analysis to Perform Activity Recognition and Detect Abnormal Patterns of Activities
CN101872418A (en) * 2010-05-28 2010-10-27 电子科技大学 Detection method based on group environment abnormal behavior
US20120314064A1 (en) * 2011-06-13 2012-12-13 Sony Corporation Abnormal behavior detecting apparatus and method thereof, and video monitoring system
CN103390278A (en) * 2013-07-23 2013-11-13 中国科学技术大学 Detecting system for video aberrant behavior
CN104123544A (en) * 2014-07-23 2014-10-29 通号通信信息集团有限公司 Video analysis based abnormal behavior detection method and system
CN105354542A (en) * 2015-10-27 2016-02-24 杭州电子科技大学 Method for detecting abnormal video event in crowded scene
CN105608446A (en) * 2016-02-02 2016-05-25 北京大学深圳研究生院 Video stream abnormal event detection method and apparatus
CN105787472A (en) * 2016-03-28 2016-07-20 电子科技大学 Abnormal behavior detection method based on time-space Laplacian Eigenmaps learning
CN105913002A (en) * 2016-04-07 2016-08-31 杭州电子科技大学 On-line adaptive abnormal event detection method under video scene
CN106228149A (en) * 2016-08-04 2016-12-14 杭州电子科技大学 A kind of video anomaly detection method
CN106384092A (en) * 2016-09-11 2017-02-08 杭州电子科技大学 Online low-rank abnormal video event detection method for monitoring scene
CN106980829A (en) * 2017-03-17 2017-07-25 苏州大学 Abnormal behaviour automatic testing method of fighting based on video analysis
CN107590427A (en) * 2017-05-25 2018-01-16 杭州电子科技大学 Monitor video accident detection method based on space-time interest points noise reduction
CN107729799A (en) * 2017-06-13 2018-02-23 银江股份有限公司 Crowd's abnormal behaviour vision-based detection and analyzing and alarming system based on depth convolutional neural networks

Patent Citations (15)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2006012174A (en) * 2004-06-28 2006-01-12 Mitsubishi Electric Research Laboratories Inc Method for detecting abnormal event in video
US20090016610A1 (en) * 2007-07-09 2009-01-15 Honeywell International Inc. Methods of Using Motion-Texture Analysis to Perform Activity Recognition and Detect Abnormal Patterns of Activities
CN101872418A (en) * 2010-05-28 2010-10-27 电子科技大学 Detection method based on group environment abnormal behavior
US20120314064A1 (en) * 2011-06-13 2012-12-13 Sony Corporation Abnormal behavior detecting apparatus and method thereof, and video monitoring system
CN103390278A (en) * 2013-07-23 2013-11-13 中国科学技术大学 Detecting system for video aberrant behavior
CN104123544A (en) * 2014-07-23 2014-10-29 通号通信信息集团有限公司 Video analysis based abnormal behavior detection method and system
CN105354542A (en) * 2015-10-27 2016-02-24 杭州电子科技大学 Method for detecting abnormal video event in crowded scene
CN105608446A (en) * 2016-02-02 2016-05-25 北京大学深圳研究生院 Video stream abnormal event detection method and apparatus
CN105787472A (en) * 2016-03-28 2016-07-20 电子科技大学 Abnormal behavior detection method based on time-space Laplacian Eigenmaps learning
CN105913002A (en) * 2016-04-07 2016-08-31 杭州电子科技大学 On-line adaptive abnormal event detection method under video scene
CN106228149A (en) * 2016-08-04 2016-12-14 杭州电子科技大学 A kind of video anomaly detection method
CN106384092A (en) * 2016-09-11 2017-02-08 杭州电子科技大学 Online low-rank abnormal video event detection method for monitoring scene
CN106980829A (en) * 2017-03-17 2017-07-25 苏州大学 Abnormal behaviour automatic testing method of fighting based on video analysis
CN107590427A (en) * 2017-05-25 2018-01-16 杭州电子科技大学 Monitor video accident detection method based on space-time interest points noise reduction
CN107729799A (en) * 2017-06-13 2018-02-23 银江股份有限公司 Crowd's abnormal behaviour vision-based detection and analyzing and alarming system based on depth convolutional neural networks

Non-Patent Citations (5)

* Cited by examiner, † Cited by third party
Title
GNANAVEL, VK 等: "Abnormal Event Detection in Crowded Video Scenes", 《ADVANCES IN INTELLIGENT SYSTEMS AND COMPUTING》 *
NAJLA BOUARADA 等: "Abnormal Events Detection Based on Trajectory Clustering", 《IEEE》 *
王军 等: "基于深度学习特征的异常行为检测", 《湖南大学学报(自然科学版)》 *
盖杰 等: "结合多属性的视频中全局异常事件检测方法", 《杭州电子科技大学学报(自然科学版)》 *
程艳云 等: "基于视频图像块模型的局部异常行为检测", 《南京邮电大学学报(自然科学报)》 *

Cited By (22)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109460744A (en) * 2018-11-26 2019-03-12 南京邮电大学 A kind of video monitoring system based on deep learning
CN109460744B (en) * 2018-11-26 2021-08-27 南京邮电大学 Video monitoring system based on deep learning
CN110210530A (en) * 2019-05-15 2019-09-06 杭州智尚云科信息技术有限公司 Intelligent control method, device, equipment, system and storage medium based on machine vision
CN110362713A (en) * 2019-07-12 2019-10-22 四川长虹电子系统有限公司 Video monitoring method for early warning and system based on Spark Streaming
CN110362713B (en) * 2019-07-12 2023-06-06 四川长虹云数信息技术有限公司 Video monitoring and early warning method and system based on Spark Streaming
CN111614627A (en) * 2020-04-27 2020-09-01 中国舰船研究设计中心 SDN-oriented cross-plane cooperation DDOS detection and defense method and system
CN111614627B (en) * 2020-04-27 2022-03-25 中国舰船研究设计中心 SDN-oriented cross-plane cooperation DDOS detection and defense method and system
CN113836976A (en) * 2020-06-23 2021-12-24 江苏翼视智能科技有限公司 Method for detecting global abnormal event in surveillance video
CN111814644A (en) * 2020-07-01 2020-10-23 重庆邮电大学 Video abnormal event detection method based on disturbance visual interpretation
CN111814644B (en) * 2020-07-01 2022-05-03 重庆邮电大学 Video abnormal event detection method based on disturbance visual interpretation
CN112367292A (en) * 2020-10-10 2021-02-12 浙江大学 Encrypted flow anomaly detection method based on deep dictionary learning
CN112866654B (en) * 2021-03-11 2023-02-28 福建环宇通信息科技股份公司 Intelligent video monitoring system
CN112866654A (en) * 2021-03-11 2021-05-28 福建环宇通信息科技股份公司 Intelligent video monitoring system
CN113270200A (en) * 2021-05-24 2021-08-17 平安科技(深圳)有限公司 Abnormal patient identification method based on artificial intelligence and related equipment
CN113270200B (en) * 2021-05-24 2022-12-27 平安科技(深圳)有限公司 Abnormal patient identification method based on artificial intelligence and related equipment
CN113706837A (en) * 2021-07-09 2021-11-26 上海汽车集团股份有限公司 Engine abnormal state detection method and device
CN113706837B (en) * 2021-07-09 2022-12-06 上海汽车集团股份有限公司 Engine abnormal state detection method and device
CN114205726A (en) * 2021-09-01 2022-03-18 珠海市杰理科技股份有限公司 Testing method and device of finished product earphone and earphone manufacturing system
CN114205726B (en) * 2021-09-01 2024-04-12 珠海市杰理科技股份有限公司 Method and device for testing finished earphone and earphone manufacturing system
CN114092851A (en) * 2021-10-12 2022-02-25 甘肃欧美亚信息科技有限公司 Monitoring video abnormal event detection method based on time sequence action detection
CN115492493A (en) * 2022-07-28 2022-12-20 重庆长安汽车股份有限公司 Tail gate control method, device, equipment and medium
CN115345527A (en) * 2022-10-18 2022-11-15 成都西交智汇大数据科技有限公司 Chemical experiment abnormal operation detection method, device, equipment and readable storage medium

Also Published As

Publication number Publication date
CN108805002B (en) 2022-03-01

Similar Documents

Publication Publication Date Title
CN108805002A (en) Monitor video accident detection method based on deep learning and dynamic clustering
Liznerski et al. Explainable deep one-class classification
CN108764085B (en) Crowd counting method based on generation of confrontation network
Medel et al. Anomaly detection in video using predictive convolutional long short-term memory networks
CN110363131B (en) Abnormal behavior detection method, system and medium based on human skeleton
CN109670528B (en) Data expansion method facing pedestrian re-identification task and based on paired sample random occlusion strategy
CN105139004B (en) Facial expression recognizing method based on video sequence
CN107506692A (en) A kind of dense population based on deep learning counts and personnel's distribution estimation method
CN110188637A (en) A kind of Activity recognition technical method based on deep learning
CN110213222A (en) Network inbreak detection method based on machine learning
CN109543727B (en) Semi-supervised anomaly detection method based on competitive reconstruction learning
Wang et al. STNet: Scale tree network with multi-level auxiliator for crowd counting
Li et al. A supervised clustering and classification algorithm for mining data with mixed variables
CN110826702A (en) Abnormal event detection method for multitask deep network
CN111145145B (en) Image surface defect detection method based on MobileNet
CN111079539A (en) Video abnormal behavior detection method based on abnormal tracking
CN113850284B (en) Multi-operation detection method based on multi-scale feature fusion and multi-branch prediction
CN116434069A (en) Remote sensing image change detection method based on local-global transducer network
Hongmeng et al. A detection method for deepfake hard compressed videos based on super-resolution reconstruction using CNN
Tambe et al. Towards designing an automated classification of lymphoma subtypes using deep neural networks
CN113344110A (en) Fuzzy image classification method based on super-resolution reconstruction
CN113761359A (en) Data packet recommendation method and device, electronic equipment and storage medium
CN117237994B (en) Method, device and system for counting personnel and detecting behaviors in oil and gas operation area
Chexia et al. A Generalized Model for Crowd Violence Detection Focusing on Human Contour and Dynamic Features
Zhu et al. Permutation-invariant tabular data synthesis

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant