CN106503647A - The accident detection method that structural sparse is represented is approached based on low-rank - Google Patents
The accident detection method that structural sparse is represented is approached based on low-rank Download PDFInfo
- Publication number
- CN106503647A CN106503647A CN201610915766.1A CN201610915766A CN106503647A CN 106503647 A CN106503647 A CN 106503647A CN 201610915766 A CN201610915766 A CN 201610915766A CN 106503647 A CN106503647 A CN 106503647A
- Authority
- CN
- China
- Prior art keywords
- feature
- dictionary
- training
- features
- test
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000001514 detection method Methods 0.000 title claims abstract description 29
- 238000000034 method Methods 0.000 claims abstract description 71
- 238000012360 testing method Methods 0.000 claims abstract description 50
- 238000012549 training Methods 0.000 claims abstract description 46
- 230000008569 process Effects 0.000 claims abstract description 36
- 230000002159 abnormal effect Effects 0.000 claims abstract description 25
- 238000000605 extraction Methods 0.000 claims abstract description 12
- 238000000513 principal component analysis Methods 0.000 claims description 5
- 230000006870 function Effects 0.000 claims description 4
- 238000003064 k means clustering Methods 0.000 claims description 3
- 230000009467 reduction Effects 0.000 claims description 3
- 238000005070 sampling Methods 0.000 claims description 3
- 238000000354 decomposition reaction Methods 0.000 claims description 2
- 238000005516 engineering process Methods 0.000 abstract description 2
- 230000005856 abnormality Effects 0.000 abstract 1
- 230000006399 behavior Effects 0.000 description 10
- 238000004422 calculation algorithm Methods 0.000 description 8
- 230000000875 corresponding effect Effects 0.000 description 8
- 230000002547 anomalous effect Effects 0.000 description 6
- 206010000117 Abnormal behaviour Diseases 0.000 description 3
- 238000004364 calculation method Methods 0.000 description 2
- 230000000877 morphologic effect Effects 0.000 description 2
- 238000011160 research Methods 0.000 description 2
- 230000003044 adaptive effect Effects 0.000 description 1
- 238000004458 analytical method Methods 0.000 description 1
- 239000012141 concentrate Substances 0.000 description 1
- 230000002596 correlated effect Effects 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 230000003993 interaction Effects 0.000 description 1
- 238000005259 measurement Methods 0.000 description 1
- 238000012544 monitoring process Methods 0.000 description 1
- 238000005457 optimization Methods 0.000 description 1
- 238000003909 pattern recognition Methods 0.000 description 1
- 230000002123 temporal effect Effects 0.000 description 1
- 230000000007 visual effect Effects 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/40—Scenes; Scene-specific elements in video content
- G06V20/41—Higher-level, semantic clustering, classification or understanding of video scenes, e.g. detection, labelling or Markovian modelling of sport events or news items
- G06V20/42—Higher-level, semantic clustering, classification or understanding of video scenes, e.g. detection, labelling or Markovian modelling of sport events or news items of sport video content
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/23—Clustering techniques
- G06F18/232—Non-hierarchical techniques
- G06F18/2321—Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions
- G06F18/23213—Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions with fixed number of clusters, e.g. K-means clustering
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/40—Extraction of image or video features
- G06V10/513—Sparse representations
Landscapes
- Engineering & Computer Science (AREA)
- Computational Linguistics (AREA)
- Software Systems (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Multimedia (AREA)
- Theoretical Computer Science (AREA)
- Image Analysis (AREA)
Abstract
本发明公开了一种基于低秩逼近结构化稀疏表示的异常事件检测方法,包括特征提取、训练和测试三个过程。1)提取视频序列的多尺度三维梯度特征;2)对多尺度三维梯度特征进行降维,形成训练特征集和测试特征集;3)初始化剩余训练特征和相关参数;4)对剩余训练特征进行迭代学习组稀疏字典,获得正常模式字典集;5)利用由训练过程获取的组稀疏字典集,对测试特征进行稀疏重建;6)根据重建误差,判断测试特征是否为异常特征。本发明解决了异常检测技术中未充分挖掘视频数据的低秩特性和检测速率较慢的缺点。
The invention discloses an abnormal event detection method based on low-rank approximation structured sparse representation, which includes three processes of feature extraction, training and testing. 1) Extract the multi-scale 3D gradient features of the video sequence; 2) Reduce the dimensionality of the multi-scale 3D gradient features to form a training feature set and a test feature set; 3) Initialize the remaining training features and related parameters; 4) Perform training on the remaining training features Iteratively learn the group sparse dictionary to obtain the normal mode dictionary set; 5) use the group sparse dictionary set obtained by the training process to perform sparse reconstruction on the test features; 6) judge whether the test features are abnormal features according to the reconstruction error. The invention solves the shortcomings of not fully exploiting the low-rank characteristics of video data and slow detection rate in the abnormality detection technology.
Description
技术领域technical field
本发明涉及模式识别和视频分析领域,更具体地说,涉及一种基于低秩逼近结构化稀疏表示的异常事件检测方法。The invention relates to the fields of pattern recognition and video analysis, and more specifically, relates to a method for detecting abnormal events based on low-rank approximation structured sparse representation.
背景技术Background technique
视频序列异常事件检测是计算机视觉中的一个活跃的研究课题,已被广泛使用在许多应用中,如人群监控,公共场所检测,交通安全和个人行为异常。面对海量的视频数据,传统的人工标记异常事件费时、低效。因此,自动化和快速的视频序列异常检测方法是迫切需要的。Anomaly event detection in video sequences is an active research topic in computer vision and has been widely used in many applications, such as crowd monitoring, public place detection, traffic safety, and abnormal personal behavior. In the face of massive video data, the traditional manual marking of abnormal events is time-consuming and inefficient. Therefore, automated and fast anomaly detection methods for video sequences are urgently needed.
虽然关于异常事件检测的研究在特征提取、行为建模和异常测量等方面已经取得了很大的进展,但是视频序列异常事件检测仍然是一个非常具有挑战性的任务。首先,对于视频中的异常事件没有精确的定义。一种常见的异常行为识别方法是异常行为模式聚类,另一种是把那些发生率低的检测样本作为异常。第一种方法的困难之处在于没有足够的先验知识来描述异常行为模式;第二种方法需要建立概率模型,异常检测依赖于正常模式的定义和特征的多尺度变化。其次,密集场景中的异常检测要求行为模型可以处理高密度的运动目标,这需要考虑多个目标之间遮掩和相互作用的影响。Although research on anomalous event detection has made great progress in feature extraction, behavior modeling, and anomaly measurement, anomalous event detection in video sequences is still a very challenging task. First, there is no precise definition of an anomalous event in a video. A common abnormal behavior identification method is abnormal behavior pattern clustering, and the other is to treat those detection samples with low occurrence rate as anomalies. The difficulty of the first method is that there is not enough prior knowledge to describe the abnormal behavior pattern; the second method needs to build a probabilistic model, and anomaly detection relies on the definition of normal patterns and the multi-scale variation of features. Second, anomaly detection in dense scenes requires that the behavior model can handle high-density moving targets, which needs to consider the effects of occlusion and interaction between multiple targets.
从特征提取的角度看,异常事件检测方法可以分为基于目标轨迹的方法和基于低层次特征的方法。基于目标轨迹的方法首先进行运动目标跟踪,然后利用目标轨迹来检测异常事件。基于目标轨迹的方法可以清晰地表示各时刻目标的空间状态,但是这一类方法对噪声、遮掩和跟踪错误敏感,不能对密集场景进行异常检测。基于低层次特征的方法通过提取视频序列中像素级别的运动特征和形态特征,可以克服目标轨迹法的缺点。From the perspective of feature extraction, abnormal event detection methods can be divided into object trajectory-based methods and low-level feature-based methods. The method based on object trajectory first tracks the moving object, and then utilizes the object trajectory to detect anomalous events. The method based on the target trajectory can clearly represent the spatial state of the target at each moment, but this type of method is sensitive to noise, occlusion and tracking errors, and cannot detect anomalies in dense scenes. Low-level feature-based methods can overcome the shortcomings of object trajectory methods by extracting pixel-level motion features and morphological features in video sequences.
目前,异常事件检测的主流方法包括动态贝叶斯网络(DBNs),概率主题模型(PTMs)和稀疏表示模型。在DBNs中,隐藏马尔可夫模型(HMM)和马尔可夫随机场(MRF)随着检测目标的增加会成几何级数地提高建模代价,导致这些模型不足以处理密集的场景。与DBNs相比,PTMs,如PLSA和LDA,只关注空间上共生的视觉单词,却忽略了特征的时序信息,使得概率主题模型不能在时空上定位异常事件。近年来,针对异常检测的稀疏表示模型引起了人们的关注。大多数的稀疏表示模型通过训练得到一个过完备字典,但是没有充分挖掘视频数据的低秩特性和内在结构冗余。Currently, mainstream methods for anomalous event detection include Dynamic Bayesian Networks (DBNs), Probabilistic Topic Models (PTMs) and Sparse Representation Models. In DBNs, Hidden Markov Models (HMMs) and Markov Random Fields (MRFs) increase the modeling cost geometrically as the number of detected objects increases, making these models insufficient to handle dense scenes. Compared with DBNs, PTMs, such as PLSA and LDA, only focus on spatially co-occurring visual words, but ignore the temporal information of features, making probabilistic topic models unable to localize anomalous events in space and time. In recent years, sparse representation models for anomaly detection have attracted attention. Most sparse representation models are trained with an over-complete dictionary, but do not fully exploit the low-rank characteristics and inherent structural redundancy of video data.
发明内容Contents of the invention
本发明的目的在于,针对上述异常检测技术中未充分挖掘视频数据的低秩特性和检测速率较慢的缺点,提出一种基于低秩逼近结构化稀疏表示的异常事件检测方法。The object of the present invention is to propose an abnormal event detection method based on low-rank approximation structured sparse representation in view of the shortcomings of the low-rank characteristics of video data and the slow detection rate in the above-mentioned anomaly detection technology.
实现本发明目的的技术解决方案为:一种基于低秩逼近结构化稀疏表示的异常事件检测方法,包括特征提取、训练和测试三个过程:The technical solution to realize the purpose of the present invention is: a method for abnormal event detection based on low-rank approximation structured sparse representation, including three processes of feature extraction, training and testing:
特征提取过程包括以下步骤:The feature extraction process includes the following steps:
1)提取视频序列的多尺度三维梯度特征;1) Extract multi-scale three-dimensional gradient features of video sequences;
2)对多尺度三维梯度特征进行降维,形成训练特征集和测试特征集。2) Dimensionality reduction is performed on multi-scale 3D gradient features to form training feature sets and test feature sets.
训练过程包括以下步骤:The training process includes the following steps:
3)初始化剩余训练特征和相关参数;3) Initialize the remaining training features and related parameters;
4)对剩余训练特征进行迭代学习组稀疏字典,获得正常模式字典集。4) Iteratively learn the group sparse dictionary on the remaining training features to obtain the normal mode dictionary set.
测试过程包括以下步骤:The testing process includes the following steps:
5)利用由训练过程获取的组稀疏字典集,对测试特征进行稀疏重建;5) Sparsely reconstruct the test features using the group sparse dictionary set obtained by the training process;
6)根据重建误差,判断测试特征是否为异常特征。6) According to the reconstruction error, it is judged whether the test feature is an abnormal feature.
上述方法中,所述步骤1)包括以下具体步骤:In the above method, said step 1) includes the following specific steps:
1.1)对视频序列的每一帧图像进行不同尺度的缩放,形成一个三层图像金字塔。1.1) Each frame of the video sequence is scaled to different scales to form a three-layer image pyramid.
1.2)对每一层图像进行时空立方体采样,提取空间上不重叠区域的三维梯度特征。1.2) Sampling the spatio-temporal cube of each layer image to extract the 3D gradient features of spatially non-overlapping regions.
1.3)针对每一层视频序列,将同一空间区域上连续5帧的三维梯度特征叠加在一起,组成一个时空特征。1.3) For each layer of video sequence, the 3D gradient features of 5 consecutive frames on the same spatial region are superimposed together to form a spatio-temporal feature.
上述方法中,所述步骤2)包括以下具体步骤:In the above method, said step 2) includes the following specific steps:
2.1)利用主成分分析(PCA),对上述提取的每一个时空特征进行降维。2.1) Use Principal Component Analysis (PCA) to reduce the dimensionality of each of the above extracted spatio-temporal features.
2.2)利用上述方法,将训练视频序列和测试视频序列转换为训练特征集和测试特征集。2.2) Using the above method, convert the training video sequence and the testing video sequence into a training feature set and a testing feature set.
上述方法中,所述步骤3)包括以下具体步骤:In the above method, said step 3) includes the following specific steps:
3.1)将步骤2.2)获取的训练特征集初始化为剩余训练特征集;3.1) Initialize the training feature set obtained in step 2.2) as the remaining training feature set;
3.2)初始化正则化参数、误差阈值、迭代次数和正常模式字典集;3.2) Initialize regularization parameters, error threshold, number of iterations and normal mode dictionary set;
上述方法中,所述步骤4)包括以下具体步骤:In the above method, said step 4) includes the following specific steps:
4.1)如果剩余特征集为空,训练过程结束;如果剩余特征集不为空,确定聚类数目,对剩余特征集进行K均值聚类。4.1) If the remaining feature set is empty, the training process ends; if the remaining feature set is not empty, determine the number of clusters, and perform K-means clustering on the remaining feature set.
4.2)分别对每一个特征聚类进行字典学习,得到组稀疏字典。4.2) Carry out dictionary learning for each feature cluster respectively to obtain group sparse dictionary.
4.3)挑选合适字典去表示剩余特征。如果字典可以表示剩余特征,则将该字典加入正常模式字典集,并将可以用该字典表示的特征从剩余训练特征集中移除;如果字典不可以表示任意一个剩余特征,则丢弃该字典。4.3) Choose a suitable dictionary to represent the remaining features. If the dictionary can represent the remaining features, the dictionary is added to the normal mode dictionary set, and the features that can be represented by the dictionary are removed from the remaining training feature set; if the dictionary cannot represent any of the remaining features, the dictionary is discarded.
4.4)迭代次数加1,跳到步骤4.1);4.4) Add 1 to the number of iterations, skip to step 4.1);
上述方法中,所述步骤5)包括以下具体步骤:In the above method, said step 5) includes the following specific steps:
5.1)对于每一个测试特征,遍历由训练过程获取的正常模式字典集,估算字典对应的稀疏系数;5.1) For each test feature, traverse the normal pattern dictionary set obtained by the training process, and estimate the sparse coefficient corresponding to the dictionary;
5.2)根据特定字典及其对应的稀疏系数,计算该测试特征对应于该组稀疏字典的重建误差;5.2) According to specific dictionary and its corresponding sparse coefficient, calculate the reconstruction error corresponding to this group of sparse dictionary of this test feature;
上述方法中,所述步骤6)包括以下具体步骤:In the above method, said step 6) includes the following specific steps:
6.1)对于某一测试特征,如果找到一个对应重建误差小于重建阈值的字典,则判断该测试特征为正常特征;6.1) For a certain test feature, if a dictionary whose corresponding reconstruction error is less than the reconstruction threshold is found, the test feature is judged to be a normal feature;
6.2)对于某一测试特征,如果其对应的任意字典集的重建误差均大于重建阈值,则判断该测试特征为异常特征;6.2) For a certain test feature, if the reconstruction error of any dictionary set corresponding to it is greater than the reconstruction threshold, then it is judged that the test feature is an abnormal feature;
本发明与现有技术相比,其显著优点:其一,因为每个视频特征聚类的低秩特性,通过利用高效的低秩解决算法,如SVD阈值算法,该方法可以学习正常行为模式的组稀疏字典;其二,该方法自适应地决定每个正常行为模式的字典基数目,可以更准确地进行动态场景语义理解;其三,不同于传统的稀疏表示方法,该方法通过在一个字典集中挑选一个合适的字典,来表示测试样本,使得视频事件的稀疏重建更精准,显著提高了检测速度,保证实时性。Compared with the prior art, the present invention has significant advantages: First, because of the low-rank characteristic of each video feature cluster, by utilizing an efficient low-rank solution algorithm, such as the SVD threshold algorithm, the method can learn the normal behavior pattern group sparse dictionary; second, this method adaptively determines the number of dictionary bases for each normal behavior pattern, which can more accurately understand the semantics of dynamic scenes; third, different from traditional sparse representation methods, this method uses a dictionary Concentrate on selecting an appropriate dictionary to represent the test samples, making the sparse reconstruction of video events more accurate, significantly improving the detection speed and ensuring real-time performance.
附图说明Description of drawings
图1为自适应异常事件检测方法概览图。Figure 1 is an overview of adaptive anomaly event detection methods.
图2为三维时空梯度特征提取流程图。Fig. 2 is a flow chart of three-dimensional spatio-temporal gradient feature extraction.
图3为多尺度视频帧。Figure 3 shows multi-scale video frames.
图4为三维梯度特征。Figure 4 is a three-dimensional gradient feature.
图5为空间区域上重叠的时空立方体。Figure 5 is a space-time cube overlaid on a spatial region.
图6为组稀疏字典学习流程图。Fig. 6 is a flow chart of group sparse dictionary learning.
图7为异常事件检测流程图。Fig. 7 is a flow chart of abnormal event detection.
具体实施方式detailed description
下面结合附图对本发明作进一步详细的说明。The present invention will be described in further detail below in conjunction with the accompanying drawings.
本发明的异常事件检测方法包括特征提取过程、训练过程和测试过程三个主要过程,如图1所示。The abnormal event detection method of the present invention includes three main processes: a feature extraction process, a training process and a testing process, as shown in FIG. 1 .
特征提取过程如图2所示,包括以下具体步骤:The feature extraction process is shown in Figure 2, including the following specific steps:
视频序列图像帧转换为三维图像金字塔过程21。将视频序列的每一帧图像转换为灰度图像,并将每一帧灰度图像缩放为三种不同尺度:20×20,30×40和120×160,形成一个三层图像金字塔。对于图像金字塔的每个尺度,每一帧被划分为同样空间大小(10×10)的不重叠区域,如图3所示。Video sequence image frames are converted into 3D image pyramid process 21 . Convert each frame of the video sequence to a grayscale image, and scale each frame of the grayscale image to three different scales: 20×20, 30×40 and 120×160, forming a three-level image pyramid. For each scale of the image pyramid, each frame is divided into non-overlapping regions of the same spatial size (10×10), as shown in Figure 3.
提取视频序列三维梯度特征过程22。为了兼顾密集场景中目标的形态特征和运动特征,采用三维时空梯度作为特征提取的目标,如图4所示。假设x,y和t分别表示视频序列中的水平、垂直和时间方向,G是一个像素的三维梯度,它对应的三维投影分别是Gx、Gy和Gt。The process of extracting the 3D gradient feature of the video sequence 22. In order to take into account the morphological features and motion features of objects in dense scenes, the three-dimensional space-time gradient is used as the target of feature extraction, as shown in Figure 4. Assuming that x, y and t represent the horizontal, vertical and time directions in the video sequence respectively, G is a three-dimensional gradient of a pixel, and its corresponding three-dimensional projections are G x , G y and G t respectively.
利用时空立方体采样形成时空特征过程23。连续5帧上的同一空间区域组成时空立方体,每一个时空立方体由一个大小为10×10×5的立方体中的所有像素构成,lx×ly=10×10和lt=5分别表示时空立方体的空间大小和时序长度。一个时空立方体中的所有像素的三维梯度构成一个三维时空梯度特征。为了保持场景事件的时序信息,在每一个空间区域上采样时序上重叠的时空立方体,如图5所示。假设一个时空立方体的体心为(Sx,Sy,St),那么其时序上相邻的时空立方体位于(Sx,Sy,St-lt/2)和(Sx,Sy,St+lt/2)。A spatiotemporal feature process 23 is formed using spatiotemporal cube sampling. The same spatial region on 5 consecutive frames forms a space-time cube, each space-time cube is composed of all pixels in a cube with a size of 10×10×5, l x ×l y =10×10 and l t =5 represent the space-time The spatial size and timing length of the cube. The 3D gradients of all pixels in a spatiotemporal cube constitute a 3D spatiotemporal gradient feature. To preserve the timing information of scene events, temporally overlapping space-time cubes are sampled on each spatial region, as shown in Figure 5. Assuming that the body center of a space-time cube is (S x ,S y ,S t ), then its temporally adjacent space-time cubes are located at (S x ,S y ,S t -l t /2) and (S x ,S y , S t + l t /2).
利用时空特征降维形成训练特征集和测试特征集过程24。由于上述三维梯度特征的维数(10×10×5×3=1500)过高,会导致训练过程和测试过程计算量过大,影响检测的实时性,所以使用PCA将时空特征降到100维。最后,得到训练特征集和测试特征集。Form training feature set and test feature set process 24 by using spatio-temporal feature dimensionality reduction. Since the dimension of the above-mentioned three-dimensional gradient feature (10×10×5×3=1500) is too high, it will lead to too much calculation in the training process and testing process, which will affect the real-time performance of detection, so PCA is used to reduce the spatio-temporal feature to 100 dimensions . Finally, the training feature set and test feature set are obtained.
需要注意的是,训练视频序列中只包含正常事件;测试视频序列既包含正常事件,也包含异常事件。It should be noted that only normal events are included in the training video sequence; the test video sequence contains both normal events and abnormal events.
训练过程是指通过已知的正常行为样本集,学习正常行为模式的过程。包括以下具体步骤:The training process refers to the process of learning normal behavior patterns through known normal behavior sample sets. Include the following specific steps:
本发明采用迭代的方式,分别对每个空间位置上的所有特征进行训练,得到每个空间位置上的一个组稀疏字典集,用来表示该空间位置上的所有训练特征。对于每个空间位置上的所有特征,利用K均值得到相似特征聚类,然后对每个相似特征聚类采用低秩逼近算法进行字典学习。每次迭代将可以用字典表示的特征去除,然后对剩余特征继续下一次迭代的字典学习,直到所有特征都可以被训练得到的字典表示。The present invention adopts an iterative method to train all the features on each spatial position respectively, and obtain a group of sparse dictionary sets on each spatial position, which is used to represent all the training features on the spatial position. For all features at each spatial location, similar feature clusters are obtained using K-means, and then a low-rank approximation algorithm is used for dictionary learning for each similar feature cluster. Each iteration removes the features that can be represented by the dictionary, and then continues the dictionary learning of the next iteration for the remaining features until all the features can be represented by the trained dictionary.
初始化训练参数过程61。剩余训练特征集初始化为由特征提取过程得到的训练特征集,即X=[x1,x2,...,xn],其中n为特征数目,m为每个特征的维数,正则化参数τ=0.015、误差阈值T=0.01、迭代次数j=1和正常模式字典集 Initialize training parameters process 61 . The remaining training feature set is initialized as the training feature set obtained by the feature extraction process, that is, X=[x 1 ,x 2 ,...,x n ], where n is the number of features, m is the dimension of each feature, the regularization parameter τ=0.015, the error threshold T=0.01, the number of iterations j=1 and the normal mode dictionary set
利用K均值进行剩余训练特征集聚类过程62。根据剩余训练特征的数目,首先自适应地选择聚类数目,再对剩余特征集进行K均值聚类。假设第jth次迭代中第cth个聚类的特征表示为其中j=1,2,...,N,表示第cth个聚类的特征数目,函数f(·)把聚类中的特征序号映射到该特征在初始特征集中的序号。The remaining training feature set clustering process 62 is performed using K-means. According to the number of remaining training features, first adaptively select the number of clusters, and then perform K-means clustering on the remaining feature sets. Assume that the feature representation of the cth cluster in the jth iteration is where j=1,2,...,N, Indicates the number of features of the c th cluster, and the function f(·) maps the serial number of the feature in the cluster to the serial number of the feature in the initial feature set.
如图6,利用低秩逼近算法学习组稀疏字典过程63。不同于传统的稀疏表示模型,本发明利用视频特征聚类的低秩结构,采用低秩逼近算法,学习包含高度相关字典基的组稀疏字典,从而抛弃视频数据中的冗余信息。组稀疏字典学习的目标函数如下:As shown in FIG. 6 , the process 63 of learning a group sparse dictionary using a low-rank approximation algorithm. Different from the traditional sparse representation model, the present invention utilizes the low-rank structure of video feature clustering and adopts a low-rank approximation algorithm to learn group sparse dictionaries containing highly correlated dictionary bases, thereby discarding redundant information in video data. The objective function for group sparse dictionary learning is as follows:
其中,采用奇异值分解,τ是正则化参数,ωi是第i个奇异值λi的权重。上述加权低秩优化问题采用奇异值阈值化算法,公式(1)的闭合解如下:in, Using singular value decomposition, τ is the regularization parameter, and ω i is the weight of the ith singular value λ i . The above weighted low-rank optimization problem adopts the singular value thresholding algorithm, and the closed solution of formula (1) is as follows:
其中,r是通过软阈值操作估算的秩。软阈值操作作用于每个奇异值,小于τ·ωi的奇异值被置为0,大于τ·ωi的奇异值仍保持原值。in, r is operated by soft thresholding Estimated rank. The soft threshold operation acts on each singular value, the singular value smaller than τ·ω i is set to 0, and the singular value greater than τ·ω i remains the original value.
选择组稀疏字典过程64。一旦确定对于任一特征目标函数如下:Select group sparse dictionary process 64 . Once determined for any feature The objective function is as follows:
其中,表示的稀疏系数向量,表示特征是否能够被字典表示,约束项∑γi,j=1和γi,j={0,1}用于保证只能挑选一个字典来表示 表示用字典表示特征聚类的系数向量,T表示误差阈值。γ的闭合解如下:in, express The sparse coefficient vector of , Express features Is it possible to be a dictionary Indicates that the constraint items ∑γ i,j =1 and γ i,j ={0,1} are used to ensure that only one dictionary can be selected to represent represent a dictionary Represent feature clustering A vector of coefficients, T represents the error threshold. The closed solution of γ is as follows:
如果可以表示则把字典加入字典集合D,保留该字典对应的奇异值信息,使得迭代次数j=j+1,并把稀疏系数的特征从剩余特征集Xj中移除;如果不可以表示任意一个剩余特征则丢弃字典 if can express then the dictionary Add the dictionary set D, retain the singular value information corresponding to the dictionary, so that the number of iterations j=j+1, and the sparse coefficient Characteristics Removed from the remaining feature set X j ; if Cannot represent any of the remaining features then discard the dictionary
迭代进行过程63和64,直到剩余特征集为空集。Processes 63 and 64 are iteratively performed until the remaining feature set is an empty set.
最后,取得正常特征的组稀疏字典集,每个字典用于表示正常行为模式。注意,由于训练特征集对应的视频序列只包含正常事件,所以学习到的字典均表示正常行为模式。Finally, a set of sparse dictionaries of normal features is obtained, each dictionary is used to represent a normal behavior pattern. Note that since the video sequences corresponding to the training feature set only contain normal events, the learned dictionaries all represent normal behavior patterns.
如图7,测试过程是指通过训练过程学习到的正常模式字典集,检测测试样本是否为异常样本的过程。包括以下具体步骤:As shown in Figure 7, the test process refers to the process of detecting whether the test sample is an abnormal sample from the normal pattern dictionary set learned through the training process. Include the following specific steps:
初始化测试参数过程71。初始化正常行为模式字典集为由训练过程得到的字典集,即D=[D1,D2,...DN],重建误差阈值为。Initialize the test parameters process 71 . Initialize the dictionary set of normal behavior patterns as the dictionary set obtained by the training process, that is, D=[D 1 , D 2 ,...D N ], and the reconstruction error threshold is .
计算测试特征的重建误差过程72。针对任意一个测试特征x,对每个字典Dk计算重建误差,计算方法如下:Compute the reconstruction error process 72 for the test features. For any test feature x, calculate the reconstruction error for each dictionary D k , the calculation method is as follows:
公式(5)中βk的最优解为:The optimal solution of β k in formula (5) is:
此时重建误差为:At this point the reconstruction error is:
判断测试特征是否为异常特征过程73。通过搜索一个合适的字典来表示该测试样本,进而利用重建误差来判断该测试样本是否为异常事件。如果重建误差小于重建误差阈值,则表明测试特征x可以被字典Dk表示,即测试特征x属于正常事件;否则,测试特征x不能被字典Dk表示。如果正常模式字典集中的所有字典均不能表示测试特征x,则测试特征x属于异常事件。Judging whether the test feature is an abnormal feature process 73 . A suitable dictionary is searched to represent the test sample, and then the reconstruction error is used to judge whether the test sample is an abnormal event. If the reconstruction error is less than the reconstruction error threshold, it indicates that the test feature x can be represented by the dictionary Dk , that is, the test feature x belongs to a normal event; otherwise, the test feature x cannot be represented by the dictionary Dk . If none of the dictionaries in the normal pattern dictionary set can represent the test feature x, the test feature x is an abnormal event.
这里需要着重指出,相比于目前最先进的算法,本发明采用迭代低秩逼近方法,取得了至少2%的检测正确率提升。通过检索正常模式字典集的方法,本发明的检测速度可以较传统异常检测方法提升20倍以上。另外,相比于传统稀疏表示方法,本发明采用SVD阈值算法可以减少至少10倍的训练时间。It should be pointed out here that, compared with the most advanced algorithms at present, the present invention adopts an iterative low-rank approximation method, which improves the detection accuracy by at least 2%. Through the method of retrieving the normal mode dictionary set, the detection speed of the present invention can be increased by more than 20 times compared with the traditional anomaly detection method. In addition, compared with the traditional sparse representation method, the present invention can reduce the training time by at least 10 times by adopting the SVD threshold algorithm.
Claims (6)
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201610915766.1A CN106503647A (en) | 2016-10-21 | 2016-10-21 | The accident detection method that structural sparse is represented is approached based on low-rank |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201610915766.1A CN106503647A (en) | 2016-10-21 | 2016-10-21 | The accident detection method that structural sparse is represented is approached based on low-rank |
Publications (1)
Publication Number | Publication Date |
---|---|
CN106503647A true CN106503647A (en) | 2017-03-15 |
Family
ID=58318284
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201610915766.1A Pending CN106503647A (en) | 2016-10-21 | 2016-10-21 | The accident detection method that structural sparse is represented is approached based on low-rank |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN106503647A (en) |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108460412A (en) * | 2018-02-11 | 2018-08-28 | 北京盛安同力科技开发有限公司 | A kind of image classification method based on subspace joint sparse low-rank Structure learning |
CN109117774A (en) * | 2018-08-01 | 2019-01-01 | 广东工业大学 | A kind of multi-angle video method for detecting abnormality based on sparse coding |
CN110580504A (en) * | 2019-08-27 | 2019-12-17 | 天津大学 | A Video Anomaly Event Detection Method Based on Self-Feedback Mutually Exclusive Subclass Mining |
CN111931682A (en) * | 2020-08-24 | 2020-11-13 | 珠海大横琴科技发展有限公司 | Abnormal behavior detection method and device |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104318261A (en) * | 2014-11-03 | 2015-01-28 | 河南大学 | Graph embedding low-rank sparse representation recovery sparse representation face recognition method |
CN105046717A (en) * | 2015-05-25 | 2015-11-11 | 浙江师范大学 | Robust video object tracking method |
CN105469359A (en) * | 2015-12-09 | 2016-04-06 | 武汉工程大学 | Locality-constrained and low-rank representation based human face super-resolution reconstruction method |
CN105513093A (en) * | 2015-12-10 | 2016-04-20 | 电子科技大学 | Object tracking method based on low-rank matrix representation |
CN105825477A (en) * | 2015-01-06 | 2016-08-03 | 南京理工大学 | Remote sensing image super-resolution reconstruction method based on multi-dictionary learning and non-local information fusion |
-
2016
- 2016-10-21 CN CN201610915766.1A patent/CN106503647A/en active Pending
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104318261A (en) * | 2014-11-03 | 2015-01-28 | 河南大学 | Graph embedding low-rank sparse representation recovery sparse representation face recognition method |
CN105825477A (en) * | 2015-01-06 | 2016-08-03 | 南京理工大学 | Remote sensing image super-resolution reconstruction method based on multi-dictionary learning and non-local information fusion |
CN105046717A (en) * | 2015-05-25 | 2015-11-11 | 浙江师范大学 | Robust video object tracking method |
CN105469359A (en) * | 2015-12-09 | 2016-04-06 | 武汉工程大学 | Locality-constrained and low-rank representation based human face super-resolution reconstruction method |
CN105513093A (en) * | 2015-12-10 | 2016-04-20 | 电子科技大学 | Object tracking method based on low-rank matrix representation |
Non-Patent Citations (1)
Title |
---|
BOSI YU: ""Low-rank Approximation based Abnormal Detection in The Video Sequence"", 《2016 IEEE INTERNATIONAL CONFERENCE ON DIGITAL SIGNAL PROCESSING》 * |
Cited By (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108460412A (en) * | 2018-02-11 | 2018-08-28 | 北京盛安同力科技开发有限公司 | A kind of image classification method based on subspace joint sparse low-rank Structure learning |
CN108460412B (en) * | 2018-02-11 | 2020-09-04 | 北京盛安同力科技开发有限公司 | Image classification method based on subspace joint sparse low-rank structure learning |
CN109117774A (en) * | 2018-08-01 | 2019-01-01 | 广东工业大学 | A kind of multi-angle video method for detecting abnormality based on sparse coding |
CN109117774B (en) * | 2018-08-01 | 2021-09-28 | 广东工业大学 | Multi-view video anomaly detection method based on sparse coding |
CN110580504A (en) * | 2019-08-27 | 2019-12-17 | 天津大学 | A Video Anomaly Event Detection Method Based on Self-Feedback Mutually Exclusive Subclass Mining |
CN110580504B (en) * | 2019-08-27 | 2023-07-25 | 天津大学 | Video abnormal event detection method based on self-feedback mutual exclusion subclass mining |
CN111931682A (en) * | 2020-08-24 | 2020-11-13 | 珠海大横琴科技发展有限公司 | Abnormal behavior detection method and device |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN106503652A (en) | Based on the accident detection method that low-rank adaptive sparse is rebuild | |
CN108229338B (en) | Video behavior identification method based on deep convolution characteristics | |
CN108596958B (en) | A Target Tracking Method Based on Difficult Positive Sample Generation | |
CN104281853B (en) | A kind of Activity recognition method based on 3D convolutional neural networks | |
CN110210320A (en) | The unmarked Attitude estimation method of multiple target based on depth convolutional neural networks | |
CN101339655B (en) | Visual Tracking Method Based on Object Features and Bayesian Filter | |
CN109961034A (en) | Video object detection method based on convolutional gated recurrent neural unit | |
CN108182388A (en) | A kind of motion target tracking method based on image | |
CN114241422B (en) | Student classroom behavior detection method based on ESRGAN and improved YOLOv s | |
CN111191667B (en) | Crowd counting method based on multiscale generation countermeasure network | |
CN110120064B (en) | Depth-related target tracking algorithm based on mutual reinforcement and multi-attention mechanism learning | |
CN105160310A (en) | 3D (three-dimensional) convolutional neural network based human body behavior recognition method | |
CN102663436B (en) | Self-adapting characteristic extracting method for optical texture images and synthetic aperture radar (SAR) images | |
CN107862275A (en) | Human bodys' response model and its construction method and Human bodys' response method | |
CN109743642B (en) | Video abstract generation method based on hierarchical recurrent neural network | |
CN112819853B (en) | A Visual Odometry Method Based on Semantic Prior | |
Savner et al. | CrowdFormer: Weakly-supervised crowd counting with improved generalizability | |
CN108734210A (en) | A kind of method for checking object based on cross-module state multi-scale feature fusion | |
CN108764019A (en) | A kind of Video Events detection method based on multi-source deep learning | |
CN106503647A (en) | The accident detection method that structural sparse is represented is approached based on low-rank | |
CN113743505A (en) | An improved SSD object detection method based on self-attention and feature fusion | |
CN103886585A (en) | Video tracking method based on rank learning | |
CN112347930A (en) | High-resolution image scene classification method based on self-learning semi-supervised deep neural network | |
CN109753897A (en) | Behavior recognition method based on memory unit reinforcement-temporal dynamic learning | |
CN104200203A (en) | Human movement detection method based on movement dictionary learning |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20170315 |