CN113762007A - A method for abnormal behavior detection based on bi-prediction of appearance and action features - Google Patents

A method for abnormal behavior detection based on bi-prediction of appearance and action features Download PDF

Info

Publication number
CN113762007A
CN113762007A CN202011263894.5A CN202011263894A CN113762007A CN 113762007 A CN113762007 A CN 113762007A CN 202011263894 A CN202011263894 A CN 202011263894A CN 113762007 A CN113762007 A CN 113762007A
Authority
CN
China
Prior art keywords
appearance
action
frame
network
feature
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202011263894.5A
Other languages
Chinese (zh)
Other versions
CN113762007B (en
Inventor
陈洪刚
李自强
王正勇
何小海
刘强
吴晓红
熊书琪
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Sichuan University
Original Assignee
Sichuan University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Sichuan University filed Critical Sichuan University
Priority to CN202011263894.5A priority Critical patent/CN113762007B/en
Publication of CN113762007A publication Critical patent/CN113762007A/en
Application granted granted Critical
Publication of CN113762007B publication Critical patent/CN113762007B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Image Analysis (AREA)

Abstract

本发明公开了一种基于外观和动作特征双预测的异常行为检测方法,涉及计算机视觉和人工智能领域。方法包括:(1)顺序读取视频帧序列,计算相邻图像的帧间差,获取固定长度的视频帧序列和对应的帧差图序列;(2)利用引入记忆增强模块的双流网络模型,分别通过外观和动作子网络提取属于正常行为的特有外观和动作特征,并预测视频帧图和帧差图;(3)将预测的视频帧和帧差图相加融合,得到最终的预测视频帧;(4)通过评估记忆增强模块所提取动作和外观特征以及最终预测图像质量获取该帧异常得分。本发明采用基于预测模型的深度学习方法,能够有效地将含异常行为的视频帧检测出来,提高了异常检测的准确率。The invention discloses an abnormal behavior detection method based on double prediction of appearance and action features, and relates to the fields of computer vision and artificial intelligence. The method includes: (1) sequentially reading the video frame sequence, calculating the inter-frame difference between adjacent images, and obtaining a fixed-length video frame sequence and a corresponding frame difference map sequence; (2) using a dual-stream network model that introduces a memory enhancement module, The unique appearance and action features belonging to normal behavior are extracted through the appearance and action sub-networks, respectively, and the video frame image and frame difference image are predicted; (3) The predicted video frame and frame difference image are added and fused to obtain the final predicted video frame. ; (4) Obtain the frame anomaly score by evaluating the action and appearance features extracted by the memory enhancement module and the final predicted image quality. The present invention adopts the deep learning method based on the prediction model, which can effectively detect the video frames with abnormal behavior, and improves the accuracy of abnormal detection.

Description

Abnormal behavior detection method based on appearance and action characteristic double prediction
Technical Field
The invention relates to an abnormal behavior detection method based on appearance and action characteristic double prediction, and belongs to the field of computer vision and security monitoring.
Background
Abnormal behavior detection is a technique in the field of computer vision, and the purpose of abnormal behavior detection is to detect the existence of abnormal behavior in a video. In recent times, public safety is getting more and more concerned, a great amount of monitoring equipment is deployed at each place, thereby generating huge amount of video resources, and it is extremely difficult to pay attention to each monitoring picture in real time by manpower, and a great amount of manpower resources are consumed. By using the abnormal behavior detection algorithm, the abnormal behavior in the monitoring video can be detected and a warning can be given out in time, so that the labor cost can be greatly reduced, and the efficiency is improved. The abnormal behavior detection has wide application prospect in the fields of video monitoring, intelligent security, transportation and the like.
For the abnormal behavior detection of videos, due to the low occurrence rate of abnormal behaviors and the difficult reason of data collection, most current methods adopt a semi-supervised learning method which only uses normal videos for training, and a reconstruction or prediction based method becomes a mainly used method due to the good detection effect. The method reconstructs an input frame or predicts the next frame by inputting a plurality of continuous frames of video to a self-encoder network or generating a countermeasure network, and judges whether the video is abnormal or not by judging whether the video is reconstructed or predicted. Although this type of method achieves good results, the following problems are still faced: (1) abnormal behaviors can be classified into appearance, action or both, and the current reconstruction and prediction methods fully utilize appearance and action information; (2) the normal behaviors have diversity, the network cannot correctly learn the special characteristics of the normal sample due to the complex background and the like, and in addition, the reconstruction or prediction effect of the abnormal sample can be good due to the strong generating capacity of the convolutional neural network, so that the final abnormal detection accuracy is influenced.
Disclosure of Invention
In order to solve the defects of the prior art, the invention provides an abnormal behavior detection method based on appearance and action characteristic double prediction, and aims to design a double-flow network structure which comprises a memory enhancement module and is used for predicting appearance and action characteristics, so that an abnormal video frame can obtain a larger prediction error, and the accuracy of abnormal behavior detection is improved.
The invention adopts the following technical scheme: an abnormal behavior detection method based on appearance and action characteristic double prediction comprises the following steps:
(1) sequentially reading video frames, calculating the inter-frame difference of adjacent images, and acquiring a video frame sequence with a fixed length and a corresponding frame difference image sequence;
(2) by utilizing a double-current network model introduced into a memory enhancement module, extracting special appearance and action characteristics belonging to normal behaviors through an appearance sub-network and an action sub-network respectively, and predicting a video frame image and a frame difference image;
(3) adding and fusing the predicted video frame and the frame difference image to obtain a final predicted video frame;
(4) and obtaining the abnormal score of the frame by measuring the action and appearance characteristics extracted by the memory enhancement module and the quality of the final prediction image.
Compared with the prior art, the invention has the beneficial effects that:
1. the invention simultaneously uses the video frame sequence and the RGB frame difference image sequence as input to be sent into a double-current convolution self-encoder network for prediction, compared with the prior method which uses a light flow graph to extract action characteristics, the invention can reduce the network complexity and the calculated amount by using the frame difference image;
2. the invention improves the network structure of the encoder and the decoder in the self-coding network, so that the characteristics can be better extracted, and the image prediction quality can be improved;
3. the method adds the memory enhancement module, better learns the characteristics of the normal sample, enhances the robustness of the network and enables the abnormal video to obtain a higher abnormal score;
4. according to the method, the quality of the prediction image is considered, and the feature similarity score of the extracted sample features and the feature similarity score of the normal behavior features are used as evaluation basis, so that the effect of abnormality detection is effectively improved, and the false detection rate is reduced.
Drawings
FIG. 1 is a flow chart of the abnormal behavior detection method of the present invention;
FIG. 2 is a network architecture diagram of the present invention for anomalous behavior detection based on dual prediction of appearance and motion characteristics;
fig. 3 is a block diagram of the upsampling and downsampling modules in the encoder and decoder of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention is further described in detail below with reference to the accompanying drawings. It should be understood that the detailed description and specific examples, while indicating the invention, are intended for purposes of illustration only and are not intended to limit the scope of the invention.
As shown in fig. 1-2, an abnormal behavior detection method based on appearance and motion feature dual prediction includes the following steps:
(1) sequentially reading video frames, calculating the inter-frame difference of adjacent images, and acquiring a video frame sequence with a fixed length and a corresponding frame difference image sequence;
(2) by utilizing a double-current network model introduced into a memory enhancement module, extracting special appearance and action characteristics belonging to normal behaviors through an appearance sub-network and an action sub-network respectively, and predicting a video frame image and a frame difference image;
(3) adding and fusing the predicted video frame and the frame difference image to obtain a final predicted video frame;
(4) and obtaining the frame abnormality score by measuring the action and appearance characteristics extracted by the memory enhancement module and the quality of the final prediction image.
The detailed steps are as follows:
step 1: fixed length video frames and frame difference maps are acquired. A section of video stream is obtained from a fixed camera, after the video is subjected to framing processing, a continuous video frame sequence with the fixed length of t is selected, wherein the first t-1 frame image is directly sent into an appearance sub-network. For a video stream of a fixed camera, a background image I of the video can be acquired by an OpenCV methodBThen subtracting I from t frame RGB video imageBObtaining a foreground image I 'without background noise'1,I′2,…,I′tAnd finally, the foreground image is takenSubtracting the previous frame from the next frame of the sequence to obtain t-1 continuous frame difference image sequences X required by the action sub-network1,X2,…,Xt-1
Step 2: and respectively sending the video frames and the frame difference images with fixed lengths into a double-current network introduced into a memory enhancement module for prediction to generate predicted video frames and RGB frame difference images.
For the network architecture, as shown in fig. 2, the network is composed of two structurally identical sub-networks of self-encoders, which are widely used for the tasks of feature extraction and reconstruction and prediction of images. Taking the visiting sub-network as an example, further explanation is: the sub-networks are in turn formed by an encoder EaMemory enhancing module MaAnd decoder DaAre cascaded. The encoder and the decoder are connected in a skip-connection mode in a feature layer with the same resolution, and the memory enhancement module performs normal sample feature enhancement on the features extracted by the encoder and then sends the features to the decoder for reconstruction. For an encoder and a decoder, the invention improves an up-sampling layer and a down-sampling layer of the encoder and the decoder, as shown in fig. 3, the improved up-sampling module and the down-sampling module both adopt a residual error-like structure, and two branches of the down-sampling module respectively pass convolution operation and maximum pooling operation of different kernel functions; the up-sampling module adopts deconvolution operations of convolution kernels with different sizes. The improved convolution kernel has richer acquired information and can extract more effective semantic features. Setting input appearance sub-network input to I1,I2,…,ItThrough an encoder EaDown-sampling extraction of deep features Z about image scenes, target appearance information, and the likeaThen, the memory enhancing module MaFor feature ZaPerforming memory enhancement on a normal sample to obtain enhanced characteristic Z'aDecoder DaInput Z'aPredicting to obtain the t +1 th frame
Figure BDA0002775501630000031
The calculation method is shown in formula (1):
Figure BDA0002775501630000041
in the above formula
Figure BDA0002775501630000042
Respectively representing encoders EaMemory enhancing module MaAnd decoder DaThe parameter (c) of (c).
The memory enhancing module is specifically described as follows:
the module comprises a memory item for storing M normal sample feature vectors in local; in the training phase, the encoder feeds all the features of the extracted normal samples into the module, which extracts the M features that best characterize the normal samples and stores them locally. The function of the module is realized by reading and updating two operations.
For read operations, which are used for reconstruction of the decoder in order to generate enhanced features, they exist in the training and testing phase of the network. The reading operation steps are as follows: for the output feature z of the encoder, calculating the cosine similarity between z and the storage feature p in the memory term, where the calculation formula is shown in (2):
Figure BDA0002775501630000043
where k and m are indices of features z and p, respectively, for s (z)k,pm) Applying softmax function to obtain read weight omegak,mThe calculation formula is as shown in (3):
Figure BDA0002775501630000044
applying the calculated correspondence weight ω to the memory item feature pk,mObtaining features of enhanced memory
Figure BDA0002775501630000048
The calculation method is as follows:
Figure BDA0002775501630000045
the updating operation only exists in the training stage and is used for learning the characteristic features of the normal sample, firstly, the cosine similarity is calculated by using the formula (1), and then, the updating weight v is calculatedm,kThe calculation method is as the formula (5):
Figure BDA0002775501630000046
the calculation method of the updated local memory is as the formula (6):
Figure BDA0002775501630000047
in order for the memory items to really remember the characteristics of the normal samples, the module introduces a characteristic compression loss LcAnd characteristic separation loss LsTwo loss functions. Characteristic compression loss LcAs shown in equation (7):
Figure BDA0002775501630000051
in the above formula pτRepresenting all memory items with zkThe one with the highest similarity.
Characteristic separation loss LsThe calculation method of (2) is shown in equation (9):
Figure BDA0002775501630000052
in the above formula, τ and γ respectively represent ω in the formula (1)k,mThe value of index m at the time of the maximum value and the second largest value is taken. And step 3: predicting the predicted video frames from the appearance and motion sub-networks by step 2
Figure BDA0002775501630000053
Sum RGB frame difference map
Figure BDA0002775501630000054
Adding the two prediction images to obtain the final t +1 frame video frame of the network
Figure BDA0002775501630000055
And 4, step 4: the method for calculating the abnormality score specifically includes:
first, the t +1 th frame is calculated
Figure BDA0002775501630000056
With real frame It+1The peak signal-to-noise ratio (PSNR) is calculated as shown in equation (10):
Figure BDA0002775501630000057
wherein N represents the t +1 th frame image It+1The number of all pixels.
Secondly, calculating each output characteristic z of the appearance sub-network and the motion sub-network encoderkMemory item characteristic p of memory enhancement moduleτThe L2 distance of (a) is used as the feature similarity score of the two sub-networks, and the calculation method is shown in formula (11):
Figure BDA0002775501630000058
wherein τ is and zkThe index of the memory item features with the maximum similarity;
finally, after the three scores are normalized to [0,1], the weight of each score is balanced by a hyper-parameter beta, and the calculation method is shown as a formula (12):
Figure BDA0002775501630000059
in the formula
Figure BDA00027755016300000510
D′a(za,pa) And D'm(zm,pm) And respectively representing the normalized PSNR, the appearance feature similarity score and the action feature similarity score.
In order to verify the effectiveness of the method, the method uses three common data sets of Avenue, UCSD-ped2 and ShanghaiTech commonly used in the field of video abnormal behavior detection to train and test. Four abnormal behavior detection methods based on deep learning are selected as comparison methods, and the method specifically comprises the following steps:
the method comprises the following steps: the methods proposed by Abati et al, references "D.Abati, A.Porrello, S.Calderara, and R.Cucchiaara," tension space autogiration for novel detection, "in Proceedings of the IEEE Conference on Computer Vision and Pattern registration 2019, pp.481-490".
The method 2 comprises the following steps: nguyen et al, references "T. -. N.Nguyen and J.Meuner," analysis detection in video sequence with application-correlation, "in Proceedings of the IEEE International Conference on Computer Vision,2019, pp.1273-1283"
The method 3 comprises the following steps: liu et al, references "W.Liu, W.Luo, D.Lian, and S.Gao," Future frame prediction for analog detection-a new base, "in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition 2018, pp.6536-6545"
The method 4 comprises the following steps: the methods proposed by Gong et al, references "D.Gong et al", "Memory-aided depth auto encoder for unsupervised analysis," in Proceedings of the IEEE International Conference on Computer Vision,2019, pp.1705-1714 "
As shown in table 1, the method provided by the present invention uses AUC as an evaluation index on three data sets, and compared with the other four methods, the accuracy of the method identification is greatly superior.
TABLE 1 comparison with other methods evaluation index (AUC)
Figure BDA0002775501630000061
Finally, it should be noted that the above examples are only used to illustrate the technical solutions of the present invention, but not to limit the same; although the present invention has been described in detail with reference to the foregoing embodiments, it will be understood by those skilled in the art that various changes in form and details may be made therein without departing from the spirit and scope of the invention; and the modifications or the substitutions do not make the essence of the corresponding technical solutions depart from the scope of the technical solutions of the embodiments of the present invention.

Claims (8)

1.一种基于外观和动作特征双预测的异常行为检测方法,其特征在于,包括以下步骤:1. an abnormal behavior detection method based on double prediction of appearance and action features, is characterized in that, comprises the following steps: (1)顺序读取视频帧,计算相邻图像的帧间差,获取固定长度的视频帧序列和对应的帧差图序列;(1) read video frames sequentially, calculate the inter-frame difference between adjacent images, and obtain a fixed-length video frame sequence and a corresponding frame difference map sequence; (2)利用引入记忆增强模块的双流网络模型,分别通过外观子网络和动作子网络提取属于正常行为的特有外观和动作特征,并预测视频帧图和帧差图;(2) Using the dual-stream network model that introduces the memory enhancement module, the unique appearance and action features belonging to normal behavior are extracted through the appearance sub-network and the action sub-network respectively, and the video frame map and frame difference map are predicted; (3)将预测的视频帧和帧差图相加融合,得到最终的预测视频帧;(3) adding and fusing the predicted video frame and the frame difference map to obtain the final predicted video frame; (4)通过衡量记忆增强模块所提取动作和外观特征以及最终预测图像质量获取该帧异常得分。(4) Obtain the anomaly score of the frame by measuring the action and appearance features extracted by the memory enhancement module and the final predicted image quality. 2.根据权利要求1所述的一种基于外观和动作特征双预测的异常行为检测方法,其特征在于,所述步骤(1)中帧差图计算方法为:2. a kind of abnormal behavior detection method based on appearance and action feature double prediction according to claim 1, is characterized in that, in described step (1), frame difference map calculation method is: 对每个视频段,首先,提取该视频段的背景图像;其次,对于固定长度为t的视频帧序列,先减去背景图像,获取剔除背景的前景目标图像;最后,前后相邻帧相减得到长度为t-1的帧差图序列。For each video segment, first, extract the background image of the video segment; secondly, for the video frame sequence of fixed length t, first subtract the background image to obtain the foreground target image with the background removed; finally, subtract the adjacent frames before and after. A frame difference map sequence of length t-1 is obtained. 3.根据权利要求1所述的一种基于外观和动作特征双预测的异常行为检测方法,其特征在于,所述步骤(2)中引入了记忆增强模块的双流网络结构包括外观子网络和动作子网络共两路卷积神经网络,所述外观子网络和动作子网络由相同结构的自编码器网络组成。3. a kind of abnormal behavior detection method based on double prediction of appearance and action feature according to claim 1, is characterized in that, in described step (2), the dual-stream network structure that has introduced memory enhancement module comprises appearance sub-network and action The sub-network consists of two convolutional neural networks, and the appearance sub-network and the action sub-network are composed of autoencoder networks with the same structure. 4.根据权利要求1及3所述的一种基于外观和动作特征双预测的异常行为检测方法,其特征在于所述自编码器网络由编码器、解码器以及记忆增强模块组成;记忆增强模块级联在编码器与解码器之间。4. a kind of abnormal behavior detection method based on appearance and action feature double prediction according to claim 1 and 3, is characterized in that described self-encoder network is made up of encoder, decoder and memory enhancement module; Memory enhancement module Cascade between encoder and decoder. 5.根据权利要求4所述的一种基于外观和动作特征双预测的异常行为检测方法,其特征在于,所述编码器和解码器的网络结构;5. a kind of abnormal behavior detection method based on appearance and action feature double prediction according to claim 4, is characterized in that, the network structure of described encoder and decoder; 所述编码器和解码器分别含有三个下采样层和三个上采样层;所述下采样层采用残差结构,两个支路分别采用最大池化和卷积来降低分辨率并增加通道数;所述上采样层两个支路采样不同大小的卷积核的反卷积来升高分辨率并减小通道数;编码器和解码器在相同分辨率的特征层使用跳连接。The encoder and decoder respectively contain three down-sampling layers and three up-sampling layers; the down-sampling layer adopts a residual structure, and the two branches adopt max pooling and convolution respectively to reduce the resolution and increase the channel The two branches of the up-sampling layer sample the deconvolution of convolution kernels of different sizes to increase the resolution and reduce the number of channels; the encoder and the decoder use skip connections in the feature layers of the same resolution. 6.根据权利要求1所述的一种基于外观和动作特征双预测的异常行为检测方法,其特征在于,所述步骤(2)中的记忆增强模块包含存储M个用于将正常样本特征向量存储在本地的内存项;记忆增强模块分为读取和更新两个操作;6. a kind of abnormal behavior detection method based on appearance and action feature double prediction according to claim 1, is characterized in that, the memory enhancement module in described step (2) comprises and stores M for normal sample feature vector Memory items stored locally; the memory enhancement module is divided into two operations: read and update; 所述读取操作同时存在网络的训练和测试阶段;读取操作步骤为:对于编码器输出特征z,计算z与内存项中存储特征p的余弦相似度,计算公式如(1)所示:The reading operation exists in the training and testing phases of the network at the same time; the reading operation steps are: for the encoder output feature z, calculate the cosine similarity between z and the stored feature p in the memory item, and the calculation formula is shown in (1):
Figure FDA0002775501620000021
Figure FDA0002775501620000021
式中k,m分别为特征z和p的索引,对s(zk,pm)应用softmax函数得到读取权重ωk,m,计算公式如(2):In the formula, k,m are the indices of the features z and p, respectively. The softmax function is applied to s(z k , p m ) to obtain the read weight ω k,m . The calculation formula is as follows: (2):
Figure FDA0002775501620000022
Figure FDA0002775501620000022
对内存项特征p应用所计算的对应权重ωk,m,得到记忆增强增强后的特征
Figure FDA00027755016200000212
计算方式为:
Apply the calculated corresponding weight ω k,m to the memory item feature p to get the memory-enhanced feature
Figure FDA00027755016200000212
The calculation method is:
Figure FDA0002775501620000023
Figure FDA0002775501620000023
所述更新操作只存在于训练阶段,首先使用公式(1)计算余弦相似度,然后计算更新权重vm,k,其计算方法如公式(4):The update operation only exists in the training phase. First, formula (1) is used to calculate the cosine similarity, and then the update weight vm ,k is calculated. The calculation method is as formula (4):
Figure FDA0002775501620000024
Figure FDA0002775501620000024
更新后地内存的计算方法如公式(5):The calculation method of the updated memory is as formula (5):
Figure FDA0002775501620000025
Figure FDA0002775501620000025
更新后地内存项将被保存在本地,在训练和测试的读取操作中被使用。The updated memory item will be saved locally and used during training and testing read operations.
7.根据权利要求1所述的一种基于外观和动作特征双预测的异常行为检测方法,其特征在于,所述步骤(3)中获取最终的预测视频帧的方法:7. a kind of abnormal behavior detection method based on appearance and action feature double prediction according to claim 1, is characterized in that, in described step (3), obtain the method for final prediction video frame: 连续t-1张视频帧图像输入外观子网络预测得到第t帧
Figure FDA0002775501620000026
同时连续t-1张帧差图输入动作子网络预测得到第t张帧差图
Figure FDA0002775501620000027
最后将
Figure FDA0002775501620000028
Figure FDA0002775501620000029
相加融合得到第t+1帧
Figure FDA00027755016200000210
Continuous t-1 video frame images are input to the appearance sub-network to predict the t-th frame
Figure FDA0002775501620000026
At the same time, continuous t-1 frame difference maps are input to the action sub-network to predict the t-th frame difference map
Figure FDA0002775501620000027
will finally
Figure FDA0002775501620000028
and
Figure FDA0002775501620000029
Add and fuse to get the t+1th frame
Figure FDA00027755016200000210
8.根据权利要求1所述的一种基于外观和动作特征双预测的异常行为检测方法,其特征在于,所述步骤(4)中,通过以下方法对异常分数进行计算:8. a kind of abnormal behavior detection method based on appearance and action feature double prediction according to claim 1, is characterized in that, in described step (4), the abnormal score is calculated by the following method: (4.1)计算第t+1帧
Figure FDA00027755016200000211
与真实帧It+1的峰值信噪比(PSNR);
(4.1) Calculate the t+1th frame
Figure FDA00027755016200000211
Peak signal-to-noise ratio (PSNR) with real frame It+1 ;
(4.2)分别计算外观子网络和动作子网络中编码器每一个输出特征zk与记忆增强模块的内存项特征pτ的L2距离作为两个子网络的特征相似度分数,计算方法如公式(6)所示:(4.2) Calculate the L2 distance between each output feature z k of the encoder in the appearance sub-network and the action sub-network and the memory item feature p τ of the memory enhancement module as the feature similarity score of the two sub-networks. The calculation method is as formula (6) ) as shown:
Figure FDA0002775501620000031
Figure FDA0002775501620000031
其中τ为与zk相似度最大的内存项特征的索引;where τ is the index of the memory item feature with the greatest similarity to z k ; (4.3)将步骤(4.1)和步骤(4.2)中三个分数归一化到【0,1】,然后相加融合得到最终的异常分数,分数越高,视频帧异常的可能性越大,分数计算方法如(7)所示:(4.3) Normalize the three scores in steps (4.1) and (4.2) to [0,1], and then add and fuse to obtain the final abnormal score. The higher the score, the greater the possibility of abnormal video frames. The score calculation method is shown in (7):
Figure FDA0002775501620000032
Figure FDA0002775501620000032
式中
Figure FDA0002775501620000033
D′a(za,pa)和D′m(zm,pm)分别表示归一化后的PSNR、外观特征相似度分数和动作特征相似度分数。
in the formula
Figure FDA0002775501620000033
D′ a (za , p a ) and D′ m (z m , p m ) represent the normalized PSNR, appearance feature similarity score and action feature similarity score, respectively.
CN202011263894.5A 2020-11-12 2020-11-12 Abnormal behavior detection method based on appearance and action feature double prediction Active CN113762007B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011263894.5A CN113762007B (en) 2020-11-12 2020-11-12 Abnormal behavior detection method based on appearance and action feature double prediction

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011263894.5A CN113762007B (en) 2020-11-12 2020-11-12 Abnormal behavior detection method based on appearance and action feature double prediction

Publications (2)

Publication Number Publication Date
CN113762007A true CN113762007A (en) 2021-12-07
CN113762007B CN113762007B (en) 2023-08-01

Family

ID=78785994

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011263894.5A Active CN113762007B (en) 2020-11-12 2020-11-12 Abnormal behavior detection method based on appearance and action feature double prediction

Country Status (1)

Country Link
CN (1) CN113762007B (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115100599A (en) * 2022-07-01 2022-09-23 湖南工商大学 Mask transform-based semi-supervised crowd scene abnormality detection method
CN117640988A (en) * 2023-12-04 2024-03-01 书行科技(北京)有限公司 Video processing method and device, electronic equipment and storage medium

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107358195A (en) * 2017-07-11 2017-11-17 成都考拉悠然科技有限公司 Nonspecific accident detection and localization method, computer based on reconstruction error
CN110415236A (en) * 2019-07-30 2019-11-05 深圳市博铭维智能科技有限公司 A kind of method for detecting abnormality of the complicated underground piping based on double-current neural network
CN111414876A (en) * 2020-03-26 2020-07-14 西安交通大学 Violent behavior identification method based on time sequence guide space attention
CN111860229A (en) * 2020-07-01 2020-10-30 上海嘉沃光电科技有限公司 Intelligent abnormal behavior identification method and device and storage medium
CN111897913A (en) * 2020-07-16 2020-11-06 浙江工商大学 A Cross-modal Retrieval Method for Complex Text Query to Video Based on Semantic Tree Enhancement

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107358195A (en) * 2017-07-11 2017-11-17 成都考拉悠然科技有限公司 Nonspecific accident detection and localization method, computer based on reconstruction error
CN110415236A (en) * 2019-07-30 2019-11-05 深圳市博铭维智能科技有限公司 A kind of method for detecting abnormality of the complicated underground piping based on double-current neural network
CN111414876A (en) * 2020-03-26 2020-07-14 西安交通大学 Violent behavior identification method based on time sequence guide space attention
CN111860229A (en) * 2020-07-01 2020-10-30 上海嘉沃光电科技有限公司 Intelligent abnormal behavior identification method and device and storage medium
CN111897913A (en) * 2020-07-16 2020-11-06 浙江工商大学 A Cross-modal Retrieval Method for Complex Text Query to Video Based on Semantic Tree Enhancement

Non-Patent Citations (6)

* Cited by examiner, † Cited by third party
Title
HYUNJONG PARK等: "Learning Memory-Guided Normality for Anomaly Detection" *
LIMIN WANG等: "Temporal Segment Networks for Action Recognition in Videos" *
TRONG-NGUYEN等: "Anomaly Detection in Video Sequence With Appearance-Motion Correspondence" *
徐晨阳等: "超声图像神经分割方法研究" *
李自强等: "基于外观和动作特征双预测模型的视频异常行为检测" *
樊亚翔: "基于深度学习的视频异常事件检测方法研究" *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115100599A (en) * 2022-07-01 2022-09-23 湖南工商大学 Mask transform-based semi-supervised crowd scene abnormality detection method
CN117640988A (en) * 2023-12-04 2024-03-01 书行科技(北京)有限公司 Video processing method and device, electronic equipment and storage medium

Also Published As

Publication number Publication date
CN113762007B (en) 2023-08-01

Similar Documents

Publication Publication Date Title
CN109961019B (en) Space-time behavior detection method
CN110135366B (en) Shielded pedestrian re-identification method based on multi-scale generation countermeasure network
CN111626245B (en) Human behavior identification method based on video key frame
CN116342596B (en) YOLOv5 improved substation equipment nut defect identification detection method
CN106529419B (en) An automatic object detection method for video saliency stack aggregation
CN109558811B (en) Motion recognition method based on motion foreground attention and unsupervised key frame extraction
CN114565594B (en) Image anomaly detection method based on soft mask contrast loss
CN110223234A (en) Depth residual error network image super resolution ratio reconstruction method based on cascade shrinkage expansion
CN111402237B (en) Video image anomaly detection method and system based on space-time cascade self-encoder
CN113569756B (en) Abnormal behavior detection and location method, system, terminal equipment and readable storage medium
CN110826429A (en) Scenic spot video-based method and system for automatically monitoring travel emergency
CN113505640B (en) A small-scale pedestrian detection method based on multi-scale feature fusion
CN114913599B (en) Video abnormal behavior detection method and system based on automatic encoder
CN113762007A (en) A method for abnormal behavior detection based on bi-prediction of appearance and action features
CN113537110A (en) False video detection method fusing intra-frame and inter-frame differences
Zhou et al. Ristra: Recursive image super-resolution transformer with relativistic assessment
Ren et al. A lightweight object detection network in low-light conditions based on depthwise separable pyramid network and attention mechanism on embedded platforms
Duan et al. Multi-scale convolutional neural network for SAR image semantic segmentation
CN112532999B (en) Digital video frame deletion tampering detection method based on deep neural network
CN110263638A (en) A kind of video classification methods based on significant information
CN114359167A (en) A lightweight YOLOv4-based insulator defect detection method in complex scenarios
CN117876959A (en) Method for reconstructing and generating abnormal behavior detection model of countermeasure network
CN116597503A (en) Classroom behavior detection method based on space-time characteristics
CN114882007A (en) Image anomaly detection method based on memory network
Ma et al. Decomposition-based unsupervised domain adaptation for remote sensing image semantic segmentation

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant