CN115527151B - Video anomaly detection method, system, electronic equipment and storage medium - Google Patents

Video anomaly detection method, system, electronic equipment and storage medium Download PDF

Info

Publication number
CN115527151B
CN115527151B CN202211374647.1A CN202211374647A CN115527151B CN 115527151 B CN115527151 B CN 115527151B CN 202211374647 A CN202211374647 A CN 202211374647A CN 115527151 B CN115527151 B CN 115527151B
Authority
CN
China
Prior art keywords
video
characteristic
feature
prediction
anomaly detection
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202211374647.1A
Other languages
Chinese (zh)
Other versions
CN115527151A (en
Inventor
崔振
朱小涵
曾志勇
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nanjing University of Science and Technology
Original Assignee
Nanjing University of Science and Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nanjing University of Science and Technology filed Critical Nanjing University of Science and Technology
Priority to CN202211374647.1A priority Critical patent/CN115527151B/en
Publication of CN115527151A publication Critical patent/CN115527151A/en
Application granted granted Critical
Publication of CN115527151B publication Critical patent/CN115527151B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/40Scenes; Scene-specific elements in video content
    • G06V20/41Higher-level, semantic clustering, classification or understanding of video scenes, e.g. detection, labelling or Markovian modelling of sport events or news items
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/40Scenes; Scene-specific elements in video content
    • G06V20/44Event detection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/40Scenes; Scene-specific elements in video content
    • G06V20/46Extracting features or characteristics from the video content, e.g. video fingerprints, representative shots or key frames
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02TCLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
    • Y02T10/00Road transport of goods or passengers
    • Y02T10/10Internal combustion engine [ICE] based vehicles
    • Y02T10/40Engine management systems

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Multimedia (AREA)
  • General Physics & Mathematics (AREA)
  • Physics & Mathematics (AREA)
  • Software Systems (AREA)
  • Evolutionary Computation (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Medical Informatics (AREA)
  • General Health & Medical Sciences (AREA)
  • Databases & Information Systems (AREA)
  • Computing Systems (AREA)
  • Artificial Intelligence (AREA)
  • Health & Medical Sciences (AREA)
  • Computational Linguistics (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses a video anomaly detection method, a video anomaly detection system, electronic equipment and a storage medium, and relates to the technical field of computer vision. The method comprises the following steps: acquiring target input data; the target input data are continuous frame images of a target video; determining whether an abnormality exists in the target video according to the target input data and the video abnormality detection model, and outputting predicted data; the video anomaly detection model comprises a feature extractor, a feature encoder, a feature decoder and an anomaly score processor which are connected in sequence; and a prediction module is also connected between the feature encoder and the feature decoder. The video anomaly detection method and device can improve the video anomaly detection precision.

Description

Video anomaly detection method, system, electronic equipment and storage medium
Technical Field
The present invention relates to the field of computer vision, and in particular, to a method and system for detecting video anomalies, an electronic device, and a storage medium.
Background
Video anomaly detection is a hotspot in the fields of monitoring and security, and is particularly important in the field of security for guaranteeing public security, and the main purpose of the video anomaly detection is to detect events beyond the traditional cognition, namely to distinguish events which do not meet expected behaviors. The anomaly detection task typically trains a normal model with a training data set containing only normal samples, and the test phase determines samples that do not conform to the normal model as anomalies. If all the expected behaviors are collected into a closed set, the behaviors outside the closed set will be considered abnormal events. The main difficulties of the video anomaly detection task are: 1) The definition boundary of normal and abnormal behaviors is fuzzy; 2) The video target behaviors are various; 3) Different behavior definitions brought by scene diversity reduce generalization of the system, namely, the scene dependence of the behavior is higher; 4) Abnormal samples are usually far less than normal samples, and imbalance of positive and negative samples easily causes insufficient learning of abnormal features by the model. Thus, the problem of video anomaly detection for real scenes remains a great challenge. According to the training mode of the neural network, the existing video anomaly detection method mainly comprises the following steps: supervised learning, weakly supervised learning, and unsupervised learning. The detection and positioning of abnormal behavior of the video are often distinguished by the differences represented by different behavior features, and generally undergo three steps of moving target detection, feature extraction and abnormal behavior detection classification judgment.
According to the development stage of the above-described solution to the problem, typical methods include a conventional machine learning method and a deep learning method. In the stage of the traditional method, an algorithm for solving the problem of video anomaly detection is mainly based on a feature space constructed by manual features, and the traditional machine learning method is utilized to detect the anomaly behavior, but the problems of low objectivity and high scene dependence caused by too high manual participation exist, and the generalization capability of the method is weak; in the deep learning stage, the anomaly detection algorithm generally utilizes an end-to-end neural network to realize self-adaptive feature learning and anomaly detection, but the supervised algorithm is subject to heavy manual labeling and predefined anomalies, and the unsupervised or weakly supervised method has higher false detection and missing detection probability and has the problem that the generalization performance is to be improved. Therefore, the accuracy of video anomaly detection in the current prior art is not high.
Disclosure of Invention
The invention aims to provide a video anomaly detection method, a system, electronic equipment and a storage medium, which can improve the video anomaly detection precision.
In order to achieve the above object, the present invention provides a method for detecting video anomalies, comprising:
acquiring target input data; the target input data are continuous frame images of a target video;
determining whether an abnormality exists in the target video according to the target input data and the video abnormality detection model, and outputting predicted data; the video anomaly detection model comprises a feature extractor, a feature encoder, a feature decoder and an anomaly score processor which are connected in sequence; and a prediction module is also connected between the feature encoder and the feature decoder.
Optionally, the training process of the video anomaly detection model specifically includes:
acquiring training data; the training data comprises a sample video and a corresponding detection result; the detection result comprises video normality and video abnormality;
constructing a deep learning network model based on U-Net connection;
and inputting the training data into the deep learning network model, training the deep learning network model by adopting a batch random gradient descent method, and determining the trained deep learning network model as the video anomaly detection model.
Optionally, the determining whether the target video has an abnormality according to the target input data and the video abnormality detection model specifically includes:
inputting the target input data into the input feature extractor to extract pixel point feature information;
inputting the pixel point characteristic information into the characteristic encoder, obtaining characteristic reconstruction coding information according to characteristic abstract operation, and obtaining characteristic diffusion coding information according to dynamic diffusion equation operation;
inputting the characteristic reconstruction coding information and the characteristic diffusion coding information into the prediction module for prediction to obtain characteristic prediction coding information;
inputting the characteristic reconstruction coding information and the characteristic prediction coding information into the characteristic decoder for decoding operation to obtain a characteristic reconstruction sample and a characteristic prediction sample;
and determining whether an abnormality exists in the target video according to the characteristic reconstruction sample, the characteristic prediction sample and the abnormality score processor.
Optionally, inputting the feature reconstruction coding information and the feature diffusion coding information into the prediction module for prediction to obtain feature prediction coding information, which specifically includes:
constructing a dynamic diffusion equation according to the state parameters and diffusion parameters of the target input data; the state parameters comprise the spatial position, time and characteristic information of the pixel points;
and inputting the characteristic reconstruction coding information and the characteristic diffusion coding information into the dynamic diffusion equation to obtain characteristic prediction coding information.
Optionally, the determining whether an abnormality exists in the target video according to the feature reconstruction sample, the feature prediction sample and the abnormality score processor specifically includes:
performing loss operation on the characteristic reconstruction sample to obtain a reconstruction loss value;
carrying out loss operation on the characteristic prediction sample to obtain a predicted loss value;
and inputting the reconstruction loss value and the prediction loss value into the anomaly score processor to perform video anomaly detection operation, and determining whether anomalies exist in the target video.
The invention also provides a video anomaly detection system, which comprises:
the data acquisition unit is used for acquiring target input data; the target input data are continuous frame images of a target video;
the abnormality detection unit is used for determining whether an abnormality exists in the target video according to the target input data and the video abnormality detection model; the video anomaly detection model comprises a feature extractor, a feature encoder, a feature decoder and an anomaly score processor which are connected in sequence; and a prediction module is also connected between the feature encoder and the feature decoder.
The invention also provides electronic equipment, which comprises a memory and a processor, wherein the memory is used for storing a computer program, and the processor runs the computer program to enable the electronic equipment to execute the video anomaly detection system.
The present invention also provides a computer-readable storage medium storing a computer program which, when executed by a processor, implements the video anomaly detection system as described above.
According to the specific embodiment provided by the invention, the invention discloses the following technical effects:
the invention discloses a video anomaly detection method, a system, electronic equipment and a storage medium, wherein the method comprises the steps of obtaining target input data; the target input data are continuous frame images of a target video; determining whether an abnormality exists in the target video according to the target input data and the video abnormality detection model, and outputting predicted data; the video anomaly detection model comprises a feature extractor, a feature encoder, a feature decoder and an anomaly score processor which are connected in sequence; and a prediction module is also connected between the feature encoder and the feature decoder. The video anomaly detection method and the video anomaly detection system can improve the accuracy of video anomaly detection by constructing the video anomaly detection model.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions of the prior art, the drawings that are needed in the embodiments will be briefly described below, it being obvious that the drawings in the following description are only some embodiments of the present invention, and that other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
FIG. 1 is a flow chart of a method of detecting video anomalies according to the present invention;
fig. 2 is a schematic diagram of a training network of a video sequence by using a video anomaly detection model in the present embodiment;
FIG. 3 is a schematic diagram of a test network of a video sequence by a video anomaly detection model in the present embodiment;
FIG. 4 is a flow chart of a method of detecting video anomalies according to the present invention.
Detailed Description
The following description of the embodiments of the present invention will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present invention, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
The invention aims to provide a video anomaly detection method, a system, electronic equipment and a storage medium, which can improve the video anomaly detection precision.
In order that the above-recited objects, features and advantages of the present invention will become more readily apparent, a more particular description of the invention will be rendered by reference to the appended drawings and appended detailed description.
As shown in fig. 1, a method for detecting video anomalies provided by an embodiment of the present invention includes:
step 100: acquiring target input data; the target input data is a succession of frame images of a target video.
Step 200: determining whether an abnormality exists in the target video according to the target input data and the video abnormality detection model, and outputting predicted data; the video anomaly detection model comprises a feature extractor, a feature encoder, a feature decoder and an anomaly score processor which are connected in sequence; and a prediction module is also connected between the feature encoder and the feature decoder.
The specific process comprises the following steps:
and the first step is to input the target input data into the input feature extractor to extract the feature information of the pixel points.
And secondly, inputting the pixel point characteristic information into the characteristic encoder, obtaining characteristic reconstruction coding information according to characteristic abstract operation, and obtaining characteristic diffusion coding information according to dynamic diffusion equation operation.
And thirdly, inputting the characteristic reconstruction coding information and the characteristic diffusion coding information into the prediction module for prediction to obtain characteristic prediction coding information.
The further operation mode of the step comprises the following steps: constructing a dynamic diffusion equation according to the state parameters and diffusion parameters of the target input data; the state parameters comprise the spatial position, time and characteristic information of the pixel points; and inputting the characteristic reconstruction coding information and the characteristic diffusion coding information into the dynamic diffusion equation to obtain characteristic prediction coding information.
And fourthly, inputting the characteristic reconstruction coding information and the characteristic prediction coding information into the characteristic decoder for decoding operation to obtain a characteristic reconstruction sample and a characteristic prediction sample.
And fifthly, determining whether an abnormality exists in the target video according to the characteristic reconstruction sample, the characteristic prediction sample and the abnormality score processor.
The further operation mode of the step comprises the following steps: performing loss operation on the characteristic reconstruction sample to obtain a reconstruction loss value; carrying out loss operation on the characteristic prediction sample to obtain a predicted loss value; and inputting the reconstruction loss value and the prediction loss value into the anomaly score processor to perform video anomaly detection operation, and determining whether anomalies exist in the target video.
The training process of the video anomaly detection model specifically comprises the following steps:
firstly, acquiring training data; the training data comprises a sample video and a corresponding detection result; the detection result comprises video normality and video abnormality.
And secondly, constructing a deep learning network model based on U-Net connection.
And thirdly, inputting the training data into the deep learning network model, training the deep learning network model by adopting a batch random gradient descent method, and determining the trained deep learning network model as the video anomaly detection model.
As shown in fig. 2 to 3, as a specific embodiment, the method is mainly divided into the following parts:
data preparation stage:
for video anomaly detection tasks, a large number of relevant videos are collected, continuous frame images in the videos are selected as a data set A, and the A is divided into a training set T containing category differences r And test set T e Wherein T is r Includes only normal mode samples and T e Including normal and abnormal pattern samples.
Model modeling stage:
in this embodiment, the complete video is displayedThe anomaly detection model is denoted as M and includes a feature extractor F, a feature encoder E, a feature decoder D, and an anomaly score processor C. M utilizes rebuilding module and prediction module to realize feature extraction, coding, decoding and end-to-end network training etc. The input video sequence of the model is noted as i= (I 1 ,I 2 ,I 3 ,…,I L ) Wherein the subscript L denotes the sequence length. Each frame of image I i Obtaining mapping space characteristic F after the action of F i =F(I i ) The feature sequence after I mapping is recorded as
Figure BDA0003926088430000061
Figure BDA0003926088430000062
Where D represents the feature dimension. In the practical implementation, a strategy of predicting 1 frame by 4 frames is adopted, so that a continuous video segment input sequence is adopted, the first four frames are historical frames, and the later frame is a future frame.
The model firstly abstracts the characteristics of F layer by an Encoder (Encoder, E), and then maps the abstracted characteristics back to the original characteristic space by a Decoder (Decoder, D) to obtain a reconstructed sample
Figure BDA0003926088430000063
The encoder E is decimated to more accurate features by the constraint of the loss function to facilitate the subsequent prediction process.
Prediction derivation implementation based on reconstruction assistance: in the continuous space modeling video sequence I, a diffusion equation is established by using variables such as space position, time, characteristic information (or energy) and the like, and a final prediction form is obtained through discretization. Specifically, the characteristic information (or energy) of the pixel point at the time t and the spatial position (x, y) is recorded as u (x, y, t). For a partially square region S, where S is small enough to be considered a pixel-level particle, the characteristic information H (t) of the S region can be expressed by the following integral formula:
H(t)=ρ∫∫ S u(x,y,t)dxdy
where ρ is a coefficient, analogous to the specific heat capacity in physics. The information change of the S region can be represented by the following gradient:
Figure BDA0003926088430000071
an infinitely small calculation is then performed on the S region, and the spatial variation is calculated by approximating the value of the center point. Based on the principle of conservation of heat variation in the heat conduction process, diffusion information inspired by the well-known Fourier law is calculated, namely, the diffusion heat flowing in unit area in unit time is in direct proportion to the gradient temperature of the diffusion material, so that the following energy variation can be obtained:
Figure BDA0003926088430000072
wherein DeltaQ x Is the change in heat in the x direction; ΔQ y Is the change in heat in the y direction; h is characteristic information; t is time.
The following dynamic equation of pixel-level particle motion can be further constructed:
Figure BDA0003926088430000073
wherein the diffusion parameter alpha is expressed by the relevant physical parameter p. Then discretizing the continuous differentiation to define four frames of time sequence as history frame u t The next frame of the fourth frame is the future frame u t+1 The predicted form of the feature level future frame can be obtained:
u t+1 =u t +α·u t ·diagMΔt+α·diagM·u t Δt
wherein diagM is a tri-diagonal matrix obtained in the derivation process, Δt is discretized time interval variation, and α is a diffusion parameter. Thereby obtaining the slave history frame u at the characteristic level t To predict future frame u t+1 A partial differential dynamic diffusion equation-based prediction module (abbreviated as PDE, defined as predictor). Due to u t The historical frame sequence features from feature extractor FThe predictor is a dynamic diffusion predictor, i.e. based on derived prediction samples:
F p =predictor(F)
the diffusion parameter α results from the encoding process, with each layer downsampling outputting the features u and α accordingly. For each video segment, the current input I corresponds to the downsampled feature F i And diffusion parameter alpha i The definition is as follows:
F i =f u (I)
α i =f α (I)
f u and f α Respectively, the convolution layers for feature u learning and diffusion parameter alpha learning, i representing the downsampling layer index. The encoding process based on encoder E can be summarized as f=e (F '), where F' is the current input, i.e. the upper layer output.
Correspondingly, the decoding process based on decoder D is summarized as:
Figure BDA0003926088430000081
by combining the bottom-layer detail features with the high-layer semantic features, the utilization rate of feature information in each layer of the network is greatly improved.
The overall loss of the model is defined as:
Figure BDA0003926088430000082
wherein the weight parameter λ is used to balance the prediction loss and the reconstruction loss.
Figure BDA0003926088430000083
And->
Figure BDA0003926088430000084
Both losses are calculated in the form of a mean square error, i.e. the average squared error between the calculated output result and the true value, defined as:
Figure BDA0003926088430000085
the encoder E and decoder are optimized simultaneously by minimizing overall loss. In the decoder, the loss is found by the obtained frame-level output, here a strategy of 4-frame prediction 1-frame is adopted, so that for each input video segment consecutive frames i= (I i ,I i+1 ,I i+2 ,I i+3 ) Resulting reconstructed and predicted outcome history frames
Figure BDA0003926088430000086
And future frame->
Figure BDA0003926088430000087
The same size as the scale of the actual real value.
The above steps are unified into an integral end-to-end deep learning network frame, and the model can realize end-to-end training and effect testing.
Model training stage:
first, a normal event data set T acquired by a data preparation stage r For training of the subsequent model M. Second, randomly sampling training set T r The multi-frame continuous images in the model are used as input samples, input into the current model M, optimize the model through a batch random gradient descent method, aim at minimizing the total loss, stop training when the total loss does not obviously change along with the increase of training rounds, otherwise continue to execute the present step process. Finally, the model M' is obtained after training is completed.
Model test stage:
first, a given video test data set T containing normal and abnormal events is input e Based on the trained model M', a complete test process of reconstruction prediction is realized by sequentially passing through the feature extractor F, the feature encoder E, the feature decoder D and the anomaly score processor C.
Next, a fixed-size continuous frame video clip sequence I obtained by a sliding window method is tested, and an anomaly score processor C is used to calculate a video anomaly detection result (anomaly or abnormal)Normal), anomaly score employs predictive score
Figure BDA0003926088430000091
And reconstruction score->
Figure BDA0003926088430000092
Form of combination:
Figure BDA0003926088430000093
wherein the prediction score
Figure BDA0003926088430000094
And reconstruction score->
Figure BDA0003926088430000095
From->
Figure BDA0003926088430000096
And->
Figure BDA0003926088430000097
Assignment of losses, setting of the balance weight parameter lambda depends on the properties of different data sets.
Through the above operation process, it can be seen that:
modeling the relative changes of the moving object at different moments by using a thermal conduction equation based on dynamic diffusion. In practice, anomaly detection aims to distinguish between all events that do not meet the expected behavior. If all expected behavior is collected into a closed set, behavior outside the closed set will be considered an exception event. From the angle of particle motion, the motion track of the abnormal pixel particles is in a state inconsistent with the normal pixel particles due to breaking the natural law of object motion. In order to build a model of the movement of the pixel-level particles, the invention researches the energy diffusion phenomenon of the pixel particle flow in video anomaly detection, and constructs a heat conduction equation of the pixel-level particles to describe the dynamic characteristics of the video by referring to the heat diffusion process in physics. The present invention describes the variation of a time frame by defining and calculating a diffusion equation, and describes the spatial variation using diffusion parameters. For different events, the diffusion parameters encode corresponding distribution characterization, for example, rapid changes of speed in running, throwing and other events cause strong comparison of the corresponding diffusion parameters with the diffusion parameters in a normal mode, fluctuation of the diffusion process is formed, the diffusion parameters obtained through a neural network deviate from the diffusion parameters obtained in the normal mode training, and the modeling process accords with the change rule generated by abnormality so as to better detect the abnormality.
And solving a dynamic partial differential equation in an end-to-end manner by adopting a reconstructed combined training model based on the discrete prediction of the diffusion equation and auxiliary anomaly detection so as to jointly realize the prediction of a future frame and the reconstruction of a history frame. The combination of reconstruction and prediction can enhance the capture detection capability of the model for abnormal and motion changes only based on the fact that the reconstructed or predicted model is prone to miss detection of abnormal behaviors. The reconstruction process is helpful to improve the accuracy of the neural network for feature learning, thereby assisting in enhancing the ability of the model for future frame prediction. The capturing and detecting capability for the motion change can be established through a model trained on a normal sample, so that a better abnormality detecting effect is achieved in a test stage.
And performing feature extraction and reconstruction prediction on the input multi-frame continuous images by adopting a U-Net based on an automatic encoder-decoder as a backbone network, and calculating network loss on an image level. The U-Net realizes cross-layer integration of different scale features through cross-layer connection, reduces the parameter quantity and enhances the learning generalization capability of the model for motion information.
Therefore, the video anomaly detection method provided by the invention inspires a dynamic heat conduction equation in a diffusion mechanism, and realizes video anomaly detection end to end by simulating a feature learning process in video anomaly detection, so that a model achieves better performance in an explicit modeling mode. On one hand, the neural network self-adaptive learning characteristic changes in space and time, and on the other hand, the dynamic diffusion module is used for self-adaptively encoding the normal mode of the multi-frame. Unlike previous methods that use self-supervised discriminant cues to distinguish between abnormal and normal modes, or rely on scene reconstruction strategies to return to normal texture modes, the present invention uses a strategy that is a reconstruction-aided prediction to predict normal texture modes by capturing the most basic components of video stream changes, where pixel-level particle motion rules are mined and encoded for better generalization. The diffusion parameters corresponding to the different events encode the corresponding profile characterizations, and rapid changes in velocity can lead to strong contrast with the habitual patterns, leading to fluctuations in the diffusion process. The invention defines a dynamic diffusion equation, and a final prediction form is obtained by modeling the diffusion equation on the spatial position, the characteristic information (or the energy in physics) and the time of the video sequence in continuous space. The invention achieves considerable and comparable performance on real video datasets.
As shown in fig. 4, the present invention further provides a video anomaly detection system, including:
the data acquisition unit is used for acquiring target input data; the target input data is a succession of frame images of a target video.
The abnormality detection unit is used for determining whether an abnormality exists in the target video according to the target input data and the video abnormality detection model; the video anomaly detection model comprises a feature extractor, a feature encoder, a feature decoder and an anomaly score processor which are connected in sequence; and a prediction module is also connected between the feature encoder and the feature decoder.
The invention also provides electronic equipment, which comprises a memory and a processor, wherein the memory is used for storing a computer program, and the processor runs the computer program to enable the electronic equipment to execute the video anomaly detection system.
The present invention also provides a computer-readable storage medium storing a computer program which, when executed by a processor, implements the video anomaly detection system as described above.
In the present specification, each embodiment is described in a progressive manner, and each embodiment is mainly described in a different point from other embodiments, and identical and similar parts between the embodiments are all enough to refer to each other.
The principles and embodiments of the present invention have been described herein with reference to specific examples, the description of which is intended only to assist in understanding the core concept of the invention; also, it is within the scope of the present invention to be modified by those of ordinary skill in the art in light of the present teachings. In view of the foregoing, this description should not be construed as limiting the invention.

Claims (5)

1. A video anomaly detection method, comprising:
acquiring target input data; the target input data are continuous frame images of a target video;
determining whether an abnormality exists in the target video according to the target input data and the video abnormality detection model, and outputting predicted data; the video anomaly detection model comprises a feature extractor, a feature encoder, a feature decoder and an anomaly score processor which are connected in sequence; a prediction module is also connected between the feature encoder and the feature decoder;
the determining whether the target video has an abnormality according to the target input data and the video abnormality detection model specifically includes:
inputting the target input data into the feature extractor to extract pixel point feature information;
inputting the pixel point characteristic information into the characteristic encoder, obtaining characteristic reconstruction coding information according to characteristic abstract operation, and obtaining characteristic diffusion coding information according to dynamic diffusion equation operation;
inputting the characteristic reconstruction coding information and the characteristic diffusion coding information into the prediction module for prediction to obtain characteristic prediction coding information;
inputting the characteristic reconstruction coding information and the characteristic prediction coding information into the characteristic decoder for decoding operation to obtain a characteristic reconstruction sample and a characteristic prediction sample;
determining whether an abnormality exists in the target video according to the characteristic reconstruction sample, the characteristic prediction sample and the abnormality score processor;
inputting the characteristic reconstruction coding information and the characteristic diffusion coding information into the prediction module for prediction to obtain characteristic prediction coding information, wherein the method specifically comprises the following steps:
constructing a dynamic diffusion equation according to the state parameters and diffusion parameters of the target input data; the state parameters comprise the spatial position, time and characteristic information of the pixel points;
inputting the characteristic reconstruction coding information and the characteristic diffusion coding information into the dynamic diffusion equation to obtain characteristic prediction coding information;
defining four frames of time series history frame as u t Future frames are characterized by u t+1 The predicted form of the feature level future frame can be obtained:
u t+1 =u t +α·u t ·diagMΔt+α·diagM·u t Δt
wherein diagM is a tri-diagonal matrix obtained by a derivation process, Δt is a discretized time interval change, α is a diffusion parameter, and a slave history frame u at a feature level is obtained t To predict future frame u t+1 Is based on partial differential dynamic diffusion equation.
2. The video anomaly detection method of claim 1, wherein the training process of the video anomaly detection model specifically comprises:
acquiring training data; the training data comprises a sample video and a corresponding detection result; the detection result comprises video normality and video abnormality;
constructing a deep learning network model based on U-Net connection;
and inputting the training data into the deep learning network model, training the deep learning network model by adopting a batch random gradient descent method, and determining the trained deep learning network model as the video anomaly detection model.
3. The method for detecting video anomalies according to claim 1, wherein the determining whether anomalies exist in the target video according to the feature reconstruction samples, the feature prediction samples and the anomaly score processor specifically comprises:
performing loss operation on the characteristic reconstruction sample to obtain a reconstruction loss value;
carrying out loss operation on the characteristic prediction sample to obtain a predicted loss value;
and inputting the reconstruction loss value and the prediction loss value into the anomaly score processor to perform video anomaly detection operation, and determining whether anomalies exist in the target video.
4. An electronic device comprising a memory for storing a computer program and a processor that runs the computer program to cause the electronic device to perform the video anomaly detection method of any one of claims 1 to 3.
5. A computer-readable storage medium, characterized in that it stores a computer program which, when executed by a processor, implements the video abnormality detection method according to any one of claims 1 to 3.
CN202211374647.1A 2022-11-04 2022-11-04 Video anomaly detection method, system, electronic equipment and storage medium Active CN115527151B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211374647.1A CN115527151B (en) 2022-11-04 2022-11-04 Video anomaly detection method, system, electronic equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211374647.1A CN115527151B (en) 2022-11-04 2022-11-04 Video anomaly detection method, system, electronic equipment and storage medium

Publications (2)

Publication Number Publication Date
CN115527151A CN115527151A (en) 2022-12-27
CN115527151B true CN115527151B (en) 2023-07-11

Family

ID=84705094

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211374647.1A Active CN115527151B (en) 2022-11-04 2022-11-04 Video anomaly detection method, system, electronic equipment and storage medium

Country Status (1)

Country Link
CN (1) CN115527151B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115797349B (en) * 2023-02-07 2023-07-07 广东奥普特科技股份有限公司 Defect detection method, device and equipment

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112908465A (en) * 2021-01-04 2021-06-04 西北工业大学 Ultrasonic key frame automatic identification method based on anomaly detection and semi-supervision
CN113132737A (en) * 2021-04-21 2021-07-16 北京邮电大学 Video prediction method based on Taylor decoupling and memory unit correction
CN113569756A (en) * 2021-07-29 2021-10-29 西安交通大学 Abnormal behavior detection and positioning method, system, terminal equipment and readable storage medium
CN113705490A (en) * 2021-08-31 2021-11-26 重庆大学 Anomaly detection method based on reconstruction and prediction
CN114820708A (en) * 2022-04-28 2022-07-29 江苏大学 Peripheral multi-target trajectory prediction method based on monocular visual motion estimation, model training method and device
CN114926767A (en) * 2022-05-27 2022-08-19 湖南工商大学 Prediction reconstruction video anomaly detection method fused with implicit space autoregression

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112908465A (en) * 2021-01-04 2021-06-04 西北工业大学 Ultrasonic key frame automatic identification method based on anomaly detection and semi-supervision
CN113132737A (en) * 2021-04-21 2021-07-16 北京邮电大学 Video prediction method based on Taylor decoupling and memory unit correction
CN113569756A (en) * 2021-07-29 2021-10-29 西安交通大学 Abnormal behavior detection and positioning method, system, terminal equipment and readable storage medium
CN113705490A (en) * 2021-08-31 2021-11-26 重庆大学 Anomaly detection method based on reconstruction and prediction
CN114820708A (en) * 2022-04-28 2022-07-29 江苏大学 Peripheral multi-target trajectory prediction method based on monocular visual motion estimation, model training method and device
CN114926767A (en) * 2022-05-27 2022-08-19 湖南工商大学 Prediction reconstruction video anomaly detection method fused with implicit space autoregression

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
Advanced Image Analysis for Learning Underlying Partial Differential Equations for Anomaly Identification;Andrew Miller et al.;《Journal of Imaging Science and Technology》;第64卷(第2期);全文 *
智能视频监控系统中的若干关键问题研究;邹一波;《中国博士学位论文全文数据库 信息科技辑》(第02期);全文 *

Also Published As

Publication number Publication date
CN115527151A (en) 2022-12-27

Similar Documents

Publication Publication Date Title
CN110555479B (en) Fault feature learning and classifying method based on 1DCNN and GRU fusion
CN111914873B (en) Two-stage cloud server unsupervised anomaly prediction method
CN109948117A (en) A kind of satellite method for detecting abnormality fighting network self-encoding encoder
CN112052763B (en) Video abnormal event detection method based on two-way review generation countermeasure network
CN107705324A (en) A kind of video object detection method based on machine learning
CN106571014A (en) Method for identifying abnormal motion in video and system thereof
Yu et al. Railway obstacle detection algorithm using neural network
CN115527151B (en) Video anomaly detection method, system, electronic equipment and storage medium
CN107220603A (en) Vehicle checking method and device based on deep learning
CN112488013A (en) Depth-forged video detection method and system based on time sequence inconsistency
CN114239377A (en) Method and system for evaluating health state of urban rail electromechanical equipment and storage medium
CN115169430A (en) Cloud network end resource multidimensional time sequence anomaly detection method based on multi-scale decoding
CN116522265A (en) Industrial Internet time sequence data anomaly detection method and device
Zhou et al. Object-guided and motion-refined attention network for video anomaly detection
CN114677339A (en) Power transmission line bolt out-of-stock defect detection method introducing attention mechanism
CN116364203A (en) Water quality prediction method, system and device based on deep learning
CN115100594A (en) Pedestrian target detection method, system and device and storage medium
CN117608959A (en) Domain countermeasure migration network-based flight control system state monitoring method
CN116244596A (en) Industrial time sequence data anomaly detection method based on TCN and attention mechanism
CN115510757A (en) Design method for long-time sequence prediction based on gated convolution and time attention mechanism
CN115205743A (en) Electrical equipment integrity monitoring method based on TSN and attention LSTM network model
CN113743306A (en) Method for analyzing abnormal behaviors of real-time intelligent video monitoring based on slowfast double-frame rate
Anagnostopoulos et al. Enhancing virtual sensors to deal with missing values and low sampling rates
CN118115924B (en) Elevator car abnormal event detection method based on multi-frame key feature fusion
Yang et al. Remaining Useful Life Prediction Based on Stacked Sparse Autoencoder and Echo State Network

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant