WO2021174771A1 - 一种人机协作的视频异常检测方法 - Google Patents
一种人机协作的视频异常检测方法 Download PDFInfo
- Publication number
- WO2021174771A1 WO2021174771A1 PCT/CN2020/110579 CN2020110579W WO2021174771A1 WO 2021174771 A1 WO2021174771 A1 WO 2021174771A1 CN 2020110579 W CN2020110579 W CN 2020110579W WO 2021174771 A1 WO2021174771 A1 WO 2021174771A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- video
- abnormal
- video frame
- normal
- frame
- Prior art date
Links
- 238000001514 detection method Methods 0.000 title claims abstract description 50
- 230000002159 abnormal effect Effects 0.000 claims abstract description 31
- 238000012549 training Methods 0.000 claims abstract description 20
- 238000012360 testing method Methods 0.000 claims abstract description 18
- 238000013528 artificial neural network Methods 0.000 claims abstract description 7
- 238000000034 method Methods 0.000 claims abstract description 7
- 230000003287 optical effect Effects 0.000 claims abstract description 5
- 230000005856 abnormality Effects 0.000 claims description 9
- 230000006870 function Effects 0.000 claims description 6
- 238000011176 pooling Methods 0.000 claims description 6
- 238000012545 processing Methods 0.000 description 4
- 238000004458 analytical method Methods 0.000 description 2
- 230000019771 cognition Effects 0.000 description 2
- 230000001149 cognitive effect Effects 0.000 description 2
- 238000000605 extraction Methods 0.000 description 2
- 238000010801 machine learning Methods 0.000 description 2
- 230000009286 beneficial effect Effects 0.000 description 1
- 238000006243 chemical reaction Methods 0.000 description 1
- 238000012790 confirmation Methods 0.000 description 1
- 238000013135 deep learning Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 230000003203 everyday effect Effects 0.000 description 1
- 238000012806 monitoring device Methods 0.000 description 1
- 238000003062 neural network model Methods 0.000 description 1
- 230000008447 perception Effects 0.000 description 1
- 238000007781 pre-processing Methods 0.000 description 1
- 238000011160 research Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/70—Information retrieval; Database structures therefor; File system structures therefor of video data
- G06F16/73—Querying
- G06F16/732—Query formulation
- G06F16/7328—Query by example, e.g. a complete video frame or video sequence
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/77—Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
- G06V10/778—Active pattern-learning, e.g. online learning of image or video features
- G06V10/7784—Active pattern-learning, e.g. online learning of image or video features based on feedback from supervisors
- G06V10/7788—Active pattern-learning, e.g. online learning of image or video features based on feedback from supervisors the supervisor being a human, e.g. interactive learning with a human teacher
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/50—Context or environment of the image
- G06V20/52—Surveillance or monitoring of activities, e.g. for recognising suspicious objects
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/70—Information retrieval; Database structures therefor; File system structures therefor of video data
- G06F16/78—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
- G06F16/7867—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using information manually generated, e.g. tags, keywords, comments, title and artist information, manually generated time, location and usage information, user ratings
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/217—Validation; Performance evaluation; Active pattern learning techniques
- G06F18/2178—Validation; Performance evaluation; Active pattern learning techniques based on feedback of a supervisor
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/241—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
- G06F18/2411—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on the proximity to a decision surface, e.g. support vector machines
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N20/00—Machine learning
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/0464—Convolutional networks [CNN, ConvNet]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/048—Activation functions
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/09—Supervised learning
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/20—Analysis of motion
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/72—Data preparation, e.g. statistical preprocessing of image or video features
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/74—Image or video pattern matching; Proximity measures in feature spaces
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/82—Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/40—Scenes; Scene-specific elements in video content
- G06V20/44—Event detection
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/40—Scenes; Scene-specific elements in video content
- G06V20/46—Extracting features or characteristics from the video content, e.g. video fingerprints, representative shots or key frames
- G06V20/47—Detecting features for summarising video content
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/10—Image acquisition modality
- G06T2207/10016—Video; Image sequence
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20084—Artificial neural networks [ANN]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20092—Interactive image processing based on input by user
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02T—CLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
- Y02T10/00—Road transport of goods or passengers
- Y02T10/10—Internal combustion engine [ICE] based vehicles
- Y02T10/40—Engine management systems
Definitions
- the invention belongs to the technical field of video anomaly detection, and particularly relates to a video anomaly detection method involving man-machine cooperation.
- anomaly detection there are mainly two types of anomaly detection, one is based on the early traditional manual feature extraction descriptors, which are used to detect anomalies in specific scenes according to specific target requirements, the detection performance and the quality of manual feature extraction The relationship is close; the other is a method based on deep learning after 2012, which learns more abundant video frames through neural network models and people cannot estimate some hidden features, thereby greatly improving the accuracy and speed of anomaly detection.
- the detection accuracy is improving
- training the detection model requires a large number of samples for training, and the test results of various models have considerable false alarms.
- it is necessary to continuously adjust the training model which is time-consuming and labor-intensive.
- the demand cannot be well met.
- the existing video anomaly detection methods are based on data distribution, model parameters, sample selection and other aspects. For objects that some people can easily identify, the designed model needs to be iterated and optimized to be able to Improve the detection (recognition) effect.
- the present invention proposes a human-machine collaboration video anomaly detection method.
- Step 1 For the video sequence to be detected, analyze its video parameters: the length of the video, the scene of the video, the start and end range of the abnormal video, and the abnormal video is agreed; the video is divided into frames and divided into a certain length of video sequence;
- Step 2 Divide the video sequence segmented in step 1 into a training set and a test set, where the training set does not include any abnormal video sequences, and the test set includes normal and abnormal video sequences;
- Step 3 Use the autoencoder model to train the training set data, adjust the model parameters within a certain time window, and block the video frames and optical flow data input to the network, and then pass the convolution and pooling of the encoder.
- the deconvolution and pooling operations of the decoder use the Euclidean loss with L2 regularization shown in equation (1) as the objective function of the rectangular parallelepiped formed by multiple video frames in the time dimension, which represents the video sequence video block f rec (X i) after the input of N blocks and the reconstructed video block made Euclidean distance X i, where ⁇ represents the adjustment factor before and after the two sum terms, W is a neural network learning from the encoder to The weight of; optimize the objective function to obtain the training model;
- Step 4 Calculate the total error of each pixel value I in frame t at position (x, y)
- the reconstruction error of each pixel at position (x, y) is expressed by formula (2):
- I (x, y, t) represents the value of a pixel value I at the position (x, y) in each frame t
- f W I (x, y, t) represents the pixel value after reconstruction
- min t e(t) and max t e(t) represent the total error value corresponding to the video frame with the smallest score and the largest score in the video; according to the overall detection result and the ratio of normal and abnormal, the threshold is set to be less than
- the threshold value is a normal video frame, and the threshold value is an abnormal video frame; for the detection result, a feedback is initiated with a certain probability, so that people can judge whether it is true normal or true abnormality. If it is a normal video frame, it will be output directly, if it is a detection error
- the video frame is marked by people;
- Step 5 Collect the wrong video frames in step 4 and store them in a buffer. After the collected video frames reach a certain number, send the collected video frames to the autoencoder model, and make the model parameters appropriate Adjust to improve the detection accuracy of similar video frames in subsequent tests.
- the ratio of the training set and the test set in the step 2 is 4:6.
- the block in step 3 has three sizes of 15*15 pixels, 18*18 pixels, or 20*20 pixels.
- the certain probability in step 4 is 0.1.
- the present invention adds human feedback to conventional video anomaly detection, and expert confirmation is made for the video frame that initiates the feedback, especially for the judgment of the video frame larger than the set threshold.
- the abnormal target object has a large occlusion.
- Experts can confirm that using human cognitive advantages to modify and label the results of algorithm detection, for false alarms (which were originally normal but were judged as abnormal by the algorithm) and missed detections (which were originally abnormal but were not detected by the algorithm). Detected) can be corrected, and the final experimental result improves the detection accuracy, without the need to update the detection model, which has practical application value.
- the invention provides a video abnormality detection method fused with human feedback.
- This method combines the abnormal natural cognition of people (with domain expertise) and the processing results of the machine learning model to a certain extent.
- Set a threshold for the test results send a feedback request in a certain proportion, confirm the correct detection, and directly output the result; mark the detection error, and then return to the input part of the model to process the marked data .
- the previous abnormal video detection algorithm provides a novel way, which combines the advantages of human cognitive analysis and the rapid processing advantages of neural networks, and improves the accuracy of detection.
- Figure 1 is a flow chart of a video anomaly detection method based on human-machine collaboration of the present invention
- FIG. 2 shows the result of whether there are any abnormalities in the video
- the present invention proposes a video anomaly detection method based on man-machine cooperation.
- Video frames and traditional image optical flow descriptors are used as input data to perform self-encoder neural network encoding and conversion into hidden layer representation content, and then the hidden layer representation content is decoded and reconstructed to output.
- the final reconstruction result maintains a high similarity with the input sample; on the contrary, if the input is an abnormal sample, the final reconstruction error will deviate from the input sample Larger.
- an appropriate threshold value is set for the test result, the value smaller than the threshold value is regarded as normal, and the value larger than the threshold value is regarded as abnormal.
- the person judges the video frame that initiated the feedback. If it is detected correctly, it will be output directly. If there is a detection error, it will be marked. The normal mark is 1, the abnormal mark is 0, and then the sample with the detection error is returned. To the model input. By collecting a certain number of error detection video frames and sending them to the neural network, the model is updated, and then in subsequent tests, some similar anomalies can be detected as real anomalies. At the same time, for abnormal videos, the detection can be more targeted according to the start and end ranges of the video abnormalities, and the speed of detection can be accelerated, which has strong practical significance in application scenarios such as public safety and social security management.
- Step 1 Analyze the video parameters of the video sequence to be detected, prepare for the processing of the video to be detected, have a basic understanding of the video to be processed, and deal with it more specifically.
- the observation record includes the length of the video, the scene of the video, the start and end range of the abnormal video, and determine the abnormality of the video (in our experimental data set: car, skateboarder, bicycle rider, wheelchair, running person , The person throwing things), so as to have a clearer perception of the video to be detected.
- Do some preprocessing perform frame division operations on the video, and divide it into a certain length of video sequence (such as 200 frames as a sequence).
- Step 2 Divide the ratio of the training set and the test set according to the video sequence segmented in Step 1, usually 4:6, where the training set does not include any abnormal video sequences, and the test set includes normal and abnormal video sequences.
- REC (X i) and the input video block X i made Euclidean distance, wherein the longitudinal adjustment factor ⁇ represents the two sum terms, W is a neural network learning from the encoder to the weight. Optimize the objective function to obtain the training model.
- Step 4 After the model is trained, we calculate the total error value of each pixel value I in the frame t at the position (x, y) The reconstruction error of each pixel at the position (x, y) is expressed by formula (2), and then the anomaly score of each frame is calculated, which is used as a basis for judging whether it is an abnormality.
- I (x, y, t) represents the value of a pixel value I at the position (x, y) in each frame t
- f W I (x, y, t) represents the pixel value after reconstruction.
- Anomaly score is obtained for each frame, expressed as formula (3)
- min t e(t) and max t e(t) represent the total error value corresponding to the video frame with the smallest score and the largest score in the video sequence.
- set the threshold less than the threshold is a normal video frame, and greater than the threshold is an abnormal video frame; feedback is initiated with a certain probability (0.1) for the detection result, and let people (experts) make judgments Whether it is true normal or true abnormality, if it is a normal video frame that is detected, it is directly output, if it is a video frame that is detected incorrectly, people will mark it; for a video sequence composed of regular events, it has a better rule (normal) score, because They are closer to the normal training data in the training set in the feature space. Conversely, the abnormal sequence has a low normal score, so it can be used to locate abnormalities.
- Step 5 Collect the wrong video frames in step 4 and store them in a buffer. After the collected video frames reach a certain number, send the collected video frames to the autoencoder model, and make the model parameters appropriate Adjust to improve the detection accuracy of similar video frames in subsequent tests.
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Multimedia (AREA)
- Evolutionary Computation (AREA)
- Software Systems (AREA)
- Data Mining & Analysis (AREA)
- Artificial Intelligence (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Computing Systems (AREA)
- General Engineering & Computer Science (AREA)
- Databases & Information Systems (AREA)
- Health & Medical Sciences (AREA)
- General Health & Medical Sciences (AREA)
- Medical Informatics (AREA)
- Mathematical Physics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Computational Linguistics (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Molecular Biology (AREA)
- Library & Information Science (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Evolutionary Biology (AREA)
- Image Analysis (AREA)
- Testing, Inspecting, Measuring Of Stereoscopic Televisions And Televisions (AREA)
Abstract
Description
Claims (4)
- 一种人机协作的视频异常检测方法,其特征在于步骤如下:步骤1:对于要检测的视频序列,分析其视频参数:视频的长度、视频的场景、异常视频的起止范围,约定异常的视频;对视频进行分帧操作,分割成一定长度的视频序列;步骤2:将步骤1分割好的视频序列划分为训练集和测试集,其中训练集不包括任何异常的视频序列,测试集包含正常和异常视频序列;步骤3:利用自编码器模型对训练集数据进行训练,在一定的时间窗口内调整模型参数,让输入到网络的视频帧和光流数据进行分块,然后经过编码器的卷积、池化,以及解码器的反卷积、池化操作;使用式(1)所示的带有L2正则化的欧几里德损失作为在时间维度多个视频帧构成的长方体的目标函数,它表示视频序列中N块重构之后的视频块f rec(X i)和输入的视频块X i所做的欧式距离,其中γ表示前后两个加和项的调节因子,W是自编码器神经网络学习到的权重;优化目标函数,从而得到训练模型;其中,I(x,y,t)表示每一帧t中一个像素值I在位置(x,y)位置的值,f W(I(x,y,t)表示重构之后的像素值;计算每一帧的异常分数,用于是否为异常的判断依据:其中,min te(t)和max te(t)表示在视频中得分最小和得分最大的视频帧对应的总计误差值;根据整体的检测结果和正常以及异常的比例,设定阈值,小于阈值为正常视频帧,大于阈值为异常视频帧;对于检测的结果以一定概率发起反馈,让人进行判断是否是真实的正常或真实异常,如果是检测正常的视频帧直接输出,如果是检测错误的视频帧,人进行标注;步骤5:对于步骤4中检测错误的视频帧,进行收集,存放到一个缓冲区,等到收集的视频帧达到一定数量后,把收集到的视频帧送入自编码器模型,模型参数做出适度调整,从而在后续的测试中,提升对类似视频帧的检测准确率。
- 根据权利要求1所述的一种人机协作的视频异常检测方法,其特征在于所述的步骤2中的训练集和测试集的比例为4:6。
- 根据权利要求1所述的一种人机协作的视频异常检测方法,其特征在于所述的步骤3中的分块为15*15像素、18*18像素或20*20像素三种尺寸。
- 根据权利要求1所述的一种人机协作的视频异常检测方法,其特征在于所述的步骤4中的一定概率为0.1。
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US17/727,728 US11983919B2 (en) | 2020-03-05 | 2022-04-23 | Video anomaly detection method based on human-machine cooperation |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010148420.XA CN111400547B (zh) | 2020-03-05 | 2020-03-05 | 一种人机协作的视频异常检测方法 |
CN202010148420.X | 2020-03-05 |
Related Child Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US17/727,728 Continuation US11983919B2 (en) | 2020-03-05 | 2022-04-23 | Video anomaly detection method based on human-machine cooperation |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2021174771A1 true WO2021174771A1 (zh) | 2021-09-10 |
Family
ID=71428571
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/CN2020/110579 WO2021174771A1 (zh) | 2020-03-05 | 2020-08-21 | 一种人机协作的视频异常检测方法 |
Country Status (2)
Country | Link |
---|---|
CN (1) | CN111400547B (zh) |
WO (1) | WO2021174771A1 (zh) |
Cited By (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN114067251A (zh) * | 2021-11-18 | 2022-02-18 | 西安交通大学 | 一种无监督监控视频预测帧异常检测方法 |
CN114092478A (zh) * | 2022-01-21 | 2022-02-25 | 合肥中科类脑智能技术有限公司 | 一种异常检测方法 |
CN114743153A (zh) * | 2022-06-10 | 2022-07-12 | 北京航空航天大学杭州创新研究院 | 基于视频理解的无感取菜模型建立、取菜方法及装置 |
CN114842371A (zh) * | 2022-03-30 | 2022-08-02 | 西北工业大学 | 一种无监督视频异常检测方法 |
CN115484456A (zh) * | 2022-09-15 | 2022-12-16 | 重庆邮电大学 | 一种基于语义聚类的视频异常预测方法及装置 |
CN117474925A (zh) * | 2023-12-28 | 2024-01-30 | 山东润通齿轮集团有限公司 | 一种基于机器视觉的齿轮点蚀检测方法及系统 |
Families Citing this family (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111400547B (zh) * | 2020-03-05 | 2023-03-24 | 西北工业大学 | 一种人机协作的视频异常检测方法 |
CN113033424B (zh) * | 2021-03-29 | 2021-09-28 | 广东众聚人工智能科技有限公司 | 一种基于多分支视频异常检测方法和系统 |
CN113240022A (zh) * | 2021-05-19 | 2021-08-10 | 燕山大学 | 多尺度单分类卷积网络的风电齿轮箱故障检测方法 |
CN113473124B (zh) * | 2021-05-28 | 2024-02-06 | 北京达佳互联信息技术有限公司 | 信息获取方法、装置、电子设备及存储介质 |
CN115082870A (zh) * | 2022-07-18 | 2022-09-20 | 松立控股集团股份有限公司 | 一种停车场异常事件检测方法 |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103559420A (zh) * | 2013-11-20 | 2014-02-05 | 苏州大学 | 一种异常检测训练集的构建方法及装置 |
CN108509827A (zh) * | 2017-02-27 | 2018-09-07 | 阿里巴巴集团控股有限公司 | 视频流中异常内容的识别方法及视频流处理系统和方法 |
CN109615019A (zh) * | 2018-12-25 | 2019-04-12 | 吉林大学 | 基于时空自动编码器的异常行为检测方法 |
CN110177108A (zh) * | 2019-06-02 | 2019-08-27 | 四川虹微技术有限公司 | 一种异常行为检测方法、装置及验证系统 |
US20190392230A1 (en) * | 2016-06-13 | 2019-12-26 | Xevo Inc. | Method and system for providing behavior of vehicle operator using virtuous cycle |
CN111400547A (zh) * | 2020-03-05 | 2020-07-10 | 西北工业大学 | 一种人机协作的视频异常检测方法 |
Family Cites Families (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108830882B (zh) * | 2018-05-25 | 2022-05-17 | 中国科学技术大学 | 视频异常行为实时检测方法 |
US10970823B2 (en) * | 2018-07-06 | 2021-04-06 | Mitsubishi Electric Research Laboratories, Inc. | System and method for detecting motion anomalies in video |
CN109359519B (zh) * | 2018-09-04 | 2021-12-07 | 杭州电子科技大学 | 一种基于深度学习的视频异常行为检测方法 |
-
2020
- 2020-03-05 CN CN202010148420.XA patent/CN111400547B/zh active Active
- 2020-08-21 WO PCT/CN2020/110579 patent/WO2021174771A1/zh active Application Filing
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103559420A (zh) * | 2013-11-20 | 2014-02-05 | 苏州大学 | 一种异常检测训练集的构建方法及装置 |
US20190392230A1 (en) * | 2016-06-13 | 2019-12-26 | Xevo Inc. | Method and system for providing behavior of vehicle operator using virtuous cycle |
CN108509827A (zh) * | 2017-02-27 | 2018-09-07 | 阿里巴巴集团控股有限公司 | 视频流中异常内容的识别方法及视频流处理系统和方法 |
CN109615019A (zh) * | 2018-12-25 | 2019-04-12 | 吉林大学 | 基于时空自动编码器的异常行为检测方法 |
CN110177108A (zh) * | 2019-06-02 | 2019-08-27 | 四川虹微技术有限公司 | 一种异常行为检测方法、装置及验证系统 |
CN111400547A (zh) * | 2020-03-05 | 2020-07-10 | 西北工业大学 | 一种人机协作的视频异常检测方法 |
Cited By (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN114067251A (zh) * | 2021-11-18 | 2022-02-18 | 西安交通大学 | 一种无监督监控视频预测帧异常检测方法 |
CN114067251B (zh) * | 2021-11-18 | 2023-09-15 | 西安交通大学 | 一种无监督监控视频预测帧异常检测方法 |
CN114092478A (zh) * | 2022-01-21 | 2022-02-25 | 合肥中科类脑智能技术有限公司 | 一种异常检测方法 |
CN114842371A (zh) * | 2022-03-30 | 2022-08-02 | 西北工业大学 | 一种无监督视频异常检测方法 |
CN114842371B (zh) * | 2022-03-30 | 2024-02-27 | 西北工业大学 | 一种无监督视频异常检测方法 |
CN114743153A (zh) * | 2022-06-10 | 2022-07-12 | 北京航空航天大学杭州创新研究院 | 基于视频理解的无感取菜模型建立、取菜方法及装置 |
CN115484456A (zh) * | 2022-09-15 | 2022-12-16 | 重庆邮电大学 | 一种基于语义聚类的视频异常预测方法及装置 |
CN115484456B (zh) * | 2022-09-15 | 2024-05-07 | 重庆邮电大学 | 一种基于语义聚类的视频异常预测方法及装置 |
CN117474925A (zh) * | 2023-12-28 | 2024-01-30 | 山东润通齿轮集团有限公司 | 一种基于机器视觉的齿轮点蚀检测方法及系统 |
CN117474925B (zh) * | 2023-12-28 | 2024-03-15 | 山东润通齿轮集团有限公司 | 一种基于机器视觉的齿轮点蚀检测方法及系统 |
Also Published As
Publication number | Publication date |
---|---|
CN111400547B (zh) | 2023-03-24 |
US20220245945A1 (en) | 2022-08-04 |
CN111400547A (zh) | 2020-07-10 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
WO2021174771A1 (zh) | 一种人机协作的视频异常检测方法 | |
CN109829443B (zh) | 基于图像增强与3d卷积神经网络的视频行为识别方法 | |
CN107330920B (zh) | 一种基于深度学习的监控视频多目标追踪方法 | |
Yang et al. | Spatio-temporal action detection with cascade proposal and location anticipation | |
CN110717411A (zh) | 一种基于深层特征融合的行人重识别方法 | |
TWI441096B (zh) | 適用複雜場景的移動偵測方法 | |
WO2018058854A1 (zh) | 一种视频的背景去除方法 | |
CN110853074A (zh) | 一种利用光流增强目标的视频目标检测网络系统 | |
CN114333070A (zh) | 一种基于深度学习的考生异常行为检测方法 | |
CN112561951B (zh) | 一种基于帧差绝对误差和sad的运动和亮度检测方法 | |
CN113313037A (zh) | 一种基于自注意力机制的生成对抗网络视频异常检测方法 | |
CN113536972A (zh) | 一种基于目标域伪标签的自监督跨域人群计数方法 | |
CN112288778B (zh) | 一种基于多帧回归深度网络的红外小目标检测方法 | |
CN111310592A (zh) | 一种基于场景分析和深度学习的检测方法 | |
Do | Attention in crowd counting using the transformer and density map to improve counting result | |
CN109446938B (zh) | 一种基于多序列双投影的黑烟车检测方法 | |
CN113707175A (zh) | 基于特征分解分类器与自适应后处理的声学事件检测系统 | |
CN112949451A (zh) | 通过模态感知特征学习的跨模态目标跟踪方法及系统 | |
CN112766179A (zh) | 一种基于运动特征混合深度网络的火灾烟雾检测方法 | |
CN114120076B (zh) | 基于步态运动估计的跨视角视频步态识别方法 | |
CN115830541A (zh) | 一种基于双流时空自编码器的视频异常事件检测方法 | |
CN114758285A (zh) | 基于锚自由和长时注意力感知的视频交互动作检测方法 | |
CN114821772A (zh) | 一种基于时空关联学习的弱监督时序动作检测方法 | |
CN115457620A (zh) | 用户表情识别方法、装置、计算机设备及存储介质 | |
CN114694090A (zh) | 一种基于改进PBAS算法与YOLOv5的校园异常行为检测方法 |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 20923530 Country of ref document: EP Kind code of ref document: A1 |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 20923530 Country of ref document: EP Kind code of ref document: A1 |
|
32PN | Ep: public notification in the ep bulletin as address of the adressee cannot be established |
Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 27.03.2023) |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 20923530 Country of ref document: EP Kind code of ref document: A1 |