CN116580330A - Machine test abnormal behavior detection method based on double-flow network - Google Patents
Machine test abnormal behavior detection method based on double-flow network Download PDFInfo
- Publication number
- CN116580330A CN116580330A CN202310278231.8A CN202310278231A CN116580330A CN 116580330 A CN116580330 A CN 116580330A CN 202310278231 A CN202310278231 A CN 202310278231A CN 116580330 A CN116580330 A CN 116580330A
- Authority
- CN
- China
- Prior art keywords
- resnet
- network
- flow
- abnormal behavior
- double
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000012360 testing method Methods 0.000 title claims abstract description 24
- 206010000117 Abnormal behaviour Diseases 0.000 title claims abstract description 22
- 238000001514 detection method Methods 0.000 title claims abstract description 10
- 230000003287 optical effect Effects 0.000 claims abstract description 16
- 238000000034 method Methods 0.000 claims abstract description 13
- 239000013598 vector Substances 0.000 claims description 18
- 239000011159 matrix material Substances 0.000 claims description 16
- 238000011176 pooling Methods 0.000 claims description 15
- 230000006399 behavior Effects 0.000 claims description 13
- 238000013527 convolutional neural network Methods 0.000 claims description 10
- 230000004913 activation Effects 0.000 claims description 6
- 230000004927 fusion Effects 0.000 claims description 6
- 238000004364 calculation method Methods 0.000 claims description 3
- 238000000605 extraction Methods 0.000 abstract description 2
- 238000010586 diagram Methods 0.000 description 4
- 238000012544 monitoring process Methods 0.000 description 3
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000005540 biological transmission Effects 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 230000002401 inhibitory effect Effects 0.000 description 1
- 238000012806 monitoring device Methods 0.000 description 1
- 238000012795 verification Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/40—Scenes; Scene-specific elements in video content
- G06V20/41—Higher-level, semantic clustering, classification or understanding of video scenes, e.g. detection, labelling or Markovian modelling of sport events or news items
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/0464—Convolutional networks [CNN, ConvNet]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/77—Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
- G06V10/80—Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level
- G06V10/806—Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level of extracted features
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/82—Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/40—Scenes; Scene-specific elements in video content
- G06V20/46—Extracting features or characteristics from the video content, e.g. video fingerprints, representative shots or key frames
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/20—Movements or behaviour, e.g. gesture recognition
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Evolutionary Computation (AREA)
- General Health & Medical Sciences (AREA)
- Software Systems (AREA)
- Health & Medical Sciences (AREA)
- Artificial Intelligence (AREA)
- Computing Systems (AREA)
- Multimedia (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Computational Linguistics (AREA)
- Biophysics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Biomedical Technology (AREA)
- Data Mining & Analysis (AREA)
- Molecular Biology (AREA)
- General Engineering & Computer Science (AREA)
- Mathematical Physics (AREA)
- Medical Informatics (AREA)
- Databases & Information Systems (AREA)
- Psychiatry (AREA)
- Social Psychology (AREA)
- Human Computer Interaction (AREA)
- Image Analysis (AREA)
Abstract
A machine test abnormal behavior detection method based on a double-flow network comprises the following steps: firstly, collecting machine examination videos, marking the videos to generate a machine examination abnormal behavior data set, extracting video frame images through the videos, extracting optical flow images through the video frame images to obtain double-flow information, randomly cutting continuous 16-frame video images and corresponding optical flow images as a group of input, respectively extracting spatial flow characteristics and time flow characteristics of the videos through a Resnet3D network, adding an attention module in the Resnet3D network to enhance action characteristic extraction, improving network performance, finally fusing the double-flow characteristics, and classifying abnormal behaviors of machine examination workers. The invention adopts a method combining a double-flow network and an attention mechanism, can improve the accuracy of the recognition of the abnormal behavior of the machine test, and can be applied to the detection of the abnormal behavior of the machine test, thereby reducing the pressure of related staff and ensuring that the machine test is more fair and fair.
Description
Technical Field
The invention relates to the field of video behavior recognition, in particular to a machine test abnormal behavior detection method based on a double-flow network.
Background
Examination has long been a means for measuring knowledge levels, the fairness and importance of which are self-evident. In recent years, with the continuous development of technology, mechanical examination is also becoming popular. However, since the examination is usually related to the interest of the examinee, there is no risk that the examinee will get good results by cheating in the examination room. In recent years, with the popularization of signal shielding devices in examination rooms, examinees are basically prevented from communicating with the outside of the examination rooms, so that the current popular cheating means in examination rooms are often information transmission between examinees.
In order to detect cheating behaviors in machine examination, a monitoring device is installed for all examinee computers, the examinee behaviors are monitored in real time, monitoring videos are uploaded to a server after the examination is finished, and staff perform verification. However, since the monitoring video for the examination is often too long, the staff needs to pay attention to watch the video playback for a long time, which easily causes the phenomenon of cheating and missed detection due to fatigue of the staff.
Disclosure of Invention
In order to overcome the defects and shortcomings in the prior art, the invention provides an machine test abnormal behavior detection method based on a double-flow network, which specifically comprises the following steps:
step one: collecting machine examination videos, marking the videos, and generating a machine examination abnormal behavior data set;
step two: splitting a video to extract video frame images, and extracting corresponding optical flow images through the video frame images;
step three: randomly cutting continuous 16 frames of video images and corresponding optical flow images as a group of input, and respectively obtaining double-flow characteristics through a Resnet3D network and a spatial attention module;
step four: and D, carrying out feature fusion on the double-flow features obtained in the step three, finally obtaining the category probability of each machine test behavior through a softmax layer, and predicting the machine test behavior of the examinee.
Further, the initial convolution layer of the Resnet3D network designed in the step three is conv1, the convolution kernel is 7 x 7 in size, the convolution sliding step size is 1×2×2, followed by four residual building blocks resnet_block1, resnet_block2, resnet_block3, resnet_block4, after the Resnet_block4 residual building block, a global average pooling layer is used for downsampling high-dimensional characteristics, and meanwhile, the parameter number of the final full-connection layer FC can be greatly reduced; all building blocks resnet_block1, resnet_block2, resnet_block3, resnet_block4 are composed of two residual component units and a spatial attention module, the convolution kernel sizes used by the residual component units are all 3 x 3, the output channel numbers of residual building blocks Resnet_block1, resnet_block2, resnet_block3 and Resnet_block4 in the network are 64, 128, 256 and 512 respectively.
Further, the spatial attention module performs maximum pooling and average pooling of channel dimensions on the input feature matrix respectively to obtain two feature matrices with dimension 1, then splices the two feature matrices to obtain a feature matrix with dimension 2 of the channel, and obtains a spatial attention coefficient through a three-dimensional convolution layer and Sigmoid activation function, and finally multiplies the spatial attention coefficient with the input feature matrix to obtain a final output feature matrix, wherein the calculation formula is as follows:
wherein σ represents a Sigmoid activation function, f 7×7×7 Representing a convolution layer with a convolution kernel of 7 x 7, maxPool (F) and AvgPool (F) represent the maximum pooling and average pooling operations, respectively, on the feature matrix F.
Further, the specific process of obtaining the dual-flow feature by using the random cutting continuous 16 frames of video images and the corresponding optical flow images as a group of inputs and respectively passing through the Resnet3D convolutional neural network in the third step includes:
step 201: taking a continuous 16-frame original video image sequence as input of a first channel of a Resnet3D convolutional neural network, wherein the output characteristic vector is F1;
step 202: and taking the continuous 16-frame optical flow image sequence as the input of a second channel of the Resnet3D convolutional neural network, and outputting a characteristic vector of F2.
Further, the input original video image sequence and the optical flow image sequence are four-dimensional vectors, and can be expressed by c×d×h×w, wherein C, D, H, W respectively represents the number of channels, the image depth, the image height and the image width.
Further, the specific process of the fourth step comprises:
step 301: splicing the obtained feature vector F1 and the feature vector F2 to obtain a fusion feature vector F;
step 302: the feature vector F passes through the full connection layer FC to obtain an output result R, and finally, a softmax layer is used for calculating probability distribution corresponding to different types of behaviors;
the invention has the following beneficial effects:
1. the invention uses the original video frame image and the optical flow image as input at the same time, enhances the capture and extraction of the human motion information by the network, and can effectively improve the accuracy of behavior recognition.
2. By adding the attention mechanism module, the weight of the network to the concerned region can be redistributed, so that the concerned degree of the human body action generating region in the video is far higher than that of other regions, and the purposes of inhibiting irrelevant redundant information and improving the network performance are achieved.
3. The invention adopts a method combining a double-flow network and an attention mechanism, can improve the accuracy of the recognition of the abnormal behavior of the machine test, and can be applied to the detection of the abnormal behavior of the machine test, thereby reducing the pressure of related staff and ensuring that the machine test is more fair and fair.
Drawings
FIG. 1 is a flow chart of the method of the present invention;
FIG. 2 is a schematic diagram of a model of the present invention;
FIG. 3 is a schematic diagram of a Resnet3D convolutional neural network of the present invention;
FIG. 4 is a schematic diagram of a Resnet3D network residual error building block structure according to the present invention;
fig. 5 is an output diagram of the network structure of the Resnet3D convolutional neural network of the present invention.
Detailed Description
The invention will be further described with reference to the following specific examples, but the scope of the invention is not limited thereto:
examples: as shown in fig. 1, a method for detecting abnormal behavior of an machine test based on a dual-flow network specifically comprises the following steps:
step one: collecting machine examination videos, marking the videos, and generating a machine examination abnormal behavior data set;
the method comprises the steps of collecting machine-check video data, marking abnormal behaviors in the machine-check video data, obtaining the data set by adopting modes of video monitoring, handheld equipment shooting and the like, wherein the data set is a self-collected data set, the obtaining way is reasonable and reliable, and the task requirement is met.
Step two: splitting a video to extract video frame images, and extracting corresponding optical flow images through the video frame images;
step three: randomly cutting continuous 16 frames of video images and corresponding optical flow images as a group of input, and respectively obtaining double-flow characteristics through a Resnet3D network and a spatial attention module;
and respectively taking the continuous 16-frame original video image sequence and the optical flow image sequence as the input of a first channel and the input of a second channel of the Resnet3D convolutional neural network, so as to obtain the output characteristic vectors F1 and F2.
The structure of the Resnet3D convolutional neural network used is shown in FIG. 3, the input of the network is 3×16×224×224, where 3 is the RGB channel number of the image, 16 is the acquired continuous 16 frames, 224 is the image wide and high pixels, the initial convolution layer is conv1, its convolution kernel size is 7 x 7, the convolution sliding step size is 1 x 2, the four residual error building blocks Resnet_block1, resnet_block2, resnet_block3 and Resnet_block4 are arranged behind the residual error building blocks Resnet_block4, and a global average pooling layer is used for downsampling high-dimensional features after the residual error building blocks Resnet_block4, so that the parameter number of the final full-connection layer FC can be greatly reduced; all building blocks resnet_block1, resnet_block2, resnet_block3, resnet_block4 are composed of two residual component units and a spatial attention module, the convolution kernel sizes used by the residual component units are all 3 x 3, the output channel numbers of residual building blocks Resnet_block1, resnet_block2, resnet_block3 and Resnet_block4 in the network are 64, 128, 256 and 512 respectively.
Each residual building block is composed of a residual assembly unit and a spatial attention module, the structure of the residual building block is shown in fig. 4, the spatial attention module performs maximum pooling and average pooling on an input feature matrix in a channel dimension respectively to obtain two feature matrices with the dimension of 1, then the two feature matrices are spliced to obtain a feature matrix with the channel dimension of 2, a spatial attention coefficient is obtained through a three-dimensional convolution layer and a Sigmoid activation function, and finally the spatial attention coefficient is multiplied with the input feature matrix to obtain a final output feature matrix, and the calculation formula is as follows:
wherein σ represents a Sigmoid activation function, f 7×7×7 Representing a convolution layer with a convolution kernel of 7 x 7, maxPool (F) and AvgPool (F) represent the maximum pooling and average pooling operations, respectively, on the feature matrix F.
Step four: and D, carrying out feature fusion on the double-flow features obtained in the step three, finally obtaining the category probability of each machine test behavior through a softmax layer, and predicting the machine test behavior of the examinee.
Splicing the output feature vectors F1 and F2 obtained in the step three to obtain a fusion feature vector F;
the feature vector F passes through the full connection layer FC to obtain an output result R, and finally, a softmax layer is used for calculating probability distribution corresponding to different types of behaviors;
the foregoing is considered as illustrative of the principles of the present invention, and has been described herein before with reference to the accompanying drawings, in which the invention is not limited to the specific embodiments shown.
Claims (6)
1. The machine test abnormal behavior detection method based on the double-flow network is characterized by comprising the following steps of:
step one: collecting machine examination videos, marking the videos, and generating a machine examination abnormal behavior data set;
step two: splitting a video to extract video frame images, and extracting corresponding optical flow images through the video frame images;
step three: randomly cutting continuous 16 frames of video images and corresponding optical flow images as a group of input, and respectively obtaining double-flow characteristics through a Resnet3D network and a spatial attention module;
step four: and D, carrying out feature fusion on the double-flow features obtained in the step three, finally obtaining the category probability of each machine test behavior through a softmax layer, and predicting the machine test behavior of the examinee.
2. The method for detecting abnormal behavior of an machine test based on a dual-flow network as claimed in claim 1, wherein, the initial convolution layer of the Resnet3D network designed in the step three is conv1, the size of the convolution kernel is 7 multiplied by 7, the convolution sliding step length is 1 multiplied by 2, the four residual error building blocks Resnet_block1, resnet_block2, resnet_block3 and Resnet_block4 are arranged behind the residual error building blocks Resnet_block4, and a global average pooling layer is used for downsampling high-dimensional features after the residual error building blocks Resnet_block4, so that the parameter number of the final full-connection layer FC can be greatly reduced; all building blocks resnet_block1, resnet_block2, resnet_block3, resnet_block4 are composed of two residual component units and a spatial attention module, the convolution kernel sizes used by the residual component units are all 3 x 3, the output channel numbers of residual building blocks Resnet_block1, resnet_block2, resnet_block3 and Resnet_block4 in the network are 64, 128, 256 and 512 respectively.
3. The method for detecting abnormal behavior of machine test based on dual-flow network as claimed in claim 2, wherein said spatial attention module performs maximum pooling and average pooling of channel dimension for input feature matrix to obtain two feature matrix with dimension 1, then splices the two feature matrices to obtain feature matrix with dimension 2, and obtains spatial attention coefficient by a three-dimensional convolution layer and Sigmoid activation function, and finally multiplies the spatial attention coefficient with the input feature matrix to obtain final output feature matrix, the calculation formula is as follows:
wherein σ represents a Sigmoid activation function, f 7×7×7 Representing a convolution layer with a convolution kernel of 7 x 7, maxPool (F) and AvgPool (F) represent the maximum pooling and average pooling operations, respectively, on the feature matrix F.
4. The machine-test abnormal behavior detection method based on the double-flow network according to claim 1, wherein the specific process of obtaining the double-flow characteristic by taking the continuous 16-frame video images and the corresponding optical flow images randomly cut as a group of inputs and respectively passing through the Resnet3D convolutional neural network comprises the following steps:
step 201: taking a continuous 16-frame original video image sequence as input of a first channel of a Resnet3D convolutional neural network, wherein the output characteristic vector is F1;
step 202: and taking the continuous 16-frame optical flow image sequence as the input of a second channel of the Resnet3D convolutional neural network, and outputting a characteristic vector of F2.
5. The method for detecting abnormal behavior of machine test based on dual-stream network as set forth in claim 4, wherein said input original video image sequence and optical flow image sequence are four-dimensional vectors, which can be expressed by c×d×h×w, wherein C, D, H, W represents channel number, image depth, image height, and image width, respectively.
6. The method for detecting abnormal behavior of an machine test based on a dual-flow network as set forth in claim 1, wherein the specific process of the fourth step comprises:
step 301: splicing the obtained feature vector F1 and the feature vector F2 to obtain a fusion feature vector F;
step 302: and the feature vector F passes through the full connection layer FC to obtain an output result R, and finally, a softmax layer is used for calculating probability distribution corresponding to different types of behaviors.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310278231.8A CN116580330A (en) | 2023-03-21 | 2023-03-21 | Machine test abnormal behavior detection method based on double-flow network |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310278231.8A CN116580330A (en) | 2023-03-21 | 2023-03-21 | Machine test abnormal behavior detection method based on double-flow network |
Publications (1)
Publication Number | Publication Date |
---|---|
CN116580330A true CN116580330A (en) | 2023-08-11 |
Family
ID=87538403
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202310278231.8A Pending CN116580330A (en) | 2023-03-21 | 2023-03-21 | Machine test abnormal behavior detection method based on double-flow network |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN116580330A (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117649630A (en) * | 2024-01-29 | 2024-03-05 | 武汉纺织大学 | Examination room cheating behavior identification method based on monitoring video stream |
-
2023
- 2023-03-21 CN CN202310278231.8A patent/CN116580330A/en active Pending
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117649630A (en) * | 2024-01-29 | 2024-03-05 | 武汉纺织大学 | Examination room cheating behavior identification method based on monitoring video stream |
CN117649630B (en) * | 2024-01-29 | 2024-04-26 | 武汉纺织大学 | Examination room cheating behavior identification method based on monitoring video stream |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN112989977B (en) | Audio-visual event positioning method and device based on cross-modal attention mechanism | |
CN112131936A (en) | Inspection robot image identification method and inspection robot | |
CN111445418A (en) | Image defogging method and device and computer equipment | |
CN112507920B (en) | Examination abnormal behavior identification method based on time displacement and attention mechanism | |
CN115861210B (en) | Transformer substation equipment abnormality detection method and system based on twin network | |
CN113052185A (en) | Small sample target detection method based on fast R-CNN | |
CN111369548A (en) | No-reference video quality evaluation method and device based on generation countermeasure network | |
CN116580330A (en) | Machine test abnormal behavior detection method based on double-flow network | |
CN116503318A (en) | Aerial insulator multi-defect detection method, system and equipment integrating CAT-BiFPN and attention mechanism | |
CN115187921A (en) | Power transmission channel smoke detection method based on improved YOLOv3 | |
CN114170144A (en) | Power transmission line pin defect detection method, equipment and medium | |
CN111091093A (en) | Method, system and related device for estimating number of high-density crowds | |
CN113486877A (en) | Power equipment infrared image real-time detection and diagnosis method based on lightweight artificial intelligence model | |
CN115082798A (en) | Power transmission line pin defect detection method based on dynamic receptive field | |
CN114897858A (en) | Rapid insulator defect detection method and system based on deep learning | |
CN111160100A (en) | Lightweight depth model aerial photography vehicle detection method based on sample generation | |
CN113627504A (en) | Multi-mode multi-scale feature fusion target detection method based on generation of countermeasure network | |
CN117152815A (en) | Student activity accompanying data analysis method, device and equipment | |
WO2020155908A1 (en) | Method and apparatus for generating information | |
CN111093140A (en) | Method, device, equipment and storage medium for detecting defects of microphone and earphone dust screen | |
CN116246200A (en) | Screen display information candid photographing detection method and system based on visual identification | |
CN112950592B (en) | Non-reference light field image quality evaluation method based on high-dimensional discrete cosine transform | |
CN115278303B (en) | Video processing method, device, equipment and medium | |
CN116310408B (en) | Method and device for establishing data association between event camera and frame camera | |
CN114898461A (en) | Human body behavior identification method based on double-current non-local space-time convolution neural network |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |