CN116580330A - Machine test abnormal behavior detection method based on double-flow network - Google Patents

Machine test abnormal behavior detection method based on double-flow network Download PDF

Info

Publication number
CN116580330A
CN116580330A CN202310278231.8A CN202310278231A CN116580330A CN 116580330 A CN116580330 A CN 116580330A CN 202310278231 A CN202310278231 A CN 202310278231A CN 116580330 A CN116580330 A CN 116580330A
Authority
CN
China
Prior art keywords
resnet
network
flow
abnormal behavior
double
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310278231.8A
Other languages
Chinese (zh)
Inventor
赵小敏
杨文嘉
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhejiang University of Technology ZJUT
Original Assignee
Zhejiang University of Technology ZJUT
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhejiang University of Technology ZJUT filed Critical Zhejiang University of Technology ZJUT
Priority to CN202310278231.8A priority Critical patent/CN116580330A/en
Publication of CN116580330A publication Critical patent/CN116580330A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/40Scenes; Scene-specific elements in video content
    • G06V20/41Higher-level, semantic clustering, classification or understanding of video scenes, e.g. detection, labelling or Markovian modelling of sport events or news items
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/0464Convolutional networks [CNN, ConvNet]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/80Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level
    • G06V10/806Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level of extracted features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/40Scenes; Scene-specific elements in video content
    • G06V20/46Extracting features or characteristics from the video content, e.g. video fingerprints, representative shots or key frames
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/20Movements or behaviour, e.g. gesture recognition

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Evolutionary Computation (AREA)
  • General Health & Medical Sciences (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Computing Systems (AREA)
  • Multimedia (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Computational Linguistics (AREA)
  • Biophysics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Biomedical Technology (AREA)
  • Data Mining & Analysis (AREA)
  • Molecular Biology (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Medical Informatics (AREA)
  • Databases & Information Systems (AREA)
  • Psychiatry (AREA)
  • Social Psychology (AREA)
  • Human Computer Interaction (AREA)
  • Image Analysis (AREA)

Abstract

A machine test abnormal behavior detection method based on a double-flow network comprises the following steps: firstly, collecting machine examination videos, marking the videos to generate a machine examination abnormal behavior data set, extracting video frame images through the videos, extracting optical flow images through the video frame images to obtain double-flow information, randomly cutting continuous 16-frame video images and corresponding optical flow images as a group of input, respectively extracting spatial flow characteristics and time flow characteristics of the videos through a Resnet3D network, adding an attention module in the Resnet3D network to enhance action characteristic extraction, improving network performance, finally fusing the double-flow characteristics, and classifying abnormal behaviors of machine examination workers. The invention adopts a method combining a double-flow network and an attention mechanism, can improve the accuracy of the recognition of the abnormal behavior of the machine test, and can be applied to the detection of the abnormal behavior of the machine test, thereby reducing the pressure of related staff and ensuring that the machine test is more fair and fair.

Description

Machine test abnormal behavior detection method based on double-flow network
Technical Field
The invention relates to the field of video behavior recognition, in particular to a machine test abnormal behavior detection method based on a double-flow network.
Background
Examination has long been a means for measuring knowledge levels, the fairness and importance of which are self-evident. In recent years, with the continuous development of technology, mechanical examination is also becoming popular. However, since the examination is usually related to the interest of the examinee, there is no risk that the examinee will get good results by cheating in the examination room. In recent years, with the popularization of signal shielding devices in examination rooms, examinees are basically prevented from communicating with the outside of the examination rooms, so that the current popular cheating means in examination rooms are often information transmission between examinees.
In order to detect cheating behaviors in machine examination, a monitoring device is installed for all examinee computers, the examinee behaviors are monitored in real time, monitoring videos are uploaded to a server after the examination is finished, and staff perform verification. However, since the monitoring video for the examination is often too long, the staff needs to pay attention to watch the video playback for a long time, which easily causes the phenomenon of cheating and missed detection due to fatigue of the staff.
Disclosure of Invention
In order to overcome the defects and shortcomings in the prior art, the invention provides an machine test abnormal behavior detection method based on a double-flow network, which specifically comprises the following steps:
step one: collecting machine examination videos, marking the videos, and generating a machine examination abnormal behavior data set;
step two: splitting a video to extract video frame images, and extracting corresponding optical flow images through the video frame images;
step three: randomly cutting continuous 16 frames of video images and corresponding optical flow images as a group of input, and respectively obtaining double-flow characteristics through a Resnet3D network and a spatial attention module;
step four: and D, carrying out feature fusion on the double-flow features obtained in the step three, finally obtaining the category probability of each machine test behavior through a softmax layer, and predicting the machine test behavior of the examinee.
Further, the initial convolution layer of the Resnet3D network designed in the step three is conv1, the convolution kernel is 7 x 7 in size, the convolution sliding step size is 1×2×2, followed by four residual building blocks resnet_block1, resnet_block2, resnet_block3, resnet_block4, after the Resnet_block4 residual building block, a global average pooling layer is used for downsampling high-dimensional characteristics, and meanwhile, the parameter number of the final full-connection layer FC can be greatly reduced; all building blocks resnet_block1, resnet_block2, resnet_block3, resnet_block4 are composed of two residual component units and a spatial attention module, the convolution kernel sizes used by the residual component units are all 3 x 3, the output channel numbers of residual building blocks Resnet_block1, resnet_block2, resnet_block3 and Resnet_block4 in the network are 64, 128, 256 and 512 respectively.
Further, the spatial attention module performs maximum pooling and average pooling of channel dimensions on the input feature matrix respectively to obtain two feature matrices with dimension 1, then splices the two feature matrices to obtain a feature matrix with dimension 2 of the channel, and obtains a spatial attention coefficient through a three-dimensional convolution layer and Sigmoid activation function, and finally multiplies the spatial attention coefficient with the input feature matrix to obtain a final output feature matrix, wherein the calculation formula is as follows:
wherein σ represents a Sigmoid activation function, f 7×7×7 Representing a convolution layer with a convolution kernel of 7 x 7, maxPool (F) and AvgPool (F) represent the maximum pooling and average pooling operations, respectively, on the feature matrix F.
Further, the specific process of obtaining the dual-flow feature by using the random cutting continuous 16 frames of video images and the corresponding optical flow images as a group of inputs and respectively passing through the Resnet3D convolutional neural network in the third step includes:
step 201: taking a continuous 16-frame original video image sequence as input of a first channel of a Resnet3D convolutional neural network, wherein the output characteristic vector is F1;
step 202: and taking the continuous 16-frame optical flow image sequence as the input of a second channel of the Resnet3D convolutional neural network, and outputting a characteristic vector of F2.
Further, the input original video image sequence and the optical flow image sequence are four-dimensional vectors, and can be expressed by c×d×h×w, wherein C, D, H, W respectively represents the number of channels, the image depth, the image height and the image width.
Further, the specific process of the fourth step comprises:
step 301: splicing the obtained feature vector F1 and the feature vector F2 to obtain a fusion feature vector F;
step 302: the feature vector F passes through the full connection layer FC to obtain an output result R, and finally, a softmax layer is used for calculating probability distribution corresponding to different types of behaviors;
the invention has the following beneficial effects:
1. the invention uses the original video frame image and the optical flow image as input at the same time, enhances the capture and extraction of the human motion information by the network, and can effectively improve the accuracy of behavior recognition.
2. By adding the attention mechanism module, the weight of the network to the concerned region can be redistributed, so that the concerned degree of the human body action generating region in the video is far higher than that of other regions, and the purposes of inhibiting irrelevant redundant information and improving the network performance are achieved.
3. The invention adopts a method combining a double-flow network and an attention mechanism, can improve the accuracy of the recognition of the abnormal behavior of the machine test, and can be applied to the detection of the abnormal behavior of the machine test, thereby reducing the pressure of related staff and ensuring that the machine test is more fair and fair.
Drawings
FIG. 1 is a flow chart of the method of the present invention;
FIG. 2 is a schematic diagram of a model of the present invention;
FIG. 3 is a schematic diagram of a Resnet3D convolutional neural network of the present invention;
FIG. 4 is a schematic diagram of a Resnet3D network residual error building block structure according to the present invention;
fig. 5 is an output diagram of the network structure of the Resnet3D convolutional neural network of the present invention.
Detailed Description
The invention will be further described with reference to the following specific examples, but the scope of the invention is not limited thereto:
examples: as shown in fig. 1, a method for detecting abnormal behavior of an machine test based on a dual-flow network specifically comprises the following steps:
step one: collecting machine examination videos, marking the videos, and generating a machine examination abnormal behavior data set;
the method comprises the steps of collecting machine-check video data, marking abnormal behaviors in the machine-check video data, obtaining the data set by adopting modes of video monitoring, handheld equipment shooting and the like, wherein the data set is a self-collected data set, the obtaining way is reasonable and reliable, and the task requirement is met.
Step two: splitting a video to extract video frame images, and extracting corresponding optical flow images through the video frame images;
step three: randomly cutting continuous 16 frames of video images and corresponding optical flow images as a group of input, and respectively obtaining double-flow characteristics through a Resnet3D network and a spatial attention module;
and respectively taking the continuous 16-frame original video image sequence and the optical flow image sequence as the input of a first channel and the input of a second channel of the Resnet3D convolutional neural network, so as to obtain the output characteristic vectors F1 and F2.
The structure of the Resnet3D convolutional neural network used is shown in FIG. 3, the input of the network is 3×16×224×224, where 3 is the RGB channel number of the image, 16 is the acquired continuous 16 frames, 224 is the image wide and high pixels, the initial convolution layer is conv1, its convolution kernel size is 7 x 7, the convolution sliding step size is 1 x 2, the four residual error building blocks Resnet_block1, resnet_block2, resnet_block3 and Resnet_block4 are arranged behind the residual error building blocks Resnet_block4, and a global average pooling layer is used for downsampling high-dimensional features after the residual error building blocks Resnet_block4, so that the parameter number of the final full-connection layer FC can be greatly reduced; all building blocks resnet_block1, resnet_block2, resnet_block3, resnet_block4 are composed of two residual component units and a spatial attention module, the convolution kernel sizes used by the residual component units are all 3 x 3, the output channel numbers of residual building blocks Resnet_block1, resnet_block2, resnet_block3 and Resnet_block4 in the network are 64, 128, 256 and 512 respectively.
Each residual building block is composed of a residual assembly unit and a spatial attention module, the structure of the residual building block is shown in fig. 4, the spatial attention module performs maximum pooling and average pooling on an input feature matrix in a channel dimension respectively to obtain two feature matrices with the dimension of 1, then the two feature matrices are spliced to obtain a feature matrix with the channel dimension of 2, a spatial attention coefficient is obtained through a three-dimensional convolution layer and a Sigmoid activation function, and finally the spatial attention coefficient is multiplied with the input feature matrix to obtain a final output feature matrix, and the calculation formula is as follows:
wherein σ represents a Sigmoid activation function, f 7×7×7 Representing a convolution layer with a convolution kernel of 7 x 7, maxPool (F) and AvgPool (F) represent the maximum pooling and average pooling operations, respectively, on the feature matrix F.
Step four: and D, carrying out feature fusion on the double-flow features obtained in the step three, finally obtaining the category probability of each machine test behavior through a softmax layer, and predicting the machine test behavior of the examinee.
Splicing the output feature vectors F1 and F2 obtained in the step three to obtain a fusion feature vector F;
the feature vector F passes through the full connection layer FC to obtain an output result R, and finally, a softmax layer is used for calculating probability distribution corresponding to different types of behaviors;
the foregoing is considered as illustrative of the principles of the present invention, and has been described herein before with reference to the accompanying drawings, in which the invention is not limited to the specific embodiments shown.

Claims (6)

1. The machine test abnormal behavior detection method based on the double-flow network is characterized by comprising the following steps of:
step one: collecting machine examination videos, marking the videos, and generating a machine examination abnormal behavior data set;
step two: splitting a video to extract video frame images, and extracting corresponding optical flow images through the video frame images;
step three: randomly cutting continuous 16 frames of video images and corresponding optical flow images as a group of input, and respectively obtaining double-flow characteristics through a Resnet3D network and a spatial attention module;
step four: and D, carrying out feature fusion on the double-flow features obtained in the step three, finally obtaining the category probability of each machine test behavior through a softmax layer, and predicting the machine test behavior of the examinee.
2. The method for detecting abnormal behavior of an machine test based on a dual-flow network as claimed in claim 1, wherein, the initial convolution layer of the Resnet3D network designed in the step three is conv1, the size of the convolution kernel is 7 multiplied by 7, the convolution sliding step length is 1 multiplied by 2, the four residual error building blocks Resnet_block1, resnet_block2, resnet_block3 and Resnet_block4 are arranged behind the residual error building blocks Resnet_block4, and a global average pooling layer is used for downsampling high-dimensional features after the residual error building blocks Resnet_block4, so that the parameter number of the final full-connection layer FC can be greatly reduced; all building blocks resnet_block1, resnet_block2, resnet_block3, resnet_block4 are composed of two residual component units and a spatial attention module, the convolution kernel sizes used by the residual component units are all 3 x 3, the output channel numbers of residual building blocks Resnet_block1, resnet_block2, resnet_block3 and Resnet_block4 in the network are 64, 128, 256 and 512 respectively.
3. The method for detecting abnormal behavior of machine test based on dual-flow network as claimed in claim 2, wherein said spatial attention module performs maximum pooling and average pooling of channel dimension for input feature matrix to obtain two feature matrix with dimension 1, then splices the two feature matrices to obtain feature matrix with dimension 2, and obtains spatial attention coefficient by a three-dimensional convolution layer and Sigmoid activation function, and finally multiplies the spatial attention coefficient with the input feature matrix to obtain final output feature matrix, the calculation formula is as follows:
wherein σ represents a Sigmoid activation function, f 7×7×7 Representing a convolution layer with a convolution kernel of 7 x 7, maxPool (F) and AvgPool (F) represent the maximum pooling and average pooling operations, respectively, on the feature matrix F.
4. The machine-test abnormal behavior detection method based on the double-flow network according to claim 1, wherein the specific process of obtaining the double-flow characteristic by taking the continuous 16-frame video images and the corresponding optical flow images randomly cut as a group of inputs and respectively passing through the Resnet3D convolutional neural network comprises the following steps:
step 201: taking a continuous 16-frame original video image sequence as input of a first channel of a Resnet3D convolutional neural network, wherein the output characteristic vector is F1;
step 202: and taking the continuous 16-frame optical flow image sequence as the input of a second channel of the Resnet3D convolutional neural network, and outputting a characteristic vector of F2.
5. The method for detecting abnormal behavior of machine test based on dual-stream network as set forth in claim 4, wherein said input original video image sequence and optical flow image sequence are four-dimensional vectors, which can be expressed by c×d×h×w, wherein C, D, H, W represents channel number, image depth, image height, and image width, respectively.
6. The method for detecting abnormal behavior of an machine test based on a dual-flow network as set forth in claim 1, wherein the specific process of the fourth step comprises:
step 301: splicing the obtained feature vector F1 and the feature vector F2 to obtain a fusion feature vector F;
step 302: and the feature vector F passes through the full connection layer FC to obtain an output result R, and finally, a softmax layer is used for calculating probability distribution corresponding to different types of behaviors.
CN202310278231.8A 2023-03-21 2023-03-21 Machine test abnormal behavior detection method based on double-flow network Pending CN116580330A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310278231.8A CN116580330A (en) 2023-03-21 2023-03-21 Machine test abnormal behavior detection method based on double-flow network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310278231.8A CN116580330A (en) 2023-03-21 2023-03-21 Machine test abnormal behavior detection method based on double-flow network

Publications (1)

Publication Number Publication Date
CN116580330A true CN116580330A (en) 2023-08-11

Family

ID=87538403

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310278231.8A Pending CN116580330A (en) 2023-03-21 2023-03-21 Machine test abnormal behavior detection method based on double-flow network

Country Status (1)

Country Link
CN (1) CN116580330A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117649630A (en) * 2024-01-29 2024-03-05 武汉纺织大学 Examination room cheating behavior identification method based on monitoring video stream

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117649630A (en) * 2024-01-29 2024-03-05 武汉纺织大学 Examination room cheating behavior identification method based on monitoring video stream
CN117649630B (en) * 2024-01-29 2024-04-26 武汉纺织大学 Examination room cheating behavior identification method based on monitoring video stream

Similar Documents

Publication Publication Date Title
CN112989977B (en) Audio-visual event positioning method and device based on cross-modal attention mechanism
CN112131936A (en) Inspection robot image identification method and inspection robot
CN111445418A (en) Image defogging method and device and computer equipment
CN112507920B (en) Examination abnormal behavior identification method based on time displacement and attention mechanism
CN115861210B (en) Transformer substation equipment abnormality detection method and system based on twin network
CN113052185A (en) Small sample target detection method based on fast R-CNN
CN111369548A (en) No-reference video quality evaluation method and device based on generation countermeasure network
CN116580330A (en) Machine test abnormal behavior detection method based on double-flow network
CN116503318A (en) Aerial insulator multi-defect detection method, system and equipment integrating CAT-BiFPN and attention mechanism
CN115187921A (en) Power transmission channel smoke detection method based on improved YOLOv3
CN114170144A (en) Power transmission line pin defect detection method, equipment and medium
CN111091093A (en) Method, system and related device for estimating number of high-density crowds
CN113486877A (en) Power equipment infrared image real-time detection and diagnosis method based on lightweight artificial intelligence model
CN115082798A (en) Power transmission line pin defect detection method based on dynamic receptive field
CN114897858A (en) Rapid insulator defect detection method and system based on deep learning
CN111160100A (en) Lightweight depth model aerial photography vehicle detection method based on sample generation
CN113627504A (en) Multi-mode multi-scale feature fusion target detection method based on generation of countermeasure network
CN117152815A (en) Student activity accompanying data analysis method, device and equipment
WO2020155908A1 (en) Method and apparatus for generating information
CN111093140A (en) Method, device, equipment and storage medium for detecting defects of microphone and earphone dust screen
CN116246200A (en) Screen display information candid photographing detection method and system based on visual identification
CN112950592B (en) Non-reference light field image quality evaluation method based on high-dimensional discrete cosine transform
CN115278303B (en) Video processing method, device, equipment and medium
CN116310408B (en) Method and device for establishing data association between event camera and frame camera
CN114898461A (en) Human body behavior identification method based on double-current non-local space-time convolution neural network

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination