CN114639166A - Examination room abnormal behavior recognition method based on motion recognition - Google Patents

Examination room abnormal behavior recognition method based on motion recognition Download PDF

Info

Publication number
CN114639166A
CN114639166A CN202210257498.4A CN202210257498A CN114639166A CN 114639166 A CN114639166 A CN 114639166A CN 202210257498 A CN202210257498 A CN 202210257498A CN 114639166 A CN114639166 A CN 114639166A
Authority
CN
China
Prior art keywords
examination room
abnormal behavior
behavior recognition
channel
method based
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210257498.4A
Other languages
Chinese (zh)
Inventor
闫月
刘建明
王鑫
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guilin University of Electronic Technology
Original Assignee
Guilin University of Electronic Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guilin University of Electronic Technology filed Critical Guilin University of Electronic Technology
Priority to CN202210257498.4A priority Critical patent/CN114639166A/en
Publication of CN114639166A publication Critical patent/CN114639166A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/25Fusion techniques
    • G06F18/253Fusion techniques of extracted features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent

Abstract

The invention relates to the technical field of motion recognition, in particular to an examination room abnormal behavior recognition method based on motion recognition, a time interaction module and a channel space attention module are respectively added into a bottomless block on a ResNet-50 backbone network to generate an abnormal behavior recognition network model, a video in an examination room is processed and then input into the abnormal behavior recognition network model to be trained until convergence, finally obtained characteristic results are fused to realize abnormal behavior recognition of the examination room, wherein, the time interaction module and the channel space interaction module are used for identifying the action of each examinee in the examination room, in particular to effectively capture the fine-grained action of the examinee, thereby improving the condition that the abnormal small-scale action can not be accurately identified by the existing analysis method for the abnormal action of the examination room based on deep learning, furthermore, the time context information can be captured with low calculation cost by using the time interaction module.

Description

Examination room abnormal behavior recognition method based on motion recognition
Technical Field
The invention relates to the technical field of motion recognition, in particular to an examination room abnormal behavior recognition method based on motion recognition.
Background
With the development of computer technology and the popularization of the application field of motion recognition, it is necessary that colleges and universities apply intelligent monitoring to examination rooms in order to maintain examination fairness. A large amount of human resources and financial resources are needed in a traditional offline invigilation mode, especially in the process of intensively carrying out examinations for a plurality of times in colleges, physical strength of invigilators is greatly consumed, attention of the invigilators is easily reduced, and abnormal behaviors of examination rooms are missed. Although the examination room has a monitoring camera, the electronic equipment generally used at present can only perform simple video recording, storage and the like on the examination room, and still needs to manually use a large amount of time to screen and identify the monitored content. In order to solve the problem of manpower consumption of examination invigilation, people combine the computer vision field with a monitoring task, however, the real-time performance and accuracy of the existing intelligent invigilation system can not reach the standard of practical application, and a plurality of disadvantages exist, such as unsatisfactory recognition effect under the conditions of local shielding, complex background and vision change. Most existing methods identify and judge difference information based on front and back frames of a video, are difficult to capture the difference of slight abnormal actions of an examination room, and have low identification capability on slight abnormal actions of the examination room, such as slight deviation and stealing of fine-grained actions of others, like examination papers, and the like; in terms of efficiency, many methods also have difficulty achieving the requirements of real-time monitoring.
Disclosure of Invention
The invention aims to provide an examination room abnormal behavior identification method based on motion identification, which improves the condition that the existing abnormal behavior analysis method based on a deep learning examination room can not accurately identify abnormal small-scale behaviors.
In order to achieve the purpose, the invention provides an examination room abnormal behavior identification method based on motion identification, which comprises the following steps:
collecting real-time original video content of an examination room;
carrying out image segmentation on the video to obtain a motion image of each examinee in the examination room;
selecting motion image processing of a single examinee to obtain an input image sequence;
inputting the image sequence into an abnormal behavior recognition network model for training until convergence, and outputting a classification result;
and fusing the classification results to realize the identification of abnormal behaviors of the examination room.
In the process of selecting the motion image of a single examinee to process and obtain an input image sequence, dividing a video image of the single examinee into 5 segments to obtain 5 continuous frame sequences, and preprocessing the data to obtain the input image sequence.
The data preprocessing process specifically includes adjusting the short edge of the RGB image to 256, then enhancing the data by using the methods of position dithering, horizontal flip angle point cropping and proportional dithering, and adjusting the cropped area size to 224 × 224.
The abnormal behavior recognition network model is formed on the basis of the ResNet-50 network improvement and comprises 5 stages, wherein each stage comprises a plurality of bottomless blocks, and each bottomless block comprises a time interaction module and a channel space attention module.
Wherein the time interaction module uses a channel-based convolution to independently learn the time evolution of each channel, reducing computational complexity.
The channel space attention module consists of a channel attention module and a space attention module, and comprises an attention mechanism of a channel and an attention mechanism of a space.
The method comprises the steps of randomly generating weights of a model in model training, carrying out continuous back propagation learning on the model according to the quality of a model training result in the later stage of the weights in the training process, and finally adopting an average weighting method to fuse classification results to effectively identify abnormal behaviors of an examination room.
The invention provides an examination room abnormal behavior recognition method based on action recognition, which is characterized in that a time interaction module and a channel space attention module are respectively added into a bottleeck residual block on a ResNet-50 backbone network to generate an abnormal behavior recognition network model, a video in an examination room is processed and then input into the abnormal behavior recognition network model for training until convergence, finally obtained characteristic results are fused to realize the abnormal behavior recognition of the examination room, wherein, the time interaction module and the channel space interaction module are used for identifying the action of each examinee in the examination room, in particular to effectively capture the fine-grained action of the examinee, thereby improving the condition that the abnormal small-scale action can not be accurately identified by the existing analysis method for the abnormal action of the examination room based on deep learning, furthermore, the time context information can be captured with low calculation cost by using the time interaction module.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to the drawings without creative efforts.
FIG. 1 is a schematic flow chart of an examination room abnormal behavior recognition method based on motion recognition according to the present invention.
FIG. 2 is a process diagram of an example of the construction of the time interaction module of the present invention.
Fig. 3 is a schematic diagram of the channel space attention mechanism module of the present invention.
FIG. 4 is a schematic diagram of the configuration of the channel attention module of the present invention.
FIG. 5 is a schematic structural diagram of a spatial attention module of the present invention.
Fig. 6 is a network architecture diagram of the improved ResNet-50 of the present invention.
Fig. 7 is a schematic structural diagram of the abnormal behavior recognition network model of the present invention.
Detailed Description
Reference will now be made in detail to embodiments of the present invention, examples of which are illustrated in the accompanying drawings, wherein like or similar reference numerals refer to the same or similar elements or elements having the same or similar function throughout. The embodiments described below with reference to the drawings are illustrative and intended to be illustrative of the invention and are not to be construed as limiting the invention.
Referring to fig. 1 to 7, the present invention provides a method for identifying abnormal behaviors in an examination room based on motion identification, which includes the following steps:
s1: collecting real-time original video content of an examination room;
s2: carrying out image segmentation on the video to obtain a motion image of each examinee in the examination room;
s3: selecting motion image processing of a single examinee to obtain an input image sequence;
s4: inputting the image sequence into an abnormal behavior recognition network model for training until convergence, and outputting a classification result;
s5: and fusing the classification results to realize the identification of abnormal behaviors of the examination room.
In the process of obtaining an input image sequence by selecting motion image processing of a single examinee, a video image of the single examinee is divided into 5 segments to obtain 5 continuous frame sequences, and the input image sequence is formed after data preprocessing.
The data preprocessing process specifically includes adjusting the short edge of the RGB image to 256, then enhancing the data by using the methods of position dithering, horizontal flip angle point cropping and proportional dithering, and adjusting the cropped area size to 224 × 224.
The abnormal behavior recognition network model is formed based on ResNet-50 network improvement and comprises 5 stages, wherein each stage comprises a plurality of botdleeck residual blocks, and each botdleeck residual block comprises a time interaction module and a channel space attention module.
The time interaction module uses a channel-based convolution to independently learn the time evolution of each channel, reducing computational complexity.
The channel spatial attention module consists of a channel attention module and a spatial attention module and comprises an attention mechanism of a channel and an attention mechanism of a space.
And randomly generating the weight of the model in the model training, continuously and reversely propagating and learning the model according to the quality of the model training result in the later stage of the weight training to obtain the weight, and finally fusing the classification result by adopting an average weighting method to effectively identify the abnormal behavior of the examination room.
The following is further illustrated from the various modules of the abnormal behavior recognition network model:
1. time interaction module
Fig. 2 is a process of constructing an example of the time interaction Module, the time interaction Module may capture time context information at a lower computation cost, a channel-based convolution is used to independently learn time evolution of each channel, and a lower computation complexity is reserved for model design.
As shown in fig. 2, given an input X ═ X1,X2,...,XTFirstly, change its shape from XT×C×H×WConvert to X'C×T×H×W(denoted by X' to avoid ambiguity). The channel convolution is then applied to operate on X', as shown below
Yc,t,x,y=∑iVc,i·X′c,t+i,x,y (1)
Where V is the channel dependent convolution kernel, Yc,t,x,yIs the output after the time convolution. Compared with the three-dimensional convolution, the channel convolution greatly reduces the calculation amount. In the setup herein, the kernel size of the channel convolution is 3 × 1 × 1, which means thatFeatures interact only with features at adjacent times, but the temporal receptive field will gradually increase as features map through deeper layers of the network. After convolution, the shape of the output Y is converted back to T × C × H × W. The parameter of the original 3D convolution is Cout×CinX t, but the parameter of the TIM module is CoutX 1 x t, the number of parameters in the TIM module is greatly reduced compared to other time convolution operators. In fact, the TSM module can be viewed as a time convolution of the channel mode, with the time kernel fixed to [0, 1, 0 ] for non-shifts]Fixed to [1, 0 ] for the post-shift]For a previous shift fixed to [0, 0, 1 ]]. TIM can generalize TSM operations to flexible modules with learnable convolution kernels, which can more effectively capture temporal context information for motion recognition than random shifts.
2. Channel space attention mechanism module
The channel space attention mechanism module (CBAM) comprises the attention mechanism of the channel and the attention mechanism of the space, and in the identification of the abnormal behaviors of the examination room, because a video does not only comprise a single student, and the influence of the change of background illumination of the examination room and the different sizes of scales can cause interference to the model when the characteristics are extracted, the invention introduces the attention mechanism into the convolution block, can effectively extract the important characteristics in the video content, ignores the secondary characteristics and ensures the accuracy of the final identification result.
Fig. 3 is a schematic diagram of the whole CBAM, and it can be seen that the output result of the convolutional layer passes through a channel attention module to obtain a weighted result, and then passes through a spatial attention module to finally perform weighting to obtain a result. Given an intermediate feature mapping F ∈ RC×H×WAs an input, CBAM in turn infers a one-dimensional channel attention M, as shown in FIG. 3c∈RC×1×1And two-dimensional spatial attention Mc∈R1×H×WThe overall attention process can be summarized as:
Figure BDA0003548956000000051
Figure BDA0003548956000000052
the multiplication by element is expressed in equation (2). During the multiplication, the attention value is copied accordingly. The channel attention value is replicated in spatial dimension and vice versa. F "is the final output.
2.1 channel attention Module
The channel attention module is shown in fig. 4. The spatial information of the feature map is first aggregated using average pooling and maximum pooling operations, generating two different spatial context descriptors:
Figure BDA0003548956000000053
and
Figure BDA0003548956000000054
mean pool characteristics and maximum pool characteristics are indicated separately. The two descriptors are then forwarded to a shared network to generate a channel attention map Mc∈RC×1×1. The shared network is composed of a multi-layer perceptron (MLP) and a hidden layer. To reduce parameter overhead, the hidden activation size is set to Rc /r×1×1Where r is the reduction rate. After applying the shared network to each descriptor, the output feature vectors are combined using element summation. Briefly, the channel attention is calculated as follows:
Figure BDA0003548956000000055
wherein σ represents sigmoid function, W0∈RC/r×C,W1∈RC×C/rWeight W of MLP0And W1Shared for both inputs, and the ReLU activation function is followed by W0
2.2 spatial attention Module
The spatial attention module is shown in fig. 5. And generating a spatial attention graph by using the spatial relation among the features. Unlike the channel attention module, the spatial attention module focuses on "where" being the information portion, which is complementary to the channel attention module. To compute spatial attention, the average pool and max pool operations are first applied along the channel axis and concatenated to generate valid feature descriptors. Applying pool operations along the channel axis can effectively highlight the information region. On the concatenated feature descriptors, a spatial attention map is generated by applying the convolutional layers. It encodes the location of emphasis or suppression. The detailed operation of the spatial attention module is as follows:
two-dimensional maps are generated by aggregating channel information of feature maps using two pool operations:
Figure BDA0003548956000000061
Figure BDA0003548956000000062
and
Figure BDA0003548956000000063
each of which represents an average pool characteristic and a maximum pool characteristic in a channel. They are then concatenated and convolved by a standard convolution layer to generate a 2D spatial attention map. Briefly, spatial attention is calculated as follows:
Figure BDA0003548956000000064
where σ denotes a sigmoid function, f7×7The convolution operation with a filter size of 7 × 7 is shown.
The main network architecture adopted by the invention is ResNet-50, the network structure diagram of ResNet-50 is shown in figure 6, wherein 5 stages are included, each stage includes a plurality of bottomless blocks, and each bottomless block includes a time interaction module and a channel space attention module. The input image is input into the stage 1 after the convolution operation of the stage 0, is input into the stage 2 after the convolution operation of the stage 1, and is input into the stage 4 finally by analogy, and the classification result is output.
Adding a time interaction module before a first volume block of a Bottleneck layer, adding a channel space attention module after the last volume block without changing the middle convolutional layer network structure of the Bottleneck, and finally adding the obtained attention information and a result output by the last Bottleneck layer to serve as the input of the next Bottleneck residual block. Finally, the obtained feature results are fused, the features which are learned and calculated through the multi-frame video passing time interaction module and the channel space attention module are classified by utilizing the full connection layer respectively, and the final classification result is obtained by fusing the classification results of each group. The finally formed abnormal behavior recognition model of the examination room is shown in FIG. 7.
In summary, the invention combines the advantages of the Time Interaction Module (TIM) and the channel space attention mechanism module (CBAM), not only can acquire time sequence text information with low computation cost, but also can pay close attention to important features of actions, and further brings the following beneficial effects:
1) the TIM time interaction module and the CBAM channel space interaction module are used for identifying the action of each examinee in the examination room, and particularly, the fine-grained action of the examinee can be effectively captured;
2) the time interaction module has higher learning flexibility for the previously proposed time shift module TSM and can capture time context information at lower computational cost;
3) the examination room abnormal behavior model can effectively replace manpower invigilation, so that the manpower resource is greatly saved.
While the invention has been described with reference to a preferred embodiment, it will be understood by those skilled in the art that various changes in form and detail may be made therein without departing from the spirit and scope of the invention as defined by the appended claims.

Claims (7)

1. An examination room abnormal behavior recognition method based on motion recognition is characterized by comprising the following steps:
collecting real-time original video content of an examination room;
performing image segmentation on the video to obtain a motion image of each examinee in the examination room;
selecting motion image processing of a single examinee to obtain an input image sequence;
inputting the image sequence into an abnormal behavior recognition network model for training until convergence, and outputting a classification result;
and fusing the classification results to realize the identification of abnormal behaviors of the examination room.
2. The examination room abnormal behavior recognition method based on motion recognition according to claim 1,
in the process of obtaining an input image sequence by selecting motion image processing of a single test taker, a video image of the single test taker is divided into 5 segments to obtain 5 continuous frame sequences, and the 5 continuous frame sequences are subjected to data preprocessing to form the input image sequence.
3. The examination room abnormal behavior recognition method based on motion recognition according to claim 2,
the data preprocessing process is specifically to adjust the short edge of the RGB image to 256, then enhance the data by using the methods of position dithering, horizontal flip angle point cropping and proportional dithering, and adjust the cropped area size to 224 × 224.
4. The examination room abnormal behavior recognition method based on motion recognition according to claim 1,
the abnormal behavior recognition network model is formed based on ResNet-50 network improvement and comprises 5 stages, wherein each stage comprises a plurality of botdleeck residual blocks, and each botdleeck residual block comprises a time interaction module and a channel space attention module.
5. The examination room abnormal behavior recognition method based on motion recognition according to claim 4,
the time interaction module uses a channel-based convolution to independently learn the time evolution of each channel, reducing computational complexity.
6. The examination room abnormal behavior recognition method based on motion recognition according to claim 4,
the channel spatial attention module consists of a channel attention module and a spatial attention module and comprises an attention mechanism of a channel and an attention mechanism of a space.
7. The examination room abnormal behavior recognition method based on motion recognition according to claim 1,
and randomly generating the weight of the model in the model training, continuously and reversely propagating and learning the model according to the quality of the model training result in the later stage of the weight training to obtain the weight, and finally fusing the classification result by adopting an average weighting method to effectively identify the abnormal behavior of the examination room.
CN202210257498.4A 2022-03-16 2022-03-16 Examination room abnormal behavior recognition method based on motion recognition Pending CN114639166A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210257498.4A CN114639166A (en) 2022-03-16 2022-03-16 Examination room abnormal behavior recognition method based on motion recognition

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210257498.4A CN114639166A (en) 2022-03-16 2022-03-16 Examination room abnormal behavior recognition method based on motion recognition

Publications (1)

Publication Number Publication Date
CN114639166A true CN114639166A (en) 2022-06-17

Family

ID=81949045

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210257498.4A Pending CN114639166A (en) 2022-03-16 2022-03-16 Examination room abnormal behavior recognition method based on motion recognition

Country Status (1)

Country Link
CN (1) CN114639166A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116935229A (en) * 2023-09-12 2023-10-24 山东博昂信息科技有限公司 Method and system for identifying hook-in state of ladle hook

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116935229A (en) * 2023-09-12 2023-10-24 山东博昂信息科技有限公司 Method and system for identifying hook-in state of ladle hook

Similar Documents

Publication Publication Date Title
Wu et al. Edge computing driven low-light image dynamic enhancement for object detection
CN109753903B (en) Unmanned aerial vehicle detection method based on deep learning
CN110363716B (en) High-quality reconstruction method for generating confrontation network composite degraded image based on conditions
Liu et al. Robust video super-resolution with learned temporal dynamics
CN111639692A (en) Shadow detection method based on attention mechanism
WO2021018163A1 (en) Neural network search method and apparatus
CN112954312B (en) Non-reference video quality assessment method integrating space-time characteristics
CN112364757B (en) Human body action recognition method based on space-time attention mechanism
CN111402130B (en) Data processing method and data processing device
CN111523410A (en) Video saliency target detection method based on attention mechanism
CN112507920B (en) Examination abnormal behavior identification method based on time displacement and attention mechanism
CN110751649B (en) Video quality evaluation method and device, electronic equipment and storage medium
CN112818969A (en) Knowledge distillation-based face pose estimation method and system
CN110532959B (en) Real-time violent behavior detection system based on two-channel three-dimensional convolutional neural network
CN110569851A (en) real-time semantic segmentation method for gated multi-layer fusion
CN113128360A (en) Driver driving behavior detection and identification method based on deep learning
CN112070664A (en) Image processing method and device
CN114898284B (en) Crowd counting method based on feature pyramid local difference attention mechanism
CN111079864A (en) Short video classification method and system based on optimized video key frame extraction
CN110852199A (en) Foreground extraction method based on double-frame coding and decoding model
CN114639166A (en) Examination room abnormal behavior recognition method based on motion recognition
CN113255464A (en) Airplane action recognition method and system
CN109697408A (en) A kind of face identification system based on FPGA
CN114830168A (en) Image reconstruction method, electronic device, and computer-readable storage medium
CN115601820A (en) Face fake image detection method, device, terminal and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination