CN109241830B - Classroom lecture listening abnormity detection method based on illumination generation countermeasure network - Google Patents

Classroom lecture listening abnormity detection method based on illumination generation countermeasure network Download PDF

Info

Publication number
CN109241830B
CN109241830B CN201810831224.5A CN201810831224A CN109241830B CN 109241830 B CN109241830 B CN 109241830B CN 201810831224 A CN201810831224 A CN 201810831224A CN 109241830 B CN109241830 B CN 109241830B
Authority
CN
China
Prior art keywords
illumination
head position
head
image
classroom
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201810831224.5A
Other languages
Chinese (zh)
Other versions
CN109241830A (en
Inventor
谢昭
张安杰
吴克伟
肖泽宇
童赟
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hefei University of Technology
Original Assignee
Hefei University of Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hefei University of Technology filed Critical Hefei University of Technology
Priority to CN201810831224.5A priority Critical patent/CN109241830B/en
Publication of CN109241830A publication Critical patent/CN109241830A/en
Application granted granted Critical
Publication of CN109241830B publication Critical patent/CN109241830B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/40Scenes; Scene-specific elements in video content
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
    • G06Q50/10Services
    • G06Q50/20Education
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/20Movements or behaviour, e.g. gesture recognition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/40Scenes; Scene-specific elements in video content
    • G06V20/44Event detection

Landscapes

  • Engineering & Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Tourism & Hospitality (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Social Psychology (AREA)
  • Human Computer Interaction (AREA)
  • Psychiatry (AREA)
  • Educational Administration (AREA)
  • Educational Technology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Economics (AREA)
  • Human Resources & Organizations (AREA)
  • Marketing (AREA)
  • Primary Health Care (AREA)
  • Strategic Management (AREA)
  • General Business, Economics & Management (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses a classroom lecture listening abnormity detection method based on an illumination generation confrontation network, which comprises the following steps: the method comprises the steps of collecting real classroom head posture data, rendering illumination classroom head posture data, constructing an illumination generation confrontation network, generating a generated confrontation sample, constructing a head posture detection model, detecting classroom head posture and detecting classroom listening abnormity. According to the invention, through using the deep neural network, the accuracy of positioning the head region is improved, and the interference of the non-head region on the judgment of the non-lecture state is reduced.

Description

Classroom lecture listening abnormity detection method based on illumination generation countermeasure network
Technical Field
The invention belongs to the technical field of anomaly detection, and particularly relates to a classroom lecture listening anomaly detection method based on an illumination generation countermeasure network.
Background
The computer anomaly detection is to analyze a video sequence recorded by monitoring equipment such as a camera and the like by using a computer vision theory and a video analysis method without human intervention, realize the positioning, identification and tracking of a target in a structured scene, analyze and judge the behavior of the target on the basis of the positioning, identification and tracking of the target, obtain the understanding of the meaning of image content and the explanation of an objective scene, and further guide and plan actions.
The existing anomaly detection method usually adopts a specific statistical analysis method and a deep learning method. The chinese patent with application number 201510141935.6, a crowd movement track anomaly detection method for a complex structured scene, performs the learning of the multi-center clustering algorithm based on the maximum and minimum distances by extracting the crowd movement track in the complex structured scene in the historical data of the surveillance video and segmenting the crowd movement track, and uses the anomaly detection based on the LOF algorithm, thereby efficiently and practically solving the problem of crowd movement track detection. The Chinese patent 'crowd abnormity detection and positioning system and method based on time recurrent neural network' with application number 201410795393.X uses the time recurrent neural network to analyze and train the collected sample data so as to complete abnormity detection. The chinese patent "a video anomaly detection method" with application number 201710305833.2 utilizes a gray projection algorithm to complete global motion estimation, thereby effectively realizing image jitter detection and jitter degree estimation. The method is easy to fall into a local optimal solution, and when a training data set is large and a training network is complex, training time is often very high, cost is too large, and efficiency is too low.
The main application fields of the current abnormity detection comprise intelligent transportation, intelligent monitoring and the like. The chinese patent "a traffic anomaly detection method and system" with application number 201410799626.3 divides a normal traffic video image sequence into video block sequences, detects the number of shots in the video block sequences, establishes a gaussian model of the number of shots in the video block sequences, and performs anomaly detection on a test traffic video image by using the gaussian model. The chinese patent "traffic scene anomaly detection method based on motion reconstruction technology" with application number 201510670786.2 utilizes the spatial position information of the motion pattern to explore the spatial structure information between different motion patterns, and solves the problem of inapplicability of the existing anomaly detection method to the specific scene. The chinese patent application No. 201710131835.4, urban road traffic anomaly detection method based on Isolation Forest, uses a road as a detection object, divides different types of data sets according to the average running speed of the road in different periods, trains an Isolation Forest based on each data set, and determines whether the road is abnormal by detecting the distance from the road speed to a root node in the Isolation Forest. The chinese patent "an anomaly detection method based on vehicle track similarity" with application number 201510046984.1 calculates similarity measurement between a typical track and the vehicle track of the type to build a deviation statistical model, obtains a confidence interval of the similarity measurement, calculates the similarity measurement between the track to be measured and the typical track, and judges whether the track is abnormal according to the confidence interval. However, the method is complex, low in scene adaptability, large in real data set requirement and high in data cost.
The existing abnormity detection method does not relate to the analysis of the listening state in the process of classroom teaching invigilation, and the task of the invention is different from the existing classroom video and head posture data. The method is different from the conventional anomaly detection method in the aspect of implementation, and on the basis of a deep learning method, the 3D model and the illumination rendering are used for generating the sample, and the illumination-optimized head posture data is generated on the basis of the generation countermeasure network, so that the inconsistency of the illumination-rendered head position image and the real head position image data is solved. The accuracy of head posture detection can be effectively improved by using the generated confrontation sample, and abnormal classroom listening state judgment is realized through statistical analysis of the head posture detection.
Disclosure of Invention
In order to make up for the defects of the prior art, the invention provides a classroom lecture listening abnormity detection method based on illumination generation countermeasure network.
The technical scheme adopted by the invention is as follows:
a classroom lecture listening abnormity detection method based on an illumination generation countermeasure network is characterized in that: the method comprises the following steps:
step S1: acquiring real classroom head posture data:
acquiring and obtaining video frames in a real classroom, constructing a head position detection model, marking candidate head position images, and acquiring a training set and training parameters;
step S2: rendering the head posture data of the illumination classroom:
according to a designed 3D model of a classroom student, setting head gestures, illumination conditions and camera shooting angle parameters in the model, rendering for multiple times, and acquiring a classroom image set under rendered illumination;
step S3: constructing an illumination generation countermeasure network:
generating an antagonistic network according to the 11 layers of illumination, solving the target loss of the antagonistic network generated by the illumination, and training the antagonistic network generated by the illumination;
step S4: generating a challenge sample:
using real classroom head posture data to obtain illumination rendering head position images of different illumination conditions, shooting angles and different figures, using trained illumination to generate confrontation network model parameters, generating an optimized rendering illumination head position image, calculating a judgment score of the illumination optimization head position image, setting a vivid image threshold value, and selecting the image which is larger than the threshold value as the vivid rendering illumination head position image;
step S5: constructing a head posture detection model:
the method comprises the steps of taking a vivid rendering illumination head position image as training data for head posture detection, marking the training data for the head posture detection in a classification mode, setting a head posture detection model, and obtaining parameters of the head posture detection model through training;
step S6: detecting the head posture in the classroom:
using the generated confrontation head posture detection model to realize classroom head posture detection;
step S7: detecting abnormal class listening in a classroom:
the method comprises the steps of inputting a classroom real-time collected video, extracting a video frame, setting a class-listening abnormity detection mechanism by using a constructed model and training parameters, and obtaining proportions of students in different states.
Each step of the classroom attendance anomaly detection method for generating an countermeasure network based on illumination is described in detail below.
The method for acquiring the real classroom head posture data comprises the following steps:
step S1-1: collecting a real classroom video;
step S1-2: acquiring video frames in a classroom video, and performing sliding window sampling to obtain candidate head position images, wherein each head position image comprises RGB three-layer color channels;
step S1-3: constructing a head position detection model, wherein the head position detection model comprises 8 layers of neural network models, the first 6 layers are convolutional neural networks, and the 7 th layer and the 8 th layer are full-connection networks;
step S1-3-1: the first 6 layers of convolutional neural networks use the same parameters, the size of the filter of each layer is 3 x 3, the number of the filters is 256, the pooling method is summation pooling, namely the summation result of response values in 256 channels is reserved as the final output response, and the form of the excitation function is a relu function;
step S1-3-2: in the full-connection layer of the 7 th layer, 256 characteristic neurons are mapped into 4096 characteristic neurons, and a full-connection mapping parameter matrix is 256 x 4096;
step S1-3-3: a full-connection layer of the 8 th layer maps 4096 characteristic neurons into single neurons, and a full-connection mapping parameter matrix is 4096 x 2, wherein the type of the final output layer neuron is 1, which indicates that the neuron is a head position image, and 0 indicates that the neuron is not a head position image;
step S1-4: marking the candidate head position image obtained in the step S1-2 to obtain head training data and non-head training data, and constructing a head position detection training set;
step S1-5: training the head position detection model constructed in the step S1-3 by using the head position detection training set acquired in the step S1-4 to obtain parameters w of the trained head position detection modelheadpos
Step S1-6: for the candidate head position image obtained in step S1-2, the trained head position detection model parameter w is usedheadposThe human face judgment is carried out, the head and the non-head can be distinguished, and therefore the real head position image real of the test video can be extractedheadpos
The method for rendering the head posture data of the illumination classroom comprises the following steps:
step S2-1: designing a classroom student 3D model;
step S2-2: setting the head postures of students in class in a classroom student 3D model;
step S2-3: setting illumination conditions in a classroom student 3D model;
step S2-4: setting a camera shooting angle in a classroom student 3D model;
step S2-5: shooting for multiple times according to the conditions set in the steps S2-1, S2-2, S2-3 and S2-4 to obtain a classroom image set under rendered illumination;
step S2-6: for a set of classroom images under rendered lighting, training is performed using step S1-5Trained head position detection model parameter wheadposObtaining a render of the head position image by illuminationheadpos
The method for constructing the illumination generation countermeasure network comprises the following steps:
step S3-1: setting an illumination generation countermeasure network, wherein the first 4 layers are illumination generation optimization networks, and the 5 th to 11 th layers are illumination generation judgment networks;
step S3-1-1: setting an illumination generation optimization network, and using a 4-layer convolutional neural network;
step S3-1-1-1: in the optimization network, the convolutional neural networks of each layer use the same parameters, the size of the filter of each layer is 3 x 3, the number of the filters is 256, the pooling method is maximum pooling, namely the maximum response value in 256 channels is reserved as the final output response, and the form of the excitation function is a relu function;
step S3-1-1-2: all parameters in the illumination generation optimization network are denoted as wref
Step S3-1-1-3: generating an optimized network by the input image through illumination, and obtaining an optimized image, wherein the resolution of the optimized image is the same as that of the original image;
step S3-1-2: setting an illumination generation judgment network, and using a neural network with 7 layers, wherein the first 5 layers are convolutional neural networks, and the 6 th layer and the 7 th layer are full-connection neural networks;
step S3-1-2-1: in the decision network, the same parameters are used in the first 5 layers, the size of the filter of each layer is 3 x 3, the number of the filters is 64, the pooling method is summation pooling, namely the summation result of response values in 64 channels is reserved as the final output response, and the form of the excitation function is relu function;
step S3-1-2-2: in the full-connection layer of the 6 th layer, 256 characteristic neurons are mapped into 4096 characteristic neurons, and a full-connection mapping parameter matrix is 256 x 4096;
step S3-1-2-3: and a full-connection layer of the 7 th layer, wherein 4096 characteristic neurons are mapped into single neurons, the full-connection mapping parameter matrix is 4096 x 2, and the type y of the neurons of the final output layerreal,yrealIs 1 means trueA real head position image, 0 representing an illumination optimized head position image;
step S3-1-2-4: all parameters in the illumination generation decision network are denoted as wjudge
Step S3-1-2-5: rendering an illumination image, inputting the illumination to generate a judgment network, and judging that the score is closer to 0; inputting a real head position image into an illumination generation judgment network, and judging that the score is closer to 1;
step S3-2: solving the target loss of the illumination generation countermeasure network;
step S3-2-1: calculating the optimization loss of the illumination rendering image;
step S3-2-1-1: rendering head position images with lightingheadposInputting step S3-1-1 illumination generation optimization network, obtaining illumination optimization head position image refineheadpos
Step S3-2-1-2: solving the optimization loss of the illumination rendering head position image, namely illumination optimization head position image refineheadposRender head position image render with illuminationheadposUsing a 1 norm to calculate the distance between 2 images, i.e.
dref=||renderheadpos-refineheadpos||1
Step S3-2-2: calculating the judgment loss of the image;
step S3-2-2-1: constructing a set img of head position imagesheadposContaining real head position image realheadposAnd illumination optimized head position image refineheadpos
Step S3-2-2-2: according to the image type, img is set for the head position imageheadposSetting image flag yreal,yrealA value of 1 indicates a true head position image, and a value of 0 indicates a lighting-optimized head position image;
step S3-2-2-3: image img of head positionheadposInputting step S3-1-2 light generation judgment network to obtain judgment score Sjudge
Step S3-2-2-4: solving the judgment loss of one image according to the judgment score and the image mark
Figure BDA0001743546990000061
Step S3-2-3: solving the total loss of the illumination rendering head position image, wherein the total loss comprises 2 parts of optimization loss and judgment loss, and the total loss of the illumination rendering head position image is
loss=dref+djudge
Step S3-3: training illumination to generate an antagonistic network;
step S3-3-1: training illumination to generate an optimized network;
step S3-3-1-1: input illumination rendering head position image renderheadpos
Step S3-3-1-2: computing illumination optimized head position image refineheadpos
Step S3-3-1-3: calculating a decision score s for an illumination-optimized head position imagejudge
Step S3-3-1-4: according to the step S3-2-3, calculating the total loss of the illumination rendering head position image;
step S3-3-1-5: adjusting model parameters for illumination generation optimization, and determining updated illumination generation optimization model parameters according to total loss and gradient descent method of illumination rendering head position image
Figure BDA0001743546990000062
Wherein t represents the t-th update of the model parameters;
step S3-3-2: training an illumination generation decision network;
step S3-3-2-1: repeating step S3-2-2-4 to calculate the decision loss d of all images in the image setjudge
Step S3-3-2-2: adjusting model parameters of illumination generation judgment, and determining updated illumination generation judgment model parameters according to gradient descent method
Figure BDA0001743546990000063
Wherein t represents the t-th update of the model parameters;
step S3-3-3: alternately repeating the steps S3-3-1 and S3-3-2, and iteratively optimizing the parameters of the illumination generation optimization model
Figure BDA0001743546990000071
And illumination generation decision model parameters
Figure BDA0001743546990000072
Until the model loss convergence no longer changes;
step S3-3-4: generating optimized model parameters by the converged illumination
Figure BDA0001743546990000073
And illumination generation decision model parameters
Figure BDA0001743546990000074
Recording as trained light generation confrontation network model parameter wadv={wref,wjudge}。
The generating generates a challenge sample, comprising the steps of:
step S4-1: using the step S2, obtaining the head position image render from different lighting conditions, different shooting angles, and different people' S lightingheadpos
Step S4-2: generating optimized rendered illumination head position image refine using the illumination-generation confrontation network model parameters trained in step S3-3-4headpos
Step S4-3: calculating a decision score s for an illumination-optimized head position image using an illumination generation decision modeljudge
Step S4-4: setting a threshold value of a vivid image, and scoring the judgment by sjudgeGreater than 0.5 as a realistic rendering illuminated head position image.
The method for constructing the head posture detection model comprises the following steps:
step S5-1: using the vivid rendered illumination head position image and the real head position image obtained in the step S1-6 as training data for head posture detection;
step S5-2: class labeling of training data for head pose detection ylistenThe head posture training data comprises lecture samples and non-lecture samples, wherein ylistenIs 1 denotes a careful lecture, y listen0 means not attending class;
step S5-3: setting a head posture detection model, and using 7 layers of neural networks, wherein the first 5 layers are convolutional neural networks, and the 6 th layer and the 7 th layer are full-connection neural networks;
step S5-3-1: in the decision network, the same parameters are used in the first 5 layers, the size of the filter of each layer is 3 x 3, the number of the filters is 64, the pooling method is summation pooling, namely the summation result of response values in 64 channels is reserved as the final output response, and the form of the excitation function is relu function;
step S5-3-2: in the full-connection layer of the 6 th layer, 256 characteristic neurons are mapped into 4096 characteristic neurons, and a full-connection mapping parameter matrix is 256 x 4096;
step S5-3-3: and a full-connection layer of the 7 th layer, wherein 4096 characteristic neurons are mapped into single neurons, the full-connection mapping parameter matrix is 4096 x 2, and the type y of the neurons of the final output layerlistenWherein y islistenIs 1 denotes a careful lecture, y listen0 means not attending class;
step S5-4: training the neural network model constructed in the step S5-3 by using the training set constructed in the step S5-1 and the step S5-2 to obtain the parameter w of the trained head posture detection modellisten
The classroom head posture detection comprises the following steps:
step S6-1: inputting a classroom real-time collected video and extracting a video frame;
step S6-2: head position detection model parameter w trained using step S1-6headposExtracting real head position image realheadpos
Step S6-3: parameter w of head pose detection model trained using step S5-4listenCalculating the head posture score of the student;
step S6-4: judging whether the student is in class according to the head posture score of the student, wherein ylistenIs 1 denotes a careful lecture, y listen0 means not attending class;
step S6-5: traversing all students in the video frame, judging the class listening state of all people, and calculating the proportion of the number of people who do not listen to the class;
step S6-6: and setting a state threshold value of the class non-attending proportion, outputting the class non-attending state if the class non-attending proportion is more than or equal to 5%, and outputting the normal class attending state if the class non-attending proportion is less than 5%.
The class attendance abnormity detection method comprises the following steps:
step S7-1: inputting a real-time monitoring video and reading a video frame;
step S7-2: judging whether the video is finished or not, and finishing the real-time judgment of abnormal class listening state if the monitoring video is finished;
step S7-3: if the monitoring video is still valid, detecting the non-lesson-listening state of the continuous video frames by using the step S6-6, and extracting the non-lesson-listening state of each frame;
step S7-4: if the non-listening state does not appear, clearing the non-listening state, clearing the starting time of the non-listening state and clearing the abnormal state of class listening;
step S7-5: if the state of not listening to the class appears, abnormal judgment of class listening in the class is carried out;
step S7-5-1: if the non-class state appears for the first time, initializing the current time as the starting time of the non-class state, and initializing the non-class duration as 1 frame;
step S7-5-2: if the non-lesson-listening state does not occur for the first time, updating the duration of the non-lesson-listening state, and increasing the duration by 1 frame;
step S7-5-3: if the duration of the non-lesson-listening state reaches 50 frames, outputting an abnormal class-listening state in the classroom;
step S7-6: and repeating the steps S7-1 to S7-5 to realize the analysis of real-time classroom lecture listening data, provide abnormal classroom lecture listening state detection and realize real-time judgment.
Compared with the prior art, the invention has the following advantages:
(1) aiming at the problems of complex background environment and the interference of a large amount of invalid information of non-head areas, the invention improves the accuracy of positioning the head areas and reduces the interference of the non-head areas on the judgment of the state of not attending classes by using the deep neural network.
(2) The head posture data can change along with the change of factors such as the head posture, the illumination condition, the shooting angle of a camera and the like, thereby causing certain influence on the accuracy of the head posture data. According to the invention, through the 3D model and the ambient light rendering method, the classroom image set under the rendering illumination is generated for multiple times, more data samples are provided for the head posture detection model, and the component training of the model is facilitated.
(3) The confrontation network is generated through illumination, the rendering illumination generated for multiple times is optimized, a more vivid confrontation sample is generated, the inconsistency between the rendering illumination sample and the real data characteristic is avoided, and the effectiveness of model training is improved. The invention can realize effective detection of the head posture based on the generation of the confrontation network, is beneficial to the judgment of the class listening state and is beneficial to improving the accuracy of the judgment of the abnormal class listening state of the duration.
Drawings
The invention is further described below with reference to the accompanying drawings:
fig. 1 is a flowchart of a classroom attendance anomaly detection method for generating an countermeasure network based on illumination.
Fig. 2(a) proposes a model for the head position.
Fig. 2(b) is the collected real classroom head pose data.
Fig. 3 is a classroom student 3D model and lighting rendering environment.
Fig. 4(a) is a model of an illumination-generated countermeasure network.
Fig. 4(b) is a lighting optimized head position image.
Fig. 5 is a head pose detection model.
Fig. 6 shows a model for detecting a non-attending state.
Fig. 7 is a model for detecting abnormal listening status in a classroom.
Detailed Description
The invention is described in detail below with reference to the figures and the detailed description. The invention relates to a classroom lecture-listening abnormity detection method based on an illumination generation confrontation network, a specific flow is shown in figure 1, and the implementation scheme of the invention comprises the following steps:
step S1: the method comprises the following steps of collecting real classroom head posture data, wherein the specific operation steps comprise:
step S1-1: collecting a real classroom video;
step S1-2: acquiring video frames in a classroom video, and performing sliding window sampling to obtain candidate head position images, wherein each head position image comprises RGB three-layer color channels;
step S1-3: constructing a head position detection model, wherein the head position detection model comprises 8 layers of neural network models, the first 6 layers are convolutional neural networks, and the 7 th layer and the 8 th layer are fully connected networks, as shown in fig. 2 (a);
step S1-3-1: the first 6 layers of convolutional neural networks use the same parameters, the size of the filter of each layer is 3 x 3, the number of the filters is 256, the pooling method is summation pooling, namely the summation result of response values in 256 channels is reserved as the final output response, and the form of the excitation function is a relu function;
step S1-3-2: in the full-connection layer of the 7 th layer, 256 characteristic neurons are mapped into 4096 characteristic neurons, and a full-connection mapping parameter matrix is 256 x 4096;
step S1-3-3: a full-connection layer of the 8 th layer maps 4096 characteristic neurons into single neurons, and a full-connection mapping parameter matrix is 4096 x 2, wherein the type of the final output layer neuron is 1, which indicates that the neuron is a head position image, and 0 indicates that the neuron is not a head position image;
step S1-4: marking the candidate head position image obtained in the step S1-2 to obtain head training data and non-head training data, and constructing a head position detection training set;
step S1-5: training the head position detection model constructed in the step S1-3 by using the head position detection training set acquired in the step S1-4 to obtain parameters w of the trained head position detection modelheadpos
Step S1-6: for the candidate head position image obtained in step S1-2, the trained head position detection model parameter w is usedheadposThe human face judgment is carried out, the head and the non-head can be distinguished, and therefore the real head position image real of the test video can be extractedheadposAs shown in fig. 2 (b);
step S2: generating rendering illumination classroom head posture data, and the specific operation steps comprise:
step S2-1: designing a classroom student 3D model;
step S2-2: setting the head postures of students in class in a classroom student 3D model;
step S2-3: setting illumination conditions in a classroom student 3D model;
step S2-4: setting a camera shooting angle in a classroom student 3D model;
step S2-5: according to the conditions set in the steps S2-1, S2-2, S2-3, S2-4 and S2-5, such as the conditions shown in FIG. 3, shooting is carried out for multiple times to obtain a classroom image set under rendered illumination;
step S2-6: for the classroom image set under the rendering illumination, the head position detection model parameter w trained in the step S1-5 is usedheadposObtaining a render of the head position image by illuminationheadpos
Step S3: aiming at the class-attending posture detection, an illumination generation countermeasure network is constructed, and the specific operation steps comprise:
step S3-1: setting an illumination generation countermeasure network, wherein the first 4 layers are illumination generation optimization networks, and the 5 th to 11 th layers are illumination generation decision networks, as shown in fig. 4 (a);
step S3-1-1: setting an illumination generation optimization network, and using a 4-layer convolutional neural network;
step S3-1-1-1: in the optimization network, the convolutional neural networks of each layer use the same parameters, the size of the filter of each layer is 3 x 3, the number of the filters is 256, the pooling method is maximum pooling, namely the maximum response value in 256 channels is reserved as the final output response, and the form of the excitation function is a relu function;
step S3-1-1-2: all parameters in the illumination generation optimization network are denoted as wref
Step S3-1-1-3: generating an optimized network by the input image through illumination, and obtaining an optimized image, wherein the resolution of the optimized image is the same as that of the original image;
step S3-1-2: setting an illumination generation judgment network, and using a neural network with 7 layers, wherein the first 5 layers are convolutional neural networks, and the 6 th layer and the 7 th layer are full-connection neural networks;
step S3-1-2-1: in the decision network, the same parameters are used in the first 5 layers, the size of the filter of each layer is 3 x 3, the number of the filters is 64, the pooling method is summation pooling, namely the summation result of response values in 64 channels is reserved as the final output response, and the form of the excitation function is relu function;
step S3-1-2-2: in the full-connection layer of the 6 th layer, 256 characteristic neurons are mapped into 4096 characteristic neurons, and a full-connection mapping parameter matrix is 256 x 4096;
step S3-1-2-3: and a full-connection layer of the 7 th layer, wherein 4096 characteristic neurons are mapped into single neurons, the full-connection mapping parameter matrix is 4096 x 2, and the type y of the neurons of the final output layerreal,yrealA value of 1 indicates a true head position image, and a value of 0 indicates a lighting-optimized head position image;
step S3-1-2-4: all parameters in the illumination generation decision network are denoted as wjudge
Step S3-1-2-5: rendering an illumination image, inputting the illumination to generate a judgment network, and judging that the score is closer to 0; inputting a real head position image into an illumination generation judgment network, and judging that the score is closer to 1;
step S3-2: solving the target loss of the illumination generation countermeasure network;
step S3-2-1: calculating the optimization loss of the illumination rendering image;
step S3-2-1-1: rendering head position images with lightingheadposInputting step S3-1-1 illumination generation optimization network, obtaining illumination optimization head position image refineheadpos
Step S3-2-1-2: solving the optimization loss of the illumination rendering head position image, namely illumination optimization head position image refineheadposRender head position image render with illuminationheadposUsing a 1 norm to calculate the distance between 2 images, i.e.
dref=||renderheadpos-refineheadpos||1
Step S3-2-2: calculating the judgment loss of the image;
step S3-2-2-1: constructing a set img of head position imagesheadposContaining real head position image realheadposAnd illumination optimized head position image refineheadpos
Step S3-2-2-2: according to the image type, img is set for the head position imageheadposSetting image flag yreal,yrealA value of 1 indicates a true head position image, and a value of 0 indicates a lighting-optimized head position image;
step S3-2-2-3: image img of head positionheadposInputting step S3-1-2 light generation judgment network to obtain judgment score Sjudge
Step S3-2-2-4: solving the judgment loss of one image according to the judgment score and the image mark
Figure BDA0001743546990000121
Step S3-2-3: solving the total loss of the illumination rendering head position image, wherein the total loss comprises 2 parts of optimization loss and judgment loss, and the total loss of the illumination rendering head position image is
loss=dref+djudge
Step S3-3: training illumination to generate an antagonistic network;
step S3-3-1: training illumination to generate an optimized network;
step S3-3-1-1: input illumination rendering head position image renderheadpos
Step S3-3-1-2: computing illumination optimized head position image refineheadpos
Step S3-3-1-3: calculating a decision score s for an illumination-optimized head position imagejudge
Step S3-3-1-4: according to the step S3-2-3, calculating the total loss of the illumination rendering head position image;
step S3-3-1-5: adjusting model parameters for illumination generation optimization, and determining updated illumination generation optimization model parameters according to total loss and gradient descent method of illumination rendering head position image
Figure BDA0001743546990000131
Wherein t represents the t-th update of the model parameters;
step S3-3-2: training an illumination generation decision network;
step S3-3-2-1: repeating step S3-2-2-4 to calculate the decision loss d of all images in the image setjudge
Step S3-3-2-2: adjusting model parameters of illumination generation judgment, and determining updated illumination generation judgment model parameters according to gradient descent method
Figure BDA0001743546990000132
Wherein t represents the t-th update of the model parameters;
step S3-3-3: alternately repeating the steps S3-3-1 and S3-3-2, and iteratively optimizing the parameters of the illumination generation optimization model
Figure BDA0001743546990000133
And illumination generation decision model parameters
Figure BDA0001743546990000134
Until the model loss convergence no longer changes;
step S3-3-4: generating optimized model parameters by the converged illumination
Figure BDA0001743546990000135
And illumination generation decision model parameters
Figure BDA0001743546990000136
Recording as trained light generation confrontation network model parameter wadv={wref,wjudge};
Step S4: optimizing head posture data of an illumination classroom and generating a confrontation sample, wherein the specific operation steps comprise:
step S4-1: using the step S2, obtaining the head position image render from different lighting conditions, different shooting angles, and different people' S lightingheadpos
Step S4-2: generating optimized rendered illumination head position image refine using the illumination-generation confrontation network model parameters trained in step S3-3-4headpos
Step S4-3: calculating a decision score s for an illumination-optimized head position image using an illumination generation decision modeljudge
Step S4-4: setting a threshold value of a vivid image, and scoring the judgment by sjudgeGreater than 0.5 as a realistic rendering illuminated head position image, fig. 4 (b);
step S5: using the generated confrontation data to construct a head posture detection model, as shown in fig. 5, the specific operation steps include:
step S5-1: using the vivid rendered illumination head position image and the real head position image obtained in the step S1-6 as training data for head posture detection;
step S5-2: class labeling of training data for head pose detection ylistenThe head posture training data comprises lecture samples and non-lecture samples, wherein ylistenIs 1 denotes a careful lecture, y listen0 means not attending class;
step S5-3: setting a head posture detection model, and using 7 layers of neural networks, wherein the first 5 layers are convolutional neural networks, and the 6 th layer and the 7 th layer are full-connection neural networks;
step S5-3-1: in the decision network, the same parameters are used in the first 5 layers, the size of the filter of each layer is 3 x 3, the number of the filters is 64, the pooling method is summation pooling, namely the summation result of response values in 64 channels is reserved as the final output response, and the form of the excitation function is relu function;
step S5-3-2: in the full-connection layer of the 6 th layer, 256 characteristic neurons are mapped into 4096 characteristic neurons, and a full-connection mapping parameter matrix is 256 x 4096;
step S5-3-3: and a full-connection layer of the 7 th layer, wherein 4096 characteristic neurons are mapped into single neurons, the full-connection mapping parameter matrix is 4096 x 2, and the type y of the neurons of the final output layerlistenWherein y islistenIs 1 denotes a careful lecture, y listen0 means not attending class;
step S5-4: training the neural network model constructed in the step S5-3 by using the training set constructed in the step S5-1 and the step S5-2 to obtain the parameter w of the trained head posture detection modellisten
Step S6: using the generated confrontation head posture detection model to realize classroom head posture detection and non-lecture state detection, as shown in fig. 6, the specific operation steps include:
step S6-1: inputting a classroom real-time collected video and extracting a video frame;
step S6-2: head position detection model parameter w trained using step S1-6headposExtracting real head position image realheadpos
Step S6-3: parameter w of head pose detection model trained using step S5-4listenCalculating the head posture score of the student;
step S6-4: judging whether the student is in class according to the head posture score of the student, wherein ylistenIs 1 denotes a careful lecture, y listen0 means not attending class;
step S6-5: traversing all students in the video frame, judging the class listening state of all people, and calculating the proportion of the number of people who do not listen to the class;
step S6-6: setting a state threshold value of the class non-attending proportion, if the class non-attending proportion is more than or equal to 5%, outputting the class non-attending state, and if the class non-attending proportion is less than 5%, outputting the normal class attending state;
step S7: using the generated confrontation head posture detection model to realize classroom lecture abnormity detection, as shown in fig. 7, the specific operation steps include:
step S7-1: inputting a real-time monitoring video and reading a video frame;
step S7-2: judging whether the video is finished or not, and finishing the real-time judgment of abnormal class listening state if the monitoring video is finished;
step S7-3: if the monitored video is still valid, using step S6-6, the non-lesson-listening status of the continuous video frames is detected, and the non-lesson-listening status of each frame is extracted
Step S7-4: if no non-listening state appears, clearing the non-listening state, clearing the starting time of the non-listening state, and clearing the abnormal state of class listening
Step S7-5: if the non-listening state appears, the abnormal judgment of the class listening in the classroom is carried out
Step S7-5-1: if the non-class state appears for the first time, initializing the current time as the starting time of the non-class state, and initializing the non-class duration as 1 frame
Step S7-5-2: if the non-attending state does not appear for the first time, updating the duration of the non-attending state, and increasing the duration by 1 frame
Step S7-5-3: if the duration of the non-lesson-listening state reaches 50 frames, outputting the abnormal state of lesson-listening in the current class
Step S7-6: and repeating the steps S7-1 to S7-5 to realize the analysis of real-time classroom lecture listening data, provide abnormal classroom lecture listening state detection and realize real-time judgment.

Claims (1)

1. A classroom lecture listening abnormity detection method based on an illumination generation countermeasure network is characterized in that: the method comprises the following steps:
step S1: acquiring real classroom head posture data:
acquiring and obtaining video frames in a real classroom, constructing a head position detection model, marking candidate head position images, and acquiring a training set and training parameters;
step S2: rendering the head posture data of the illumination classroom:
according to a designed 3D model of a classroom student, setting head gestures, illumination conditions and camera shooting angle parameters in the model, rendering for multiple times, and acquiring a classroom image set under rendered illumination;
step S3: constructing an illumination generation countermeasure network:
generating an antagonistic network according to the 11 layers of illumination, solving the target loss of the antagonistic network generated by the illumination, and training the antagonistic network generated by the illumination;
step S4: generating a challenge sample:
using real classroom head posture data to obtain illumination rendering head position images of different illumination conditions, shooting angles and different figures, using trained illumination to generate confrontation network model parameters, generating optimized rendering illumination head position images, calculating judgment scores of the illumination optimization head position images, setting a vivid image threshold value, and selecting the illumination rendering head position images larger than the threshold value as vivid rendering illumination head position images;
step S5: constructing a head posture detection model:
the method comprises the steps of taking a vivid rendering illumination head position image as training data for head posture detection, setting a head posture detection model by utilizing the training data for marking the head posture detection, and obtaining parameters of the head posture detection model through training;
step S6: detecting the head posture in the classroom:
using the generated confrontation head posture detection model to realize classroom head posture detection;
step S7: detecting abnormal class listening in a classroom:
inputting a classroom real-time collected video, extracting a video frame, setting a class-listening abnormity detection mechanism by using a constructed model and training parameters, and obtaining proportions of students in different states;
the method for acquiring the real classroom head posture data specifically comprises the following steps:
step S1-1: collecting a real classroom video;
step S1-2: acquiring video frames in a classroom video, and performing sliding window sampling to obtain candidate head position images, wherein each head position image comprises RGB three-layer color channels;
step S1-3: constructing a head position detection model, wherein the head position detection model comprises 8 layers of neural network models, the first 6 layers are convolutional neural networks, and the 7 th layer and the 8 th layer are full-connection networks;
step S1-3-1: the first 6 layers of convolutional neural networks use the same parameters, the size of the filter of each layer is 3 x 3, the number of the filters is 256, the pooling method is summation pooling, namely the summation result of response values in 256 channels is reserved as the final output response, and the form of the excitation function is a relu function;
step S1-3-2: in the full-connection layer of the 7 th layer, 256 characteristic neurons are mapped into 4096 characteristic neurons, and a full-connection mapping parameter matrix is 256 x 4096;
step S1-3-3: a full-connection layer of the 8 th layer maps 4096 characteristic neurons into single neurons, and a full-connection mapping parameter matrix is 4096 x 2, wherein the type of the final output layer neuron is 1, which indicates that the neuron is a head position image, and 0 indicates that the neuron is not a head position image;
step S1-4: marking the candidate head position image obtained in the step S1-2 to obtain head training data and non-head training data, and constructing a head position detection training set;
step S1-5: training the head position detection model constructed in the step S1-3 by using the head position detection training set acquired in the step S1-4 to obtain parameters w of the trained head position detection modelheadpos
Step S1-6: for the candidate head position image obtained in step S1-2, the trained head position detection model parameter w is usedheadposThe human face judgment is carried out, the head and the non-head can be distinguished, and therefore the real head position image real of the test video can be extractedheadpos
The method for rendering the head posture data of the illumination classroom specifically comprises the following steps:
step S2-1: designing a classroom student 3D model;
step S2-2: setting the head postures of students in class in a classroom student 3D model;
step S2-3: setting illumination conditions in a classroom student 3D model;
step S2-4: setting a camera shooting angle in a classroom student 3D model;
step S2-5: shooting for multiple times according to the conditions set in the steps S2-1, S2-2, S2-3 and S2-4 to obtain a classroom image set under rendered illumination;
step S2-6: for the classroom image set under the rendering illumination, the head position detection model parameter w trained in the step S1-5 is usedheadposObtaining a render of the head position image by illuminationheadpos
The method for constructing the illumination generation countermeasure network specifically comprises the following steps:
step S3-1: setting an illumination generation countermeasure network, wherein the first 4 layers are illumination generation optimization networks, and the 5 th to 11 th layers are illumination generation judgment networks;
step S3-1-1: setting an illumination generation optimization network, and using a 4-layer convolutional neural network;
step S3-1-1-1: in the optimization network, the convolutional neural networks of each layer use the same parameters, the size of the filter of each layer is 3 x 3, the number of the filters is 256, the pooling method is maximum pooling, namely the maximum response value in 256 channels is reserved as the final output response, and the form of the excitation function is a relu function;
step S3-1-1-2: all parameters in the illumination generation optimization network are denoted as wref
Step S3-1-1-3: generating an optimized network by the input image through illumination, and obtaining an optimized image, wherein the resolution of the optimized image is the same as that of the original image;
step S3-1-2: setting an illumination generation judgment network, and using a neural network with 7 layers, wherein the first 5 layers are convolutional neural networks, and the 6 th layer and the 7 th layer are full-connection neural networks;
step S3-1-2-1: in the decision network, the same parameters are used in the first 5 layers, the size of the filter of each layer is 3 x 3, the number of the filters is 64, the pooling method is summation pooling, namely the summation result of response values in 64 channels is reserved as the final output response, and the form of the excitation function is relu function;
step S3-1-2-2: in the full-connection layer of the 6 th layer, 256 characteristic neurons are mapped into 4096 characteristic neurons, and a full-connection mapping parameter matrix is 256 x 4096;
step S3-1-2-3: and a full-connection layer of the 7 th layer, wherein 4096 characteristic neurons are mapped into single neurons, the full-connection mapping parameter matrix is 4096 x 2, and the type y of the neurons of the final output layerreal,yrealA value of 1 indicates a true head position image, and a value of 0 indicates a lighting-optimized head position image;
step S3-1-2-4: all parameters in the illumination generation decision network are denoted as wjudge
Step S3-1-2-5: rendering an illumination image, inputting the illumination to generate a judgment network, and judging that the score is closer to 0; inputting a real head position image into an illumination generation judgment network, and judging that the score is closer to 1;
step S3-2: solving the target loss of the illumination generation countermeasure network;
step S3-2-1: calculating the optimization loss of the illumination rendering image;
step S3-2-1-1: rendering head position images with lightingheadposInputting step S3-1-1 illumination generation optimization network, obtaining illumination optimization head position image refineheadpos
Step S3-2-1-2: solving the optimization loss of the illumination rendering head position image, namely illumination optimization head position image refineheadposRender head position image render with illuminationheadposUsing a 1 norm to calculate the distance between 2 images, i.e.
dref=||renderheadpos-refineheadpos||1
Step S3-2-2: calculating the judgment loss of the image;
step S3-2-2-1: constructing a set img of head position imagesheadposContaining real head position image realheadposAnd illumination optimized head position image refineheadpos
Step S3-2-2-2: according to the image type, img is set for the head position imageheadposSetting image flag yreal,yrealA value of 1 indicates a true head position image, and a value of 0 indicates a lighting-optimized head position image;
step S3-2-2-3: image img of head positionheadposInputting step S3-1-2 light generation judgment network to obtain judgment score Sjudge
Step S3-2-2-4: solving for a decision loss for an image based on the decision score and the image label
Figure FDA0003074870730000041
Step S3-2-3: solving the total loss of the illumination rendering head position image, wherein the total loss comprises 2 parts of optimization loss and judgment loss, and the total loss of the illumination rendering head position image is
loss=dref+djudge
Step S3-3: training illumination to generate an antagonistic network;
step S3-3-1: training illumination to generate an optimized network;
step S3-3-1-1: input illumination rendering head position image renderheadpos
Step S3-3-1-2: computing illumination optimized head position image refineheadpos
Step S3-3-1-3: calculating a decision score s for an illumination-optimized head position imagejudge
Step S3-3-1-4: according to the step S3-2-3, calculating the total loss of the illumination rendering head position image;
step S3-3-1-5: adjusting model for illumination generation optimizationParameters, and determining updated parameters of the illumination generation optimization model according to the total loss and gradient reduction method of the illumination rendering head position image
Figure FDA0003074870730000042
Wherein t represents the t-th update of the model parameters;
step S3-3-2: training an illumination generation decision network;
step S3-3-2-1: repeating step S3-2-2-4 to calculate the decision loss d of all images in the image setjudge
Step S3-3-2-2: adjusting model parameters of illumination generation judgment, and determining updated illumination generation judgment model parameters according to gradient descent method
Figure FDA0003074870730000051
Wherein t represents the t-th update of the model parameters;
step S3-3-3: alternately repeating the steps S3-3-1 and S3-3-2, and iteratively optimizing the parameters of the illumination generation optimization model
Figure FDA0003074870730000052
And illumination generation decision model parameters
Figure FDA0003074870730000053
Until the model loss convergence no longer changes;
step S3-3-4: generating optimized model parameters by the converged illumination
Figure FDA0003074870730000054
And illumination generation decision model parameters
Figure FDA0003074870730000055
Recording as trained light generation confrontation network model parameter wadv={wref,wjudge};
The method for generating the confrontation sample specifically comprises the following steps:
step S4-1: using the step S2, obtaining the head position image render from different lighting conditions, different shooting angles, and different people' S lightingheadpos
Step S4-2: generating optimized rendered illumination head position image refine using the illumination-generation confrontation network model parameters trained in step S3-3-4headpos
Step S4-3: calculating a decision score s for an illumination-optimized head position image using an illumination generation decision modeljudge
Step S4-4: setting a threshold value of a vivid image, and scoring the judgment by sjudgeGreater than 0.5 as a realistic rendering illuminated head position image;
the method for constructing the head posture detection model specifically comprises the following steps:
step S5-1: using the vivid rendered illumination head position image and the real head position image obtained in the step S1-6 as training data for head posture detection;
step S5-2: class labeling of training data for head pose detection ylistenThe head posture training data comprises lecture samples and non-lecture samples, wherein ylistenIs 1 denotes a careful lecture, ylisten0 means not attending class;
step S5-3: setting a head posture detection model, and using 7 layers of neural networks, wherein the first 5 layers are convolutional neural networks, and the 6 th layer and the 7 th layer are full-connection neural networks;
step S5-3-1: in the decision network, the same parameters are used in the first 5 layers, the size of the filter of each layer is 3 x 3, the number of the filters is 64, the pooling method is summation pooling, namely the summation result of response values in 64 channels is reserved as the final output response, and the form of the excitation function is relu function;
step S5-3-2: in the full-connection layer of the 6 th layer, 256 characteristic neurons are mapped into 4096 characteristic neurons, and a full-connection mapping parameter matrix is 256 x 4096;
step S5-3-3: the fully-connected layer of layer 7 maps 4096 characteristic neurons to a single neuron, and the fully-connected mapping parameter matrix is 4096 x 2, which isClass y of the last output layer neuronlistenWherein y islistenIs 1 denotes a careful lecture, ylisten0 means not attending class;
step S5-4: training the neural network model constructed in the step S5-3 by using the training set constructed in the step S5-1 and the step S5-2 to obtain the parameter w of the trained head posture detection modellisten
The classroom head posture detection specifically comprises the following steps:
step S6-1: inputting a classroom real-time collected video and extracting a video frame;
step S6-2: head position detection model parameter w trained using step S1-6headposExtracting real head position image realheadpos
Step S6-3: parameter w of head pose detection model trained using step S5-4listenCalculating the head posture score of the student;
step S6-4: judging whether the student is in class according to the head posture score of the student, wherein ylistenIs 1 denotes a careful lecture, ylisten0 means not attending class;
step S6-5: traversing all students in the video frame, judging the class listening state of all people, and calculating the proportion of the number of people who do not listen to the class;
step S6-6: setting a state threshold value of the class non-attending proportion, if the class non-attending proportion is more than or equal to 5%, outputting the class non-attending state, and if the class non-attending proportion is less than 5%, outputting the normal class attending state;
the class attendance abnormity detection specifically comprises the following steps:
step S7-1: inputting a real-time monitoring video and reading a video frame;
step S7-2: judging whether the video is finished or not, and finishing the real-time judgment of abnormal class listening state if the monitoring video is finished;
step S7-3: if the monitoring video is still valid, detecting the non-lesson-listening state of the continuous video frames by using the step S6-6, and extracting the non-lesson-listening state of each frame;
step S7-4: if the non-listening state does not appear, clearing the non-listening state, clearing the starting time of the non-listening state and clearing the abnormal state of class listening;
step S7-5: if the state of not listening to the class appears, abnormal judgment of class listening in the class is carried out;
step S7-5-1: if the non-class state appears for the first time, initializing the current time as the starting time of the non-class state, and initializing the non-class duration as 1 frame;
step S7-5-2: if the non-lesson-listening state does not occur for the first time, updating the duration of the non-lesson-listening state, and increasing the duration by 1 frame;
step S7-5-3: if the duration of the non-lesson-listening state reaches 50 frames, outputting a class-listening abnormal state;
step S7-6: and repeating the steps S7-1 to S7-5 to realize the analysis of real-time classroom lecture listening data, provide abnormal classroom lecture listening state detection and realize real-time judgment.
CN201810831224.5A 2018-07-26 2018-07-26 Classroom lecture listening abnormity detection method based on illumination generation countermeasure network Active CN109241830B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810831224.5A CN109241830B (en) 2018-07-26 2018-07-26 Classroom lecture listening abnormity detection method based on illumination generation countermeasure network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810831224.5A CN109241830B (en) 2018-07-26 2018-07-26 Classroom lecture listening abnormity detection method based on illumination generation countermeasure network

Publications (2)

Publication Number Publication Date
CN109241830A CN109241830A (en) 2019-01-18
CN109241830B true CN109241830B (en) 2021-09-17

Family

ID=65072427

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810831224.5A Active CN109241830B (en) 2018-07-26 2018-07-26 Classroom lecture listening abnormity detection method based on illumination generation countermeasure network

Country Status (1)

Country Link
CN (1) CN109241830B (en)

Families Citing this family (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109919032B (en) * 2019-01-31 2021-03-30 华南理工大学 Video abnormal behavior detection method based on motion prediction
CN109949305B (en) * 2019-03-29 2021-09-28 北京百度网讯科技有限公司 Product surface defect detection method and device and computer equipment
CN110135301B (en) * 2019-04-30 2022-02-22 百度在线网络技术(北京)有限公司 Traffic sign recognition method, device, equipment and computer readable medium
CN110321865A (en) * 2019-07-09 2019-10-11 北京字节跳动网络技术有限公司 Head effect processing method and device, storage medium
CN110705652B (en) * 2019-10-17 2020-10-23 北京瑞莱智慧科技有限公司 Countermeasure sample, generation method, medium, device and computing equipment thereof
CN112016506B (en) * 2020-09-07 2022-10-11 重庆邮电大学 Classroom attitude detection model parameter training method capable of quickly adapting to new scene
CN111985627B (en) * 2020-09-27 2021-03-30 上海松鼠课堂人工智能科技有限公司 Companion learning role generation method and system based on confrontation network model
CN115311521B (en) * 2022-09-13 2023-04-28 中南大学 Black box video countermeasure sample generation method and evaluation method based on reinforcement learning

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2017142977A1 (en) * 2016-02-15 2017-08-24 Meta Company Apparatuses, methods and systems for tethering 3-d virtual elements to digital content
CN107862734A (en) * 2017-11-14 2018-03-30 华南理工大学 It is a kind of that image irradiation method is rendered based on generation confrontation network
CN108171206A (en) * 2018-01-17 2018-06-15 百度在线网络技术(北京)有限公司 information generating method and device
CN108280413A (en) * 2018-01-17 2018-07-13 百度在线网络技术(北京)有限公司 Face identification method and device

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2017142977A1 (en) * 2016-02-15 2017-08-24 Meta Company Apparatuses, methods and systems for tethering 3-d virtual elements to digital content
CN107862734A (en) * 2017-11-14 2018-03-30 华南理工大学 It is a kind of that image irradiation method is rendered based on generation confrontation network
CN108171206A (en) * 2018-01-17 2018-06-15 百度在线网络技术(北京)有限公司 information generating method and device
CN108280413A (en) * 2018-01-17 2018-07-13 百度在线网络技术(北京)有限公司 Face identification method and device

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
"Towards Large-Pose Face Frontalization in the Wild";Xi Yin等;《2017 IEEE International Conference on Computer Vision (ICCV)》;20171225;第4010-4019页 *
"生成式对抗网络研究进展";王万良等;《通信学报》;20180228;第39卷(第2期);第2018032-1至2018032-14页 *

Also Published As

Publication number Publication date
CN109241830A (en) 2019-01-18

Similar Documents

Publication Publication Date Title
CN109241830B (en) Classroom lecture listening abnormity detection method based on illumination generation countermeasure network
CN108491880B (en) Object classification and pose estimation method based on neural network
CN105069413B (en) A kind of human posture's recognition methods based on depth convolutional neural networks
CN108447078B (en) Interference perception tracking algorithm based on visual saliency
CN105739702B (en) Multi-pose finger tip tracking for natural human-computer interaction
CN111563452B (en) Multi-human-body gesture detection and state discrimination method based on instance segmentation
CN110135249B (en) Human behavior identification method based on time attention mechanism and LSTM (least Square TM)
CN107563494A (en) A kind of the first visual angle Fingertip Detection based on convolutional neural networks and thermal map
CN109816689A (en) A kind of motion target tracking method that multilayer convolution feature adaptively merges
CN108171112A (en) Vehicle identification and tracking based on convolutional neural networks
CN106920243A (en) The ceramic material part method for sequence image segmentation of improved full convolutional neural networks
CN106127812B (en) A kind of passenger flow statistical method of the non-gate area in passenger station based on video monitoring
CN107301376B (en) Pedestrian detection method based on deep learning multi-layer stimulation
CN103854027A (en) Crowd behavior identification method
CN111028319B (en) Three-dimensional non-photorealistic expression generation method based on facial motion unit
CN107146237A (en) A kind of method for tracking target learnt based on presence with estimating
CN110619268B (en) Pedestrian re-identification method and device based on space-time analysis and depth features
CN107103614B (en) Dyskinesia detection method based on level independent element coding
CN109035300A (en) A kind of method for tracking target based on depth characteristic Yu average peak correlation energy
CN112464844A (en) Human behavior and action recognition method based on deep learning and moving target detection
CN105930793A (en) Human body detection method based on SAE characteristic visual learning
CN108009512A (en) A kind of recognition methods again of the personage based on convolutional neural networks feature learning
Feng Mask RCNN-based single shot multibox detector for gesture recognition in physical education
CN111178201A (en) Human body sectional type tracking method based on OpenPose posture detection
CN112766145A (en) Method and device for identifying dynamic facial expressions of artificial neural network

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant