CN112905811A - Teaching audio and video pushing method and system based on student classroom behavior analysis - Google Patents

Teaching audio and video pushing method and system based on student classroom behavior analysis Download PDF

Info

Publication number
CN112905811A
CN112905811A CN202110168340.5A CN202110168340A CN112905811A CN 112905811 A CN112905811 A CN 112905811A CN 202110168340 A CN202110168340 A CN 202110168340A CN 112905811 A CN112905811 A CN 112905811A
Authority
CN
China
Prior art keywords
video
classroom
teaching audio
recognition result
information
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110168340.5A
Other languages
Chinese (zh)
Inventor
冯飞
龙土仲
张国友
陈国镇
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guangzhou Luzhen Technology Co ltd
Original Assignee
Guangzhou Luzhen Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guangzhou Luzhen Technology Co ltd filed Critical Guangzhou Luzhen Technology Co ltd
Priority to CN202110168340.5A priority Critical patent/CN112905811A/en
Publication of CN112905811A publication Critical patent/CN112905811A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/40Information retrieval; Database structures therefor; File system structures therefor of multimedia data, e.g. slideshows comprising image and additional audio data
    • G06F16/48Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
    • G06F16/483Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/40Information retrieval; Database structures therefor; File system structures therefor of multimedia data, e.g. slideshows comprising image and additional audio data
    • G06F16/43Querying
    • G06F16/435Filtering based on additional data, e.g. user or group profiles
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/40Information retrieval; Database structures therefor; File system structures therefor of multimedia data, e.g. slideshows comprising image and additional audio data
    • G06F16/45Clustering; Classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/60Information retrieval; Database structures therefor; File system structures therefor of audio data
    • G06F16/63Querying
    • G06F16/635Filtering based on additional data, e.g. user or group profiles
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/60Information retrieval; Database structures therefor; File system structures therefor of audio data
    • G06F16/65Clustering; Classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/60Information retrieval; Database structures therefor; File system structures therefor of audio data
    • G06F16/68Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
    • G06F16/683Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/70Information retrieval; Database structures therefor; File system structures therefor of video data
    • G06F16/73Querying
    • G06F16/735Filtering based on additional data, e.g. user or group profiles
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/70Information retrieval; Database structures therefor; File system structures therefor of video data
    • G06F16/75Clustering; Classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/70Information retrieval; Database structures therefor; File system structures therefor of video data
    • G06F16/78Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
    • G06F16/783Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/172Classification, e.g. identification

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Databases & Information Systems (AREA)
  • Library & Information Science (AREA)
  • Computational Linguistics (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Software Systems (AREA)
  • Biophysics (AREA)
  • Biomedical Technology (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Evolutionary Biology (AREA)
  • Oral & Maxillofacial Surgery (AREA)
  • Human Computer Interaction (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention discloses a teaching audio and video pushing method based on student classroom behavior analysis, which comprises the following steps: collecting classroom behavior videos of students while collecting classroom teaching audios and videos; inputting the classroom behavior video into a trained neural network model for training to obtain a recognition result, wherein the neural network model comprises a face detection model, a face recognition model and a behavior posture model, and the recognition result comprises an identity recognition result and a behavior recognition result; and packaging the classroom teaching audio and video and the recognition result and storing the packaged classroom teaching audio and video and recognition result in a file library of a teaching audio and video application platform. By adopting the invention, the problems that the video on the teaching video application platform can only be pushed in a coarse-dimension mode such as the additional information and the playing information of the video, the precision is not high enough, and the personalized requirements of users can not be met can be solved.

Description

Teaching audio and video pushing method and system based on student classroom behavior analysis
Technical Field
The invention relates to the field of video pushing, in particular to a teaching audio and video pushing method and system based on student classroom behavior analysis.
Background
With the rapid development of multimedia technology and network technology, the acquisition and transmission of teaching videos become easier and easier, and the teaching videos have gradually become one of the main carriers of teaching resource transmission. The existing video pushing method mainly pushes according to the added additional information of the manually added video and pushes according to the playing and browsing conditions of the video. For the first video pushing method, text identifiers are added to videos manually, such as video titles, the courses, the specialties, teaching teachers, relevant knowledge points, course introduction and the like are added; and setting a certain additional information and a value in the playing condition as a pushing basis, such as pushing according to the affiliated course, pushing according to the teaching teacher, pushing according to the browsing amount, pushing according to the praise amount, and the like. For the second video pushing method, the browsing amount, praise amount, collection amount and comment number of the video are recorded; and then setting a value in a certain playing situation as a pushing basis, such as pushing according to the browsing amount, pushing according to the praise amount, and the like.
For the video pushing method, the course information needs to be manually added, the workload is large, the dimensionality is too thick, the content pushed to each user is the same, the accuracy is not high enough, and the personalized requirements of the users cannot be met, so that the click-through rate of the pushed video is very low. At present, teaching videos are rapidly increased, and a teaching resource platform needs to push videos to users in a targeted manner, so that the utilization rate of the videos can be improved. However, the conventional video pushing technology can only push in a coarse dimension, for example, according to the course to which the video belongs, a teacher giving lessons, and the like, the pushing in the coarse dimension is not accurate enough, and cannot meet the personalized requirements of the user, for example, the user may only be interested in the video of a classroom missed by the user, and the pushing in the coarse dimension cannot identify the requirement of the user.
Disclosure of Invention
The technical problem to be solved by the invention is to provide a teaching audio and video pushing method and system based on student classroom behavior analysis, which can solve the problems that videos on a teaching video application platform can only be pushed in a coarse-dimension mode such as additional information and playing information of the videos, and the like, and are not accurate enough and cannot meet the personalized requirements of users.
In order to solve the technical problem, the invention provides a teaching audio and video pushing method based on student classroom behavior analysis, which comprises the following steps: collecting classroom behavior videos of students while collecting classroom teaching audios and videos; inputting the classroom behavior video into a trained neural network model for training to obtain a recognition result, wherein the neural network model comprises a face detection model, a face recognition model and a behavior posture model, and the recognition result comprises an identity recognition result and a behavior recognition result; packaging the classroom teaching audio and video and the recognition result and storing the packaged classroom teaching audio and video and the recognition result in a file library of a teaching audio and video application platform; acquiring user identity information for logging in the teaching audio and video application platform; and screening the identity recognition result consistent with the user identity information from the file library, and pushing the classroom teaching audio and video corresponding to the identity recognition result to the user.
Preferably, the teaching audio and video pushing method based on student classroom behavior analysis further includes: acquiring timestamp information of the behavior recognition result and generating corresponding description information; and packaging the timestamp information, the description information and the classroom teaching audio and video and storing the timestamp information, the description information and the classroom teaching audio and video in a file library of a teaching audio and video application platform.
Preferably, the teaching audio and video pushing method based on student classroom behavior analysis further includes: acquiring identity information of the corresponding students in the classroom; and packaging the identity information of the student and the classroom teaching audio and video and storing the identity information and the classroom teaching audio and video in a file library of a teaching audio and video application platform.
Preferably, the teaching audio and video pushing method based on student classroom behavior analysis further includes: collecting the sign-in information of the corresponding students; and packaging the identity information, the sign-in information and the classroom teaching audio and video of the student and storing the identity information, the sign-in information and the classroom teaching audio and video in a file library of a teaching audio and video application platform.
Preferably, the trained neural network model is imported in a plug-in directory mode.
The invention also provides a teaching audio and video pushing system based on student classroom behavior analysis, which comprises a first acquisition module, a second acquisition module, an identification model construction module, a packaging storage module, a first acquisition module, a screening module and a pushing module; the first acquisition module is used for acquiring audio and video data of classroom teaching, and the second acquisition module is used for acquiring video data of student behaviors; the recognition model construction module is used for constructing and training a neural network model, and inputting the classroom behavior video into the trained neural network model for training to obtain a recognition result, wherein the neural network model comprises a face detection model, a face recognition model and a behavior posture model, and the recognition result comprises an identity recognition result and a behavior recognition result; the packaging storage module is used for packaging the classroom teaching audio and video and the recognition result and storing the packaged classroom teaching audio and video and the recognition result in a file library of a teaching audio and video application platform; the first acquisition module is used for acquiring the identity information of a user logging in the teaching audio and video application platform; the screening module is used for screening the identity recognition result consistent with the user identity information in the document library; the pushing module is used for pushing the classroom teaching audio and video corresponding to the identity recognition result to a user.
Preferably, the teaching audio and video pushing system based on student classroom behavior analysis further comprises a second acquisition module, the second acquisition module is used for acquiring timestamp information of the behavior recognition result and generating corresponding description information, and the encapsulation storage module is further used for encapsulating the timestamp information and the description information and classroom teaching audio and video and storing the encapsulated timestamp information and description information and classroom teaching audio and video in a file library of a teaching audio and video application platform.
Preferably, the teaching audio and video pushing system based on student classroom behavior analysis further comprises a third acquisition module, the second acquisition module is used for acquiring corresponding student identity information of a classroom, and the packaging storage module is further used for packaging the corresponding student identity information and the classroom teaching audio and video and storing the packaged information and the classroom teaching audio and video in a file library of a teaching audio and video application platform.
Preferably, teaching audio frequency and video push system based on student's classroom behavior analysis still includes the third collection module, the third collection module is used for gathering the information of registering to the student, encapsulation storage module still be used for with the identity information of registering to the student, the information of registering and classroom teaching audio frequency and video encapsulate and store in the file storehouse of teaching audio frequency and video application platform.
Preferably, the teaching audio and video pushing system based on student classroom behavior analysis further comprises a recognition model importing module, and the recognition model importing module is used for importing the trained neural network model in a plug-in directory mode.
The beneficial effects of the implementation of the invention are as follows:
according to the invention, the classroom teaching audio and video and the classroom behavior video of the student are collected at the same time, the identity recognition result and the behavior recognition result of the student are recognized according to the classroom behavior video, the identity recognition result consistent with the identity information of the user logging in the teaching audio and video application platform is screened out, and the classroom teaching audio and video corresponding to the identity recognition result is pushed to the user, so that the problems that the video on the teaching video application platform can only be pushed in a coarse-dimension mode such as the additional information and the playing information of the video, the accuracy is not high, and the personalized requirement of the user cannot be met are solved.
Drawings
FIG. 1 is a flow chart of a teaching audio/video pushing method based on student classroom behavior analysis according to the present invention;
FIG. 2 is a schematic diagram of a P-Net network according to the present invention;
FIG. 3 is a schematic diagram of the structure of an R-Net network provided by the present invention;
FIG. 4 is a schematic diagram of the structure of an O-Net network provided by the present invention;
FIG. 5 is a schematic structural diagram of the facenet model provided by the present invention;
fig. 6 is a schematic structural diagram of an opencast network provided by the present invention;
fig. 7 is a schematic diagram of a teaching audio and video pushing system based on student classroom behavior analysis provided by the invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention will be described in further detail with reference to the accompanying drawings. It is only noted that the invention is intended to be limited to the specific forms set forth herein, including any reference to the drawings, as well as any other specific forms of embodiments of the invention.
As shown in fig. 1, the invention provides a teaching audio and video pushing method based on student classroom behavior analysis, which comprises the following steps:
s101, collecting classroom behavior videos of students while collecting classroom teaching audios and videos.
It should be noted that, the classroom teaching audio and video and the classroom behavior video of the students are collected at the same time to ensure the time consistency and prepare for the subsequent processing.
S102, inputting the classroom behavior video into a trained neural network model for training to obtain a recognition result, wherein the neural network model comprises a face detection model, a face recognition model and a behavior posture model, and the recognition result comprises an identity recognition result and a behavior recognition result.
It should be noted that these models can be deployed in a production environment, so that in a real-time video stream, the identities and behavior postures of students are recognized by using these trained models to make algorithmic reasoning. According to the method, an AI algorithm training platform is used, CUDA parallel computing hardware equipment based on a GPU is used, and hyper-parameters such as an image data set, a learning rate and a loss function are trained by utilizing deep learning frames such as tensorFlow and caffe and network models such as mtcnn, facenet and opencast, so that a trained neural network model is obtained.
Preferably, the trained neural network model is imported in a plug-in directory mode. The invention uses the neural network and the image processing algorithm library which are optimized by the GPU, leads in the corresponding algorithm model in a plug-in directory mode, and can rapidly deploy the neural network model for AI inference without installing a deep learning framework. The inference engine can greatly improve the operation speed of the algorithm and achieve the effect of real-time video analysis.
Firstly, an mtcnn network model is imported for face detection. mtcnn is a multitask convolutional neural network divided into three-layer network structures of P-Net, R-Net and O-Net.
The basic construction of P-Net is a full convolutional network, as shown in fig. 2. The method comprises the steps of performing initial feature extraction and frame calibration through an FCN, and performing Bounding-Box Regression adjustment window and NMS (network management System) to filter most of windows. P-Net is a regional suggestion network of a face region, after three convolution layers of feature input results of the network are input, a face classifier is used for judging whether the region is a face or not, frame regression and a locator of a face key point are used for carrying out preliminary proposal of the face region, the part finally outputs a plurality of face regions where faces may exist, and the regions are input into R-Net for further processing.
As shown in fig. 3, the basic structure of R-Net is a convolutional neural network, and a fully connected layer is added to P-Net of the first layer, so that the input data screening is more strict. After a picture passes through P-Net, a plurality of prediction windows are left, all the prediction windows are sent to R-Net, a network filters a large number of candidate frames with poor effects, and finally, Bounding-Box Regression and NMS are carried out on the selected candidate frames to further optimize prediction results.
Because the output of P-Net is only a possible face region with a certain credibility, in the network, the input is selected in a refining way, most of error input is eliminated, the frame regression and the face key point positioner are used again to carry out the frame regression and the key point positioning of the face region, and finally, the more credible face region is output for the O-Net to use. Compared with the characteristics of 1x1x32 output by P-Net using full convolution, R-Net uses a full connection layer of 128 after the last convolution layer, so that more image characteristics are reserved and the accuracy performance is better than that of P-Net.
As shown in FIG. 4, the basic structure of O-Net is a more complex convolutional neural network, with one more convolutional layer compared to R-Net. The difference between the O-Net effect and the R-Net effect is that the layer structure can identify the facial region through more supervision, and can regress the facial feature points of the human, and finally five facial feature points are output. The O-Net network has more input features, the last of the network structure is also a larger 256 full-connection layer, more image features are reserved, meanwhile, face judgment, face region frame regression and face feature positioning are carried out, and finally the upper left corner coordinate and the lower right corner coordinate of the face region and five feature points of the face region are output. O-Net has more characteristic input and more complex network structure, and also has better performance, and the output of the layer is used as the final network model output.
And secondly, importing a facenet network mode for face recognition. facenet maps images to euclidean space through convolutional neural network learning, the spatial distance is directly related to the image similarity: different images of the same person are at a small spatial distance, and images of different persons are at a large spatial distance. FaceNet trains the neural network directly using the loss function of the triplets-based LMNN (maximum boundary nearest neighbor classification), and the network directly outputs a vector space of 128 dimensions. The network model structure is shown in fig. 5: the method comprises a batch input layer, a Deep CNN layer and a Deep architecture layer, wherein after L2 normalization is carried out, Embedding feature vectors are generated, and finally triplet loss is used for classification.
And finally, importing an opencast network model for behavior recognition. openposition uses a "bottom-up" mode, that is, all key points of a human body in a picture are detected first, and then the key points are corresponding to different human bodies. The network model is shown in fig. 6: where F is a feature maps set consisting of the top 10 layers of VGG-19, which outputs a set of feature maps F as input to the first stage. The whole technical scheme is 'two-branch multi-stage CNN', wherein one branch is used for predicting a grading diagram consistency maps (S), and the other branch is used for predicting Par Affinity Fields (L). The input received by stage 1 of the network is a feature F, and then the feature F is processed by a Branch 1 network and a Branch 2 network to obtain S _1 and L _1 respectively. Starting from stage 2, the input to the phase t network consists of three parts, S _ { t-1}, L _ { t-1}, F.
Through the three neural network models, the identity of the student and the behavior action can be recognized. Based on the identity and the behavior action, the video can be labeled, and a basis is provided for the screening and retrieval of the video.
And S103, packaging the classroom teaching audio and video and the recognition result and storing the packaged classroom teaching audio and video and recognition result in a file library of a teaching audio and video application platform.
It should be noted that, the invention firstly pre-processes the collected audio data and video data, including audio and video processing such as audio and video encoding and decoding, video scaling and synthesis, audio and video storage and encapsulation, audio and video synchronization processing, and further forms structured data to be stored in a file library of the teaching audio and video application platform. And writing the classroom teaching audio and video generation MP4 video file into a disk, and writing the label information of the identification result into a MySQL database.
And S104, acquiring the identity information of the user logging in the teaching audio and video application platform.
It should be noted that each user has an account in the teaching audio/video application platform, and each account is associated with identity information of one user.
S105, screening the identity recognition result consistent with the user identity information from the file library, and pushing the classroom teaching audio and video corresponding to the identity recognition result to the user.
It should be noted that the method and the device provided by the invention have the advantages of accurate pushing and required pushing by screening according to the identity information and pushing the teaching audio and video for related users, and improve the utilization value of the teaching audio and video.
According to the invention, the classroom teaching audio and video and the classroom behavior video of the student are collected at the same time, the identity recognition result and the behavior recognition result of the student are recognized according to the classroom behavior video, the identity recognition result consistent with the identity information of the user logging in the teaching audio and video application platform is screened out, and the classroom teaching audio and video corresponding to the identity recognition result is pushed to the user, so that the problems that the video on the teaching video application platform can only be pushed in a coarse-dimension mode such as the additional information and the playing information of the video, the accuracy is not high, and the personalized requirement of the user cannot be met are solved.
Preferably, the teaching audio and video pushing method based on student classroom behavior analysis further includes: acquiring timestamp information of the behavior recognition result and generating corresponding description information; and packaging the timestamp information, the description information and the classroom teaching audio and video and storing the timestamp information, the description information and the classroom teaching audio and video in a file library of a teaching audio and video application platform.
It should be noted that, the timestamp information of the behavior recognition result is added and the corresponding description information is generated, so that the later screening and recommendation are facilitated. According to the method, according to a login account of a user, student information is inquired through the account, then the student information is retrieved from a file library, and description information is generated according to corresponding timestamps of behavior recognition results (such as absenteeism, dozing, vague and other different dimensions, relevant videos are found out for recommendation), so that the user can log in a video application platform through the two modes (including a browser and an APP), and after logging in, a background can be automatically triggered to initiate retrieval and recommendation behaviors, and recommended video courses and description information of the courses relevant to the user are matched. When the video description information is clicked, the corresponding time segment is automatically jumped to play, for example: when a student dozes or plays a mobile phone at the 10 th minute in a video, a user can automatically jump to play from the tenth minute segment when viewing the video, so that targeted push is realized.
Preferably, the teaching audio and video pushing method based on student classroom behavior analysis further includes: acquiring identity information of the corresponding students in the classroom; and packaging the identity information of the student and the classroom teaching audio and video and storing the identity information and the classroom teaching audio and video in a file library of a teaching audio and video application platform.
Further, the teaching audio and video pushing method based on student classroom behavior analysis further comprises the following steps: collecting the sign-in information of the corresponding students; and packaging the identity information, the sign-in information and the classroom teaching audio and video of the student and storing the identity information, the sign-in information and the classroom teaching audio and video in a file library of a teaching audio and video application platform.
It should be noted that, in order to avoid that students who are not on duty cannot receive corresponding push videos, the identity information of the corresponding students in the classroom is acquired; and packaging the identity information of the student and the classroom teaching audio and video and storing the identity information and the classroom teaching audio and video in a file library of a teaching audio and video application platform. And adding the sign-in information (sign-in or not sign-in or leaving after a period of sign-in) of the student to the corresponding classroom education audio-video, and adding the leaving timestamp to the corresponding classroom education audio-video for the student leaving after the sign-in. Therefore, when the students log in the teaching audio and video application platform, whether the students are on duty or not, the corresponding pushed videos can be received in a targeted manner, and the corresponding time slices can be automatically jumped to be played after the students are opened, so that the aim of accurate pushing is achieved.
As shown in fig. 2, the invention further provides a teaching audio and video push system based on student classroom behavior analysis, which comprises a first acquisition module 1, a second acquisition module 2, an identification model construction module 3, a packaging storage module 4, a first acquisition module 5, a screening module 6 and a push module 7; the first acquisition module 1 is used for acquiring audio and video data of classroom teaching, and the second acquisition module 2 is used for acquiring video data of student behaviors; the recognition model construction module 3 is used for constructing and training a neural network model, and inputting the classroom behavior video into the trained neural network model for training to obtain a recognition result, wherein the neural network model comprises a face detection model, a face recognition model and a behavior posture model, and the recognition result comprises an identity recognition result and a behavior recognition result; the encapsulation storage module 4 is used for encapsulating the classroom teaching audio and video and the identification result and storing the encapsulated classroom teaching audio and video and the identification result in a file library of a teaching audio and video application platform; the first obtaining module 5 is used for obtaining the user identity information logged in the teaching audio and video application platform; the screening module 6 is configured to screen the identity recognition result that is consistent with the user identity information from the document library; the pushing module 7 is used for pushing the classroom teaching audio and video corresponding to the identity recognition result to a user.
According to the invention, the classroom teaching audio and video and the classroom behavior video of the students are simultaneously acquired through the first acquisition module and the second acquisition module, the identity recognition result and the behavior recognition result of the students are recognized according to the classroom behavior video, then the identity recognition result which is consistent with the identity information of the user who logs in the teaching audio and video application platform is screened out, and the classroom teaching audio and video corresponding to the identity recognition result is pushed to the user, so that the problems that the video on the teaching video application platform can only be pushed in a coarse-dimension mode such as additional information and playing information of the video, the accuracy is not high, and the personalized requirement of the user cannot be met are solved.
Preferably, the teaching audio and video pushing system based on student classroom behavior analysis further comprises a second obtaining module 8, the second obtaining module 8 is used for obtaining timestamp information of the behavior recognition result and generating corresponding description information, and the packaging storage module 4 is further used for packaging and storing the timestamp information, the description information and the classroom teaching audio and video in a file library of a teaching audio and video application platform.
It should be noted that, the timestamp information of the behavior recognition result is added and the corresponding description information is generated, so that the later screening and recommendation are facilitated. According to the method, according to a login account of a user, student information is inquired through the account, then the student information is retrieved from a file library, and description information is generated according to corresponding timestamps of behavior recognition results (such as absenteeism, dozing, vague and other different dimensions, relevant videos are found out for recommendation), so that the user can log in a video application platform through the two modes (including a browser and an APP), and after logging in, a background can be automatically triggered to initiate retrieval and recommendation behaviors, and recommended video courses and description information of the courses relevant to the user are matched. When the video description information is clicked, the corresponding time segment is automatically jumped to play, for example: when a student dozes or plays a mobile phone at the 10 th minute in a video, a user can automatically jump to play from the tenth minute segment when viewing the video, so that targeted push is realized.
Preferably, the teaching audio/video pushing system based on student classroom behavior analysis further comprises a third obtaining module 9, the third obtaining module 9 is used for obtaining the identity information of the corresponding student in the classroom, and the packaging storage module 4 is further used for packaging and storing the identity information of the corresponding student and the classroom teaching audio/video in a file library of a teaching audio/video application platform.
Further, teaching audio frequency and video push system based on student's classroom behavior analysis still includes third collection module 10, third collection module 10 is used for gathering the information of registering of corresponding student, encapsulation storage module 4 still is used for with the identity information of corresponding student, the information of registering and classroom teaching audio frequency and video encapsulate and store the file storehouse at teaching audio frequency and video application platform.
It should be noted that, in order to avoid that students who are not on duty cannot receive corresponding push videos, the identity information of the corresponding students in the classroom is acquired; and packaging the identity information of the student and the classroom teaching audio and video and storing the identity information and the classroom teaching audio and video in a file library of a teaching audio and video application platform. And adding the sign-in information (sign-in or not sign-in or leaving after a period of sign-in) of the student to the corresponding classroom education audio-video, and adding the leaving timestamp to the corresponding classroom education audio-video for the student leaving after the sign-in. Therefore, when the students log in the teaching audio and video application platform, whether the students are on duty or not, the corresponding pushed videos can be received in a targeted manner, and the corresponding time slices can be automatically jumped to be played after the students are opened, so that the aim of accurate pushing is achieved.
Preferably, the teaching audio and video pushing system based on student classroom behavior analysis further comprises a recognition model importing module 11, and the recognition model importing module 11 is used for importing a trained neural network model in a plug-in directory mode.
It should be noted that, a comparison image library of various classroom behaviors is preset in the neural network model, for example: the user can do classroom behaviors such as lying on a desk for dozing, lowering head for playing a mobile phone and the like.
In the invention, the first acquisition module and the second acquisition module are a recording and broadcasting host, a camera and a sound pickup, but are not limited thereto. And after the audio data and the video data are collected, distributing the data. After receiving the data, the encapsulation storage module 4 performs coding compression on the audio data, and performs processing such as scaling, synthesis, coding compression and the like on the video data. And then, a neural network model is constructed and trained through the recognition model construction module 3, and is used for inputting the classroom behavior video into the trained neural network model for training to obtain a recognition result, wherein the neural network model comprises a face detection model, a face recognition model and a behavior posture model, and the recognition result comprises an identity recognition result and a behavior recognition result. And then packaging and storing the recognition result and the corresponding classroom teaching audio and video to a file library. And the user logs in the video application platform through the browser or the APP to trigger the recommendation system to operate. And the recommendation system confirms the user identity information according to the account number. And the recommendation system initiates a screening request and searches the video content of the user in the file library. And after the screening 1 is finished, generating personalized recommended content. And presenting the personalized recommended courses and the description information of the courses at the user client. And clicking the description information, and automatically jumping to the corresponding time segment for playing.
In conclusion, the invention converts the classroom behavior of the students into the structured data, and then makes targeted data push according to the data, so that the pushed data more conforms to the real requirements of the users. Meanwhile, the data conversion and pushing are all automatically completed by the system, manual compiling is not needed, and the efficiency is high.
While the present disclosure has been described in considerable detail and with particular reference to a few illustrative embodiments thereof, it is not intended to be limited to any such details or embodiments or any particular embodiments, but it is to be construed as effectively covering the intended scope of the disclosure by providing a broad, potential interpretation of such claims in view of the prior art with reference to the appended claims. Furthermore, the foregoing describes the disclosure in terms of embodiments foreseen by the inventor for which an enabling description was available, notwithstanding that insubstantial modifications of the disclosure, not presently foreseen, may nonetheless represent equivalent modifications thereto.

Claims (10)

1. A teaching audio and video pushing method based on student classroom behavior analysis is characterized by comprising the following steps:
collecting classroom behavior videos of students while collecting classroom teaching audios and videos;
inputting the classroom behavior video into a trained neural network model for training to obtain a recognition result, wherein the neural network model comprises a face detection model, a face recognition model and a behavior posture model, and the recognition result comprises an identity recognition result and a behavior recognition result;
packaging the classroom teaching audio and video and the recognition result and storing the packaged classroom teaching audio and video and the recognition result in a file library of a teaching audio and video application platform;
acquiring user identity information for logging in the teaching audio and video application platform;
and screening the identity recognition result consistent with the user identity information from the file library, and pushing the classroom teaching audio and video corresponding to the identity recognition result to the user.
2. The teaching audio and video pushing method based on student classroom behavior analysis as recited in claim 1, further comprising:
acquiring timestamp information of the behavior recognition result and generating corresponding description information;
and packaging the timestamp information, the description information and the classroom teaching audio and video and storing the timestamp information, the description information and the classroom teaching audio and video in a file library of a teaching audio and video application platform.
3. The teaching audio and video pushing method based on student classroom behavior analysis as recited in claim 1, further comprising:
acquiring identity information of the corresponding students in the classroom;
and packaging the identity information of the student and the classroom teaching audio and video and storing the identity information and the classroom teaching audio and video in a file library of a teaching audio and video application platform.
4. The teaching audio and video pushing method based on student classroom behavior analysis as recited in claim 3, further comprising:
collecting the sign-in information of the corresponding students;
and packaging the identity information, the sign-in information and the classroom teaching audio and video of the student and storing the identity information, the sign-in information and the classroom teaching audio and video in a file library of a teaching audio and video application platform.
5. The teaching audio and video pushing method based on student classroom behavior analysis as claimed in claim 1, wherein the trained neural network model is imported in a plug-in directory manner.
6. A teaching audio and video pushing system based on student classroom behavior analysis is characterized by comprising a first acquisition module, a second acquisition module, an identification model building module, a packaging storage module, a first acquisition module, a screening module and a pushing module;
the first acquisition module is used for acquiring audio and video data of classroom teaching, and the second acquisition module is used for acquiring video data of student behaviors;
the recognition model construction module is used for constructing and training a neural network model, and inputting the classroom behavior video into the trained neural network model for training to obtain a recognition result, wherein the neural network model comprises a face detection model, a face recognition model and a behavior posture model, and the recognition result comprises an identity recognition result and a behavior recognition result;
the packaging storage module is used for packaging the classroom teaching audio and video and the recognition result and storing the packaged classroom teaching audio and video and the recognition result in a file library of a teaching audio and video application platform;
the first acquisition module is used for acquiring the identity information of a user logging in the teaching audio and video application platform;
the screening module is used for screening the identity recognition result consistent with the user identity information in the document library;
the pushing module is used for pushing the classroom teaching audio and video corresponding to the identity recognition result to a user.
7. The teaching audio and video pushing system based on student classroom behavior analysis as claimed in claim 6, further comprising a second obtaining module, wherein the second obtaining module is configured to obtain timestamp information of the behavior recognition result and generate corresponding description information, and the encapsulation storage module is further configured to encapsulate the timestamp information, the description information and the classroom teaching audio and video and store the encapsulated timestamp information and description information and the classroom teaching audio and video in a file library of a teaching audio and video application platform.
8. The teaching audio and video pushing system based on student classroom behavior analysis as claimed in claim 6, further comprising a third obtaining module, wherein the second obtaining module is used for obtaining the identity information of the student corresponding to the classroom, and the encapsulation storage module is further used for encapsulating the identity information of the student corresponding to the classroom and the classroom teaching audio and video and storing the encapsulated identity information and classroom teaching audio and video in a file library of a teaching audio and video application platform.
9. The student classroom behavior analysis based teaching audio and video push system according to claim 6, further comprising a third collection module, wherein the third collection module is configured to collect the attendance information of the student, and the encapsulation storage module is further configured to encapsulate and store the identity information and the attendance information of the student and the classroom teaching audio and video in a file library of a teaching audio and video application platform.
10. The teaching audio and video pushing system based on student classroom behavior analysis as claimed in claim 6, further comprising a recognition model import module, wherein the recognition model import module is used for importing the trained neural network model by means of a plug-in directory.
CN202110168340.5A 2021-02-07 2021-02-07 Teaching audio and video pushing method and system based on student classroom behavior analysis Pending CN112905811A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110168340.5A CN112905811A (en) 2021-02-07 2021-02-07 Teaching audio and video pushing method and system based on student classroom behavior analysis

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110168340.5A CN112905811A (en) 2021-02-07 2021-02-07 Teaching audio and video pushing method and system based on student classroom behavior analysis

Publications (1)

Publication Number Publication Date
CN112905811A true CN112905811A (en) 2021-06-04

Family

ID=76123639

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110168340.5A Pending CN112905811A (en) 2021-02-07 2021-02-07 Teaching audio and video pushing method and system based on student classroom behavior analysis

Country Status (1)

Country Link
CN (1) CN112905811A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113626642A (en) * 2021-08-11 2021-11-09 赞同科技股份有限公司 Assembling method and system of video script semantic structure and electronic device

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110443226A (en) * 2019-08-16 2019-11-12 重庆大学 A kind of student's method for evaluating state and system based on gesture recognition
CN110992741A (en) * 2019-11-15 2020-04-10 深圳算子科技有限公司 Learning auxiliary method and system based on classroom emotion and behavior analysis

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110443226A (en) * 2019-08-16 2019-11-12 重庆大学 A kind of student's method for evaluating state and system based on gesture recognition
CN110992741A (en) * 2019-11-15 2020-04-10 深圳算子科技有限公司 Learning auxiliary method and system based on classroom emotion and behavior analysis

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113626642A (en) * 2021-08-11 2021-11-09 赞同科技股份有限公司 Assembling method and system of video script semantic structure and electronic device
CN113626642B (en) * 2021-08-11 2023-08-25 赞同科技股份有限公司 Method, system and electronic device for assembling video script semantic structure

Similar Documents

Publication Publication Date Title
CN102804208B (en) Individual model for visual search application automatic mining famous person
CN113095428B (en) Video emotion classification method and system integrating electroencephalogram and stimulus information
CN111341341A (en) Training method of audio separation network, audio separation method, device and medium
CN110363131B (en) Abnormal behavior detection method, system and medium based on human skeleton
CN109766759A (en) Emotion identification method and Related product
CN110929098B (en) Video data processing method and device, electronic equipment and storage medium
US20230059697A1 (en) System and Method for Indexing Large Volumes and Durations of Temporally-Based Sensor Datasets
CN110517689A (en) A kind of voice data processing method, device and storage medium
Roy The birth of a word
CN113395578A (en) Method, device and equipment for extracting video theme text and storage medium
CN113709384A (en) Video editing method based on deep learning, related equipment and storage medium
KR101617649B1 (en) Recommendation system and method for video interesting section
CN113766299B (en) Video data playing method, device, equipment and medium
CN110516749A (en) Model training method, method for processing video frequency, device, medium and calculating equipment
Li et al. A two-stage multi-modal affect analysis framework for children with autism spectrum disorder
CN116484318A (en) Lecture training feedback method, lecture training feedback device and storage medium
CN111144359A (en) Exhibit evaluation device and method and exhibit pushing method
CN116050892A (en) Intelligent education evaluation supervision method based on artificial intelligence
CN112529054B (en) Multi-dimensional convolution neural network learner modeling method for multi-source heterogeneous data
CN112905811A (en) Teaching audio and video pushing method and system based on student classroom behavior analysis
CN110309753A (en) A kind of race process method of discrimination, device and computer equipment
CN113572981B (en) Video dubbing method and device, electronic equipment and storage medium
CN116935170B (en) Processing method and device of video processing model, computer equipment and storage medium
Liu et al. MSDWild: Multi-modal Speaker Diarization Dataset in the Wild.
Soler et al. Suggesting sounds for images from video collections

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination