Intelligent generation method of wonderful video of children in nursery garden based on artificial intelligence
Technical Field
The invention relates to the field of computer vision and the technical field of artificial intelligence, in particular to a method for intelligently generating wonderful video of children in a nursery based on artificial intelligence.
Background
With the further increase of the popularity of national obligation education and the increasingly busy work of parents, more and more children are delivered to the kindergartens for study and play at a very young age. However, due to the natural inseparable emotion between parents and children, parents may very urgently want to know the learning and playing situation of children in the nursery garden for one day, and do not want to miss various wonderful moments of the children in the daily life of the nursery garden, such as the moment of laughtening, the moment of careful learning, the glaring of occasional bends, and the urge to participate in group activities.
In order to meet the expectation of parents, a real urgent need arises, namely shooting the wonderful moment of children in a day in a juvenile garden, capturing various wonderful moments of the children in mind or movement, generating a wonderful moment clipping video of the children in one day, and sending the video to the parents of the children at the end of one day. Therefore, parents can better understand the learning and playing conditions of children and can also trust the students and life guarantee of the nursery garden. To capture such video, if manual, it is obviously faced with many difficulties and problems: a large amount of manpower is consumed to shoot the video; all children cannot be considered, and omission is inevitable; manually finding, clipping the most wonderful, most valuable moments from a large number of videos would consume a tremendous amount of manpower.
Disclosure of Invention
The invention aims to provide an artificial intelligence-based intelligent generation method of wonderful video of children in a nursery garden, and aims to solve the problems in the background art.
In order to achieve the purpose, the invention provides an artificial intelligence-based intelligent generation method of wonderful video of children in a nursery garden, which comprises the following steps:
s1, arranging a video acquisition system; the video acquisition system comprises a plurality of cameras and a background processing computer, the cameras are respectively installed at different positions in the nursery garden, and the background processing computer is used for receiving video data acquired by the cameras;
s2, starting the face capturing function of each camera, and capturing a face and tracking a moving face;
s3, identity recognition: according to a human face data set of teachers and students in the juvenile garden, which is prepared in advance, identity recognition is carried out on captured human face images, and information of the occurrence time of the human face images is recorded to a background database;
s4, detecting the wonderful moment of each student to form a wonderful video of each student;
s5, detecting the wonderful moment of all students to form wonderful videos of the students' collective activities or learning;
and S6, automatically sending the generated wonderful video of each student to the mobile phone of the parents of the student according to the information configured by the background database, releasing the generated wonderful video of the collective activities or the learning of the students on the homepage of the class, and sharing the linkage of the wonderful video to the parents.
Further, the background processing computer in step S2 is used for real-time face detection and face motion trajectory tracking, the face detection algorithm uses a face detection algorithm based on deep learning, and after a face is detected, the motion face tracking is performed by using a kalman filtering based method.
Further, in step S3, comparing the captured faces with a background database, and identifying identity information of each face; at least 5 images in the front upper, lower, left and right sides of each teacher and student are recorded in the background face database.
Further, in step S4, for each student, the time table of the occurrence of the face of the student is called from the background database, the n time points of the occurrence of the face of the student are found by random or uniform sampling, the expression of the student is identified, and then videos of 5 seconds before and after each time point are found, so as to form 10-second highlight video clips respectively.
Further, the wonderful moment of the student includes a state in which the student exhibits happy, sad, feared, angry, or quiet emotion or expression; the expression recognition method is adopted to detect the expressions of students, and simultaneously video data shot by a plurality of cameras which detect the same face at the same time are integrated to form a wonderful video clip of each student.
Further, forming the highlight video segment of each student comprises:
(1) preparing a student expression classification data set;
(2) adopting a deep convolution neural network to extract facial expression characteristics and classify and fuse the facial expression characteristics and the expressions into an end-to-end network;
(3) the method includes the steps that the expressions of all faces of students are recognized in a classified mode through training results of a deep convolutional neural network, whether other cameras shoot the same face at the same time or not is retrieved according to timestamps shot by videos, if more than one cameras shoot wonderful moments of a certain student at the same time, the segments shot by the multiple cameras are respectively subjected to parallel zooming arrangement, and finally generated wonderful moment videos contain multiple videos which are arranged in parallel and shoot the same scene at different angles.
Further, in step S5, the specific step of performing highlight detection on all students is: sampling and collecting video images of the collective activity, wherein the number of detected faces in the video images is not less than 20, performing expression recognition on each detected face, determining the overall expression attribute according to the votes of all expression labels, and finally cutting videos in 5 seconds before and after the moment into 10-second wonderful video clips.
Compared with the prior art, the invention has the following beneficial effects:
(1) the intelligent generation method of the wonderful moment video of the children in the playground comprises the steps of installing cameras at each place of a classroom, a rest room, a living room, a playground and the like of the playground, transmitting video data shot by the cameras to a processing computer, identifying identity information of the children appearing in the video by adopting a computer vision method, automatically capturing wonderful moment video clips of each child by utilizing artificial intelligent algorithms such as computer vision, machine learning and the like, automatically generating short video of the wonderful moment of the children one day, and automatically transmitting the short video to a mobile phone of a parent through information configured in a background. Meanwhile, the system supports the generation of a group or class wonderful moment video, helps parents to better understand the performance of children in group activities, and also helps juvenile gardens and teachers to well supervise and review activities of one day.
(2) The intelligent generation method of the wonderful moment video of the children in the nursery can save a large amount of manpower and time, simultaneously meets the requirements of parents, teachers, the nursery and other parties, and has important practical application significance.
In addition to the objects, features and advantages described above, other objects, features and advantages of the present invention are also provided. The present invention will be described in further detail below with reference to the drawings.
Drawings
The accompanying drawings, which are incorporated in and constitute a part of this application, illustrate embodiments of the invention and, together with the description, serve to explain the invention and not to limit the invention. In the drawings:
FIG. 1 is a flow diagram of a smart generation method of a wonderful moment video of a child in a nursery based on artificial intelligence.
Detailed Description
Embodiments of the invention will be described in detail below with reference to the drawings, but the invention can be implemented in many different ways, which are defined and covered by the claims.
Referring to fig. 1, the present embodiment provides an intelligent generation method of wonderful time video for children in a nursery based on artificial intelligence, which includes the following steps:
step one, arranging a wonderful video acquisition system; the wonderful video acquisition system comprises a plurality of cameras and a background processing computer, wherein the cameras are respectively installed at different positions in the kindergarten, and the background processing computer is used for receiving video data acquired by the cameras. The method comprises the following steps that cameras are installed at multiple places such as a classroom of a nursery garden, a rest room, a movable room, a playground and the like, and video data shot by the cameras are transmitted to a background processing computer; the background processing computer is connected with storage equipment such as a hard disk.
And step two, starting the face capturing function of each camera, and capturing the face and tracking the moving face. The specific implementation case can be as follows: (1) the method comprises the following steps of purchasing a camera supporting face snapshot and detection, such as a Haokangwei sight light intelligent 71-hemisphere network camera or a Honeyville HICC-2600T-FC 2MP face snapshot gun type network camera, wherein the camera has the functions of face detection, face tracking and the like, and can meet the requirements of practical application; (2) the background processing computer provides real-time human face detection and track tracking functions, wherein the human face detection algorithm adopts a human face detection algorithm based on deep learning, such as a CenterFace, and the like, and after the human face is detected, the moving human face tracking is carried out by adopting a Kalman filtering based method.
Step three, identity recognition is carried out: and aiming at the captured face, carrying out identity recognition according to a face data set prepared by teachers and students in the kindergartens in advance, and recording the moment information of the face to a background database. Wherein, the background human face database records at least 5 images of the front of each teacher and student in the upper, lower, left and right directions, the captured human faces are compared with the background database, and the identity information of each human face is identified.
And step four, detecting the wonderful moment for each student. For each student, calling a moment table of face appearance from a background database, randomly or uniformly sampling to find n moment points of face appearance, identifying expression of the moment points, and then finding out videos of 5 seconds before and after each moment point to respectively form 10-second wonderful video clips; preferably, n is 6. The wonderful moment of the students comprises 5 states of emotions or expressions, such as happiness (laugh or smile), sadness (lacrimation), fear, anger (roar), silence (learning state) and the like, of the students, the expressions of the students are detected by adopting methods such as expression recognition and the like, and video data shot by a plurality of cameras which detect the same face at the same moment are integrated to form a wonderful video clip of each student. The method comprises the following steps:
(1) a student expression classification dataset is prepared. Expressions of the students in five states of happiness (such as laughing or smiling), sadness (such as lacrimation), fear, anger (such as roaring), silence (such as a learning state) and the like are collected, data normalization processing is carried out, and corresponding expression labels are manually marked.
(2) And a deep convolutional neural network is adopted to fuse the facial expression feature extraction and the expression classification into an end-to-end network. VGG19 and Resnet18 may be used to accomplish the recognition and classification of expressions, respectively. Each tile of VGG19 is composed of a convolutional layer, a BatchNorm layer, a relu layer and an average pooling layer; the resnet18 is composed of two convolution layers and two BatchNorm layers, and the input and output ends of each resnet module are also connected with a quick link; a dropout strategy is added before a full connection layer, so that the robustness of the model is improved; finally, a plurality of full connection layers in the traditional VGG19 and Resnet18 are removed, and 5 types of expressions are directly classified to be recognized after one full connection layer.
(3) And classifying and identifying the expression of each face of the student by adopting the training result of the steps, retrieving whether other cameras shoot the same face at the same moment or not according to the timestamp of video shooting, and if more than one camera simultaneously shoots the wonderful moment of a certain student, respectively carrying out parallel zooming arrangement on the segments shot by the multiple cameras, namely the finally generated wonderful moment video comprises multiple videos shot from the same scene at different angles and arranged in parallel. The duration of each video clip is set to 10 seconds.
And step five, performing wonderful moment detection on all students. And (5) detecting the wonderful moment aiming at the moment of collective activities or learning of students. The method specifically comprises the steps of sampling and collecting video images of the collective activity, requiring the number of detected faces in the images to be not less than 20, carrying out expression recognition on each detected face, determining the overall expression attribute according to the votes of all expression labels, and finally cutting videos of 5 seconds before and after the moment into 10-second wonderful video clips.
And step six, automatically sending the generated wonderful video of each student to the mobile phone of the parent of the student according to the information configured by the background database, releasing the collective wonderful video on the homepage of the class, and linking and sharing the wonderful video to the parent.
The above description is only a preferred embodiment of the present invention and is not intended to limit the present invention, and various modifications and changes may be made by those skilled in the art. Any modification, equivalent replacement, or improvement made within the spirit and principle of the present invention should be included in the protection scope of the present invention.