CN112364737A - Facial expression recognition method, device and equipment for live webcast lessons - Google Patents

Facial expression recognition method, device and equipment for live webcast lessons Download PDF

Info

Publication number
CN112364737A
CN112364737A CN202011193684.3A CN202011193684A CN112364737A CN 112364737 A CN112364737 A CN 112364737A CN 202011193684 A CN202011193684 A CN 202011193684A CN 112364737 A CN112364737 A CN 112364737A
Authority
CN
China
Prior art keywords
facial expression
neural network
convolutional neural
network model
expression recognition
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202011193684.3A
Other languages
Chinese (zh)
Inventor
孙悦
李天驰
王帅
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen Dianmao Technology Co Ltd
Original Assignee
Shenzhen Dianmao Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenzhen Dianmao Technology Co Ltd filed Critical Shenzhen Dianmao Technology Co Ltd
Priority to CN202011193684.3A priority Critical patent/CN112364737A/en
Publication of CN112364737A publication Critical patent/CN112364737A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/174Facial expression recognition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Engineering & Computer Science (AREA)
  • Evolutionary Computation (AREA)
  • General Health & Medical Sciences (AREA)
  • Computing Systems (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Computational Linguistics (AREA)
  • Biophysics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Biomedical Technology (AREA)
  • Molecular Biology (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Oral & Maxillofacial Surgery (AREA)
  • Human Computer Interaction (AREA)
  • Multimedia (AREA)
  • Image Analysis (AREA)
  • Image Processing (AREA)

Abstract

The invention discloses a facial expression recognition method, a device and equipment for a network live broadcast course, wherein the method comprises the following steps: constructing an initial convolutional neural network model, optimizing the initial convolutional neural network model, and constructing a deep convolutional neural network model of a double-channel full-connection layer; acquiring a facial expression training sample, training a deep convolutional neural network model according to the facial expression training sample, and generating a facial expression recognition model; and acquiring a facial expression image in a network live broadcast course, inputting the facial expression image into a facial expression recognition model, and generating a facial expression recognition result. According to the embodiment of the invention, the influence of full-connection layers with different scales on the high-level semantic feature expression capability of the image is considered, the full-connection layer with the fusion of two channels is designed, the feature expression capability of the deep convolutional neural network model is enhanced, and the facial expression recognition accuracy is improved.

Description

Facial expression recognition method, device and equipment for live webcast lessons
Technical Field
The invention relates to the technical field of image processing, in particular to a facial expression recognition method, a device and equipment for a live webcast course.
Background
With the rise of the artificial intelligence industry, facial expression recognition technology based on deep learning is more and more concerned by people, and especially in a network live broadcast class, how the current class listening state of a student is can be obtained by analyzing the facial expressions of the student in a live broadcast video, so that teacher management and teaching are facilitated. In recent years, deep learning has achieved excellent results in many computer vision tasks such as image classification and face recognition. Until recently, facial expression recognition methods based on deep learning have also been developed. The characteristic information contained in each layer of the DCNN is distributed in a layering mode in the whole network, and the lower layer mainly contains texture and corner characteristics of the image and is local characteristics of the image. Higher layers contain a specific class of features that are more suited to complex tasks that require global features. As layers grow deeper, features become more complex and global. The features extracted by the full connection layer are generally regarded as high-level features, the traditional DCNN such as the Letnet and the Alexnet uses the full connection layer of a single channel, in addition, the traditional single-channel full connection layer only reserves partial 'important' features of the last pooling layer, and discards the features regarded as 'less important', so that the features extracted by the full connection layer have certain limitation on the aspect of image expression capacity, and the expression recognition accuracy is low.
Accordingly, the prior art is yet to be improved and developed.
Disclosure of Invention
In view of the defects of the prior art, the present invention aims to provide a facial expression recognition method, device and equipment for a live webcast lesson, and aims to solve the technical problem that in the prior art, a single-channel full-connection layer is adopted in the facial expression recognition method, so that features extracted by the full-connection layer have certain limitations in the aspect of image expression capability, and the facial expression recognition accuracy is low.
The technical scheme of the invention is as follows:
a facial expression recognition method for a live webcast session, the method comprising:
constructing an initial convolutional neural network model, optimizing the initial convolutional neural network model, and constructing a deep convolutional neural network model of a double-channel full-connection layer;
acquiring a facial expression training sample, training a deep convolutional neural network model according to the facial expression training sample, and generating a facial expression recognition model;
and acquiring a facial expression image in a network live broadcast course, inputting the facial expression image into a facial expression recognition model, and generating a facial expression recognition result.
Further, the constructing an initial convolutional neural network model, optimizing the initial convolutional neural network model, and constructing a deep convolutional neural network model of a dual-channel fully-connected layer includes:
constructing an initial convolutional neural network model, continuously convolving hidden layers of the initial convolutional neural network by adopting a minimum-scale convolution kernel, and then pooling;
optimizing the network internal structure of the pooled convolutional neural network model, and constructing a deep convolutional neural network model of a double-channel full-connection layer.
Further preferably, the optimizing the network internal structure of the pooled convolutional neural network model includes:
and optimizing the network internal structure of the pooled convolutional neural network model according to the Maxout activating function and the Dropout algorithm.
Further preferably, the obtaining of the facial expression training sample, training the deep convolutional neural network model according to the facial expression training sample, and generating the facial expression recognition model includes:
acquiring a facial expression training sample, and training a deep convolutional neural network model according to the facial expression training sample;
and in the training process, learning is carried out by adopting an A-Softmax algorithm, and a facial expression recognition model is generated according to the learning result.
Preferably, the constructing an initial convolutional neural network model, performing continuous convolution on the hidden layer of the initial convolutional neural network by using a minimum-scale convolution kernel, and then performing pooling includes:
constructing an initial convolutional neural network model, obtaining a convolutional layer of the convolutional neural network model, and using a 0-value filling technology in the convolutional layer;
and carrying out continuous convolution on the convolution layers of the filled convolution neural network by adopting a convolution kernel with the minimum scale, and then carrying out pooling.
Further, performing continuous convolution on hidden layers of the initial convolutional neural network by using a minimum-scale convolution kernel, and then performing pooling, includes:
and continuously convolving hidden layers of the initial convolutional neural network by using a filter convolution kernel of 3x3, and then pooling.
Further, the convolutional neural network is provided with a fully-connected fusion layer, and the network internal structure of the pooled convolutional neural network model is optimized according to the Maxout activation function and the Dropout algorithm, and the method further comprises the following steps:
and optimizing the fully-connected fusion layer of the pooled convolutional neural network model according to the Maxout activating function and the Dropout algorithm.
Another embodiment of the present invention provides a facial expression recognition device for a live webcast session, the device comprising:
the model building module is used for building an initial convolutional neural network model, optimizing the initial convolutional neural network model and building a deep convolutional neural network model of a double-channel full-connection layer;
the model training module is used for acquiring a facial expression training sample, training the deep convolutional neural network model according to the facial expression training sample and generating a facial expression recognition model;
and the facial expression recognition module is used for acquiring a facial expression image in a live network course, inputting the facial expression image into the facial expression recognition model and generating a facial expression recognition result.
Another embodiment of the present invention provides a facial expression recognition apparatus for a live webcast session, the apparatus comprising at least one processor; and the number of the first and second groups,
a memory communicatively coupled to the at least one processor; wherein the content of the first and second substances,
the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the above-described facial expression recognition method for a live webcast session.
Another embodiment of the present invention also provides a non-transitory computer-readable storage medium storing computer-executable instructions that, when executed by one or more processors, cause the one or more processors to perform the above-mentioned facial expression recognition method for a live webcast session.
Has the advantages that: according to the embodiment of the invention, the influence of full-connection layers with different scales on the high-level semantic feature expression capability of the image is considered, the full-connection layer with the fusion of two channels is designed, the feature expression capability of the deep convolutional neural network model is enhanced, and the facial expression recognition accuracy is improved.
Drawings
The invention will be further described with reference to the accompanying drawings and examples, in which:
FIG. 1 is a flowchart illustrating a preferred embodiment of a facial expression recognition method for live online lessons according to the present invention;
fig. 2 is a schematic network structure diagram of a face recognition model according to a specific application embodiment of the facial expression recognition method for live webcast lessons in the present invention;
FIG. 3a is a schematic diagram of parameters of each network layer of a face recognition model in the prior art;
fig. 3b is a schematic diagram of parameters of each network layer in a specific application embodiment of the facial expression recognition method for the live webcast lesson according to the present invention;
FIG. 4 is a functional block diagram of an embodiment of a facial expression recognition apparatus for live online lessons according to the present invention;
fig. 5 is a schematic diagram of a hardware structure of a facial expression recognition device for live webcast lessons according to a preferred embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and effects of the present invention clearer and clearer, the present invention is described in further detail below. It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention. Embodiments of the present invention will be described below with reference to the accompanying drawings.
The embodiment of the invention provides a facial expression recognition method for a live webcast course. Referring to fig. 1, fig. 1 is a flowchart illustrating a facial expression recognition method for live webcast lessons according to a preferred embodiment of the present invention. As shown in fig. 1, it includes the steps of:
s100, constructing an initial convolutional neural network model, optimizing the initial convolutional neural network model, and constructing a deep convolutional neural network model of a double-channel full-connection layer;
s200, obtaining a facial expression training sample, training a deep convolutional neural network model according to the facial expression training sample, and generating a facial expression recognition model;
step S300, collecting facial expression images in a live online course, inputting the facial expression images into a facial expression recognition model, and generating a facial expression recognition result.
In specific implementation, the facial expression recognition algorithm of the embodiment of the invention is used for recognizing the expressions of students in a live webcast class, and by recognizing the expressions, the class listening state of the students can be obtained, thereby providing convenience for subsequently monitoring the class listening effect of the students.
Acquiring an initial convolutional neural network model, continuously convolving hidden layers of the initial convolutional neural network by adopting a minimum-scale convolution kernel, and then pooling; optimizing the internal structure of the network, and constructing a deep convolutional neural network model of a double-channel full-connection layer; inputting the characteristics of the collected face image into a deep convolution neural network model for training to obtain a trained facial expression recognition model; the facial expression image is collected through an image collecting device, the facial image to be recognized is input into a facial expression recognition model, and recognized facial expressions are generated.
Further, an initial convolutional neural network model is constructed, the initial convolutional neural network model is optimized, and a deep convolutional neural network model of a double-channel full-connection layer is constructed, and the method comprises the following steps:
constructing an initial convolutional neural network model, continuously convolving hidden layers of the initial convolutional neural network by adopting a minimum-scale convolution kernel, and then pooling;
optimizing the network internal structure of the pooled convolutional neural network model, and constructing a deep convolutional neural network model of a double-channel full-connection layer.
During specific implementation, an initial Convolutional Neural network model is constructed, continuous convolution is carried out on a hidden layer by adopting a small-scale convolution kernel, then (maximum + average) pooling is carried out, the internal structure of the network is optimized, meanwhile, a traditional single-channel full-connection layer is improved, and a DCNN (Deep Convolutional Neural network) model with a double-channel full-connection layer is constructed.
Further, optimizing the network internal structure of the pooled convolutional neural network model, including:
and optimizing the network internal structure of the pooled convolutional neural network model according to the Maxout activating function and the Dropout algorithm.
In specific implementation, the internal structure of the network is optimized by combining the Maxout activation function and the Dropout technology.
Further, acquiring a facial expression training sample, training the deep convolutional neural network model according to the facial expression training sample, and generating a facial expression recognition model, including:
acquiring a facial expression training sample, and training a deep convolutional neural network model according to the facial expression training sample;
and in the training process, learning is carried out by adopting an A-Softmax algorithm, and a facial expression recognition model is generated according to the learning result.
In particular, the A-Softmax loss is used during training, the angle is used as a distance measure, and the angular distance and the learned features are combined to enhance discrimination. The network performance is greatly improved, the network feature extraction capability is improved, the number of parameters in the training process is reduced, and a facial expression recognition training model with a good effect is obtained.
Further, constructing an initial convolutional neural network model, continuously convolving hidden layers of the initial convolutional neural network by adopting a minimum-scale convolution kernel, and then pooling, wherein the pooling comprises the following steps:
constructing an initial convolutional neural network model, obtaining a convolutional layer of the convolutional neural network model, and using a 0-value filling technology in the convolutional layer;
and carrying out continuous convolution on the convolution layers of the filled convolution neural network by adopting a convolution kernel with the minimum scale, and then carrying out pooling.
In specific implementation, the two-channel convolutional neural network includes, as shown in fig. 2, 5 convolutional layers, 3 pooling layers, a fully-connected fusion layer, and a two-channel fully-connected layer. The 0-value padding technique was used at the convolutional layers, and the convolutional operations were performed twice in succession using filters at the C2, C3 convolutional layers and C4, C5 convolutional layers, respectively. By combining the maximum pooling and the average pooling, more diversified feature information can be retained.
Further, performing continuous convolution on hidden layers of the initial convolutional neural network by using a minimum-scale convolution kernel, and then performing pooling, wherein the pooling comprises:
and continuously convolving hidden layers of the initial convolutional neural network by using a filter convolution kernel of 3x3, and then pooling.
In a specific implementation, as shown in fig. 2, two convolution operations were performed successively on the C2 and C3 convolutional layers and the C4 and C5 convolutional layers, respectively, using a filter of size 3 × 3. By combining the maximum pooling and the average pooling, more diversified feature information can be retained.
Further, the convolutional neural network is provided with a fully-connected fusion layer, and the network internal structure of the pooled convolutional neural network model is optimized according to the Maxout activation function and the Dropout algorithm, and the method further comprises the following steps:
and optimizing the fully-connected fusion layer of the pooled convolutional neural network model according to the Maxout activating function and the Dropout algorithm.
In specific implementation, Dropout technology is used for the convolution layer and the full-link layer respectively in order to prevent overfitting, and batch normalization technology is added after the convolution layer in order to improve the generalization of the DCNN model.
As can be seen from fig. 3a and 3b, fig. 3a shows the network layer parameters of the DCNN model before modification, and fig. 3b shows the network layer parameters of the TCNN model after modification. It can be seen that the number of trainable parameters reduced by using successive convolution for C2 and C3 layers is (5x5x24x24-3x3x24x24x2) x 64-258048, the number of trainable parameters reduced by using successive convolution for C4 and C5 layers is (5x5x12x12-3x3x12x12x2) x 128-129024, and the total number of reduced trainable parameters is 387072. Because the improved TCNN model uses two channels at the full link layer, the number of trainable parameters of the full link layer is increased by 256+256 to 512. In fig. 3b, the F3 layer is a feature fusion layer, and the layer is fused by F1 and F2 full link layers, so the parameters of the layer do not belong to trainable parameters. In general, the TCNN model optimizes the number of parameters in the network, and allows trainable parameters in the network to be reduced.
The embodiments of the present invention provide a facial expression recognition method for live webcast lessons, which aims to improve network performance, increase network feature extraction capability, and reduce the number of parameters in a training process, and the method performs (max + average) pooling after continuous convolution by using a small-scale convolution kernel in a hidden layer, optimizes a network internal structure by combining a Maxout activation function and a Dropout technology, and improves a conventional single-channel full-connection layer to construct a DCNN model with a dual-channel full-connection layer. The a-Softmax loss is used during training, using angle as a distance measure, combining the angular distance and learned features to enhance discrimination. And finally, obtaining a facial expression recognition training model with good effect.
According to the embodiment of the invention, the influence of full-connection layers with different scales on the high-level semantic feature expression capability of the image is fully considered, the dual-channel fusion full-connection layer is designed, and the feature expression capability of the DCNN model is enhanced.
And a Maxout activation function is used for replacing a traditional ReLU activation function at a dual-channel full-connection layer, so that the network can express more accurate high-dimensional characteristic information.
Given the problem that ideal facial features exist during FER with a maximum inter-class distance smaller than the minimum inter-class distance, the a-Softmax penalty is used during training to allow TCNN to learn facial features with geometrically interpretable angular separation.
It should be noted that, a certain order does not necessarily exist between the above steps, and those skilled in the art can understand, according to the description of the embodiments of the present invention, that in different embodiments, the above steps may have different execution orders, that is, may be executed in parallel, may also be executed interchangeably, and the like.
Another embodiment of the present invention provides a facial expression recognition apparatus for live online lessons, as shown in fig. 4, the apparatus 1 includes:
the model building module 11 is used for building an initial convolutional neural network model, optimizing the initial convolutional neural network model and building a deep convolutional neural network model of a double-channel full-connection layer;
the model training module 12 is used for acquiring a facial expression training sample, training the deep convolutional neural network model according to the facial expression training sample, and generating a facial expression recognition model;
and the facial expression recognition module 13 is configured to collect facial expression images in a live network course, input the facial expression images into a facial expression recognition model, and generate a facial expression recognition result.
The specific implementation is shown in the method embodiment, and is not described herein again.
Another embodiment of the present invention provides a facial expression recognition apparatus for a live webcast session, as shown in fig. 5, the apparatus 10 includes:
one or more processors 110 and a memory 120, where one processor 110 is illustrated in fig. 5, the processor 110 and the memory 120 may be connected by a bus or other means, and where fig. 5 illustrates a connection by a bus.
Processor 110 is operative to implement various control logic of apparatus 10, which may be a general purpose processor, a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), a Field Programmable Gate Array (FPGA), a single chip, an ARM (Acorn RISC machine) or other programmable logic device, discrete gate or transistor logic, discrete hardware components, or any combination of these components. Also, the processor 110 may be any conventional processor, microprocessor, or state machine. Processor 110 may also be implemented as a combination of computing devices, e.g., a combination of a DSP and a microprocessor, a plurality of microprocessors, one or more microprocessors in conjunction with a DSP core, or any other such configuration.
The memory 120 is a non-volatile computer-readable storage medium, and can be used to store non-volatile software programs, non-volatile computer-executable programs, and modules, such as program instructions corresponding to the facial expression recognition method for live webcast lessons in the embodiment of the present invention. The processor 110 executes various functional applications and data processing of the device 10, namely, implements the facial expression recognition method for live webcast lessons in the above-described method embodiments, by running the nonvolatile software programs, instructions, and units stored in the memory 120.
The memory 120 may include a storage program area and a storage data area, wherein the storage program area may store an application program required for operating the device, at least one function; the storage data area may store data created according to the use of the device 10, and the like. Further, the memory 120 may include high speed random access memory, and may also include non-volatile memory, such as at least one magnetic disk storage device, flash memory device, or other non-volatile solid state storage device. In some embodiments, memory 120 optionally includes memory located remotely from processor 110, which may be connected to device 10 via a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.
One or more units are stored in the memory 120, and when executed by the one or more processors 110, perform the facial expression recognition method for webcast lessons in any of the above-described method embodiments, e.g., performing the above-described method steps S100 to S300 in fig. 1.
Embodiments of the present invention provide a non-transitory computer-readable storage medium storing computer-executable instructions for execution by one or more processors, for example, to perform method steps S100-S300 of fig. 1 described above.
By way of example, non-volatile storage media can include read-only memory (ROM), Programmable ROM (PROM), Electrically Programmable ROM (EPROM), electrically erasable ROM (EEPROM), or flash memory. Volatile memory can include Random Access Memory (RAM), which acts as external cache memory. By way of illustration and not limitation, RAM is available in many forms such as Synchronous RAM (SRAM), dynamic RAM, (DRAM), Synchronous DRAM (SDRAM), double data rate SDRAM (DDR SDRAM), Enhanced SDRAM (ESDRAM), Synchlink DRAM (SLDRAM), and Direct Rambus RAM (DRRAM). The disclosed memory components or memory of the operating environment described herein are intended to comprise one or more of these and/or any other suitable types of memory.
Another embodiment of the present invention provides a computer program product comprising a computer program stored on a non-volatile computer-readable storage medium, the computer program comprising program instructions that, when executed by a processor, cause the processor to perform the facial expression recognition method for webcast lessons of the above-described method embodiment. For example, the method steps S100 to S300 in fig. 1 described above are performed.
The above-described embodiments are merely illustrative, and the units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the modules can be selected according to actual needs to achieve the purpose of the scheme of the embodiment.
Through the above description of the embodiments, those skilled in the art will clearly understand that the embodiments may be implemented by software plus a general hardware platform, and may also be implemented by hardware. Based on such understanding, the above technical solutions essentially or contributing to the related art can be embodied in the form of a software product, which can be stored in a computer-readable storage medium, such as ROM/RAM, magnetic disk, optical disk, etc., and includes several instructions for enabling a computer device (which can be a personal computer, a server, or a network device, etc.) to execute the methods of the various embodiments or some parts of the embodiments.
Conditional language such as "can," "might," or "may" is generally intended to convey that a particular embodiment can include (yet other embodiments do not include) particular features, elements, and/or operations, among others, unless specifically stated otherwise or otherwise understood within the context as used. Thus, such conditional language is also generally intended to imply that features, elements, and/or operations are in any way required for one or more embodiments or that one or more embodiments must include logic for deciding, with or without input or prompting, whether such features, elements, and/or operations are included or are to be performed in any particular embodiment.
What has been described herein in the specification and drawings includes examples that can provide a facial expression recognition method and apparatus for a live webcast session. It will, of course, not be possible to describe every conceivable combination of components and/or methodologies for purposes of describing the various features of the disclosure, but it can be appreciated that many further combinations and permutations of the disclosed features are possible. It is therefore evident that various modifications can be made to the disclosure without departing from the scope or spirit thereof. In addition, or in the alternative, other embodiments of the disclosure may be apparent from consideration of the specification and drawings and from practice of the disclosure as presented herein. It is intended that the examples set forth in this specification and the drawings be considered in all respects as illustrative and not restrictive. Although specific terms are employed herein, they are used in a generic and descriptive sense only and not for purposes of limitation.

Claims (10)

1. A facial expression recognition method for a live webcast course is characterized by comprising the following steps:
constructing an initial convolutional neural network model, optimizing the initial convolutional neural network model, and constructing a deep convolutional neural network model of a double-channel full-connection layer;
acquiring a facial expression training sample, training a deep convolutional neural network model according to the facial expression training sample, and generating a facial expression recognition model;
and acquiring a facial expression image in a network live broadcast course, inputting the facial expression image into a facial expression recognition model, and generating a facial expression recognition result.
2. The facial expression recognition method for the live webcast class according to claim 1, wherein the constructing of the initial convolutional neural network model, the optimizing of the initial convolutional neural network model, and the constructing of the deep convolutional neural network model with the two channels and the full connection layer comprises:
constructing an initial convolutional neural network model, continuously convolving hidden layers of the initial convolutional neural network by adopting a minimum-scale convolution kernel, and then pooling;
optimizing the network internal structure of the pooled convolutional neural network model, and constructing a deep convolutional neural network model of a double-channel full-connection layer.
3. The facial expression recognition method for the live webcast class according to claim 2, wherein the optimizing the network internal structure of the pooled convolutional neural network model comprises:
and optimizing the network internal structure of the pooled convolutional neural network model according to the Maxout activating function and the Dropout algorithm.
4. The facial expression recognition method for the live webcast class according to claim 3, wherein the obtaining of the facial expression training sample, training the deep convolutional neural network model according to the facial expression training sample, and generating the facial expression recognition model comprises:
acquiring a facial expression training sample, and training a deep convolutional neural network model according to the facial expression training sample;
and in the training process, learning is carried out by adopting an A-Softmax algorithm, and a facial expression recognition model is generated according to the learning result.
5. The facial expression recognition method for the live webcasting class according to claim 4, wherein the constructing of the initial convolutional neural network model, the performing of pooling after continuous convolution of the hidden layer of the initial convolutional neural network by using a minimum-scale convolution kernel, comprises:
constructing an initial convolutional neural network model, obtaining a convolutional layer of the convolutional neural network model, and using a 0-value filling technology in the convolutional layer;
and carrying out continuous convolution on the convolution layers of the filled convolution neural network by adopting a convolution kernel with the minimum scale, and then carrying out pooling.
6. The facial expression recognition method for the live webcasting class according to claim 5, wherein the pooling is performed after the continuous convolution of the hidden layer of the initial convolutional neural network by using a convolution kernel with a minimum scale, and the pooling comprises:
and continuously convolving hidden layers of the initial convolutional neural network by using a filter convolution kernel of 3x3, and then pooling.
7. The facial expression recognition method for live webcast lessons according to claim 6, wherein the convolutional neural network is provided with a fully-connected fusion layer, and the network internal structure of the pooled convolutional neural network model is optimized according to the Maxout activation function and the Dropout algorithm, and further comprising:
and optimizing the fully-connected fusion layer of the pooled convolutional neural network model according to the Maxout activating function and the Dropout algorithm.
8. A facial expression recognition apparatus for a live webcast session, the apparatus comprising:
the model building module is used for building an initial convolutional neural network model, optimizing the initial convolutional neural network model and building a deep convolutional neural network model of a double-channel full-connection layer;
the model training module is used for acquiring a facial expression training sample, training the deep convolutional neural network model according to the facial expression training sample and generating a facial expression recognition model;
and the facial expression recognition module is used for acquiring a facial expression image in a live network course, inputting the facial expression image into the facial expression recognition model and generating a facial expression recognition result.
9. A facial expression recognition device for a live webcast class, the device comprising at least one processor; and the number of the first and second groups,
a memory communicatively coupled to the at least one processor; wherein the content of the first and second substances,
the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the method of any one of claims 1-7 for facial expression recognition in a live online class.
10. A non-transitory computer-readable storage medium storing computer-executable instructions that, when executed by one or more processors, cause the one or more processors to perform the method of facial expression recognition for a live webcast session of any one of claims 1-7.
CN202011193684.3A 2020-10-30 2020-10-30 Facial expression recognition method, device and equipment for live webcast lessons Pending CN112364737A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011193684.3A CN112364737A (en) 2020-10-30 2020-10-30 Facial expression recognition method, device and equipment for live webcast lessons

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011193684.3A CN112364737A (en) 2020-10-30 2020-10-30 Facial expression recognition method, device and equipment for live webcast lessons

Publications (1)

Publication Number Publication Date
CN112364737A true CN112364737A (en) 2021-02-12

Family

ID=74514219

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011193684.3A Pending CN112364737A (en) 2020-10-30 2020-10-30 Facial expression recognition method, device and equipment for live webcast lessons

Country Status (1)

Country Link
CN (1) CN112364737A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113569975A (en) * 2021-08-04 2021-10-29 华南师范大学 Sketch work rating method and device based on model fusion
CN113688714A (en) * 2021-08-18 2021-11-23 华南师范大学 Method, device, equipment and storage medium for identifying multi-angle facial expressions

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106529503A (en) * 2016-11-30 2017-03-22 华南理工大学 Method for recognizing face emotion by using integrated convolutional neural network
CN109272107A (en) * 2018-08-10 2019-01-25 广东工业大学 A method of improving the number of parameters of deep layer convolutional neural networks
KR20190123372A (en) * 2018-04-12 2019-11-01 가천대학교 산학협력단 Apparatus and method for robust face recognition via hierarchical collaborative representation

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106529503A (en) * 2016-11-30 2017-03-22 华南理工大学 Method for recognizing face emotion by using integrated convolutional neural network
KR20190123372A (en) * 2018-04-12 2019-11-01 가천대학교 산학협력단 Apparatus and method for robust face recognition via hierarchical collaborative representation
CN109272107A (en) * 2018-08-10 2019-01-25 广东工业大学 A method of improving the number of parameters of deep layer convolutional neural networks

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
张琳琳: "基于卷积神经网络的人脸表情识别研究", 《中国知网》, no. 09, 15 September 2019 (2019-09-15), pages 1 - 4 *

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113569975A (en) * 2021-08-04 2021-10-29 华南师范大学 Sketch work rating method and device based on model fusion
CN113688714A (en) * 2021-08-18 2021-11-23 华南师范大学 Method, device, equipment and storage medium for identifying multi-angle facial expressions
CN113688714B (en) * 2021-08-18 2023-09-01 华南师范大学 Multi-angle facial expression recognition method, device, equipment and storage medium

Similar Documents

Publication Publication Date Title
KR102071582B1 (en) Method and apparatus for classifying a class to which a sentence belongs by using deep neural network
CN107273936B (en) GAN image processing method and system
CN112734775B (en) Image labeling, image semantic segmentation and model training methods and devices
CN111160350B (en) Portrait segmentation method, model training method, device, medium and electronic equipment
CN111950656B (en) Image recognition model generation method and device, computer equipment and storage medium
CN111598190B (en) Training method of image target recognition model, image recognition method and device
WO2021184902A1 (en) Image classification method and apparatus, training method and apparatus, device, and medium
CN112347248A (en) Aspect-level text emotion classification method and system
CN112801146A (en) Target detection method and system
CN107784316A (en) A kind of image-recognizing method, device, system and computing device
US20220237917A1 (en) Video comparison method and apparatus, computer device, and storage medium
CN112364737A (en) Facial expression recognition method, device and equipment for live webcast lessons
CN112381763A (en) Surface defect detection method
CN109858022A (en) A kind of user's intension recognizing method, device, computer equipment and storage medium
CN112529146A (en) Method and device for training neural network model
CN112748941A (en) Feedback information-based target application program updating method and device
CN113705715B (en) Time sequence classification method based on LSTM and multi-scale FCN
CN117036834B (en) Data classification method and device based on artificial intelligence and electronic equipment
CN114169501A (en) Neural network compression method and related equipment
CN111783473B (en) Method and device for identifying best answer in medical question and answer and computer equipment
CN112465847A (en) Edge detection method, device and equipment based on clear boundary prediction
CN116758331A (en) Object detection method, device and storage medium
CN115346084B (en) Sample processing method, device, electronic equipment, storage medium and program product
CN112749797A (en) Pruning method and device for neural network model
CN112465848A (en) Semantic edge detection method, device and equipment based on dynamic feature fusion

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination