CN113076813A - Mask face feature recognition model training method and device - Google Patents

Mask face feature recognition model training method and device Download PDF

Info

Publication number
CN113076813A
CN113076813A CN202110272296.2A CN202110272296A CN113076813A CN 113076813 A CN113076813 A CN 113076813A CN 202110272296 A CN202110272296 A CN 202110272296A CN 113076813 A CN113076813 A CN 113076813A
Authority
CN
China
Prior art keywords
video
feature
facial
frame sequence
recognition model
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202110272296.2A
Other languages
Chinese (zh)
Other versions
CN113076813B (en
Inventor
许二赫
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Xuanwu Hospital
Institute of Computing Technology of CAS
Original Assignee
Xuanwu Hospital
Institute of Computing Technology of CAS
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Xuanwu Hospital, Institute of Computing Technology of CAS filed Critical Xuanwu Hospital
Priority to CN202110272296.2A priority Critical patent/CN113076813B/en
Publication of CN113076813A publication Critical patent/CN113076813A/en
Application granted granted Critical
Publication of CN113076813B publication Critical patent/CN113076813B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/168Feature extraction; Face representation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/161Detection; Localisation; Normalisation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/172Classification, e.g. identification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/20Movements or behaviour, e.g. gesture recognition

Landscapes

  • Engineering & Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Human Computer Interaction (AREA)
  • General Health & Medical Sciences (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Theoretical Computer Science (AREA)
  • Oral & Maxillofacial Surgery (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Psychiatry (AREA)
  • Social Psychology (AREA)
  • Image Analysis (AREA)

Abstract

The application provides a mask face feature recognition model training method and device, and the method comprises the following steps: acquiring a sample facial feature video and a corresponding evaluation label; the sample facial feature video is a video formed by a user executing operation according to a set rule; extracting image frames of a sample facial feature video to form a frame sequence; according to the frame sequence, carrying out differential operation on adjacent frames in the frame sequence to obtain a differential image; extracting a characteristic matrix of each differential image; combining the characteristic matrixes of the differential images according to the frame sequence to obtain a video characteristic matrix; and training a mask face feature recognition model by adopting the video feature matrix and the corresponding evaluation label. Compared with the method for determining the facial mask features by directly extracting the features by using the facial features of the human face in the prior art, the method can simplify calculation, realize quick establishment of a facial mask feature recognition model and achieve better accuracy.

Description

Mask face feature recognition model training method and device
Technical Field
The application relates to the technical field of machine learning, in particular to a training method and device for a mask face feature recognition model.
Background
Mask face refers to a dull expression, a facial state that is expressed due to the suppression of facial expression muscle activity even if the expression is intentionally made. Although the diagnosis method cannot be directly used as a direct diagnosis basis of the Parkinson's disease, the method determines that the facemask and the neurodegenerative diseases such as the Parkinson's disease have strong association relationship from a large number of existing cases and can be used as a primary screening basis of the Parkinson's disease and other diseases.
With the popularization of intelligent terminals such as smart phones, the body health data monitoring work which originally needs to be operated by professionals can be obtained by the intelligent terminals through operation processing of collected data. For example, the mask face can be identified by shooting a video of the face of the user on the intelligent terminal through a deep learning algorithm; in this case, the core issue is the applicability and accuracy of the algorithms used to process the acquired data.
At present, there is a process of acquiring a user face by using an intelligent terminal to obtain a sample face feature video and establishing a related algorithm model by using a deep learning method, but the core of the algorithm needs to identify the features of each frame in the video, and the calculation amount of the algorithm is large; in practical application, the action state of a tester when shooting a characteristic video has a great influence on the recognition result.
Disclosure of Invention
Based on the problems discovered by analysis of the prior art scheme, the application provides a method and a device for training a mask face feature recognition model and a mask face recognition method.
In one aspect, the application provides a method for training a mask face feature recognition model, comprising:
acquiring a sample facial feature video and a corresponding evaluation label; the sample facial feature video is a video formed by a user executing operation according to a set rule;
extracting image frames of the sample facial feature video to form a frame sequence;
according to the frame sequence, carrying out differential operation on adjacent frames in the frame sequence to obtain a differential image;
extracting a feature matrix of each differential image; combining the characteristic matrixes of the differential images according to the frame sequence to obtain a video characteristic matrix;
and training the mask face feature recognition model by adopting the video feature matrix and the corresponding evaluation label.
Optionally, extracting image frames of the sample facial feature video to form a frame sequence comprises: dividing the sample facial feature video according to a set rule to obtain a sample divided video;
extracting image frames of each sample sub-video to form a corresponding frame sequence;
combining the feature matrices of the differential images according to the frame sequences to obtain a video feature matrix, wherein the video feature matrix comprises: and combining the characteristic matrix of each differential image according to the arrangement sequence of the sample sub-videos and the corresponding frame sequences to obtain the video characteristic matrix.
Optionally, the setting rule includes at least two facial movements and execution time of each of the facial movements;
dividing the sample facial feature video according to a set rule to obtain a sample video, comprising: and dividing the video characteristics according to the facial actions and the corresponding execution time to obtain the sample divided video.
Optionally, the facial action includes closing both eyes, relaxing and looking straight ahead, smiling and exposing teeth.
Optionally, extracting image frames of the sample facial feature video to form a frame sequence comprises:
extracting a face image region of the image frame;
combining the facial image regions in the order of the image frames to form the frame sequence.
Optionally, the feature matrix of the difference image includes at least two feature parameters;
combining the feature matrices of the differential images according to the frame sequences to obtain a video feature matrix, wherein the video feature matrix comprises:
extracting the same characteristic parameter of each characteristic matrix according to the frame sequence, and combining to form a vector with the same parameter;
and combining the vectors with the same parameters to obtain the video feature matrix.
In another aspect, the present application provides a mask face recognition method, including:
acquiring a facial feature video to be evaluated; the to-be-evaluated facial feature video is a video formed by a user executing operation according to a set rule;
extracting image frames of the facial evaluation video to be evaluated to form a frame sequence;
according to the frame sequence, carrying out differential operation on adjacent frames in the frame sequence to obtain a differential image;
extracting a feature matrix of each differential image; combining the characteristic matrixes of the differential images according to the frame sequence to obtain a video characteristic matrix;
and processing the video characteristic matrix by adopting the mask face characteristic identification model obtained by the method to obtain a mask face degree evaluation result.
In another aspect, the present application provides a mask face recognition model training device, including:
the source data acquisition unit is used for acquiring a sample facial feature video and a corresponding evaluation label; the sample facial feature video is a video formed by a user executing operation according to a set rule;
a frame extraction unit for extracting image frames of the sample facial feature video to form a frame sequence;
the difference processing unit is used for carrying out difference operation on adjacent frames in the frame sequence according to the frame sequence to obtain a difference image;
a feature determination unit configured to extract a feature matrix of each of the difference images; combining the characteristic matrixes of the differential images according to the frame sequence to obtain a video characteristic matrix;
and the model training unit is used for training the mask face feature recognition model by adopting the video feature matrix and the corresponding evaluation label.
According to the mask face feature recognition model training method, after a difference image is obtained by performing difference operation on adjacent frames in a frame sequence, the difference image is processed to extract a feature matrix, and the feature matrix representing a sample face feature video is obtained based on the feature matrix of the difference image.
Because the feature matrix for representing the sample facial feature video is determined based on the differential image, the change condition of the video content along with time is embodied, namely the delay degree and the controllable degree of the change of the facial expression of the tester according to the set rule are embodied, and then the facial mask characteristics of the tester can be embodied. Compared with the prior art in which the method for determining the facial mask features by directly extracting the features by using the facial features is directly adopted, the method can simplify the calculated amount, realize the rapid establishment of the facial mask feature identification model and achieve better accuracy.
Drawings
The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the present application and together with the description, serve to explain the principles of the application.
In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings needed to be used in the description of the embodiments or the prior art will be briefly described below, and it is obvious for those skilled in the art to obtain other drawings without inventive labor;
FIG. 1 is a flowchart of a mask face feature recognition model training method according to an embodiment of the present disclosure;
FIG. 2 is a flow chart of a mask face recognition method provided in the practice of the present application;
FIG. 3 is a schematic structural diagram of a mask face recognition device according to an embodiment of the present disclosure;
fig. 4 is a schematic structural diagram of an electronic device provided in an embodiment of the present application;
wherein: 11-source data acquisition unit, 12-frame extraction unit, 13-difference processing unit, 14-feature determination unit, 15-model training unit, 21-processor, 22-memory, 23-communication interface, 24-bus system.
Detailed Description
In order that the above-mentioned objects, features and advantages of the present application may be more clearly understood, the solution of the present application will be further described below. It should be noted that the embodiments and features of the embodiments of the present application may be combined with each other without conflict.
In the following description, numerous specific details are set forth in order to provide a thorough understanding of the present application, but the present application may be practiced in other ways than those described herein; it is to be understood that the embodiments described in this specification are only some embodiments of the present application and not all embodiments.
The embodiment of the application provides a mask face feature recognition model training method, which is characterized in that training data for training a mask face feature recognition model are obtained by a new method, and the model is performed by the aid of the training data.
Fig. 1 is a flowchart of a mask face feature recognition model training method according to an embodiment of the present disclosure. As shown in fig. 1, the method provided by the embodiment of the present application includes steps S101 to S105.
S101: and acquiring a sample facial feature video and a corresponding evaluation label.
The sample facial feature video may be acquired by a tester using a mobile terminal such as a smart phone, or may be acquired by using a dedicated image acquisition device, which is not particularly limited in the embodiments of the present application.
In the embodiment of the application, the sample facial feature video is a video formed by a tester executing facial actions according to a set rule; the set rules comprise parameters such as the type of the facial action of the tester, the execution time of the facial action and the like. When obtaining a sample facial feature video, a tester must autonomously control facial muscles to perform corresponding actions according to set rules.
In the embodiment of the present application, the setting rule may include one facial action and corresponding execution time, or may include a plurality of facial actions and corresponding execution times.
In one specific application, the set rule includes three facial movements, which are closing both eyes, relaxing and looking straight ahead and tiny and exposing teeth, respectively; three facial movements all are designed to some characteristics that the mask face has for observe the facial rigidity degree and the face of tester (also be the tester) and change lazy degree, wherein: closing the eyes to observe the relaxation degree of the face of the tester, relaxing and directly viewing the front to observe the eye condition of the tester, and smiling and exposing the teeth to observe the mouth angle and the eye condition of the tester.
In order to capture each facial movement reasonably and reflect the degree of stiffness and slowness of change of the face, each facial movement should be performed properly. In one application of the embodiment of the present application, the execution time of each face motion is 5s, and the execution of three face motions is continuous.
In the embodiment of the application, the evaluation tag is obtained by a professional viewing the sample facial feature video or observing the corresponding facial features of the test object. In one application, the rating labels may include five grades, normal, mild, moderate and severe; normal characterizes normal facial expression; slightly corresponding to the situation that the tester has the instantaneous frequency reduction; mild response test persons with reduced frequency of hits, reduced lower facial expression (e.g., spontaneous smiley lips not separated); the lips of the moderate corresponding testers can be opened when the mouth is still; the severe counterpart lips are mostly open when immobilized.
S102: image frames of a sample facial feature video are extracted to form a sequence of frames.
In the embodiment of the present application, after the sample facial feature video is obtained, the sample facial feature video needs to be subjected to framing processing to form a frame sequence for subsequent processing by a user.
In practical applications, the strategy for forming the frame sequence may be different according to the sampling frequency of the sample facial feature video. If the sampling frequency is small, all image frames of the sample facial feature video can be directly used as one frame in the frame sequence; if the sampling frequency is large, a method of interval sampling can be adopted to extract partial frames in the sample facial feature video as the frame sequence.
In practical application, when a tester performs video acquisition of sample facial features, an image acquisition device (such as a smartphone of a user) is displaced relative to the face of the user, so that the position of the face of the user moves.
In order to extract valid information and exclude unnecessary information, in step S102, the face area of the user in each image frame may be determined, and the face area of the user may be extracted and other areas may be deleted as pixel points of each frame in the frame sequence. For example, in one application of the embodiment of the present application, a face region of each image frame may be determined by an edge recognition method or a deep learning method, and the size of the face region is made to be 64 × 64 pixels by a difference value method or a pixel combination method, so as to meet the requirement of consistency in subsequent processing.
In practical application of the embodiment of the present application, if the set rule in step S101 includes three actions, that is, the sample facial feature video includes video contents of a plurality of actions, in step S102, the sample facial feature video may be further divided according to the set rule (that is, the execution time of each facial action in the set rule) to obtain sample partial videos, and each sample partial video is subjected to frame processing to form a frame sequence corresponding to each facial action. Of course, in other embodiments, the frame sequence may be determined and then divided according to the processing rule.
In the embodiment of the application, in order to improve the processing efficiency, under the condition that the sample facial feature video is a color video, the sample facial feature video can be converted into a black-and-white gray-scale video, and then a frame sequence is obtained; or after acquiring the frame sequence, performing gray scale conversion on the content in the frame sequence to form a black-and-white gray scale video.
S103: and carrying out difference operation on adjacent frames in the frame sequence according to the frame sequence to obtain a difference image.
In step S103, the difference operation is performed on the adjacent frames in the frame sequence, in which the gray data of the pixel points corresponding to the two adjacent frames in the frame sequence are subtracted, and the absolute value of the subtracted value is used as the pixel gray value of the corresponding point in the corresponding difference image. It is conceivable that the difference operation determines the change except for the adjacent frames, and therefore the difference image embodies the change of the facial features of the user.
Specifically, if there are k frame sequences, for the mth frame sequence, the difference value of the corresponding pixel point (i, j) in the tth image frame is adopted
Figure BDA0002974852510000071
And (4) calculating.
S104: extracting a characteristic matrix of each differential image; and combining the characteristic matrixes of the differential images according to the frame sequence to obtain a video characteristic matrix.
After determining the difference image, feature extraction may be performed on the difference image to determine a feature matrix characterizing the difference image.
In this embodiment of the application, the feature matrix of the difference image may include the following parameters: the method comprises the steps of obtaining information entropy, a maximum value, a minimum value, an average value, a variance, skewness, kurtosis, a median, a range, a quartile 1, a quartile 3, a quartile difference and a correlation coefficient, and constructing a feature matrix of a differential image by using the parameters. In practical applications, the feature matrix of the difference image may be only one feature vector, and the parameters in the feature vector are arranged in a set order.
Suppose that the proportion of pixels at each gray value in a certain difference image is piThen entropy of information is adopted
Figure BDA0002974852510000081
And (4) showing.
The gray scale of all pixels in a certain differential image is xiThen the maximum value is
Figure BDA0002974852510000082
Minimum value of
Figure BDA0002974852510000083
Has an average value of
Figure BDA0002974852510000084
Variance of
Figure BDA0002974852510000085
Skewness of
Figure BDA0002974852510000086
Kurtosis of
Figure BDA0002974852510000087
The range is Ptp ═ max-min, and the quartile number difference is DQ ═ Q3-Q1The correlation coefficient is
Figure BDA0002974852510000088
Cov(xt,xt+1)=E(xtxt+1)-E(xt)(xt+1)。
And after the characteristic matrixes of the differential images are obtained, combining the characteristic matrixes of the differential images according to the sequence to obtain a video characteristic matrix. In practical application, the video feature matrix is obtained according to the feature matrix of each differential image, the same feature parameter in the feature matrix of each differential image can be extracted, the parameters are sequenced according to the sequencing of video frames to obtain corresponding feature vectors, and the feature vectors corresponding to all the parameters are combined into the video feature matrix.
In some applications of the embodiment of the application, when the sample facial feature video is split into the plurality of sample sub-videos, the video feature matrix of each sample sub-video may be calculated first, and then the feature matrices of each sample sub-video may be combined into the feature matrix of the sample facial feature video.
S105: and training a mask face feature recognition model by adopting the video feature matrix and the corresponding evaluation label.
Step S105 is a process of training the mask face feature recognition model based on the association relationship between the video feature matrix and the corresponding facial face evaluation label after the video feature matrix and the corresponding facial face evaluation label are determined. In the embodiment of the present application, the mask face feature recognition model may be a model widely used in the field of machine learning, and may be, for example, a support vector machine model.
As can be seen from the foregoing description and analysis in steps S101 to S105, in the method for training a mask face feature recognition model provided in the embodiment of the present application, after a difference image is obtained by performing a difference operation on adjacent frames in a frame sequence, the difference image is processed to extract a feature matrix, and a feature matrix representing a sample face feature video is obtained based on the feature matrix of the difference image. Because the feature matrix of the video representing the facial features of the sample is determined based on the differential image, the change condition of the video content along with time is represented, the delay degree and the controllable degree of the change of the facial expression of the tester according to the set rule are reflected, and the facial mask characteristics of the tester can be further reflected.
Compared with the method for determining the facial mask features by directly extracting the features by using the facial features of the human face in the prior art, the method can simplify calculation, realize quick establishment of a facial mask feature recognition model and achieve better accuracy.
In addition, in some specific applications of the embodiment of the application, the sample facial feature video is a video formed when a shooting tester takes an action according to a set rule, and the set rule is a corresponding test index feature, so that a feature matrix corresponding to the sample facial feature video also corresponds to the corresponding test index feature, and a mask facial feature recognition model is more accurate.
After the mask face feature recognition model is obtained, the model can be stored in an APP application program, the APP application program is distributed to corresponding user sides, the user sides process the collected sample face feature videos to obtain video feature matrixes, the video feature matrixes are used as model input, and corresponding classification results are obtained.
The mask face obtained based on the mask face recognition model training method provided in the foregoing is a recognition model, and the embodiment of the application further provides a mask face recognition method. Fig. 2 is a flowchart of a mask face recognition method according to an embodiment of the present disclosure. As shown in fig. 2, the method for identifying the mask size provided by the embodiment of the present application includes steps S201 to S205.
S201: acquiring a facial feature video to be evaluated; the to-be-evaluated facial feature video is a video formed by a user executing operation according to a set rule.
S202: image frames of a face evaluation video to be evaluated are extracted to form a frame sequence.
S203: and carrying out difference operation on adjacent frames in the frame sequence according to the frame sequence to obtain a difference image.
S204: extracting a characteristic matrix of each differential image; and combining the characteristic matrixes of the differential images according to the frame sequence to obtain a video characteristic matrix.
The execution steps of the foregoing steps S201-S204 are substantially the same as the implementation steps of the steps S101-S105, except that the facial feature video to be evaluated is processed in the step S201, and the evaluation tag does not need to be determined; the specific operation of the related steps can be referred to the foregoing description, and will not be repeated here.
S205: and processing the video characteristic matrix by adopting a mask face characteristic identification model to obtain a mask face degree evaluation result.
In step S205, the video feature matrix determined in step S204 is input into the mask feature recognition model determined in the foregoing, so that the mask face degree evaluation result can be obtained. In the case where the evaluation labels in the foregoing model were normal, mild, moderate, and severe, the mask face degree evaluation result was any of the foregoing labels.
In addition to providing the aforementioned training method for the mask face recognition model, the embodiment of the present application further provides a training device for the mask face feature recognition model. Fig. 3 is a schematic structural diagram of a mask face recognition apparatus according to an embodiment of the present application, and as shown in fig. 3, the mask face recognition model training apparatus includes a source data obtaining unit 11, a frame extracting unit 12, a difference processing unit 13, a feature determining unit 14, and a model training unit 15.
The source data acquiring unit 11 is used for acquiring a sample facial feature video and a corresponding evaluation label.
In the embodiment of the application, the sample facial feature video is a video formed by a tester executing facial actions according to a set rule; the set rules comprise parameters such as the type of the facial action of the tester, the execution time of the facial action and the like. When obtaining a sample facial feature video, a tester must autonomously control facial muscles to perform corresponding actions according to set rules.
In the embodiment of the present application, the setting rule may include one facial action and corresponding execution time, or may include a plurality of facial actions and corresponding execution times.
In one specific application, the set rule includes three facial movements, which are closing both eyes, relaxing and looking straight ahead and tiny and exposing teeth, respectively; three facial movements all are designed to some characteristics that the mask face has for observe the facial rigidity degree and the face of tester (also be the tester) and change lazy degree, wherein: closing the eyes to observe the relaxation degree of the face of the tester, relaxing and directly viewing the front to observe the eye condition of the tester, and smiling and exposing the teeth to observe the mouth angle and the eye condition of the tester.
In order to capture each facial movement reasonably and to reflect the degree of stiffness and slowness of change of the face, each facial movement should be performed properly. In one application of the embodiment of the present application, the execution time of each face motion is 5s, and the execution of three face motions is continuous.
In the embodiment of the application, the evaluation tag is obtained by a professional viewing the sample facial feature video or observing the corresponding facial features of the test object. In one application, the rating labels may include five grades, normal, mild, moderate and severe; normal characterizes normal facial expression; slightly corresponding to the situation that the tester has the instantaneous frequency reduction; mild response test persons with reduced frequency of hits, reduced lower facial expression (e.g., spontaneous smiley lips not separated); the lips of the moderate corresponding testers can be opened when the mouth is still; the severe counterpart lips are mostly open when immobilized.
The frame extraction unit 12 is configured to extract image frames of the sample facial feature video to form a frame sequence.
In practical applications, the strategy for the frame extraction unit 12 to form the frame sequence may be different according to the sampling frequency of the sample facial feature video. For example, if the sampling frequency is small, all image frames of the sample facial feature video may be directly taken as one frame in the frame sequence; if the sampling frequency is large, a method of interval sampling can be adopted to extract partial frames in the sample facial feature video as the frame sequence.
In order to extract valid information and exclude unnecessary information, the frame processing unit may further determine a face region of the user in each image frame, extract the face region of the user and delete other regions as pixel points of each frame in the frame sequence.
The difference processing unit 13 is configured to perform difference operation on adjacent frames in the frame sequence according to the frame sequence to obtain a difference image.
The characteristic determining unit 14 is configured to extract a characteristic matrix of each of the difference images; and combining the characteristic matrixes of the differential images according to the frame sequence to obtain a video characteristic matrix.
In practical application, the video feature matrix is obtained according to the feature matrix of each differential image, the same feature parameter in the feature matrix of each differential image can be extracted, the parameters are sequenced according to the sequencing of video frames to obtain corresponding feature vectors, and the feature vectors corresponding to all the parameters are combined into the video feature matrix.
In some applications of the embodiment of the application, when the sample facial feature video is split into a plurality of sample sub-videos, the video feature matrix of each sample sub-video may be calculated on line, and then the feature matrices of each sample sub-video are combined into the feature matrix of the sample facial feature video.
The model training unit 15 is configured to train the mask face feature recognition model by using the video feature matrix and the corresponding evaluation labels.
In the embodiment of the present application, the mask face feature recognition model may be a model widely used in the field of machine learning, and may be, for example, a support vector machine model.
The training method for the mask face recognition model provided by the embodiment of the application can simplify calculation, realize quick establishment of the mask face feature recognition model and achieve better accuracy.
Based on the inventive concept, the application also provides an electronic device. Fig. 4 is a schematic structural diagram of an electronic device provided in an embodiment of the present application. As shown in fig. 4, the first server comprises at least one processor 21, at least one memory 22 and at least one communication interface 23. And a communication interface 23 for information transmission with an external device.
The various components in the first server are coupled together by a bus system 24. Understandably, the bus system 24 is used to enable connective communication between these components. The bus system 24 includes a power bus, a control bus, and a status signal bus in addition to a data bus. For clarity of illustration, the various buses are labeled as bus system 24 in fig. 4.
It will be appreciated that the memory 22 in this embodiment may be either volatile memory or nonvolatile memory, or may include both volatile and nonvolatile memory. In some embodiments, memory 22 stores elements, executable units or data structures, or a subset thereof, or an expanded set thereof: an operating system and an application program.
The operating system includes various system programs, such as a framework layer, a core library layer, a driver layer, and the like, and is used for implementing various basic tasks and processing hardware-based tasks. The application programs include various application programs such as a media player (MediaPlayer), a Browser (Browser), etc. for implementing various application tasks. The program for implementing the mask face feature recognition model training method provided by the embodiment of the disclosure may be included in an application program.
In the embodiment of the present disclosure, the processor 21 is configured to execute the steps of the training method for the mask face feature recognition model provided in the embodiment of the present disclosure by calling a program or an instruction stored in the memory 22, which may be specifically a program or an instruction stored in an application program.
The mask face feature recognition model training method provided by the embodiment of the disclosure can be applied to the processor 21, or implemented by the processor 21. The processor 21 may be an integrated circuit chip having signal processing capabilities. In implementation, the steps of the above method may be performed by integrated logic circuits of hardware or instructions in the form of software in the processor 21. The Processor 21 may be a general-purpose Processor, a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), an off-the-shelf Programmable Gate Array (FPGA) or other Programmable logic device, a discrete Gate or transistor logic device, or a discrete hardware component. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.
The steps of the mask face feature recognition model training method provided by the embodiment of the disclosure can be directly implemented by a hardware decoding processor, or implemented by combining hardware and software units in the decoding processor. The software elements may be located in ram, flash, rom, prom, or eprom, registers, among other storage media that are well known in the art. The storage medium is located in a memory 22, and the processor 21 reads the information in the memory 22 and performs the steps of the method in combination with its hardware.
The embodiments of the present disclosure further provide a non-transitory computer-readable storage medium, where a program or an instruction is stored in the non-transitory computer-readable storage medium, where the program or the instruction causes a computer to execute the steps of the mask face feature recognition model training method in each embodiment, and in order to avoid repeated description, the steps are not repeated here.
It is noted that, in this document, relational terms such as "first" and "second," and the like, may be used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Also, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other identical elements in a process, method, article, or apparatus that comprises the element.
The foregoing are merely exemplary embodiments of the present application and are presented to enable those skilled in the art to understand and practice the present application. Various modifications to these embodiments will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other embodiments without departing from the spirit or scope of the application. Thus, the present application is not intended to be limited to the embodiments shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.

Claims (10)

1. A mask face feature recognition model training method is characterized by comprising the following steps:
acquiring a sample facial feature video and a corresponding evaluation label; the sample facial feature video is a video formed by a user executing operation according to a set rule;
extracting image frames of the sample facial feature video to form a frame sequence;
according to the frame sequence, carrying out differential operation on adjacent frames in the frame sequence to obtain a differential image;
extracting a feature matrix of each differential image; combining the characteristic matrixes of the differential images according to the frame sequence to obtain a video characteristic matrix;
and training the mask face feature recognition model by adopting the video feature matrix and the corresponding evaluation label.
2. The mask face feature recognition model training method according to claim 1,
extracting image frames of the sample facial feature video to form a sequence of frames, comprising: dividing the sample facial feature video according to a set rule to obtain a sample divided video;
extracting image frames of each sample sub-video to form a corresponding frame sequence;
combining the feature matrices of the differential images according to the frame sequences to obtain a video feature matrix, wherein the video feature matrix comprises: and combining the characteristic matrix of each differential image according to the arrangement sequence of the sample sub-videos and the corresponding frame sequences to obtain the video characteristic matrix.
3. The training method of a mask face feature recognition model according to claim 2, wherein the setting rule includes at least two facial movements and execution time of each of the facial movements;
dividing the sample facial feature video according to a set rule to obtain a sample video, comprising: and dividing the video characteristics according to the facial actions and the corresponding execution time to obtain the sample divided video.
4. The mask face feature recognition model training method of claim 3, wherein the facial actions include closing both eyes, relaxing and looking straight ahead and smiling and exposing teeth.
5. The mask face feature recognition model training method according to claim 1, wherein extracting image frames of the sample facial feature video to form a frame sequence comprises:
extracting a face image region of the image frame;
combining the facial image regions in the order of the image frames to form the frame sequence.
6. The mask face feature recognition model training method according to any one of claims 1 to 5,
the feature matrix of the differential image comprises at least two feature parameters;
combining the feature matrices of the differential images according to the frame sequences to obtain a video feature matrix, wherein the video feature matrix comprises:
extracting the same characteristic parameter of each characteristic matrix according to the frame sequence, and combining to form a vector with the same parameter;
and combining the vectors with the same parameters to obtain the video feature matrix.
7. A face recognition method for a mask, comprising:
acquiring a facial feature video to be evaluated; the to-be-evaluated facial feature video is a video formed by a user executing operation according to a set rule;
extracting image frames of the facial evaluation video to be evaluated to form a frame sequence;
according to the frame sequence, carrying out differential operation on adjacent frames in the frame sequence to obtain a differential image;
extracting a feature matrix of each differential image; combining the characteristic matrixes of the differential images according to the frame sequence to obtain a video characteristic matrix;
processing the video feature matrix using a mask face feature recognition model obtained by the method of any one of claims 1-6 to obtain a mask face degree evaluation result.
8. A mask face recognition model training device, comprising:
the source data acquisition unit is used for acquiring a sample facial feature video and a corresponding evaluation label; the sample facial feature video is a video formed by a user executing operation according to a set rule;
a frame extraction unit for extracting image frames of the sample facial feature video to form a frame sequence;
the difference processing unit is used for carrying out difference operation on adjacent frames in the frame sequence according to the frame sequence to obtain a difference image;
a feature determination unit configured to extract a feature matrix of each of the difference images; combining the characteristic matrixes of the differential images according to the frame sequence to obtain a video characteristic matrix;
and the model training unit is used for training the mask face feature recognition model by adopting the video feature matrix and the corresponding evaluation label.
9. An electronic device comprising a processor and a memory;
the processor is configured to execute the steps of the mask face feature recognition model training method according to any one of claims 1 to 6 by calling a program or instructions stored in the memory.
10. A computer-readable storage medium storing a program or instructions for causing a computer to execute the steps of the mask face feature recognition model training method according to any one of claims 1 to 6.
CN202110272296.2A 2021-03-12 2021-03-12 Training method and device for mask face feature recognition model Active CN113076813B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110272296.2A CN113076813B (en) 2021-03-12 2021-03-12 Training method and device for mask face feature recognition model

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110272296.2A CN113076813B (en) 2021-03-12 2021-03-12 Training method and device for mask face feature recognition model

Publications (2)

Publication Number Publication Date
CN113076813A true CN113076813A (en) 2021-07-06
CN113076813B CN113076813B (en) 2024-04-12

Family

ID=76612319

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110272296.2A Active CN113076813B (en) 2021-03-12 2021-03-12 Training method and device for mask face feature recognition model

Country Status (1)

Country Link
CN (1) CN113076813B (en)

Citations (21)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20140362091A1 (en) * 2013-06-07 2014-12-11 Ecole Polytechnique Federale De Lausanne Online modeling for real-time facial animation
CN104463100A (en) * 2014-11-07 2015-03-25 重庆邮电大学 Intelligent wheelchair man-machine interaction system and method based on facial expression recognition mode
CN107808113A (en) * 2017-09-13 2018-03-16 华中师范大学 A kind of facial expression recognizing method and system based on difference depth characteristic
CN108830237A (en) * 2018-06-21 2018-11-16 北京师范大学 A kind of recognition methods of human face expression
CN109190582A (en) * 2018-09-18 2019-01-11 河南理工大学 A kind of new method of micro- Expression Recognition
CN109190487A (en) * 2018-08-07 2019-01-11 平安科技(深圳)有限公司 Face Emotion identification method, apparatus, computer equipment and storage medium
CN109614927A (en) * 2018-12-10 2019-04-12 河南理工大学 Micro- Expression Recognition based on front and back frame difference and Feature Dimension Reduction
CN109784230A (en) * 2018-12-29 2019-05-21 中国科学院重庆绿色智能技术研究院 A kind of facial video image quality optimization method, system and equipment
CN109934158A (en) * 2019-03-11 2019-06-25 合肥工业大学 Video feeling recognition methods based on local strengthening motion history figure and recursive convolution neural network
CN110110653A (en) * 2019-04-30 2019-08-09 上海迥灵信息技术有限公司 The Emotion identification method, apparatus and storage medium of multiple features fusion
US20190294868A1 (en) * 2016-06-01 2019-09-26 Ohio State Innovation Foundation System and method for recognition and annotation of facial expressions
CN110363764A (en) * 2019-07-23 2019-10-22 安徽大学 A kind of driving license type information integrality detection method based on inter-frame difference
CN110532950A (en) * 2019-08-29 2019-12-03 中国科学院自动化研究所 Video feature extraction method, micro- expression recognition method based on micro- expression video
CN110569702A (en) * 2019-02-14 2019-12-13 阿里巴巴集团控股有限公司 Video stream processing method and device
CN110717423A (en) * 2019-09-26 2020-01-21 安徽建筑大学 Training method and device for emotion recognition model of facial expression of old people
CN111353452A (en) * 2020-03-06 2020-06-30 国网湖南省电力有限公司 Behavior recognition method, behavior recognition device, behavior recognition medium and behavior recognition equipment based on RGB (red, green and blue) images
CN111476178A (en) * 2020-04-10 2020-07-31 大连海事大学 Micro-expression recognition method based on 2D-3D CNN
CN111539290A (en) * 2020-04-16 2020-08-14 咪咕文化科技有限公司 Video motion recognition method and device, electronic equipment and storage medium
CN111814615A (en) * 2020-06-28 2020-10-23 湘潭大学 Parkinson non-contact intelligent detection method based on instruction video
CN111860414A (en) * 2020-07-29 2020-10-30 中国科学院深圳先进技术研究院 Method for detecting Deepfake video based on multi-feature fusion
CN111898703A (en) * 2020-08-14 2020-11-06 腾讯科技(深圳)有限公司 Multi-label video classification method, model training method, device and medium

Patent Citations (21)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20140362091A1 (en) * 2013-06-07 2014-12-11 Ecole Polytechnique Federale De Lausanne Online modeling for real-time facial animation
CN104463100A (en) * 2014-11-07 2015-03-25 重庆邮电大学 Intelligent wheelchair man-machine interaction system and method based on facial expression recognition mode
US20190294868A1 (en) * 2016-06-01 2019-09-26 Ohio State Innovation Foundation System and method for recognition and annotation of facial expressions
CN107808113A (en) * 2017-09-13 2018-03-16 华中师范大学 A kind of facial expression recognizing method and system based on difference depth characteristic
CN108830237A (en) * 2018-06-21 2018-11-16 北京师范大学 A kind of recognition methods of human face expression
CN109190487A (en) * 2018-08-07 2019-01-11 平安科技(深圳)有限公司 Face Emotion identification method, apparatus, computer equipment and storage medium
CN109190582A (en) * 2018-09-18 2019-01-11 河南理工大学 A kind of new method of micro- Expression Recognition
CN109614927A (en) * 2018-12-10 2019-04-12 河南理工大学 Micro- Expression Recognition based on front and back frame difference and Feature Dimension Reduction
CN109784230A (en) * 2018-12-29 2019-05-21 中国科学院重庆绿色智能技术研究院 A kind of facial video image quality optimization method, system and equipment
CN110569702A (en) * 2019-02-14 2019-12-13 阿里巴巴集团控股有限公司 Video stream processing method and device
CN109934158A (en) * 2019-03-11 2019-06-25 合肥工业大学 Video feeling recognition methods based on local strengthening motion history figure and recursive convolution neural network
CN110110653A (en) * 2019-04-30 2019-08-09 上海迥灵信息技术有限公司 The Emotion identification method, apparatus and storage medium of multiple features fusion
CN110363764A (en) * 2019-07-23 2019-10-22 安徽大学 A kind of driving license type information integrality detection method based on inter-frame difference
CN110532950A (en) * 2019-08-29 2019-12-03 中国科学院自动化研究所 Video feature extraction method, micro- expression recognition method based on micro- expression video
CN110717423A (en) * 2019-09-26 2020-01-21 安徽建筑大学 Training method and device for emotion recognition model of facial expression of old people
CN111353452A (en) * 2020-03-06 2020-06-30 国网湖南省电力有限公司 Behavior recognition method, behavior recognition device, behavior recognition medium and behavior recognition equipment based on RGB (red, green and blue) images
CN111476178A (en) * 2020-04-10 2020-07-31 大连海事大学 Micro-expression recognition method based on 2D-3D CNN
CN111539290A (en) * 2020-04-16 2020-08-14 咪咕文化科技有限公司 Video motion recognition method and device, electronic equipment and storage medium
CN111814615A (en) * 2020-06-28 2020-10-23 湘潭大学 Parkinson non-contact intelligent detection method based on instruction video
CN111860414A (en) * 2020-07-29 2020-10-30 中国科学院深圳先进技术研究院 Method for detecting Deepfake video based on multi-feature fusion
CN111898703A (en) * 2020-08-14 2020-11-06 腾讯科技(深圳)有限公司 Multi-label video classification method, model training method, device and medium

Also Published As

Publication number Publication date
CN113076813B (en) 2024-04-12

Similar Documents

Publication Publication Date Title
US10832069B2 (en) Living body detection method, electronic device and computer readable medium
CN112950581B (en) Quality evaluation method and device and electronic equipment
US9852327B2 (en) Head-pose invariant recognition of facial attributes
CN111046959A (en) Model training method, device, equipment and storage medium
CN110070029B (en) Gait recognition method and device
JP6688277B2 (en) Program, learning processing method, learning model, data structure, learning device, and object recognition device
CN112381837B (en) Image processing method and electronic equipment
CN107633237B (en) Image background segmentation method, device, equipment and medium
CN108229673B (en) Convolutional neural network processing method and device and electronic equipment
CN111935479B (en) Target image determination method and device, computer equipment and storage medium
CN110059666B (en) Attention detection method and device
CN114663593A (en) Three-dimensional human body posture estimation method, device, equipment and storage medium
CN109977875A (en) Gesture identification method and equipment based on deep learning
CN112633221A (en) Face direction detection method and related device
CN114511702A (en) Remote sensing image segmentation method and system based on multi-scale weighted attention
CN112788254B (en) Camera image matting method, device, equipment and storage medium
CN111612732B (en) Image quality evaluation method, device, computer equipment and storage medium
CN113379702A (en) Blood vessel path extraction method and device of microcirculation image
JP6202938B2 (en) Image recognition apparatus and image recognition method
CN110633630B (en) Behavior identification method and device and terminal equipment
CN113076813A (en) Mask face feature recognition model training method and device
CN112581001B (en) Evaluation method and device of equipment, electronic equipment and readable storage medium
WO2019150649A1 (en) Image processing device and image processing method
CN112989924B (en) Target detection method, target detection device and terminal equipment
CN115239551A (en) Video enhancement method and device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
CB03 Change of inventor or designer information

Inventor after: Xu Erhe

Inventor after: Chen Biao

Inventor after: Sun Hong

Inventor after: Chen Yiqiang

Inventor after: Lu Wang

Inventor after: Yu Hanchao

Inventor after: Yang Xiaodong

Inventor before: Xu Erhe

CB03 Change of inventor or designer information
GR01 Patent grant
GR01 Patent grant