CN114267077B - Method, system, device and medium for identifying wearing of mask - Google Patents

Method, system, device and medium for identifying wearing of mask Download PDF

Info

Publication number
CN114267077B
CN114267077B CN202210201148.6A CN202210201148A CN114267077B CN 114267077 B CN114267077 B CN 114267077B CN 202210201148 A CN202210201148 A CN 202210201148A CN 114267077 B CN114267077 B CN 114267077B
Authority
CN
China
Prior art keywords
mask
branch
classification
wearing
regression
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202210201148.6A
Other languages
Chinese (zh)
Other versions
CN114267077A (en
Inventor
李来
张江峰
王月平
王东
肖传宝
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hangzhou Moredian Technology Co ltd
Original Assignee
Hangzhou Moredian Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hangzhou Moredian Technology Co ltd filed Critical Hangzhou Moredian Technology Co ltd
Priority to CN202210201148.6A priority Critical patent/CN114267077B/en
Publication of CN114267077A publication Critical patent/CN114267077A/en
Application granted granted Critical
Publication of CN114267077B publication Critical patent/CN114267077B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Abstract

The present application relates to a method, system, apparatus, and medium for mask wear identification by constructing a multitasking collaboration network, and training the first branch network respectively through the classification loss function of the mask wearing branch, the mask height regression loss function of the mask wearing branch and the isotropic combined loss function circulation between the classification and regression of the mask wearing branch until the mask wearing classification branch and the mask wearing regression branch reach a preset convergence state to obtain a trained multi-task cooperative network, wherein the trained multi-task cooperative network can output the classification number of the pedestrian mask wearing classification and the mask wearing regression height, can know whether the pedestrian's gauze mask wears the standard, and the multitask in this embodiment is higher in network output result accuracy in coordination, when having solved among the correlation technique through artifical or intelligent equipment inspection gauze mask, inspect that the effect is poor or can't discern whether the gauze mask wears the problem of standard.

Description

Method, system, device and medium for identifying wearing of mask
Technical Field
The application relates to the technical field of face recognition, in particular to a method, a system, a device and a medium for mask wearing recognition.
Background
The mask can be worn in public places as a method for preventing infectious diseases, so that people have become a daily habit of wearing the mask to go out. The mask wearing inspection can be carried out in places such as shopping malls, subways, campuses, hospitals, offices and the like with large people flow in and out. In the related technology, when the inspection is carried out manually, the problems of employee's off duty, passing queuing, missing identification and passing and the like exist, and the inspection effect is poor; when checking through intelligent hardware equipment, can only discern whether to wear the gauze mask, can not discern whether export mask wears the standard.
At present, an effective solution is not provided aiming at the problem that the mask is poor in inspection effect or whether the mask is normally worn cannot be identified when the mask is inspected through manual work or intelligent equipment in the related art.
Disclosure of Invention
The embodiment of the application provides a method, a system, a device and a medium for identifying mask wearing, so that the problem that in the related art, when the mask is inspected through manual work or intelligent equipment, the inspection effect is poor or whether the mask wearing is standard or not cannot be identified is at least solved.
In a first aspect, an embodiment of the present application provides a method for mask wearing identification, where the method includes:
constructing a multitask cooperative network, wherein the multitask cooperative network comprises a basic network and a first branch network, and the first branch network comprises a mask wearing classification branch and a mask wearing regression branch;
inputting training data into the multitask collaborative network, and training the first branch network respectively through a classification loss function of a mask wearing branch, a mask height regression loss function of the mask wearing branch and an isotropic combined loss function cycle between classification and regression of the mask wearing branch after the training data passes through the basic network until the mask wearing classification branch and the mask wearing regression branch reach a preset convergence state to obtain a trained multitask collaborative network;
the face is identified through a trained multitask collaborative network, and the class number of the mask wearing classification and the mask wearing regression height are obtained, wherein the class number comprises that the mask is not worn, is not worn according to the standard and is worn according to the standard.
In a second aspect, the present application provides a system for mask wear identification, where the system includes a multitask collaborative network, where the multitask collaborative network includes a base network and a first branch network, and the first branch network includes a mask wear classification branch and a mask wear regression branch;
inputting training data into the multitask collaborative network, and training the first branch network respectively through a classification loss function of a mask wearing branch, a mask height regression loss function of the mask wearing branch and an isotropic combined loss function cycle between classification and regression of the mask wearing branch after the training data passes through the basic network until the mask wearing classification branch and the mask wearing regression branch reach a preset convergence state to obtain a trained multitask collaborative network;
the face is identified through the trained multitask collaborative network, and the class number of the mask wearing classification and the mask wearing regression height are obtained, wherein the class number comprises that the mask is not worn, is not worn according to the standard and is worn according to the standard.
In a third aspect, an embodiment of the present application provides an electronic device, which includes a memory, a processor, and a computer program stored on the memory and executable on the processor, and the processor executes the computer program to implement the method for identifying wearing of a mask according to the first aspect.
In a fourth aspect, embodiments of the present application provide a storage medium having a computer program stored thereon, where the computer program is executed by a processor to implement the method for mask wearing identification according to the first aspect.
Compared with the related art, the mask wearing identification method provided by the embodiment of the application trains the first branch network respectively through the classification loss function of the mask wearing branch, the mask height regression loss function of the mask wearing branch and the isotropic combined loss function circulation between the classification and regression of the mask wearing branch by constructing the multitask collaborative network until the mask wearing classification branch and the mask wearing regression branch reach the preset convergence state to obtain the trained multitask collaborative network, wherein the trained multitask collaborative network can output the classification number of the pedestrian mask wearing classification and the mask wearing regression height, can know whether the pedestrian's gauze mask wears the standard, and the multitask in this embodiment is higher in network output result accuracy in coordination, when having solved among the correlation technique through artifical or intelligent equipment inspection gauze mask, inspect that the effect is poor or can't discern whether the gauze mask wears the problem of standard.
Drawings
The accompanying drawings, which are included to provide a further understanding of the application and are incorporated in and constitute a part of this application, illustrate embodiment(s) of the application and together with the description serve to explain the application and not to limit the application. In the drawings:
fig. 1 is a flowchart of a method of mask wear identification according to an embodiment of the present application;
fig. 2 is a flow chart of another method of mask wear identification according to an embodiment of the present application;
FIG. 3 is a schematic structural diagram of a lightweight attention module according to an embodiment of the present application;
fig. 4 is a schematic structural diagram of a multitask cooperative network according to an embodiment of the present application.
Detailed Description
In order to make the objects, technical solutions and advantages of the present application more apparent, the present application will be described and illustrated below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the present application and are not intended to limit the present application. All other embodiments obtained by a person of ordinary skill in the art based on the embodiments provided in the present application without any inventive step are within the scope of protection of the present application. Moreover, it should be appreciated that in the development of any such actual implementation, as in any engineering or design project, numerous implementation-specific decisions must be made to achieve the developers' specific goals, such as compliance with system-related and business-related constraints, which may vary from one implementation to another.
Reference in the specification to "an embodiment" means that a particular feature, structure, or characteristic described in connection with the embodiment can be included in at least one embodiment of the specification. The appearances of the phrase in various places in the specification are not necessarily all referring to the same embodiment, nor are separate or alternative embodiments mutually exclusive of other embodiments. Those of ordinary skill in the art will explicitly and implicitly appreciate that the embodiments described herein may be combined with other embodiments without conflict.
Unless defined otherwise, technical or scientific terms referred to herein shall have the ordinary meaning as understood by those of ordinary skill in the art to which this application belongs. Reference to "a," "an," "the," and similar words throughout this application are not to be construed as limiting in number, and may refer to the singular or the plural. The present application is directed to the use of the terms "including," "comprising," "having," and any variations thereof, which are intended to cover non-exclusive inclusions; for example, a process, method, system, article, or apparatus that comprises a list of steps or modules (elements) is not limited to the listed steps or elements, but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus. Reference to "connected," "coupled," and the like in this application is not intended to be limited to physical or mechanical connections, but may include electrical connections, whether direct or indirect. Reference herein to "a plurality" means greater than or equal to two. "and/or" describes an association relationship of associated objects, meaning that three relationships may exist, for example, "A and/or B" may mean: a exists alone, A and B exist simultaneously, and B exists alone. Reference herein to the terms "first," "second," "third," and the like, are merely to distinguish similar objects and do not denote a particular ordering for the objects.
The present embodiment provides a method for mask wearing identification, and fig. 1 is a flowchart of a method for mask wearing identification according to an embodiment of the present application, and as shown in fig. 1, the method includes the following steps:
step S101, a multitask cooperative network is constructed, wherein the multitask cooperative network comprises a basic network and a first branch network, and the first branch network comprises a mask wearing classification branch and a mask wearing regression branch;
step S102, inputting training data into the multitask cooperative network, training the first branch network through a classification loss function of a mask wearing branch, a mask height regression loss function of the mask wearing branch and an isotropic combined loss function circulation between classification and regression of the mask wearing branch after the training data passes through a basic network until the mask wearing classification branch and the mask wearing regression branch reach a preset convergence state, and obtaining the trained multitask cooperative network;
in this embodiment, the classification loss function of the mask wearing branch is shown by the following formula 1:
Figure 708111DEST_PATH_IMAGE001
equation 1
Wherein the content of the first and second substances,
Figure 421989DEST_PATH_IMAGE002
indicating a classification loss of the mask wearing branch,
Figure 123098DEST_PATH_IMAGE003
the number of categories of the classified wearing of the mask is represented, the number of categories is 3, namely 1-wearing is not performed, 2-wearing is not performed according to the standard, 3-wearing is performed according to the standard,
Figure 924832DEST_PATH_IMAGE004
a category index representing the wearing classification of the mask, ranging from 1 to
Figure 270362DEST_PATH_IMAGE003
Figure 279776DEST_PATH_IMAGE005
The actual label of the mask wearing, which represents the current image data, i.e., 1-not worn, 2-not worn, 3-worn,
Figure 218913DEST_PATH_IMAGE006
means of
Figure 342114DEST_PATH_IMAGE004
In the same way, the first and second,
Figure 542151DEST_PATH_IMAGE007
represents the classification probability of the mask wearing branch output,
Figure 942040DEST_PATH_IMAGE008
means of and
Figure 493107DEST_PATH_IMAGE007
the same is true.
The mask height regression loss function of the mask wearing branch is shown by the following formula 2:
Figure 151490DEST_PATH_IMAGE009
equation 2
Wherein the content of the first and second substances,
Figure 612558DEST_PATH_IMAGE010
showing the high regression loss of the mask with the wearing branch of the mask,
Figure 963774DEST_PATH_IMAGE011
represents the number of samples in the batch size,
Figure 877503DEST_PATH_IMAGE012
represents the regression-predicted mask wearing height,
Figure 949365DEST_PATH_IMAGE013
the invention represents the real calibrated mask wearing height, the mask wearing height is continuously and numerically calibrated in the implementation process, and part of important calibrated heights are stated, namely the mask wearing height is not-0, the nose tip is-72, the upper lip is-65, the eye pouch is-85, the eyebrow is-100, and the mask wearing height is not-0]Out of specification- [0-72]Norm- [72-100]。
The isotropic joint loss function between the classification and regression of the mask wearing branches is shown in the following equations 3-4:
Figure 248628DEST_PATH_IMAGE014
equation 3
Figure 114953DEST_PATH_IMAGE015
Equation 4
Wherein the content of the first and second substances,
Figure 250399DEST_PATH_IMAGE016
representing the isotropic union loss between the classification and regression of the mask wearing branches,
Figure 860372DEST_PATH_IMAGE017
the classification probability of different regression masks is shown, for example, when the height is 85, the classification probability is [0,0,1 ]],
Figure 282650DEST_PATH_IMAGE003
Figure 460822DEST_PATH_IMAGE004
Figure 208198DEST_PATH_IMAGE005
Figure 746496DEST_PATH_IMAGE006
The meaning is shown in formula 1.
Whether the gauze mask is worn and the gauze mask is worn highly, there is very strong correlation in the semanteme, it can make the gauze mask wear categorised branch and the gauze mask and wear the unable intersection utilization of the high-level semantic information that returns branch to train to first branch network to wear the categorised branch of branch and gauze mask through the highly regression loss function of branch is worn to the gauze mask alone to wear the categorised branch and the gauze mask through the gauze mask, the result degree of accuracy that makes the output is not high enough, consequently need rethread gauze mask to wear the branch and the regression between the categorised branch of branch isotropic joint loss function trains first branch network, wear categorised branch and gauze mask and wear highly regression branch and jointly train together, the degree of accuracy of output result has been improved.
And S103, identifying the face through the trained multitask cooperative network, and obtaining the class number of the mask wearing classification and the mask wearing regression height, wherein the class number comprises non-wearing, non-standard wearing and standard wearing. In this embodiment, if the mask wearing height is 72 or more, it may be determined as the mask wearing specification, and the trained multi-task cooperative network outputs that the number of classes of the pedestrian a mask wearing classification is 3 and the mask wearing return height is 75, which indicates the pedestrian a mask wearing specification.
Through steps S101 to S103, in contrast to the problem that the mask wearing is poor or the mask wearing is not recognized normally when the mask is inspected through manual or intelligent devices in the related art, the present embodiment trains the first branch network through constructing the multitask collaborative network and respectively through the classification loss function of the mask wearing branch, the mask height regression loss function of the mask wearing branch, and the isotropic combined loss function cycle between the classification and regression of the mask wearing branch until the mask wearing classification branch and the mask wearing regression branch reach the preset convergence state, so as to obtain the trained multitask collaborative network, the trained multitask collaborative network can output the number of classes of the mask wearing classification and the mask wearing regression height of the pedestrian, so as to know whether the pedestrian mask is worn normally, and the output result of the multitask collaborative network in the present embodiment has higher accuracy, the problem of among the correlation technique when examining the gauze mask through artifical or intelligent equipment, examine the effect poor or can't discern whether the gauze mask is worn normatively is solved.
In some embodiments, the multitask collaborative network further includes a second branch network, the second branch network includes a living body identification classification branch, fig. 2 is a flowchart of another mask wearing identification method according to an embodiment of the present application, and as shown in fig. 2, after the mask wearing classification branch and the mask wearing regression branch reach a preset convergence state, the method includes the following steps:
step S201, fixing mask wearing classification branch parameters and mask wearing regression branch parameters, and training a second branch network respectively through a classification loss function of a living body recognition branch and an isotropic combined loss function circulation between the regression of the mask wearing branch and the living body recognition classification until the living body recognition classification branch reaches a preset convergence state, so as to obtain a trained multi-task cooperative network;
in this embodiment, the classification loss function of the living body recognition branch is shown by the following formula 5:
Figure 769420DEST_PATH_IMAGE018
equation 5
Wherein the content of the first and second substances,
Figure 649652DEST_PATH_IMAGE019
indicating a loss of classification of the live recognition branch,
Figure 743378DEST_PATH_IMAGE020
the number of classes representing the recognition classification of the living body, which is 2, i.e., 1-prosthesis, 2-living body,
Figure 695154DEST_PATH_IMAGE021
class index representing living body recognition classification in the range of 1 to
Figure 839827DEST_PATH_IMAGE020
Figure 611999DEST_PATH_IMAGE022
The real label representing the live recognition of the current image data, i.e. 1-prosthesis, 2-live,
Figure 333967DEST_PATH_IMAGE023
means of and
Figure 964800DEST_PATH_IMAGE024
in the same way, the first and second,
Figure 947668DEST_PATH_IMAGE025
representing the classification probability of the live body recognition branch output,
Figure 763178DEST_PATH_IMAGE026
means of and
Figure 847808DEST_PATH_IMAGE025
the same is true.
The isotropic joint loss function between the regression of the mask wearing branches and the living body recognition classification is shown in the following formulas 6 to 7:
Figure 406965DEST_PATH_IMAGE027
equation 6
Figure 244340DEST_PATH_IMAGE028
Equation 7
Wherein the content of the first and second substances,
Figure 106117DEST_PATH_IMAGE029
the punishment weight is represented, when the living body is input and the mask is worn regularly, the weighting punishment is carried out, the living body passing rate when the mask is worn regularly is improved,
Figure 537098DEST_PATH_IMAGE020
Figure 759001DEST_PATH_IMAGE021
Figure 591828DEST_PATH_IMAGE022
Figure 358927DEST_PATH_IMAGE023
Figure 277204DEST_PATH_IMAGE025
Figure 571307DEST_PATH_IMAGE026
the meaning is shown in the same formula 5,
Figure 993061DEST_PATH_IMAGE017
the meaning is shown in formula 3.
Through this embodiment, wear the height according to the gauze mask and carry out the loss punishment, finally promote the living body under the gauze mask condition of wearing and judge the rate of accuracy.
And S202, recognizing the face through the trained multi-task cooperative network to obtain the class number of the mask wearing classification, the mask wearing regression height and the living body confidence coefficient.
In the related technology, the living body recognition mainly depends on the details in the human face to distinguish the truth of the human face, but because the mask is shielded, the details of the human face are lost, so that the confidence of the living body output by the model is low, the conventional living body algorithm depending on a fixed threshold value is easy to misjudge, and the mask needs to be manually removed or pulled down to correctly recognize, therefore, in the embodiment, the second branch network is not trained by the classification loss function of the living body recognition branch, but is trained by the isotropic combined loss function between the regression of the mask wearing branch and the living body recognition classification until the living body recognition branch reaches the preset convergence state, the living body confidence of the pedestrian wearing the mask for living body recognition is improved, and the problem that the living body can be correctly recognized only by manually removing or pulling down the mask in the prior art is solved, by carrying out living body identification on the pedestrian, the risk of false body attack facing printing paper, an electronic screen and the like is also solved.
In some embodiments, after the living body recognition classification branch reaches the preset convergence state, the mask wearing classification branch parameter, the mask wearing regression branch parameter and the living body recognition classification branch parameter are released, the first branch network is trained through the mask wearing classification loss function and the mask height regression loss function of the mask wearing branch, and the second branch network is trained through the living body recognition branch classification loss function until the mask wearing classification branch, the mask wearing regression branch and the living body recognition classification branch reach the preset convergence state, so that the trained multi-task cooperation network is obtained.
In this embodiment, the multi-task cooperative network is alternately optimized and trained through a progressive multi-stage cycle, where the progressive is easy before difficult, and the difficulty in optimizing the multi-task cooperative network can be reduced, and the multi-stage cycle includes the following steps:
(1) a first stage circulation: fixing living body identification classification branch parameters, training first branch network
A: training a first branch network through a classification loss function of a mask wearing classification branch;
b: training a first branch network through a mask height regression loss function of the mask wearing branches;
c: training a first branch network through an isotropic joint loss function between the classification and regression of the mask wearing branches;
and repeating the steps A-C until the mask wearing classification branch and the mask wearing regression branch reach a preset convergence state.
(2) And (3) second-stage circulation: fixing the mask wearing classification branch parameters and mask wearing regression branch parameters, training a second branch network
E: training a second branch network through a classification loss function of the living body recognition branch;
f: training a second branch network through various isotropic joint loss functions between regression of mask wearing branches and living body recognition classification;
and E-F is repeated until the living body identification classification branch reaches a preset convergence state.
(3) And a third stage of circulation: and releasing the mask wearing classification branch parameter, the mask wearing regression branch parameter and the living body identification classification branch parameter, training the mask wearing classification branch, the mask wearing regression branch and the living body identification classification branch until the mask wearing classification branch, the mask wearing regression branch and the living body identification classification branch reach a preset convergence state.
(4) And (5) repeating the steps (1) to (4) until the mask wearing classification branch, the mask wearing regression branch and the living body identification classification branch reach a preset convergence state, and obtaining the trained multi-task cooperative network.
The mask recognition and living body recognition scheme is a strongly-related collaborative task, for a network, visual information is shared at the bottom layer, but high-level semantic information needs to distinguish whether a mask is worn, the height of the mask and whether the mask is a living body, so that convergence is relatively difficult and task balance is difficult to achieve during direct end-to-end training, and the problem that the multi-task collaborative network is difficult to converge and task balance is difficult to achieve is solved by the gradual multi-stage circulation alternate optimization training strategy.
In some embodiments, the basic network includes several lightweight attention modules, that is, the basic network is constructed by stacking a plurality of lightweight attention modules, fig. 3 is a schematic structural diagram of a lightweight attention module according to an embodiment of the present application, and as shown in fig. 3, inputting training data to the lightweight attention module includes:
the input x is first global pooled (GAP) and then Channel Split (Channel Split) in the Channel dimension, resulting in two packets
Figure 196640DEST_PATH_IMAGE030
And
Figure 336635DEST_PATH_IMAGE031
reducing the operation amount, realizing light weight reasoning, respectively performing 1 × 1 convolution and Sigmoid activation on two groups, then splicing the two groups together on Channel dimension (Channel Cat) to obtain n, wherein n is a weight coefficient of each Channel, multiplying n and input x, and since n is a coefficient of each Channel, n x represents that each input Channel is configured with weight, namely applying an attention mechanism to each Channel, optimizing model feature expression, the weight coefficient corresponding to a useful Channel is large, the weight coefficient corresponding to an useless Channel is small, so that the network can be more focused on learning of the useful Channel, namely the attention mechanism, and finally performing Channel Shuffle operation, disordering the channels, increasing information flow in the channels, further enhancing the feature expression capability of the network, finally obtaining semantic information, inputting x after passing through a basic network, and finally obtaining high-level semantic information.
The attention module describes the formula as the following formulas 8-10:
Figure 165919DEST_PATH_IMAGE032
equation 8
Figure 848705DEST_PATH_IMAGE033
Equation 9
Figure 347819DEST_PATH_IMAGE034
Equation 10
Wherein the content of the first and second substances,
Figure 99743DEST_PATH_IMAGE035
in order to take care of the input of the attention module,
Figure 342506DEST_PATH_IMAGE036
in order to be an output of the attention module,
Figure 614218DEST_PATH_IMAGE037
indicating the sequence number of the packet index,
Figure 284234DEST_PATH_IMAGE038
Figure 523454DEST_PATH_IMAGE039
a temporary variable for an intermediate operation is indicated,
Figure 569908DEST_PATH_IMAGE040
representing the product of the element-wise product,
Figure 961706DEST_PATH_IMAGE041
represents the Global Average Pooling operation,
Figure 184047DEST_PATH_IMAGE042
showing the operation of the Channel Split, and,
Figure 785930DEST_PATH_IMAGE043
indicating that 1x1Conv is performed first, then Sigmoid operation is performed,
Figure 245861DEST_PATH_IMAGE044
indicating the operation of the Channel Cat in the Channel,
Figure 616799DEST_PATH_IMAGE045
indicating Channel Shuffle operation.
In the embodiment, the computation amount is reduced through Channel Split, the light weight of the attention module is ensured, information exchange in a characteristic Channel is promoted through Channel Split, and the characteristic expression of the model is enhanced.
In the correlation technique, gauze mask discernment and living body discernment are generally used for the entrance guard scene, the discernment can be passed through, in practical application, different places are different to the requirement of gauze mask height of wearing, traditional scheme generally adopts fixed threshold value, can't carry out nimble setting, therefore, in some of them embodiments, discern the people's face through the multitask collaborative network that trains, after obtaining categorised classification number of gauze mask and wear the regression height, obtain the gauze mask and wear the comparison result of regression height and preset gauze mask height, according to comparison result and the categorised classification number of gauze mask wearing, obtain the gauze mask and wear the output result, specifically realize following formula 11:
Figure 753251DEST_PATH_IMAGE046
equation 11
Wherein, the first and the second end of the pipe are connected with each other,
Figure 842430DEST_PATH_IMAGE047
shows the output result of the mask wearing, namely 1-not wearing, 2-not wearing in standard, 3-wearing in standard,
Figure 106052DEST_PATH_IMAGE048
the mask wearing regression height which represents the multitask collaborative network output ranges from 0 to 100, the larger the numerical value is, the higher the wearing height is,
Figure 331497DEST_PATH_IMAGE049
representing the mask wearing classification result output by the multitask collaborative network,
Figure 638851DEST_PATH_IMAGE050
indicate to predetermine gauze mask and wear the height, the user can set up as the demand, can set up to 72, and the tip of the nose position promptly, in this embodiment, the gauze mask that outputs in coordination with the network as multitask wears classification result and wears for the standard, but the gauze mask is worn and is regressed highly to be less than and predetermines gauze mask and wear the height, then explains that the gauze mask is worn and is accorded with general requirement, but is not conform to the requirement that the user set for, then final gauze mask is worn output result
Figure 215326DEST_PATH_IMAGE047
Still for the nonstandard wearing, can adjust through this embodiment in a flexible way and predetermine the gauze mask and wear the height, realize that the gauze mask wears the nimble deployment of discernment.
Meanwhile, the living body identification threshold value is difficult to carry out on-site parameter adjustment and adaptation, so in some embodiments, after the trained multi-task cooperative network is used for identifying the face, the class number of the mask wearing classification, the mask wearing regression height and the living body confidence coefficient are obtained, the living body confidence coefficient after the face is pulled up is obtained through the pulling-up coefficient, the mask wearing regression height and the living body confidence coefficient; obtaining a comparison result of the confidence level of the living body after the pulling-up and the confidence level threshold value, and obtaining an output result of the living body identification according to the comparison result, wherein the following formula 12-13 is specifically realized:
Figure 282639DEST_PATH_IMAGE051
equation 12
Figure 752803DEST_PATH_IMAGE052
Equation 13
Wherein, the first and the second end of the pipe are connected with each other,
Figure 372003DEST_PATH_IMAGE053
indicating living bodyThe result of the output is recognized and,
Figure 45561DEST_PATH_IMAGE054
indicating a pull-up factor, may be set to 0.03,
Figure 168763DEST_PATH_IMAGE055
the living body confidence coefficient after the heightening, namely the living body confidence coefficient after the heightening can improve the robustness of living body judgment under the mask scene,
Figure 696696DEST_PATH_IMAGE056
representing the live confidence of the multi-tasking collaborative network output,
Figure 893322DEST_PATH_IMAGE057
representing a preset living body confidence coefficient threshold value, and setting by a user according to requirements; training the combined branch of the mask height regression and the living body recognition classification through various isotropic joint loss functions between the regression of the mask wearing branch and the living body recognition classification, although the accuracy of living body judgment under the wearing condition of the mask is improved, the problem that the output value of the living body is lower under the wearing condition of the mask cannot be completely solved, therefore, the living body confidence coefficient output by the multitask collaborative network is heightened through the embodiment, the problem that the living body confidence coefficient is low is further solved, the problem that the living body confidence coefficient is lower than a preset living body confidence coefficient threshold value under an entrance guard scene, the false body attack is considered, the false body attack does not pass through the living body confidence coefficient threshold value, the mask needs to be pulled down to perform recognition judgment again to pass through, and if the mask needs to be brushed or pulled down for multiple times, the passing blockage problem is caused, so that the passing efficiency of the living body recognition under the wearing condition of the mask is improved.
It should be noted that the steps illustrated in the above-described flow diagrams or in the flow diagrams of the figures may be performed in a computer system, such as a set of computer-executable instructions, and that, although a logical order is illustrated in the flow diagrams, in some cases, the steps illustrated or described may be performed in an order different than here.
The present embodiment further provides a system for identifying wearing of a mask, which is used to implement the foregoing embodiments and preferred embodiments, and the description of the system is omitted here. As used hereinafter, the terms "module," "unit," "subunit," and the like may implement a combination of software and/or hardware for a predetermined function. Although the means described in the embodiments below are preferably implemented in software, an implementation in hardware, or a combination of software and hardware is also possible and contemplated.
Fig. 4 is a schematic structural diagram of a multitask collaboration network according to an embodiment of the present application, and as shown in fig. 4, the system includes the multitask collaboration network, the multitask collaboration network includes a base network and a first branch network, and the first branch network includes a mask wearing classification branch and a mask wearing regression branch;
inputting training data into the multitask collaborative network, and training the first branch network respectively through a classification loss function of a mask wearing branch, a mask height regression loss function of the mask wearing branch and an isotropic combined loss function circulation between classification and regression of the mask wearing branch after the training data passes through the basic network until the mask wearing classification branch and the mask wearing regression branch reach a preset convergence state to obtain a trained multitask collaborative network; the face is identified through the trained multi-task collaborative network, the classified class number of wearing of the mask and the regression height of wearing of the mask are obtained, wherein the class number includes that the mask is not worn, worn in an irregular mode and worn in an irregular mode, and the problem that whether the mask is worn in an irregular mode or not is poor in inspection effect or cannot be identified when the mask is inspected through manual work or intelligent equipment in the related art is solved.
In some embodiments, as shown in fig. 4, the multitask coordination network further includes a second branch network, the second branch network includes a living body recognition classification branch, after the mask wearing classification branch and the mask wearing regression branch reach the preset convergence state, the mask wearing classification branch parameter and the mask wearing regression branch parameter are fixed, the second branch network is trained through a classification loss function of the living body recognition branch and an isotropic combined loss function loop between the regression of the mask wearing branch and the living body recognition classification respectively, until the living body recognition classification branch reaches the preset convergence state, and the trained multitask coordination network is obtained;
and identifying the face through the trained multi-task cooperative network to obtain the classified class number of the mask wearing, the regression height of the mask wearing and the confidence coefficient of the living body.
The above modules may be functional modules or program modules, and may be implemented by software or hardware. For a module implemented by hardware, the modules may be located in the same processor; or the modules can be respectively positioned in different processors in any combination.
The present embodiment also provides an electronic device comprising a memory having a computer program stored therein and a processor configured to execute the computer program to perform the steps of any of the above method embodiments.
Optionally, the electronic apparatus may further include a transmission device and an input/output device, wherein the transmission device is connected to the processor, and the input/output device is connected to the processor.
It should be noted that, for specific examples in this embodiment, reference may be made to examples described in the foregoing embodiments and optional implementations, and details of this embodiment are not described herein again.
In addition, in combination with the method for identifying wearing of the mask in the above embodiments, the embodiments of the present application may provide a storage medium to implement. The storage medium having stored thereon a computer program; the computer program, when executed by a processor, implements any of the above-described embodiments of a method of mask wear identification.
In one embodiment, a computer device is provided, which may be a terminal. The computer device includes a processor, a memory, a network interface, a display screen, and an input device connected by a system bus. Wherein the processor of the computer device is configured to provide computing and control capabilities. The memory of the computer device comprises a nonvolatile storage medium and an internal memory. The non-volatile storage medium stores an operating system and a computer program. The internal memory provides an environment for the operation of an operating system and computer programs in the non-volatile storage medium. The network interface of the computer device is used for communicating with an external terminal through a network connection. The computer program is executed by a processor to implement a method of mask wear identification. The display screen of the computer equipment can be a liquid crystal display screen or an electronic ink display screen, and the input device of the computer equipment can be a touch layer covered on the display screen, a key, a track ball or a touch pad arranged on the shell of the computer equipment, an external keyboard, a touch pad or a mouse and the like.
It will be understood by those skilled in the art that all or part of the processes of the methods of the embodiments described above can be implemented by hardware instructions of a computer program, which can be stored in a non-volatile computer-readable storage medium, and when executed, can include the processes of the embodiments of the methods described above. Any reference to memory, storage, database, or other medium used in the embodiments provided herein may include non-volatile and/or volatile memory, among others. Non-volatile memory can include read-only memory (ROM), Programmable ROM (PROM), Electrically Programmable ROM (EPROM), Electrically Erasable Programmable ROM (EEPROM), or flash memory. Volatile memory can include Random Access Memory (RAM) or external cache memory. By way of illustration and not limitation, RAM is available in a variety of forms such as Static RAM (SRAM), Dynamic RAM (DRAM), Synchronous DRAM (SDRAM), Double Data Rate SDRAM (DDRSDRAM), Enhanced SDRAM (ESDRAM), Synchronous Link DRAM (SLDRAM), Rambus Direct RAM (RDRAM), direct bus dynamic RAM (DRDRAM), and memory bus dynamic RAM (RDRAM).
It should be understood by those skilled in the art that various features of the above-described embodiments can be combined in any combination, and for the sake of brevity, all possible combinations of features in the above-described embodiments are not described in detail, but rather, all combinations of features which are not inconsistent with each other should be construed as being within the scope of the present disclosure.
The above-mentioned embodiments only express several embodiments of the present application, and the description thereof is more specific and detailed, but not construed as limiting the scope of the invention. It should be noted that, for a person skilled in the art, several variations and modifications can be made without departing from the concept of the present application, which falls within the scope of protection of the present application. Therefore, the protection scope of the present patent shall be subject to the appended claims.

Claims (10)

1. A method of mask wear identification, the method comprising:
constructing a multitask cooperative network, wherein the multitask cooperative network comprises a basic network and a first branch network, and the first branch network comprises a mask wearing classification branch and a mask wearing regression branch;
inputting training data into the multitask collaborative network, and training the first branch network respectively through a classification loss function of a mask wearing branch, a mask height regression loss function of a mask wearing branch and an isotropic combined loss function loop between classification and regression of the mask wearing branch after the training data passes through the basic network until the mask wearing classification branch and the mask wearing regression branch reach a preset convergence state to obtain a trained multitask collaborative network, wherein the mask wearing branch comprises a mask wearing classification branch and a mask wearing regression branch;
the isotropic joint loss function between the classification and regression of the mask wearing branches is as follows:
Figure DEST_PATH_IMAGE002
wherein the content of the first and second substances,
Figure DEST_PATH_IMAGE004
representing the isotropic union loss between the classification and regression of the mask wearing branches,
Figure DEST_PATH_IMAGE006
the classification probability corresponding to different regression mask heights is shown,
Figure DEST_PATH_IMAGE008
the height of the mask is the height of the mask,
Figure DEST_PATH_IMAGE010
the number of categories indicating the wearing classification of the mask,
Figure DEST_PATH_IMAGE012
a category index representing the wearing classification of the mask, ranging from 1 to
Figure 746777DEST_PATH_IMAGE010
Figure DEST_PATH_IMAGE014
A real tag representing the mask wearing of the current image data,
Figure DEST_PATH_IMAGE016
means of
Figure 618918DEST_PATH_IMAGE012
In the same way, the first and second,
Figure DEST_PATH_IMAGE018
means of
Figure 330391DEST_PATH_IMAGE006
The meanings are the same;
the face is identified through a trained multitask collaborative network, and the class number of the mask wearing classification and the mask wearing regression height are obtained, wherein the class number comprises that the mask is not worn, is not worn according to the standard and is worn according to the standard.
2. The method of claim 1, wherein the multitask collaborative network further comprises a second branch network comprising a living body recognition classification branch, and wherein after the mask worn classification branch and the mask worn regression branch reach a predetermined convergence state, the method further comprises:
fixing mask wearing classification branch parameters and mask wearing regression branch parameters, and training the second branch network respectively through a classification loss function of a living body recognition branch and an isotropic combined loss function circulation between regression and living body recognition classification of the mask wearing branch until the living body recognition classification branch reaches a preset convergence state to obtain a trained multi-task cooperative network;
wherein, the isotropic joint loss function between the regression of the mask wearing branch and the living body identification classification is as follows:
Figure DEST_PATH_IMAGE020
Figure DEST_PATH_IMAGE022
wherein the content of the first and second substances,
Figure DEST_PATH_IMAGE024
the punishment weight is represented, when the living body is input and the mask is worn regularly, the weighting punishment is carried out, the living body passing rate when the mask is worn regularly is improved,
Figure DEST_PATH_IMAGE026
the number of categories representing the identification classification of the living body,
Figure DEST_PATH_IMAGE028
class index representing living body recognition classification in the range of 1 to
Figure 225666DEST_PATH_IMAGE026
Figure DEST_PATH_IMAGE030
A genuine tag indicating living body identification of the current image data,
Figure DEST_PATH_IMAGE032
means of and
Figure DEST_PATH_IMAGE034
in the same way, the first and second,
Figure DEST_PATH_IMAGE036
representing the classification probability of the live body recognition branch output,
Figure DEST_PATH_IMAGE038
means of and
Figure 618470DEST_PATH_IMAGE036
in the same way, the first and second,
Figure 876276DEST_PATH_IMAGE006
representing the class probability corresponding to different regression mask heights;
and identifying the face through the trained multi-task cooperative network to obtain the classified class number of the mask wearing, the regression height of the mask wearing and the confidence coefficient of the living body.
3. The method of claim 2, wherein after the living body identification classification branch reaches a preset convergence state, the method further comprises:
and releasing the mask wearing classification branch parameters, mask wearing regression branch parameters and living body recognition classification branch parameters, training the first branch network through the mask wearing branched classification loss function and the mask wearing branched mask height regression loss function, training the second branch network through the living body recognition branched classification loss function until the mask wearing classification branch, the mask wearing regression branch and the living body recognition classification branch reach a preset convergence state, and obtaining a trained multi-task collaborative network.
4. The method of claim 1, wherein the base network comprises a number of lightweight attention modules, wherein training data input to the lightweight attention modules comprises:
performing global pooling operation on training data, performing Channel splitting on Channel dimension to obtain a plurality of groups, performing 1 × 1 convolution and Sigmoid activation on the groups respectively, splicing the groups together again on the Channel dimension to obtain a weight coefficient of each Channel, multiplying the weight coefficient of each Channel by the training data, and performing Channel Shuffle operation to obtain semantic information;
and after the training data passes through the basic network, obtaining high-level semantic information.
5. The method according to claim 1, wherein after the trained multitask collaborative network is used for recognizing the human face to obtain the class number of the mask wearing classification and the mask wearing regression height, the method further comprises the following steps:
and obtaining a comparison result of the mask wearing regression height and a preset mask wearing height, and obtaining a mask wearing output result according to the comparison result and the class number of the mask wearing classification.
6. The method according to claim 1, wherein after the trained multitask collaborative network is used for recognizing the human face, and the category number of the mask wearing classification, the mask wearing regression height and the living body confidence coefficient are obtained, the method further comprises the following steps:
obtaining the living body confidence coefficient after the height is pulled up through the height coefficient, the mask wearing regression height and the living body confidence coefficient;
and obtaining a comparison result of the living body confidence coefficient after the pulling-up and a preset living body confidence coefficient threshold value, and obtaining an output result of the living body identification according to the comparison result.
7. A system for mask wear identification, the system comprising a multitask collaborative network, the multitask collaborative network comprising a base network and a first branch network, the first branch network comprising a mask wear classification branch and a mask wear regression branch;
inputting training data into the multitask collaborative network, and training the first branch network respectively through a classification loss function of a mask wearing branch, a mask height regression loss function of a mask wearing branch and an isotropic combined loss function loop between classification and regression of the mask wearing branch after the training data passes through the basic network until the mask wearing classification branch and the mask wearing regression branch reach a preset convergence state to obtain a trained multitask collaborative network, wherein the mask wearing branch comprises a mask wearing classification branch and a mask wearing regression branch;
the isotropic joint loss function between the classification and regression of the mask wearing branches is as follows:
Figure 40541DEST_PATH_IMAGE002
wherein the content of the first and second substances,
Figure 265855DEST_PATH_IMAGE004
representing the isotropic union loss between the classification and regression of the mask wearing branches,
Figure 275399DEST_PATH_IMAGE006
the classification probability corresponding to different regression mask heights is shown,
Figure 653291DEST_PATH_IMAGE008
the height of the mask is the height of the mask,
Figure 722878DEST_PATH_IMAGE010
the number of categories indicating the wearing classification of the mask,
Figure 920641DEST_PATH_IMAGE012
a category index representing the wearing classification of the mask, ranging from 1 to
Figure 733876DEST_PATH_IMAGE010
Figure 966275DEST_PATH_IMAGE014
A real tag representing the mask wearing of the current image data,
Figure 721610DEST_PATH_IMAGE016
means of
Figure 406669DEST_PATH_IMAGE012
In the same way, the first and second,
Figure 23595DEST_PATH_IMAGE018
means of
Figure 110500DEST_PATH_IMAGE006
The meanings are the same;
the face is identified through the trained multitask collaborative network, and the class number of the mask wearing classification and the mask wearing regression height are obtained, wherein the class number comprises that the mask is not worn, is not worn according to the standard and is worn according to the standard.
8. The system according to claim 7, wherein the multitask collaborative network further comprises a second branch network, the second branch network comprises a living body recognition classification branch, after the mask wearing classification branch and the mask wearing regression branch reach a preset convergence state, mask wearing classification branch parameters and mask wearing regression branch parameters are fixed, the second branch network is trained through a classification loss function of the living body recognition branch and an isotropic combined loss function cycle between the regression of the mask wearing branch and the living body recognition classification respectively until the living body recognition classification branch reaches the preset convergence state, and a trained multitask collaborative network is obtained;
wherein, the isosexual joint loss function between the regression of the mask wearing branch and the living body identification classification is as follows:
Figure 787469DEST_PATH_IMAGE020
Figure DEST_PATH_IMAGE039
wherein the content of the first and second substances,
Figure 694245DEST_PATH_IMAGE024
the punishment weight is represented, when the living body is input and the mask is worn regularly, the weighting punishment is carried out, the living body passing rate when the mask is worn regularly is improved,
Figure 832971DEST_PATH_IMAGE026
the number of categories representing the identification classification of the living body,
Figure 39962DEST_PATH_IMAGE028
class index representing living body recognition classification in the range of 1 to
Figure 887832DEST_PATH_IMAGE026
Figure DEST_PATH_IMAGE040
A real tag representing living body identification of the current image data,
Figure 281904DEST_PATH_IMAGE032
means of and
Figure 240633DEST_PATH_IMAGE034
in the same way, the first and second groups of the first and second groups,
Figure 36551DEST_PATH_IMAGE036
representing the classification probability of the live body recognition branch output,
Figure 55322DEST_PATH_IMAGE038
means of and
Figure 451537DEST_PATH_IMAGE036
in the same way, the first and second,
Figure 213957DEST_PATH_IMAGE006
representing the class probability corresponding to different regression mask heights;
and identifying the face through the trained multi-task cooperative network to obtain the classified class number of the mask wearing, the regression height of the mask wearing and the confidence coefficient of the living body.
9. An electronic device comprising a memory and a processor, wherein the memory stores a computer program, and the processor is configured to execute the computer program to perform the method of mask wearing identification according to any one of claims 1 to 6.
10. A storage medium having a computer program stored thereon, wherein the computer program is configured to execute the method of mask wear identification according to any one of claims 1 to 6 when the computer program is run.
CN202210201148.6A 2022-03-03 2022-03-03 Method, system, device and medium for identifying wearing of mask Active CN114267077B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210201148.6A CN114267077B (en) 2022-03-03 2022-03-03 Method, system, device and medium for identifying wearing of mask

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210201148.6A CN114267077B (en) 2022-03-03 2022-03-03 Method, system, device and medium for identifying wearing of mask

Publications (2)

Publication Number Publication Date
CN114267077A CN114267077A (en) 2022-04-01
CN114267077B true CN114267077B (en) 2022-06-21

Family

ID=80833980

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210201148.6A Active CN114267077B (en) 2022-03-03 2022-03-03 Method, system, device and medium for identifying wearing of mask

Country Status (1)

Country Link
CN (1) CN114267077B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115116122B (en) * 2022-08-30 2022-12-16 杭州魔点科技有限公司 Mask identification method and system based on double-branch cooperative supervision

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111931580A (en) * 2020-07-09 2020-11-13 陕西师范大学 Mask wearing detection method
CN111931623A (en) * 2020-07-31 2020-11-13 南京工程学院 Face mask wearing detection method based on deep learning
CN112115818A (en) * 2020-09-01 2020-12-22 燕山大学 Mask wearing identification method
CN112818953A (en) * 2021-03-12 2021-05-18 苏州科达科技股份有限公司 Mask wearing state identification method, device, equipment and readable storage medium
CN113361397A (en) * 2021-06-04 2021-09-07 重庆邮电大学 Face mask wearing condition detection method based on deep learning
CN113420675A (en) * 2021-06-25 2021-09-21 浙江大华技术股份有限公司 Method and device for detecting mask wearing standardization

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109101923B (en) * 2018-08-14 2020-11-27 罗普特(厦门)科技集团有限公司 Method and device for detecting mask wearing condition of person
KR20200053751A (en) * 2018-11-09 2020-05-19 전자부품연구원 Apparatus and method for masking face
CN111507199A (en) * 2020-03-25 2020-08-07 杭州电子科技大学 Method and device for detecting mask wearing behavior
US11361445B2 (en) * 2020-07-08 2022-06-14 Nec Corporation Of America Image analysis for detecting mask compliance
CN112183471A (en) * 2020-10-28 2021-01-05 西安交通大学 Automatic detection method and system for standard wearing of epidemic prevention mask of field personnel

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111931580A (en) * 2020-07-09 2020-11-13 陕西师范大学 Mask wearing detection method
CN111931623A (en) * 2020-07-31 2020-11-13 南京工程学院 Face mask wearing detection method based on deep learning
CN112115818A (en) * 2020-09-01 2020-12-22 燕山大学 Mask wearing identification method
CN112818953A (en) * 2021-03-12 2021-05-18 苏州科达科技股份有限公司 Mask wearing state identification method, device, equipment and readable storage medium
CN113361397A (en) * 2021-06-04 2021-09-07 重庆邮电大学 Face mask wearing condition detection method based on deep learning
CN113420675A (en) * 2021-06-25 2021-09-21 浙江大华技术股份有限公司 Method and device for detecting mask wearing standardization

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
Face mask recognition based on MTCNN and MobileNet;He-Ming T.等;《2021 3rd International Academic Exchange Conference on Science and Technology Innovation (IAECST)》;20220207;第471-474页 *
改进轻量级卷积神经网络的复杂场景口罩佩戴检测方法;薛均晓 等;《计算机辅助设计与图形学学报》;20210731;第33卷(第7期);第1045-1054页 *

Also Published As

Publication number Publication date
CN114267077A (en) 2022-04-01

Similar Documents

Publication Publication Date Title
CN105512624A (en) Smile face recognition method and device for human face image
Liu et al. Implicit discourse relation classification via multi-task neural networks
CN112085010B (en) Mask detection and deployment system and method based on image recognition
CN108595601A (en) A kind of long text sentiment analysis method incorporating Attention mechanism
CN109816092A (en) Deep neural network training method, device, electronic equipment and storage medium
CN104091206B (en) Social network information propagation prediction method based on evolutionary game theory
CN114267077B (en) Method, system, device and medium for identifying wearing of mask
Zhang et al. Object semantics sentiment correlation analysis enhanced image sentiment classification
US10610109B2 (en) Emotion representative image to derive health rating
CN105160299A (en) Human face emotion identifying method based on Bayes fusion sparse representation classifier
CN109801269B (en) Tongue fur physique classification method based on competitive extrusion and excitation neural network
CN109840280A (en) A kind of file classification method, device and computer readable storage medium
CN109753571A (en) A kind of scene map lower dimensional space embedding grammar based on secondary theme space projection
CN106502382A (en) Active exchange method and system for intelligent robot
CN106775665A (en) The acquisition methods and device of the emotional state change information based on sentiment indicator
CN109299470A (en) The abstracting method and system of trigger word in textual announcement
CN110276396A (en) Picture based on object conspicuousness and cross-module state fusion feature describes generation method
CN113822183B (en) Zero sample expression recognition method and system based on AU-EMO association and graph neural network
CN110633689B (en) Face recognition model based on semi-supervised attention network
CN112767386A (en) Image aesthetic quality evaluation method and system based on theme feature and score distribution
CN113052010A (en) Personnel mask wearing data set generation method based on deep learning
CN109299641A (en) A kind of train dispatcher's fatigue monitoring image adaptive Processing Algorithm
CN109920510A (en) A kind of scientific fitness guidance system and method for knowledge based map
CN110321554A (en) Bad text detection method and device based on Bi-LSTM
CN109298783A (en) Mark monitoring method, device and electronic equipment based on Expression Recognition

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant