CN114267077B

CN114267077B - Method, system, device and medium for identifying wearing of mask

Info

Publication number: CN114267077B
Application number: CN202210201148.6A
Authority: CN
Inventors: 李来; 张江峰; 王月平; 王东; 肖传宝
Original assignee: Hangzhou Moredian Technology Co ltd
Current assignee: Hangzhou Moredian Technology Co ltd
Priority date: 2022-03-03
Filing date: 2022-03-03
Publication date: 2022-06-21
Anticipated expiration: 2042-03-03
Also published as: CN114267077A

Abstract

The present application relates to a method, system, apparatus, and medium for mask wear identification by constructing a multitasking collaboration network, and training the first branch network respectively through the classification loss function of the mask wearing branch, the mask height regression loss function of the mask wearing branch and the isotropic combined loss function circulation between the classification and regression of the mask wearing branch until the mask wearing classification branch and the mask wearing regression branch reach a preset convergence state to obtain a trained multi-task cooperative network, wherein the trained multi-task cooperative network can output the classification number of the pedestrian mask wearing classification and the mask wearing regression height, can know whether the pedestrian's gauze mask wears the standard, and the multitask in this embodiment is higher in network output result accuracy in coordination, when having solved among the correlation technique through artifical or intelligent equipment inspection gauze mask, inspect that the effect is poor or can't discern whether the gauze mask wears the problem of standard.

Description

Method, system, device and medium for identifying wearing of mask

Technical Field

The application relates to the technical field of face recognition, in particular to a method, a system, a device and a medium for mask wearing recognition.

Background

The mask can be worn in public places as a method for preventing infectious diseases, so that people have become a daily habit of wearing the mask to go out. The mask wearing inspection can be carried out in places such as shopping malls, subways, campuses, hospitals, offices and the like with large people flow in and out. In the related technology, when the inspection is carried out manually, the problems of employee's off duty, passing queuing, missing identification and passing and the like exist, and the inspection effect is poor; when checking through intelligent hardware equipment, can only discern whether to wear the gauze mask, can not discern whether export mask wears the standard.

At present, an effective solution is not provided aiming at the problem that the mask is poor in inspection effect or whether the mask is normally worn cannot be identified when the mask is inspected through manual work or intelligent equipment in the related art.

Disclosure of Invention

The embodiment of the application provides a method, a system, a device and a medium for identifying mask wearing, so that the problem that in the related art, when the mask is inspected through manual work or intelligent equipment, the inspection effect is poor or whether the mask wearing is standard or not cannot be identified is at least solved.

In a first aspect, an embodiment of the present application provides a method for mask wearing identification, where the method includes:

constructing a multitask cooperative network, wherein the multitask cooperative network comprises a basic network and a first branch network, and the first branch network comprises a mask wearing classification branch and a mask wearing regression branch;

inputting training data into the multitask collaborative network, and training the first branch network respectively through a classification loss function of a mask wearing branch, a mask height regression loss function of the mask wearing branch and an isotropic combined loss function cycle between classification and regression of the mask wearing branch after the training data passes through the basic network until the mask wearing classification branch and the mask wearing regression branch reach a preset convergence state to obtain a trained multitask collaborative network;

the face is identified through a trained multitask collaborative network, and the class number of the mask wearing classification and the mask wearing regression height are obtained, wherein the class number comprises that the mask is not worn, is not worn according to the standard and is worn according to the standard.

In a second aspect, the present application provides a system for mask wear identification, where the system includes a multitask collaborative network, where the multitask collaborative network includes a base network and a first branch network, and the first branch network includes a mask wear classification branch and a mask wear regression branch;

the face is identified through the trained multitask collaborative network, and the class number of the mask wearing classification and the mask wearing regression height are obtained, wherein the class number comprises that the mask is not worn, is not worn according to the standard and is worn according to the standard.

In a third aspect, an embodiment of the present application provides an electronic device, which includes a memory, a processor, and a computer program stored on the memory and executable on the processor, and the processor executes the computer program to implement the method for identifying wearing of a mask according to the first aspect.

In a fourth aspect, embodiments of the present application provide a storage medium having a computer program stored thereon, where the computer program is executed by a processor to implement the method for mask wearing identification according to the first aspect.

Compared with the related art, the mask wearing identification method provided by the embodiment of the application trains the first branch network respectively through the classification loss function of the mask wearing branch, the mask height regression loss function of the mask wearing branch and the isotropic combined loss function circulation between the classification and regression of the mask wearing branch by constructing the multitask collaborative network until the mask wearing classification branch and the mask wearing regression branch reach the preset convergence state to obtain the trained multitask collaborative network, wherein the trained multitask collaborative network can output the classification number of the pedestrian mask wearing classification and the mask wearing regression height, can know whether the pedestrian's gauze mask wears the standard, and the multitask in this embodiment is higher in network output result accuracy in coordination, when having solved among the correlation technique through artifical or intelligent equipment inspection gauze mask, inspect that the effect is poor or can't discern whether the gauze mask wears the problem of standard.

Drawings

The accompanying drawings, which are included to provide a further understanding of the application and are incorporated in and constitute a part of this application, illustrate embodiment(s) of the application and together with the description serve to explain the application and not to limit the application. In the drawings:

fig. 1 is a flowchart of a method of mask wear identification according to an embodiment of the present application;

fig. 2 is a flow chart of another method of mask wear identification according to an embodiment of the present application;

FIG. 3 is a schematic structural diagram of a lightweight attention module according to an embodiment of the present application;

fig. 4 is a schematic structural diagram of a multitask cooperative network according to an embodiment of the present application.

Detailed Description

In order to make the objects, technical solutions and advantages of the present application more apparent, the present application will be described and illustrated below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the present application and are not intended to limit the present application. All other embodiments obtained by a person of ordinary skill in the art based on the embodiments provided in the present application without any inventive step are within the scope of protection of the present application. Moreover, it should be appreciated that in the development of any such actual implementation, as in any engineering or design project, numerous implementation-specific decisions must be made to achieve the developers' specific goals, such as compliance with system-related and business-related constraints, which may vary from one implementation to another.

Reference in the specification to "an embodiment" means that a particular feature, structure, or characteristic described in connection with the embodiment can be included in at least one embodiment of the specification. The appearances of the phrase in various places in the specification are not necessarily all referring to the same embodiment, nor are separate or alternative embodiments mutually exclusive of other embodiments. Those of ordinary skill in the art will explicitly and implicitly appreciate that the embodiments described herein may be combined with other embodiments without conflict.

Unless defined otherwise, technical or scientific terms referred to herein shall have the ordinary meaning as understood by those of ordinary skill in the art to which this application belongs. Reference to "a," "an," "the," and similar words throughout this application are not to be construed as limiting in number, and may refer to the singular or the plural. The present application is directed to the use of the terms "including," "comprising," "having," and any variations thereof, which are intended to cover non-exclusive inclusions; for example, a process, method, system, article, or apparatus that comprises a list of steps or modules (elements) is not limited to the listed steps or elements, but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus. Reference to "connected," "coupled," and the like in this application is not intended to be limited to physical or mechanical connections, but may include electrical connections, whether direct or indirect. Reference herein to "a plurality" means greater than or equal to two. "and/or" describes an association relationship of associated objects, meaning that three relationships may exist, for example, "A and/or B" may mean: a exists alone, A and B exist simultaneously, and B exists alone. Reference herein to the terms "first," "second," "third," and the like, are merely to distinguish similar objects and do not denote a particular ordering for the objects.

The present embodiment provides a method for mask wearing identification, and fig. 1 is a flowchart of a method for mask wearing identification according to an embodiment of the present application, and as shown in fig. 1, the method includes the following steps:

step S101, a multitask cooperative network is constructed, wherein the multitask cooperative network comprises a basic network and a first branch network, and the first branch network comprises a mask wearing classification branch and a mask wearing regression branch;

step S102, inputting training data into the multitask cooperative network, training the first branch network through a classification loss function of a mask wearing branch, a mask height regression loss function of the mask wearing branch and an isotropic combined loss function circulation between classification and regression of the mask wearing branch after the training data passes through a basic network until the mask wearing classification branch and the mask wearing regression branch reach a preset convergence state, and obtaining the trained multitask cooperative network;

in this embodiment, the classification loss function of the mask wearing branch is shown by the following formula 1:

equation 1

Wherein the content of the first and second substances,

indicating a classification loss of the mask wearing branch,

the number of categories of the classified wearing of the mask is represented, the number of categories is 3, namely 1-wearing is not performed, 2-wearing is not performed according to the standard, 3-wearing is performed according to the standard,

a category index representing the wearing classification of the mask, ranging from 1 to

，

The actual label of the mask wearing, which represents the current image data, i.e., 1-not worn, 2-not worn, 3-worn,

means of

In the same way, the first and second,

represents the classification probability of the mask wearing branch output,

means of and

the same is true.

The mask height regression loss function of the mask wearing branch is shown by the following formula 2:

equation 2

Wherein the content of the first and second substances,

showing the high regression loss of the mask with the wearing branch of the mask,

represents the number of samples in the batch size,

represents the regression-predicted mask wearing height,

the invention represents the real calibrated mask wearing height, the mask wearing height is continuously and numerically calibrated in the implementation process, and part of important calibrated heights are stated, namely the mask wearing height is not-0, the nose tip is-72, the upper lip is-65, the eye pouch is-85, the eyebrow is-100, and the mask wearing height is not-0]Out of specification- [0-72]Norm- [72-100]。

The isotropic joint loss function between the classification and regression of the mask wearing branches is shown in the following equations 3-4:

equation 3

Equation 4

Wherein the content of the first and second substances,

representing the isotropic union loss between the classification and regression of the mask wearing branches,

the classification probability of different regression masks is shown, for example, when the height is 85, the classification probability is [0,0,1 ]]，

、

、

、

The meaning is shown in formula 1.

Whether the gauze mask is worn and the gauze mask is worn highly, there is very strong correlation in the semanteme, it can make the gauze mask wear categorised branch and the gauze mask and wear the unable intersection utilization of the high-level semantic information that returns branch to train to first branch network to wear the categorised branch of branch and gauze mask through the highly regression loss function of branch is worn to the gauze mask alone to wear the categorised branch and the gauze mask through the gauze mask, the result degree of accuracy that makes the output is not high enough, consequently need rethread gauze mask to wear the branch and the regression between the categorised branch of branch isotropic joint loss function trains first branch network, wear categorised branch and gauze mask and wear highly regression branch and jointly train together, the degree of accuracy of output result has been improved.

And S103, identifying the face through the trained multitask cooperative network, and obtaining the class number of the mask wearing classification and the mask wearing regression height, wherein the class number comprises non-wearing, non-standard wearing and standard wearing. In this embodiment, if the mask wearing height is 72 or more, it may be determined as the mask wearing specification, and the trained multi-task cooperative network outputs that the number of classes of the pedestrian a mask wearing classification is 3 and the mask wearing return height is 75, which indicates the pedestrian a mask wearing specification.

Through steps S101 to S103, in contrast to the problem that the mask wearing is poor or the mask wearing is not recognized normally when the mask is inspected through manual or intelligent devices in the related art, the present embodiment trains the first branch network through constructing the multitask collaborative network and respectively through the classification loss function of the mask wearing branch, the mask height regression loss function of the mask wearing branch, and the isotropic combined loss function cycle between the classification and regression of the mask wearing branch until the mask wearing classification branch and the mask wearing regression branch reach the preset convergence state, so as to obtain the trained multitask collaborative network, the trained multitask collaborative network can output the number of classes of the mask wearing classification and the mask wearing regression height of the pedestrian, so as to know whether the pedestrian mask is worn normally, and the output result of the multitask collaborative network in the present embodiment has higher accuracy, the problem of among the correlation technique when examining the gauze mask through artifical or intelligent equipment, examine the effect poor or can't discern whether the gauze mask is worn normatively is solved.

In some embodiments, the multitask collaborative network further includes a second branch network, the second branch network includes a living body identification classification branch, fig. 2 is a flowchart of another mask wearing identification method according to an embodiment of the present application, and as shown in fig. 2, after the mask wearing classification branch and the mask wearing regression branch reach a preset convergence state, the method includes the following steps:

step S201, fixing mask wearing classification branch parameters and mask wearing regression branch parameters, and training a second branch network respectively through a classification loss function of a living body recognition branch and an isotropic combined loss function circulation between the regression of the mask wearing branch and the living body recognition classification until the living body recognition classification branch reaches a preset convergence state, so as to obtain a trained multi-task cooperative network;

in this embodiment, the classification loss function of the living body recognition branch is shown by the following formula 5:

equation 5

Wherein the content of the first and second substances,

indicating a loss of classification of the live recognition branch,

the number of classes representing the recognition classification of the living body, which is 2, i.e., 1-prosthesis, 2-living body,

class index representing living body recognition classification in the range of 1 to

，

The real label representing the live recognition of the current image data, i.e. 1-prosthesis, 2-live,

means of and

in the same way, the first and second,

representing the classification probability of the live body recognition branch output,

means of and

the same is true.

The isotropic joint loss function between the regression of the mask wearing branches and the living body recognition classification is shown in the following formulas 6 to 7:

equation 6

Equation 7

Wherein the content of the first and second substances,

the punishment weight is represented, when the living body is input and the mask is worn regularly, the weighting punishment is carried out, the living body passing rate when the mask is worn regularly is improved,

、

、

、

、

、

the meaning is shown in the same formula 5,

the meaning is shown in formula 3.

Through this embodiment, wear the height according to the gauze mask and carry out the loss punishment, finally promote the living body under the gauze mask condition of wearing and judge the rate of accuracy.

And S202, recognizing the face through the trained multi-task cooperative network to obtain the class number of the mask wearing classification, the mask wearing regression height and the living body confidence coefficient.

In the related technology, the living body recognition mainly depends on the details in the human face to distinguish the truth of the human face, but because the mask is shielded, the details of the human face are lost, so that the confidence of the living body output by the model is low, the conventional living body algorithm depending on a fixed threshold value is easy to misjudge, and the mask needs to be manually removed or pulled down to correctly recognize, therefore, in the embodiment, the second branch network is not trained by the classification loss function of the living body recognition branch, but is trained by the isotropic combined loss function between the regression of the mask wearing branch and the living body recognition classification until the living body recognition branch reaches the preset convergence state, the living body confidence of the pedestrian wearing the mask for living body recognition is improved, and the problem that the living body can be correctly recognized only by manually removing or pulling down the mask in the prior art is solved, by carrying out living body identification on the pedestrian, the risk of false body attack facing printing paper, an electronic screen and the like is also solved.

In some embodiments, after the living body recognition classification branch reaches the preset convergence state, the mask wearing classification branch parameter, the mask wearing regression branch parameter and the living body recognition classification branch parameter are released, the first branch network is trained through the mask wearing classification loss function and the mask height regression loss function of the mask wearing branch, and the second branch network is trained through the living body recognition branch classification loss function until the mask wearing classification branch, the mask wearing regression branch and the living body recognition classification branch reach the preset convergence state, so that the trained multi-task cooperation network is obtained.

In this embodiment, the multi-task cooperative network is alternately optimized and trained through a progressive multi-stage cycle, where the progressive is easy before difficult, and the difficulty in optimizing the multi-task cooperative network can be reduced, and the multi-stage cycle includes the following steps:

(1) a first stage circulation: fixing living body identification classification branch parameters, training first branch network

A: training a first branch network through a classification loss function of a mask wearing classification branch;

b: training a first branch network through a mask height regression loss function of the mask wearing branches;

c: training a first branch network through an isotropic joint loss function between the classification and regression of the mask wearing branches;

and repeating the steps A-C until the mask wearing classification branch and the mask wearing regression branch reach a preset convergence state.

(2) And (3) second-stage circulation: fixing the mask wearing classification branch parameters and mask wearing regression branch parameters, training a second branch network

E: training a second branch network through a classification loss function of the living body recognition branch;

f: training a second branch network through various isotropic joint loss functions between regression of mask wearing branches and living body recognition classification;

and E-F is repeated until the living body identification classification branch reaches a preset convergence state.

(3) And a third stage of circulation: and releasing the mask wearing classification branch parameter, the mask wearing regression branch parameter and the living body identification classification branch parameter, training the mask wearing classification branch, the mask wearing regression branch and the living body identification classification branch until the mask wearing classification branch, the mask wearing regression branch and the living body identification classification branch reach a preset convergence state.

(4) And (5) repeating the steps (1) to (4) until the mask wearing classification branch, the mask wearing regression branch and the living body identification classification branch reach a preset convergence state, and obtaining the trained multi-task cooperative network.

The mask recognition and living body recognition scheme is a strongly-related collaborative task, for a network, visual information is shared at the bottom layer, but high-level semantic information needs to distinguish whether a mask is worn, the height of the mask and whether the mask is a living body, so that convergence is relatively difficult and task balance is difficult to achieve during direct end-to-end training, and the problem that the multi-task collaborative network is difficult to converge and task balance is difficult to achieve is solved by the gradual multi-stage circulation alternate optimization training strategy.

In some embodiments, the basic network includes several lightweight attention modules, that is, the basic network is constructed by stacking a plurality of lightweight attention modules, fig. 3 is a schematic structural diagram of a lightweight attention module according to an embodiment of the present application, and as shown in fig. 3, inputting training data to the lightweight attention module includes:

the input x is first global pooled (GAP) and then Channel Split (Channel Split) in the Channel dimension, resulting in two packets

And

reducing the operation amount, realizing light weight reasoning, respectively performing 1 × 1 convolution and Sigmoid activation on two groups, then splicing the two groups together on Channel dimension (Channel Cat) to obtain n, wherein n is a weight coefficient of each Channel, multiplying n and input x, and since n is a coefficient of each Channel, n x represents that each input Channel is configured with weight, namely applying an attention mechanism to each Channel, optimizing model feature expression, the weight coefficient corresponding to a useful Channel is large, the weight coefficient corresponding to an useless Channel is small, so that the network can be more focused on learning of the useful Channel, namely the attention mechanism, and finally performing Channel Shuffle operation, disordering the channels, increasing information flow in the channels, further enhancing the feature expression capability of the network, finally obtaining semantic information, inputting x after passing through a basic network, and finally obtaining high-level semantic information.

The attention module describes the formula as the following formulas 8-10:

equation 8

Equation 9

Equation 10

Wherein the content of the first and second substances,

in order to take care of the input of the attention module,

in order to be an output of the attention module,

indicating the sequence number of the packet index,

、

a temporary variable for an intermediate operation is indicated,

representing the product of the element-wise product,

represents the Global Average Pooling operation,

showing the operation of the Channel Split, and,

indicating that 1x1Conv is performed first, then Sigmoid operation is performed,

indicating the operation of the Channel Cat in the Channel,

indicating Channel Shuffle operation.

In the embodiment, the computation amount is reduced through Channel Split, the light weight of the attention module is ensured, information exchange in a characteristic Channel is promoted through Channel Split, and the characteristic expression of the model is enhanced.

In the correlation technique, gauze mask discernment and living body discernment are generally used for the entrance guard scene, the discernment can be passed through, in practical application, different places are different to the requirement of gauze mask height of wearing, traditional scheme generally adopts fixed threshold value, can't carry out nimble setting, therefore, in some of them embodiments, discern the people's face through the multitask collaborative network that trains, after obtaining categorised classification number of gauze mask and wear the regression height, obtain the gauze mask and wear the comparison result of regression height and preset gauze mask height, according to comparison result and the categorised classification number of gauze mask wearing, obtain the gauze mask and wear the output result, specifically realize following formula 11:

equation 11

Wherein, the first and the second end of the pipe are connected with each other,

shows the output result of the mask wearing, namely 1-not wearing, 2-not wearing in standard, 3-wearing in standard,

the mask wearing regression height which represents the multitask collaborative network output ranges from 0 to 100, the larger the numerical value is, the higher the wearing height is,

representing the mask wearing classification result output by the multitask collaborative network,

indicate to predetermine gauze mask and wear the height, the user can set up as the demand, can set up to 72, and the tip of the nose position promptly, in this embodiment, the gauze mask that outputs in coordination with the network as multitask wears classification result and wears for the standard, but the gauze mask is worn and is regressed highly to be less than and predetermines gauze mask and wear the height, then explains that the gauze mask is worn and is accorded with general requirement, but is not conform to the requirement that the user set for, then final gauze mask is worn output result

Still for the nonstandard wearing, can adjust through this embodiment in a flexible way and predetermine the gauze mask and wear the height, realize that the gauze mask wears the nimble deployment of discernment.

Meanwhile, the living body identification threshold value is difficult to carry out on-site parameter adjustment and adaptation, so in some embodiments, after the trained multi-task cooperative network is used for identifying the face, the class number of the mask wearing classification, the mask wearing regression height and the living body confidence coefficient are obtained, the living body confidence coefficient after the face is pulled up is obtained through the pulling-up coefficient, the mask wearing regression height and the living body confidence coefficient; obtaining a comparison result of the confidence level of the living body after the pulling-up and the confidence level threshold value, and obtaining an output result of the living body identification according to the comparison result, wherein the following formula 12-13 is specifically realized:

equation 12

Equation 13

indicating living bodyThe result of the output is recognized and,

indicating a pull-up factor, may be set to 0.03,

the living body confidence coefficient after the heightening, namely the living body confidence coefficient after the heightening can improve the robustness of living body judgment under the mask scene,

representing the live confidence of the multi-tasking collaborative network output,

representing a preset living body confidence coefficient threshold value, and setting by a user according to requirements; training the combined branch of the mask height regression and the living body recognition classification through various isotropic joint loss functions between the regression of the mask wearing branch and the living body recognition classification, although the accuracy of living body judgment under the wearing condition of the mask is improved, the problem that the output value of the living body is lower under the wearing condition of the mask cannot be completely solved, therefore, the living body confidence coefficient output by the multitask collaborative network is heightened through the embodiment, the problem that the living body confidence coefficient is low is further solved, the problem that the living body confidence coefficient is lower than a preset living body confidence coefficient threshold value under an entrance guard scene, the false body attack is considered, the false body attack does not pass through the living body confidence coefficient threshold value, the mask needs to be pulled down to perform recognition judgment again to pass through, and if the mask needs to be brushed or pulled down for multiple times, the passing blockage problem is caused, so that the passing efficiency of the living body recognition under the wearing condition of the mask is improved.

It should be noted that the steps illustrated in the above-described flow diagrams or in the flow diagrams of the figures may be performed in a computer system, such as a set of computer-executable instructions, and that, although a logical order is illustrated in the flow diagrams, in some cases, the steps illustrated or described may be performed in an order different than here.

The present embodiment further provides a system for identifying wearing of a mask, which is used to implement the foregoing embodiments and preferred embodiments, and the description of the system is omitted here. As used hereinafter, the terms "module," "unit," "subunit," and the like may implement a combination of software and/or hardware for a predetermined function. Although the means described in the embodiments below are preferably implemented in software, an implementation in hardware, or a combination of software and hardware is also possible and contemplated.

Fig. 4 is a schematic structural diagram of a multitask collaboration network according to an embodiment of the present application, and as shown in fig. 4, the system includes the multitask collaboration network, the multitask collaboration network includes a base network and a first branch network, and the first branch network includes a mask wearing classification branch and a mask wearing regression branch;

inputting training data into the multitask collaborative network, and training the first branch network respectively through a classification loss function of a mask wearing branch, a mask height regression loss function of the mask wearing branch and an isotropic combined loss function circulation between classification and regression of the mask wearing branch after the training data passes through the basic network until the mask wearing classification branch and the mask wearing regression branch reach a preset convergence state to obtain a trained multitask collaborative network; the face is identified through the trained multi-task collaborative network, the classified class number of wearing of the mask and the regression height of wearing of the mask are obtained, wherein the class number includes that the mask is not worn, worn in an irregular mode and worn in an irregular mode, and the problem that whether the mask is worn in an irregular mode or not is poor in inspection effect or cannot be identified when the mask is inspected through manual work or intelligent equipment in the related art is solved.

In some embodiments, as shown in fig. 4, the multitask coordination network further includes a second branch network, the second branch network includes a living body recognition classification branch, after the mask wearing classification branch and the mask wearing regression branch reach the preset convergence state, the mask wearing classification branch parameter and the mask wearing regression branch parameter are fixed, the second branch network is trained through a classification loss function of the living body recognition branch and an isotropic combined loss function loop between the regression of the mask wearing branch and the living body recognition classification respectively, until the living body recognition classification branch reaches the preset convergence state, and the trained multitask coordination network is obtained;

and identifying the face through the trained multi-task cooperative network to obtain the classified class number of the mask wearing, the regression height of the mask wearing and the confidence coefficient of the living body.

The above modules may be functional modules or program modules, and may be implemented by software or hardware. For a module implemented by hardware, the modules may be located in the same processor; or the modules can be respectively positioned in different processors in any combination.

The present embodiment also provides an electronic device comprising a memory having a computer program stored therein and a processor configured to execute the computer program to perform the steps of any of the above method embodiments.

Optionally, the electronic apparatus may further include a transmission device and an input/output device, wherein the transmission device is connected to the processor, and the input/output device is connected to the processor.

It should be noted that, for specific examples in this embodiment, reference may be made to examples described in the foregoing embodiments and optional implementations, and details of this embodiment are not described herein again.

In addition, in combination with the method for identifying wearing of the mask in the above embodiments, the embodiments of the present application may provide a storage medium to implement. The storage medium having stored thereon a computer program; the computer program, when executed by a processor, implements any of the above-described embodiments of a method of mask wear identification.

In one embodiment, a computer device is provided, which may be a terminal. The computer device includes a processor, a memory, a network interface, a display screen, and an input device connected by a system bus. Wherein the processor of the computer device is configured to provide computing and control capabilities. The memory of the computer device comprises a nonvolatile storage medium and an internal memory. The non-volatile storage medium stores an operating system and a computer program. The internal memory provides an environment for the operation of an operating system and computer programs in the non-volatile storage medium. The network interface of the computer device is used for communicating with an external terminal through a network connection. The computer program is executed by a processor to implement a method of mask wear identification. The display screen of the computer equipment can be a liquid crystal display screen or an electronic ink display screen, and the input device of the computer equipment can be a touch layer covered on the display screen, a key, a track ball or a touch pad arranged on the shell of the computer equipment, an external keyboard, a touch pad or a mouse and the like.

It will be understood by those skilled in the art that all or part of the processes of the methods of the embodiments described above can be implemented by hardware instructions of a computer program, which can be stored in a non-volatile computer-readable storage medium, and when executed, can include the processes of the embodiments of the methods described above. Any reference to memory, storage, database, or other medium used in the embodiments provided herein may include non-volatile and/or volatile memory, among others. Non-volatile memory can include read-only memory (ROM), Programmable ROM (PROM), Electrically Programmable ROM (EPROM), Electrically Erasable Programmable ROM (EEPROM), or flash memory. Volatile memory can include Random Access Memory (RAM) or external cache memory. By way of illustration and not limitation, RAM is available in a variety of forms such as Static RAM (SRAM), Dynamic RAM (DRAM), Synchronous DRAM (SDRAM), Double Data Rate SDRAM (DDRSDRAM), Enhanced SDRAM (ESDRAM), Synchronous Link DRAM (SLDRAM), Rambus Direct RAM (RDRAM), direct bus dynamic RAM (DRDRAM), and memory bus dynamic RAM (RDRAM).

It should be understood by those skilled in the art that various features of the above-described embodiments can be combined in any combination, and for the sake of brevity, all possible combinations of features in the above-described embodiments are not described in detail, but rather, all combinations of features which are not inconsistent with each other should be construed as being within the scope of the present disclosure.

The above-mentioned embodiments only express several embodiments of the present application, and the description thereof is more specific and detailed, but not construed as limiting the scope of the invention. It should be noted that, for a person skilled in the art, several variations and modifications can be made without departing from the concept of the present application, which falls within the scope of protection of the present application. Therefore, the protection scope of the present patent shall be subject to the appended claims.

Claims

1. A method of mask wear identification, the method comprising:

inputting training data into the multitask collaborative network, and training the first branch network respectively through a classification loss function of a mask wearing branch, a mask height regression loss function of a mask wearing branch and an isotropic combined loss function loop between classification and regression of the mask wearing branch after the training data passes through the basic network until the mask wearing classification branch and the mask wearing regression branch reach a preset convergence state to obtain a trained multitask collaborative network, wherein the mask wearing branch comprises a mask wearing classification branch and a mask wearing regression branch;

the isotropic joint loss function between the classification and regression of the mask wearing branches is as follows:

wherein the content of the first and second substances,

the classification probability corresponding to different regression mask heights is shown,

the height of the mask is the height of the mask,

the number of categories indicating the wearing classification of the mask,

，

A real tag representing the mask wearing of the current image data,

means of

In the same way, the first and second,

means of

The meanings are the same;

2. The method of claim 1, wherein the multitask collaborative network further comprises a second branch network comprising a living body recognition classification branch, and wherein after the mask worn classification branch and the mask worn regression branch reach a predetermined convergence state, the method further comprises:

fixing mask wearing classification branch parameters and mask wearing regression branch parameters, and training the second branch network respectively through a classification loss function of a living body recognition branch and an isotropic combined loss function circulation between regression and living body recognition classification of the mask wearing branch until the living body recognition classification branch reaches a preset convergence state to obtain a trained multi-task cooperative network;

wherein, the isotropic joint loss function between the regression of the mask wearing branch and the living body identification classification is as follows:

wherein the content of the first and second substances,

the number of categories representing the identification classification of the living body,

，

A genuine tag indicating living body identification of the current image data,

means of and

in the same way, the first and second,

means of and

in the same way, the first and second,

representing the class probability corresponding to different regression mask heights;

3. The method of claim 2, wherein after the living body identification classification branch reaches a preset convergence state, the method further comprises:

and releasing the mask wearing classification branch parameters, mask wearing regression branch parameters and living body recognition classification branch parameters, training the first branch network through the mask wearing branched classification loss function and the mask wearing branched mask height regression loss function, training the second branch network through the living body recognition branched classification loss function until the mask wearing classification branch, the mask wearing regression branch and the living body recognition classification branch reach a preset convergence state, and obtaining a trained multi-task collaborative network.

4. The method of claim 1, wherein the base network comprises a number of lightweight attention modules, wherein training data input to the lightweight attention modules comprises:

performing global pooling operation on training data, performing Channel splitting on Channel dimension to obtain a plurality of groups, performing 1 × 1 convolution and Sigmoid activation on the groups respectively, splicing the groups together again on the Channel dimension to obtain a weight coefficient of each Channel, multiplying the weight coefficient of each Channel by the training data, and performing Channel Shuffle operation to obtain semantic information;

and after the training data passes through the basic network, obtaining high-level semantic information.

5. The method according to claim 1, wherein after the trained multitask collaborative network is used for recognizing the human face to obtain the class number of the mask wearing classification and the mask wearing regression height, the method further comprises the following steps:

and obtaining a comparison result of the mask wearing regression height and a preset mask wearing height, and obtaining a mask wearing output result according to the comparison result and the class number of the mask wearing classification.

6. The method according to claim 1, wherein after the trained multitask collaborative network is used for recognizing the human face, and the category number of the mask wearing classification, the mask wearing regression height and the living body confidence coefficient are obtained, the method further comprises the following steps:

obtaining the living body confidence coefficient after the height is pulled up through the height coefficient, the mask wearing regression height and the living body confidence coefficient;

and obtaining a comparison result of the living body confidence coefficient after the pulling-up and a preset living body confidence coefficient threshold value, and obtaining an output result of the living body identification according to the comparison result.

7. A system for mask wear identification, the system comprising a multitask collaborative network, the multitask collaborative network comprising a base network and a first branch network, the first branch network comprising a mask wear classification branch and a mask wear regression branch;

wherein the content of the first and second substances,

the height of the mask is the height of the mask,

the number of categories indicating the wearing classification of the mask,

，

A real tag representing the mask wearing of the current image data,

means of

In the same way, the first and second,

means of

The meanings are the same;

8. The system according to claim 7, wherein the multitask collaborative network further comprises a second branch network, the second branch network comprises a living body recognition classification branch, after the mask wearing classification branch and the mask wearing regression branch reach a preset convergence state, mask wearing classification branch parameters and mask wearing regression branch parameters are fixed, the second branch network is trained through a classification loss function of the living body recognition branch and an isotropic combined loss function cycle between the regression of the mask wearing branch and the living body recognition classification respectively until the living body recognition classification branch reaches the preset convergence state, and a trained multitask collaborative network is obtained;

wherein, the isosexual joint loss function between the regression of the mask wearing branch and the living body identification classification is as follows:

wherein the content of the first and second substances,

，

A real tag representing living body identification of the current image data,

means of and

in the same way, the first and second groups of the first and second groups,

means of and

in the same way, the first and second,

9. An electronic device comprising a memory and a processor, wherein the memory stores a computer program, and the processor is configured to execute the computer program to perform the method of mask wearing identification according to any one of claims 1 to 6.

10. A storage medium having a computer program stored thereon, wherein the computer program is configured to execute the method of mask wear identification according to any one of claims 1 to 6 when the computer program is run.