CN113222973B

CN113222973B - Image processing method and device, processor, electronic equipment and storage medium

Info

Publication number: CN113222973B
Application number: CN202110600103.1A
Authority: CN
Inventors: 张金豪; 高哲峰; 李若岱; 庄南庆; 马堃
Original assignee: Shenzhen Sensetime Technology Co Ltd
Current assignee: Shenzhen Sensetime Technology Co Ltd
Priority date: 2021-05-31
Filing date: 2021-05-31
Publication date: 2024-03-08
Anticipated expiration: 2041-05-31
Also published as: TW202248954A; WO2022252737A1; TWI787113B; CN113222973A

Abstract

The application discloses an image processing method and device, a processor, electronic equipment and a storage medium. The method comprises the following steps: acquiring an image to be processed, a first threshold, a second threshold and a third threshold, wherein the first threshold and the second threshold are different, the first threshold and the third threshold are different, and the second threshold is smaller than or equal to the third threshold; determining a first number of first pixel points in a skin area of the image to be processed; the first pixel points are pixel points with the color value being more than or equal to the second threshold value and less than or equal to the third threshold value; and obtaining a skin shielding detection result of the image to be processed according to the ratio of the first quantity to the quantity of the pixel points of the skin area and the first threshold value.

Description

Image processing method and device, processor, electronic equipment and storage medium

Technical Field

The present disclosure relates to the field of image processing technologies, and in particular, to an image processing method and apparatus, a processor, an electronic device, and a storage medium.

Background

To improve detection safety, non-contact detection of skin is applied to more and more scenes. The detection accuracy of such non-contact detection processing is largely affected by the skin shielding state. For example, if the area of the skin area that is blocked is large, the accuracy of the detection result obtained by performing the non-contact detection processing on the skin area may be low under the influence of the skin area blocking object. Therefore, how to detect the skin shielding state has a very important meaning.

Disclosure of Invention

The application provides an image processing method and device, a processor, electronic equipment and a storage medium, so as to determine whether a forehead is in a shielding state.

The application provides an image processing method, which comprises the following steps:

acquiring an image to be processed, a first threshold, a second threshold and a third threshold, wherein the first threshold and the second threshold are different, the first threshold and the third threshold are different, and the second threshold is smaller than or equal to the third threshold;

determining a first number of first pixel points in a skin area of the image to be processed; the first pixel points are pixel points with the color value being more than or equal to the second threshold value and less than or equal to the third threshold value;

and obtaining a skin shielding detection result of the image to be processed according to the ratio of the first quantity to the quantity of the pixel points of the skin area and the first threshold value.

In combination with any one of the embodiments of the present application, the skin area includes a face area, and the skin occlusion detection result includes a face occlusion detection result;

before said determining the first number of first pixel points in the skin area of the image to be processed, the method further comprises:

Performing face detection processing on the image to be processed to obtain a first face frame;

and determining the face area from the image to be processed according to the first face frame.

In combination with any one of the embodiments of the present application, the face area includes a forehead area, the face shielding detection result includes a forehead shielding detection result, and the first face frame includes: an upper wire and a lower wire; the upper frame line and the lower frame line are edges, parallel to the transverse axis of the pixel coordinate system of the image to be processed, of the first face frame, and the ordinate of the upper frame line is smaller than that of the lower frame line;

the step of determining the face area from the image to be processed according to the first face frame includes:

detecting the key points of the face of the image to be processed to obtain at least one key point of the face; the at least one face key point comprises a left eyebrow key point and a right eyebrow key point;

under the condition that the ordinate of the upper frame line is kept unchanged, the lower frame line is moved along the negative direction of the longitudinal axis of the pixel coordinate system of the image to be processed, so that the straight line where the lower frame line is located coincides with the first straight line, and a second face frame is obtained; the first straight line is a straight line passing through the left eyebrow key point and the right eyebrow key point;

And obtaining the forehead area according to the area contained in the second face frame.

In combination with any one of the embodiments of the present application, the obtaining the forehead area according to the area included in the second face frame includes:

under the condition that the ordinate of the lower frame line of the second face frame is kept unchanged, moving the upper frame line of the second face frame along the longitudinal axis of the pixel coordinate system of the image to be processed, so that the distance between the upper frame line of the second face frame and the lower frame line of the second face frame is a preset distance, and obtaining a third face frame;

and obtaining the forehead area according to the area contained in the third face frame.

In combination with any one of the embodiments of the present application, the at least one face key point further includes a left mouth corner key point and a right mouth corner key point; the first face frame further includes: a left wire and a right wire; the left frame line and the right frame line are edges, parallel to the longitudinal axis of the pixel coordinate system of the image to be processed, of the first face frame, and the abscissa of the left frame line is smaller than the abscissa of the right frame line;

the obtaining the forehead area according to the area contained in the third face frame includes:

Under the condition that the abscissa of the left frame line of the third face frame is kept unchanged, moving the right frame line of the third face frame along the transverse axis of the pixel coordinate system of the image to be processed, so that the distance between the right frame line of the third face frame and the left frame line of the third face frame is a reference distance, and obtaining a fourth face frame; the reference distance is the distance between two intersection points of the second straight line and the face outline contained in the third face frame; the second straight line is a straight line which is between the first straight line and the third straight line and is parallel to the first straight line or the third straight line; the third straight line is a straight line passing through the left-mouth corner key point and the right-mouth corner key point;

and taking the area contained in the fourth face frame as the forehead area.

In combination with any one of the embodiments of the present application, before the determining the first number of first pixel points in the skin area of the image to be processed, the method further includes:

determining a skin pixel point area from the pixel point areas contained in the first face frame;

acquiring a color value of a second pixel point in the skin pixel point area;

taking the difference between the color value of the second pixel point and the first value as the second threshold value, and taking the sum of the color value of the second pixel point and the second value as the third threshold value; and the first value and the second value do not exceed the maximum value of the color value of the image to be processed.

In combination with any one of the embodiments of the present application, before the determining the skin pixel area from the pixel areas contained in the first face frame, the method further includes:

carrying out mask wearing detection processing on the image to be processed to obtain a detection result;

the determining the skin pixel point area from the pixel point areas contained in the first face frame comprises the following steps:

when the detection result is that the mask is not worn in the face area, taking a pixel point area except for the forehead area, the mouth area, the eyebrow area and the eye area in the face area as the skin pixel point area;

when the detection result is that the mask is worn in the face area, taking a pixel point area between the first straight line and the fourth straight line as the skin pixel point area; the fourth straight line is a straight line passing through the left lower eyelid key point and the right lower eyelid key point; the left eye lower eyelid keypoint and the right eye lower eyelid keypoint both belong to the at least one face keypoint.

In combination with any one of the embodiments of the present application, the obtaining the color value of the second pixel in the skin pixel area includes:

Determining a rectangular region according to at least one first key point and at least one second key point in the medial left eyebrow region in the case that the at least one face key point comprises at least one first key point in the medial left eyebrow region and the at least one face key point comprises at least one second key point in the medial right eyebrow region;

carrying out graying treatment on the rectangular region to obtain a gray map of the rectangular region;

taking the color value of the intersection point of the first row and the first column as the color value of the second pixel point; the first row is the row with the largest sum of gray values in the gray scale map, and the first column is the column with the largest sum of gray values in the gray scale map.

In combination with any one of the embodiments of the present application, the obtaining, according to the ratio of the first number to the number of the pixel points of the skin area and the first threshold, a skin occlusion detection result of the image to be processed includes:

determining that the skin occlusion detection result is that the skin area is in an occlusion state under the condition that the ratio of the first number to the number of pixel points in the skin area does not exceed the first threshold value;

and under the condition that the ratio of the first number to the number of the pixel points in the skin area exceeds the first threshold value, determining that the skin occlusion detection result is that the skin area is in a non-occlusion state.

In combination with any one of the embodiments of the present application, the skin area belongs to a person to be detected, and the method further includes:

acquiring a temperature thermodynamic diagram of the image to be processed;

and when the skin occlusion detection result shows that the skin area is in a non-occlusion state, reading the temperature of the skin area from the temperature thermodynamic diagram as the body temperature of the person to be detected.

In some embodiments, the present application further provides an apparatus for image processing, the apparatus including:

the device comprises an acquisition unit, a processing unit and a processing unit, wherein the acquisition unit is used for acquiring an image to be processed, a first threshold value, a second threshold value and a third threshold value, the first threshold value is different from the second threshold value, the first threshold value is different from the third threshold value, and the second threshold value is smaller than or equal to the third threshold value;

a first processing unit for determining a first number of first pixel points in a skin area of the image to be processed; the first pixel points are pixel points with the color value being more than or equal to the second threshold value and less than or equal to the third threshold value;

the detection unit is used for obtaining a skin shielding detection result of the image to be processed according to the ratio of the first quantity to the quantity of the pixel points of the skin area and the first threshold value.

the image processing apparatus further includes: the second processing unit is used for carrying out face detection processing on the image to be processed before the first number of first pixel points in the skin area of the image to be processed is determined, so as to obtain a first face frame;

the second processing unit is used for:

In combination with any one of the embodiments of the present application, the second processing unit is configured to:

The second processing unit is used for:

and taking the area contained in the fourth face frame as the forehead area.

In combination with any one of the embodiments of the present application, the image device further includes: a determining unit, configured to determine a skin pixel point area from pixel point areas included in the first face frame before the determining of the first number of first pixel points in the skin area of the image to be processed;

the obtaining unit is further configured to obtain a color value of a second pixel in the skin pixel area;

The first processing unit is further configured to take a difference between the color value of the second pixel point and the first value as the second threshold value, and take a sum of the color value of the second pixel point and the second value as the third threshold value; and the first value and the second value do not exceed the maximum value of the color value of the image to be processed.

In combination with any one of the embodiments of the present application, the image processing apparatus further includes: the third processing unit is used for carrying out mask wearing detection processing on the image to be processed before the skin pixel point area is determined from the pixel point areas contained in the first face frame, so as to obtain a detection result;

the determining unit is used for:

In combination with any one of the embodiments of the present application, the obtaining unit is configured to:

In combination with any one of the embodiments of the present application, the detection unit is configured to:

In combination with any one of the embodiments of the present application, the skin area belongs to a person to be detected, and the acquiring unit is further configured to:

acquiring a temperature thermodynamic diagram of the image to be processed;

the image processing apparatus further includes: and the fourth processing unit is used for reading the temperature of the skin area from the temperature thermodynamic diagram as the body temperature of the person to be detected when the skin occlusion detection result shows that the skin area is in a non-occlusion state.

The present application also provides a processor for performing the method of the first aspect and any one of its possible implementation manners.

The application also provides an electronic device, comprising: a processor, transmission means, input means, output means and memory for storing computer program code comprising computer instructions which, when executed by the processor, cause the electronic device to carry out the method as described in the first aspect and any one of its possible implementations.

The present application also provides a computer readable storage medium having stored therein a computer program comprising program instructions which, when executed by a processor, cause the processor to perform a method as described in the first aspect and any one of its possible implementations.

The present application also provides a computer program product comprising a computer program or instructions which, when run on a computer, cause the computer to perform the method of the first aspect and any one of the possible implementations thereof.

It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the application.

Drawings

In order to more clearly describe the technical solutions in the embodiments or the background of the present application, the following description will describe the drawings that are required to be used in the embodiments or the background of the present application.

The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the application and, together with the description, serve to explain the technical aspects of the application.

Fig. 1 is a schematic diagram of a pixel coordinate system according to an embodiment of the present application;

fig. 2 is a schematic flow chart of an image processing method according to an embodiment of the present application;

fig. 3 is a flowchart of another image processing method according to an embodiment of the present application;

fig. 4 is a schematic diagram of a face key point provided in an embodiment of the present application;

Fig. 5 is a schematic structural diagram of an image processing apparatus according to an embodiment of the present application;

fig. 6 is a schematic hardware structure of an image processing apparatus according to an embodiment of the present application.

Detailed Description

In order to make the present application solution better understood by those skilled in the art, the following description will clearly and completely describe the technical solution in the embodiments of the present application with reference to the accompanying drawings in the embodiments of the present application, and it is apparent that the described embodiments are only some embodiments of the present application, not all embodiments. All other embodiments, which can be made by one of ordinary skill in the art without undue burden from the present disclosure, are within the scope of the present disclosure.

The terms first, second and the like in the description and in the claims of the present application and in the above-described figures, are used for distinguishing between different objects and not for describing a particular sequential order. Furthermore, the terms "comprise" and "have," as well as any variations thereof, are intended to cover a non-exclusive inclusion. For example, a process, method, system, article, or apparatus that comprises a list of steps or elements is not limited to only those listed steps or elements but may include other steps or elements not listed or inherent to such process, method, article, or apparatus.

Reference herein to "an embodiment" means that a particular feature, structure, or characteristic described in connection with the embodiment may be included in at least one embodiment of the present application. The appearances of such phrases in various places in the specification are not necessarily all referring to the same embodiment, nor are separate or alternative embodiments mutually exclusive of other embodiments. Those of skill in the art will explicitly and implicitly appreciate that the embodiments described herein may be combined with other embodiments.

Before proceeding with the following explanation, a pixel coordinate system in the embodiment of the present application is first defined. As shown in fig. 1, a pixel coordinate system xoy is constructed with the upper left corner of the image as the origin o of the pixel coordinate system, the direction parallel to the rows of the image as the direction of the x-axis, and the direction parallel to the columns of the image as the direction of the y-axis. In the pixel coordinate system, the abscissa is used to represent the number of columns of pixels in the image, the ordinate is used to represent the number of rows of pixels in the image, and the units of the abscissa and the ordinate may be pixels. For example, let the coordinates of pixel a in fig. 1 be (30, 25), that is, the abscissa of pixel a is 30 pixels, the ordinate of pixel a is 25 pixels, and pixel a is the pixel of the 30 th column and 25 th row in the image.

For example, non-contact thermometry is currently widely used in the field of body temperature detection. The non-contact temperature measuring tool has the advantages of high measuring speed, over-temperature voice alarm and the like, and is specially used for rapidly screening the body temperature in public places with particularly high people flow.

The thermal imaging device mainly collects light in a thermal infrared band to detect thermal radiation emitted by an object, and finally establishes an accurate corresponding relation with temperature to realize a temperature measurement function. As a non-contact temperature measuring tool, the thermal imaging equipment can cover a larger area, improve the passing speed and reduce the group gathering time.

The thermal imaging device primarily recognizes the position of the forehead and then measures the body temperature based on the region. However, if a pedestrian wears a hat or has a bang, it cannot be determined whether the forehead area is in a shielding state. At this time, whether or not the shielding state of the detection forehead has a great influence on the accuracy of the body temperature can be determined.

Based on this, the embodiment of the application provides an image processing method to realize skin occlusion detection processing.

The execution subject of the embodiment of the present application is an image processing apparatus, which may be one of the following: cell phone, computer, server, panel computer.

Embodiments of the present application are described below with reference to the accompanying drawings in the embodiments of the present application.

Referring to fig. 2, fig. 2 is a flowchart of an image processing method according to an embodiment of the present application.

201. Acquiring an image to be processed, a first threshold, a second threshold and a third threshold, wherein the first threshold is different from the second threshold, the first threshold is different from the third threshold, and the second threshold is smaller than or equal to the third threshold.

In this example, the image to be processed includes an image including a face and an image not including a face. The first threshold is a ratio of the number of skin pixels in the forehead area to the number of pixels in the forehead area, which is preset according to the specific embodiment, and is a criterion for evaluating whether the forehead area is blocked.

The first threshold in embodiments of the present application relates to the accuracy of temperature detection or other embodiments. For example, assuming that a temperature measurement operation is performed on a pedestrian in a forehead area, the more skin areas are exposed in the forehead area, the more accurate the temperature measurement result is. When the exposed skin area of the forehead area is above 60%, the temperature measurement result is considered to be accurate. If such accuracy is desired, the first threshold is set to 60%. If higher accuracy is required in the temperature detection scenario, the first threshold may be set above 60%. If the first threshold is considered to be set to 60% too high, and in fact too accurate results are not required, the first threshold may be set to below 60%. In this case, the accuracy of the corresponding thermometry results is to be lowered. Therefore, the setting of the first threshold needs to be performed in a specific implementation, and the embodiment of the present application is not limited thereto.

In one implementation of acquiring an image to be processed, an image processing device receives an image to be processed input by a user through an input component. The input assembly includes: a keyboard, a mouse, a touch screen, a touch pad, an audio input device, and the like.

In another implementation manner of acquiring the image to be processed, the image processing device receives the image to be processed sent by the data terminal. The data terminal may be any of the following: cell phone, computer, panel computer, server.

In still another implementation manner of acquiring the image to be processed, the image processing device receives the image to be processed sent by the monitoring camera. Alternatively, the monitoring camera may be deployed on non-contact temperature measurement products such as artificial intelligence (artificial intelligence, AI) infrared imagers and security gates (such products are mainly placed in densely populated scenes such as stations, airports, subways, shops, supermarkets, schools, corporate halls and cell gates).

In still another implementation manner of obtaining the image to be processed, the image processing device receives the video stream sent by the monitoring camera, decodes the video stream, and uses the obtained image as the image to be processed. Alternatively, the monitoring camera may be deployed on non-contact temperature measurement products such as AI infrared imagers and security gates (such products are mainly placed in dense traffic scenes such as stations, airports, subways, shops, supermarkets, schools, corporate halls and cell gates).

In still another implementation manner of acquiring the image to be processed, the image processing device is connected to the cameras, and the image processing device can acquire a data frame acquired in real time from each camera, where the data frame includes two forms of images and videos.

It should be understood that the number of cameras connected to the image processing apparatus is not fixed, and the network address of the cameras is input to the image processing apparatus, that is, the image processing apparatus can obtain the acquired data frame from the cameras.

For example, the person in the a place wants to use the technical scheme provided by the application, the network address of the camera in the a place is only required to be input to the image processing device, the image processing device can acquire the data frame acquired by the camera in the a place, the data frame acquired by the camera in the a place can be subjected to subsequent processing, and the image processing device outputs a detection result of whether the forehead is blocked or not.

202. And determining a first number of first pixel points in the skin area of the image to be processed, wherein the first pixel points are pixel points with color values larger than or equal to the second threshold value and smaller than or equal to the third threshold value.

In this model, the three parameters of the color value are hue (H), saturation (S), and brightness (V). The color value carries three information of chromaticity, saturation, and brightness.

Specifically, the image processing apparatus regards pixels having a color value equal to or greater than the second threshold value and equal to or less than the third threshold value as skin pixels. That is, in the embodiment of the present application, the second threshold value and the third threshold value are used to determine whether the pixel point is a skin pixel point.

In the implementation manner of determining that the first pixel point is a pixel point with a color value greater than or equal to the second threshold value and less than or equal to the third threshold value, when all parameters of the color value of the pixel point of the skin area are greater than or equal to the parameters corresponding to the second threshold value and less than or equal to the parameters corresponding to the third threshold value, the pixel point can be considered as the skin pixel point with the skin area not shielded. For example, let H be 26, S be 43, and V be 46 for the second threshold, H be 34, S be 255, and V be 255 for the third threshold. Then, the color range of the color value of the skin pixel area where the skin area is not blocked is 26 to 34, s is 43 to 255, and v is 46 to 255. When the color value of a certain pixel point of the skin area is respectively 25H, 45 s and 200 v, the pixel point is considered not to be the skin pixel point of which the skin area is not shielded because the value of H is not in the range of 26-34 of the set H. For example, when the color value of a pixel point of the skin area is H28, s 45, and v 200, respectively, the pixel point is considered to be a skin pixel point where the skin area is not blocked because the values of H, S, V are all within the set range. That is, converting the skin area from the RGB channel to the HSV channel, only when the color value of a certain pixel point of the skin area is within the above-given second threshold value and third threshold value, the color of the pixel point is the color of the skin area which is not blocked by the skin area, and the pixel point is the first pixel point.

The image processing device further determines the number of first pixel points to obtain a first number after determining the first pixel points in the skin area.

203. And obtaining a skin shielding detection result of the image to be processed according to the ratio of the first quantity to the quantity of the pixel points of the skin area and the first threshold value.

In this embodiment of the present application, the skin occlusion detection result includes that the skin area is in an occluded state or the skin area is in an unoccluded state.

In this embodiment of the present application, the ratio of the first number to the number of pixels in the skin area indicates the duty ratio (hereinafter, simply referred to as the duty ratio) of the pixels in the skin area that are not blocked in the skin area. If the occupied area is smaller, the skin area is shielded, otherwise, if the occupied area is larger, the skin area is not shielded.

In this embodiment of the present application, the image processing apparatus uses the first threshold as a basis for determining the size of the duty ratio, so as to determine whether the skin area is blocked according to the size of the duty ratio, thereby obtaining a skin blocking detection result.

In one possible implementation, the duty cycle does not exceed the first threshold, indicating that the duty cycle is relatively small, thereby determining that the skin region is in an occluded state. The duty cycle exceeding the first threshold indicates that the duty cycle is relatively large, thereby determining that the skin region is in an unoccluded state.

In this application, the image processing apparatus determines, according to a first threshold, the number of skin pixels in a skin area of an image to be processed, that is, a first number. The ratio of the first quantity to the quantity of the skin area pixels is determined to obtain the duty ratio of the skin area pixels in the skin area, and then the shielding state of the skin area can be determined according to the magnitude relation between the duty ratio and the first threshold value, so that the skin shielding detection result of the image to be processed is obtained.

As an alternative embodiment, the skin area comprises a face area and the skin occlusion detection result comprises a face occlusion detection result. In this embodiment, the image processing apparatus further determines the duty ratio of the skin pixels in the face area under the condition of determining the number of skin pixels in the face area in the image to be processed, so as to determine whether the face area is blocked according to the duty ratio, and obtain a face blocking detection result. Specifically, under the condition that the face area is blocked, the face blocking detection result is determined to be in a blocked state; and under the condition that the face area is not blocked, determining that the face blocking detection result is that the face area is not in a blocked state.

In such an embodiment, the image processing apparatus further performs the following steps before determining the first number of first pixel points in the skin area of the image to be processed:

1. and carrying out face detection processing on the image to be processed to obtain a first face frame.

In the embodiment of the application, the face detection process is used for identifying whether the image to be processed contains the person object.

And carrying out face detection processing on the image to be processed to obtain coordinates of a first face frame (shown as D in fig. 1). The coordinates of the first face frame may be an upper left corner coordinate, a lower right corner coordinate, an upper right corner coordinate. The coordinates of the first face frame may also be a pair of diagonal coordinates, i.e., upper left and lower right or upper left and upper right. The first face box contains an area from the forehead to the chin of the face.

In one possible implementation manner, feature extraction processing is performed on an image to be processed through a pre-trained neural network, so as to obtain feature data, and the pre-trained neural network identifies whether the image to be processed contains a human face according to features in the feature data. The feature extraction processing is performed on the image to be processed, and under the condition that the face is contained in the image to be processed in the feature extraction data, the position of the first face frame of the image to be processed is determined, namely the face detection is realized, and the face detection processing on the image to be processed can be realized through a convolutional neural network.

The convolutional neural network is trained by taking a plurality of images with labeling information as training data, so that the trained convolutional neural network can finish face detection processing of the images. The labeling information of the images in the training data is the positions of the faces. In the process of training the convolutional neural network by using training data, the convolutional neural network extracts feature data of an image from the image, determines whether a face exists in the image according to the feature data, and obtains the position of the face according to the feature data of the image under the condition that the face exists in the image. And supervising the result obtained by the convolutional neural network in the training process by taking the labeling information as the supervision information, updating the parameters of the convolutional neural network, and finishing the training of the convolutional neural network. In this way, the trained convolutional neural network can be used for processing the image to be processed so as to obtain the position of the face in the image to be processed.

In another possible implementation, the face detection process may be implemented by a face detection algorithm, where the face detection algorithm may be one of the following: the face detection algorithm for realizing the face detection processing is not particularly limited in the application, and the face detection algorithm is based on the histogram coarse segmentation and the singular value characteristics, the face detection based on the binary wavelet transformation, the neural network method (pdbnn) based on the probability decision, the hidden Markov model method (hidden markov model) and the like.

2. And determining the face area from the image to be processed according to the first face frame.

In one possible implementation, the image processing apparatus uses an area surrounded by the first face frame as the face area.

As an alternative embodiment, the first face box includes: an upper wire and a lower wire. The first face frame includes: an upper wire, a lower wire, a left wire and a right wire; the upper frame line and the lower frame line are edges parallel to the transverse axis of the pixel coordinate system of the image to be processed in the first face frame, and the ordinate of the upper frame line is smaller than that of the upper frame line; the left frame line and the right frame line are edges of the first face frame which are parallel to the longitudinal axis of the pixel coordinate system of the image to be processed, and the abscissa of the left frame line is smaller than the abscissa of the right frame line.

In this embodiment, the face area includes a forehead area, and the image processing apparatus determines the face area from the image to be processed according to the first face frame, that is, determines the forehead area from the image to be processed according to the first face frame.

In one implementation of determining the forehead area, the distance between the upper and lower frame lines is the distance from the skin of the face contained in the first face frame to the chin, and the distance between the left and right frame lines is the distance between the inner left and inner right ears of the face contained in the first face frame. Generally, the width of the skin area of a human face is about 1/3 of the length of the entire human face, but the proportion of the width of the skin area to the length of the human face varies from person to person. However, the proportion of the width of the skin area of each person to the length of the entire face is in the range of 30% to 40%. And under the condition that the ordinate of the upper frame line is kept unchanged, moving the lower frame line along the negative direction of the longitudinal axis of the pixel coordinate system of the image to be processed, so that the distance between the upper frame line and the lower frame line after movement is 30-40% of the distance between the upper frame line and the lower frame line, and the area contained by the first face frame after movement is a forehead area. When the coordinates of the first face frame are a pair of diagonal coordinates, the coordinates of the upper left corner of the first face frame or the coordinates of the upper right corner of the first face frame determine the position of the forehead region. Therefore, by changing the size and the position of the first face frame, the area in the first face frame can be the forehead area of the face in the image to be processed.

In another implementation of determining the forehead region, the image processing apparatus determines the forehead region by performing the steps of:

21. detecting the key points of the face of the image to be processed to obtain at least one key point of the face; the at least one face key point includes a left eyebrow key point and a right eyebrow key point.

In this embodiment, by performing face key point detection on the image to be processed, at least one face key point is obtained, where the at least one key point includes a left eyebrow key point and a right eyebrow key point.

And carrying out feature extraction processing on the image to be processed to obtain feature data, so that the detection of key points of the human face can be realized. The feature extraction process may be implemented through a pre-trained neural network, or may be implemented through a feature extraction model, which is not limited in this application. The feature data are used for extracting key point information of a face in the image to be processed. The image to be processed is a digital image, and the feature data obtained by carrying out feature extraction processing on the image to be processed can be understood as more deep semantic information of the image to be processed.

In a possible implementation manner of face key point detection, a training face image set is established, and key point positions to be detected are marked. Constructing a first-layer deep neural network, training a face region estimation model, constructing a second-layer deep neural network, and performing primary detection on key points of the face; continuously dividing local areas of the inner face area, and respectively constructing a third layer of deep neural network for each local area; and estimating the rotation angle of each local area, correcting according to the estimated rotation angle, and constructing a fourth depth-of-layer neural network for the correction data set of each local area. And (3) giving any new face image, and detecting key points by adopting the four-layer deep neural network model to obtain a final face key point detection result.

In another possible implementation manner of face key point detection, the convolutional neural network is trained by taking a plurality of images with labeling information as training data, so that the trained convolutional neural network can complete face key point detection processing of the images. The labeling information of the images in the training data is the key point positions of the faces. In the process of training the convolutional neural network by using training data, the convolutional neural network extracts characteristic data of the image from the image, and determines the key point positions of the faces in the image according to the characteristic data. And supervising the result obtained by the convolutional neural network in the training process by taking the labeling information as the supervision information, updating the parameters of the convolutional neural network, and finishing the training of the convolutional neural network. In this way, the trained convolutional neural network can be used for processing the image to be processed so as to obtain the key point positions of the face in the image to be processed.

In another possible implementation manner, the convolution processing is performed on the image to be processed layer by layer through at least two convolution layers, so that the feature extraction processing of the image to be processed is completed. The convolution layers in at least two layers of convolution layers are sequentially connected in series, namely, the output of the previous layer is the input of the next layer of convolution layer, and the content and semantic information extracted by each layer of convolution layer are different. Therefore, the size of the feature data extracted later is smaller, but the content and semantic information are more concentrated. The multi-layer convolution layer is used for carrying out convolution processing on the image to be processed step by step, so that the size of the image to be processed can be reduced while the content information and the semantic information in the image to be processed are obtained, the data processing capacity of the image processing device is reduced, and the operation speed of the image processing device is improved.

In another possible implementation manner of face key point detection, the implementation process of the convolution processing is as follows: by sliding the convolution kernel over the image to be processed, and designating a pixel on the image to be processed that corresponds to the center pixel of the convolution kernel as the target pixel. And multiplying the pixel value on the image to be processed by the corresponding numerical value on the convolution kernel, and then adding all multiplied values to obtain the pixel value after convolution processing. The pixel value after the convolution processing is taken as the pixel value of the target pixel. And finally, the image to be processed is subjected to sliding processing, pixel values of all pixels in the image to be processed are updated, and convolution processing of the image to be processed is completed, so that characteristic data are obtained. In one possible implementation manner, the key point information of the face in the image to be processed can be obtained by identifying the features in the feature data through the neural network extracting the feature data.

In another possible implementation manner of face keypoint detection, face keypoint detection is implemented by using a face keypoint algorithm, and the face keypoint detection algorithm may be one of OpenFace, a multi-task cascade convolutional neural network (multi-task cascaded convolutional networks, MTCNN), an adjustment convolutional neural network (tweaked convolutional neural networks, TCNN), or a task constraint depth convolutional neural network (tasks-constrained deep convolutional network, TCDCN), which is not limited in the application.

22. And under the condition that the ordinate of the upper frame line of the first face frame is kept unchanged, moving the lower frame line of the first face frame along the negative direction of the longitudinal axis of the pixel coordinate system of the image to be processed, so that the straight line of the lower frame line of the first face frame is overlapped with the first straight line, and a second face frame is obtained. Wherein the first straight line is a straight line passing through the left eyebrow key point and the right eyebrow key point.

23. And obtaining the skin area according to the area contained in the second face frame.

In this embodiment of the present invention, the distance between the upper frame line and the lower frame line is a distance from a forehead to a chin of a face contained in the first face frame, and the distance between the left frame line and the right frame line is a distance between a left inner ear and a right inner ear of the face contained in the first face frame. The first straight line is a straight line passing through the left eyebrow key point and the right eyebrow key point. Because the forehead area is above the first straight line of the face contained in the first face frame, moving the lower frame line can make the area contained in the moved first face frame be the forehead area. And under the condition that the ordinate of the upper frame line is kept unchanged, moving the lower frame line along the negative direction of the longitudinal axis of the pixel coordinate system of the image to be processed, so that the straight line of the lower frame line after moving is overlapped with the first straight line, and a second face frame is obtained. The second face frame includes a forehead area.

As an alternative embodiment, the image processing apparatus performs the following steps in performing step 23:

24. and under the condition that the ordinate of the lower frame line of the second face frame is kept unchanged, moving the upper frame line of the second face frame along the longitudinal axis of the pixel coordinate system of the image to be processed, so that the distance between the upper frame line of the second face frame and the lower frame line of the second face frame is a preset distance, and obtaining a third face frame.

25. And obtaining the forehead area according to the area contained in the third face frame.

In this embodiment of the present application, a distance between a left frame line of the second face frame and a right frame line of the second face frame is a distance from an inner side of a left ear to an inner side of a right ear of a face contained in the second face frame. The distance between the upper frame line of the first face frame and the lower frame line of the first face frame is the distance from the forehead to the chin of the face contained in the first face frame, generally, the width of the forehead area accounts for about 1/3 of the length of the whole face, but the width of the forehead area of each person accounts for different proportions of the length of the face, however, the proportion of the width of the forehead area of all people to the length of the face is in the range of 30% to 40%. The preset distance is set to be 30% to 40% of the distance between the upper frame line of the first face frame and the lower frame line of the first face frame. Therefore, in order to make the region in the second face frame be the forehead region, it is necessary to reduce the distance between the upper frame line of the second face frame and the lower frame line of the second face frame to 30% to 40% of the distance between the upper frame line and the lower frame line. And under the condition that the ordinate of the lower frame line of the second face frame is kept unchanged, moving the upper frame line of the second face frame along the longitudinal axis of the pixel coordinate system of the image to be processed, so that the distance between the upper frame line of the second face frame and the lower frame line of the second face frame is a preset distance, and obtaining a third face frame. The region included in the third face frame is the forehead region.

As an alternative embodiment, the image processing apparatus performs the following steps in performing step 25:

26. and under the condition that the abscissa of the left frame line of the third face frame is kept unchanged, moving the right frame line of the third face frame along the transverse axis of the pixel coordinate system of the image to be processed, so that the distance between the right frame line of the third face frame and the left frame line of the third face frame is a reference distance, and obtaining a fourth face frame. The reference distance is a distance between two intersection points of a second straight line and a face contour included in the third face frame, the second straight line is a straight line between the first straight line and a third straight line and parallel to the first straight line or the third straight line, and the third straight line is a straight line passing through the left-mouth corner key point and the right-mouth corner key point.

27. And taking the area contained in the fourth face frame as the forehead area.

In this embodiment of the present application, the at least one face key point further includes a left mouth corner key point and a right mouth corner key point. The third straight line is a straight line passing through the left-hand corner key point and the right-hand corner key point. The second straight line is between the first straight line and the third straight line, and the second straight line is parallel to the first straight line or the third straight line. And taking the distance between the second straight line and two intersection points of the face outline of the face image contained in the third face frame as a reference distance. The second line is between the first line and the third line, i.e. in the middle region of the eyebrow region and the mouth region. Because the face width of the eyebrow area and the middle area of the mouth are relatively close to the length of the forehead area, the width of the area is used for determining the length of the forehead area accurately. The length of the forehead area is the width of the face contour, namely the reference distance. And under the condition that the abscissa of the left frame line of the third face frame is kept unchanged, moving the right frame line of the third face frame along the transverse axis of the pixel coordinate system of the image to be processed, so that the distance between the left frame line of the third face frame and the right frame line of the third face frame is a reference distance, and obtaining a fourth face frame. The region included in the fourth face frame is the forehead region.

In still another possible implementation manner, under the condition that the abscissa of the right frame line of the third face frame is kept unchanged, the left frame line of the third face frame is moved along the transverse axis of the pixel coordinate system of the image to be processed, so that the distance between the left frame line of the third face frame and the right frame line of the third face frame after movement is the reference distance, and the area included in the third face frame after movement is the forehead area.

In still another possible implementation manner, the right frame line of the third face frame is moved by half of the reference distance along the negative direction of the transverse axis of the pixel coordinate system of the image to be processed, and the left frame line of the third face frame is moved by half of the reference distance along the positive direction of the transverse axis of the pixel coordinate system of the image to be processed, so that the distance between the left frame line of the third face frame after movement and the right frame line of the third face frame after movement is the reference distance, and the area included in the third face frame after movement is the forehead area.

As an alternative embodiment, the image processing device further performs the following steps before determining the first number of first pixel points in the skin area of the image to be processed:

3. And determining a skin pixel point area from the pixel point areas contained in the first face frame.

In the embodiment of the present application, since the skin color reference exposed in the skin area is to be found, the color value of the pixel point of the skin pixel point area needs to be taken as the skin color reference exposed in the skin area. Therefore, the skin pixel point area is determined from the pixel point areas included in the first face frame. For example, as shown in fig. 1, the skin pixel area may be a cheek area under the eyes included in the first face frame, an intersection area of a nose area and a mouth area included in the first face frame, or a mouth area included in the first face frame.

As an alternative embodiment, the image processing apparatus further performs the following steps, before determining the skin pixel area from the pixel areas included in the face frame:

31. and carrying out mask wearing detection processing on the image to be processed to obtain a detection result.

In this embodiment of the application, carry out the gauze mask and wear the detection to the image of treating, obtain the testing result and include: the person in the image to be processed wears the mask or the person in the image to be detected does not wear the mask.

In one possible implementation manner, the image processing device performs first feature extraction processing on the image to be processed to obtain first feature data, wherein the first feature data carries information about whether the person to be detected wears the mask. The image processing device obtains a detection result according to the first characteristic data obtained by mask wearing detection.

Alternatively, the first feature extraction process may be implemented through a mask detection network. The mask detection network can be obtained by training the deep convolutional neural network by taking at least one first training image with labeling information as training data, wherein the labeling information comprises whether people in the first training image wear the mask or not.

32. And when the detection result is that the mask is not worn in the face area, taking a pixel point area except for a forehead area, a mouth area, an eyebrow area and an eye area in the face area as the skin pixel point area. The key points of the human face also comprise a left lower eyelid key point and a right lower eyelid key point.

When the face area is the face area, the mask is worn, and the pixel point area between the first straight line and the fourth straight line in the face area is taken as the skin pixel point area; the fourth straight line is a straight line passing through the left lower eyelid key point and the right lower eyelid key point; the left lower eyelid key point and the right lower eyelid key point both belong to the at least one face key point;

In this embodiment of the present application, when the face area is detected without wearing a mask, the skin pixel area of the face area is an area other than the skin area, the mouth area, the eyebrow area, and the eye area. The human face region has pixels with color values displayed as black in the eye region and the eyebrow region and pixels with color values displayed as red in the mouth region. Therefore, the skin pixel does not include the range of the eye area, the mouth area, and the eyebrow area. And because the area of the skin pixel points of the skin area cannot be judged under the condition that whether the skin area is covered by a hat or is shielded by a bang is uncertain. Therefore, when the mask wearing detection processing is performed on the image to be processed to determine that the mask is not worn in the face region, the skin pixel region includes a pixel region other than the skin region, the mouth region, the eyebrow region, and the eye region in the face region.

When the mask is worn in the face region as a result of the detection, most of the region below the nose of the face region is blocked. Therefore, the portion of the skin that is not shielded may be an eyebrow region, an eyelid region, a nasal bridge region. The face key point detection can obtain the left eye lower eyelid key point coordinate, the right eye lower eyelid key point coordinate, the left eyebrow key point coordinate and the right eyebrow key point coordinate. The fourth straight line is a straight line passing through the left lower eyelid keypoint and the right lower eyelid keypoint, and the first straight line is a straight line passing through the left eyebrow keypoint and the right eyebrow keypoint. The three partial areas of the eyebrow area, the eyelid area and the nose bridge area are all between the horizontal lines determined by the left eyebrow and the right eyebrow and the straight lines determined by the left lower eyelid and the right lower eyelid in the face area. Therefore, when the mask is worn in the face region as a result of the detection, a pixel region between the first straight line and the fourth straight line in the face region is defined as the skin pixel region.

4. Acquiring a color value of a second pixel point in the skin pixel point area;

in this embodiment of the present application, the color value of the second pixel is obtained from the skin pixel area, where the color value of the second pixel is used as a reference for measuring the skin color exposed from the skin area. Thus, the second pixel point may be any point in the skin pixel point area.

The implementation manner of obtaining the second pixel point in the skin pixel point area may be: finding the coordinate average value of a certain skin area as a second pixel point; or finding out the pixel points of the intersection point coordinates of the straight lines determined by some key points as second pixel points; alternatively, the image of a part of the skin area is subjected to gradation processing, and the pixel point having the largest gradation value is used as the second pixel point. The method for acquiring the skin pixel is not limited in the embodiment of the present application.

In a possible implementation manner, in the case that the right eyebrow inner area and the left eyebrow inner area have two key points respectively, the key points are set to be a right eyebrow inner upper point, a right eyebrow inner lower point, a left eyebrow inner upper point and a left eyebrow inner lower point. And connecting the key points of the upper point on the inner side of the right eyebrow and the key point of the lower point on the inner side of the left eyebrow, and connecting the key points of the upper point on the inner side of the left eyebrow and the key point of the lower point on the inner side of the right eyebrow to obtain two intersecting straight lines. A unique intersection point can be obtained by these two intersecting straight lines. As shown, the four keypoints are assumed to be numbered 37, 38, 67, 68, respectively. That is, 37 and 68 are connected, 38 and 67 are connected, and an intersection point can be obtained after determining the two straight lines. Through the positions of the face frames, the coordinates of the four key points 37, 38, 67 and 68 can be determined, and then the coordinates of the intersection point can be solved by using Opencv. By determining the coordinates of the intersection points, the positions of the pixel points corresponding to the intersection points can be obtained. And converting the RGB channel of the pixel point into an HSV channel, and obtaining the color value of the pixel point corresponding to the intersection point coordinate. The color value of the pixel point corresponding to the intersection point coordinate is the color value of the second pixel point.

In another possible implementation manner, in the case that the right eyebrow inner area and the left eyebrow inner area have two key points, the key points are set to be a right eyebrow inner upper point, a right eyebrow inner lower point, a left eyebrow inner upper point and a left eyebrow inner lower point. A rectangular area is found by the 4 key points as an eyebrow area. As shown, assuming that the four points correspond to the numbers 37, 38, 67, 68, respectively, a rectangular area is found by the four points as an eyebrow area. The coordinates of the acquisitions 37, 38, 67, 68 are defined as (X1, Y1), (X2, Y2), (X3, Y3), (X4, Y4), respectively. A rectangular region can be obtained by taking the maximum value of the Y coordinates in (X1, Y1), (X2, Y2) as Y5, taking the minimum value of the Y coordinates in (X3, Y3), (X4, Y4) as Y6, taking the maximum value of the X coordinates in (X1, Y1), (X3, Y3) as X5, and taking the minimum value of the X coordinates in (X2, Y2), (X4, Y4) as X6. That is, the 4 coordinates of the truncated eyebrow area are (X6, Y6), (X5, Y5), (X5, Y6), (X6, Y5). The coordinates of the four key points 37, 38, 67 and 68 can be determined by the positions of the face frames, and the positions of the four points (X6, Y6), (X5, Y5), (X5, Y6) and (X6, Y5) can be determined. Connecting (X6, Y6), (X5, Y5), (X5, Y6) and (X6, Y5) to obtain two straight lines, and obtaining a unique intersection point through the two straight lines. The coordinates of the intersection point can then be solved using Opencv. By determining the coordinates of the intersection points, the positions of the pixel points corresponding to the intersection points can be obtained. And converting the RGB channel of the pixel point into an HSV channel, and obtaining the color value of the pixel point corresponding to the intersection point coordinate. The color value of the pixel point corresponding to the intersection point coordinate is the color value of the second pixel point.

As an alternative embodiment, the image processing apparatus performs the following steps in performing step 4:

41. and determining a rectangular region according to the at least one first key point and the at least one second key point when the at least one face key point comprises at least one first key point belonging to the left eyebrow inner region and the at least one face key point comprises at least one second key point belonging to the right eyebrow inner region.

42. And carrying out graying treatment on the rectangular region to obtain a gray level diagram of the rectangular region.

43. And taking the color value of the intersection point of a first row and a first column as the color value of the second pixel point, wherein the first row is the row with the largest sum of the gray values in the gray scale map, and the first column is the column with the largest sum of the gray values in the gray scale map.

In this embodiment of the present application, a plurality of schemes for obtaining a rectangular area according to the at least one first key point and the at least one second key point are included. And carrying out graying treatment on the rectangular region to obtain a gray scale map of the rectangular region. The sum of gray values of each line of the gray map is calculated, and the line with the largest sum of gray values is recorded as the first line. Similarly, the sum of gradation values of each column of the gradation map is calculated, and the column in which the sum of gradation values is largest is the first column. And finding the intersection point coordinates according to the row with the maximum sum of the gray values and the column with the maximum sum of the gray values. I.e. the intersection coordinates determined by the first row and the first column. By determining the coordinates of the intersection points, the pixel points corresponding to the intersection points can be obtained. And converting the RGB channel of the pixel point into an HSV channel, so that the color value of the pixel point at the intersection point can be obtained. The color value of the pixel point of the intersection point coordinate is the color value of the second pixel point.

In a possible implementation manner of obtaining the rectangular area, when the left-eyebrow inner key point and the right-eyebrow inner key point are respectively only one and the ordinate of the two key points are inconsistent, the difference value of the ordinate of the two key points is taken as the width of the rectangular area, the difference value of the abscissa is taken as the length of the rectangular area, and a rectangular area taking the two key points as diagonal angles is determined.

In still another possible implementation manner of obtaining the rectangular area, in the case that two key points on the inner side of the left eyebrow and one key point on the inner side of the right eyebrow are provided, a connecting line of the two key points on the inner side of the left eyebrow is used as a first side length of the rectangular area, and a connecting line of a key point inconsistent with the ordinate of the key point on the inner side of the right eyebrow and the key point on the inner side of the right eyebrow is selected as a second side length of the rectangular area. And respectively making parallel lines according to the determined first side length and the second side length to obtain the remaining two side lengths of the rectangular area, thereby determining the rectangular area.

In still another possible implementation manner of obtaining the rectangular area, in the case that there are more than two key points of the left eyebrow inner area and the right eyebrow inner area, respectively, four key points may be selected to form a quadrilateral area. And then obtaining a rectangular area according to the coordinates of the four points.

In yet another possible implementation manner of obtaining the rectangular area, the at least one first key point includes a third key point and a fourth key point; the at least one second keypoint comprises a fifth keypoint and a sixth keypoint; the ordinate of the third key point is smaller than the fourth key point; the ordinate of the fifth key point is smaller than the sixth key point; determining a first coordinate by a first abscissa and a first ordinate; determining a second coordinate by the second abscissa and the first ordinate; the first abscissa and the second ordinate determine a third coordinate; the second abscissa and the second ordinate determine a fourth coordinate; the first ordinate is the maximum value of the ordinate of the third key point and the fifth key point; the second ordinate is the minimum value of the ordinate of the fourth key point and the sixth key point; the first abscissa is the maximum value of the abscissas of the third key point and the fourth key point; the second abscissa is the minimum value of the abscissas of the fifth key point and the sixth key point; the area surrounded by the first coordinate, the second coordinate, the third coordinate and the fourth coordinate is used as a rectangular area. For example, when there are two left and right inner region keypoints, the four keypoints are respectively the third (X1, Y1), fifth (X2, Y2), fourth (X3, Y3), and sixth (X4, Y4) keypoints. Taking the maximum value of the Y coordinate in (X1, Y1) and (X2, Y2) as Y5 as a first ordinate; taking the minimum value of the Y coordinate in (X3, Y3) and (X4, Y4) as Y6 as a second ordinate; taking the maximum value of X coordinates in (X1, Y1) and (X3, Y3) as X5 as a first abscissa; taking the minimum value of X coordinate in (X2, Y2) and (X4, Y4) as X6 as a second abscissa. Thus, 4 coordinates of the rectangular region can be obtained as a first coordinate (X5, Y5), a second coordinate (X6, Y5), a third coordinate (X5, Y6), and a fourth coordinate (X6, Y6).

As another alternative embodiment, the image processing apparatus performs the following steps in performing step 4:

44. and determining average coordinates of at least one first key point and at least one second key point in the medial left eyebrow area when the at least one face key point comprises at least one first key point in the medial left eyebrow area and the at least one face key point comprises at least one second key point in the medial right eyebrow area.

45. And taking the color value of the pixel point determined according to the average value coordinate as the color value of the second pixel point in the skin pixel point area.

In this embodiment of the present application, in a case where the at least one face key point includes at least one second key point that belongs to an inner region of the right eyebrow, and the at least one face key point includes at least one first key point that belongs to an inner region of the left eyebrow, coordinates of the at least one first key point and the at least one second key point are averaged. For example, when there are two key point coordinates of the right eyebrow inner region and the left eyebrow inner region, the key points of the right eyebrow inner region and the left eyebrow inner region are set to be four points, i.e., a right eyebrow inner upper point, a right eyebrow inner lower point, a left eyebrow inner upper point and a left eyebrow inner lower point. As shown in fig. 4, it is assumed that the four points are numbered 37, 38, 67, 68, respectively. The coordinates of the acquisitions 37, 38, 67, 68 are (X1, Y1), (X2, Y2), (X3, Y3), (X4, Y4), and the four coordinates are averaged by adding the abscissa and the ordinate, respectively, to obtain the average coordinate (X0, Y0). And converting the RGB channel of the pixel point into an HSV channel, and acquiring the color value of the pixel point corresponding to the average value coordinate (X0, Y0) according to the average value coordinate. The color value of the pixel point corresponding to the average value coordinate is the color value of the second pixel point.

As a further alternative embodiment, the image processing apparatus performs the following steps in performing step 4:

46. determining a fifth straight line according to coordinates of the key point on the inner side of the right eyebrow and the key point on the left side of the nose bridge; and determining a sixth straight line according to the coordinates of the left eyebrow inner key point and the right bridge key point.

47. And taking the color value of the pixel point determined according to the intersection point coordinates of the fifth straight line and the sixth straight line as the color value of the second pixel point in the skin pixel point area.

In this embodiment of the present application, the at least one face key point further includes a right eyebrow inner key point, a nose bridge left key point, a nose bridge right key point, and a left eyebrow inner key point. The right eyebrow inner side key point is connected with the nose bridge left side key point, the left eyebrow inner side key point is connected with the nose bridge right side key point, and two intersecting straight lines are respectively a fifth straight line and a sixth straight line. The application does not limit the inner key point of the right eyebrow and the inner key point of the left eyebrow, the inner key point of the right eyebrow is any key point taken in the inner area of the right eyebrow, and the inner key point of the left eyebrow is any key point taken in the inner area of the left eyebrow. As shown in fig. 4, assuming that the numbers corresponding to the four points are 67, 68, 78, 79, respectively, that is, 78 and 68 are connected, 79 and 67 are connected, an intersection point can be obtained after determining the two straight lines. The coordinates of the four points 67, 68, 79 and 78 can be determined by the positions of the face frames, and then the coordinates of the intersection points can be solved by using Opencv. By determining the coordinates of the intersection points, the positions of the pixel points corresponding to the intersection points can be obtained. And converting the RGB channel of the pixel point into an HSV channel, and obtaining the color value of the pixel point corresponding to the intersection point coordinate. The color value of the pixel point corresponding to the intersection point coordinate is the color value of the second pixel point.

5. And taking the difference between the color value of the second pixel point and the first value as a second threshold value and taking the sum of the color value of the second pixel point and the second value as a third threshold value, wherein the first value and the second value do not exceed the maximum value of the color value.

In the embodiment of the application, the second threshold value and the third threshold value can be determined by determining the color value of the second pixel point. The representation form of the image can be changed from an RGB channel diagram to an HSV channel diagram through the function of the Opencv algorithm, so that the color value of the second pixel point is obtained.

The color values include chromaticity, luminance, and saturation. Wherein, the range of chromaticity is 0-180, and the range of brightness and saturation is 0-255. That is, the color value has a maximum chromaticity of 180 and the luminance and saturation have a maximum value of 255. It should be understood that the first value and the second value also include three values of chromaticity, luminance, and saturation, respectively. Therefore, the chromaticity of the first value and the chromaticity of the second value are not more than 180, the luminance of the first value and the luminance of the second value are not more than 255, and the saturation of the first value and the saturation of the second value are not more than 255. Generally, the chromaticity, luminance, saturation of the first value and the second value are identical. That is, the three values of chromaticity, luminance, and saturation of the color value of the second pixel point are intermediate values of the three values of chromaticity, luminance, and saturation corresponding to the second threshold value and the third threshold value.

In an implementation manner of acquiring the mapping relation between the color value of the second pixel point and the second threshold value and the third threshold value, a classification algorithm of machine learning, such as Logistic regression and naive bayes algorithm, is used to determine whether the color belongs to the color value of the second pixel point according to a certain input color to classify. That is, a pile of color values is input, and whether the color values belong to the color values of the second pixel point is classified, so as to determine which color values belong to the color values of the second pixel point. The mapping relation between the color value of the second pixel point and the second threshold value and the third threshold value can be obtained through a machine algorithm.

Optionally, the first value and the second value correspond to three values of chromaticity, brightness and saturation, namely 30, 60 and 70 respectively. That is, after the color value of the second pixel is obtained, the corresponding second threshold is 30 for chromaticity decrease, 60 for luminance decrease, and 70 for saturation decrease, and the corresponding third threshold is 30 for chromaticity increase, 60 for luminance increase, and 70 for saturation increase.

As an alternative embodiment, the image processing apparatus performs the following steps in performing step 203:

6. and determining that the skin occlusion detection result is that the skin area is in a non-occlusion state under the condition that the ratio of the first number to the number of pixel points in the skin area does not exceed a first threshold.

In this embodiment of the present application, the image processing apparatus determines whether the skin area is in the shielding state according to a result of whether the ratio of the first number to the pixel points in the skin area exceeds the first threshold.

And determining that the skin detection result is that the skin area is in a shielding state under the condition that the ratio of the first number to the pixel points in the skin area is smaller than a first threshold value. For example, the first number is 50, the number of pixels in the skin area is 100, and the first threshold is 60%. Because the ratio of the first number to the number of pixels in the skin area is 50/100=50%, less than 60%. The skin detection result is considered to be that the skin area is in a shielded state.

When the skin detection result shows that the skin area is in shielding, the image processing device outputs prompt information of the skin needing to be exposed. The skin detection may be performed again or may be performed again by exposing the skin based on the indication of the exposed skin. The present application is not limited.

7. And when the ratio of the first number to the number of the pixel points in the skin area exceeds the first threshold, determining that the skin occlusion detection result is that the skin area is in a non-occlusion state.

In this embodiment of the present application, the image processing apparatus determines, according to a result that a ratio of the first number to the pixel points in the skin area is equal to or greater than a first threshold, that the skin detection result is that the skin area is in a blocking state. For example, the first number is 60, the number of pixels in the skin area is 100, and the first threshold is 60%. Because the ratio of the first number to the number of pixels in the skin area is 60/100=60%, equal to 60%. The skin detection result is considered to be that the skin area is in a non-occluded state. Alternatively, the first number is 70, the number of pixels in the skin area is 100, and the first threshold is 60%. Since the ratio of the first number to the number of pixels in the skin area is 70/100=70% and is greater than 60%, the skin detection result is considered as that the skin area is in a non-shielding state.

In the case where it is determined that the skin detection result is that the skin area is in the non-shielded state, an operation of thermometry or other operations may be performed. If the temperature measurement is performed under the condition that the skin detection result is that the skin area is in a non-shielding state, the accuracy of detecting the temperature can be improved. The present application is not limited herein as to the subsequent operation performed when the skin detection result is that the skin area is not occluded.

As an alternative embodiment, the image processing apparatus further performs the steps of:

8. and acquiring a temperature thermodynamic diagram of the image to be processed.

In the embodiment of the application, the image processing method can be used in the field of temperature measurement, and the skin area belongs to the person to be detected. Each pixel point in the thermodynamic diagram carries temperature information of the corresponding pixel point. Optionally, the thermogram is acquired by an infrared thermal imaging device on the image processing apparatus. The image processing device performs image matching processing on the temperature thermodynamic diagram and the image to be processed, determines a pixel point area corresponding to the face area of the image to be processed from the temperature thermodynamic diagram, and obtains the pixel point area corresponding to the face area of the image to be processed on the temperature diagram.

9. And when the skin occlusion detection result shows that the skin area is in a non-occlusion state, reading the temperature of the skin area from the temperature thermodynamic diagram as the body temperature of the person to be detected.

In this embodiment of the present application, when the skin detection result indicates that the skin area is not blocked, the pixel area corresponding to the face area of the image to be processed is found from the thermogram, and generally the skin area is located in the upper 30% -40% of the whole face area, so that the temperature of the skin area in the thermogram is obtained. The average temperature of the skin area may be the body temperature of the person to be detected, or the highest temperature of the skin area may be the body temperature of the person to be detected, which is not limited in this application.

Referring to fig. 3, fig. 3 is a flowchart illustrating an application image processing method according to an embodiment of the present application.

Based on the image processing method provided by the embodiment of the application, the embodiment of the application also provides a possible application scene of the image processing method.

When a thermal imaging device is used to measure the temperature of a pedestrian in a non-contact manner, the temperature of the forehead area of the pedestrian is generally measured. However, when a pedestrian has a bang to shield the forehead or wears a hat, because whether the forehead area is in a shielding state cannot be determined, a certain degree of interference can be brought to temperature measurement, and a certain challenge is brought to the current temperature measurement work. Therefore, the forehead detection is carried out on the pedestrian before the temperature measurement, and the temperature measurement is carried out on the forehead area of the pedestrian in a non-shielding state, so that the accuracy of the temperature measurement can be improved.

As shown in fig. 3, the image processing apparatus acquires camera frame data, that is, a piece of image to be processed. And carrying out face detection on the image to be processed, and if the face does not exist in the face detection result, re-acquiring the image to be processed by the image processing device. If the face exists as a result of the face detection, the image processing device inputs the image to be processed into the trained neural network, and can output the face frame (shown as D in fig. 1) and the coordinates of the face frame (shown as fig. 1) of the image to be processed and the coordinates of 106 key points (shown as fig. 4). It should be understood that the coordinates of the face frame may be a pair of diagonal coordinates including an upper left corner coordinate and a lower right corner coordinate or an upper left corner coordinate and an upper right corner coordinate, and four corner coordinates (as shown in fig. 1) of the face frame are provided for convenience of understanding in the embodiment of the present application. In the embodiment of the application, the neural network for outputting the face frame coordinates and the 106 key point coordinates of the image to be processed can be a neural network, or can be a series connection of two neural networks for respectively realizing face detection and face key point detection.

In order to detect the exposed skin area of the forehead area, the color value of the brightest pixel point of the eyebrow area is used as the reference of the exposed skin area of the forehead area. The brightest pixel is the second pixel described above. It is therefore necessary to first acquire the eyebrow area. And acquiring key points of the inner side area of the left eyebrow and the inner side of the right eyebrow through face key point detection. When there are two key points in the right and left eyebrow inner regions, the key points are the right eyebrow inner upper point, the right eyebrow inner lower point, the left eyebrow inner upper point, and the left eyebrow inner lower point. A rectangular area is obtained as an eyebrow area through the four key points. In this embodiment, 106 key point coordinates are taken as an example, and the four key points, namely, the upper point on the inner side of the right eyebrow, the lower point on the inner side of the right eyebrow, the upper point on the inner side of the left eyebrow and the lower point on the inner side of the left eyebrow, are corresponding to each other, namely, 37, 38, 67 and 68. It should be understood that the number of key points and the number of key points are not limited, and any two key points that are respectively taken from the medial right eyebrow region and the medial left eyebrow region are within the scope of the present application.

The coordinates of the key points obtained by the acquisition 37, 38, 67 and 68 are respectively defined as (X1, Y1), (X2, Y2), (X3, Y3) and (X4, Y4). Taking the maximum value of the Y coordinates in (X1, Y1) and (X2, Y2) as Y5, taking the minimum value of the Y coordinates in (X3, Y3) and (X4, Y4) as Y6, taking the maximum value of the X coordinates in (X1, Y1) and (X3, Y3) as X5, and taking the minimum value of the X coordinates in (X2, Y2) and (X4, Y4) as X6. Thus, the obtained X5 and X6 coordinates and Y5 and Y6 coordinates are combined to obtain four coordinates. A rectangular region can be determined from these four coordinates. The four coordinates of the rectangular area are (X6, Y6), (X5, Y5), (X5, Y6), (X6, Y5), and the rectangular area is the eyebrow area to be cut. The coordinates of the four points 37, 38, 67 and 68 can be determined by the key point detection of the face, and then the positions of the four points (X6, Y6), (X5, Y5), (X5, Y6) and (X6, Y5) can be determined. And intercepting a rectangular area determined according to the four points to obtain an eyebrow area.

After the eyebrow area is acquired, the brightest pixel point needs to be found. Therefore, the gray scale processing is carried out on the eyebrow area, and a gray scale image of the eyebrow area is obtained. In this embodiment, the graying process is to make each pixel in the pixel matrix satisfy the following relationship: r=g=b. I.e. the value of the red variable, the value of the green variable and the value of the blue variable, which are equal. This "=" means equal in mathematics, and this value at this time is called a gradation value. General gray scale processing is often processed using two methods:

the method comprises the following steps: r=g=b after graying and g=b after graying = (r+ before treatment, g+ before treatment B)/3

Illustrating: r of m pixel points of the picture A is 100, G is 120, and B is 110. That is, before the graying process, R is 100, g is 120, and b is 110 for m pixels. Then the image a is subjected to graying processing, and after graying processing, r=g=b= (100+120+110)/3=110 for m pixels.

The second method is as follows: r=g=g=b=r×0.3+g×0.59+b×0.11 after treatment after graying

Illustrating: r of m pixel points of the picture A is 100, G is 120, and B is 110. That is, before the graying process, R is 100, g is 120, and b is 110 for m pixels. Then, the graying process is performed on the picture a, where r=g=b=100×0.3+120×0.59+110×0.11=112.9 of the m pixel points after the graying process.

The method for graying the eyebrow area according to the embodiment is not limited. In order to find the color value of the brightest pixel, that is, find the color value of the pixel having the largest gray value after the graying process of the eyebrow area. The gray values of each line of the gray map of the eyebrow region are added, and the coordinates of the line where the sum of the gray values is the largest are recorded. Similarly, the gradation values of each row of the gradation image in the eyebrow region are added, and the coordinates of the row in which the sum of the gradation values is the largest are recorded. And obtaining the coordinate of the brightest point in the eyebrow area by obtaining the intersection point coordinate determined by the coordinates of the row with the maximum sum of the gray values and the maximum column. Through the conversion relation between RGB and HSV, the RGB value of the brightest pixel point coordinate of the eyebrow area can be found, the corresponding HSV value can be obtained through formula conversion, the RGB channel of the eyebrow area can be converted into the HSV channel through the cvtcolor function of opencv, and the HSV value of the brightest pixel point coordinate can be found. Because the HSV value and the second and third thresholds have a determined relationship, that is to say the HSV value of the brightest pixel point of the eyebrow area, the corresponding second and third thresholds can be determined.

Acquiring the forehead area requires determining the size and position of the forehead area. The length of the forehead region is the width of the face. By calculating the distance between the key point 0 and the key point 32, the face frame is reduced so that the distance between the left frame line and the right frame line of the face frame is the distance between the key point 0 and the key point 32. That is, the distance between the key point 0 and the key point 32 is taken as the length of the forehead area. The forehead area has a width of about 1/3 of the entire face frame, and although the forehead area of each person has a width that varies from the length of the entire face, the forehead area has a width in the range of 30% to 40% of the length of the face. Therefore, the distance between the upper frame line and the lower frame line of the face frame is reduced to 30% to 40% of the width of the forehead area of the original face frame. The forehead area is the area above the eyebrows. The horizontal line defined by the keypoints 35 and 40 is the position of the eyebrow. And therefore, moving the size-changed face frame so that the lower frame line of the size-changed face frame is positioned at the horizontal line determined by the two key points 35 and 40, and obtaining the size-changed face frame. The rectangular area contained in the face frame with the changed size and position is the forehead area.

And intercepting the forehead region, and then binarizing the forehead region according to the second threshold value and the third threshold value to obtain a binarized image of the forehead region. The binarized image is adopted, so that the data processing amount can be reduced, and the speed of detecting the forehead by the image processing device can be increased. The binarization criteria are: the HSV value of a certain pixel point in the forehead area is larger than or equal to the second threshold value and smaller than or equal to the third threshold value, the gray value of the pixel point is 255, the HSV value of a certain pixel point in the forehead area is smaller than the second threshold value or larger than the third threshold value, and the gray value of the pixel point is 0. First, the forehead area is converted from the RGB channel map to the HSV channel map. Then, the number of pixels with gray values of 255 in the forehead area, that is, the number of pixels with white color in the gray scale map is counted. Under the condition that the ratio of the number of white pixels to the number of pixels of the forehead area reaches a threshold value, the forehead area is considered to be in a non-shielded state, so that the thermal imaging temperature measurement operation is carried out. Under the condition that the ratio of the number of white pixels to the number of pixels in the forehead area does not reach a threshold value, the forehead area is considered to be in a shielding state, and at the moment, the temperature measurement operation can influence the accuracy of temperature measurement, so that a prompt for exposing the forehead is output, and an image processing device is required to acquire an image again to detect the forehead again. Illustrating: assuming that the second threshold value is (100, 50, 70), the third threshold value is (120, 90, 100), the color value of the pixel point q of the forehead region is (110, 60, 70), and the color value of the pixel point p of the forehead region is (130, 90, 20). Q is in the range of the second threshold and the third threshold and p is not in the range of the second threshold and the third threshold. When binarizing the forehead area, the gray value of the pixel point q is 255, and the gray value of the pixel point p is 0. Assuming that the threshold is 60%, the number of pixels in the forehead area is 100, and the number of pixels in the white area is 50, then the ratio of the number of pixels in the white area to the number of pixels in the forehead area is 50%, and the threshold is not reached, and the forehead area is in a shielding state, so that a prompt for exposing the forehead is output.

It will be appreciated by those skilled in the art that in the above-described method of the specific embodiments, the written order of steps is not meant to imply a strict order of execution but rather should be construed according to the function and possibly inherent logic of the steps.

The foregoing details the method of embodiments of the present application, and the apparatus of embodiments of the present application is provided below.

Referring to fig. 5, fig. 5 is a schematic structural diagram of an image processing apparatus provided in an embodiment of the present application, where the apparatus 1 includes an obtaining unit 11, a first processing unit 12, and a detecting unit 13, and optionally, the image processing apparatus 1 further includes a second processing unit 14, a determining unit 15, a third processing unit 16, and a fourth processing unit 17, where:

an obtaining unit 11, configured to obtain an image to be processed, a first threshold, a second threshold, and a third threshold, where the first threshold and the second threshold are different, the first threshold and the third threshold are different, and the second threshold is less than or equal to the third threshold;

a first processing unit 12 for determining a first number of first pixel points in a skin area of the image to be processed; the first pixel points are pixel points with the color value being more than or equal to the second threshold value and less than or equal to the third threshold value;

And the detection unit 13 is configured to obtain a skin occlusion detection result of the image to be processed according to the ratio of the first number to the number of the pixels in the skin area and the first threshold.

the image processing apparatus further includes: a second processing unit 14, configured to perform face detection processing on the image to be processed to obtain a first face frame before the first number of first pixel points in the skin area of the image to be processed is determined;

the second processing unit 14 is configured to:

In combination with any of the embodiments of the present application, the second processing unit 14 is configured to:

The second processing unit 14 is configured to:

and taking the area contained in the fourth face frame as the forehead area.

In combination with any one of the embodiments of the present application, the image device further includes: a determining unit 15, configured to determine a skin pixel point area from pixel point areas included in the first face frame before the determining of the first number of first pixel points in the skin area of the image to be processed;

the acquiring unit 11 is further configured to acquire a color value of a second pixel in the skin pixel area;

The first processing unit 12 is further configured to take a difference between the color value of the second pixel point and the first value as the second threshold value, and take a sum of the color value of the second pixel point and the second value as the third threshold value; and the first value and the second value do not exceed the maximum value of the color value of the image to be processed.

In combination with any one of the embodiments of the present application, the image processing apparatus further includes: a third processing unit 16, configured to perform mask wearing detection processing on the image to be processed before determining a skin pixel point area from the pixel point areas included in the first face frame, so as to obtain a detection result;

the determining unit 15 is configured to:

In combination with any one of the embodiments of the present application, the obtaining unit 11 is configured to:

In combination with any one of the embodiments of the present application, the detection unit 13 is configured to:

In combination with any of the embodiments of the present application, the skin area belongs to a person to be detected, and the acquiring unit 11 is further configured to:

acquiring a temperature thermodynamic diagram of the image to be processed;

the image processing apparatus further includes: a fourth processing unit 17, configured to, when the skin occlusion detection result indicates that the skin region is in a non-occluded state, read a temperature of the skin region from the temperature thermodynamic diagram as a body temperature of the person to be detected.

In some embodiments, functions or modules included in the apparatus provided in the embodiments of the present application may be used to perform the methods described in the foregoing method embodiments, and specific implementations thereof may refer to descriptions of the foregoing method embodiments, which are not repeated herein for brevity.

Fig. 6 is a schematic hardware structure of an image processing apparatus according to an embodiment of the present application. The image processing device 2 comprises a processor 21, a memory 22, an input device 23 and an output device 24. The processor 21, memory 22, input device 23, and output device 24 are coupled by connectors, including various interfaces, transmission lines or buses, etc., as not limited in this application. It should be understood that in various embodiments of the present application, coupled is intended to mean interconnected by a particular means, including directly or indirectly through other devices, e.g., through various interfaces, transmission lines, buses, etc.

The processor 21 may be one or more graphics processors (graphics processing unit, GPUs), which may be single-core GPUs or multi-core GPUs in the case where the processor 21 is a GPU. Alternatively, the processor 21 may be a processor group formed by a plurality of GPUs, and the plurality of processors are coupled to each other through one or more buses. In the alternative, the processor may be another type of processor, and the embodiment of the present application is not limited.

Memory 22 may be used to store computer program instructions as well as various types of computer program code for performing aspects of the present application. Optionally, the memory includes, but is not limited to, a random access memory (random access memory, RAM), a read-only memory (ROM), an erasable programmable read-only memory (erasable programmable read only memory, EPROM), or a portable read-only memory (compact disc read-only memory, CD-ROM) for associated instructions and data.

The input means 23 are for inputting data and/or signals and the output means 24 are for outputting data and/or signals. The input device 23 and the output device 24 may be separate devices or may be an integral device.

It will be appreciated that in the embodiment of the present application, the memory 22 may be used not only for storing related instructions, but also for storing data, such as the memory 22 may be used for storing data obtained through the input device 23, or the memory 22 may be used for storing data obtained through the processor 21, etc., and the embodiment of the present application is not limited to the specific data stored in the memory.

It will be appreciated that fig. 6 shows only a simplified design of an image processing apparatus. In practical applications, the image processing apparatus may also include other necessary elements, including but not limited to any number of input/output devices, processors, memories, etc., and all image processing apparatuses capable of implementing the embodiments of the present application are within the scope of protection of the present application.

Those of ordinary skill in the art will appreciate that the various illustrative elements and algorithm steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware, or combinations of computer software and electronic hardware. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the solution. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present application.

It will be clear to those skilled in the art that, for convenience and brevity of description, specific working procedures of the above-described systems, apparatuses and units may refer to corresponding procedures in the foregoing method embodiments, and are not repeated herein. It will be further apparent to those skilled in the art that the descriptions of the various embodiments herein are provided with emphasis, and that the same or similar parts may not be explicitly described in different embodiments for the sake of convenience and brevity of description, and thus, parts not described in one embodiment or in detail may be referred to in the description of other embodiments.

In the several embodiments provided in this application, it should be understood that the disclosed systems, devices, and methods may be implemented in other manners. For example, the above-described apparatus embodiments are merely illustrative, and for example, the above-described division of units is merely a logical function division, and there may be another division manner in actual implementation, for example, a plurality of units or components may be combined or may be integrated into another system, or some features may be omitted, or not performed. Alternatively, the coupling or direct coupling or communication connection shown or discussed with each other may be an indirect coupling or communication connection via some interfaces, devices or units, which may be in electrical, mechanical or other form.

The units described above as separate components may or may not be physically separate, and components shown as units may or may not be physical units, may be located in one place, or may be distributed over a plurality of network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution of this embodiment.

In addition, each functional unit in each embodiment of the present application may be integrated in one first processing unit, or each unit may exist alone physically, or two or more units may be integrated in one unit.

In the above embodiments, it may be implemented in whole or in part by software, hardware, firmware, or any combination thereof. When implemented in software, may be implemented in whole or in part in the form of a computer program product. The computer program product described above includes one or more computer instructions. When the above-described computer program instructions are loaded and executed on a computer, the processes or functions described in accordance with embodiments of the present application are produced in whole or in part. The computer may be a general purpose computer, a special purpose computer, a computer network, or other programmable apparatus. The computer instructions may be stored in or transmitted across a computer-readable storage medium. The computer instructions may be transmitted from one website, computer, server, or data center to another website, computer, server, or data center by a wired (e.g., coaxial cable, fiber optic, digital subscriber line (digital subscriber line, DSL)) or wireless (e.g., infrared, wireless, microwave, etc.). The computer readable storage media may be any available media that can be accessed by a computer or a data storage device such as a server, data center, or the like that contains an integration of one or more available media. The above-mentioned usable medium may be a magnetic medium (e.g., a floppy disk, a hard disk, a magnetic tape), an optical medium (e.g., a digital versatile disk (digital versatile disc, DVD)), or a semiconductor medium (e.g., a Solid State Disk (SSD)), or the like.

Those of ordinary skill in the art will appreciate that implementing all or part of the above-described method embodiments may be accomplished by a computer program to instruct related hardware, the program may be stored in a computer readable storage medium, and the program may include the above-described method embodiments when executed. And the aforementioned storage medium includes: a read-only memory (ROM) or a random access memory (random access memory, RAM), a magnetic disk or an optical disk, or the like.

Claims

1. An image processing method, the method comprising:

acquiring an image to be processed and a first threshold value;

determining a face area from the image to be processed according to the first face frame;

when the detection result is that the mask is not worn in the face area, taking the pixel point areas except for the forehead area, the mouth area, the eyebrow area and the eye area in the face area as skin pixel point areas;

When the detection result is that the mask is worn in the face area, taking a pixel point area between a first straight line and a fourth straight line as the skin pixel point area; the fourth straight line is a straight line passing through the left lower eyelid key point and the right lower eyelid key point; the left eye lower eyelid key point and the right eye lower eyelid key point both belong to at least one face key point;

acquiring a color value of a second pixel point in the skin pixel point area;

taking the difference between the color value of the second pixel point and the first value as a second threshold value, and taking the sum of the color value of the second pixel point and the second value as a third threshold value; the first value and the second value do not exceed the maximum value of the color value of the image to be processed;

2. The method of claim 1, wherein the face region comprises a forehead region, the face occlusion detection result comprises a forehead occlusion detection result, and the first face box comprises: an upper wire and a lower wire; the upper frame line and the lower frame line are edges, parallel to the transverse axis of the pixel coordinate system of the image to be processed, of the first face frame, and the ordinate of the upper frame line is smaller than that of the lower frame line;

3. The method according to claim 2, wherein the obtaining the forehead area according to the area included in the second face frame includes:

4. The method of claim 3, wherein the at least one face keypoint further comprises a left mouth corner keypoint and a right mouth corner keypoint; the first face frame further includes: a left wire and a right wire; the left frame line and the right frame line are edges, parallel to the longitudinal axis of the pixel coordinate system of the image to be processed, of the first face frame, and the abscissa of the left frame line is smaller than the abscissa of the right frame line;

And taking the area contained in the fourth face frame as the forehead area.

5. The method of claim 1, wherein the obtaining the color value of the second pixel in the skin pixel area comprises:

6. The method according to any one of claims 1 to 5, wherein obtaining the skin occlusion detection result of the image to be processed according to the ratio of the first number to the number of pixels of the skin area and the first threshold value includes:

7. The method of claim 6, wherein the skin area belongs to a person to be detected, the method further comprising:

acquiring a temperature thermodynamic diagram of the image to be processed;

8. An image processing apparatus, characterized in that the apparatus comprises:

the acquisition unit is used for acquiring the image to be processed and the first threshold value;

the second processing unit is used for carrying out face detection processing on the image to be processed to obtain a first face frame;

the second processing unit is further configured to determine a face area from the image to be processed according to the first face frame;

The third processing unit is used for carrying out mask wearing detection processing on the image to be processed to obtain a detection result;

a determining unit, configured to, when the detection result is that the face area does not wear a mask, take a pixel area except for a forehead area, a mouth area, an eyebrow area, and an eye area in the face area as a skin pixel area;

the determining unit is used for taking a pixel point area between a first straight line and a fourth straight line as the skin pixel point area when the face area wears the mask as the detection result; the fourth straight line is a straight line passing through the left lower eyelid key point and the right lower eyelid key point; the left eye lower eyelid key point and the right eye lower eyelid key point both belong to at least one face key point;

the first processing unit is used for taking the difference between the color value of the second pixel point and the first value as a second threshold value and taking the sum of the color value of the second pixel point and the second value as a third threshold value; the first value and the second value do not exceed the maximum value of the color value of the image to be processed;

The first processing unit is further configured to determine a first number of first pixel points in a skin area of the image to be processed; the first pixel points are pixel points with the color value being more than or equal to the second threshold value and less than or equal to the third threshold value;

9. A processor for performing the method of any one of claims 1 to 7.

10. An electronic device, comprising: a processor and a memory for storing computer program code comprising computer instructions which, when executed by the processor, cause the electronic device to perform the method of any one of claims 1 to 7.

11. A computer readable storage medium, characterized in that the computer readable storage medium has stored therein a computer program comprising program instructions which, when executed by a processor, cause the processor to perform the method of any of claims 1 to 7.