WO2022252737A1

WO2022252737A1 - Image processing method and apparatus, processor, electronic device, and storage medium

Info

Publication number: WO2022252737A1
Application number: PCT/CN2022/080403
Authority: WO
Inventors: 张金豪; 高哲峰; 李若岱; 庄南庆; 马堃
Original assignee: 上海商汤智能科技有限公司
Priority date: 2021-05-31
Filing date: 2022-03-11
Publication date: 2022-12-08
Also published as: TW202248954A; CN113222973A; TWI787113B; CN113222973B

Abstract

The present application discloses an image processing method and apparatus, a processor, an electronic device, and a storage medium. The method comprises: obtaining an image to be processed, a first threshold, a second threshold, and a third threshold, the first threshold being different from the second threshold, the first threshold being different from the third threshold, and the second threshold being less than or equal to the third threshold; determining a first number of first pixel points in an area to be detected of said image; the first pixel points being pixel points whose color value is greater than or equal to the second threshold and less than or equal to the third threshold; and obtaining a skin occlusion detection result of said image according to a first ratio of the first number to the number of pixel points in a skin area and the first threshold.

Description

Image processing method and device, processor, electronic equipment and storage medium

Cross References to Related Applications

This application claims priority to the Chinese patent application with application number 202110600103.1 filed on May 31, 2021, the entire disclosure of which is incorporated herein by reference.

technical field

The present application relates to the technical field of image processing, and in particular to an image processing method and device, a processor, electronic equipment, and a storage medium.

Background technique

In order to improve the safety of detection, non-contact detection of skin is applied in more and more scenarios. The detection accuracy of this type of non-contact detection is largely affected by the state of skin occlusion. For example, if the covered area of the skin area is relatively large, the accuracy of the detection result of the non-contact detection on the skin area may be low. Therefore, how to detect the state of skin occlusion is of great significance.

Contents of the invention

The present application provides an image processing method and device, a processor, electronic equipment and a storage medium to determine whether the skin is in a blocking state.

The present application provides an image processing method, the method comprising: acquiring an image to be processed, a first threshold, a second threshold, and a third threshold, the first threshold is different from the second threshold, and the first threshold Different from the third threshold, the second threshold is less than or equal to the third threshold; determine the first number of first pixels in the region to be detected of the image to be processed; the first pixel is a color value Pixels greater than or equal to the second threshold and less than or equal to the third threshold; according to the first ratio of the first number to the number of pixels in the region to be tested and the first threshold, the obtained Describe the skin occlusion detection results of the image to be processed.

In combination with any embodiment of the present application, the determining the first number of first pixels in the skin area of the image to be processed includes: performing face detection processing on the image to be processed to obtain a first face frame; Determining the region to be detected from the image to be processed according to the first face frame; determining a first number of the first pixels in the region to be detected.

In combination with any embodiment of the present application, the first face frame includes an upper frame line and a lower frame line; both the upper frame line and the lower frame line are in the first face frame parallel to the Process the side of the horizontal axis of the pixel coordinate system of the image, and the ordinate of the upper frame line is smaller than the ordinate of the lower frame line; according to the first face frame, determine from the image to be processed The area to be tested includes: performing human face key point detection on the image to be processed to obtain at least one human face key point; the at least one human face key point includes a left eyebrow key point and a right eyebrow key point; When the vertical coordinate of the upper frame line remains unchanged, the lower frame line is moved along the negative direction of the vertical axis of the pixel coordinate system of the image to be processed, so that the line where the lower frame line is located is the same as the first straight line. The lines overlap to obtain the second human face frame; the first straight line is a straight line passing through the left eyebrow key point and the right eyebrow key point; according to the area included in the second human face frame, the to-be measurement area.

In combination with any embodiment of the present application, the obtaining the area to be tested according to the area contained in the second face frame includes: keeping the ordinate of the lower frame line of the second face frame unchanged case, move the upper frame line of the second face frame along the vertical axis of the pixel coordinate system of the image to be processed, so that the upper frame line of the second face frame and the second face frame The distance between the lower frame lines is a preset distance to obtain a third human face frame; according to the area included in the third human face frame, the region to be tested is obtained.

In combination with any embodiment of the present application, the at least one human face key point also includes a left mouth corner key point and a right mouth corner key point; the first human face frame also includes a left frame line and a right frame line; the left frame line and the right frame line are sides parallel to the vertical axis of the pixel coordinate system of the image to be processed in the first face frame, and the abscissa of the left frame line is smaller than the abscissa of the right frame line Coordinates; said obtaining the region to be tested according to the region included in the third human face frame includes: keeping the abscissa of the left frame line of the third human face frame unchanged, The right frame line of the third human face frame moves along the horizontal axis of the pixel coordinate system of the image to be processed, so that between the right frame line of the third human face frame and the left frame line of the third human face frame The distance is the reference distance to obtain the fourth human face frame; the reference distance is the distance between the two intersection points of the second straight line and the human face contour contained in the third human face frame; the second straight line is at A straight line between the first straight line and the third straight line and parallel to the first straight line or the third straight line; the third straight line passes through the key point of the left mouth corner and the key point of the right mouth corner A straight line; the area contained in the fourth human face frame is used as the area to be tested.

In combination with any embodiment of the present application, the acquiring the second threshold and the third threshold includes: determining the skin pixel area from the pixel area contained in the first human face frame; acquiring the second skin pixel area in the skin pixel area The color value of two pixels; the difference between the color value of the second pixel and the first value is used as the second threshold, and the sum of the color value of the second pixel and the second value is used as the first threshold Three thresholds; wherein, neither the first value nor the second value exceeds the maximum value among the color values of the image to be processed.

In combination with any embodiment of the present application, the determining the skin pixel point area from the pixel point area contained in the first human face frame includes: when it is detected that the face area in the image to be processed is not wearing a mask , using the pixel point area in the face area except the forehead area, mouth area, eyebrow area and eye area as the skin pixel point area; Under normal circumstances, the pixel point area between the first straight line and the fourth straight line is used as the skin pixel point area; the fourth straight line is a straight line passing through the key point of the lower eyelid of the left eye and the key point of the lower eyelid of the right eye; Both the key points of the lower eyelid of the left eye and the key points of the lower eyelid of the right eye belong to the at least one human face key point.

In combination with any embodiment of the present application, the acquiring the color value of the second pixel in the skin pixel area includes: including at least one first key belonging to the inner left eyebrow area in the at least one human face key point point, and the at least one face key point contains at least one second key point belonging to the inner area of the right eyebrow, determine the rectangular area according to the at least one first key point and the at least one second key point ; Perform grayscale processing on the rectangular area to obtain a grayscale image of the rectangular area; use the color value of the intersection point of the first row and the first column as the color value of the second pixel point; the first The line is the row with the largest sum of grayscale values in the grayscale image, and the first column is the column with the largest sum of grayscale values in the grayscale image.

In combination with any embodiment of the present application, the skin occlusion detection result of the image to be processed is obtained according to the first ratio of the first number to the number of pixels in the region to be detected and the first threshold, The method includes: when the first ratio does not exceed the first threshold, determining that the skin occlusion detection result indicates that the skin area corresponding to the region to be detected is in an occlusion state; when the first ratio exceeds the first threshold, In the case of a threshold value, it is determined that the skin occlusion detection result indicates that the skin area corresponding to the area to be detected is in an unoccluded state.

In combination with any embodiment of the present application, the skin area belongs to the person to be detected, and the method further includes: acquiring a temperature thermodynamic map of the image to be processed; In the case of , the temperature of the skin area is read from the temperature thermodynamic map as the body temperature of the person to be detected.

In some embodiments, the present application also provides an image processing device, which includes: an acquisition unit, configured to acquire an image to be processed, a first threshold, a second threshold, and a third threshold, the first threshold Different from the second threshold, the first threshold is different from the third threshold, the second threshold is less than or equal to the third threshold; a first processing unit, configured to determine the image to be processed The first quantity of the first pixel in the area; the first pixel is a pixel whose color value is greater than or equal to the second threshold and less than or equal to the third threshold; the detection unit is configured to use the first quantity The skin occlusion detection result of the image to be processed is obtained by the first ratio with the number of pixels in the region to be detected and the first threshold.

In combination with any embodiment of the present application, the area to be tested includes a human face area, and the skin occlusion detection result includes a human face occlusion detection result; the image processing device further includes: a second processing unit, configured to determine Before the first number of first pixels in the area to be detected of the image to be processed, perform face detection processing on the image to be processed to obtain a first face frame; according to the first face frame, from the Determine the face area in the image to be processed.

In combination with any embodiment of the present application, the face area includes a forehead area, the face occlusion detection result includes a forehead occlusion detection result, and the first face frame includes: an upper frame line and a lower frame line; Both the frame line and the lower frame line are sides parallel to the horizontal axis of the pixel coordinate system of the image to be processed in the first face frame, and the ordinate of the upper frame line is smaller than the lower frame line The ordinate; the second processing unit is used to: perform face key point detection on the image to be processed to obtain at least one face key point; the at least one face key point includes a left eyebrow key point and a right eyebrow Key point: under the condition of keeping the ordinate of the upper frame line unchanged, move the lower frame line along the negative direction of the vertical axis of the pixel coordinate system of the image to be processed, so that the lower frame line is located The straight line coincides with the first straight line to obtain a second human face frame; the first straight line is a straight line passing through the left eyebrow key point and the right eyebrow key point; according to the area included in the second human face frame , to get the forehead area.

In combination with any embodiment of the present application, the second processing unit is configured to: keep the ordinate of the lower frame line of the second face frame unchanged, and convert the upper frame of the second face frame to The line moves along the vertical axis of the pixel coordinate system of the image to be processed, so that the distance between the upper frame line of the second human face frame and the lower frame line of the second human face frame is a preset distance, and the third A face frame: obtain the forehead area according to the area included in the third face frame.

In combination with any embodiment of the present application, the at least one human face key point also includes a left mouth corner key point and a right mouth corner key point; the first human face frame further includes: a left frame line and a right frame line; the left frame line and the right frame line are both sides parallel to the vertical axis of the pixel coordinate system of the image to be processed in the first face frame, and the abscissa of the left frame line is smaller than that of the right frame line abscissa; the second processing unit is configured to: keep the abscissa of the left frame line of the third face frame unchanged, and place the right frame line of the third face frame along the Process the horizontal axis of the pixel coordinate system of the image to move, so that the distance between the right frame line of the third human face frame and the left frame line of the third human face frame is a reference distance, and obtain the fourth human face frame; The reference distance is the distance between the second straight line and the two intersection points of the human face contour contained in the third human face frame; the second straight line is between the first straight line and the third straight line and parallel to The straight line of the first straight line or the third straight line; the third straight line is a straight line passing through the key point of the left corner of the mouth and the key point of the right corner of the mouth; the region included in the fourth human face frame is used as the forehead area.

In combination with any of the embodiments of the present application, the image device further includes: a determining unit, configured to, before determining the first number of first pixels in the region to be detected of the image to be processed, from the first person Determining the skin pixel point area in the pixel point area included in the face frame; the acquisition unit is also used to acquire the color value of the second pixel point in the skin pixel point area; the first processing unit is also used to convert the The difference between the color value of the second pixel point and the first value is used as the second threshold, and the sum of the color value of the second pixel point and the second value is used as the third threshold; the first value and None of the second values exceeds the maximum value among the color values of the image to be processed.

In combination with any embodiment of the present application, the image processing device further includes: a third processing unit, configured to, before determining the skin pixel area from the pixel area included in the first human face frame, process the The image to be processed is subjected to mask wearing detection processing to obtain the detection result; the determination unit is used to: when it is detected that the face area in the image to be processed is not wearing a mask, remove the forehead from the face area area, mouth area, eyebrow area and pixel area outside the eye area, as the skin pixel area; when it is detected that the face area in the image to be processed is wearing a mask, the first straight line The pixel point area between and the fourth straight line is used as the skin pixel point area. Wherein, the fourth straight line is a straight line passing through the key points of the lower eyelid of the left eye and the key point of the lower eyelid of the right eye; both the key points of the lower eyelid of the left eye and the key points of the lower eyelid of the right eye belong to the at least one human face key point.

In combination with any embodiment of the present application, the acquisition unit is configured to: include at least one first key point belonging to the inner area of the left eyebrow in the at least one key point of the human face, and include at least one key point belonging to the inner area of the right eyebrow In the case of the second key point, a rectangular area is determined according to the at least one first key point and the at least one second key point; grayscale processing is performed on the rectangular area to obtain a grayscale image of the rectangular area; The color value of the intersection point of the first row and the first column in the grayscale image of the rectangular area is used as the color value of the second pixel point; the first row is the row with the largest sum of grayscale values in the grayscale image, The first column is the column with the largest sum of gray values in the gray scale image.

In combination with any embodiment of the present application, the detection unit is configured to: if the first ratio does not exceed the first threshold, determine that the skin occlusion detection result indicates that the skin region corresponding to the region to be detected is in the Blocking state: when the first ratio exceeds the first threshold, determine that the skin blockage detection result is that the skin area corresponding to the region to be detected is in an unblocked state.

In combination with any embodiment of the present application, the skin area belongs to a person to be detected, and the acquiring unit is further configured to: acquire a temperature thermodynamic map of the image to be processed; the image processing device further includes: a fourth processing unit configured to If the skin occlusion detection result shows that the skin area is in an unoccluded state, read the temperature of the skin area from the temperature thermodynamic map as the body temperature of the person to be detected.

The present application also provides a processor, configured to execute the method in the above first aspect and any possible implementation manner thereof.

The present application also provides an electronic device, including: a processor, a sending device, an input device, an output device, and a memory, the memory is used to store computer program codes, the computer program codes include computer instructions, and the processor In the case of executing the computer instructions, the electronic device executes the method in the above first aspect and any possible implementation manner thereof.

The present application also provides a computer-readable storage medium, where a computer program is stored in the computer-readable storage medium, the computer program includes program instructions, and when the program instructions are executed by a processor, the The processor executes the method in the above first aspect and any possible implementation manner thereof.

The present application also provides a computer program product, the computer program product includes a computer program or an instruction, and when the computer program or instruction is run on a computer, it causes the computer to perform the above-mentioned first aspect and any one thereof. A possible method of implementation.

It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the application.

Description of drawings

In order to more clearly illustrate the technical solutions in the embodiment of the present application or the summary of the invention, the following will describe the drawings that need to be used in the embodiment of the application or the summary of the invention.

The accompanying drawings here are incorporated into the specification and constitute a part of the specification. These drawings show embodiments consistent with the application and are used to illustrate the technical solution of the application.

FIG. 1 is a schematic diagram of a pixel coordinate system provided by an embodiment of the present application;

FIG. 2 is a schematic flow chart of an image processing method provided in an embodiment of the present application;

FIG. 3 is a schematic flow diagram of another image processing method provided in the embodiment of the present application;

Fig. 4 is a schematic diagram of key points of a human face provided by the embodiment of the present application;

FIG. 5 is a schematic structural diagram of an image processing device provided in an embodiment of the present application;

FIG. 6 is a schematic diagram of a hardware structure of an image processing device provided by an embodiment of the present application.

Detailed ways

In order to enable those skilled in the art to better understand the solutions of the present application, the technical solutions in the embodiments of the present application will be clearly and completely described below in conjunction with the drawings in the present application. The described embodiments are only some of the embodiments of the application, not all of them. Based on the embodiments in this application, all other embodiments obtained by persons of ordinary skill in the art without making creative efforts shall fall within the scope of protection of this application.

The terms "first", "second" and the like in the specification and claims of the present application and the above drawings are used to distinguish different objects, rather than to describe a specific order. Furthermore, the terms "include" and "have", as well as any variations thereof, are intended to cover a non-exclusive inclusion. For example, a process, method, system, product, or device comprising a series of steps or units should not be understood as being limited to the listed steps or units, but may optionally also include steps or units not listed, or may Other steps or elements inherent to these processes, methods, products or devices are optionally also included.

Reference herein to an "embodiment" means that a particular feature, structure, or characteristic described in connection with the embodiment can be included in at least one embodiment of the present application. The occurrences of the term in various places in the specification are not necessarily all referring to the same embodiment, nor are separate or alternative embodiments mutually exclusive of other embodiments. Those skilled in the art can understand that any embodiment described herein can be combined with other embodiments.

Before proceeding to the following description, first define the pixel coordinate system in the embodiment of the present application. As shown in Figure 1, the pixel coordinate system xoy is constructed with the upper left corner of the image as the origin o of the pixel coordinate system, the direction parallel to the row of the image as the direction of the x-axis, and the direction parallel to the column of the image as the direction of the y-axis . In the pixel coordinate system, the abscissa is used to indicate the number of columns of the pixels in the image, and the ordinate is used to indicate the number of rows of the pixels in the image. The units of both the abscissa and the ordinate can be pixels. For example, suppose the coordinates of pixel a in Figure 1 are (30, 25), that is, the abscissa of pixel a is 30 pixels, the ordinate of pixel a is 25 pixels, and pixel a is the 25th row of the 30th column in the image of pixels.

In order to improve the safety of detection, non-contact detection of skin is applied in more and more scenarios. The detection accuracy of this type of non-contact detection is largely affected by the state of skin occlusion. For example, if the covered area of the skin area is relatively large, the accuracy of the detection result of the non-contact detection on the skin area may be low. Therefore, how to detect the state of skin occlusion is of great significance. For example, at present, non-contact temperature measurement is widely used in the field of body temperature detection. The non-contact temperature measurement tool has the advantages of fast measurement speed and over-temperature voice alarm. It is suitable for rapid screening of human body temperature in public places with a particularly large flow of people.

Thermal imaging equipment mainly detects the thermal radiation emitted by objects by collecting light in the thermal infrared band, and finally establishes an accurate correspondence between thermal radiation and temperature to realize the temperature measurement function. As a non-contact temperature measurement tool, thermal imaging equipment can cover a large area. It can increase the speed of traffic and reduce the gathering time of groups in detection scenarios with a large flow of people.

The thermal imaging device mainly recognizes the position of the forehead of the pedestrian, and then measures the body temperature according to the forehead area. However, in the case of pedestrians wearing hats or bangs, it is impossible to determine whether the forehead area is blocked. At this time, whether or not the covering state of the forehead can be determined has a great influence on the accuracy of body temperature detection.

Based on this, an embodiment of the present application provides an image processing method to realize skin occlusion detection of, for example, an object to be measured. For example: in the embodiment of body temperature detection, the object to be measured can be a human face, or specifically the forehead area of a human face, or more specifically a specific position in the forehead area. To simplify the expression, in the following description, the area corresponding to the object to be measured in the image to be processed is referred to as the area to be measured. In other words, the temperature-measuring object is usually a skin area corresponding to the to-be-measured area in the image to be processed, and the skin occlusion detection result of the temperature-measuring object includes whether the corresponding skin area is occluded.

The execution subject of the embodiment of the present application is an image processing device, and the image processing device may be one of the following: a mobile phone, a computer, a server, and a tablet computer.

Embodiments of the present application are described below with reference to the drawings in the embodiments of the present application.

Please refer to FIG. 2 . FIG. 2 is a schematic flowchart of an image processing method provided in an embodiment of the present application.

201. Acquire an image to be processed, a first threshold, a second threshold, and a third threshold, the first threshold is different from the second threshold, the first threshold is different from the third threshold, and the second threshold is less than or equal to the third threshold threshold.

In the example of this application, the image to be processed includes an image block containing a human face and an image block not containing a human face. The first threshold is a standard ratio between the number of skin pixels in the forehead area and the number of pixels in the forehead area preset according to specific implementation conditions, and is a criterion for evaluating whether the forehead area is blocked.

The first threshold in this embodiment of the present application is related to the accuracy of temperature detection or other embodiments. For example, assuming that the temperature measurement operation is performed on the forehead area of pedestrians, the more exposed skin areas in the forehead area, the more accurate the temperature measurement result will be. In the case that the exposed skin area of the forehead area accounts for more than 60%, the result of the temperature measurement is considered to be accurate. If such accuracy is required in the temperature detection scenario, then the first threshold can be set to 60%. If higher accuracy is required in the temperature detection scenario, the first threshold can be set above 60%. If it is considered that the requirement of setting the first threshold to 60% is too high and an over-accurate result is actually not needed, then the first threshold can be set below 60%. In this case, the accuracy of the corresponding temperature measurement results will be reduced. Therefore, the setting of the first threshold needs to be performed in specific implementation, which is not limited in this embodiment of the present application.

In an implementation manner of acquiring an image to be processed, the image processing apparatus receives an image to be processed input by a user through an input component. The above-mentioned input components include: a keyboard, a mouse, a touch screen, a touch panel, an audio and video input device, and the like.

In another implementation manner of acquiring an image to be processed, the image processing device receives the image to be processed sent by the data terminal. The above-mentioned data terminal may be any of the following: mobile phone, computer, tablet computer, server, etc.

In yet another implementation manner of acquiring an image to be processed, the image processing device receives the image to be processed sent by the surveillance camera. Optionally, the monitoring camera may be deployed on non-contact temperature measurement products such as artificial intelligence (AI) infrared imagers and security gates (such products are mainly placed in stations, airports, subways, shops, supermarkets, Scenes with dense traffic such as schools, company halls, and community gates).

In yet another implementation manner of acquiring an image to be processed, the image processing device receives the video stream sent by the surveillance camera, decodes the video stream, and uses the obtained image as the image to be processed. Optionally, the surveillance camera may be deployed on non-contact temperature measurement products such as AI infrared imagers and security gates (such products are mainly placed at stations, airports, subways, shops, supermarkets, schools, company halls and community gates) These crowded scenes).

In yet another implementation manner of acquiring images to be processed, the image processing device is connected to the cameras, and the image processing device can obtain real-time collected data frames from each camera, and the data frames may include images and/or videos.

It should be understood that the number of cameras connected to the image processing device is not fixed, and the collected data frames can be obtained from the cameras by inputting the network addresses of the cameras into the image processing device.

For example, if a person in place A wants to use the technical solution provided by this application, he only needs to input the network address of the camera in place A into the image processing device, and the data frame collected by the camera in place A can be obtained by the image processing device , and subsequent processing can be performed on the data frames collected by the camera at A, and the image processing device outputs the detection result of whether the forehead is blocked.

202. Determine a first number of first pixels in the region to be detected of the image to be processed, wherein the first pixels are pixels whose color values are greater than or equal to a second threshold and less than or equal to a third threshold.

In the example of this application, the color value is a parameter of the hexagonal pyramid model ((hue, saturation, value, HSV). The three parameters of the color value in this model are: hue (hue, H), saturation (saturation, S ), brightness (value, V). That is to say, the color value carries three kinds of information of chroma, saturation and brightness. Because this application involves skin detection, it is necessary to detect the number of skin pixels in the area to be tested, that is, the first The first number of pixels in a pixel.

Specifically, the image processing device regards pixels whose color values are greater than or equal to the second threshold and less than or equal to the third threshold as skin pixels. That is, in the embodiment of the present application, the second threshold and the third threshold are used to determine whether the pixel is a skin pixel.

In the implementation of determining that the first pixel is a pixel whose color value is greater than or equal to the second threshold and less than or equal to the third threshold, when all parameters of the color value of the pixel in the area to be tested are greater than or equal to the parameters corresponding to the second threshold and is less than or equal to the parameter corresponding to the third threshold, the pixel can be considered as a skin pixel corresponding to an unoccluded skin area. For example, suppose the H of the second threshold is 26, the S is 43, and the V is 46; the H of the third threshold is 34, the S is 255, and the V is 255. Then, the color value range of the skin pixel is 26 to 34 for H, 43 to 255 for S, and 46 to 255 for V. When the color value of a certain pixel in the area to be tested is H is 25, S is 45, and V is 200, because the value of H is not within the range of 26 to 34 of the set H, then this pixel is considered not Skin pixels. For another example, when the color value of a certain pixel in the area to be tested is 28 for H, 45 for S, and 200 for V, because the values of H, S, and V are all within the set range, then it is considered that this Pixels are skin pixels. That is to say, the area to be tested is converted from the RGB channel to the HSV channel. Only when the color values of a certain pixel in the area to be tested are within the range of the second threshold and the third threshold given above, this pixel is indicated. A point is a skin pixel point corresponding to an unoccluded skin area, that is, this pixel point is the first pixel point.

After the image processing device determines the first pixel points in the region to be detected, it further determines the number of the first pixel points to obtain the first number.

203. Obtain a skin occlusion detection result of the image to be processed according to a first ratio of the first number to the number of pixels in the region to be detected and the first threshold.

In the embodiment of the present application, the skin occlusion detection result includes that the skin area is in an occluded state or that the skin area is in an unoccluded state.

In the embodiment of the present application, the first ratio between the first number and the number of pixels in the area to be tested represents the proportion of unoccluded skin pixels in the area to be tested (hereinafter referred to as proportion) . If the first ratio indicates that the proportion is small, it means that the skin area corresponding to the area to be tested is blocked; on the contrary, if the first ratio indicates that the proportion is large, it means that the skin area corresponding to the area to be tested is not blocked.

In the embodiment of the present application, the image processing device uses the first threshold as the basis for judging the proportion, and then can determine whether the skin area is blocked according to the proportion, so as to obtain the skin occlusion detection result.

In a possible implementation manner, if the proportion does not exceed the first threshold, it means that the proportion is small, and then it is determined that the skin area is in a blocked state. If the proportion exceeds the first threshold, it means that the proportion is relatively large, and then it is determined that the skin area is in an unoccluded state.

In the implementation of the present application, the image processing device determines the number of skin pixels in the region to be detected in the image to be processed according to the first threshold, that is, the first number. By determining the first ratio of the first number to the number of pixels in the area to be tested, the proportion of skin pixels in the area to be tested is obtained, and then the skin pixel can be determined according to the relationship between the proportion and the first threshold. The occlusion state of the area, so as to obtain the skin occlusion detection result of the image to be processed.

As an optional implementation manner, the skin area includes a human face area, and the skin occlusion detection result includes a human face occlusion detection result. In this embodiment, the image processing device further determines the proportion of skin pixels in the face area after determining the number of skin pixels in the face area in the image to be processed, and then can use the proportion Determine whether the face area is occluded, and obtain the face occlusion detection result. Specifically, when it is determined that the human face area is blocked, it is determined that the human face occlusion detection result is the state that the human face area is blocked; when it is determined that the human face area is not blocked, it is determined that the human face occlusion detection result is the human face Not in a blocked state.

In this embodiment, before determining the first number of first pixels in the region to be detected of the image to be processed, the image processing device further performs the following steps:

1. Perform face detection processing on the image to be processed to obtain a first face frame.

In the embodiment of the present application, the face detection process is used to identify whether the image to be processed contains a human object.

Face detection processing is performed on the image to be processed to obtain the coordinates of the first face frame (as shown in D in FIG. 1 ). The coordinates of the first face frame may be upper left corner coordinates, lower left corner coordinates, lower right corner coordinates, and upper right corner coordinates. The coordinates of the first face frame may also be a pair of diagonal coordinates, that is, the coordinates of the upper left corner and the lower right corner or the coordinates of the lower left corner and the upper right corner. The area contained in the first face frame is the area from the forehead to the chin of the face.

In a possible implementation, feature extraction is performed on the image to be processed through a pre-trained neural network to obtain feature data, and the pre-trained neural network identifies whether the image to be processed contains a human face according to the features in the feature data . By performing feature extraction processing on the image to be processed, if it is determined that the image to be processed contains a human face in the feature extracted data, determine the position of the first human face frame of the image to be processed, that is, realize the detection of the human face. The face detection processing of the image to be processed can be realized through a convolutional neural network.

By using multiple images with label information as training data, the convolutional neural network is trained, so that the trained convolutional neural network can complete the face detection processing of the image. The annotation information of the images in the training data is the face and the position of the face. In the process of using the training data to train the convolutional neural network, the convolutional neural network extracts the feature data of the image from the image, and determines whether there is a human face in the image according to the feature data. In the case of a human face in the image, according to The feature data of the image obtains the position of the face. Use the labeling information as the supervision information to supervise the results obtained by the convolutional neural network during the training process, and update the parameters of the convolutional neural network to complete the training of the convolutional neural network. In this way, the image to be processed can be processed by using the trained convolutional neural network to obtain the position of the face in the image to be processed.

In another possible implementation, the face detection process can be implemented by a face detection algorithm, wherein the face detection algorithm can be at least one of the following: face detection based on histogram rough segmentation and singular value features Algorithm, face detection based on binary wavelet transform, neural network method (pdbnn) based on probability decision-making, hidden markov model method (hidden markov model), etc., this application does not focus on the face detection algorithm for realizing face detection processing Be specific.

2. According to the first human face frame, determine the human face area from the image to be processed.

In a possible implementation manner, the image processing apparatus uses the area surrounded by the first human face frame as the human face area.

As an optional implementation manner, the first human face frame includes an upper frame line and a lower frame line. Or, the first human face frame includes an upper frame line, a lower frame line, a left frame line and a right frame line; the upper frame line and the lower frame line are all parallel to the pixel coordinate system of the image to be processed in the first human face frame The side of the horizontal axis, and the ordinate of the upper frame line is less than the ordinate of the lower frame line; the left frame line and the right frame line are the sides parallel to the vertical axis of the pixel coordinate system of the image to be processed in the first face frame, And the abscissa of the left frame line is smaller than the abscissa of the right frame line.

In this embodiment, the face area includes the forehead area. At this time, the image processing device determines the face area from the image to be processed according to the first face frame, that is, determines the forehead area from the image to be processed according to the first face frame. .

In an implementation of determining the forehead area, the distance between the upper frame line and the lower frame line is the distance from the upper edge of the forehead to the lower edge of the chin of the face included in the first face frame, and the distance between the left frame line and the right frame line is the distance between the inner side of the left ear and the inner side of the right ear of the face contained in the first face frame. Generally speaking, the width of the forehead area of a face (that is, the distance between the upper and lower edges of the forehead area) accounts for about 1/3 of the length of the entire face (that is, the distance between the upper and lower edges of the entire face), but the forehead The ratio of the width of the region to the length of the face varies from person to person. However, the ratio of the width of the forehead area to the length of the entire human face is in the range of 30% to 40% for each person. In the case of keeping the vertical coordinate of the upper frame line unchanged, move the lower frame line along the negative direction of the vertical axis of the pixel coordinate system of the image to be processed, so that the distance between the moved upper frame line and the lower frame line is the upper frame 30% to 40% of the initial distance between the line and the lower frame line, the area included in the first face frame after the movement is the forehead area. When the coordinates of the first face frame are a pair of diagonal coordinates, the coordinates of the upper left corner of the first face frame or the coordinates of the upper right corner of the first face frame determine the position of the forehead area. Therefore, by changing the size and position of the first human face frame, the area within the first human face frame can be made to be the forehead area of the human face in the image to be processed.

In another implementation manner of determining the forehead area, the image processing device determines the forehead area by performing the following steps:

21. Perform face key point detection on the image to be processed to obtain at least one face key point; the at least one face key point includes a left eyebrow key point and a right eyebrow key point.

In the embodiment of the present application, at least one key point of human face is obtained by performing human face key point detection on the image to be processed, and at least one key point includes a left eyebrow key point and a right eyebrow key point.

Feature extraction is performed on the image to be processed to obtain feature data, which can realize face key point detection. Wherein, the feature extraction process can be realized by a pre-trained neural network, or by a feature extraction model, which is not limited in this application. The feature data is used to extract the key point information of the face in the image to be processed. The above image to be processed is a digital image, and the feature data obtained by performing feature extraction on the image to be processed can be understood as deeper semantic information of the image to be processed.

In a possible implementation of face key point detection, a face image set for training is established, and positions of key points to be detected are marked. Build the first layer of deep neural network and train the face area estimation model, build the second layer of deep neural network, and do preliminary detection of key points of the face; continue to divide the inner face area into local areas, and build the third layer for each local area Deep neural network: Estimate the rotation angle of each local area, correct it according to the estimated rotation angle, and construct a fourth-layer deep neural network for the correction data set of each local area. Given any new face image, the above four-layer deep neural network model is used for key point detection, and the final face key point detection result can be obtained.

In yet another possible implementation of face key point detection, the convolutional neural network is trained by using multiple images with annotation information as training data, so that the trained convolutional neural network can complete the image recognition. Face keypoint detection processing. The annotation information of the image in the training data is the key point position of the face. In the process of using the training data to train the convolutional neural network, the convolutional neural network extracts the feature data of the image from the image, and determines the key point position of the face in the image according to the feature data. Use the labeling information as the supervision information to supervise the results obtained by the convolutional neural network during the training process, and update the parameters of the convolutional neural network to complete the training of the convolutional neural network. In this way, the image to be processed can be processed using the trained convolutional neural network to obtain key point positions of faces in the image to be processed.

In yet another possible implementation manner, at least two convolutional layers are used to perform convolution processing on the image to be processed layer by layer to complete the feature extraction process of the image to be processed. The convolutional layers in at least two convolutional layers are connected in sequence, that is, the output of the previous convolutional layer is the input of the next convolutional layer, and the content and semantic information extracted by each convolutional layer are different. Specifically The performance is that the feature extraction process abstracts the features of the face in the image to be processed step by step, and at the same time gradually discards the relatively minor feature data, wherein the relatively minor feature data refers to the feature data of the detected face other feature data. Therefore, the size of the feature data extracted later is smaller, but the content and semantic information are more concentrated. The image to be processed is convoluted step by step through the multi-layer convolution layer, which can reduce the size of the image to be processed while obtaining the content information and semantic information in the image to be processed, and reduce the data processing capacity of the image processing device. Improve the computing speed of the image processing device.

In another possible implementation of face key point detection, the implementation process of convolution processing is as follows: by sliding the convolution kernel on the image to be processed, and moving the pixel corresponding to the central pixel of the convolution kernel on the image to be processed called the target pixel. Multiply the pixel value on the image to be processed by the corresponding value on the convolution kernel, and then add all the multiplied values to obtain the pixel value after convolution. The pixel value after convolution processing is used as the pixel value of the target pixel. Finally, the image to be processed is slid and processed, the pixel values of all pixels in the image to be processed are updated, and the convolution processing of the image to be processed is completed to obtain feature data. In a possible implementation manner, the features in the feature data are identified by a neural network that extracts the feature data, so as to obtain the key point information of the face in the image to be processed.

In another possible implementation of face key point detection, the face key point detection algorithm is adopted to realize the face key point detection, and the face key point detection algorithm adopted can be OpenFace, multi-task cascaded convolutional neural network (multi -at least one of task cascaded convolutional networks (MTCNN), adjusted convolutional neural networks (tweaked convolutional neural networks, TCNN), or task-constrained deep convolutional networks (tasks-constrained deep convolutional network, TCDCN), this application is for The face key point detection algorithm is not limited.

22. While keeping the ordinate of the upper frame line of the first face frame unchanged, move the lower frame line of the first face frame along the negative direction of the vertical axis of the pixel coordinate system of the image to be processed , so that the line where the lower frame line of the first face frame is located coincides with the first line to obtain the second face frame. Wherein, the first straight line is a straight line passing through the above-mentioned left eyebrow key point and the above-mentioned right eyebrow key point.

23. Obtain the aforementioned forehead area according to the area included in the aforementioned second human face frame.

In the embodiment of the present application, the distance between the above-mentioned upper frame line and the above-mentioned lower frame line is the distance from the upper edge of the forehead to the lower edge of the chin of the face included in the first face frame, and the distance between the above-mentioned left frame line and the above-mentioned right frame line is the distance between the inside of the left ear and the inside of the right ear of the face included in the first face frame. The first straight line is a straight line passing through the above-mentioned left eyebrow key point and the above-mentioned right eyebrow key point. Because the forehead area is above the first straight line included in the first face frame, moving the lower frame line to coincide with the first straight line can make the area included in the moved first face frame the forehead area. That is, while keeping the ordinate of the above-mentioned upper frame line unchanged, move the above-mentioned lower frame line along the negative direction of the vertical axis of the pixel coordinate system of the image to be processed, so that the line where the above-mentioned lower frame line is located after the movement Coincident with the above-mentioned first straight line, the second face frame is obtained. The area contained in the second face frame is the forehead area.

As an optional implementation manner, the image processing device performs the following steps during step 23:

24. While keeping the vertical coordinate of the lower frame line of the second human face frame unchanged, move the upper frame line of the second human face frame along the vertical axis of the pixel coordinate system of the image to be processed, so that the above The distance between the upper frame line of the second face frame and the lower frame line of the second face frame is a preset distance to obtain a third face frame.

25. Obtain the forehead area according to the area included in the third face frame.

In the embodiment of the present application, the distance between the left frame line of the second face frame and the right frame line of the second face frame is the distance from the inside of the left ear to the inside of the right ear of the face included in the second face frame. The distance between the upper frame line of the first face frame and the lower frame line of the first face frame is the distance from the upper edge of the forehead to the lower edge of the chin of the face contained in the first face frame. Generally speaking, the width of the forehead area is the largest. It accounts for about 1/3 of the length of the entire face, but the ratio of the width of the forehead area to the length of the face is different for each person. However, the ratio of the width of the forehead area to the length of the face of all people is 30% to 40% %In the range. Therefore, the preset distance is set to be 30% to 40% of the distance between the upper frame line of the first human face frame and the lower frame line of the first human face frame. Therefore, to make the area inside the second face frame the forehead area, it is necessary to reduce the distance between the upper frame line of the second face frame and the lower frame line of the second face frame to the above-mentioned first face frame. 30% to 40% of the distance between the frame line and the bottom frame line. While keeping the ordinate of the lower frame line of the second face frame unchanged, move the upper frame line of the second face frame along the vertical axis of the pixel coordinate system of the image to be processed, so that the second person The distance between the upper frame line of the face frame and the lower frame line of the second face frame is a preset distance to obtain a third face frame. At this time, the area included in the third face frame is the forehead area.

As an optional implementation manner, the image processing device performs the following steps during the execution of step 25:

26. While keeping the abscissa of the left frame of the third face frame unchanged, move the right frame of the third face frame along the abscissa of the pixel coordinate system of the image to be processed, so that the above The distance between the right frame line of the third face frame and the left frame line of the third face frame is a reference distance, and the fourth face frame is obtained. Wherein, the above-mentioned reference distance is the distance between the two intersection points of the second straight line and the human face contour included in the third human-face frame, and the above-mentioned second straight line is between the above-mentioned first straight line and the third straight line and parallel to The above-mentioned first straight line or the above-mentioned third straight line is a straight line passing through the key points of the left corner of the mouth and the key point of the right corner of the mouth.

27. Use the region included in the fourth face frame as the forehead region.

In the embodiment of the present application, the at least one human face key point further includes a left mouth corner key point and a right mouth corner key point. The third straight line is a straight line passing through the key points of the left corner of the mouth and the key point of the right corner of the mouth. The second straight line is between the first straight line and the third straight line, and the second straight line is parallel to the first straight line or the third straight line. The distance between the two intersection points of the second straight line and the face contour of the face image included in the third face frame is taken as the reference distance. The second straight line is between the first straight line and the third straight line, that is, in the middle area between the eyebrow area and the mouth area. Because the width of the human face in the middle area of the eyebrow area and the mouth area is relatively close to the length of the forehead area, it is more accurate to use the width of this part of the area to determine the length of the forehead area. At this time, the length of the forehead area is the width of the contour of the face, that is, the reference distance. While keeping the abscissa of the left frame line of the third face frame unchanged, move the right frame line of the third face frame along the abscissa of the pixel coordinate system of the image to be processed, so that the first The distance between the left frame line of the three face frames and the right frame line of the third face frame is a reference distance, and the fourth face frame is obtained. At this time, the area included in the fourth face frame is the forehead area.

In yet another possible implementation, while keeping the abscissa of the right frame line of the third face frame unchanged, align the left frame line of the third face frame along the pixel coordinates of the image to be processed The horizontal axis of the system moves, so that the distance between the left frame line of the third human face frame after the movement and the right frame line of the third human face frame is the reference distance, and the third human face frame after the movement contains The area is the forehead area.

In yet another possible implementation, the right frame line of the third face frame is moved along the negative direction of the horizontal axis of the pixel coordinate system of the image to be processed, and the left frame line and the right frame line of the third face frame are moved While the distance between them is half of the reference distance difference, move the left frame line of the third human face frame along the positive direction of the horizontal axis of the pixel coordinate system of the image to be processed to the left frame of the third human face frame The distance between the line and the right frame line is half of the difference between the reference distance, so that the distance between the left frame line of the above-mentioned third face frame after moving and the right frame line of the above-mentioned third face frame after moving is Reference distance. At this time, the region included in the moved third human face frame is the forehead region.

As an optional implementation manner, before determining the first number of first pixels in the region to be detected of the image to be processed, the image processing device further performs the following steps:

3. Determine the skin pixel area from the pixel area included in the first human face frame.

In the embodiment of the present application, to find the color reference of the exposed skin in the skin area, it is necessary to take the color value of the pixel in the skin pixel area as the color reference of the exposed skin in the skin area. Therefore, it is necessary to determine the skin pixel area from the pixel area included in the first human face frame. For example, as shown in Figure 1, the skin pixel point area can be the cheek area below the eyes included in the first human face frame, or the intersection area of the area below the nose and the area above the mouth included in the first human face frame, or It may be the region under the mouth contained in the first face frame.

As an optional implementation manner, before determining the skin pixel point area from the pixel point area contained in the above-mentioned face frame, the image processing device further performs the following steps:

31. Perform mask wearing detection processing on the image to be processed to obtain a detection result.

In the embodiment of the present application, the image to be processed is tested for wearing a mask, and the detection results obtained include: the person in the image to be processed has worn a mask or the person in the image to be processed has not worn a mask.

In a possible implementation manner, the image processing device performs first feature extraction processing on the image to be processed to obtain first feature data, where the first feature data carries information about whether the person to be detected is wearing a mask. The image processing device obtains the detection result according to the first feature data obtained from the mask wearing detection.

Optionally, the first feature extraction process can be implemented through a mask detection network. By using at least one first training image with label information as training data, the mask detection network can be obtained by training the deep convolutional neural network. Wherein, the annotation information includes whether the person in the first training image is wearing a mask.

32. When the detection result shows that the above-mentioned face area is not wearing a mask, use the pixel point area in the face area except the forehead area, mouth area, eyebrow area and eye area as the above-mentioned skin pixel point area. Wherein, the at least one human face key point also includes key points of the lower eyelid of the left eye and key points of the lower eyelid of the right eye.

If the detection result is that the face area wears a mask, the pixel point area between the first straight line and the fourth straight line in the face area is taken as the skin pixel point area. Wherein, the fourth straight line is a straight line passing through the key points of the lower eyelid of the left eye and the key point of the lower eyelid of the right eye; both the key points of the lower eyelid of the left eye and the key points of the lower eyelid of the right eye belong to at least one of the aforementioned key points of the human face.

In the embodiment of the present application, when the detection result is that the face area is not wearing a mask, the skin pixel area of the face area is an area other than the skin area, mouth area, eyebrow area, and eye area. Because the face area has pixels whose color values are displayed as black in the eye area and eyebrow area, and pixels whose color value is displayed as red in the mouth area. Therefore, the skin pixel area does not include the eye area, mouth area and eyebrow area. And because it is not sure whether the skin area is covered by a hat or bangs, etc., it is impossible to determine the skin pixel area corresponding to the skin area. Therefore, when the mask wearing detection processing of the image to be processed determines that the above-mentioned face area is not wearing a mask, the skin pixel area includes the pixel area in the face area except the skin area, mouth area, eyebrow area, and eye area.

In the case that the detection result is that the mask is worn in the face area, most of the area below the nose of the face area will be blocked. Therefore, the unoccluded part of the skin can be the eyebrow area, eyelid area, and nasion area. Face key point detection can obtain the key point coordinates of the lower eyelid of the left eye, the key point coordinates of the lower eyelid of the right eye, the key point coordinates of the left eyebrow, and the key point coordinates of the right eyebrow. The fourth straight line is the straight line passing through the key points of the lower eyelid of the left eye and the key point of the lower eyelid of the right eye, and the first straight line is the straight line passing through the key points of the left eyebrow and the right eyebrow. The three parts of the eyebrow area, the eyelid area and the nasion area are all between the horizontal line determined by the left eyebrow and the right eyebrow in the human face area and the straight line determined by the lower eyelid of the left eye and the lower eyelid of the right eye. Therefore, in the case that the detection result is that the face area wears a mask, the pixel point area between the first straight line and the fourth straight line in the human face area is taken as the skin pixel point area.

4. Obtain the color value of the second pixel in the skin pixel area;

In the embodiment of the present application, the color value of the second pixel point is obtained from the skin pixel point area, where the color value of the second pixel point is used as a benchmark for measuring the skin color exposed in the skin area. Therefore, the second pixel point may be any point in the skin pixel point area.

The implementation of obtaining the second pixel in the skin pixel area can be: find the coordinate average of a certain skin pixel area as the second pixel; or find the pixel at the intersection coordinates of the straight lines determined by some key points as The second pixel; or grayscale processing is performed on an image of a part of the skin pixel area, and the pixel with the largest grayscale value is used as the second pixel. The embodiment of the present application does not limit the manner of acquiring the second pixel.

In a possible implementation, when there are two key points in the inner area of the right eyebrow and the inner area of the left eyebrow respectively, set the key points as the upper point on the inner side of the right eyebrow, the lower point on the inner side of the right eyebrow, the upper point on the inner side of the left eyebrow, Point below the inside of the left eyebrow. Connect the upper point on the inner side of the right eyebrow with the lower point on the inner side of the left eyebrow, connect the upper point on the inner side of the left eyebrow with the lower point on the inner side of the right eyebrow, and obtain two intersecting straight lines. The unique point of intersection can be obtained by these two intersecting straight lines. As shown in the figure, suppose the numbers corresponding to these four key points are 37, 38, 67, and 68 respectively. That is, the

key points

37 and 68 are connected, and the key points 38 and 67 are connected. After determining these two straight lines, an intersection point can be obtained. Based on the position of the face frame, the coordinates of the four

key points

37, 38, 67, and 68 can be determined, and then the coordinates of the intersection points can be solved by using Opencv. By determining the coordinates of the intersection point, the pixel point corresponding to the intersection point can be obtained. By converting the RGB channel of the pixel corresponding to the intersection point into an HSV channel, the color value of the pixel corresponding to the intersection coordinate can be obtained. The color value of the pixel corresponding to the intersection coordinate is the color value of the second pixel.

In another possible implementation, when there are two key points in the inner area of the right eyebrow and the inner area of the left eyebrow respectively, set the key points as the upper point on the inner side of the right eyebrow, the lower point on the inner side of the right eyebrow, and the upper point on the inner side of the left eyebrow. , Point below the inside of the left eyebrow. Find a rectangular area through these 4 key points as the eyebrow area. As shown in the figure, assuming that the numbers corresponding to these four key points are 37, 38, 67, and 68 respectively, a rectangular area is calculated as the eyebrow area through these four key points. The obtained coordinates of the

key points

37, 38, 67, 68 are defined as (X1, Y1), (X2, Y2), (X3, Y3), (X4, Y4) respectively. Take the maximum value of Y coordinates in (X1, Y1), (X2, Y2) as Y5, take the minimum value of Y coordinates in (X3, Y3), (X4, Y4) as Y6, take (X1, Y1), ( The maximum value of X coordinates in X3, Y3) is X5, and the minimum value of X coordinates in (X2, Y2), (X4, Y4) is X6, so a rectangular area can be obtained. That is to say, the four coordinates of the intercepted eyebrow area are (X6, Y6), (X5, Y5), (X5, Y6), (X6, Y5). Based on the position of the face frame, the coordinates of the four

key points

37, 38, 67, and 68 can be determined, and (X6, Y6), (X5, Y5), (X5, Y6), (X6, Y5) can be determined The positions of these four points. Connect (X6, Y6) and (X5, Y5) and connect (X5, Y6) and (X6, Y5) to obtain two straight lines, and a unique intersection point can be obtained through these two straight lines. Then, Opencv can be used to solve the coordinates of the intersection point. By determining the coordinates of the intersection point, the pixel point corresponding to the intersection point can be obtained. By converting the RGB channel of the pixel corresponding to the intersection point into an HSV channel, the color value of the pixel corresponding to the intersection coordinate can be obtained. The color value of the pixel corresponding to the intersection coordinate is the color value of the second pixel.

As an optional implementation manner, the image processing device performs the following steps during step 4:

41. In the case where the above at least one face key point includes at least one first key point belonging to the inner area of the left eyebrow and at least one second key point belonging to the inner area of the right eyebrow, according to the above at least one first key point The key point and the at least one second key point define a rectangular area.

42. Perform grayscale processing on the above rectangular area to obtain a grayscale image of the rectangular area.

43. Use the color value of the intersection of the first row and the first column in the grayscale image of the rectangular area as the color value of the second pixel, wherein the first row is the row with the largest sum of grayscale values in the grayscale image above , the first column is the column with the largest sum of gray values in the above grayscale image.

In the embodiment of the present application, various schemes for obtaining a rectangular area according to the at least one first key point and the at least one second key point are included. Perform grayscale processing on this rectangular area to obtain a grayscale image of the rectangular area. Calculate the sum of the gray values of each row of the grayscale image, and remember that the row with the largest sum of gray values is the first row. In the same way, calculate the sum of the gray values of each column of the grayscale image, and remember that the column with the largest sum of gray values is the first column. Find the intersection coordinates according to the row and column with the largest sum of gray values. That is, the intersection coordinates of the first row and the first column. By determining the coordinates of the intersection point, the pixel point corresponding to the intersection point can be obtained. By converting the RGB channel of the pixel corresponding to the intersection to the HSV channel, the color value of the pixel corresponding to the intersection can be obtained. The color value of the pixel corresponding to the intersection coordinate is the color value of the second pixel.

In a possible implementation of obtaining a rectangular area, when there is only one key point on the inner side of the left eyebrow and one key point on the inner side of the right eyebrow and the vertical coordinates of these two key points are inconsistent, the difference between the vertical coordinates of these two key points The value is taken as the width of the rectangular area, and the difference between the abscissas of these two key points is taken as the length of the rectangular area to determine a rectangular area with these two key points as the diagonal.

In yet another possible implementation of obtaining a rectangular area, in the case that there are two key points inside the left eyebrow and one key point inside the right eyebrow, the line connecting the two key points inside the left eyebrow is used as the key point of the rectangular area The length of the first side, select one of the two key points on the inner side of the left eyebrow that is inconsistent with the ordinate of the key point on the inner side of the right eyebrow, and use the line connecting it with the key point on the inner side of the right eyebrow as the second side of the rectangular area long. Draw parallel lines according to the determined first side length and second side length respectively, and obtain the remaining two side lengths of the rectangular area, thereby determining the rectangular area.

In yet another possible implementation of acquiring a rectangular area, when there are more than two key points in the inner area of the left eyebrow and more than two key points in the inner area of the right eyebrow, four of the key points can be selected to form a quadrilateral area. Then get the rectangular area according to the coordinates of these four key points.

In yet another possible implementation of acquiring a rectangular area, at least one first key point includes a third key point and a fourth key point; at least one second key point includes a fifth key point and a sixth key point; the third key point The ordinate of the point is smaller than the fourth key point; the ordinate of the fifth key point is smaller than the sixth key point; the first abscissa and the first ordinate determine the first coordinate; the second abscissa and the first ordinate determine the second coordinate; An abscissa and the second ordinate determine the third coordinate; the second abscissa and the second ordinate determine the fourth coordinate; the first ordinate is the maximum value of the ordinate of the third key point and the fifth key point; the second The ordinate is the minimum value of the ordinate of the fourth key point and the sixth key point; the first abscissa is the maximum value of the abscissa of the third key point and the fourth key point; the second abscissa is the fifth key point and The minimum value of the abscissa of the sixth key point; the area enclosed by the first coordinate, the second coordinate, the third coordinate and the fourth coordinate is regarded as a rectangular area. For example, in the case where there are two key points in the inner area of the left eyebrow and two key points in the inner area of the right eyebrow, set these four key points as the third key point (X1, Y1), the fifth key point (X2, Y2), the fourth key point (X3, Y3), the sixth key point (X4, Y4). Take the maximum value of the Y coordinates in (X1, Y1), (X2, Y2) as Y5, as the first ordinate; take the minimum value of the Y coordinates in (X3, Y3), (X4, Y4) as Y6, as the first ordinate Two vertical coordinates; take the maximum value of the X coordinate in (X1, Y1), (X3, Y3) as X5, as the first horizontal coordinate; take the minimum value of the X coordinate in (X2, Y2), (X4, Y4) X6, as the second abscissa. Therefore, the four coordinates of the rectangular area can be obtained as the first coordinate (X5, Y5), the second coordinate (X6, Y5), the third coordinate (X5, Y6), and the fourth coordinate (X6, Y6).

As another optional implementation manner, the image processing device performs the following steps during step 4:

44. In the case where the at least one face key point includes at least one first key point belonging to the inner area of the left eyebrow, and the at least one face key point includes at least one second key point belonging to the inner area of the right eyebrow , determine the average coordinates of at least one first key point and at least one second key point.

45. Use the color value of the pixel point determined according to the average value coordinates as the color value of the second pixel point in the skin pixel point area.

In the embodiment of the present application, the above-mentioned at least one human face key point includes at least one second key point belonging to the inner area of the right eyebrow, and the above-mentioned at least one human face key point includes at least one first key point belonging to the inner area of the left eyebrow In the case of points, the coordinates of at least one first keypoint and at least one second keypoint are averaged. For example, when there are two key point coordinates of the inner area of the right eyebrow and the inner area of the left eyebrow, the key points of the inner area of the right eyebrow and the inner area of the left eyebrow are set as the upper point of the inner right eyebrow, the lower point of the inner right eyebrow, and the lower point of the left inner eyebrow. Point on the upper inner side of the eyebrow and four points on the lower inner side of the left eyebrow. As shown in FIG. 4 , it is assumed that the numbers corresponding to these four points are 37, 38, 67, and 68 respectively. Get the coordinates of 37, 38, 67, and 68 as (X1, Y1), (X2, Y2), (X3, Y3), (X4, Y4), and add the abscissa and ordinate of these four coordinates respectively Calculate the average value to obtain the average value coordinates as (X0, Y0). The RGB channel of the pixel is converted into the HSV channel, and the color value of the pixel corresponding to the average coordinate (X0, Y0) can be obtained according to the average coordinate. The color value of the pixel point corresponding to the average value coordinate is the color value of the second pixel point.

As yet another optional implementation manner, the image processing device performs the following steps during step 4:

46. Determine the fifth straight line according to the coordinates of the key points inside the right eyebrow and the key points on the left side of the nasion; determine the sixth straight line according to the coordinates of the key points inside the left eyebrow and the key points on the right side of the nasion.

47. Use the color value of the pixel determined according to the coordinates of the intersection of the fifth straight line and the sixth straight line as the color value of the second pixel in the skin pixel area.

In the embodiment of the present application, the at least one facial key point further includes a key point inside the right eyebrow, a key point on the left side of the nasion, a key point on the right side of the nasion, and a key point inside the left eyebrow. Connect the key point inside the right eyebrow with the key point on the left side of the nasion, connect the key point inside the left eyebrow with the key point on the right side of the nasion, and obtain two intersecting straight lines as the fifth straight line and the sixth straight line. This application does not limit the key point inside the right eyebrow and the key point inside the left eyebrow. The key point inside the right eyebrow is any key point taken in the inner area of the right eyebrow, and the key point inside the left eyebrow is any key point taken in the area inside the left eyebrow. a key point. As shown in Figure 4, assuming that the numbers corresponding to these four key points are 67, 68, 78, and 79 respectively, that is, connecting

key points

78 and 68, and connecting key points 79 and 67, after determining the two straight lines You can get a point of intersection. Based on the position of the face frame, the coordinates of the four

key points

67, 68, 79, and 78 can be determined, and then the coordinates of the intersection points can be solved by using Opencv. By determining the coordinates of the intersection point, the pixel point corresponding to the intersection point can be obtained. By converting the RGB channel of the pixel corresponding to the intersection point into an HSV channel, the color value of the pixel corresponding to the intersection coordinate can be obtained. The color value of the pixel corresponding to the intersection coordinate is the color value of the second pixel.

5. The difference between the color value of the second pixel point and the first value is used as the second threshold, and the sum of the color value of the second pixel point and the second value is used as the third threshold, wherein the first value and the first value Neither value exceeds the maximum value among the color values of the object to be processed.

In the embodiment of the present application, the second threshold and the third threshold can be determined by determining the color value of the second pixel. The function of the Opencv algorithm can convert the representation of the image from the RGB channel map to the HSV channel map, so as to obtain the color value of the second pixel.

The color value includes three parameter values of chroma, brightness and saturation. Among them, the range of hue is 0 to 180, and the range of brightness and saturation are both 0 to 255. That is to say, the maximum value of chroma is 180, and the maximum value of brightness and saturation is 255. It should be understood that the first value and the second value also respectively include three parameters of hue, brightness and saturation. Therefore, neither the chroma of the first value nor the chroma of the second value exceeds 180, neither the brightness of the first value nor the brightness of the second value exceeds 255, and the saturation of the first value and the saturation of the second value both No more than 255. Generally speaking, the three parameter values of the first value and the second value of chroma, brightness, and saturation are consistent. That is to say, the three parameter values of chroma, brightness, and saturation of the color value of the second pixel point are intermediate values of the three parameter values of chroma, brightness, and saturation corresponding to the second threshold and the third threshold.

In an implementation of obtaining the mapping relationship between the color value of the second pixel point and the second threshold and the third threshold, through machine learning binary classification algorithms, such as Logistic regression and naive Bayesian algorithm, according to the input of a certain color The color value judges whether this color belongs to the color value of the second pixel point for classification. That is, input a bunch of color values, classify whether these color values belong to the color values of the second pixel, and determine which color values belong to the color values of the second pixel. The mapping relationship between the color value of the second pixel point and the second threshold and the third threshold can be obtained through a machine algorithm.

Optionally, the three parameter values of chroma, brightness and saturation corresponding to the first value and the second value are 30, 60 and 70 respectively. That is to say, after obtaining the color value of the second pixel, the corresponding second threshold is to decrease the chroma by 30, the brightness by 60, and the saturation by 70, and the corresponding third threshold is to increase the chroma by 30 and the brightness by 60 , increase the saturation by 70.

As an optional implementation manner, the image processing device performs the following steps during the execution of step 203:

6. When the first ratio of the first number to the number of pixels in the area to be detected exceeds a first threshold, determine that the skin occlusion detection result is that the skin area is in an unoccluded state.

In the embodiment of the present application, the image processing device judges whether the skin area is in an occluded state according to whether the first ratio of the first number to the number of pixels in the area to be detected exceeds a first threshold. If the first ratio is smaller than the first threshold, it is determined that the skin occlusion detection result is that the skin area is in an occlusion state. For example, the first number is 50, the number of pixels in the region to be tested is 100, and the first threshold is 60%. Because the first ratio is 50/100=50%, less than 60%. Then it is considered that the skin occlusion detection result indicates that the skin area is in an occlusion state.

If the skin occlusion detection result shows that the skin area is occluded, the image processing device outputs prompt information that the skin needs to be exposed. According to the prompt message of exposing the skin, the skin occlusion detection can be performed again after exposing the skin, or other operations can be performed. This application is not limited.

7. When the first ratio exceeds the first threshold, determine that the skin occlusion detection result indicates that the skin area is in an unoccluded state.

In the embodiment of the present application, the image processing device determines that the skin occlusion detection result is that the skin area is in an unoccluded state according to the result that the first ratio of the first number to the number of pixels in the area to be detected is equal to or greater than the first threshold. For example, the first number is 60, the number of pixels in the region to be tested is 100, and the first threshold is 60%. Since the first ratio is 60/100=60%, it is equal to 60%. Then it is considered that the skin occlusion detection result indicates that the skin area is in an unoccluded state. Alternatively, the first number is 70, the number of pixels in the region to be tested is 100, and the first threshold is 60%. Since the first ratio is 70/100=70%, which is greater than 60%, it is considered that the skin occlusion detection result indicates that the skin area is in an unoccluded state.

When it is determined that the skin occlusion detection result indicates that the skin area is in an unoccluded state, a temperature measurement operation or other operations may be performed. If the temperature measurement is performed when the skin occlusion detection result shows that the skin area is in an unoccluded state, the accuracy of temperature detection can be improved. The present application does not limit the subsequent operations performed when the skin occlusion detection result shows that the skin area is in an unoccluded state.

As an optional implementation manner, the image processing device also performs the following steps:

8. Obtain the temperature thermodynamic map of the image to be processed above.

The image processing method in the embodiment of the present application can be used in the field of temperature measurement, and the above skin area belongs to the person to be detected. Each pixel in the temperature thermodynamic map carries the temperature information of the corresponding pixel. Optionally, the temperature thermodynamic map is collected by an infrared thermal imaging device on the image processing device. The image processing device performs image matching processing on the temperature thermodynamic map and the image to be processed, determines the pixel point area corresponding to the face area of the image to be processed from the temperature thermodynamic map, and obtains the person in the image to be processed on the temperature thermodynamic map. The pixel area corresponding to the face area.

9. When the skin occlusion detection result shows that the skin area is in an unoccluded state, read the temperature of the skin area from the temperature thermodynamic map as the body temperature of the person to be detected.

In the embodiment of the present application in which the body temperature of the subject is determined by detecting the temperature of the forehead area of the subject, in the case that the above-mentioned skin occlusion detection result shows that the skin area is in an unoccluded state, first find the image to be processed from the temperature thermodynamic map The pixel area corresponding to the face area, generally speaking, the skin area is located in the upper 30% to 40% of the entire face area, so the temperature corresponding to the skin area in the temperature thermodynamic map can be obtained. The average temperature of the skin area can be used as the body temperature of the person to be detected, or the highest temperature of the skin area can be used as the body temperature of the person to be detected, which is not limited in this application.

Please refer to FIG. 3 . FIG. 3 is a schematic flowchart of an applied image processing method provided by an embodiment of the present application.

Based on the image processing method provided in the embodiment of the present application, the embodiment of the present application also provides a possible application scenario of the image processing method.

When thermal imaging equipment is used to measure the temperature of pedestrians in a non-contact manner, the temperature of the forehead area of pedestrians is generally measured. However, when pedestrians have bangs covering their foreheads or wearing hats, because it is impossible to determine whether the forehead area is covered, it will cause a certain degree of interference to the temperature measurement, which brings certain challenges to the current temperature measurement work. Therefore, before measuring the temperature, the pedestrian's forehead is covered by detection, and when the forehead is not covered, the temperature of the pedestrian's forehead can be measured, which can improve the accuracy of temperature measurement.

As shown in FIG. 3 , the image processing device acquires camera frame data, that is, an image to be processed. Face detection is performed on the image to be processed, and if the result of the face detection is that there is no human face in the image to be processed, the image processing device acquires a new image to be processed. If the result of face detection is that there is a human face, then the image processing device will input the image to be processed into the trained neural network, and can output the human face frame (as shown in D of Figure 1 ) and the human face of the image to be processed. Box coordinates (as shown in Figure 1) and coordinates of 106 key points (as shown in Figure 4). It should be understood that the coordinates of the face frame can be a pair of diagonal coordinates including the upper left corner coordinate and the lower right corner coordinate or the lower left corner coordinate and the upper right corner coordinate. Corner coordinates (as shown in Figure 1). In the embodiment of the present application, the neural network that outputs the coordinates of the face frame of the image to be processed and the coordinates of 106 key points can be a neural network, or it can be a series of two neural networks that realize face detection and face key point detection respectively. .

In order to detect the exposed skin area of the forehead area, the color value of the brightest pixel point in the brow area is used as the color value reference of the exposed skin area of the forehead area. The brightest pixel is the above-mentioned second pixel. Therefore, it is necessary to obtain the eyebrow area first. Through face key point detection, the key points of the inner area of the left eyebrow and the inner area of the right eyebrow are obtained. In the case where there are two key points in the inner area of the right eyebrow and the inner area of the left eyebrow respectively, the key points are the upper point on the inner side of the right eyebrow, the lower point on the inner side of the right eyebrow, the upper point on the inner side of the left eyebrow, and the lower point on the inner side of the left eyebrow. Find a rectangular area through these four key points as the eyebrow area. The embodiment of the present application takes the coordinates of 106 key points as an example. The upper point on the inner side of the right eyebrow, the lower point on the inner side of the right eyebrow, the upper point on the inner side of the left eyebrow, and the lower point on the inner side of the left eyebrow correspond to the four

points

37, 38, 67, and 68. key point. It should be understood that the number of key points and the number of key points here do not constitute a limitation, as long as the two key points of the inner area of the right eyebrow and the inner area of the left eyebrow are taken respectively, they are within the scope of protection claimed by this application.

The coordinates of

key points

37, 38, 67, and 68 are obtained as (X1, Y1), (X2, Y2), (X3, Y3), and (X4, Y4) respectively. Take the maximum value of Y coordinate in (X1, Y1), (X2, Y2) as Y5 Take the minimum value of Y coordinate in (X3, Y3), (X4, Y4) as Y6, take (X1, Y1), (X3 , Y3), the maximum value of X coordinates is X5, and the minimum value of X coordinates among (X2, Y2), (X4, Y4) is X6. Combine the obtained X5, X6 coordinates with Y5, Y6 coordinates to obtain four coordinates. According to these four coordinates, a rectangular area can be determined. Wherein, the coordinates of the four vertices of the rectangular area are (X6, Y6), (X5, Y5), (X5, Y6), (X6, Y5), and this rectangular area is also the brow area to be intercepted. The coordinates of the four

points

37, 38, 67, and 68 can be determined through the key point detection of the face, then (X6, Y6), (X5, Y5), (X5, Y6), (X6, Y5) can be determined The positions of these four points. Intercept the rectangular area determined according to these four points to obtain the area between the eyebrows.

After obtaining the brow area, you need to find the brightest pixel in the brow area. Therefore, gray-scale processing is performed on the brow area to obtain a gray-scale image of the brow area. In the example of this application, the grayscale processing is to make each pixel in the pixel matrix satisfy the following relationship: R=G=B. That is to make the value of the red variable, the value of the green variable and the value of the blue variable equal. This "=" means equality in mathematics, and this value at this time is called a gray value. Generally, grayscale processing often uses two methods for processing:

Method 1: R after grayscale=G after grayscale=B after grayscale=(R before processing+G before processing+B before processing)/3

For example: the R of the pixel point m of the picture A is 100, the G is 120, and the B is 110. That is to say, the R of the pixel point m is 100, the G is 120, and the B is 110 before grayscale processing. Then grayscale processing is performed on the picture A, and R=G=B=(100+120+110)/3=110 of the pixel point m after the grayscale processing.

Method 2: R after grayscale = G after grayscale = B after grayscale = R*0.3 before processing + G*0.59 before processing + B*0.11 before processing

For example: the R of the pixel point m of the picture A is 100, the G is 120, and the B is 110. That is to say, the R of the pixel point m is 100, the G is 120, and the B is 110 before grayscale processing. Then grayscale processing is performed on the picture A, and R=G=B=100*0.3+120*0.59+110*0.11=112.9 of the pixel point m after the grayscale processing.

The Opencv function can also be used to perform grayscale processing on the region between the eyebrows, and the grayscale processing method of the region between the eyebrows is not limited in this application. In order to obtain the color value of the brightest pixel point, that is to find the color value of the pixel point with the largest gray value after the gray scale processing in the brow area. Add the gray values of each row of the gray image of the brow area, and record the coordinates of the row with the largest sum of gray values. Similarly, the gray value of each column of the gray image of the brow area is added, and the coordinates of the column with the largest sum of gray values are recorded. The coordinates of the brightest pixel in the eyebrow area are obtained by obtaining the intersection coordinates determined by the coordinates of the maximum row and maximum column of the sum of gray values. Through the conversion relationship between RGB and HSV, find the RGB value of the brightest pixel in the brow area. You can convert the corresponding HSV value through the formula. You can also convert the RGB channel of the brow area into an HSV channel through the cvtcolor function of opencv to find the brightest pixel. The HSV value of the point. Because the HSV value of the brightest pixel in the eyebrow area has a definite relationship with the second threshold and the third threshold, that is to say, the HSV value of the brightest pixel in the eyebrow area can determine the corresponding second threshold and third threshold.

Obtaining the forehead region requires determining the size and location of the forehead region. The length of the forehead area is the width of the human face. By calculating the distance between key point 0 and key point 32, the face frame is reduced so that the distance between the left frame line and the right frame line of the face frame is the distance between key point 0 and key point 32. That is to say, the distance between the key point 0 and the key point 32 is taken as the length of the forehead area. The width of the forehead area accounts for about 1/3 of the entire face frame. Although the ratio of the width of the forehead area to the length of the entire face is different for each person, the width of the forehead area is almost 30% of the length of the face. to the 40% range. Therefore, the distance between the upper frame line and the lower frame line of the face frame is reduced to 30% to 40% of the distance between the upper frame line and the lower frame line of the original face frame, as the width of the forehead area. The forehead area is the area located above the eyebrows. Here the horizontal line defined by

key points

35 and 40 is the position of the eyebrows. Therefore, move the size-changed face frame so that the lower frame of the size-changed face frame is located at the horizontal line determined by the two

key points

35 and 40, and obtain a changed position and size face frame. The rectangular area contained in the face frame whose size and position are changed is the forehead area.

The forehead area is intercepted, and then the forehead area is binarized according to the second threshold and the third threshold to obtain a binarized image of the forehead area. The binary image is used here, which can reduce the amount of data processing and speed up the detection of the forehead area by the image processing device. The binarization standard is: the HSV value of a certain pixel in the forehead area is greater than or equal to the second threshold and less than or equal to the third threshold, then the gray value of this pixel is 255, and the HSV value of a certain pixel in the forehead area If it is less than the second threshold or greater than the third threshold, then the gray value of this pixel is 0. First, convert the forehead region image from RGB channel map to HSV channel map. Then, count the number of pixels with a gray value of 255 in the forehead area, that is, the number of pixels whose color is white in the gray scale image. When the ratio of the number of white pixels to the number of pixels in the forehead area reaches the threshold, it is considered that the forehead area is not blocked, and thus the thermal imaging temperature measurement operation is performed. When the ratio of the number of white pixels to the number of pixels in the forehead area does not reach the threshold, the forehead area is considered to be in a blocked state, and the temperature measurement operation at this time will affect the accuracy of temperature measurement, so the output needs to be exposed Forehead prompts, and the image processing device needs to re-acquire an image to re-detect the forehead occlusion state. For example: Suppose the second threshold is (100, 50, 70), the third threshold is (120, 90, 100), the color value of the pixel point q in the forehead area is (110, 60, 70), the pixel in the forehead area The color value of point p is (130, 90, 20). Then q is within the range of the second threshold and the third threshold, and p is not within the range of the second threshold and the third threshold. During the binarization process of the forehead area, the gray value of the pixel point q is 255, and the gray value of the pixel point p is 0. Suppose the threshold is 60%, the number of pixels in the forehead area is 100, and the number of white pixels is 50, then the ratio of the number of white pixels to the number of pixels in the forehead area is 50%, and the threshold is not reached, the forehead area It is in an occluded state, so the output needs to show the prompt of the forehead.

Those skilled in the art can understand that in the specific implementation of the above method, the writing order of each step does not mean a strict execution order and constitutes any limitation on the implementation process. The specific execution order of each step should be based on its function and possible The inner logic is OK.

The method of the embodiment of the present application has been described in detail above, and the device of the embodiment of the present application is provided below.

Please refer to FIG. 5. FIG. 5 is a schematic structural diagram of an image processing device provided by an embodiment of the present application, wherein the device 1 includes an acquisition unit 11, a first processing unit 12, and a detection unit 13. Optionally, an image processing device 1 also includes a second processing unit 14, a determination unit 15, a third processing unit 16, and a fourth processing unit 17, wherein: the acquisition unit 11 is used to acquire the image to be processed, the first threshold, the second threshold and the third threshold, The first threshold is different from the second threshold, the first threshold is different from the third threshold, and the second threshold is less than or equal to the third threshold; the first processing unit 12 is configured to determine the The first quantity of the first pixel in the area to be detected of the image to be processed; the first pixel is a pixel whose color value is greater than or equal to the second threshold and less than or equal to the third threshold; the detection unit 13 is configured to The first ratio of the first number to the number of pixels in the region to be detected and the first threshold are used to obtain a skin occlusion detection result of the image to be processed.

In combination with any embodiment of the present application, the area to be tested includes a human face area, and the skin occlusion detection result includes a human face occlusion detection result; the image processing device further includes: a second processing unit 14, configured to Before determining the first number of first pixels in the area to be detected of the image to be processed, performing face detection processing on the image to be processed to obtain a first face frame; according to the first face frame, from The face area is determined in the image to be processed.

In combination with any embodiment of the present application, the face area includes a forehead area, the face occlusion detection result includes a forehead occlusion detection result, and the first face frame includes: an upper frame line and a lower frame line; Both the frame line and the lower frame line are sides parallel to the horizontal axis of the pixel coordinate system of the image to be processed in the first face frame, and the ordinate of the upper frame line is smaller than the lower frame line The ordinate; the second processing unit 14 is used to: detect the key points of the human face on the image to be processed to obtain at least one key point of the human face; the at least one key point of the human face includes the left eyebrow key point and the right eyebrow key point Eyebrow key point; under the condition of keeping the ordinate of the upper frame line unchanged, move the lower frame line along the negative direction of the vertical axis of the pixel coordinate system of the image to be processed, so that the lower frame line Where the straight line coincides with the first straight line to obtain the second human face frame; the first straight line is a straight line passing through the left eyebrow key point and the right eyebrow key point; according to the second human face frame included region to get the forehead region.

In combination with any embodiment of the present application, the second processing unit 14 is configured to: keep the vertical coordinate of the lower frame line of the second face frame unchanged, and convert the upper frame line of the second face frame to The frame line moves along the vertical axis of the pixel coordinate system of the image to be processed, so that the distance between the upper frame line of the second human face frame and the lower frame line of the second human face frame is a preset distance, and the first Three face frames; according to the area included in the third face frame, the forehead area is obtained.

In combination with any embodiment of the present application, the at least one human face key point also includes a left mouth corner key point and a right mouth corner key point; the first human face frame further includes: a left frame line and a right frame line; the left frame line and the right frame line are both sides parallel to the vertical axis of the pixel coordinate system of the image to be processed in the first face frame, and the abscissa of the left frame line is smaller than that of the right frame line Abscissa; the second processing unit 14 is used to: keep the abscissa of the left frame line of the third human face frame unchanged, and place the right frame line of the third human face frame along the The horizontal axis of the pixel coordinate system of the image to be processed moves, so that the distance between the right frame line of the third human face frame and the left frame line of the third human face frame is a reference distance, and the fourth human face frame is obtained; The reference distance is the distance between the second straight line and the two intersection points of the human face contour included in the third human face frame; the second straight line is between the first straight line and the third straight line and parallel A straight line on the first straight line or the third straight line; the third straight line is a straight line passing through the key point of the left corner of the mouth and the key point of the right corner of the mouth; the region included in the fourth human face frame as the forehead area.

In combination with any embodiment of the present application, the image device further includes: a determining unit 15, configured to, before determining the first number of first pixels in the region to be detected of the image to be processed, from the first Determine the skin pixel point area in the pixel point area included in the human face frame; the acquisition unit 11 is also used to obtain the color value of the second pixel point in the skin pixel point area; the first processing unit 12 is also used The difference between the color value of the second pixel point and the first value is used as the second threshold, and the sum of the color value of the second pixel point and the second value is used as the third threshold; Neither the first value nor the second value exceeds the maximum value among the color values of the image to be processed.

In combination with any embodiment of the present application, the image processing device further includes: a third processing unit 16, configured to, before determining the skin pixel area from the pixel area included in the first human face frame, The mask wearing detection process is carried out on the image to be processed, and the detection result is obtained; the determination unit 15 is used to: when it is detected that the face area in the image to be processed is not wearing a mask, remove all masks from the face area. The forehead area, the mouth area, the eyebrow area and the pixel area outside the eye area are used as the skin pixel area; when it is detected that the face area in the image to be processed is wearing a mask, the first The pixel point area between the straight line and the fourth straight line is used as the skin pixel point area. Wherein, the fourth straight line is a straight line passing through the key points of the lower eyelid of the left eye and the key point of the lower eyelid of the right eye; both the key points of the lower eyelid of the left eye and the key points of the lower eyelid of the right eye belong to the at least one human face key point.

In combination with any embodiment of the present application, the acquisition unit 11 is configured to: include at least one first key point belonging to the inner area of the left eyebrow in the at least one key point of the human face, and include at least one first key point belonging to the inner area of the right eyebrow. In the case of one second key point, determining a rectangular area according to the at least one first key point and the at least one second key point; performing grayscale processing on the rectangular area to obtain a grayscale image of the rectangular area; The color value of the intersection point of the first row and the first column in the grayscale image of the rectangular area is used as the color value of the second pixel point; the first row is the row with the largest sum of grayscale values in the grayscale image , the first column is the column with the largest sum of gray values in the gray scale image.

In combination with any embodiment of the present application, the detection unit 13 is configured to: if the first ratio does not exceed the first threshold, determine that the skin occlusion detection result is the skin area corresponding to the area to be tested In an occluded state; when the first ratio exceeds the first threshold, it is determined that the skin occlusion detection result indicates that the skin area corresponding to the region to be detected is in an unoccluded state.

In combination with any embodiment of the present application, the skin area belongs to the person to be detected, and the acquisition unit 11 is further configured to: acquire the temperature thermodynamic map of the image to be processed; the image processing device further includes: a fourth processing unit 17 , for reading the temperature of the skin area from the temperature thermodynamic map as the body temperature of the person to be detected when the skin occlusion detection result shows that the skin area is in an unoccluded state.

In some embodiments, the functions or modules included in the device provided by the embodiments of the present application can be used to execute the methods described in the above method embodiments, and its specific implementation can refer to the descriptions of the above method embodiments. For brevity, here No longer.

FIG. 6 is a schematic diagram of a hardware structure of an image processing device provided by an embodiment of the present application. The image processing device 2 includes a processor 21 , a memory 22 , an input device 23 and an output device 24 . The processor 21, the memory 22, the input device 23 and the output device 24 are coupled through a connector 25, and the connector 25 includes various interfaces, transmission lines or buses, etc., which are not limited in this embodiment of the present application. It should be understood that in various embodiments of the present application, coupling refers to interconnection in a specific way, including direct connection or indirect connection through other devices, for example, connection through various interfaces, transmission lines, and buses.

The processor 21 may be one or more graphics processing units (graphics processing unit, GPU), and in the case where the processor 21 is a GPU, the GPU may be a single-core GPU or a multi-core GPU. Optionally, the processor 21 may be a processor group composed of multiple GPUs, and the multiple processors are coupled to each other through one or more buses. Optionally, the processor may also be other types of processors, etc., which are not limited in this embodiment of the present application.

The memory 22 can be used to store computer program instructions and various computer program codes including program codes for implementing the solutions of the present application. Optionally, the memory includes but is not limited to random access memory (random access memory, RAM), read-only memory (read-only memory, ROM), erasable programmable read-only memory (erasable programmable read only memory, EPROM ), or portable read-only memory (compact disc read-only memory, CD-ROM), which is used for related instructions and data.

The input device 23 is used for inputting data and/or signals and the output device 24 is used for outputting data and/or signals. The input device 23 and the output device 24 can be independent devices, or an integrated device.

It can be understood that in the embodiment of the present application, the memory 22 can not only be used to store relevant instructions, but also can be used for storage, for example, the memory 22 can be used to store data obtained through the input device 23, or the memory 22 can also be used to store data obtained through the processor. 21 processed data, etc., the embodiment of the present application does not limit the specific data stored in the memory.

It can be understood that Fig. 6 only shows a simplified design of the image processing device. In practical applications, the image processing device can also include other necessary components, including but not limited to any number of input/output devices, processors, memories, etc., and all image processing devices that can implement the embodiments of the present application should be in Within the protection scope of this application.

Those skilled in the art can appreciate that the units and algorithm steps of the examples described in conjunction with the embodiments disclosed herein can be implemented by electronic hardware, or a combination of computer software and electronic hardware. Whether these functions are executed by hardware or software depends on the specific application and design constraints of the technical solution. Those skilled in the art may use different methods to implement the described functions for each specific application, but such implementation should not be regarded as exceeding the scope of the present application.

Those skilled in the art can clearly understand that for the convenience and brevity of the description, the specific working process of the above-described system, device and unit can refer to the corresponding process in the foregoing method embodiment, which will not be repeated here. Those skilled in the art can also clearly understand that the descriptions of each embodiment of the present application have their own emphases. For the convenience and brevity of description, the same or similar parts may not be repeated in different embodiments. Therefore, in a certain embodiment For parts not described or not described in detail, reference may be made to the descriptions of other embodiments.

In the several embodiments provided in this application, it should be understood that the disclosed systems, devices and methods may be implemented in other ways. For example, the device embodiments described above are only illustrative. For example, the division of the above units is only a logical function division. In actual implementation, there may be other division methods. For example, multiple units or components can be combined or can be Integrate into another system, or some features may be ignored, or not implemented. In another point, the mutual coupling or direct coupling or communication connection shown or discussed may be through some interfaces, and the indirect coupling or communication connection of devices or units may be in electrical, mechanical or other forms.

The units described above as separate components may or may not be physically separated, and the components displayed as units may or may not be physical units, that is, they may be located in one place, or may be distributed to multiple network units. Part or all of the units can be selected according to actual needs to achieve the purpose of the solution of this embodiment.

In addition, each functional unit in each embodiment of the present application may be integrated into a first processing unit, each unit may exist separately physically, or two or more units may be integrated into one unit.

In the above embodiments, all or part of them may be implemented by software, hardware, firmware or any combination thereof. When implemented using software, it may be implemented in whole or in part in the form of a computer program product. The computer program product described above comprises one or more computer instructions. When the above-mentioned computer program instructions are loaded and executed on the computer, all or part of the above-mentioned processes or functions according to the embodiments of the present application will be generated. The above-mentioned computers may be general-purpose computers, special-purpose computers, computer networks, or other programmable devices. The above computer instructions may be stored in a computer-readable storage medium, or transmitted through the above-mentioned computer-readable storage medium. The above computer instructions can be sent from one website site, computer, server or data center to another via wired (such as coaxial cable, optical fiber, digital subscriber line (DSL)) or wireless (such as infrared, wireless, microwave, etc.) A website site, computer, server or data center for transmission. The above-mentioned computer-readable storage medium may be any available medium that can be accessed by a computer, or a data storage device such as a server or a data center integrated with one or more available media. The above available medium may be a magnetic medium (for example, a floppy disk, a hard disk, a magnetic tape), an optical medium (for example, a digital versatile disc (digital versatile disc, DVD)), or a semiconductor medium (for example, a solid state disk (solid state disk, SSD)) Wait.

Those of ordinary skill in the art can understand that all or part of the processes in the methods of the above embodiments are realized. The processes can be completed by computer programs to instruct related hardware. The programs can be stored in computer-readable storage media. When the programs are executed , may include the processes of the foregoing method embodiments. The aforementioned storage medium includes: various media capable of storing program codes such as read-only memory (ROM) or random access memory (RAM), magnetic disk or optical disk.

Claims

An image processing method, characterized in that the method comprises:

Acquire the image to be processed, a first threshold, a second threshold, and a third threshold, the first threshold is different from the second threshold, the first threshold is different from the third threshold, and the second threshold is less than or equal to said third threshold;

determining a first number of first pixels in the region to be measured of the image to be processed; the first pixels are pixels whose color values are greater than or equal to the second threshold and less than or equal to the third threshold;

A skin occlusion detection result of the image to be processed is obtained according to a first ratio of the first number to the number of pixels in the region to be detected and the first threshold.
The method according to claim 1, wherein the determining the first number of first pixels in the region to be measured of the image to be processed comprises:

Performing face detection processing on the image to be processed to obtain a first face frame;

Determining the region to be detected from the image to be processed according to the first face frame;

A first quantity of the first pixel points in the region to be tested is determined.
The method according to claim 2, wherein the first human face frame includes an upper frame line and a lower frame line; both the upper frame line and the lower frame line are parallel in the first human face frame On the side of the horizontal axis of the pixel coordinate system of the image to be processed, and the vertical coordinate of the upper frame line is smaller than the vertical coordinate of the lower frame line; Determining the area to be tested in the processed image includes:

Carrying out human face key point detection on the image to be processed to obtain at least one human face key point; the at least one human face key point includes a left eyebrow key point and a right eyebrow key point;

In the case of keeping the vertical coordinate of the upper frame line unchanged, move the lower frame line along the negative direction of the vertical axis of the pixel coordinate system of the image to be processed, so that the line where the lower frame line is located is in line with the first A straight line overlaps to obtain the second human face frame; the first straight line is a straight line passing through the left eyebrow key point and the right eyebrow key point;

The region to be detected is obtained according to the region included in the second human face frame.
The method according to claim 3, wherein the obtaining the region to be tested according to the region contained in the second human face frame includes:

Under the condition of keeping the ordinate of the lower frame line of the second human face frame unchanged, move the upper frame line of the second human face frame along the vertical axis of the pixel coordinate system of the image to be processed, so that The distance between the upper frame line of the second human face frame and the lower frame line of the second human face frame is a preset distance to obtain a third human face frame;

The region to be detected is obtained according to the region included in the third human face frame.
The method according to claim 4, wherein said at least one human face key point also includes a left mouth corner key point and a right mouth corner key point; said first human face frame also includes a left frame line and a right frame line; Both the left frame line and the right frame line are sides parallel to the vertical axis of the pixel coordinate system of the image to be processed in the first face frame, and the abscissa of the left frame line is smaller than the right The abscissa of frame line; Described according to the area that described the 3rd people's face frame comprises, obtain described area to be tested, comprise:

While keeping the abscissa of the left frame line of the third face frame unchanged, move the right frame line of the third face frame along the abscissa of the pixel coordinate system of the image to be processed, so that The distance between the right frame line of the third human face frame and the left frame line of the third human face frame is a reference distance to obtain a fourth human face frame; the reference distance is the second straight line and the first straight line The distance between two intersection points of the human face contours contained in the three human face frames; the second straight line is between the first straight line and the third straight line and parallel to the first straight line or the first straight line A straight line of three straight lines; the third straight line is a straight line passing through the key point of the left corner of the mouth and the key point of the right corner of the mouth;

The region included in the fourth human face frame is used as the region to be detected.
The method according to any one of claims 2 to 5, wherein said obtaining the second threshold and the third threshold comprises:

Determining the skin pixel point area from the pixel point area included in the first human face frame;

Acquiring the color value of the second pixel in the skin pixel area;

using the difference between the color value of the second pixel point and the first value as the second threshold,

Taking the sum of the color value of the second pixel and the second value as the third threshold; wherein neither the first value nor the second value exceeds the maximum color value of the image to be processed value.
The method according to claim 6, wherein the determining the skin pixel area from the pixel area included in the first human face frame comprises:

When it is detected that the face area in the image to be processed does not wear a mask, the pixel point area in the face area except the forehead area, mouth area, eyebrow area and eye area is used as the skin pixel point area;

When it is detected that the face area in the image to be processed wears a mask, the pixel point area between the first straight line and the fourth straight line is used as the skin pixel point area; The straight line of the key point of the lower eyelid of the left eye and the key point of the lower eyelid of the right eye; the key point of the lower eyelid of the left eye and the key point of the lower eyelid of the right eye belong to the at least one human face key point.
The method according to claim 6 or 7, wherein said acquiring the color value of the second pixel in the skin pixel area comprises:

In the case where the at least one human face key point includes at least one first key point belonging to the inner area of the left eyebrow, and the at least one human face key point includes at least one second key point belonging to the inner area of the right eyebrow , determining a rectangular area according to the at least one first key point and the at least one second key point;

performing grayscale processing on the rectangular area to obtain a grayscale image of the rectangular area;

The color value of the intersection point of the first row and the first column is used as the color value of the second pixel point; the first row is the row with the largest sum of grayscale values in the grayscale image, and the first column is The column with the largest sum of grayscale values in the grayscale image.
The method according to any one of claims 1 to 8, characterized in that, according to the first ratio of the first number to the number of pixels in the region to be tested and the first threshold, the obtained Describe the skin occlusion detection results of the image to be processed, including:

If the first ratio does not exceed the first threshold, determine that the skin occlusion detection result indicates that the skin area corresponding to the area to be tested is in an occlusion state;

In a case where the first ratio exceeds the first threshold, it is determined that the skin occlusion detection result indicates that the skin area corresponding to the area to be detected is in an unoccluded state.
The method according to claim 9, wherein the skin area belongs to a person to be detected, and the method further comprises:

Acquiring a temperature thermodynamic map of the image to be processed;

If the skin occlusion detection result shows that the skin area is in an unoccluded state, read the temperature of the skin area from the temperature thermodynamic map as the body temperature of the person to be detected.
An image processing device, characterized in that the device comprises:

an acquiring unit, configured to acquire an image to be processed, a first threshold, a second threshold, and a third threshold, the first threshold is different from the second threshold, the first threshold is different from the third threshold, and the The second threshold is less than or equal to the third threshold;

A first processing unit, configured to determine a first number of first pixels in the region to be detected of the image to be processed; the first pixel is a color value greater than or equal to the second threshold and less than or equal to the third threshold Threshold pixels;

A detection unit, configured to obtain a skin occlusion detection result of the image to be processed according to a first ratio of the first number to the number of pixels in the region to be detected and the first threshold.
A processor, wherein the processor is configured to execute the method according to any one of claims 1-10.
An electronic device, characterized in that it includes: a processor and a memory, the memory is used to store computer program codes, the computer program codes include computer instructions, and when the processor executes the computer instructions, the The electronic device executes the method according to any one of claims 1-10.
A computer-readable storage medium, wherein a computer program is stored in the computer-readable storage medium, the computer program includes program instructions, and when the program instructions are executed by a processor, the processing The device performs the method according to any one of claims 1 to 10.