WO2023231400A1 - Procédé et appareil de prédiction d'angle facial, et dispositif et support de stockage lisible - Google Patents
Procédé et appareil de prédiction d'angle facial, et dispositif et support de stockage lisible Download PDFInfo
- Publication number
- WO2023231400A1 WO2023231400A1 PCT/CN2022/142276 CN2022142276W WO2023231400A1 WO 2023231400 A1 WO2023231400 A1 WO 2023231400A1 CN 2022142276 W CN2022142276 W CN 2022142276W WO 2023231400 A1 WO2023231400 A1 WO 2023231400A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- angle
- face
- probabilities
- detection window
- face image
- Prior art date
Links
- 238000000034 method Methods 0.000 title claims abstract description 75
- 230000001815 facial effect Effects 0.000 title claims abstract description 69
- 238000001514 detection method Methods 0.000 claims abstract description 114
- 238000013507 mapping Methods 0.000 claims description 32
- 230000008569 process Effects 0.000 claims description 15
- 238000012545 processing Methods 0.000 claims description 8
- 238000004590 computer program Methods 0.000 claims description 7
- 230000006870 function Effects 0.000 description 14
- 238000004422 calculation algorithm Methods 0.000 description 12
- 230000009471 action Effects 0.000 description 6
- 238000004891 communication Methods 0.000 description 6
- 230000001360 synchronised effect Effects 0.000 description 5
- 238000010586 diagram Methods 0.000 description 4
- 238000005516 engineering process Methods 0.000 description 4
- 230000004044 response Effects 0.000 description 4
- 230000008878 coupling Effects 0.000 description 3
- 238000010168 coupling process Methods 0.000 description 3
- 238000005859 coupling reaction Methods 0.000 description 3
- 238000000605 extraction Methods 0.000 description 2
- 238000003491 array Methods 0.000 description 1
- 238000013473 artificial intelligence Methods 0.000 description 1
- 238000013528 artificial neural network Methods 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000005540 biological transmission Effects 0.000 description 1
- 238000004364 calculation method Methods 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 239000000284 extract Substances 0.000 description 1
- 230000008921 facial expression Effects 0.000 description 1
- 230000003993 interaction Effects 0.000 description 1
- 238000011835 investigation Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000010008 shearing Methods 0.000 description 1
- 230000003068 static effect Effects 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/70—Determining position or orientation of objects or cameras
- G06T7/73—Determining position or orientation of objects or cameras using feature-based methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/70—Determining position or orientation of objects or cameras
- G06T7/77—Determining position or orientation of objects or cameras using statistical methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/10—Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
- G06V40/16—Human faces, e.g. facial parts, sketches or expressions
Definitions
- the present application belongs to the field of image recognition technology, and in particular relates to a face angle detection method, device, equipment and readable storage medium.
- the facial image is first reconstructed into a three-dimensional image, and then the three-dimensional image is mapped into a two-dimensional image, and then based on the facial movement characteristics in the two-dimensional image Perform face pose prediction.
- the angle can be corrected before recognition or not, to improve the accuracy of image recognition.
- This application provides a face angle prediction method, device, equipment and readable storage medium, which can avoid the difficulty of predicting only the approximate posture of a face and the difficulty in accurately predicting the face angle, and can adapt to the relatively low accuracy of face angle prediction. High scene.
- this application provides a face angle prediction method, including:
- multiple angle probabilities of multiple angle types are determined.
- the multiple angle probabilities of each angle type respectively correspond to multiple angle intervals of each angle type.
- the multiple angle types include yaw. angle, pitch and roll angles;
- the predicted angle of each angle type of the face in the face image to be measured relative to the shooting position is determined.
- This application determines the multiple angle probabilities of multiple angle types through the facial features corresponding to the face area, and then determines the relative shooting position of the face in the face image to be measured based on the multiple angle probabilities of each angle type.
- the predicted angle for each angle type Since the obtained multiple angle probabilities are angle probabilities corresponding to multiple angle intervals, the corresponding prediction angle is calculated through the angle probabilities corresponding to multiple angle intervals, ensuring the accuracy of the prediction angle and avoiding the situation where it is difficult to accurately predict the face angle. , which can be adapted to scenes with high accuracy in face angle prediction.
- this application provides a face angle prediction device, which is used to perform the method in the above-mentioned first aspect or any possible implementation of the first aspect.
- the device may include:
- the acquisition module is used to obtain the face area of the face image to be tested
- the first determination module is used to determine the facial features corresponding to the facial area
- the second determination module is used to determine multiple angle probabilities of multiple angle types according to the facial features, and the multiple angle probabilities of each angle type respectively correspond to multiple angle intervals of each angle type, said Multiple angle types including yaw, pitch, and roll angles;
- the third determination module is configured to determine the predicted angle of each angle type of the human face in the face image to be measured relative to the shooting position based on multiple angle probabilities of each angle type.
- the present application provides an electronic device, which includes a memory and a processor.
- the memory is used to store instructions; the processor executes the instructions stored in the memory, so that the device performs the face angle prediction method in the first aspect or any possible implementation of the first aspect.
- a computer-readable storage medium In a fourth aspect, a computer-readable storage medium is provided. Instructions are stored in the computer-readable storage medium. When the instructions are run on a computer, they cause the computer to execute the first aspect or any possible implementation of the first aspect. Face angle prediction method.
- a fifth aspect provides a computer program product containing instructions that, when run on a device, cause the device to execute the face angle prediction method of the first aspect or any possible implementation of the first aspect.
- Figure 1 is a schematic flowchart of a face angle prediction method provided by an embodiment of the present application
- Figure 2 is a schematic flow chart of a face angle prediction method provided by an embodiment of the present application.
- Figure 3 is a schematic flow chart of a face angle prediction method provided by an embodiment of the present application.
- Figure 4 is a schematic flowchart of a face angle prediction method provided by an embodiment of the present application.
- Figure 5 is a schematic structural diagram of a face angle prediction device provided by an embodiment of the present application.
- FIG. 6 is a schematic structural diagram of an electronic device provided by an embodiment of the present application.
- the term “if” may be interpreted as “when” or “once” or “in response to determining” or “in response to detecting” depending on the context. ". Similarly, the phrase “if determined” or “if [the described condition or event] is detected” may be interpreted, depending on the context, to mean “once determined” or “in response to a determination” or “once the [described condition or event] is detected ]” or “in response to detection of [the described condition or event]”.
- This application provides a face angle prediction method, device, equipment and readable storage medium.
- the method can be implemented through recognition equipment and applied in access control recognition, missing person search, case investigation, intelligent security and other scenarios.
- the face angle includes three angle types of the face relative to the shooting position, and the three angle types are pitch angle, yaw angle and roll angle respectively.
- the recognition device refers to the device used by users to predict face angles.
- Identification devices can be access control devices, smartphones, desktop computers, laptops, tablets, wearable devices, handheld devices, vehicle-mounted devices, servers, etc.
- the embodiments of this application do not place any restrictions on the specific types of identification devices.
- the identification device may include display hardware, or may have an external display.
- the recognition device can predict the face angle based on the face in the image, and determine whether the next action can be taken based on the predicted face angle. For example, when When the angle of the face is too large, a prompt message such as "The angle of the face is too large and cannot be recognized" will be displayed on the display screen of the access control device.
- the face angle prediction method in the missing person search scenario, it is used to input the face image into the recognition device and predict the face angle corresponding to the face in the face image. The face obtained through prediction The angle determines whether the next step can be taken, for example, determining whether the recognition device can recognize the face, determining whether the face image matches the image in the missing person image library, and then determining whether it is a missing person.
- the recognition device can be connected through communication with the surveillance camera, and the recognition device can predict the face angle corresponding to the face in the image by acquiring the image captured by the surveillance camera.
- FIG. 1 shows a schematic flowchart of a face angle prediction method provided by an embodiment of the present application.
- the face angle prediction method provided by this application can include:
- the face image to be tested can be directly given by the user, or it can be extracted from video data collected by image collection equipment such as surveillance cameras and video cameras.
- the face area refers to the area containing faces in the face image to be measured.
- the face area is obtained by performing face detection on the face image to be tested, obtaining the first detection window, and intercepting the image in the first detection window.
- the recognition device can perform an external expansion process on the first detection window to obtain an expanded second detection window, and intercept an area corresponding to the size of the second detection window as the face area.
- the detection window refers to the wire frame from which the face in the face image to be detected can be extracted.
- the face detection algorithm can be used to detect the face to be tested.
- the face detection algorithm can be stored in the storage device.
- the storage device can communicate with the recognition device, so that the recognition device can retrieve the face detection algorithm from the storage device.
- This application does not limit the storage method and specific type of storage devices.
- the YOLO (you only look once) algorithm is used for face detection.
- the YOLO algorithm is an object recognition and positioning algorithm based on deep neural networks. Its biggest feature is its fast running speed.
- the identification device is an access control device
- the access control device includes a camera.
- the camera captures a face image.
- the access control device uses a face detection algorithm to detect the face image and obtain the face area of the face image to be measured.
- the recognition device is a mobile phone
- the mobile phone has a recognition applet.
- the mobile phone communicates with the surveillance camera through the recognition applet to obtain images captured by the surveillance camera.
- the recognition applet can use the face detection algorithm to detect faces in images captured by surveillance cameras and obtain the face area of the face image to be detected.
- the recognition device can obtain the face area. Therefore, the recognition device can perform feature extraction on the face area and obtain the facial features corresponding to the face area.
- the recognition device outputs the facial features by inputting the facial region into the backbone network of the facial angle recognition model.
- the backbone network is used to extract facial features from facial images.
- the backbone network is stored in advance in a storage device that communicates with the identification device.
- the identification device is an access control device.
- the camera captures the face image.
- the access control device performs face detection on the face image through the face detection algorithm. After obtaining the face area of the face image, it calls the backbone network to detect the face. Feature extraction is performed on the region to obtain facial features.
- the identification device is a mobile phone
- the mobile phone has an identification applet.
- the recognition applet performs face detection on the image through the face detection algorithm. After obtaining the face area corresponding to the image, it calls the backbone network to extract features of the face area to obtain the face features.
- various angle types include yaw angle, pitch angle and roll angle.
- Multiple angle probabilities for each angle type respectively correspond to multiple angle intervals for each angle type.
- Multiple angle intervals refer to multiple angle intervals obtained by dividing the angle ranges of yaw angle, pitch angle and roll angle according to preset rules.
- the prediction rule is to divide the angle range into intervals every 5 degrees.
- the angle range of yaw angle and pitch angle is [-90, 90].
- the angle range of yaw angle and pitch angle is divided into intervals every 5 degrees, resulting in 36 angle intervals respectively.
- the 36 angle intervals of the yaw angle are [-90, -85), [-85, -80)... (80, 85], (85, 90].
- the 36 angle intervals of the pitch angle are [-90, -85), [-85, -80)... (80, 85], (85, 90].
- the angle range of the roll angle is [-180, 180].
- the angle range of the roll angle is divided into intervals every 5 degrees, resulting in 72 angle intervals.
- the 72 angle intervals of the roll angle are [-180, -175), [-175, -170), [-170, -165)... (165, 170], (170, 175], (175, 180 ].
- the recognition device inputs the facial features into the fully connected classification network of the facial angle recognition model and outputs multiple angle probabilities for each angle type.
- the fully connected classification network is used to predict the angle probabilities corresponding to the facial features in multiple angle intervals of each angle type.
- the backbone network and fully connected classification network serve as face angle recognition models and are pre-stored in a storage device that communicates with the recognition device.
- the fully connected classification network is connected to the output end of the backbone network.
- the recognition device backbone network extracts the facial features of the face image to be tested, the facial features are input to the fully connected classification network for angle probability prediction.
- the fully connected classification network includes three fully connected layers, namely the first fully connected layer, the second fully connected layer and the third fully connected layer.
- the first fully connected layer and the second fully connected layer are connected to 36 nodes respectively, and the third fully connected layer is connected to 72 pivots.
- the 36 nodes of the first fully connected layer and the second fully connected layer are used to predict the angular probability of the yaw angle and the pitch angle in their 36 angle intervals
- the 72 nodes of the third fully connected layer are used It is used to predict the angle probability of roll angle in its 72 angle intervals.
- the identification device is an access control device. After acquiring the facial features, the access control device predicts the angle probabilities of the yaw angle and the pitch angle in its 36 angle intervals, and predicts the angle probability of the roll angle in its 72 angle intervals based on the facial features.
- the identification device is a mobile phone
- the mobile phone has an identification applet.
- the recognition applet predicts the angle probabilities of the yaw angle and the pitch angle in its 36 angle intervals, and predicts the angle probability of the roll angle in its 72 angle intervals based on the facial features.
- the relative angle of the face in the face image to be measured is determined based on multiple angle probabilities, the number of angle intervals, and the intermediate angle of each angle interval. The angle of the shooting position.
- the middle angle of each angle interval refers to every five angles being the middle angle in an angle interval. For example, if a certain angle range is [0, 5), then the corresponding intermediate angle is 2.5 degrees.
- n represents the number of angle intervals
- the identification device is an access control device
- the access control device includes a display screen and a camera.
- the access control device predicts the angular probability of the yaw angle and pitch angle in its 36 angle interval based on the facial features. Calculate the angle probability of each angle interval corresponding to the value of each angle interval, calculate the value corresponding to each interval based on the angle probability of the roll angle in its 72 angle intervals, and calculate the final face based on the value corresponding to each interval prediction angle.
- the access control device can determine whether the next action can be taken based on the predicted angle of the face. For example, when the face angle is too large, a prompt message such as "The face angle is too large to be recognized" will be displayed on the display of the access control device. .
- the identification device is a mobile phone
- the mobile phone has an identification applet.
- the recognition applet determines the corresponding facial features based on the facial features, predicts the angle probabilities of the yaw angle and the pitch angle in its 36 angle intervals, and predicts the angle probability of the roll angle in its 72 angle intervals based on the facial features. Finally, calculate the corresponding value of each angle interval based on the angle probability of the yaw angle and pitch angle in its 36 angle intervals, calculate the corresponding value of each interval based on the angle probability of the roll angle in its 72 angle intervals, and Calculate the final predicted angle of the face based on the value corresponding to each interval.
- the recognition applet can use the predicted angle of the face to determine whether the next action can be taken. For example, when the angle of the face is too large, the face image to be measured is corrected.
- the face angle prediction method provided by this application obtains face features based on the face area of the face image to be measured, and then determines multiple angle probabilities for each of the multiple angle types based on the face features. Finally, based on each angle type The multiple angle probabilities are used to determine the predicted angle of each angle type of the face in the face image to be measured relative to the shooting position. Therefore, for each angle type, since the obtained multiple angle probabilities are angle probabilities corresponding to multiple angle intervals, the predicted angle is calculated based on the angle probabilities corresponding to multiple angle intervals, ensuring the accuracy of the predicted angle.
- the recognition device can obtain the maximum value of multiple angle probabilities of the roll angle.
- the maximum value corresponds to the preset mapping angle interval
- the calculated predicted angle of the roll angle is inaccurate.
- the range of multiple angle intervals of the roll angle is [-180, 180].
- the mapping angle interval includes multiple angle intervals corresponding to [-180, -90) or multiple angle intervals corresponding to (90, 180].
- Figure 2 shows a schematic flow chart of a face angle prediction method provided by an embodiment of the present application.
- the face angle prediction method provided by this application may include:
- the maximum probability angle interval is the angle interval corresponding to the maximum value among multiple angle probabilities.
- the maximum value among multiple angle probabilities for roll angle corresponds to the angle interval [-175, -170).
- the roll angle of the face is considered to be larger.
- the recognition device calculates the roll angle according to the method shown in S103 in Figure 1 The predicted angle is inaccurate.
- the recognition device determines that the maximum probability angle interval corresponding to the roll angle is in the angle interval of [-90, 90], it can calculate the predicted angle of the roll angle according to the method shown in S103 in Figure 1.
- the recognition device determines that the maximum probability angle interval corresponding to the roll angle is within the angle interval corresponding to [-180, -90) or the angle interval corresponding to (90, 180], the recognition device needs to determine the multiple angle probabilities of the roll angle. Perform linear mapping to obtain multiple angle probabilities after mapping.
- the recognition device linearly maps multiple angle probabilities so that when the maximum probability angle interval corresponding to the roll angle is in the angle interval of [-180, 90) or (90, 180], the accurate angle value of the roll angle can be obtained .
- linear mapping refers to:
- the predicted angle of the roll angle is calculated directly according to the method shown in S103 in Figure 1, the predicted angle value obtained It may be -103.5 degrees. Obviously, -103.5 degrees is not in the angle interval [-175, -170), which is unreasonable.
- the recognition device needs to first map multiple angle probabilities of the roll angle, for example, map the angle probability corresponding to the angle interval [-175, -170) to the angle interval (-10, -5], that is, Replace the angle probability corresponding to the angle interval [-175, -170) with the angle probability corresponding to the angle interval (-10, -5).
- the calculated predicted angle value can be -5 degrees. Obviously -5 degrees is in the angle interval (-10, -5]. This is reasonable.
- Inverse linear mapping refers to mapping the predicted angle value (mapping angle) calculated based on the angle probability corresponding to the angle interval (-10, -5) to the angle interval [-175, -170).
- the angle value obtained based on S203 is -5 degrees
- -5 degrees are de-reflected
- the obtained angle value is -175 degrees, which is in the angle interval [-175, 170), which is reasonable.
- the multiple angle probabilities are linearly mapped to obtain the mapped multiple angle probabilities, and then based on The mapping angle is determined based on the mapped multiple angle probabilities, the number of angle intervals of the roll angle and the middle angle of each angle interval, and the mapping angle is inversely mapped to predict the angle.
- linear mapping is used to map the angle probabilities corresponding to the mapping angle interval, and then the mapping angle is calculated, and the predicted angle is calculated based on the mapping angle, so that more accuracy can be obtained prediction angle.
- the recognition device when it determines the face area, it can expand the first detection window to obtain the second detection window, intercept the area corresponding to the size of the second detection window, and add the second detection window to the second detection window.
- the area corresponding to the size of the two detection windows is determined as the face area.
- the first detection window can be expanded to obtain more face information corresponding to the face image to be measured, ensuring that the final face angle accuracy is higher.
- FIG. 3 shows a schematic flowchart of a face angle prediction method provided by an embodiment of the present application.
- the face angle prediction method provided by this application may include:
- face detection is performed on the face image to be tested, and the first detection window obtained is a rectangle.
- the identification device can obtain the third detection window based on the length of the first detection window and the width as the side length and the center of the first detection window as the center.
- face detection is performed on the face image to be tested, and the first detection window obtained is a square.
- step S302 can be directly performed on the first detection window according to the expansion coefficient.
- the first detection window obtained is a rectangle or a square. Whether the first detection window is rectangular or square is usually determined by the distance from the camera, the facial expression or movement, the angle of the human face, and other aspects.
- the first detection window is a rectangle with a length of 60 pixels and a width of 40 pixels. Then, taking the center of the first detection window as the center and the length of the first detection window as the side length of the window, the third detection window obtained is a square with a side length of 60 pixels.
- the preset expansion coefficient is 0.1.
- the expansion coefficient can also be other values, such as 0.15, which can be set according to the actual situation, and will not be described in detail here.
- the side length of the third detection window is 60 pixels, and the expansion coefficient is 0.1. Then, the third detection window is expanded, and the fourth detection window obtained is a square with a side length of 66 pixels.
- the obtained fourth detection window may exceed the original face image to be tested, that is, the fourth detection window exceeds the corresponding side length of the face image to be tested.
- the recognition device determines that the fourth detection window exceeds the side length corresponding to the face image to be measured, it removes the excess side length, and obtains the fifth detection window after removal.
- the length of the face image to be measured is 90 pixels
- the width is 60 pixels
- the side length of the fourth detection window is 66 pixels.
- the length of the fifth detection window obtained is 66 pixels and the width is 60 pixels.
- S304 A window with the center of the fifth detection window as the center and the shorter side of the fifth detection window as the second detection window.
- the length of the fifth detection window is 66 pixels and the width is 60 pixels. Then, taking the center of the fifth detection window as the center and the width of the fifth detection window as the side length, the obtained second detection window is a square with a side length of 60 pixels.
- the identification device will take the center of the first detection window as the center and the long side of the first detection window as the third detection window.
- each line of the third detection window will be The side lengths are all expanded to obtain the fourth detection window. Remove the side length of the fourth detection window that exceeds the corresponding side length of the face image to be tested, and obtain the fifth detection window.
- the center of the fifth detection window will be the center.
- the short side of the window is the second detection window.
- the recognition device obtains a second detection window by expanding the first detection window, and the face area selected by the face image to be tested is larger and includes more face information. Facial features are extracted from the face area, and the facial features obtained are more accurate. By predicting the angle of the face with more accurate facial features, a more accurate prediction angle can be obtained.
- this application also provides a generation process of a face angle recognition model including a backbone network and a fully connected classification network.
- the recognition device when the recognition device obtains the facial features corresponding to the facial area, it obtains them through the backbone network in the facial angle recognition model.
- the recognition device when the recognition device obtains the facial features corresponding to the face area, it obtains them through the fully connected classification network in the face angle recognition model.
- the generation process of the face angle recognition model can be completed by a model generation device, or it can be generated by other feasible devices, which will not be described again here.
- FIG. 4 shows a schematic flowchart of generating a face angle recognition model according to an embodiment of the present application.
- the process of generating the face angle recognition model includes:
- the sample face image set includes multiple frames of sample face images and the real angle corresponding to each angle type of the face in each frame of the sample face image relative to the shooting position.
- the sample face image set at least includes a set of sample face images and real angles corresponding to each angle type of the face in the sample face image relative to the shooting position.
- the sample face image set can be selected from an existing image data set (for example, the public data set 300W-LP), or it can be a face image captured by a camera in advance.
- an existing image data set for example, the public data set 300W-LP
- it can be a face image captured by a camera in advance.
- the camera that captures face images can be a camera, a smartphone camera, a laptop camera, or a tablet camera.
- the real angle corresponding to the sample face image can be obtained by using relevant sensors or by manual annotation.
- S402 Perform data enhancement processing on each frame of the sample face image to obtain an enhanced sample face image.
- Data enhancement processing may include one or a combination of random shearing, adding random noise, and color perturbation.
- each frame of sample face image is randomly cut, random noise is added, and color perturbation is processed to obtain an enhanced sample face image.
- the original face angle recognition model includes the original backbone network and the original fully connected classification network.
- the output end of the original backbone network is connected to three fully connected layers of the original fully connected classification network.
- the three fully connected layers are respectively the first original fully connected layer, the second original fully connected layer and the third original fully connected layer. connection layer.
- the first original fully connected layer and the second original fully connected layer are connected to 36 nodes respectively, and the third original fully connected layer is connected to 72 pivots.
- the 36 nodes of the first original fully connected layer and the second original fully connected layer are used to predict the angle probability of yaw angle and pitch angle in their 36 angle intervals respectively, and the 72 nodes of the third original fully connected layer are used for prediction.
- the model generation device inputs the sample image set into the original backbone network, it outputs facial features.
- the first original fully connected layer obtains the angles corresponding to the 36 angle intervals of the yaw angle based on the facial features. Probability.
- the model generation device first calculates a loss function based on multiple angle probabilities of each angle type and the real angle corresponding to each angle type, and then adjusts the model parameters of the original angle recognition model through the loss function.
- the above loss function is the cross entropy loss function.
- the loss function can also be other types of loss functions, so I won’t go into details here.
- the model generation device trains the original angle recognition model according to the loss function through an error backpropagation algorithm to obtain a trained face angle recognition model, and determines the trained face angle recognition model as the face angle Identify the model.
- the model generation device when the model generation device generates the face angle recognition model, it first obtains a sample face image, and performs data enhancement processing on each frame of the sample face image to obtain an enhanced sample face image. Then input the enhanced sample face image into the original angle recognition model, output multiple angle probabilities for each angle type, and adjust the original angle probability based on the multiple angle probabilities for each angle type and the true angle corresponding to each angle type.
- the model parameters of the angle recognition model determine the adjusted original angle recognition model as the face angle recognition model. Using preset rules to divide the angle ranges of the three angle types, and adjusting the original angle recognition model through the multiple angle probabilities corresponding to the three angle types and the real angles corresponding to the three angle types, more accurate predictions can be obtained Face angle recognition model for face angles.
- this application also provides a face angle prediction device.
- FIG. 5 shows a schematic block diagram of a face angle prediction device provided by an embodiment of the present application.
- a face angle prediction device provided by an embodiment of the present application includes an acquisition module 501 , a first determination module 502 , a second determination module 503 and a third determination module 504 .
- the acquisition module 501 is used to acquire the face area of the face image to be tested
- the first determination module 502 is used to determine the facial features corresponding to the facial area
- the second determination module 503 is used to determine multiple angle probabilities of multiple angle types based on the facial features.
- the multiple angle probabilities of each angle type respectively correspond to multiple angle intervals of each angle type, so
- the various angle types include yaw angle, pitch angle and roll angle;
- the third determination module 504 is configured to determine the predicted angle of each angle type of the human face in the face image to be measured relative to the shooting position based on multiple angle probabilities of each angle type.
- the third determination module is specifically used for:
- the angle of the face in the face image to be measured relative to the shooting position is determined based on multiple angle probabilities, the number of angle intervals, and the intermediate angle of each angle interval.
- the third determination module 504 is specifically used to:
- a maximum probability angle interval is determined, and the maximum probability angle interval is an angle interval corresponding to the maximum value among multiple angle probabilities;
- the third determination module 504 is specifically used to:
- the acquisition module 501 is specifically used to:
- the method of obtaining the face area of the face image to be tested includes:
- the acquisition module 501 is specifically used for:
- each side length of the third detection window is expanded to obtain a fourth detection window
- the first determination module 502 is specifically used to:
- the backbone network is used to extract the face features in the face image
- the facial features are input into the fully connected classification network of the face angle recognition model, and multiple angle probabilities of each angle type are output.
- the fully connected classification network is used to predict the facial features in each angle type.
- the angle probabilities corresponding to the multiple angle intervals of .
- the model generation device is used for:
- Obtain a sample face image set which includes a multi-frame sample face image and a true angle corresponding to each angle type of the face in each frame of the sample face image relative to the shooting position;
- the original face angle recognition model includes the original backbone network and the original fully connected classification network;
- the adjusted original angle recognition model is determined as the face angle recognition model.
- the device 500 of the present application can be implemented through an application-specific integrated circuit (application-specific integrated circuit). integrated circuit (ASIC), or programmable logic device (PLD).
- ASIC application-specific integrated circuit
- PLD programmable logic device
- the above PLD can be a complex programmable logical device (CPLD), a field-programmable gate array (field-programmable) gate array, FPGA), general array logic (generic array logic, GAL) or any combination thereof.
- CPLD complex programmable logical device
- FPGA field-programmable gate array
- GAL general array logic
- the face angle prediction method shown in Figure 1 can also be implemented through software.
- the device 500 and its respective modules can also be software modules.
- Figure 6 is a schematic structural diagram of an electronic device provided by this application.
- the device 600 includes a processor 601, a memory 602, a communication interface 603 and a bus 604.
- the processor 601, the memory 602, and the communication interface 603 communicate through the bus 604. Communication can also be achieved through other means such as wireless transmission.
- the memory 602 is used to store instructions, and the processor 601 is used to execute the instructions stored in the memory 602.
- the memory 602 stores program code 6021, and the processor 601 can call the program code 6021 stored in the memory 602 to execute the face angle prediction method shown in Figure 2.
- the processor 601 may be a CPU, and the processor 601 may also be other general-purpose processors, digital signal processors (DSPs), application-specific integrated circuits (ASICs), field-programmable gate arrays (FPGAs), or Other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, etc.
- DSPs digital signal processors
- ASICs application-specific integrated circuits
- FPGAs field-programmable gate arrays
- a general-purpose processor can be a microprocessor or any conventional processor, etc.
- the memory 602 may include read-only memory and random access memory and provides instructions and data to the processor 601. Memory 602 may also include non-volatile random access memory.
- the memory 602 may be volatile memory or non-volatile memory, or may include both volatile and non-volatile memory.
- the non-volatile memory can be a read-only memory (read-only memory). memory, ROM), programmable read-only memory (programmable ROM, PROM), erasable programmable read-only memory (erasable PROM, EPROM), electrically erasable programmable read-only memory (electrically EPROM, EEPROM) or flash memory.
- Volatile memory can be random access memory (random access memory (RAM), which serves as an external cache.
- RAM static random access memory
- DRAM dynamic random access memory
- SDRAM synchronous dynamic random access memory
- Double data rate synchronous dynamic random access memory double data date SDRAM, DDR SDRAM
- enhanced synchronous dynamic random access memory enhanced SDRAM, ESDRAM
- synchronous link dynamic random access memory direct memory bus random access memory
- direct rambus RAM direct rambus RAM
- bus 604 may also include a power bus, a control bus, a status signal bus, etc. However, for clarity of illustration, the various buses are labeled bus 604 in FIG. 6 .
- the electronic device 600 may correspond to the device 500 in the present application, and may correspond to the device in the method shown in FIG. 1 of the present application.
- the device 600 corresponds to the device in the method shown in FIG. 2 of the present application.
- the above and other operations and/or functions of each module in the device 600 are respectively intended to implement the operating steps of the method performed by the device in Figure 2. For the sake of brevity, they will not be described again here.
- This application also provides a computer-readable storage medium that stores a computer program.
- the computer program is executed by a processor, the steps in each of the above method embodiments can be implemented.
- the present application provides a computer program product.
- the steps in each of the above method embodiments can be implemented when the electronic device is executed.
- sequence number of each step in the above embodiment does not mean the order of execution.
- the execution order of each process should be determined by its function and internal logic, and should not constitute any limitation on the implementation process of this application.
- Module completion means dividing the internal structure of the above device into different functional units or modules to complete all or part of the functions described above.
- Each functional unit and module in the embodiment can be integrated into one processing unit, or each unit can exist physically alone, or two or more units can be integrated into one unit.
- the above-mentioned integrated unit can be hardware-based. It can also be implemented in the form of software functional units.
- the specific names of each functional unit and module are only for the convenience of distinguishing each other and are not used to limit the scope of protection of the present application.
- For the specific working processes of the units and modules in the above system please refer to the corresponding processes in the foregoing method embodiments, and will not be described again here.
- the disclosed devices/network devices and methods can be implemented in other ways.
- the device/network equipment embodiments described above are only illustrative.
- the division of the above modules or units is only a logical function division. In actual implementation, there may be other division methods, such as multiple units or units. Components may be combined or may be integrated into another system, or some features may be ignored, or not implemented.
- the coupling or direct coupling or communication connection between each other shown or discussed may be through some interfaces, indirect coupling or communication connection of devices or units, which may be in electrical, mechanical or other forms.
- the units described above as separate components may or may not be physically separated.
- the components shown as units may or may not be physical units, that is, they may be located in one place, or they may be distributed to multiple network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of this application.
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Human Computer Interaction (AREA)
- Health & Medical Sciences (AREA)
- Multimedia (AREA)
- Oral & Maxillofacial Surgery (AREA)
- Life Sciences & Earth Sciences (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Evolutionary Biology (AREA)
- Probability & Statistics with Applications (AREA)
- General Health & Medical Sciences (AREA)
- Image Analysis (AREA)
- Image Processing (AREA)
Abstract
La présente invention peut s'appliquer au domaine technique de la reconnaissance d'image. La présente invention concerne un procédé et un appareil de prédiction d'un angle facial, ainsi qu'un dispositif et un support de stockage lisible. Le procédé consiste à : acquérir une zone faciale d'une image faciale à soumettre à une détection ; déterminer des caractéristiques faciales correspondant à la zone faciale ; selon les caractéristiques faciales, déterminer une pluralité de probabilités d'angle pour chaque type d'angle parmi une pluralité de types d'angles, la pluralité de types d'angles comprenant un angle de lacet, un angle de tangage et un angle de roulis ; et, selon la pluralité de probabilités d'angle pour chaque type d'angle, déterminer un angle prédit de chaque type d'angle d'un visage dans ladite image faciale par rapport à une position de photographie. Ainsi, dans la présente invention, des probabilités d'angle correspondant respectivement à une pluralité d'intervalles d'angle sont acquises, et un angle prédit est calculé au moyen de probabilités d'angle correspondant respectivement à la pluralité d'intervalles d'angle, de telle sorte que la précision de l'angle prédit est garantie. La présente invention peut s'appliquer à une pluralité de scénarios dans lesquels des angles faciaux précis doivent être déterminés.
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210607682.7A CN117197853A (zh) | 2022-05-31 | 2022-05-31 | 人脸角度预测方法、装置、设备及可读存储介质 |
CN202210607682.7 | 2022-05-31 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2023231400A1 true WO2023231400A1 (fr) | 2023-12-07 |
Family
ID=89003944
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/CN2022/142276 WO2023231400A1 (fr) | 2022-05-31 | 2022-12-27 | Procédé et appareil de prédiction d'angle facial, et dispositif et support de stockage lisible |
Country Status (2)
Country | Link |
---|---|
CN (1) | CN117197853A (fr) |
WO (1) | WO2023231400A1 (fr) |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110647865A (zh) * | 2019-09-30 | 2020-01-03 | 腾讯科技(深圳)有限公司 | 人脸姿态的识别方法、装置、设备及存储介质 |
CN111814613A (zh) * | 2020-06-24 | 2020-10-23 | 浙江大华技术股份有限公司 | 一种人脸识别方法、设备及计算机可读存储介质 |
CN112084856A (zh) * | 2020-08-05 | 2020-12-15 | 深圳市优必选科技股份有限公司 | 一种人脸姿态检测方法、装置、终端设备及存储介质 |
CN112818969A (zh) * | 2021-04-19 | 2021-05-18 | 南京烽火星空通信发展有限公司 | 一种基于知识蒸馏的人脸姿态估计方法及系统 |
CN112883918A (zh) * | 2021-03-22 | 2021-06-01 | 深圳市百富智能新技术有限公司 | 人脸检测方法、装置、终端设备及计算机可读存储介质 |
CN113971833A (zh) * | 2021-11-29 | 2022-01-25 | 成都新潮传媒集团有限公司 | 多角度的人脸识别方法、装置、计算机主设备及存储介质 |
-
2022
- 2022-05-31 CN CN202210607682.7A patent/CN117197853A/zh active Pending
- 2022-12-27 WO PCT/CN2022/142276 patent/WO2023231400A1/fr unknown
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110647865A (zh) * | 2019-09-30 | 2020-01-03 | 腾讯科技(深圳)有限公司 | 人脸姿态的识别方法、装置、设备及存储介质 |
CN111814613A (zh) * | 2020-06-24 | 2020-10-23 | 浙江大华技术股份有限公司 | 一种人脸识别方法、设备及计算机可读存储介质 |
CN112084856A (zh) * | 2020-08-05 | 2020-12-15 | 深圳市优必选科技股份有限公司 | 一种人脸姿态检测方法、装置、终端设备及存储介质 |
CN112883918A (zh) * | 2021-03-22 | 2021-06-01 | 深圳市百富智能新技术有限公司 | 人脸检测方法、装置、终端设备及计算机可读存储介质 |
CN112818969A (zh) * | 2021-04-19 | 2021-05-18 | 南京烽火星空通信发展有限公司 | 一种基于知识蒸馏的人脸姿态估计方法及系统 |
CN113971833A (zh) * | 2021-11-29 | 2022-01-25 | 成都新潮传媒集团有限公司 | 多角度的人脸识别方法、装置、计算机主设备及存储介质 |
Also Published As
Publication number | Publication date |
---|---|
CN117197853A (zh) | 2023-12-08 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
WO2021057848A1 (fr) | Procédé d'entraînement de réseau, procédé de traitement d'image, réseau, dispositif terminal et support | |
CN109255352B (zh) | 目标检测方法、装置及系统 | |
US10896518B2 (en) | Image processing method, image processing apparatus and computer readable storage medium | |
WO2018119599A1 (fr) | Procédé et dispositif de recherche de personne et système de communication | |
WO2020199480A1 (fr) | Procédé et dispositif de reconnaissance de mouvement corporel | |
CN111160178B (zh) | 图像处理方法及装置、处理器、电子设备及存储介质 | |
WO2022027912A1 (fr) | Procédé et appareil de reconnaissance de pose du visage, dispositif terminal et support de stockage | |
WO2021164269A1 (fr) | Procédé et appareil d'acquisition de carte de disparité basés sur un mécanisme d'attention | |
CN112819722B (zh) | 一种红外图像人脸曝光方法、装置、设备及存储介质 | |
WO2021254134A1 (fr) | Procédé et appareil de traitement de protection de la confidentialité, dispositif électronique et système de surveillance | |
WO2017054442A1 (fr) | Procédé et dispositif de traitement de reconnaissance d'informations d'image, et support de stockage informatique | |
WO2018082308A1 (fr) | Procédé de traitement d'image et terminal | |
WO2021169668A1 (fr) | Procédé de traitement d'image et dispositif associé | |
WO2020238556A1 (fr) | Procédé de transmission de données basé sur une plateforme de configuration, appareil et dispositif informatique | |
WO2022262474A1 (fr) | Procédé et appareil de commande de zoom, dispositif électronique et support de stockage lisible par ordinateur | |
WO2019033575A1 (fr) | Dispositif électronique, procédé et système de suivi de visage, et support d'informations | |
WO2023124040A1 (fr) | Procédé et appareil de reconnaissance faciale | |
WO2020134229A1 (fr) | Dispositif et procédé de traitement d'image, appareil électronique et support de stockage lisible par ordinateur | |
CN111080571A (zh) | 摄像头遮挡状态检测方法、装置、终端和存储介质 | |
WO2022061850A1 (fr) | Procédé et dispositif de correction de distorsion de déplacement de nuages de points | |
CN111753766B (zh) | 一种图像处理方法、装置、设备及介质 | |
CN112733901A (zh) | 基于联邦学习和区块链的结构化动作分类方法与装置 | |
CN112689221A (zh) | 录音方法、录音装置、电子设备及计算机可读存储介质 | |
WO2021169625A1 (fr) | Procédé et appareil de détection de photographie de réseau reproduite, dispositif informatique et support de mémoire | |
CN111881740A (zh) | 人脸识别方法、装置、电子设备及介质 |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 22944701 Country of ref document: EP Kind code of ref document: A1 |