WO2020108041A1 - 耳部关键点检测方法、装置及存储介质 - Google Patents

耳部关键点检测方法、装置及存储介质 Download PDF

Info

Publication number
WO2020108041A1
WO2020108041A1 PCT/CN2019/107104 CN2019107104W WO2020108041A1 WO 2020108041 A1 WO2020108041 A1 WO 2020108041A1 CN 2019107104 W CN2019107104 W CN 2019107104W WO 2020108041 A1 WO2020108041 A1 WO 2020108041A1
Authority
WO
WIPO (PCT)
Prior art keywords
ear
area
key point
key points
region
Prior art date
Application number
PCT/CN2019/107104
Other languages
English (en)
French (fr)
Inventor
李宣平
李岩
张国鑫
Original Assignee
北京达佳互联信息技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 北京达佳互联信息技术有限公司 filed Critical 北京达佳互联信息技术有限公司
Publication of WO2020108041A1 publication Critical patent/WO2020108041A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/168Feature extraction; Face representation
    • G06V40/171Local features and components; Facial parts ; Occluding parts, e.g. glasses; Geometrical relationships

Definitions

  • the present application belongs to the field of image processing, and particularly relates to an ear key point detection method, device and storage medium.
  • the pixel point of is determined as the outer pinna edge point, so that multiple outer pinna edge points are determined by moving the circular area multiple times, and the ear area of the face image is determined according to the multiple outer pinna edge points, according to the ear area
  • the gray level of each pixel determines the key points of the ear.
  • the present application discloses a method, device and storage medium for detecting key points of the ear.
  • a method for detecting key points of an ear includes:
  • the face image includes key points of a face contour, and the key points of the face contour are used to determine an ear region in the face image;
  • the ear key point detection model is used to detect an ear key point in any ear region
  • the ear key point in the face image is detected.
  • an ear key point detection device includes:
  • An image acquisition unit configured to acquire a face image, the face image including key points of a face contour, and the key points of the face contour are used to determine an ear region in the face image;
  • a model acquisition unit configured to acquire an ear key point detection model, the ear key point detection model is used to detect an ear key point in any ear area;
  • the determining unit is configured to detect the ear key points in the face image based on the ear key point detection model and the position of the face contour key points in the face image.
  • an ear key point detection device comprising:
  • Memory for storing processor executable commands
  • the processor is configured to:
  • the face image includes key points of a face contour, and the key points of the face contour are used to determine an ear region in the face image;
  • the ear key point detection model is used to detect an ear key point in any ear region
  • the ear key point in the face image is detected.
  • a non-transitory computer-readable storage medium when instructions in the storage medium are executed by a processor of a detection device, the detection device can perform an ear key Point detection method, the method includes:
  • the face image includes key points of a face contour, and the key points of the face contour are used to determine an ear region in the face image;
  • the ear key point detection model is used to detect an ear key point in any ear region
  • the ear key point in the face image is detected.
  • an application program/computer program product is provided, and when instructions in the application program/computer program product are executed by a processor of a detection device, the detection device can execute an ear Key point detection method, the method includes:
  • the face image includes key points of a face contour, and the key points of the face contour are used to determine an ear region in the face image;
  • the ear key point detection model is used to detect an ear key point in any ear region
  • the ear key point in the face image is detected.
  • the ear key point detection model by acquiring the face image including the face contour key points, the face contour key point is used to determine the ear area in the face image, and the ear key point detection model is used to detect the ear area
  • the key points in the ear are detected based on the key point detection model of the ear and the key points of the contour of the face to detect the key points of the ear in the face image.
  • the ear area is determined by using the key points of the face contour, and the ear key point detection model is used to detect the key points of the ear in the face image.
  • the The key point detection model learns how to detect key points in the ear area, which improves the accuracy of key points in the ear and reduces errors.
  • Fig. 1 is a flowchart of a method for detecting key points of an ear according to an exemplary embodiment
  • Fig. 2 is a flowchart of a method for detecting key points of an ear according to an exemplary embodiment
  • Fig. 3 is a schematic diagram of a face image according to an exemplary embodiment
  • Fig. 4 is a flow chart of a method for detecting key points of an ear according to an exemplary embodiment
  • Fig. 5 is a block diagram of an ear key point detection device according to an exemplary embodiment
  • Fig. 6 is a block diagram of a terminal for key point detection of an ear according to an exemplary embodiment
  • Fig. 7 is a schematic structural diagram of a server according to an exemplary embodiment.
  • Fig. 1 is a flowchart of an ear key point detection method according to an exemplary embodiment. As shown in Fig. 1, the ear key point detection method is used in a detection device and includes the following steps:
  • a face image is obtained.
  • the face image includes face contour key points, and the face contour key points are used to determine the ear region in the face image.
  • step 102 an ear key point detection model is obtained, and the ear key point detection model is used to detect ear key points in any ear region.
  • step 103 based on the ear key point detection model and the position of the face contour key point in the face image, the ear key point in the face image is detected.
  • the method provided in the embodiment of the present application obtains an ear key point detection model by acquiring a face image including key points of a face outline, and the face outline key points are used to determine an ear area and ear key points in a face image
  • the detection model is used to detect key points of the ear in the ear area, and then the key points of the ear in the face image are detected based on the key point detection model of the ear and the key points of the face contour.
  • the ear area is determined by using the key points of the face contour, and the ear key point detection model is used to detect the key points of the ear in the face image.
  • the ear learns how to detect key points in the ear area, which improves the accuracy of key points in the ear and reduces errors.
  • detecting the ear key points in the face image includes:
  • each ear key point in the face image determines each ear key point in the face image s position.
  • the first ear region and the second ear region in the face image are determined according to the positions of the key points of the face contour in the face image, including:
  • the first ear region including the first designated key point and the second ear region including the second designated key point are determined.
  • the first ear area belongs to the first type ear area
  • the second ear area belongs to the second type ear area
  • the first type ear area is located on the first side of the human face Ear area
  • the second type of ear area is the ear area located on the second side of the human face
  • the ear key points in the first ear area and the ear key points in the second ear area are detected, including:
  • the second ear area, and the third ear area determine the ear key points in the second ear area and the ear key points in the third ear area;
  • the third ear region including the key point of the ear is horizontally inverted to obtain the first ear region including the key point of the ear.
  • the method further includes:
  • the model training is performed according to the extracted ear area and the ear key points in the ear area to obtain an ear key point detection model.
  • the model training is performed according to the extracted ear region and the ear key points in the ear region to obtain an ear key point detection model, including:
  • the first type of ear area is the ear area located on the first side of the human face;
  • the second type ear area and the inverted ear area in the extracted ear area are determined as the sample ear area, and the second type ear area is the ear area located on the second side of the human face;
  • the model training is performed according to the sample ear area and the ear key points in the sample ear area to obtain the ear key point detection model.
  • Fig. 2 is a flowchart of an ear key point detection method according to an exemplary embodiment. As shown in Fig. 2, the ear key point detection method is used in a detection device.
  • the detection device may be a mobile phone, a computer, or Servers, cameras, monitoring equipment and other devices with image processing functions, the method includes the following steps:
  • step 201 a face image is obtained, and the face image includes key points of the face contour.
  • the face image may be captured by the detection device, or extracted from the video image captured by the detection device, or downloaded by the detection device from the Internet, or sent to the detection device by other devices. Or, during the live video broadcast of the detection device, each image in the video stream can be obtained, and each image can be used as a face image to be detected, so as to perform key points of the ear on each image in the video stream Detection.
  • the face image includes multiple face contour key points, that is, key points on the face contour in the face image, and the multiple face contour key points are connected to form a face contour.
  • the face image includes 19 face contour key points, and the 19 face contour key points are evenly distributed on the face contour in the face image.
  • the multiple key points of the face contour are obtained by performing face detection on the face image.
  • the face detection algorithm used in the face detection process may be a recognition algorithm based on face feature points, a recognition algorithm based on a template, and a neural network Identification algorithm, etc.
  • the detection device acquires the original face image, it performs face detection on the face image to obtain multiple face contour key points in the face image.
  • other devices perform face detection on the face image, and after obtaining multiple face contour key points in the face image, send a face image including multiple face contour key points to the detection device.
  • step 202 an ear key point detection model is obtained, and the ear key point detection model is used to detect ear key points in any ear region.
  • the ear key points in any ear area can be detected, so as to determine the ear key points in the face image.
  • the ear key point detection model can be trained by the detection device and stored by the detection device, or the ear key point detection model can be sent to the detection device after being trained by other equipment and stored by the detection device.
  • an initial ear key point detection model is first constructed to obtain multiple sample images, and each sample image includes the ear area and the ear area.
  • Ear key points are used to extract the ear area from multiple sample images, and the model training is performed according to the extracted ear area and the ear key points in the ear area to obtain an ear key point detection model.
  • multiple ear regions and corresponding ear key points are divided into a training data set and a test data set, and multiple ear regions in the training data set are used as the input of the ear key point detection model, Use the position of the ear key point in the corresponding ear area as the output of the ear key point detection model, train the ear key point detection model, and make the ear key point detection model learn the detection method of the ear key point , With the ability to detect key points of the ear.
  • each ear region in the test data set is input into the ear key point detection model, the position of the test ear key point in the ear region is determined based on the ear key point detection model, and the test ear key The position of the point in the ear area is compared with the position of the marked actual ear key point in the ear area, and the ear key point detection model is modified according to the comparison result to improve the ear key point detection The accuracy of the model.
  • a preset training algorithm may be used when training the ear key point detection model, and the preset training algorithm may be a convolutional neural network algorithm, a decision tree algorithm, an artificial neural network algorithm, or the like.
  • the trained ear key point detection model can be a convolutional neural network model, a decision tree model or an artificial neural network model.
  • step 203 the first ear region and the second ear region in the face image are determined according to the positions of the key points of the face contour in the face image.
  • the detection device detects the ear key points in the face image based on the ear key point detection model and the position of the face contour key points in the face image.
  • the key points of the face contour are used to determine the ear area including the entire ear in the face image. Since there is a fixed relative position relationship between the face contour and the ear area, the key position is determined based on the relative position relationship and the face contour. The position of the point in the face image can determine the ear area in the face image, so as to detect key points of the ear.
  • the left ear area and the right ear area are usually included in the face image
  • the first ear area and the second ear area are determined, of which the first The ear region is a left ear region, the second ear region is a right ear region, or the first ear region is a right ear region, and the second ear region is a left ear region.
  • the face image usually includes the face area, the ear area and other areas.
  • the ear area is extracted according to the key points of the face contour.
  • the a priori knowledge that the ear is adjacent to the face contour can be used to exclude the ear area. For other areas, detection is based only on the ear area, which not only reduces the amount of calculation, but also eliminates interference from extraneous areas and improves accuracy.
  • the multiple face contour key points in the face image are located at different positions in the face image.
  • the relative positional relationship between the multiple face contour key points and the ear area is also different. Therefore, in order to extract the accurate ear area, you can First, based on the positions of the multiple face contour key points in the face contour, determine the face contour key point closest to the ear area as the designated key point, and determine the ear in the face image according to the designated key point region.
  • the first specified key point and the second specified key point among the face contour key points are obtained, and the first specified key point and the second specified key point are the faces closest to the ear region Contour key points, determine the first ear area including the first specified key point, and the second ear area including the second specified key point.
  • the first designated key point and the second designated key point are determined in advance according to the distance between the key points of the multiple face contours and the ear, such as when a face detection algorithm is used to obtain a fixed number of sequentially arranged in the face contour
  • the sequence numbers of the two key points closest to the ear can be determined in advance.
  • the face detection algorithm is used to obtain a face image including a plurality of face contour key points
  • the first specified key point and the second specified key point can be determined from the face image according to the determined two serial numbers.
  • the first designated key point is included
  • the size is set according to the size of a general human face, so that the determined ear area can include the entire ear, and the shape may be a rectangle, a circle, a shape similar to the human ear, or other shapes.
  • the position of the ear region it may be determined according to the relative positional relationship between the first designated key point and the second designated key point and the corresponding ear region.
  • the first specified key point and the second specified key point are the face contour key points closest to the ear lobe in the face contour, and the first specified key point and the second specified key point can be used as the ear to be extracted, respectively The center of the ear area, or the center of the lower edge of the ear area to be extracted, respectively, extracts the first ear area and the second ear area.
  • the first specified key point is the face contour key point closest to the left ear area in the face image
  • the second specified key point is the face contour key point closest to the right ear area in the face image
  • the first The ear area is the left ear area
  • the second ear area is the right ear area.
  • step 204 based on the ear key point detection model, the first ear area, and the second ear area, the ear key points in the first ear area and the ear key points in the second ear area are detected.
  • the detection device inputs the first ear region and the second ear region to the ear key point detection model, and based on the ear key point detection model, the ear of the first ear region
  • the ear key points in the second ear area and the ear key points in the second ear area are separately detected, so as to determine the ear key points in the first ear area and the ear key points in the second ear area.
  • each ear key point is determined according to the determined position of each ear key point in the located ear area and the position of the first ear area and the second ear area in the face image Position in the face image.
  • the detection of the key points of the ear in the first ear area and the second ear area in the above step 204 actually determines the position of the key points of the ear in the ear area. Therefore, the position of the ear key point in the face image is determined according to the position of the ear key point in the ear area and the position of the ear area in the face image.
  • a certain point (such as a designated key point) in the face image is determined as the origin of the ear area, and a coordinate system is created, then the coordinates of the key point of the ear in the ear area are determined Then, superimpose the coordinates of the ear key point in the ear area and the coordinates of the origin in the face image to obtain the coordinates of the ear key point in the face image, thereby determining the ear key point in the human The position in the face image.
  • various operations can be performed based on the key points of the ear in the face image. For example, in the process of live video broadcasting, you can obtain each image in the video stream, after detecting the key points of the ears of each image, add virtual decorations, stickers, and glow at the location of the key points of an ear Special effects, etc., enhance the live broadcast effect.
  • the method provided in the embodiment of the present application obtains an ear key point detection model by acquiring a face image including key points of a face outline, and the face outline key points are used to determine an ear area and ear key points in a face image
  • the detection model is used to detect key points of the ear in the ear area, and then the key points of the ear in the face image are detected based on the key point detection model of the ear and the key points of the face contour.
  • the ear area is determined by using the key points of the face contour, and the ear key point detection model is used to detect the key points of the ear in the face image.
  • the ear learns how to detect key points in the ear area, which improves the accuracy of key points in the ear and reduces errors.
  • the face image usually includes the face area, the ear area and other areas.
  • the ear area is extracted according to the key points of the face contour.
  • the a priori knowledge that the ear is adjacent to the face contour can be used to exclude the ear area Except for other areas, only the ear area is used for detection, which not only reduces the amount of calculation, but also eliminates interference from extraneous areas and improves accuracy.
  • the key points of the ear can be used as the operation target, and a variety of operations can be performed based on the key points of the ear in the face image, which expands the application function, improves flexibility, and improves the face The fun of the image.
  • Fig. 4 is a flowchart of a method for detecting key points of the ear according to an exemplary embodiment. As shown in Fig. 4, the method for detecting key points of the ear is used in a detection device.
  • the detection device may be a mobile phone, a computer, or Servers, cameras, monitoring equipment and other devices with image processing functions, the method includes the following steps:
  • step 401 a face image is obtained, and the face image includes key points of the face contour, and the key points of the face contour are used to determine the ear region in the face image.
  • This step 401 is similar to the above step 201.
  • step 402 an ear key point detection model is acquired.
  • the ear key point detection model used in the embodiment shown in FIG. 2 needs to separately detect the left ear area and the right ear area, it is necessary to train the ear key point detection model according to The ear area and the right ear area are trained to learn the detection method of key points of the ear, resulting in a high complexity of the ear key point detection model.
  • the ear region is divided into a first type ear region and a second type ear region.
  • the first type ear region is an ear region located on the first side of the human face
  • the second The ear-like area is the ear area located on the second side of the human face
  • the ear key point detection model is used to detect the ear key points in the second type ear area, instead of detecting the first type ear area Key points of the ear.
  • the first type ear area is the left ear area
  • the second type ear area is the right ear area
  • the first type ear area is the right ear area
  • the second type ear area is the left ear area region.
  • the type of the extracted ear region is determined, and the first type of ear region in the extracted ear region is horizontally inverted to obtain the inverted ear region,
  • the ear key points in the sample ear area are subjected to model training to obtain an ear key point detection model, so that the ear key point detection model can learn the detection method of the ear key points in the second type ear area.
  • the ear key point detection model does not need to learn the detection method of the ear key points on both sides of the face, it is only necessary to learn the ear key point detection method on the side of the face, thus reducing the complexity of the ear key point detection model Degree, improve the training speed.
  • multiple sample ear regions and corresponding ear key points are divided into a training data set and a test data set, and multiple sample ear regions in the training data set are used as the input of the ear key points,
  • the position of the ear key point in the corresponding ear area is used as the output of the ear key point detection model, and the ear key point detection model is trained to make the ear key point detection model match the ear in the second type of ear area.
  • Learning the key point detection method to make the ear key point detection model have the ability to detect ear key points in the second type of ear area.
  • each sample ear region in the test data set is input into the ear key point detection model, and the position of the test ear key point in the ear region is determined based on the ear key point detection model. If the sample ear region is the original second type ear region, then the detected test ear key point is compared with the actual ear key point in the sample ear region, and the ear key point is compared according to the comparison result.
  • the detection model is modified, if the sample ear area is the ear area obtained after the first type ear area is turned over, then the detected key points of the test ear and the actual ear after the first type ear area is turned over The key points of the ear are compared, and the key point detection model of the ear is revised according to the comparison result.
  • horizontally inverting any ear region includes: determining the position of each pixel in the ear region in the ear region, and the central axis of the ear region, according to the position of each pixel and the central axis Position, determine the target position of each pixel point symmetrical about the central axis, and exchange the pixel information of each pixel point with the pixel information of the pixel point at the corresponding target position to achieve horizontal flip.
  • step 403 the first ear area and the second ear area in the face image are determined according to the positions of the key points of the face contour in the face image.
  • Step 403 is similar to the above step 203.
  • Step 201 please refer to the above step 201, which will not be repeated here.
  • step 404 the first ear region is horizontally inverted to obtain a third ear region, which belongs to the second type of ear region.
  • step 405 based on the ear key point detection model, the second ear area and the third ear area, the ear key points in the second ear area and the ear key points in the third ear area are determined.
  • step 406 the third ear region including the key points of the ear is horizontally inverted to obtain the first ear region including the key points of the ear.
  • both the first-type ear area and the second-type ear area can be detected based on the ear key point detection model, while in the embodiment of the present application, the ear key point detection model only Can detect the second type of ear area.
  • the first ear region belonging to the first type ear region is horizontally inverted to obtain the third ear region, so that the third ear region belongs to the second type ear region, based on the ear
  • the key point detection model detects the third ear area. After detecting the ear key points in the third ear area, the third ear area containing the ear key points is horizontally flipped to determine the ear key points in the first ear area.
  • the first type of ear area detection is performed before the detection.
  • each ear key point is determined according to the determined position of each ear key point in the located ear area and the position of the first ear area and the second ear area in the face image Position in the face image.
  • Step 407 is similar to step 205 described above.
  • step 205 described above please refer to step 205 described above, which will not be repeated here.
  • the method provided in the embodiment of the present application obtains the ear key point detection model by acquiring the face image, determines the first ear area and the second ear area in the face image according to the key points of the face contour, and converts the first ear
  • the region is flipped horizontally to obtain the third ear region belonging to the second type of ear region.
  • the ear key points in the ear region are detected, and the third ear including the ear key points
  • the region is flipped horizontally to obtain the first ear region containing the key points of the ear, and the position of each ear key point in the face image is determined.
  • the ear area is determined by using the key points of the face contour, and the ear key point detection model is used to detect the key points of the ear in the face image. Considering the positional relationship between the ear area and the face contour, the ear The key point detection model learns how to detect key points in the ear area, which improves the accuracy of key points in the ear and reduces errors.
  • the ear key point detection model is used to detect the ear key points in the second type ear area without detecting the first Ear key points in the ear-like area.
  • Fig. 5 is a block diagram of a device for detecting key points of an ear according to an exemplary embodiment.
  • the device includes an image acquisition unit 501, a model acquisition unit 502 and a determination unit 503.
  • the image acquisition unit 501 is configured to acquire a face image, and the face image includes key points of the face contour, and the key points of the face contour are used to determine the ear region in the face image;
  • the model acquisition unit 502 is configured to acquire an ear key point detection model, and the ear key point detection model is used to detect ear key points in any ear region;
  • the determining unit 503 is configured to detect the ear key points in the face image based on the ear key point detection model and the position of the face contour key points in the face image.
  • the device provided in the embodiment of the present application obtains an ear key point detection model by acquiring a face image including face contour key points, and the face contour key points are used to determine the ear area and ear key points in the face image
  • the detection model is used to detect key points of the ear in the ear area, and then the key points of the ear in the face image are detected based on the key point detection model of the ear and the key points of the face contour.
  • the ear area is determined by using the key points of the face contour, and the ear key point detection model is used to detect the key points of the ear in the face image.
  • the ear The key point detection model learns how to detect key points in the ear area, which improves the accuracy of key points in the ear and reduces errors.
  • the determining unit 503 includes:
  • the area determination subunit is configured to determine the first ear area and the second ear area in the face image according to the positions of the key points of the face contour in the face image;
  • the key point determination subunit is configured to detect the ear key points in the first ear area and the ears in the second ear area based on the ear key point detection model, the first ear area, and the second ear area Key points
  • the position determination subunit is configured to determine each ear according to the determined position of each ear key point in the ear region where it is located and the positions of the first and second ear regions in the face image The position of the key points in the face image.
  • the area determination subunit is further configured to obtain the first specified key point and the second specified key point among the key points of the face contour; determine the first ear including the first specified key point Region and the second ear region including the second designated key point.
  • the first ear area belongs to the first type ear area
  • the second ear area belongs to the second type ear area
  • the first type ear area is located on the first side of the human face Ear area
  • the second type of ear area is the ear area located on the second side of the human face
  • the key point determination subunit is also configured to horizontally flip the first ear area to obtain a third ear area, which belongs to the second type of ear area; based on the ear key point detection model, the second Ear area and third ear area, determine the ear key points in the second ear area and the ear key points in the third ear area; horizontally flip the third ear area containing the ear key points To get the first ear area containing the key points of the ear.
  • the device further includes:
  • the acquiring unit is configured to acquire a plurality of sample images, and each sample image includes an ear region and ear key points in the ear region;
  • An extraction unit configured to extract the ear region from multiple sample images, respectively;
  • the training unit is configured to perform model training based on the extracted ear region and ear key points in the ear region to obtain an ear key point detection model.
  • the training unit includes:
  • the flip subunit is configured to horizontally flip the first type of ear area in the extracted ear area to obtain a flipped ear area.
  • the first type of ear area is an ear area located on the first side of the human face ;
  • the sample determination subunit is configured to determine the second type ear area and the inverted ear area in the extracted ear area as the sample ear area, and the second type ear area is located on the second side of the human face Ear area
  • the training unit is further configured to perform model training based on the sample ear area and the ear key points in the sample ear area to obtain an ear key point detection model.
  • Fig. 6 is a block diagram of a terminal 600 for key point detection of an ear according to an exemplary embodiment.
  • the terminal 600 is used to perform the steps performed by the detection device in the ear key point detection method described above, and may be a portable mobile terminal, such as: a smartphone, a tablet computer, a motion picture expert compression standard audio layer 3 player (Moving Picture Experts Group Audio Layer III, MP3), Motion Picture Expert Compression Standard Audio Layer 4 (Moving Pictures Experts Group Audio Layer IV, MP4) player, laptop or desktop computer.
  • the terminal 600 may also be called other names such as user equipment, portable terminal, laptop terminal, and desktop terminal.
  • the terminal 600 includes a processor 601 and a memory 602.
  • the processor 601 may include one or more processing cores, such as a 4-core processor, an 8-core processor, and so on.
  • the processor 601 may adopt at least one hardware form of digital signal processing (Digital Signal Processing, DSP), field programmable gate array (Field-Programmable Gate Array, FPGA), programmable logic array (Programmable Logic Array, PLA) achieve.
  • the processor 601 may also include a main processor and a coprocessor.
  • the main processor is a processor for processing data in a wake-up state, also called a central processing unit (Central Processing Unit, CPU); the coprocessor is A low-power processor for processing data in the standby state.
  • CPU Central Processing Unit
  • the processor 601 may be integrated with a graphics processor (Graphics Processing Unit, GPU), and the GPU is used to render and draw the content required to be displayed on the display screen.
  • the processor 601 may further include an artificial intelligence (Artificial Intelligence, AI) processor, which is used to process computing operations related to machine learning.
  • AI Artificial Intelligence
  • the memory 602 may include one or more computer-readable storage media, which may be non-transitory.
  • the memory 602 may also include high-speed random access memory, as well as non-volatile memory, such as one or more disk storage devices and flash storage devices.
  • the non-transitory computer-readable storage medium in the memory 602 is used to store at least one instruction for the ear provided by the processor 601 to implement the method embodiment of the present application Key point detection method.
  • the terminal 600 may optionally include a peripheral device interface 603 and at least one peripheral device.
  • the processor 601, the memory 602, and the peripheral device interface 603 may be connected by a bus or a signal line.
  • Each peripheral device may be connected to the peripheral device interface 603 through a bus, a signal line, or a circuit board.
  • the peripheral device includes: at least one of a radio frequency circuit 604, a touch display screen 605, a camera 606, an audio circuit 607, a positioning component 608, and a power supply 609.
  • the peripheral device interface 603 may be used to connect at least one peripheral device related to input/output (Input/Output, I/O) to the processor 601 and the memory 602.
  • the processor 601, the memory 602, and the peripheral device interface 603 are integrated on the same chip or circuit board; in some other embodiments, any one of the processor 601, the memory 602, and the peripheral device interface 603 or Both can be implemented on a separate chip or circuit board, which is not limited in this embodiment.
  • the radio frequency circuit 604 is used to receive and transmit radio frequency (Radio Frequency) signals, also called electromagnetic signals.
  • the radio frequency circuit 604 communicates with a communication network and other communication devices through electromagnetic signals.
  • the radio frequency circuit 604 converts the electrical signal into an electromagnetic signal for transmission, or converts the received electromagnetic signal into an electrical signal.
  • the radio frequency circuit 604 includes: an antenna system, an RF transceiver, one or more amplifiers, a tuner, an oscillator, a digital signal processor, a codec chipset, a user identity module card, and so on.
  • the radio frequency circuit 604 can communicate with other terminals through at least one wireless communication protocol.
  • the wireless communication protocol includes but is not limited to: metropolitan area networks, various generations of mobile communication networks (2G, 3G, 4G, and 13G), wireless local area networks, and/or wireless fidelity (WiFi) networks.
  • the radio frequency circuit 604 may further include a circuit related to near field communication (Near Field Communication, NFC), which is not limited in this application.
  • NFC Near Field Communication
  • the display screen 605 is used to display a user interface (User Interface, UI).
  • the UI may include graphics, text, icons, video, and any combination thereof.
  • the display screen 605 also has the ability to collect touch signals on or above the surface of the display screen 605.
  • the touch signal may be input to the processor 601 as a control signal for processing.
  • the display screen 605 can also be used to provide virtual buttons and/or virtual keyboards, also called soft buttons and/or soft keyboards.
  • the display screen 605 may be one, and the front panel of the terminal 600 is provided; in other embodiments, the display screen 605 may be at least two, respectively disposed on different surfaces of the terminal 600 or in a folded design; In still other embodiments, the display screen 605 may be a flexible display screen, which is disposed on the curved surface or folding surface of the terminal 600. Even, the display screen 605 can also be set as a non-rectangular irregular figure, that is, a special-shaped screen.
  • the display screen 605 may be made of liquid crystal display (Liquid Crystal) (LCD), organic light-emitting diode (Organic Light-Emitting Diode, OLED) and other materials.
  • LCD liquid crystal display
  • OLED Organic Light-Emitting Diode
  • the camera component 606 is used to collect images or videos.
  • the camera assembly 606 includes a front camera and a rear camera.
  • the front camera is set on the front panel of the terminal, and the rear camera is set on the back of the terminal.
  • the camera assembly 606 may also include a flash.
  • the flash can be a single-color flash or a dual-color flash. Dual color temperature flash refers to the combination of warm light flash and cold light flash, which can be used for light compensation at different color temperatures.
  • the audio circuit 607 may include a microphone and a speaker.
  • the microphone is used to collect sound waves of the user and the environment, and convert the sound waves into electrical signals and input them to the processor 601 for processing, or input them to the radio frequency circuit 604 to implement voice communication.
  • the microphone can also be an array microphone or an omnidirectional acquisition microphone.
  • the speaker is used to convert the electrical signal from the processor 601 or the radio frequency circuit 604 into sound waves.
  • the speaker can be a traditional thin-film speaker or a piezoelectric ceramic speaker.
  • the speaker When the speaker is a piezoelectric ceramic speaker, it can not only convert electrical signals into sound waves audible by humans, but also convert electrical signals into sound waves inaudible to humans for distance measurement and other purposes.
  • the audio circuit 607 may further include a headphone jack.
  • the positioning component 608 is used to locate the current geographic location of the terminal 600 to implement navigation or location-based services (Location Based Services, LBS).
  • LBS Location Based Services
  • the positioning component 608 may be a positioning component based on the Global Positioning System (GPS) of the United States, the Beidou system of China, the Grenas system of Russia, or the Galileo system of the European Union.
  • GPS Global Positioning System
  • the power supply 609 is used to supply power to various components in the terminal 600.
  • the power source 609 may be alternating current, direct current, disposable batteries, or rechargeable batteries.
  • the rechargeable battery may support wired charging or wireless charging.
  • the rechargeable battery can also be used to support fast charging technology.
  • the terminal 600 further includes one or more sensors 610.
  • the one or more sensors 610 include, but are not limited to: an acceleration sensor 611, a gyro sensor 612, a pressure sensor 613, a fingerprint sensor 614, an optical sensor 615, and a proximity sensor 616.
  • the acceleration sensor 611 can detect the magnitude of acceleration on the three coordinate axes of the coordinate system established with the terminal 600.
  • the acceleration sensor 611 can be used to detect components of gravity acceleration on three coordinate axes.
  • the processor 601 may control the touch display 605 to display the user interface in a landscape view or a portrait view according to the gravity acceleration signal collected by the acceleration sensor 611.
  • the acceleration sensor 611 can also be used for game or user movement data collection.
  • the gyro sensor 612 can detect the body direction and the rotation angle of the terminal 600, and the gyro sensor 612 can cooperate with the acceleration sensor 611 to collect a 3D action of the user on the terminal 600. Based on the data collected by the gyro sensor 612, the processor 601 can realize the following functions: motion sensing (such as changing the UI according to the user's tilt operation), image stabilization during shooting, game control, and inertial navigation.
  • the pressure sensor 613 may be disposed on the side frame of the terminal 600 and/or the lower layer of the touch display 605.
  • the processor 601 can perform left-right hand recognition or shortcut operation according to the grip signal collected by the pressure sensor 613.
  • the processor 601 controls the operability control on the UI interface according to the user's pressure operation on the touch screen 605.
  • the operability control includes at least one of a button control, a scroll bar control, an icon control, and a menu control.
  • the fingerprint sensor 614 is used to collect the user's fingerprint, and the processor 601 identifies the user's identity based on the fingerprint collected by the fingerprint sensor 614, or the fingerprint sensor 614 identifies the user's identity based on the collected fingerprint. When the user's identity is recognized as a trusted identity, the processor 601 authorizes the user to have relevant sensitive operations, including unlocking the screen, viewing encrypted information, downloading software, paying, and changing settings.
  • the fingerprint sensor 614 may be provided on the front, back, or side of the terminal 600. When a physical button or manufacturer logo is provided on the terminal 600, the fingerprint sensor 614 may be integrated with the physical button or manufacturer logo.
  • the optical sensor 615 is used to collect the ambient light intensity.
  • the processor 601 can control the display brightness of the touch display 605 according to the ambient light intensity collected by the optical sensor 615. Specifically, when the ambient light intensity is high, the display brightness of the touch display 605 is increased; when the ambient light intensity is low, the display brightness of the touch display 605 is decreased.
  • the processor 601 can also dynamically adjust the shooting parameters of the camera assembly 606 according to the ambient light intensity collected by the optical sensor 615.
  • the proximity sensor 616 also called a distance sensor, is usually provided on the front panel of the terminal 600.
  • the proximity sensor 616 is used to collect the distance between the user and the front of the terminal 600.
  • the processor 601 controls the touch display 605 to switch from the bright screen state to the breathing state; when the proximity sensor 616 detects When the distance from the user to the front of the terminal 600 gradually becomes larger, the processor 601 controls the touch display 605 to switch from the breath-hold state to the bright-screen state.
  • FIG. 6 does not constitute a limitation on the terminal 600, and may include more or fewer components than illustrated, or combine certain components, or adopt different component arrangements.
  • FIG. 7 is a schematic structural diagram of a server according to an exemplary embodiment.
  • the server 700 may have a relatively large difference due to different configurations or performance, and may include one or more processors (central processing units) (CPU) 701. And one or more memories 702, wherein at least one instruction is stored in the memory 702, and the at least one instruction is loaded and executed by the processor 701 to implement the methods provided by the foregoing method embodiments.
  • the server may also have components such as a wired or wireless network interface, a keyboard, and an input-output interface for input and output.
  • the server may also include other components for implementing device functions, which will not be repeated here.
  • the server 700 may be used to perform the steps performed by the ear key point detection device in the ear key point detection method.
  • a non-transitory computer-readable storage medium is also provided.
  • the detection device can perform an ear key point detection method, Methods include:
  • the face image includes face contour key points, and the face contour key points are used to determine the ear area in the face image;
  • the ear key point detection model is used to detect the ear key points in any ear area
  • the ear key points in the face image are detected.
  • an application program/computer program product is also provided, and when instructions in the application program/computer program product are executed by the processor of the detection device, the detection device can perform an ear key point detection Methods, methods include:
  • the face image includes face contour key points, and the face contour key points are used to determine the ear area in the face image;
  • the ear key point detection model is used to detect the ear key points in any ear area
  • the ear key points in the face image are detected.

Abstract

本申请是关于一种耳部关键点检测方法、装置及存储介质,属于图像处理领域。方法包括:获取人脸图像,所述人脸图像包括人脸轮廓关键点,所述人脸轮廓关键点用于确定所述人脸图像中的耳部区域;获取耳部关键点检测模型,所述耳部关键点检测模型用于检测任一耳部区域中的耳部关键点;基于所述耳部关键点检测模型和所述人脸轮廓关键点在所述人脸图像中的位置,检测所述人脸图像中的耳部关键点。通过采用人脸轮廓关键点确定耳部区域,并采用耳部关键点检测模型检测人脸图像中的耳部关键点,考虑到了耳部区域与人脸轮廓之间的位置关系,还通过耳部关键点检测模型学习到在耳部区域中检测耳部关键点的方式,提高了耳部关键点的准确性,降低了误差。

Description

耳部关键点检测方法、装置及存储介质
本申请要求在2018年11月28日提交中国专利局、申请号为201811437331.6、发明名称为“耳部关键点检测方法、装置及存储介质”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本申请属于图像处理领域,尤其涉及一种耳部关键点检测方法、装置及存储介质。
背景技术
近年来,随着图像处理技术的快速发展和广泛应用,在虚拟现实、短视频等多种领域,通常需要检测人脸图像中的耳部关键点,根据检测到的耳部关键点对人脸图像中的耳部区域进行操作,如添加装饰品的操作等。
在包含耳部区域的人脸图像中移动一定大小的圆形区域,对位于该圆形区域内的人脸图像进行扫描,根据耳部区域不同部位像素点的灰度不同的特点,将灰度突出的像素点确定为外耳廓边缘点,从而通过多次移动该圆形区域确定多个外耳廓边缘点,根据该多个外耳廓边缘点确定人脸图像的耳部区域,根据该耳部区域中各个像素点的灰度确定耳部关键点。
发明人发现,上述方案仅是根据耳部区域中各个像素点的灰度确定耳部关键点,导致检测到的耳部关键点不够准确,误差较大。
发明内容
为克服相关技术中存在的问题,本申请公开一种耳部关键点检测方法、装置及存储介质。
根据本申请实施例的第一方面,提供一种耳部关键点检测方法,所述方法包括:
获取人脸图像,所述人脸图像包括人脸轮廓关键点,所述人脸轮廓关键点用于确定所述人脸图像中的耳部区域;
获取耳部关键点检测模型,所述耳部关键点检测模型用于检测任一耳部区域中的耳部关键点;
基于所述耳部关键点检测模型和所述人脸轮廓关键点在所述人脸图像中的位置,检测所述人脸图像中的耳部关键点。
根据本申请实施例的第二方面,提供一种耳部关键点检测装置,所述装置包括:
图像获取单元,被配置为获取人脸图像,所述人脸图像包括人脸轮廓关键点,所述人脸轮廓关键点用于确定所述人脸图像中的耳部区域;
模型获取单元,被配置为获取耳部关键点检测模型,所述耳部关键点检测模型用于检测任一耳部区域中的耳部关键点;
确定单元,被配置为基于所述耳部关键点检测模型和所述人脸轮廓关键点在所述人脸图像中的位置,检测所述人脸图像中的耳部关键点。
根据本申请实施例的第三方面,提供一种耳部关键点检测装置,所述装置包括:
处理器;
用于存储处理器可执行命令的存储器;
其中,所述处理器被配置为:
获取人脸图像,所述人脸图像包括人脸轮廓关键点,所述人脸轮廓关键点用于确定所述人脸图像中的耳部区域;
获取耳部关键点检测模型,所述耳部关键点检测模型用于检测任一耳部区域中的耳部关键点;
基于所述耳部关键点检测模型和所述人脸轮廓关键点在所述人脸图像中的位置,检测所述人脸图像中的耳部关键点。
根据本申请实施例提供的第四方面,提供一种非临时性计算机可读存储介质,当所述存储介质中的指令由检测装置的处理器执行时,使得检测装置能够执行一种耳部关键点检测方法,所述方法包括:
获取人脸图像,所述人脸图像包括人脸轮廓关键点,所述人脸轮廓关键点用于确定所述人脸图像中的耳部区域;
获取耳部关键点检测模型,所述耳部关键点检测模型用于检测任一耳部区域中的耳部关键点;
基于所述耳部关键点检测模型和所述人脸轮廓关键点在所述人脸图像中的位置,检测所述人脸图像中的耳部关键点。
根据本申请实施例的第五方面,提供一种应用程序/计算机程序产品,当所述应用程序/计算机程序产品中的指令由检测装置的处理器执行时,使得检测装置能够执行一种耳部关键点检测方法,所述方法包括:
获取人脸图像,所述人脸图像包括人脸轮廓关键点,所述人脸轮廓关键点用于确定所述人脸图像中的耳部区域;
获取耳部关键点检测模型,所述耳部关键点检测模型用于检测任一耳部区域中的耳部关键点;
基于所述耳部关键点检测模型和所述人脸轮廓关键点在所述人脸图像中的位置,检测所述人脸图像中的耳部关键点。
本申请的实施例提供的技术方案可以包括以下有益效果:
通过获取包括人脸轮廓关键点的人脸图像,获取耳部关键点检测模型,人脸轮廓关键点用于确定人脸图像中的耳部区域,耳部关键点检测模型用于检测耳部区域中的耳部关键点,则基于耳部关键点检测模型和人脸轮廓关键点,检测人脸图像中的耳部关键点。通过采用人脸轮廓关键点确定耳部区域,并采用耳部关键点检测模型检测人脸图像中的耳部关键点,考虑到了耳部区域与人脸轮廓之间的位置关系,还通过耳部关键点检测模型学习到在耳部区域中检测耳部关键点的方式,提高了耳部关键点的准确性,降低了误差。
应当理解的是,以上的一般描述和后文的细节描述仅是示例性和解释性的,并不能限制本申请。
附图说明
此处的附图被并入说明书中并构成本说明书的一部分,示出了符合本申请的实施例,并与说明书一起用于解释本申请的原理。
图1是根据一示例性实施例示出的一种耳部关键点检测方法的流程图;
图2是根据一示例性实施例示出的一种耳部关键点检测方法的流程图;
图3是根据一示例性实施例示出的一种人脸图像的示意图;
图4是根据一示例性实施例示出的一种耳部关键点检测方法的流程图;
图5是根据一示例性实施例示出的一种耳部关键点检测装置的框图;
图6是根据一示例性实施例示出的一种用于耳部关键点检测的终端的框图;
图7是根据一示例性实施例示出的一种服务器的结构示意图。
具体实施方式
这里将详细地对示例性实施例进行说明,其示例表示在附图中。下面的描述涉及附图时,除非另有表示,不同附图中的相同数字表示相同或相似的要素。以下示例性实施例中所描述的实施方式并不代表与本申请相一致的所有实施方式。相反,它们仅是与如所附权利要求书中所详述的、本申请的一些方面相一致的装置和方法的例子。
图1是根据一示例性实施例示出的一种耳部关键点检测方法的流程图,如图1所示,该耳部关键点检测方法用于检测装置中,包括以下步骤:
在步骤101中,获取人脸图像,人脸图像包括人脸轮廓关键点,人脸轮廓关键点用于 确定人脸图像中的耳部区域。
在步骤102中,获取耳部关键点检测模型,耳部关键点检测模型用于检测任一耳部区域中的耳部关键点。
在步骤103中,基于耳部关键点检测模型和人脸轮廓关键点在人脸图像中的位置,检测人脸图像中的耳部关键点。
本申请实施例提供的方法,通过获取包括人脸轮廓关键点的人脸图像,获取耳部关键点检测模型,人脸轮廓关键点用于确定人脸图像中的耳部区域,耳部关键点检测模型用于检测耳部区域中的耳部关键点,则基于耳部关键点检测模型和人脸轮廓关键点,检测人脸图像中的耳部关键点。通过采用人脸轮廓关键点确定耳部区域,并采用耳部关键点检测模型检测人脸图像中的耳部关键点,考虑到了耳部区域与人脸轮廓之间的位置关系,还通过耳部关键点检测模型学习到在耳部区域中检测耳部关键点的方式,提高了耳部关键点的准确性,降低了误差。
在一种可能实现的方式中,基于耳部关键点检测模型和人脸轮廓关键点在人脸图像中的位置,检测人脸图像中的耳部关键点,包括:
根据人脸轮廓关键点在人脸图像中的位置,确定人脸图像中的第一耳部区域和第二耳部区域;
基于耳部关键点检测模型、第一耳部区域和第二耳部区域,检测第一耳部区域中的耳部关键点和第二耳部区域中的耳部关键点;
根据确定的每个耳部关键点在所处耳部区域中的位置以及第一耳部区域和第二耳部区域在人脸图像中的位置,确定每个耳部关键点在人脸图像中的位置。
在另一种可能实现的方式中,根据人脸轮廓关键点在人脸图像中的位置,确定人脸图像中的第一耳部区域和第二耳部区域,包括:
获取人脸轮廓关键点中的第一指定关键点和第二指定关键点;
确定包括第一指定关键点的第一耳部区域,以及包括第二指定关键点的第二耳部区域。
在另一种可能实现的方式中,第一耳部区域属于第一类耳部区域,第二耳部区域属于第二类耳部区域,第一类耳部区域为位于人脸第一侧的耳部区域,第二类耳部区域为位于人脸第二侧的耳部区域;
基于耳部关键点检测模型、第一耳部区域和第二耳部区域,检测第一耳部区域中的耳部关键点和第二耳部区域中的耳部关键点,包括:
将第一耳部区域进行水平翻转,得到第三耳部区域,第三耳部区域属于第二类耳部区域;
基于耳部关键点检测模型、第二耳部区域和第三耳部区域,确定第二耳部区域中的耳部关键点和第三耳部区域中的耳部关键点;
将包含耳部关键点的第三耳部区域进行水平翻转,得到包含耳部关键点的第一耳部区域。
在另一种可能实现的方式中,方法还包括:
获取多个样本图像,每个样本图像包括耳部区域以及耳部区域中的耳部关键点;
分别从多个样本图像中提取耳部区域;
根据提取的耳部区域和耳部区域中的耳部关键点进行模型训练,得到耳部关键点检测模型。
在另一种可能实现的方式中,根据提取的耳部区域和耳部区域中的耳部关键点进行模型训练,得到耳部关键点检测模型,包括:
将提取的耳部区域中的第一类耳部区域进行水平翻转,得到翻转后的耳部区域,第一类耳部区域为位于人脸第一侧的耳部区域;
将提取的耳部区域中的第二类耳部区域和翻转后的耳部区域确定为样本耳部区域,第二类耳部区域为位于人脸第二侧的耳部区域;
根据样本耳部区域以及样本耳部区域中的耳部关键点进行模型训练,得到耳部关键点检测模型。
图2是根据一示例性实施例示出的一种耳部关键点检测方法的流程图,如图2所示,该耳部关键点检测方法用于检测装置中,检测装置可以为手机、计算机、服务器、摄像头、监控设备等具有图像处理功能的装置,该方法包括以下步骤:
在步骤201中,获取人脸图像,人脸图像包括人脸轮廓关键点。
其中,该人脸图像可以由检测装置拍摄得到,或者从检测装置拍摄到的视频图像中提取得到,或者由检测装置从互联网中下载得到,或者由其他设备发送给检测装置。或者,在检测装置进行视频直播的过程中,可以获取视频流中的每张图像,将每张图片分别作为待检测的人脸图像,以便对视频流中的每张图像进行耳部关键点的检测。
人脸图像中包括多个人脸轮廓关键点,即人脸图像中的人脸轮廓上的关键点,该多个人脸轮廓关键点连接构成人脸轮廓。例如,人脸图像中包括19个人脸轮廓关键点,该19个人脸轮廓关键点均匀分布在人脸图像中的人脸轮廓上。
该多个人脸轮廓关键点通过对人脸图像进行人脸检测得到,该人脸检测过程中所采用的人脸检测算法可以基于人脸特征点的识别算法、基于模板的识别算法以及基于神经网络的识别算法等。当检测装置获取到原始的人脸图像时,对该人脸图像进行人脸检测,得到人脸图像中的多个人脸轮廓关键点。或者,其他设备对该人脸图像进行人脸检测,得到人脸图像中的多个人脸轮廓关键点后,将包括多个人脸轮廓关键点的人脸图像发送给检测装置。
在步骤202中,获取耳部关键点检测模型,耳部关键点检测模型用于检测任一耳部区 域中的耳部关键点。
本申请实施例中,基于耳部关键点检测模型可以对任一耳部区域中的耳部关键点进行检测,从而确定该人脸图像中的耳部关键点。
该耳部关键点检测模型可以由检测装置训练得到,并由该检测装置存储,或者,该耳部关键点检测模型可以由其他设备训练后发送给检测装置,并由该检测装置存储。
在一种可能实现的方式中,在训练耳部关键点检测模型时,先构建初始的耳部关键点检测模型,获取多个样本图像,每个样本图像包括耳部区域以及耳部区域中的耳部关键点,分别从多个样本图像中提取耳部区域,根据提取的耳部区域和耳部区域中的耳部关键点进行模型训练,得到耳部关键点检测模型。
其中,在训练过程中,将多个耳部区域以及对应的耳部关键点划分为训练数据集和测试数据集,将训练数据集中的多个耳部区域作为耳部关键点检测模型的输入,将耳部关键点在对应耳部区域中的位置作为耳部关键点检测模型的输出,对耳部关键点检测模型进行训练,使耳部关键点检测模型对耳部关键点的检测方式进行学习,具备检测耳部关键点的能力。之后,将测试数据集中的每个耳部区域输入到耳部关键点检测模型中,基于耳部关键点检测模型确定测试耳部关键点在所处耳部区域中的位置,将测试耳部关键点在所处耳部区域中的位置与标注的实际耳部关键点在所处耳部区域中的位置进行对比,根据对比结果对耳部关键点检测模型进行修正,以提高耳部关键点检测模型的准确性。
在一种可能实现的方式中,在训练耳部关键点检测模型时可以采用预设训练算法,该预设训练算法可以为卷积神经网络算法、决策树算法、人工神经网络算法等。相应地,训练出的耳部关键点检测模型可以为卷积神经网络模型、决策树模型或人工神经网络模型等。
在步骤203中,根据人脸轮廓关键点在人脸图像中的位置,确定人脸图像中的第一耳部区域和第二耳部区域。
本申请实施例中,检测装置基于耳部关键点检测模型和人脸轮廓关键点在人脸图像中的位置,检测人脸图像中的耳部关键点。
人脸轮廓关键点用于确定包括人脸图像中整个耳部的耳部区域,由于人脸轮廓与耳部区域之间存在着固定的相对位置关系,因此根据该相对位置关系以及人脸轮廓关键点在人脸图像中的位置,可以确定人脸图像中的耳部区域,以便进行耳部关键点的检测。
其中,由于人脸图像中通常会包括左耳部区域和右耳部区域,因此在确定人脸图像中的耳部区域时,会确定第一耳部区域和第二耳部区域,其中第一耳部区域为左耳部区域,第二耳部区域为右耳部区域,或者,第一耳部区域为右耳部区域,第二耳部区域为左耳部区域。
人脸图像中通常包括人脸区域、耳部区域和其他区域,根据人脸轮廓关键点来提取耳部区域,能够利用耳部与人脸轮廓相邻的先验知识,排除耳部区域以外的其他区域,仅根 据耳部区域进行检测,既减小了计算量,还能够排除无关区域的干扰,提高准确性。
人脸图像中的多个人脸轮廓关键点位于人脸图像中的不同位置,该多个人脸轮廓关键点与耳部区域的相对位置关系也有所不同,因此为了提取到准确的耳部区域,可以先基于该多个人脸轮廓关键点在人脸轮廓中所处的位置,确定与耳部区域距离最近的人脸轮廓关键点,作为指定关键点,根据指定关键点确定人脸图像中的耳部区域。
在一种可能实现的方式中,获取人脸轮廓关键点中的第一指定关键点和第二指定关键点,该第一指定关键点和第二指定关键点为距离耳部区域最近的人脸轮廓关键点,确定包括第一指定关键点的第一耳部区域,以及包括第二指定关键点的第二耳部区域。
其中,该第一指定关键点和第二指定关键点预先根据多个人脸轮廓关键点与耳部之间的距离确定,如当采用人脸检测算法得到在人脸轮廓中依次排列且数量固定的多个人脸轮廓关键点时,可以预先确定与耳部之间的距离最近的两个关键点的序号。则当采用上述人脸检测算法,获取到包括多个人脸轮廓关键点的人脸图像时,根据确定的两个序号即可从人脸图像中确定第一指定关键点和第二指定关键点。
可选地,确定第一指定关键点和第二指定关键点后,根据第一指定关键点和第二指定关键点在人脸图像中的位置,按照固定的尺寸和形状,确定包括第一指定关键点的第一耳部区域和包括第二指定关键点的第二耳部区域。其中,该尺寸根据一般人脸的尺寸设定,以使确定的耳部区域能够包括整个耳部,该形状可以为矩形、圆形、与人耳类似的形状或者其他形状。
另外,在确定耳部区域的位置时,可以根据第一指定关键点和第二指定关键点与对应耳部区域之间的相对位置关系确定。如图3所示,第一指定关键点和第二指定关键点为人脸轮廓中距离耳垂最近的人脸轮廓关键点,则可以将第一指定关键点和第二指定关键点分别作为待提取耳部区域的中心,或者分别作为待提取耳部区域的下边缘的中心,提取第一耳部区域和第二耳部区域。
当第一指定关键点为距离人脸图像中左耳部区域最近的人脸轮廓关键点,第二指定关键点为距离人脸图像中右耳部区域最近的人脸轮廓关键点时,第一耳部区域为左耳部区域,第二耳部区域为右耳部区域。当第一指定关键点为距离人脸图像中右耳部区域最近的人脸轮廓关键点,第二指定关键点为距离人脸图像中左耳部区域最近的人脸轮廓关键点时,第一耳部区域为右耳部区域,第二耳部区域为左耳部区域。
在步骤204中,基于耳部关键点检测模型、第一耳部区域和第二耳部区域,检测第一耳部区域中的耳部关键点和第二耳部区域中的耳部关键点。
在一种可能实现的方式中,检测装置将第一耳部区域和第二耳部区域分别输入至耳部关键点检测模型中,基于耳部关键点检测模型,对第一耳部区域的耳部关键点和第二耳部区域的耳部关键点分别进行检测,从而确定第一耳部区域中的耳部关键点和第二耳部区域 中的耳部关键点。
在步骤205中,根据确定的每个耳部关键点在所处耳部区域中的位置以及第一耳部区域和第二耳部区域在人脸图像中的位置,确定每个耳部关键点在人脸图像中的位置。
上述步骤204中检测第一耳部区域和第二耳部区域中的耳部关键点,实际上是确定了耳部关键点在所处耳部区域中的位置。因此,根据耳部关键点在所处耳部区域中的位置以及耳部区域在人脸图像中的位置,确定耳部关键点在人脸图像中的位置。
在一种可能实现的方式中,将人脸图像中的某一点(如指定关键点)确定为耳部区域的原点,创建坐标系,则确定耳部关键点在所处耳部区域中的坐标后,将耳部关键点在所处耳部区域中的坐标与该原点在人脸图像中的坐标进行叠加,得到耳部关键点在人脸图像中的坐标,从而确定耳部关键点在人脸图像中的位置。
通过上述步骤201-205实现耳部关键点的检测之后,即可基于人脸图像中的耳部关键点进行多种操作。如,在进行视频直播的过程中,可以获取视频流中的每张图像,检测出每张图像的耳部关键点后,在某一耳部关键点所在的位置添加虚拟装饰品、贴纸、发光特效等,提升直播效果。
本申请实施例提供的方法,通过获取包括人脸轮廓关键点的人脸图像,获取耳部关键点检测模型,人脸轮廓关键点用于确定人脸图像中的耳部区域,耳部关键点检测模型用于检测耳部区域中的耳部关键点,则基于耳部关键点检测模型和人脸轮廓关键点,检测人脸图像中的耳部关键点。通过采用人脸轮廓关键点确定耳部区域,并采用耳部关键点检测模型检测人脸图像中的耳部关键点,考虑到了耳部区域与人脸轮廓之间的位置关系,还通过耳部关键点检测模型学习到在耳部区域中检测耳部关键点的方式,提高了耳部关键点的准确性,降低了误差。
另外,人脸图像中通常包括人脸区域、耳部区域和其他区域,根据人脸轮廓关键点来提取耳部区域,能够利用耳部与人脸轮廓相邻的先验知识,排除耳部区域以外的其他区域,仅根据耳部区域进行检测,既减小了计算量,还能够排除无关区域的干扰,提高准确性。
另外,实现耳部关键点的检测之后,可将耳部关键点作为操作目标,基于人脸图像中的耳部关键点进行多种操作,扩展了应用功能,提高了灵活性,提升了人脸图像的趣味性。
图4是根据一示例性实施例示出的一种耳部关键点检测方法的流程图,如图4所示,该耳部关键点检测方法用于检测装置中,检测装置可以为手机、计算机、服务器、摄像头、监控设备等具有图像处理功能的装置,该方法包括以下步骤:
在步骤401中,获取人脸图像,人脸图像包括人脸轮廓关键点,人脸轮廓关键点用于确定人脸图像中的耳部区域。
该步骤401与上述步骤201类似,详细描述可参见上述步骤201,在此不再赘述。
在步骤402中,获取耳部关键点检测模型。
本申请实施例中,由于图2所示实施例采用的耳部关键点检测模型需要对左耳部区域和右耳部区域分别进行检测,这就需要训练耳部关键点检测模型时,根据左耳部区域和右耳部区域进行训练,以使对耳部关键点的检测方式进行学习,造成耳部关键点检测模型的复杂度高。
为了解决上述问题,本申请实施例中将耳部区域分为第一类耳部区域和第二类耳部区域,第一类耳部区域为位于人脸第一侧的耳部区域,第二类耳部区域为位于人脸第二侧的耳部区域,而耳部关键点检测模型用于检测第二类耳部区域中的耳部关键点,而不再检测第一类耳部区域中的耳部关键点。
其中,第一类耳部区域为左耳部区域,第二类耳部区域为右耳部区域,或者,第一类耳部区域为右耳部区域,第二类耳部区域为左耳部区域。
相应地,训练耳部关键点检测模型的过程中,确定提取的耳部区域所属的类型,将提取的耳部区域中的第一类耳部区域进行水平翻转,得到翻转后的耳部区域,以使翻转后的耳部区域属于第二类耳部区域,将提取的耳部区域中的第二类耳部区域和翻转后的耳部区域确定为样本耳部区域,根据样本耳部区域以及样本耳部区域中的耳部关键点进行模型训练,得到耳部关键点检测模型,以使耳部关键点检测模型能够学习到对第二类耳部区域中的耳部关键点的检测方式。由于耳部关键点检测模型无需学习人脸两侧的耳部关键点的检测方式,只需学习人脸一侧的耳部关键点检测方式即可,因此降低了耳部关键点检测模型的复杂度,提高了训练速度。
其中,在训练过程中,将多个样本耳部区域以及对应的耳部关键点划分为训练数据集和测试数据集,将训练数据集中的多个样本耳部区域作为耳部关键点的输入,将耳部关键点在对应耳部区域中的位置作为耳部关键点检测模型的输出,对耳部关键点检测模型进行训练,使耳部关键点检测模型对第二类耳部区域中的耳部关键点的检测方式进行学习,使耳部关键点检测模型具备检测第二类耳部区域中的耳部关键点的能力。之后,将测试数据集中的每个样本耳部区域输入到耳部关键点检测模型中,基于耳部关键点检测模型确定测试耳部关键点在所处耳部区域中的位置。如果该样本耳部区域为原始的第二类耳部区域,则将检测出的测试耳部关键点与该样本耳部区域中的实际耳部关键点进行对比,根据对比结果对耳部关键点检测模型进行修正,如果该样本耳部区域为第一类耳部区域经过翻转后得到的耳部区域,则将检测出的测试耳部关键点与第一类耳部区域经过翻转后实际的耳部关键点进行对比,根据对比结果对耳部关键点检测模型进行修正。
其中,对任一耳部区域进行水平翻转包括:确定耳部区域中每个像素点在耳部区域中的位置,以及耳部区域的中轴线,根据每个像素点的位置和该中轴线的位置,确定每个像素点关于该中轴线对称的目标位置,将每个像素点的像素信息与对应目标位置上的像素点 的像素信息进行交换,实现水平翻转。
在步骤403中,根据人脸轮廓关键点在人脸图像中的位置,确定人脸图像中的第一耳部区域和第二耳部区域。
步骤403与上述步骤203类似,详细描述可参见上述步骤201,在此不再赘述。
在步骤404中,将第一耳部区域进行水平翻转,得到第三耳部区域,第三耳部区域属于第二类耳部区域。
在步骤405中,基于耳部关键点检测模型、第二耳部区域和第三耳部区域,确定第二耳部区域中的耳部关键点和第三耳部区域中的耳部关键点。
在步骤406中,将包含耳部关键点的第三耳部区域进行水平翻转,得到包含耳部关键点的第一耳部区域。
上述图2所示实施例中,基于耳部关键点检测模型对第一类耳部区域和第二类耳部区域均可进行检测,而本申请实施例中,基于耳部关键点检测模型只能对第二类耳部区域进行检测。
因此,在进行检测之前,将属于第一类耳部区域的第一耳部区域进行水平翻转,得到第三耳部区域,以使第三耳部区域属于第二类耳部区域,基于耳部关键点检测模型,对第三耳部区域进行检测。在检测出第三耳部区域中的耳部关键点之后,再将包含耳部关键点的第三耳部区域进行水平翻转,从而确定第一耳部区域中的耳部关键点,实现了对第一类耳部区域的检测。
在步骤407中,根据确定的每个耳部关键点在所处耳部区域中的位置以及第一耳部区域和第二耳部区域在人脸图像中的位置,确定每个耳部关键点在人脸图像中的位置。
步骤407与上述步骤205类似,详细描述可参见上述步骤205,在此不再赘述。
本申请实施例提供的方法,通过获取人脸图像,获取耳部关键点检测模型,根据人脸轮廓关键点确定人脸图像中的第一耳部区域和第二耳部区域,将第一耳部区域进行水平翻转,得到属于第二类耳部区域的第三耳部区域,基于耳部关键点检测模型,检测耳部区域中的耳部关键点,将包含耳部关键点的第三耳部区域进行水平翻转,得到包含耳部关键点的第一耳部区域,确定每个耳部关键点在人脸图像中的位置。通过采用人脸轮廓关键点确定耳部区域,并采用耳部关键点检测模型检测人脸图像中的耳部关键点,考虑到了耳部区域与人脸轮廓之间的位置关系,还通过耳部关键点检测模型学习到在耳部区域中检测耳部关键点的方式,提高了耳部关键点的准确性,降低了误差。
并且,通过将耳部区域分为第一类耳部区域和第二类耳部区域,耳部关键点检测模型用于检测第二类耳部区域中的耳部关键点,而不检测第一类耳部区域中的耳部关键点,当训练耳部关键点检测模型时,无需学习人脸两侧的耳部关键点检测方式,只需学习人脸一侧的耳部关键点检测方式即可,因此降低了耳部关键点检测模型的复杂度,提高了训练速 度。
图5是根据一示例性实施例示出的一种耳部关键点检测装置的框图。参见图5,该装置包括图像获取单元501、模型获取单元502以及确定单元503。
图像获取单元501,被配置为获取人脸图像,人脸图像包括人脸轮廓关键点,人脸轮廓关键点用于确定人脸图像中的耳部区域;
模型获取单元502,被配置为获取耳部关键点检测模型,耳部关键点检测模型用于检测任一耳部区域中的耳部关键点;
确定单元503,被配置为基于耳部关键点检测模型和人脸轮廓关键点在人脸图像中的位置,检测人脸图像中的耳部关键点。
本申请实施例提供的装置,通过获取包括人脸轮廓关键点的人脸图像,获取耳部关键点检测模型,人脸轮廓关键点用于确定人脸图像中的耳部区域,耳部关键点检测模型用于检测耳部区域中的耳部关键点,则基于耳部关键点检测模型和人脸轮廓关键点,检测人脸图像中的耳部关键点。通过采用人脸轮廓关键点确定耳部区域,并采用耳部关键点检测模型检测人脸图像中的耳部关键点,考虑到了耳部区域与人脸轮廓之间的位置关系,还通过耳部关键点检测模型学习到在耳部区域中检测耳部关键点的方式,提高了耳部关键点的准确性,降低了误差。
在一种可能实现的方式中,确定单元503包括:
区域确定子单元,被配置为根据人脸轮廓关键点在人脸图像中的位置,确定人脸图像中的第一耳部区域和第二耳部区域;
关键点确定子单元,被配置为基于耳部关键点检测模型、第一耳部区域和第二耳部区域,检测第一耳部区域中的耳部关键点和第二耳部区域中的耳部关键点;
位置确定子单元,被配置为根据确定的每个耳部关键点在所处耳部区域中的位置以及第一耳部区域和第二耳部区域在人脸图像中的位置,确定每个耳部关键点在人脸图像中的位置。
在另一种可能实现的方式中,区域确定子单元,还被配置为获取人脸轮廓关键点中的第一指定关键点和第二指定关键点;确定包括第一指定关键点的第一耳部区域,以及包括第二指定关键点的第二耳部区域。
在另一种可能实现的方式中,第一耳部区域属于第一类耳部区域,第二耳部区域属于第二类耳部区域,第一类耳部区域为位于人脸第一侧的耳部区域,第二类耳部区域为位于人脸第二侧的耳部区域;
关键点确定子单元,还被配置为将第一耳部区域进行水平翻转,得到第三耳部区域,第三耳部区域属于第二类耳部区域;基于耳部关键点检测模型、第二耳部区域和第三耳部区域,确定第二耳部区域中的耳部关键点和第三耳部区域中的耳部关键点;将包含耳部关 键点的第三耳部区域进行水平翻转,得到包含耳部关键点的第一耳部区域。
在另一种可能实现的方式中,装置还包括:
获取单元,被配置为获取多个样本图像,每个样本图像包括耳部区域以及耳部区域中的耳部关键点;
提取单元,被配置为分别从多个样本图像中提取耳部区域;
训练单元,被配置为根据提取的耳部区域和耳部区域中的耳部关键点进行模型训练,得到耳部关键点检测模型。
在另一种可能实现的方式中,训练单元,包括:
翻转子单元,被配置为将提取的耳部区域中的第一类耳部区域进行水平翻转,得到翻转后的耳部区域,第一类耳部区域为位于人脸第一侧的耳部区域;
样本确定子单元,被配置为将提取的耳部区域中的第二类耳部区域和翻转后的耳部区域确定为样本耳部区域,第二类耳部区域为位于人脸第二侧的耳部区域;
训练单元,还被配置为根据样本耳部区域以及样本耳部区域中的耳部关键点进行模型训练,得到耳部关键点检测模型。
关于上述实施例中的装置,其中各个模块执行操作的具体方式已经在有关该方法的实施例中进行了详细描述,此处将不做详细阐述说明。
图6是根据一示例性实施例示出的一种用于耳部关键点检测的终端600的框图。该终端600用于执行上述耳部关键点检测方法中检测装置所执行的步骤,可以是便携式移动终端,比如:智能手机、平板电脑、动态影像专家压缩标准音频层面3播放器(Moving Picture Experts Group Audio Layer III,MP3)、动态影像专家压缩标准音频层面4(Moving Picture Experts Group Audio Layer IV,MP4)播放器、笔记本电脑或台式电脑。终端600还可能被称为用户设备、便携式终端、膝上型终端、台式终端等其他名称。
通常,终端600包括有:处理器601和存储器602。
处理器601可以包括一个或多个处理核心,比如4核心处理器、8核心处理器等。处理器601可以采用数字信号处理(Digital Signal Processing,DSP)、现场可编程门阵列(Field-Programmable Gate Array,FPGA)、可编程逻辑阵列(Programmable Logic Array,PLA)中的至少一种硬件形式来实现。处理器601也可以包括主处理器和协处理器,主处理器是用于对在唤醒状态下的数据进行处理的处理器,也称中央处理器(Central Processing Unit,CPU);协处理器是用于对在待机状态下的数据进行处理的低功耗处理器。在一些实施例中,处理器601可以在集成有图像处理器(Graphics Processing Unit,GPU),GPU用于负责显示屏所需要显示的内容的渲染和绘制。一些实施例中,处理器601还可以包括人工智能(Artificial Intelligence,AI)处理器,该AI处理器用于处理有关机器学习的计算操作。
存储器602可以包括一个或多个计算机可读存储介质,该计算机可读存储介质可以是非暂态的。存储器602还可包括高速随机存取存储器,以及非易失性存储器,比如一个或多个磁盘存储设备、闪存存储设备。在一些实施例中,存储器602中的非暂态的计算机可读存储介质用于存储至少一个指令,该至少一个指令用于被处理器601所具有以实现本申请中方法实施例提供的耳部关键点检测方法。
在一些实施例中,终端600还可选包括有:外围设备接口603和至少一个外围设备。处理器601、存储器602和外围设备接口603之间可以通过总线或信号线相连。各个外围设备可以通过总线、信号线或电路板与外围设备接口603相连。具体地,外围设备包括:射频电路604、触摸显示屏605、摄像头606、音频电路607、定位组件608和电源609中的至少一种。
外围设备接口603可被用于将输入/输出(Input/Output,I/O)相关的至少一个外围设备连接到处理器601和存储器602。在一些实施例中,处理器601、存储器602和外围设备接口603被集成在同一芯片或电路板上;在一些其他实施例中,处理器601、存储器602和外围设备接口603中的任意一个或两个可以在单独的芯片或电路板上实现,本实施例对此不加以限定。
射频电路604用于接收和发射射频(Radio Frequency,RF)信号,也称电磁信号。射频电路604通过电磁信号与通信网络以及其他通信设备进行通信。射频电路604将电信号转换为电磁信号进行发送,或者,将接收到的电磁信号转换为电信号。可选地,射频电路604包括:天线系统、RF收发器、一个或多个放大器、调谐器、振荡器、数字信号处理器、编解码芯片组、用户身份模块卡等等。射频电路604可以通过至少一种无线通信协议来与其它终端进行通信。该无线通信协议包括但不限于:城域网、各代移动通信网络(2G、3G、4G及13G)、无线局域网和/或无线保真(Wireless Fidelity,WiFi)网络。在一些实施例中,射频电路604还可以包括近场通信(Near Field Communication,NFC)有关的电路,本申请对此不加以限定。
显示屏605用于显示用户界面(User Interface,UI)。该UI可以包括图形、文本、图标、视频及其它们的任意组合。当显示屏605是触摸显示屏时,显示屏605还具有采集在显示屏605的表面或表面上方的触摸信号的能力。该触摸信号可以作为控制信号输入至处理器601进行处理。此时,显示屏605还可以用于提供虚拟按钮和/或虚拟键盘,也称软按钮和/或软键盘。在一些实施例中,显示屏605可以为一个,设置终端600的前面板;在另一些实施例中,显示屏605可以为至少两个,分别设置在终端600的不同表面或呈折叠设计;在再一些实施例中,显示屏605可以是柔性显示屏,设置在终端600的弯曲表面上或折叠面上。甚至,显示屏605还可以设置成非矩形的不规则图形,也即异形屏。显示屏605可以采用液晶显示屏(Liquid Crystal Display,LCD)、有机发光二极管(Organic Light-Emitting Diode,OLED) 等材质制备。
摄像头组件606用于采集图像或视频。可选地,摄像头组件606包括前置摄像头和后置摄像头。通常,前置摄像头设置在终端的前面板,后置摄像头设置在终端的背面。在一些实施例中,后置摄像头为至少两个,分别为主摄像头、景深摄像头、广角摄像头、长焦摄像头中的任意一种,以实现主摄像头和景深摄像头融合实现背景虚化功能、主摄像头和广角摄像头融合实现全景拍摄以及虚拟现实(Virtual Reality,VR)拍摄功能或者其它融合拍摄功能。在一些实施例中,摄像头组件606还可以包括闪光灯。闪光灯可以是单色温闪光灯,也可以是双色温闪光灯。双色温闪光灯是指暖光闪光灯和冷光闪光灯的组合,可以用于不同色温下的光线补偿。
音频电路607可以包括麦克风和扬声器。麦克风用于采集用户及环境的声波,并将声波转换为电信号输入至处理器601进行处理,或者输入至射频电路604以实现语音通信。出于立体声采集或降噪的目的,麦克风可以为多个,分别设置在终端600的不同部位。麦克风还可以是阵列麦克风或全向采集型麦克风。扬声器则用于将来自处理器601或射频电路604的电信号转换为声波。扬声器可以是传统的薄膜扬声器,也可以是压电陶瓷扬声器。当扬声器是压电陶瓷扬声器时,不仅可以将电信号转换为人类可听见的声波,也可以将电信号转换为人类听不见的声波以进行测距等用途。在一些实施例中,音频电路607还可以包括耳机插孔。
定位组件608用于定位终端600的当前地理位置,以实现导航或基于位置的服务(Location Based Service,LBS)。定位组件608可以是基于美国的全球定位系统(Global Positioning System,GPS)、中国的北斗系统、俄罗斯的格雷纳斯系统或欧盟的伽利略系统的定位组件。
电源609用于为终端600中的各个组件进行供电。电源609可以是交流电、直流电、一次性电池或可充电电池。当电源609包括可充电电池时,该可充电电池可以支持有线充电或无线充电。该可充电电池还可以用于支持快充技术。
在一些实施例中,终端600还包括有一个或多个传感器610。该一个或多个传感器610包括但不限于:加速度传感器611、陀螺仪传感器612、压力传感器613、指纹传感器614、光学传感器615以及接近传感器616。
加速度传感器611可以检测以终端600建立的坐标系的三个坐标轴上的加速度大小。比如,加速度传感器611可以用于检测重力加速度在三个坐标轴上的分量。处理器601可以根据加速度传感器611采集的重力加速度信号,控制触摸显示屏605以横向视图或纵向视图进行用户界面的显示。加速度传感器611还可以用于游戏或者用户的运动数据的采集。
陀螺仪传感器612可以检测终端600的机体方向及转动角度,陀螺仪传感器612可以与加速度传感器611协同采集用户对终端600的3D动作。处理器601根据陀螺仪传感器612采集 的数据,可以实现如下功能:动作感应(比如根据用户的倾斜操作来改变UI)、拍摄时的图像稳定、游戏控制以及惯性导航。
压力传感器613可以设置在终端600的侧边框和/或触摸显示屏605的下层。当压力传感器613设置在终端600的侧边框时,可以检测用户对终端600的握持信号,由处理器601根据压力传感器613采集的握持信号进行左右手识别或快捷操作。当压力传感器613设置在触摸显示屏605的下层时,由处理器601根据用户对触摸显示屏605的压力操作,实现对UI界面上的可操作性控件进行控制。可操作性控件包括按钮控件、滚动条控件、图标控件、菜单控件中的至少一种。
指纹传感器614用于采集用户的指纹,由处理器601根据指纹传感器614采集到的指纹识别用户的身份,或者,由指纹传感器614根据采集到的指纹识别用户的身份。在识别出用户的身份为可信身份时,由处理器601授权该用户具有相关的敏感操作,该敏感操作包括解锁屏幕、查看加密信息、下载软件、支付及更改设置等。指纹传感器614可以被设置终端600的正面、背面或侧面。当终端600上设置有物理按键或厂商Logo时,指纹传感器614可以与物理按键或厂商标志集成在一起。
光学传感器615用于采集环境光强度。在一个实施例中,处理器601可以根据光学传感器615采集的环境光强度,控制触摸显示屏605的显示亮度。具体地,当环境光强度较高时,调高触摸显示屏605的显示亮度;当环境光强度较低时,调低触摸显示屏605的显示亮度。在另一个实施例中,处理器601还可以根据光学传感器615采集的环境光强度,动态调整摄像头组件606的拍摄参数。
接近传感器616,也称距离传感器,通常设置在终端600的前面板。接近传感器616用于采集用户与终端600的正面之间的距离。在一个实施例中,当接近传感器616检测到用户与终端600的正面之间的距离逐渐变小时,由处理器601控制触摸显示屏605从亮屏状态切换为息屏状态;当接近传感器616检测到用户与终端600的正面之间的距离逐渐变大时,由处理器601控制触摸显示屏605从息屏状态切换为亮屏状态。
本领域技术人员可以理解,图6中示出的结构并不构成对终端600的限定,可以包括比图示更多或更少的组件,或者组合某些组件,或者采用不同的组件布置。
图7是根据一示例性实施例示出的一种服务器的结构示意图,该服务器700可因配置或性能不同而产生比较大的差异,可以包括一个或一个以上处理器(central processing units,CPU)701和一个或一个以上的存储器702,其中,所述存储器702中存储有至少一条指令,所述至少一条指令由所述处理器701加载并执行以实现上述各个方法实施例提供的方法。当然,该服务器还可以具有有线或无线网络接口、键盘以及输入输出接口等部件,以便进行输入输出,该服务器还可以包括其他用于实现设备功能的部件,在此不做赘述。
服务器700可以用于执行上述耳部关键点检测方法中耳部关键点检测装置所执行的步 骤。
在示例性实施例中,还提供了一种非临时性计算机可读存储介质,当存储介质中的指令由检测装置的处理器执行时,使得检测装置能够执行一种耳部关键点检测方法,方法包括:
获取人脸图像,人脸图像包括人脸轮廓关键点,人脸轮廓关键点用于确定人脸图像中的耳部区域;
获取耳部关键点检测模型,耳部关键点检测模型用于检测任一耳部区域中的耳部关键点;
基于耳部关键点检测模型和人脸轮廓关键点在人脸图像中的位置,检测人脸图像中的耳部关键点。
在示例性实施例中,还提供了一种应用程序/计算机程序产品,当应用程序/计算机程序产品中的指令由检测装置的处理器执行时,使得检测装置能够执行一种耳部关键点检测方法,方法包括:
获取人脸图像,人脸图像包括人脸轮廓关键点,人脸轮廓关键点用于确定人脸图像中的耳部区域;
获取耳部关键点检测模型,耳部关键点检测模型用于检测任一耳部区域中的耳部关键点;
基于耳部关键点检测模型和人脸轮廓关键点在人脸图像中的位置,检测人脸图像中的耳部关键点。
本领域技术人员在考虑说明书及实践这里公开的发明后,将容易想到本申请的其它实施方案。本申请旨在涵盖本申请的任何变型、用途或者适应性变化,这些变型、用途或者适应性变化遵循本申请的一般性原理并包括本申请未公开的本技术领域中的公知常识或惯用技术手段。说明书和实施例仅被视为示例性的,本申请的真正范围和精神由下面的权利要求指出。
应当理解的是,本申请并不局限于上面已经描述并在附图中示出的精确结构,并且可以在不脱离其范围进行各种修改和改变。本申请的范围仅由所附的权利要求来限制。

Claims (14)

  1. 一种耳部关键点检测方法,所述方法包括:
    获取人脸图像,所述人脸图像包括人脸轮廓关键点,所述人脸轮廓关键点用于确定所述人脸图像中的耳部区域;
    获取耳部关键点检测模型,所述耳部关键点检测模型用于检测任一耳部区域中的耳部关键点;
    基于所述耳部关键点检测模型和所述人脸轮廓关键点在所述人脸图像中的位置,检测所述人脸图像中的耳部关键点。
  2. 根据权利要求1所述的方法,所述基于所述耳部关键点检测模型和所述人脸轮廓关键点在所述人脸图像中的位置,检测所述人脸图像中的耳部关键点,包括:
    根据所述人脸轮廓关键点在所述人脸图像中的位置,确定所述人脸图像中的第一耳部区域和第二耳部区域;
    基于所述耳部关键点检测模型、所述第一耳部区域和所述第二耳部区域,检测所述第一耳部区域中的耳部关键点和所述第二耳部区域中的耳部关键点;
    根据确定的每个耳部关键点在所处耳部区域中的位置以及所述第一耳部区域和所述第二耳部区域在所述人脸图像中的位置,确定所述每个耳部关键点在所述人脸图像中的位置。
  3. 根据权利要求2所述的方法,所述根据所述人脸轮廓关键点在所述人脸图像中的位置,确定所述人脸图像中的第一耳部区域和第二耳部区域,包括:
    获取所述人脸轮廓关键点中的第一指定关键点和第二指定关键点;
    确定包括所述第一指定关键点的所述第一耳部区域,以及包括所述第二指定关键点的所述第二耳部区域。
  4. 根据权利要求2所述的方法,所述第一耳部区域属于第一类耳部区域,所述第二耳部区域属于第二类耳部区域,所述第一类耳部区域为位于人脸第一侧的耳部区域,所述第二类耳部区域为位于人脸第二侧的耳部区域;
    所述基于所述耳部关键点检测模型、所述第一耳部区域和所述第二耳部区域,检测所述第一耳部区域中的耳部关键点和所述第二耳部区域中的耳部关键点,包括:
    将所述第一耳部区域进行水平翻转,得到第三耳部区域,所述第三耳部区域属于所述第二类耳部区域;
    基于所述耳部关键点检测模型、所述第二耳部区域和所述第三耳部区域,确定所述第二耳部区域中的耳部关键点和所述第三耳部区域中的耳部关键点;
    将包含所述耳部关键点的所述第三耳部区域进行水平翻转,得到包含所述耳部关键点的所述第一耳部区域。
  5. 根据权利要求1-4任一项所述的方法,所述方法还包括:
    获取多个样本图像,每个样本图像包括耳部区域以及所述耳部区域中的耳部关键点;
    分别从所述多个样本图像中提取耳部区域;
    根据提取的耳部区域和所述耳部区域中的耳部关键点进行模型训练,得到所述耳部关键点检测模型。
  6. 根据权利要求5所述的方法,所述根据提取的耳部区域和所述耳部区域中的耳部关键点进行模型训练,得到所述耳部关键点检测模型,包括:
    将提取的耳部区域中的第一类耳部区域进行水平翻转,得到翻转后的耳部区域,所述第一类耳部区域为位于人脸第一侧的耳部区域;
    将提取的耳部区域中的第二类耳部区域和所述翻转后的耳部区域确定为样本耳部区域,所述第二类耳部区域为位于人脸第二侧的耳部区域;
    根据所述样本耳部区域以及所述样本耳部区域中的耳部关键点进行模型训练,得到所述耳部关键点检测模型。
  7. 一种耳部关键点检测装置,所述装置包括:
    图像获取单元,被配置为获取人脸图像,所述人脸图像包括人脸轮廓关键点,所述人脸轮廓关键点用于确定所述人脸图像中的耳部区域;
    模型获取单元,被配置为获取耳部关键点检测模型,所述耳部关键点检测模型用于检测任一耳部区域中的耳部关键点;
    确定单元,被配置为基于所述耳部关键点检测模型和所述人脸轮廓关键点在所述人脸图像中的位置,检测所述人脸图像中的耳部关键点。
  8. 根据权利要求7所述的装置,所述确定单元包括:
    区域确定子单元,被配置为根据所述人脸轮廓关键点在所述人脸图像中的位置,确定所述人脸图像中的第一耳部区域和第二耳部区域;
    关键点确定子单元,被配置为基于所述耳部关键点检测模型、所述第一耳部区域和所述第二耳部区域,检测所述第一耳部区域中的耳部关键点和所述第二耳部区域中的耳部关键点;
    位置确定子单元,被配置为根据确定的每个耳部关键点在所处耳部区域中的位置以及所述第一耳部区域和所述第二耳部区域在所述人脸图像中的位置,确定所述每个耳部关键点在所述人脸图像中的位置。
  9. 根据权利要求8所述的装置,所述区域确定子单元,还被配置为获取所述人脸轮廓关键点中的第一指定关键点和第二指定关键点;确定包括所述第一指定关键点的所述第一耳部区域,以及包括所述第二指定关键点的所述第二耳部区域。
  10. 根据权利要求8所述的装置,所述第一耳部区域属于第一类耳部区域,所述第二耳部区域属于第二类耳部区域,所述第一类耳部区域为位于人脸第一侧的耳部区域,所述第二类耳部区域为位于人脸第二侧的耳部区域;
    所述关键点确定子单元,还被配置为将所述第一耳部区域进行水平翻转,得到第三耳部区域,所述第三耳部区域属于所述第二类耳部区域;基于所述耳部关键点检测模型、所述第二耳部区域和所述第三耳部区域,确定所述第二耳部区域中的耳部关键点和所述第三耳部区域中的耳部关键点;将包含所述耳部关键点的所述第三耳部区域进行水平翻转,得到包含所述耳部关键点的所述第一耳部区域。
  11. 根据权利要求7-10任一项所述的装置,所述装置还包括:
    获取单元,被配置为获取多个样本图像,每个样本图像包括耳部区域以及所述耳部区域中的耳部关键点;
    提取单元,被配置为分别从所述多个样本图像中提取耳部区域;
    训练单元,被配置为根据提取的耳部区域和所述耳部区域中的耳部关键点进行模型训练,得到所述耳部关键点检测模型。
  12. 根据权利要求11所述的装置,所述训练单元,包括:
    翻转子单元,被配置为将提取的耳部区域中的第一类耳部区域进行水平翻转,得到翻转后的耳部区域,所述第一类耳部区域为位于人脸第一侧的耳部区域;
    样本确定子单元,被配置为将提取的耳部区域中的第二类耳部区域和所述翻转后的耳部区域确定为样本耳部区域,所述第二类耳部区域为位于人脸第二侧的耳部区域;
    所述训练单元,还被配置为根据所述样本耳部区域以及所述样本耳部区域中的耳部关键点进行模型训练,得到所述耳部关键点检测模型。
  13. 一种耳部关键点检测装置,所述装置包括:
    处理器;
    用于存储处理器可执行命令的存储器;
    其中,所述处理器被配置为可执行如权利要求1-6任一项所述的一种耳部关键点检测方法。
  14. 一种非临时性计算机可读存储介质,当所述存储介质中的指令由检测装置的处理器执行时,使得检测装置能够执行如权利要求1-6任一项所述的一种耳部关键点检测方法。
PCT/CN2019/107104 2018-11-28 2019-09-20 耳部关键点检测方法、装置及存储介质 WO2020108041A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201811437331.6 2018-11-28
CN201811437331.6A CN109522863B (zh) 2018-11-28 2018-11-28 耳部关键点检测方法、装置及存储介质

Publications (1)

Publication Number Publication Date
WO2020108041A1 true WO2020108041A1 (zh) 2020-06-04

Family

ID=65793704

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2019/107104 WO2020108041A1 (zh) 2018-11-28 2019-09-20 耳部关键点检测方法、装置及存储介质

Country Status (2)

Country Link
CN (1) CN109522863B (zh)
WO (1) WO2020108041A1 (zh)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112070021A (zh) * 2020-09-09 2020-12-11 深圳数联天下智能科技有限公司 基于人脸检测的测距方法、测距系统、设备和存储介质
CN112465695A (zh) * 2020-12-01 2021-03-09 北京达佳互联信息技术有限公司 图像处理方法、装置、电子设备及存储介质
CN112489169A (zh) * 2020-12-17 2021-03-12 脸萌有限公司 人像图像处理方法及装置

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109522863B (zh) * 2018-11-28 2020-11-27 北京达佳互联信息技术有限公司 耳部关键点检测方法、装置及存储介质
CN110197149B (zh) * 2019-05-23 2021-05-18 北京达佳互联信息技术有限公司 耳部关键点检测方法、装置、存储介质及电子设备
CN110929651B (zh) 2019-11-25 2022-12-06 北京达佳互联信息技术有限公司 图像处理方法、装置、电子设备及存储介质

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2007272435A (ja) * 2006-03-30 2007-10-18 Univ Of Electro-Communications 顔特徴抽出装置及び顔特徴抽出方法
CN104268591A (zh) * 2014-09-19 2015-01-07 海信集团有限公司 一种面部关键点检测方法及装置
CN109522863A (zh) * 2018-11-28 2019-03-26 北京达佳互联信息技术有限公司 耳部关键点检测方法、装置及存储介质

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104794464B (zh) * 2015-05-13 2019-06-07 上海依图网络科技有限公司 一种基于相对属性的活体检测方法
CN106327801B (zh) * 2015-07-07 2019-07-26 北京易车互联信息技术有限公司 疲劳驾驶检测方法和装置
KR102443214B1 (ko) * 2016-01-08 2022-09-15 삼성전자 주식회사 영상처리장치 및 그 제어방법
CN108229293A (zh) * 2017-08-09 2018-06-29 北京市商汤科技开发有限公司 人脸图像处理方法、装置和电子设备
CN107679446B (zh) * 2017-08-17 2019-03-15 平安科技(深圳)有限公司 人脸姿态检测方法、装置及存储介质
CN108764048B (zh) * 2018-04-28 2021-03-16 中国科学院自动化研究所 人脸关键点检测方法及装置

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2007272435A (ja) * 2006-03-30 2007-10-18 Univ Of Electro-Communications 顔特徴抽出装置及び顔特徴抽出方法
CN104268591A (zh) * 2014-09-19 2015-01-07 海信集团有限公司 一种面部关键点检测方法及装置
CN109522863A (zh) * 2018-11-28 2019-03-26 北京达佳互联信息技术有限公司 耳部关键点检测方法、装置及存储介质

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112070021A (zh) * 2020-09-09 2020-12-11 深圳数联天下智能科技有限公司 基于人脸检测的测距方法、测距系统、设备和存储介质
CN112465695A (zh) * 2020-12-01 2021-03-09 北京达佳互联信息技术有限公司 图像处理方法、装置、电子设备及存储介质
CN112465695B (zh) * 2020-12-01 2024-01-02 北京达佳互联信息技术有限公司 图像处理方法、装置、电子设备及存储介质
CN112489169A (zh) * 2020-12-17 2021-03-12 脸萌有限公司 人像图像处理方法及装置
CN112489169B (zh) * 2020-12-17 2024-02-13 脸萌有限公司 人像图像处理方法及装置

Also Published As

Publication number Publication date
CN109522863B (zh) 2020-11-27
CN109522863A (zh) 2019-03-26

Similar Documents

Publication Publication Date Title
US11288807B2 (en) Method, electronic device and storage medium for segmenting image
US11205282B2 (en) Relocalization method and apparatus in camera pose tracking process and storage medium
US11367307B2 (en) Method for processing images and electronic device
US11158083B2 (en) Position and attitude determining method and apparatus, smart device, and storage medium
US11517099B2 (en) Method for processing images, electronic device, and storage medium
US11436779B2 (en) Image processing method, electronic device, and storage medium
WO2019101021A1 (zh) 图像识别方法、装置及电子设备
WO2020108041A1 (zh) 耳部关键点检测方法、装置及存储介质
WO2020221012A1 (zh) 图像特征点的运动信息确定方法、任务执行方法和设备
CN110059652B (zh) 人脸图像处理方法、装置及存储介质
CN109558837B (zh) 人脸关键点检测方法、装置及存储介质
CN111127509B (zh) 目标跟踪方法、装置和计算机可读存储介质
CN109360222B (zh) 图像分割方法、装置及存储介质
CN109886208B (zh) 物体检测的方法、装置、计算机设备及存储介质
US11386586B2 (en) Method and electronic device for adding virtual item
CN112581358B (zh) 图像处理模型的训练方法、图像处理方法及装置
CN113763228B (zh) 图像处理方法、装置、电子设备及存储介质
CN110933468A (zh) 播放方法、装置、电子设备及介质
CN110991445B (zh) 竖排文字识别方法、装置、设备及介质
WO2022199102A1 (zh) 图像处理方法及装置
CN111860064B (zh) 基于视频的目标检测方法、装置、设备及存储介质
CN111982293B (zh) 体温测量方法、装置、电子设备及存储介质
WO2021218926A1 (zh) 图像显示方法、装置和计算机设备
CN112184802B (zh) 标定框的调整方法、装置及存储介质
CN111723615B (zh) 对检测物图像进行检测物匹配判定的方法和装置

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 19890020

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 19890020

Country of ref document: EP

Kind code of ref document: A1