CN112580462A

CN112580462A - Feature point selection method, terminal and storage medium

Info

Publication number: CN112580462A
Application number: CN202011441502.XA
Authority: CN
Inventors: 刘涛; 王欣欣; 朱彪; 王丽
Original assignee: Shenzhen Horn Audio Co Ltd
Current assignee: Shenzhen Horn Audio Co Ltd
Priority date: 2020-12-11
Filing date: 2020-12-11
Publication date: 2021-03-30

Abstract

The application is applicable to the field of image processing, and provides a feature point selection method, a terminal and a storage medium. The feature point selection method comprises the following steps: acquiring a plurality of first ear images, wherein the plurality of first ear images are ear images of the same human ear; respectively identifying ear feature points in each first ear image, wherein the number of the ear feature points in each first ear image is greater than 1; respectively calculating the combination characteristics of the ear characteristic points in each first ear image; according to the combination characteristics, screening out target ear images from the first ear images, and obtaining target feature points according to the ear feature points of the target ear images. The embodiment of the application can improve the reliability of the ear characteristic points.

Description

Feature point selection method, terminal and storage medium

Technical Field

The present application relates to the field of image processing, and in particular, to a feature point selection method, a terminal, and a storage medium.

Background

Human ear recognition is a new biometric technology that has emerged in recent years. The ear feature points identified from the ear images by using the human ear identification technology are an important ring for biological feature identification and ergonomic design of ear wearing products.

The reliability of results obtained by the existing ear feature point extraction method is poor, and the selected ear feature point is not beneficial to the use in biological feature recognition or ergonomic design.

Disclosure of Invention

The embodiment of the application provides a feature point selection method, a terminal and a storage medium, which can solve the problem of low reliability of the current ear feature point.

A first aspect of the embodiments of the present application provides a feature point selection method, including:

acquiring a plurality of first ear images, wherein the plurality of first ear images are ear images of the same human ear;

respectively identifying ear feature points in each first ear image, wherein the number of the ear feature points in each first ear image is greater than 1;

respectively calculating the combination characteristics of the ear characteristic points in each first ear image;

according to the combination characteristics, screening out target ear images from the first ear images, and obtaining target feature points according to the ear feature points of the target ear images.

A second aspect of the embodiments of the present application provides a feature point selection apparatus, including:

the device comprises an acquisition unit, a processing unit and a display unit, wherein the acquisition unit is used for acquiring a plurality of first ear images which are ear images of the same human ear;

the identification unit is used for respectively identifying ear feature points in each first ear image, and the number of the ear feature points in each first ear image is greater than 1;

a calculating unit, configured to calculate a combined feature of the ear feature points in each of the first ear images, respectively;

and the determining unit is used for screening out a target ear image from the plurality of first ear images according to the combined features, and obtaining a target feature point according to the ear feature point of the target ear image.

A third aspect of the embodiments of the present application provides a terminal, including a memory, a processor, and a computer program stored in the memory and executable on the processor, where the processor implements the steps of the method when executing the computer program.

A fourth aspect of the embodiments of the present application provides a computer-readable storage medium, which stores a computer program, and when the computer program is executed by a processor, the computer program implements the steps of the above method.

A fifth aspect of embodiments of the present application provides a computer program product, which when run on a terminal, causes the terminal to perform the steps of the method.

It is understood that the beneficial effects of the second aspect to the fifth aspect can be referred to the related description of the first aspect, and are not described herein again.

In the embodiment of the application, the ear feature points extracted from a plurality of first ear images are combined to form the combination feature of each first ear image. Whether the first ear image meets the extraction requirement of the ear feature point or not can be determined by utilizing the combination feature, namely whether the angle of the ear in the first ear image can be accurately extracted by the ear feature point or not is determined. According to the combination characteristics, the target ear images capable of extracting the ear characteristic points more accurately can be screened from the multiple first ear images, so that the reliability of the target characteristic points obtained according to the ear characteristic points of the target ear images is high. Therefore, the target feature points can be used for more accurately and conveniently carrying out the biological recognition or the ergonomic design of ear wearing products.

Drawings

In order to more clearly illustrate the technical solutions in the embodiments of the present application, the drawings needed to be used in the embodiments or the prior art descriptions will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present application, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without inventive exercise.

Fig. 1 is a schematic flow chart of an implementation of a feature point selection method provided in an embodiment of the present application;

FIG. 2 is a schematic diagram of acquiring a first ear image according to an embodiment of the present disclosure;

fig. 3 is a schematic flow chart of an implementation of extracting ear feature points according to an embodiment of the present application;

fig. 4 is a schematic flow chart of a first implementation of removing a first ear image with a poor ear feature point recognition effect according to the embodiment of the present application;

FIG. 5 is a schematic diagram of a flow chart of implementing a training target detection model and a feature point extraction model provided in the embodiment of the present application;

FIG. 6 is a schematic diagram of a second distance of the left ear provided by embodiments of the present application;

fig. 7 is a schematic flowchart of a second implementation of removing a first ear image with a poor ear feature point recognition effect according to the embodiment of the present application;

fig. 8 is a schematic flowchart of a first implementation of step S103 provided in the embodiment of the present application;

FIG. 9 is a schematic diagram of calculating a third distance according to an embodiment of the present application;

fig. 10 is a schematic flowchart of a second implementation of step S103 provided in the embodiment of the present application;

FIG. 11 is a schematic diagram of calculating a first distance according to an embodiment of the present application;

fig. 12 is a schematic implementation flow diagram of step S104 provided in the embodiment of the present application;

fig. 13 is a schematic structural diagram of a feature point selection apparatus according to an embodiment of the present application;

fig. 14 is a schematic structural diagram of a terminal according to an embodiment of the present application.

Detailed Description

In order to make the objects, technical solutions and advantages of the present application more apparent, the present application is described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the present application and are not intended to limit the present application. All other embodiments, which can be derived by a person skilled in the art from the embodiments of the present application without making any creative effort, shall fall within the protection scope of the present application.

In order to explain the technical means of the present application, the following description will be given by way of specific examples.

Fig. 1 shows a schematic implementation flow chart of a feature point selection method provided in an embodiment of the present application, where the method may be applied to a terminal and may be applied to a situation where reliability of feature point selection needs to be improved. The terminal can be a terminal such as a smart phone and a computer.

Specifically, the above feature point selection method may include the following steps S101 to S104.

Step S101, acquiring a plurality of first ear images.

Wherein, above-mentioned a plurality of first ear images all are the ear image of same people's ear.

Research finds that when biometric identification of a target object or artificial design of ear-worn equipment for the target object is required, the existing technology is to collect a single ear image of the target object for ear feature point extraction. However, because the structure of human ears is complicated, the angle of each human ear is different, and the shooting angle used when the ear image is collected is different. The ear feature points extracted by the existing method have low reliability, so that the accuracy of subsequent biological feature recognition or ergonomic design by using the ear feature points is low, and the production efficiency and the production cost are possibly reduced in the actual production process.

Therefore, in the embodiment of the application, a plurality of first ear images of the target object may be acquired to select an ear feature point from the plurality of first ear images, so as to implement biometric identification on the target object or perform an ergonomic design of ear wearing equipment for the target object.

Wherein, the acquisition mode of above-mentioned a plurality of first ear's images can be selected according to actual conditions. For example, in some embodiments of the present application, the terminal may capture a plurality of images of a human ear of the target object by using a camera to obtain the plurality of first ear images. In the shooting process, the terminal can prompt the target object to remind the target object, so that the obtained first ear image is an image obtained through shooting at different shooting angles.

In other embodiments of the present application, the terminal may further record a video of a human ear of the target object, and intercept a part of video frames in the recorded video, for example, intercept video frames at a preset frame interval with a first frame of the video as a starting point, where each intercepted video frame corresponds to one first ear image.

Specifically, as shown in fig. 2, the terminal may perform circular motion from the front side of the target object to the rear side of the target object with a preset radius by taking the target object as a center of a circle; and recording the video in the circular motion process, and intercepting a plurality of video frames in the recorded video to respectively serve as a first ear image.

In the process of recording the video, the terminal can calculate the shooting pose of the terminal according to the radius of the circular movement during shooting and a nine-axis sensor carried by the terminal, and detect whether the current shooting pose meets the preset requirement or not. Furthermore, the photographer can be reminded to adjust the shooting pose in a voice or text mode.

For example, whether the current shooting pose meets the preset requirement is judged according to whether the terminal keeps moving along the horizontal direction in a circular manner or whether the inclination angle of the terminal exceeds a threshold value, and if the current shooting pose does not meet the preset requirement, the target object can be reminded to adjust the shooting pose of the terminal according to the current shooting pose, so that the recorded video is higher in usability.

Step S102, ear feature points in each first ear image are respectively identified.

In the embodiment of the present application, it is necessary to determine the ear feature points in each first ear image, so as to use the appropriate ear feature points for biometric identification or ergonomic design.

In an embodiment of the present application, the identification method of the ear feature point may be selected according to actual situations. For example, object detection can be realized by using an algorithm based on feature recognition of HOG (histogram of Oriented Gradients) and SVM (support vector machine), and then two-dimensional image feature point extraction can be realized by using an algorithm based on a multi-level Regression tree (Ensemble of Regression Trees). It should be noted that other neural network models can be used in the present application, and the present application is not limited thereto.

When the number of the ear feature points in a single first ear image is 0 or 1, the number of the ear feature points is too small, which indicates that the shooting angle of the first ear image is not favorable for the identification of the ear feature points, and the first ear image has low available value for subsequent biological feature identification or human engineering design. Therefore, in the embodiment of the present application, the number of ear feature points in each of the above-described first ear images is greater than 1.

Step S103, calculating the combination characteristics of the ear characteristic points in each first ear image respectively.

In the embodiment of the application, a single ear feature point hardly reflects differences of different first ear images, that is, for first ear images obtained at different shooting angles, a single feature point hardly reflects differences caused by shooting angles in the first ear images, and influence of the differences on an ear feature point extraction effect.

Therefore, in the embodiment of the present application, the combined features of the ear feature points in each of the first ear images are calculated. That is, for a single first ear image, a combined feature formed by a plurality of ear feature points in the first ear image is calculated, the combined feature may represent a relative relationship between the plurality of ear feature points, and the relative relationship may reflect an ear feature point extraction effect of the first ear image. For example, in some embodiments of the present application the combined feature may be a distance feature, an angle feature, or the like, between a plurality of ear feature points.

And S104, screening out a target ear image from the plurality of first ear images according to the combination characteristics, and obtaining a target characteristic point according to the ear characteristic point of the target ear image.

In the embodiment of the present application, after the combined feature in each first ear image is calculated, the combined feature can reflect the ear feature point extraction effect of the first ear image. Accordingly, a target ear image can be screened out from the plurality of first ear images according to the combination feature. Then, according to the ear feature point of the target ear image, the target feature point can be obtained. The target feature points are ear feature points selected for biometric recognition or ergonomic design.

In some embodiments of the present application, as shown in fig. 3, in the above operation of identifying the ear feature points in each of the first ear images, the operation on a single first ear image may include the following steps S301 to S303.

Step S301, inputting the first ear image into a plurality of pre-trained target detection models for processing, and obtaining a plurality of corresponding result scores.

Research shows that the existing ear detection mode is that a detection model is trained by sample images shot under a plurality of different shooting angles, and the trained detection model is reused for ear detection of a first ear image.

However, as can be seen from the above description, the angle of each ear is different, and the shooting angle used when the ear image is collected is different, so that the influence of the angle is ignored, the first ear image is directly input to one detection model for processing, and the accuracy of the detected ear is low.

Therefore, in some embodiments of the present application, the first ear image may be input to a plurality of pre-trained target detection models for processing.

The target detection model is a model obtained by training ear image samples shot based on a first shooting angle, and the first shooting angle values of the ear image samples corresponding to different target detection models are different. Above-mentioned first shooting angle refers to the shooting angle that ear's image sample used, and the concrete value of every first shooting angle can be selected by the staff according to actual conditions.

That is to say, the ear image sample shot at the first shooting angle can be used for training and obtaining the target detection model corresponding to the first shooting angle. Then, the ear image sample shot at the second first shooting angle can be used for training to obtain a target detection model corresponding to the second first shooting angle. By analogy, a plurality of target detection models can be finally obtained, and each target detection model corresponds to one first shooting angle respectively.

In some embodiments of the present application, for a single first ear image, the first ear image may be input to the multiple target detection models for processing, and a result score corresponding to each target detection model is obtained. The result score may indicate an accuracy rate of detecting an ear in the first ear image.

Step S302, a first shooting angle corresponding to the target detection model with the highest result score is determined, and a feature point extraction model obtained by training the ear image sample shot based on the first shooting angle is determined.

In some embodiments of the present application, for a single first ear image, after obtaining the result score corresponding to each target detection model respectively, the first photographing angle corresponding to the target detection model with the highest result score may be determined. When the result score is the highest, it indicates that the accuracy rate of detecting the ear using the target detection model corresponding to the result score is the highest, and therefore, the shooting angle of the first ear image should be closer to the first shooting angle corresponding to the target detection model. At this time, a feature point extraction model obtained by training an ear image sample photographed based on the first photographing angle may be determined.

In some embodiments of the present application, the target detection model and the feature point extraction model may be trained based on the same sample set and test set. That is to say, for a single first shooting angle, the target detection model corresponding to the first shooting angle and the feature point extraction model corresponding to the first shooting angle may be trained simultaneously using the ear image sample shot based on the first shooting angle.

Step S303, feature point extraction is carried out on the first ear image by using the feature point extraction model, and ear feature points of the first ear image are obtained.

In some embodiments of the present application, after determining the first shooting angle corresponding to the target detection model with the highest result score, the feature point extraction model obtained by training the ear image sample shot based on the first shooting angle may be utilized to perform feature point extraction on the first ear image, so as to obtain the ear feature point of the first ear image.

In the embodiment of the application, the terminal can find the first shooting angle with the highest degree of association with the first ear image according to the result scores, so that the detection success rate of the ears at different angles can be effectively increased, the feature point extraction model associated with the first shooting angle is used for extracting the feature points, and the result of the position of each obtained feature point is more accurate.

It should be noted that, in practical applications, there may be some differences between the left ear and the right ear of a person, so that there are some differences between the first ear image obtained by shooting the left ear and the first ear image obtained by shooting the right ear. Therefore, in some embodiments of the present application, a target detection model and a feature point extraction model corresponding to the left ear may be trained based on the ear image sample of the left ear, and a target detection model and a feature point extraction model corresponding to the right ear may be trained based on the ear image sample of the right ear. When the method shown in fig. 3 is used for ear feature point identification, the terminal may first determine whether an ear in the acquired first ear image is a left ear or a right ear, and then input the corresponding target detection model and the feature point extraction model to perform feature point detection, so as to further improve accuracy of feature point detection.

In practical application, when the ear target detection and the feature point extraction are performed through the model, the used model reflects the average index of ear image samples shot at the corresponding first shooting angle, and in view of the difference of ears of each person, a result with poor feature point detection effect may exist in a result obtained by processing through the model. That is, the above-described feature point detection effect of the acquired first ear image is different. Therefore, before the combination features of the ear feature points in each first ear image are calculated, the images with poor feature point detection effects can be removed from all the first ear images, then the combination features of the ear feature points in each remaining first ear image are calculated, the target ear image is screened from each remaining first ear image, and the target feature points are obtained according to the ear feature points of the target ear image.

Specifically, in some embodiments of the present application, after the first ear image is input to a plurality of pre-trained target detection models for processing, an ear mark frame output by each target detection model may also be obtained. At this time, as shown in fig. 4, the following steps S401 to S402 may be included before the combined features of the ear feature points in the respective first ear images are calculated, respectively.

Step S401, calculating a first distance between the ear mark frame and one or more ear feature points in each first ear image.

The first distance represents the shortest distance between the ear mark frame and the ear feature point, and can be used for representing the relative position relationship between the ear mark frame and the ear feature point. The calculation mode of the first distance can be selected according to actual conditions.

In some embodiments of the present application, when the ear is in the center of the ear mark frame in the first ear image, it is better to perform feature point extraction using the model. According to the first distance, the position of the ear feature point in the ear mark frame can be determined, and whether the ear is located in the center of the ear mark frame in the first ear image is further determined.

Step S402, images with first distances not meeting first preset conditions are removed from the multiple first ear images.

Wherein, above-mentioned first preset condition means in can accurately carrying out the first ear image of ear characteristic point discernment, the length requirement that the distance between ear mark frame and the ear characteristic point needs to satisfy, and this first preset condition is used for screening out the first ear image that can not accurately carry out ear characteristic point discernment.

Specifically, for a single first ear image, the proportional relationship between a plurality of first distances can be determined according to the first distances between the ear mark frame and the ear feature points, and when the obtained proportional relationship is outside the preset proportion interval, the first ear image is determined as an image of which the first distance does not satisfy the first preset condition.

In some embodiments of the present application, a first ratio between a largest first distance and a smallest first distance of the first distances may be calculated. When the first proportion of the first ear images does not meet the first preset condition in the plurality of first ear images, namely the first proportion is outside a preset proportion range, the first ear images which do not meet the first preset condition are removed.

In other embodiments of the present application, a second ratio between the first distance corresponding to the third quartile of the first distances and the first distance corresponding to the first quartile may also be calculated. When the second proportion of the first ear images does not meet the first preset condition in the plurality of first ear images, namely the second proportion is outside a preset proportion range, the first ear images which do not meet the first preset condition are removed.

The first scale and the second scale may represent positions of the ears in the ear mark frame. The preset proportion interval refers to the proportion requirement which needs to be met by the first proportion or the second proportion when the ear is to be located in the center of the ear marking frame, and the specific value of the proportion interval can be adjusted by an administrator according to the actual situation.

According to the foregoing description, it can be concluded that, when the first distance satisfies the first preset condition, the ear in the first ear image should be located in the center of the ear mark frame, and the effect of extracting the ear feature point using the first ear image is better.

In the embodiment of the application, the first distance between the ear mark frame and one or more ear feature points in each first ear image is calculated, and the images with the first distances not meeting the first preset condition are removed from the plurality of first ear images. On one hand, a first ear image with poor ear characteristic point detection effect is eliminated, and the reliability of a subsequently obtained target characteristic point is high; on the other hand, the calculation amount of the target feature points determined subsequently is reduced, the efficiency of selecting the feature points can be improved, and the application in actual production is facilitated.

Accordingly, in some embodiments of the present application, when the target detection model and the feature point extraction model are trained, the condition for model convergence may be further increased.

Specifically, as shown in fig. 5, before the first ear image is input to a plurality of pre-trained object detection models and processed, the training operation for the object detection model and the feature point extraction model associated with the same first photographing angle may include the following steps S501 to S502.

In step S501, a plurality of ear image samples taken at the first shooting angle are obtained.

The above ear image sample refers to an image used in training a target detection model and a feature point extraction model at the first shooting angle, and an acquisition mode of the ear image sample may be selected according to an actual situation.

Step S502, training a target detection model to be trained and a feature point extraction model to be trained respectively by using the obtained ear image sample until the accuracy rates of the target detection model to be trained and the feature point extraction model to be trained are both greater than an accuracy rate threshold value, and obtaining a trained target detection model and a trained feature point extraction model, wherein a second distance between an ear mark frame output by the target detection model to be trained and an ear feature point output by the feature point extraction model to be trained meets a distance threshold value requirement.

That is to say, when the target detection model and the feature point extraction model at the first shooting angle are trained, except that the accuracy rates of the target detection model and the feature point extraction model are both required to be greater than an accuracy rate threshold, it is also required that a second distance between the ear mark frame output by the target detection model and the ear feature point output by the feature point extraction model satisfies a distance threshold requirement.

Wherein, the accuracy threshold is the lowest accuracy value which needs to be satisfied when the model converges; the distance threshold requirement is a distance requirement that the distance between the ear mark frame output by the target detection model and the ear feature point output by the feature point extraction model needs to be satisfied when the model converges. The values required by the accuracy threshold and the distance threshold can be adjusted according to actual conditions.

Specifically, the calculation method of the accuracy may be selected according to actual conditions. In some embodiments of the present application, the accuracy of the target detection model may be determined according to a True Positive Rate (TPR) and a negative positive rate (FPR) of the target detection model. The accuracy of the feature point extraction model can be determined according to the relative position error between the output feature point and the actual feature point.

Whether a second distance between the ear mark frame output by the target detection model and the ear feature point output by the feature point extraction model meets a distance requirement or not can be referred to whether the second distance between the ear mark frame of the left ear and the lowest, highest and rightmost feature points of the left ear meets the distance requirement or not; or whether the distances between the right ear mark frame and the right ear lowest, highest and leftmost feature points meet the distance requirement or not. For ease of understanding, fig. 6 shows a schematic diagram of the second distance of the ear mark box of the left ear from the lowest, highest, and rightmost feature points of the left ear.

In the embodiment of the application, when the target detection model and the feature point extraction model are trained, the condition of model convergence can be further increased, so that the output result of the model can be better used for extracting the ear feature points, and the reliability of the target feature points selected by the application is further improved.

In other embodiments of the present application, as shown in fig. 7, before calculating the combined features of the ear feature points in each of the first ear images, the following steps S701 to S702 may be further included.

Step S701, calculating the dispersion of the ear feature points in each first ear image.

The dispersion represents the degree of position difference between the ear feature points, and can be used for representing the relative position relationship between the ear feature points. The calculation mode of the dispersion can be selected according to actual conditions.

In some embodiments of the present application, when the dispersion in the first ear image is within a certain threshold range, it indicates that the radian of the ear in the first ear image is better. The ear feature points obtained by the first ear image recognition have fewer dead pixels and high accuracy, and are convenient for subsequent biological feature recognition and ergonomic design.

Step S702, images with dispersion degrees not meeting second preset conditions are removed from the multiple first ear images.

The second preset condition is a difference degree requirement which needs to be met by the position difference degree between the ear feature points in the first ear image which can accurately identify the ear feature points; this second preset condition is specifically used for screening out the first ear image that can not accurately carry out ear feature point discernment.

Specifically, in some embodiments of the present application, curve fitting may be performed on ear feature points in the first ear image to obtain a quadratic fit curve; then, the maximum error value among errors between the respective ear feature points and the quadratic fit curve is determined as the above-described dispersion. At this time, when the dispersion is larger, the error between each ear feature point and the quadratic fit curve is larger, and it is indicated that the radian formed by each ear feature point is poor, at this time, the ear feature point identified by using the first ear image is low in accuracy, and the effect of performing biometric identification and ergonomic design by using the ear feature point identified by using the first ear image is also poor. Therefore, images with dispersion degrees which do not meet a second preset condition, namely the dispersion degrees of which are larger than a preset dispersion degree threshold value, can be removed from the plurality of first ear images.

The value of the preset dispersion threshold value can be adjusted by an administrator according to actual conditions.

In the embodiment of the application, the images of which the dispersion does not meet the second preset condition are removed from the plurality of first ear images by calculating the dispersion of the ear feature points in each first ear image. On one hand, a first ear image with poor ear radian, namely poor ear characteristic point detection effect is eliminated, and the reliability of a subsequently obtained target characteristic point is high; on the other hand, the calculation amount of the target feature points determined subsequently is reduced, the efficiency of selecting the feature points can be improved, and the application in actual production is facilitated.

It should be noted that, in some embodiments of the present application, the terminal may sequentially execute the method shown in fig. 4 and the method shown in fig. 7, and remove, from the first ear image, an image whose first distance does not satisfy a first preset condition and an image whose dispersion does not satisfy a second preset condition; and then calculating the combination characteristics of the ear characteristic points in the rest first ear images, screening out the target ear images from the plurality of first ear images according to the combination characteristics, and obtaining the target characteristic points according to the ear characteristic points of the target ear images.

Specifically, in some embodiments of the present application, the calculation manner of the above combination features may be selected according to actual situations.

In some embodiments of the present application, as shown in fig. 8, in the operation of calculating the combined features of the ear feature points in each of the first ear images, the operation on a single first ear image may include the following steps S801 to S802.

Step S801 includes selecting two feature points from the ear feature points of the first ear image, and calculating a third distance between the two feature points in the first ear image.

The selection mode of the two characteristic points is set by an administrator according to the actual situation. Two feature points may be randomly selected from the ear feature points, for example, by a random algorithm. In some embodiments of the present application, specific two feature points may be set by an administrator. Such as tragus feature points and antihelix feature points.

In some embodiments of the present application, the third distance refers to a distance between two feature points, and may be used to represent a radian of the ear in the first ear image. For example, as shown in fig. 9, two feature points that are the same are selected for three different first ear images to perform third distance calculation, so as to obtain third distances d corresponding to the three first ear images respectively₁、d₂And d₃Wherein d is₃Greater than d₂And d is₂Greater than d₁. As shown in fig. 9It is seen that, when the third distance is larger, the ear part of the user appears better in the first ear part image, and therefore, the ear feature point extracted by using the first ear part image is also better.

Step S802, the third distance is used as a combination feature of the first ear image.

In the embodiment of the application, two feature points are selected from the ear feature points of the first ear image, a third distance between the two feature points in the first ear image is calculated, and the third distance is used as a combined feature of the first ear image, the combined feature can represent a radian of an ear in the first ear image, and can further represent an ear feature point extraction effect reflecting the first ear image.

In other embodiments of the present application, as shown in fig. 10, in the operation of calculating the combined features of the ear feature points in each of the first ear images, the operation on a single first ear image may include the following steps S1001 to S1004.

In step S1001, at least three feature points are selected from the ear feature points of the first ear image.

The selection manner of the feature points may refer to the selection manner of the feature points in step S801, which is not described in detail herein.

Step S1002, at least two eigenvectors formed by at least three feature points are calculated.

In some embodiments of the present application, a feature vector may be obtained for every two feature points, and after at least three feature points are selected, at least two feature vectors formed by the at least three feature points can be calculated. The specific calculation mode of the feature vector can be selected according to actual conditions, for example, the specific calculation mode is determined by using a coordinate calculation mode.

Step S1003, determining two feature vectors from the at least two feature vectors, and calculating a first angle between the two feature vectors in the first ear image.

In some embodiments of the present application, after obtaining at least two eigenvectors, if the number of the eigenvectors is 2, a first angle between the two eigenvectors may be directly calculated; if the number of the feature vectors is greater than 2, two feature vectors need to be screened out from the feature vectors, and a first angle between the two feature vectors in the first ear image is calculated. The first angle refers to an included angle between feature vectors, and can also be used to represent the radian of the ear in the first ear image.

The above-mentioned manner of screening the feature vectors may be selected according to actual situations, and in some embodiments of the present application, two feature vectors may be randomly selected as well.

In other embodiments of the present application, in step S1001, an administrator may set three specific ear feature points in advance, and then two feature vectors may be formed according to the three ear feature points, and then a first angle formed by the two feature vectors is calculated.

For example, as shown in fig. 11, the same three feature points are selected for three different first ear images to calculate two feature vectors, and a first angle between the two feature vectors is calculated to obtain first angles a, b, and c corresponding to the three first ear images, respectively, where c is greater than b, and b is greater than a. As can be seen from fig. 11, when the first angle is larger, the ear portion thereof is better represented in the first ear portion image, and therefore the ear portion feature point extracted by using the first ear portion image is also better.

In step S1004, the first angle is taken as a combination feature of the first ear image.

In the embodiment of the application, at least three feature points are selected from the ear feature points of the first ear image, at least two feature vectors formed by the at least three feature points are calculated, then two feature vectors are determined from the at least two feature vectors, a first angle between the two feature vectors in the first ear image is calculated, and finally the first angle is used as a combined feature of the first ear image, wherein the combined feature can represent a radian of an ear in the first ear image, and can further represent an ear feature point extraction effect reflecting the first ear image.

That is to say, in the embodiment of this application, can utilize the distance characteristic or the angle characteristic that ear feature point formed as the combination characteristic, select the better first ear image of wherein ear formation of image angle according to this combination characteristic for when carrying out ear feature point discernment with this first ear image, the ear feature point that discerns is more accurate, and the reliability is higher.

Specifically, the above-mentioned screening out the target ear image from the plurality of first ear images according to the combination feature may include: screening out a first ear image with the combination characteristics meeting extreme point conditions from a plurality of first ear images; and confirming the screened first ear image as a target ear image.

The extreme point condition is a combination feature that belongs to a maximum value or a minimum value among the combination features of the plurality of first ear images. And for the combined features which meet a maximum value condition or a minimum value condition in the combined features calculated by the plurality of first ear images, the corresponding first ear images are the images which can best identify the ear feature points in the first ear images.

The extreme point condition can be selected according to actual conditions. Taking fig. 9 as an example, the combination feature satisfying the extreme point condition means that the third distance is the maximum value of all the third distances; taking fig. 11 as an example, the above-mentioned combination feature satisfying the extreme point condition means that the first angle is the maximum value of all the first angles.

In some embodiments of the present application, when the number of the screened target ear images is 1, the ear feature point of the target ear image may be directly determined as the target feature point. However, in some embodiments of the present application, the number of the selected target ear images may be greater than 1. For example, the number of first ear images whose combined features satisfy the extreme point condition may be greater than 1. When the number of target ear images is greater than 1, in the conventional method, a plurality of target ear images are usually retained, and ear feature points of each target ear image are respectively determined as a group of target feature points.

In some embodiments of the present application, when the number of target ear images is greater than 1, the target feature points of the multiple target ear images may be merged. Specifically, as shown in fig. 12, obtaining the target feature point according to the ear feature point of the target ear image may further include: step S1201 to step S1202.

Step S1201, if the number of target ear images is greater than 1, acquiring a shooting angle difference between the plurality of target ear images.

The shooting angle difference refers to the angle difference between the shooting angles of the plurality of target ear images, and the obtaining mode can be selected according to actual conditions. In some embodiments of the present application, if the ear feature point is obtained by the method shown in fig. 3, the shooting angle of each target ear image may be determined as the first shooting angle corresponding to the target detection model with the highest score of the corresponding result, and then the shooting angle difference between the plurality of target ear images may be calculated.

Step S1202, if the shooting angle difference value is smaller than the angle difference threshold value, calculating target feature points according to the ear feature points of the plurality of target ear images.

In some embodiments of the present application, if feature points extracted from a plurality of good ear images can be referred to, usability is higher for subsequent biometric recognition or ergonomic design.

If the shooting angle difference between the target ear images is smaller than the angle difference threshold value, the shooting angle difference between the target ear images is close, and at the moment, ear feature points recognized by different target ear images are also close. The angle difference threshold is the maximum shooting angle difference between target ear images capable of feature point merging, and is used for determining whether ear feature points between different target ear images can be merged or not, and the value of the angle difference threshold can be adjusted according to actual conditions. If the shooting angle difference value is smaller than the angle difference threshold value, the target feature point can be calculated according to the ear feature points of the multiple target ear images.

Specifically, the terminal may calculate the target feature points according to the ear feature points of the multiple target ear images by using a linear interpolation or a quadratic interpolation.

In the embodiment of the application, the target feature points are calculated according to the ear feature points of the plurality of target ear images by acquiring the shooting angle difference values among the plurality of target ear images and when the shooting angle difference values are smaller than the angle difference threshold value. Therefore, the ear feature points in the first ear images with good effects are referred to, the accuracy and the reliability of the obtained target feature points are higher, and the subsequent biological feature recognition or ergonomic design is facilitated.

It should be noted that, if the shooting angle difference is greater than or equal to the angle difference threshold, it indicates that the shooting angle difference between the target ear images is relatively long. At this time, the ear feature points identified by different target ear images may not be the same and are not suitable for merging. Therefore, a plurality of target ear images can be reserved, and the ear characteristic points of each target ear image are respectively determined to be a group of target characteristic points to obtain a plurality of groups of target characteristic points.

It should be noted that, for simplicity of description, the foregoing method embodiments are described as a series of acts or combination of acts, but those skilled in the art will recognize that the present application is not limited by the order of acts, as some steps may, in accordance with the present application, occur in other orders.

Fig. 13 is a schematic structural diagram of a feature point selection apparatus 1300 according to an embodiment of the present disclosure, where the feature point selection apparatus 1300 is configured on a terminal. The feature point selection device 1300 may include: an acquisition unit 1301, an identification unit 1302, a calculation unit 1303 and a determination unit 1304.

An obtaining unit 1301, configured to obtain a plurality of first ear images, where the plurality of first ear images are all ear images of the same human ear;

an identifying unit 1302, configured to identify ear feature points in each of the first ear images, where the number of the ear feature points in each of the first ear images is greater than 1;

a calculating unit 1303, configured to calculate a combination feature of the ear feature points in each of the first ear images;

a determining unit 1304, configured to screen a target ear image from the multiple first ear images according to the combined feature, and obtain a target feature point according to the ear feature point of the target ear image.

In some embodiments of the present application, the identifying unit 1302 may be further configured to: inputting the first ear image into a plurality of pre-trained target detection models for processing to obtain a plurality of corresponding result scores, wherein the target detection models are models obtained by training ear image samples shot based on a first shooting angle, and the first shooting angle values of the ear image samples corresponding to different target detection models are different; determining a first shooting angle corresponding to the target detection model with the highest result score, and determining a feature point extraction model obtained by training an ear image sample shot based on the first shooting angle; and extracting the feature points of the first ear image by using the feature point extraction model to obtain the ear feature points of the first ear image.

In some embodiments of the present application, after the first ear image is input to a plurality of pre-trained target detection models for processing, an ear mark frame output by each target detection model may also be obtained; accordingly, the above feature point selection apparatus 1300 further includes a rejecting unit, which can be configured to: calculating a first distance between the ear mark box and one or more of the ear feature points in each of the first ear images; and removing the images with the first distances not meeting first preset conditions from the plurality of first ear images.

In some embodiments of the present application, the above feature point selecting apparatus 1300 further includes a training unit, which can be configured to: acquiring a plurality of ear image samples shot at the first shooting angle; and respectively training a target detection model to be trained and a feature point extraction model to be trained by utilizing the acquired ear image sample until the target detection model to be trained and the accuracy of the feature point extraction model are both greater than an accuracy threshold, and the ear marking frame output by the target detection model to be trained and the second distance between the ear feature points output by the feature point extraction model to be trained meet the distance threshold requirement to obtain the target detection model and the feature point extraction model which are trained.

In some embodiments of the present application, the above-mentioned rejection unit may be further configured to: calculating the dispersion of the ear feature points in each first ear image; and rejecting the images of which the dispersion does not meet a second preset condition from the plurality of first ear images.

In some embodiments of the present application, the calculating unit 1303 may be further configured to: selecting two feature points from the ear feature points of a single first ear image, and calculating a third distance between the two feature points in the first ear image; and taking the third distance as the combined feature of the first ear image.

In some embodiments of the present application, the calculating unit 1303 may be further configured to: for a single first ear image, selecting at least three feature points from the ear feature points of the first ear image; calculating at least two eigenvectors formed by the at least three characteristic points; determining two feature vectors from the at least two feature vectors, and calculating a first angle between the two feature vectors in the first ear image; and taking the first angle as the combined characteristic of the first ear image.

In some embodiments of the present application, the determining unit 1304 may be further configured to: screening the first ear images of which the combined features meet extreme point conditions from the multiple first ear images; confirming the screened first ear images as target ear images.

In some embodiments of the present application, the determining unit 1304 may be further configured to: if the number of the target ear images is larger than 1, acquiring a shooting angle difference value between a plurality of target ear images; and if the shooting angle difference value is smaller than an angle difference threshold value, calculating the target feature point according to the ear feature points of the plurality of target ear images.

It should be noted that, for convenience and simplicity of description, the specific working process of the feature point selecting apparatus 1300 may refer to the corresponding process of the method described in fig. 1 to fig. 12, and is not described herein again.

Fig. 14 is a schematic diagram of a terminal according to an embodiment of the present application. The terminal 14 may include: a processor 140, a memory 141 and a computer program 142, such as a feature point selection program, stored in said memory 141 and executable on said processor 140. The processor 140 implements the steps in the above-described embodiments of the feature point selection method, such as the steps S101 to S104 shown in fig. 1, when executing the computer program 142. Alternatively, the processor 140, when executing the computer program 142, implements the functions of the modules/units in the above device embodiments, such as the functions of the units 1301 to 1304 in fig. 13.

The computer program may be divided into one or more modules/units, which are stored in the memory 141 and executed by the processor 140 to accomplish the present application. The one or more modules/units may be a series of computer program instruction segments capable of performing specific functions, which are used to describe the execution of the computer program in the terminal.

For example, the computer program may be divided into: the device comprises an acquisition unit, a recognition unit, a calculation unit and a determination unit. The specific functions of each unit are as follows: the device comprises an acquisition unit, a processing unit and a display unit, wherein the acquisition unit is used for acquiring a plurality of first ear images which are ear images of the same human ear; the identification unit is used for respectively identifying ear feature points in each first ear image, and the number of the ear feature points in each first ear image is greater than 1; a calculating unit, configured to calculate a combined feature of the ear feature points in each of the first ear images, respectively; and the determining unit is used for screening out a target ear image from the plurality of first ear images according to the combined features, and obtaining a target feature point according to the ear feature point of the target ear image.

The terminal may include, but is not limited to, a processor 140, a memory 141. Those skilled in the art will appreciate that fig. 14 is merely an example of a terminal and is not intended to be limiting and may include more or fewer components than those shown, or some components may be combined, or different components, e.g., the terminal may also include input-output devices, network access devices, buses, etc.

The Processor 140 may be a Central Processing Unit (CPU), other general purpose Processor, a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), an off-the-shelf Programmable Gate Array (FPGA) or other Programmable logic device, discrete Gate or transistor logic, discrete hardware components, etc. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.

The storage 141 may be an internal storage unit of the terminal, such as a hard disk or a memory of the terminal. The memory 141 may also be an external storage device of the terminal, such as a plug-in hard disk, a Smart Media Card (SMC), a Secure Digital (SD) Card, a Flash memory Card (Flash Card), and the like, which are provided on the terminal. Further, the memory 141 may also include both an internal storage unit and an external storage device of the terminal. The memory 141 is used for storing the computer program and other programs and data required by the terminal. The memory 141 may also be used to temporarily store data that has been output or is to be output.

It will be apparent to those skilled in the art that, for convenience and brevity of description, only the above-mentioned division of the functional units and modules is illustrated, and in practical applications, the above-mentioned function distribution may be performed by different functional units and modules according to needs, that is, the internal structure of the apparatus is divided into different functional units or modules to perform all or part of the above-mentioned functions. Each functional unit and module in the embodiments may be integrated in one processing unit, or each unit may exist alone physically, or two or more units are integrated in one unit, and the integrated unit may be implemented in a form of hardware, or in a form of software functional unit. In addition, specific names of the functional units and modules are only for convenience of distinguishing from each other, and are not used for limiting the protection scope of the present application. The specific working processes of the units and modules in the system may refer to the corresponding processes in the foregoing method embodiments, and are not described herein again.

In the above embodiments, the descriptions of the respective embodiments have respective emphasis, and reference may be made to the related descriptions of other embodiments for parts that are not described or illustrated in a certain embodiment.

Those of ordinary skill in the art will appreciate that the various illustrative elements and algorithm steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware or combinations of computer software and electronic hardware. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the implementation. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present application.

In the embodiments provided in the present application, it should be understood that the disclosed apparatus/terminal and method may be implemented in other ways. For example, the above-described apparatus/terminal embodiments are merely illustrative, and for example, the division of the modules or units is only one logical division, and there may be other divisions when actually implemented, for example, a plurality of units or components may be combined or may be integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, devices or units, and may be in an electrical, mechanical or other form.

The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.

In addition, functional units in the embodiments of the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware, and can also be realized in a form of a software functional unit.

The integrated modules/units, if implemented in the form of software functional units and sold or used as separate products, may be stored in a computer readable storage medium. Based on such understanding, all or part of the flow in the method of the embodiments described above can be realized by a computer program, which can be stored in a computer-readable storage medium and can realize the steps of the embodiments of the methods described above when the computer program is executed by a processor. Wherein the computer program comprises computer program code, which may be in the form of source code, object code, an executable file or some intermediate form, etc. The computer-readable medium may include: any entity or device capable of carrying the computer program code, recording medium, usb disk, removable hard disk, magnetic disk, optical disk, computer Memory, Read-Only Memory (ROM), Random Access Memory (RAM), electrical carrier wave signals, telecommunications signals, software distribution medium, and the like. It should be noted that the computer readable medium may contain content that is subject to appropriate increase or decrease as required by legislation and patent practice in jurisdictions, for example, in some jurisdictions, computer readable media does not include electrical carrier signals and telecommunications signals as is required by legislation and patent practice.

The above-mentioned embodiments are only used for illustrating the technical solutions of the present application, and not for limiting the same; although the present application has been described in detail with reference to the foregoing embodiments, it should be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; such modifications and substitutions do not substantially depart from the spirit and scope of the embodiments of the present application and are intended to be included within the scope of the present application.

Claims

1. A feature point selection method, comprising:

2. The feature point selection method according to claim 1, wherein, in the step of separately identifying the ear feature points in the respective first ear images, the step of identifying a single first ear image includes:

inputting the first ear image into a plurality of pre-trained target detection models for processing to obtain a plurality of corresponding result scores, wherein the target detection models are models obtained by training ear image samples shot based on a first shooting angle, and the first shooting angle values of the ear image samples corresponding to different target detection models are different;

determining a first shooting angle corresponding to the target detection model with the highest result score, and determining a feature point extraction model obtained by training an ear image sample shot based on the first shooting angle;

and extracting the feature points of the first ear image by using the feature point extraction model to obtain the ear feature points of the first ear image.

3. The feature point selection method according to claim 2, wherein after the first ear image is input to a plurality of pre-trained target detection models for processing, an ear mark frame output by each target detection model is obtained;

correspondingly, before the separately calculating the combined features of the ear feature points in each of the first ear images, the method includes:

calculating a first distance between the ear mark box and one or more of the ear feature points in each of the first ear images;

and removing the images with the first distances not meeting first preset conditions from the plurality of first ear images.

4. The feature point selection method according to claim 3, wherein the training step of the object detection model and the feature point extraction model associated with the same first photographing angle before the first ear image is input to the plurality of pre-trained object detection models for processing comprises:

acquiring a plurality of ear image samples shot at the first shooting angle;

and respectively training a target detection model to be trained and a feature point extraction model to be trained by utilizing the acquired ear image sample until the target detection model to be trained and the accuracy of the feature point extraction model are both greater than an accuracy threshold, and the ear marking frame output by the target detection model to be trained and the second distance between the ear feature points output by the feature point extraction model to be trained meet the distance threshold requirement to obtain the target detection model and the feature point extraction model which are trained.

5. The feature point selection method according to any one of claims 2 to 4, further comprising, before the separately calculating the combined features of the ear feature points in each of the first ear images:

calculating the dispersion of the ear feature points in each first ear image;

and rejecting the images of which the dispersion does not meet a second preset condition from the plurality of first ear images.

6. The feature point selection method according to any one of claims 1 to 4, wherein, in the step of separately calculating the combined features of the ear feature points in the respective first ear images, the step of calculating a single first ear image includes:

selecting two feature points from the ear feature points of the first ear image, and calculating a third distance between the two feature points in the first ear image;

and taking the third distance as the combined feature of the first ear image.

7. The feature point selection method according to any one of claims 1 to 4, wherein, in the step of separately calculating the combined features of the ear feature points in the respective first ear images, the step of calculating a single first ear image includes:

selecting at least three feature points from the ear feature points of the first ear image;

calculating at least two eigenvectors formed by the at least three characteristic points;

determining two feature vectors from the at least two feature vectors, and calculating a first angle between the two feature vectors in the first ear image;

and taking the first angle as the combined characteristic of the first ear image.

8. The feature point selection method according to any one of claims 1 to 4, wherein the screening out a target ear image from the plurality of first ear images according to the combined features comprises:

screening the first ear images of which the combined features meet extreme point conditions from the multiple first ear images;

confirming the screened first ear images as target ear images.

9. The feature point selection method according to any one of claims 1 to 4, wherein the obtaining a target feature point from the ear feature point of the target ear image includes:

if the number of the target ear images is larger than 1, acquiring a shooting angle difference value between a plurality of target ear images;

and if the shooting angle difference value is smaller than an angle difference threshold value, calculating the target feature point according to the ear feature points of the plurality of target ear images.

10. A terminal comprising a memory, a processor and a computer program stored in the memory and executable on the processor, characterized in that the processor implements the steps of the method according to any of claims 1 to 9 when executing the computer program.

11. A computer-readable storage medium, in which a computer program is stored which, when being executed by a processor, carries out the steps of the method according to any one of claims 1 to 9.