CN106203242B

CN106203242B - Similar image identification method and equipment

Info

Publication number: CN106203242B
Application number: CN201510229654.6A
Authority: CN
Inventors: 陈岳峰
Original assignee: Alibaba Group Holding Ltd
Current assignee: Alibaba Group Holding Ltd
Priority date: 2015-05-07
Filing date: 2015-05-07
Publication date: 2019-12-24
Anticipated expiration: 2035-05-07
Also published as: WO2016177259A1; CN106203242A

Abstract

The application discloses a similar image identification method. After a to-be-compared area corresponding to the designated feature in the first to-be-identified image is determined and the image in the to-be-compared area is aligned with a preset standard image, namely, the resolution of the aligned image in the to-be-compared area is adjusted to the preset resolution, the adjusted image is used as a normalized image, finally, a measurement distance between the normalized image of the first to-be-identified image and the normalized image of the second to-be-identified image is obtained, and whether the designated feature of the first to-be-identified image is similar to the designated feature of the second to-be-identified image is determined according to the measurement distance and a preset threshold value. Therefore, on the premise of ensuring the accuracy, the similarity between the image to be detected and another image to be detected is identified quickly and efficiently, and a reference basis is provided for improving the safety of the existing system.

Description

Similar image identification method and equipment

Technical Field

The present application relates to the field of communications technologies, and in particular, to a similar image identification method. The application also relates to a similar image recognition device.

Background

With the development of internet and computer information technology, online shopping is becoming a new fashion of shopping. For a sales system with a large visit amount and a large purchase amount, tens of millions of merchant users sell goods through the sales system every day, but meanwhile, many lawbreakers attempt to deal with the sales system by imitating the identities of other people, so that various illegal operations can be generated, and the rights and interests of other legitimate users are damaged. Therefore, how to judge whether the two photos are the same person or not based on the photo to be detected and the existing record photo uploaded from the commercial household is one of the problems to be solved by the sales system.

The traditional face authentication method is mainly based on SIFT (Scale-invariant feature transform), LBP (Local Binary pattern), and other features to describe the photo to be detected and the faces in the existing photo, and then judges whether the two faces are the same person through a classifier, wherein SIFT is a Local feature descriptor used in the field of image processing, the description has Scale invariance and can detect key points in the image, SIFT features are based on some interest points of Local appearance on the object and are unrelated to the size and rotation of the image, and the tolerance to light, noise and slight visual angle changes is quite high; the LBP is an effective texture description operator, can measure and extract local texture information of an image, and has invariance to illumination.

However, in the process of implementing the present application, the inventors found that the following disadvantages exist in the prior art: the traditional face authentication algorithm based on feature description usually performs face authentication by extracting high-dimensional features from a face region and by means of a classifier. Such algorithms tend to work only with pictures or photographs that are particularly visible for human facial features. Under the conditions that the background is complex and the face changes greatly, the recognition technology in the prior art often cannot accurately determine whether the images in the two photos are the same person, so how to perform fast and efficient recognition on the image to be detected and the existing image on the premise of ensuring the recognition accuracy becomes a technical problem to be solved urgently by technical personnel in the field.

Disclosure of Invention

The application provides a similar image identification method, which is used for carrying out rapid and efficient identification on an image to be detected and an existing image on the premise of ensuring accuracy, and comprises the following steps:

acquiring a region to be compared corresponding to the specified feature in the first image to be identified;

aligning the image in the area to be compared with a preset standard image, and taking the aligned image as a normalized image of the first image to be identified, wherein the standard image corresponds to the specified feature;

determining a metric distance between the normalized image and a normalized image of a second image to be recognized, the metric distance being generated according to the distances of the normalized image and the normalized image of the second image to be recognized in a feature space, wherein the distance of a similar normalized image in the feature space is smaller than the distance of a non-similar normalized image in the feature space;

if the measurement distance is larger than a preset threshold value, determining that the designated features of the first image to be recognized and the second image to be recognized are not similar;

and if the metric distance is smaller than or equal to the threshold value, determining that the specified features of the first image to be recognized and the second image to be recognized are similar.

Preferably, the acquiring a region to be compared corresponding to the specified feature in the first image to be recognized specifically includes:

determining the area to be compared in the first image to be identified according to a detection algorithm corresponding to the specified feature;

and acquiring key point coordinates corresponding to a plurality of key point features of the specified features in the region to be compared through a preset key point regression model.

Preferably, the image in the region to be compared is aligned with a preset standard image, specifically:

mapping each key point coordinate of the area to be compared into the key point coordinate of the aligned image according to the parameter M;

and the parameter M is generated according to the key point coordinates of each standard image and the key point coordinates of the image corresponding to the specified feature in the labeled image.

Preferably, after the aligned image is used as a normalized image of the first image to be recognized, the method further includes:

and adjusting the resolution of the normalized image to a preset resolution.

Preferably, the metric distance between the normalized image and the normalized image of the second image to be recognized is determined, specifically:

extracting specified features in the normalized image through a convolutional neural network;

determining a characteristic value of the specified characteristic after being mapped to a characteristic space according to a convolutional neural network and a distance measurement loss function, and taking the characteristic value as a characteristic value of the normalized image;

and determining a Euclidean distance between the characteristic value of the normalized image and the characteristic value of the normalized image of the second image to be identified, and taking the Euclidean distance as the measurement distance.

Preferably, the specific feature is a face region, and the keypoint features at least include a left eye region, a right eye region, a nose region, a left mouth corner region, and a right mouth corner region.

Preferably, the convolutional neural network parameters are obtained by training according to labeled images, and the labeled images comprise normalized images with mutually similar specified features and normalized images with mutually dissimilar specified features.

Correspondingly, the present application also provides a similar image recognition device, including:

the acquisition module is used for acquiring a region to be compared corresponding to the specified feature in the first image to be identified;

the alignment module is used for aligning the image in the area to be compared with a preset standard image, and taking the aligned image as a normalized image of the first image to be identified, wherein the standard image corresponds to the specified feature;

a determining module, configured to determine a metric distance between the normalized image of the first image to be recognized and a normalized image of a second image to be recognized, where the metric distance is generated according to distances of the normalized images and the normalized image of the second image to be recognized in a feature space, and a distance of a similar normalized image in the feature space is smaller than a distance of a non-similar normalized image in the feature space;

the identification module is used for confirming that the specified features of the first image to be identified and the second image to be identified are not similar when the metric distance is larger than a preset threshold value, and confirming that the specified features of the first image to be identified and the second image to be identified are similar when the metric distance is smaller than or equal to the threshold value.

Preferably, the determining module is specifically configured to:

determining the region to be compared in the first image to be identified according to a detection algorithm corresponding to the designated feature, and acquiring key point coordinates corresponding to a plurality of key point features of the designated feature in the region to be compared through a preset key point regression model.

Preferably, the alignment module is specifically configured to:

and mapping each key point coordinate of the area to be compared to the key point coordinate of the aligned image according to a parameter M, wherein the parameter M is generated according to each key point coordinate of the standard image and the key point coordinate of the image corresponding to the specified feature in the labeled image.

Preferably, the method further comprises the following steps:

and the adjusting module is used for adjusting the resolution of the normalized image to a preset resolution.

Preferably, the obtaining module is specifically configured to:

Correspondingly, the application also provides a similar image identification method, which is applied to a client and comprises the following steps:

receiving an identity authentication request of a user, wherein the identity authentication request carries a first image to be identified uploaded by the user and authentication information of the user;

sending the identity authentication request to a server so that the server acquires a second image to be identified corresponding to the user according to the authentication information;

receiving an identity authentication response sent by the server;

and the client displays an authentication result to the user according to the identity authentication response.

Preferably, the receiving of the identity authentication request of the user specifically includes:

acquiring the image uploaded by the user and the information input by the user;

taking the image as the first image to be identified and the information as the authentication information;

and generating the identity authentication request according to the first image to be identified and the authentication information.

Preferably, the identity authentication response is an identity authentication success response or an identity authentication failure response, and further includes:

the identity authentication success response is generated after the server confirms that the specified features of the first image to be recognized and the second image to be recognized are similar;

the identity authentication failure response is generated by the server after confirming that the specified features of the first image to be recognized and the second image to be recognized are not similar.

Preferably, the authentication result is presented to the user according to the identity authentication response, specifically:

when the identity authentication success response is received, displaying a preset interface corresponding to the identity authentication success response to the user;

and when the identity authentication failure response is received, displaying a preset interface corresponding to the identity authentication failure response to the user, and displaying prompt information whether manual verification is needed or not to the user.

Preferably, after displaying a preset interface corresponding to the identity authentication failure response to the user and displaying a prompt indicating whether manual verification is required to be performed to the user, the method further includes:

and if the manual verification request of the user is received, sending the identity authentication request to a preset server.

Correspondingly, the present application also provides a client, including:

the receiving module is used for receiving an identity authentication request of a user, wherein the identity authentication request carries a first image to be identified uploaded by the user and authentication information of the user;

the sending module is used for sending the identity authentication request to a server so that the server can obtain a second image to be identified corresponding to the user according to the authentication information;

the receiving module is used for receiving the identity authentication response sent by the server;

and the display module is used for displaying the authentication result to the user according to the identity authentication response.

Preferably, the receiving module is specifically configured to:

and acquiring the image uploaded by the user and the information input by the user, taking the image as the first image to be identified, taking the information as the authentication information, and generating the identity authentication request according to the first image to be identified and the authentication information.

Preferably, the display module is specifically configured to display a preset interface corresponding to the identity authentication success response to the user when the receiving module receives the identity authentication success response;

or, the display module is specifically configured to display a preset interface corresponding to the identity authentication failure response to the user and display a prompt message indicating whether manual verification is required to be performed to the user when the receiving module receives the identity authentication failure response.

Preferably, after the display module displays a preset interface corresponding to the identity authentication failure response to the user and displays a prompt indicating whether manual verification is required to be performed to the user, the receiving module further receives a manual verification request of the user, and the receiving module indicates the sending module to send the identity authentication request to a preset server.

Correspondingly, the application also provides a similar image identification method, which is applied to a server and comprises the following steps:

receiving an identity authentication request sent by the client, wherein the identity authentication request carries a first image to be identified uploaded by the user and authentication information of the user;

inquiring a second image to be identified corresponding to the user according to the authentication information;

if the measurement distance is larger than a preset threshold value, confirming that the specified characteristics of the first image to be recognized and the second image to be recognized are not similar, and returning an authentication failure response to the client:

and if the measurement distance is smaller than or equal to the threshold value, confirming that the specified characteristics of the first image to be recognized and the second image to be recognized are similar, and returning an identity verification success response to the client.

and adjusting the resolution of the normalized image to a preset resolution.

Correspondingly, the present application also proposes a server, including:

the receiving module is used for receiving an identity authentication request sent by the client, wherein the identity authentication request carries a first image to be identified uploaded by the user and authentication information of the user;

the query module is used for querying a second image to be identified corresponding to the user according to the authentication information;

the identification module is used for confirming that the specified features of the first image to be identified and the second image to be identified are not similar when the metric distance is greater than a preset threshold value, and confirming that the specified features of the first image to be identified and the second image to be identified are similar when the metric distance is less than or equal to the threshold value;

and the sending module is used for returning an authentication failure response to the client when the recognition module confirms that the specified features of the first image to be recognized and the second image to be recognized are not similar, and returning an authentication success response to the client when the recognition module confirms that the specified features of the first image to be recognized and the second image to be recognized are similar.

Preferably, the determining module is specifically configured to:

Preferably, the alignment module is specifically configured to map, according to a parameter M, each key point coordinate of the region to be compared to a key point coordinate of the aligned image, where the parameter M is generated according to each key point coordinate of the standard image and a key point coordinate of an image corresponding to the specified feature in the labeled image.

Preferably, the method further comprises the following steps:

Preferably, the obtaining module is specifically configured to:

Therefore, by applying the technical scheme of the application, after the area to be compared corresponding to the specified feature in the first image to be recognized is determined and the image in the area to be compared is aligned with the preset standard image, the resolution of the aligned image in the area to be compared is adjusted to the preset resolution, the adjusted image is used as the normalized image, the measurement distance between the normalized image of the first image to be recognized and the normalized image of the second image to be recognized is finally obtained, and whether the specified feature of the first image to be recognized is similar to the specified feature of the second image to be recognized is determined according to the measurement distance and the preset threshold value. Therefore, on the premise of ensuring the accuracy, the similarity between the image to be detected and another image to be detected is identified quickly and efficiently, and a reference basis is provided for improving the safety of the existing system.

Drawings

Fig. 1 is a schematic flow chart of a similar image recognition method proposed in the present application;

FIG. 2 is a diagram of a convolutional neural network structure for training face feature point positioning in an embodiment of the present application;

FIG. 3 is a schematic flow chart illustrating depth metric learning according to an embodiment of the present application;

FIG. 4 is a diagram of a convolutional neural network structure for training face authentication in an embodiment of the present application;

fig. 5 is a schematic flow chart illustrating similar image recognition performed by a client according to an embodiment of the present application;

FIG. 6 is a schematic flow chart illustrating a process of performing similar image recognition by a server according to an embodiment of the present application;

fig. 7 is a schematic structural diagram of a similar image recognition device according to the present application;

fig. 8 is a schematic structural diagram of a client proposed in the present application;

fig. 9 is a schematic structural diagram of a server according to the present application.

Detailed Description

With the popularization of mobile devices, face authentication plays an important role in more and more places. However, the human face authentication process is interfered by many other objective factors, and due to reasons such as angles, the human face in the image generally containing the human face cannot be directly compared and extracted. In view of this problem, the present application proposes a recognition method for similar images, which can be implemented by means of a computer device in a network environment. The client facing the user can be a mobile device compatible with key position input and touch screen input or a PC device, and the client and the server are connected through a network in a wired or wireless mode.

As shown in fig. 1, a schematic flow chart of a similar image recognition method proposed by the present application includes the following steps:

s101, acquiring a region to be compared corresponding to the specified feature in the first image to be recognized.

In the process of judging the similarity of the pictures of the human faces so as to accurately distinguish whether the two pictures are the same person, the essential steps of accurately positioning some key points (such as eyes, a nose, a mouth corner and the like) of the human faces and aligning the human faces are carried out. Therefore, in the preferred embodiment of the present application, the region to be compared can be obtained by determining the coordinates of the key points (related to the human face). Specifically, in the process of determining the region to be compared in the first image to be recognized, which corresponds to the specified feature, the region to be compared in the first image to be recognized may be determined according to the detection algorithm corresponding to the specified feature, and then the coordinates of the key points in the region to be compared, which correspond to the plurality of key point features of the specified feature, are obtained through a preset key point regression model, so as to accurately determine the region to be compared. Accordingly, the designated feature may be a face region, and the keypoint features include at least a left eye region, a right eye region, a nose region, a left mouth corner region, and a right mouth corner region.

According to one embodiment of the application, a deep convolutional neural network is adopted to realize regression of key points of the human face. The structure of the neural network in this embodiment is shown in fig. 2, and includes 4 convolutional layers and 2 fully-connected layers. Where the first 3 convolutional layers contain the maximum pooling (max boosting) operation and the last convolutional layer contains only the convolution operation. The first full-connected layer contains 100 nodes, and the second full-connected layer contains 10 nodes and represents the coordinates of 5 key points of the human face. The regression uses the Euclidean distance as the most loss function, and the expression is as follows:

x represents the coordinates of the labeled keypoints and the coordinates of the keypoints predicted by the convolutional neural network.

By minimizing the loss function, the present embodiment optimizes parameters in the model by using a stochastic gradient descent algorithm, thereby obtaining a model for predicting key points of the human face through training.

Secondly, mapping each key point coordinate of the area to be compared to the key point coordinate of the aligned image according to a parameter M, wherein the parameter M is generated according to each key point coordinate of the standard image and the key point coordinate of the image corresponding to the specified feature in the labeled image.

Taking the parameters in the above embodiment as an example, the embodiment defines 5 key point positions in the standard face, which are the left eye, the right eye, the nose, the left mouth angle, and the right mouth angle, and aligns the detected face to the standard face by rotating, translating, and scaling the detected face. Assuming that the feature point position in the standard face is (x, y), and the predicted feature point position is (x ', y'), the relationship between the two is:

wherein the position parameter is a ═ s cos θ, b ═ s sin θ, c ═ t_x，d＝t_y4 unknown parameters, 4 equations are needed to solve for these 4 parameters. In order to make the alignment result more robust, in this embodiment, the system maps 5 points, establishes a linear equation set, and calculates the least square method of the linear equation set into the linear equation set. The method comprises the following specific steps:

expressing the above equation in the form of a linear system of equations may become:

by minimizingWherein x is solved by (M)^TM)^-1y

The above process is a detailed generation process of the parameter M, and a person skilled in the art can perform alignment processing on the face image according to the parameter M, and on this basis, other improved implementation manners capable of obtaining the parameter M all belong to the protection scope of the present application.

And S102, aligning the image in the area to be compared with a preset standard image, and taking the aligned image as a normalized image of the first image to be identified.

In order to reduce interference of other objective factors and enable a comparison result to be more accurate, after the region to be compared corresponding to the specified feature is determined, the image in the region to be compared needs to be aligned with the preset standard image. In the preferred embodiment of the present application, this step aligns the face to a standard face, which a person skilled in the art can set based on the existing comparison standard, and these are all within the scope of the present application.

In addition, in order to further standardize the image for processing, after the above process is finished, the application adjusts the resolution of the normalized image to the preset resolution. According to an embodiment of the present application, if the face region needs to be scaled to the 39x39 specification according to the preset parameters, the step normalizes the face key point coordinate information to the 39x39 scale space.

S103, determining a metric distance between the normalized image of the first image to be recognized and the normalized image of the second image to be recognized, wherein the metric distance is generated according to the distances of the normalized images and the normalized image of the second image to be recognized in a feature space, and the distance of a similar normalized image in the feature space is smaller than the distance of a non-similar normalized image in the feature space.

Based on the above description, in a preferred embodiment of the present application, first, a specified feature in the normalized image is extracted through a convolutional neural network, then, a feature value of the specified feature after being mapped to a feature space is determined according to the convolutional neural network and a distance metric loss function, and the feature value is used as a feature value of the normalized image, and finally, a euclidean distance between the feature value of the normalized image and the feature value of the normalized image of the second image to be recognized is determined, and the euclidean distance is used as the metric distance.

Aiming at a special scene of face authentication (comparing two faces), the technical scheme of the application combines a deep convolutional neural network and metric learning to train a face authentication model. Deep convolutional neural networks are currently in wide use in the field of image understanding, including image classification, image retrieval, target detection, face recognition, and the like. Compared with the traditional method of adding the classifier into the features, the convolutional neural network has the advantages of self-learning of the features, good generalization capability of the model and the like. Metric learning is performed by mapping the feature space linearly or non-linearly so that the same face feature distance is smaller than different face feature distances.

It should be noted that the convolutional neural network parameters are obtained by training according to labeled images, and the labeled images include normalized images with mutually similar specified features and normalized images with mutually dissimilar specified features.

Specifically, in order to obtain a face authentication model based on depth metric learning, in the specific embodiment of the present application, a pair-wise (pair-wise) manner is adopted for sample labeling, each sample includes 2 portrait pictures, if two images are different from each other, a negative sample is represented, and if the two images are the same person, the same person is represented as a positive sample. The positive sample is generated by collecting a plurality of pictures belonging to the same person and combining the pictures in pairs, and the negative sample is generated by simulating the pictures which are not of the same person. After a series of processes of carrying out face detection on a sample, predicting key points of a face through a face key point prediction model obtained through training, aligning the face to a standard face, and scaling the image resolution to 39x39, the face authentication model based on depth measurement learning can be trained.

As shown in fig. 3, after face detection, key point positioning and face alignment are performed on two images in each group of samples, the two images are input to a convolutional neural network, and learned face features are extracted through convolution. Where the parameters W of the left and right networks are shared. And finally, measuring the distance of the features in a high-level semantic space. Depth metric learning mainly consists of 2 parts. One of which is a parameter W representing the parameters of the resulting convolutional neural network that needs to be trained, and the other of which is a distance metric loss function. Unlike traditional face recognition, the input of the application is 2 faces, and the final loss is also the distance of 2 faces in the feature space. The structure of the convolutional neural network used in this embodiment is shown in fig. 4, and comprises 4 convolutional layers and 2 fully-connected layers, where 3 convolutional layers are followed by a maximum sampling layer. The maximum sampling layer enables the extracted features to have translation invariance and reduces the computational complexity. And finally, mapping the characteristics of the human face to a 100-dimensional characteristic space in a nonlinear mode.

Therefore, the metric learning is to find a transformation space in which the distance between samples of the same type is reduced and the distance between samples of different types is increased. Therefore, the step firstly finds a nonlinear transformation through metric learning to transform the face from the original pixel to a feature space, so that the distance of similar faces in the space is small, and the distance of dissimilar faces is large. Then extracting the face features through a deep convolutional neural network, and finally mapping the features learned by the convolutional neural network to a feature space by combining metric learning. Because the convolutional neural network carries out nonlinear mapping on the face image, the obtained feature expression is more robust compared with the artificially designed feature, and the accuracy rate of face authentication is higher.

And S104, if the measurement distance is larger than a preset threshold value, determining that the designated features of the first image to be recognized and the second image to be recognized are not similar.

S105, if the measurement distance is smaller than or equal to the threshold value, determining that the first image to be recognized is similar to the second image to be recognized in the designated characteristic.

In the specific embodiment of S103, the resulting 100-dimensional features are subjected to metric learning, and the loss function adopted in the process is as follows:

whereinRepresenting a generalized logic loss function. (X)_i，X_j) E.g. P represents a set of samples, l_ijIndicates the class of the sample, l_ij1 represents X_iAnd X_jIs the same person,. l_ij-1 represents X_iAnd X_jNot of the same person, W is a parameter of the model, F_W(X_i) Representing the value of the feature mapped to 100 dimensions when the current model parameter is W.The distance between two faces in the feature space is represented, and the smaller the distance is, the more similar the two faces are. When (X)_i，X_j) Is the same person,/_ij1, the loss function followsIs increased by an increase of (X) correspondingly_i，X_j) Loss function following not the same personIs increased and decreased, and τ represents the threshold of whether or not it is the same person.

The parameters W of the model can be obtained by minimizing the loss function. Preferably, a technician may use a chain derivation method to obtain a gradient of a corresponding parameter and use a random gradient descent (SGD) method to optimize the parameter of the calculation model, and other calculation models capable of achieving the optimization effect are also within the protection scope of the present application.

Based on the above description, when a pair of images is input, faces in the images are respectively detected, and feature point extraction is performed according to the face regions to align with the faces. And then, extracting features according to the model obtained by training in the previous step, mapping the features into a space with 100 dimensions, and finally calculating the Euclidean distance of the features of the two faces, wherein if the distance is more than or equal to tau, the person is not the same person, otherwise, the person is the same person.

The above scheme describes in detail a process of determining whether a group of images to be recognized are similar, and the process in a specific implementation scenario can be completed by a client and a server together. In the implementation scenario, the user may upload the picture and the information using a mobile terminal such as a smart phone or a tablet computer, or upload the picture and the related information via a PC terminal. As a main body for processing the image, the server may be a data server or a web server set up in advance by a system operator.

As a link between the user and the server, the client is mainly used for forwarding the input content of the user to the server, the server verifies the identity of the user according to the content input by the user, and finally the client displays the user according to a verification result returned by the server. The following description first refers to a similar image recognition method on the client side, as shown in fig. 5, including the following steps:

s501, receiving an identity authentication request of a user, wherein the identity authentication request carries a first image to be identified uploaded by the user and authentication information of the user.

In this embodiment, the form of the client is not limited, and the client may be a PC device or a mobile device. Specifically, the client first obtains the image uploaded by the user and the information input by the user, then takes the image as the first image to be recognized, takes the information as the authentication information, and finally generates the identity authentication request according to the first image to be recognized and the authentication information.

S502, the identity authentication request is sent to a server, so that the server can obtain a second image to be identified corresponding to the user according to the authentication information.

After the authentication information of the user is obtained, the server can obtain the second image to be identified corresponding to the user according to the authentication information. In a specific embodiment, the server queries an image in the user identification card in the database according to the identification information provided by the user, and uses the image as a second image to be identified, so as to determine whether the image uploaded by the user is matched with the image of the identification card.

S503, receiving the identity authentication response sent by the server.

In this embodiment, the identity authentication response is an identity authentication success response or an identity authentication failure response, where the identity authentication success response is generated by the server after confirming that the specified features of the first image to be recognized and the second image to be recognized are similar, and the identity authentication failure response is generated by the server after confirming that the specified features of the first image to be recognized and the second image to be recognized are not similar.

S504, an authentication result is displayed to the user according to the identity authentication response.

Based on the successful response of the identity authentication or the failure response of the identity authentication, the specific implementation process of the step is as follows:

(1) when the client receives the identity authentication success response, the client displays a preset interface corresponding to the identity authentication success response to the user;

(2) and when the client receives the identity authentication failure response, the client displays a preset interface corresponding to the identity authentication failure response to the user and displays prompt information indicating whether manual verification is needed or not to the user.

According to the method and the device, the equipment automatically judges the group of images to be identified, so that in order to further avoid the influence caused by errors, when a failure response is returned to the user, prompt information about whether manual verification is needed or not is displayed to the user. If the user thinks that the manual review needs to be submitted again, the user notifies the client side to input the manual verification request again, and after receiving the manual verification request of the user, the client side sends the identity authentication request to a preset server side.

The above is a flow of the client, and is mainly used for realizing interaction between the user and the server, and the following embodiment is a similar image identification method on the server side, as shown in fig. 6, including the following steps:

s601, receiving an identity authentication request sent by the client, wherein the identity authentication request carries a first image to be identified uploaded by the user and authentication information of the user;

s602, inquiring a second image to be identified corresponding to the user according to the authentication information;

s603, acquiring a region to be compared corresponding to the specified feature in the first image to be identified;

s604, aligning the image in the area to be compared with a preset standard image, and taking the aligned image as a normalized image of the first image to be identified, wherein the standard image corresponds to the specified feature;

s605, determining a measurement distance between the normalized image and a normalized image of a second image to be identified, wherein the measurement distance is generated according to the distances of the normalized image and the normalized image of the second image to be identified in a feature space, and the distance of a similar normalized image in the feature space is smaller than the distance of a non-similar normalized image in the feature space;

s606, if the measurement distance is larger than a preset threshold value, determining that the designated features of the first image to be recognized and the second image to be recognized are not similar, and returning an identity verification failure response to the client;

s607, if the measurement distance is less than or equal to the threshold value, the first image to be recognized and the second image to be recognized are confirmed to have similar designated features, and an identity verification success response is returned to the client.

To achieve the above technical object, the present application also proposes a similar image recognition apparatus, as shown in fig. 7, including:

an obtaining module 710, configured to obtain a to-be-compared region corresponding to the specified feature in the first to-be-identified image;

an alignment module 720, configured to align the image in the region to be compared with a preset standard image, and use the aligned image as a normalized image of the first image to be recognized, where the standard image corresponds to the specified feature;

a determining module 730, configured to determine a metric distance between the normalized image of the first image to be recognized and a normalized image of a second image to be recognized, where the metric distance is generated according to distances of the normalized images and the normalized image of the second image to be recognized in a feature space, and a distance of a similar normalized image in the feature space is smaller than a distance of a non-similar normalized image in the feature space;

an identifying module 740, configured to confirm that the specified features of the first image to be identified and the second image to be identified are not similar when the metric distance is greater than a preset threshold, and confirm that the specified features of the first image to be identified and the second image to be identified are similar when the metric distance is less than or equal to the threshold.

In a specific application scenario, the determining module is specifically configured to:

In a specific application scenario, the alignment module is specifically configured to:

In a specific application scenario, the method further includes:

In a specific application scenario, the obtaining module is specifically configured to:

In a specific application scenario, the specified feature is specifically a face region, and the keypoint feature at least includes a left eye region, a right eye region, a nose region, a left mouth corner region, and a right mouth corner region.

In a specific application scenario, the convolutional neural network parameters are obtained by training according to labeled images, wherein the labeled images comprise normalized images with mutually similar specified characteristics and normalized images with mutually dissimilar specified characteristics.

The present application further provides a client, as shown in fig. 8, including:

a receiving module 810, configured to receive an identity authentication request of a user, where the identity authentication request carries a first image to be identified and authentication information of the user, where the first image to be identified is uploaded by the user;

a sending module 820, configured to send the identity authentication request to a server, so that the server obtains a second image to be identified corresponding to the user according to the authentication information;

the receiving module 810 is further configured to receive an identity authentication response sent by the server;

and a display module 830, configured to display an authentication result to the user according to the identity authentication response.

In a specific application scenario, the receiving module is specifically configured to:

In a specific application scenario, the identity authentication response is an identity authentication success response or an identity authentication failure response, and further includes:

In a specific application scenario, the display module is specifically configured to display a preset interface corresponding to the successful identity authentication response to the user when the receiving module receives the successful identity authentication response; or, the display module is specifically configured to display a preset interface corresponding to the identity authentication failure response to the user and display a prompt message indicating whether manual verification is required to be performed to the user when the receiving module receives the identity authentication failure response.

In a specific application scenario, after the receiving module displays a preset interface corresponding to the identity authentication failure response to the user and displays a prompt indicating whether manual verification is needed to be performed to the user, the receiving module further receives a manual verification request of the user, and the receiving module indicates the sending module to send the identity authentication request to a preset server.

An embodiment of the present application further provides a server, as shown in fig. 9, including:

a receiving module 910, configured to receive an identity authentication request sent by the client, where the identity authentication request carries a first image to be identified uploaded by the user and authentication information of the user;

the query module 920 is configured to query a second image to be identified corresponding to the user according to the authentication information;

an obtaining module 930, configured to obtain a to-be-compared region corresponding to the specified feature in the first to-be-identified image;

an alignment module 940, configured to align the image in the region to be compared with a preset standard image, and use the aligned image as a normalized image of the first image to be recognized, where the standard image corresponds to the specified feature;

a determining module 950, configured to determine a metric distance between the normalized image of the first to-be-identified image and a normalized image of a second to-be-identified image, where the metric distance is generated according to distances of the normalized images and the normalized image of the second to-be-identified image in a feature space, and a distance of a similar normalized image in the feature space is smaller than a distance of a non-similar normalized image in the feature space;

an identifying module 960, configured to confirm that the specified features of the first image to be identified and the second image to be identified are not similar when the metric distance is greater than a preset threshold, and confirm that the specified features of the first image to be identified and the second image to be identified are similar when the metric distance is less than or equal to the threshold;

the sending module 970 is configured to return an authentication failure response to the client when the recognition module confirms that the specified features of the first image to be recognized and the second image to be recognized are not similar, and return an authentication success response to the client when the recognition module confirms that the specified features of the first image to be recognized and the second image to be recognized are similar.

In a specific application scenario, the method further includes:

Through the above description of the embodiments, those skilled in the art will clearly understand that the present application can be implemented by hardware, and also by software plus a necessary general hardware platform. Based on such understanding, the technical solution of the present application may be embodied in the form of a software product, which may be stored in a non-volatile storage medium (which may be a CD-ROM, a usb disk, a removable hard disk, etc.), and includes several instructions for enabling a computer device (which may be a personal computer, a server, or a network device, etc.) to execute the method according to the implementation scenarios of the present application.

Those skilled in the art will appreciate that the figures are merely schematic representations of one preferred implementation scenario and that the blocks or flow diagrams in the figures are not necessarily required to practice the present application.

Those skilled in the art will appreciate that the modules in the devices in the implementation scenario may be distributed in the devices in the implementation scenario according to the description of the implementation scenario, or may be located in one or more devices different from the present implementation scenario with corresponding changes. The modules of the implementation scenario may be combined into one module, or may be further split into a plurality of sub-modules.

The above application serial numbers are for description purposes only and do not represent the superiority or inferiority of the implementation scenarios.

The above disclosure is only a few specific implementation scenarios of the present application, but the present application is not limited thereto, and any variations that can be made by those skilled in the art are intended to fall within the scope of the present application.

Claims

1. A method for identifying similar images, comprising:

if the metric distance is smaller than or equal to the threshold value, determining that the first image to be recognized is similar to the specified feature of the second image to be recognized;

determining a measurement distance between the normalized image and a normalized image of a second image to be identified, specifically:

2. The method according to claim 1, wherein the obtaining of the area to be compared in the first image to be recognized corresponding to the specified feature comprises:

3. The method according to claim 2, wherein the image in the area to be compared is aligned with a preset standard image, specifically:

4. The method of claim 3, wherein after taking the aligned image as a normalized image of the first image to be identified, further comprising:

and adjusting the resolution of the normalized image to a preset resolution.

5. The method of any one of claims 2-4, wherein the specified features are specifically face regions, and the keypoint features include at least a left eye region, a right eye region, a nose region, a left mouth corner region, and a right mouth corner region.

6. The method of claim 1,

the convolutional neural network parameters are obtained by training according to labeled images, wherein the labeled images comprise normalized images with mutually similar specified characteristics and normalized images with mutually dissimilar specified characteristics.

7. A similar image recognition apparatus, comprising:

the acquisition module is specifically configured to:

8. The device of claim 7, wherein the determination module is specifically configured to:

9. The device of claim 8, wherein the alignment module is specifically configured to:

10. A similar image recognition method is applied to a server, and is characterized by comprising the following steps:

receiving an identity authentication request sent by a client, wherein the identity authentication request carries a first image to be identified uploaded by a user and authentication information of the user;

if the measurement distance is larger than a preset threshold value, confirming that the specified characteristics of the first image to be recognized and the second image to be recognized are not similar, and returning an identity verification failure response to the client;

if the measurement distance is smaller than or equal to the threshold value, confirming that the specified features of the first image to be recognized and the second image to be recognized are similar, and returning an identity verification success response to the client;

11. The method according to claim 10, wherein the obtaining of the region to be compared in the first image to be recognized corresponding to the specified feature comprises:

12. The method according to claim 11, wherein the image in the area to be compared is aligned with a preset standard image, specifically:

13. A server, comprising:

the system comprises a receiving module, a judging module and a judging module, wherein the receiving module is used for receiving an identity authentication request sent by a client, and the identity authentication request carries a first image to be identified uploaded by a user and authentication information of the user;

a sending module, configured to return an authentication failure response to the client when the recognition module confirms that the specified features of the first image to be recognized and the second image to be recognized are not similar, and return an authentication success response to the client when the recognition module confirms that the specified features of the first image to be recognized and the second image to be recognized are similar;

the acquisition module is specifically configured to:

14. The server according to claim 13, wherein the determining module is specifically configured to:

15. The server according to claim 14, wherein the alignment module is specifically configured to: