WO2021208373A1

WO2021208373A1 - Image identification method and apparatus, and electronic device and computer-readable storage medium

Info

Publication number: WO2021208373A1
Application number: PCT/CN2020/119613
Authority: WO
Inventors: 王亚可; 王塑; 刘宇
Original assignee: 北京迈格威科技有限公司
Priority date: 2020-04-14
Filing date: 2020-09-30
Publication date: 2021-10-21
Also published as: CN111639667A; CN111639667B

Abstract

The present application relates to the technical field of image processing. Provided are an image identification method and apparatus, and an electronic device and a computer-readable medium. The method comprises: during image identification, first extracting a feature of a target object image to be subjected to identification; then calculating a first feature distance between the feature of said target object image and a feature of a base library image, and obtaining a second feature distance between said target object image and the base library image according to the first feature distance and a target scaling parameter, wherein the target scaling parameter is related to the feature of said target object image; and determining an identification result of a target object in said target object image according to the second feature distance. As such, the feature distance from said target object image to the base library image is shortened by means of distance scaling transformation of the first feature distance under the target scaling parameter, so as to improve the passing rate of identification of an image photographed under dim light, top light or at a large angle, etc. without increasing an identification error rate.

Description

Image recognition method, device, electronic equipment and computer readable storage medium

Cross-references to related applications

This application claims the priority of the Chinese patent application filed with the Chinese Patent Office on April 14, 2020, with the application number 2020102932947, titled "Image Recognition Method, Device, Electronic Equipment, and Computer-readable Storage Medium", and the entire content of it is approved The reference is incorporated in this application.

Technical field

This application relates to the field of image processing technology, and in particular to an image recognition method, device, electronic equipment, and computer-readable storage medium.

Background technique

Image recognition refers to the use of computers to process, analyze, and understand images to identify targets and objects in various patterns. Before performing image recognition, you can first enter the target image of the target in the image recognition system as the base library image in the base library, and then perform image recognition based on the similarity between the target image to be recognized and the base library image, for example , The feature distance between the target image to be recognized and the base library image can be calculated (the higher the similarity is, the smaller the feature distance is), and the image recognition can be performed by comparing the size relationship between the feature distance and the preset distance threshold.

However, the similarity between the image taken in the dark, top light or large-angle scene and the bottom library image corresponding to the taken image is low, which can also be understood as the difference between the taken image and the corresponding The feature distance of the base library images is generally large, which causes the above-mentioned captured images to be unable to be correctly identified. Therefore, the pass rate of the existing image recognition method is low.

Summary of the invention

One of the objectives of the present application is to provide an image recognition method, device, electronic device, and computer-readable storage medium to improve the pass rate of image recognition, thereby improving user experience.

In the first aspect, an embodiment of the present application provides an image recognition method, including: extracting features of the target image to be recognized; calculating the first feature distance between the features of the target image to be recognized and the features of the base library image According to the first characteristic distance and the target expansion parameter, the second characteristic distance between the target object image to be recognized and the base library image is obtained; wherein the target expansion parameter and the target object image to be recognized According to the second feature distance, determine the target recognition result in the image of the target to be recognized.

Optionally, the step of calculating the first feature distance between the feature of the target image to be recognized and the feature of the base library image includes: calculating the feature of the target image to be recognized and the base library image using the following formula The first feature distance d12 between the features of the library image:

Wherein, f _1,i represents the i-th element of the feature of the base library image, and f _2,i represents the i-th element of the feature of the target image to be recognized.

Optionally, the step of obtaining the second characteristic distance between the image of the object to be identified and the base library image according to the first characteristic distance and the target expansion and contraction parameter includes: converting the image of the object to be identified The feature of the object to be recognized is input into the neural network model to obtain the target expansion parameter corresponding to the image of the target to be recognized; the first feature distance is numerically transformed by the target expansion parameter to obtain the image of the target to be recognized and the background The second feature distance between library images.

Optionally, the target stretch parameter includes a target stretch coefficient or a target stretch value; using the target stretch parameter to perform a numerical transformation on the first characteristic distance to obtain the difference between the image of the target to be identified and the image of the base library The step of the second characteristic distance between the two includes: multiplying the first characteristic distance and the target expansion coefficient to obtain the second characteristic distance between the image of the target object to be recognized and the image of the base library; The target expansion coefficient is greater than 0 and less than 1;

Or, performing a subtraction operation on the first characteristic distance and the target stretch value to obtain the second characteristic distance between the image of the target object to be recognized and the image of the base library.

Optionally, the base library image is one, and the base library image corresponds to one of the second characteristic distances; the step of determining the target recognition result in the target image to be recognized according to the second characteristic distance , Including: determining whether the second characteristic distance is less than or equal to a distance threshold; if the second characteristic distance is less than or equal to the distance threshold, determining the target in the base library image as the target image to be recognized The result of target recognition.

Optionally, there are multiple base library images, and each base library image corresponds to one second characteristic distance; according to the second characteristic distance, the target recognition in the target image to be recognized is determined The result step includes: judging the numerical relationship between each of the second characteristic distances and a distance threshold; when there is a target second characteristic distance smaller than the distance threshold in each of the second characteristic distances, the target The target object in the base library image corresponding to the second characteristic distance is determined as the target object recognition result.

Optionally, the step of judging the numerical value relationship between each of the second characteristic distances and a distance threshold includes: judging whether the minimum value of each of the second characteristic distances is less than or equal to the distance threshold; if the The minimum value of the second characteristic distances is less than or equal to the distance threshold, and the minimum value of each of the second characteristic distances is determined as the target second characteristic distance.

Optionally, the target expansion and contraction parameters are determined by a neural network model, and the neural network model is obtained by training in the following steps: extracting the characteristics of the sample image; inputting the characteristics of the sample image into the initial neural network model to obtain the predicted expansion and contraction parameters; Determine the label expansion parameter corresponding to the sample image according to the third characteristic distance between the characteristic of the sample image and the characteristic of each image in the target image set; determine the label expansion parameter according to the predicted expansion parameter and the label expansion parameter The loss value of the initial neural network model; the parameters in the initial neural network model are updated according to the loss value to obtain the trained neural network model.

Optionally, the step of determining the label expansion parameter corresponding to the sample image according to the third characteristic distance between the characteristic of the sample image and the characteristic of each image in the target image set includes: calculating the characteristic of the sample image and the target image The third feature distance between the features of each image is collected; and the label expansion and contraction parameters corresponding to the sample image are determined according to each third feature distance and the distance threshold.

Optionally, the step of determining the label expansion parameter corresponding to the sample image according to each third characteristic distance and the distance threshold includes: judging whether the target characteristic distance is the smallest value among the third characteristic distances; the target The feature distance is the third feature distance between the feature of the sample image and the feature of the standard image corresponding to the sample image in the target image set; when the target feature distance is the smallest of the third feature distances Value, judging whether the target feature distance is greater than the distance threshold; when the target feature distance is greater than the distance threshold, the label stretch parameter is determined according to the target feature distance and the distance threshold.

Optionally, the label expansion parameter includes a label expansion coefficient; the step of determining the label expansion parameter according to the target feature distance and the distance threshold includes: determining that the label expansion coefficient is the distance from the target feature and the distance The first value related to the distance threshold, where the first value is greater than 0 and less than 1.

Optionally, the step of determining that the label expansion coefficient is the first value related to the target characteristic distance and the distance threshold includes: determining according to the ratio of the distance threshold to the target characteristic distance and a preset coefficient The first value is used as the label expansion coefficient; wherein the preset coefficient is greater than 0 and less than 1.

Optionally, the label expansion parameter includes a label expansion coefficient; the method further includes: when the target feature distance is not the smallest value among the third characteristic distances, determining that the label expansion coefficient is a second value, The second value is greater than or equal to 1.

Optionally, the label expansion parameter includes a label expansion coefficient; the method further includes: when the target feature distance is less than the distance threshold, determining that the label expansion coefficient is 1.

Optionally, the target image to be recognized is a face image to be recognized, and the method further includes: extracting features of the face image to be recognized; calculating the difference between the feature of the face image to be recognized and the feature of each base library image The first feature distance between the first feature distance; input the features of the face image to be recognized into the neural network model to obtain the target expansion coefficient corresponding to the face image to be recognized; multiply each first feature distance and the target expansion coefficient to obtain the target expansion coefficient The second feature distance between the face image and the base library image, the target expansion coefficient is greater than 0 and less than 1; determine the numerical relationship between each second feature distance and the distance threshold; when each second feature distance is less than When the target second feature distance of the distance threshold is the target second feature distance, the face in the base image corresponding to the target second feature distance is determined as the face recognition result of the face image to be recognized.

In a second aspect, an embodiment of the present application also provides an image recognition device, including: an extraction module configured to extract features of an image of a target object to be recognized; The first feature distance between the features of the library image; a transformation module configured to obtain the second feature distance between the target object image to be identified and the base library image according to the first feature distance and the target expansion parameter Wherein, the target stretch parameter is related to the feature of the image of the target to be recognized; the determining module is configured to determine the target recognition result in the image of the target to be recognized according to the second characteristic distance.

In a third aspect, an embodiment of the present application also provides an electronic device, including a memory and a processor, the memory stores a computer program that can run on the processor, and when the processor executes the computer program The image recognition method of the first aspect described above is realized.

In a fourth aspect, an embodiment of the present application also provides a computer-readable storage medium having a computer program stored on the computer-readable storage medium, and the computer program executes the image recognition method of the first aspect when the computer program is run by a processor .

The embodiments of the application provide an image recognition method, device, electronic equipment, and computer-readable storage medium. When performing image recognition on an image of a target to be recognized, first extract the features of the target image to be recognized; then calculate the target to be recognized The first feature distance between the feature of the image and the feature of the base library image, and the second feature distance between the image of the target object to be identified and the base library image is obtained according to the first feature distance and the target scaling parameter; wherein, the target stretches The parameter is related to the feature of the image of the target object to be recognized; and the result of the target object recognition in the image of the target object to be recognized is determined according to the second characteristic distance. In this way, by performing the distance scaling transformation under the target scaling parameter on the first feature distance, the feature distance of the target image to be recognized is shortened, and the target scaling coefficient is related to the features of the target image to be recognized, so as not to increase the error. In the case of the recognition rate, the recognition accuracy and pass rate of images taken under low light, top light or large angles are improved, and the false rejection rate is reduced, thereby improving the recognition effect and user experience.

Other features and advantages of the embodiments of the present application will be described in the following specification, or some of the features and advantages can be inferred from the specification or determined without doubt, or can be learned by implementing the above-mentioned technology of the embodiments of the present application.

In order to make the above objectives, features, and advantages of the present application more comprehensible, optional embodiments accompanied with accompanying drawings are described in detail below.

Description of the drawings

In order to more clearly illustrate the specific embodiments of this application or the technical solutions in the prior art, the following will briefly introduce the drawings that need to be used in the specific embodiments or the description of the prior art. Obviously, the appendix in the following description The drawings are some embodiments of the present application. For those of ordinary skill in the art, without creative work, other drawings can be obtained based on these drawings.

FIG. 1 shows a schematic structural diagram of an electronic device provided by an embodiment of the present application;

Fig. 2 shows a flowchart of an image recognition method provided by an embodiment of the present application;

FIG. 3 shows a schematic diagram of the principle of distance expansion and contraction in an image recognition method provided by an embodiment of the present application;

FIG. 4 shows a flowchart of another image recognition method provided by an embodiment of the present application;

Fig. 5 shows a structural block diagram of an image recognition device provided by an embodiment of the present application;

Fig. 6 shows a structural block diagram of another image recognition device provided by an embodiment of the present application.

Detailed ways

In order to make the objectives, technical solutions, and advantages of the embodiments of the present application clearer, the technical solutions of the present application will be described below in conjunction with the accompanying drawings. Obviously, the described implementations in this embodiment are only a part of the possible implementations. , Not the full implementation.

When performing image recognition, you can first calculate the feature distance of the target image to be recognized in the base image, and then compare the feature distance with the preset distance threshold. If the feature distance is less than or equal to the distance threshold, the recognition result is determined to be The target in the base library image. In the current image recognition system, it is necessary to determine the comparison distance threshold according to a given false recognition rate. However, some difficult samples (such as images taken under dark light, top light or large angles) to the corresponding characteristics of the base library image The distance may be greater than the distance threshold, causing these difficult samples to not be correctly identified. Based on the findings of the above-mentioned problems, the image recognition method, device, electronic equipment, and computer-readable storage medium provided by the embodiments of the present application can improve the pass of image recognition of difficult samples without increasing the misrecognition rate. Rate, thereby enhancing the user experience.

First, referring to FIG. 1, FIG. 1 shows a schematic structural diagram of an electronic device provided by an embodiment of the present application; the electronic device 100 shown in FIG. 1 is an image recognition that can be used to implement an embodiment of the present application. An example electronic device 100 of the method and apparatus.

As shown in FIG. 1, the electronic device 100 can be configured with one or more processors 102, one or more storage devices 104, an input device 106, an output device 108, and an image acquisition device 110. These components can be connected through a bus system 112 and/ Or other forms of connection mechanism (not shown) are interconnected. It should be noted that the components and structure of the electronic device 100 shown in FIG. 1 are only exemplary and not restrictive. According to needs, the electronic device may have some of the components shown in FIG. Other components and structures.

In some possible examples, the processor 102 may be implemented in at least one hardware form of a digital signal processor (DSP), a field programmable gate array (FPGA), and a programmable logic array (PLA), and the processor 102 may It is one or a combination of a central processing unit (CPU), a graphics processing unit (GPU), or other forms of processing units with data processing capabilities and/or instruction execution capabilities, and can control other components in the electronic device 100 Component to perform the desired function.

In some possible examples, the storage device 104 may include one or more computer program products, and the computer program products may include various forms of computer-readable storage media, such as volatile memory and/or non-volatile memory. . The volatile memory may include random access memory (RAM) and/or cache memory (cache), for example. The non-volatile memory may include, for example, read-only memory (ROM), hard disk, flash memory, and the like. One or more computer program instructions may be stored on the computer-readable storage medium, and the processor 102 may run the program instructions to implement the client functions and/or other functions in the embodiments of the present application (implemented by the processor) described below. The desired function. Various application programs and various data, such as various data used and/or generated by the application program, can also be stored in the computer-readable storage medium.

In some possible examples, the input device 106 may be a device used by the user to input instructions, and may include one or more of a keyboard, a mouse, a microphone, and a touch screen.

In some possible examples, the output device 108 may output various information (for example, text, image, or sound) to the outside (for example, a user), and may include one or more of a display, a speaker, and the like.

In some possible examples, the image capture device 110 may capture images (such as photos, videos, etc.) desired by the user, and store the captured images in the storage device 104 for use by other components.

Exemplarily, the exemplary electronic device 100 for implementing the image recognition method according to the embodiment of the present application may be implemented as a smart terminal such as a smart phone, a tablet computer, or a computer.

Referring to FIG. 2, FIG. 2 shows a flowchart of an image recognition method provided by an embodiment of the present application. The method mainly includes the following steps S202 to S208:

Step S202: Extract the features of the image of the target object to be recognized.

In some possible embodiments, the aforementioned target object to be recognized may be, but is not limited to, a human face, human body, animal, or vehicle (such as a car, boat or bicycle, etc.), and the target object image to be recognized may be dark light, top For images taken under light or at a large angle, the image recognition method provided in the embodiments of the present application may be suitable for image recognition of such images. In a possible implementation manner, the characteristics of the target image to be recognized can be extracted through the corresponding neural network model after pre-training. The specific extraction process of the characteristics can refer to the related prior art, which will not be repeated here.

Step S204: Calculate the first feature distance between the feature of the target image to be recognized and the feature of the base library image.

In some possible embodiments, the electronic device used to execute the image recognition method provided in the embodiments of the present application may pre-store the base library image, and there may be one or more base library images, and there is one base library image in each base library image. Target. In this embodiment, one base library image corresponds to one target. The above-mentioned first characteristic distance may represent the characteristic distance of the target image to be identified in the feature space to the base library image. It is understandable that the above-mentioned first characteristic distance may correspond to the base library image one-to-one. In a possible example If there are multiple base library images stored in the electronic device, there are also multiple first characteristic distances that can be obtained.

In some possible embodiments, the above-mentioned feature of the base library image may be obtained by performing feature extraction on the base library image before step S204 is performed, or may be pre-extracted and stored in the electronic device. The method of extracting the features of the base library image is the same as the method of extracting the features of the target image to be recognized.

In some possible embodiments, the features of the target image to be recognized and the features of the base library image may both be in the form of a matrix. Since the matrix usually contains multiple elements, the features of the target image to be recognized and the features of the base library image are also Both contain multiple elements. For ease of understanding, the embodiments of the present application take as an example that the features of the target image to be recognized and the features of the base library image are both one-dimensional matrices, and also provide a calculation for the relationship between the features of the target image to be recognized and the features of the base library image. A possible implementation of the first characteristic distance is as follows:

The first feature distance d12 between the feature of the target image to be recognized and the feature of the base library image is calculated by the following formula:

Among them, f _1,i represents the i-th element of the feature of the base library image, and f _2,i represents the i-th element of the feature of the target image to be recognized.

Step S206: Obtain a second characteristic distance between the image of the target object to be identified and the image of the base library according to the first characteristic distance and the target expansion parameter; wherein, the target expansion parameter is related to the characteristics of the target object image to be identified.

In some possible embodiments, considering that the image of the target to be recognized belongs to the above-mentioned difficult sample, the first feature distance obtained through the above step S204 may be greater than the preset distance threshold. At this time, the image of the target to be recognized may not be correct. In order to solve the above technical problems, the embodiment of the present invention provides a possible implementation manner. The difficult samples can be brought closer to the bottom by performing a distance scaling transformation on the first feature distance. Library images, thereby improving the pass rate of image recognition of difficult samples, avoiding the recognition of images containing the target object as images not containing the target object, and reducing the false rejection rate.

In the embodiment of the present application, the target stretch parameter when performing the distance stretch transformation is related to the characteristics of the target object image to be recognized. This ensures that the zooming process does not bring the target-free image closer than the distance threshold, so that no Misrecognition means that the rate of misrecognition will not increase.

In some possible implementations, in order to facilitate the understanding of the implementation of step S206, please refer to FIG. 3, which shows a schematic diagram of the principle of distance expansion and contraction in an image recognition method provided by an embodiment of the present application; FIG. As shown, base represents the base library image, simple query represents simple samples (that is, images that are easily recognized correctly), difficult query represents difficult samples, and dots and triangles represent images corresponding to two different targets (including the base library). Image, simple sample and difficult sample) position in the feature space, the circle corresponds to the distance threshold (samples inside the circle can be correctly identified, samples outside the circle cannot be correctly identified).

As shown in Figure 3, before the first feature distance is transformed by the distance expansion, the difficult samples are all located outside the circle, so that the difficult samples cannot be correctly identified; and after the distance expansion and transformation is performed on the first feature distance, the ones located outside the circle The difficult sample is drawn into the circle, and the feature distance between the difficult sample and the base library image corresponding to the difficult sample becomes smaller, so that the difficult sample can be correctly identified.

Optionally, in a possible implementation manner, the above step S206 can be implemented by the following process: input the features of the image of the target object to be recognized into the neural network model to obtain the target expansion and contraction parameters corresponding to the image of the target object to be recognized; The parameter performs numerical transformation on the first characteristic distance to obtain the second characteristic distance between the image of the target object to be identified and the image of the base library.

In some possible embodiments, the aforementioned neural network model may be a pre-trained neural network model, and the neural network model may be a layer of fully connected neural network. In the embodiments of the present application, the input of the neural network model may be To identify the characteristics of the target image, the output of the neural network model is a real value, and the real value can be the target expansion parameter in the embodiment of this application. For the specific content of the training process of the neural network model, please refer to the explanation below

In some possible embodiments, the aforementioned target expansion parameter may include a target expansion coefficient or a target expansion value. Based on the target expansion parameters in these two implementation modes, the second characteristic distance can be obtained by the following implementation: , The first characteristic distance and the target expansion coefficient are multiplied to obtain the second characteristic distance between the target image to be identified and the base library image, the target expansion coefficient is greater than 0 and less than 1; in another implementation manner , Performing a subtraction operation on the first characteristic distance and the target stretch value to obtain the second characteristic distance between the image of the target object to be identified and the image of the base library.

Step S208: Determine the target recognition result in the image of the target to be recognized according to the above-mentioned second characteristic distance.

In a possible implementation manner, the target recognition result in the image of the target to be recognized can be determined by comparing the second characteristic distance with a preset distance threshold. The distance threshold can be set according to the required misrecognition rate, which is not limited here.

In some possible examples, there may be one base library image, so the second characteristic distance is also one. In the case where there is one base library image and the second characteristic distance is also one, the above step S208 may go through the following process Realization: Determine whether the second characteristic distance is less than or equal to the distance threshold; if the second characteristic distance is less than or equal to the distance threshold, determine the target in the base library image as the target recognition result in the target image to be recognized.

In other possible examples, there may be multiple base library images, and each base library image corresponds to a second feature distance. Therefore, there are also multiple second feature distances. There are multiple base library images, and each base library image In the case that the image corresponds to a second characteristic distance, the above step S208 can be implemented by the following process: judging the numerical relationship between each second characteristic distance and the distance threshold; The second feature distance, the target object in the base library image corresponding to the second feature distance of the target is determined as the target object recognition result in the target object image to be recognized.

In an optional implementation manner, the step of judging the numerical value relationship between each second characteristic distance and the distance threshold may be: judging whether the minimum value of each second characteristic distance is less than or equal to the distance threshold; The minimum value of the second characteristic distances is less than or equal to the distance threshold, and the minimum value of each second characteristic distance is determined as the target second characteristic distance.

It is understandable that the image recognition of the image of the target object to be recognized can be realized through the above steps S202 to S208.

The above-mentioned image recognition method of this embodiment makes full use of the distinguishability of difficult samples. When performing image recognition on the image of the target object to be recognized, the first feature distance is subjected to the distance expansion and contraction transformation under the target expansion parameter to narrow the to-be-identified The feature distance of the target image to the bottom library image, and the target expansion parameter is related to the features of the target image to be recognized, so that the accuracy of shooting under dark light, top light or large angles is improved without increasing the misrecognition rate. The image recognition accuracy and pass rate reduce the false rejection rate, thereby improving the user experience.

For the above steps S204 and S206, the following will take the target expansion parameter as the target expansion coefficient as an example, and give an implementation way to obtain the second characteristic distance. In the formula, the second characteristic distance d" _{12 can be calculated by the following formula} :

Among them, h(f ₂ ) represents the target stretch coefficient.

In some possible embodiments, the embodiment of the present application also provides a training process of a neural network model, which mainly includes the following steps 302 to 310:

Step 302: Extract the features of the sample image.

In some possible embodiments, the sample images in the training set may be images taken under shooting scenes such as dark light, overhead light, or large angles. The process of extracting the features of the sample image can refer to the related prior art, which will not be repeated here.

Step 304: Input the characteristics of the above-mentioned sample image into the initial neural network model to obtain the predicted expansion and contraction parameters. The prediction scaling parameter may include a prediction scaling coefficient or a prediction scaling value.

Step 306: Determine a label expansion parameter corresponding to the sample image according to the third feature distance between the feature of the sample image and the feature of each image in the target image set.

In some possible implementation examples, each image in the target image set may be the bottom library image in the embodiment of the present application. In the embodiment of the present application, the label expansion parameter corresponds to the aforementioned predicted expansion parameter, which can be understood as: if the predicted expansion parameter is the predicted expansion coefficient, the label expansion parameter is the label expansion coefficient; if the predicted expansion parameter is the predicted expansion value, then The label expansion parameter is the label expansion value.

Optionally, in some possible embodiments, according to the third feature distance between the feature of the sample image and the feature of each image in the target image set, the method of determining the label expansion and contraction parameters corresponding to the sample image may be: calculation The third feature distance between the feature of the sample image and the feature of each image in the target image set, and then the label expansion parameter corresponding to the sample image is determined according to each third feature distance and the distance threshold.

It is understandable that the above distance threshold is the same as the distance threshold used in the specific implementation of determining the target recognition result in the target image to be recognized based on the second feature distance, so as to ensure that the trained neural network model can be accurate Identify the target image.

In a possible implementation manner, the label stretch parameter may be determined according to the comparison result between each third characteristic distance and the distance threshold. For example, the average distance of each third characteristic distance may be compared with the distance threshold. Or, compare the maximum or minimum value of each third characteristic distance with the distance threshold, etc., and obtain a corresponding strategy for determining the label expansion parameter according to the comparison result.

In another possible implementation manner, in the process of obtaining the corresponding strategy for determining the label expansion parameter based on the comparison result, the value of the label expansion parameter may be jointly determined based on the third characteristic distance and the distance threshold, and may also be based on The preset coefficients, the third feature distance, and the distance threshold are used to determine the value of the label expansion and contraction parameters to facilitate understanding of the above-mentioned implementation of obtaining the label expansion and contraction parameters. The distance and distance threshold, the way to determine the label expansion parameters corresponding to the sample image can be achieved by the following process: determine whether the target feature distance is the minimum of the third feature distances, and the target feature distance is the concentration of the features of the sample image and the target image The third characteristic distance between the features of the standard image corresponding to the sample image; when the target characteristic distance is the smallest value among the third characteristic distances, it is judged whether the target characteristic distance is greater than the distance threshold; when the target characteristic distance is greater than the distance threshold, according to the target The feature distance and distance threshold determine the label expansion parameters.

This application also provides another possible way to obtain the label expansion parameter, that is, when the label expansion parameter is the label expansion coefficient, the target feature distance can be the smallest value among the third feature distances, and the target feature distance is greater than the distance When thresholding (this is the case of corresponding difficult samples), it is determined that the label stretch coefficient is the first value related to the target feature distance and the distance threshold greater than 0 and less than 1. In a possible implementation manner, the first value can be determined according to the ratio of the distance threshold to the target characteristic distance and a preset coefficient, and the first value is used as the label expansion coefficient; wherein the preset coefficient is greater than 0 and less than 1.

In a possible implementation manner, the above-mentioned first value may be determined according to the following formula:

Among them, h(f) represents the first value, d represents the distance threshold,

Represents the target feature distance, k represents the preset coefficient, and 0<k<1.

In some possible embodiments, in order not to increase the false recognition rate, taking the label expansion parameter as the label expansion coefficient as an example, a third characteristic distance between the characteristics of the sample image and the characteristics of each image in the target image set is given. The method of determining the label expansion parameter corresponding to the sample image, that is, the above step 306 may also include: when the target feature distance is not the smallest value among the third feature distances (this is the case of misidentification), determining the label expansion coefficient as the first Two values, the second value is greater than or equal to 1; when the target feature distance is less than the distance threshold (this is the case of correct identification), the label expansion coefficient is determined to be 1. In this way, in the case of misrecognition, the label expansion coefficient is set to a second value greater than or equal to 1, which will not further increase the misrecognition rate; in the case of correct recognition, the label is expanded and contracted. If the coefficient is set to 1, it will not affect the recognition result.

In some possible embodiments, the label expansion coefficient in the embodiments of this application can be denoted as h(f). In a possible implementation manner, the relationship between the target feature distance and each third feature distance and distance threshold can be Divided into the following three situations:

1.

2.

3.

in,

Represents the feature distance from the sample image q _i to the corresponding standard image b _i in the target image set (ie the target feature distance); min

Represents the minimum value of the third feature distances from the sample image q _i to each image b in the target image set; d represents the distance threshold.

In some possible embodiments, for the first situation that can be correctly identified, h(f)=1 is set, so that it will not affect the identification result;

In some possible embodiments, for the third type of misrecognition, h(f)≥1 is set, so that the misrecognition rate will not increase;

In some possible embodiments, for the second case corresponding to difficult samples, you can set

(That is, the above-mentioned preset coefficient k is 0.99), so

Can be correctly identified.

Step 308: Determine the loss value of the initial neural network model according to the aforementioned predicted expansion and contraction parameters and the label expansion and contraction parameters.

In some possible implementations, the predicted stretch parameters and label stretch parameters can be brought into the loss function of the initial neural network model to obtain the loss value of the initial neural network model.

Step 310: Update the parameters in the initial neural network model according to the aforementioned loss value to obtain a trained neural network model.

It should be noted that there is no order of execution between the above step 304 and step 306; the steps not described in detail in the above step 302 to step 310 can refer to the corresponding content of the foregoing embodiment or related prior art, and will not be repeated here.

On the basis of the embodiment of the image recognition method provided in the embodiment of this application, this embodiment also provides a possible example of applying the aforementioned image recognition method. In this example, the target object to be recognized is a human face, that is, the aforementioned The target image is a face image to be recognized, there are multiple base library images, and the target expansion parameter is the target expansion coefficient. Referring to the flowchart of another image recognition method shown in FIG. 4, the method mainly includes the following steps S402 to S412:

Step S402: Extract the features of the face image to be recognized.

Step S404: Calculate the first feature distance between the feature of the face image to be recognized and the feature of each base library image.

Step S406: Input the features of the face image to be recognized into the neural network model to obtain the target expansion coefficient corresponding to the face image to be recognized.

Step S408: Multiply each first feature distance and the target expansion coefficient to obtain a second feature distance between the face image to be recognized and the base library image, and the target expansion coefficient is greater than 0 and less than 1.

Step S410: Determine the numerical value relationship between each second characteristic distance and the distance threshold.

Step S412: When there is a target second characteristic distance smaller than the distance threshold in each second characteristic distance, determine the face in the base image corresponding to the target second characteristic distance as the face recognition result of the face image to be recognized.

In the above-mentioned image recognition method provided by this embodiment, when performing face recognition on the face image to be recognized, the first feature distance is subjected to the distance scaling transformation under the target scaling coefficient to narrow the features of the face image to be recognized in the base image. The distance, and the target expansion coefficient is related to the characteristics of the face image to be recognized, so that the accuracy of face recognition for images taken under dark light, overhead light or large angles is improved without increasing the misrecognition rate And the pass rate reduces the false rejection rate, thereby improving the user experience.

Corresponding to the image recognition method provided by the embodiment of the application, the embodiment of the application provides an image recognition device. Refer to the structural block diagram of the image recognition device shown in FIG. 5, the device includes the following modules:

The extraction module 52 is configured to extract features of the image of the target object to be recognized;

The calculation module 54 is configured to calculate the first feature distance between the feature of the target image to be recognized and the feature of the base library image;

The transformation module 56 is configured to obtain the second characteristic distance between the image of the target object to be identified and the base library image according to the first characteristic distance and the target expansion parameter; wherein the target expansion parameter is related to the characteristics of the target object image to be identified;

The determining module 58 is configured to determine the target object recognition result in the image of the target object to be recognized according to the second characteristic distance.

The above-mentioned image recognition device provided by the embodiment of the present application makes full use of the distinguishability of difficult samples. When image recognition is performed on the image of the target object to be recognized, the first feature distance is adjusted by the distance expansion and contraction transformation under the target expansion parameter. The feature distance of the target image to be recognized to the bottom library image, and the target expansion parameter is related to the feature of the target image to be recognized, so as not to increase the misrecognition rate, it improves the resistance to low light, top light or large angles. The recognition accuracy and pass rate of the captured images reduce the false rejection rate, thereby improving the user experience.

As a possible implementation manner, the foregoing calculation module 54 may be configured as:

As a possible implementation manner, the above-mentioned transformation module 56 may be configured as:

Input the characteristics of the image of the target to be recognized into the neural network model to obtain the target expansion parameters corresponding to the image of the target to be recognized;

The first characteristic distance is numerically transformed using the target stretch parameter to obtain the second characteristic distance between the image of the target object to be identified and the image of the base library.

As a possible implementation manner, the aforementioned target expansion parameter includes a target expansion coefficient or a target expansion value; the aforementioned transformation module 56 may also be configured as:

Multiply the first characteristic distance and the target expansion coefficient to obtain the second characteristic distance between the target image to be identified and the base library image; the target expansion coefficient is greater than 0 and less than 1;

Or, perform a subtraction operation on the first characteristic distance and the target stretch value to obtain the second characteristic distance between the image of the target object to be identified and the image of the base library.

In an optional implementation manner, the above determining module 58 may be configured as:

Determine whether the second characteristic distance is less than or equal to the distance threshold;

If the second characteristic distance is less than or equal to the distance threshold, the target in the base library image is determined as the target recognition result in the to-be-recognized target image.

In another optional implementation manner, there are multiple base library images, and each base library image corresponds to a second characteristic distance; the determining module 58 may also be configured to:

Judge the numerical relationship between each second characteristic distance and the distance threshold;

When there is a target second characteristic distance smaller than the distance threshold in each second characteristic distance, the target object in the base library image corresponding to the target second characteristic distance is determined as the target object recognition result.

As a possible implementation manner, the above determining module 58 may also be configured as:

Determine whether the minimum value of each second characteristic distance is less than or equal to the distance threshold;

If the minimum value of each second characteristic distance is less than or equal to the distance threshold, the minimum value of each second characteristic distance is determined as the target second characteristic distance.

In an implementation manner, the above-mentioned target expansion parameters are determined by a neural network model. Refer to the structural block diagram of another image recognition device shown in FIG. 6. On the basis of FIG. 5, the above-mentioned device is also equipped with a training module 62. The training module 62 can be configured as:

Extract the features of the sample image;

Input the features of the sample image into the initial neural network model to obtain the predicted expansion and contraction parameters;

Determine the label expansion parameter corresponding to the sample image according to the third characteristic distance between the characteristic of the sample image and the characteristic of each image in the target image set;

Determine the loss value of the initial neural network model according to the predicted expansion and contraction parameters and the label expansion and contraction parameters;

The parameters in the initial neural network model are updated according to the loss value to obtain the trained neural network model.

As a possible implementation manner, the above-mentioned training module 62 may be configured as:

Determine whether the target feature distance is the minimum of the third feature distances; the target feature distance is the third feature distance between the feature of the sample image and the feature of the standard image corresponding to the sample image in the target image set;

When the target characteristic distance is the minimum value among the third characteristic distances, it is judged whether the target characteristic distance is greater than the distance threshold;

When the target feature distance is greater than the distance threshold, the label stretch parameter is determined according to the target feature distance and the distance threshold.

As a possible implementation manner, the aforementioned label expansion parameter includes a label expansion coefficient; the aforementioned training module 62 is further configured to:

It is determined that the label expansion coefficient is the first value related to the target feature distance and the distance threshold, and the first value is greater than 0 and less than 1.

As a possible implementation manner, the above-mentioned training module 62 is further configured to:

The first value is determined according to the ratio of the distance threshold to the target characteristic distance and the preset coefficient, and the first value is used as the label expansion coefficient; wherein the preset coefficient is greater than 0 and less than 1.

As a possible implementation manner, the above-mentioned label expansion parameter includes a label expansion coefficient; the above-mentioned training module 62 may also be configured as:

When the target feature distance is not the minimum value among the third feature distances, it is determined that the label expansion coefficient is the second value, and the second value is greater than or equal to 1.

When the target feature distance is not greater than the distance threshold, the label expansion coefficient is determined to be 1.

The implementation principles and technical effects of the device provided in this embodiment are the same as those in the foregoing method embodiments. For a brief description, for parts not mentioned in the device embodiments, please refer to the implementation of the image recognition method provided in the embodiments of this application. The corresponding content in the example.

In addition, the embodiments of the present application also provide a computer-readable storage medium having a computer program stored on the computer-readable storage medium, and the computer program executes the image recognition method described in the foregoing method embodiment when the computer program is run by a processor.

The computer program product of the image recognition method and device provided in the embodiments of the present application includes a computer-readable storage medium storing program code, and the instructions included in the program code can be configured to execute the method described in the previous method embodiment, For specific implementation, please refer to the method embodiment, which will not be repeated here.

If the function is implemented in the form of a software functional unit and sold or used as an independent product, it can be stored in a computer readable storage medium. Based on this understanding, the technical solution of the present application essentially or the part that contributes to the existing technology or the part of the technical solution can be embodied in the form of a software product, and the computer software product is stored in a storage medium, including Several instructions are used to make a computer device (which may be a personal computer, a server, or a network device, etc.) execute all or part of the steps of the methods described in the various embodiments of the present application. The aforementioned storage media include: U disk, mobile hard disk, read-only memory (Read-Only Memory, ROM for short), RAM, magnetic disk, or optical disk and other media that can store program codes.

In all the examples shown and described herein, any specific value should be interpreted as merely exemplary, rather than as a limitation, and therefore, other examples of the exemplary embodiment may have different values.

In the description of this application, it should be noted that the terms "center", "upper", "lower", "left", "right", "vertical", "horizontal", "inner", "outer", etc. The indicated orientation or positional relationship is based on the orientation or positional relationship shown in the drawings, and is only for the convenience of describing the application and simplifying the description, and does not indicate or imply that the pointed device or element must have a specific orientation or a specific orientation. The structure and operation cannot therefore be understood as a limitation of this application. In addition, the terms "first", "second", and "third" are only configured for descriptive purposes, and cannot be understood as indicating or implying relative importance.

Finally, it should be noted that the above-mentioned embodiments are only specific implementations of this application, which are used to illustrate the technical solution of this application, rather than limit it. The scope of protection of this application is not limited to this, although referring to the foregoing The examples describe the application in detail, and those of ordinary skill in the art should understand that any person skilled in the art can still modify the technical solutions described in the foregoing examples within the technical scope disclosed in this application. Or it can be easily conceived of changes, or equivalent replacements of some of the technical features; and these modifications, changes or replacements do not cause the essence of the corresponding technical solutions to deviate from the spirit and scope of the technical solutions of the embodiments of this application, and should be covered in this application Within the scope of protection. Therefore, the protection scope of this application should be subject to the protection scope of the claims.

Industrial applicability

This application provides an image recognition method, device, electronic equipment, and computer-readable storage medium, and relates to the technical field of image processing. When performing image recognition, first extract the features of the image of the target to be recognized; then calculate the image of the target to be recognized The first feature distance between the feature of and the feature of the base library image, and the second feature distance between the target image to be identified and the base library image is obtained according to the first feature distance and the target expansion parameter; wherein, the target expansion parameter It is related to the feature of the image of the target to be recognized; and then according to the second feature distance, the target recognition result in the image of the target to be recognized is determined. In this way, by performing the distance scaling transformation under the target scaling parameter on the first feature distance, the feature distance of the target image to be recognized is narrowed to the base library image, thereby improving the sensitivity to dark light and top image without increasing the misrecognition rate. The recognition pass rate of images taken under light or large angles.

Claims

An image recognition method, characterized in that it comprises:

Extract the features of the target image to be recognized;

Calculating the first feature distance between the feature of the target image to be recognized and the feature of the base library image;

According to the first characteristic distance and the target telescopic parameter, the second characteristic distance between the target object image to be recognized and the base library image is obtained; wherein the target telescopic parameter and the target object image to be recognized Feature-related

According to the second characteristic distance, a target recognition result in the image of the target to be recognized is determined.
The method according to claim 1, wherein the step of calculating the first feature distance between the feature of the target image to be recognized and the feature of the base library image comprises:

The first feature distance d 12 between the feature of the target image to be recognized and the feature of the base library image is calculated by the following formula:

Wherein, f 1,i represents the i-th element of the feature of the base library image, and f 2,i represents the i-th element of the feature of the target image to be recognized.
The method according to claim 1, wherein the step of obtaining the second characteristic distance between the image of the target object to be recognized and the image of the base library according to the first characteristic distance and the target expansion parameter comprises :

Inputting the features of the image of the target object to be recognized into a neural network model to obtain the target expansion parameters corresponding to the image of the target object to be recognized;

The first characteristic distance is numerically transformed by using the target stretch parameter to obtain the second characteristic distance between the image of the target object to be recognized and the image of the base library.
The method according to claim 3, wherein the target expansion parameter comprises a target expansion coefficient or a target expansion value; the target expansion parameter is used to perform a numerical transformation on the first characteristic distance to obtain the target to be identified The step of determining the second characteristic distance between the object image and the base library image includes:

Multiplying the first characteristic distance and the target expansion coefficient to obtain a second characteristic distance between the target object image to be identified and the base library image; the target expansion coefficient is greater than 0 and less than 1;

Or, performing a subtraction operation on the first characteristic distance and the target stretch value to obtain the second characteristic distance between the image of the target object to be recognized and the image of the base library.
The method according to any one of claims 1 to 4, wherein there is one base library image, and the base library image corresponds to one of the second characteristic distances; the second characteristic distance is determined according to the second characteristic distance. The steps of the target recognition result in the target image to be recognized include:

Judging whether the second characteristic distance is less than or equal to a distance threshold;

If the second characteristic distance is less than or equal to the distance threshold, the target in the base library image is determined as the target recognition result in the to-be-recognized target image.
The method according to any one of claims 1 to 4, wherein there are multiple base library images, and each base library image corresponds to one second characteristic distance; according to the second characteristic distance , The step of determining the target recognition result in the image of the target to be recognized includes:

Judging the numerical magnitude relationship between each of the second characteristic distances and the distance threshold;

When there is a target second characteristic distance smaller than the distance threshold in each of the second characteristic distances, the target object in the base library image corresponding to the target second characteristic distance is determined as the target object recognition result.
The method according to claim 6, wherein the step of judging the numerical value relationship between each of the second characteristic distances and a distance threshold comprises:

Judging whether the minimum value of each of the second characteristic distances is less than or equal to the distance threshold;

If the minimum value of each of the second characteristic distances is less than or equal to the distance threshold, the minimum value of each of the second characteristic distances is determined as the target second characteristic distance.
The method according to any one of claims 1-7, wherein the target expansion and contraction parameters are determined by a neural network model, and the neural network model is obtained by training in the following steps:

Extract the features of the sample image;

Input the features of the sample image into the initial neural network model to obtain the predicted expansion and contraction parameters;

Determine the label expansion parameter corresponding to the sample image according to the third feature distance between the feature of the sample image and the feature of each image in the target image set;

Determine the loss value of the initial neural network model according to the predicted expansion and contraction parameters and the label expansion and contraction parameters;

The parameters in the initial neural network model are updated according to the loss value to obtain the trained neural network model.
The method according to claim 8, wherein the step of determining the label expansion parameter corresponding to the sample image according to the third characteristic distance between the characteristic of the sample image and the characteristic of each image in the target image set comprises :

Calculate the third feature distance between the feature of the sample image and the feature of each image in the target image set;

According to each third characteristic distance and the distance threshold, the label expansion and contraction parameters corresponding to the sample image are determined.
The method according to claim 8 or 9, wherein the step of determining the label expansion parameter corresponding to the sample image according to the third feature distance between the feature of the sample image and the feature of each image in the target image set ,include:

Determine whether the target feature distance is the minimum of the third feature distances; the target feature distance is the first between the feature of the sample image and the feature of the standard image corresponding to the sample image in the target image set Three characteristic distance;

When the target characteristic distance is the minimum value among the third characteristic distances, judging whether the target characteristic distance is greater than a distance threshold;

When the target feature distance is greater than the distance threshold, the label expansion parameter is determined according to the target feature distance and the distance threshold.
The method according to claim 10, wherein the label expansion and contraction parameter comprises a label expansion coefficient; the step of determining the label expansion and contraction parameter according to the target feature distance and the distance threshold comprises:

It is determined that the tag expansion coefficient is a first value related to the target feature distance and the distance threshold, and the first value is greater than 0 and less than 1.
The method according to claim 11, wherein the step of determining that the tag expansion coefficient is a first value related to the target feature distance and the distance threshold comprises:

The first value is determined according to the ratio of the distance threshold to the target characteristic distance and a preset coefficient, and the first value is used as the label expansion coefficient; wherein the preset coefficient is greater than 0 and less than 1. .
The method according to claim 10, wherein the label expansion parameter comprises a label expansion coefficient; the method further comprises:

When the target characteristic distance is not the minimum value among the third characteristic distances, it is determined that the tag expansion coefficient is a second value, and the second value is greater than or equal to 1.
The method according to claim 10, wherein the label expansion parameter comprises a label expansion coefficient; the method further comprises:

When the target feature distance is less than the distance threshold, it is determined that the label expansion coefficient is 1.
The method according to any one of claims 1-14, wherein the target image to be recognized is a face image to be recognized, and the method further comprises:

Extract the features of the face image to be recognized;

Calculate the first feature distance between the feature of the face image to be recognized and the feature of each base library image;

Input the features of the face image to be recognized into the neural network model to obtain the target expansion coefficient corresponding to the face image to be recognized;

Multiply each first feature distance and the target expansion coefficient to obtain the second feature distance between the face image to be recognized and the base library image, and the target expansion coefficient is greater than 0 and less than 1;

Judge the numerical relationship between each second characteristic distance and the distance threshold;

When there is a target second characteristic distance smaller than the distance threshold in each second characteristic distance, the face in the base image corresponding to the target second characteristic distance is determined as the face recognition result of the face image to be recognized.
An image recognition device, characterized in that it comprises:

An extraction module, configured to extract features of the target image to be recognized;

A calculation module configured to calculate the first feature distance between the feature of the target image to be recognized and the feature of the base library image;

The transformation module is configured to obtain the second characteristic distance between the target object image to be identified and the base library image according to the first characteristic distance and the target expansion parameter; wherein, the target expansion parameter and the target expansion parameter are Recognize the characteristics of the target image;

The determining module is configured to determine the target recognition result in the image of the target to be recognized according to the second characteristic distance.
An electronic device, comprising a memory and a processor, and a computer program that can be run on the processor is stored in the memory, wherein the processor executes the computer program to implement claims 1-15 Any of the methods.
A computer-readable storage medium with a computer program stored on the computer-readable storage medium, wherein the computer program executes the method according to any one of claims 1-15 when the computer program is run by a processor.