WO2021208373A1 - Procédé et appareil d'identification d'image, et dispositif électronique et support d'enregistrement lisible par ordinateur - Google Patents

Procédé et appareil d'identification d'image, et dispositif électronique et support d'enregistrement lisible par ordinateur Download PDF

Info

Publication number
WO2021208373A1
WO2021208373A1 PCT/CN2020/119613 CN2020119613W WO2021208373A1 WO 2021208373 A1 WO2021208373 A1 WO 2021208373A1 CN 2020119613 W CN2020119613 W CN 2020119613W WO 2021208373 A1 WO2021208373 A1 WO 2021208373A1
Authority
WO
WIPO (PCT)
Prior art keywords
image
target
distance
feature
characteristic
Prior art date
Application number
PCT/CN2020/119613
Other languages
English (en)
Chinese (zh)
Inventor
王亚可
王塑
刘宇
Original Assignee
北京迈格威科技有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 北京迈格威科技有限公司 filed Critical 北京迈格威科技有限公司
Publication of WO2021208373A1 publication Critical patent/WO2021208373A1/fr

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/74Image or video pattern matching; Proximity measures in feature spaces
    • G06V10/75Organisation of the matching processes, e.g. simultaneous or sequential comparisons of image or video features; Coarse-fine approaches, e.g. multi-scale approaches; using context analysis; Selection of dictionaries
    • G06V10/751Comparing pixel values or logical combinations thereof, or feature values having positional relevance, e.g. template matching
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Definitions

  • This application relates to the field of image processing technology, and in particular to an image recognition method, device, electronic equipment, and computer-readable storage medium.
  • Image recognition refers to the use of computers to process, analyze, and understand images to identify targets and objects in various patterns. Before performing image recognition, you can first enter the target image of the target in the image recognition system as the base library image in the base library, and then perform image recognition based on the similarity between the target image to be recognized and the base library image, for example , The feature distance between the target image to be recognized and the base library image can be calculated (the higher the similarity is, the smaller the feature distance is), and the image recognition can be performed by comparing the size relationship between the feature distance and the preset distance threshold.
  • the similarity between the image taken in the dark, top light or large-angle scene and the bottom library image corresponding to the taken image is low, which can also be understood as the difference between the taken image and the corresponding
  • the feature distance of the base library images is generally large, which causes the above-mentioned captured images to be unable to be correctly identified. Therefore, the pass rate of the existing image recognition method is low.
  • One of the objectives of the present application is to provide an image recognition method, device, electronic device, and computer-readable storage medium to improve the pass rate of image recognition, thereby improving user experience.
  • an embodiment of the present application provides an image recognition method, including: extracting features of the target image to be recognized; calculating the first feature distance between the features of the target image to be recognized and the features of the base library image According to the first characteristic distance and the target expansion parameter, the second characteristic distance between the target object image to be recognized and the base library image is obtained; wherein the target expansion parameter and the target object image to be recognized According to the second feature distance, determine the target recognition result in the image of the target to be recognized.
  • the step of calculating the first feature distance between the feature of the target image to be recognized and the feature of the base library image includes: calculating the feature of the target image to be recognized and the base library image using the following formula The first feature distance d12 between the features of the library image:
  • f 1,i represents the i-th element of the feature of the base library image
  • f 2,i represents the i-th element of the feature of the target image to be recognized.
  • the step of obtaining the second characteristic distance between the image of the object to be identified and the base library image according to the first characteristic distance and the target expansion and contraction parameter includes: converting the image of the object to be identified
  • the feature of the object to be recognized is input into the neural network model to obtain the target expansion parameter corresponding to the image of the target to be recognized; the first feature distance is numerically transformed by the target expansion parameter to obtain the image of the target to be recognized and the background
  • the second feature distance between library images includes: converting the image of the object to be identified
  • the feature of the object to be recognized is input into the neural network model to obtain the target expansion parameter corresponding to the image of the target to be recognized; the first feature distance is numerically transformed by the target expansion parameter to obtain the image of the target to be recognized and the background
  • the second feature distance between library images includes: converting the image of the object to be identified
  • the feature of the object to be recognized is input into the neural network model to obtain the target expansion parameter corresponding to the image of the target to be recognized; the first feature distance is numerically
  • the target stretch parameter includes a target stretch coefficient or a target stretch value; using the target stretch parameter to perform a numerical transformation on the first characteristic distance to obtain the difference between the image of the target to be identified and the image of the base library
  • the step of the second characteristic distance between the two includes: multiplying the first characteristic distance and the target expansion coefficient to obtain the second characteristic distance between the image of the target object to be recognized and the image of the base library;
  • the target expansion coefficient is greater than 0 and less than 1;
  • the base library image is one, and the base library image corresponds to one of the second characteristic distances; the step of determining the target recognition result in the target image to be recognized according to the second characteristic distance , Including: determining whether the second characteristic distance is less than or equal to a distance threshold; if the second characteristic distance is less than or equal to the distance threshold, determining the target in the base library image as the target image to be recognized The result of target recognition.
  • each base library image corresponds to one second characteristic distance; according to the second characteristic distance, the target recognition in the target image to be recognized is determined
  • the result step includes: judging the numerical relationship between each of the second characteristic distances and a distance threshold; when there is a target second characteristic distance smaller than the distance threshold in each of the second characteristic distances, the target The target object in the base library image corresponding to the second characteristic distance is determined as the target object recognition result.
  • the step of judging the numerical value relationship between each of the second characteristic distances and a distance threshold includes: judging whether the minimum value of each of the second characteristic distances is less than or equal to the distance threshold; if the The minimum value of the second characteristic distances is less than or equal to the distance threshold, and the minimum value of each of the second characteristic distances is determined as the target second characteristic distance.
  • the target expansion and contraction parameters are determined by a neural network model, and the neural network model is obtained by training in the following steps: extracting the characteristics of the sample image; inputting the characteristics of the sample image into the initial neural network model to obtain the predicted expansion and contraction parameters; Determine the label expansion parameter corresponding to the sample image according to the third characteristic distance between the characteristic of the sample image and the characteristic of each image in the target image set; determine the label expansion parameter according to the predicted expansion parameter and the label expansion parameter The loss value of the initial neural network model; the parameters in the initial neural network model are updated according to the loss value to obtain the trained neural network model.
  • the step of determining the label expansion parameter corresponding to the sample image according to the third characteristic distance between the characteristic of the sample image and the characteristic of each image in the target image set includes: calculating the characteristic of the sample image and the target image The third feature distance between the features of each image is collected; and the label expansion and contraction parameters corresponding to the sample image are determined according to each third feature distance and the distance threshold.
  • the step of determining the label expansion parameter corresponding to the sample image according to each third characteristic distance and the distance threshold includes: judging whether the target characteristic distance is the smallest value among the third characteristic distances; the target The feature distance is the third feature distance between the feature of the sample image and the feature of the standard image corresponding to the sample image in the target image set; when the target feature distance is the smallest of the third feature distances Value, judging whether the target feature distance is greater than the distance threshold; when the target feature distance is greater than the distance threshold, the label stretch parameter is determined according to the target feature distance and the distance threshold.
  • the label expansion parameter includes a label expansion coefficient
  • the step of determining the label expansion parameter according to the target feature distance and the distance threshold includes: determining that the label expansion coefficient is the distance from the target feature and the distance The first value related to the distance threshold, where the first value is greater than 0 and less than 1.
  • the step of determining that the label expansion coefficient is the first value related to the target characteristic distance and the distance threshold includes: determining according to the ratio of the distance threshold to the target characteristic distance and a preset coefficient The first value is used as the label expansion coefficient; wherein the preset coefficient is greater than 0 and less than 1.
  • the label expansion parameter includes a label expansion coefficient; the method further includes: when the target feature distance is not the smallest value among the third characteristic distances, determining that the label expansion coefficient is a second value, The second value is greater than or equal to 1.
  • the label expansion parameter includes a label expansion coefficient; the method further includes: when the target feature distance is less than the distance threshold, determining that the label expansion coefficient is 1.
  • the target image to be recognized is a face image to be recognized
  • the method further includes: extracting features of the face image to be recognized; calculating the difference between the feature of the face image to be recognized and the feature of each base library image The first feature distance between the first feature distance; input the features of the face image to be recognized into the neural network model to obtain the target expansion coefficient corresponding to the face image to be recognized; multiply each first feature distance and the target expansion coefficient to obtain the target expansion coefficient
  • the second feature distance between the face image and the base library image, the target expansion coefficient is greater than 0 and less than 1; determine the numerical relationship between each second feature distance and the distance threshold; when each second feature distance is less than When the target second feature distance of the distance threshold is the target second feature distance, the face in the base image corresponding to the target second feature distance is determined as the face recognition result of the face image to be recognized.
  • an embodiment of the present application also provides an image recognition device, including: an extraction module configured to extract features of an image of a target object to be recognized; The first feature distance between the features of the library image; a transformation module configured to obtain the second feature distance between the target object image to be identified and the base library image according to the first feature distance and the target expansion parameter Wherein, the target stretch parameter is related to the feature of the image of the target to be recognized; the determining module is configured to determine the target recognition result in the image of the target to be recognized according to the second characteristic distance.
  • an embodiment of the present application also provides an electronic device, including a memory and a processor, the memory stores a computer program that can run on the processor, and when the processor executes the computer program
  • an electronic device including a memory and a processor, the memory stores a computer program that can run on the processor, and when the processor executes the computer program
  • an embodiment of the present application also provides a computer-readable storage medium having a computer program stored on the computer-readable storage medium, and the computer program executes the image recognition method of the first aspect when the computer program is run by a processor .
  • the embodiments of the application provide an image recognition method, device, electronic equipment, and computer-readable storage medium.
  • image recognition When performing image recognition on an image of a target to be recognized, first extract the features of the target image to be recognized; then calculate the target to be recognized The first feature distance between the feature of the image and the feature of the base library image, and the second feature distance between the image of the target object to be identified and the base library image is obtained according to the first feature distance and the target scaling parameter; wherein, the target stretches The parameter is related to the feature of the image of the target object to be recognized; and the result of the target object recognition in the image of the target object to be recognized is determined according to the second characteristic distance.
  • the feature distance of the target image to be recognized is shortened, and the target scaling coefficient is related to the features of the target image to be recognized, so as not to increase the error.
  • the recognition rate the recognition accuracy and pass rate of images taken under low light, top light or large angles are improved, and the false rejection rate is reduced, thereby improving the recognition effect and user experience.
  • FIG. 1 shows a schematic structural diagram of an electronic device provided by an embodiment of the present application
  • Fig. 2 shows a flowchart of an image recognition method provided by an embodiment of the present application
  • FIG. 3 shows a schematic diagram of the principle of distance expansion and contraction in an image recognition method provided by an embodiment of the present application
  • FIG. 4 shows a flowchart of another image recognition method provided by an embodiment of the present application.
  • Fig. 5 shows a structural block diagram of an image recognition device provided by an embodiment of the present application
  • Fig. 6 shows a structural block diagram of another image recognition device provided by an embodiment of the present application.
  • the recognition result is determined to be The target in the base library image.
  • some difficult samples such as images taken under dark light, top light or large angles
  • the distance may be greater than the distance threshold, causing these difficult samples to not be correctly identified.
  • the image recognition method, device, electronic equipment, and computer-readable storage medium provided by the embodiments of the present application can improve the pass of image recognition of difficult samples without increasing the misrecognition rate. Rate, thereby enhancing the user experience.
  • FIG. 1 shows a schematic structural diagram of an electronic device provided by an embodiment of the present application
  • the electronic device 100 shown in FIG. 1 is an image recognition that can be used to implement an embodiment of the present application.
  • the electronic device 100 can be configured with one or more processors 102, one or more storage devices 104, an input device 106, an output device 108, and an image acquisition device 110. These components can be connected through a bus system 112 and/ Or other forms of connection mechanism (not shown) are interconnected. It should be noted that the components and structure of the electronic device 100 shown in FIG. 1 are only exemplary and not restrictive. According to needs, the electronic device may have some of the components shown in FIG. Other components and structures.
  • the processor 102 may be implemented in at least one hardware form of a digital signal processor (DSP), a field programmable gate array (FPGA), and a programmable logic array (PLA), and the processor 102 may It is one or a combination of a central processing unit (CPU), a graphics processing unit (GPU), or other forms of processing units with data processing capabilities and/or instruction execution capabilities, and can control other components in the electronic device 100 Component to perform the desired function.
  • DSP digital signal processor
  • FPGA field programmable gate array
  • PDA programmable logic array
  • the storage device 104 may include one or more computer program products, and the computer program products may include various forms of computer-readable storage media, such as volatile memory and/or non-volatile memory.
  • the volatile memory may include random access memory (RAM) and/or cache memory (cache), for example.
  • the non-volatile memory may include, for example, read-only memory (ROM), hard disk, flash memory, and the like.
  • One or more computer program instructions may be stored on the computer-readable storage medium, and the processor 102 may run the program instructions to implement the client functions and/or other functions in the embodiments of the present application (implemented by the processor) described below. The desired function.
  • Various application programs and various data such as various data used and/or generated by the application program, can also be stored in the computer-readable storage medium.
  • the input device 106 may be a device used by the user to input instructions, and may include one or more of a keyboard, a mouse, a microphone, and a touch screen.
  • the output device 108 may output various information (for example, text, image, or sound) to the outside (for example, a user), and may include one or more of a display, a speaker, and the like.
  • the image capture device 110 may capture images (such as photos, videos, etc.) desired by the user, and store the captured images in the storage device 104 for use by other components.
  • the exemplary electronic device 100 for implementing the image recognition method according to the embodiment of the present application may be implemented as a smart terminal such as a smart phone, a tablet computer, or a computer.
  • FIG. 2 shows a flowchart of an image recognition method provided by an embodiment of the present application.
  • the method mainly includes the following steps S202 to S208:
  • Step S202 Extract the features of the image of the target object to be recognized.
  • the aforementioned target object to be recognized may be, but is not limited to, a human face, human body, animal, or vehicle (such as a car, boat or bicycle, etc.), and the target object image to be recognized may be dark light, top
  • the image recognition method provided in the embodiments of the present application may be suitable for image recognition of such images.
  • the characteristics of the target image to be recognized can be extracted through the corresponding neural network model after pre-training. The specific extraction process of the characteristics can refer to the related prior art, which will not be repeated here.
  • Step S204 Calculate the first feature distance between the feature of the target image to be recognized and the feature of the base library image.
  • the electronic device used to execute the image recognition method provided in the embodiments of the present application may pre-store the base library image, and there may be one or more base library images, and there is one base library image in each base library image.
  • Target In this embodiment, one base library image corresponds to one target.
  • the above-mentioned first characteristic distance may represent the characteristic distance of the target image to be identified in the feature space to the base library image. It is understandable that the above-mentioned first characteristic distance may correspond to the base library image one-to-one. In a possible example If there are multiple base library images stored in the electronic device, there are also multiple first characteristic distances that can be obtained.
  • the above-mentioned feature of the base library image may be obtained by performing feature extraction on the base library image before step S204 is performed, or may be pre-extracted and stored in the electronic device.
  • the method of extracting the features of the base library image is the same as the method of extracting the features of the target image to be recognized.
  • the features of the target image to be recognized and the features of the base library image may both be in the form of a matrix. Since the matrix usually contains multiple elements, the features of the target image to be recognized and the features of the base library image are also Both contain multiple elements.
  • the embodiments of the present application take as an example that the features of the target image to be recognized and the features of the base library image are both one-dimensional matrices, and also provide a calculation for the relationship between the features of the target image to be recognized and the features of the base library image.
  • a possible implementation of the first characteristic distance is as follows:
  • the first feature distance d12 between the feature of the target image to be recognized and the feature of the base library image is calculated by the following formula:
  • f 1,i represents the i-th element of the feature of the base library image
  • f 2,i represents the i-th element of the feature of the target image to be recognized.
  • Step S206 Obtain a second characteristic distance between the image of the target object to be identified and the image of the base library according to the first characteristic distance and the target expansion parameter; wherein, the target expansion parameter is related to the characteristics of the target object image to be identified.
  • the first feature distance obtained through the above step S204 may be greater than the preset distance threshold. At this time, the image of the target to be recognized may not be correct.
  • the embodiment of the present invention provides a possible implementation manner.
  • the difficult samples can be brought closer to the bottom by performing a distance scaling transformation on the first feature distance. Library images, thereby improving the pass rate of image recognition of difficult samples, avoiding the recognition of images containing the target object as images not containing the target object, and reducing the false rejection rate.
  • the target stretch parameter when performing the distance stretch transformation is related to the characteristics of the target object image to be recognized. This ensures that the zooming process does not bring the target-free image closer than the distance threshold, so that no Misrecognition means that the rate of misrecognition will not increase.
  • FIG. 3 shows a schematic diagram of the principle of distance expansion and contraction in an image recognition method provided by an embodiment of the present application
  • base represents the base library image
  • simple query represents simple samples (that is, images that are easily recognized correctly)
  • difficult query represents difficult samples
  • dots and triangles represent images corresponding to two different targets (including the base library).
  • Image, simple sample and difficult sample) position in the feature space, the circle corresponds to the distance threshold (samples inside the circle can be correctly identified, samples outside the circle cannot be correctly identified).
  • the difficult samples are all located outside the circle, so that the difficult samples cannot be correctly identified; and after the distance expansion and transformation is performed on the first feature distance, the ones located outside the circle
  • the difficult sample is drawn into the circle, and the feature distance between the difficult sample and the base library image corresponding to the difficult sample becomes smaller, so that the difficult sample can be correctly identified.
  • step S206 can be implemented by the following process: input the features of the image of the target object to be recognized into the neural network model to obtain the target expansion and contraction parameters corresponding to the image of the target object to be recognized; The parameter performs numerical transformation on the first characteristic distance to obtain the second characteristic distance between the image of the target object to be identified and the image of the base library.
  • the aforementioned neural network model may be a pre-trained neural network model, and the neural network model may be a layer of fully connected neural network.
  • the input of the neural network model may be To identify the characteristics of the target image, the output of the neural network model is a real value, and the real value can be the target expansion parameter in the embodiment of this application.
  • the aforementioned target expansion parameter may include a target expansion coefficient or a target expansion value.
  • the second characteristic distance can be obtained by the following implementation: , The first characteristic distance and the target expansion coefficient are multiplied to obtain the second characteristic distance between the target image to be identified and the base library image, the target expansion coefficient is greater than 0 and less than 1; in another implementation manner , Performing a subtraction operation on the first characteristic distance and the target stretch value to obtain the second characteristic distance between the image of the target object to be identified and the image of the base library.
  • Step S208 Determine the target recognition result in the image of the target to be recognized according to the above-mentioned second characteristic distance.
  • the target recognition result in the image of the target to be recognized can be determined by comparing the second characteristic distance with a preset distance threshold.
  • the distance threshold can be set according to the required misrecognition rate, which is not limited here.
  • step S208 may go through the following process Realization: Determine whether the second characteristic distance is less than or equal to the distance threshold; if the second characteristic distance is less than or equal to the distance threshold, determine the target in the base library image as the target recognition result in the target image to be recognized.
  • each base library image corresponds to a second feature distance. Therefore, there are also multiple second feature distances.
  • the above step S208 can be implemented by the following process: judging the numerical relationship between each second characteristic distance and the distance threshold; The second feature distance, the target object in the base library image corresponding to the second feature distance of the target is determined as the target object recognition result in the target object image to be recognized.
  • the step of judging the numerical value relationship between each second characteristic distance and the distance threshold may be: judging whether the minimum value of each second characteristic distance is less than or equal to the distance threshold; The minimum value of the second characteristic distances is less than or equal to the distance threshold, and the minimum value of each second characteristic distance is determined as the target second characteristic distance.
  • the above-mentioned image recognition method of this embodiment makes full use of the distinguishability of difficult samples.
  • the first feature distance is subjected to the distance expansion and contraction transformation under the target expansion parameter to narrow the to-be-identified
  • the feature distance of the target image to the bottom library image, and the target expansion parameter is related to the features of the target image to be recognized, so that the accuracy of shooting under dark light, top light or large angles is improved without increasing the misrecognition rate.
  • the image recognition accuracy and pass rate reduce the false rejection rate, thereby improving the user experience.
  • the target expansion parameter as the target expansion coefficient as an example, and give an implementation way to obtain the second characteristic distance.
  • the second characteristic distance d" 12 can be calculated by the following formula :
  • h(f 2 ) represents the target stretch coefficient.
  • the embodiment of the present application also provides a training process of a neural network model, which mainly includes the following steps 302 to 310:
  • Step 302 Extract the features of the sample image.
  • the sample images in the training set may be images taken under shooting scenes such as dark light, overhead light, or large angles.
  • the process of extracting the features of the sample image can refer to the related prior art, which will not be repeated here.
  • Step 304 Input the characteristics of the above-mentioned sample image into the initial neural network model to obtain the predicted expansion and contraction parameters.
  • the prediction scaling parameter may include a prediction scaling coefficient or a prediction scaling value.
  • Step 306 Determine a label expansion parameter corresponding to the sample image according to the third feature distance between the feature of the sample image and the feature of each image in the target image set.
  • each image in the target image set may be the bottom library image in the embodiment of the present application.
  • the label expansion parameter corresponds to the aforementioned predicted expansion parameter, which can be understood as: if the predicted expansion parameter is the predicted expansion coefficient, the label expansion parameter is the label expansion coefficient; if the predicted expansion parameter is the predicted expansion value, then The label expansion parameter is the label expansion value.
  • the method of determining the label expansion and contraction parameters corresponding to the sample image may be: calculation The third feature distance between the feature of the sample image and the feature of each image in the target image set, and then the label expansion parameter corresponding to the sample image is determined according to each third feature distance and the distance threshold.
  • the above distance threshold is the same as the distance threshold used in the specific implementation of determining the target recognition result in the target image to be recognized based on the second feature distance, so as to ensure that the trained neural network model can be accurate Identify the target image.
  • the label stretch parameter may be determined according to the comparison result between each third characteristic distance and the distance threshold. For example, the average distance of each third characteristic distance may be compared with the distance threshold. Or, compare the maximum or minimum value of each third characteristic distance with the distance threshold, etc., and obtain a corresponding strategy for determining the label expansion parameter according to the comparison result.
  • the value of the label expansion parameter may be jointly determined based on the third characteristic distance and the distance threshold, and may also be based on The preset coefficients, the third feature distance, and the distance threshold are used to determine the value of the label expansion and contraction parameters to facilitate understanding of the above-mentioned implementation of obtaining the label expansion and contraction parameters.
  • the distance and distance threshold the way to determine the label expansion parameters corresponding to the sample image can be achieved by the following process: determine whether the target feature distance is the minimum of the third feature distances, and the target feature distance is the concentration of the features of the sample image and the target image The third characteristic distance between the features of the standard image corresponding to the sample image; when the target characteristic distance is the smallest value among the third characteristic distances, it is judged whether the target characteristic distance is greater than the distance threshold; when the target characteristic distance is greater than the distance threshold, according to the target The feature distance and distance threshold determine the label expansion parameters.
  • the label expansion parameter when the label expansion parameter is the label expansion coefficient, the target feature distance can be the smallest value among the third feature distances, and the target feature distance is greater than the distance
  • thresholding this is the case of corresponding difficult samples
  • the label stretch coefficient is the first value related to the target feature distance and the distance threshold greater than 0 and less than 1.
  • the first value can be determined according to the ratio of the distance threshold to the target characteristic distance and a preset coefficient, and the first value is used as the label expansion coefficient; wherein the preset coefficient is greater than 0 and less than 1.
  • the above-mentioned first value may be determined according to the following formula:
  • h(f) represents the first value
  • d represents the distance threshold
  • k represents the preset coefficient
  • a third characteristic distance between the characteristics of the sample image and the characteristics of each image in the target image set is given.
  • the method of determining the label expansion parameter corresponding to the sample image may also include: when the target feature distance is not the smallest value among the third feature distances (this is the case of misidentification), determining the label expansion coefficient as the first Two values, the second value is greater than or equal to 1; when the target feature distance is less than the distance threshold (this is the case of correct identification), the label expansion coefficient is determined to be 1.
  • the label expansion coefficient is set to a second value greater than or equal to 1, which will not further increase the misrecognition rate; in the case of correct recognition, the label is expanded and contracted. If the coefficient is set to 1, it will not affect the recognition result.
  • the label expansion coefficient in the embodiments of this application can be denoted as h(f).
  • the relationship between the target feature distance and each third feature distance and distance threshold can be Divided into the following three situations:
  • h(f) ⁇ 1 is set, so that the misrecognition rate will not increase
  • Step 308 Determine the loss value of the initial neural network model according to the aforementioned predicted expansion and contraction parameters and the label expansion and contraction parameters.
  • the predicted stretch parameters and label stretch parameters can be brought into the loss function of the initial neural network model to obtain the loss value of the initial neural network model.
  • Step 310 Update the parameters in the initial neural network model according to the aforementioned loss value to obtain a trained neural network model.
  • step 304 there is no order of execution between the above step 304 and step 306; the steps not described in detail in the above step 302 to step 310 can refer to the corresponding content of the foregoing embodiment or related prior art, and will not be repeated here.
  • this embodiment also provides a possible example of applying the aforementioned image recognition method.
  • the target object to be recognized is a human face, that is, the aforementioned
  • the target image is a face image to be recognized, there are multiple base library images, and the target expansion parameter is the target expansion coefficient.
  • the method mainly includes the following steps S402 to S412:
  • Step S402 Extract the features of the face image to be recognized.
  • Step S404 Calculate the first feature distance between the feature of the face image to be recognized and the feature of each base library image.
  • Step S406 Input the features of the face image to be recognized into the neural network model to obtain the target expansion coefficient corresponding to the face image to be recognized.
  • Step S408 Multiply each first feature distance and the target expansion coefficient to obtain a second feature distance between the face image to be recognized and the base library image, and the target expansion coefficient is greater than 0 and less than 1.
  • Step S410 Determine the numerical value relationship between each second characteristic distance and the distance threshold.
  • Step S412 When there is a target second characteristic distance smaller than the distance threshold in each second characteristic distance, determine the face in the base image corresponding to the target second characteristic distance as the face recognition result of the face image to be recognized.
  • the first feature distance is subjected to the distance scaling transformation under the target scaling coefficient to narrow the features of the face image to be recognized in the base image.
  • the distance, and the target expansion coefficient is related to the characteristics of the face image to be recognized, so that the accuracy of face recognition for images taken under dark light, overhead light or large angles is improved without increasing the misrecognition rate And the pass rate reduces the false rejection rate, thereby improving the user experience.
  • the embodiment of the application provides an image recognition device.
  • the device includes the following modules:
  • the extraction module 52 is configured to extract features of the image of the target object to be recognized
  • the calculation module 54 is configured to calculate the first feature distance between the feature of the target image to be recognized and the feature of the base library image;
  • the transformation module 56 is configured to obtain the second characteristic distance between the image of the target object to be identified and the base library image according to the first characteristic distance and the target expansion parameter; wherein the target expansion parameter is related to the characteristics of the target object image to be identified;
  • the determining module 58 is configured to determine the target object recognition result in the image of the target object to be recognized according to the second characteristic distance.
  • the above-mentioned image recognition device makes full use of the distinguishability of difficult samples.
  • the first feature distance is adjusted by the distance expansion and contraction transformation under the target expansion parameter.
  • the feature distance of the target image to be recognized to the bottom library image, and the target expansion parameter is related to the feature of the target image to be recognized, so as not to increase the misrecognition rate, it improves the resistance to low light, top light or large angles.
  • the recognition accuracy and pass rate of the captured images reduce the false rejection rate, thereby improving the user experience.
  • the foregoing calculation module 54 may be configured as:
  • the first feature distance d12 between the feature of the target image to be recognized and the feature of the base library image is calculated by the following formula:
  • f 1,i represents the i-th element of the feature of the base library image
  • f 2,i represents the i-th element of the feature of the target image to be recognized.
  • the above-mentioned transformation module 56 may be configured as:
  • the first characteristic distance is numerically transformed using the target stretch parameter to obtain the second characteristic distance between the image of the target object to be identified and the image of the base library.
  • the aforementioned target expansion parameter includes a target expansion coefficient or a target expansion value; the aforementioned transformation module 56 may also be configured as:
  • the target expansion coefficient is greater than 0 and less than 1;
  • the above determining module 58 may be configured as:
  • the target in the base library image is determined as the target recognition result in the to-be-recognized target image.
  • each base library image corresponds to a second characteristic distance; the determining module 58 may also be configured to:
  • the target object in the base library image corresponding to the target second characteristic distance is determined as the target object recognition result.
  • the above determining module 58 may also be configured as:
  • the minimum value of each second characteristic distance is less than or equal to the distance threshold, the minimum value of each second characteristic distance is determined as the target second characteristic distance.
  • the above-mentioned target expansion parameters are determined by a neural network model.
  • a neural network model Refer to the structural block diagram of another image recognition device shown in FIG. 6. On the basis of FIG. 5, the above-mentioned device is also equipped with a training module 62.
  • the training module 62 can be configured as:
  • the parameters in the initial neural network model are updated according to the loss value to obtain the trained neural network model.
  • the above-mentioned training module 62 may be configured as:
  • the target feature distance is the minimum of the third feature distances; the target feature distance is the third feature distance between the feature of the sample image and the feature of the standard image corresponding to the sample image in the target image set;
  • the target characteristic distance is the minimum value among the third characteristic distances, it is judged whether the target characteristic distance is greater than the distance threshold;
  • the label stretch parameter is determined according to the target feature distance and the distance threshold.
  • the aforementioned label expansion parameter includes a label expansion coefficient; the aforementioned training module 62 is further configured to:
  • the label expansion coefficient is the first value related to the target feature distance and the distance threshold, and the first value is greater than 0 and less than 1.
  • the above-mentioned training module 62 is further configured to:
  • the first value is determined according to the ratio of the distance threshold to the target characteristic distance and the preset coefficient, and the first value is used as the label expansion coefficient; wherein the preset coefficient is greater than 0 and less than 1.
  • the above-mentioned label expansion parameter includes a label expansion coefficient;
  • the above-mentioned training module 62 may also be configured as:
  • the target feature distance is not the minimum value among the third feature distances, it is determined that the label expansion coefficient is the second value, and the second value is greater than or equal to 1.
  • the above-mentioned label expansion parameter includes a label expansion coefficient;
  • the above-mentioned training module 62 may also be configured as:
  • the label expansion coefficient is determined to be 1.
  • the embodiments of the present application also provide a computer-readable storage medium having a computer program stored on the computer-readable storage medium, and the computer program executes the image recognition method described in the foregoing method embodiment when the computer program is run by a processor.
  • the computer program product of the image recognition method and device provided in the embodiments of the present application includes a computer-readable storage medium storing program code, and the instructions included in the program code can be configured to execute the method described in the previous method embodiment, For specific implementation, please refer to the method embodiment, which will not be repeated here.
  • the function is implemented in the form of a software functional unit and sold or used as an independent product, it can be stored in a computer readable storage medium.
  • the technical solution of the present application essentially or the part that contributes to the existing technology or the part of the technical solution can be embodied in the form of a software product, and the computer software product is stored in a storage medium, including Several instructions are used to make a computer device (which may be a personal computer, a server, or a network device, etc.) execute all or part of the steps of the methods described in the various embodiments of the present application.
  • the aforementioned storage media include: U disk, mobile hard disk, read-only memory (Read-Only Memory, ROM for short), RAM, magnetic disk, or optical disk and other media that can store program codes.
  • This application provides an image recognition method, device, electronic equipment, and computer-readable storage medium, and relates to the technical field of image processing.
  • image recognition When performing image recognition, first extract the features of the image of the target to be recognized; then calculate the image of the target to be recognized The first feature distance between the feature of and the feature of the base library image, and the second feature distance between the target image to be identified and the base library image is obtained according to the first feature distance and the target expansion parameter; wherein, the target expansion parameter It is related to the feature of the image of the target to be recognized; and then according to the second feature distance, the target recognition result in the image of the target to be recognized is determined.
  • the feature distance of the target image to be recognized is narrowed to the base library image, thereby improving the sensitivity to dark light and top image without increasing the misrecognition rate.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Computing Systems (AREA)
  • Evolutionary Computation (AREA)
  • General Health & Medical Sciences (AREA)
  • General Physics & Mathematics (AREA)
  • Software Systems (AREA)
  • Databases & Information Systems (AREA)
  • Medical Informatics (AREA)
  • Multimedia (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Molecular Biology (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Image Analysis (AREA)

Abstract

La présente invention se rapporte au domaine technique du traitement des images. L'invention concerne un procédé et un appareil d'identification d'image, et un dispositif électronique et un support lisible par ordinateur. Le procédé consiste à : lors de l'identification d'image, d'abord extraire une caractéristique d'une image d'objet cible à soumettre à une identification ; puis calculer une première distance de caractéristique entre la caractéristique de ladite image d'objet cible et une caractéristique d'une image de bibliothèque de base, et obtenir une seconde distance de caractéristique entre ladite image d'objet cible et l'image de bibliothèque de base selon la première distance de caractéristique et un paramètre de mise à l'échelle cible, le paramètre de mise à l'échelle cible étant lié à la caractéristique de ladite image d'objet cible ; et déterminer un résultat d'identification d'un objet cible dans ladite image d'objet cible selon la seconde distance de caractéristique. Ainsi, la distance de caractéristique de ladite image d'objet cible à l'image de bibliothèque de base est raccourcie au moyen d'une transformation de mise à l'échelle de distance de la première distance de caractéristique selon le paramètre de mise à l'échelle cible, de façon à améliorer le taux de réussite de l'identification d'une image photographiée sous une lumière faible, sous une lumière supérieure ou à un grand angle, etc., sans augmenter un taux d'erreur d'identification.
PCT/CN2020/119613 2020-04-14 2020-09-30 Procédé et appareil d'identification d'image, et dispositif électronique et support d'enregistrement lisible par ordinateur WO2021208373A1 (fr)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202010293294.7 2020-04-14
CN202010293294.7A CN111639667B (zh) 2020-04-14 2020-04-14 图像识别方法、装置、电子设备及计算机可读存储介质

Publications (1)

Publication Number Publication Date
WO2021208373A1 true WO2021208373A1 (fr) 2021-10-21

Family

ID=72331390

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2020/119613 WO2021208373A1 (fr) 2020-04-14 2020-09-30 Procédé et appareil d'identification d'image, et dispositif électronique et support d'enregistrement lisible par ordinateur

Country Status (2)

Country Link
CN (1) CN111639667B (fr)
WO (1) WO2021208373A1 (fr)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114429663A (zh) * 2022-01-28 2022-05-03 北京百度网讯科技有限公司 人脸底库的更新方法、人脸识别方法、装置及系统

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111639667B (zh) * 2020-04-14 2023-06-16 北京迈格威科技有限公司 图像识别方法、装置、电子设备及计算机可读存储介质
CN112579803B (zh) * 2020-11-16 2024-04-02 北京迈格威科技有限公司 一种图像数据清洗方法、装置、电子设备及存储介质

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7792386B1 (en) * 2004-02-27 2010-09-07 Adobe Systems Incorporated Using difference kernels for image filtering
CN104424483A (zh) * 2013-08-21 2015-03-18 中移电子商务有限公司 一种人脸图像的光照预处理方法、装置及终端
CN110188641A (zh) * 2019-05-20 2019-08-30 北京迈格威科技有限公司 图像识别和神经网络模型的训练方法、装置和系统
CN111639667A (zh) * 2020-04-14 2020-09-08 北京迈格威科技有限公司 图像识别方法、装置、电子设备及计算机可读存储介质

Family Cites Families (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP3475422B2 (ja) * 1997-08-04 2003-12-08 オムロン株式会社 画像認識装置、画像認識方法、および記録媒体
JP2005173995A (ja) * 2003-12-11 2005-06-30 Nippon Telegr & Teleph Corp <Ntt> 奥行き算出装置、奥行き算出方法、および、プログラム
WO2012114464A1 (fr) * 2011-02-23 2012-08-30 富士通株式会社 Dispositif d'imagerie, programme et procédé de support d'imagerie
CN107766864B (zh) * 2016-08-23 2022-02-01 斑马智行网络(香港)有限公司 提取特征的方法和装置、物体识别的方法和装置
CN107895166B (zh) * 2017-04-24 2021-05-25 长春工业大学 基于特征描述子的几何哈希法实现目标鲁棒识别的方法
CN108596110A (zh) * 2018-04-26 2018-09-28 北京京东金融科技控股有限公司 图像识别方法及装置、电子设备、存储介质
CN109102020A (zh) * 2018-08-10 2018-12-28 新华三技术有限公司 一种图像对比方法及装置
CN110874921A (zh) * 2018-08-31 2020-03-10 百度在线网络技术(北京)有限公司 智能路侧单元及其信息处理方法

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7792386B1 (en) * 2004-02-27 2010-09-07 Adobe Systems Incorporated Using difference kernels for image filtering
CN104424483A (zh) * 2013-08-21 2015-03-18 中移电子商务有限公司 一种人脸图像的光照预处理方法、装置及终端
CN110188641A (zh) * 2019-05-20 2019-08-30 北京迈格威科技有限公司 图像识别和神经网络模型的训练方法、装置和系统
CN111639667A (zh) * 2020-04-14 2020-09-08 北京迈格威科技有限公司 图像识别方法、装置、电子设备及计算机可读存储介质

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114429663A (zh) * 2022-01-28 2022-05-03 北京百度网讯科技有限公司 人脸底库的更新方法、人脸识别方法、装置及系统
CN114429663B (zh) * 2022-01-28 2023-10-20 北京百度网讯科技有限公司 人脸底库的更新方法、人脸识别方法、装置及系统

Also Published As

Publication number Publication date
CN111639667A (zh) 2020-09-08
CN111639667B (zh) 2023-06-16

Similar Documents

Publication Publication Date Title
WO2021208373A1 (fr) Procédé et appareil d&#39;identification d&#39;image, et dispositif électronique et support d&#39;enregistrement lisible par ordinateur
CN109255352B (zh) 目标检测方法、装置及系统
US10354168B2 (en) Systems and methods for recognizing characters in digitized documents
CN109829506B (zh) 图像处理方法、装置、电子设备和计算机存储介质
CN109145766B (zh) 模型训练方法、装置、识别方法、电子设备及存储介质
WO2019041519A1 (fr) Procédé et dispositif de suivi de cible, et support de stockage lisible par ordinateur
CN111797893A (zh) 一种神经网络的训练方法、图像分类系统及相关设备
CN111666960A (zh) 图像识别方法、装置、电子设备及可读存储介质
WO2018082308A1 (fr) Procédé de traitement d&#39;image et terminal
Liu et al. Real-time facial expression recognition based on cnn
CN111242222A (zh) 分类模型的训练方法、图像处理方法及装置
CN111160288A (zh) 手势关键点检测方法、装置、计算机设备和存储介质
CN113255557B (zh) 一种基于深度学习的视频人群情绪分析方法及系统
WO2021238586A1 (fr) Procédé et appareil d&#39;entraînement, dispositif, et support de stockage lisible par ordinateur
CN114266897A (zh) 痘痘类别的预测方法、装置、电子设备及存储介质
CN112597918A (zh) 文本检测方法及装置、电子设备、存储介质
CN112232506A (zh) 网络模型训练方法、图像目标识别方法、装置和电子设备
CN112819011A (zh) 对象间关系的识别方法、装置和电子系统
CN110490058B (zh) 行人检测模型的训练方法、装置、系统和计算机可读介质
CN111985556A (zh) 关键点识别模型的生成方法和关键点识别方法
KR20230043318A (ko) 영상 내 객체를 분류하는 객체 분류 방법 및 장치
CN112560791B (zh) 识别模型的训练方法、识别方法、装置及电子设备
CN110889290B (zh) 文本编码方法和设备、文本编码有效性检验方法和设备
CN109871814B (zh) 年龄的估计方法、装置、电子设备和计算机存储介质
CN111753583A (zh) 一种识别方法及装置

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20931629

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 20931629

Country of ref document: EP

Kind code of ref document: A1