CN112288012A

CN112288012A - Image recognition method, device and storage medium

Info

Publication number: CN112288012A
Application number: CN202011188442.5A
Authority: CN
Inventors: 陈畅怀
Original assignee: Hangzhou Hikvision Digital Technology Co Ltd
Current assignee: Hangzhou Hikvision Digital Technology Co Ltd
Priority date: 2020-10-30
Filing date: 2020-10-30
Publication date: 2021-01-29

Abstract

The embodiment of the application discloses an image identification method, an image identification device and a storage medium, and belongs to the technical field of deep learning. In the embodiment of the application, the high-dimensional feature vector of the image to be recognized can be directly obtained through the first neural network model. And because the high-dimensional feature vectors of the plurality of reference image samples are also extracted through the first neural network model, the image category of the image to be identified can be determined through the high-dimensional feature vectors of the image to be identified and the high-dimensional feature vectors and the image categories of the reference image samples. On the basis, even if the image class of the image to be recognized is not the image class participating in the training of the first neural network model, the image class of the image to be recognized can still be recognized. That is, in the embodiment of the present application, the high-dimensional feature vector is extracted by the first neural network model, and a plurality of reference image samples are set, so that the identification of the image category that does not participate in the training of the first neural network model can be supported.

Description

Image recognition method, device and storage medium

Technical Field

The present disclosure relates to the field of deep learning technologies, and in particular, to an image recognition method, an image recognition device, and a storage medium.

Background

With the rapid development of deep learning technology, the image recognition technology based on the neural network model has been widely applied. In the related art, for a neural network model applied to image recognition in a certain scene, the neural network model may be trained by collecting training image samples in the scene. And then, inputting the image to be recognized into the trained neural network model, and processing the image to be recognized by the trained neural network model so as to output the image category corresponding to the image to be recognized. However, the trained neural network model can only support the recognition of the images of the same class as the image training sample, and the images of other classes cannot be recognized.

Disclosure of Invention

The embodiment of the application provides an image recognition method, an image recognition device and a storage medium, and for a trained neural network model, recognition of other types of images different from image types of a training sample of the trained neural network model can be supported. The technical scheme is as follows:

in one aspect, an image recognition method is provided, and the method includes:

acquiring an image to be identified;

extracting a high-dimensional feature vector of the image to be recognized through a first neural network model, wherein the first neural network model is a trained neural network model;

and determining the image category of the image to be identified according to the high-dimensional feature vector of the image to be identified and the high-dimensional feature vectors and the image categories of a plurality of reference image samples, wherein the high-dimensional feature vectors of the plurality of reference image samples are extracted through the first neural network model.

In a possible implementation manner of the embodiment of the present application, the method further includes:

extracting, by the first neural network model, a high-dimensional feature vector for each of the plurality of reference image samples;

and correspondingly storing the high-dimensional feature vector of each reference image sample and the image class of the corresponding reference image sample.

In a possible implementation manner of the embodiment of the present application, the determining, according to the high-dimensional feature vector of the image to be recognized and the high-dimensional feature vectors and the image categories of a plurality of reference image samples, the image category of the image to be recognized includes:

determining the similarity between the high-dimensional feature vector of the image to be identified and the high-dimensional feature vector of each reference image sample;

determining a target reference image sample with the highest similarity with the high-dimensional feature vector of the image to be identified;

and taking the image category of the target reference image sample as the image category of the image to be identified.

In a possible implementation manner of the embodiment of the present application, there are image samples with different image categories from those of a plurality of training image samples in the plurality of reference image samples, where the plurality of training image samples refer to image samples used for training to obtain the first neural network model.

obtaining a plurality of training image samples;

training a second neural network model through the plurality of training image samples, the second neural network model comprising a classification layer;

and deleting the classification layer of the trained second neural network model to obtain the first neural network model.

In a possible implementation manner of the embodiment of the present application, before deleting the classification layer of the trained second neural network model, the method further includes:

obtaining a plurality of test image samples;

identifying the plurality of test image samples through the trained second neural network model;

and if the identification accuracy rate of the plurality of test image samples is greater than a reference threshold value, executing the step of deleting the classification layer of the trained second neural network model.

In another aspect, there is provided an image recognition apparatus, the apparatus including:

the acquisition module is used for acquiring an image to be identified;

the extraction module is used for extracting the high-dimensional feature vector of the image to be recognized through a first neural network model, and the first neural network model is a trained neural network model;

and the identification module is used for determining the image category of the image to be identified according to the high-dimensional feature vector of the image to be identified and the high-dimensional feature vectors and the image categories of a plurality of reference image samples, and the high-dimensional feature vectors of the plurality of reference image samples are extracted through the first neural network model.

In a possible implementation manner of the embodiment of the present application, the extraction module is further configured to:

In a possible implementation manner of the embodiment of the present application, the identification module is specifically configured to:

In a possible implementation manner of the embodiment of the present application, the apparatus further includes a training module, where the training module is configured to:

the acquisition module is used for acquiring a plurality of training image samples;

the training module is to train a second neural network model through the plurality of training image samples, the second neural network model including a classification layer; and deleting the classification layer of the trained second neural network model to obtain the first neural network model.

In a possible implementation manner of the embodiment of the present application, the apparatus further includes a testing module, where the testing module is configured to:

obtaining a plurality of test image samples;

and if the identification accuracy of the plurality of test image samples is greater than a reference threshold value, triggering the training module to execute a step of deleting the classification layer of the trained second neural network model.

In another aspect, a computer device is provided, which includes a processor, a communication interface, a memory and a communication bus, wherein the processor, the communication interface and the memory complete mutual communication through the communication bus, the memory is used for storing computer programs, and the processor is used for executing the programs stored in the memory to implement the steps of the image recognition method.

In another aspect, a computer-readable storage medium is provided, in which a computer program is stored, which computer program, when being executed by a processor, carries out the steps of the image recognition method described above.

In another aspect, a computer program product is provided comprising instructions which, when run on a computer, cause the computer to perform the steps of the image recognition method described above.

The technical scheme provided by the application can at least bring the following beneficial effects:

in the embodiment of the application, the high-dimensional feature vector of the image to be recognized can be directly obtained through the first neural network model. And because the high-dimensional feature vectors of the plurality of reference image samples are also extracted through the first neural network model, the image category of the image to be identified can be determined through the high-dimensional feature vectors of the image to be identified and the high-dimensional feature vectors and the image categories of the reference image samples. On the basis, even if the image category of the image to be recognized is not the image category participating in the training of the first neural network model, the image category of the image to be recognized can still be recognized by comparing the high-dimensional feature vector extracted by the first neural network model with the high-dimensional feature vector of the reference image sample extracted by the first neural network model. Therefore, the high-dimensional feature vectors are extracted through the first neural network model, and the plurality of reference image samples are set, so that the recognition of the images of the image classes which do not participate in the training of the first neural network model can be supported.

Drawings

In order to more clearly illustrate the technical solutions in the embodiments of the present application, the drawings needed to be used in the description of the embodiments are briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present application, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without creative efforts.

Fig. 1 is a diagram of an implementation environment related to an image recognition method provided in an embodiment of the present application;

FIG. 2 is a flow chart of a method for obtaining a first neural network model provided by an embodiment of the present application;

FIG. 3 is a flowchart of an image recognition method according to an embodiment of the present disclosure;

fig. 4 is a schematic structural diagram of an image recognition apparatus according to an embodiment of the present application;

fig. 5 is a schematic structural diagram of another image recognition apparatus provided in an embodiment of the present application;

fig. 6 is a schematic structural diagram of a computer device according to an embodiment of the present application.

Detailed Description

To make the objects, technical solutions and advantages of the embodiments of the present application more clear, the embodiments of the present application will be further described in detail with reference to the accompanying drawings.

Before explaining the image recognition method provided in the embodiment of the present application in detail, an application scenario of the embodiment of the present application is introduced.

With the rapid development of deep learning technology, the image recognition technology based on the neural network model has been widely applied. For example, in the field of intelligent transportation, classification and identification are performed on acquired license plate images of vehicles. In the e-commerce field such as a vending machine, the commodities purchased by customers are classified and identified through the collected commodity pattern information, and the like. Because the neural network models applied to different scenes need different identified targets, currently, for the neural network models applied to different scenes, a professional algorithm engineer is required to collect data according to the targets to be identified in the corresponding scenes, model training is performed according to the collected data, then, components are developed and the trained models are deployed, and a large amount of human resources are consumed in the whole process. Moreover, after most of the inference platforms deploy the trained models, only the classes identical to the training classes can be identified through the trained models. That is, assuming that the image classes of the training image samples used in training the model are classified into A, B, C, the trained model will support only image recognition of A, B, C image classes. In this case, if the user wants to add a new image class, for example, the user wants the neural network model to be able to recognize images of class D, the user needs to collect training image samples of the corresponding image class again and retrain the neural network model again, which is costly to apply. Based on this, the embodiment of the application provides an image recognition method, when the image recognition method provided by the embodiment of the application is used for image recognition, the recognition of a new image category can be realized without retraining a neural network model, and the application cost of the model is reduced. In addition, the process of model training, testing, deployment, reasoning, identification application and the like is automated, and the development efficiency is improved.

Next, an implementation environment related to the image recognition method provided by the embodiment of the present application will be described.

Fig. 1 is an implementation environment diagram of an image recognition method according to an embodiment of the present application. As shown in fig. 1, the implementation environment includes a training node 101 and a recognition node 102. Wherein training node 101 and recognition node 102 establish a communication connection.

It should be noted that the training node 101 is configured to obtain a training image sample, and train a predefined neural network model through the training image sample, so as to obtain a trained neural network model. And then, obtaining a test image sample, and testing the trained neural network model through the test image sample so as to judge whether the performance of the trained neural network model meets the requirements. If the requirement is met, the training node directly uses the trained neural network model as a first neural network model, or the training node 101 deletes part of layers in the trained neural network model, so that the first neural network model capable of outputting the high-dimensional feature vector of the image is obtained. The first neural network model is then sent to the recognition node 102 for deployment of the application. The deleted partial layer may be a classification layer in the trained neural network model, or N layers before the classification layer and the classification layer in the trained neural network model, for example, the classification layer and a layer (e.g., a full connection layer) above the classification layer may be deleted. The embodiment of the present application does not limit this.

When the first neural network model is deployed, in a possible implementation manner, the training node 101 may adjust an inference frame of the first neural network model according to a type of an inference chip used on the recognition node 102, and send the first neural network model after the adjusted inference frame to the recognition node 102 for deployment and application.

Optionally, in another possible implementation manner, the implementation environment further includes a model adjusting node 103, in which case the training node 101 packages the first neural network model and sends the first neural network model to the model adjusting node 103, and the model adjusting node 103 adjusts an inference framework of the first neural network model according to a type of an inference chip adopted by a recognition node to be deployed by the first neural network model. And then, sending the first neural network model after the inference framework is adjusted to the recognition node 102 for deployment application.

The recognition node 103 may input the image to be recognized to the first neural network model after deploying the first neural network model. If the first neural network model is a neural network model without deleting any layer, the recognition node 103 obtains high-dimensional feature vectors of an image to be recognized output by the first N layers of the classification layer of the first neural network model, and if the first neural network model is a neural network model with deleting partial layers including the classification layer, the first neural network model can directly output the high-dimensional feature vectors of the image, in which case, the recognition node 103 directly obtains the high-dimensional feature vectors of the image to be recognized output by the first neural network model. Then, the identification stage 103 determines the image category of the image to be identified according to the high-dimensional feature vector of the image to be identified and the high-dimensional feature vectors and the image categories of the plurality of reference image samples. Wherein, the high-dimensional feature vectors of the plurality of reference image samples are also extracted through the first neural network model.

It should be noted that the training node 101, the recognition node 102, and the model adjustment node 103 may be multiple functional modules deployed on one device, may also be independent devices, or may be deployed on multiple devices in one server cluster, which is not limited in this embodiment of the present invention.

As can be seen from the foregoing description, before the image to be recognized is recognized by the first neural network model, the predefined neural network model may be first trained and tested to obtain the first neural network model, and then the process of obtaining the first neural network model is first described. Referring to fig. 2, the process includes the steps of:

step 201: a plurality of training image samples are acquired.

In an embodiment of the present application, a training node obtains a training set, which includes a plurality of training image samples. The training set may be collected in advance and stored on a training node, or may be obtained by the training node from an image database on another node, which is not limited in this embodiment of the application. Moreover, each training image sample in the training set is subjected to class calibration, that is, each training image sample corresponds to a calibrated image class.

It should be noted that the training set will include training image samples of several specific image classes. For example, assuming that A, B, C images of three image classes are included in the training set, in this case, the second training model trained from the training samples in the training set will be able to identify the images of the three image classes, via step 202. The training samples of which image categories specifically included in the training set can be selectively set by the user. And, in order to guarantee the training precision, the number of training image samples of various image classes is not less than the first threshold.

Step 202: a second neural network model is trained with a plurality of training image samples, the second neural network model including a classification layer.

After the training set is acquired, the training node trains the second neural network model through the training image samples included in the training set. The second neural network model refers to a predefined initial network model which is not trained, and comprises a classification layer, wherein the classification layer is also connected with a loss function for guiding a training process, so that the training of the second neural network model is realized.

It should be noted that the second neural network model includes an input layer, a plurality of feature extraction layers, and a classification layer, where the input layer is used to pre-process the image input into the second neural network model, for example, to perform size normalization on the input image. The images output by the input layer are input to the subsequent feature extraction layers, and the feature extraction layers are used for extracting the image features of the images and further outputting the extracted image features to the classification layer for classification. The plurality of feature extraction layers may include a convolution layer, a pooling layer, a full-link layer, and the like, and the embodiment of the present application does not limit the types and the number of the feature extraction layers. The classification layer comprises a plurality of classification nodes, the number of the classification nodes is the same as the total number of image classes of the training image samples included in the training set, and each classification node corresponds to one image class. For example, assuming that the training set includes A, B, C training image samples of three image classes, the classification layer of the second neural network model includes three classification nodes, each classification node corresponds to one of the three image classes, and the image class corresponding to each classification node is different.

Based on this, the training node first obtains a training image sample from the training set. And after the training image sample is processed by the input layer and the feature extraction layer of the second neural network model, outputting the image features of the training image sample to the classification layer. And each classification node in the classification layer carries out operation according to the received image characteristics of the training image sample, so that the confidence coefficient of the image class corresponding to the corresponding classification node is output, and the confidence coefficient of the image class corresponding to each classification node is the probability that the training image sample belongs to the image class. And then, the training nodes calculate loss functions according to the confidence coefficient distribution of each image class output by the classification layer and the calibration image class of the training image sample, and further adjust the parameter values of the nodes in each layer in the second neural network model through the loss functions. And in this way, the training node processes the plurality of training image samples according to the above mode so as to adjust the parameter value of the second neural network model, and when the calculated loss function value meets the preset condition, the training is finished to obtain the trained second neural network model.

The loss function may be a softmax loss function, a triplet loss, or the like, which is not limited in this embodiment of the application.

After the trained second neural network model is obtained, the training node may obtain the first neural network model through step 203.

Optionally, in a possible case, after obtaining the trained second neural network model, the training node may further obtain a plurality of test image samples; identifying a plurality of test image samples through the trained second neural network model; if the recognition accuracy of the plurality of test image samples is greater than the reference threshold, a first neural network model is obtained through step 203.

The training nodes acquire a test set, and the test set comprises a plurality of test image samples. The test set may be collected in advance and stored on the training node, or may be obtained by the training node from an image database on another node, which is not limited in the embodiment of the present application. In addition, each test image sample in the test set is marked with a corresponding image type. Based on the method, the training node acquires all or part of the test image samples from the test set, and the acquired test image samples are identified through the trained second neural network model to obtain the output image type corresponding to each test image sample. And comparing the output image category of each test image sample with the calibrated image category, and if the output image category is consistent with the calibrated image category, determining that the corresponding test image sample is correctly identified. And if the output image category is inconsistent with the calibrated image category, determining that the corresponding test image sample is identified incorrectly. And then, the training node counts the number of the correctly identified test image samples, and the ratio of the counted number of the correctly identified test image samples to the total number of the obtained test image samples identified by the trained second neural network model is determined as the identification accuracy of the trained second neural network model to the plurality of test samples.

After the recognition accuracy of the trained second neural network model to the plurality of test samples is obtained, the training node compares the recognition accuracy with a reference threshold, if the recognition accuracy is greater than the reference threshold, the recognition accuracy of the trained second neural network model meets the requirement, at the moment, the trained second neural network model is processed through the step 203, and therefore the first neural network model is obtained. The reference threshold is a preset accuracy that the second neural network model can achieve, and may be, for example, 0.8, 0.9, and the like, which is not limited in this embodiment of the present application.

Optionally, if the recognition accuracy is not greater than the reference threshold, it indicates that the accuracy of the trained second neural network model is not satisfactory, in this case, the training node may display the reason of the training failure or send the reason of the training failure to the client, where the reason of the training failure includes that the number of the training image samples is insufficient or the quality is abnormal, so as to remind the user to provide more training image samples or improve the quality of the training image samples.

Step 203: and determining the first neural network model according to the trained second neural network model.

After the trained second neural network model is obtained, or after the recognition accuracy test of the trained second neural network model passes, the training node may obtain the first neural network model according to the trained second neural network model.

And the training node directly takes the trained second neural network model as the first neural network model. Or deleting part of layers and loss functions of the trained second neural network model by the training node, thereby obtaining the first neural network model capable of outputting the high-dimensional feature vectors of the image.

When deleting part of layers of the trained second neural network model, in a possible implementation manner, the training nodes can directly delete the classification layers of the trained second neural network model, so that the first neural network model is obtained. In this case, the first neural network model includes an input layer and a plurality of feature extraction layers, so that when an image is processed by the first neural network model, a high-dimensional feature vector of the image can be output through a last feature extraction layer of the plurality of feature extraction layers.

In another possible implementation manner, the training node may also delete the classification layer of the trained second neural network model and the first N feature extraction layers of the classification layer, where N is less than the total number of feature extraction layers included in the second neural network model. For example, assuming that 5 feature extraction layers precede the classification layer in sequence, the training node may delete the classification layer and the last two feature extraction layers of the 5 feature extraction layers. In this way, when the image is processed by the first neural network model, the high-dimensional feature vector of the image can be output through the last feature extraction layer in the remaining feature extraction layers.

After the first neural network model is obtained, the training node can adjust the inference framework of the first neural network model according to the type of the inference chip of the recognition node, and then the first neural network model is packaged and then sent to the recognition node for deployment.

Optionally, in some possible cases, when the implementation environment further includes a model adjusting node, the training node may directly package and send the first neural network model obtained by deleting the classification layer and the loss function to the model adjusting node, the model adjusting node adjusts the inference frame of the first neural network model according to the type of the inference chip of the recognition node, and then deploys the adjusted first neural network model on the recognition node.

After the first neural network model is deployed to the recognition nodes, the recognition nodes can recognize the image to be recognized through the steps shown in fig. 3.

Step 301: and acquiring an image to be identified.

The image to be recognized may be an image acquired by the recognition node in real time, or may also be an image acquired by other equipment received by the recognition node, which is not limited in the embodiment of the present application.

Step 302: and extracting the high-dimensional feature vector of the image to be recognized through a first neural network model, wherein the first neural network model is a trained neural network model.

After the image to be recognized is obtained, the recognition node takes the image to be recognized as an input image of a first neural network model, and extracts a high-dimensional feature vector of the image to be recognized through the first neural network model.

From the foregoing description, in one possible implementation, the first neural network model is a trained second neural network model. In this case, the recognition node acquires high-dimensional feature vectors, which have not been input to the classification layer, extracted from the image to be recognized by the feature extraction layer in the first neural network model. The high-dimensional feature vector may be a high-dimensional feature vector to be input to the classification layer output by the last feature extraction layer in the plurality of feature extraction layers, or a high-dimensional feature vector to be input to the next feature extraction layer extracted by a certain feature extraction layer in the plurality of feature extraction layers, which is not limited in this embodiment of the present application.

Alternatively, the first neural network model may be obtained by deleting part of the layers of the trained second neural network model. In this case, the first neural network model includes an input layer and part or all of the feature extraction layer, but does not include the classification layer. Based on this, after the image to be recognized is input to the first neural network model, the first neural network model can extract the high-dimensional feature vector of the image to be recognized and directly output the high-dimensional feature vector of the image to be recognized.

Step 303: and determining the image category of the image to be identified according to the high-dimensional feature vector of the image to be identified and the high-dimensional feature vectors and the image categories of the multiple reference image samples, wherein the high-dimensional feature vectors of the multiple reference image samples are extracted through a first neural network model.

After obtaining the high-dimensional feature vector of the image to be identified, the identification node compares the high-dimensional feature vector of the image to be identified with the high-dimensional feature vectors of a plurality of reference image samples, so as to determine the reference image sample which is most similar to the high-dimensional feature vector of the image to be identified, and further determine the image category of the image to be identified according to the determined image category of the reference image sample.

Exemplarily, the identification node determines the similarity between the high-dimensional feature vector of the image to be identified and the high-dimensional feature vector of each reference image sample; determining a target reference image sample with the highest similarity with a high-dimensional feature vector of an image to be identified; and taking the image category of the target reference image sample as the image category of the image to be identified.

It should be noted that a plurality of reference image samples are stored in the identification node, or the plurality of reference image samples are stored on other devices. The recognition node may take a plurality of reference image samples and extract a high-dimensional feature vector for each reference image sample. Wherein, the high-dimensional feature vector of each reference image sample is obtained by extracting through a first neural network model.

Illustratively, the identification node acquires a reference image set, the reference image set includes a plurality of reference image samples, each of the reference image samples is labeled with an image category, and in this embodiment of the present application, there are image samples with image categories different from those of a plurality of training image samples in the plurality of reference image samples included in the reference image set, and the plurality of training image samples are image samples used for training to obtain the first neural network model. That is, in the embodiment of the present application, there may be an intersection between the image categories of the plurality of reference image samples included in the reference image set and the image categories of the plurality of training image samples, or there may not be an intersection.

For example, when there is an intersection between the image classes of the reference image samples included in the reference image set and the image classes of the training image samples included in the training set, the number of kinds of image classes of the reference image samples included in the reference image set may be greater than or less than or equal to the number of kinds of image classes in the training set. For example, assuming that A, B, C training image samples of three image classes are included in the training set, the reference image set may include reference image samples of one or more image classes of the three image classes, and may further include reference image samples of more image classes besides the three image classes, for example, a reference image sample with an image class of D, E, F.

When the image class of the reference image sample included in the reference image set does not intersect with the image class of the training image sample included in the training set, the reference image sample which is the same as the image class of the training image sample in the training set does not exist in the reference image set. For example, assume that the training set includes A, B, C training image samples of three image classes, and the reference image set includes D, E reference image samples.

On the basis, after the first neural network model is deployed to the identification node, the identification node firstly performs feature extraction on each reference image sample through the first neural network model, so as to obtain a high-dimensional feature vector of each reference image sample. Then, the identification node may store the high-dimensional feature vector of each reference image sample and the image class of the corresponding reference image sample in correspondence. Optionally, in a possible case, a sample identifier of each reference image sample may also be stored in the correspondence, and the sample identifier may be used to uniquely identify the reference image sample, for example, may be a sample number of the reference image sample, and the like. In this way, after obtaining the high-dimensional feature vectors of the images to be recognized, the recognition node may calculate the similarity between the high-dimensional feature vector of each image to be recognized and the high-dimensional feature vector of the reference image sample stored in the correspondence. And then, determining the maximum similarity from the calculated similarities, and taking the reference image sample corresponding to the maximum similarity as a target reference image sample, wherein the target reference image sample is actually the image most similar to the image to be identified, and in this case, the identification node takes the image category of the target reference image sample as the image category of the image sample to be identified.

As can be seen from the above description, in the embodiment of the present application, the reference image set can include more image categories than the training set, and the first neural network model is trained by the training image samples in the training set, but does not include the classification layer, so that the first neural network model is only responsible for extracting the high-dimensional feature vectors, and thus even if the image category of the image to be recognized does not exist in the training set, the image category of the image to be recognized can still be recognized by comparing the high-dimensional feature vectors extracted by the first neural network model with the high-dimensional feature vectors extracted by the first neural network model in the reference image set. That is to say, in the embodiment of the present application, by removing the classification layer of the trained neural network model and setting the reference image set, it is possible to support recognition of a greater variety of images.

On the basis, if the user has other images of new image categories needing to be identified, the reference image samples of the corresponding image categories can be directly added into the reference image set. Correspondingly, the identification node can detect whether a newly added reference image sample exists in the reference image set in real time, if so, detect whether the image category of the newly added reference image sample is also the newly added image category, if so, extract the high-dimensional feature vector of the newly added reference image sample, and correspondingly store the extracted high-dimensional feature vector and the newly added image category into the corresponding relationship. In this way, after receiving the image to be recognized of the image category, the recognition node can recognize the image to be recognized of the image category. Therefore, the image is identified by the identification method provided by the embodiment of the application, the reference image sample can be added in the reference image set at any time according to the user requirement, so that the identification of the image of a new image type is supported, the neural network model does not need to be retrained, and the method is flexible, simple and low in cost.

In summary, in the embodiment of the present application, the high-dimensional feature vector of the image to be recognized can be directly obtained through the first neural network model. And because the high-dimensional feature vectors of the plurality of reference image samples are also extracted through the first neural network model, the image category of the image to be identified can be determined through the high-dimensional feature vectors of the image to be identified and the high-dimensional feature vectors and the image categories of the reference image samples. In this way, even if the image category of the image to be recognized is not the image category involved in the training of the first neural network model, the image category of the image to be recognized can still be recognized by comparing the high-dimensional feature vector extracted by the first neural network model with the high-dimensional feature vector extracted by the first neural network model in the reference image set. That is to say, in the embodiment of the present application, a high-dimensional feature vector is extracted by the first neural network model, and a plurality of reference image samples are set, so that a greater variety of image identifications can be supported. On the basis, if a new image of the image category needs to be identified exists, the identification of the image of the corresponding image category can be realized only by adding the reference image sample of the corresponding image category, the first neural network model does not need to be retrained, the application cost of the model is reduced, and the flexibility and the value of the model are improved.

In addition, in the embodiment of the application, the procedures of training, testing, deploying, applying and the like of the neural network model are automated, so that the development cost of the application of the neural network model can be reduced, and the development efficiency is improved.

It should be noted that the

above steps

201 and 203 are optional steps in the embodiment of the present application, that is, the training of the second neural network model to obtain the first neural network model may also be implemented by other methods, and the

steps

201 and 203 in the embodiment of the present application are only one possible example and do not limit the embodiment of the present application.

Fig. 4 is a schematic structural diagram of an image recognition apparatus 400 according to an embodiment of the present application, please refer to fig. 4, in which the apparatus 400 includes: an obtaining module 401, an extracting module 402 and an identifying module 403, wherein:

an obtaining module 401, configured to obtain an image to be identified;

an extracting module 402, configured to extract a high-dimensional feature vector of an image to be identified through a first neural network model, where the first neural network model is a trained neural network model;

the identifying module 403 is configured to determine an image category of the image to be identified according to the high-dimensional feature vector of the image to be identified and the high-dimensional feature vectors and the image categories of the multiple reference image samples, where the high-dimensional feature vectors of the multiple reference image samples are extracted through the first neural network model.

In one possible implementation, the extraction module 402 is further configured to:

extracting a high-dimensional feature vector of each reference image sample in a plurality of reference image samples through a first neural network model;

In a possible implementation manner, the identifying module 403 is specifically configured to:

determining a target reference image sample with the highest similarity with a high-dimensional feature vector of an image to be identified;

In one possible implementation manner, image samples with different image types exist in the plurality of reference image samples, and the plurality of training image samples refer to image samples used for training to obtain the first neural network model.

In one possible implementation, referring to fig. 5, the apparatus 400 further includes a training module 404, where the training module 404 is configured to:

the training module is used for training a second neural network model through a plurality of training image samples, and the second neural network model comprises a classification layer; and deleting the classification layer of the trained second neural network model to obtain the first neural network model.

In a possible implementation manner, the apparatus 400 further includes a testing module 405, and the testing module 405 is configured to:

obtaining a plurality of test image samples;

identifying a plurality of test image samples through the trained second neural network model;

and if the identification accuracy rate of the plurality of test image samples is greater than the reference threshold value, triggering the training module to execute the step of deleting the classification layer of the trained second neural network model.

In the embodiment of the application, the high-dimensional feature vector of the image to be recognized can be directly obtained through the first neural network model. And because the high-dimensional feature vectors of the plurality of reference image samples are also extracted through the first neural network model, the image category of the image to be identified can be determined through the high-dimensional feature vectors of the image to be identified and the high-dimensional feature vectors and the image categories of the reference image samples. In this way, even if the image category of the image to be recognized is not the image category involved in the training of the first neural network model, the image category of the image to be recognized can still be recognized by comparing the high-dimensional feature vector extracted by the first neural network model with the high-dimensional feature vector extracted by the first neural network model in the reference image set. That is to say, in the embodiment of the present application, a high-dimensional feature vector is extracted by the first neural network model, and a plurality of reference image samples are set, so that a greater variety of image identifications can be supported. On the basis, if a new image of the image category needs to be identified exists, the identification of the image of the corresponding image category can be realized only by adding the reference image sample of the corresponding image category, the first neural network model does not need to be retrained, the application cost of the model is reduced, and the flexibility and the value of the model are improved.

It should be noted that: in the image recognition apparatus provided in the above embodiment, only the division of the functional modules is illustrated when performing image recognition, and in practical applications, the functions may be distributed by different functional modules as needed, that is, the internal structure of the apparatus may be divided into different functional modules to complete all or part of the functions described above. In addition, the image recognition apparatus and the image recognition method provided by the above embodiments belong to the same concept, and specific implementation processes thereof are described in the method embodiments in detail and are not described herein again.

Fig. 6 is a schematic structural diagram of a computer device according to an embodiment of the present application. The training node and the recognition node in the foregoing embodiments may be implemented by the computer device, or the training node and the recognition node in the foregoing embodiments may be two functional modules in the computer device. The computer apparatus 600 includes, among other things, a Central Processing Unit (CPU)601, a system memory 604 including a Random Access Memory (RAM)602 and a Read Only Memory (ROM)603, and a system bus 605 connecting the system memory 604 and the central processing unit 601. The computer device 600 also includes a basic input/output system (I/O system) 606 for facilitating information transfer between various elements within the computer, and a mass storage device 607 for storing an operating system 613, application programs 614, and other program modules 615.

The basic input/output system 606 includes a display 608 for displaying information and an input device 609 such as a mouse, keyboard, etc. for user input of information. Wherein a display 608 and an input device 609 are connected to the central processing unit 601 through an input output controller 610 connected to the system bus 605. The basic input/output system 606 may also include an input/output controller 610 for receiving and processing input from a number of other devices, such as a keyboard, mouse, or electronic stylus. Similarly, input/output controller 610 may also provide output to a display screen, a printer, or other type of output device.

The mass storage device 607 is connected to the central processing unit 601 through a mass storage controller (not shown) connected to the system bus 605. The mass storage device 607 and its associated computer-readable media provide non-volatile storage for the computer device 600. That is, mass storage device 607 may include a computer-readable medium (not shown), such as a hard disk or CD-ROM drive.

Without loss of generality, computer readable media may comprise computer storage media and communication media. Computer storage media includes volatile and nonvolatile, removable and non-removable media implemented in any method or technology for storage of information such as computer readable instructions, data structures, program modules or other data. Computer storage media includes RAM, ROM, EPROM, EEPROM, flash memory or other solid state memory technology, CD-ROM, DVD, or other optical storage, magnetic cassettes, magnetic tape, magnetic disk storage or other magnetic storage devices. Of course, those skilled in the art will appreciate that computer storage media is not limited to the foregoing. The system memory 604 and mass storage device 607 described above may be collectively referred to as memory.

According to various embodiments of the present application, the computer device 600 may also operate as a remote computer connected to a network through a network, such as the Internet. That is, the computer device 600 may be connected to the network 612 through the network interface unit 611 connected to the system bus 605, or may be connected to other types of networks or remote computer systems (not shown) using the network interface unit 611.

The memory further includes one or more programs, and the one or more programs are stored in the memory and configured to be executed by the CPU.

In some embodiments, a computer-readable storage medium is also provided, in which a computer program is stored, which, when being executed by a processor, implements the steps of the image recognition method in the above embodiments. For example, the computer readable storage medium may be a ROM, a RAM, a CD-ROM, a magnetic tape, a floppy disk, an optical data storage device, and the like.

It is noted that the computer-readable storage medium referred to in the embodiments of the present application may be a non-volatile storage medium, in other words, a non-transitory storage medium.

It should be understood that all or part of the steps for implementing the above embodiments may be implemented by software, hardware, firmware or any combination thereof. When implemented in software, may be implemented in whole or in part in the form of a computer program product. The computer program product includes one or more computer instructions. The computer instructions may be stored in the computer-readable storage medium described above.

That is, in some embodiments, there is also provided a computer program product containing instructions which, when run on a computer, cause the computer to perform the steps of the image recognition method described above.

The above-mentioned embodiments are provided by way of example and not intended to limit the embodiments, and any modifications, equivalents, improvements, etc. made within the spirit and principle of the embodiments should be included in the scope of the embodiments.

Claims

1. An image recognition method, characterized in that the method comprises:

acquiring an image to be identified;

2. The method of claim 1, further comprising:

3. The method according to claim 1, wherein the determining the image category of the image to be recognized according to the high-dimensional feature vector of the image to be recognized and the high-dimensional feature vectors and the image categories of a plurality of reference image samples comprises:

4. The method according to any one of claims 1 to 3, wherein there are image samples in the plurality of reference image samples having different image classes from image classes of a plurality of training image samples, and the plurality of training image samples are image samples used for training to obtain the first neural network model.

5. The method according to any one of claims 1-3, further comprising:

obtaining a plurality of training image samples;

6. The method of claim 5, wherein the removing the classification layer of the trained second neural network model further comprises:

obtaining a plurality of test image samples;

7. An image recognition apparatus, characterized in that the apparatus comprises:

the acquisition module is used for acquiring an image to be identified;

and the identification module is used for determining the image category of the image to be identified according to the high-dimensional feature vector of the image to be identified and the high-dimensional feature vectors of a plurality of reference image samples, and the high-dimensional feature vectors of the plurality of reference image samples are extracted through the first neural network model.

8. The apparatus of claim 7, wherein the extraction module is further configured to:

9. The apparatus of claim 8, wherein the identification module is specifically configured to:

10. The apparatus according to any one of claims 7 to 9, wherein there are image samples with different image classes from those of a plurality of training image samples in the plurality of reference image samples, and the plurality of training image samples are image samples used for training to obtain the first neural network model.

11. The apparatus of any of claims 7-9, further comprising a training module to:

12. The apparatus of claim 11, further comprising a testing module to:

obtaining a plurality of test image samples;

13. A computer-readable storage medium, in which a computer program is stored which, when being executed by a processor, carries out the steps of the method according to any one of claims 1 to 6.