CN109063776B

CN109063776B - Image re-recognition network training method and device and image re-recognition method and device

Info

Publication number: CN109063776B
Application number: CN201810893815.5A
Authority: CN
Inventors: 张弛; 张思朋; 金昊
Original assignee: Beijing Kuangshi Technology Co Ltd
Current assignee: Beijing Kuangshi Technology Co Ltd
Priority date: 2018-08-07
Filing date: 2018-08-07
Publication date: 2021-08-10
Anticipated expiration: 2038-08-07
Also published as: CN109063776A

Abstract

The invention provides an image re-recognition network training method and device and an image re-recognition method and device, and relates to the technical field of image re-recognition. The image re-recognition network training method comprises the following steps: acquiring a reference characteristic set; the reference feature set comprises at least one feature vector corresponding to the same object distributed in a feature space; generating a training image through a generative confrontation network according to the reference feature set; adding the training images to a set of training images; and training the image by using the training image set to identify the network. The image re-recognition network training method and device and the image re-recognition method and device provided by the embodiment of the invention can generate the training images through the generating type countermeasure network, add the training images into the training image set, increase the number of images in the training image set, train the image re-recognition network by adopting more images, and improve the precision of the image re-recognition network.

Description

Image re-recognition network training method and device and image re-recognition method and device

Technical Field

The invention relates to the technical field of image re-recognition, in particular to an image re-recognition network training method and device and an image re-recognition method and device.

Background

With the development of video structuring technology, image re-identification is widely applied, for example, pedestrian re-identification is used as a branch of image re-identification, and is widely applied to a plurality of fields such as security protection, video retrieval and the like. One application scenario for pedestrian re-identification is as follows: in the video monitoring network, for a certain pedestrian specified in one camera, whether the pedestrian appears in the images shot by other cameras or whether the pedestrian appears in a certain image stored in an image library is judged by using an image re-identification network.

The image re-recognition network needs to be trained with a large number of images. However, in practical applications, the number of collected images that can be trained as an image re-recognition network is often small, which affects the accuracy of the image re-recognition network.

Disclosure of Invention

In view of the above, an object of the present invention is to provide a method and an apparatus for training an image re-recognition network, and an electronic device, which can increase the number of images used in training the image re-recognition network, so as to achieve the purpose of improving the accuracy of the image re-recognition network.

In order to achieve the above purpose, the embodiment of the present invention adopts the following technical solutions:

in a first aspect, an embodiment of the present invention provides a training method for an image re-recognition network, including:

acquiring a reference characteristic set; the reference feature set comprises at least one feature vector corresponding to the same object distributed in a feature space;

generating a training image through a generative confrontation network according to the reference feature set;

adding the training image to a set of training images;

and training an image re-identification network by using the training image set.

With reference to the first aspect, an embodiment of the present invention provides a first possible implementation manner of the first aspect, where the step of obtaining a reference feature set includes:

acquiring a training image set; the training image set comprises at least one reference image of the same object;

respectively extracting the characteristic vector of each reference image in the training image set through a convolutional neural network; the feature vector is located in the feature space;

and selecting at least one feature vector corresponding to the same object from the feature space to generate the reference feature set.

With reference to the first possible implementation manner of the first aspect, an embodiment of the present invention provides a second possible implementation manner of the first aspect, where the step of generating a training image by using a generative confrontation network according to the reference feature set includes:

randomly selecting a feature vector in the reference feature set;

and inputting the extracted feature vector into a generative confrontation network to obtain the training image.

With reference to the second possible implementation manner of the first aspect, an embodiment of the present invention provides a third possible implementation manner of the first aspect, where the step of randomly selecting a feature vector in the reference feature set includes:

dividing the reference feature set into a plurality of layers from the center to the edge according to the spatial position of the feature vector in the reference feature set in the feature space; each layer of reference feature set comprises at least one feature vector;

feature vectors are randomly selected from each layer of the reference feature set in order from edge to center.

With reference to the second possible implementation manner of the first aspect, an embodiment of the present invention provides a fourth possible implementation manner of the first aspect, where the generative countermeasure network includes a generative network and an authentication network; the step of inputting the extracted feature vector into a generative confrontation network to obtain the training image comprises:

inputting the feature vector into the generation network to obtain an image to be identified;

inputting the image to be identified and the reference image corresponding to the feature vector into the identification network for identification;

and if the identification result output by the identification network shows that the image to be identified and the reference image corresponding to the characteristic vector contain the same object, taking the image to be identified as a training image.

With reference to the fourth possible implementation manner of the first aspect, an embodiment of the present invention provides a fifth possible implementation manner of the first aspect, where after the step of inputting the image to be authenticated and the reference image corresponding to the feature vector into the authentication network for authentication, the method further includes:

and adjusting the parameters of the generated network according to the authentication result output by the authentication network.

With reference to the first aspect, an embodiment of the present invention provides a sixth possible implementation manner of the first aspect, where the image re-identification network includes a convolutional neural network and a distance metric function; the convolutional neural network comprises at least one convolutional layer, a global pooling layer and a fully-connected layer which are connected in sequence.

With reference to the sixth possible implementation manner of the first aspect, the embodiment of the present invention provides a seventh possible implementation manner of the first aspect, wherein the step of training an image re-recognition network by using the training image set includes:

randomly selecting two training images from the training image sample set, and respectively extracting feature vectors corresponding to the two training images through a convolutional neural network;

calculating a first distance between the feature vectors corresponding to the two training images by adopting a distance measurement function;

according to preset labels corresponding to the two training images, carrying out accuracy inspection on the first distance through a loss function to obtain a loss function value;

and training the parameters of the image re-identification network through a back propagation algorithm based on the loss function values.

In a second aspect, an embodiment of the present invention further provides a training apparatus for an image re-recognition network, including:

the characteristic set acquisition module is used for acquiring a reference characteristic set; the reference feature set comprises at least one feature vector corresponding to the same object distributed in a feature space;

the image generation module is used for generating a training image through a generative confrontation network according to the reference feature set;

the image adding module is used for adding the training images to a training image set;

and the network generation module is used for utilizing the training image set to train the image and then identifying the network.

In a third aspect, an embodiment of the present invention further provides an image re-recognition method, including:

acquiring a query picture containing a target object and at least one picture to be compared;

determining whether the picture to be compared contains the target object or not through an image re-identification network; the image re-recognition network is obtained by training by using the training method of any one of the first aspect.

With reference to the third aspect, an embodiment of the present invention provides a first possible implementation manner of the third aspect, where the image re-identification network includes a convolutional neural network and a distance metric function; the step of determining whether the picture to be compared contains the target object through an image re-recognition network comprises the following steps:

extracting the feature vector of the query picture and the feature vector of the picture to be compared through a convolutional neural network;

calculating a second distance between the feature vector of the query picture and the feature vector of the picture to be compared by adopting a distance measurement function;

and determining whether the picture to be compared contains the target object or not according to the second distance.

In a fourth aspect, an embodiment of the present invention provides an image re-recognition apparatus, including:

the image acquisition module is used for acquiring a query image containing a target object and at least one image to be compared;

the re-identification module is used for determining whether the picture to be compared contains the target object through an image re-identification network; the image re-recognition network is obtained by training by using the training method of any one of the first aspect.

In a fifth aspect, an embodiment of the present invention provides an electronic device, which includes a memory and a processor, where the memory stores a computer program that is executable on the processor, and the processor implements the steps of the method according to any one of the first aspect and/or any one of the third aspect when executing the computer program.

In a sixth aspect, the present invention provides a computer-readable storage medium, on which a computer program is stored, where the computer program is executed by a processor to perform the steps of any one of the first aspect and/or any one of the third aspect.

The embodiment of the invention has the following beneficial effects:

according to the training method, the training device and the electronic equipment for the image re-recognition network, the training images can be generated through the generation type countermeasure network, the training images are added into the training image set, the number of the images in the training image set is increased, the image re-recognition network is trained by adopting more images, and the accuracy of the image re-recognition network can be improved.

Additional features and advantages of the disclosure will be set forth in the description which follows, or in part may be learned by the practice of the above-described techniques of the disclosure, or may be learned by practice of the disclosure.

In order to make the aforementioned objects, features and advantages of the present disclosure more comprehensible, preferred embodiments accompanied with figures are described in detail below.

Drawings

In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, and it is obvious that the drawings in the following description are some embodiments of the present invention, and other drawings can be obtained by those skilled in the art without creative efforts.

Fig. 1 is a schematic structural diagram of an electronic device according to an embodiment of the present invention;

FIG. 2 is a flow chart illustrating a training method of an image re-recognition network according to an embodiment of the present invention;

FIG. 3 is a diagram illustrating a process of image reproduction by a generative confrontation network according to an embodiment of the present invention;

FIG. 4 is a flowchart illustrating a method for training an image set to train an image re-recognition network according to an embodiment of the present invention;

FIG. 5 is a flowchart of a training apparatus for an image re-recognition network according to an embodiment of the present invention;

FIG. 6 is a flow chart illustrating an image re-recognition method provided by an embodiment of the invention;

fig. 7 is a schematic structural diagram illustrating an image re-recognition apparatus according to an embodiment of the present invention.

Detailed Description

To make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions of the present invention will be clearly and completely described below with reference to the accompanying drawings, and it is apparent that the described embodiments are some, but not all embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

The first embodiment is as follows:

first, an example electronic device 100 for implementing a scene recognition method of an embodiment of the present invention is described with reference to fig. 1. The example electronic device 100 may be a computer, a mobile terminal such as a smart phone or a tablet computer, or an authentication device such as a witness integrated machine.

As shown in FIG. 1, electronic device 100 includes one or more processors 102, one or more memory devices 104, an input device 106, an output device 108, and an image capture device 110, which are interconnected via a bus system 112 and/or other form of connection mechanism (not shown). It should be noted that the components and structure of the electronic device 100 shown in fig. 1 are exemplary only, and not limiting, and the electronic device may have other components and structures as desired.

The processor 102 may be a Central Processing Unit (CPU) or other form of processing unit having data processing capabilities and/or instruction execution capabilities, and may control other components in the electronic device 100 to perform desired functions.

The storage 104 may include one or more computer program products that may include various forms of computer-readable storage media, such as volatile memory and/or non-volatile memory. The volatile memory may include, for example, Random Access Memory (RAM), cache memory (cache), and/or the like. The non-volatile memory may include, for example, Read Only Memory (ROM), hard disk, flash memory, etc. On which one or more computer program instructions may be stored that may be executed by processor 102 to implement client-side functionality (implemented by the processor) and/or other desired functionality in embodiments of the invention described below. Various applications and various data, such as various data used and/or generated by the applications, may also be stored in the computer-readable storage medium.

The input device 106 may be a device used by a user to input instructions and may include one or more of a keyboard, a mouse, a microphone, a touch screen, and the like.

The output device 108 may output various information (e.g., images or sounds) to the outside (e.g., a user), and may include one or more of a display, a speaker, and the like.

The image capture device 110 may take images (e.g., photographs, videos, etc.) desired by the user and store the taken images in the storage device 104 for use by other components.

Example two:

since the number of training images of the existing training image re-recognition network affects the accuracy of the image re-recognition network, in order to improve the accuracy of the image re-recognition network, the present embodiment provides a method for training an image re-recognition network first, it should be noted that the steps shown in the flowchart of the figure may be executed in a computer system such as a set of computer executable instructions, and although a logical order is shown in the flowchart, in some cases, the steps shown or described may be executed in an order different from that here. The present embodiment will be described in detail below.

Fig. 2 shows a flowchart of a training method of an image re-recognition network according to an embodiment of the present invention. The image re-identification network may include a convolutional neural network and a distance metric function. The convolution neural network is used for identifying the two input images and extracting the characteristic vectors, and the distance measurement function is used for calculating the distance between the characteristic vectors of the two images so as to determine whether the two images are the images of the same object. As shown in fig. 2, the training method includes the following steps:

step S202, acquiring a reference feature set; the reference feature set comprises at least one feature vector corresponding to the same object distributed in a feature space.

The reference feature set herein is used to generate more training images, and therefore, the feature vectors in the training feature set may be feature vectors derived from initial images in the training image set. The initial images in the training image set may be referred to as reference images. The reference images in the training image set may be obtained from pictures captured by one or more image capture devices (e.g., cameras) or from an image library.

An alternative way to obtain the reference feature set is as follows: firstly, acquiring a training image set; the training image set contains one or more reference images of the same object, and also contains one or more reference images of other objects. Respectively extracting the characteristic vector of each reference image in the training image set through a convolutional neural network; the feature vector is located in a feature space. Therefore, at least one feature vector corresponding to the same object exists in the feature space, and feature vectors corresponding to other objects may also exist. And selecting at least one feature vector corresponding to the same object from the feature space to generate a reference feature set. For example, for a specific object (the object may be a specific pedestrian, or a specific vehicle or a specific animal, or other things), all feature vectors corresponding to the object are selected from the feature space, and a reference feature set is generated. The reference feature set retains the mutual position relationship of the respective feature vectors in the feature space.

In the above process, each reference image in the training image set obtains a corresponding feature vector, and the feature vector exists in the feature space, so the feature space includes feature vectors of a plurality of different objects. After extracting feature vectors from all reference images in the training image set, reference feature sets corresponding to different objects can be generated, so as to generate more training images in the following steps.

Another alternative way of obtaining the set of reference features is as follows: firstly, acquiring a training image set; all reference images containing the specified object are selected from the training image set, and one or more reference images containing the same object can be obtained to form a reference image set. Respectively extracting the characteristic vector of each reference image in the reference image set through a convolutional neural network; the feature vector is located in a feature space. And forming a reference feature set by all the obtained feature vectors. Also, the reference feature set retains the mutual positional relationship of the respective feature vectors in the feature space.

Similarly, because the training image set includes reference images of a plurality of different objects, the reference images can be selected to form a corresponding reference image set for the different objects, and further, reference feature sets corresponding to the different objects are generated.

The convolutional neural network in the process may employ a convolutional neural network in an image re-identification network.

And step S204, generating a training image through a generative confrontation network according to the reference feature set.

The feature vectors can be randomly selected from the reference feature set, and the extracted feature vectors are input into the generative confrontation network to obtain the training image output by the generative confrontation network.

Generally, in the feature space, the feature vectors of the same object are gathered in a limited region, and the feature vectors located at the edge of the region are more representative of the features of the object than the feature vectors located at the center of the region. In order to generate a more vivid image more quickly, the reference feature set can be divided into multiple layers from the center to the edge according to the spatial positions of the feature vectors in the reference feature set in the feature space; each layer of reference feature set comprises at least one feature vector; feature vectors are randomly selected from each layer of the reference feature set in order from edge to center. For example, feature vectors are randomly chosen first from the edge layer, then from the immediately adjacent edge layer, and so on until they are randomly chosen from the center layer.

And inputting the extracted feature vectors into a generating type countermeasure network to obtain a training image, wherein the training image comprises objects corresponding to the feature vectors. For example, a certain pedestrian is taken as a specified object, a reference feature set of the pedestrian is obtained, feature vectors are randomly extracted from the reference feature set and input into a generating type confrontation network, and the generating type confrontation network can output training images containing the pedestrian according to the required number. These generated training images may have a different shade or a different background than the reference image in the set of training images containing the pedestrian, or the pedestrian has a different pose, etc.

The generation type countermeasure network comprises a generation network and an identification network, the generation network is used for generating an image containing an object corresponding to the characteristic vector, and the identification network is used for judging the fidelity of the image generated by the generation network. Fig. 3 shows a process diagram of image reproduction by a generative countermeasure network. As shown in fig. 3, taking an object a as an example, randomly selecting a feature vector corresponding to the object a from a reference feature set of the object a, and inputting the selected feature vector into a generation network to obtain an image to be identified, where the image to be identified is an image containing the object a and simulated by the network. The generation network can be understood simply as an image acquisition unit which takes an image which differs from the reference image but may contain the object a in the image as the image to be identified. And inputting the image to be identified and the reference image corresponding to the characteristic vector into an identification network for identification, and judging whether the image to be identified contains the object A or not by the identification network. If so, the identification result output by the identification network shows that the image to be identified contains the object A, and the image to be identified can be used as a training image; if not, the identification network determines that the image to be identified does not contain the object A, which indicates that the fidelity of the image to be identified generated by the generation network is not enough, and the image to be identified cannot be output as a training image.

In the process of generating the training image by using the generative confrontation network, the identification network and the generation network can be trained. For example, the parameters of the generation network are adjusted according to the authentication result output by the authentication network. Optionally, parameters of the generation network and the convolutional neural network may be adjusted simultaneously according to an identification result output by the identification network, so that a feature vector extracted by the convolutional neural network can better reflect the characteristics of the object a, and an image generated by the generation network is more vivid.

With reference to the above steps, the generative confrontation network can generate a large number of training images for different subjects.

Step S206, adding the training images to the training image set.

For example, when n reference images are included in the training image set, if m training images are generated by the generative confrontation network, the m training images are added to the training image set to obtain a new training image set including n + m images, thereby increasing the number of images in the training image set. And the image re-recognition network is trained by adopting more images, so that the precision of the trained image re-recognition network can be improved.

And step S208, training the image by using the training image set to re-identify the network.

Wherein the image re-identification network comprises a convolutional neural network and a distance metric function. The convolutional neural network may employ one of GoogleNet, VGG, or ResNet. The distance metric function may be one of a euclidean distance function (e.g., L2 distance function), a manhattan distance function, an angle cosine function, a chebyshev distance function, a hamming distance function, or a mahalanobis distance function.

The specific process of training the image re-recognition network may be as follows: randomly selecting two training images from a training image sample set, and respectively extracting feature vectors corresponding to the two training images through a convolutional neural network; calculating a first distance between the feature vectors corresponding to the two training images by adopting a distance measurement function; according to preset labels corresponding to the two training images, carrying out accuracy inspection on the first distance through a loss function to obtain a loss function value; and training the parameters of the image re-identification network through a back propagation algorithm based on the loss function values.

FIG. 4 illustrates an example of training an image re-recognition network using a set of training images. As shown in fig. 4, the training image 1 and the training image 2 are passed through a convolutional neural network to extract a feature vector P1 and a feature vector P2, respectively. The training images 1 and 2 may be both reference images in a training image set, or may be training images generated by a generative confrontation network, or one may be a reference image in a training image set, and one may be a training image generated by a generative confrontation network. After the eigenvector P1 and the eigenvector P2 are obtained, a first distance between the eigenvector P1 and the eigenvector P2 is calculated by adopting a distance measurement function, the accuracy of the first distance is checked by a loss function according to preset labels corresponding to the training image 1 and the training image 2 to obtain a loss function value, and parameters of the convolutional neural network and a matrix of the distance measurement function are trained by a back propagation algorithm based on the loss function value. The Loss function used in this embodiment may include, but is not limited to, one or more of a square Loss function, a cross-entropy Loss function, and a triple Loss (triple Hard Loss) function.

The effect of training the parameters of the image re-recognition network by using the loss function is as follows: the first distance calculated from any two images containing the same object is smaller than the first distance calculated from any two images containing different objects. Alternatively, it can be said that the smaller the first distance obtained between two images including the same object, the better the first distance obtained between two images not including the same object, and the larger the first distance obtained between the two images.

The training method of the image re-identification network provided by the embodiment of the invention can generate the training images by obtaining the reference characteristic set and generating the countermeasure network according to the reference characteristic set, the training images are added into the training image set, namely, the number of the images in the training image set is increased, and the image re-identification network is trained by utilizing the increased training image set.

Example three:

corresponding to the training method of the image re-recognition network provided in the second embodiment, the present embodiment provides a training apparatus of the image re-recognition network. Fig. 5 is a schematic structural diagram illustrating a training of an image re-recognition network according to an embodiment of the present invention, and as shown in fig. 5, the apparatus includes the following modules:

a feature set obtaining module 51, configured to obtain a reference feature set; the reference feature set comprises at least one feature vector corresponding to the same object distributed in a feature space;

the image generation module 52 is configured to generate a training image through a generative confrontation network according to the reference feature set;

an image adding module 53, configured to add the training image to a set of training images;

and a network training module 54, configured to train an image to re-identify a network by using the training image set.

The feature set obtaining module 51 may be further configured to: acquiring a training image set; the training image set comprises at least one reference image of the same object, the feature vector of each reference image in the training image set is respectively extracted through a convolutional neural network, the feature vector is located in the feature space, at least one feature vector corresponding to the same object is selected from the feature space, and the reference feature set is generated.

Further, the image generation module 52 may include an extraction sub-module and a generation sub-module. The extraction submodule is used for: and randomly selecting the feature vectors in the reference feature set. Generating a submodule for: and inputting the extracted feature vector into a generative confrontation network to obtain the training image.

Optionally, the extraction sub-module may be further configured to: dividing the reference feature set into a plurality of layers from the center to the edge according to the spatial position of the feature vector in the reference feature set in the feature space; each layer of reference feature set comprises at least one feature vector; feature vectors are randomly selected from each layer of the reference feature set in order from edge to center.

The generative confrontation network comprises a generative network and an authentication network, and optionally, the generation submodule may be further configured to: inputting the feature vector into the generation network to obtain an image to be identified; inputting the image to be identified and the reference image corresponding to the feature vector into the identification network for identification; and if the identification result output by the identification network shows that the image to be identified and the reference image corresponding to the characteristic vector contain the same object, taking the image to be identified as a training image.

Optionally, the image generating module 52 may include an adjusting module, configured to adjust parameters of the generated network according to an authentication result output by the authentication network after the image to be authenticated and the reference image corresponding to the feature vector are input into the authentication network for authentication.

Optionally, the image re-identification network comprises a convolutional neural network and a distance metric function; the convolutional neural network comprises at least one convolutional layer, a global pooling layer and a fully-connected layer which are connected in sequence.

Optionally, the network generating module 54 may further be configured to: randomly selecting two training images from the training image sample set, and respectively extracting feature vectors corresponding to the two training images through a convolutional neural network; calculating a first distance between the feature vectors corresponding to the two training images by adopting a distance measurement function; according to preset labels corresponding to the two training images, carrying out accuracy inspection on the first distance through a loss function to obtain a loss function value; and training the parameters of the image re-identification network through a back propagation algorithm based on the loss function values.

The training device for the image re-recognition network provided by the embodiment of the invention can generate the training images through obtaining the reference characteristic set and generating the confrontation network according to the reference characteristic set, the training images are added into the training image set, namely, the number of the images in the training image set is increased, the image re-recognition network is trained by utilizing the increased training image set, and the number of the images is increased through the generating confrontation, so that the aim of improving the precision of the image re-recognition network is fulfilled.

The device provided by the embodiment has the same implementation principle and technical effect as the foregoing embodiment, and for the sake of brief description, reference may be made to the corresponding contents in the foregoing method embodiment for the portion of the embodiment of the device that is not mentioned.

Example four:

the embodiment of the invention also provides an image re-identification method, as shown in fig. 6, the method comprises the following steps:

step S602: and acquiring a query picture containing a target object and at least one picture to be compared.

The query picture may be captured by an image capturing apparatus of the electronic device or other image capturing devices (such as a camera) connected to the electronic device via a network, or may be provided by a user. The picture to be compared can be obtained from pictures captured by one or more cameras, and can also be extracted from an image library. The number of the pictures to be compared can be one or more. For example, the target object included in the query picture may be a designated pedestrian.

Step S604: and determining whether the picture to be compared contains the target object or not through an image re-identification network.

The image re-recognition network is obtained by training using the training method described in the second embodiment. The image re-identification network includes a convolutional neural network and a distance metric function. The convolutional neural network comprises at least one convolutional layer, a global pooling layer and a fully-connected layer which are connected in sequence. Each convolution layer in the convolutional neural network comprises one or more convolution kernels used for extracting characteristic information from a pixel matrix of an input picture, the convolution kernels are used for traversing the pixel matrix of the input picture according to a certain step length to obtain at least one characteristic value, and the characteristic graph is formed by the at least one characteristic value. And performing dimension reduction processing on the feature map through the global pooling layer and the full connection layer to obtain a feature vector corresponding to the input picture. The distance measurement function may be one of an euclidean distance function, a manhattan distance function, an included angle cosine function, a chebyshev distance function, a hamming distance function, or a mahalanobis distance function.

And respectively inputting the query picture and the picture to be compared into the convolutional neural network to obtain the characteristic vector of the query picture and the characteristic vector of the picture to be compared. Calculating a second distance between the feature vector of the query picture and the feature vector of the picture to be compared by adopting a distance measurement function; and determining whether the picture to be compared contains the target object or not according to the second distance.

Exemplarily, when the picture to be compared is a picture, if a second distance between the feature vector of the query picture and the feature vector of the picture to be compared is greater than a set threshold, it may be determined that the picture to be compared does not include the target object; and if the second distance between the feature vector of the query picture and the feature vector of the picture to be compared is smaller than the set threshold, determining that the picture to be compared contains the target object.

And when the picture to be compared comprises a plurality of pictures, calculating a second distance between the feature vector of each picture to be compared and the feature vector of the query picture through a distance measurement function to obtain a plurality of second distances. Each picture to be compared corresponds to a second distance. The images to be compared can be sorted according to the sequence from small to large of the second distance, the image to be compared with the smallest distance has the largest probability of containing the target object, and the image to be compared with the set digit at the front end of the sequence is output, or the image to be compared with the second distance smaller than the set threshold value is output according to the sequence. The output picture to be compared is regarded as a picture with higher probability of containing the target object, and the user can carefully screen the pictures, so that the workload of the user can be greatly reduced.

Because the image re-recognition network adopted in the embodiment is trained by a large number of training images, the precision is higher, and the recognition accuracy is higher.

Example five:

corresponding to the image re-recognition method provided in the fourth embodiment, with reference to fig. 7, the present embodiment provides an image re-recognition apparatus, including: a picture taking module 71 and a re-recognition module 72.

The image obtaining module 71 is configured to obtain a query image including a target object and at least one image to be compared; a re-identification module 72, configured to determine whether the picture to be compared includes the target object through an image re-identification network; the image re-recognition network is obtained by training by using the training method in any one of the preceding embodiments.

Optionally, the image re-identification network comprises a convolutional neural network and a distance metric function. The re-identification module 72 may also be configured to: extracting the feature vector of the query picture and the feature vector of the picture to be compared through a convolutional neural network; calculating a second distance between the feature vector of the query picture and the feature vector of the picture to be compared by adopting a distance measurement function; and determining whether the picture to be compared contains the target object or not according to the second distance.

Furthermore, an embodiment of the present invention provides an electronic device, which includes a memory and a processor, where the memory stores a computer program operable on the processor, and the processor implements the training method for the image re-recognition network and/or the steps of the method provided in the foregoing embodiment of the image re-recognition method when executing the computer program.

Further, an embodiment of the present invention further provides a computer program product of a computer-readable storage medium and a device, including a computer-readable storage medium storing a program code, where instructions included in the program code may be used to execute the aforementioned training method for an image re-recognition network and/or the method described in the foregoing embodiment of the image re-recognition method, and specific implementation may refer to the embodiment of the method, which is not described herein again.

The functions, if implemented in the form of software functional units and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present invention may be embodied in the form of a software product, which is stored in a storage medium and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and other various media capable of storing program codes.

Finally, it should be noted that: the above-mentioned embodiments are only specific embodiments of the present invention, which are used for illustrating the technical solutions of the present invention and not for limiting the same, and the protection scope of the present invention is not limited thereto, although the present invention is described in detail with reference to the foregoing embodiments, those skilled in the art should understand that: any person skilled in the art can modify or easily conceive the technical solutions described in the foregoing embodiments or equivalent substitutes for some technical features within the technical scope of the present disclosure; such modifications, changes or substitutions do not depart from the spirit and scope of the embodiments of the present invention, and they should be construed as being included therein. Therefore, the protection scope of the present invention shall be subject to the protection scope of the appended claims.

Claims

1. A training method of an image re-recognition network is characterized by comprising the following steps:

acquiring a reference characteristic set; the reference feature set comprises at least one feature vector corresponding to the same object and feature vectors corresponding to other objects which are distributed in a feature space;

adding the training image to a set of training images;

training an image re-identification network by using the training image set;

generating a training image through a generative confrontation network according to the reference feature set, wherein the step comprises the following steps:

randomly selecting a feature vector from each layer of reference feature set according to the sequence from the edge to the center;

2. The training method of claim 1, wherein the step of obtaining the reference feature set comprises:

3. The training method of claim 2, wherein the generative confrontation network comprises a generative network and an authentication network; the step of inputting the extracted feature vector into a generative confrontation network to obtain the training image comprises:

4. The training method according to claim 3, wherein after the step of inputting the image to be identified and the reference image corresponding to the feature vector into the identification network for identification, the method further comprises:

5. The training method of claim 1, wherein the image re-recognition network comprises a convolutional neural network and a distance metric function; the convolutional neural network comprises at least one convolutional layer, a global pooling layer and a fully-connected layer which are connected in sequence.

6. The training method of claim 5, wherein the step of training an image re-recognition network using the set of training images comprises:

7. An apparatus for training an image re-recognition network, comprising:

the characteristic set acquisition module is used for acquiring a reference characteristic set; the reference feature set comprises at least one feature vector corresponding to the same object and feature vectors corresponding to other objects which are distributed in a feature space;

the network training module is used for utilizing the training image set to train images and then identifying a network;

wherein the image generation module is further configured to:

8. An image re-recognition method, comprising:

determining whether the picture to be compared contains the target object or not through an image re-identification network; the image re-recognition network is obtained by training by the training method of any one of claims 1-6.

9. The image re-recognition method of claim 8, wherein the image re-recognition network comprises a convolutional neural network and a distance metric function; the step of determining whether the picture to be compared contains the target object through an image re-recognition network comprises the following steps:

10. An image re-recognition apparatus, comprising:

the re-identification module is used for determining whether the picture to be compared contains the target object through an image re-identification network; the image re-recognition network is obtained by training by the training method of any one of claims 1-6.

11. An electronic device comprising a memory, a processor, and a computer program stored in the memory and operable on the processor, wherein the processor, when executing the computer program, implements the steps of the method of any of claims 1 to 6 and/or any of claims 8 to 9.

12. A computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, carries out the steps of the method according to any of the claims 1 to 6 and/or 8 to 9.