CN109063776A

CN109063776A - Image identifies network training method, device and image recognition methods and device again again

Info

Publication number: CN109063776A
Application number: CN201810893815.5A
Authority: CN
Inventors: 张弛; 张思朋; 金昊
Original assignee: Beijing Megvii Technology Co Ltd
Current assignee: Beijing Megvii Technology Co Ltd
Priority date: 2018-08-07
Filing date: 2018-08-07
Publication date: 2018-12-21
Anticipated expiration: 2038-08-07
Also published as: CN109063776B

Abstract

The present invention provides a kind of images to identify network training method, device and image recognition methods and device again again, is related to image identification technology field again.Wherein, image identifies that network training method includes: to obtain fixed reference feature collection again；Fixed reference feature concentrates at least one corresponding feature vector of same target comprising being distributed in feature space；According to fixed reference feature collection, network is fought by production, generates training image；Training image is added to training image to concentrate；Network is identified again using training image collection training image.Image provided by the embodiment of the present invention identifies network training method, device and image recognition methods and device again again, network can be fought by production, generate training image, training image is added to training image to concentrate, increase the quantity of the image of training image concentration, identify that network is trained, and can promote the precision that image identifies network again again to image using more image.

Description

Image identifies network training method, device and image recognition methods and device again again

Technical field

The present invention relates to image identification technology fields again, identify network training method, dress again more particularly, to a kind of image It sets and image recognition methods and device again.

Background technique

With the development of video structural technology, image identifies be widely applied again, as pedestrian is identified again as figure As the branch identified again, it is widely used to the various fields such as security protection, video frequency searching.A kind of applied field that pedestrian identifies again Scape is as follows: in video surveillance network, for some pedestrian specified in a camera, identifying network again using image, sentences Whether the pedestrian of breaking appears in image captured by other cameras, alternatively, whether the pedestrian appears in the guarantor in image library In some image deposited.

Image identifies that network needs to be trained using great amount of images again.However, in practical applications, what is be collected into can be with It identifies the image of network training often negligible amounts again as image, affects the precision that image identifies network again.

Summary of the invention

In view of this, the purpose of the present invention is to provide training method, device and electronics that a kind of image identifies network again Equipment, can identify the quantity of used image when network training again by increase image, know again to reach and promote image The purpose of the precision of other network.

To achieve the goals above, technical solution used in the embodiment of the present invention is as follows:

In a first aspect, the embodiment of the invention provides the training methods that a kind of image identifies network again, comprising:

Obtain fixed reference feature collection；The fixed reference feature concentrates the same target comprising being distributed in feature space corresponding extremely A few feature vector；

According to the fixed reference feature collection, network is fought by production, generates training image；

The training image is added to training image to concentrate；

Network is identified again using the training image collection training image.

With reference to first aspect, the embodiment of the invention provides the first possible embodiments of first aspect, wherein institute State the step of obtaining fixed reference feature collection, comprising:

Obtain training image collection；The training image concentrates at least reference picture comprising same target；

Extract the feature vector for every reference picture that the training image is concentrated respectively by convolutional neural networks；It is described Feature vector is located in the feature space；

From the feature space, at least one corresponding feature vector of same target is chosen, the fixed reference feature is generated Collection.

The possible embodiment of with reference to first aspect the first, the embodiment of the invention provides second of first aspect Possible embodiment, wherein it is described according to the fixed reference feature collection, network is fought by production, generates training image Step, comprising:

Randomly select the feature vector that the fixed reference feature is concentrated；

The described eigenvector of extraction is input in production confrontation network, the training image is obtained.

The possible embodiment of second with reference to first aspect, the embodiment of the invention provides the third of first aspect Possible embodiment, wherein described the step of randomly selecting the feature vector that the fixed reference feature is concentrated, comprising:

Spatial position of the feature vector concentrated according to the fixed reference feature in the feature space, by described with reference to special Collection is divided into multilayer by center to edge；It includes at least one feature vector that each layer of fixed reference feature, which is concentrated,；

According to the sequence from edge to center, feature vector is randomly selected from each layer of fixed reference feature concentration.

The possible embodiment of second with reference to first aspect, the embodiment of the invention provides the 4th kind of first aspect Possible embodiment, wherein the production confrontation network includes generating network and discrimination natwork；It is described will be described in extraction The step of feature vector is input in production confrontation network, obtains the training image, comprising:

Described eigenvector is input in the generation network, image to be identified is obtained；

The image to be identified and the corresponding reference picture of described eigenvector are input in the discrimination natwork and are carried out Identify；

If the identification result of the discrimination natwork output shows that the image to be identified is corresponding with described eigenvector Reference picture includes same target, then using the image to be identified as training image.

The 4th kind of possible embodiment with reference to first aspect, the embodiment of the invention provides the 5th kind of first aspect Possible embodiment, wherein described that the image to be identified and the corresponding reference picture of described eigenvector are input to institute After stating the step of being identified in discrimination natwork, the method also includes:

The parameter for generating network is adjusted according to the identification result that the discrimination natwork exports.

With reference to first aspect, the embodiment of the invention provides the 6th kind of possible embodiments of first aspect, wherein institute It states image and identifies that network includes convolutional neural networks and distance metric function again；The convolutional neural networks include sequentially connected At least one convolutional layer, global pool layer and full articulamentum.

The 6th kind of possible embodiment with reference to first aspect, the embodiment of the invention provides the 7th kind of first aspect Possible embodiment, wherein described the step of identifying network again using the training image collection training image, comprising:

Two training images are randomly selected from the training image sample set to extract respectively by convolutional neural networks The corresponding feature vector of two training images；

First distance between the corresponding feature vector of two training images is calculated using distance metric function；

According to the corresponding default label of two training images, it is accurate to be carried out by loss function to the first distance Degree is examined, and loss function value is obtained；

Based on the loss function value, identify that the parameter of network is instructed again to described image by back-propagation algorithm Practice.

Second aspect, the embodiment of the present invention also provide the training device that a kind of image identifies network again, comprising:

Feature set obtains module, for obtaining fixed reference feature collection；The fixed reference feature is concentrated comprising being distributed in feature space In at least one corresponding feature vector of same target；

Image generation module, for fighting network by production, generating training image according to the fixed reference feature collection；

Image adding module is concentrated for the training image to be added to training image；

Network generation module, for identifying network again using the training image collection training image.

The third aspect, the embodiment of the present invention also provide a kind of image recognition methods again, comprising:

Obtain inquiry picture and an at least picture to be compared comprising target object；

Network is identified again by image, is determined described wait whether compare in picture comprising the target object；Described image Identify that network is obtained using the described in any item training method training of first aspect again.

In conjunction with the third aspect, the embodiment of the invention provides the first possible embodiments of the third aspect, wherein institute It states image and identifies that network includes convolutional neural networks and distance metric function again；It is described that network is identified by image again, determine institute The step of whether including the target object, is stated wait compare in picture, comprising:

By convolutional neural networks, extract the inquiry picture feature vector and the picture to be compared feature to Amount；

The feature vector of the inquiry picture and the feature vector of the picture to be compared are calculated using distance metric function Between second distance；

According to the second distance, determine described wait whether compare in picture comprising the target object.

Fourth aspect, the embodiment of the invention provides image identification devices again, comprising:

Picture obtains module, for obtaining inquiry picture and an at least picture to be compared comprising target object；

Whether identification module again determines described wait compare in picture comprising the mesh for identifying network again by image Mark object；Described image identifies that network is obtained using the described in any item training method training of first aspect again.

5th aspect, the embodiment of the invention provides a kind of electronic equipment, including memory, processor, the memories In be stored with the computer program that can be run on the processor, the processor realizes the when executing the computer program On the one hand any one of and/or the step of the third aspect described in any item methods.

6th aspect, the embodiment of the invention provides a kind of computer readable storage medium, the computer-readable storage It is stored with computer program on medium, any one of first aspect and/or the are executed when the computer program is run by processor The step of three aspect described in any item methods.

The embodiment of the present invention bring it is following the utility model has the advantages that

Image provided by the embodiment of the present invention identifies training method, device and the electronic equipment of network again, can pass through Production fights network, generates training image, and training image is added to training image and is concentrated, and increases training image concentration The quantity of image identifies that network is trained, and can be promoted image and be identified network again using more image again to image Precision.

Other feature and advantage of the disclosure will illustrate in the following description, alternatively, Partial Feature and advantage can be with Deduce from specification or unambiguously determine, or by implement the disclosure above-mentioned technology it can be learnt that.

To enable the above objects, features, and advantages of the disclosure to be clearer and more comprehensible, preferred embodiment is cited below particularly, and cooperate Appended attached drawing, is described in detail below.

Detailed description of the invention

It, below will be to specific in order to illustrate more clearly of the specific embodiment of the invention or technical solution in the prior art Embodiment or attached drawing needed to be used in the description of the prior art be briefly described, it should be apparent that, it is described below Attached drawing is some embodiments of the present invention, for those of ordinary skill in the art, before not making the creative labor It puts, is also possible to obtain other drawings based on these drawings.

Fig. 1 shows the structural schematic diagram of a kind of electronic equipment provided by the embodiment of the present invention；

Fig. 2 shows the flow charts that a kind of image provided by the embodiment of the present invention identifies the training method of network again；

Fig. 3 shows a kind of procedure chart of production confrontation network progress image regeneration provided by the embodiment of the present invention；

Fig. 4 shows the method that a kind of training image collection training image provided by the embodiment of the present invention identifies network again Flow chart；

Fig. 5 shows the flow chart that a kind of image provided by the embodiment of the present invention identifies the training device of network again；

Fig. 6 shows a kind of flow chart of the recognition methods again of image provided by the embodiment of the present invention；

Fig. 7 shows a kind of structural schematic diagram of the identification device again of image provided by the embodiment of the present invention.

Specific embodiment

In order to make the object, technical scheme and advantages of the embodiment of the invention clearer, below in conjunction with attached drawing to the present invention Technical solution be clearly and completely described, it is clear that described embodiments are some of the embodiments of the present invention, rather than Whole embodiments.Based on the embodiments of the present invention, those of ordinary skill in the art are not making creative work premise Under every other embodiment obtained, shall fall within the protection scope of the present invention.

Embodiment one:

Firstly, describing the exemplary electronic device of the scene recognition method for realizing the embodiment of the present invention referring to Fig.1 100.The exemplary electronic device 100 can be computer, be also possible to the mobile terminals such as smart phone, tablet computer, can be with It is the authenticating devices such as testimony of a witness all-in-one machine.

As shown in Figure 1, electronic equipment 100 include one or more processors 102, it is one or more storage device 104, defeated Enter device 106, output device 108 and image collecting device 110, these components pass through bus system 112 and/or other forms Bindiny mechanism's (not shown) interconnection.It should be noted that the component and structure of electronic equipment 100 shown in FIG. 1 are only exemplary, And not restrictive, as needed, the electronic equipment also can have other assemblies and structure.

The processor 102 can be central processing unit (CPU) or have data-handling capacity and/or instruction execution The processing unit of the other forms of ability, and the other components that can control in the electronic equipment 100 are desired to execute Function.

The storage device 104 may include one or more computer program products, and the computer program product can To include various forms of computer readable storage mediums, such as volatile memory and/or nonvolatile memory.It is described easy The property lost memory for example may include random access memory (RAM) and/or cache memory (cache) etc..It is described non- Volatile memory for example may include read-only memory (ROM), hard disk, flash memory etc..In the computer readable storage medium On can store one or more computer program instructions, processor 102 can run described program instruction, to realize hereafter institute The client functionality (realized by processor) in the embodiment of the present invention stated and/or other desired functions.In the meter Can also store various application programs and various data in calculation machine readable storage medium storing program for executing, for example, the application program use and/or The various data etc. generated.

The input unit 106 can be the device that user is used to input instruction, and may include keyboard, mouse, wheat One or more of gram wind and touch screen etc..

The output device 108 can export various information (for example, image or sound) to external (for example, user), and It and may include one or more of display, loudspeaker etc..

Described image acquisition device 110 can shoot the desired image of user (such as photo, video etc.), and will be clapped The image taken the photograph is stored in the storage device 104 for the use of other components.

Embodiment two:

Since existing training image identifies the negligible amounts of the training image of network again, affects image and identify network again Precision, in order to improve the precision that image identifies network again, the present embodiment provides firstly the instruction that a kind of image identifies network again Practice method, it should be noted that step shown in the flowchart of the accompanying drawings can be in such as a group of computer-executable instructions It is executed in computer system, although also, logical order is shown in flow charts, and it in some cases, can be with not The sequence being same as herein executes shown or described step.It describes in detail below to the present embodiment.

Fig. 2 shows the training method flow charts that a kind of image provided by the embodiment of the present invention identifies network again.It is described Image identifies that network may include convolutional neural networks and distance metric function again.Convolutional neural networks are used for two to input Image is identified, feature vector is extracted, and distance metric function is used to calculate the distance between the feature vector of two images, with Determine two images whether be same target image.As shown in Fig. 2, the training method includes the following steps:

Step S202 obtains fixed reference feature collection；The fixed reference feature concentrates same a pair comprising being distributed in feature space As at least one corresponding feature vector.

Fixed reference feature collection herein is for generating more training images, and therefore, the feature vector that training characteristics are concentrated can To be derived from the feature vector of the initial pictures of training image concentration.The initial pictures that training image is concentrated are properly termed as referring to Image.The reference picture that training image is concentrated can be obtained from the picture that one or more image collecting devices (such as camera) captures , it can also be obtained from image library.

A kind of optional mode for obtaining fixed reference feature collection is as follows: acquisition training image collection first；The training image collection In include same target one or more reference picture, while also include other objects one or more reference picture.It is logical Cross the feature vector that convolutional neural networks extract every reference picture of training image concentration respectively；Described eigenvector is located at spy It levies in space.Therefore, there are at least one corresponding feature vectors of same target in this feature space, it is also possible to which there are other right As corresponding feature vector.From feature space, at least one corresponding feature vector of same target is chosen, fixed reference feature is generated Collection.For example, for certain specified an object, (object can be some specified pedestrian, is also possible to specified vehicle or refers to Other things such as fixed animal), the corresponding all feature vectors of the object are chosen from feature space, generate fixed reference feature collection. Fixed reference feature concentration remains mutual alignment relation of each feature vector in feature space.

In above process, since every reference picture that training image is concentrated obtains a corresponding feature vector, It is present in feature space, therefore includes the feature vector of multiple and different objects in feature space.The institute that training image is concentrated After thering is reference picture to extract feature vector, it can be directed to different objects, corresponding fixed reference feature collection is generated, under It states and generates more training images in step.

The optional mode of another kind for obtaining fixed reference feature collection is as follows: acquisition training image collection first；From training image collection In select all reference pictures comprising specified object, available one or more reference picture comprising same target, group At reference picture collection.Extract the feature vector of every reference picture of reference picture concentration respectively by convolutional neural networks；Institute Feature vector is stated to be located in feature space.Obtained all feature vectors are formed into fixed reference feature collection.Equally, fixed reference feature is concentrated Remain mutual alignment relation of each feature vector in feature space.

Similarly, since training image concentrates the reference picture comprising multiple and different objects, different objects, choosing can be directed to It selects reference picture and forms corresponding reference picture collection, and then generate multiple and different corresponding fixed reference feature collection of object.

Convolutional neural networks during this can identify the convolutional neural networks in network using image again.

Step S204 fights network by production, generates training image according to fixed reference feature collection.

It can be concentrated from fixed reference feature and randomly select feature vector, the feature vector input production of extraction is fought into net Network obtains the training image of production confrontation network output.

In general, in feature space, the feature vector of same target is gathered in a limited region, is located at the area Feature vector feature more representative of object of the feature vector at domain edge than being located in the region the heart.In order to quickly generate more Image true to nature, can be according to spatial position of the feature vector that fixed reference feature is concentrated in feature space, by fixed reference feature collection Multilayer is divided by center to edge；It includes at least one feature vector that each layer of fixed reference feature, which is concentrated,；According to from edge to center Sequence, randomly select feature vector from each layer of fixed reference feature concentration.For example, first from marginal layer randomly select feature to Amount, then, urgent adjacent side edge layer randomly selects feature vector, and so on, until randomly selecting feature vector from central core.

The feature vector of extraction is input in production confrontation network, training image is obtained, includes spy in training image Levy the corresponding object of vector.For example, being specified object with certain a group traveling together, the fixed reference feature collection of the pedestrian is obtained, from reference spy Feature vector is extracted in collection at random and is input to production confrontation network, production confrontation network can be exported according to requirement Training image comprising the pedestrian.What these training images generated may be concentrated with training image includes the pedestrian with reference to figure As having different shading values or different backgrounds or pedestrian that there is different postures etc..

The production confrontation network includes generating network and discrimination natwork, and the purpose for generating network is generated comprising feature The image of object corresponding to vector, the effect of discrimination natwork are the degree true to nature that judgement generates network image generated.Fig. 3 Show a kind of procedure chart of production confrontation network progress image regeneration.As shown in figure 3, by taking object A as an example, from object A's Fixed reference feature concentration randomly selects the corresponding feature vector of object A, and the feature vector of selection is input to and is generated in network, is obtained Image to be identified, the image to be identified make a living into the image containing object A of network analog.Generating network can simply be interpreted as One image acquisition device, image acquisition device shooting one is different from reference picture but may include object A's in image Image, as image to be identified.Image to be identified and the corresponding reference picture of feature vector are input in discrimination natwork and are carried out Identify, whether discrimination natwork will judge in image to be identified comprising object A.If it is, the identification result of discrimination natwork output Show that image to be identified includes object A, then it can be using the image to be identified as training image；If not, i.e. discrimination natwork is recognized Do not include object A for image to be identified, illustrates that the fidelity for generating the image to be identified that network generates is inadequate, this cannot be waited reflecting Other image is exported as training image.

Using production confrontation network generate training image during, can also to discrimination natwork and generate network into Row training.For example, generating the parameter of network according to the identification result adjustment of discrimination natwork output.It optionally, can also be according to identification The identification result of network output adjusts the parameter for generating network and above-mentioned convolutional neural networks simultaneously, extracts convolutional neural networks Feature vector the characteristics of capable of more embodying object A, also make to generate the image that network generates more true to nature.

Referring to above-mentioned steps, production, which fights network, can generate a large amount of training images for different objects.

Training image is added to training image and concentrated by step S206.

For example, when it includes n reference picture that training image, which is concentrated, if fighting network by production generates m Training image, by m training image be added to training image concentrate, obtain include n+m image new training image collection, from And increase the quantity of the image of training image concentration.Identify that network is trained again to image using more image, The precision of network can be identified again with the image that training for promotion goes out.

Step S208 identifies network using training image collection training image again.

Wherein, image identifies that network includes convolutional neural networks and distance metric function again.Convolutional neural networks can be adopted With GoogleNet, one of VGG or ResNet.Distance metric function can (such as L2 be apart from letter using Euclidean distance function Number), manhatton distance function, included angle cosine function, Chebyshev's distance function, in Hamming distance function or mahalanobis distance function One kind.

Training image identifies that the detailed process of network can be such that again and randomly selects two instructions from training image sample set Practice image, by convolutional neural networks, extracts the corresponding feature vector of two training images respectively；Using distance metric function meter Calculate the first distance between the corresponding feature vector of two training images；According to the corresponding default label of two training images, lead to It crosses loss function and accuracy inspection is carried out to first distance, obtain loss function value；Based on loss function value, pass through backpropagation Algorithm identifies that the parameter of network is trained to image again.

Fig. 4 shows an example for identifying network again using training image collection training image.As shown in connection with fig. 4, it will instruct Practice image 1 and training image 2 by convolutional neural networks, extracts feature vector P1 and feature vector P2 respectively.Wherein, training figure As 1 and training image 2 all can be the reference picture concentrated of training image, or production all can be passed through to fight network raw At training image or one for training image concentrate reference picture, one for pass through production fight network generate Training image.After obtaining feature vector P1 and feature vector P2, feature vector P1 and feature are calculated using distance metric function First distance between vector P2, according to training image 1 and the corresponding default label of training image 2, by loss function to One distance carries out accuracy inspection, obtains loss function value, loss function value is based on, by back-propagation algorithm to convolutional Neural The parameter of network and the matrix of distance metric function are trained.Wherein, loss function used by the present embodiment may include But it is not limited to quadratic loss function, cross entropy loss function, difficult sample and (Triplet Hard Loss) letter is lost using triple One of number is a variety of.

Using loss function, the effect that the parameter of network is trained is identified again to image are as follows: by any two comprising same The first distance that the image of an object is calculated is less than first be calculated by any two images comprising different objects Distance.Alternatively, it may also be said that the first distance obtained between two images comprising same target is the smaller the better, not comprising same The first distance obtained between two images of an object is the bigger the better.

Image provided by the embodiment of the present invention identifies the training method of network again, can by obtain fixed reference feature collection, According to fixed reference feature collection, network is fought by production, generates training image, training image is added to training image and is concentrated, The quantity for increasing the image that training image is concentrated, identifies network, this hair using the training image collection training image after increase again The bright quantity for increasing image by production confrontation, to achieve the purpose that promote the precision that image identifies network again.

Embodiment three:

The training method for identifying network again corresponding to image provided in embodiment two present embodiments provides a kind of figure Training device as identifying network again.Fig. 5 shows the training that a kind of image provided by the embodiment of the present invention identifies network again Structural schematic diagram, as shown in figure 5, the device comprises the following modules:

Feature set obtains module 51, for obtaining fixed reference feature collection；The fixed reference feature is concentrated comprising being distributed in feature sky Between at least one corresponding feature vector of same target；

Image generation module 52 generates training figure for fighting network by production according to the fixed reference feature collection Picture；

Image adding module 53 is concentrated for the training image to be added to training image；

Network training module 54, for identifying network again using the training image collection training image.

Wherein, feature set obtains module 51 and can be also used for: obtaining training image collection；The training image is concentrated comprising same An at least reference picture for an object extracts every that the training image is concentrated by convolutional neural networks with reference to figure respectively The feature vector of picture, described eigenvector is located in the feature space, and from the feature space, it is corresponding to choose same target At least one feature vector, generate the fixed reference feature collection.

Further, image generation module 52 may include extracting sub-module and generation submodule.Extracting sub-module is used for: Randomly select the feature vector that the fixed reference feature is concentrated.It generates submodule to be used for: the described eigenvector of extraction is input to Production is fought in network, and the training image is obtained.

Optionally, extracting sub-module can be also used for: the feature vector concentrated according to the fixed reference feature is in the feature The fixed reference feature collection is divided into multilayer by center to edge by the spatial position in space；Each layer of fixed reference feature, which is concentrated, includes At least one feature vector；According to the sequence from edge to center, feature vector is randomly selected from each layer of fixed reference feature concentration.

The production confrontation network includes generating network and discrimination natwork, and optionally, generating submodule can be also used for: Described eigenvector is input in the generation network, image to be identified is obtained；By the image to be identified and the feature The corresponding reference picture of vector, which is input in the discrimination natwork, to be identified；If the identification result of the discrimination natwork output Show that the image to be identified reference picture corresponding with described eigenvector includes same target, then by the image to be identified As training image.

Optionally, image generation module 52 may include adjustment module, be used for the image to be identified and the feature The corresponding reference picture of vector is input to identified in the discrimination natwork after, according to the discrimination natwork export identification As a result the parameter for generating network is adjusted.

Optionally, described image identifies that network includes convolutional neural networks and distance metric function again；The convolutional Neural Network includes at least one sequentially connected convolutional layer, global pool layer and full articulamentum.

Optionally, network generation module 54 can be also used for: two instructions are randomly selected from the training image sample set Practice image and extracts the corresponding feature vector of two training images respectively by convolutional neural networks；Using distance metric letter Number calculates the first distance between the corresponding feature vector of two training images；It is corresponding according to two training images Default label carries out accuracy inspection to the first distance by loss function, obtains loss function value；Based on the loss Functional value identifies that the parameter of network is trained by back-propagation algorithm again to described image.

Image provided by the embodiment of the present invention identifies the training device of network again, can by obtain fixed reference feature collection, According to fixed reference feature collection, network is fought by production, generates training image, training image is added to training image and is concentrated, The quantity for increasing the image that training image is concentrated, identifies network, this hair using the training image collection training image after increase again The bright quantity for increasing image by production confrontation, to achieve the purpose that promote the precision that image identifies network again.

The technical effect of device provided by the present embodiment, realization principle and generation is identical with previous embodiment, for letter It describes, Installation practice part does not refer to place, can refer to corresponding contents in preceding method embodiment.

Example IV:

The embodiment of the invention also provides a kind of recognition methods again of image, as shown in fig. 6, this method comprises the following steps:

Step S602: inquiry picture and an at least picture to be compared comprising target object are obtained.

Wherein, the image collecting device that inquiry picture can be electronic equipment either passes through network connection with electronic equipment Other image capture devices (such as camera) capture, be also possible to user offer.Picture to be compared can be from one Or it obtained in the picture of multiple camera captures, can also be extracted from image library.Picture to be compared can be one, It can be multiple.Illustratively, inquiring the target object for including in picture can be specified pedestrian.

Whether step S604: identifying network by image again, determine wait compare in picture comprising target object.

Image identifies that network is obtained using training method documented in embodiment two training again.Image identifies net again Network includes convolutional neural networks and distance metric function.Convolutional neural networks include at least one sequentially connected convolutional layer, complete Office's pond layer and full articulamentum.Every layer of convolutional layer in convolutional neural networks includes one or more pictures for from input The convolution kernel that characteristic information is extracted in picture element matrix, with convolution kernel according to the pixel square of the picture of certain step-length traversal input Battle array, obtains at least one characteristic value, by least one eigenvalue cluster at characteristic pattern.Characteristic pattern passes through global pool layer and full connection Layer carries out dimension-reduction treatment, the corresponding feature vector of the picture inputted.Distance metric function can using Euclidean distance function, One in manhatton distance function, included angle cosine function, Chebyshev's distance function, Hamming distance function or mahalanobis distance function Kind.

Will inquiry picture and picture to be compared input convolutional neural networks respectively, obtain inquiry picture feature vector and to Compare the feature vector of picture.Using distance metric function calculate inquiry picture feature vector and picture to be compared feature to Second distance between amount；According to second distance, determine wait whether compare in picture comprising target object.

Illustratively, when comparing picture is a picture, if the feature vector of inquiry picture and picture to be compared Feature vector between second distance be greater than given threshold, then can determine wait compare in picture do not include target object；Such as The second distance that fruit is inquired between the feature vector of picture and the feature vector of picture to be compared is less than given threshold, then can be true Determine wait compare in picture comprising target object.

When comparing picture includes plurality of pictures, the feature of each picture to be compared is calculated by distance metric function Second distance between vector and the feature vector for inquiring picture, obtains multiple second distances.Every picture to be compared is corresponding One second distance.Picture to be compared can be ranked up according to the sequence of second distance from small to large, apart from it is the smallest to The maximum probability in picture comprising target object is compared, output is located at the picture to be compared of the setting digit of sequence front end, or Output second distance is less than the picture to be compared of given threshold in sequence.The picture to be compared of output is considered as comprising target The biggish picture of the probability of object, user can do careful examination again from these pictures, can greatly reduce the work of user Amount.

Identify that network is trained by a large amount of training images again by the image used in this present embodiment, precision is more The accuracy of height, identification is also higher.

Embodiment five:

One kind is present embodiments provided as shown in connection with fig. 7 corresponding to the recognition methods again of image provided in example IV Image identification device again, comprising: picture obtains module 71 and again identification module 72.

Wherein, picture obtain module 71, for obtains comprising target object inquiry picture and at least one to comparison diagram Piece；Whether identification module 72 again determine described wait compare in picture comprising the target pair for identifying network again by image As；Described image identifies that network is to obtain using the training of training method described in any one of previous embodiment again.

Optionally, image identifies that network includes convolutional neural networks and distance metric function again.Identification module 72 may be used also again To be used for: by convolutional neural networks, extracting the feature vector of the inquiry picture and the feature vector of the picture to be compared； The between the feature vector of the inquiry picture and the feature vector of the picture to be compared is calculated using distance metric function Two distances；According to second distance, determine wait whether compare in picture comprising target object.

In addition, the embodiment of the invention provides a kind of electronic equipment, including memory and processor, it is stored in memory The computer program that can be run on a processor, processor realize that aforementioned image identifies the instruction of network again when executing computer program The step of practicing method and/or aforementioned the image method that recognition methods embodiment provides again.

Further, the embodiment of the invention also provides a kind of computer readable storage medium and the computer programs of device Product, the computer readable storage medium including storing program code, the instruction that said program code includes can be used for executing Aforementioned image identifies the training method and/or aforementioned image recognition methods method as described in the examples again of network again, specifically It realizes and can be found in embodiment of the method, details are not described herein.

It, can be with if the function is realized in the form of SFU software functional unit and when sold or used as an independent product It is stored in a computer readable storage medium.Based on this understanding, technical solution of the present invention is substantially in other words The part of the part that contributes to existing technology or the technical solution can be embodied in the form of software products, the meter Calculation machine software product is stored in a storage medium, including some instructions are used so that a computer equipment (can be a People's computer, server or network equipment etc.) it performs all or part of the steps of the method described in the various embodiments of the present invention. And storage medium above-mentioned includes: that USB flash disk, mobile hard disk, read-only memory (ROM, Read-Only Memory), arbitrary access are deposited The various media that can store program code such as reservoir (RAM, Random Access Memory), magnetic or disk.

Finally, it should be noted that embodiment described above, only a specific embodiment of the invention, to illustrate the present invention Technical solution, rather than its limitations, scope of protection of the present invention is not limited thereto, although with reference to the foregoing embodiments to this hair It is bright to be described in detail, those skilled in the art should understand that: anyone skilled in the art In the technical scope disclosed by the present invention, it can still modify to technical solution documented by previous embodiment or can be light It is readily conceivable that variation or equivalent replacement of some of the technical features；And these modifications, variation or replacement, do not make The essence of corresponding technical solution is detached from the spirit and scope of technical solution of the embodiment of the present invention, should all cover in protection of the invention Within the scope of.Therefore, protection scope of the present invention should be based on the protection scope of the described claims.

Claims

1. the training method that a kind of image identifies network again characterized by comprising

Obtain fixed reference feature collection；The fixed reference feature concentrates the same target corresponding at least one comprising being distributed in feature space A feature vector；

The training image is added to training image to concentrate；

Network is identified again using the training image collection training image.

2. training method according to claim 1, which is characterized in that the step of the acquisition fixed reference feature collection, comprising:

Extract the feature vector for every reference picture that the training image is concentrated respectively by convolutional neural networks；The feature Vector is located in the feature space；

From the feature space, at least one corresponding feature vector of same target is chosen, the fixed reference feature collection is generated.

3. training method according to claim 2, which is characterized in that it is described according to the fixed reference feature collection, pass through generation The step of formula fights network, generates training image, comprising:

4. training method according to claim 3, which is characterized in that the spy for randomly selecting the fixed reference feature and concentrating The step of levying vector, comprising:

Spatial position of the feature vector concentrated according to the fixed reference feature in the feature space, by the fixed reference feature collection Multilayer is divided by center to edge；It includes at least one feature vector that each layer of fixed reference feature, which is concentrated,；

5. training method according to claim 3, which is characterized in that production confrontation network include generate network and Discrimination natwork；It is described that the described eigenvector of extraction is input in production confrontation network, obtain the step of the training image Suddenly, comprising:

The image to be identified and the corresponding reference picture of described eigenvector are input in the discrimination natwork and are identified；

If the identification result of the discrimination natwork output shows the image to be identified reference corresponding with described eigenvector Image includes same target, then using the image to be identified as training image.

6. training method according to claim 5, which is characterized in that it is described by the image to be identified and the feature to Corresponding reference picture is measured to be input to after the step of being identified in the discrimination natwork, the method also includes:

7. training method according to claim 1, which is characterized in that described image identifies that network includes convolutional Neural net again Network and distance metric function；The convolutional neural networks include at least one sequentially connected convolutional layer, global pool layer and complete Articulamentum.

8. training method according to claim 7, which is characterized in that described to utilize the training image collection training image again The step of identifying network, comprising:

Two training images are randomly selected from the training image sample set, by convolutional neural networks, respectively described in extraction The corresponding feature vector of two training images；

According to the corresponding default label of two training images, accuracy inspection is carried out to the first distance by loss function It tests, obtains loss function value；

Based on the loss function value, identify that the parameter of network is trained again to described image by back-propagation algorithm.

9. the training device that a kind of image identifies network again characterized by comprising

Feature set obtains module, for obtaining fixed reference feature collection；The fixed reference feature is concentrated comprising being distributed in feature space At least one corresponding feature vector of same target；

Network training module, for identifying network again using the training image collection training image.

10. a kind of recognition methods again of image characterized by comprising

Network is identified again by image, is determined described wait whether compare in picture comprising the target object；Described image is known again Other network is obtained using training method according to any one of claims 1 to 8 training.

11. training method according to claim 10, which is characterized in that described image identifies that network includes convolutional Neural again Network and distance metric function；It is described that network is identified by image again, it determines described wait whether compare in picture comprising the mesh The step of marking object, comprising:

By convolutional neural networks, the feature vector of the inquiry picture and the feature vector of the picture to be compared are extracted；

It is calculated using distance metric function between the feature vector of the inquiry picture and the feature vector of the picture to be compared Second distance；

12. a kind of image identification device again characterized by comprising

Whether identification module again determines described wait compare in picture comprising the target pair for identifying network again by image As；Described image identifies that network is obtained using training method according to any one of claims 1 to 8 training again.

13. a kind of electronic equipment, including memory, processor, it is stored with and can runs on the processor in the memory Computer program, which is characterized in that the processor is realized in the claims 1~8 when executing the computer program The step of method described in any one of any one and/or claim 10~11.

14. a kind of computer readable storage medium, computer program, feature are stored on the computer readable storage medium It is, any one of the claims 1~8 and/or claim 10 is executed when the computer program is run by processor The step of method described in any one of~11.