CN107609645A

CN107609645A - Method and apparatus for training convolutional neural networks

Info

Publication number: CN107609645A
Application number: CN201710859122.XA
Authority: CN
Inventors: 刘文献
Original assignee: Beijing Baidu Netcom Science and Technology Co Ltd
Current assignee: Beijing Baidu Netcom Science and Technology Co Ltd
Priority date: 2017-09-21
Filing date: 2017-09-21
Publication date: 2018-01-19
Anticipated expiration: 2037-09-21
Also published as: CN107609645B

Abstract

The embodiment of the present application discloses the method and apparatus for training convolutional neural networks.One embodiment of this method includes：Every layer in each layer for initializing convolutional neural networks, the input information aggregate of this layer is stored using at least one piece of video card；Calculate the average and variance of the input information aggregate of at least part this layer that every piece of video card at least one piece of video card is stored；The average of the input information aggregate of every piece of video card at least one piece of video card is stored at least part this layer and variance are sent to other video cards, to calculate the average and variance of the input information aggregate of this layer；It is normalized using the input information aggregate to this layer of average and variance of the input information aggregate of this layer, obtains the normalized input information aggregate of this layer；Initialization convolutional neural networks are trained using the normalized input information aggregate of each layer, obtain the convolutional neural networks that training is completed.This embodiment improves the stability of convolutional neural networks.

Description

Method and apparatus for training convolutional neural networks

Technical field

The application is related to field of computer technology, and in particular to Internet technical field, more particularly, to training convolutional The method and apparatus of neutral net.

Background technology

Convolutional neural networks (Convolutional Neural Network, CNN) are a kind of feedforward neural networks, it Artificial neuron can respond the surrounding cells in a part of coverage, have outstanding performance for large-scale image procossing.

However, the continuous intensification of the depth with convolutional neural networks, causes each layer for training convolutional neural networks The distribution of input information aggregate differ so that the convolutional neural networks trained are unstable.Therefore, convolution god how is improved Stability through network just becomes current urgent problem to be solved.

The content of the invention

The purpose of the embodiment of the present application is to propose a kind of improved method and apparatus for training convolutional neural networks, To solve the technical problem that background section above is mentioned.

In a first aspect, the embodiment of the present application provides a kind of method for training convolutional neural networks, this method includes： Every layer in each layer for initializing convolutional neural networks, the input information aggregate of this layer is stored using at least one piece of video card, Wherein, the input information aggregate of every piece of video card storage at least partly this layer at least in one piece of video card；Calculate at least one piece of video card In every piece of video card average and variance of the input information aggregate of at least part this layer that are stored；By at least one piece of video card The average and variance of the input information aggregate of at least part this layer that every piece of video card is stored are sent to other video cards, so that calculate should The average and variance of the input information aggregate of layer；Believed using input of the average and variance that input information aggregate of this layer to this layer Breath set is normalized, and obtains the normalized input information aggregate of this layer；Believed using the normalized input of each layer Breath set is trained to initialization convolutional neural networks, obtains the convolutional neural networks that training is completed.

In certain embodiments, initialization convolutional neural networks are carried out using the normalized input information aggregate of each layer Training, the convolutional neural networks that training is completed are obtained, including：Perform following training step：Every layer of normalized input is believed Breath set input obtains characteristic vector set to every layer of initialization convolutional neural networks, determines whether characteristic vector set is full Sufficient preparatory condition, if meeting preparatory condition, convolutional neural networks of the convolutional neural networks as training completion will be initialized；Ring Ying Yu is unsatisfactory for preparatory condition, the parameter of adjustment initialization convolutional neural networks, and continues executing with training step.

In certain embodiments, input information aggregate includes the input information of multiple identical categories；And determine feature to Whether duration set meets preparatory condition, including：Calculate in multiple characteristic vectors corresponding to the input information of multiple identical categories The distance between each characteristic vector, obtain the first result of calculation；Based on the first result of calculation, it is determined whether meet default bar Part.

In certain embodiments, calculate each in multiple characteristic vectors corresponding to the input information of multiple identical categories The distance between characteristic vector, the first result of calculation is obtained, including：Calculate more corresponding to the input information of multiple identical categories The Euclidean distance between each characteristic vector in individual characteristic vector, obtains the first result of calculation.

In certain embodiments, based on the first result of calculation, it is determined whether meet preparatory condition, including：Determine multiple phases Whether the Euclidean distance between each characteristic vector in multiple characteristic vectors corresponding to generic input information is respectively less than First pre-determined distance threshold value；If respectively less than the first pre-determined distance threshold value, meets preparatory condition；If not respectively less than first is default Distance threshold, then it is unsatisfactory for preparatory condition.

In certain embodiments, input information aggregate includes multiple different classes of input information；And determine feature to Whether duration set meets preparatory condition, including：Calculate in multiple characteristic vectors corresponding to multiple different classes of input information The distance between each characteristic vector, obtain the second result of calculation；Based on the second result of calculation, it is determined whether meet default bar Part.

In certain embodiments, calculate each in multiple characteristic vectors corresponding to multiple different classes of input information The distance between characteristic vector, the second result of calculation is obtained, including：Calculate more corresponding to multiple different classes of input information The Euclidean distance between each characteristic vector in individual characteristic vector, obtains the second result of calculation.

In certain embodiments, based on the second result of calculation, it is determined whether meet preparatory condition, including：Determine it is multiple not Whether the Euclidean distance between each characteristic vector in multiple characteristic vectors corresponding to generic input information is all higher than Second pre-determined distance threshold value；If being all higher than the second pre-determined distance threshold value, meet preparatory condition；If not it is default to be all higher than second Distance threshold, then it is unsatisfactory for preparatory condition.

In certain embodiments, this method also includes：Obtain the first input information and the second input information；By the first input Information and the second input information input the convolutional neural networks completed to training, obtain the characteristic vector and the of the first input information The characteristic vector of two input information；Calculate between the characteristic vector of the first input information and the characteristic vector of the second input information Distance；Based on the distance calculated, determine whether the first input information and the second input information are same category of input information, And export determination result.

Second aspect, the embodiment of the present application provide a kind of device for training convolutional neural networks, and the device includes： Normalization unit, every layer be configured in each layer for initialization convolutional neural networks, utilize at least one piece of video card storage The input information aggregate of this layer, wherein, the input information collection of every piece of video card storage at least partly this layer at least one piece of video card Close；Calculate average and the side of the input information aggregate of at least part this layer that every piece of video card at least one piece of video card is stored Difference；The average and variance hair of the input information aggregate of at least part this layer that every piece of video card at least one piece of video card is stored Other video cards are delivered to, to calculate the average and variance of the input information aggregate of this layer；The equal of information aggregate is inputted using this layer The input information aggregate of value and variance to this layer is normalized, and obtains the normalized input information aggregate of this layer；Instruction Practice unit, be configured to be trained initialization convolutional neural networks using the normalized input information aggregate of each layer, obtain The convolutional neural networks completed to training.

In certain embodiments, training unit includes：Subelement is trained, is configured to carry out following training step：Will be every The normalized input information aggregate of layer is inputted to every layer of initialization convolutional neural networks, obtains characteristic vector set, it is determined that Whether characteristic vector set meets preparatory condition, if meeting preparatory condition, initialization convolutional neural networks is used as and trained Into convolutional neural networks；Subelement is adjusted, is configured in response to being unsatisfactory for preparatory condition, adjustment initialization convolutional Neural net The parameter of network, and continue executing with training step.

In certain embodiments, input information aggregate includes the input information of multiple identical categories；And training subelement Including：First computing module, it is configured to calculate in multiple characteristic vectors corresponding to the input information of multiple identical categories The distance between each characteristic vector, obtain the first result of calculation；First determining module, it is configured to calculate based on first and ties Fruit, it is determined whether meet preparatory condition.

In certain embodiments, the first computing module is further configured to：Calculate the input information of multiple identical categories The Euclidean distance between each characteristic vector in corresponding multiple characteristic vectors, obtains the first result of calculation.

In certain embodiments, the first determining module is further configured to：Determine the input information of multiple identical categories Whether the Euclidean distance between each characteristic vector in corresponding multiple characteristic vectors is respectively less than the first pre-determined distance threshold value； If respectively less than the first pre-determined distance threshold value, meets preparatory condition；If not respectively less than the first pre-determined distance threshold value, then be unsatisfactory for Preparatory condition.

In certain embodiments, input information aggregate includes multiple different classes of input information；And training subelement Including：Second computing module, it is configured to calculate in multiple characteristic vectors corresponding to multiple different classes of input information The distance between each characteristic vector, obtain the second result of calculation；Second determining module, it is configured to calculate based on second and ties Fruit, it is determined whether meet preparatory condition.

In certain embodiments, the second computing module is further configured to：Calculate multiple different classes of input information The Euclidean distance between each characteristic vector in corresponding multiple characteristic vectors, obtains the second result of calculation.

In certain embodiments, the second determining module is further configured to：Determine multiple different classes of input information Whether the Euclidean distance between each characteristic vector in corresponding multiple characteristic vectors is all higher than the second pre-determined distance threshold value； If being all higher than the second pre-determined distance threshold value, meet preparatory condition；If not being all higher than the second pre-determined distance threshold value, then it is unsatisfactory for Preparatory condition.

In certain embodiments, the device also includes：Acquiring unit, it is configured to obtain the first input information and second defeated Enter information；Input block, it is configured to inputting the first input information and the second input information into the convolutional Neural completed to training Network, obtain the characteristic vector of the first input information and the characteristic vector of the second input information；Computing unit, it is configured to calculate The distance between characteristic vector of the characteristic vector of first input information and the second input information；Determining unit, it is configured to base In the distance calculated, determine whether the first input information and the second input information are same category of input information, and export Determine result.

The third aspect, the embodiment of the present application provide a kind of server, and the server includes：One or more processors； Storage device, for storing one or more programs；When one or more programs are executed by one or more processors so that one Individual or multiple processors realize the method as described in any implementation in first aspect.

Fourth aspect, the embodiment of the present application provide a kind of computer-readable recording medium, are stored thereon with computer journey Sequence, the method as described in any implementation in first aspect is realized when the computer program is executed by processor.

The method and apparatus for training convolutional neural networks that the embodiment of the present application provides, for initializing convolutional Neural Every layer in each layer of network, the input of at least part this layer stored by calculating every piece of video card at least one piece of video card The average and variance of information aggregate, and the input of at least part this layer that every piece of video card at least one piece of video card is stored is believed The average and variance for ceasing set are sent to other video cards, to realize the average and variance of the input information aggregate for calculating this layer；So It is normalized afterwards using the input information aggregate to this layer of average and variance of the input information aggregate of this layer, so as to To the normalized input information aggregate of this layer；Finally using the normalized input information aggregate of each layer to initialization convolution god It is trained through network, so as to obtain the convolutional neural networks that training is completed.Using each layer input information aggregate average and Input information aggregate of the variance to each layer is normalized so that the distribution phase of the normalized input information aggregate of each layer Together, so as to improve the stability of the normalized input information aggregate of each layer, and then trained convolutional Neural is improved The stability of network.

Brief description of the drawings

By reading the detailed description made to non-limiting example made with reference to the following drawings, the application's is other Feature, objects and advantages will become more apparent upon：

Fig. 1 is that the embodiment of the present application can apply to exemplary system architecture figure therein；

Fig. 2 is the flow chart according to one embodiment of the method for training convolutional neural networks of the application；

Fig. 3 is the normalized input information aggregate using each layer in the flow chart to Fig. 2 to initializing convolutional Neural The decomposition process figure for the step of network is trained；

Fig. 4 is the structural representation according to one embodiment of the device for training convolutional neural networks of the application；

Fig. 5 is adapted for the structural representation of the computer system of the server for realizing the embodiment of the present application.

Embodiment

The application is described in further detail with reference to the accompanying drawings and examples.It is understood that this place is retouched The specific embodiment stated is used only for explaining related invention, rather than the restriction to the invention.It also should be noted that in order to Be easy to describe, illustrate only in accompanying drawing to about the related part of invention.

It should be noted that in the case where not conflicting, the feature in embodiment and embodiment in the application can phase Mutually combination.Describe the application in detail below with reference to the accompanying drawings and in conjunction with the embodiments.

Fig. 1 is shown can be using the application for the method for training convolutional neural networks or for training convolutional nerve The exemplary system architecture 100 of the device of network.

As shown in figure 1, system architecture 100 can include terminal device 101,102,103, network 104 and server 105. Network 104 between terminal device 101,102,103 and server 105 provide communication link medium.Network 104 can be with Including various connection types, such as wired, wireless communication link or fiber optic cables etc..

Terminal device 101,102,103 is interacted by network 104 with server 105, to receive or send message etc..Terminal Equipment 101,102,103 can be various electronic equipments, including but not limited to smart mobile phone, tablet personal computer, portable meter on knee Calculation machine and desktop computer etc..

Server 105 can provide various services, for example, server 105 can by network 104 from terminal device 101, 102nd, 103 input information aggregate is obtained, to realize to initializing the training of convolutional neural networks, and obtains the convolution that training is completed Neutral net.

It should be noted that the method for training convolutional neural networks that the embodiment of the present application is provided is typically by servicing Device 105 is performed, and correspondingly, the device for training convolutional neural networks is generally positioned in server 105.

It should be understood that the number of the terminal device, network and server in Fig. 1 is only schematical.According to realizing need Will, can have any number of terminal device, network and server.Input information aggregate is stored with server 105 In the case of, system architecture 100 can be not provided with terminal device 101,102,103.

With continued reference to Fig. 2, it illustrates the implementation of one of the method for training convolutional neural networks according to the application The flow 200 of example.This is used for the method for training convolutional neural networks, comprises the following steps：

Step 201, every layer in each layer for initializing convolutional neural networks, the layer is stored using at least one piece of video card Input information aggregate.

In the present embodiment, the method operation electronic equipment thereon for training convolutional neural networks is (such as shown in Fig. 1 Server 105) can will initialize convolutional neural networks each layer in every layer of input information aggregate be stored at least one On block video card.Here, with convolutional neural networks depth continuous intensification, the input information for training convolutional neural networks The data volume of set constantly increases, and at least one piece of video card therefore, in electronic equipment would generally be set to carry out storing initial convolution god Every layer of input information aggregate in each layer through network.Wherein, every piece of video card at least in one piece of video card can store at least The partly input information aggregate of this layer.Specifically, every layer in each layer for initializing convolutional neural networks, the input of this layer If information aggregate can be divided into stem portion, one piece of video card is used for a part for the input information aggregate for storing this layer.As showing Example, initialize includes m × n input information altogether in the input information aggregate of convolutional neural networks, input information aggregate is drawn It is divided into the input information subsets that n includes m input information, each video card storage one in n video card inputs information subset, Wherein, m and n is positive integer.

In the present embodiment, the electronic equipment run thereon for the method for training convolutional neural networks can be a clothes Business device, and at least one piece of video card can be set in the server；The electricity of method operation thereon for training convolutional neural networks Sub- equipment can also be server cluster, and at least one piece of video card can be set in every server in server cluster.Make For example, electronic equipment is server cluster, and 4 pieces of video cards are set in every server in server cluster.

In the present embodiment, convolutional neural networks can be a kind of feedforward neural network, and its artificial neuron can ring The surrounding cells in a part of coverage are answered, have outstanding performance for large-scale image procossing.Generally, the base of convolutional neural networks This structure includes two layers, and one is characterized extract layer, and the input information of each neuron is connected with the local acceptance region of preceding layer, And extract the local feature.After the local feature is extracted, its position relationship between further feature also determines therewith Get off；The second is Feature Mapping layer, each computation layer of network is made up of multiple Feature Mappings, and each Feature Mapping is one flat Face, the weights of all neurons are equal in plane.As a kind of example, convolutional neural networks can be AlexNet.Wherein, AlexNet is a kind of existing structure of convolutional neural networks, and in the ImageNet of 2012, (computer vision system was known Other project name, be the maximum database of image recognition in the world at present) contest in, Geoffrey (Jeffree) and his student Structure used in Alex (Alex) is referred to as AlexNet.Generally, AlexNet includes 8 layers, wherein, first 5 layers are Convolutional (convolutional layer), behind 3 layers be full-connected (full articulamentum).As another example, convolution god Can be GoogleNet through network.Wherein, GoogleNet is also a kind of existing structure of convolutional neural networks, is 2014 Champion's model in ImageNet contest.Its basic component parts is similar with AlexNet, is one 22 layers of model.

Step 202, the input information collection of at least part this layer that every piece of video card at least one piece of video card is stored is calculated The average and variance of conjunction.

In the present embodiment, each piece of video card that electronic equipment can calculate at least one piece of video card respectively is stored at least The partly average and variance of the input information aggregate of this layer.

Step 203, the input information aggregate of at least part this layer every piece of video card at least one piece of video card stored Average and variance send to other video cards, with calculate this layer input information aggregate average and variance.

In the present embodiment, at least part that electronic equipment can be stored every piece of video card at least one piece of video card should The average and variance of the input information aggregate of layer are sent to other video cards, are deposited so as to all be stored with other video cards on every piece of video card The average and variance of the input information aggregate of this layer of at least part of storage；Then every piece of video card institute at least one piece of video card is utilized This layer of at least part of storage input information aggregate average and variance, calculate this layer input information aggregate average and Variance.As an example, at least part this layer that electronic equipment can calculate that every piece of video card at least one piece of video card stored Input the average of the average of information aggregate, and the average of the input information aggregate as this layer；At least one piece can be calculated simultaneously The variance of the average of the input information aggregate of at least part this layer that every piece of video card in video card is stored, and as the defeated of this layer Enter the variance of information aggregate.

Step 204, returned using the input information aggregate to this layer of average and variance of the input information aggregate of this layer One change is handled, and obtains the normalized input information aggregate of this layer.

In the present embodiment, electronic equipment can be using the average in the input information aggregate of this layer and variance to this layer Input information aggregate is normalized, so as to obtain the normalized input information aggregate of this layer.

In the present embodiment, during training convolutional neural networks, initialize every layer of convolutional neural networks just Beginningization parameter is constantly adjusted, and causes the distribution of follow-up each layer of input information aggregate also to change, and training process Every layer of input information aggregate is set to keep identical distribution, so, electronic equipment just needs the input information aggregate to every layer It is normalized.Here, a BN (Batch is may be inserted into after initializing every layer of convolutional neural networks Normalization, normalization) layer, average and variance for the input information aggregate using every layer believe every layer of input Breath set is normalized to：Average 0, variance 1.

Step 205, initialization convolutional neural networks are trained using the normalized input information aggregate of each layer, obtained The convolutional neural networks completed to training.

In the present embodiment, electronic equipment can utilize the normalized input information aggregate of each layer to initialization convolution god It is trained through network, so as to obtain the convolutional neural networks that training is completed.

In the present embodiment, the convolutional neural networks of completion are trained to can be used for characterizing input information with inputting the spy of information Levy the corresponding relation of vector.Here, electronic equipment can training convolutional neural networks in several ways.

As a kind of example, electronic equipment can be defeated by the input side for inputting information aggregate from initialization convolutional neural networks Enter, successively by initializing the processing of each layer in convolutional neural networks and the BN layers after each layer, and from initialization convolution god Outlet side output through network.Here, electronic equipment can utilize the BN layers after each layer to enter next layer of input information aggregate Row normalized, and the normalized input information aggregate of each layer is handled (such as multiplied using the parameter matrix of each layer Product, convolution).Wherein, initialize in convolutional neural networks and be stored with initiation parameter, initiation parameter is in convolutional neural networks Can constantly it be adjusted in training process, until training the volume that exported characteristic vector set meets default constraints Untill product neutral net.

As another example, electronic equipment can be based on the feature to a large amount of normalized input information and input information Vector statistical and generating is stored with the mapping table of the corresponding relation of characteristic vectors of multiple input information with inputting information, and The convolutional neural networks that the mapping table is completed as training.

In the present embodiment, the convolutional neural networks of completion are trained to can apply in several scenes.Alternatively, electronics is set It is standby to obtain the first input information and the second input information first；Afterwards by the first input information and the second input information input The convolutional neural networks completed to training, obtain the characteristic vector of the first input information and the characteristic vector of the second input information； Then the distance between the characteristic vector of the first input information and the characteristic vector of the second input information are calculated；It is finally based on and is counted The distance of calculation, determine whether the first input information and the second input information are same category of input information, and export determination knot Fruit.

As a kind of example, electronic equipment can calculate the characteristic vector of the first input information and the spy of the second input information Euclidean distance between sign vector.Wherein, Euclidean distance can be referred to as euclidean metric (euclidean metric) again, It is often referred to the actual distance between two points in m-dimensional space, or the natural length (i.e. the distance of the point to origin) of vector. Euclidean distance in two and three dimensions space is exactly the actual range between 2 points.Generally, the Euclidean between two vectors away from From smaller, then it is bigger to belong to same category of possibility for the input information corresponding to the two vectors；Between two vectors Euclidean distance is bigger, then the input information corresponding to the two vectors belongs to same category of possibility with regard to smaller.

As another example, electronic equipment can calculate the characteristic vector of the first input information and second and input information COS distance between characteristic vector.Wherein, COS distance can be referred to as cosine similarity again, be by calculating two vectors Included angle cosine value assess their similarity.Generally, the angle between two vectors is smaller, and cosine value is closer to 1, phase Higher like spending, then it is bigger to belong to same category of possibility for the input information corresponding to the two vectors；Between two vectors Angle it is bigger, cosine value more deviates 1, and similarity is lower, then the input information corresponding to the two vectors belongs to same category Possibility with regard to smaller.

The method for training convolutional neural networks that the embodiment of the present application provides, for initialization convolutional neural networks Every layer in each layer, the input information collection of at least part this layer stored by calculating every piece of video card at least one piece of video card The average and variance of conjunction, and the input information aggregate of at least part this layer that every piece of video card at least one piece of video card is stored Average and variance send to other video cards, with realize calculate this layer input information aggregate average and variance；Then utilize Input information aggregate of the average and variance of the input information aggregate of this layer to this layer is normalized, to obtain the layer Normalized input information aggregate；Finally using the normalized input information aggregate of each layer to initializing convolutional neural networks It is trained, so as to obtain the convolutional neural networks that training is completed.Utilize the average and variance pair of the input information aggregate of each layer The input information aggregate of each layer is normalized, so as to improve the stabilization of the normalized input information aggregate of each layer Property, and then improve the stability of trained convolutional neural networks.

In a kind of optional mode of training convolutional neural networks, the normalized of each layer is utilized in Fig. 2 flow chart The step of input information aggregate is trained to initialization convolutional neural networks can be broken down into more sub-steps.With specific reference to Fig. 3, it illustrates the normalized input information aggregate using each layer in the flow chart to Fig. 2 to initializing convolutional Neural net The decomposition process 300 for the step of network is trained.In figure 3, using the normalized input information aggregate of each layer to initialization The step of convolutional neural networks are trained resolves into 4 following sub-steps, i.e.,：Step 301, step 302, step 303 and Step 304.

Step 301, every layer of normalized input information aggregate is inputted to every layer of initialization convolutional neural networks, obtained To characteristic vector set.

In the present embodiment, electronic equipment can input every layer of normalized input information aggregate to initialization convolution Neutral net, so as to obtain characteristic vector set.Specifically, electronic equipment can be refreshing from initialization convolution by input information aggregate Input side input through network, successively by initializing the processing of each layer in convolutional neural networks and the BN layers after each layer, And exported from the outlet side of initialization convolutional neural networks.Here, electronic equipment can be using the BN layers after each layer to next layer Input information aggregate be normalized, and using each layer parameter matrix to the normalized input information aggregate of each layer Handled (such as product, convolution).

Step 302, determine whether characteristic vector set meets preparatory condition.

In the present embodiment, characteristic vector can be determined based on the characteristic vector set obtained by step 301, electronic equipment Whether set meets preparatory condition；In the case where meeting preparatory condition, step 303 is performed；It is being unsatisfactory for the feelings of preparatory condition Under condition, step 304 is performed.Specifically, electronic equipment can obtain some rules possessed by characteristic vector set first；Then It is determined that whether acquired rule meets default rule；If meeting default rule, meet preparatory condition；If default rule are not met Rule, then be unsatisfactory for preparatory condition.

In some optional implementations of the present embodiment, electronic equipment can be by following at least one mode come really Determine whether characteristic vector set meets preparatory condition：

1st, first, each characteristic vector in multiple characteristic vectors corresponding to the input information of multiple identical categories is calculated The distance between, obtain the first result of calculation.Wherein, input information aggregate can include the input information of multiple identical categories.

As a kind of example, electronic equipment can calculate multiple features corresponding to the input information of multiple identical categories to The Euclidean distance between each characteristic vector in amount, to obtain the first result of calculation.

As another example, electronic equipment can calculate multiple features corresponding to the input information of multiple identical categories The COS distance between each characteristic vector in vector, to obtain the first result of calculation.

Then, based on the first result of calculation, it is determined whether meet preparatory condition.

As a kind of example, electronic equipment can determine multiple features corresponding to the input information of multiple identical categories to Whether the Euclidean distance between each characteristic vector in amount is respectively less than the first pre-determined distance threshold value；If respectively less than first it is default away from From threshold value, then meet preparatory condition；If not respectively less than the first pre-determined distance threshold value, then be unsatisfactory for preparatory condition.

As another example, electronic equipment can by multiple features corresponding to the input information of multiple identical categories to The COS distance between each characteristic vector in amount is compared with 1；If close to 1, meet preparatory condition；If deviateing 1, It is unsatisfactory for preparatory condition.

2nd, first, each characteristic vector in multiple characteristic vectors corresponding to multiple different classes of input information is calculated The distance between, obtain the second result of calculation.Wherein, input information aggregate can include multiple different classes of input information.

As a kind of example, electronic equipment can calculate multiple features corresponding to multiple different classes of input information to The Euclidean distance between each characteristic vector in amount, to obtain the second result of calculation.

As another example, electronic equipment can calculate multiple features corresponding to multiple different classes of input information The COS distance between each characteristic vector in vector, to obtain the second result of calculation.

Then, based on the second result of calculation, it is determined whether meet preparatory condition.

As a kind of example, electronic equipment can determine multiple features corresponding to multiple different classes of input information to Whether the Euclidean distance between each characteristic vector in amount is all higher than the second pre-determined distance threshold value；If be all higher than second it is default away from From threshold value, then meet preparatory condition；If not being all higher than the second pre-determined distance threshold value, then preparatory condition is unsatisfactory for.

As another example, electronic equipment can by multiple features corresponding to multiple different classes of input information to The COS distance between each characteristic vector in amount is compared with 1；If deviateing 1, meet preparatory condition；If close to 1, It is unsatisfactory for preparatory condition.

Step 303, the convolutional neural networks completed convolutional neural networks are initialized as training.

In the present embodiment, in the case where meeting preparatory condition, then illustrate that convolutional neural networks training is completed, now, The convolutional neural networks that electronic equipment can be completed convolutional neural networks are initialized as training.Wherein, the volume of completion is trained Product neutral net can make the distance between characteristic vector of input information of identical category near as far as possible, and make inhomogeneity Similarity between the characteristic vector of other input information is remote as far as possible.

Step 303, the parameter of adjustment initialization convolutional neural networks.

In the present embodiment, in the case where being unsatisfactory for preparatory condition, electronic equipment can adjust initialization convolutional Neural The parameter of network, and return and perform step 301, untill training and meeting the convolutional neural networks of preparatory condition.

With further reference to Fig. 4, as the realization to method shown in above-mentioned each figure, it is used to train volume this application provides one kind One embodiment of the device of product neutral net, the device embodiment is corresponding with the embodiment of the method shown in Fig. 2, device tool Body can apply in various electronic equipments.

As shown in figure 4, the device 400 for training convolutional neural networks of the present embodiment can include：Normalization unit 401 and training unit 402.Wherein, normalization unit 401, it is configured in each layer for initializing convolutional neural networks Every layer, the input information aggregate of this layer is stored using at least one piece of video card, wherein, every piece of video card storage at least one piece of video card At least partly input information aggregate of this layer；Calculate at least part this layer that every piece of video card at least one piece of video card is stored Input the average and variance of information aggregate；The input of at least part this layer that every piece of video card at least one piece of video card is stored The average and variance of information aggregate are sent to other video cards, to calculate the average and variance of the input information aggregate of this layer；Utilize Input information aggregate of the average and variance of the input information aggregate of this layer to this layer is normalized, and obtains returning for this layer The one input information aggregate changed；Training unit 402, the normalized input information aggregate using each layer is configured to initialization Convolutional neural networks are trained, and obtain the convolutional neural networks that training is completed.

In the present embodiment, in the device 400 of training convolutional neural networks：Normalization unit 401 and training unit 402 specific processing and its caused technique effect can be respectively with reference to the step 201-204 and step in the corresponding embodiment of figure 2 205 related description, will not be repeated here.

In some optional implementations of the present embodiment, training unit 402 can include：Subelement is trained (in figure It is not shown), it is configured to carry out following training step：Every layer of normalized input information aggregate is inputted to initialization convolution Every layer of neutral net, characteristic vector set is obtained, determine whether characteristic vector set meets preparatory condition, if meeting default bar Part, the then convolutional neural networks completed convolutional neural networks are initialized as training；Subelement (not shown) is adjusted, is matched somebody with somebody Put in response to being unsatisfactory for preparatory condition, adjustment to initialize the parameter of convolutional neural networks, and continues executing with training step.

In some optional implementations of the present embodiment, input information aggregate can include the defeated of multiple identical categories Enter information；And training subelement can include：First computing module (not shown), it is configured to calculate multiple mutually similar The distance between each characteristic vector in multiple characteristic vectors corresponding to other input information, obtains the first result of calculation； First determining module (not shown), it is configured to be based on the first result of calculation, it is determined whether meet preparatory condition.

In some optional implementations of the present embodiment, the first computing module can be further configured to：Calculate The Euclidean distance between each characteristic vector in multiple characteristic vectors corresponding to the input information of multiple identical categories, is obtained First result of calculation.

In some optional implementations of the present embodiment, the first determining module can be further configured to：It is determined that Whether the Euclidean distance between each characteristic vector in multiple characteristic vectors corresponding to the input information of multiple identical categories Respectively less than the first pre-determined distance threshold value；If respectively less than the first pre-determined distance threshold value, meets preparatory condition；If not respectively less than One pre-determined distance threshold value, then be unsatisfactory for preparatory condition.

In some optional implementations of the present embodiment, input information aggregate can include multiple different classes of defeated Enter information；And training subelement can include：Second computing module (not shown), it is configured to calculate multiple inhomogeneities The distance between each characteristic vector in multiple characteristic vectors corresponding to other input information, obtains the second result of calculation； Second determining module (not shown), it is configured to be based on the second result of calculation, it is determined whether meet preparatory condition.

In some optional implementations of the present embodiment, the second computing module can be further configured to：Calculate The Euclidean distance between each characteristic vector in multiple characteristic vectors corresponding to multiple different classes of input information, is obtained Second result of calculation.

In some optional implementations of the present embodiment, the second determining module can be further configured to：It is determined that Whether the Euclidean distance between each characteristic vector in multiple characteristic vectors corresponding to multiple different classes of input information It is all higher than the second pre-determined distance threshold value；If being all higher than the second pre-determined distance threshold value, meet preparatory condition；If not it is all higher than Two pre-determined distance threshold values, then be unsatisfactory for preparatory condition.

In some optional implementations of the present embodiment, the device 400 for training convolutional neural networks can be with Including：Acquiring unit (not shown), it is configured to obtain the first input information and the second input information；Input block (figure Not shown in), it is configured to inputting the first input information and the second input information into the convolutional neural networks completed to training, obtains To the characteristic vector of the first input information and the characteristic vector of the second input information；Computing unit (not shown), configuration are used In the distance between characteristic vector of the characteristic vector for calculating the first input information and the second input information；Determining unit is (in figure It is not shown), it is configured to based on the distance calculated, determines whether the first input information and the second input information are same categories Input information, and export determination result.

Below with reference to Fig. 5, it illustrates suitable for for realizing the computer system 500 of the server of the embodiment of the present application Structural representation.Server shown in Fig. 5 is only an example, should not be to the function and use range band of the embodiment of the present application Carry out any restrictions.

As shown in figure 5, computer system 500 includes CPU (CPU) 501, it can be read-only according to being stored in Program in memory (ROM) 502 or be loaded into program in random access storage device (RAM) 503 from storage part 508 and Perform various appropriate actions and processing.In RAM 503, also it is stored with system 500 and operates required various programs and data. CPU 501, ROM 502 and RAM 503 are connected with each other by bus 504.Input/output (I/O) interface 505 is also connected to always Line 504.

I/O interfaces 505 are connected to lower component：Importation 506 including keyboard, mouse etc.；Penetrated including such as negative electrode The output par, c 507 of spool (CRT), liquid crystal display (LCD) etc. and loudspeaker etc.；Storage part 508 including hard disk etc.； And the communications portion 509 of the NIC including LAN card, modem etc..Communications portion 509 via such as because The network of spy's net performs communication process.Driver 510 is also according to needing to be connected to I/O interfaces 505.Detachable media 511, such as Disk, CD, magneto-optic disk, semiconductor memory etc., it is arranged on as needed on driver 510, in order to read from it Computer program be mounted into as needed storage part 508.

Especially, in accordance with an embodiment of the present disclosure, it may be implemented as computer above with reference to the process of flow chart description Software program.For example, embodiment of the disclosure includes a kind of computer program product, it includes being carried on computer-readable medium On computer program, the computer program include be used for execution flow chart shown in method program code.In such reality To apply in example, the computer program can be downloaded and installed by communications portion 509 from network, and/or from detachable media 511 are mounted.When the computer program is performed by CPU (CPU) 501, perform what is limited in the present processes Above-mentioned function.

It should be noted that the above-mentioned computer-readable medium of the application can be computer-readable signal media or meter Calculation machine readable storage medium storing program for executing either the two any combination.Computer-readable recording medium for example can be but unlimited In：Electricity, magnetic, optical, electromagnetic, system, device or the device of infrared ray or semiconductor, or it is any more than combination.Computer can Reading the more specifically example of storage medium can include but is not limited to：Electrically connecting with one or more wires, portable meter Calculation machine disk, hard disk, random access storage device (RAM), read-only storage (ROM), erasable programmable read only memory (EPROM or flash memory), optical fiber, portable compact disc read-only storage (CD-ROM), light storage device, magnetic memory device or The above-mentioned any appropriate combination of person.In this application, computer-readable recording medium can be any includes or storage program Tangible medium, the program can be commanded execution system, device either device use or it is in connection.And in this Shen Please in, computer-readable signal media can include in a base band or as carrier wave a part propagation data-signal, its In carry computer-readable program code.The data-signal of this propagation can take various forms, and include but is not limited to Electromagnetic signal, optical signal or above-mentioned any appropriate combination.Computer-readable signal media can also be computer-readable Any computer-readable medium beyond storage medium, the computer-readable medium can send, propagate or transmit for by Instruction execution system, device either device use or program in connection.The journey included on computer-readable medium Sequence code can be transmitted with any appropriate medium, be included but is not limited to：Wirelessly, electric wire, optical cable, RF etc., or it is above-mentioned Any appropriate combination.

Flow chart and block diagram in accompanying drawing, it is illustrated that according to the system of the various embodiments of the application, method and computer journey Architectural framework in the cards, function and the operation of sequence product.At this point, each square frame in flow chart or block diagram can generation The part of one module of table, program segment or code, the part of the module, program segment or code include one or more use In the executable instruction of logic function as defined in realization.It should also be noted that marked at some as in the realization replaced in square frame The function of note can also be with different from the order marked in accompanying drawing generation.For example, two square frames succeedingly represented are actually It can perform substantially in parallel, they can also be performed in the opposite order sometimes, and this is depending on involved function.Also to note Meaning, the combination of each square frame and block diagram in block diagram and/or flow chart and/or the square frame in flow chart can be with holding Function as defined in row or the special hardware based system of operation are realized, or can use specialized hardware and computer instruction Combination realize.

Being described in unit involved in the embodiment of the present application can be realized by way of software, can also be by hard The mode of part is realized.Described unit can also be set within a processor, for example, can be described as：A kind of processor bag Include normalization unit and training unit.Wherein, the title of these units is not formed to the unit in itself under certain conditions Limit, for example, training unit is also described as " using the normalized input information aggregate of each layer to initialization convolution god It is trained through network, obtains the unit for the convolutional neural networks that training is completed ".

As on the other hand, present invention also provides a kind of computer-readable medium, the computer-readable medium can be Included in server described in above-described embodiment；Can also be individualism, and without be incorporated the server in.It is above-mentioned Computer-readable medium carries one or more program, when said one or multiple programs are performed by the server, So that the server：Every layer in each layer for initializing convolutional neural networks, this layer is stored using at least one piece of video card Information aggregate is inputted, wherein, the input information aggregate of every piece of video card storage at least partly this layer at least one piece of video card；Calculate The average and variance of the input information aggregate of at least part this layer that every piece of video card at least one piece of video card is stored；Will at least The average and variance of the input information aggregate of at least part this layer that every piece of video card in one piece of video card is stored are sent to other Video card, to calculate the average and variance of the input information aggregate of this layer；Utilize the average and variance of the input information aggregate of this layer Input information aggregate to this layer is normalized, and obtains the normalized input information aggregate of this layer；Utilize each layer Normalized input information aggregate is trained to initialization convolutional neural networks, obtains the convolutional neural networks that training is completed.

Above description is only the preferred embodiment of the application and the explanation to institute's application technology principle.People in the art Member should be appreciated that invention scope involved in the application, however it is not limited to the technology that the particular combination of above-mentioned technical characteristic forms Scheme, while should also cover in the case where not departing from foregoing invention design, carried out by above-mentioned technical characteristic or its equivalent feature The other technical schemes for being combined and being formed.Such as features described above has similar work(with (but not limited to) disclosed herein The technical scheme that the technical characteristic of energy is replaced mutually and formed.

Claims

A kind of 1. method for training convolutional neural networks, it is characterised in that methods described includes：

Every layer in each layer for initializing convolutional neural networks, the input information collection of this layer is stored using at least one piece of video card Close, wherein, the input information aggregate of every piece of video card storage at least partly this layer at least one piece of video card；Described in calculating extremely The average and variance of the input information aggregate of at least part this layer that every piece of video card in few one piece of video card is stored；By described in extremely The average and variance of the input information aggregate of at least part this layer that every piece of video card in few one piece of video card is stored are sent to it His video card, to calculate the average and variance of the input information aggregate of this layer；Average and side using the input information aggregate of this layer Input information aggregate of the difference to this layer is normalized, and obtains the normalized input information aggregate of this layer；

The initialization convolutional neural networks are trained using the normalized input information aggregate of each layer, obtain having trained Into convolutional neural networks.
2. according to the method for claim 1, it is characterised in that the normalized input information aggregate pair using each layer The initialization convolutional neural networks are trained, and obtain the convolutional neural networks that training is completed, including：

Perform following training step：Every layer of normalized input information aggregate is inputted to the initialization convolutional neural networks Every layer, obtain characteristic vector set, determine whether the characteristic vector set meets preparatory condition, if meeting the default bar Part, the then convolutional neural networks completed the initialization convolutional neural networks as training；

In response to being unsatisfactory for the preparatory condition, the parameter of the initialization convolutional neural networks is adjusted, and is continued executing with described Training step.
3. according to the method for claim 2, it is characterised in that the input that input information aggregate includes multiple identical categories is believed Breath；And

It is described to determine whether the characteristic vector set meets preparatory condition, including：

Calculate between each characteristic vector in multiple characteristic vectors corresponding to the input information of the multiple identical category Distance, obtain the first result of calculation；

Based on first result of calculation, it is determined whether meet preparatory condition.
4. according to the method for claim 3, it is characterised in that the input information institute for calculating the multiple identical category The distance between each characteristic vector in corresponding multiple characteristic vectors, the first result of calculation is obtained, including：

Calculate between each characteristic vector in multiple characteristic vectors corresponding to the input information of the multiple identical category Euclidean distance, obtain the first result of calculation.
5. according to the method for claim 4, it is characterised in that described to be based on first result of calculation, it is determined whether full Sufficient preparatory condition, including：

Determine between each characteristic vector in multiple characteristic vectors corresponding to the input information of the multiple identical category Whether Euclidean distance is respectively less than the first pre-determined distance threshold value；

If respectively less than described first pre-determined distance threshold value, meets the preparatory condition；

If not respectively less than described first pre-determined distance threshold value, then be unsatisfactory for the preparatory condition.
6. according to the method described in one of claim 2-5, it is characterised in that input information aggregate includes multiple different classes of Input information；And

It is described to determine whether the characteristic vector set meets preparatory condition, including：

Calculate between each characteristic vector in multiple characteristic vectors corresponding to the multiple different classes of input information Distance, obtain the second result of calculation；

Based on second result of calculation, it is determined whether meet preparatory condition.
7. according to the method for claim 6, it is characterised in that described to calculate the multiple different classes of input information institute The distance between each characteristic vector in corresponding multiple characteristic vectors, the second result of calculation is obtained, including：

Calculate between each characteristic vector in multiple characteristic vectors corresponding to the multiple different classes of input information Euclidean distance, obtain the second result of calculation.
8. according to the method for claim 7, it is characterised in that described to be based on second result of calculation, it is determined whether full Sufficient preparatory condition, including：

Determine between each characteristic vector in multiple characteristic vectors corresponding to the multiple different classes of input information Whether Euclidean distance is all higher than the second pre-determined distance threshold value；

If being all higher than the second pre-determined distance threshold value, meet the preparatory condition；

If not being all higher than the second pre-determined distance threshold value, then the preparatory condition is unsatisfactory for.
9. according to the method for claim 1, it is characterised in that methods described also includes：

Obtain the first input information and the second input information；

Described first input information and the second input information are inputted to the convolutional neural networks completed to the training, obtained The characteristic vector of the characteristic vector of the first input information and the second input information；

Calculate the distance between the characteristic vector of the first input information and the characteristic vector of the second input information；

Based on the distance calculated, determine whether the first input information and the second input information are same category of defeated Enter information, and export determination result.
10. a kind of device for training convolutional neural networks, it is characterised in that described device includes：

Normalization unit, every layer be configured in each layer for initialization convolutional neural networks, utilizes at least one piece of video card Store the input information aggregate of this layer, wherein, every piece of video card at least one piece of video card storage at least partly this layer it is defeated Enter information aggregate；The input information aggregate of at least part this layer that every piece of video card described in calculating at least one piece of video card is stored Average and variance；The input information aggregate of at least part this layer that every piece of video card at least one piece of video card is stored Average and variance send to other video cards, with calculate this layer input information aggregate average and variance；Utilize the defeated of this layer Enter the input information aggregate of the average and variance of information aggregate to this layer to be normalized, obtain the normalized defeated of this layer Enter information aggregate；

Training unit, it is configured to enter the initialization convolutional neural networks using the normalized input information aggregate of each layer Row training, obtain the convolutional neural networks that training is completed.
11. device according to claim 10, it is characterised in that the training unit includes：

Subelement is trained, is configured to carry out following training step：Every layer of normalized input information aggregate is inputted to institute State initialization convolutional neural networks every layer, obtains characteristic vector set, determines whether the characteristic vector set meets to preset Condition, if meeting the preparatory condition, the convolutional neural networks using the initialization convolutional neural networks as training completion；

Subelement is adjusted, is configured to, in response to being unsatisfactory for the preparatory condition, adjust the initialization convolutional neural networks Parameter, and continue executing with the training step.
12. device according to claim 11, it is characterised in that input information aggregate includes the input of multiple identical categories Information；And

The training subelement includes：

First computing module, it is configured to calculate in multiple characteristic vectors corresponding to the input information of the multiple identical category The distance between each characteristic vector, obtain the first result of calculation；

First determining module, it is configured to be based on first result of calculation, it is determined whether meet preparatory condition.
13. the device according to claim 11 or 12, it is characterised in that input information aggregate includes multiple different classes of Input information；And

The training subelement includes：

Second computing module, it is configured to calculate in multiple characteristic vectors corresponding to the multiple different classes of input information The distance between each characteristic vector, obtain the second result of calculation；

Second determining module, it is configured to be based on second result of calculation, it is determined whether meet preparatory condition.
14. device according to claim 10, it is characterised in that described device also includes：

Acquiring unit, it is configured to obtain the first input information and the second input information；

Input block, it is configured to the described first input information and the second input information inputting what is completed to the training Convolutional neural networks, obtain the characteristic vector of the first input information and the characteristic vector of the second input information；

Computing unit, be configured to calculate the characteristic vector of the first input information and the feature of the second input information to The distance between amount；

Determining unit, it is configured to based on the distance calculated, determines the first input information and the second input information Whether it is same category of input information, and exports determination result.
15. a kind of server, it is characterised in that the server includes：

One or more processors；

Storage device, for storing one or more programs；

When one or more of programs are by one or more of computing devices so that one or more of processors are real The now method as described in any in claim 1-9.
16. a kind of computer-readable recording medium, is stored thereon with computer program, it is characterised in that the computer program The method as described in any in claim 1-9 is realized when being executed by processor.