CN110443286A

CN110443286A - Training method, image-recognizing method and the device of neural network model

Info

Publication number: CN110443286A
Application number: CN201910651552.1A
Authority: CN
Inventors: 曾葆明; 王雷; 梁炎
Original assignee: Guangzhou Huaduo Network Technology Co Ltd
Current assignee: Guangzhou Cubesili Information Technology Co Ltd
Priority date: 2019-07-18
Filing date: 2019-07-18
Publication date: 2019-11-12
Anticipated expiration: 2039-07-18
Also published as: CN110443286B

Abstract

This application discloses a kind of training method of neural network model, image-recognizing method and device, the training method of the neural network model includes: acquisition neural network model；Wherein, neural network model is the neural network model being trained, and neural network model includes at least the first branching networks；The second branching networks are added in neural network model；The second branching networks will be input to, to training dataset individually to be trained to the second branching networks；First branching networks and the second branching networks are merged, to complete the training of neural network model.By the above-mentioned means, can be improved the efficiency of neural network model training, and the recognition effect of original neural network model is not influenced.

Description

Training method, image-recognizing method and the device of neural network model

Technical field

This application involves technical field of image processing, training method, image more particularly to a kind of neural network model Recognition methods and device.

Background technique

With the rise of deep learning, more and more technologies realize the image of picture or video flowing using deep learning Identification.Compared to conventional method, deep learning avoids manual parameters and adjusts the complexity selected with manual features, by building Deep-neural-network model, to data carry out multi-layer analysis and abstract feature extraction, with high accuracy, high reliability, The characteristics of high-adaptability.Common image recognition application covers action recognition, recognition of face, target identification, scene Recognition etc.. Wherein, target identification and scene Recognition are known as image retrieval, the basis of image classification, scene understanding, environment sensing in mode Not, the fields such as machine learning play an important role.

When carrying out image recognition using the neural network model trained, if being badly in need of adding new feature, have at present Two methods: 1, a neural network model is individually created；2, the image with new feature is input to original nerve net Network model carries out continuing to train.The former will consume double computing resource, and the latter's time consumption for training will be longer, can not react rapidly, And after being unable to control addition new samples, to the recognition effect of original classification, it is more likely that will affect original recognition effect.

Summary of the invention

To solve the above problems, this application provides a kind of training method of neural network model, image-recognizing method with And device, it can be improved the efficiency of neural network model training, and do not influence the recognition effect of original neural network model.

The technical solution that the application uses is: a kind of training method of neural network model is provided, this method comprises: Obtain neural network model；Wherein, neural network model is the neural network model being trained, and neural network mould Type includes at least the first branching networks；The second branching networks are added in neural network model；It will be input to training dataset Second branching networks, individually to be trained to the second branching networks；First branching networks and the second branching networks are melted It closes, to complete the training of neural network model.

Wherein, the second branching networks are added in neural network model, comprising: determine multiple convolution of the first branching networks The output scale of module；The second branching networks are added to the certain volume product module in the first branching networks based on output scale demand Block.

Wherein, the first branching networks include: input layer；First convolution module；First pond layer；Second convolution module；The Two pond layers；Third convolution module；Volume Four volume module；5th convolution module；First global average pond layer；First full connection Layer；First sorter network layer；First branching networks output layer.

Wherein, the second branching networks include: feature selecting layer, connect Volume Four volume module；6th convolution module；Second is complete The average pond layer of office；Second full articulamentum；Second sorter network layer；Second branching networks output layer.

Wherein, network mode further include: fused layer connects the first branching networks output layer and the output of the second branching networks Layer；Merge output layer.

Wherein, the second branching networks will be input to training dataset, individually to be trained to the second branching networks, packet It includes: obtaining to training dataset；It treats training dataset and carries out data enhancing processing；By data enhancing treated number to be trained The second branching networks are input to according to collection, the second branching networks are individually trained.

Wherein, by data enhancing, treated is input to the second branching networks to training dataset, to the second branching networks It is individually trained, comprising: the convolution initiation parameter of the second branching networks of setting；Fix multiple convolution of the first branching networks The parameter of module, by data enhancing, treated is input to the second branching networks to training dataset, to the second branching networks into Row individually training.

Another technical solution that the application uses is: a kind of image-recognizing method is provided, this method comprises: obtaining wait know Other image；Images to be recognized is input to setting neural network model；Wherein, setting neural network model is using such as above-mentioned Method training obtains；Export recognition result.

Another technical solution that the application uses is: providing a kind of pattern recognition device, which includes Processor and memory connected to the processor, memory is for storing program data, and processor is for executing program data To realize such as above-mentioned method.

Another technical solution that the application uses is: providing a kind of computer storage medium, the computer storage medium In be stored with program data, program data is when being executed by processor, to realize such as above-mentioned method.

The training method of neural network model provided by the present application includes: acquisition neural network model；Wherein, neural network Model is the neural network model being trained, and neural network model includes at least the first branching networks；In nerve The second branching networks are added in network model；The second branching networks will be input to training dataset, to the second branching networks Individually trained；First branching networks and the second branching networks are merged, to complete the training of neural network model.It is logical Aforesaid way is crossed, when needing to identify new feature using existing neural network model, without newly one neural network of training Model trains original neural network model again, improves the efficiency of neural network model training, and not It will affect the recognition effect of original neural network model.

Detailed description of the invention

In order to more clearly explain the technical solutions in the embodiments of the present application, make required in being described below to embodiment Attached drawing is briefly described, it should be apparent that, the drawings in the following description are only some examples of the present application, for For those of ordinary skill in the art, without creative efforts, it can also be obtained according to these attached drawings other Attached drawing.Wherein:

Fig. 1 is the flow diagram of the training method of neural network model provided by the embodiments of the present application；

Fig. 2 is neural network model schematic diagram provided by the embodiments of the present application；

Fig. 3 is the flow diagram of the training of the second branching networks provided by the embodiments of the present application；

Fig. 4 is another flow diagram of the training of the second branching networks provided by the embodiments of the present application；

Fig. 5 is the flow diagram of image-recognizing method provided by the present application；

Fig. 6 is the structural schematic diagram of pattern recognition device provided by the embodiments of the present application；

Fig. 7 is the structural schematic diagram of computer storage medium provided by the embodiments of the present application.

Specific embodiment

Below in conjunction with the attached drawing in the embodiment of the present application, technical solutions in the embodiments of the present application carries out clear, complete Site preparation description.It is understood that specific embodiment described herein is only used for explaining the application, rather than to the limit of the application It is fixed.It also should be noted that illustrating only part relevant to the application for ease of description, in attached drawing and not all knot Structure.Based on the embodiment in the application, obtained by those of ordinary skill in the art without making creative efforts Every other embodiment, shall fall in the protection scope of this application.

Term " first ", " second " in the application etc. be for distinguishing different objects, rather than it is specific suitable for describing Sequence.In addition, term " includes " and " having " and their any deformations, it is intended that cover and non-exclusive include.Such as comprising The process, method, system, product or equipment of a series of steps or units are not limited to listed step or unit, and It is optionally further comprising the step of not listing or unit, or optionally further comprising for these process, methods, product or equipment Intrinsic other step or units.

Referenced herein " embodiment " is it is meant that a particular feature, structure, or characteristic described can wrap in conjunction with the embodiments It is contained at least one embodiment of the application.Each position in the description occur the phrase might not each mean it is identical Embodiment, nor the independent or alternative embodiment with other embodiments mutual exclusion.Those skilled in the art explicitly and Implicitly understand, embodiment described herein can be combined with other embodiments.

Refering to fig. 1, Fig. 1 is the flow diagram of the training method of neural network model provided by the embodiments of the present application, should Method includes:

Step 11: obtaining neural network model；Wherein, neural network model is the neural network mould being trained Type, and neural network model includes at least the first branching networks.

Wherein, neural network model is a carrier for deep learning (Deep Learning, DL), deep learning It is one of technology and the research field of machine learning, there is the artificial neural network (Artifitial of hierarchical structure by establishing Neural Networks, ANNs), artificial intelligence is realized in computing systems.Since stratum ANN can carry out input information It successively extracts and screens, therefore deep learning has representative learning (representation learning) ability, may be implemented Supervised learning and unsupervised learning end to end.In addition, deep learning may also participate in building intensified learning (reinforcement Learning) system forms deeply study.

By taking convolutional neural networks as an example, convolutional neural networks (Convolutional Neural Networks, CNN) are One kind includes convolutional calculation and the feedforward neural network (Feedforward Neural Networks) with depth structure, is One of the representative algorithm of deep learning (deep learning).

Wherein, convolutional neural networks include input layer, hidden layer and output layer.Hidden layer therein include convolution module, Pond layer, full articulamentum.

1) input layer of convolutional neural networks can handle multidimensional data, and the input layer of one-dimensional convolutional neural networks receives one Dimension or two-dimensional array, wherein one-dimension array is usually time or spectral sample；Two-dimensional array may include multiple channels；Two dimension volume The input layer of product neural network receives two dimension or three-dimensional array；The input layer of Three dimensional convolution neural network receives four-dimensional array.

In the present embodiment, which is mainly used for handling image, it is therefore possible to use three-dimensional volume Product neural network comprising three-dimensional data channel, i.e. two-dimensional image vegetarian refreshments and RGB (RGB) data channel.

2) function of convolution module is that feature extraction is carried out to input data, and internal includes multiple convolution kernels, and group is coiled Each element of product core corresponds to a weight coefficient and a departure (bias vector), is similar to a feed forward neural The neuron (neuron) of network.Multiple nerves in the region being closely located in each neuron and preceding layer in convolution module Member is connected, and the size in region depends on the size of convolution kernel.

After convolution module carries out feature extraction, the characteristic pattern of output can be passed to pond layer and carry out feature selecting and letter Breath filtering.Pond layer includes presetting pond function, and function is that the result of a single point in characteristic pattern is replaced with its is adjacent The characteristic pattern statistic in region.Pond layer choosing takes pond region identical as convolution kernel scanning feature figure step, You Chihua size, step Long and filling control.

Full articulamentum in convolutional neural networks is equivalent to the hidden layer in conventional feed forward neural network.Full articulamentum is usual The decline in convolutional neural networks hidden layer is built, and only transmits signal to other full articulamentums.Characteristic pattern is connecting entirely Three-dimensional structure can be lost in layer, vector is expanded as and next layer is transferred to by excitation function.

3) upstream of output layer is usually full articulamentum in convolutional neural networks, therefore its structure and working principle and tradition Output layer in feedforward neural network is identical.For image classification problem, output layer uses logical function or normalization index letter Number (softmax function) output category label.

For example, carrying out identification output layer to image in the present embodiment may be designed as the center seat of output objects in images Mark, size and classification.In image, semantic segmentation, output layer directly exports the classification results of each pixel.

Step 12: the second branching networks are added in neural network model.

Optionally, step 12 can specifically include: determine the output scale of multiple convolution modules of the first branching networks；Base The second branching networks are added to the specific convolution module in the first branching networks in output scale demand.

As shown in Fig. 2, Fig. 2 is neural network model schematic diagram provided by the embodiments of the present application.

Wherein, the first branching networks include: input layer (INPUT)；First convolution module (ConvBlock)；First pond Layer (Pooling)；Second convolution module；Second pond layer；Third convolution module；Volume Four volume module；5th convolution module；The One global average pond layer (Global Average Pooling, GAP)；First full articulamentum (fully connected layers,FC)；First sorter network layer (Softmax)；First branching networks output layer (Main_output).

Wherein, the second branching networks include: feature selecting layer (SelectBlock), connect Volume Four volume module；Volume six Volume module；Second global average pond layer；Second full articulamentum；Second sorter network layer；Second branching networks output layer (Branch_output)。

In addition, further including fused layer (Fusing) and fusion output layer (Fusing_output), fused layer connects first point Branch network output layer and the second branching networks output layer.

In one embodiment, the output scale of the first convolution module can be arranged according to demand, for example, it exports scale It can be N*N, wherein 100 < N < 300.For example, the common value of N can be 227.Further, the output ruler of the first pond layer Degree is N/2*N/2；The output scale of second convolution module is N/2*N/2；The output scale of second pond layer is N/4*N/4；The The output scale of three convolution modules is N/8*N/8；The output scale of Volume Four volume module is N/16*N/16；5th convolution module Output scale be N/16*N/16.Feature selecting layer connects Volume Four volume module, and it is N/16* that the output scale of the two is identical N/16。

In a specific embodiment, the output scale of the first convolution module is 168*168；The output ruler of first pond layer Degree is 84*84；The output scale of second convolution module is 84*84；The output scale of second pond layer is 42*42；Third convolution The output scale of module is 21*21；The output scale of Volume Four volume module is 11*11；The output scale of 5th convolution module is 11*11.The output scale of feature selecting layer is 11*11.

In the present embodiment, the second branching networks need to carry out bifurcated when penultimate characteristic pattern scale declines, That is the scale of 11*11, because shallow-layer is mainly used for extracting feature, deep layer network is mainly used for being changed feature, extracts high Grade semantic information；If only doing bifurcated using the last one full articulamentum, extracted information will receive final result First branching networks are affected, and final effect is bad；Right side is the second branching networks part as shown in Figure 2, is first passed around One feature selecting layer (SelectBlock), is weighted recombination to feature, acts on big feature to new samples and assigns weight more Greatly, the convolution transform several times of legacy network is then carried out again, then is connect full articulamentum and classified.

Step 13: the second branching networks will be input to, to training dataset individually to be trained to the second branching networks.

Optionally, as shown in figure 3, Fig. 3 is the process signal of the training of the second branching networks provided by the embodiments of the present application Figure, step 12 can specifically include:

Step 131: obtaining to training dataset.

It wherein, should be the data with new feature to training dataset.By taking image as an example, in an application scenarios, need Then the first branching networks, which are to be trained to obtain to the image with A feature, is identified to the image in image with A feature 's.Further, to newly add B feature, then the second branching networks are added, inputting, there is the image of B feature to be trained.

Step 132: treating training dataset and carry out data enhancing processing.

In general, neural network needs a large amount of parameter, the parameter of various neural network is all millions of Meter, and allows these parameters correctly to work, and a large amount of data is needed to be trained, and data are not in actual conditions It is so much in our imaginations.Data are enhanced, i.e., such as overturns, translate or rotates using existing data, create More data are come so that neural network has better extensive effect.

Step 133: by data enhancing, treated is input to the second branching networks to training dataset, to the second branched network Network is individually trained.

In addition, as shown in figure 4, another process that Fig. 4 is the training of the second branching networks provided by the embodiments of the present application is shown It is intended to, step 12 can specifically include:

Step 136: the convolution initiation parameter of the second branching networks of setting.

Convolution module parameter includes convolution kernel size, step-length and filling, and three has codetermined convolution module output feature The size of figure is the hyper parameter of convolutional neural networks.Wherein convolution kernel size can specify as appointing less than input image size Meaning value, convolution kernel is bigger, and extractible input feature vector is more complicated.

Wherein, the distance of position when convolution step-length defines convolution kernel adjacent inswept characteristic pattern twice, convolution step-length are 1 When, convolution kernel can inswept characteristic pattern one by one element, n-1 pixel can be skipped in scanning next time when step-length is n.

Calculated by the crosscorrelation of convolution kernel it is found that with convolution module stacking, the size of characteristic pattern can gradually reduce, Such as 16 × 16 input picture after the convolution kernel of one step, packless 5 × 5,12 × 12 feature can be exported Figure.For this purpose, filling is artificially to increase its size before characteristic pattern is by convolution kernel to offset dimensional contraction in calculating and influence Method.Common fill method is by 0 filling and to repeat boundary value filling (replication padding).

Step 137: fixing the parameter of multiple convolution modules of the first branching networks, treated wait train by data enhancing Data set is input to the second branching networks, is individually trained to the second branching networks.

Step 14: the first branching networks and the second branching networks being merged, to complete the training of neural network model.

The training method of neural network model provided in this embodiment includes: acquisition neural network model；Wherein, nerve net Network model is the neural network model being trained, and neural network model includes at least the first branching networks；In mind Through adding the second branching networks in network model；The second branching networks will be input to training dataset, to the second branched network Network is individually trained；First branching networks and the second branching networks are merged, to complete the training of neural network model. By the above-mentioned means, when needing to identify new feature using existing neural network model, without newly one nerve net of training Network model trains original neural network model again, improves the efficiency of neural network model training, and It will not influence the recognition effect of original neural network model.

It is to be appreciated that the method for the present embodiment can be applied to that the illegal picture or video of network are trained and are known Not.For example, have been used for the external yellow model with short-sighted frequency mirror Huang that reflects, if application scenarios are different, can determine externally when the output of mirror Huang Different output is made, can be adapted to using branching networks；Occur the violation picture of some bursts, existing model in short-sighted frequency Can not identify, be added training set will affect it is effective if, such problems, such as short-sighted frequency can be solved using branching networks Using the violation video of leakage, can constantly be propagated on platform, in another example there is a kind of watermark of porn site；For these Problem is that have certain features, and specific some pictures, after method of the invention, violation picture discrimination is high, and misrecognition is very It is few.

It is as shown in the table by taking SE-BN-Inception model as an example:

The model provided in this embodiment for increasing by the second branching networks on the basis of the first branching networks, at single picture When reason, individual calculates time-consuming averagely increase 4.8ms, and video memory consumption increases 69MB, when batchs size is 12, individual calculating Time-consuming averagely to increase 1ms, video memory consumption increases 129MB.It can find out from above-mentioned data, using the new mind with branching networks When carrying out picture processing through network model, time-consuming compared to original neural network model increases seldom, and video memory consumption increases It is relatively low, for through two different neural network models progress picture processing twice, greatly shorten processing Time reduces memory consumption.

It is the flow diagram of image-recognizing method provided by the present application refering to Fig. 5, Fig. 5, this method comprises:

Step 51: obtaining images to be recognized.

Wherein, which can be single picture, can also be with a picture frame in video flowing, here with no restriction.

Step 52: images to be recognized is input to setting neural network model.

Wherein, setting neural network model is using as the training of the method for above-described embodiment obtains, and which is not described herein again.

Step 53: output recognition result.

It is the structural schematic diagram of pattern recognition device provided by the embodiments of the present application, image recognition dress refering to Fig. 6, Fig. 6 It sets 60 and includes processor 61 and the memory 62 connecting with processor 61, memory 62 is for storing program data, processor 61 for executing program data to realize following method:

Obtain neural network model；Wherein, neural network model is the neural network model being trained, and mind The first branching networks are included at least through network model；The second branching networks are added in neural network model；It will be to training data Collection is input to the second branching networks, individually to be trained to the second branching networks；By the first branching networks and the second branched network Network is merged, to complete the training of neural network model.

Optionally, in another embodiment, processor 61 is for executing program data to realize following method: obtain to Identify image；Images to be recognized is input to setting neural network model；Export recognition result.

It is the structural schematic diagram of computer storage medium provided by the embodiments of the present application refering to Fig. 7, Fig. 7, which deposits Program data 71 is stored in storage media 70, program data 71 is when being executed by processor, to realize following method:

Optionally, in another embodiment, program data 71 is when being executed by processor, also to realize following side Method: the output scale of multiple convolution modules of the first branching networks is determined；The second branching networks are added based on output scale demand Add to the specific convolution module in the first branching networks.

Wherein, the first branching networks include: input layer；First convolution module, output scale are 168*168；First pond Change layer, output scale is 84*84；Second convolution module, output scale are 84*84；Second pond layer, output scale are 42*42；Third convolution module, output scale are 21*21；Volume Four volume module, output scale are 11*11；5th convolution Module, output scale are 11*11；First global average pond layer；First full articulamentum；First sorter network layer；First point Branch network output layer.

Wherein, the second branching networks include: feature selecting layer, connect Volume Four volume module, scale 11*11；6th Convolution module；Second global average pond layer；Second full articulamentum；Second sorter network layer；Second branching networks output layer.

Optionally, in another embodiment, program data 71 is when being executed by processor, also to realize following side Method: it obtains to training dataset；It treats training dataset and carries out data enhancing processing；By data enhancing treated number to be trained The second branching networks are input to according to collection, the second branching networks are individually trained.

Optionally, in another embodiment, program data 71 is when being executed by processor, also to realize following side Method: the convolution initiation parameter of the second branching networks of setting；The parameter for fixing multiple convolution modules of the first branching networks, will count According to enhancing, treated is input to the second branching networks to training dataset, is individually trained to the second branching networks.

In several embodiments provided herein, it should be understood that disclosed method and equipment, Ke Yitong Other modes are crossed to realize.For example, equipment embodiment described above is only schematical, for example, the module or The division of unit, only a kind of logical function partition, there may be another division manner in actual implementation, such as multiple units Or component can be combined or can be integrated into another system, or some features can be ignored or not executed.

The unit as illustrated by the separation member may or may not be physically separated, aobvious as unit The component shown may or may not be physical unit, it can and it is in one place, or may be distributed over multiple In network unit.Some or all of unit therein can be selected to realize present embodiment scheme according to the actual needs Purpose.

In addition, each functional unit in each embodiment of the application can integrate in one processing unit, it can also To be that each unit physically exists alone, can also be integrated in one unit with two or more units.It is above-mentioned integrated Unit both can take the form of hardware realization, can also realize in the form of software functional units.

If the integrated unit in above-mentioned other embodiments is realized in the form of SFU software functional unit and as independence Product when selling or using, can store in a computer readable storage medium.Based on this understanding, the application Technical solution substantially all or part of the part that contributes to existing technology or the technical solution can be in other words It is expressed in the form of software products, which is stored in a storage medium, including some instructions are used So that a computer equipment (can be personal computer, server or the network equipment etc.) or processor (processor) all or part of the steps of each embodiment the method for the application is executed.And storage medium packet above-mentioned It includes: USB flash disk, mobile hard disk, read-only memory (ROM, Read-Only Memory), random access memory (RAM, Random Access Memory), the various media that can store program code such as magnetic or disk.

The foregoing is merely presently filed embodiments, are not intended to limit the scope of the patents of the application, all according to this Equivalent structure or equivalent flow shift made by application specification and accompanying drawing content, it is relevant to be applied directly or indirectly in other Technical field similarly includes in the scope of patent protection of the application.

Claims

1. a kind of training method of neural network model, which is characterized in that the described method includes:

Obtain neural network model；Wherein, the neural network model is the neural network model being trained, and institute Neural network model is stated including at least the first branching networks；

The second branching networks are added in the neural network model；

Second branching networks will be input to, to training dataset individually to be trained to the second branching networks；

First branching networks and second branching networks are merged, to complete the training of neural network model.

2. the method according to claim 1, wherein

It is described that the second branching networks are added in the neural network model, comprising:

Determine the output scale of multiple convolution modules of first branching networks；

Second branching networks are added to the specific convolution module in first branching networks based on output scale demand.

3. according to the method described in claim 2, it is characterized in that,

First branching networks include:

Input layer；

First convolution module；

First pond layer；

Second convolution module；

Second pond layer；

Third convolution module；

Volume Four volume module；

5th convolution module；

First global average pond layer；

First full articulamentum；

First sorter network layer；

First branching networks output layer.

4. according to the method described in claim 3, it is characterized in that,

Second branching networks include:

Feature selecting layer connects the Volume Four volume module；

6th convolution module；

Second global average pond layer；

Second full articulamentum；

Second sorter network layer；

Second branching networks output layer.

5. according to the method described in claim 4, it is characterized in that,

The network mode further include:

Fused layer connects the first branching networks output layer and the second branching networks output layer；

Merge output layer.

6. the method according to claim 1, wherein

It is described that second branching networks will be input to training dataset, individually to be trained to the second branching networks, packet It includes:

It obtains to training dataset；

Data enhancing processing is carried out to training dataset to described；

By data enhancing treated it is described be input to second branching networks to training dataset, to second branched network Network is individually trained.

7. according to the method described in claim 6, it is characterized in that,

It is described by data enhancing treated it is described be input to second branching networks to training dataset, to described second point Branch network is individually trained, comprising:

The convolution initiation parameter of second branching networks is set；

The parameter of multiple convolution modules of fixed first branching networks, by data enhancing, that treated is described to training data Collection is input to second branching networks, is individually trained to second branching networks.

8. a kind of image-recognizing method, which is characterized in that the described method includes:

Obtain images to be recognized；

The images to be recognized is input to setting neural network model；Wherein, the setting neural network model is using such as The described in any item method training of claim 1-7 obtain；

Export recognition result.

9. a kind of pattern recognition device, which is characterized in that described image identification device include processor and with the processor The memory of connection, the memory is for storing program data, and the processor is for executing described program data to realize The method according to claim 1.

10. a kind of computer storage medium, which is characterized in that program data is stored in the computer storage medium, it is described Program data by the processor when being executed, to realize the method according to claim 1.