A kind of data center's cloud target identification method based on deep learning
Technical field
The present invention relates to field of target recognition more particularly to a kind of data center's cloud target identifications based on deep learning
Method.
Background technique
Recently as the development of depth learning technology, so that realizing end-to-end (End- using initial data as input
To-End learning process) is possibly realized.Convolutional neural networks (Convolutional Neural Networks, CNN) have
Very strong feature extraction and learning ability.Therefore, the existing object detection method based on deep learning is based on convolutional Neural more
Network obtains most effective one further feature, and relatively multiple by establishing by being trained, learning to large-scale data
Miscellaneous network structure, the association between abundant mining data, to realize target identification end to end.
As shown in Figure 1, it is existing based on the object detection system of deep learning by input/output module, functional module, solution
It analyses module, network layer module, neural network and constructs module, cfg configuration file and network weight file are constituted.
Cfg configuration file is connected with parsing module, and the network parameter that cfg configuration file records building convolutional neural networks supplies
Parsing module is read, which is divided into network layer parameter and network architecture parameters.Network layer parameter includes each network layer (volume
Product neural network includes input layer, convolutional layer, pond layer, full articulamentum and output layer) each characteristic pattern neuron it is a
The number (i.e. the dimension of layer) of number (i.e. the size of layer) and characteristic pattern, wherein the output layer of convolutional neural networks records priori frame
Size, the size of priori frame sets (the generally equivalent to size and ratio of normal image according to the size and ratio of normal image
Zoom in or out), which is fixed;Network architecture parameters include the type of each network layer for forming neural network, number and
Combination.
Network weight file is connected with neural network building module, functional module, and network weight file stores from function mould
The received network weight parameter of block is read for neural network building module.Network weight parameter refers to that neural network connects between layers
The coefficient of the expression formula of the input/output relation of each neuron when connecing.
Parsing module is connected with cfg configuration file, network layer module, neural network building module, which configures from cfg
File reads the network parameter of building neural network, network parameter is resolved to network layer parameter and network architecture parameters, and will
Network layer parameter issues network layer module, and network architecture parameters are sent to neural network building module.
Network layer module is connected with parsing module, neural network building module, and analytically module receives network layer to the module
Parameter instantiates each network layer using network layer parameter, and the network layer after instantiation is sent to neural network building module;Its
In, loss function is defined in the output layer of convolutional neural networks, loss function is the pre- of the output of measurement convolutional neural networks
The function of gap between measured value and true value.The value of loss function is smaller, with regard to the better of representative model fitting.Common loss
Function has 0-1 loss function, absolute error loss function, quadratic loss function and cross entropy loss function etc..It is existing to be used for mesh
Identifying loss function used by other convolutional neural networks is the form based on quadratic loss function, is primarily adapted for use in common figure
Picture identification (in normal image object to be measured size account for full figure ratio is higher, target numbers are less, be distributed sparse), it is not right
Big Small object distinguishes.
CNN is trained using training function, and training function is divided into two stages to the training of CNN: propagated forward and anti-
To propagation.In the propagated forward stage, each layer Forward (propagated forward) function is successively called, obtains layer-by-layer output, most
Later layer (i.e. output layer) exports predicted value, and predicted value is obtained loss function value by loss function compared with sample true value;So
Backward (backpropagation) function of output layer calculates weight parameter updated value afterwards, successively calls each layer of Backward
(backpropagation) function successively reaches first layer by backpropagation, network weight parameter at the end of backpropagation together more
Newly.As training carries out, network weight parameter is constantly updated, and loss function value constantly reduces, i.e., the predicted value of network output with
True value error is also smaller and smaller.When the value of loss function is no longer reduced, indicates that training is completed, obtain network weight parameter.
Select different loss functions, it will so that the training of neural network stresses direction difference, the weight of each layer in training process
Updated value also will be different, and the detection effect of last neural network model also will be different.
Neural network building module is connected with parsing module, network layer module, network weight file, functional module, the mould
Analytically module receives network architecture parameters to block, receives the network layer after instantiation from network layer module, is joined according to network structure
Number combines each network layer, constructs the basic framework of neural network.Neural network constructs module also from network weight file acquisition
Network weight parameter is that the basic framework of neural network assigns weight parameter, completes building for neural network, neural network is sent out
Give functional module.
Input/output module is connected with functional module, and input/output module is read from the test set that user provides to mapping
Testing image is converted the structural body (such as image, data and box structural body) that functional module can be identified and be handled by picture, will
These structural bodies are sent to functional module;And recognition result is received from functional module, recognition result is exported to user.
Functional module is connected with input/output module, neural network building module, network weight file, and functional module is called
Training function is (referring to document Jake Bouvrie.Notes on Convolutional Neural Networks [J]
.2006.11, it is translated into: convolutional neural networks notes) training neural network, network weight parameter is stored into network weight text
Part;Functional module calls detection function, carries out target identification using neural network, obtains neural network to the identification knot of image
Recognition result is sent to input/output module by fruit.
The method that the existing target identification system based on deep learning carries out target identification are as follows:
1) parsing module reads network parameter from cfg configuration file, network layer parameter is sent into network layer module, by network
Structural parameters are sent to neural network building module;
2) analytically module receives network layer parameter to network layer module, defines and realize that each network layer, output network layer arrive
Neural network constructs module;
3) analytically module receives network architecture parameters to neural network building module, receives network layer from network layer module,
It is combined network layer according to order according to network architecture parameters, constructs the basic framework of neural network.Neural network constructs module
Network weight parameter is obtained from network weight file again, is that the basic framework of neural network assigns weight, completes neural network
Build, neural network constructs module and neural network is sent to functional module;
4) input/output module receives the testing image of user's input, and is fixed dimension M*M (one by testing image scaling
As be set as 416*416), the structural body that functional module can be identified and be handled is then converted to, such as image, data and box structure
Body, input/output module is by these structural body input functional modules.Functional module calls training function training neural network, obtains
Network weight parameter is stored into network weight file by network weight parameter;Functional module calls detection function, utilizes nerve
Network carries out target detection, and the prediction result of image is calculated and obtained by neural network output layer, for the prediction of target position
Be add that position offset obtains using priori frame (specific testing principle is referring to document Joseph Redmon.YOLO9000
Better, Faster, Stronger [C] .Hawaii Convention Center.2017, pp7263-7271. translations
YOLO9000, more preferably, faster, stronger, CVPR2017 meeting paper, the row of page 3 the 1st the 4th row of page-the 4.), obtain neural network
After the recognition result of image, recognition result is transmitted to input/output module, by input/output module by recognition result export to
User.
However, existing, based on the target identification method of deep learning, there is following technical problems:
1) existing object detection method is based on predicted position on the basis of priori frame for the prediction of target frame position
What relative displacement obtained, presetting priori frame is set according to the size and ratio of normal image, the value be it is fixed,
It is stored in cfg file.However in actual conditions, size and the ratio difference of many targets are larger, and the value of priori frame should not be straight
The priori frame size of the presetting fixation of female connector, the accuracy of target identification be not high;
2) loss function in existing object detection method is designed for normal image, mesh to be measured in usual image
The ratio that dimensioning accounts for full figure is higher, and target numbers are few, is distributed sparse.However many Aerial Images are (such as in actual conditions
Remote sensing images) object to be measured often accounts for full figure ratio very little, and is distributed than comparatively dense, when this results in training network to target and
The training of background stresses unbalance, Small object identification inaccuracy insufficient to Small object training;
3) input/output module of existing recognition methods will carry out the normalized pre- place of size to the image of all inputs
Reason, every image will zoom to the size of M*M (generally several hundred to multiply several hundred, to be 416*416 in YOLOv2).However some figures
As such as remote sensing images, often size reaches thousands of multiplied by thousands of or even tens of thousands of multiplied by tens of thousands of, and target size therein is usually only
There are tens pixels.If directly inputting neural network to be detected, size normalization can enable image lose many detailed information,
Most of target can all become a point, and the detection effect of big map sheet image is had a greatly reduced quality.
In consideration of it, how to solve the problems, such as that Small object and big map sheet image recognition accuracy are low, Small object is effectively improved
The detection effect of recognition accuracy and big map sheet image becomes this field researcher urgent problem to be solved.
Summary of the invention
The technical problem to be solved by the present invention is to propose a kind of cloud target identification side, data center based on deep learning
Method makes full use of the feature of training set picture to be trained, and refines prior information by dimension cluster module, detects letter using segmentation
Number piecemeal detects big map sheet image, solves the problems, such as that Small object and big map sheet image recognition accuracy are low, effectively improves Small object
Recognition accuracy and big map sheet image detection effect.
The technical scheme is that
The first step constructs data center's cloud target identification system based on deep learning.Data based on deep learning
Center cloud target identification system by cloud server, groups of clients at.Telnet, client are installed in client
In store the data set that required by task to be measured is wanted, data set includes test set picture to be measured (figure to be detected in test set
Piece), training set picture to be trained (being used to train the picture of neural network in training set), training set label file, training set mark
Sign the indicia framing information of target in file record training set picture to be trained, position coordinates, width, height and mesh including indicia framing
Target classification (such as aircraft, ship).Client logs in cloud server by telnet, and data set is uploaded to cloud clothes
Business device sends training instruction, detection instruction to cloud server before starting training and testing, long-range to cloud server progress
Training, detection;Cloud server carries out neural metwork training and target identification, dispatches cloud server according to the instruction of client
Computing resource and storage resource, and the training progress msg and recognition result of neural network are sent to client.
Cloud server removes and is equipped with input/output module, functional module, parsing module, network layer module, neural network
Module, cfg configuration file, network weight file are constructed, dimension cluster module is also equipped with.
Dimension cluster module is connected with client, cfg configuration file, which receives training set label text from client
Part, dimension cluster module carry out refinement analysis to the indicia framing information in training set label file, and priori frame size is calculated,
Priori frame size is written in cfg configuration file.
Cfg configuration file is connected with dimension cluster module, parsing module, and cfg configuration file is in addition to recording building convolution mind
Outside network parameter through network, also (belong to from the received priori frame size of dimension cluster module as the output layer parameter of network
In network layer parameter) storage.
Parsing module is connected with cfg configuration file, network layer module, neural network building module.The module is configured from cfg
File reads the network parameter of building neural network, and parsing network parameter is network layer parameter and network architecture parameters, and by net
Network layers parameter is sent to network layer module, and network architecture parameters are sent to neural network building module.
Network layer module is connected with parsing module, neural network building module, and analytically module receives network layer to the module
Parameter instantiates each network layer using network layer parameter, and the network layer after instantiation is sent to neural network building module.
Unlike network layer module shown in FIG. 1, loss function defined in the output layer of convolutional neural networks is focused lost
Function, focused lost function be based on cross entropy loss function it is improved, for detection target detection complexity carry out
It distinguishes, increases the difficult detection target such as Small object weight shared in loss function, enhance the detection effect for Small object
(concrete principle of focused lost function is referring to document Lin T Y, Goyal P, Girshick R, He K.Focal Loss for fruit
For Dense Object Detection [C] .ICCV2017 paper arXiv preprint arXiv:1708.02002,
2018:1-10 is translated into: for the focused lost function of intensive target detection)).Focused lost function is for actual conditions
Aerial Images design, this kind of image object to be measured usually accounts for full figure ratio very little, comparatively dense is compared in distribution.
Neural network building module is connected with parsing module, network layer module, network weight file, functional module.Nerve
Analytically module receives network architecture parameters to network struction module, receives network layer from network layer module, is joined according to network structure
Number combines network layer according to order, constructs the basic framework of neural network.Neural network constructs module also from network weight text
Part reads network weight parameter, is that the basic framework of neural network assigns weight parameter, completes building for neural network, nerve net
Network constructs module and neural network is sent to functional module.
Network weight file is connected with neural network building module, functional module, and network weight file stores from function mould
The received network weight parameter of block is read for neural network building module.
Input/output module is connected with functional module, and input/output module receives test set testing image from client, will
Testing image is converted into the structural body (such as image, data and box structural body) that program can be identified and be handled, and these are tied
Structure body is sent to functional module.
Functional module is connected with input/output module, neural network building module, network weight file, client, with Fig. 1
Equally, also there are trained function and detection function in functional module, functional module calls training function training neural network, by network
Weight parameter is sent to network weight file;Different with Fig. 1 is that detection function is changed, and becomes segmentation detection letter
Number, functional module call segmentation detection function to carry out target detection using neural network, obtain identification of the neural network to image
As a result, recognition result is sent to input/output module.
Second step, dimension cluster module, parsing module, network layer module and the neural network of cloud server construct mould
Block cooperates, and constructs the basic framework of neural network, method are as follows:
2.1 dimension cluster modules receive training set label file from client, read indicia framing from training set label file
Information finds out priori frame size, method are as follows:
2.1.1 dimension cluster module obtains training set picture to be trained (multiple pictures to be trained) from training set label file
The indicia framing information (these indicia framing information have been got well by user's mark) of middle target, with the width and high structure of each indicia framing
At binary group (wi,hi) be used as element (w expression is wide, and h indicates high, i expression indicia framing serial number), in composition set S, set S
Element number is N, and N is the number of indicia framing in picture to be trained, i ∈ [1, N];
2.1.2 dimension cluster module sets cluster centre number as k, and k is positive integer, and definition maximum number of iterations is Num,
Num is generally the integer between 10 to 100, initializes the first cluster centre set C1For empty set, if C1The current number of middle element
For N', N' initial value is 0.
2.1.3 k cluster centre is initialized, method is:
2.1.3.1 dimension cluster module randomly chooses an element (w from Sl,hl), l ∈ [1, N] is set to first
Set C is added in a cluster centre1, enable variable N'=1;
2.1.3.2 enabling variable m=1, n=1;
2.1.3.3 dimension cluster module calculates the element (w in Sm,hm) and C1In element (wn,hn) distance d ((wm,
hm),(wn,hn)):
d((wm,hm),(wn,hn))=1-IOU ((wm,hm),(wn,hn))
Wherein, for any one element (a, b) in dimension cluster module calculating S, (a, b are respectively the wide w of indicia framingmWith
High hm) and C1In any one element (c, d) (c, d are respectively the wide w of cluster centrenWith high hn) rectangle frame hand over and than IOU's
Calculation is as follows:
If a >=c, b >=d, then
If a >=c, b≤d, then
If a≤c, b >=d, then
If a≤c, b≤d, then
If 2.1.3.4 n < N', enables n=n+1, turn 2.1.3.3;If n=N' turns 2.1.3.5;
If 2.1.3.5 m < N, enables m=m+1, n=1, turn 2.1.3.3;If m=N turns 2.1.3.6;
2.1.3.6 variable m=1, n=1, D (w are enabledm,hm)=1;D(wm,hm) it is arbitrary element (w in Sm,hm) and C1In
Arbitrary element (wn,hn) distance minimum value;
2.1.3.7 if d ((wm,hm),(wn,hn)) < D (wm,hm), then enable D (wm,hm)=d ((wm,hm),(wn,hn)), turn
2.1.3.8;Otherwise directly turn 2.1.3.8;
If 2.1.3.8 n < N', enables n=n+1, turn 2.1.3.7;If n=N' turns 2.1.3.9;
If 2.1.3.9 m < N, m=m+1, n=1, D (w are enabledm,hm)=1, turns 2.1.3.7;If m=N turns 2.1.3.10;
2.1.3.10 dimension cluster module calculate minimum range and
2.1.3.11 dimension cluster module takes the N'+1 cluster centre of method choice by weight distribution probability:
2.1.3.11.1 value r is obtained multiplied by random value random (random ∈ [0,1]) with SUM, initialization takes and variable
Cur=0 enables m=1;
2.1.3.11.2 dimension cluster module calculates cur=cur+D (wm,hm)
If 2.1.3.11.3 cur≤r enables m=m+1, turn 2.1.3.11.2;If cur > r, the element (w in Sm,hm) plus
Enter set C1, N'=N'+1 is enabled, 2.1.3.12 is turned;
2.1.3.12 if N'< k, goes to step 2.1.2.2;If N'=k, the first cluster centre set C is obtained1, turn 2.1.4.
2.1.4 the number of iterations t=1 is enabled, the iterative calculation of dimension cluster module generates t+1 cluster centre set, and step is such as
Under:
2.1.4.1 dimension cluster module is according to element each in S and CtThe distance of middle k cluster centre, will be each in S
Element incorporates cluster belonging to nearest cluster centre into, and method is:
For element each in S, CtIn have a cluster centre distance d be minimum therewith.It will be with first cluster centre
(w1,h1) apart from the smallest element it is divided into a set C1, will be with second cluster centre (w2,h2) drawn apart from the smallest element
It is divided into a set C2, and so on, k set is obtained, C is expressed as1, C2..., Cp..., Ck, p ∈ [1, k].
2.1.4.2 finding out C respectively1, C2..., Cp..., CkMean value (the w of middle each element1',h1')(w'2,h2') ...,
(w'p,h'p) ..., (w'k,h'k), wherein w'pFor CpThe arithmetic mean of instantaneous value of the abscissa of middle each element, h'pFor CpMiddle each element
The arithmetic mean of instantaneous value of ordinate, k obtained mean value is as t+1 cluster centre set Ct+1, t=t+1;
If 2.1.4.3 t < Num, goes to step 2.1.4.1;If t=Num, by C at this timet+1Middle k element is as priori frame
Width and high write-in cfg configuration file, turn 2.2.
2.2 parsing modules receive the network parameter of building neural network from cfg configuration file, and parsing network parameter is network
Layer parameter and network architecture parameters, and network layer parameter is issued into network layer module, network architecture parameters are sent to nerve net
Network constructs module.
Analytically module receives network layer parameter to 2.4 network layer modules, instantiates each network layer using network layer parameter,
Focused lost function is defined in output layer, and network layer is sent to neural network building module.
2.5 neural networks construct module, and analytically module receives network architecture parameters, receives network layer from network layer module,
Network layer is combined according to network architecture parameters, constructs the basic framework of neural network.
Third step, cloud server and client, which cooperate, carries out the training of neural network, completes taking for neural network
It builds, method is:
3.1 functional modules obtain training instruction from client;
3.2 input/output modules, functional module, the basic framework of neural network building module training neural network, method
It is:
3.2.1 input/output module receives training set picture to be measured from client, converts journey for training set picture to be measured
The structural body that sequence can be identified and be handled, such as image, data and box structural body.
3.2.2 structural body is sent to functional module by input/output module.
3.2.3 neural network constructs module using random number as input, initializes the network weight parameter of neural network, is
The basic framework of neural network assigns weight parameter, completes initial neural network and builds.
3.2.4 neural network constructs module and initial neural network is sent to functional module;
3.2.5 functional module using structural body training neural network, method is: functional module using structural body as input,
The focused lost function in initial neural network output layer is called, instructs neural network to be instructed using focused lost function
Practice, and generate trained network weight parameter (network weight parameter update concrete principle and method referring to document Jake
Bouvrie.Notes on Convolutional Neural Networks [J] .2006.11, is translated into: convolutional neural networks pen
Note).
Trained network weight parameter is stored into network weight file by 3.3 functional modules.
3.4 neural networks building module reads trained network weight parameter from network weight file, will train
Network weight parameter assign neural network basic framework, complete building for neural network.
4th step, cloud server and client, which cooperate, carries out target detection identification to testing image, and method is:
4.1 functional modules receive detection instruction from client;
4.2 functional modules, input/output module, functional module, neural network building module cooperate and carry out target inspection
Identification is surveyed, method is:
4.2.1 input/output module receives test set picture P to be measured from client, converts function mould for picture P to be measured
The structural body that block can be identified and be handled, such as image, data and box structural body;
4.2.2 functional module receives structural body from input/output module, and receives nerve net from neural network building module
Network;
4.2.3 functional module calls segmentation detection function, is split inspection using structural body picture P to be measured to test set
It surveys, the method is as follows:
4.2.3.1 the width and a height of W and H for assuming P enable m=0 using the upper left corner as coordinate origin (0,0), and n=0, M are mind
Size through network input layer, between generally 100 to 1000;
4.2.3.2 segmentation detection function is section [m, m+M] to wide coordinate using neural network, and high coordinate is section
Slice in [n, n+M] range carries out target detection, and the prediction result of image is calculated and obtained by neural network output layer, for
The prediction of target position adds position offset to obtain using priori frame, and obtaining wide coordinate is section [m, m+M], high coordinate
For the recognition result being respectively sliced in section [n, n+M] range, i.e., the position coordinates and classification of each target;
If 4.2.3.3 m < W-M, m=m+M, turn 4.2.3.2;If W-M≤m≤W, turns 4.2.3.4;
4.2.3.4 segmentation detection function is section [m, W] to wide coordinate using neural network, and high coordinate is section
Slice in [n, n+M] range carries out target detection, and obtaining wide coordinate is section [m, W], and high coordinate is section [n, n+M]
The recognition result being respectively sliced in range, m=0;
If 4.2.3.5 n < H-M, n=n+M, turn 4.2.3.2;If H-M≤n≤H, turns 4.2.3.6;
4.2.3.6 segmentation detection function is section [m, m+M] to wide coordinate using neural network, and high coordinate is section
Slice in [n, H] range carries out target detection, and obtaining wide coordinate is section [m, m+M], and high coordinate is section [n, H] model
Enclose the interior recognition result being respectively sliced;
If 4.2.3.7 m < W-M, m=m+M, turn 4.2.3.6;If W-M≤m≤W, turns 4.2.3.8;
4.2.3.8 segmentation detection function is section [m, W] to wide coordinate using neural network, and high coordinate is section
Slice in [n, H] range carries out target detection, and obtaining wide coordinate is section [m, W], and high coordinate is section [n, H] range
The interior recognition result being respectively sliced;
4.2.3.9 the wide coordinate in 4.2.3.2 is section [m, m+M] by segmentation detection function, and high coordinate is section [n, n+
M] each slice recognition result in range, wide coordinate is section [m, W] in 4.2.3.4, and high coordinate is in section [n, n+M] range
Each slice recognition result, wide coordinate is section [m, m+M] in 4.2.3.6, and high coordinate is each slice in section [n, H] range
Recognition result, wide coordinate is section [m, W] in 4.2.3.8, high coordinate be each slice recognition result in section [n, H] range into
Row integration obtains the recognition result of entire image P (wide and height is respectively W and H).
4.2.4 the recognition result of P is transmitted to input/output module by functional module;
4.3 input/output modules export the recognition result of P to client.
Following technical effect can achieve using the present invention:
1. second step of the present invention can extract the elder generation of training set by design dimension cluster module from training set label file
Information is tested, priori frame size is calculated, improves the positional accuracy of target;
2. the present invention is added to focused lost function instead of existing loss function, the training of network is made to lay particular emphasis on image
In Small object, improve the recognition accuracy of Small object;
3. the present invention loses serious situation for big map sheet image detection information, using segmentation detection function, improve
The detection speed of big map sheet image and the accuracy of detection.
Detailed description of the invention
Fig. 1 is the architecture diagram of the existing target identification method based on deep learning;
Fig. 2 is overview flow chart of the invention.
Fig. 3 is the architecture diagram for data center's cloud target identification method based on deep learning that the present invention designs.
Specific embodiment
Fig. 2 is overview flow chart of the invention.As shown in Fig. 2, the present invention the following steps are included:
The first step constructs data center's cloud target identification system based on deep learning.Data based on deep learning
Center cloud target identification system as shown in figure 3, by cloud server, groups of clients at.Telnet is installed in client
Software, stores the data set that required by task to be measured is wanted in client, and data set includes that test set picture to be measured, training set wait instructing
Practice picture, training set label file.Client logs in cloud server by telnet, and data set is uploaded to cloud
Server sends training instruction, detection instruction to cloud server before starting training and testing, and carries out to cloud server remote
Cheng Xunlian, detection;Cloud server carries out neural metwork training and target identification, dispatches cloud service according to the instruction of client
The computing resource and storage resource of device, and the training progress msg and recognition result of neural network are sent to client.
Cloud server removes and is equipped with input/output module, functional module, parsing module, network layer module, neural network
Module, cfg configuration file, network weight file are constructed, dimension cluster module is also equipped with.
Dimension cluster module is connected with client, cfg configuration file, which receives training set label text from client
Part, dimension cluster module carry out refinement analysis to the indicia framing information in training set label file, and priori frame size is calculated,
Priori frame size is written in cfg configuration file.
Cfg configuration file is connected with dimension cluster module, parsing module, and cfg configuration file is in addition to recording building convolution mind
Outside network parameter through network, also deposited from the received priori frame size of dimension cluster module as the output layer parameter of network
Storage.
Parsing module is connected with cfg configuration file, network layer module, neural network building module.The module is configured from cfg
File reads the network parameter of building neural network, and parsing network parameter is network layer parameter and network architecture parameters, and by net
Network layers parameter is sent to network layer module, and network architecture parameters are sent to neural network building module.
Network layer module is connected with parsing module, neural network building module, and analytically module receives network layer to the module
Parameter instantiates each network layer using network layer parameter, and the network layer after instantiation is sent to neural network building module.
Unlike network layer module shown in FIG. 1, loss function defined in the output layer of convolutional neural networks is focused lost
Function, focused lost function be based on cross entropy loss function it is improved, for detection target detection complexity carry out
It distinguishes, increases the difficult detection target such as Small object weight shared in loss function, enhance the detection effect for Small object
Fruit.
Neural network building module is connected with parsing module, network layer module, network weight file, functional module.Nerve
Analytically module receives network architecture parameters to network struction module, receives network layer from network layer module, is joined according to network structure
Number combines network layer according to order, constructs the basic framework of neural network.Neural network constructs module also from network weight text
Part reads network weight parameter, is that the basic framework of neural network assigns weight parameter, completes building for neural network, nerve net
Network constructs module and neural network is sent to functional module.
Network weight file is connected with neural network building module, functional module, and network weight file stores from function mould
The received network weight parameter of block is read for neural network building module.
Input/output module is connected with functional module, and input/output module receives test set testing image from client, will
Testing image is converted into the structural body (such as image, data and box structural body) that program can be identified and be handled, and these are tied
Structure body is sent to functional module.
Functional module is connected with input/output module, neural network building module, network weight file, client, with Fig. 1
Equally, also there are trained function and detection function in functional module, functional module calls training function training neural network, by network
Weight parameter is sent to network weight file;Different with Fig. 1 is that detection function is changed, and becomes segmentation detection letter
Number, functional module call segmentation detection function to carry out target detection using neural network, obtain identification of the neural network to image
As a result, recognition result is sent to input/output module.
Second step, dimension cluster module, parsing module, network layer module and the neural network of cloud server construct mould
Block cooperates, and constructs the basic framework of neural network, method are as follows:
2.1 dimension cluster modules receive training set label file from client, read indicia framing from training set label file
Information finds out priori frame size, method are as follows:
2.1.1 dimension cluster module obtains the indicia framing letter of target in training set picture to be trained from training set label file
Breath, with the binary group (w of the width of each indicia framing and high compositioni,hi) as element, (w indicates wide, and h indicates high, and i indicates label
Frame serial number), set S is constituted, the element number in set S is N, and N is the number of indicia framing in picture to be trained, i ∈ [1, N];
2.1.2 dimension cluster module sets cluster centre number as k, and k is positive integer, and definition maximum number of iterations is Num,
Num is generally the integer between 10 to 100, initializes the first cluster centre set C1For empty set, if C1The current number of middle element
For N', N' initial value is 0.
2.1.3 k cluster centre is initialized, method is:
2.1.3.1 dimension cluster module randomly chooses an element (w from Sl,hl), l ∈ [1, N] is set to first
Set C is added in a cluster centre1, enable variable N'=1;
2.1.3.2 enabling variable m=1, n=1;
2.1.3.3 dimension cluster module calculates the element (w in Sm,hm) and C1In element (wn,hn) distance d ((wm,
hm),(wn,hn)):
d((wm,hm),(wn,hn))=1-IOU ((wm,hm),(wn,hn))
Wherein, for any one element (a, b) in dimension cluster module calculating S, (a, b are respectively the wide w of indicia framingmWith
High hm) and C1In any one element (c, d) (c, d are respectively the wide w of cluster centrenWith high hn) rectangle frame hand over and than IOU's
Calculation is as follows:
If a >=c, b >=d, then
If a >=c, b≤d, then
If a≤c, b >=d, then
If a≤c, b≤d, then
If 2.1.3.4 n < N', enables n=n+1, turn 2.1.3.3;If n=N' turns 2.1.3.5;
If 2.1.3.5 m < N, enables m=m+1, n=1, turn 2.1.3.3;If m=N turns 2.1.3.6;
2.1.3.6 variable m=1, n=1, D (w are enabledm,hm)=1;D(wm,hm) it is arbitrary element (w in Sm,hm) and C1In
Arbitrary element (wn,hn) distance minimum value;
2.1.3.7 if d ((wm,hm),(wn,hn)) < D (wm,hm), then enable D (wm,hm)=d ((wm,hm),(wn,hn)), turn
2.1.3.8;Otherwise directly turn 2.1.3.8;
If 2.1.3.8 n < N', enables n=n+1, turn 2.1.3.7;If n=N' turns 2.1.3.9;
If 2.1.3.9 m < N, m=m+1, n=1, D (w are enabledm,hm)=1, turns 2.1.3.7;If m=N turns 2.1.3.10;
2.1.3.10 dimension cluster module calculate minimum range and
2.1.3.11 dimension cluster module takes the N'+1 cluster centre of method choice by weight distribution probability:
2.1.3.11.1 value r is obtained multiplied by random value random (random ∈ [0,1]) with SUM, initialization takes and variable
Cur=0 enables m=1;
2.1.3.11.2 dimension cluster module calculates cur=cur+D (wm,hm)
If 2.1.3.11.3 cur≤r enables m=m+1, turn 2.1.3.11.2;If cur > r, the element (w in Sm,hm) plus
Enter set C1, N'=N'+1 is enabled, 2.1.3.12 is turned;
2.1.3.12 if N'< k, goes to step 2.1.2.2;If N'=k, the first cluster centre set C is obtained1, turn 2.1.4.
2.1.4 the number of iterations t=1 is enabled, the iterative calculation of dimension cluster module generates t+1 cluster centre set, and step is such as
Under:
2.1.4.1 dimension cluster module is according to element each in S and CtThe distance of middle k cluster centre, will be each in S
Element incorporates cluster belonging to nearest cluster centre into, and method is:
For element each in S, CtIn have a cluster centre distance d be minimum therewith.It will be with first cluster centre
(w1,h1) apart from the smallest element it is divided into a set C1, will be with second cluster centre (w2,h2) drawn apart from the smallest element
It is divided into a set C2, and so on, k set is obtained, C is expressed as1, C2..., Cp..., Ck, p ∈ [1, k].
2.1.4.2 finding out C respectively1, C2..., Cp..., CkMean value (the w of middle each element1',h1')(w'2,h2') ...,
(w'p,h'p) ..., (w'k,h'k), wherein w'pFor CpThe arithmetic mean of instantaneous value of the abscissa of middle each element, h'pFor CpMiddle each element
The arithmetic mean of instantaneous value of ordinate, k obtained mean value is as t+1 cluster centre set Ct+1, t=t+1;
If 2.1.4.3 t < Num, goes to step 2.1.4.1;If t=Num, by C at this timet+1Middle k element is as priori frame
Width and high write-in cfg configuration file, turn 2.2.
2.2 parsing modules receive the network parameter of building neural network from cfg configuration file, and parsing network parameter is network
Layer parameter and network architecture parameters, and network layer parameter is issued into network layer module, network architecture parameters are sent to nerve net
Network constructs module.
Analytically module receives network layer parameter to 2.4 network layer modules, instantiates each network layer using network layer parameter,
Focused lost function is defined in output layer, and network layer is sent to neural network building module.
2.5 neural networks construct module, and analytically module receives network architecture parameters, receives network layer from network layer module,
Network layer is combined according to network architecture parameters, constructs the basic framework of neural network.
Third step, cloud server and client, which cooperate, carries out the training of neural network, completes taking for neural network
It builds, method is:
3.1 functional modules obtain training instruction from client;
3.2 input/output modules, functional module, the basic framework of neural network building module training neural network, method
It is:
3.2.1 input/output module receives training set picture to be measured from client, converts journey for training set picture to be measured
The structural body that sequence can be identified and be handled.
3.2.2 structural body is sent to functional module by input/output module.
3.2.3 neural network constructs module using random number as input, initializes the network weight parameter of neural network, is
The basic framework of neural network assigns weight parameter, completes initial neural network and builds.
3.2.4 neural network constructs module and initial neural network is sent to functional module;
3.2.5 functional module using structural body training neural network, method is: functional module using structural body as input,
The focused lost function in initial neural network output layer is called, instructs neural network to be instructed using focused lost function
Practice, and generates trained network weight parameter.
Trained network weight parameter is stored into network weight file by 3.3 functional modules.
3.4 neural networks building module reads trained network weight parameter from network weight file, will train
Network weight parameter assign neural network basic framework, complete building for neural network.
4th step, cloud server and client, which cooperate, carries out target detection identification to testing image, and method is:
4.1 functional modules receive detection instruction from client;
4.2 functional modules, input/output module, functional module, neural network building module cooperate and carry out target inspection
Identification is surveyed, method is:
4.2.1 input/output module receives test set picture P to be measured from client, converts function mould for picture P to be measured
The structural body that block can be identified and be handled, such as image, data and box structural body;
4.2.2 functional module receives structural body from input/output module, and receives nerve net from neural network building module
Network;
4.2.3 functional module calls segmentation detection function, is split inspection using structural body picture P to be measured to test set
It surveys, the method is as follows:
4.2.3.1 the width and a height of W and H for assuming P enable m=0 using the upper left corner as coordinate origin (0,0), and n=0, M are mind
Size through network input layer, between generally 100 to 1000;
4.2.3.2 segmentation detection function is section [m, m+M] to wide coordinate using neural network, and high coordinate is section
Slice in [n, n+M] range carries out target detection, and the prediction result of image is calculated and obtained by neural network output layer, for
The prediction of target position adds position offset to obtain using priori frame, and obtaining wide coordinate is section [m, m+M], high coordinate
For the recognition result being respectively sliced in section [n, n+M] range, i.e., the position coordinates and classification of each target;
If 4.2.3.3 m < W-M, m=m+M, turn 4.2.3.2;If W-M≤m≤W, turns 4.2.3.4;
4.2.3.4 segmentation detection function is section [m, W] to wide coordinate using neural network, and high coordinate is section
Slice in [n, n+M] range carries out target detection, and obtaining wide coordinate is section [m, W], and high coordinate is section [n, n+M]
The recognition result being respectively sliced in range, m=0;
If 4.2.3.5 n < H-M, n=n+M, turn 4.2.3.2;If H-M≤n≤H, turns 4.2.3.6;
4.2.3.6 segmentation detection function is section [m, m+M] to wide coordinate using neural network, and high coordinate is section
Slice in [n, H] range carries out target detection, and obtaining wide coordinate is section [m, m+M], and high coordinate is section [n, H] model
Enclose the interior recognition result being respectively sliced;
If 4.2.3.7 m < W-M, m=m+M, turn 4.2.3.6;If W-M≤m≤W, turns 4.2.3.8;
4.2.3.8 segmentation detection function is section [m, W] to wide coordinate using neural network, and high coordinate is section
Slice in [n, H] range carries out target detection, and obtaining wide coordinate is section [m, W], and high coordinate is section [n, H] range
The interior recognition result being respectively sliced;
4.2.3.9 the wide coordinate in 4.2.3.2 is section [m, m+M] by segmentation detection function, and high coordinate is section [n, n+
M] each slice recognition result in range, wide coordinate is section [m, W] in 4.2.3.4, and high coordinate is in section [n, n+M] range
Each slice recognition result, wide coordinate is section [m, m+M] in 4.2.3.6, and high coordinate is each slice in section [n, H] range
Recognition result,
4.2.3.8 wide coordinate is section [m, W] in, high coordinate be each slice recognition result in section [n, H] range into
Row integration obtains the recognition result of entire image P (wide and height is respectively W and H).
4.2.4 the recognition result of P is transmitted to input/output module by functional module;
4.3 input/output modules export the recognition result of P to client.