CN109858486A

CN109858486A - A data center cloud target recognition method based on deep learning

Info

Publication number: CN109858486A
Application number: CN201910076845.1A
Authority: CN
Inventors: 赵宝康; 时向泉; 王宝生; 陶静; 赵锋; 原玉磊; 毛席龙; 杨帆; 魏子令
Original assignee: National University of Defense Technology
Current assignee: National University of Defense Technology
Priority date: 2019-01-27
Filing date: 2019-01-27
Publication date: 2019-06-07
Anticipated expiration: 2039-01-27
Also published as: CN109858486B

Abstract

The invention discloses a data center cloud target recognition method based on deep learning, which aims to improve the recognition accuracy of small targets and the detection effect of large-format images. The technical solution is to build a data center cloud target recognition system based on deep learning including dimensional clustering modules and functional modules. The output layer of the constructed convolutional neural network defines a focusing loss function, and the functional module includes a segmentation detection function. The features of the images to be trained in the training set are used to extract prior information through the dimensional clustering module. The focusing loss function makes the network training focus on small targets, and the segmentation detection function is used to detect large images in blocks. The dimension clustering module of the present invention extracts the prior information of the training set from the training set label file, which improves the positioning accuracy of the target, the focusing loss function improves the recognition accuracy of the small target, and the segmentation detection function further improves the large-scale image. Detection speed and detection accuracy.

Description

A kind of data center's cloud target identification method based on deep learning

Technical field

The present invention relates to field of target recognition more particularly to a kind of data center's cloud target identifications based on deep learning Method.

Background technique

Recently as the development of depth learning technology, so that realizing end-to-end (End- using initial data as input To-End learning process) is possibly realized.Convolutional neural networks (Convolutional Neural Networks, CNN) have Very strong feature extraction and learning ability.Therefore, the existing object detection method based on deep learning is based on convolutional Neural more Network obtains most effective one further feature, and relatively multiple by establishing by being trained, learning to large-scale data Miscellaneous network structure, the association between abundant mining data, to realize target identification end to end.

As shown in Figure 1, it is existing based on the object detection system of deep learning by input/output module, functional module, solution It analyses module, network layer module, neural network and constructs module, cfg configuration file and network weight file are constituted.

Cfg configuration file is connected with parsing module, and the network parameter that cfg configuration file records building convolutional neural networks supplies Parsing module is read, which is divided into network layer parameter and network architecture parameters.Network layer parameter includes each network layer (volume Product neural network includes input layer, convolutional layer, pond layer, full articulamentum and output layer) each characteristic pattern neuron it is a The number (i.e. the dimension of layer) of number (i.e. the size of layer) and characteristic pattern, wherein the output layer of convolutional neural networks records priori frame Size, the size of priori frame sets (the generally equivalent to size and ratio of normal image according to the size and ratio of normal image Zoom in or out), which is fixed；Network architecture parameters include the type of each network layer for forming neural network, number and Combination.

Network weight file is connected with neural network building module, functional module, and network weight file stores from function mould The received network weight parameter of block is read for neural network building module.Network weight parameter refers to that neural network connects between layers The coefficient of the expression formula of the input/output relation of each neuron when connecing.

Parsing module is connected with cfg configuration file, network layer module, neural network building module, which configures from cfg File reads the network parameter of building neural network, network parameter is resolved to network layer parameter and network architecture parameters, and will Network layer parameter issues network layer module, and network architecture parameters are sent to neural network building module.

Network layer module is connected with parsing module, neural network building module, and analytically module receives network layer to the module Parameter instantiates each network layer using network layer parameter, and the network layer after instantiation is sent to neural network building module；Its In, loss function is defined in the output layer of convolutional neural networks, loss function is the pre- of the output of measurement convolutional neural networks The function of gap between measured value and true value.The value of loss function is smaller, with regard to the better of representative model fitting.Common loss Function has 0-1 loss function, absolute error loss function, quadratic loss function and cross entropy loss function etc..It is existing to be used for mesh Identifying loss function used by other convolutional neural networks is the form based on quadratic loss function, is primarily adapted for use in common figure Picture identification (in normal image object to be measured size account for full figure ratio is higher, target numbers are less, be distributed sparse), it is not right Big Small object distinguishes.

CNN is trained using training function, and training function is divided into two stages to the training of CNN: propagated forward and anti- To propagation.In the propagated forward stage, each layer Forward (propagated forward) function is successively called, obtains layer-by-layer output, most Later layer (i.e. output layer) exports predicted value, and predicted value is obtained loss function value by loss function compared with sample true value；So Backward (backpropagation) function of output layer calculates weight parameter updated value afterwards, successively calls each layer of Backward (backpropagation) function successively reaches first layer by backpropagation, network weight parameter at the end of backpropagation together more Newly.As training carries out, network weight parameter is constantly updated, and loss function value constantly reduces, i.e., the predicted value of network output with True value error is also smaller and smaller.When the value of loss function is no longer reduced, indicates that training is completed, obtain network weight parameter. Select different loss functions, it will so that the training of neural network stresses direction difference, the weight of each layer in training process Updated value also will be different, and the detection effect of last neural network model also will be different.

Neural network building module is connected with parsing module, network layer module, network weight file, functional module, the mould Analytically module receives network architecture parameters to block, receives the network layer after instantiation from network layer module, is joined according to network structure Number combines each network layer, constructs the basic framework of neural network.Neural network constructs module also from network weight file acquisition Network weight parameter is that the basic framework of neural network assigns weight parameter, completes building for neural network, neural network is sent out Give functional module.

Input/output module is connected with functional module, and input/output module is read from the test set that user provides to mapping Testing image is converted the structural body (such as image, data and box structural body) that functional module can be identified and be handled by picture, will These structural bodies are sent to functional module；And recognition result is received from functional module, recognition result is exported to user.

Functional module is connected with input/output module, neural network building module, network weight file, and functional module is called Training function is (referring to document Jake Bouvrie.Notes on Convolutional Neural Networks [J] .2006.11, it is translated into: convolutional neural networks notes) training neural network, network weight parameter is stored into network weight text Part；Functional module calls detection function, carries out target identification using neural network, obtains neural network to the identification knot of image Recognition result is sent to input/output module by fruit.

The method that the existing target identification system based on deep learning carries out target identification are as follows:

1) parsing module reads network parameter from cfg configuration file, network layer parameter is sent into network layer module, by network Structural parameters are sent to neural network building module；

2) analytically module receives network layer parameter to network layer module, defines and realize that each network layer, output network layer arrive Neural network constructs module；

3) analytically module receives network architecture parameters to neural network building module, receives network layer from network layer module, It is combined network layer according to order according to network architecture parameters, constructs the basic framework of neural network.Neural network constructs module Network weight parameter is obtained from network weight file again, is that the basic framework of neural network assigns weight, completes neural network Build, neural network constructs module and neural network is sent to functional module；

4) input/output module receives the testing image of user's input, and is fixed dimension M*M (one by testing image scaling As be set as 416*416), the structural body that functional module can be identified and be handled is then converted to, such as image, data and box structure Body, input/output module is by these structural body input functional modules.Functional module calls training function training neural network, obtains Network weight parameter is stored into network weight file by network weight parameter；Functional module calls detection function, utilizes nerve Network carries out target detection, and the prediction result of image is calculated and obtained by neural network output layer, for the prediction of target position Be add that position offset obtains using priori frame (specific testing principle is referring to document Joseph Redmon.YOLO9000 Better, Faster, Stronger [C] .Hawaii Convention Center.2017, pp7263-7271. translations YOLO9000, more preferably, faster, stronger, CVPR2017 meeting paper, the row of page 3 the 1st the 4th row of page-the 4.), obtain neural network After the recognition result of image, recognition result is transmitted to input/output module, by input/output module by recognition result export to User.

However, existing, based on the target identification method of deep learning, there is following technical problems:

1) existing object detection method is based on predicted position on the basis of priori frame for the prediction of target frame position What relative displacement obtained, presetting priori frame is set according to the size and ratio of normal image, the value be it is fixed, It is stored in cfg file.However in actual conditions, size and the ratio difference of many targets are larger, and the value of priori frame should not be straight The priori frame size of the presetting fixation of female connector, the accuracy of target identification be not high；

2) loss function in existing object detection method is designed for normal image, mesh to be measured in usual image The ratio that dimensioning accounts for full figure is higher, and target numbers are few, is distributed sparse.However many Aerial Images are (such as in actual conditions Remote sensing images) object to be measured often accounts for full figure ratio very little, and is distributed than comparatively dense, when this results in training network to target and The training of background stresses unbalance, Small object identification inaccuracy insufficient to Small object training；

3) input/output module of existing recognition methods will carry out the normalized pre- place of size to the image of all inputs Reason, every image will zoom to the size of M*M (generally several hundred to multiply several hundred, to be 416*416 in YOLOv2).However some figures As such as remote sensing images, often size reaches thousands of multiplied by thousands of or even tens of thousands of multiplied by tens of thousands of, and target size therein is usually only There are tens pixels.If directly inputting neural network to be detected, size normalization can enable image lose many detailed information, Most of target can all become a point, and the detection effect of big map sheet image is had a greatly reduced quality.

In consideration of it, how to solve the problems, such as that Small object and big map sheet image recognition accuracy are low, Small object is effectively improved The detection effect of recognition accuracy and big map sheet image becomes this field researcher urgent problem to be solved.

Summary of the invention

The technical problem to be solved by the present invention is to propose a kind of cloud target identification side, data center based on deep learning Method makes full use of the feature of training set picture to be trained, and refines prior information by dimension cluster module, detects letter using segmentation Number piecemeal detects big map sheet image, solves the problems, such as that Small object and big map sheet image recognition accuracy are low, effectively improves Small object Recognition accuracy and big map sheet image detection effect.

The technical scheme is that

The first step constructs data center's cloud target identification system based on deep learning.Data based on deep learning Center cloud target identification system by cloud server, groups of clients at.Telnet, client are installed in client In store the data set that required by task to be measured is wanted, data set includes test set picture to be measured (figure to be detected in test set Piece), training set picture to be trained (being used to train the picture of neural network in training set), training set label file, training set mark Sign the indicia framing information of target in file record training set picture to be trained, position coordinates, width, height and mesh including indicia framing Target classification (such as aircraft, ship).Client logs in cloud server by telnet, and data set is uploaded to cloud clothes Business device sends training instruction, detection instruction to cloud server before starting training and testing, long-range to cloud server progress Training, detection；Cloud server carries out neural metwork training and target identification, dispatches cloud server according to the instruction of client Computing resource and storage resource, and the training progress msg and recognition result of neural network are sent to client.

Cloud server removes and is equipped with input/output module, functional module, parsing module, network layer module, neural network Module, cfg configuration file, network weight file are constructed, dimension cluster module is also equipped with.

Dimension cluster module is connected with client, cfg configuration file, which receives training set label text from client Part, dimension cluster module carry out refinement analysis to the indicia framing information in training set label file, and priori frame size is calculated, Priori frame size is written in cfg configuration file.

Cfg configuration file is connected with dimension cluster module, parsing module, and cfg configuration file is in addition to recording building convolution mind Outside network parameter through network, also (belong to from the received priori frame size of dimension cluster module as the output layer parameter of network In network layer parameter) storage.

Parsing module is connected with cfg configuration file, network layer module, neural network building module.The module is configured from cfg File reads the network parameter of building neural network, and parsing network parameter is network layer parameter and network architecture parameters, and by net Network layers parameter is sent to network layer module, and network architecture parameters are sent to neural network building module.

Network layer module is connected with parsing module, neural network building module, and analytically module receives network layer to the module Parameter instantiates each network layer using network layer parameter, and the network layer after instantiation is sent to neural network building module. Unlike network layer module shown in FIG. 1, loss function defined in the output layer of convolutional neural networks is focused lost Function, focused lost function be based on cross entropy loss function it is improved, for detection target detection complexity carry out It distinguishes, increases the difficult detection target such as Small object weight shared in loss function, enhance the detection effect for Small object (concrete principle of focused lost function is referring to document Lin T Y, Goyal P, Girshick R, He K.Focal Loss for fruit For Dense Object Detection [C] .ICCV2017 paper arXiv preprint arXiv:1708.02002, 2018:1-10 is translated into: for the focused lost function of intensive target detection)).Focused lost function is for actual conditions Aerial Images design, this kind of image object to be measured usually accounts for full figure ratio very little, comparatively dense is compared in distribution.

Neural network building module is connected with parsing module, network layer module, network weight file, functional module.Nerve Analytically module receives network architecture parameters to network struction module, receives network layer from network layer module, is joined according to network structure Number combines network layer according to order, constructs the basic framework of neural network.Neural network constructs module also from network weight text Part reads network weight parameter, is that the basic framework of neural network assigns weight parameter, completes building for neural network, nerve net Network constructs module and neural network is sent to functional module.

Network weight file is connected with neural network building module, functional module, and network weight file stores from function mould The received network weight parameter of block is read for neural network building module.

Input/output module is connected with functional module, and input/output module receives test set testing image from client, will Testing image is converted into the structural body (such as image, data and box structural body) that program can be identified and be handled, and these are tied Structure body is sent to functional module.

Functional module is connected with input/output module, neural network building module, network weight file, client, with Fig. 1 Equally, also there are trained function and detection function in functional module, functional module calls training function training neural network, by network Weight parameter is sent to network weight file；Different with Fig. 1 is that detection function is changed, and becomes segmentation detection letter Number, functional module call segmentation detection function to carry out target detection using neural network, obtain identification of the neural network to image As a result, recognition result is sent to input/output module.

Second step, dimension cluster module, parsing module, network layer module and the neural network of cloud server construct mould Block cooperates, and constructs the basic framework of neural network, method are as follows:

2.1 dimension cluster modules receive training set label file from client, read indicia framing from training set label file Information finds out priori frame size, method are as follows:

2.1.1 dimension cluster module obtains training set picture to be trained (multiple pictures to be trained) from training set label file The indicia framing information (these indicia framing information have been got well by user's mark) of middle target, with the width and high structure of each indicia framing At binary group (w_i,h_i) be used as element (w expression is wide, and h indicates high, i expression indicia framing serial number), in composition set S, set S Element number is N, and N is the number of indicia framing in picture to be trained, i ∈ [1, N]；

2.1.2 dimension cluster module sets cluster centre number as k, and k is positive integer, and definition maximum number of iterations is Num, Num is generally the integer between 10 to 100, initializes the first cluster centre set C¹For empty set, if C¹The current number of middle element For N', N' initial value is 0.

2.1.3 k cluster centre is initialized, method is:

2.1.3.1 dimension cluster module randomly chooses an element (w from S_l,h_l), l ∈ [1, N] is set to first Set C is added in a cluster centre¹, enable variable N'=1；

2.1.3.2 enabling variable m=1, n=1；

2.1.3.3 dimension cluster module calculates the element (w in S_m,h_m) and C¹In element (w_n,h_n) distance d ((w_m, h_m),(w_n,h_n)):

d((w_m,h_m),(w_n,h_n))=1-IOU ((w_m,h_m),(w_n,h_n))

Wherein, for any one element (a, b) in dimension cluster module calculating S, (a, b are respectively the wide w of indicia framing_mWith High h_m) and C¹In any one element (c, d) (c, d are respectively the wide w of cluster centre_nWith high h_n) rectangle frame hand over and than IOU's Calculation is as follows:

If a >=c, b >=d, then

If a >=c, b≤d, then

If a≤c, b >=d, then

If a≤c, b≤d, then

If 2.1.3.4 n < N', enables n=n+1, turn 2.1.3.3；If n=N' turns 2.1.3.5；

If 2.1.3.5 m < N, enables m=m+1, n=1, turn 2.1.3.3；If m=N turns 2.1.3.6；

2.1.3.6 variable m=1, n=1, D (w are enabled_m,h_m)=1；D(w_m,h_m) it is arbitrary element (w in S_m,h_m) and C¹In Arbitrary element (w_n,h_n) distance minimum value；

2.1.3.7 if d ((w_m,h_m),(w_n,h_n)) < D (w_m,h_m), then enable D (w_m,h_m)=d ((w_m,h_m),(w_n,h_n)), turn 2.1.3.8；Otherwise directly turn 2.1.3.8；

If 2.1.3.8 n < N', enables n=n+1, turn 2.1.3.7；If n=N' turns 2.1.3.9；

If 2.1.3.9 m < N, m=m+1, n=1, D (w are enabled_m,h_m)=1, turns 2.1.3.7；If m=N turns 2.1.3.10；

2.1.3.10 dimension cluster module calculate minimum range and

2.1.3.11 dimension cluster module takes the N'+1 cluster centre of method choice by weight distribution probability:

2.1.3.11.1 value r is obtained multiplied by random value random (random ∈ [0,1]) with SUM, initialization takes and variable Cur=0 enables m=1；

2.1.3.11.2 dimension cluster module calculates cur=cur+D (w_m,h_m)

If 2.1.3.11.3 cur≤r enables m=m+1, turn 2.1.3.11.2；If cur > r, the element (w in S_m,h_m) plus Enter set C¹, N'=N'+1 is enabled, 2.1.3.12 is turned；

2.1.3.12 if N'< k, goes to step 2.1.2.2；If N'=k, the first cluster centre set C is obtained¹, turn 2.1.4.

2.1.4 the number of iterations t=1 is enabled, the iterative calculation of dimension cluster module generates t+1 cluster centre set, and step is such as Under:

2.1.4.1 dimension cluster module is according to element each in S and C^tThe distance of middle k cluster centre, will be each in S Element incorporates cluster belonging to nearest cluster centre into, and method is:

For element each in S, C^tIn have a cluster centre distance d be minimum therewith.It will be with first cluster centre (w₁,h₁) apart from the smallest element it is divided into a set C₁, will be with second cluster centre (w₂,h₂) drawn apart from the smallest element It is divided into a set C₂, and so on, k set is obtained, C is expressed as₁, C₂..., C_p..., C_k, p ∈ [1, k].

2.1.4.2 finding out C respectively₁, C₂..., C_p..., C_kMean value (the w of middle each element₁',h₁')(w'₂,h₂') ..., (w'_p,h'_p) ..., (w'_k,h'_k), wherein w'_pFor C_pThe arithmetic mean of instantaneous value of the abscissa of middle each element, h'_pFor C_pMiddle each element The arithmetic mean of instantaneous value of ordinate, k obtained mean value is as t+1 cluster centre set C^t+1, t=t+1；

If 2.1.4.3 t < Num, goes to step 2.1.4.1；If t=Num, by C at this time^t+1Middle k element is as priori frame Width and high write-in cfg configuration file, turn 2.2.

2.2 parsing modules receive the network parameter of building neural network from cfg configuration file, and parsing network parameter is network Layer parameter and network architecture parameters, and network layer parameter is issued into network layer module, network architecture parameters are sent to nerve net Network constructs module.

Analytically module receives network layer parameter to 2.4 network layer modules, instantiates each network layer using network layer parameter, Focused lost function is defined in output layer, and network layer is sent to neural network building module.

2.5 neural networks construct module, and analytically module receives network architecture parameters, receives network layer from network layer module, Network layer is combined according to network architecture parameters, constructs the basic framework of neural network.

Third step, cloud server and client, which cooperate, carries out the training of neural network, completes taking for neural network It builds, method is:

3.1 functional modules obtain training instruction from client；

3.2 input/output modules, functional module, the basic framework of neural network building module training neural network, method It is:

3.2.1 input/output module receives training set picture to be measured from client, converts journey for training set picture to be measured The structural body that sequence can be identified and be handled, such as image, data and box structural body.

3.2.2 structural body is sent to functional module by input/output module.

3.2.3 neural network constructs module using random number as input, initializes the network weight parameter of neural network, is The basic framework of neural network assigns weight parameter, completes initial neural network and builds.

3.2.4 neural network constructs module and initial neural network is sent to functional module；

3.2.5 functional module using structural body training neural network, method is: functional module using structural body as input, The focused lost function in initial neural network output layer is called, instructs neural network to be instructed using focused lost function Practice, and generate trained network weight parameter (network weight parameter update concrete principle and method referring to document Jake Bouvrie.Notes on Convolutional Neural Networks [J] .2006.11, is translated into: convolutional neural networks pen Note).

Trained network weight parameter is stored into network weight file by 3.3 functional modules.

3.4 neural networks building module reads trained network weight parameter from network weight file, will train Network weight parameter assign neural network basic framework, complete building for neural network.

4th step, cloud server and client, which cooperate, carries out target detection identification to testing image, and method is:

4.1 functional modules receive detection instruction from client；

4.2 functional modules, input/output module, functional module, neural network building module cooperate and carry out target inspection Identification is surveyed, method is:

4.2.1 input/output module receives test set picture P to be measured from client, converts function mould for picture P to be measured The structural body that block can be identified and be handled, such as image, data and box structural body；

4.2.2 functional module receives structural body from input/output module, and receives nerve net from neural network building module Network；

4.2.3 functional module calls segmentation detection function, is split inspection using structural body picture P to be measured to test set It surveys, the method is as follows:

4.2.3.1 the width and a height of W and H for assuming P enable m=0 using the upper left corner as coordinate origin (0,0), and n=0, M are mind Size through network input layer, between generally 100 to 1000；

4.2.3.2 segmentation detection function is section [m, m+M] to wide coordinate using neural network, and high coordinate is section Slice in [n, n+M] range carries out target detection, and the prediction result of image is calculated and obtained by neural network output layer, for The prediction of target position adds position offset to obtain using priori frame, and obtaining wide coordinate is section [m, m+M], high coordinate For the recognition result being respectively sliced in section [n, n+M] range, i.e., the position coordinates and classification of each target；

If 4.2.3.3 m < W-M, m=m+M, turn 4.2.3.2；If W-M≤m≤W, turns 4.2.3.4；

4.2.3.4 segmentation detection function is section [m, W] to wide coordinate using neural network, and high coordinate is section Slice in [n, n+M] range carries out target detection, and obtaining wide coordinate is section [m, W], and high coordinate is section [n, n+M] The recognition result being respectively sliced in range, m=0；

If 4.2.3.5 n < H-M, n=n+M, turn 4.2.3.2；If H-M≤n≤H, turns 4.2.3.6；

4.2.3.6 segmentation detection function is section [m, m+M] to wide coordinate using neural network, and high coordinate is section Slice in [n, H] range carries out target detection, and obtaining wide coordinate is section [m, m+M], and high coordinate is section [n, H] model Enclose the interior recognition result being respectively sliced；

If 4.2.3.7 m < W-M, m=m+M, turn 4.2.3.6；If W-M≤m≤W, turns 4.2.3.8；

4.2.3.8 segmentation detection function is section [m, W] to wide coordinate using neural network, and high coordinate is section Slice in [n, H] range carries out target detection, and obtaining wide coordinate is section [m, W], and high coordinate is section [n, H] range The interior recognition result being respectively sliced；

4.2.3.9 the wide coordinate in 4.2.3.2 is section [m, m+M] by segmentation detection function, and high coordinate is section [n, n+ M] each slice recognition result in range, wide coordinate is section [m, W] in 4.2.3.4, and high coordinate is in section [n, n+M] range Each slice recognition result, wide coordinate is section [m, m+M] in 4.2.3.6, and high coordinate is each slice in section [n, H] range Recognition result, wide coordinate is section [m, W] in 4.2.3.8, high coordinate be each slice recognition result in section [n, H] range into Row integration obtains the recognition result of entire image P (wide and height is respectively W and H).

4.2.4 the recognition result of P is transmitted to input/output module by functional module；

4.3 input/output modules export the recognition result of P to client.

Following technical effect can achieve using the present invention:

1. second step of the present invention can extract the elder generation of training set by design dimension cluster module from training set label file Information is tested, priori frame size is calculated, improves the positional accuracy of target；

2. the present invention is added to focused lost function instead of existing loss function, the training of network is made to lay particular emphasis on image In Small object, improve the recognition accuracy of Small object；

3. the present invention loses serious situation for big map sheet image detection information, using segmentation detection function, improve The detection speed of big map sheet image and the accuracy of detection.

Detailed description of the invention

Fig. 1 is the architecture diagram of the existing target identification method based on deep learning；

Fig. 2 is overview flow chart of the invention.

Fig. 3 is the architecture diagram for data center's cloud target identification method based on deep learning that the present invention designs.

Specific embodiment

Fig. 2 is overview flow chart of the invention.As shown in Fig. 2, the present invention the following steps are included:

The first step constructs data center's cloud target identification system based on deep learning.Data based on deep learning Center cloud target identification system as shown in figure 3, by cloud server, groups of clients at.Telnet is installed in client Software, stores the data set that required by task to be measured is wanted in client, and data set includes that test set picture to be measured, training set wait instructing Practice picture, training set label file.Client logs in cloud server by telnet, and data set is uploaded to cloud Server sends training instruction, detection instruction to cloud server before starting training and testing, and carries out to cloud server remote Cheng Xunlian, detection；Cloud server carries out neural metwork training and target identification, dispatches cloud service according to the instruction of client The computing resource and storage resource of device, and the training progress msg and recognition result of neural network are sent to client.

Cfg configuration file is connected with dimension cluster module, parsing module, and cfg configuration file is in addition to recording building convolution mind Outside network parameter through network, also deposited from the received priori frame size of dimension cluster module as the output layer parameter of network Storage.

Network layer module is connected with parsing module, neural network building module, and analytically module receives network layer to the module Parameter instantiates each network layer using network layer parameter, and the network layer after instantiation is sent to neural network building module. Unlike network layer module shown in FIG. 1, loss function defined in the output layer of convolutional neural networks is focused lost Function, focused lost function be based on cross entropy loss function it is improved, for detection target detection complexity carry out It distinguishes, increases the difficult detection target such as Small object weight shared in loss function, enhance the detection effect for Small object Fruit.

2.1.1 dimension cluster module obtains the indicia framing letter of target in training set picture to be trained from training set label file Breath, with the binary group (w of the width of each indicia framing and high composition_i,h_i) as element, (w indicates wide, and h indicates high, and i indicates label Frame serial number), set S is constituted, the element number in set S is N, and N is the number of indicia framing in picture to be trained, i ∈ [1, N]；

2.1.3 k cluster centre is initialized, method is:

2.1.3.2 enabling variable m=1, n=1；

d((w_m,h_m),(w_n,h_n))=1-IOU ((w_m,h_m),(w_n,h_n))

If a >=c, b >=d, then

If a >=c, b≤d, then

If a≤c, b >=d, then

If a≤c, b≤d, then

If 2.1.3.4 n < N', enables n=n+1, turn 2.1.3.3；If n=N' turns 2.1.3.5；

If 2.1.3.5 m < N, enables m=m+1, n=1, turn 2.1.3.3；If m=N turns 2.1.3.6；

If 2.1.3.8 n < N', enables n=n+1, turn 2.1.3.7；If n=N' turns 2.1.3.9；

2.1.3.10 dimension cluster module calculate minimum range and

2.1.3.11.2 dimension cluster module calculates cur=cur+D (w_m,h_m)

3.1 functional modules obtain training instruction from client；

3.2.1 input/output module receives training set picture to be measured from client, converts journey for training set picture to be measured The structural body that sequence can be identified and be handled.

3.2.2 structural body is sent to functional module by input/output module.

3.2.5 functional module using structural body training neural network, method is: functional module using structural body as input, The focused lost function in initial neural network output layer is called, instructs neural network to be instructed using focused lost function Practice, and generates trained network weight parameter.

4.1 functional modules receive detection instruction from client；

If 4.2.3.3 m < W-M, m=m+M, turn 4.2.3.2；If W-M≤m≤W, turns 4.2.3.4；

If 4.2.3.5 n < H-M, n=n+M, turn 4.2.3.2；If H-M≤n≤H, turns 4.2.3.6；

If 4.2.3.7 m < W-M, m=m+M, turn 4.2.3.6；If W-M≤m≤W, turns 4.2.3.8；

4.2.3.9 the wide coordinate in 4.2.3.2 is section [m, m+M] by segmentation detection function, and high coordinate is section [n, n+ M] each slice recognition result in range, wide coordinate is section [m, W] in 4.2.3.4, and high coordinate is in section [n, n+M] range Each slice recognition result, wide coordinate is section [m, m+M] in 4.2.3.6, and high coordinate is each slice in section [n, H] range Recognition result,

4.2.3.8 wide coordinate is section [m, W] in, high coordinate be each slice recognition result in section [n, H] range into Row integration obtains the recognition result of entire image P (wide and height is respectively W and H).

4.3 input/output modules export the recognition result of P to client.

Claims

1. a data center cloud target recognition method based on deep learning, is characterized in that comprising the following steps:

The first step is to build a data center cloud target recognition system based on deep learning; the data center cloud target recognition system based on deep learning consists of a cloud server and a client; the client is installed with remote login software, and the client stores the data to be tested. The data set required for the task. The data set includes the test set image to be tested, the training set image to be trained, and the training set label file. The training set label file records the marked frame information of the target in the training set image to be trained, including the position coordinates of the marked frame. , width, height and target category; the client logs in to the cloud server through the remote login software, uploads the data set to the cloud server, sends training instructions and detection instructions to the cloud server before starting training and testing, and performs remote training and testing on the cloud server. Detection; the cloud server performs neural network training and target recognition, dispatches the computing resources and storage resources of the cloud server according to the client's instructions, and sends the neural network training progress information and recognition results to the client;

In addition to the input and output modules, function modules, parsing modules, network layer modules, neural network building modules, cfg configuration files, and network weight files, the cloud server is also installed with a dimensional clustering module;

The dimension clustering module is connected to the client and the cfg configuration file. The module receives the training set label file from the client. The dimension clustering module extracts and analyzes the label box information in the training set label file, calculates the size of the a priori box, and calculates the size of the a priori box. The a priori box size is written into the cfg configuration file;

The cfg configuration file is connected with the dimension clustering module and the parsing module. In addition to recording the network parameters for constructing the convolutional neural network, the cfg configuration file also stores the a priori frame size received from the dimension clustering module as the output layer parameters of the network;

The parsing module is connected to the cfg configuration file, the network layer module, and the neural network building module; the module reads the network parameters for building the neural network from the cfg configuration file, parses the network parameters into network layer parameters and network structure parameters, and sends the network layer parameters. To the network layer module, the network structure parameters are sent to the neural network building module;

The network layer module is connected with the parsing module and the neural network building module. The module receives the network layer parameters from the parsing module, instantiates each network layer with the network layer parameters, and sends the instantiated network layers to the neural network building module; convolution The loss function defined in the output layer of the neural network is the focusing loss function;

The neural network building module is connected with the parsing module, the network layer module, the network weight file and the function module; the neural network building module receives the network structure parameters from the parsing module, receives the network layer from the network layer module, and combines the network layers in order according to the network structure parameters , build the basic framework of the neural network; the neural network building module also reads the network weight parameters from the network weight file, assigns weight parameters to the basic framework of the neural network, and completes the construction of the neural network, and the neural network building module sends the neural network to the functional module. ;

The network weight file is connected with the neural network building module and the function module, and the network weight file stores the network weight parameters received from the function module for the neural network building module to read;

The input and output module is connected with the function module, and the input and output module receives the test image of the test set from the client, converts the to-be-tested image into a structure that can be recognized and processed by the program, and sends these structures to the function module;

The function module is connected with the input and output module, the neural network building module, the network weight file and the client. The function module calls the training function to train the neural network and sends the network weight parameters to the network weight file; the function module calls the segmentation detection function and uses the neural network to target Detect, get the recognition result of the image by the neural network, and send the recognition result to the input and output module;

In the second step, the dimensional clustering module, parsing module, network layer module and neural network building module of the cloud server cooperate with each other to construct the basic framework of the neural network. The method is as follows:

2.1 The dimension clustering module receives the training set label file from the client, reads the label box information from the training set label file, and obtains the a priori box size. The method is as follows:

2.1.1 The dimensional clustering module obtains the marked frame information of the target in the training set image from the training set label file, and uses the two-tuple (w _i , h _i ) formed by the width and height of each marked frame as an element to form Set S, the number of elements in set S is N, N is the number of marked boxes in the image to be trained, w represents the width, h represents the height, i represents the serial number of the marked box, i∈[1,N];

2.1.2 The dimensional clustering module sets the number of cluster centers as k, k is a positive integer, defines the maximum number of iterations as Num, initializes the first cluster center set C ¹ as an empty set, and sets the current number of elements in C ¹ as N', N' initial value is 0;

2.1.3 Initialize k cluster centers by:

2.1.3.1 The dimension clustering module randomly selects an element (w _l , h _l ), l∈[1,N] from S, sets it as the first cluster center, joins the set C ¹ , and sets the variable N'=1;

2.1.3.2 Let the variables m=1, n=1;

2.1.3.3 The dimensional clustering module calculates the distance d(( _wm , _hm ),( _wn , _hn ) between elements ( _wm , _hm ) in S and elements ( _wn , _hn ) in ^C1 )):

d((w _m ,h _m ),(w _n ,h _n ))=1-IOU((w _m ,h _m ),(w _n ,h _n ))

Among them, a and b are the width w _m and height h _m of the marked frame, respectively, c and d are the width w _n and height h _n of the cluster center, respectively. For the dimension clustering module, calculate any element in S (a, b ) and the rectangular box intersection of any element (c, d) in C ¹ and the ratio of IOU is calculated as follows:

If a≥c, b≥d, then

If a≥c, b≤d, then

If a≤c, b≥d, then

If a≤c, b≤d, then

2.1.3.4 If n<N', let n=n+1, go to 2.1.3.3; if n=N', go to 2.1.3.5;

2.1.3.5 If m<N, let m=m+1, n=1, go to 2.1.3.3; if m=N, go to 2.1.3.6;

2.1.3.6 Let the variables m=1, n=1, D(w _m , h _m )= ¹ ; D(w _m , h _m ) is any element (w _m , h _m ) in S and the The minimum value of the distance of any element (w _n , h _n );

2.1.3.7 If d(( _wm , _hm ),( _wn , _hn ))<D(wm, _hm ), then let D( _wm , _hm )= _d (( _wm ,h _m ),(w _n ,h _n )), go to 2.1.3.8; otherwise go to 2.1.3.8 directly;

2.1.3.8 If n<N', let n=n+1, go to 2.1.3.7; if n=N', go to 2.1.3.9;

2.1.3.9 If m<N, let m=m+1, n=1, D(w _m , h _m )=1, go to 2.1.3.7; if m=N, go to 2.1.3.10;

2.1.3.10 The dimensional clustering module calculates the minimum distance sum

2.1.3.11 The dimension clustering module adopts the method of assigning probability by weight to select the N'+1th cluster center:

2.1.3.11.1 Multiply the random value random (random∈[0,1]) by SUM to obtain the value r, initialize the sum variable cur=0, and let m=1;

2.1.3.11.2 Dimensional clustering module calculates cur=cur+D(w _m ,h _m )

2.1.3.11.3 If cur≤r, let m=m+1, go to 2.1.3.11.2; if cur>r, add elements (w _m , h _m ) in S to set C ¹ , let N'= N'+1, go to 2.1.3.12;

2.1.3.12 If N'<k, go to step 2.1.2.2; if N'=k, get the first cluster center set C ¹ , go to 2.1.4;

2.1.4 Let the number of iterations t=1, and the dimension clustering module iteratively calculates to generate the t+1th cluster center set. The steps are as follows:

2.1.4.1 Dimensional clustering module According to the distance between each element in S and the k cluster centers in C ^t , each element in S is assigned to the cluster to which the nearest cluster center belongs. The method is:

Divide the elements with the smallest distance from the first cluster center (w ₁ , h ₁ ) into a set C ₁ , and divide the elements with the smallest distance from the second cluster center (w ₂ , h ₂ ) into a set C ₂ , and so on, to obtain k sets, denoted as C ₁ , C ₂ , ..., C _p , ..., C _k , p∈[1,k];

2.1.4.2 Find the mean value of each element in C ₁ , C ₂ ,..., C _p ,..., C _k respectively (w' ₁ ,h' ₁ )(w' ₂ ,h' ₂ ),...,(w' _p , h′ _p ), …, (w′ _k , h′ _k ), where w′ _p is the arithmetic mean of the abscissa of each element in C _p , h′ _p is the ordinate of each element in C _p Arithmetic mean, the obtained k mean values are taken as the t+1th cluster center set C ^t+1 , t=t+1;

2.1.4.3 If t<Num, go to step 2.1.4.1; if t=Num, write the k elements in C ^t+1 at this time as the width and height of the prior box into the cfg configuration file, go to 2.2;

2.2 The parsing module receives the network parameters for building the neural network from the cfg configuration file, parses the network parameters into network layer parameters and network structure parameters, sends the network layer parameters to the network layer module, and sends the network structure parameters to the neural network building module;

2.4 The network layer module receives the network layer parameters from the parsing module, instantiates each network layer with the network layer parameters, defines the focus loss function in the output layer, and sends the network layer to the neural network building module;

2.5 The neural network building module receives the network structure parameters from the parsing module, receives the network layer from the network layer module, and combines the network layers according to the network structure parameters to construct the basic framework of the neural network;

In the third step, the cloud server and the client cooperate with each other to train the neural network to complete the construction of the neural network. The method is as follows:

3.1 The function module obtains training instructions from the client;

3.2 Input and output modules, functional modules, and neural network building modules are the basic framework for training neural networks, and generate trained network weight parameters;

3.3 The function module stores the trained network weight parameters to the network weight file;

3.4 The neural network building module reads the trained network weight parameters from the network weight file, assigns the trained network weight parameters to the basic framework of the neural network, and completes the construction of the neural network;

In the fourth step, the cloud server and the client cooperate with each other to perform target detection and recognition on the image to be tested. The method is as follows:

4.1 The function module receives the detection instruction from the client;

4.2 The function module, input and output module, function module, and neural network building module cooperate with each other for target detection and identification, and the method is as follows:

4.2.1 The input and output module receives the picture P to be tested in the test set from the client, and converts the picture P to be tested into a structure that the functional module can recognize and process;

4.2.2 The function module receives the structure from the input and output module, and receives the neural network from the neural network building module;

4.2.3 The function module calls the segmentation detection function, and uses the structure to perform segmentation detection on the picture P to be tested in the test set, and obtains the recognition result of P. The method is:

4.2.3.1 Assuming that the width and height of P are W and H, and the upper left corner is the coordinate origin (0,0), let m=0, n=0, and M is the size of the input layer of the neural network;

4.2.3.2 The segmentation detection function uses the neural network to detect the slices whose width is in the interval [m, m+M] and the high coordinate is in the interval [n, n+M]. The prediction result of the image is determined by the neural network. The output layer is calculated and obtained. The prediction of the target position is obtained by using the a priori frame and the position offset. The width coordinate is the interval [m, m+M], and the height coordinate is the interval [n, n+M] range. The identification result of each slice in the inner, that is, the position coordinates and category of each target;

4.2.3.3 If m<W-M, then m=m+M, go to 4.2.3.2; if W-M≤m≤W, go to 4.2.3.4;

4.2.3.4 The segmentation detection function uses the neural network to detect the slices within the range of the wide coordinate as the interval [m, W] and the high coordinate as the interval [n, n+M], and obtain the wide coordinate as the interval [m, W], the high coordinate is the recognition result of each slice within the range [n, n+M], m=0;

4.2.3.5 If n<H-M, then n=n+M, go to 4.2.3.2; if H-M≤n≤H, go to 4.2.3.6;

4.2.3.6 The segmentation detection function uses the neural network to detect the slices within the range of the wide coordinate as the interval [m, m+M] and the high coordinate as the interval [n, H], and obtains the wide coordinate as the interval [m, m+M], the high coordinate is the recognition result of each slice within the range [n, H];

4.2.3.7 If m<W-M, then m=m+M, go to 4.2.3.6; if W-M≤m≤W, go to 4.2.3.8;

4.2.3.8 The segmentation detection function uses the neural network to detect the slices within the range of the wide coordinate as the interval [m, W] and the high coordinate as the interval [n, H], and obtains the wide coordinate as the interval [m, W] , the high coordinate is the recognition result of each slice in the range [n, H];

4.2.3.9 The segmentation detection function takes the width coordinate as the interval [m, m+M], the height coordinate as the recognition result of each slice within the range of the interval [n, n+M], the width coordinate as the interval [m, W], and the height coordinate as the interval [m, W]. is the recognition result of each slice within the range of interval [n,n+M], the width coordinate is the interval [m,m+M], the height coordinate is the recognition result of each slice within the range of interval [n,H], and the width coordinate is the interval [m, W], the height coordinate is the recognition result of each slice within the range of the interval [n, H], and the recognition result of the whole image P is obtained;

4.2.4 The function module transmits the recognition result of P to the input and output module;

4.3 The input and output module outputs the recognition result of P to the client.

2 . The method for recognizing a cloud target in a data center based on deep learning according to claim 1 , wherein the structures refer to image, data and box structures. 3 .

3 . The method for recognizing a cloud target in a data center based on deep learning according to claim 1 , wherein the Num is an integer between 10 and 100, and the M is between 100 and 1000. 4 .

4. a kind of data center cloud target recognition method based on deep learning as claimed in claim 1 is characterized in that the method for the basic frame of the described training neural network in 3.2 steps is:

3.2.1 The input and output module receives the images to be tested in the training set from the client, and converts the images to be tested in the training set into a structure that the program can recognize and process;

3.2.2 The input and output module sends the structure to the function module;

3.2.3 The neural network building module takes random numbers as input, initializes the network weight parameters of the neural network, assigns weight parameters to the basic framework of the neural network, and completes the initial neural network construction;

3.2.4 The neural network building module sends the initial neural network to the functional module;

3.2.5 The function module uses the structure to train the neural network. The method is as follows: the function module takes the structure as the input, calls the focus loss function in the initial neural network output layer, uses the focus loss function to guide the neural network for training, and generates a trained the network weight parameters.