CN110163337A - Data processing method, device, equipment and storage medium neural network based - Google Patents

Data processing method, device, equipment and storage medium neural network based Download PDF

Info

Publication number
CN110163337A
CN110163337A CN201811340948.6A CN201811340948A CN110163337A CN 110163337 A CN110163337 A CN 110163337A CN 201811340948 A CN201811340948 A CN 201811340948A CN 110163337 A CN110163337 A CN 110163337A
Authority
CN
China
Prior art keywords
layer
data
network
layers
hidden
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201811340948.6A
Other languages
Chinese (zh)
Other versions
CN110163337B (en
Inventor
周谦
周方云
詹成君
方允福
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tencent Technology Shenzhen Co Ltd
Original Assignee
Tencent Technology Shenzhen Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tencent Technology Shenzhen Co Ltd filed Critical Tencent Technology Shenzhen Co Ltd
Priority to CN201811340948.6A priority Critical patent/CN110163337B/en
Publication of CN110163337A publication Critical patent/CN110163337A/en
Application granted granted Critical
Publication of CN110163337B publication Critical patent/CN110163337B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks

Abstract

The embodiment of the invention provides a kind of data processing method, device, equipment and storage mediums neural network based, belong to technical field of data processing, wherein the neural network includes that at least one network merges layer, it includes n cascade hidden layers, n >=2 that network, which merges layer,;The data processing method includes: that control network merges layer progress data processing, wherein there are the interlayer parallel processings of data between each hidden layer of network merging layer in data processing.Scheme through the embodiment of the present invention can effectively improve the treatment effeciency of data.

Description

Data processing method, device, equipment and storage medium neural network based
Technical field
The present invention relates to technical field of data processing, specifically, the present invention relates to a kind of data neural network based Processing method, device, equipment and storage medium.
Background technique
With the continuous development of nerual network technique, neural network has been widely used in every field.For example, for Convolutional neural networks have been applied widely in computer vision, field of image processing by its exclusive performance advantage, And application of the convolutional neural networks in terms of visual identity in recent years also achieves good results.
However, neural network has the characteristics that large capacity, high-dimensional, and the network parameter of neural network is numerous, is being based on When neural network carries out data processing, can there is a problem of operation time length, how improve data-handling efficiency, be urgently to be resolved The problem of.
Summary of the invention
The main purpose of the embodiment of the present invention is to provide a kind of data processing method neural network based, device, sets Standby and storage medium, to solve the problems, such as that data processing speed is slow in available data processing mode.
In a first aspect, the embodiment of the invention provides a kind of data processing methods neural network based, wherein nerve net Network includes that at least one network merges layer, and it includes n cascade hidden layers, n >=2 that network, which merges layer,;The data processing method packet It includes:
It controls network and merges layer progress data processing, wherein network merges each hidden layer of layer in data processing Between there are the interlayer parallel processings of data.
In a kind of alternative embodiment of first aspect, when being instantiated n cascade hidden layers, n cascade Hidden layer corresponds to the same object instance.
In a kind of alternative embodiment of first aspect, control network merges layer and carries out data processing, comprising:
(i-1)-th hidden layer that network merges layer is controlled, the input data of (i-1)-th hidden layer is handled, and will Processing result carries out Serial output, wherein 2≤i≤n;
I-th of hidden layer that control network merges layer handles the output data of (i-1)-th hidden layer.
In a kind of alternative embodiment of first aspect, the number that has exported of i-th of the hidden layer of control to (i-1)-th hidden layer According to being handled, comprising:
When output data meets preset condition, i-th of hidden layer of control handles the data exported.
In a kind of alternative embodiment of first aspect, preset condition includes i-th of hidden layer progress of output data satisfaction The minimum calculation condition of operation in layer.
In a kind of alternative embodiment of first aspect, output data is the part output data of (i-1)-th hidden layer.
In a kind of alternative embodiment of first aspect, the data processing method further include:
At least partly data for controlling output data are stored in register and/or cache memory (Cache).
In a kind of alternative embodiment of first aspect, control network merges i-th of hidden layer of layer to (i-1)-th hidden layer Output data handled, comprising:
When the maximum output duration of output data is no more than setting duration, merge layer i-th of control network is hidden Layer handles the output data of (i-1)-th hidden layer;
Wherein, maximum output duration, refer to the data obtained earliest in current time and output data obtains the moment Between duration.
In a kind of alternative embodiment of first aspect, setting duration according to the maximum storage duration of register temporal data, And/or the maximum storage duration determination that Cache is data cached.
In a kind of alternative embodiment of first aspect, neural network is the first convolutional neural networks, and network merges layer and includes Cascade first convolutional layer and the first activation primitive (Relu) layer, control network merge layer and carry out data processing, comprising:
It controls the first convolutional layer and convolution algorithm is carried out to the input data of the first convolutional layer, by each fortune of the first convolutional layer Result Serial output is calculated, and controls the first Relu layers of output data each to the first convolutional layer and carries out Relu operation;
Alternatively,
Neural network is the second convolutional neural networks, and network merges layer and includes successively cascade second convolutional layer, batch normalizing Change (Batch Normalization) layer, scaling translation (Scale) layer and the 2nd Relu layer, control network merging layer is counted According to processing, comprising:
It controls the second convolutional layer and convolution algorithm is carried out to the input data of the second convolutional layer, and the second convolutional layer is each Operation result Serial output;
Batch Normalization layers of the control output data each to the second convolutional layer carries out Batch Normalization operation, and by Normalization layers of Batch each operation result Serial output;
Scale layers of control carry out Scale operation to Normalization layers of Batch each output data, by Scale The each operation result Serial output of layer, and Relu layers of control the 2nd carry out Relu operations to Scale layers of each output data;
Alternatively,
Neural network is third convolutional neural networks, and it includes cascade any hidden layer and by element operation that network, which merges layer, (Eltwise) layer, the input data that the output data of any hidden layer is Eltwise layers, control network merge layer and carry out data Processing, comprising:
It controls any hidden layer and corresponding operation is carried out to the input data of any hidden layer, and any hidden layer is each Operation result Serial output;
The output data of control Eltwise layers of output data and other hidden layer each to any hidden layer carries out Eltwise operation, the input data that the output data of other hidden layers is Eltwise layers.
Second aspect, the embodiment of the invention provides a kind of data processing equipment neural network based, neural network packets It includes at least one network and merges layer, it includes n cascade hidden layers, n >=2 that network, which merges layer,;The data processing equipment includes:
Data processing module merges layer progress data processing for controlling network, wherein network in data processing There are the interlayer parallel processings of data between each hidden layer of merging layer.
In a kind of alternative embodiment of second aspect, when being instantiated n cascade hidden layers, n cascade Hidden layer corresponds to the same object instance.
In a kind of alternative embodiment of second aspect, data processing module is specifically used for:
(i-1)-th hidden layer that network merges layer is controlled, the input data of (i-1)-th hidden layer is handled, and will Processing result carries out Serial output, wherein 2≤i≤n;
I-th of hidden layer that control network merges layer handles the output data of (i-1)-th hidden layer.
In a kind of alternative embodiment of second aspect, data processing module hides (i-1)-th in i-th of hidden layer of control When the data of layer exported are handled, it is specifically used for:
When output data meets preset condition, i-th of hidden layer of control handles the data exported.
In a kind of alternative embodiment of second aspect, preset condition includes i-th of hidden layer progress of output data satisfaction The minimum calculation condition of operation in layer.
In a kind of alternative embodiment of second aspect, output data is the part output data of (i-1)-th hidden layer.
In a kind of alternative embodiment of second aspect, data processing module is also used to:
At least partly data for controlling output data are stored in register and/or cache memory Cache.
In a kind of alternative embodiment of second aspect, data processing module merges i-th of hidden layer of layer in control network When handling the output data of (i-1)-th hidden layer, it is specifically used for:
When the maximum output duration of output data is no more than setting duration, i-th of control network merging layer is hidden Layer handles the output data of (i-1)-th hidden layer;
Wherein, maximum output duration refer to the data obtained earliest in current time and output data obtain the moment it Between duration.
In a kind of alternative embodiment of second aspect, setting duration according to the maximum storage duration of register temporal data, And/or the maximum storage duration determination that Cache is data cached.
In a kind of alternative embodiment of second aspect, neural network is the first convolutional neural networks, and network merges layer and includes Cascade first convolutional layer and the first Relu layers, data processing module is when controlling network and merging layer and carry out data processing, specifically For:
It controls the first convolutional layer and convolution algorithm is carried out to the input data of the first convolutional layer, by each fortune of the first convolutional layer Result Serial output is calculated, and controls the first Relu layers of output data each to the first convolutional layer and carries out Relu operation.
In a kind of alternative embodiment of second aspect, neural network is the second convolutional neural networks, and network merges layer and includes Successively cascade second convolutional layer, Normalization layers of Batch, Scale layers and the 2nd Relu layers, data processing module exists When controlling network merging layer progress data processing, it is specifically used for:
It controls the second convolutional layer and convolution algorithm is carried out to the input data of the second convolutional layer, and the second convolutional layer is each Operation result Serial output;
Batch Normalization layers of the control output data each to the second convolutional layer carries out Batch Normalization operation, and by Normalization layers of Batch each operation result Serial output;
Scale layers of control carry out Scale operation to Normalization layers of Batch each output data, by Scale The each operation result Serial output of layer, and Relu layers of control the 2nd carry out Relu operations to Scale layers of each output data.
In a kind of alternative embodiment of second aspect, neural network is third convolutional neural networks, and network merges layer and includes Cascade any hidden layer and Eltwise layers, the input data that the output data of any hidden layer is Eltwise layers, at data Module is managed when controlling network merging layer progress data processing, is specifically used for:
It controls any hidden layer and corresponding operation is carried out to the input data of any hidden layer, and any hidden layer is each Operation result Serial output;
The output data of control Eltwise layers of output data and other hidden layer each to any hidden layer carries out Eltwise operation, the input data that the output data of other hidden layers is Eltwise layers.
The third aspect, the embodiment of the invention provides a kind of electronic equipment, the electronic equipment includes processor and storage Device;It is stored with readable instruction in the memory, when the readable instruction is loaded and executed by the processor, realizes as above-mentioned The data processing method in any alternative embodiment of first aspect or first aspect.
Fourth aspect stores in the storage medium the embodiment of the invention provides a kind of computer readable storage medium There is readable instruction, when the readable instruction is loaded and executed by processor, realizes times such as above-mentioned first aspect or first aspect The data processing method in one alternative embodiment.
Technical solution provided in an embodiment of the present invention has the benefit that
Data processing method neural network based, device, equipment and storage medium provided in an embodiment of the present invention, logical During the network merging layer progress data processing for crossing neural network, pass through cascade each hidden layer that control network merges layer The interlayer parallel processing that data are carried out during data processing makes may exist data between the different hidden layers of merging layer Synchronization process.Compared to existing technologies, network merges in multiple cascade hidden layers of layer, next layer of hidden layer without After need to waiting until the processing that one layer thereon of hidden layer completes all data, just starts the processing of data, implement through the invention The scheme of example, can effectively reduce the time overhead of data processing, improve the treatment effeciency of data.
Detailed description of the invention
To describe the technical solutions in the embodiments of the present invention more clearly, institute in being described below to the embodiment of the present invention Attached drawing to be used is needed to be briefly described.
Fig. 1 a, Fig. 1 b and Fig. 1 c respectively illustrate the now structural schematic diagram there are three types of neural network structure;
Fig. 2 shows the structural schematic diagrams that network a kind of in an example of the invention merges layer;
Fig. 3 a and Fig. 3 b respectively illustrate the schematic diagram that the network in example of the invention merges two kinds of input structures of layer;
Fig. 4 a shows a kind of structural schematic diagram of existing neural network structure;
Fig. 4 b shows a kind of corresponding structural schematic diagram of network merging layer of the neural network structure in Fig. 4 a;
Fig. 5 a shows a kind of structural schematic diagram of existing neural network structure;
Figure 5b shows that a kind of corresponding structural schematic diagrams of network merging layer of the neural network structure in Fig. 5 a;
Fig. 6 a shows a kind of structural schematic diagram of existing neural network structure;
Fig. 6 b shows a kind of corresponding structural schematic diagram of network merging layer of the neural network structure in Fig. 6 a;
Fig. 6 c shows the structural schematic diagram that the corresponding another network of the neural network structure in Fig. 6 a merges layer;
FIG. 6d shows that the structural schematic diagrams that another corresponding network of the neural network structure in Fig. 6 a merges layer;
Fig. 7 shows the structural schematic diagram of a kind of electronic equipment provided in an embodiment of the present invention.
Specific embodiment
In order to make the invention's purpose, features and advantages of the invention more obvious and easy to understand, below in conjunction with the present invention Attached drawing in embodiment, technical scheme in the embodiment of the invention is clearly and completely described, it is clear that described reality Applying example is only a part of the embodiment of the present invention, and not all embodiments.Based on the embodiments of the present invention, those skilled in the art Member's every other embodiment obtained without making creative work, shall fall within the protection scope of the present invention.
The embodiment of the present invention is described below in detail, examples of the embodiments are shown in the accompanying drawings, wherein from beginning to end Same or similar label indicates same or similar element or element with the same or similar functions.Below with reference to attached The embodiment of figure description is exemplary, and for explaining only the invention, and is not construed as limiting the claims.
Those skilled in the art of the present technique are appreciated that unless expressly stated, singular " one " used herein, " one It is a ", " described " and "the" may also comprise plural form.It is to be further understood that being arranged used in specification of the invention Diction " comprising " refer to that there are the feature, integer, step, operation, element and/or component, but it is not excluded that in the presence of or addition Other one or more features, integer, step, operation, element, component and/or their group.It should be understood that when we claim member Part is " connected " or when " coupled " to another element, it can be directly connected or coupled to other elements, or there may also be Intermediary element.In addition, " connection " used herein or " coupling " may include being wirelessly connected or wirelessly coupling.It is used herein to arrange Diction "and/or" includes one or more associated wholes for listing item or any cell and all combinations.
How technical solution of the present invention and technical solution of the present invention are solved with specifically embodiment below above-mentioned Technical problem is described in detail.These specific embodiments can be combined with each other below, for the same or similar concept Or process may repeat no more in certain embodiments.Below in conjunction with attached drawing, the embodiment of the present invention is described.
Neural network is a kind of operational model, is constituted by being coupled to each other between a large amount of node (or neuron).Entirely Neural network can be divided into input layer, hidden layer and output layer, and a neural network may include one or more hidden layers.It is defeated Enter the first layer that layer is neural network, is responsible for receiving the original input data of network, and the input data received is passed to Hidden layer, wherein hidden layer is responsible for required calculating and exports calculated result to output layer, output layer be neural network most Later layer is responsible for receiving finally entering for hidden layer, by the value of the desired number in the available ideal range of output layer, i.e., Obtain the final process result of neural network.
General described L layer neural network typically refers to neural network with L hidden layer, and input layer and output layer can Not count.For example, common Conv (convolution, convolution) layer, Relu (Rectified linear unit, Activation primitive) layer, POOL (pooling, pond) layer, Batch Normal/BN (Batch Normalization, batch normalizing Change) layer, Scale (scaling translation) layer, Eltwise (pressing element operation) layer etc., it is the hidden layer in neural network.
Neural network is allowed to because of its distinctive non-linear adaptive information processing capability in pattern-recognition, intelligent control, group The fields such as optimization, prediction are closed to be applied successfully.In recent years, more deepen on the road of the quasi- human cognitive of neural network direct die Enter development, becomes an important directions of AI (Artificial Intelligence, artificial intelligence).But it is existing based on mind In data processing method through network, there is a problem of that operation time is long, data-handling efficiency is lower, can not meet well Efficient data processing needs in practical application.
By taking convolutional neural networks as an example, due to outstanding performance of the convolutional neural networks in terms of visual identity in recent years, volume More and more wider, the application including various scene of game of application of product neural network, such as MOBA (Multiplayer Online Battle Arena, more online tactics sports of people) game.Often occur that there are players to go offline, on-hook and player in MOBA game Continuous loss causes player's sense of defeat strong etc., and various problems can be played by AI to going offline in order to not influence the experience of other players Family and on-hook player carry out trustship, or in order to pacify continuous loss player, can allow continuous loss player and weaker AI pairs of battle ability War is to achieve the purpose that pacify continuous loss player.
Many MOBA hands are swum, improve the real-time of calculating, the predictions such as AI trustship calculating can be placed on mobile terminal come into Row calculates, and to reduce cost, improves the real-time of calculating.In mobile terminal, due to GPU (Graphics Processing Unit, Graphics processor) it needs to render for game, resource is limited, therefore the AI reasoning and calculation of mobile terminal is usually all that occupancy is limited CPU (Central Processing Unit, central processing unit) resource and memory source, and AI is required to quickly calculate The instruction of the operation of trustship player, therefore more stringent requirements are proposed to the AI forward calculation Framework for Reasoning of mobile terminal.
The existing common deep learning Framework for Reasoning in mobile terminal, such as caffe2, Tensorflow Lite are rolled up in processing Product neural network model when, be convolutional neural networks each hidden layer create a corresponding object, a upper hidden layer it is defeated The processing needs of input as next hidden layer out, next hiding layer data wait until that the processing result of a hidden layer is all defeated Out, but next hidden layer is when obtaining the output data of a upper hidden layer to be calculated, data usually not in a register, Even not in cache (cache memory), data need to go to cache from memory, then are loaded into deposit from cache Device, and which can generate biggish time overhead, to influence the speed of entire neural network forward calculation.Further, since The output data of a upper hidden layer can be mostly present in memory, therefore, existing data calculating side neural network based Formula, there is also the excessive problems of committed memory resource, can impact to the overall performance of mobile terminal.
The part layer network knot of three kinds of different existing convolutional neural networks models as shown in Fig. 1 a, Fig. 1 b and Fig. 1 c The structural schematic diagram of structure, wherein the input of the Eltwise layer in Fig. 1 c is connect with the output of two layers (layer), inputs number According to the output data for any layer1 and any layer2 shown in figure.Existing deep learning Framework for Reasoning is rolled up in processing It is each in master mould structure according to retaining when in product neural network model such as Fig. 1 a, Fig. 1 b and tri- structures shown in figure of Fig. 1 c Layer structure is calculated, one corresponding object of each layer of creation, upper one layer of the input inputted as next layer, inter-layer data It is transmitted by memory.Such as the network structure shown in Fig. 1 a, Conv layers and Relu layers respectively correspond respective object, When carrying out data processing based on the network structure, need to initialize and call the two layer of corresponding object respectively, in Conv After layer completes whole operations of its input data, just Relu layers can be carried out based on Conv layers of whole output datas by Relu layers Processing.Since the time all handled that each hidden layer of neural network completes data is longer, so as to cause needing When by Relu layers of progress data processing, the output data of most or all of Conv layer has been gone in memory, is made The time overhead of data load is larger, affects data-handling efficiency.
In addition, can also be seen that the number of plies of the hidden layer of neural network is more by the comparison of Fig. 1 a, Fig. 1 b and Fig. 1 c, disappear The time consumed in the load of inter-layer data is also more.
At least one existing technical problem, the present invention when in order to solve the existing progress data processing based on neural network Embodiment provides a kind of data processing method neural network based, and this method is by the way of interlayer merging, to nerve net Some hidden layers carry out interlayer merging in network, are merged by the network after merging in control data handling procedure each hiding in layer There are the interlayer parallel processings of data between layer, and the time overhead of inter-layer data load, is improved at data when reducing data processing Manage efficiency.In addition, method through the embodiment of the present invention, can also effectively reduce the occupancy to memory source.Below to this The concrete scheme that inventive embodiments provide is further described.
In data processing method neural network based provided by the embodiment of the present invention, which includes at least one Network merges layer, and as shown in Figure 2, it may include n cascade hidden layers, n >=2, that is to say, that each net that network, which merges layer, It is complexed and layer includes at least two cascade hidden layers;Wherein, which includes:
Control network merge layer carry out data processing, wherein data handling procedure network merge layer each hidden layer it Between there are the interlayer parallel processings of data.
Wherein, interlayer parallel processing refers to the parallel processing between different hidden layers there are data, is passing through nerve net During the network of network merges layer progress data, at least there are two can have data in hidden layer in cascade n hidden layer Parallel processing.
For example, it includes one Conv layers and one Relu layers that a network of neural network, which merges layer, Relu layers of input number According to the output data for Conv layers, during merging layer by the network and carrying out data processing, Conv layer with Relu layer Between can have the parallel processings of data, that is to say, that starting for Relu layers of data handling procedure can be in Conv layer completion to institute Before the processing for having Conv layer data, the processing of part Conv layer data is often completed by Conv layers, obtains Relu layers of operation institute When the input data needed, the partial data that Conv layers are completed processing is input to Relu layers by controlling, to start Relu layers Operation, make the processing of Conv layers and Relu layer in progress data.
Hidden layer cascade, refers to that the output of previous hidden layer is connected with the input of the latter hidden layer, and the latter is hidden The input data of layer depends on the output data of previous hidden layer.For example, network merges layer for a convolutional neural networks It may include 2 hidden layers, such as Conv layers and Relu layers, the output data that Relu layers of input data is Conv layers, Conv layers Join with Relu level.
It should be noted that in practical applications, network merges the number of layer in neural network and single network merges The number of included hidden layer in layer, all can be according to practical application request, practical application scene and neural network What the self attributes of each hidden layer were configured.For example, can be by the relatively small two or more hidden layers of data processing amount Merge layer as a network;For another example when carrying out data processing by neural network, it is relatively small in data processing amount In application scenarios, it may include the relatively large number of hidden layer of number that network, which merges layer, in the relatively large application scenarios of data volume In, it may include the relatively small number of hidden layer of number that network, which merges layer,.
For network merges layer, input data includes the input data that network merges first hidden layer of layer, The output data that network merges layer is the output data of its last one hidden layer for including.In addition, merging layer sheet based on network The difference of hidden layer included by body, the input data source that network merges layer may also can be different.For example, as shown in fig. 3a In a kind of schematic diagram of the part-structure of neural network, when it includes first hidden layer of neural network that network, which merges layer, net The output of the input as input layer of neural network of simultaneously layer, the i.e. original input data of neural network is complexed.Merge in network Layer is not when including first hidden layer of neural network, and the input that network merges layer then includes that network merges layer by layer included by itself First hidden layer input data, in a kind of schematic diagram of the part-structure of neural network as shown in figure 3b, network The input data for merging first hidden layer of layer is the output data for the upper hidden layer connecting with the hidden layer, the data The as input data of network merging layer.
Likewise, the output data of network merging layer may be the output data of neural network, the i.e. output of neural network The input data of layer, it is also possible to which (the last one hidden layer for merging layer with network is connect next hidden layer of neural network Hidden layer) input data.
As an example, Fig. 4 a shows a kind of schematic diagram of the subnetwork structure of existing neural network comprising Cascade Conv layers and Relu layers, the input Input1 that the input of the network structure is Conv layers exports the output for Relu layers Output1;Fig. 4 b shows the method based on the embodiment of the present invention, by the Conv layer and Relu of network structure shown in Fig. 4 a Layer carries out the structural schematic diagram that the network that interlayer merges merges layer (Conv+relu shown in figure merges layer), i.e. Fig. 4 b Shown in network to merge layer include Conv layer and Relu layer, have the function of Conv calculating and Relu calculating simultaneously.The net It is complexed and the input of layer is Conv layers of original shown in Fig. 4 a of input Input1, the output which merges layer is institute in original 4a Former Relu layers of the output Input1 shown.
As another example, Fig. 5 a shows a kind of schematic diagram of the subnetwork structure of existing neural network comprising It is successively Conv layers cascade, Normal layers, Scale layers and Relu layers of Batch, the input of the network structure is Conv layers defeated Enter Input1, exports the output Output1 for Relu layers;Figure 5b shows that the methods based on the embodiment of the present invention, will be in Fig. 5 a Shown in Normal layers, Scale layers and Relu layers progress interlayer of Conv layer, Batch of network structure merge to obtain network merging The structural schematic diagram of layer (Conv+batchnormal+scale+relu shown in figure merges layer), which, which merges layer, has The function of 4 cascade hidden layers shown in Fig. 5 a, the input Input1 that input is Conv layers of original shown in Fig. 5 a, The output Output1 that output is Relu layers of original shown in Fig. 5 b.
As another example, Fig. 6 a shows a kind of schematic diagram of the subnetwork structure of existing neural network comprising The input that the output of any hidden layer layer1 and any hidden layer layer2, layer1 and layer2 are Eltwise layers, Eltwise layers of output Output2 is the output of the network structure;Fig. 6 b, Fig. 6 c and Fig. 6 d are respectively illustrated based on the present invention Layer1, layer2 and Eltwise layer of network structure shown in Fig. 6 a are carried out interlayer and merge to obtain by the method for embodiment Network merges the structural schematic diagram of layer, and Layer2+eltwise as shown in Figure 6b merges layer, Layer1+ shown in Fig. 6 c Eltwise merges layer, Layer1+layer2+eltwise layers shown in Fig. 6 d.Layer is merged for network shown in Fig. 6 d, It includes 2 cascade hidden layers, and Layer1 is cascade with Eltwise layers, and Layer2 is also cascade with Eltwise layers.
It can be seen from Fig. 6 b, Fig. 6 c and Fig. 6 d can by layer1 and layer2 any one or two layers with Eltwise layers of progress interlayer merge to obtain network merging layer.As shown in Figure 6 b, the eltwise that eltwise layers can be carried out is counted Calculation is put into layer2 and carries out, i.e., merges to obtain Layer2+eltwise with layer2 by Eltwise layers and merge layer, and layer1 pairs Its input data Input1 is handled, and the output Output1 of layer1 merges the input of layer as network, before layer2 closes layer Input Input2 be still used as Layer2+eltwise merge layer input, Layer2+eltwise merge layer output be original Eltwise layers of output.It is then that the eltwise calculating that Eltwise layers carry out is put into layer1 to carry out in Fig. 6 c, i.e., will Eltwise layers merge to obtain Layer1+eltwise with layer1 and merge layer, layer2 to its input data Input2 at Reason, the output Output1 of layer2 merge the input of layer as network, and layer1 closes the input Input1 still conduct before layer The input of Layer1+eltwise merging layer.It and is then to have carried out interlayer conjunction for layer1, layer2 and Eltwise layers in Fig. 6 d And.
The data processing method of the embodiment of the present invention is merging multiple cascades included by layer by the network of neural network During hidden layer carries out data processing, it will do it data during data processing by controlling cascade each hidden layer Interlayer parallel processing, allow to merge the partially synchronous carry out data processing of the different hidden layers of layer.Pass through with existing The method that neural network carries out data processing is compared, and network merges in multiple cascade hidden layers of layer, next layer of hidden layer Without can effectively reduce the time delay of data processing after one layer of hidden layer completes the processing of all data thereon, improve The treatment effeciencies of data.
For example, merging layer for network shown in Fig. 4 b, Relu layers are merged into Conv layers and is calculated, is being based on When the network merges layer progress data calculating, network merging layer first carries out conv and a corresponding output is calculated Data can control the network to merge layer at this time and carry out Relu related operation at once, so that network is merged layer and are based on included by it Conv layer and Relu layer, parallel progress Conv operation and Relu related operation improve the treatment effeciency of data.
Layer is merged for network shown in Fig. 5 b, is merged into Conv normal layers, Scale layers, Relu layers of Batch It is calculated in layer, when merging layer progress data calculating based on the network, which merges layer and first carry out Conv calculating It obtains the data of a corresponding output, Batch Normal related operation can be carried out to the output data at once at this time, tightly Then Scale, Relu related operation are carried out.
Layer is merged for network shown in Fig. 6 b, Fig. 6 c and Fig. 6 d, based on network shown in Fig. 6 b/6c merge layer into When row forward data calculates, by taking Fig. 6 c as an example, merges in layer in network and generate an output number according to the calculating in former layer2 According to when, the output to the output data and layer1 that can pull up a horse carry out Eltwise calculating.Based on net shown in Fig. 6 d When simultaneously layer progress forward data calculating is complexed, when the calculating in former layer1 and layer2 all generates an output data In, it can pull up a horse and Eltwise calculating is carried out to the calculated data of layer1 and layer2.
It should be noted that merging when carrying out data processing based on network structure corresponding in Fig. 6 d for network The calculating of layer1 in layer and the calculating of layer2, according to actual needs, can control the two is serial computing, is also possible to Parallel computation.So-called serial computing, corresponding to the calculation of layer1 and layer2 described in above-mentioned Fig. 6 b and Fig. 6 c, Such as complete then to start layer2 operation after layer1 calculates, so-called parallel computation, i.e., carry out simultaneously layer1 calculating and The calculating of layer2.
In alternative embodiment of the invention, when being instantiated n cascade hidden layers, the n cascade hidden Hiding layer corresponds to the same object instance.
Instantiation, refers in the programming of object-oriented, with the process of class creation object, that is, creates the object of a class. Create and use an object, it is necessary to complete the load of class and the instantiation of class, the load of class i.e. in advance by class load into In memory, the instantiation of class is then the process from class to a specific object, and object is specific example, including one group of attribute With method (set of one section of code for completing certain function).In data processing, object is called by calling object Method, realize function corresponding to the method for object.
In the embodiment of the present invention, in data processing, deposited between each hidden layer in order to preferably guarantee network merging layer In the interlayer parallel processing of data, an object is instantiated as by each hidden layer that network is merged layer, so that adjusting When with the object, can be completed at the same time network merge layer each hidden layer initialization, complete each hidden layer attribute information and The load of method, so that i-th of hidden layer that network merges layer being capable of output data at any time based on (i-1)-th hidden layer The processing for carrying out data, improves the efficiency of data processing.
In alternative embodiment of the invention, control network merges layer and carries out data processing, can specifically include:
(i-1)-th hidden layer that network merges layer is controlled, the input data of (i-1)-th hidden layer is handled, and will Processing result carries out Serial output, wherein 2≤i≤n;
I-th of hidden layer that control network merges layer handles the output data of (i-1)-th hidden layer.
Wherein, output data refers to that the data of (i-1)-th layer of corresponding operation are completed in (i-1)-th hidden layer processing, such as For Conv layers, Conv layers of output data is that the data of Conc calculating are completed.It is above-mentioned to carry out processing result serially Output refers to and first exports the data that first processing is completed first to calculate, and will export after the data of post-processing completion, and is not It will unify again output after the completion of the operation of this layer data whole.
When carrying out processing processing by network merging layer, each hidden layer is controlled after completing each operation System exports the data calculated to next hidden layer, so as to make next hidden layer quickly carry out the processing of data, and Restart operation after completing whole operations without one layer of hidden layer on it, so that improving network merges processing of the layer to data Efficiency.Further, since the data that (i-1)-th hidden layer elder generation operation is completed first are exported, and the data can participate at once i-th During the data operation of hidden layer, therefore, the output number for (i-1)-th hidden layer that i-th of hidden layer current operation is relied on Maximum probability is all located in the register of terminal device according to being, and since register possesses very high read or write speed, When carrying out data processing by i-th of hidden layer, the reading of data can be quickly finished very much, the place of data is effectively improved Manage efficiency.
In alternative embodiment of the invention, i-th of hidden layer of control carries out the data of (i-1)-th hidden layer exported Processing, comprising:
When output data meets preset condition, i-th of hidden layer of control handles the data exported.
Wherein, preset condition can merge each included by layer according to different application scenarios, practical application request, network One or more of information such as the duration that the attribute information and data of hidden layer itself can be stored temporarily in a register It is configured.In alternative embodiment of the invention, above-mentioned preset condition may include: output data meet i-th of hidden layer Carry out the minimum calculation condition of operation in layer.
It wherein, is for a hidden layer in layer, hidden layer carries out the minimum calculation condition of operation in layer, refers to When carrying out data operation by a hidden layer, at least one neuron of the hidden layer can start needed for carrying out operation Minimal data.For different hidden layers, which may be identical, it is also possible to different.
Network merges preset condition corresponding to each hidden layer of layer, and the minimum operation of operation in layer is carried out for each hidden layer When condition, such as network shown in Fig. 4 b merges layer, if operation is minimum in the Relu layer progress layer of network merging layer The output data (operation result after completing a Conv operation) that calculation condition is Conv layers, then carrying out for Conv layers When a corresponding output data is calculated in Conv, Relu layers can carry out Relu operation based on the output data at once.
It is understood that except above-mentioned output data meets the minimum operation item that i-th of hidden layer carries out operation in layer Part, preset condition can also include the condition of other configurations.
In alternative embodiment of the invention, above-mentioned output data is the part output data of (i-1)-th hidden layer.
In order to guarantee exist between multiple hidden layers of network merging layer when merging layer progress data processing by network The interlayer parallel processing of data, i-th of hidden layer are required to open based on the part output data of (i-1)-th hidden layer Begin to carry out data operation.That is, the minimum calculation condition that i-th of hidden layer carries out operation in layer is hidden dependent on (i-1)-th The part output data of hiding layer, rather than whole output datas of i-th of hidden layer.
In alternative embodiment of the invention, which can also include:
At least partly data for controlling above-mentioned output data are stored in register and/or Cache.
Register is the element inside CPU, possesses very high read or write speed.Cache, that is, cache memory is position A kind of capacity between CPU and main memory is smaller but the very high memory of speed.Register, Cache and data in EMS memory Access rate is followed successively by register, Cache and memory from fast to slow.When carrying out data processing, if necessary to the data read In memory, needs first to be loaded into data in Cache from memory, then be loaded into register from Cache, can generate Biggish time overhead causes data-handling efficiency lower.
In order to improve the treatment effeciency of data, in the embodiment of the present invention, when merging layer by network and carrying out data processing, It can be carried out by i-th of hidden layer of control in the output data for (i-1)-th hidden layer that operation is relied on partly or entirely Data are stored in register and/or Cache, to be further reduced time overhead caused by data load, at raising Manage efficiency.
In practical applications, the self attributes information of each hidden layer according to included by neural network, it is each hide The attribute of the data volume of calculation processing required for layer and the register of electronic equipment and/or Cache, to determine which two Or multiple hidden layers merge into a network and merge layer, to realize that the above-mentioned output data of (i-1)-th hidden layer is whole positions It in register, or is entirely located in register and Cache, reduces the load time of data to the greatest extent, raising processing Efficiency.
Network as shown in fig 4b merges layer, network merge the Conv layer of layer the input data based on Conv layers into It, at this time still in a register due to the data, can be at once defeated to this after a corresponding output data is calculated in row Conv Enter data and carry out Relu related operation, the input data for allowing to carry out Relu calculating is got from register, and existing Technology is compared, and the time overhead of data load, the data-handling efficiency of raising is effectively reduced.In addition, using the embodiment of the present invention Data processing method, occupied memory when since data storage can be greatly reduced, thus reducing the same of computation delay When, occupied memory source when data processing can be effectively reduced.
In alternative embodiment of the invention, control network merges i-th of hidden layer of layer to the defeated of (i-1)-th hidden layer Data are handled out, comprising:
When the maximum output duration of output data is no more than setting duration, merge layer i-th of control network is hidden Layer handles the output data of (i-1)-th hidden layer;
Wherein, maximum output duration, refer to the data obtained earliest in current time and output data obtains the moment Between duration, that is, in the output data of (i-1)-th hidden layer, the acquisition moment ((i-1)-th of the data exported earliest At the time of data are calculated in a hidden layer) apart from the duration at current time.
In practical applications, it can be pre-configured above-mentioned setting duration, where realizing the output data to (i-1)-th layer The control of storage location, which can based on experience value and/or experiment value determines.
In alternative embodiment of the invention, setting duration can according to the maximum storage duration of register temporal data, and/ Or, Cache data cached maximum storage duration determines.
Wherein, the maximum storage duration of register temporal data refer to data can temporary storage time in a register, It can be understood as a data to enter after register if do not applied, which goes to the time in Cache from register, together Sample, Cache data cached maximum storage duration refers to that data can be in the temporary storage time in Cache, it is understood that is One data enter after Cache if do not applied, which goes to the time in memory from Cache.
In practical applications, duration can be set according to the maximum storage duration of register temporal data, it is defeated to control Data are respectively positioned in register out, and duration is arranged by the maximum storage duration of Cache temporal data, is respectively positioned on to control data Cache is respectively positioned in Cache and register.
It should be noted that in practical applications, the maximum output duration of output data is no more than the item for setting duration Part can be separately configured, and also can be only fitted in above-mentioned preset condition, that is to say, that above-mentioned preset condition may include having exported The maximum output duration of data is no more than setting duration.
In alternative embodiment of the invention, neural network can be first volume product neural network, and it includes grade that network, which merges layer, First convolutional layer of connection and the first Relu layer, control network merges layer progress data processing, can specifically include:
It controls the first convolutional layer and convolution algorithm is carried out to the input data of the first convolutional layer, by each fortune of the first convolutional layer Result Serial output is calculated, and controls the first Relu layers of output data each to the first convolutional layer and carries out Relu operation.
Specifically, for example, handling convolutional neural networks mould in mobile terminal Framework for Reasoning Caffe2 or Tensorflow Lite In type when network structure as is shown in fig. 4 a, can when forward calculation frame stress model Structure Creating graph structure, A layer structure is merged by Conv layers and Relu layers, i.e., only creates a layer object instance, the network after merging merges layer (such as Conv+relu shown in Fig. 4 b merges layer) use the input of Conv layers of original as its input, it exports as Relu layers of original Output, when subsequent progresss forward inference calculating, the network merging layer first carry out Conv be calculated one it is corresponding defeated Data out, since data still in a register, can carry out Relu related operation to the data at once, allow to carry out at this time The input data of Relu operation is obtained from register, and the calculating of entire convolutional neural networks forward calculation is effectively improved Speed.
In alternative embodiment of the invention, neural network is the second convolutional neural networks, and network merges layer and includes successively grade Second convolutional layer of connection, Normalization layer of Batch, Scale layer and the 2nd Relu layers of scaling translation, control network merging Layer carries out data processing, can specifically include:
It controls the second convolutional layer and convolution algorithm is carried out to the input data of the second convolutional layer, and the second convolutional layer is each Operation result Serial output;
Batch Normalization layers of the control output data each to the second convolutional layer carries out Batch Normalization operation, and by Normalization layers of Batch each operation result Serial output;
Scale layers of control carry out Scale operation to Normalization layers of Batch each output data, by Scale The each operation result Serial output of layer, and Relu layers of control the 2nd carry out Relu operations to Scale layers of each output data.
Specifically, for example, handling convolutional neural networks mould in mobile terminal Framework for Reasoning Caffe2 or Tensorflow Lite In type when network structure as illustrated in fig. 5 a, can when forward calculation frame stress model Structure Creating graph structure, A layer structure is merged by Conv layers and Batch Normalization layers, Scale layers and Relu layers, only creates a layer Object instance, the network after merging merge layer (Conv+batchnormal+scale+relu as illustrated in fig. 5b merges layer) Use former Conv layers of input as its input, the output as Relu layers of original is exported, in subsequent progress forward inference calculating When, which merges layer and first carries out the data that a corresponding output is calculated in Conv, since data are still being posted at this time In storage, Batch normal related operation can be carried out to the data at once, and then carry out Scale operation, Relu operation, So that being calculated required data in the merging layer next time every time can obtain in a register, reduce inter-layer data The consumption of load time.
In alternative embodiment of the invention, neural network can be third convolutional neural networks, and it includes grade that network, which merges layer, Any hidden layer of connection and Eltwise layer, the input data that the output data of any hidden layer is Eltwise layers, control network Merge layer and carry out data processing, can specifically include:
It controls any hidden layer and corresponding operation is carried out to the input data of any hidden layer, and any hidden layer is each Operation result Serial output;
The output data of control Eltwise layers of output data and other hidden layer each to any hidden layer carries out Eltwise operation, the input data that the output data of other hidden layers is Eltwise layers.
Wherein, any hidden layer in the program and other hidden layers are hidden with Eltwise layers of cascade upper level Layer is hidden, the Eltwise layers of output data for output data and other hidden layers to any hidden layer carries out fusion operation, such as The operations such as product (dot product), sum (mutually adding and subtracting), max (taking large values).It is understood that other hidden layers can be one It is a, it is also possible to multiple.For example, Eltwise layers are used for two hidden layers (in figure for network structure shown in Fig. 6 a Shown in layer1 and layer2) output data carry out Eltwise operation, at this point, a correspondence in layer1 and layer2 For any hidden layer in the program, another corresponds to other hidden layers.Below shown in Fig. 6 for network structure into Row explanation.
Specifically, in mobile terminal Framework for Reasoning Caffe2 or Tensorflow Lite processing convolutional neural networks model , can be when forward calculation frame stress model Structure Creating graph structure when network structure as shown in FIG. 6 a, it will Layer2 and Eltwise layers is merged into a layer structure (Layer2+eltwise as shown in Figure 6b merges layer), is only created One layer object instance, or a layer structure (Layer2+ as shown in the figure 6c is merged by layer2 and Eltwise layers Eltwise merges layer), for the network shown in Fig. 6 b merges layer, at this point, the output of layer1 is as network merging layer Input, layer2 close the input inputted still as network merging layer before layer, merge in layer in network according in former layer2 When calculating one output data of generation, Eltwise operation immediately is carried out to the output data of the output data and layer1, Compared to before closing layer, the output data for carrying out layer2 when Eltwise operation is directly to be directly obtained from register 's.
The data processing method of the embodiment of the present invention can be applied to the various moulds that data processing is carried out based on neural network In block or product, specifically it can be applied in various terminal equipment (including client device and server etc.), by terminal device Processor realize.For example, can be applied in mobile terminal device (such as smart phone), by the processing of mobile terminal device Device (such as CPU) is realized.
The scheme of the embodiment of the present invention, is particularly suitable in mobile terminal device, this is because relative to computer, service For the terminal devices such as device, it is multi-party to be limited to system architecture, technique, cost, hardware resource configuration (CPU, GPU, memory etc.) etc. The reason of face, the processor performance (main includes the operational capability of processor) of the mobile terminal devices such as mobile phone is far below computer etc. The processor performance of equipment, usually difference is at thousands of to up to ten thousand times especially in floating-point operation ability, and is based on neural network Data processing scheme in, be such as related to the game application of neural network framework, it is specific as used convolutional neural networks In AI model in the MOBA game of structure, the operand of data is usually all very big, and the dimension of data is also usually various dimensions, Therefore, for mobile terminal device, the treatment effeciency of data how is provided with regard to even more important.And the embodiment of the present invention is mentioned The data processing method of confession, since the network of neural network merges at the carry out data that can partially synchronize of different hidden layers of layer Reason, can effectively reduce the time delay of data processing, improves the treatment effeciency of data, further, since the scheme of the embodiment of the present invention is also It can be effectively reduced the occupancy to memory, when the scheme of the embodiment of the present invention is applied in mobile terminal device, promoting number According to the effect of arithmetic speed etc. can be more significant, can be effectively improved since the various aspects of mobile terminal device itself are former Various problems such as low, the application program Caton of data-handling efficiency because caused by.Specifically, for example, will be of the invention real When applying the scheme of example and being applied in the MOBA game of the mobile terminal based on neural network framework, data processing effect can be effectively promoted Rate, the appearance for reducing situations such as memory overflows (Out of memory).
Data processing method provided by the embodiment of the present invention, which can be applied, carries out data processing in various neural networks Structure, device, product, in model.It is further described below with reference to scheme of the specific example to the embodiment of the present invention.
Example one
In this example, data processing method provided in an embodiment of the present invention can be applied to it is hereinbefore described, By including that the AI forward calculation Framework for Reasoning of convolutional neural networks is realized in the application scenarios of AI trustship in MOBA game.Specifically , it can apply and be seen, in the corresponding convolutional neural networks reasoning forward calculation of the micro- behaviour of AI in AI overall situation.Wherein, AI overall situation, which is seen, is Referring to that control AI in the position that entire map to be gone, by taking king's honor as an example, such as keeps tower, Manchu troop's line beats open country, arrests, support etc..AI Micro- behaviour then refers to that control AI is specifically operated, and such as walks, technical ability release etc..
Three kinds of structures shown in Fig. 4 a, Fig. 5 a and Fig. 6 a that have in the micro- behaviour's model of model, AI, base are seen for AI overall situation In the method for the embodiment of the present invention, when forward calculation frame creates network example, the interlayer that can be hidden layer merges meter It calculates.Specifically, two hidden layers shown in Fig. 4 a can be merged into the network in Fig. 4 b merges layer, it will be more shown in Fig. 5 a A hidden layer merges into network shown in Fig. 5 b and merges layer, and multiple hidden layers shown in Fig. 6 a are merged into Fig. 6 b, Fig. 6 c Or network shown in Fig. 6 d merges layer.
It goes offline trustship AI for MOBA game, when a certain player goes offline or is more than given threshold value without relevant operation Between, terminal device (such as mobile phone) hardware for each game player that server end is got when can be according to since this innings of game is matched It sets in resource, selects the preferable user of one or more handset capability, AI is operated in the mobile phone of these users, by these The mobile phone of user carries out forward prediction calculating.In this example, input data when prediction is the game frame data of current player That is the image data of the corresponding present image of game sees the micro- behaviour's model of model, AI using AI overall situation to input data and carries out correlation Forward prediction calculate, and calculated result is sent to server, server issues relevant manipulation instruction according to calculated result To drive player in trust to carry out correlation displacement and technical ability release.It wherein, can for 15 operation frame/second MOBA game Calculating is seen to carry out 1 overall situation every 15 frames, every 2 frame carries out 1 micro- behaviour and calculates.
By taking the forward calculation frame with network structure shown in Fig. 4 b as an example, at the data of the embodiment of the present invention Reason method, see in forward calculation frame load AI overall situation, AI micro- behaviour model structure creation graph structure when, by Conv layers with Relu layers are merged into a layer structure, i.e., a layer object example are only created, shown in new construction, that is, Fig. 4 b after merging Conv+relu merges layer and uses the input Input1 of Conv layers of original (before merging Conv layers shown in Fig. 4 a) defeated as its Enter, export the output Output1 as Relu layers of original, when subsequent forward inference calculates, it is advanced which merges layer The data of a corresponding output are calculated in row Conv, at this time still in a register due to the data, can be at once defeated to this Data carry out Relu related operation out, and the input data for allowing to carry out Relu calculating is obtained from register.With based on figure The existing forward calculation frame of network structure shown in 4a is compared, and computational efficiency can be greatly improved, to improve Server issues relevant manipulation instruction according to calculated result to drive player's progress correlation displacement in trust and skill releasable Efficiency has better met the high performance demands of data processing.
Likewise, based on the forward calculation frame with network structure shown in Fig. 5 b, using the embodiment of the present invention When data processing method carries out forward inference calculating, when carrying out forward inference calculating, Conv+batchnormal+scale+ Relu merges layer and first carries out the data that a corresponding output is calculated in Conv, at this time still in a register due to the data, Batch Normal related operation can be carried out to the output data at once, and then carry out Scale, Relu related operation, make The network merges and is calculated required data next time every time in layer and can obtained in register, and shown in Fig. 5 a Network structure compare, computational efficiency can be greatly improved, better met the high efficiency demand of data processing.Based on having The forward calculation frame of network structure shown in Fig. 6 b carries out forward inference using the data processing method of the embodiment of the present invention When calculating, when Layer2+eltwise merging layer generates an output data according to the calculating in layer2 layers, Ke Yili Horse compares before closing layer, when carrying out eltwise calculating the output data and layer1 layers of output progress eltwise calculating The data of layer2 are directly acquired from register, be can be effectively reduced the load time expense of data, are improved the place of data Manage efficiency.
By experiment, statistical data is shown on line, with three kinds of nets shown in Fig. 4 b, Fig. 5 b, Fig. 6 b (or Fig. 6 c) It is complexed and the MOBA game AI overall situation of layer structure is seen in model and the micro- behaviour's model of AI, using the data processing side of the embodiment of the present invention When method carries out AI reasoning and calculation prediction calculating, average time-consuming is calculated by 31ms and is increased to 23ms, speed-up ratio 1.35, the micro- behaviour's mould of AI Type calculates average time-consuming and is increased to 1ms, 2 times of speed-up ratio by 2ms, and corresponding AI trustship winning rate is promoted by 27% to 38%.It can See, can be improved the efficiency of data processing using scheme provided by the embodiment of the present invention.
Example two
As an example, respectively illustrated in Tables 1 and 2 using existing data processing method neural network based with When data processing method of the invention carries out image recognition, the time-consuming of two schemes and the contrast table for consuming memory.Wherein, table 1 It is when carrying out image recognition using two schemes, except the neural network structure of use is different with comparing result shown in table 2 Outside, it is obtained under the premise of other conditions are all the same.
Respectively illustrated in table 1 based on include Fig. 5 a shown in network structure mobilenet neural network model into When row image recognition, when carrying out image data calculating using the method for existing way and the embodiment of the present invention, the consumption of two schemes When comparing result, and based on include Fig. 4 a shown in network structure squzeenet neural network model carry out image knowledge When other, the time-consuming comparing result of two schemes.
Respectively illustrated in table 2 based on include Fig. 5 a shown in network structure mobilenet neural network model into When row image recognition, using the method for existing way and the embodiment of the present invention carry out image data calculating when, two schemes it is interior Consumption comparing result is deposited, and figure is being carried out based on the squzeenet neural network model for including network structure shown in Fig. 4 a When as identification, the memory consumption comparing result of two schemes.Wherein, it in Tables 1 and 2, closes layer and corresponds to implement using the present invention The scheme of the neural network structure of example does not conform to layer and corresponds to the scheme for using existing neural network structure.
mobilenet squzeenet
Close layer 110.66 120.03
Layer is not conformed to 126.90 129.17
The time-consuming comparison (unit ms) of table 1
mobilenet squzeenet
Close layer 75.23 76.68
Layer is not conformed to 114.54 83.29
Table 2 consumes memory comparison (unit is MB (million))
As can be seen from Table 1, for the image recognition model based on mobilenet, before (closing layer) and conjunction layer after closing layer (no Close layer) it compares, calculating time-consuming speed-up ratio is 1.15 (126.90/110.66), for being based on squeezenet image recognition model, After conjunction layer compared with closing before layer, calculates time-consuming speed-up ratio and reach 1.08 (129.17/120.03).
By table 2 then it can be seen from terms of EMS memory occupation, for the image recognition model based on mobilenet, after closing layer Than close layer before compare, save the 34.3% i.e. memory of (114.54-75.23)/114.54, for based on squeezenet figure As identification model, than comparing before closing layer after conjunction layer, then 7.9% memory is saved.
Example three
It is shown in this exemplary table 3 and is being based on mobilenet neural network, using the data processing of the embodiment of the present invention Method is tied with time-consuming comparison when prediction calculates is carried out using existing caffe2 and TensorFlowLite Framework for Reasoning Fruit, and it is being based on squeezenet neural network, it is existing with using using the data processing method of the embodiment of the present invention Caffe2 and TensorFlowLite Framework for Reasoning, the time-consuming comparing result when carrying out prediction and calculating.
mobilenet squzeenet
Caffe2 327.82 187.79
Close layer 117.66 124.03
TensorFlowLite 176.11 252.21
The time-consuming comparison (unit ms) of table 3
As can be seen from Table 3, when being based on mobilenet neural network, using the scheme type of the embodiment of the present invention, meter Calculating the time-consuming speed-up ratio to existing caffe2 is 2.78 (327.82/117.66), to the speed-up ratio of existing TensorFlowLite For 1.50 (187.79/124.03);When based on squzeenet neural network, using the scheme of the embodiment of the present invention, consumption is calculated When be 1.51 to caffe2 speed-up ratio, be 2.03 to the speed-up ratio of TensorFlowLite.
It is understood that the scheme of the embodiment of the present invention, which can be applied to various application neural network structures, carries out data The structure of processing, model, in product, it is not limited in above-mentioned involved application field or application scenarios, such as not office It is limited to apply the corresponding Controlling model in above-mentioned MOBA game, the program can be carried forward into the production using any neural network structure In product, such as it can be applied in the product of any convolutional neural networks structure of the structure with above-mentioned Fig. 1 a, Fig. 1 b and Fig. 1 c, For example, can be applied to the moulds such as classical image recognition model vgg16, mobilenet, squeezenet neural network based In type.
Based on principle identical with data processing method neural network based provided in an embodiment of the present invention, the present invention is real It applies example and additionally provides a kind of data processing equipment neural network based, wherein neural network includes that at least one network merges Layer, it includes n cascade hidden layers, n >=2 that network, which merges layer,.
The data processing equipment of the embodiment of the present invention includes data processing module, and the data processing module is for controlling network Merge layer and carry out data processing, wherein there are the layers of data between each hidden layer of network merging layer in data processing Between parallel processing.
The data processing module of the embodiment of the present invention can be applied in each electronic equipment, for example, can be applied to move It in terminal device, also can be applied in fixed terminal equipment, can also be applied in server, the function of data processing module It realizes specifically to be controlled by the processor of electronic equipment and realize.
It is understood that the above-mentioned module of the data processing equipment in the embodiment of the present disclosure, which has, realizes that the present invention is any The function of corresponding steps in data processing method shown in embodiment, the function can also be passed through by hardware realization Hardware executes corresponding software realization, and the hardware or software include one or more modules corresponding with above-mentioned function.It is above-mentioned Each module can be implemented separately, can also be with multiple module integration realizations.It can be with for the concrete function description of data processing equipment Referring to hereinbefore to the corresponding description in data processing method, details are not described herein.
In alternative embodiment of the invention, when n cascade hidden layers are instantiated, n cascade hidden layers The corresponding same object instance
In alternative embodiment of the invention, data processing module can be specifically used for:
(i-1)-th hidden layer that network merges layer is controlled, the input data of (i-1)-th hidden layer is handled, and will Processing result carries out Serial output, wherein 2≤i≤n;
I-th of hidden layer that control network merges layer handles the output data of (i-1)-th hidden layer.
In alternative embodiment of the invention, data processing module control i-th of hidden layer to (i-1)-th hidden layer When the data of output are handled, it can be specifically used for:
When output data meets preset condition, i-th of hidden layer of control handles the data exported.
In alternative embodiment of the invention, preset condition, which includes that i-th of hidden layer progress layer of output data satisfaction is interior, is transported The minimum calculation condition calculated.
In alternative embodiment of the invention, output data is the part output data of (i-1)-th hidden layer.
In alternative embodiment of the invention, data processing module is also used to:
At least partly data for controlling output data are stored in register and/or Cache.
In alternative embodiment of the invention, data processing module merges i-th of hidden layer of layer to (i-1)-th in control network When the output data of a hidden layer is handled, it is specifically used for:
When the maximum output duration of output data is no more than setting duration, i-th of control network merging layer is hidden Layer handles the output data of (i-1)-th hidden layer;
Wherein, maximum output duration refer to the data obtained earliest in current time and output data obtain the moment it Between duration.
In alternative embodiment of the invention, duration is set according to the maximum storage duration of register temporal data, and/or, Cache data cached maximum storage duration determines.
In alternative embodiment of the invention, neural network is the first convolutional neural networks, and it includes cascade that network, which merges layer, First convolutional layer and the first Relu layers, data processing module are specifically used for when controlling network and merging layer and carry out data processing:
It controls the first convolutional layer and convolution algorithm is carried out to the input data of the first convolutional layer, by each fortune of the first convolutional layer Result Serial output is calculated, and controls the first Relu layers of output data each to the first convolutional layer and carries out Relu operation.
In alternative embodiment of the invention, neural network is the second convolutional neural networks, and network merges layer and includes successively grade The second convolutional layer, Normalization layer of the Batch, Scale layers and the 2nd Relu layers of connection, data processing module are controlling net When simultaneously layer progress data processing is complexed, it is specifically used for:
It controls the second convolutional layer and convolution algorithm is carried out to the input data of the second convolutional layer, and the second convolutional layer is each Operation result Serial output;
Batch Normalization layers of the control output data each to the second convolutional layer carries out Batch Normalization operation, and by Normalization layers of Batch each operation result Serial output;
Scale layers of control carry out Scale operation to Normalization layers of Batch each output data, by Scale The each operation result Serial output of layer, and Relu layers of control the 2nd carry out Relu operations to Scale layers of each output data.
In alternative embodiment of the invention, neural network is third convolutional neural networks, and it includes cascade that network, which merges layer, Any hidden layer and Eltwise layers, the input data that the output data of any hidden layer is Eltwise layers, data processing module When controlling network merging layer progress data processing, it is specifically used for:
It controls any hidden layer and corresponding operation is carried out to the input data of any hidden layer, and any hidden layer is each Operation result Serial output;
The output data of control Eltwise layers of output data and other hidden layer each to any hidden layer carries out Eltwise operation, the input data that the output data of other hidden layers is Eltwise layers.
The data processing equipment as provided by the embodiment of the present invention is at the data that can be executed in the embodiment of the present invention The device of reason method, so based on data processing method provided in the embodiment of the present invention, those skilled in the art's energy It is much of that solution the embodiment of the present invention data processing equipment specific embodiment and its various change form, so herein for How the data processing equipment realizes that the data processing method in the embodiment of the present invention is no longer discussed in detail.As long as belonging to this field Technical staff implements device used by data processing in the embodiment of the present invention, belongs to the range to be protected of the application.
Based on principle identical with data processing method provided by the embodiment of the present invention and data processing equipment, the present invention Embodiment additionally provides a kind of electronic equipment, which may include processor and memory.Wherein, it is stored in memory There is readable instruction, when readable instruction is loaded and executed by processor, data shown in any embodiment of the present invention may be implemented Processing method.
The embodiment of the invention also provides a kind of computer readable storage medium, readable finger is stored in the storage medium It enables, when readable instruction is loaded and executed by processor, realizes data processing method shown in any embodiment of the present invention.
Fig. 7 shows the structural schematic diagram of the applicable a kind of electronic equipment of the embodiment of the present invention, as shown in fig. 7, the electronics Equipment 2000 includes processor 2001 and memory 2003.Wherein, processor 2001 is connected with memory 2003, such as passes through bus 2002 are connected.Optionally, electronic equipment 2000 can also include transceiver 2004.It should be noted that being received and dispatched in practical application Device 2004 is not limited to one, and the structure of the electronic equipment 2000 does not constitute the restriction to the embodiment of the present invention.
Wherein, processor 2001 is applied in the embodiment of the present invention, for realizing data processing mould in the embodiment of the present invention The function of block.Transceiver 2004 includes Receiver And Transmitter, and transceiver 2004 is applied in the embodiment of the present invention, for realizing Sending and receiving for data is realized in communication between electronic equipment 2000 and other equipment.
Processor 2001 can be CPU, general processor, DSP, ASIC, FPGA or other programmable logic device, crystalline substance Body pipe logical device, hardware component or any combination thereof.It, which may be implemented or executes, combines described by the disclosure of invention Various illustrative logic blocks, module and circuit.Processor 2001 is also possible to realize the combination of computing function, such as wraps It is combined containing one or more microprocessors, DSP and the combination of microprocessor etc..
Bus 2002 may include an access, and information is transmitted between said modules.Bus 2002 can be pci bus or Eisa bus etc..Bus 2002 can be divided into address bus, data/address bus, control bus etc..Only to be used in Fig. 7 convenient for indicating One thick line indicates, it is not intended that an only bus or a type of bus.
Memory 2003 can be ROM or can store the other kinds of static storage device of static information and instruction, RAM Or the other kinds of dynamic memory of information and instruction can be stored, it is also possible to EEPROM, CD-ROM or other CDs Storage, optical disc storage (including compression optical disc, laser disc, optical disc, Digital Versatile Disc, Blu-ray Disc etc.), magnetic disk storage medium Or other magnetic storage apparatus or can be used in carry or store have instruction or data structure form desired program generation Code and can by any other medium of computer access, but not limited to this.
Optionally, memory 2003 is used to store the application code for executing the present invention program, and by processor 2001 It is executed to control.Processor 2001 is for executing the application code stored in memory 2003, to realize that the present invention is implemented The movement for the device that example provides.
It should be understood that although each step in the flow chart of attached drawing is successively shown according to the instruction of arrow, These steps are not that the inevitable sequence according to arrow instruction successively executes.Unless expressly stating otherwise herein, these steps Execution there is no stringent sequences to limit, can execute in the other order.Moreover, at least one in the flow chart of attached drawing Part steps may include that perhaps these sub-steps of multiple stages or stage are not necessarily in synchronization to multiple sub-steps Completion is executed, but can be executed at different times, execution sequence, which is also not necessarily, successively to be carried out, but can be with other At least part of the sub-step or stage of step or other steps executes in turn or alternately.
The above is only some embodiments of the invention, it is noted that for the ordinary skill people of the art For member, various improvements and modifications may be made without departing from the principle of the present invention, these improvements and modifications are also answered It is considered as protection scope of the present invention.

Claims (13)

1. a kind of data processing method neural network based, which is characterized in that the neural network includes at least one network Merge layer, it includes n cascade hidden layers, n >=2 that the network, which merges layer,;The described method includes:
It controls the network and merges layer progress data processing, wherein the network merges each hidden of layer in data processing There are the interlayer parallel processings of data between hiding layer.
2. the method according to claim 1, wherein when the n cascade hidden layers are instantiated, The n cascade hidden layers correspond to the same object instance.
3. method according to claim 1 or 2, which is characterized in that the control network, which merges layer, to carry out at data Reason, comprising:
(i-1)-th hidden layer that the network merges layer is controlled, the input data of (i-1)-th hidden layer is handled, And processing result is subjected to Serial output, wherein 2≤i≤n;
I-th of hidden layer for controlling the network merging layer handles the output data of (i-1)-th hidden layer.
4. according to the method described in claim 3, it is characterized in that, control i-th of hidden layer is to described (i-1)-th The data of hidden layer exported are handled, comprising:
It is described output data meets preset condition when, control i-th of hidden layer and the data exported carried out Processing.
5. according to the method described in claim 4, it is characterized in that, the preset condition includes that the output data meets institute State the minimum calculation condition that i-th of hidden layer carries out operation in layer.
6. method according to any one of claim 3 to 5, which is characterized in that the output data is described (i-1)-th The part output data of a hidden layer.
7. method according to any one of claim 3 to 6, which is characterized in that further include:
At least partly data of the control output data are stored in register and/or cache memory Cache.
8. the method according to the description of claim 7 is characterized in that the control network merges i-th of hidden layer of layer The output data of (i-1)-th hidden layer is handled, comprising:
When the maximum output duration of the output data is no more than setting duration, control i-th that the network merges layer Hidden layer handles the output data of (i-1)-th hidden layer;
Wherein, the maximum output duration refers to obtaining for the data obtained earliest in current time and the output data Duration between moment.
9. according to the method described in claim 8, it is characterized in that, the setting duration is according to the register temporal data Maximum storage duration, and/or, the Cache data cached maximum storage duration determines.
10. method according to any one of claim 1 to 9, which is characterized in that the neural network is the first convolution mind Through network, it includes Relu layers of cascade first convolutional layer and the first activation primitive, the control net that the network, which merges layer, It is complexed and layer carries out data processing, comprising:
It controls first convolutional layer and convolution algorithm is carried out to the input data of first convolutional layer, by first convolutional layer Each operation result Serial output, and control the described first Relu layers of output data each to first convolutional layer and carry out Relu operation;
Alternatively,
The neural network is the second convolutional neural networks, the network merge layer include successively cascade second convolutional layer, batch Batch Normalization layer of normalization, Scale layer and the 2nd Relu layers of scaling translation, it is described to control the network merging Layer carries out data processing, comprising:
It controls second convolutional layer and convolution algorithm is carried out to the input data of second convolutional layer, and by second convolution The each operation result Serial output of layer;
It controls the described Batch Normalization layers output data each to second convolutional layer and carries out Batch Normalization operation, and by described Batch Normalization layers each operation result Serial output;
It controls described Scale layers and Scale operation is carried out to described Batch Normalization layers each output data, it will Described Scale layers each operation result Serial output, and control the described 2nd Relu layers to Scale layers of each output Data carry out Relu operation;
Alternatively,
The neural network is third convolutional neural networks, and it includes cascade any hidden layer and by element that the network, which merges layer, Eltwise layer of operation, the output data of any hidden layer are Eltwise layers of the input data, described in the control Network merges layer and carries out data processing, comprising:
It controls any hidden layer and corresponding operation is carried out to the input data of any hidden layer, and will be described any hidden Hide each operation result Serial output of layer;
The output data for controlling described Eltwise layers output data and other hidden layer each to any hidden layer carries out Eltwise operation, the output data of other hidden layers are Eltwise layers of the input data.
11. a kind of data processing equipment neural network based, which is characterized in that the neural network includes at least one network Merge layer, it includes n cascade hidden layers, n >=2 that the network, which merges layer,;Described device includes:
Data processing module merges layer progress data processing for controlling the network, wherein described in data processing There are the interlayer parallel processings of data between each hidden layer of network merging layer.
12. a kind of electronic equipment, which is characterized in that the electronic equipment includes processor and memory;
It is stored with readable instruction in the memory, when the readable instruction is loaded and executed by the processor, realizes as weighed Benefit require any one of 1 to 10 described in data processing method.
13. a kind of computer readable storage medium, which is characterized in that be stored with readable instruction in the storage medium, it is described can When reading instruction is loaded and executed by processor, the data processing method as described in any one of claims 1 to 10 is realized.
CN201811340948.6A 2018-11-12 2018-11-12 Data processing method, device and equipment based on neural network and storage medium Active CN110163337B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201811340948.6A CN110163337B (en) 2018-11-12 2018-11-12 Data processing method, device and equipment based on neural network and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201811340948.6A CN110163337B (en) 2018-11-12 2018-11-12 Data processing method, device and equipment based on neural network and storage medium

Publications (2)

Publication Number Publication Date
CN110163337A true CN110163337A (en) 2019-08-23
CN110163337B CN110163337B (en) 2023-01-20

Family

ID=67645220

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201811340948.6A Active CN110163337B (en) 2018-11-12 2018-11-12 Data processing method, device and equipment based on neural network and storage medium

Country Status (1)

Country Link
CN (1) CN110163337B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2022206536A1 (en) * 2021-03-29 2022-10-06 维沃移动通信有限公司 Data processing method and apparatus, and chip

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107844833A (en) * 2017-11-28 2018-03-27 郑州云海信息技术有限公司 A kind of data processing method of convolutional neural networks, device and medium
CN108074211A (en) * 2017-12-26 2018-05-25 浙江大华技术股份有限公司 A kind of image processing apparatus and method
WO2018120016A1 (en) * 2016-12-30 2018-07-05 上海寒武纪信息科技有限公司 Apparatus for executing lstm neural network operation, and operational method
CN108446758A (en) * 2018-02-11 2018-08-24 江苏金羿智芯科技有限公司 A kind of serial flow processing method of Neural Network Data calculated towards artificial intelligence
CN108491924A (en) * 2018-02-11 2018-09-04 江苏金羿智芯科技有限公司 A kind of serial stream treatment device of Neural Network Data calculated towards artificial intelligence

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2018120016A1 (en) * 2016-12-30 2018-07-05 上海寒武纪信息科技有限公司 Apparatus for executing lstm neural network operation, and operational method
CN107844833A (en) * 2017-11-28 2018-03-27 郑州云海信息技术有限公司 A kind of data processing method of convolutional neural networks, device and medium
CN108074211A (en) * 2017-12-26 2018-05-25 浙江大华技术股份有限公司 A kind of image processing apparatus and method
CN108446758A (en) * 2018-02-11 2018-08-24 江苏金羿智芯科技有限公司 A kind of serial flow processing method of Neural Network Data calculated towards artificial intelligence
CN108491924A (en) * 2018-02-11 2018-09-04 江苏金羿智芯科技有限公司 A kind of serial stream treatment device of Neural Network Data calculated towards artificial intelligence

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2022206536A1 (en) * 2021-03-29 2022-10-06 维沃移动通信有限公司 Data processing method and apparatus, and chip

Also Published As

Publication number Publication date
CN110163337B (en) 2023-01-20

Similar Documents

Publication Publication Date Title
CN111858009B (en) Task scheduling method of mobile edge computing system based on migration and reinforcement learning
CN107578095B (en) Neural computing device and processor comprising the computing device
CN106447034B (en) A kind of neural network processor based on data compression, design method, chip
CN107918794A (en) Neural network processor based on computing array
CN110188795A (en) Image classification method, data processing method and device
CN107066239A (en) A kind of hardware configuration for realizing convolutional neural networks forward calculation
CN108090560A (en) The design method of LSTM recurrent neural network hardware accelerators based on FPGA
CN106709565A (en) Optimization method and device for neural network
CN113067873A (en) Edge cloud collaborative optimization method based on deep reinforcement learning
CN110222717A (en) Image processing method and device
CN107622305A (en) Processor and processing method for neutral net
CN109446996A (en) Facial recognition data processing unit and processing method based on FPGA
CN107292458A (en) A kind of Forecasting Methodology and prediction meanss applied to neural network chip
CN110059747A (en) A kind of net flow assorted method
CN110050282A (en) Convolutional neural networks compression
CN113760511B (en) Vehicle edge calculation task unloading method based on depth certainty strategy
CN108545556A (en) Information processing unit based on neural network and method
CN110600020B (en) Gradient transmission method and device
CN113241064A (en) Voice recognition method, voice recognition device, model training method, model training device, electronic equipment and storage medium
CN108985449A (en) A kind of control method and device of pair of convolutional neural networks processor
CN110163337A (en) Data processing method, device, equipment and storage medium neural network based
CN116957698A (en) Electricity price prediction method based on improved time sequence mode attention mechanism
Zhang et al. Communication-computation efficient device-edge co-inference via AutoML
CN117193992B (en) Model training method, task scheduling device and computer storage medium
Jin et al. An intelligent scheduling algorithm for resource management of cloud platform

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant