CN110163337A - Data processing method, device, equipment and storage medium neural network based - Google Patents
Data processing method, device, equipment and storage medium neural network based Download PDFInfo
- Publication number
- CN110163337A CN110163337A CN201811340948.6A CN201811340948A CN110163337A CN 110163337 A CN110163337 A CN 110163337A CN 201811340948 A CN201811340948 A CN 201811340948A CN 110163337 A CN110163337 A CN 110163337A
- Authority
- CN
- China
- Prior art keywords
- layer
- data
- network
- layers
- hidden
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
Abstract
The embodiment of the invention provides a kind of data processing method, device, equipment and storage mediums neural network based, belong to technical field of data processing, wherein the neural network includes that at least one network merges layer, it includes n cascade hidden layers, n >=2 that network, which merges layer,;The data processing method includes: that control network merges layer progress data processing, wherein there are the interlayer parallel processings of data between each hidden layer of network merging layer in data processing.Scheme through the embodiment of the present invention can effectively improve the treatment effeciency of data.
Description
Technical field
The present invention relates to technical field of data processing, specifically, the present invention relates to a kind of data neural network based
Processing method, device, equipment and storage medium.
Background technique
With the continuous development of nerual network technique, neural network has been widely used in every field.For example, for
Convolutional neural networks have been applied widely in computer vision, field of image processing by its exclusive performance advantage,
And application of the convolutional neural networks in terms of visual identity in recent years also achieves good results.
However, neural network has the characteristics that large capacity, high-dimensional, and the network parameter of neural network is numerous, is being based on
When neural network carries out data processing, can there is a problem of operation time length, how improve data-handling efficiency, be urgently to be resolved
The problem of.
Summary of the invention
The main purpose of the embodiment of the present invention is to provide a kind of data processing method neural network based, device, sets
Standby and storage medium, to solve the problems, such as that data processing speed is slow in available data processing mode.
In a first aspect, the embodiment of the invention provides a kind of data processing methods neural network based, wherein nerve net
Network includes that at least one network merges layer, and it includes n cascade hidden layers, n >=2 that network, which merges layer,;The data processing method packet
It includes:
It controls network and merges layer progress data processing, wherein network merges each hidden layer of layer in data processing
Between there are the interlayer parallel processings of data.
In a kind of alternative embodiment of first aspect, when being instantiated n cascade hidden layers, n cascade
Hidden layer corresponds to the same object instance.
In a kind of alternative embodiment of first aspect, control network merges layer and carries out data processing, comprising:
(i-1)-th hidden layer that network merges layer is controlled, the input data of (i-1)-th hidden layer is handled, and will
Processing result carries out Serial output, wherein 2≤i≤n;
I-th of hidden layer that control network merges layer handles the output data of (i-1)-th hidden layer.
In a kind of alternative embodiment of first aspect, the number that has exported of i-th of the hidden layer of control to (i-1)-th hidden layer
According to being handled, comprising:
When output data meets preset condition, i-th of hidden layer of control handles the data exported.
In a kind of alternative embodiment of first aspect, preset condition includes i-th of hidden layer progress of output data satisfaction
The minimum calculation condition of operation in layer.
In a kind of alternative embodiment of first aspect, output data is the part output data of (i-1)-th hidden layer.
In a kind of alternative embodiment of first aspect, the data processing method further include:
At least partly data for controlling output data are stored in register and/or cache memory (Cache).
In a kind of alternative embodiment of first aspect, control network merges i-th of hidden layer of layer to (i-1)-th hidden layer
Output data handled, comprising:
When the maximum output duration of output data is no more than setting duration, merge layer i-th of control network is hidden
Layer handles the output data of (i-1)-th hidden layer;
Wherein, maximum output duration, refer to the data obtained earliest in current time and output data obtains the moment
Between duration.
In a kind of alternative embodiment of first aspect, setting duration according to the maximum storage duration of register temporal data,
And/or the maximum storage duration determination that Cache is data cached.
In a kind of alternative embodiment of first aspect, neural network is the first convolutional neural networks, and network merges layer and includes
Cascade first convolutional layer and the first activation primitive (Relu) layer, control network merge layer and carry out data processing, comprising:
It controls the first convolutional layer and convolution algorithm is carried out to the input data of the first convolutional layer, by each fortune of the first convolutional layer
Result Serial output is calculated, and controls the first Relu layers of output data each to the first convolutional layer and carries out Relu operation;
Alternatively,
Neural network is the second convolutional neural networks, and network merges layer and includes successively cascade second convolutional layer, batch normalizing
Change (Batch Normalization) layer, scaling translation (Scale) layer and the 2nd Relu layer, control network merging layer is counted
According to processing, comprising:
It controls the second convolutional layer and convolution algorithm is carried out to the input data of the second convolutional layer, and the second convolutional layer is each
Operation result Serial output;
Batch Normalization layers of the control output data each to the second convolutional layer carries out Batch
Normalization operation, and by Normalization layers of Batch each operation result Serial output;
Scale layers of control carry out Scale operation to Normalization layers of Batch each output data, by Scale
The each operation result Serial output of layer, and Relu layers of control the 2nd carry out Relu operations to Scale layers of each output data;
Alternatively,
Neural network is third convolutional neural networks, and it includes cascade any hidden layer and by element operation that network, which merges layer,
(Eltwise) layer, the input data that the output data of any hidden layer is Eltwise layers, control network merge layer and carry out data
Processing, comprising:
It controls any hidden layer and corresponding operation is carried out to the input data of any hidden layer, and any hidden layer is each
Operation result Serial output;
The output data of control Eltwise layers of output data and other hidden layer each to any hidden layer carries out
Eltwise operation, the input data that the output data of other hidden layers is Eltwise layers.
Second aspect, the embodiment of the invention provides a kind of data processing equipment neural network based, neural network packets
It includes at least one network and merges layer, it includes n cascade hidden layers, n >=2 that network, which merges layer,;The data processing equipment includes:
Data processing module merges layer progress data processing for controlling network, wherein network in data processing
There are the interlayer parallel processings of data between each hidden layer of merging layer.
In a kind of alternative embodiment of second aspect, when being instantiated n cascade hidden layers, n cascade
Hidden layer corresponds to the same object instance.
In a kind of alternative embodiment of second aspect, data processing module is specifically used for:
(i-1)-th hidden layer that network merges layer is controlled, the input data of (i-1)-th hidden layer is handled, and will
Processing result carries out Serial output, wherein 2≤i≤n;
I-th of hidden layer that control network merges layer handles the output data of (i-1)-th hidden layer.
In a kind of alternative embodiment of second aspect, data processing module hides (i-1)-th in i-th of hidden layer of control
When the data of layer exported are handled, it is specifically used for:
When output data meets preset condition, i-th of hidden layer of control handles the data exported.
In a kind of alternative embodiment of second aspect, preset condition includes i-th of hidden layer progress of output data satisfaction
The minimum calculation condition of operation in layer.
In a kind of alternative embodiment of second aspect, output data is the part output data of (i-1)-th hidden layer.
In a kind of alternative embodiment of second aspect, data processing module is also used to:
At least partly data for controlling output data are stored in register and/or cache memory Cache.
In a kind of alternative embodiment of second aspect, data processing module merges i-th of hidden layer of layer in control network
When handling the output data of (i-1)-th hidden layer, it is specifically used for:
When the maximum output duration of output data is no more than setting duration, i-th of control network merging layer is hidden
Layer handles the output data of (i-1)-th hidden layer;
Wherein, maximum output duration refer to the data obtained earliest in current time and output data obtain the moment it
Between duration.
In a kind of alternative embodiment of second aspect, setting duration according to the maximum storage duration of register temporal data,
And/or the maximum storage duration determination that Cache is data cached.
In a kind of alternative embodiment of second aspect, neural network is the first convolutional neural networks, and network merges layer and includes
Cascade first convolutional layer and the first Relu layers, data processing module is when controlling network and merging layer and carry out data processing, specifically
For:
It controls the first convolutional layer and convolution algorithm is carried out to the input data of the first convolutional layer, by each fortune of the first convolutional layer
Result Serial output is calculated, and controls the first Relu layers of output data each to the first convolutional layer and carries out Relu operation.
In a kind of alternative embodiment of second aspect, neural network is the second convolutional neural networks, and network merges layer and includes
Successively cascade second convolutional layer, Normalization layers of Batch, Scale layers and the 2nd Relu layers, data processing module exists
When controlling network merging layer progress data processing, it is specifically used for:
It controls the second convolutional layer and convolution algorithm is carried out to the input data of the second convolutional layer, and the second convolutional layer is each
Operation result Serial output;
Batch Normalization layers of the control output data each to the second convolutional layer carries out Batch
Normalization operation, and by Normalization layers of Batch each operation result Serial output;
Scale layers of control carry out Scale operation to Normalization layers of Batch each output data, by Scale
The each operation result Serial output of layer, and Relu layers of control the 2nd carry out Relu operations to Scale layers of each output data.
In a kind of alternative embodiment of second aspect, neural network is third convolutional neural networks, and network merges layer and includes
Cascade any hidden layer and Eltwise layers, the input data that the output data of any hidden layer is Eltwise layers, at data
Module is managed when controlling network merging layer progress data processing, is specifically used for:
It controls any hidden layer and corresponding operation is carried out to the input data of any hidden layer, and any hidden layer is each
Operation result Serial output;
The output data of control Eltwise layers of output data and other hidden layer each to any hidden layer carries out
Eltwise operation, the input data that the output data of other hidden layers is Eltwise layers.
The third aspect, the embodiment of the invention provides a kind of electronic equipment, the electronic equipment includes processor and storage
Device;It is stored with readable instruction in the memory, when the readable instruction is loaded and executed by the processor, realizes as above-mentioned
The data processing method in any alternative embodiment of first aspect or first aspect.
Fourth aspect stores in the storage medium the embodiment of the invention provides a kind of computer readable storage medium
There is readable instruction, when the readable instruction is loaded and executed by processor, realizes times such as above-mentioned first aspect or first aspect
The data processing method in one alternative embodiment.
Technical solution provided in an embodiment of the present invention has the benefit that
Data processing method neural network based, device, equipment and storage medium provided in an embodiment of the present invention, logical
During the network merging layer progress data processing for crossing neural network, pass through cascade each hidden layer that control network merges layer
The interlayer parallel processing that data are carried out during data processing makes may exist data between the different hidden layers of merging layer
Synchronization process.Compared to existing technologies, network merges in multiple cascade hidden layers of layer, next layer of hidden layer without
After need to waiting until the processing that one layer thereon of hidden layer completes all data, just starts the processing of data, implement through the invention
The scheme of example, can effectively reduce the time overhead of data processing, improve the treatment effeciency of data.
Detailed description of the invention
To describe the technical solutions in the embodiments of the present invention more clearly, institute in being described below to the embodiment of the present invention
Attached drawing to be used is needed to be briefly described.
Fig. 1 a, Fig. 1 b and Fig. 1 c respectively illustrate the now structural schematic diagram there are three types of neural network structure;
Fig. 2 shows the structural schematic diagrams that network a kind of in an example of the invention merges layer;
Fig. 3 a and Fig. 3 b respectively illustrate the schematic diagram that the network in example of the invention merges two kinds of input structures of layer;
Fig. 4 a shows a kind of structural schematic diagram of existing neural network structure;
Fig. 4 b shows a kind of corresponding structural schematic diagram of network merging layer of the neural network structure in Fig. 4 a;
Fig. 5 a shows a kind of structural schematic diagram of existing neural network structure;
Figure 5b shows that a kind of corresponding structural schematic diagrams of network merging layer of the neural network structure in Fig. 5 a;
Fig. 6 a shows a kind of structural schematic diagram of existing neural network structure;
Fig. 6 b shows a kind of corresponding structural schematic diagram of network merging layer of the neural network structure in Fig. 6 a;
Fig. 6 c shows the structural schematic diagram that the corresponding another network of the neural network structure in Fig. 6 a merges layer;
FIG. 6d shows that the structural schematic diagrams that another corresponding network of the neural network structure in Fig. 6 a merges layer;
Fig. 7 shows the structural schematic diagram of a kind of electronic equipment provided in an embodiment of the present invention.
Specific embodiment
In order to make the invention's purpose, features and advantages of the invention more obvious and easy to understand, below in conjunction with the present invention
Attached drawing in embodiment, technical scheme in the embodiment of the invention is clearly and completely described, it is clear that described reality
Applying example is only a part of the embodiment of the present invention, and not all embodiments.Based on the embodiments of the present invention, those skilled in the art
Member's every other embodiment obtained without making creative work, shall fall within the protection scope of the present invention.
The embodiment of the present invention is described below in detail, examples of the embodiments are shown in the accompanying drawings, wherein from beginning to end
Same or similar label indicates same or similar element or element with the same or similar functions.Below with reference to attached
The embodiment of figure description is exemplary, and for explaining only the invention, and is not construed as limiting the claims.
Those skilled in the art of the present technique are appreciated that unless expressly stated, singular " one " used herein, " one
It is a ", " described " and "the" may also comprise plural form.It is to be further understood that being arranged used in specification of the invention
Diction " comprising " refer to that there are the feature, integer, step, operation, element and/or component, but it is not excluded that in the presence of or addition
Other one or more features, integer, step, operation, element, component and/or their group.It should be understood that when we claim member
Part is " connected " or when " coupled " to another element, it can be directly connected or coupled to other elements, or there may also be
Intermediary element.In addition, " connection " used herein or " coupling " may include being wirelessly connected or wirelessly coupling.It is used herein to arrange
Diction "and/or" includes one or more associated wholes for listing item or any cell and all combinations.
How technical solution of the present invention and technical solution of the present invention are solved with specifically embodiment below above-mentioned
Technical problem is described in detail.These specific embodiments can be combined with each other below, for the same or similar concept
Or process may repeat no more in certain embodiments.Below in conjunction with attached drawing, the embodiment of the present invention is described.
Neural network is a kind of operational model, is constituted by being coupled to each other between a large amount of node (or neuron).Entirely
Neural network can be divided into input layer, hidden layer and output layer, and a neural network may include one or more hidden layers.It is defeated
Enter the first layer that layer is neural network, is responsible for receiving the original input data of network, and the input data received is passed to
Hidden layer, wherein hidden layer is responsible for required calculating and exports calculated result to output layer, output layer be neural network most
Later layer is responsible for receiving finally entering for hidden layer, by the value of the desired number in the available ideal range of output layer, i.e.,
Obtain the final process result of neural network.
General described L layer neural network typically refers to neural network with L hidden layer, and input layer and output layer can
Not count.For example, common Conv (convolution, convolution) layer, Relu (Rectified linear unit,
Activation primitive) layer, POOL (pooling, pond) layer, Batch Normal/BN (Batch Normalization, batch normalizing
Change) layer, Scale (scaling translation) layer, Eltwise (pressing element operation) layer etc., it is the hidden layer in neural network.
Neural network is allowed to because of its distinctive non-linear adaptive information processing capability in pattern-recognition, intelligent control, group
The fields such as optimization, prediction are closed to be applied successfully.In recent years, more deepen on the road of the quasi- human cognitive of neural network direct die
Enter development, becomes an important directions of AI (Artificial Intelligence, artificial intelligence).But it is existing based on mind
In data processing method through network, there is a problem of that operation time is long, data-handling efficiency is lower, can not meet well
Efficient data processing needs in practical application.
By taking convolutional neural networks as an example, due to outstanding performance of the convolutional neural networks in terms of visual identity in recent years, volume
More and more wider, the application including various scene of game of application of product neural network, such as MOBA (Multiplayer Online
Battle Arena, more online tactics sports of people) game.Often occur that there are players to go offline, on-hook and player in MOBA game
Continuous loss causes player's sense of defeat strong etc., and various problems can be played by AI to going offline in order to not influence the experience of other players
Family and on-hook player carry out trustship, or in order to pacify continuous loss player, can allow continuous loss player and weaker AI pairs of battle ability
War is to achieve the purpose that pacify continuous loss player.
Many MOBA hands are swum, improve the real-time of calculating, the predictions such as AI trustship calculating can be placed on mobile terminal come into
Row calculates, and to reduce cost, improves the real-time of calculating.In mobile terminal, due to GPU (Graphics Processing Unit,
Graphics processor) it needs to render for game, resource is limited, therefore the AI reasoning and calculation of mobile terminal is usually all that occupancy is limited
CPU (Central Processing Unit, central processing unit) resource and memory source, and AI is required to quickly calculate
The instruction of the operation of trustship player, therefore more stringent requirements are proposed to the AI forward calculation Framework for Reasoning of mobile terminal.
The existing common deep learning Framework for Reasoning in mobile terminal, such as caffe2, Tensorflow Lite are rolled up in processing
Product neural network model when, be convolutional neural networks each hidden layer create a corresponding object, a upper hidden layer it is defeated
The processing needs of input as next hidden layer out, next hiding layer data wait until that the processing result of a hidden layer is all defeated
Out, but next hidden layer is when obtaining the output data of a upper hidden layer to be calculated, data usually not in a register,
Even not in cache (cache memory), data need to go to cache from memory, then are loaded into deposit from cache
Device, and which can generate biggish time overhead, to influence the speed of entire neural network forward calculation.Further, since
The output data of a upper hidden layer can be mostly present in memory, therefore, existing data calculating side neural network based
Formula, there is also the excessive problems of committed memory resource, can impact to the overall performance of mobile terminal.
The part layer network knot of three kinds of different existing convolutional neural networks models as shown in Fig. 1 a, Fig. 1 b and Fig. 1 c
The structural schematic diagram of structure, wherein the input of the Eltwise layer in Fig. 1 c is connect with the output of two layers (layer), inputs number
According to the output data for any layer1 and any layer2 shown in figure.Existing deep learning Framework for Reasoning is rolled up in processing
It is each in master mould structure according to retaining when in product neural network model such as Fig. 1 a, Fig. 1 b and tri- structures shown in figure of Fig. 1 c
Layer structure is calculated, one corresponding object of each layer of creation, upper one layer of the input inputted as next layer, inter-layer data
It is transmitted by memory.Such as the network structure shown in Fig. 1 a, Conv layers and Relu layers respectively correspond respective object,
When carrying out data processing based on the network structure, need to initialize and call the two layer of corresponding object respectively, in Conv
After layer completes whole operations of its input data, just Relu layers can be carried out based on Conv layers of whole output datas by Relu layers
Processing.Since the time all handled that each hidden layer of neural network completes data is longer, so as to cause needing
When by Relu layers of progress data processing, the output data of most or all of Conv layer has been gone in memory, is made
The time overhead of data load is larger, affects data-handling efficiency.
In addition, can also be seen that the number of plies of the hidden layer of neural network is more by the comparison of Fig. 1 a, Fig. 1 b and Fig. 1 c, disappear
The time consumed in the load of inter-layer data is also more.
At least one existing technical problem, the present invention when in order to solve the existing progress data processing based on neural network
Embodiment provides a kind of data processing method neural network based, and this method is by the way of interlayer merging, to nerve net
Some hidden layers carry out interlayer merging in network, are merged by the network after merging in control data handling procedure each hiding in layer
There are the interlayer parallel processings of data between layer, and the time overhead of inter-layer data load, is improved at data when reducing data processing
Manage efficiency.In addition, method through the embodiment of the present invention, can also effectively reduce the occupancy to memory source.Below to this
The concrete scheme that inventive embodiments provide is further described.
In data processing method neural network based provided by the embodiment of the present invention, which includes at least one
Network merges layer, and as shown in Figure 2, it may include n cascade hidden layers, n >=2, that is to say, that each net that network, which merges layer,
It is complexed and layer includes at least two cascade hidden layers;Wherein, which includes:
Control network merge layer carry out data processing, wherein data handling procedure network merge layer each hidden layer it
Between there are the interlayer parallel processings of data.
Wherein, interlayer parallel processing refers to the parallel processing between different hidden layers there are data, is passing through nerve net
During the network of network merges layer progress data, at least there are two can have data in hidden layer in cascade n hidden layer
Parallel processing.
For example, it includes one Conv layers and one Relu layers that a network of neural network, which merges layer, Relu layers of input number
According to the output data for Conv layers, during merging layer by the network and carrying out data processing, Conv layer with Relu layer
Between can have the parallel processings of data, that is to say, that starting for Relu layers of data handling procedure can be in Conv layer completion to institute
Before the processing for having Conv layer data, the processing of part Conv layer data is often completed by Conv layers, obtains Relu layers of operation institute
When the input data needed, the partial data that Conv layers are completed processing is input to Relu layers by controlling, to start Relu layers
Operation, make the processing of Conv layers and Relu layer in progress data.
Hidden layer cascade, refers to that the output of previous hidden layer is connected with the input of the latter hidden layer, and the latter is hidden
The input data of layer depends on the output data of previous hidden layer.For example, network merges layer for a convolutional neural networks
It may include 2 hidden layers, such as Conv layers and Relu layers, the output data that Relu layers of input data is Conv layers, Conv layers
Join with Relu level.
It should be noted that in practical applications, network merges the number of layer in neural network and single network merges
The number of included hidden layer in layer, all can be according to practical application request, practical application scene and neural network
What the self attributes of each hidden layer were configured.For example, can be by the relatively small two or more hidden layers of data processing amount
Merge layer as a network;For another example when carrying out data processing by neural network, it is relatively small in data processing amount
In application scenarios, it may include the relatively large number of hidden layer of number that network, which merges layer, in the relatively large application scenarios of data volume
In, it may include the relatively small number of hidden layer of number that network, which merges layer,.
For network merges layer, input data includes the input data that network merges first hidden layer of layer,
The output data that network merges layer is the output data of its last one hidden layer for including.In addition, merging layer sheet based on network
The difference of hidden layer included by body, the input data source that network merges layer may also can be different.For example, as shown in fig. 3a
In a kind of schematic diagram of the part-structure of neural network, when it includes first hidden layer of neural network that network, which merges layer, net
The output of the input as input layer of neural network of simultaneously layer, the i.e. original input data of neural network is complexed.Merge in network
Layer is not when including first hidden layer of neural network, and the input that network merges layer then includes that network merges layer by layer included by itself
First hidden layer input data, in a kind of schematic diagram of the part-structure of neural network as shown in figure 3b, network
The input data for merging first hidden layer of layer is the output data for the upper hidden layer connecting with the hidden layer, the data
The as input data of network merging layer.
Likewise, the output data of network merging layer may be the output data of neural network, the i.e. output of neural network
The input data of layer, it is also possible to which (the last one hidden layer for merging layer with network is connect next hidden layer of neural network
Hidden layer) input data.
As an example, Fig. 4 a shows a kind of schematic diagram of the subnetwork structure of existing neural network comprising
Cascade Conv layers and Relu layers, the input Input1 that the input of the network structure is Conv layers exports the output for Relu layers
Output1;Fig. 4 b shows the method based on the embodiment of the present invention, by the Conv layer and Relu of network structure shown in Fig. 4 a
Layer carries out the structural schematic diagram that the network that interlayer merges merges layer (Conv+relu shown in figure merges layer), i.e. Fig. 4 b
Shown in network to merge layer include Conv layer and Relu layer, have the function of Conv calculating and Relu calculating simultaneously.The net
It is complexed and the input of layer is Conv layers of original shown in Fig. 4 a of input Input1, the output which merges layer is institute in original 4a
Former Relu layers of the output Input1 shown.
As another example, Fig. 5 a shows a kind of schematic diagram of the subnetwork structure of existing neural network comprising
It is successively Conv layers cascade, Normal layers, Scale layers and Relu layers of Batch, the input of the network structure is Conv layers defeated
Enter Input1, exports the output Output1 for Relu layers;Figure 5b shows that the methods based on the embodiment of the present invention, will be in Fig. 5 a
Shown in Normal layers, Scale layers and Relu layers progress interlayer of Conv layer, Batch of network structure merge to obtain network merging
The structural schematic diagram of layer (Conv+batchnormal+scale+relu shown in figure merges layer), which, which merges layer, has
The function of 4 cascade hidden layers shown in Fig. 5 a, the input Input1 that input is Conv layers of original shown in Fig. 5 a,
The output Output1 that output is Relu layers of original shown in Fig. 5 b.
As another example, Fig. 6 a shows a kind of schematic diagram of the subnetwork structure of existing neural network comprising
The input that the output of any hidden layer layer1 and any hidden layer layer2, layer1 and layer2 are Eltwise layers,
Eltwise layers of output Output2 is the output of the network structure;Fig. 6 b, Fig. 6 c and Fig. 6 d are respectively illustrated based on the present invention
Layer1, layer2 and Eltwise layer of network structure shown in Fig. 6 a are carried out interlayer and merge to obtain by the method for embodiment
Network merges the structural schematic diagram of layer, and Layer2+eltwise as shown in Figure 6b merges layer, Layer1+ shown in Fig. 6 c
Eltwise merges layer, Layer1+layer2+eltwise layers shown in Fig. 6 d.Layer is merged for network shown in Fig. 6 d,
It includes 2 cascade hidden layers, and Layer1 is cascade with Eltwise layers, and Layer2 is also cascade with Eltwise layers.
It can be seen from Fig. 6 b, Fig. 6 c and Fig. 6 d can by layer1 and layer2 any one or two layers with
Eltwise layers of progress interlayer merge to obtain network merging layer.As shown in Figure 6 b, the eltwise that eltwise layers can be carried out is counted
Calculation is put into layer2 and carries out, i.e., merges to obtain Layer2+eltwise with layer2 by Eltwise layers and merge layer, and layer1 pairs
Its input data Input1 is handled, and the output Output1 of layer1 merges the input of layer as network, before layer2 closes layer
Input Input2 be still used as Layer2+eltwise merge layer input, Layer2+eltwise merge layer output be original
Eltwise layers of output.It is then that the eltwise calculating that Eltwise layers carry out is put into layer1 to carry out in Fig. 6 c, i.e., will
Eltwise layers merge to obtain Layer1+eltwise with layer1 and merge layer, layer2 to its input data Input2 at
Reason, the output Output1 of layer2 merge the input of layer as network, and layer1 closes the input Input1 still conduct before layer
The input of Layer1+eltwise merging layer.It and is then to have carried out interlayer conjunction for layer1, layer2 and Eltwise layers in Fig. 6 d
And.
The data processing method of the embodiment of the present invention is merging multiple cascades included by layer by the network of neural network
During hidden layer carries out data processing, it will do it data during data processing by controlling cascade each hidden layer
Interlayer parallel processing, allow to merge the partially synchronous carry out data processing of the different hidden layers of layer.Pass through with existing
The method that neural network carries out data processing is compared, and network merges in multiple cascade hidden layers of layer, next layer of hidden layer
Without can effectively reduce the time delay of data processing after one layer of hidden layer completes the processing of all data thereon, improve
The treatment effeciencies of data.
For example, merging layer for network shown in Fig. 4 b, Relu layers are merged into Conv layers and is calculated, is being based on
When the network merges layer progress data calculating, network merging layer first carries out conv and a corresponding output is calculated
Data can control the network to merge layer at this time and carry out Relu related operation at once, so that network is merged layer and are based on included by it
Conv layer and Relu layer, parallel progress Conv operation and Relu related operation improve the treatment effeciency of data.
Layer is merged for network shown in Fig. 5 b, is merged into Conv normal layers, Scale layers, Relu layers of Batch
It is calculated in layer, when merging layer progress data calculating based on the network, which merges layer and first carry out Conv calculating
It obtains the data of a corresponding output, Batch Normal related operation can be carried out to the output data at once at this time, tightly
Then Scale, Relu related operation are carried out.
Layer is merged for network shown in Fig. 6 b, Fig. 6 c and Fig. 6 d, based on network shown in Fig. 6 b/6c merge layer into
When row forward data calculates, by taking Fig. 6 c as an example, merges in layer in network and generate an output number according to the calculating in former layer2
According to when, the output to the output data and layer1 that can pull up a horse carry out Eltwise calculating.Based on net shown in Fig. 6 d
When simultaneously layer progress forward data calculating is complexed, when the calculating in former layer1 and layer2 all generates an output data
In, it can pull up a horse and Eltwise calculating is carried out to the calculated data of layer1 and layer2.
It should be noted that merging when carrying out data processing based on network structure corresponding in Fig. 6 d for network
The calculating of layer1 in layer and the calculating of layer2, according to actual needs, can control the two is serial computing, is also possible to
Parallel computation.So-called serial computing, corresponding to the calculation of layer1 and layer2 described in above-mentioned Fig. 6 b and Fig. 6 c,
Such as complete then to start layer2 operation after layer1 calculates, so-called parallel computation, i.e., carry out simultaneously layer1 calculating and
The calculating of layer2.
In alternative embodiment of the invention, when being instantiated n cascade hidden layers, the n cascade hidden
Hiding layer corresponds to the same object instance.
Instantiation, refers in the programming of object-oriented, with the process of class creation object, that is, creates the object of a class.
Create and use an object, it is necessary to complete the load of class and the instantiation of class, the load of class i.e. in advance by class load into
In memory, the instantiation of class is then the process from class to a specific object, and object is specific example, including one group of attribute
With method (set of one section of code for completing certain function).In data processing, object is called by calling object
Method, realize function corresponding to the method for object.
In the embodiment of the present invention, in data processing, deposited between each hidden layer in order to preferably guarantee network merging layer
In the interlayer parallel processing of data, an object is instantiated as by each hidden layer that network is merged layer, so that adjusting
When with the object, can be completed at the same time network merge layer each hidden layer initialization, complete each hidden layer attribute information and
The load of method, so that i-th of hidden layer that network merges layer being capable of output data at any time based on (i-1)-th hidden layer
The processing for carrying out data, improves the efficiency of data processing.
In alternative embodiment of the invention, control network merges layer and carries out data processing, can specifically include:
(i-1)-th hidden layer that network merges layer is controlled, the input data of (i-1)-th hidden layer is handled, and will
Processing result carries out Serial output, wherein 2≤i≤n;
I-th of hidden layer that control network merges layer handles the output data of (i-1)-th hidden layer.
Wherein, output data refers to that the data of (i-1)-th layer of corresponding operation are completed in (i-1)-th hidden layer processing, such as
For Conv layers, Conv layers of output data is that the data of Conc calculating are completed.It is above-mentioned to carry out processing result serially
Output refers to and first exports the data that first processing is completed first to calculate, and will export after the data of post-processing completion, and is not
It will unify again output after the completion of the operation of this layer data whole.
When carrying out processing processing by network merging layer, each hidden layer is controlled after completing each operation
System exports the data calculated to next hidden layer, so as to make next hidden layer quickly carry out the processing of data, and
Restart operation after completing whole operations without one layer of hidden layer on it, so that improving network merges processing of the layer to data
Efficiency.Further, since the data that (i-1)-th hidden layer elder generation operation is completed first are exported, and the data can participate at once i-th
During the data operation of hidden layer, therefore, the output number for (i-1)-th hidden layer that i-th of hidden layer current operation is relied on
Maximum probability is all located in the register of terminal device according to being, and since register possesses very high read or write speed,
When carrying out data processing by i-th of hidden layer, the reading of data can be quickly finished very much, the place of data is effectively improved
Manage efficiency.
In alternative embodiment of the invention, i-th of hidden layer of control carries out the data of (i-1)-th hidden layer exported
Processing, comprising:
When output data meets preset condition, i-th of hidden layer of control handles the data exported.
Wherein, preset condition can merge each included by layer according to different application scenarios, practical application request, network
One or more of information such as the duration that the attribute information and data of hidden layer itself can be stored temporarily in a register
It is configured.In alternative embodiment of the invention, above-mentioned preset condition may include: output data meet i-th of hidden layer
Carry out the minimum calculation condition of operation in layer.
It wherein, is for a hidden layer in layer, hidden layer carries out the minimum calculation condition of operation in layer, refers to
When carrying out data operation by a hidden layer, at least one neuron of the hidden layer can start needed for carrying out operation
Minimal data.For different hidden layers, which may be identical, it is also possible to different.
Network merges preset condition corresponding to each hidden layer of layer, and the minimum operation of operation in layer is carried out for each hidden layer
When condition, such as network shown in Fig. 4 b merges layer, if operation is minimum in the Relu layer progress layer of network merging layer
The output data (operation result after completing a Conv operation) that calculation condition is Conv layers, then carrying out for Conv layers
When a corresponding output data is calculated in Conv, Relu layers can carry out Relu operation based on the output data at once.
It is understood that except above-mentioned output data meets the minimum operation item that i-th of hidden layer carries out operation in layer
Part, preset condition can also include the condition of other configurations.
In alternative embodiment of the invention, above-mentioned output data is the part output data of (i-1)-th hidden layer.
In order to guarantee exist between multiple hidden layers of network merging layer when merging layer progress data processing by network
The interlayer parallel processing of data, i-th of hidden layer are required to open based on the part output data of (i-1)-th hidden layer
Begin to carry out data operation.That is, the minimum calculation condition that i-th of hidden layer carries out operation in layer is hidden dependent on (i-1)-th
The part output data of hiding layer, rather than whole output datas of i-th of hidden layer.
In alternative embodiment of the invention, which can also include:
At least partly data for controlling above-mentioned output data are stored in register and/or Cache.
Register is the element inside CPU, possesses very high read or write speed.Cache, that is, cache memory is position
A kind of capacity between CPU and main memory is smaller but the very high memory of speed.Register, Cache and data in EMS memory
Access rate is followed successively by register, Cache and memory from fast to slow.When carrying out data processing, if necessary to the data read
In memory, needs first to be loaded into data in Cache from memory, then be loaded into register from Cache, can generate
Biggish time overhead causes data-handling efficiency lower.
In order to improve the treatment effeciency of data, in the embodiment of the present invention, when merging layer by network and carrying out data processing,
It can be carried out by i-th of hidden layer of control in the output data for (i-1)-th hidden layer that operation is relied on partly or entirely
Data are stored in register and/or Cache, to be further reduced time overhead caused by data load, at raising
Manage efficiency.
In practical applications, the self attributes information of each hidden layer according to included by neural network, it is each hide
The attribute of the data volume of calculation processing required for layer and the register of electronic equipment and/or Cache, to determine which two
Or multiple hidden layers merge into a network and merge layer, to realize that the above-mentioned output data of (i-1)-th hidden layer is whole positions
It in register, or is entirely located in register and Cache, reduces the load time of data to the greatest extent, raising processing
Efficiency.
Network as shown in fig 4b merges layer, network merge the Conv layer of layer the input data based on Conv layers into
It, at this time still in a register due to the data, can be at once defeated to this after a corresponding output data is calculated in row Conv
Enter data and carry out Relu related operation, the input data for allowing to carry out Relu calculating is got from register, and existing
Technology is compared, and the time overhead of data load, the data-handling efficiency of raising is effectively reduced.In addition, using the embodiment of the present invention
Data processing method, occupied memory when since data storage can be greatly reduced, thus reducing the same of computation delay
When, occupied memory source when data processing can be effectively reduced.
In alternative embodiment of the invention, control network merges i-th of hidden layer of layer to the defeated of (i-1)-th hidden layer
Data are handled out, comprising:
When the maximum output duration of output data is no more than setting duration, merge layer i-th of control network is hidden
Layer handles the output data of (i-1)-th hidden layer;
Wherein, maximum output duration, refer to the data obtained earliest in current time and output data obtains the moment
Between duration, that is, in the output data of (i-1)-th hidden layer, the acquisition moment ((i-1)-th of the data exported earliest
At the time of data are calculated in a hidden layer) apart from the duration at current time.
In practical applications, it can be pre-configured above-mentioned setting duration, where realizing the output data to (i-1)-th layer
The control of storage location, which can based on experience value and/or experiment value determines.
In alternative embodiment of the invention, setting duration can according to the maximum storage duration of register temporal data, and/
Or, Cache data cached maximum storage duration determines.
Wherein, the maximum storage duration of register temporal data refer to data can temporary storage time in a register,
It can be understood as a data to enter after register if do not applied, which goes to the time in Cache from register, together
Sample, Cache data cached maximum storage duration refers to that data can be in the temporary storage time in Cache, it is understood that is
One data enter after Cache if do not applied, which goes to the time in memory from Cache.
In practical applications, duration can be set according to the maximum storage duration of register temporal data, it is defeated to control
Data are respectively positioned in register out, and duration is arranged by the maximum storage duration of Cache temporal data, is respectively positioned on to control data
Cache is respectively positioned in Cache and register.
It should be noted that in practical applications, the maximum output duration of output data is no more than the item for setting duration
Part can be separately configured, and also can be only fitted in above-mentioned preset condition, that is to say, that above-mentioned preset condition may include having exported
The maximum output duration of data is no more than setting duration.
In alternative embodiment of the invention, neural network can be first volume product neural network, and it includes grade that network, which merges layer,
First convolutional layer of connection and the first Relu layer, control network merges layer progress data processing, can specifically include:
It controls the first convolutional layer and convolution algorithm is carried out to the input data of the first convolutional layer, by each fortune of the first convolutional layer
Result Serial output is calculated, and controls the first Relu layers of output data each to the first convolutional layer and carries out Relu operation.
Specifically, for example, handling convolutional neural networks mould in mobile terminal Framework for Reasoning Caffe2 or Tensorflow Lite
In type when network structure as is shown in fig. 4 a, can when forward calculation frame stress model Structure Creating graph structure,
A layer structure is merged by Conv layers and Relu layers, i.e., only creates a layer object instance, the network after merging merges layer (such as
Conv+relu shown in Fig. 4 b merges layer) use the input of Conv layers of original as its input, it exports as Relu layers of original
Output, when subsequent progresss forward inference calculating, the network merging layer first carry out Conv be calculated one it is corresponding defeated
Data out, since data still in a register, can carry out Relu related operation to the data at once, allow to carry out at this time
The input data of Relu operation is obtained from register, and the calculating of entire convolutional neural networks forward calculation is effectively improved
Speed.
In alternative embodiment of the invention, neural network is the second convolutional neural networks, and network merges layer and includes successively grade
Second convolutional layer of connection, Normalization layer of Batch, Scale layer and the 2nd Relu layers of scaling translation, control network merging
Layer carries out data processing, can specifically include:
It controls the second convolutional layer and convolution algorithm is carried out to the input data of the second convolutional layer, and the second convolutional layer is each
Operation result Serial output;
Batch Normalization layers of the control output data each to the second convolutional layer carries out Batch
Normalization operation, and by Normalization layers of Batch each operation result Serial output;
Scale layers of control carry out Scale operation to Normalization layers of Batch each output data, by Scale
The each operation result Serial output of layer, and Relu layers of control the 2nd carry out Relu operations to Scale layers of each output data.
Specifically, for example, handling convolutional neural networks mould in mobile terminal Framework for Reasoning Caffe2 or Tensorflow Lite
In type when network structure as illustrated in fig. 5 a, can when forward calculation frame stress model Structure Creating graph structure,
A layer structure is merged by Conv layers and Batch Normalization layers, Scale layers and Relu layers, only creates a layer
Object instance, the network after merging merge layer (Conv+batchnormal+scale+relu as illustrated in fig. 5b merges layer)
Use former Conv layers of input as its input, the output as Relu layers of original is exported, in subsequent progress forward inference calculating
When, which merges layer and first carries out the data that a corresponding output is calculated in Conv, since data are still being posted at this time
In storage, Batch normal related operation can be carried out to the data at once, and then carry out Scale operation, Relu operation,
So that being calculated required data in the merging layer next time every time can obtain in a register, reduce inter-layer data
The consumption of load time.
In alternative embodiment of the invention, neural network can be third convolutional neural networks, and it includes grade that network, which merges layer,
Any hidden layer of connection and Eltwise layer, the input data that the output data of any hidden layer is Eltwise layers, control network
Merge layer and carry out data processing, can specifically include:
It controls any hidden layer and corresponding operation is carried out to the input data of any hidden layer, and any hidden layer is each
Operation result Serial output;
The output data of control Eltwise layers of output data and other hidden layer each to any hidden layer carries out
Eltwise operation, the input data that the output data of other hidden layers is Eltwise layers.
Wherein, any hidden layer in the program and other hidden layers are hidden with Eltwise layers of cascade upper level
Layer is hidden, the Eltwise layers of output data for output data and other hidden layers to any hidden layer carries out fusion operation, such as
The operations such as product (dot product), sum (mutually adding and subtracting), max (taking large values).It is understood that other hidden layers can be one
It is a, it is also possible to multiple.For example, Eltwise layers are used for two hidden layers (in figure for network structure shown in Fig. 6 a
Shown in layer1 and layer2) output data carry out Eltwise operation, at this point, a correspondence in layer1 and layer2
For any hidden layer in the program, another corresponds to other hidden layers.Below shown in Fig. 6 for network structure into
Row explanation.
Specifically, in mobile terminal Framework for Reasoning Caffe2 or Tensorflow Lite processing convolutional neural networks model
, can be when forward calculation frame stress model Structure Creating graph structure when network structure as shown in FIG. 6 a, it will
Layer2 and Eltwise layers is merged into a layer structure (Layer2+eltwise as shown in Figure 6b merges layer), is only created
One layer object instance, or a layer structure (Layer2+ as shown in the figure 6c is merged by layer2 and Eltwise layers
Eltwise merges layer), for the network shown in Fig. 6 b merges layer, at this point, the output of layer1 is as network merging layer
Input, layer2 close the input inputted still as network merging layer before layer, merge in layer in network according in former layer2
When calculating one output data of generation, Eltwise operation immediately is carried out to the output data of the output data and layer1,
Compared to before closing layer, the output data for carrying out layer2 when Eltwise operation is directly to be directly obtained from register
's.
The data processing method of the embodiment of the present invention can be applied to the various moulds that data processing is carried out based on neural network
In block or product, specifically it can be applied in various terminal equipment (including client device and server etc.), by terminal device
Processor realize.For example, can be applied in mobile terminal device (such as smart phone), by the processing of mobile terminal device
Device (such as CPU) is realized.
The scheme of the embodiment of the present invention, is particularly suitable in mobile terminal device, this is because relative to computer, service
For the terminal devices such as device, it is multi-party to be limited to system architecture, technique, cost, hardware resource configuration (CPU, GPU, memory etc.) etc.
The reason of face, the processor performance (main includes the operational capability of processor) of the mobile terminal devices such as mobile phone is far below computer etc.
The processor performance of equipment, usually difference is at thousands of to up to ten thousand times especially in floating-point operation ability, and is based on neural network
Data processing scheme in, be such as related to the game application of neural network framework, it is specific as used convolutional neural networks
In AI model in the MOBA game of structure, the operand of data is usually all very big, and the dimension of data is also usually various dimensions,
Therefore, for mobile terminal device, the treatment effeciency of data how is provided with regard to even more important.And the embodiment of the present invention is mentioned
The data processing method of confession, since the network of neural network merges at the carry out data that can partially synchronize of different hidden layers of layer
Reason, can effectively reduce the time delay of data processing, improves the treatment effeciency of data, further, since the scheme of the embodiment of the present invention is also
It can be effectively reduced the occupancy to memory, when the scheme of the embodiment of the present invention is applied in mobile terminal device, promoting number
According to the effect of arithmetic speed etc. can be more significant, can be effectively improved since the various aspects of mobile terminal device itself are former
Various problems such as low, the application program Caton of data-handling efficiency because caused by.Specifically, for example, will be of the invention real
When applying the scheme of example and being applied in the MOBA game of the mobile terminal based on neural network framework, data processing effect can be effectively promoted
Rate, the appearance for reducing situations such as memory overflows (Out of memory).
Data processing method provided by the embodiment of the present invention, which can be applied, carries out data processing in various neural networks
Structure, device, product, in model.It is further described below with reference to scheme of the specific example to the embodiment of the present invention.
Example one
In this example, data processing method provided in an embodiment of the present invention can be applied to it is hereinbefore described,
By including that the AI forward calculation Framework for Reasoning of convolutional neural networks is realized in the application scenarios of AI trustship in MOBA game.Specifically
, it can apply and be seen, in the corresponding convolutional neural networks reasoning forward calculation of the micro- behaviour of AI in AI overall situation.Wherein, AI overall situation, which is seen, is
Referring to that control AI in the position that entire map to be gone, by taking king's honor as an example, such as keeps tower, Manchu troop's line beats open country, arrests, support etc..AI
Micro- behaviour then refers to that control AI is specifically operated, and such as walks, technical ability release etc..
Three kinds of structures shown in Fig. 4 a, Fig. 5 a and Fig. 6 a that have in the micro- behaviour's model of model, AI, base are seen for AI overall situation
In the method for the embodiment of the present invention, when forward calculation frame creates network example, the interlayer that can be hidden layer merges meter
It calculates.Specifically, two hidden layers shown in Fig. 4 a can be merged into the network in Fig. 4 b merges layer, it will be more shown in Fig. 5 a
A hidden layer merges into network shown in Fig. 5 b and merges layer, and multiple hidden layers shown in Fig. 6 a are merged into Fig. 6 b, Fig. 6 c
Or network shown in Fig. 6 d merges layer.
It goes offline trustship AI for MOBA game, when a certain player goes offline or is more than given threshold value without relevant operation
Between, terminal device (such as mobile phone) hardware for each game player that server end is got when can be according to since this innings of game is matched
It sets in resource, selects the preferable user of one or more handset capability, AI is operated in the mobile phone of these users, by these
The mobile phone of user carries out forward prediction calculating.In this example, input data when prediction is the game frame data of current player
That is the image data of the corresponding present image of game sees the micro- behaviour's model of model, AI using AI overall situation to input data and carries out correlation
Forward prediction calculate, and calculated result is sent to server, server issues relevant manipulation instruction according to calculated result
To drive player in trust to carry out correlation displacement and technical ability release.It wherein, can for 15 operation frame/second MOBA game
Calculating is seen to carry out 1 overall situation every 15 frames, every 2 frame carries out 1 micro- behaviour and calculates.
By taking the forward calculation frame with network structure shown in Fig. 4 b as an example, at the data of the embodiment of the present invention
Reason method, see in forward calculation frame load AI overall situation, AI micro- behaviour model structure creation graph structure when, by Conv layers with
Relu layers are merged into a layer structure, i.e., a layer object example are only created, shown in new construction, that is, Fig. 4 b after merging
Conv+relu merges layer and uses the input Input1 of Conv layers of original (before merging Conv layers shown in Fig. 4 a) defeated as its
Enter, export the output Output1 as Relu layers of original, when subsequent forward inference calculates, it is advanced which merges layer
The data of a corresponding output are calculated in row Conv, at this time still in a register due to the data, can be at once defeated to this
Data carry out Relu related operation out, and the input data for allowing to carry out Relu calculating is obtained from register.With based on figure
The existing forward calculation frame of network structure shown in 4a is compared, and computational efficiency can be greatly improved, to improve
Server issues relevant manipulation instruction according to calculated result to drive player's progress correlation displacement in trust and skill releasable
Efficiency has better met the high performance demands of data processing.
Likewise, based on the forward calculation frame with network structure shown in Fig. 5 b, using the embodiment of the present invention
When data processing method carries out forward inference calculating, when carrying out forward inference calculating, Conv+batchnormal+scale+
Relu merges layer and first carries out the data that a corresponding output is calculated in Conv, at this time still in a register due to the data,
Batch Normal related operation can be carried out to the output data at once, and then carry out Scale, Relu related operation, make
The network merges and is calculated required data next time every time in layer and can obtained in register, and shown in Fig. 5 a
Network structure compare, computational efficiency can be greatly improved, better met the high efficiency demand of data processing.Based on having
The forward calculation frame of network structure shown in Fig. 6 b carries out forward inference using the data processing method of the embodiment of the present invention
When calculating, when Layer2+eltwise merging layer generates an output data according to the calculating in layer2 layers, Ke Yili
Horse compares before closing layer, when carrying out eltwise calculating the output data and layer1 layers of output progress eltwise calculating
The data of layer2 are directly acquired from register, be can be effectively reduced the load time expense of data, are improved the place of data
Manage efficiency.
By experiment, statistical data is shown on line, with three kinds of nets shown in Fig. 4 b, Fig. 5 b, Fig. 6 b (or Fig. 6 c)
It is complexed and the MOBA game AI overall situation of layer structure is seen in model and the micro- behaviour's model of AI, using the data processing side of the embodiment of the present invention
When method carries out AI reasoning and calculation prediction calculating, average time-consuming is calculated by 31ms and is increased to 23ms, speed-up ratio 1.35, the micro- behaviour's mould of AI
Type calculates average time-consuming and is increased to 1ms, 2 times of speed-up ratio by 2ms, and corresponding AI trustship winning rate is promoted by 27% to 38%.It can
See, can be improved the efficiency of data processing using scheme provided by the embodiment of the present invention.
Example two
As an example, respectively illustrated in Tables 1 and 2 using existing data processing method neural network based with
When data processing method of the invention carries out image recognition, the time-consuming of two schemes and the contrast table for consuming memory.Wherein, table 1
It is when carrying out image recognition using two schemes, except the neural network structure of use is different with comparing result shown in table 2
Outside, it is obtained under the premise of other conditions are all the same.
Respectively illustrated in table 1 based on include Fig. 5 a shown in network structure mobilenet neural network model into
When row image recognition, when carrying out image data calculating using the method for existing way and the embodiment of the present invention, the consumption of two schemes
When comparing result, and based on include Fig. 4 a shown in network structure squzeenet neural network model carry out image knowledge
When other, the time-consuming comparing result of two schemes.
Respectively illustrated in table 2 based on include Fig. 5 a shown in network structure mobilenet neural network model into
When row image recognition, using the method for existing way and the embodiment of the present invention carry out image data calculating when, two schemes it is interior
Consumption comparing result is deposited, and figure is being carried out based on the squzeenet neural network model for including network structure shown in Fig. 4 a
When as identification, the memory consumption comparing result of two schemes.Wherein, it in Tables 1 and 2, closes layer and corresponds to implement using the present invention
The scheme of the neural network structure of example does not conform to layer and corresponds to the scheme for using existing neural network structure.
mobilenet | squzeenet | |
Close layer | 110.66 | 120.03 |
Layer is not conformed to | 126.90 | 129.17 |
The time-consuming comparison (unit ms) of table 1
mobilenet | squzeenet | |
Close layer | 75.23 | 76.68 |
Layer is not conformed to | 114.54 | 83.29 |
Table 2 consumes memory comparison (unit is MB (million))
As can be seen from Table 1, for the image recognition model based on mobilenet, before (closing layer) and conjunction layer after closing layer (no
Close layer) it compares, calculating time-consuming speed-up ratio is 1.15 (126.90/110.66), for being based on squeezenet image recognition model,
After conjunction layer compared with closing before layer, calculates time-consuming speed-up ratio and reach 1.08 (129.17/120.03).
By table 2 then it can be seen from terms of EMS memory occupation, for the image recognition model based on mobilenet, after closing layer
Than close layer before compare, save the 34.3% i.e. memory of (114.54-75.23)/114.54, for based on squeezenet figure
As identification model, than comparing before closing layer after conjunction layer, then 7.9% memory is saved.
Example three
It is shown in this exemplary table 3 and is being based on mobilenet neural network, using the data processing of the embodiment of the present invention
Method is tied with time-consuming comparison when prediction calculates is carried out using existing caffe2 and TensorFlowLite Framework for Reasoning
Fruit, and it is being based on squeezenet neural network, it is existing with using using the data processing method of the embodiment of the present invention
Caffe2 and TensorFlowLite Framework for Reasoning, the time-consuming comparing result when carrying out prediction and calculating.
mobilenet | squzeenet | |
Caffe2 | 327.82 | 187.79 |
Close layer | 117.66 | 124.03 |
TensorFlowLite | 176.11 | 252.21 |
The time-consuming comparison (unit ms) of table 3
As can be seen from Table 3, when being based on mobilenet neural network, using the scheme type of the embodiment of the present invention, meter
Calculating the time-consuming speed-up ratio to existing caffe2 is 2.78 (327.82/117.66), to the speed-up ratio of existing TensorFlowLite
For 1.50 (187.79/124.03);When based on squzeenet neural network, using the scheme of the embodiment of the present invention, consumption is calculated
When be 1.51 to caffe2 speed-up ratio, be 2.03 to the speed-up ratio of TensorFlowLite.
It is understood that the scheme of the embodiment of the present invention, which can be applied to various application neural network structures, carries out data
The structure of processing, model, in product, it is not limited in above-mentioned involved application field or application scenarios, such as not office
It is limited to apply the corresponding Controlling model in above-mentioned MOBA game, the program can be carried forward into the production using any neural network structure
In product, such as it can be applied in the product of any convolutional neural networks structure of the structure with above-mentioned Fig. 1 a, Fig. 1 b and Fig. 1 c,
For example, can be applied to the moulds such as classical image recognition model vgg16, mobilenet, squeezenet neural network based
In type.
Based on principle identical with data processing method neural network based provided in an embodiment of the present invention, the present invention is real
It applies example and additionally provides a kind of data processing equipment neural network based, wherein neural network includes that at least one network merges
Layer, it includes n cascade hidden layers, n >=2 that network, which merges layer,.
The data processing equipment of the embodiment of the present invention includes data processing module, and the data processing module is for controlling network
Merge layer and carry out data processing, wherein there are the layers of data between each hidden layer of network merging layer in data processing
Between parallel processing.
The data processing module of the embodiment of the present invention can be applied in each electronic equipment, for example, can be applied to move
It in terminal device, also can be applied in fixed terminal equipment, can also be applied in server, the function of data processing module
It realizes specifically to be controlled by the processor of electronic equipment and realize.
It is understood that the above-mentioned module of the data processing equipment in the embodiment of the present disclosure, which has, realizes that the present invention is any
The function of corresponding steps in data processing method shown in embodiment, the function can also be passed through by hardware realization
Hardware executes corresponding software realization, and the hardware or software include one or more modules corresponding with above-mentioned function.It is above-mentioned
Each module can be implemented separately, can also be with multiple module integration realizations.It can be with for the concrete function description of data processing equipment
Referring to hereinbefore to the corresponding description in data processing method, details are not described herein.
In alternative embodiment of the invention, when n cascade hidden layers are instantiated, n cascade hidden layers
The corresponding same object instance
In alternative embodiment of the invention, data processing module can be specifically used for:
(i-1)-th hidden layer that network merges layer is controlled, the input data of (i-1)-th hidden layer is handled, and will
Processing result carries out Serial output, wherein 2≤i≤n;
I-th of hidden layer that control network merges layer handles the output data of (i-1)-th hidden layer.
In alternative embodiment of the invention, data processing module control i-th of hidden layer to (i-1)-th hidden layer
When the data of output are handled, it can be specifically used for:
When output data meets preset condition, i-th of hidden layer of control handles the data exported.
In alternative embodiment of the invention, preset condition, which includes that i-th of hidden layer progress layer of output data satisfaction is interior, is transported
The minimum calculation condition calculated.
In alternative embodiment of the invention, output data is the part output data of (i-1)-th hidden layer.
In alternative embodiment of the invention, data processing module is also used to:
At least partly data for controlling output data are stored in register and/or Cache.
In alternative embodiment of the invention, data processing module merges i-th of hidden layer of layer to (i-1)-th in control network
When the output data of a hidden layer is handled, it is specifically used for:
When the maximum output duration of output data is no more than setting duration, i-th of control network merging layer is hidden
Layer handles the output data of (i-1)-th hidden layer;
Wherein, maximum output duration refer to the data obtained earliest in current time and output data obtain the moment it
Between duration.
In alternative embodiment of the invention, duration is set according to the maximum storage duration of register temporal data, and/or,
Cache data cached maximum storage duration determines.
In alternative embodiment of the invention, neural network is the first convolutional neural networks, and it includes cascade that network, which merges layer,
First convolutional layer and the first Relu layers, data processing module are specifically used for when controlling network and merging layer and carry out data processing:
It controls the first convolutional layer and convolution algorithm is carried out to the input data of the first convolutional layer, by each fortune of the first convolutional layer
Result Serial output is calculated, and controls the first Relu layers of output data each to the first convolutional layer and carries out Relu operation.
In alternative embodiment of the invention, neural network is the second convolutional neural networks, and network merges layer and includes successively grade
The second convolutional layer, Normalization layer of the Batch, Scale layers and the 2nd Relu layers of connection, data processing module are controlling net
When simultaneously layer progress data processing is complexed, it is specifically used for:
It controls the second convolutional layer and convolution algorithm is carried out to the input data of the second convolutional layer, and the second convolutional layer is each
Operation result Serial output;
Batch Normalization layers of the control output data each to the second convolutional layer carries out Batch
Normalization operation, and by Normalization layers of Batch each operation result Serial output;
Scale layers of control carry out Scale operation to Normalization layers of Batch each output data, by Scale
The each operation result Serial output of layer, and Relu layers of control the 2nd carry out Relu operations to Scale layers of each output data.
In alternative embodiment of the invention, neural network is third convolutional neural networks, and it includes cascade that network, which merges layer,
Any hidden layer and Eltwise layers, the input data that the output data of any hidden layer is Eltwise layers, data processing module
When controlling network merging layer progress data processing, it is specifically used for:
It controls any hidden layer and corresponding operation is carried out to the input data of any hidden layer, and any hidden layer is each
Operation result Serial output;
The output data of control Eltwise layers of output data and other hidden layer each to any hidden layer carries out
Eltwise operation, the input data that the output data of other hidden layers is Eltwise layers.
The data processing equipment as provided by the embodiment of the present invention is at the data that can be executed in the embodiment of the present invention
The device of reason method, so based on data processing method provided in the embodiment of the present invention, those skilled in the art's energy
It is much of that solution the embodiment of the present invention data processing equipment specific embodiment and its various change form, so herein for
How the data processing equipment realizes that the data processing method in the embodiment of the present invention is no longer discussed in detail.As long as belonging to this field
Technical staff implements device used by data processing in the embodiment of the present invention, belongs to the range to be protected of the application.
Based on principle identical with data processing method provided by the embodiment of the present invention and data processing equipment, the present invention
Embodiment additionally provides a kind of electronic equipment, which may include processor and memory.Wherein, it is stored in memory
There is readable instruction, when readable instruction is loaded and executed by processor, data shown in any embodiment of the present invention may be implemented
Processing method.
The embodiment of the invention also provides a kind of computer readable storage medium, readable finger is stored in the storage medium
It enables, when readable instruction is loaded and executed by processor, realizes data processing method shown in any embodiment of the present invention.
Fig. 7 shows the structural schematic diagram of the applicable a kind of electronic equipment of the embodiment of the present invention, as shown in fig. 7, the electronics
Equipment 2000 includes processor 2001 and memory 2003.Wherein, processor 2001 is connected with memory 2003, such as passes through bus
2002 are connected.Optionally, electronic equipment 2000 can also include transceiver 2004.It should be noted that being received and dispatched in practical application
Device 2004 is not limited to one, and the structure of the electronic equipment 2000 does not constitute the restriction to the embodiment of the present invention.
Wherein, processor 2001 is applied in the embodiment of the present invention, for realizing data processing mould in the embodiment of the present invention
The function of block.Transceiver 2004 includes Receiver And Transmitter, and transceiver 2004 is applied in the embodiment of the present invention, for realizing
Sending and receiving for data is realized in communication between electronic equipment 2000 and other equipment.
Processor 2001 can be CPU, general processor, DSP, ASIC, FPGA or other programmable logic device, crystalline substance
Body pipe logical device, hardware component or any combination thereof.It, which may be implemented or executes, combines described by the disclosure of invention
Various illustrative logic blocks, module and circuit.Processor 2001 is also possible to realize the combination of computing function, such as wraps
It is combined containing one or more microprocessors, DSP and the combination of microprocessor etc..
Bus 2002 may include an access, and information is transmitted between said modules.Bus 2002 can be pci bus or
Eisa bus etc..Bus 2002 can be divided into address bus, data/address bus, control bus etc..Only to be used in Fig. 7 convenient for indicating
One thick line indicates, it is not intended that an only bus or a type of bus.
Memory 2003 can be ROM or can store the other kinds of static storage device of static information and instruction, RAM
Or the other kinds of dynamic memory of information and instruction can be stored, it is also possible to EEPROM, CD-ROM or other CDs
Storage, optical disc storage (including compression optical disc, laser disc, optical disc, Digital Versatile Disc, Blu-ray Disc etc.), magnetic disk storage medium
Or other magnetic storage apparatus or can be used in carry or store have instruction or data structure form desired program generation
Code and can by any other medium of computer access, but not limited to this.
Optionally, memory 2003 is used to store the application code for executing the present invention program, and by processor 2001
It is executed to control.Processor 2001 is for executing the application code stored in memory 2003, to realize that the present invention is implemented
The movement for the device that example provides.
It should be understood that although each step in the flow chart of attached drawing is successively shown according to the instruction of arrow,
These steps are not that the inevitable sequence according to arrow instruction successively executes.Unless expressly stating otherwise herein, these steps
Execution there is no stringent sequences to limit, can execute in the other order.Moreover, at least one in the flow chart of attached drawing
Part steps may include that perhaps these sub-steps of multiple stages or stage are not necessarily in synchronization to multiple sub-steps
Completion is executed, but can be executed at different times, execution sequence, which is also not necessarily, successively to be carried out, but can be with other
At least part of the sub-step or stage of step or other steps executes in turn or alternately.
The above is only some embodiments of the invention, it is noted that for the ordinary skill people of the art
For member, various improvements and modifications may be made without departing from the principle of the present invention, these improvements and modifications are also answered
It is considered as protection scope of the present invention.
Claims (13)
1. a kind of data processing method neural network based, which is characterized in that the neural network includes at least one network
Merge layer, it includes n cascade hidden layers, n >=2 that the network, which merges layer,;The described method includes:
It controls the network and merges layer progress data processing, wherein the network merges each hidden of layer in data processing
There are the interlayer parallel processings of data between hiding layer.
2. the method according to claim 1, wherein when the n cascade hidden layers are instantiated,
The n cascade hidden layers correspond to the same object instance.
3. method according to claim 1 or 2, which is characterized in that the control network, which merges layer, to carry out at data
Reason, comprising:
(i-1)-th hidden layer that the network merges layer is controlled, the input data of (i-1)-th hidden layer is handled,
And processing result is subjected to Serial output, wherein 2≤i≤n;
I-th of hidden layer for controlling the network merging layer handles the output data of (i-1)-th hidden layer.
4. according to the method described in claim 3, it is characterized in that, control i-th of hidden layer is to described (i-1)-th
The data of hidden layer exported are handled, comprising:
It is described output data meets preset condition when, control i-th of hidden layer and the data exported carried out
Processing.
5. according to the method described in claim 4, it is characterized in that, the preset condition includes that the output data meets institute
State the minimum calculation condition that i-th of hidden layer carries out operation in layer.
6. method according to any one of claim 3 to 5, which is characterized in that the output data is described (i-1)-th
The part output data of a hidden layer.
7. method according to any one of claim 3 to 6, which is characterized in that further include:
At least partly data of the control output data are stored in register and/or cache memory Cache.
8. the method according to the description of claim 7 is characterized in that the control network merges i-th of hidden layer of layer
The output data of (i-1)-th hidden layer is handled, comprising:
When the maximum output duration of the output data is no more than setting duration, control i-th that the network merges layer
Hidden layer handles the output data of (i-1)-th hidden layer;
Wherein, the maximum output duration refers to obtaining for the data obtained earliest in current time and the output data
Duration between moment.
9. according to the method described in claim 8, it is characterized in that, the setting duration is according to the register temporal data
Maximum storage duration, and/or, the Cache data cached maximum storage duration determines.
10. method according to any one of claim 1 to 9, which is characterized in that the neural network is the first convolution mind
Through network, it includes Relu layers of cascade first convolutional layer and the first activation primitive, the control net that the network, which merges layer,
It is complexed and layer carries out data processing, comprising:
It controls first convolutional layer and convolution algorithm is carried out to the input data of first convolutional layer, by first convolutional layer
Each operation result Serial output, and control the described first Relu layers of output data each to first convolutional layer and carry out
Relu operation;
Alternatively,
The neural network is the second convolutional neural networks, the network merge layer include successively cascade second convolutional layer, batch
Batch Normalization layer of normalization, Scale layer and the 2nd Relu layers of scaling translation, it is described to control the network merging
Layer carries out data processing, comprising:
It controls second convolutional layer and convolution algorithm is carried out to the input data of second convolutional layer, and by second convolution
The each operation result Serial output of layer;
It controls the described Batch Normalization layers output data each to second convolutional layer and carries out Batch
Normalization operation, and by described Batch Normalization layers each operation result Serial output;
It controls described Scale layers and Scale operation is carried out to described Batch Normalization layers each output data, it will
Described Scale layers each operation result Serial output, and control the described 2nd Relu layers to Scale layers of each output
Data carry out Relu operation;
Alternatively,
The neural network is third convolutional neural networks, and it includes cascade any hidden layer and by element that the network, which merges layer,
Eltwise layer of operation, the output data of any hidden layer are Eltwise layers of the input data, described in the control
Network merges layer and carries out data processing, comprising:
It controls any hidden layer and corresponding operation is carried out to the input data of any hidden layer, and will be described any hidden
Hide each operation result Serial output of layer;
The output data for controlling described Eltwise layers output data and other hidden layer each to any hidden layer carries out
Eltwise operation, the output data of other hidden layers are Eltwise layers of the input data.
11. a kind of data processing equipment neural network based, which is characterized in that the neural network includes at least one network
Merge layer, it includes n cascade hidden layers, n >=2 that the network, which merges layer,;Described device includes:
Data processing module merges layer progress data processing for controlling the network, wherein described in data processing
There are the interlayer parallel processings of data between each hidden layer of network merging layer.
12. a kind of electronic equipment, which is characterized in that the electronic equipment includes processor and memory;
It is stored with readable instruction in the memory, when the readable instruction is loaded and executed by the processor, realizes as weighed
Benefit require any one of 1 to 10 described in data processing method.
13. a kind of computer readable storage medium, which is characterized in that be stored with readable instruction in the storage medium, it is described can
When reading instruction is loaded and executed by processor, the data processing method as described in any one of claims 1 to 10 is realized.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201811340948.6A CN110163337B (en) | 2018-11-12 | 2018-11-12 | Data processing method, device and equipment based on neural network and storage medium |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201811340948.6A CN110163337B (en) | 2018-11-12 | 2018-11-12 | Data processing method, device and equipment based on neural network and storage medium |
Publications (2)
Publication Number | Publication Date |
---|---|
CN110163337A true CN110163337A (en) | 2019-08-23 |
CN110163337B CN110163337B (en) | 2023-01-20 |
Family
ID=67645220
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201811340948.6A Active CN110163337B (en) | 2018-11-12 | 2018-11-12 | Data processing method, device and equipment based on neural network and storage medium |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN110163337B (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2022206536A1 (en) * | 2021-03-29 | 2022-10-06 | 维沃移动通信有限公司 | Data processing method and apparatus, and chip |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107844833A (en) * | 2017-11-28 | 2018-03-27 | 郑州云海信息技术有限公司 | A kind of data processing method of convolutional neural networks, device and medium |
CN108074211A (en) * | 2017-12-26 | 2018-05-25 | 浙江大华技术股份有限公司 | A kind of image processing apparatus and method |
WO2018120016A1 (en) * | 2016-12-30 | 2018-07-05 | 上海寒武纪信息科技有限公司 | Apparatus for executing lstm neural network operation, and operational method |
CN108446758A (en) * | 2018-02-11 | 2018-08-24 | 江苏金羿智芯科技有限公司 | A kind of serial flow processing method of Neural Network Data calculated towards artificial intelligence |
CN108491924A (en) * | 2018-02-11 | 2018-09-04 | 江苏金羿智芯科技有限公司 | A kind of serial stream treatment device of Neural Network Data calculated towards artificial intelligence |
-
2018
- 2018-11-12 CN CN201811340948.6A patent/CN110163337B/en active Active
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2018120016A1 (en) * | 2016-12-30 | 2018-07-05 | 上海寒武纪信息科技有限公司 | Apparatus for executing lstm neural network operation, and operational method |
CN107844833A (en) * | 2017-11-28 | 2018-03-27 | 郑州云海信息技术有限公司 | A kind of data processing method of convolutional neural networks, device and medium |
CN108074211A (en) * | 2017-12-26 | 2018-05-25 | 浙江大华技术股份有限公司 | A kind of image processing apparatus and method |
CN108446758A (en) * | 2018-02-11 | 2018-08-24 | 江苏金羿智芯科技有限公司 | A kind of serial flow processing method of Neural Network Data calculated towards artificial intelligence |
CN108491924A (en) * | 2018-02-11 | 2018-09-04 | 江苏金羿智芯科技有限公司 | A kind of serial stream treatment device of Neural Network Data calculated towards artificial intelligence |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2022206536A1 (en) * | 2021-03-29 | 2022-10-06 | 维沃移动通信有限公司 | Data processing method and apparatus, and chip |
Also Published As
Publication number | Publication date |
---|---|
CN110163337B (en) | 2023-01-20 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN111858009B (en) | Task scheduling method of mobile edge computing system based on migration and reinforcement learning | |
CN107578095B (en) | Neural computing device and processor comprising the computing device | |
CN106447034B (en) | A kind of neural network processor based on data compression, design method, chip | |
CN107918794A (en) | Neural network processor based on computing array | |
CN110188795A (en) | Image classification method, data processing method and device | |
CN107066239A (en) | A kind of hardware configuration for realizing convolutional neural networks forward calculation | |
CN108090560A (en) | The design method of LSTM recurrent neural network hardware accelerators based on FPGA | |
CN106709565A (en) | Optimization method and device for neural network | |
CN113067873A (en) | Edge cloud collaborative optimization method based on deep reinforcement learning | |
CN110222717A (en) | Image processing method and device | |
CN107622305A (en) | Processor and processing method for neutral net | |
CN109446996A (en) | Facial recognition data processing unit and processing method based on FPGA | |
CN107292458A (en) | A kind of Forecasting Methodology and prediction meanss applied to neural network chip | |
CN110059747A (en) | A kind of net flow assorted method | |
CN110050282A (en) | Convolutional neural networks compression | |
CN113760511B (en) | Vehicle edge calculation task unloading method based on depth certainty strategy | |
CN108545556A (en) | Information processing unit based on neural network and method | |
CN110600020B (en) | Gradient transmission method and device | |
CN113241064A (en) | Voice recognition method, voice recognition device, model training method, model training device, electronic equipment and storage medium | |
CN108985449A (en) | A kind of control method and device of pair of convolutional neural networks processor | |
CN110163337A (en) | Data processing method, device, equipment and storage medium neural network based | |
CN116957698A (en) | Electricity price prediction method based on improved time sequence mode attention mechanism | |
Zhang et al. | Communication-computation efficient device-edge co-inference via AutoML | |
CN117193992B (en) | Model training method, task scheduling device and computer storage medium | |
Jin et al. | An intelligent scheduling algorithm for resource management of cloud platform |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |