WO2020041934A1 - 一种数据处理设备以及一种数据处理方法 - Google Patents

一种数据处理设备以及一种数据处理方法 Download PDF

Info

Publication number
WO2020041934A1
WO2020041934A1 PCT/CN2018/102515 CN2018102515W WO2020041934A1 WO 2020041934 A1 WO2020041934 A1 WO 2020041934A1 CN 2018102515 W CN2018102515 W CN 2018102515W WO 2020041934 A1 WO2020041934 A1 WO 2020041934A1
Authority
WO
WIPO (PCT)
Prior art keywords
parameter
data
neural network
calculation
determination module
Prior art date
Application number
PCT/CN2018/102515
Other languages
English (en)
French (fr)
Inventor
许若圣
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Priority to CN201880090383.2A priority Critical patent/CN111788567B/zh
Priority to PCT/CN2018/102515 priority patent/WO2020041934A1/zh
Publication of WO2020041934A1 publication Critical patent/WO2020041934A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/10Complex mathematical operations
    • G06F17/15Correlation function computation including computation of convolution operations

Definitions

  • the present application relates to the technical field of data processing, and in particular, to a data processing device and a data processing method.
  • Deep neural networks can be used in the fields of image classification, image recognition, and audio recognition.
  • the data processed by general processors in deep neural networks is 32-bit floating-point format (FP32) data. High computing power and power requirements.
  • FP32 floating-point format
  • the data format is quantized to a lower bit format such as 1 bit by DNN. Then, matrix operations are performed to complete the frame data processing flow of the DNN.
  • the quantization of the neural network is divided into two parts. The first type is the quantization of the network weight coefficients. The second type is the quantization of the input and output feature maps of each layer in the neural network.
  • I is the feature map matrix of FP32
  • the weight matrix of FP32 is W
  • sign (I) is the Feature map matrix of each layer after 1bit quantization
  • sign (W) is the 1bit quantized weight matrix
  • is a scalar value
  • K is a quantization parameter, which can be a two-dimensional matrix
  • represents the matrix point multiplication
  • K is calculated by a convolutional layer in the DNN using the input feature map before quantization.
  • the above DNN's processing of frame data is calculated layer-by-layer for the neural network. After the quantization parameter calculation is completed for one layer, it can be brought into the matrix operation, and then the parameter calculation of the next layer is completed to complete the DNN.
  • the calculation time of the quantization parameter is limited by the previous layer calculation, which increases the delay of the neural network processing the frame data.
  • the application provides a data processing device and a data processing method, which improve data processing efficiency and reduce data processing delay.
  • the first aspect of the present application provides a data processing method, which is characterized in that the method includes: performing parameter calculation on the first data through a parameter determination module to obtain a first parameter set used for calculation of the first neural network.
  • the first parameter set includes at least one first parameter, and each first parameter corresponds to a feature map of each layer in the first neural network.
  • the first neural network may be a convolutional neural network, a deep neural network, or a recurrent network. Neural networks and / or multilayer perceptrons.
  • a first neural network calculation is performed on the first data by using a first parameter set through a neural network calculation module to obtain a calculation result. After the first data is input into the neural network calculation module, a feature map of each layer in the first neural network can be obtained.
  • the first parameter in the first parameter set and the corresponding feature map matrix are brought into a preset calculation formula to obtain The calculation result is repeatedly executed until the calculation of each first parameter in the first parameter set is completed, thereby completing the first neural network calculation process of the first data.
  • the parameter determination module and the neural network calculation module are two modules with independent data processing capabilities, so the parameter calculation of the parameter determination module is independent of the first neural network calculation of the neural network calculation module.
  • the embodiment of the present application has the following advantages: In this embodiment, because the data processing equipment generally needs to process a large amount of data, the present application uses a parameter determination module and a neural network calculation module to cooperate to complete the neural network calculation of the data.
  • the parameter calculation of the determination module is independent of the first neural network calculation of the neural network calculation module, so the two modules can process data in parallel, reducing the data processing delay.
  • the performing parameter calculation on the first data includes: using a second neural network to perform the parameter calculation on the first data to obtain the The first parameter set, wherein the second neural network may be a convolutional neural network, a deep neural network, a recurrent neural network, and / or a multilayer perceptron.
  • the second neural network may be a convolutional neural network, a deep neural network, a recurrent neural network, and / or a multilayer perceptron.
  • the parameter calculation is performed on the first data to obtain the first
  • the parameter set includes: performing parameter calculation on the first data to obtain a second parameter set.
  • One possible situation is to use a second neural network to perform parameter calculation on the first data to obtain a second parameter set.
  • the first parameter set is obtained by performing weighted average or smooth calculation or alpha filtering on the second parameter set and the third parameter set, wherein the third parameter set is a historical parameter set calculated by the parameter determination module.
  • the method of bringing in the historical parameter set to obtain the first parameter set is described, which increases the practicability and implementation flexibility of the solution.
  • the parameter calculating the first data includes: The data is subjected to a matrix operation with a preset matrix, wherein the preset matrix can be set in advance by a parameter determination module.
  • the preset matrix can be set in advance by a parameter determination module.
  • the method further includes: the parameter determination module and the neural network calculation module are in In the parallel processing on the domain, one possible situation is: when the neural network calculation module is in the state of the first neural network calculation, perform parameter calculation on the second data by the parameter determination module.
  • the second data is earlier than the first data in the time domain.
  • the second data may be data at any time before the first data, or may be data at a time before the first data, which is not limited herein.
  • the parallel processing method in the time domain of the parameter determination module and the neural network calculation module is described, so that data can be processed in parallel and the data processing delay is reduced.
  • the first parameter set includes: a quantization parameter or a quantization parameter.
  • An adjustment amount or a parameter associated with the quantization parameter wherein a person skilled in the art can obviously obtain a quantization parameter based on the associated parameter
  • the first neural network calculation is a quantized neural network calculation, for example, using XNOR- The Net method performs the quantized neural network calculation on the first data.
  • the specific reference of the first parameter set and the specific method of the first neural network calculation are described, which is beneficial to the implementation of the solution.
  • a second aspect of the present application provides a data processing device, including: a neural network calculation module coupled and connected to a parameter determination module and the parameter determination module.
  • a parameter determination module is configured to perform parameter calculation on the first data to obtain a first parameter set used for calculation of the first neural network, where the first parameter set includes at least one first parameter, and each first parameter is related to the first neural network. There is a one-to-one feature map for each layer in the network.
  • the first neural network may be a convolutional neural network, a deep neural network, a recurrent neural network, and / or a multilayer perceptron.
  • a neural network calculation module configured to use the first parameter set to perform the first neural network calculation on the first data to obtain a calculation result, where the first data is input to the neural network calculation module to obtain a first
  • the feature map of each layer in the neural network, the first parameter in the first parameter set and the corresponding feature map matrix are brought into a preset calculation formula to obtain a calculation result, and the execution is repeated until each of the first parameter set in the first parameter set is completed.
  • the parameter determination module and the neural network calculation module are two modules with independent data processing capabilities. Therefore, the parameter calculation of the parameter determination module is independent of The first neural network calculation of the neural network calculation module.
  • the parameter determining module is specifically configured to perform the parameter calculation on the first data using a second neural network to obtain the first A parameter set, where the second neural network may be a convolutional neural network, a deep neural network, a recurrent neural network, and / or a multilayer perceptron.
  • the second neural network may be a convolutional neural network, a deep neural network, a recurrent neural network, and / or a multilayer perceptron.
  • the parameter determining module is specifically configured to perform parameter calculation on the first data
  • a second parameter set is obtained.
  • One possible situation is to use a second neural network to perform parameter calculation on the first data to obtain a second parameter set.
  • the first parameter set is obtained by performing weighted average or smooth calculation or alpha filtering on the second parameter set and the third parameter set, wherein the third parameter set is a historical parameter set calculated by the parameter determination module.
  • the parameter calculation of the first data includes: The data is subjected to a matrix operation with a preset matrix, wherein the preset matrix can be set in advance by a parameter determination module.
  • the parameter determination module and the neural network calculation module perform parallel processing in the time domain.
  • the parameter determination module can be used to perform parameter calculation on the second data.
  • the domain is earlier than the first data, and the second data may be data at any time before the first data, or may be data at a time before the first data, which is not limited herein.
  • the first parameter set includes: a quantization parameter or a quantization parameter.
  • An adjustment amount or a parameter associated with the quantization parameter wherein those skilled in the art can obviously obtain the quantization parameter based on the associated parameter
  • the first neural network calculation is a quantized neural network calculation, for example, using XNOR- The Net method performs the quantized neural network calculation on the first data.
  • the parameter determination module may be a first circuit and the neural network calculation module It may be a second circuit, and the first circuit and the second circuit may be located on one or more chips.
  • a third aspect of the present application provides a data processing device.
  • the data processing device includes a first processor and a second processor.
  • the first processor corresponds to the parameter determination module described in the foregoing aspects, and can be run by
  • the software program executes the operations performed by the parameter determination module of the above aspects
  • the second processor corresponds to the neural network calculation module described in the above aspects, and can execute the operations performed by the neural network calculation module of the above aspects by running the software program
  • the first processor and the second processor may be located on one or more chips.
  • a fourth aspect of the present application provides a computer-readable storage medium, where the computer-readable storage medium stores instructions that, when run on a computer, cause the computer to execute the methods described in the above aspects.
  • a fifth aspect of the present application provides a computer program product containing instructions that, when run on a computer, causes the computer to perform the methods described in the above aspects.
  • FIG. 1 (a) is a possible structure of a data processing device of this application
  • FIG. 1 (b) is a processing flowchart of a data processing device during low-bit quantization
  • FIG. 2 is a schematic diagram of an embodiment of generating a quantization parameter by a parameter determination module of the present application
  • FIG. 3 is a schematic diagram of another embodiment of generating a quantization parameter by a parameter determination module of the present application.
  • FIG. 4 is a schematic diagram of an embodiment in which data parameter determination module and neural network calculation module process data in parallel in this application;
  • FIG. 5 another possible structure of the data processing device of the present application.
  • FIG. 6 is a schematic diagram of an embodiment of a data processing method of the present application.
  • the first neural network or the second neural network may specifically be a convolutional neural network (CNN), a deep neural network (DNN), a recurrent neural network (RNN), and / or multiple layers Perceptron (multi-layer perceptron, MLP).
  • CNN convolutional neural network
  • DNN deep neural network
  • RNN recurrent neural network
  • MLP multiple layers Perceptron
  • This application provides a data processing device that can be applied to the processing of video data, voice data, and picture data by a neural network.
  • the data processing device uses the XNOR-Net method to quantize the data format of the data to be processed to a lower bit format.
  • the data processing device includes a parameter determination module and a neural network calculation module coupled to the parameter determination module.
  • the parameter determination module and the neural network The network computing module is two independent modules in the data processing equipment. When processing the data, the data can be processed in parallel. At the same time, the output of the parameter determination module is also used as the input of the neural network computing module.
  • the process of data processing by data processing equipment That is to say, the calculation of the parameter determination module no longer depends on the calculation of the neural network calculation module. Specifically, the parameter determination module does not need to wait for the calculation result of the neural network calculation module to perform further processing. Compared with the prior art, the calculation efficiency is improved and the delay is shorter.
  • Step 1 is to calculate the first parameter, such as the calculation of the quantization parameter K or its deformation.
  • Step 2 is the calculation of the first neural network, that is, to calculate the matrix of each layer.
  • One is the quantitative neural network calculation, for example, refer to the introduction of XNOR-Net operation in the prior art.
  • the following uses the data processing device to process the image as an example: Step 1: The external device inputs the image data to be processed to the data processing device. First, the parameter determination module in the data processing device receives the image data to be processed.
  • the second neural network or a preset matrix performs parameter calculation on the image data to obtain a first parameter set, wherein the preset second neural network and the preset matrix are both obtained through data training in advance, and the parameter determination module passes the second
  • the neural network processes the image data as follows: the parameter determination module first determines the feature map of each layer in the second neural network, and each layer of the feature map corresponds to a first parameter.
  • the parameter determination module can use the matrix of each layer of the feature map
  • the expression obtains the first parameter, and finally obtains the first parameter set;
  • the processing process of the image data by the parameter determination module through the preset matrix is: the parameter determination module performs the matrix operation on the image data and the preset matrix, which can be a matrix multiplication operation, Get the first parameter set.
  • Step 2 Subsequently, the parameter determination module inputs the determined first parameter to the neural network calculation module.
  • the neural network calculation module performs low-bit quantization of the image data based on the XNOR-Net method, and then, according to the received first parameter, The first neural network processes the image data, and the specific processing process is according to the algorithm: Perform matrix operations to finally get the output of the first neural network.
  • both the parameter determination module and the neural network calculation module have a function of processing image data.
  • the process of parameter calculation performed by the parameter determination module and the process of neural network calculation performed by the neural network calculation module are independent of each other. In other words, the process of parameter calculation performed by the parameter determination module does not depend on the process of neural network calculation performed by the neural network calculation module.
  • both the parameter determination module and the neural network calculation module can process the image data in parallel.
  • the parallelization of the two in the time domain is specifically as follows: when the external device inputs the second image data to the data processing device, the neural network calculation module starts to perform parameter calculation on the second image data, and the neural network calculation module at this time It also stays in a state where a neural network calculation is performed on the first image data before the second image. That is, the further processing of the parameter determination module may not necessarily wait for the processing result of the neural network calculation module.
  • the data processing device 10 in the present application includes a parameter determination module 101 and a parameter determination module 101 coupled to the parameter determination module 101.
  • a neural network calculation module 102; the parameter determination module 101 is configured to perform parameter calculation on the first data to obtain a first parameter set used for the calculation of the first neural network; in this embodiment, the parameter determination module 101 may pass
  • Neural network training can also be obtained through other machine learning algorithms, such as support vector machine (SVM) and decision tree training. It is used to implement parameter calculation and can be implemented in software, hardware, or a combination thereof.
  • the first data may be video data, audio data, or picture data, and may also be other types of data, which is not limited herein.
  • the parameter determining module 101 configured to perform parameter calculation on the first data may specifically include: the parameter determining module 101 performs matrix calculation on the first data to obtain parameters or the parameter determining module 101 further performs smooth calculation on multiple first data, and may also be Other forms of calculation are not limited here.
  • the obtained first parameter may be a scalar parameter, or may be a 2D matrix, and may also be other cases, which are not specifically limited here.
  • the first parameter set is used by the neural network calculation module 102 to perform the first neural network calculation.
  • the processing of the first data in the neural network usually requires four layers, which are a convolution layer, a pooling layer, a non-linear layer, and a full connection. Layers, these four layers alternate in varying numbers, and each layer corresponds to a feature map.
  • the XNOR-Net method is used to quantize the data format of the first data into a low-bit format (for example, 1 bit), and the NOR-Net method is used for the data of the first data.
  • Format quantization includes feature quantization.
  • the first input data is different. The value of the feature map before quantization will also change.
  • the parameter determination module 101 is a module dedicated to calculating the quantization parameter K or its deformation, which can be implemented by software, hardware, or a combination thereof.
  • the neural network calculation module 102 is configured to perform the first neural network calculation on the first data using the first parameter set to obtain a calculation result.
  • the neural network calculation module 102 may specifically It is a DNN processor.
  • the DNN processor has the same function as a common AI processor and can perform neural network calculations. For example, it can be implemented by software, hardware, or a combination thereof.
  • the parameter determination module 101 inputs the first parameter set to the neural network calculation module 102, and at the same time, the neural network calculation module 102 obtains the first data, and performs a first neural network calculation on the first data in combination with the first parameter set to obtain each The neural network output of the layer feature map.
  • the formula calculated by the first neural network is: Where I is the feature map matrix before quantization, W is the weight matrix before quantization, sign (I) is the feature map matrix after quantization, and sign (W) is the weight matrix after quantization.
  • the first parameter may be a quantization parameter K, or an associated parameter that can be obtained by those skilled in the art without any creative work.
  • the first parameter may also be an adjustment amount of the quantization parameter. Limitation, which will be introduced in detail later.
  • the parameter calculation of the parameter determination module 101 is independent of the first neural network calculation of the neural network calculation module 102, and the first neural network calculation includes calculation of multiple layers
  • the data input is pipelined by each layer, and the next layer receives the output of the previous layer as an input feature map.
  • the parameter determination module 101 and the neural network calculation module 102 can process the input data respectively, that is, the data processing processes of the parameter determination module 101 and the neural network calculation module 102 are independent of each other and do not affect each other.
  • the parameter determination module 101 and the neural network calculation module 102 are two independent modules in the data processing device 10, the calculation process of the two data is independent.
  • This application uses the parameter determination module 101 to After the data is processed to obtain the first parameter set, the neural network calculation module 102 then uses the first parameter set to complete the neural network calculation process of the first data, so that while the first data is processed, the data is processed relative to a single device.
  • the parameter determination module 101 and the neural network calculation module 102 of this application can perform further processing on the data in parallel, thereby reducing the data processing delay.
  • the neural network training is performed in advance.
  • the data processing device 10 uses the trained neural network to process the first data to obtain the calculation result of the neural network. This method has a relatively small amount of data calculation and is implemented by a solution.
  • the parameter determination module 101 performs parameter calculation on the first data to obtain the first parameter set used for the calculation of the first neural network in the following achievable manners, which will be described below: 1.
  • the network performs parameter calculation on the first data to obtain a first parameter set.
  • the second neural network is obtained by training a large amount of data in advance, and according to different meanings of the first parameter, the training may be training based on the input data to obtain a set of quantized parameters, or may be obtained based on the input data
  • the training of the related parameter set may also be the training of obtaining the quantization parameter adjustment amount set according to the input data, which is not specifically limited here. It should be noted that the specific data represented by the first parameter is different, and the calculation process of the internal parameters of the second neural network is different, which will be described below:
  • the first parameter indicates a quantization parameter K; the parameter determination module 101 performs parameter calculation on the first data by using a second neural network to obtain a quantization parameter set.
  • the convolutional layer and the fully connected layer that need to be quantized have a total of N layers, of which the first n layers are used for feature extraction, and then N layers are used to generate N respectively Quantization parameter K.
  • the first n layers may be a convolutional layer and a non-linear layer, or a flexible combination of other layers of other convolutional neural networks, which are not specifically limited here.
  • Conv in FIG. 2 refers to convolution.
  • A is a value obtained by averaging the absolute values of the elements of all channels of the feature map on each pixel. To calculate the average value, the number of channels and elements need to be calculated.
  • Each neural network The input of the layer is a three-dimensional matrix M (X, Y, Z), where X, Y, Z is the number in this dimension, let the third dimension Z be the number of channels, then there is a 2 corresponding to a certain channel zi Dimensional matrix Mz (X, Y, zi).
  • Mz an element in this 2-dimensional matrix, Mz (xi, yk, zi), is analogized here to a pixel of the image.
  • k is obtained according to the size of the convolution kernel of the convolutional layer.
  • the quantization parameter K calculated here is a 2D matrix.
  • f (W nn_para_est , input) is the response function of the second neural network
  • D t is the input data of the second neural network.
  • the first parameter represents a parameter associated with the quantization parameter K, that is, a modification of K; the parameter determination module 101 uses the second neural network to perform parameter calculation on the first data to obtain a related parameter set of the quantization parameter. After using the second neural network to obtain the correlation parameters, those skilled in the art can obtain the quantization parameter K through reasonable derivation without any creative work.
  • the first parameter is the aforementioned parameter A.
  • the method of obtaining the first parameter by using the second neural network is similar to the method of obtaining the quantization parameter by using the convolutional neural network in the above FIG.
  • the first data can also be calculated by XNOR-Net in the second neural network, such as the above-mentioned parameter A.
  • A is an average of the absolute values of the elements of all channels on each pixel.
  • the calculated value of A is: Among them, I is the matrix of the feature map of each layer before quantization, and c is the number of channels of the feature map.
  • the first parameter represents the adjustment amount of the quantization parameter K, which is another variant of K.
  • the parameter determining module 101 performs parameter calculation on the first data by using the second neural network to obtain a set of quantization parameter adjustment amounts.
  • a quantization parameter K1 can be estimated offline in advance, and the quantization parameter K1 can be determined according to the distribution rule of quantization parameters obtained after multiple experiments of data processing after quantization parameters are obtained.
  • the more ideal quantization parameter can be the average of the quantization parameters obtained from multiple experiments.
  • the parameter determination module 101 performs parameter calculation on the first data by using the second neural network to obtain a quantization parameter adjustment amount ⁇ K, which may be an offset between the calculated quantization parameter and the quantization parameter K1.
  • the second neural network performs parameter calculation on the first data to obtain the quantization parameter adjustment amount ⁇ K in a manner similar to the method for obtaining the quantization parameter calculated by the convolutional neural network in FIG. 2 described above, that is, the second neural network may be used. After performing feature extraction, a parameter adjustment amount set is generated, and details are not described herein again.
  • the calculation amount for the parameter determination module 101 to determine the quantization parameter adjustment amount ⁇ K is smaller than the calculation amount for the parameter determination module 101 to determine the quantization parameter K. Therefore, the parameter determination module 101 only needs to obtain the quantization parameter adjustment online.
  • the quantity ⁇ K can obtain the quantization parameter K, which can reduce the calculation amount of the parameter determination module 101.
  • a matrix operation may also be performed on the first data through a preset matrix to obtain a first parameter set.
  • Matrix operations can be matrix addition, subtraction, and / or multiplication and division operations, which are not specifically limited here.
  • the preset matrix is a matrix obtained by training a large amount of data in advance.
  • the training can be training based on the input data to obtain a set of quantized parameters, or it can be obtained based on the input data.
  • the training of the related parameter set may also be training that obtains the adjustment amount of the quantization parameter according to the input data, which is not specifically limited here.
  • the first parameter may be a quantization parameter or an adjustment amount of the quantization parameter or a parameter associated with the quantization parameter.
  • the first parameter is a quantization parameter k
  • the first data can be expanded into a matrix Dt.
  • the first parameter is the quantization parameter adjustment amount ⁇ K
  • the first data can be expanded into a matrix Dt
  • a weighting matrix W2 is trained in advance
  • opt can be any operation, such as addition, subtraction or multiplication Etc.
  • the quantization parameter K2 may be an ideal quantization parameter determined according to the distribution rule of the quantization parameter obtained after multiple experiments after multiple data processing experiments, and may be obtained from multiple experiments. The average value of the quantization parameter.
  • the parameter determination module 101 may also directly obtain the second parameter set through the second neural network or the preset matrix. Then, the historical parameters are combined with the second parameter set to calculate the first parameter set.
  • Step1 The parameter determination module 101 performs parameter calculation on the first data to obtain a second parameter set;
  • An implementable method is that the parameter determination module 101 performs parameter calculation on the first data using a second neural network to obtain a second parameter set. Parameter collection.
  • the second parameter may be a quantization parameter or an adjustment amount of the quantization parameter or a parameter associated with the quantization parameter obtained by the parameter determination module 101 processing the first data through the second neural network.
  • the method for processing the first data by the second neural network to obtain the second parameter is similar to the method for directly obtaining the first parameter after processing the first data through the second neural network, and details are not described herein again.
  • the parameter determining module 101 performs parameter calculation on the first data by using a preset matrix to obtain a second parameter set.
  • the second parameter may be a quantization parameter or an adjustment amount of the quantization parameter obtained by processing the first data by the parameter determination module 101 through a preset matrix, or a parameter associated with the quantization parameter.
  • the method of setting the matrix to process the first data to obtain the second parameter is similar to the method of directly obtaining the first parameter after processing the first data through the preset matrix, that is, performing matrix addition, subtraction, and / or multiplication and division operations through the preset matrix.
  • the second parameter is a quantization parameter k3
  • the first data can be expanded into a matrix D3, and a weighting matrix W3 is trained in advance.
  • K3 D3W3.
  • the weighting matrix W3 is preset. It can be the same matrix as the above W2 and W1.
  • Step2 The parameter determination module 101 processes the second parameter set and the third parameter set to obtain a first parameter set.
  • the multiple second parameters in the second parameter set correspond to the multiple third parameters in the third parameter set one by one.
  • the processed data distribution within a certain period of time has a certain similarity, so the third parameter, that is, the historical parameter, can be used to generate the first quantized parameter, thereby reducing the noise of the first parameter estimation and increasing the first parameter.
  • Estimated accuracy is at least one parameter set obtained by the parameter determination module 101 processing at least one data before the first data.
  • the second parameter is a quantization parameter and the history parameter set is a second neural network.
  • At least one quantization parameter set obtained before processing the first data when it is necessary to generate a parameter associated with the quantization parameter K, such as the above-mentioned parameter A, the second parameter is an associated parameter of the quantization parameter, and the historical parameter set is a second neural network
  • At least one related parameter set obtained before processing the first data when an adjustment amount ⁇ K of the quantization parameter K needs to be generated, the second parameter is a parameter adjustment amount, and the historical parameter set is a second neural network before processing the first data
  • the obtained at least one quantization parameter adjustment amount ⁇ K set when an adjustment amount ⁇ K of the quantization parameter K needs to be generated, the second parameter is a parameter adjustment amount, and the historical parameter set is a second neural network before processing the first data.
  • the parameter determination module 101 processes the second parameter set and the third parameter set to obtain the first parameter set by using a neural network for processing. As shown in FIG. 3, the first parameter and the third parameter are subjected to alpha filtering through a neural network, that is, weighted average is performed to obtain the first parameter.
  • the third parameter Y (t-1) is a parameter obtained by the parameter determination module 101 processing the data immediately before the first data.
  • the third parameter is the third quantization parameter K (t-1), the linearity is first used.
  • K (t) ⁇ K ′ (t) + (1- ⁇ ) K (t-1), where K (t) represents the first parameter and K ′ (t) represents the second parameter.
  • the parameter determination module 101 processes the second parameter set and the third parameter set to obtain the first parameter set.
  • the method of matrix operation may also be used to process the parameter determination module 101.
  • the parameter and the third parameter are subjected to matrix addition and subtraction operations and / or multiplication and division operations to obtain the first parameter.
  • the manner in which the parameter determining module 101 processes the second parameter set and the third parameter set to obtain the first parameter set is not limited herein.
  • the parameter determination module 101 when calculating the first parameter, takes historical parameters into consideration, reduces the estimated noise of the first parameter, and improves the accuracy of the first parameter estimation.
  • the parameter determination module 101 is further configured to perform parameter calculation on the second data.
  • the second data is earlier than the first data in the time domain.
  • the parameter determination module 101 first processes data 1 to obtain a first parameter set, and then the parameter determination module 101 processes data 2 after data 1, and the neural network calculation module 102 is still in the state of processing data 1 .
  • the two modules can achieve the effect that the data processing does not affect each other, and then complete the parallel processing of the two data.
  • this embodiment is equivalent to making the parameter determination module 101 and the neural network calculation module 102 operate in a flowing manner.
  • the parameter determination module 101 can further calculate the next part.
  • the parameters of the data realize the running of the two modules and improve the efficiency.
  • the processing of the first data and the second data by the parameter determination module 101 is not completely seamless in time.
  • the parameter determination module 101 can wait.
  • the neural network calculation module 102 finishes processing the data 1 and then processes the data 3.
  • the parameter determination module 101 does not need to wait, and may continuously process, which is not limited in this embodiment.
  • the data processing device 10 usually needs to process a large amount of data.
  • the application divides two independent modules into parallel processing of the data, which speeds up the process. Delay in data processing.
  • the data processing device 50 includes a first processor 501 and a second processor 502, which may correspond to the parameters in FIG. 1 (a).
  • the first processor 501 and the second processor 502 are connected through a bus 503 as an example.
  • the first processor 501 and the second processor 502 may be further connected to the memory 504 through a bus 503.
  • the number of memories 504 may be one or more.
  • the first processor 501 and the second processor 502 may use different memories, respectively, or the first processor 501 and the second processor 502 may share the same memory, which is not limited in this embodiment.
  • the memory 504 may include non-volatile memory (Non-Volatile Random Access Memory, NVRAM), such as a read-only memory and volatile memory, such as a random access memory, and is used to store the first processor 501 and the second processor 502 Required instructions and data.
  • NVRAM Non-Volatile Random Access Memory
  • the first processor 501 and the second processor 502 are respectively used for calculating parameters and neural network operations. Therefore, each memory stores program instructions and data required by a corresponding processor.
  • any memory stores an operating system, operating instructions, executable software modules or data structures, or a subset of them, or an extended set thereof.
  • the operating instructions may include various operating instructions for implementing various operations. .
  • the operating system may include various system programs for implementing various basic services and processing hardware-based tasks.
  • any one of the first processor 501 and the second processor 502 may include at least one of a central processing unit (CPU), an image processing unit (GPU), a microprocessor, or a digital signal processor (DSP), each A processor may include one or more cores.
  • the memory 504 of FIG. 5 may be integrated with the two processors 501 and 502, but a more common implementation is located in other devices outside the two processors 501 and 502.
  • two processors 501 and 502 may be located on one or more chips.
  • One or more memories 504 are located on another chip or chips.
  • processors 501 and 502 may be implemented by computing software programs of the processors 501 and 502.
  • processors 501 or 502 in addition to the processing of running software, can also include the necessary hardware accelerators, such as application-specific integrated circuits (English: Application Specific Integrated Circuit, English abbreviation: ASIC), field programmable gate array (Full English name: Field-Programmable Gate Array, English abbreviation: FPGA) or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components.
  • ASIC Application Specific Integrated Circuit
  • FPGA Field-Programmable Gate Array
  • the above embodiment in FIG. 5 mainly gives an embodiment mainly implemented by software or a combination of software and hardware. That is, different processors run software programs that implement the functions of the parameter determination module 101 and the neural network calculation module 102. Although the two different processors are separate, they can be processed in parallel and achieve the beneficial effects of the previous embodiment, but both are read by Take the software program to achieve the function.
  • the software program required for each processor may be stored in any one or more readable storage media. For example, the memory 504 mentioned earlier.
  • the parameter determination module 101 and the neural network calculation module 102 both exist in hardware, that is, each of the parameter determination module 101 and the neural network calculation module 102 is implemented in circuit hardware.
  • each module may include an ASIC, FPGA, other programmable logic device, discrete gate or transistor logic device, and discrete hardware components.
  • each module 101 and 102 may be a hardware neural processing unit (NPU), a neural network circuit, or a deep neural network processor.
  • NPU hardware neural processing unit
  • each module 101 and 102 may include a large number of logic circuits, transistors, or arithmetic circuits, and the calculation function is realized by hardware circuits without running software.
  • the modules 101 and 102 may be designed as software, stored in the memory 504 as shown in FIG. 5 and run by the corresponding hardware processors 501 and 502, respectively, so as to achieve parallelism of this embodiment through different processors in parallel. Calculation method.
  • the modules 101 and 102 are designed as two-part hardware circuits located on one or more chips to achieve corresponding computing capabilities.
  • modules 101 and 102 in FIG. 1 (a) are hardware circuits, and the two parts of the hardware circuits are integrated in one chip or distributed in different On multiple chips.
  • Modules 101 and 102 implemented by hardware have strong computing capabilities.
  • module 102 is a deep neural network processor, and module 101 may also include a parameter calculation circuit of a deep neural network. The coordinated operation of the two can maximize the computing power. good.
  • the two parts of the hardware circuit are independent of each other, which can realize parallel computing as described above, improve efficiency and reduce delay.
  • the software program or software module involved in this embodiment may be located in any type of readable storage medium, such as random access memory, flash memory, read-only memory, programmable read-only memory, or electrically erasable programmable memory, register, etc. Storage media.
  • a data processing device of the present application has been described above. Please refer to FIG. 6, and a data processing method of the present application is described below.
  • a data processing method characterized in that the data processing method is applied to a data processing device, the data processing device includes: a parameter determination module 101 and a neural network calculation module 102 coupled to the parameter determination module 101; as before As mentioned, each module may be a software module and is run by a different processor to implement parallel processing; or, each module may be a hardware circuit, so that two independent hardware circuits can perform parallel processing.
  • the method includes: 601, performing parameter calculation on the first data through the parameter determination module 101 to obtain a first parameter set used for calculation of the first neural network; 602, using the first parameter through the neural network calculation module 102 The first neural network calculation is performed on the first data to obtain a calculation result.
  • the parameter calculation of the parameter determination module 101 is independent of the first neural network calculation of the neural network calculation module 102. For specific calculation methods and implementations, please refer to the description of the previous embodiments.
  • the first parameter set is input to the neural network calculation module 102 to complete the neural network calculation process.
  • the parameter determination module 101 and the neural network calculation module 102 can Complete the data processing flow independently, so that the data processing equipment can process the data compared to a single device while processing the first data. By setting up two independent modules, this application can process data in parallel, reducing Data processing delay.
  • performing parameter calculation on the first data through the parameter determination module 101 to obtain a first parameter set for calculation of the first neural network includes: using the second neural network through the parameter determination module 101 Perform parameter calculation on the first data to obtain the first parameter set, or perform matrix operations on the first data and the preset matrix through the parameter determination module 101 to obtain the first data A parameter set.
  • the manner in which the parameter determination module 101 obtains the first parameter set by using the second neural network, and the manner in which the first parameter set is obtained by using the preset matrix is similar to that described in steps 1 and 2 of the foregoing embodiment. I won't repeat them here.
  • the possible situation of the first parameter is similar to the possible situation of the first parameter described in the corresponding part of FIG. 1 (a), and details are not described herein again.
  • the device embodiments described above are only schematic, and the modules described as separate components may or may not be physically separated, and the components displayed as modules may or may not be A physical unit can be located in one place or distributed across multiple network units. Some or all of the modules may be selected according to actual needs to achieve the objective of the solution of this embodiment.
  • the connection relationship between the modules indicates that there is a communication connection between them, which can be specifically implemented as one or more communication buses or signal lines.
  • the technical solution of this application that is essentially or contributes to the existing technology can be embodied in the form of a software product, which is stored in a readable storage medium, such as a computer's floppy disk , U disk, mobile hard disk, read-only memory (ROM, Read-Only Memory), random access memory (RAM, Random Access Memory), magnetic disk or CD-ROM, etc., including several instructions to make a computer device (can be A personal computer, a server, or a network device, etc.) execute the methods described in the embodiments of the present application.
  • a computer device can be A personal computer, a server, or a network device, etc.
  • the computer program product includes one or more computer instructions.
  • the computer may be a general-purpose computer, a special-purpose computer, a computer network, or other programmable devices.
  • the computer instructions may be stored in a computer-readable storage medium, or transmitted from one computer-readable storage medium to another computer-readable storage medium, for example, the computer instructions may be from a website site, computer, server, or data center Transmission by wire (for example, coaxial cable, optical fiber, digital subscriber line (DSL)) or wireless (for example, infrared, wireless, microwave, etc.) to another website site, computer, server, or data center.
  • wire for example, coaxial cable, optical fiber, digital subscriber line (DSL)
  • wireless for example, infrared, wireless, microwave, etc.
  • the computer-readable storage medium may be any available medium that can be stored by a computer or a data storage device such as a server, a data center, and the like that includes one or more available medium integration.
  • the available medium may be a magnetic medium (for example, a floppy disk, a hard disk, a magnetic tape), an optical medium (for example, a DVD), or a semiconductor medium (for example, a solid state disk (Solid State Disk (SSD)), and the like.

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Computational Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Mathematical Analysis (AREA)
  • Mathematical Optimization (AREA)
  • Pure & Applied Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Algebra (AREA)
  • Databases & Information Systems (AREA)
  • Software Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Computing Systems (AREA)
  • Image Analysis (AREA)

Abstract

本申请实施例公开了一种数据处理设备,用于通过设置相互独立的参数确定模块和与参数确定模块耦合的神经网络计算模块,实现对数据的并行处理,减小了数据处理时延。本申请实施例方法包括:一种数据处理设备,所述数据处理设备包括:参数确定模块和耦合于所述参数确定模块的神经网络计算模块;所述参数确定模块,用于对第一数据进行参数计算,以得到用于第一神经网络计算的第一参数集合;所述神经网络计算模块,用于利用所述第一参数集合对所述第一数据进行所述第一神经网络计算,以得到计算结果;其中,所述参数确定模块的所述参数计算独立于所述神经网络计算模块的所述第一神经网络计算。

Description

一种数据处理设备以及一种数据处理方法 技术领域
本申请涉及数据处理技术领域,尤其涉及一种数据处理设备以及一种数据处理方法。
背景技术
深度神经网络(deep neural network,DNN)可以用于图像分类、图像识别以及音频识别等领域,深度神经网络中一般处理器处理的数据为32位浮点格式(FP32)的数据,对处理器的运算能力和功耗要求很高。
为避免处理器处理数据功耗过高,近年来研究者们提出了一种量化的神经网络计算方法,在此称为XNOR-Net,通过DNN将数据格式量化到较低的比特格式例如1bit后,再进行矩阵运算,从而完成DNN的帧数据处理流程。其中,神经网络的量化分为2部分,第一类是网络权重系数的量化,第二类是神经网络中每一层的输入和输出特征图(Feature map)量化,矩阵运算的具体计算公式为:
Figure PCTCN2018102515-appb-000001
I为FP32的feature map矩阵,FP32的权重矩阵为W,sign(I)为1bit量化后的每一层的Feature map矩阵,sign(W)为1bit量化后的权重矩阵,α为一个标量值,K为量化参数,可以为一个二维矩阵,⊙表示矩阵点乘,
Figure PCTCN2018102515-appb-000002
表示矩阵乘。可见DNN进行低比特格式的量化时,还涉及量化参数K的运算,K作为该层当前帧数据的量化参数,由DNN中某一层卷积层利用未量化前的输入feature map计算得到。关于XNOR-Net运算的更多介绍,可参见“Mohammad Rastegari,Vicente Ordonez,Joseph Redmon,Ali Farhadi,XNOR-Net:ImageNet Classification Using Binary Convolutional Neural Networks”。
可见,上述DNN对帧数据的处理都是针对神经网络的逐层计算的,需要先针对一个层完成量化参数计算后,才能带入矩阵运算,之后再进行下一层的参数计算,从而完成DNN对帧数据的处理流程,由于矩阵运算的过程需要等待量化参数计算完毕才能执行,量化参数的计算时间又被前一层计算的限制,加大了神经网络处理帧数据的时延。
发明内容
本申请提供了一种数据处理设备以及一种数据处理方法,提高数据处理效率,减小了数据处理时延。
本申请第一方面提供了一种数据处理方法,其特征在于,所述方法包括:通过参数确定模块,对第一数据进行参数计算,得到用于第一神经网络计算的第一参数集合。其中,第一参数集合包括至少一个第一参数,每个第一参数与第一神经网络中每一层的feature map一一对应,第一神经网络可以为卷积神经网络、深度神经网络、循环神经网络和/或多层感知器。通过神经网络计算模块,利用所述第一参数集合,对所述第一数据进行所述第一神经网络计算,以得到计算结果。其中第一数据输入神经网络计算模块后,可以得到第一神经网络中每一层的feature map,将第一参数集合中的第一参数与对应的feature map矩阵带入预置的计算公式可以得到计算结果,重复执行直至完成第一参数集合中每个第一参数的计算,从而完成第一数据的第一神经网络计算过程。参数确定模块和神经网络计算模块分别为具有独立数据处理能力的两个模块,因此所述参数确定模块的所述参数计算独立于所述神经网络 计算模块的所述第一神经网络计算。本申请实施例具有以下优点:在本实施例中,由于数据处理设备一般需要处理的大量数据,本申请通过参数确定模块和神经网络计算模块在合作完成对数据进行神经网络计算的同时,由于参数确定模块的参数计算独立于神经网络计算模块的第一神经网络计算,因此两模块可以并行对数据进行处理,减小了数据处理时延。
基于第一方面,在第一方面的第一种可实现方式中,所述对第一数据进行参数计算包括:利用第二神经网络对所述第一数据进行所述参数计算,以得到所述第一参数集合,其中第二神经网络可以为卷积神经网络、深度神经网络、循环神经网络和/或多层感知器。在本实施例中,对第一数据进行参数计算的一种可能的方式进行了说明,增加了方案的可实施性。
基于第一方面及其第一方面的第一种可实现方式,在第一方面的第二种可实现方式中,所述对所述第一数据进行所述参数计算,以得到所述第一参数集合包括:对所述第一数据进行参数计算,得到第二参数集合,一种可能的情况是利用第二神经网络对第一数据进行参数计算,得到第二参数集合。将所述第二参数集合与第三参数集合进行加权平均或平滑计算或α滤波得到所述第一参数集合,其中,所述第三参数集合为所述参数确定模块计算出的历史参数集合。在本实施例中,对带入历史参数集合得到第一参数集合的方式进行了说明,增加了方案的实用性和实施的灵活性。
基于第一方面及其第一方面的第一种至第二种可实现方式,在第一方面的第三种可实现方式中,所述对第一数据进行参数计算包括:将所述第一数据与预置的矩阵进行矩阵运算,其中预置的矩阵可以由参数确定模块预先设置。在本实施例中,对第一数据进行参数计算的另一种可能的方式进行了说明,增加了方案实施的灵活性。
基于第一方面及其第一方面的第一种至第三种可实现方式,在第一方面的第四种可实现方式中,所述方法还包括:参数确定模块与神经网络计算模块在时域上的并行处理,一种可能的情况为:当通过所述神经网络计算模块处于所述第一神经网络计算的状态时,通过所述参数确定模块对第二数据进行参数的计算,所述第二数据在时域上早于所述第一数据,第二数据可以为第一数据之前任意一个时刻的数据,也可以为第一数据前一时刻的数据,此处不做限定。在本实施例中,对参数确定模块与神经网络计算模块时域上的并行处理方式进行了说明,从而可以实现并行对数据进行处理,减小了数据处理时延。
基于第一方面及其第一方面的第一种至第四种可实现方式,在第一方面的第五种可实现方式中,所述第一参数集合包括:量化参数或所述量化参数的调整量或与所述量化参数相关联的参数,其中本领域技术人员可以基于该相关联的参数显而易见的得到量化参数,同时所述第一神经网络计算是量化的神经网络计算,例如采用XNOR-Net方式对第一数据进行所述量化的神经网络计算。在本实施例中,对第一参数集合的具体指代以及第一神经网络计算的具体方式进行了说明,有利于方案实施。
本申请的第二方面提供了一种数据处理设备,包括:参数确定模块与所述参数确定模块耦合连接的神经网络计算模块。参数确定模块,用于对第一数据进行参数计算,得到用于第一神经网络计算的第一参数集合,其中,第一参数集合包括至少一个第一参数,每个第一参数与第一神经网络中每一层的feature map一一对应,第一神经网络可以为卷积神经网络、深度神经网络、循环神经网络和/或多层感知器。神经网络计算模块,用于利用所述第一参数集合,对所述第一数据进行所述第一神经网络计算,以得到计算结果,其中第一数据输入 神经网络计算模块后,可以得到第一神经网络中每一层的feature map,将第一参数集合中的第一参数与对应的feature map矩阵带入预置的计算公式可以得到计算结果,重复执行直至完成第一参数集合中每个第一参数的计算,从而完成第一数据的第一神经网络计算过程参数确定模块和神经网络计算模块分别为具有独立数据处理能力的两个模块,因此所述参数确定模块的所述参数计算独立于所述神经网络计算模块的所述第一神经网络计算。
基于第二方面,在第二方面的第一种可实现方式中,所述参数确定模块,具体用于利用第二神经网络对所述第一数据进行所述参数计算,以得到所述第一参数集合,其中第二神经网络可以为卷积神经网络、深度神经网络、循环神经网络和/或多层感知器。
基于第二方面及其第二方面的第一种可实现方式,在第二方面的第二种可实现方式中,所述参数确定模块,具体用于:对所述第一数据进行参数计算,得到第二参数集合,一种可能的情况是利用第二神经网络对第一数据进行参数计算,得到第二参数集合。将所述第二参数集合与第三参数集合进行加权平均或平滑计算或α滤波得到所述第一参数集合,其中,所述第三参数集合为所述参数确定模块计算出的历史参数集合。
基于第二方面及其第二方面的第一种至第二种可实现方式,在第二方面的第三种可实现方式中,所述对第一数据进行参数计算包括:将所述第一数据与预置的矩阵进行矩阵运算,其中预置的矩阵可以由参数确定模块预先设置。
基于第二方面及其第二方面的第一种至第三种可实现方式,在第二方面的第四种可实现方式中,参数确定模块与神经网络计算模块在时域上的并行处理,一种可能的情况为具当所述神经网络计算模块处于所述第一神经网络计算的状态时,所述参数确定模块能够用于对第二数据进行参数的计算,所述第二数据在时域上早于所述第一数据,第二数据可以为第一数据之前任意一个时刻的数据,也可以为第一数据前一时刻的数据,此处不做限定。
基于第二方面及其第二方面的第一种至第四种可实现方式,在第二方面的第五种可实现方式中,所述第一参数集合包括:量化参数或所述量化参数的调整量或与所述量化参数相关联的参数,其中本领域技术人员可以基于该相关联的参数显而易见的得到量化参数,同时所述第一神经网络计算是量化的神经网络计算,例如采用XNOR-Net方式对第一数据进行所述量化的神经网络计算。
基于第二方面以及第一方面的第一种至第五种可实现方式,在第二方面的第六种可实现方式中,所述参数确定模块可以为第一电路,所述神经网络计算模块可以为第二电路,所述第一电路和所述第二电路可以位于一个或多个芯片上。
本申请的第三方面提供了一种数据处理设备,所述数据处理设备包括第一处理器和第二处理器,所述第一处理器对应上述各方面所述的参数确定模块,可以通过运行软件程序执行上述各方面参数确定模块所执行的操作,所述第二处理器对应上述各方面所述的神经网络计算模块,可以通过运行软件程序执行上述各方面神经网络计算模块所执行的操作,所述第一处理器和所述第二处理器可以位于一个或多个芯片上。
本申请的第四方面提供了一种计算机可读存储介质,所述计算机可读存储介质中存储有指令,当其在计算机上运行时,使得计算机执行上述各方面所述的方法。
本申请的第五方面提供了一种包含指令的计算机程序产品,当其在计算机上运行时,使得计算机执行上述各方面所述的方法。
附图说明
图1(a)为本申请数据处理设备的一种可能的结构;
图1(b)为低比特量化时数据处理设备的处理流程图;
图2为本申请参数确定模块生成量化参数的一种实施例示意图;
图3为本申请参数确定模块生成量化参数的另一种实施例示意图;
图4为本申请数据参数确定模块和神经网络计算模块并行处理数据的实施例示意图;
图5本申请数据处理设备的另一种可能的结构;
图6为本申请数据处理方法的实施例示意图。
具体实施方式
下面将结合本发明实施例中的附图,对本发明实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例仅是本发明一部分实施例,而不是全部的实施例。基于本发明中的实施例,本领域普通技术人员在没有作出创造性劳动前提下所获得的所有其他实施例,都属于本发明保护的范围。
本发明的说明书和权利要求书及上述附图中的术语“第一”、“第二”、“第三”、“第四”等(如果存在)是用于区别类似的对象,而不必用于描述特定的顺序或先后次序。应该理解这样使用的数据在适当情况下可以互换,以便这里描述的本发明的实施例如能够以除了在这里图示或描述的那些以外的顺序实施。此外,术语“包括”和“具有”以及他们的任何变形,意图在于覆盖不排他的包含,例如,包含了一系列步骤或单元的过程、方法、系统、产品或设备不必限于清楚地列出的那些步骤或单元,而是可包括没有清楚地列出的或对于这些过程、方法、产品或设备固有的其它步骤或单元。
第一神经网络或第二神经网络具体可以为卷积神经网络(convolutional neural network,CNN)、深度神经网络(deep neural network,DNN)、循环神经网络(recurrent neural network,RNN)和/或多层感知器(multi-layer perceptron,MLP)。
本申请提供了一种数据处理设备,可以应用于神经网络对视频数据、语音数据、以及图片数据的处理。数据处理设备采用XNOR-Net的方法将待处理数据的数据格式量化到较低的比特格式,其中数据处理设备包括参数确定模块和和耦合于参数确定模块的神经网络计算模块,参数确定模块与神经网络计算模块为数据处理设备中两个相互独立的模块,当处理数据时,可以并行对数据进行处理,同时参数确定模块的输出结果又作为神经网络计算模块的输入,从而两者可以分工合作完成数据处理设备对数据的处理过程。也即是说,参数确定模块的计算不再依赖于神经网络计算模块的计算。具体地,参数确定模块不必等待神经网络计算模块的计算结果即可进行进一步处理,与现有技术相比,计算效率有所提高,延时较短。
数据处理设备对数据的处理分为两个步骤,步骤1是计算第一参数,例如量化参数K或其变形的计算,步骤2是第一神经网络的计算,即做每层的矩阵计算,具体可一是量化的神经网络计算,例如参照现有技术中关于XNOR-Net运算的介绍。下面以数据处理设备对图像的处理过程进行举例说明:步骤1:外部设备输入待处理的图像数据至数据处理设备,首先数据处理设备中的参数确定模块接收待处理的图像数据,通过预置的第二神经网络或预置的矩阵对图 像数据进行参数计算得到第一参数集合,其中,预置的第二神经网络和预置的矩阵均为预先通过数据训练得到的,参数确定模块通过第二神经网络对图像数据的处理过程为:参数确定模块先确定第二神经网络中每一层的feature map,每一层feature map对应一个第一参数,参数确定模块可以根据每一层feature map的矩阵表达式得到第一参数,最终得到第一参数集合;参数确定模块通过预置矩阵对图像数据的处理过程为:参数确定模块将图像数据与预置矩阵进行矩阵运算,具体可以为矩阵乘运算,得到第一参数集合。步骤2:随后,参数确定模块将确定得到的第一参数输入至神经网络计算模块,神经网络计算模块基于XNOR-Net的方法将图像数据进行低比特量化,再根据收到的第一参数,通过第一神经网络对图像数据进行处理,具体的处理过程为按照算法:
Figure PCTCN2018102515-appb-000003
进行矩阵运算,最终得到第一神经网络的输出结果。在本方案中,参数确定模块与神经网络计算模块都具有对图像数据的处理功能,参数确定模块执行参数计算的过程与神经网络计算模块执行神经网络计算的过程是相互独立的。或者说参数确定模块执行参数计算的过程不依赖于神经网络计算模块执行神经网络计算的过程。对于本实施例而言,参数确定模块和神经网络计算模块两者可以并行对图像数据进行处理。两者在时域上的并行具体表现为:当外接设备将第二个图像数据输入至数据处理设备后,神经网络计算模块开始对第二个图像数据进行参数计算,而神经网络计算模块此时还停留在对第二个图像前的第一图像数据执行神经网络计算的状态。即参数确定模块的进一步处理可以不必等待神经网络计算模块的处理结果。
基于上述应用场景的举例,下面对本申请的数据处理设备10进行介绍:如图1(a)所示,本申请中的数据处理设备10包括参数确定模块101和耦合于所述参数确定模块101的神经网络计算模块102;所述参数确定模块101,用于对第一数据进行参数计算,以得到用于第一神经网络计算的第一参数集合;在本实施例中,参数确定模块101可以通过神经网络训练得到的,也可以是通过其它机器学习算法,如支持向量机(SVM),决策树训练得到,用于实现参数计算,可以软件、硬件或其结合来实现。第一数据可以为视频数据、音频数据或图片数据,还可以为其他类型的数据,具体此处不作限定。
参数确定模块101,用于对第一数据进行参数计算具体可以为:参数确定模块101对第一数据进行矩阵计算得到参数或参数确定模块101进一步对多个第一数据进行平滑计算,还可能为其他形式的计算,具体此处不作限定。得到的第一参数可以为一个标量参数了,也可以为一个2D的矩阵,还可能为其他的情况,具体此处不做限定。
第一参数集合用于神经网络计算模块102进行第一神经网络计算,第一数据在神经网络中进行处理通常需要经过四种层,分别是卷积层、池化层、非线性层和全连结层,这四种层数量不等的交替出现,每一层对应一个feature map。第一神经网络对第一数据进行神经网络计算时,会采用XNOR-Net的方法将第一数据的数据格式量化为低比特格式(例如1bit),采用NOR-Net的方法对第一数据的数据格式进行量化包括feature map的量化,输入的第一数据不同,未量化前feature map的取值的也会发生变化,而现有的量化方法中,无论输入的第一数据如何变化,并未考虑到输入的第一数据变化所造成的未量化前的feature map的取值的变化,即无论未量化前的feature map的取值如何,均将feature map量化为1比特,这样的量化方式会导致神经网络数据库精度的降低,因此引入第一参数K,第一参数K可以根据每一层量化前 feature map的参数表达式得到,随着输入的第一数据变化的参数,每一层量化前feature map的取值不同,因此得到的第一参数可随之变化。本申请通过引入随第一数据变化而同步变化的第一参数,从而提高神经网络数据库精度。因此,参数确定模块101是一个专用于计算量化参数K或其变形的模块,其可以通过软件、硬件或其结合实现。
所述神经网络计算模块102,用于利用所述第一参数集合对所述第一数据进行所述第一神经网络计算,以得到计算结果;在本实施例中,神经网络计算模块102具体可以为DNN处理器,其中的DNN处理器与常见的AI处理器功能相同,可以执行神经网络计算,例如其可以通过软件、硬件或其结合实现。参数确定模块101将第一参数集合输入至神经网络计算模块102,同时神经网络计算模块102获取上述第一数据,结合第一参数集合对该第一数据执行第一神经网络计算,以得到每一层feature map的神经网络输出结果。
在本实施例中,第一神经网络计算的公式为:
Figure PCTCN2018102515-appb-000004
其中I为量化前的feature map矩阵,W为量化前的权重矩阵,sign(I)为量化后的Feature map矩阵,sign(W)为量化后的权重矩阵,
Figure PCTCN2018102515-appb-000005
为矩阵乘,⊙为矩阵点乘,α为一个预置的标量值,K为所述量化参数,由第一参数得到,K可以为一个二维矩阵。第一参数可以为量化参数K,也可以为一个为本领域技术人员不用付出创造性劳动即可得到量化参数K的关联参数,例如,第一参数还可以为量化参数的调整量,具体此处不作限定,后续对此具体介绍。
同时参照图1(b),所述参数确定模块101的所述参数计算独立于所述神经网络计算模块102的所述第一神经网络计算,所述第一神经网络计算包括多个层的计算,所述数据输入被每个层做流水处理,下一层接收上一层的输出作为输入的feature map。参数确定模块101和神经网络计算模块102可以分别对输入的数据进行处理,即参数确定模块101与神经网络计算模块102对数据的处理过程是相互独立互不影响的。
在本实施例中,由于参数确定模块101与神经网络计算模块102为数据处理设备10中的两个独立模块,两者对数据的计算过程是独立的,本申请利用参数确定模块101对第一数据进行处理得到第一参数集合后,神经网络计算模块102再利用第一参数集合完成第一数据的神经网络计算流程,从而在完成对第一数据进行处理的同时,相对于单一设备对数据进行处理,本申请的参数确定模块101与神经网络计算模块102可以并行对数据进行进一步的处理,减小了数据处理时延。
同时,本申请通过预先进行神经网络训练,数据处理设备10利用训练得到的神经网络对第一数据进行处理,得到神经网络计算结果,此方式的数据运算量较小,有利用方案实现。
在本实施例中,参数确定模块101对第一数据进行参数计算,以得到用于第一神经网络计算的第一参数集合有如下可实现的方式,下面将进行说明:一、利用第二神经网络对第一数据进行参数计算,以得到第一参数集合。其中,第二神经网络为预先通过对大量数据进行训练得到的,依据第一参数表示的含义不同,该训练可以为根据输入数据得到量化参数集合的训练,也可以为根据输入数据得到量化参数的关联参数集合的训练,还可以为根据输入数据得到量化参数调整量集合的训练,具体此处不作限定。需要说明的是,第一参数所表示的具体数据不同,第二神经网络内部参数的计算过程不同,下面进行说明:
1、第一参数表示量化参数K;参数确定模块101利用第二神经网络对第一数据进行参数 计算,以得到量化参数集合。
如图2所示,以卷积神经网络计算得到量化参数集合为例:需要量化的卷积层和全连接层共N层,其中前n层用于特征抽取,然后使用N个层分别生成N个量化参数K。前n层可以为一个卷积层和一个非线性层,也可以是其它卷积神经网络其他层的灵活组合,具体此处不作限定,图2中的conv指代卷积(convolution)。
一种可能的情况,上述第一数据在第二神经网络中可以采用XNOR-Net的方法,根据每一层量化前feature map参数表达式得到量化参数K,计算公式为:K=A*k,A为矩阵,k为一个常量,*表示乘运算。其中,以图像数据为例,首先A为对每个像素上feature map的所有通道的元素的绝对值求平均值得到的值,求平均值时需要带入通道数和元素计算,每个神经网络层的输入是一个三维的矩阵M(X,Y,Z),其中X,Y,Z为该维度上的个数,令第三维Z为通道数,则对应某一个通道zi,都有一个2维的矩阵Mz(X,Y,zi)。这个2维矩阵中的一个元素,Mz(xi,yk,zi),在这里类比为图像的一个像素。其次k根据卷积层卷积核尺寸得到,具体可参见采用XNOR-Net方式对数据进行量化的实现方法,此处不多做赘述。这里计算得到的量化参数K为一个2D矩阵。
另一种可能的情况是为:使用第二神经网络进行回归,令回归使用的第二神经网络的参数为W nn_para_est,此时量化参数K=f(W nn_para_est,D t)。其中f(W nn_para_est,input)是第二神经网络的响应函数,D t是第二神经网络的输入数据。
2、第一参数表示与量化参数K相关联的参数,即K的一种变形;参数确定模块101利用第二神经网络对第一数据进行参数计算,以得到量化参数的关联参数集合。本领域技术人员在利用第二神经网络得到关联参数后,可以通过合理的推导,在不付出创造性劳动的基础上可得到量化参数K。例如第一参数为上述参数A。
在本实施例中,利用第二神经网络得到第一参数的方式与上述图2通过卷积神经网络得到量化参数的方式类似,即可以采用第二神经网络进行特征提取后,生成关联参数集合,同时第一数据在第二神经网络中也可以采用XNOR-Net的方法计算得到关联参数,例如上述参数A,如上述所述,A为对每个像素上所有通道的元素的绝对值求平均值得到的值,A的计算公式为:
Figure PCTCN2018102515-appb-000006
其中,I为量化前每一层feature map的矩阵,c为feature map的通道数。
3、第一参数表示量化参数K的调整量,即K的另一种变形。参数确定模块101利用第二神经网络对第一数据进行参数计算,以得到量化参数调整量集合。
为了降低参数确定模块101的运算量,可以预先离线估计一个量化参数K1,该量化参数K1可以为经过多次数据处理得到量化参数的实验后,根据多次实验得到的量化参数的分布规律所确定的较为理想的量化参数,可以为多次实验得到的量化参数的平均值。然后参数确定模块101利用第二神经网络对第一数据进行参数计算,得到量化参数调整量ΔK,其可以是计算出的量化参数和量化参数K1之间的一个偏移量。例如,将K1与ΔK,进行K=opt(K1,ΔK)的计算即可得到量化参数K,这里的opt可以是任意操作,例如相加、相减或相乘等,具体此处不作限定。在本实施例中,第二神经网络对第一数据进行参数计算,得 到量化参数调整量ΔK的方式与上述图2通过卷积神经网络计算得到量化参数的方式类似,即可以采用第二神经网络进行特征提取后,生成参数调整量集合,具体此处不再赘述。
在本实施例中,参数确定模块101确定量化参数调整量ΔK的运算量相对于参数确定模块101确定量化参数K的运算量较小,因此本申请中参数确定模块101只需要在线得到量化参数调整量ΔK即可得到量化参数K,可以降低参数确定模块101的运算量。
二、利用预置的矩阵对第一数据进行参数计算,以得到第一参数集合。在本实施例中,还可以通过预置矩阵对第一数据进行矩阵运算,得到第一参数集合。矩阵运算可以为矩阵加减和/或乘除运算,具体此处不作限定。预置矩阵为事先通过对大量数据进行训练得到的矩阵,依据第一参数所表示的具体含义不同,该训练可以为根据输入数据得到量化参数集合的训练,也可以为根据输入数据得到量化参数的关联参数集合的训练,还可以为根据输入数据得到量化参数调整量的训练,具体此处不作限定。
第一参数可以为量化参数或量化参数的调整量或与量化参数相关联的参数。例如第一参数为量化参数k,第一数据可以展开为矩阵Dt,预先训练好一个加权矩阵W1,此时,K=Dt W1。
再例如第一参数为量化参数调整量ΔK,第一数据可以展开为矩阵Dt,预先训练好一个加权矩阵W2,参数确定模块101对第一数据与该预置矩阵进行矩阵运算ΔK=Dt W2,得到量化参数调整量ΔK,最后将ΔK与离线得到的量化参数K2进行K=opt(K2,ΔK)的计算得到量化参数K,这里的opt可以是任意操作,例如相加、相减或相乘等,具体此处不作限定。在本实施例中,量化参数K2可以为经过多次数据处理得到量化参数的实验后,根据多次实验得到的量化参数的分布规律所确定的较为理想的量化参数,可以为多次试验得到的量化参数的平均值。
除上述方式一和方式二通过第二神经网络或预置矩阵直接得到第一参数集合的方式外,进一步的,参数确定模块101还可以通过第二神经网络或预置矩阵直接得到第二参数集合后,再将历史参数结合第二参数集合计算得到第一参数集合。
Step1:参数确定模块101对第一数据进行参数计算,得到第二参数集合;(1)一种可实现的方式为参数确定模块101利用第二神经网络对第一数据进行参数计算,得到第二参数集合。依据第一参数的具体类型不同,第二参数可以为,参数确定模块101通过第二神经网络对第一数据进行处理得到的量化参数或量化参数的调整量或与量化参数相关联的参数,通过第二神经网络对第一数据进行处理得到第二参数的方式与上述通过第二神经网络对第一数据进行处理后直接得到第一参数的方式类似,具体此处不再赘述。
(2)另一种可实现的方式为参数确定模块101利用预置的矩阵对第一数据进行参数计算,得到第二参数集合。依据第一参数的具体类型不同,第二参数可以为,参数确定模块101通过预置矩阵对第一数据进行处理得到的量化参数或量化参数的调整量或与量化参数相关联的参数,通过预置矩阵对第一数据进行处理得到第二参数的方式与上述通过预置矩阵对第一数据进行处理后直接得到第一参数的方式类似,即通过预置矩阵进行矩阵加减和/或乘除运算,得到第二参数,例如第二参数为量化参数k3,第一数据可以展开为矩阵D3,预先训练好一个 加权矩阵W3,此时,K3=D3W3,在本实施例中,预置加权矩阵W3和上述W2和W1可以为相同的矩阵。
Step2:参数确定模块101对第二参数集合和第三参数集合进行处理得到第一参数集合。其中第二参数集合中的多个第二参数与第三参数集合中的多个第三参数一一对应。在本实施例中,某一段时间内处理的数据分布有一定相似性,因此可以将第三参数即历史参数用于生成第一量化参数,以此降低第一参数估计的噪声,提高第一参数估计的精确度。历史参数为参数确定模块101对第一数据之前时刻的至少一个数据进行处理得到的至少一个参数集合,当需要生成量化参数K时,第二参数为量化参数,历史参数集合为第二神经网络在对第一数据处理前得到的至少一个量化参数集合;当需要生成与量化参数K相关联的参数,例如上述参数A时,第二参数为量化参数的关联参数,历史参数集合为第二神经网络在对第一数据处理前得到的至少一个关联参数集合;当需要生成量化参数K的调整量ΔK时,第二参数为参数调整量,历史参数集合为第二神经网络在对第一数据处理前得到的至少一个量化参数调整量ΔK集合。
(1)一种可能的情况,参数确定模块101对第二参数集合和第三参数集合进行处理得到第一参数集合的方式可以采用神经网络进行处理。如图3所示,通过神经网络将第一参数与第三参数进行α滤波,即进行加权平均,得到第一参数。例如第三参数Y(t-1)为参数确定模块101对第一数据前一时刻的数据进行处理得到的参数,当第三参数为第三量化参数K(t-1)时,首先使用线性回归对第一数据进行矩阵运算得到K′(t),K′(t)=W estD t,W est为第一数据的矩阵表达式,D t为预置矩阵,第一参数的计算公式为:K(t)=αK′(t)+(1-α)K(t-1),其中,K(t)表示第一参数,K′(t)表示第二参数。
(2)另一种可能的情况,参数确定模块101对第二参数集合和第三参数集合进行处理得到第一参数集合的方式还可以采用矩阵运算的方式进行处理,参数确定模块101将第二参数与第三参数进行矩阵加减运算和/或乘除运算得到第一参数。在本实施例中,参数确定模块101对第二参数集合和第三参数集合进行处理得到第一参数集合的方式此处不作限定。
在本实施例中,参数确定模块101通过计算第一参数时,结合考虑了历史参数,减小了第一参数的估计噪声,提高第一参数估计的精确度。
上面对参数确定模块101和神经网络计算模块102对第一数据处理的具体过程进行了说明,请参照图4,下面对本申请的参数确定模块101与神经网络计算模块102在时域上的并行处理数据方式进行说明。
当所述神经网络计算模块102处于所述第一神经网络计算的状态时,参数确定模块101,还用于对第二数据进行参数的计算。在本实施例中,所述第二数据在时域上早于第一数据。如图4,首先参数确定模块101对数据1进行处理得到第一参数集合,随后参数确定模块101对数据1之后的数据2进行处理,而神经网络计算模块102还处于对数据1进行处理的状态。本申请通过设计相互独立且耦合连接的参数确定模块101以及神经网络计算模块102,因此能达到两模块对数据的处理互相不影响的效果,进而完成两者对数据的并行处理。因此,本 实施例相当于使得参数确定模块101和神经网络计算模块102之间利用流水的方式进行操作,当神经网络计算模块102做一部分数据的计算的时候,参数确定模块101可进一步计算下一部分数据的参数,实现两个模块流水运行,提高效率。
可以理解,在本实施例中,参数确定模块101对第一数据与第二数据的处理在时刻上并不是完全无缝衔接的,例如图4,数据2处理完成后,参数确定模块101可以等待神经网络计算模块102完成数据1的处理后再处理数据3。当然,参数确定模块101无需进行等待,也可以连续进行处理,本实施例对此不作限定。
在本实施例中,数据处理设备10通常需要处理大量数据,相对于现有技术通过单一处理设备完成神经网络对数据的处理,本申请分设两个相互独立的模块对数据进行并行处理,加快了数据处理的时延。
请参照图5,下面对本申请数据处理设备的另一种可能的结构进行说明。接下来介绍本申请实施例提供的另一种网络设备,请参阅图5所示,数据处理设备50包括:第一处理器501和第二处理器502,可以对应图1(a)中的参数确定模块101和神经网络计算模块102。进一步地,其中,第一处理器501和第二处理器502以通过总线503连接为例。第一处理器501和第二处理器502可以通过总线503进一步连接至存储器504。存储器504的数量可以使一个或多个。例如,第一处理器501和第二处理器502可以分别使用不同的存储器,或者第一处理器501和第二处理器502可以共享同一个存储器,本实施例不做限定。
存储器504可以包括非易失性存储器(Non-Volatile Random Access Memory,NVRAM),如只读存储器和易失性存储器,如随机存取存储器,并用于存储第一处理器501和第二处理器502所需的指令和数据。第一处理器501和第二处理器502分别用于计算参数和神经网络运算。因此,每个存储器存储有对应的处理器所需的程序指令和数据。例如,任一存储器存储有操作系统、操作指令、可执行软件模块或者数据结构,或者它们的子集,或者它们的扩展集,其中,操作指令可包括各种操作指令,用于实现各种操作。操作系统可包括各种系统程序,用于实现各种基础业务以及处理基于硬件的任务。
第一处理器501和第二处理器502中任一处理器可以包括中央处理单元(CPU)、图像处理单元(GPU)、微处理器、或数字信号处理器(DSP)的至少一个,每个处理器可包括一个或多个核。图5的存储器504可以与两个处理器501和502集成在一起,但是更为常见的一个实现方式是位于两个处理器501和502外的其他设备中。例如,两个处理器501和502可以位于一个或多个芯片上。一个或多个存储器504则位于另一个或多个其他芯片上。
上述本申请实施例揭示的方法可以应用于处理器501和502中,或者由处理器501和502运算软件程序实现。每个处理器501或502除了运行软件的处理分不分,还可已包括必要的硬件加速器部分,如专用集成电路(英文全称:Application Specific Integrated Circuit,英文缩写:ASIC)、现场可编程门阵列(英文全称:Field-Programmable Gate Array,英文缩写:FPGA)或者其他可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件。
以上图5的实施例主要给出了主要通过软件实现或软硬件结合实现的实施例。即不同处理器运行实现参数确定模块101和神经网络计算模块102功能的软件程序,虽然两个不同的 处理器是分开的,可以并行处理并达到之前实施例的有益效果,但两者均通过读取软件程序实现功能。每个处理器所需的软件程序可以存储于任一个或多个可读存储介质中。例如之前提到的存储器504。在一种更为常见的实现方案中,参数确定模块101和神经网络计算模块102均以硬件形式存在,即参数确定模块101和神经网络计算模块102的每个均以电路硬件实现。此时参数确定模块101和神经网络计算模块102均是个功能电路。参数确定模块101和神经网络计算模块102合起来成为一个芯片。例如,每个模块可以包括ASIC、FPGA、他可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件。例如,每个模块101和102可以是一个硬件神经处理单元(NPU)、神经网络电路或深度神经网络处理器。例如,每个模块101和102可以包括大量的逻辑电路、晶体管或运算电路,在无需运行软件的情况下通过硬件电路实现计算功能。
在以上实施例中,模块101和102可以被设计为软件,存储于如图5的存储器504中并分别被对应的硬件处理器501和502运行,以通过不同处理器并行实现本实施例的并行计算方法。或者,模块101和102被设计为两部分硬件电路,位于一个或多个芯片上,以实现相应计算能力。
可以理解,本方案的一种典型的实现方案是通过硬件实现,即图1(a)中的模块101和102均是硬件电路,该两部分硬件电路被集成在一个芯片中,或分布于不同多个芯片上。由硬件实现的模块101和102的运算能力较强,其中模块102是深度神经网络处理器,且模块101也可以是包括了深度神经网络的参数计算电路,二者协调运行可以使得计算能力达到最佳。且两部分硬件电路互相独立,可以实现如前所述并行计算,提高效率,减小延迟。
本实施例涉及的软件程序或软件模块可以位于任一种可读存储介质中,如随机存储器,闪存、只读存储器,可编程只读存储器或者电可擦写可编程存储器、寄存器等本领域成熟的存储介质中。
上面对本申请的一种数据处理装置进行了叙述,请参照图6,下面对本申请的一种数据处理方法进行说明。一种数据处理方法,其特征在于,所述数据处理方法应用于数据处理设备,所述数据处理设备包括:参数确定模块101和耦合于所述参数确定模块101的神经网络计算模块102;如前所述,每个模块可以是软件模块并分别被不同的处理器运行以实现并行处理;或者,每个模块可以是硬件电路,使得两部分独立的硬件电路做到并行处理。具体地,方法包括:601、通过参数确定模块101,对第一数据进行参数计算,以得到用于第一神经网络计算的第一参数集合;602、通过神经网络计算模块102,利用第一参数集合对第一数据进行第一神经网络计算,以得到计算结果;其中,参数确定模块101的参数计算独立于神经网络计算模块102的第一神经网络计算。具体的计算方法和实现请参照之前实施例的描述。
在本实施例中,通过参数确定模块101确定第一参数集合后,将第一参数集合输入至神经网络计算模块102从而完成神经网络计算流程,其中,参数确定模块101与神经网络计算模块102能独立的完成数据的处理流程,从而使得数据处理设备在完成对第一数据处理的同时,相对于单一设备对数据进行处理,本申请通过设立两个独立模块,可以并行对数据进行数据,减小了数据处理时延。
进一步的,所述通过所述参数确定模块101,对第一数据进行参数计算,以得到用于第 一神经网络计算的第一参数集合包括:通过所述参数确定模块101,利用第二神经网络对所述第一数据进行参数计算,以得到所述第一参数集合,或通过所述参数确定模块101,将所述第一数据与所述预置的矩阵进行矩阵运算,以得到所述第一参数集合。
在本实施例中,参数确定模块101利用第二神经网络得到第一参数集合的方式,以及利用预置矩阵得到第一参数集合的方式与上述实施例步骤Step1和step2所描述的方式类似,具体此处不再赘述。
进一步的,所述第一参数的可能情况与图1(a)对应部分所描述的第一参数的可能的情况类似,具体此处不再赘述。
另外需说明的是,以上所描述的装置实施例仅仅是示意性的,其中所述作为分离部件说明的模块可以是或者也可以不是物理上分开的,作为模块显示的部件可以是或者也可以不是物理单元,即可以位于一个地方,或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部模块来实现本实施例方案的目的。另外,本申请提供的装置实施例附图中,模块之间的连接关系表示它们之间具有通信连接,具体可以实现为一条或多条通信总线或信号线。
通过以上的实施方式的描述,所属领域的技术人员可以清楚地了解到本申请可借助软件加必需的通用硬件的方式来实现,当然也可以通过专用硬件包括专用集成电路、专用CPU、专用存储器、专用元器件等来实现。一般情况下,凡由计算机程序完成的功能都可以很容易地用相应的硬件来实现,而且,用来实现同一功能的具体硬件结构也可以是多种多样的,例如模拟电路、数字电路或专用电路等。基于这样的理解,本申请的技术方案本质上或者说对现有技术做出贡献的部分可以以软件产品的形式体现出来,该计算机软件产品存储在可读取的存储介质中,如计算机的软盘、U盘、移动硬盘、只读存储器(ROM,Read-Only Memory)、随机存取存储器(RAM,Random Access Memory)、磁碟或者光盘等,包括若干指令用以使得一台计算机设备(可以是个人计算机,服务器,或者网络设备等)执行本申请各个实施例所述的方法。
所述计算机程序产品包括一个或多个计算机指令。在计算机或其中的处理器上加载和执行所述计算机程序指令时,全部或部分地产生按照本申请实施例所述的流程或功能。所述计算机可以是通用计算机、专用计算机、计算机网络、或者其他可编程装置。所述计算机指令可以存储在计算机可读存储介质中,或者从一个计算机可读存储介质向另一计算机可读存储介质传输,例如,所述计算机指令可以从一个网站站点、计算机、服务器或数据中心通过有线(例如同轴电缆、光纤、数字用户线(DSL))或无线(例如红外、无线、微波等)方式向另一个网站站点、计算机、服务器或数据中心进行传输。所述计算机可读存储介质可以是计算机能够存储的任何可用介质或者是包含一个或多个可用介质集成的服务器、数据中心等数据存储设备。所述可用介质可以是磁性介质,(例如,软盘、硬盘、磁带)、光介质(例如,DVD)、或者半导体介质(例如固态硬盘Solid State Disk(SSD))等。

Claims (12)

  1. 一种数据处理设备,其特征在于,包括:参数确定模块和耦合于所述参数确定模块的神经网络计算模块;
    所述参数确定模块,用于对第一数据进行参数计算,以得到用于第一神经网络计算的第一参数集合;
    所述神经网络计算模块,用于利用所述第一参数集合对所述第一数据进行所述第一神经网络计算,以得到计算结果;其中,
    所述参数确定模块的所述参数计算独立于所述神经网络计算模块的所述第一神经网络计算。
  2. 根据权利要求1所述的参数确定模块,其特征在于,所述参数确定模块,具体用于利用第二神经网络对所述第一数据进行所述参数计算,以得到所述第一参数集合。
  3. 根据权利要求1或2所述的参数确定模块,其特征在于,所述参数确定模块,具体用于:
    对所述第一数据进行参数计算,得到第二参数集合;
    将所述第二参数集合与第三参数集合进行处理得到所述第一参数集合,所述第三参数集合为所述参数确定模块计算出的历史参数集合。
  4. 根据权利要求1至3任一项所述的参数确定模块,其特征在于,所述对第一数据进行参数计算包括:将所述第一数据与预置的矩阵进行矩阵运算。
  5. 根据权利要求1至4中任一项所述的参数确定模块,其特征在于,当所述神经网络计算模块处于所述第一神经网络计算的状态时,所述参数确定模块能够用于对第二数据进行参数的计算,所述第二数据在时域上早于所述第一数据。
  6. 根据权利要求1至5中任一项所述的参数确定模块,其特征在于,所述第一参数集合包括:量化参数或所述量化参数的调整量或与所述量化参数相关联的参数;
    所述第一神经网络计算是量化的神经网络计算。
  7. 一种数据处理方法,其特征在于,所述方法包括:
    通过参数确定模块,对第一数据进行参数计算,以得到用于第一神经网络计算的第一参数集合;
    通过神经网络计算模块,利用所述第一参数集合对所述第一数据进行所述第一神经网络计算,以得到计算结果;其中,
    所述参数确定模块的所述参数计算独立于所述神经网络计算模块的所述第一神经网络计算。
  8. 根据权利要求7所述的方法,其特征在于,所述对第一数据进行参数计算包括:
    利用第二神经网络对所述第一数据进行参数计算。
  9. 根据权利要求7或8所述的方法,其特征在于,所述对所述第一数据进行所述参数计算,以得到所述第一参数集合包括:
    对所述第一数据进行参数计算,得到第二参数集合;
    将所述第二参数集合与第三参数集合进行处理得到所述第一参数集合,所述第三参数集合为所述参数确定模块计算出的历史参数集合。
  10. 根据权利要求7至9中任一项所述的方法,其特征在于,所述对第一数据进行参数计算包括:
    将所述第一数据与所述预置的矩阵进行矩阵运算。
  11. 根据权利要求7至10中任一项所述的方法,其特征在于,所述方法还包括:
    当通过所述神经网络计算模块处于所述第一神经网络计算的状态时,通过所述参数确定模块对第二数据进行参数的计算,所述第二数据在时域上早于所述第一数据。
  12. 根据权利要求7至11中任一项所述的方法,其特征在于,所述第一参数集合包括:量化参数或所述量化参数的调整量或与所述量化参数相关联的参数;
    所述第一神经网络计算是量化的神经网络计算。
PCT/CN2018/102515 2018-08-27 2018-08-27 一种数据处理设备以及一种数据处理方法 WO2020041934A1 (zh)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN201880090383.2A CN111788567B (zh) 2018-08-27 2018-08-27 一种数据处理设备以及一种数据处理方法
PCT/CN2018/102515 WO2020041934A1 (zh) 2018-08-27 2018-08-27 一种数据处理设备以及一种数据处理方法

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2018/102515 WO2020041934A1 (zh) 2018-08-27 2018-08-27 一种数据处理设备以及一种数据处理方法

Publications (1)

Publication Number Publication Date
WO2020041934A1 true WO2020041934A1 (zh) 2020-03-05

Family

ID=69643432

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2018/102515 WO2020041934A1 (zh) 2018-08-27 2018-08-27 一种数据处理设备以及一种数据处理方法

Country Status (2)

Country Link
CN (1) CN111788567B (zh)
WO (1) WO2020041934A1 (zh)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112800591B (zh) * 2021-01-08 2023-03-21 广西玉柴机器股份有限公司 一种预测发动机性能参数修改量的方法及相关装置
CN113570034B (zh) * 2021-06-18 2022-09-27 北京百度网讯科技有限公司 处理装置、神经网络的处理方法及其装置

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104899641A (zh) * 2015-05-25 2015-09-09 杭州朗和科技有限公司 深度神经网络学习方法、处理器和深度神经网络学习系统
CN106951395A (zh) * 2017-02-13 2017-07-14 上海客鹭信息技术有限公司 面向压缩卷积神经网络的并行卷积运算方法及装置
CN107122705A (zh) * 2017-03-17 2017-09-01 中国科学院自动化研究所 基于三维人脸模型的人脸关键点检测方法
CN107171717A (zh) * 2017-05-31 2017-09-15 武汉光迅科技股份有限公司 一种从畸变的信号中恢复理想信号的方法和系统

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110992935B (zh) * 2014-09-12 2023-08-11 微软技术许可有限责任公司 用于训练神经网络的计算系统
US20180260703A1 (en) * 2016-11-22 2018-09-13 Massachusetts Institute Of Technology Systems and methods for training neural networks
CN107451653A (zh) * 2017-07-05 2017-12-08 深圳市自行科技有限公司 深度神经网络的计算方法、装置及可读存储介质
CN108090565A (zh) * 2018-01-16 2018-05-29 电子科技大学 一种卷积神经网络并行化训练加速方法
CN108334945B (zh) * 2018-01-30 2020-12-25 中国科学院自动化研究所 深度神经网络的加速与压缩方法及装置

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104899641A (zh) * 2015-05-25 2015-09-09 杭州朗和科技有限公司 深度神经网络学习方法、处理器和深度神经网络学习系统
CN106951395A (zh) * 2017-02-13 2017-07-14 上海客鹭信息技术有限公司 面向压缩卷积神经网络的并行卷积运算方法及装置
CN107122705A (zh) * 2017-03-17 2017-09-01 中国科学院自动化研究所 基于三维人脸模型的人脸关键点检测方法
CN107171717A (zh) * 2017-05-31 2017-09-15 武汉光迅科技股份有限公司 一种从畸变的信号中恢复理想信号的方法和系统

Also Published As

Publication number Publication date
CN111788567A (zh) 2020-10-16
CN111788567B (zh) 2024-04-26

Similar Documents

Publication Publication Date Title
CN111684473B (zh) 提高神经网络阵列的性能
US20210004663A1 (en) Neural network device and method of quantizing parameters of neural network
US10402725B2 (en) Apparatus and method for compression coding for artificial neural network
US11307865B2 (en) Data processing apparatus and method
CN107832082B (zh) 一种用于执行人工神经网络正向运算的装置和方法
US10096134B2 (en) Data compaction and memory bandwidth reduction for sparse neural networks
US10445638B1 (en) Restructuring a multi-dimensional array
US20180260710A1 (en) Calculating device and method for a sparsely connected artificial neural network
US20190164043A1 (en) Low-power hardware acceleration method and system for convolution neural network computation
Faraone et al. AddNet: Deep neural networks using FPGA-optimized multipliers
US20210224640A1 (en) Neural network circuit device, neural network processingmethod, and neural network execution program
WO2019157812A1 (zh) 一种计算装置及方法
US20160093343A1 (en) Low power computation architecture
KR20190084705A (ko) 근사 곱셈기를 구비하는 뉴럴 네트워크 처리 장치 및 이를 포함하는 시스템온 칩
CN114651260A (zh) 具有动态权重选择的相位选择性卷积
TW201807621A (zh) 人造神經網路、人造神經元及其控制方法
US11704556B2 (en) Optimization methods for quantization of neural network models
WO2020001401A1 (zh) 深度神经网络中的网络层运算方法及装置
WO2019001323A1 (zh) 信号处理的系统和方法
Hsiao et al. Design of a sparsity-aware reconfigurable deep learning accelerator supporting various types of operations
WO2020041934A1 (zh) 一种数据处理设备以及一种数据处理方法
Véstias et al. Lite-CNN: A high-performance architecture to execute CNNs in low density FPGAs
Struharik et al. Conna–compressed cnn hardware accelerator
WO2019128248A1 (zh) 一种信号处理方法及装置
WO2021081854A1 (zh) 一种卷积运算电路和卷积运算方法

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 18931526

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 18931526

Country of ref document: EP

Kind code of ref document: A1