WO2018140294A1 - Réseau neuronal basé sur des opérations à virgule fixe - Google Patents

Réseau neuronal basé sur des opérations à virgule fixe Download PDF

Info

Publication number
WO2018140294A1
WO2018140294A1 PCT/US2018/014303 US2018014303W WO2018140294A1 WO 2018140294 A1 WO2018140294 A1 WO 2018140294A1 US 2018014303 W US2018014303 W US 2018014303W WO 2018140294 A1 WO2018140294 A1 WO 2018140294A1
Authority
WO
WIPO (PCT)
Prior art keywords
fixed
parameters
layer
convolutional layer
point format
Prior art date
Application number
PCT/US2018/014303
Other languages
English (en)
Inventor
Ningyi Xu
Hucheng Zhou
Wenqiang WANG
Xi Chen
Original Assignee
Microsoft Technology Licensing, Llc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Microsoft Technology Licensing, Llc filed Critical Microsoft Technology Licensing, Llc
Publication of WO2018140294A1 publication Critical patent/WO2018140294A1/fr

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/06Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons
    • G06N3/063Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons using electronic means
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks

Definitions

  • Neural networks have been widely and deeply applied in computer vision, natural language processing, and speech recognition.
  • Convolutional neural network is a special neural network and includes a large amount of learnable parameters.
  • Most of the current convolutional neural networks even if deployed on one or more fast and power-hungry Graphics Processing Units (GPUs), take a great amount of time to train.
  • GPUs Graphics Processing Units
  • Various solutions have been proposed to improve the computing speed of neural networks. However, the current solutions still have a number of problems to be solved in memory consumption and/or computation complexity.
  • a solution for training a neural network In the solution, a fixed-point format is used to store parameters of the neural networks, such as weights and biases. The parameters are also known as primal parameters to be updated for each iteration. Parameters in the fixed-point format have a predefined bit-width and can be stored in a memory unit of a special-purpose processing device.
  • the special-purpose processing device when executing the solution, receives an input to a layer of a neural network, reads parameters of the layer from the memory unit, and computes an output of the layer based on the input of the layer and the read parameters. In this way, the requirements for the memory and computing resources of the special-purpose processing device can be reduced.
  • FIG. 1 illustrates a block diagram of a computing environment in which implementations of the subject matter described herein can be implemented
  • FIG. 2 illustrates a block diagram of a neural network in accordance with an implementation of the subject matter described herein;
  • FIG. 3 illustrates an internal architecture for a forward pass of a convolutional layer of the neural network in accordance with an implementation of the subject matter described herein;
  • Fig. 4 illustrates an internal architecture for a backward pass of a layer of the neural network in accordance with an implementation of the subject matter described herein;
  • FIG. 5 illustrates a flowchart of a method for training a neural network in accordance with an implementation of the subject matter described herein;
  • FIG. 6 illustrates a block diagram of a device for training a neural network in accordance with an implementation of the subject matter described herein;
  • Fig. 7 illustrates a block diagram of a forward pass of the neural network in accordance with one implementation of the subject matter described herein;
  • Fig. 8 illustrates a block diagram of a backward pass of the neural network in accordance with one implementation of the subject matter described herein.
  • the term “includes” and its variants are to be read as open terms that mean “includes, but is not limited to.”
  • the term “based on” is to be read as “based at least in part on.”
  • the term “one implementation” and “an implementation” are to be read as “at least one implementation.”
  • the term “another implementation” is to be read as “at least one other implementation.”
  • the terms “first,” “second,” and the like may refer to different or same objects. Other definitions, explicit and implicit, may be included below.
  • model quantization has been considered to be one of the most promising approaches, because it not only significantly accelerates the speed and increases the power- efficiency, but also achieves comparable accuracy.
  • Model quantization is intended to quantize the model parameters (as well as activations and gradients) to low bit-width values, while model binarization further pushes the limit of the quantization by extremely quantizing the parameters to be a binary value (a single bit, -1 and 1).
  • Fig. 1 illustrates a block diagram of a computing device 100 in which implementations of the subject matter described herein can be implemented. It would be appreciated that the computing device 100 shown in Fig. 1 is merely illustration but not limiting the function and scope of the implementations of the subject matter described herein in any way. As illustrated in Fig. 1, the computing device 100 may include a memory 102, a controller 104, and a special-purpose processing device 106.
  • the computing device 100 can be implemented as various user terminals or service terminals with computing capability.
  • the service terminals may be servers, large-scale computer devices, and other devices provided by various service providers.
  • the user terminals for example, are any type of mobile terminals, fixed terminals, or portable terminals, including mobile phones, stations, units, devices, multimedia computers, multimedia tablets, Internet nodes, communicators, desktop computers, laptop computers, notebook computers, netbook computers, tablet computers, Personal Communication System (PCS) devices, personal navigation devices, Personal Digital Assistants (PDAs), audio/video players, digital camera/camcorders, positioning devices, television receivers, radio broadcast receivers, electronic book devices, game devices, or any combination thereof, including the accessories and peripherals of these devices or any combination thereof. It is also contemplated that the computing device 100 can support any type of interface to the user (such as "wearable" circuitry and the like).
  • the special-purpose processing device 106 may further include a memory unit 108 and a processing unit 110.
  • the special-purpose processing device 106 may be a Field Programmable Gate Array (FPGA), an Application Specific Integrated Circuit (ASIC), a processor or a Central Processing Unit (CPU) with a customized processing unit, or a Graphics Processing Unit (GPU). Therefore, the memory unit 108 may be referred to as a memory-on-chip and the memory 102 may be referred to as a memory-off-chip accordingly.
  • the processing unit 110 can control the overall operations of the special-purpose processing device 106 and perform various computations.
  • the memory 102 may be implemented by various storage media, including but not limited to volatile and non-volatile medium, and removable and non-removable medium.
  • the memory 102 can be a volatile memory (such as a register, cache, Random Access Memory (RAM)), a non-volatile memory (such as, a Read-Only Memory (ROM), Electrically Erasable Programmable Read-Only Memory (EEPROM), flash memory)), or any combinations thereof.
  • the memory 102 may be removable and non-removable medium, and may include a machine-readable medium, such as a memory, flash drive, magnetic disk or any other media that can be used to store information and/or data and can be accessed by the computing device 100.
  • the controller 104 can control the start and end of the computing process and may further provide inputs required for forward pass of the convolutional neural network. In addition, the controller 104 can also provide the weight data for the neural network.
  • the controller 104 communicates with the special-purpose processing device 106 via a standard interface such as a PCIe bus. The controller 104 assigns the computing tasks to the processing unit 110 on the special-purpose processing device 106.
  • the processing unit 110 begins the computing process after receiving the start signal from the controller 104.
  • the controller 104 provides the inputs and weights to the processing unit 110 for computation.
  • the memory unit 108 of the special -purpose processing device 106 may be used to store parameters, such as convolution kernel weights, while the memory 102 may store input and output feature maps and intermediate data generated during computation.
  • the special-purpose processing device 106 completes the computation of the forward pass of the neural network and then returns the output result obtained from the previous layer of the convolutional neural network to the controller 104.
  • the above control process is merely exemplary. Those skilled in the art may change the control process after understanding the implementations of the subject matter described herein.
  • the computing device 100 or the special -purpose processing device 106 can perform the training of the neural networks in the implementations of the subject matter described herein.
  • model parameters also known as primal parameters
  • the parameters are updated during each iteration.
  • the parameters are stored in high-resolution format. These parameters will be quantized or binarized every time before forward pass, and the associated gradient accumulation is still performed in a floating-point domain.
  • the special-purpose processing device such as the FPGA and the ASIC, still need to implement expensive floating-point multiplication-accumulation to handle parameter updates, and even much more expensive nonlinear quantization method.
  • the limits of quantization are further pushed by representing the parameters in a fixed-point format, which can decrease the bit- width of the parameters, so as to dramatically reduce the total memory.
  • 8-bit fixed-point number can reduce the total memory space to a quarter compared with the 32-bit floating-point number.
  • This makes it possible to store the parameters on the memory-on-chip of the special-purpose processing device rather than memory-off-chip.
  • 45nm CMOS process node it means 100 times energy efficiency.
  • fixed point arithmetic operations with low resolution in the special-purpose processing device are much faster and energy efficient than floating-point representation.
  • fixed-point operations generally will dramatically reduce logic usage and power consumption, combined with higher clock frequency, shorter pipelines, and increased throughput capabilities.
  • Convolutional neural network is a particular neural network, which usually includes a plurality of layers with each layer including one or more neurons. Each neuron obtains input data from the input of the neural network or the previous layer to perform respective operations, and outputs the results to the next layer or the output of the neural network model.
  • the input of the neural network may be, for example, images, e.g., RGB images of particular pixels.
  • the output of the neural network is a score or a probability of a different class.
  • the last layer (usually the fully-connected layer) of the neural network may be provided with a loss function, which can be a cross entropy loss function. During training of the neural network, it is generally required to minimize the loss function.
  • each layer is arranged in three dimensions: width, height, and depth.
  • Each layer of the convolutional neural network converts the three-dimensional input data to a three-dimensional activation data and outputs the converted three-dimensional activation data.
  • the convolutional neural network includes various layers arranged in a sequence and each layer sends the activation data from one layer to another.
  • the convolutional neural network mainly includes three types of layers: a convolutional layer, a pooling layer, and a fully-connected layer. By stacking the layers over each other, a complete convolutional neural network may be constructed.
  • Fig. 2 illustrates example architecture of a convolutional neural network (CNN) 200 in accordance with some implementations of the subject matter described herein. It should be understood that the structure and functions of the convolutional neural network 200 are described for illustration, and do not limit the scope of the subject matter described herein. The subject matter described herein can be implemented by different structures and/or functions.
  • CNN convolutional neural network
  • the CNN 200 includes an input layer 202, convolutional layers 204 and 208, pooling layers 206 and 210, and an output layer 212.
  • the convolutional layer and the pooling layer are arranged alternately.
  • the convolutional layer 204 is followed by the adjacent pooling layer 206 and the convolutional layer 208 is followed by the adjacent pooling layer 206.
  • the convolutional layer may not be followed by a pooling layer.
  • the CNN 200 only includes one of the pooling layers 206 and 210. In some implementations, there are even no pooling layers.
  • each of the input layer 202, the convolutional layers 204 and 208, the pooling layers 206 and 210, and the output layer 212 includes one or more planes, also known as feature maps or channels.
  • the planes are arranged along the depth dimension and each plane may include two space dimensions, i.e., width and height, also known as space domain.
  • the input layer 202 may be represented by the input images, such as 32*32 RGB images.
  • the dimension for the input layer 202 is 32*32*3.
  • the width and height for the image is 32 and there are three color channels.
  • Feature map of each of the convolutional layers 204 and 208 may be obtained by applying convolutional operations on the feature map of the previous layer. By convolutional operations, each neuron in the feature map of the convolutional layers is only connected to a part of neurons of the previous layer. Therefore, applying convolutional operations on the convolutional layers indicates the presence of sparse connection between these two layers. After applying convolutional operations on the convolutional layers, the result may be applied with an activation function to determine the output of the convolutional layers.
  • each neuron is connected to a local area in the input layer 202 and computes the inner product of the local area and its weights.
  • the convolutional layer 204 may compute the output of all neurons and, if 12 filters (also known as convolution kernel) are used, the obtained output data will have a dimension of [32*32* 12].
  • activation operations may be performed on each output data in the convolutional layer 204 and the common activation functions include Sigmoid, tanh, ReLU, and so on.
  • the pooling layers 206 and 210 down sample the output of the previous layer in space dimension (width and height), so as to reduce data size in space dimension.
  • the output layer 212 is usually a fully-connected layer, in which each neuron is connected to all neurons of the previous layer.
  • the output layer 212 computes classification scores and converts data size to one-dimensional vectors, each element of the one-dimensional vectors corresponding to a respective category. For instance, regarding the convolutional network of the images in CIFAR-10 for classification, the last output layer has a dimension of 1 * 1 * 10 because the convolutional neural network will finally compress the images into one vector consisting of classification scores, where the vector is arranged along the direction of depth.
  • the convolutional neural network converts the images one by one from original pixel values to final classification score values.
  • both the activation function and the learning parameters can be used, where parameters in the convolutional layers and the fully-connected layer can be optimized based on different optimization solutions.
  • the optimization solutions include but not limited to stochastic gradient descent algorithm, adaptive momentum estimation (ADAM) method and the like. Therefore, the errors between the classification scores obtained by the convolutional neural network and labels of each image can be lowered as much as possible for the data in the training data set.
  • Training of the neural network can be further implemented by a backward pass.
  • the training set is input into the input layer of the neural network, e.g., input the training set into the input layer of the neural network in batches and iteratively update the parameters of the neural network by batch. Samples of each batch can be regarded as a mini-batch. After multiple iterations, all samples in the training set have been trained once, which is called as an epoch.
  • a plurality of inputs is grouped into a mini-batch, which is provided to the input layer.
  • the inputs are propagated layer by layer to the output layer of the neural network, so as to determine the output of the neural network, such as classification scores.
  • the classification scores are then compared with the labels in the training set to compute prediction errors by the loss function, for example.
  • the output layer discovers that the output is inconsistent with the right label.
  • the parameters of the last layer in the neural network may be adjusted and then parameters of the last second layer connected to the last layer may be adjusted. Accordingly, the layers are adjusted layer by layer in a backward direction.
  • the adjustment for all parameters in the neural network is completed, the same process is performed on the next mini-batch. In this way, the process is performed iteratively until the predefined termination condition is satisfied.
  • BNN binary neural network
  • weights and activations can be binarized to significantly speed up the performance by using the bit convolution kernels.
  • the floating-point value is converted into a single bit by the stochastic method. Although the stochastic binarization can get better performance, the computation complexity of the solution is higher since it requires the hardware resources to generate random bits when quantizing.
  • a deterministic method is adopted to convert the floating-point value into a single bit and the deterministic solution is lower in computation complexity. For example, a simple sign function sign ( ⁇ ) is used to binarize the floating-point value, as shown in equation (1).
  • ⁇ i represents an indicator function, where when the input ⁇ satisfies the condition that
  • STE can also be regarded as applying hard-tanh activation function HT to n, where HT is defined as:
  • weights and gradients can be stored in a fixed-point format, e.g., weights can be stored in the memory unit 108 of the special-purpose processing device 106 in a fixed-point format.
  • the fixed- point format includes a /-bit signed integer mantissa and a global scaling factor (e.g., 2 "n ) shared with the fixed-point values, as shown in equation (5):
  • n and mantissa mi-rrik are integers.
  • the vector v includes K elements vi ⁇ vk, which share one scaling factor 2 "n .
  • the integer n actually indicates the radix point position of the /-bit fixed- point number.
  • the scaling factor in fact refers to the position of the radix point. Because the scaling factor is usually fixed, i.e., fixed radix point, this data format is called as fixed-point number. Lowering the scaling factor will reduce the range of the fixed-point format but increases the accuracy of the fixed-point format.
  • the scaling factor is usually set to be a power of 2, since the multiplication can be replaced by bit shift to reduce the complexity of computation.
  • the following equation (6) may be used to convert data x (such as, a floating-point number) into an /-bit fixed-point number with the scaling factor
  • equation (6) also defines the rounding behavior, i.e., indicated by the rounding down operation [ ⁇ ].
  • equation (6) also defines the saturating ⁇ ⁇
  • the value of the converted fixed-point number is MAX; when is less than MIN, the value of the converted fixed-point number is MIN.
  • the scaling factor may be updated based on the data range. Specifically, it may be determined, based on overflow of the data (e.g., overflow rate and/or overflow amount), whether to update the scaling factor and how to update the scaling factor.
  • overflow rate e.g., overflow rate and/or overflow amount
  • the method for updating the scaling factor will now be explained with reference to weights. However, it would be appreciated that the method can also be applied for other parameters.
  • the scaling factor may be multiplied with the cardinal number (e.g., 2). For example, the radix point may be shifted right by one bit. If the overflow rate does not exceed the predefined threshold and, after the weight is multiplied with 2, the overflow rate is still below the predefined threshold, the range of the fixed-point number is too large. Therefore, the scaling factor may be reduced, for example, by dividing the scaling factor by the cardinal number (e.g., 2). For example, the radix point may be shifted left by one bit.
  • Fig. 3 illustrates an internal architecture for a forward pass of a convolutional layer 300 of the convolutional neural network in accordance with an implementation of the subject matter described herein.
  • the convolutional layer 300 may be a k-th layer of the neural network.
  • the convolutional layer 300 may be a convolutional layer 204 or 208 in the convolutional neural network as shown in Fig. 2.
  • legend 10 represents binary numbers and legend 20 represents fixed-point numbers. It would be appreciated that, although Fig. 3 illustrates a plurality of modules or sub-layers, in specific implementations one or more sub-layers may be omitted or modified for different purposes.
  • parameters of the convolutional layer 300 includes weights 302 and biases 304, respectively denoted as & and b k ' , i.e., weights and biases of the k- th layer.
  • parameters of the convolutional layer 300 may be represented and stored in fixed-point format instead of floating-point format.
  • the parameters in fixed-point format may be stored in the memory unit 108 of the special - purpose processing device 106 and may be read from the memory unit 108 during operation.
  • the weights 302 in fixed-point format are converted by a binary sub-layer 308 to binary weights 310, which may be represented by .
  • the binary sub-layer 308 may convert the fixed-point weights 302 into binary weights 310 by a sign function, as shown in equation (1).
  • the convolutional layer 300 further receives an input 306, which may be represented by x .
  • the input 306 can be input images of the neural network. In this case, the input 306 can be regarded as an 8- bit integer vector (0-225).
  • the convolutional layer 300 when the convolutional layer 300 is a hidden layer or an output layer of the neural network, for example, the input 306 may be an output of the previous layer, which may be a binary vector (+1 or -1). In both cases, convolutional operation only includes integer multiplication and accumulation and may be computed by bit convolution kernels. In some implementations, if the convolutional layer 300 is the first layer, it may be processed according to equation (8),
  • x represents an input 306 in an 8-bit fixed-point format
  • w b represents a binary weight
  • x 11 represents the mantissa of the n-th element of vector x.
  • a normalization sub-layer 316 represents integer batch normalization (IBN) sublayer, which normalizes input tensor within a mini-batch with mean and variance. Different from conventional batch normalization performed in floating-point domain, all intermediate results involved in the sub-layer 316 are either 32-bit integers or low resolution fixed-point values. Since integer is a special fixed-point number, the IBN sub-layer 316 only includes corresponding fixed-point operations. Subsequently, the quantization sublayer 318 converts the output of the IBN sub-layer 316 to a predefined fixed-point format.
  • IBN integer batch normalization
  • output - i S i i 1 , - - - f ⁇ me sum su-n - - ⁇ ⁇ ' ⁇ .
  • anc j sum of squares s,tm2 ⁇ ' of all inputs can be determined.
  • anc j the variance var in p U t are computed based on suml and sum 2, wherein Round ( ) means rounding to the nearest 32-bit integer.
  • the normalized output can be converted to ⁇ " * - irX ⁇ ) ⁇ ) in a predefined fixed- point format via the sub-layer 318.
  • the method for updating scaling factors described in the Quantization section above can be used to update the scaling factors. For example, it may be first determined whether the overflow rate of the IBN output exceeds the predefined threshold. If the overflow rate is greater than the predefined threshold, the range of the IBN output is extended, that is, increasing the scaling factor or right shifting the radix point in fixed-point format when the cardinal number is 2. This will not be repeated because it is substantially the same as the method for updating scaling factors described with reference to quantization.
  • a summing sub-layer 320 adds the output of the IBN sub-layer 136 with the bias 304 to provide an output Sk.
  • the bias 304 may be read from the memory unit 108 of the special-purpose processing device 106.
  • the activation sublayer 322 represents an activation function, which is usually implemented by a non-linear activation function, e.g., hard-tanh function HT.
  • the output of the activation sub-layer 322 is converted via the quantization sub-layer 324 to an output 326 in a fixed-point format, which is denoted by X +1 , to be provided to the next layer (k+1 layer) of the neural network.
  • the last layer of the neural network may not include the activation sub-layer 322 and binary layer 324, i.e., the loss function layer is computed in a floating-point domain.
  • a pooling layer is located after the convolutional layer 300.
  • both of the convolutional layers 204 and 208 are followed by a pooling layer 206 in the convolutional neural network 200.
  • the pooling layer may be incorporated into the convolutional layer 300 to further reduce computation complexity.
  • the pooling layer 206 is incorporated into the convolutional layer 204 in the convolutional neural network 200.
  • the pooling sub-layer 314 indicated by the dotted line may be incorporated into the convolutional layer 300 and arranged between the convolutional sub-layer 321 and the IBN sub-layer 316.
  • the forward pass of the entire neural network may be stacked by a plurality of similar processes.
  • the output of the k-th layer is provided to the k+1 layer to serve as an input of the k+1 layer; and the process continues layer by layer.
  • the output of the convolutional layer 204 is determined from the architecture of the convolutional layer 300 (without the sub-layer 314). If the pooling layer 206 is incorporated into the convolutional layer 204, the output of the pooling layer 206 may be determined by the architecture of the convolutional layer 300 (including the sub-layer 314). Then, the output is provided to the convolutional layer 208 and the classification category is provided at the output layer 212.
  • Fig. 4 illustrates an internal architecture for a backward pass of a convolutional layer 400 of the convolutional neural network in accordance with an implementation of the subject matter described herein.
  • the backward pass process is shown in Fig. 4 from right to left.
  • legend 30 represents floating-point number and legend 20 represents fixed-point number.
  • the forward pass and backward pass process of the convolutional layer is respectively indicated by the signs 300 and 400
  • the convolutional layers 300 and 400 may refer to the same layer in the neural network.
  • the convolutional layers 300 and 400 may be the architecture for implementing the forward pass and backward pass of the convolutional layer 204 or 208 in the convolutional neural network 200.
  • Fig. 4 illustrates a plurality of modules or sub-layers, each sub-layer can be omitted or modified in specific implementations for different purposes in view of different situations.
  • the convolutional layer 400 receives a backward input 426 from a next layer of the neural network, e.g., if the convolutional layer 400 is a k-th layer, the convolutional layer 400 receives a backward input 426 from the k+1- th layer.
  • the backward input 426 may be a gradient of the loss function with respect to the forward output 326 of the convolutional layer 300.
  • the gradient may be in floating-point format and may be represented as g h .
  • the backward input 426 is converted to a fixed-point value 430 (denoted by g fx )
  • the activation sub-layer 422 computes its output based on the fixed-point value 430, i.e., the gradient of the loss function with respect to the input sk of the activation sub-layer 322, denoted by
  • the activation sub-layer 322 in Fig. 3 corresponds to the activation sub-layer 422 in Fig. 4, which serves as a backward gradient operation for the activation sub-layer 322.
  • the input of the activation sub-layer 322 is x
  • the output thereof is y
  • the backward input of the corresponding activation sub-layer 422 is a gradient of the loss function with respect to the output y
  • the backward output is a gradient of the loss function with respect to the input x.
  • the backward output of the activation sub-layer 422 is provided to the summing sub-layer 420, which corresponds to the summing sub-layer 320, and the gradients of the loss function with respect to two inputs of the summing sub-layer 320 may be determined. Because an input of the sub-layer 320 is the bias, the gradient of the loss function with respect to the bias may be determined and the gradient is provided to the quantization sublayer 428. Subsequently, the gradient is converted to a fixed-point gradient by the quantization sub-layer 428 for updating the bias 404 (represented by fe ).
  • the fixed- point format has a specific scaling factor, which may be updated in accordance with the method for updating scaling factors as described in the Quantization section above.
  • Another backward output of the summing sub-layer 420 is propagated to the IBN sub-layer 418.
  • forward pass a fixed-point format is used to compute the IBN sub-layer 418.
  • the IBN sub-layer 418 is returned to the floating-point domain for operations, so as to provide an intermediate gradient output.
  • the intermediate gradient output is a gradient of the loss function with respect to the convolution of the input and parameters.
  • an additional quantization sub-layer 416 is utilized after the IBN sub-layer 418 for converting the floating-point format into a fixed-point format.
  • the quantization sub-layer 416 converts the intermediate gradient output to a fixed-point format having a specific scaling factor, which may be updated according to the method for updating scaling factors as described in the Quantization section above.
  • the convolutional sub-layer 412 further propagates a gradient g w b of the loss function with respect to the weight WP and a gradient g b of the loss function with
  • the convolutional sub-layer 412 only contains fixed- point multiplication and accumulation, thereby resulting in a very low computation complexity.
  • the backward output g v b of the convolutional sub-layer 412 provides a backward
  • the backward output g w b of the convolutional sub-layer 412 is converted to a fixed-point format via the quantization layer 408 to update the weight 402 (represented by w xp ).
  • the fixed-point format has a specific scaling factor, which may be updated according to the method for updating scaling factors as described in the Quantization section above.
  • the parameters may be updated.
  • the parameter may be updated by various updating rules, e.g., stochastic gradient descent, Adaptive Momentum Estimation (ADAM), or the like.
  • the updating rules are performed in the fixed-point domain to further reduce floating-point computation. It would be appreciated that, although reference is made to the ADAM optimization method, any other suitable methods currently known or to be developed in the further may also be implemented.
  • ADAM method dynamically adjusts the learning rate for each parameter based on a first moment estimate and a second moment estimate of the gradient of the loss function with respect to each parameter.
  • Fixed-point ADAM optimization method differs from the standard ADAM optimization method in that the fixed-point ADAM method operates entirely within the fixed-point domain. In other words, its intermediate variables (e.g., first moment estimate and second moment estimate) are represented by fixed-point numbers.
  • one fixed-point ADAM learning rule is represented by the following equation
  • gf denotes element-by-element square # ⁇ 0 #*.
  • ' 3 ⁇ 4 are respectively fixed to be and ' FXP( ) represents a function of equation (6).
  • the parameter represents the current fixed-point parameter value with a fixed-point format h, m, and 3 ⁇ 4 represents the updated fixed-point parameter value.
  • the fixed-point format for the gradient gt is h and «2, and 3 ⁇ 4 is the learning rate. It can be seen that the ADAM method computes the updated parameters by calculating the intermediate variables mt, vt, and tit, and only includes respective fixed-point operations.
  • the updated weight V fc fxp and bias b xp can be computed.
  • these parameters can be stored in a memory unit 108 of the special-purpose processing device 106 in a fixed-point format.
  • the scaling factors for the fixed-point format of the parameters may also be updated as described above.
  • the scaling factors may be updated according to the method for updating scaling factors as described in the Quantization section above.
  • pooling layer is incorporated into the convolutional layer 300 to serve as its pooling sub-layer 314 in the forward pass
  • a corresponding pooling layer should be incorporated into the convolutional layer 400 to serve as its pooling sub-layer 414 in the backward pass.
  • the quantization sub-layer may be implemented by a linear quantization method, and an adaptive updating method for scaling factors of the fixed-point parameters corresponding to the quantization sub-layer may be used to ensure that no significant drop will occur in accuracy.
  • the linear quantization method can greatly lower computation complexity, which can further facilitate the deployment of the convolutional neural network on the special-purpose processing device.
  • the backward pass process has been introduced above with reference to a convolutional layer 400. It would be appreciated that the backward pass of the entire neural network can be stacked by a plurality of similar processes. For example, the backward output of the k+l-th layer is provided to the k-th layer to serve as a backward input of the k-th layer; and the parameter of each layer is updated layer by layer.
  • the backward output of the convolutional layer 204 can be determined by the architecture of the convolutional layer 300 (including a sublayer 314).
  • the backward output is provided to the input layer 202, to finally finish updating all parameters of the neural network 200, thereby completing an iteration of a mini- batch. Iteratively completing iterations of all mini-batches in the training set may be referred to as finishing a full iteration of the data set, which is also known as epoch. After a plurality of epochs, if the training result satisfies the predefined threshold condition, the training is complete.
  • the threshold condition can be a predefined number of epochs or a predefined accuracy.
  • the adaptive updating method may be performed once after a plurality of iterations.
  • the frequency for applying the adaptive updating method may vary for different quantities.
  • the adaptive updating method may be applied more frequently for the gradients, because the gradients tend to fluctuation more extensively.
  • Fig. 5 illustrates a flowchart of a method 500 for a convolutional neural network in accordance with implementations of the subject matter described herein.
  • the method 500 may be performed on the special-purpose processing device 106 as shown in Fig. 1.
  • the special-purpose processing device 106 may be an FPGA or ASIC, for example.
  • an input to a convolutional layer of the neural network is received.
  • the input may be received from the previous layer, or may be an input image for the neural network.
  • the input may correspond to samples of a mini-batch in the training set.
  • parameters of the convolutional layer are read from a memory unit 108 of the special-purpose processing device 106, where the parameters are stored in the memory unit 108 of the special -purpose processing device 106 in a first fixed-point format and have a predefined bit-width.
  • the parameters may represent either weight parameters or bias parameters of the convolutional layer, or may represent both the weight parameters and the bias parameters.
  • the bit-width of the first fixed-point format is smaller than the floating-point number to reduce the memory space of the memory unit 108.
  • the output of the convolutional layer is computed by fixed-point operations based on the input of the convolutional layer and the read parameters.
  • the convolutional operations may be performed on the input and the parameters of the convolutional layer to obtain an intermediate output, which is normalized to obtain a normalized output.
  • the normalization only includes respective fixed-point operations.
  • the normalization may be implemented by the IBN layer 316 as shown in Fig. 3.
  • the scaling factors of the parameters above are adaptively updated. For example, a backward input to the convolutional layer is received at the output of the convolutional layer, where the backward input is a gradient of the loss function of the neural network with respect to the output of the convolutional layer. Based on the backward input, the gradient of the loss function of the neural network with respect to parameters of the convolutional layer is computed.
  • the parameters in the first fixed-point format may be updated based on the gradient of the loss function of the neural network with respect to parameters.
  • the scaling factors of the first fixed-point format may be updated based on the updated parameter range. For example, the fixed-point format of the parameters may be updated by the method described above with reference to quantization.
  • the updated parameters may be stored in the memory unit 108 of the special- purpose processing device 106 to be read at the next iteration.
  • the fixed-point format of the parameters may be updated at a certain frequency.
  • updating parameters only include respective fixed-point operations, which may be implemented by a fixed-point ADAM optimization method, for example.
  • the gradient of the loss function with respect to the parameters may be first converted to a second fixed-point format for updating parameters in the first fixed-point form.
  • the first fixed-point format may be identical to or different from the second fixed-point format.
  • the conversion method can be carried out by a linear quantization method.
  • the gradient of the loss function of the neural network with respect to parameters may be converted to the second fixed-point format by a linear quantization method.
  • the parameters in the first fixed-point format may be updated based on the gradient in the second fixed-point format.
  • the scaling factors of the second fixed-point format may be updated based on the range of the gradient of the loss function with respect to the parameters.
  • the linear quantization method has a lower computation complexity and the performance will not be substantially degraded, because the scaling factor updating method is employed in the implementations of the subject matter described herein.
  • computing the output of the convolutional layer further comprises: converting the normalized output to a normalized output in a third fixed-point format, where the scaling factors of the third fixed-point format may be updated based on the range of the normalized output in the third fixed-point format.
  • the output of the IBN sub-layer 316 may be provided to the quantization layer 318, which can convert the normalized output of the IBN sub-layer 316 to a normalized output in a second fixed-point format.
  • the scaling factors of the second fixed-point format can be updated depending on various factors.
  • the updating method may be configured to be carried out after a given number of iterations, which updating method may be the method described in the Quantization section above.
  • the method further comprises: receiving a backward input to the convolutional layer at the output of the convolutional layer, which backward input is a gradient of the loss function of the neural network with respect to the output of the convolutional layer. Then, the intermediate backward output is obtained based on the normalized backward gradient operations.
  • the gradient of the loss function with respect to the convolution above is computed based on the backward input.
  • the backward gradient operations of the IBN gradient sub-layer 416 corresponds to normalization of the IBN sub-layer 416 as shown in Fig. 4.
  • the backward gradient operations can be performed on the IBN gradient sub-layer 416 to get an intermediate backward output.
  • the intermediate backward output is converted to a fourth fixed-point format and the scaling factors of the fourth fixed-point format can be updated based on the range of the intermediate backward output.
  • the scaling factors of the fourth fixed-point format may be updated according to the updating method described above with reference to quantization.
  • the training process of the entire neural network may be stacked by the process of method 500 as described above with reference to Figs. 3 and 4.
  • Fig. 1 illustrates an example implementation of the special-purpose processing device 106.
  • the special-purpose processing device 106 includes a memory unit 108 for storing parameters of the neural network, and a processing unit 110 for reading the stored parameters from the memory unit 108 and using the parameters to process the input.
  • Fig. 6 illustrates a block diagram of a further example implementation of the special-purpose processing device 106.
  • the special-purpose processing device 106 may be an FPGA or ASIC, for example.
  • the special-purpose processing device 106 includes a memory module 602 configured to store parameters of the convolutional layer of the neural network in a first fixed-point format, where the parameters in the first fixed-point format have a predefined bit-width.
  • the memory module 602 is similar to the memory unit 108 of Fig. 1 in terms of functionality and both of them may be implemented using the same or different techniques or processes.
  • the bit-width of the first fixed-point format is smaller than the floating-point numbers to reduce memory space of the memory module 602.
  • the special-purpose processing device 106 further includes an interface module 604 configured to receive an input to the convolutional layer. In some implementations, the interface module 604 may be used for processing various inputs and outputs between various layers of the neural network.
  • the special-purpose processing device 106 further includes a data access module 606 configured to read parameters of the convolutional layer from the memory module 602. In some implementations, the data access module 606 may interact with the memory module 602 to process the access to the parameters of the neural network.
  • the special-purpose processing device 106 may further include a computing module 608 configured to compute, based on the input of the convolutional layer and the read parameters, the output of the convolutional layer by a fixed-point operation.
  • the interface module 604 is further configured to receive a backward input to the convolutional layer at the output of the convolutional layer, where the backward input is a gradient of the loss function of the neural network with respect to the output of the convolutional layer.
  • the computing module 608 is further configured to compute a gradient of the loss function of the neural network with respect to the parameters of the convolutional layer based on the backward input; and update parameters in the first fixed-point format based on the gradient of the loss function of the neural network with respect to the parameters, where the scaling factors of the first fixed- point format can be updated based on the range of the updated parameters.
  • updating parameters only includes respective fixed-point operations.
  • the computing module 608 is further configured to convert the gradient of the loss function of the neural network with respect to the parameters to a second fixed-point format by a linear quantization method, where the scaling factors of the second fixed-point format can be updated based on the gradient of the loss function with respect to the parameters; and update the parameters based on the gradient in the second fixed-point format.
  • the computing module 608 is further configured to normalize a convolution of the input of the convolutional layer and the parameters to obtain a normalized output, where the normalization only includes respective fixed-point operations.
  • the computing module 608 is further configured to convert the normalized output to a normalized output in a third fixed-point format, where the scaling factors of the third fixed-point format can be updated based on the range of the normalized output in the third fixed-point format.
  • the interface module 604 is further configured to obtain a backward input to the convolutional layer at the output of the convolutional layer, where the backward input is a gradient of the loss function of the neural network with respect to the output of the convolutional layer.
  • the computing module 608 may be configured to compute the gradient of the loss function with respect to the convolution based on the backward input; and convert the gradient of the loss function with respect to the convolution to a fourth fixed-point format, where the scaling factor of the fourth fixed-point format can be updated based on the range of the gradient of the loss function with respect to the convolution.
  • the following section will introduce the important factors that affect the final prediction accuracy of the training model of the neural network in accordance with implementations of the subject matter described herein.
  • the factors comprise the batch normalization (BN) scheme, bit-width of the primal parameters, and bit-width of gradients.
  • BN batch normalization
  • BNN binary neural network
  • a data set CIRFA-30 is used, where the data set CIRFA-30 is an image classification benchmark with 60K 32x32 RGB tiny images. It consists of 10 classes object, including airplane, automobile, bird, cat, deer, dog, frog, horse, ship, and truck. Each class has 5K training images and IK test images.
  • three networks with different size are designed by stacking basic structural modules of the neural network shown in Figs. 3 and 4, including Model-S (Small), Model-M (Medium) and Model-L (Large). The overall network structure is illustrated in Figs. 7 and 8.
  • Fig. 7 illustrates a block diagram of a forward pass of the convolutional neural network 700 in accordance with an implementation of the subject matter described herein and Fig. 8 illustrates a block diagram of a backward pass of the convolutional neural network 800 in accordance with an implementation of the subject matter described herein.
  • each epoch means one training that uses all samples in the training set and each iteration uses samples of one batch for training, each epoch has 250 iterations accordingly. Furthermore, in the experiments, by using the fixed-point ADAM optimization method or standard ADAM optimization method and setting the initial learning rate as 2 "6 , the learning rate will be decreased by a factor of 2 "4 every 50 epochs.
  • the accuracy loss of the neural network is quite stable with respect to the bit-width of the IBN output, as low as 6 bits. If the bit-width of the IBN output continues to decrease, the accuracy will suffer a cliff-off drop.
  • the effect of the gradient bit-width is also evaluated.
  • the gradients are more unstable than the parameters, which shows that the scaling factors of the gradients should be updated more frequently.
  • the update occurs every 375 iterations (1% of total iterations) and the fixed-point ADAM method is employed.
  • the primal parameters are set with floating-point values. It can be seen from the testing that the prediction accuracy decreases very slowly when the bit-width of the gradient is reduced. The prediction accuracy also suffers a cliff-off drop when the bit-width of the gradient is lower than 12 bits, which is similar to the effect of the IBN output and the parameter bit-width.
  • the test is performed by combining the three effects, i.e., the neural network is implemented to substantially involve fixed-point computations only. In this way, the result in Table 2 can be obtained.
  • the relative storage is characterized by a product of the parameter number and the bits of the primal weight. It can be seen from Table 2 that a comparable accuracy with a larger bit-width can be obtained when the bit-width of the primal weight is 12 and the bit-width of the gradient is also 12. As the weight bit-width decreases, the storage will be substantially decreased. Therefore, the training solution for the neural network according to implementations of the subject matter described herein can lower the storage while maintaining computation accuracy.
  • the method can achieve comparable result with the state-of- art works (not shown) when the bit-width of each of the primal weight and the gradient is 12.
  • the method dramatically reduces the storage and significantly improves system performance.
  • a special-purpose processing device comprises a memory unit configured to store parameters of a layer of a neural network in a first fixed-point format, the parameters in the first fixed-point format having a predefined bit-width; a processing unit coupled to the memory unit and configured to perform acts including: receiving an input to the layer; reading the parameters of the layer from the memory unit; and computing, based on the input of the layer and the read parameters, an output of the layer through a fixed-point operation.
  • the layer of the neural network includes a convolutional layer.
  • the acts further include: receiving a backward input to the convolutional layer at an output of the convolutional layer, the backward input being a gradient of a loss function of the neural network with respect to the output of the convolutional layer; computing, based on the backward input, a gradient of the loss function of the neural network with respect to the parameters of the convolutional layer; and updating the parameters in the first fixed-point format based on the gradient of the loss function of the neural network with respect to the parameters of the convolutional layer, a scaling factor of the first fixed-point format being updatable based on a range of the updated parameters.
  • updating the parameters only include a respective fixed- point operation.
  • updating the parameters in the first fixed-point format based on the gradient of the loss function of the neural network with respect to the parameters of the convolutional layer comprises: converting the gradient of the loss function of the neural network with respect to the parameters of the convolutional layer into a second fixed-point format by a linear quantization method, the scaling factor of the second fixed- point format being updatable based on a range of the gradient of the loss function with respect to the parameters of the convolutional layer; and updating the parameters in the first fixed-point format based on the gradient in the second fixed-point format.
  • computing the output of the layer comprises: normalizing a convolution of the input of the convolutional layer and the parameters in the first fixed-point format to obtain a normalized output, the normalizing only including a respective fixed-point operation.
  • computing the output of the convolutional layer further comprises: converting the normalized output into the normalized output in a third fixed- point format, a scaling factor of the third fixed-point format being updatable based on a range of the normalized output in the third fixed-point format.
  • the acts further include: obtaining a backward input to the convolutional layer at an output of the convolutional layer, the backward input being a gradient of the loss function of the neural network with respect to the output of the convolutional layer; computing, based on the backward input, a gradient of the loss function with respect to the convolution; and converting the gradient of the loss function with respect to a convolution into a fourth fixed-point format, a scaling factor of the fourth fixed-point format being updatable based on a range of the gradient of the loss function with respect to the convolution.
  • the special -purpose processing device is a field programmable gate array (FPGA), an application-specific integrated circuit (ASIC), a processor having a customized processing unit, or a graphics processing unit (GPU).
  • FPGA field programmable gate array
  • ASIC application-specific integrated circuit
  • GPU graphics processing unit
  • a method executed by a special-purpose processing device including a memory unit and a processing unit.
  • the method comprises receiving an input to a layer of a neural network; reading parameters of the layer from the memory unit of the special-purpose processing device, the parameters being stored in the memory unit in a first fixed-point format and having a predefined bit-width; and computing, by the processing unit and based on the input of the layer and the read parameters, an output of the layer through a fixed- point operation.
  • the layer of the neural network includes a convolutional layer.
  • the method further comprises: receiving a backward input to the convolutional layer at an output of the convolutional layer, the backward input being a gradient of a loss function of the neural network with respect to the output of the convolutional layer; computing, based on the backward input, a gradient of the loss function of the neural network with respect to the parameters of the convolutional layer; and updating the parameters in the first fixed-point format based on the gradient of the loss function of the neural network with respect to the parameters of the convolutional layer, a scaling factor of the first fixed-point format being updatable based on a range of the updated parameters.
  • updating the parameters only include a respective fixed- point operation.
  • updating the parameters in the first fixed-point format based on the gradient of the loss function of the neural network with respect to the parameters of the convolutional layer comprises: converting the gradient of the loss function of the neural network with respect to the parameters of the convolutional layer into a second fixed-point format by a linear quantization method, the scaling factor of the second fixed- point format being updatable based on a range of the gradient of the loss function with respect to the parameters of the convolutional layer; and updating the parameters in the first fixed-point format based on the gradient in the second fixed-point format.
  • computing the output of the layer comprises: normalizing a convolution of the input of the convolutional layer and the parameters in the first fixed-point format to obtain a normalized output, the normalizing only including a respective fixed-point operation.
  • computing the output of the convolutional layer further comprises: converting the normalized output into the normalized output in a third fixed- point format, a scaling factor of the third fixed-point format being updatable based on a range of the normalized output in the third fixed-point format.
  • the method further comprises: obtaining a backward input to the convolutional layer at an output of the convolutional layer, the backward input being a gradient of the loss function of the neural network with respect to the output of the convolutional layer; computing, based on the backward input, a gradient of the loss function with respect to the convolution; and converting the gradient of the loss function with respect to a convolution into a fourth fixed-point format, a scaling factor of the fourth fixed-point format being updatable based on a range of the gradient of the loss function with respect to the convolution.
  • the special-purpose processing device is a field programmable gate array (FPGA), an application-specific integrated circuit (ASIC), a processor having customized processing units or a graphics processing unit (GPU).
  • FPGA field programmable gate array
  • ASIC application-specific integrated circuit
  • GPU graphics processing unit
  • a special-purpose processing device comprises: a memory module configured to store parameters of a layer of a neural network in a first fixed-point format, the parameters in the first fixed-point format having a predefined bit-width; an interface module configured to receive an input to the layer; a data access module configured to read the parameters of the layer from the memory module; and a computing module configured to compute, based on the input of the layer and the read parameters, an output of the layer through a fixed-point operation.
  • the layer of the neural network includes a convolutional layer.
  • the interface module is further configured to receive a backward input to the convolutional layer at an output of the convolutional layer, the backward input being a gradient of a loss function of the neural network with respect to the output of the convolutional layer;
  • the computing module is further configured to: compute, based on the backward input, a gradient of the loss function of the neural network with respect to the parameters of the convolutional layer, and update the parameters in the first fixed-point format based on the gradient of the loss function of the neural network with respect to the parameters of the convolutional layer, a scaling factor of the first fixed-point format being updatable based on a range of the updated parameters.
  • updating the parameters only include a respective fixed- point operation.
  • the computing module is further configured to: convert the gradient of the loss function of the neural network with respect to the parameters of the convolutional layer into a second fixed-point format by a linear quantization method, the scaling factor of the second fixed-point format being updatable based on a range of the gradient of the loss function with respect to the parameters of the convolutional layer; and update the parameters in the first fixed-point format based on the gradient in the second fixed-point format.
  • the computing module is further configured to: normalize a convolution of the input of the convolutional layer and the parameters in the first fixed-point format to obtain a normalized output, the normalizing only including a respective fixed-point operation.
  • the computing module is further configured to: convert the normalized output into the normalized output in a third fixed-point format, a scaling factor of the third fixed-point format being updatable based on a range of the normalized output in the third fixed-point format.
  • the interface module is further configured to: obtain a backward input to the convolutional layer at an output of the convolutional layer, the backward input being a gradient of the loss function of the neural network with respect to the output of the convolutional layer; compute, based on the backward input, a gradient of the loss function with respect to the convolution; and convert the gradient of the loss function with respect to a convolution into a fourth fixed-point format, a scaling factor of the fourth fixed-point format being updatable based on a range of the gradient of the loss function with respect to the convolution.
  • the special-purpose processing device is a field programmable gate array (FPGA), an application-specific integrated circuit (ASIC), a processor having customized processing units or a graphics processing unit (GPU).
  • FPGA field programmable gate array
  • ASIC application-specific integrated circuit
  • GPU graphics processing unit
  • FPGAs Field-Programmable Gate Arrays
  • ASICs Application-specific Integrated Circuits
  • ASSP Application-specific Standard Product
  • SOC System-on-a-chip systems
  • CPLD Complex Programmable Logic Devices

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • General Health & Medical Sciences (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Computational Linguistics (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Artificial Intelligence (AREA)
  • Neurology (AREA)
  • Image Analysis (AREA)

Abstract

Des mises en oeuvre de l'objet de l'invention proposent une solution pour l'apprentissage d'un réseau neuronal convolutionnel. Dans cette solution, des paramètres du réseau neuronal sont stockés dans un format à virgule fixe, tels que des poids et des biais. Les paramètres dans le premier format de point fixe ont une largeur de bit prédéfinie et peuvent être stockés dans une unité de mémoire du dispositif de traitement à usage spécial. Le dispositif de traitement à usage spécial, lors de l'exécution de la solution, reçoit une entrée d'une couche de convolution, lit des paramètres de la couche de convolution à partir de l'unité de mémoire, et calcule une sortie de la couche de convolution sur la base de l'entrée de la couche de convolution et des paramètres de lecture. De cette manière, les demandes sur l'espace de stockage et les ressources informatiques du dispositif de traitement à usage spécial sont abaissées.
PCT/US2018/014303 2017-01-25 2018-01-19 Réseau neuronal basé sur des opérations à virgule fixe WO2018140294A1 (fr)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201710061333.9 2017-01-25
CN201710061333.9A CN108345939B (zh) 2017-01-25 2017-01-25 基于定点运算的神经网络

Publications (1)

Publication Number Publication Date
WO2018140294A1 true WO2018140294A1 (fr) 2018-08-02

Family

ID=61569403

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US2018/014303 WO2018140294A1 (fr) 2017-01-25 2018-01-19 Réseau neuronal basé sur des opérations à virgule fixe

Country Status (2)

Country Link
CN (1) CN108345939B (fr)
WO (1) WO2018140294A1 (fr)

Cited By (37)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109165736A (zh) * 2018-08-08 2019-01-08 北京字节跳动网络技术有限公司 应用于卷积神经网络的信息处理方法和装置
CN109284761A (zh) * 2018-09-04 2019-01-29 苏州科达科技股份有限公司 一种图像特征提取方法、装置、设备及可读存储介质
CN109800877A (zh) * 2019-02-20 2019-05-24 腾讯科技(深圳)有限公司 神经网络的参数调整方法、装置及设备
CN110110852A (zh) * 2019-05-15 2019-08-09 电科瑞达(成都)科技有限公司 一种深度学习网络移植到fpag平台的方法
US20190279072A1 (en) * 2018-03-09 2019-09-12 Canon Kabushiki Kaisha Method and apparatus for optimizing and applying multilayer neural network model, and storage medium
EP3617954A1 (fr) * 2018-08-22 2020-03-04 INTEL Corporation Normalisation itérative pour des applications d'apprentissage machine
CN110929838A (zh) * 2018-09-19 2020-03-27 杭州海康威视数字技术股份有限公司 神经网络中位宽定点化方法、装置、终端和存储介质
WO2020063715A1 (fr) * 2018-09-26 2020-04-02 Huawei Technologies Co., Ltd. Procédé et système de formation de poids quantifié binaire et fonction d'activation pour réseaux neuronaux profonds
CN110969217A (zh) * 2018-09-28 2020-04-07 杭州海康威视数字技术股份有限公司 基于卷积神经网络进行图像处理的方法和装置
EP3640858A1 (fr) * 2018-10-17 2020-04-22 Samsung Electronics Co., Ltd. Procédé et appareil de quantification de paramètres d'un réseau neuronal
WO2020080827A1 (fr) 2018-10-19 2020-04-23 Samsung Electronics Co., Ltd. Appareil de codage d'ia et son procédé de fonctionnement, et appareil de décodage d'ia et son procédé de fonctionnement
US20200126185A1 (en) 2018-10-19 2020-04-23 Samsung Electronics Co., Ltd. Artificial intelligence (ai) encoding device and operating method thereof and ai decoding device and operating method thereof
CN111144560A (zh) * 2018-11-05 2020-05-12 杭州海康威视数字技术股份有限公司 一种深度神经网络运算方法及装置
CN111144564A (zh) * 2019-12-25 2020-05-12 上海寒武纪信息科技有限公司 对神经网络执行训练的设备及其集成电路板卡
CN111353517A (zh) * 2018-12-24 2020-06-30 杭州海康威视数字技术股份有限公司 一种车牌识别方法、装置及电子设备
CN111368978A (zh) * 2020-03-02 2020-07-03 开放智能机器(上海)有限公司 一种离线量化工具的精度提升方法
EP3686808A1 (fr) * 2019-01-23 2020-07-29 StradVision, Inc. Procédé et dispositif permettant de transformer des couches de cnn afin d'optimiser la quantification de paramètres de cnn à utiliser pour des dispositifs mobiles ou des réseaux compacts à grande précision par l'intermédiaire de l'optimisation de matériel
CN111723901A (zh) * 2019-03-19 2020-09-29 百度在线网络技术(北京)有限公司 神经网络模型的训练方法及装置
CN112561028A (zh) * 2019-09-25 2021-03-26 华为技术有限公司 训练神经网络模型的方法、数据处理的方法及装置
WO2021075735A1 (fr) * 2019-10-15 2021-04-22 Lg Electronics Inc. Formation d'un réseau neuronal à l'aide d'un échantillonnage périodique sur des poids modèles
CN112930543A (zh) * 2018-10-10 2021-06-08 利普麦德股份有限公司 神经网络处理装置、神经网络处理方法和神经网络处理程序
EP3811619A4 (fr) * 2018-10-19 2021-08-18 Samsung Electronics Co., Ltd. Appareil de codage d'ia et son procédé de fonctionnement, et appareil de décodage d'ia et son procédé de fonctionnement
CN113468935A (zh) * 2020-05-08 2021-10-01 上海齐感电子信息科技有限公司 人脸识别方法
US11170472B2 (en) 2018-10-19 2021-11-09 Samsung Electronics Co., Ltd. Method and apparatus for streaming data
US11170534B2 (en) 2018-10-19 2021-11-09 Samsung Electronics Co., Ltd. Methods and apparatuses for performing artificial intelligence encoding and artificial intelligence decoding on image
JP2021179966A (ja) * 2019-06-12 2021-11-18 シャンハイ カンブリコン インフォメーション テクノロジー カンパニー リミテッドShanghai Cambricon Information Technology Co., Ltd. ニューラルネットワークにおける量子化パラメータの確定方法および関連製品
CN113673664A (zh) * 2020-05-14 2021-11-19 杭州海康威视数字技术股份有限公司 数据溢出检测方法、装置、设备及存储介质
US11190782B2 (en) 2018-10-19 2021-11-30 Samsung Electronics Co., Ltd. Methods and apparatuses for performing encoding and decoding on image
CN113780523A (zh) * 2021-08-27 2021-12-10 深圳云天励飞技术股份有限公司 图像处理方法、装置、终端设备及存储介质
WO2022009433A1 (fr) * 2020-07-10 2022-01-13 富士通株式会社 Dispositif, procédé et programme de traitement d'informations
US11288770B2 (en) 2018-10-19 2022-03-29 Samsung Electronics Co., Ltd. Apparatuses and methods for performing artificial intelligence encoding and artificial intelligence decoding on image
CN114492779A (zh) * 2022-02-16 2022-05-13 安谋科技(中国)有限公司 神经网络模型的运行方法、可读介质和电子设备
US11616988B2 (en) 2018-10-19 2023-03-28 Samsung Electronics Co., Ltd. Method and device for evaluating subjective quality of video
WO2023115814A1 (fr) * 2021-12-22 2023-06-29 苏州浪潮智能科技有限公司 Architecture matérielle fpga, procédé de traitement de données associé et support de stockage
US11720998B2 (en) 2019-11-08 2023-08-08 Samsung Electronics Co., Ltd. Artificial intelligence (AI) encoding apparatus and operating method thereof and AI decoding apparatus and operating method thereof
CN110378470B (zh) * 2019-07-19 2023-08-18 Oppo广东移动通信有限公司 神经网络模型的优化方法、装置以及计算机存储介质
US11995532B2 (en) * 2018-12-05 2024-05-28 Arm Limited Systems and devices for configuring neural network circuitry

Families Citing this family (28)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110796244B (zh) * 2018-08-01 2022-11-08 上海天数智芯半导体有限公司 用于人工智能设备的核心计算单元处理器及加速处理方法
KR20200026455A (ko) * 2018-09-03 2020-03-11 삼성전자주식회사 인공 신경망 시스템 및 인공 신경망의 고정 소수점 제어 방법
US10331983B1 (en) * 2018-09-11 2019-06-25 Gyrfalcon Technology Inc. Artificial intelligence inference computing device
US20200117981A1 (en) * 2018-10-11 2020-04-16 International Business Machines Corporation Data representation for dynamic precision in neural network cores
US10387772B1 (en) * 2018-10-22 2019-08-20 Gyrfalcon Technology Inc. Ensemble learning based image classification systems
CN111126558B (zh) * 2018-10-31 2024-04-02 嘉楠明芯(北京)科技有限公司 一种卷积神经网络计算加速方法及装置、设备、介质
CN111191783B (zh) * 2018-11-15 2024-04-05 嘉楠明芯(北京)科技有限公司 一种自适应量化方法及装置、设备、介质
FR3089329A1 (fr) * 2018-11-29 2020-06-05 Stmicroelectronics (Rousset) Sas Procédé d’analyse d’un jeu de paramètres d’un réseau de neurones en vue d’obtenir une amélioration technique, par exemple un gain en mémoire.
CN109800859B (zh) * 2018-12-25 2021-01-12 深圳云天励飞技术有限公司 一种神经网络批归一化的优化方法及装置
CN109740733B (zh) * 2018-12-27 2021-07-06 深圳云天励飞技术有限公司 深度学习网络模型优化方法、装置及相关设备
CN109697083B (zh) * 2018-12-27 2021-07-06 深圳云天励飞技术有限公司 数据的定点化加速方法、装置、电子设备及存储介质
CN109508784B (zh) * 2018-12-28 2021-07-27 四川那智科技有限公司 一种神经网络激活函数的设计方法
CN109670582B (zh) * 2018-12-28 2021-05-07 四川那智科技有限公司 一种全定点化神经网络的设计方法
CN110222821B (zh) * 2019-05-30 2022-03-25 浙江大学 基于权重分布的卷积神经网络低位宽量化方法
CN112085187A (zh) * 2019-06-12 2020-12-15 安徽寒武纪信息科技有限公司 数据处理方法、装置、计算机设备和存储介质
CN112308216B (zh) * 2019-07-26 2024-06-18 杭州海康威视数字技术股份有限公司 数据块的处理方法、装置及存储介质
JP7294017B2 (ja) * 2019-09-13 2023-06-20 富士通株式会社 情報処理装置、情報処理方法および情報処理プログラム
CN110705696B (zh) * 2019-10-11 2022-06-28 阿波罗智能技术(北京)有限公司 神经网络的量化与定点化融合方法及装置
CN111027691B (zh) * 2019-12-25 2023-01-17 上海寒武纪信息科技有限公司 用于神经网络运算、训练的装置、设备及板卡
JP2021111081A (ja) * 2020-01-09 2021-08-02 富士通株式会社 情報処理装置、ニューラルネットワークの演算プログラム及びニューラルネットワークの演算方法
US11610128B2 (en) * 2020-03-31 2023-03-21 Amazon Technologies, Inc. Neural network training under memory restraint
CN113554159A (zh) * 2020-04-23 2021-10-26 意法半导体(鲁塞)公司 用于在集成电路中实现人工神经网络的方法和装置
WO2022007879A1 (fr) 2020-07-09 2022-01-13 北京灵汐科技有限公司 Procédé et appareil de configuration de précision de poids, dispositif informatique et support de stockage
CN111831354B (zh) * 2020-07-09 2023-05-16 北京灵汐科技有限公司 数据精度配置方法、装置、芯片、芯片阵列、设备及介质
CN111831355B (zh) * 2020-07-09 2023-05-16 北京灵汐科技有限公司 权重精度配置方法、装置、设备及存储介质
CN111831356B (zh) * 2020-07-09 2023-04-07 北京灵汐科技有限公司 权重精度配置方法、装置、设备及存储介质
CN113255901B (zh) * 2021-07-06 2021-10-08 上海齐感电子信息科技有限公司 实时量化方法及实时量化系统
CN117992578A (zh) * 2024-04-02 2024-05-07 淘宝(中国)软件有限公司 基于大语言模型处理数据的方法、大语言模型及电子设备

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
DE102015007943A1 (de) * 2014-07-22 2016-01-28 Intel Corporation Mechanismen für eine Gewichtungsverschiebung in faltenden neuronalen Netzwerken
US20160328647A1 (en) * 2015-05-08 2016-11-10 Qualcomm Incorporated Bit width selection for fixed point neural networks

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102200787B (zh) * 2011-04-18 2013-04-17 重庆大学 机器人行为多层次集成学习方法及系统
US20150269481A1 (en) * 2014-03-24 2015-09-24 Qualcomm Incorporated Differential encoding in neural networks
CN105488563A (zh) * 2015-12-16 2016-04-13 重庆大学 面向深度学习的稀疏自适应神经网络、算法及实现装置
CN105760933A (zh) * 2016-02-18 2016-07-13 清华大学 卷积神经网络的逐层变精度定点化方法及装置

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
DE102015007943A1 (de) * 2014-07-22 2016-01-28 Intel Corporation Mechanismen für eine Gewichtungsverschiebung in faltenden neuronalen Netzwerken
US20160328647A1 (en) * 2015-05-08 2016-11-10 Qualcomm Incorporated Bit width selection for fixed point neural networks

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
CHEN XI ET AL: "FxpNet: Training a deep convolutional neural network in fixed-point representation", 2017 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), IEEE, 14 May 2017 (2017-05-14), pages 2494 - 2501, XP033112353, DOI: 10.1109/IJCNN.2017.7966159 *
JIANTAO QIU ET AL: "Going Deeper with Embedded FPGA Platform for Convolutional Neural Network", PROCEEDINGS OF THE 2016 ACM/SIGDA INTERNATIONAL SYMPOSIUM ON FIELD-PROGRAMMABLE GATE ARRAYS, FPGA '16, 1 January 2016 (2016-01-01), New York, New York, USA, pages 26 - 35, XP055423746, ISBN: 978-1-4503-3856-1, DOI: 10.1145/2847263.2847265 *
PHILIPP GYSEL ET AL: "HARDWARE-ORIENTED APPROXIMATION OF CONVOLUTIONAL NEURAL NETWORKS", 11 April 2016 (2016-04-11), XP055398866, Retrieved from the Internet <URL:https://arxiv.org/pdf/1604.03168v1.pdf> [retrieved on 20170816] *
SUYOG GUPTA ET AL: "Deep Learning with Limited Numerical Precision", ARXIV.ORG, CORNELL UNIVERSITY LIBRARY, 201 OLIN LIBRARY CORNELL UNIVERSITY ITHACA, NY 14853, 9 February 2015 (2015-02-09), XP080677454 *

Cited By (65)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11755880B2 (en) * 2018-03-09 2023-09-12 Canon Kabushiki Kaisha Method and apparatus for optimizing and applying multilayer neural network model, and storage medium
US20190279072A1 (en) * 2018-03-09 2019-09-12 Canon Kabushiki Kaisha Method and apparatus for optimizing and applying multilayer neural network model, and storage medium
CN109165736A (zh) * 2018-08-08 2019-01-08 北京字节跳动网络技术有限公司 应用于卷积神经网络的信息处理方法和装置
CN109165736B (zh) * 2018-08-08 2023-12-12 北京字节跳动网络技术有限公司 应用于卷积神经网络的信息处理方法和装置
US11636319B2 (en) 2018-08-22 2023-04-25 Intel Corporation Iterative normalization for machine learning applications
EP3617954A1 (fr) * 2018-08-22 2020-03-04 INTEL Corporation Normalisation itérative pour des applications d'apprentissage machine
CN109284761A (zh) * 2018-09-04 2019-01-29 苏州科达科技股份有限公司 一种图像特征提取方法、装置、设备及可读存储介质
CN109284761B (zh) * 2018-09-04 2020-11-27 苏州科达科技股份有限公司 一种图像特征提取方法、装置、设备及可读存储介质
CN110929838A (zh) * 2018-09-19 2020-03-27 杭州海康威视数字技术股份有限公司 神经网络中位宽定点化方法、装置、终端和存储介质
CN110929838B (zh) * 2018-09-19 2023-09-26 杭州海康威视数字技术股份有限公司 神经网络中位宽定点化方法、装置、终端和存储介质
WO2020063715A1 (fr) * 2018-09-26 2020-04-02 Huawei Technologies Co., Ltd. Procédé et système de formation de poids quantifié binaire et fonction d'activation pour réseaux neuronaux profonds
CN110969217A (zh) * 2018-09-28 2020-04-07 杭州海康威视数字技术股份有限公司 基于卷积神经网络进行图像处理的方法和装置
CN110969217B (zh) * 2018-09-28 2023-11-17 杭州海康威视数字技术股份有限公司 基于卷积神经网络进行图像处理的方法和装置
CN112930543A (zh) * 2018-10-10 2021-06-08 利普麦德股份有限公司 神经网络处理装置、神经网络处理方法和神经网络处理程序
JP2020064635A (ja) * 2018-10-17 2020-04-23 三星電子株式会社Samsung Electronics Co.,Ltd. ニューラルネットワークのパラメータを量子化する方法及びその装置
CN111062475A (zh) * 2018-10-17 2020-04-24 三星电子株式会社 用于对神经网络的参数进行量化的方法和装置
JP7117280B2 (ja) 2018-10-17 2022-08-12 三星電子株式会社 ニューラルネットワークのパラメータを量子化する方法及びその装置
EP3640858A1 (fr) * 2018-10-17 2020-04-22 Samsung Electronics Co., Ltd. Procédé et appareil de quantification de paramètres d'un réseau neuronal
US11170534B2 (en) 2018-10-19 2021-11-09 Samsung Electronics Co., Ltd. Methods and apparatuses for performing artificial intelligence encoding and artificial intelligence decoding on image
US11616988B2 (en) 2018-10-19 2023-03-28 Samsung Electronics Co., Ltd. Method and device for evaluating subjective quality of video
US11720997B2 (en) 2018-10-19 2023-08-08 Samsung Electronics Co.. Ltd. Artificial intelligence (AI) encoding device and operating method thereof and AI decoding device and operating method thereof
US11688038B2 (en) 2018-10-19 2023-06-27 Samsung Electronics Co., Ltd. Apparatuses and methods for performing artificial intelligence encoding and artificial intelligence decoding on image
US11663747B2 (en) 2018-10-19 2023-05-30 Samsung Electronics Co., Ltd. Methods and apparatuses for performing artificial intelligence encoding and artificial intelligence decoding on image
US11647210B2 (en) 2018-10-19 2023-05-09 Samsung Electronics Co., Ltd. Methods and apparatuses for performing encoding and decoding on image
EP3811619A4 (fr) * 2018-10-19 2021-08-18 Samsung Electronics Co., Ltd. Appareil de codage d'ia et son procédé de fonctionnement, et appareil de décodage d'ia et son procédé de fonctionnement
WO2020080827A1 (fr) 2018-10-19 2020-04-23 Samsung Electronics Co., Ltd. Appareil de codage d'ia et son procédé de fonctionnement, et appareil de décodage d'ia et son procédé de fonctionnement
US11170472B2 (en) 2018-10-19 2021-11-09 Samsung Electronics Co., Ltd. Method and apparatus for streaming data
US11748847B2 (en) 2018-10-19 2023-09-05 Samsung Electronics Co., Ltd. Method and apparatus for streaming data
US11170473B2 (en) 2018-10-19 2021-11-09 Samsung Electronics Co., Ltd. Method and apparatus for streaming data
US20210358083A1 (en) 2018-10-19 2021-11-18 Samsung Electronics Co., Ltd. Method and apparatus for streaming data
US20200126185A1 (en) 2018-10-19 2020-04-23 Samsung Electronics Co., Ltd. Artificial intelligence (ai) encoding device and operating method thereof and ai decoding device and operating method thereof
US11288770B2 (en) 2018-10-19 2022-03-29 Samsung Electronics Co., Ltd. Apparatuses and methods for performing artificial intelligence encoding and artificial intelligence decoding on image
US11190782B2 (en) 2018-10-19 2021-11-30 Samsung Electronics Co., Ltd. Methods and apparatuses for performing encoding and decoding on image
US11200702B2 (en) 2018-10-19 2021-12-14 Samsung Electronics Co., Ltd. AI encoding apparatus and operation method of the same, and AI decoding apparatus and operation method of the same
CN111144560A (zh) * 2018-11-05 2020-05-12 杭州海康威视数字技术股份有限公司 一种深度神经网络运算方法及装置
CN111144560B (zh) * 2018-11-05 2024-02-02 杭州海康威视数字技术股份有限公司 一种深度神经网络运算方法及装置
US11995532B2 (en) * 2018-12-05 2024-05-28 Arm Limited Systems and devices for configuring neural network circuitry
CN111353517A (zh) * 2018-12-24 2020-06-30 杭州海康威视数字技术股份有限公司 一种车牌识别方法、装置及电子设备
CN111353517B (zh) * 2018-12-24 2023-09-26 杭州海康威视数字技术股份有限公司 一种车牌识别方法、装置及电子设备
EP3686808A1 (fr) * 2019-01-23 2020-07-29 StradVision, Inc. Procédé et dispositif permettant de transformer des couches de cnn afin d'optimiser la quantification de paramètres de cnn à utiliser pour des dispositifs mobiles ou des réseaux compacts à grande précision par l'intermédiaire de l'optimisation de matériel
CN109800877A (zh) * 2019-02-20 2019-05-24 腾讯科技(深圳)有限公司 神经网络的参数调整方法、装置及设备
CN109800877B (zh) * 2019-02-20 2022-12-30 腾讯科技(深圳)有限公司 神经网络的参数调整方法、装置及设备
CN111723901A (zh) * 2019-03-19 2020-09-29 百度在线网络技术(北京)有限公司 神经网络模型的训练方法及装置
CN111723901B (zh) * 2019-03-19 2024-01-12 百度在线网络技术(北京)有限公司 神经网络模型的训练方法及装置
CN110110852A (zh) * 2019-05-15 2019-08-09 电科瑞达(成都)科技有限公司 一种深度学习网络移植到fpag平台的方法
JP7167405B2 (ja) 2019-06-12 2022-11-09 寒武紀(西安)集成電路有限公司 ニューラルネットワークにおける量子化パラメータの確定方法および関連製品
JP2021179966A (ja) * 2019-06-12 2021-11-18 シャンハイ カンブリコン インフォメーション テクノロジー カンパニー リミテッドShanghai Cambricon Information Technology Co., Ltd. ニューラルネットワークにおける量子化パラメータの確定方法および関連製品
CN110378470B (zh) * 2019-07-19 2023-08-18 Oppo广东移动通信有限公司 神经网络模型的优化方法、装置以及计算机存储介质
CN112561028A (zh) * 2019-09-25 2021-03-26 华为技术有限公司 训练神经网络模型的方法、数据处理的方法及装置
US11922316B2 (en) 2019-10-15 2024-03-05 Lg Electronics Inc. Training a neural network using periodic sampling over model weights
WO2021075735A1 (fr) * 2019-10-15 2021-04-22 Lg Electronics Inc. Formation d'un réseau neuronal à l'aide d'un échantillonnage périodique sur des poids modèles
US11720998B2 (en) 2019-11-08 2023-08-08 Samsung Electronics Co., Ltd. Artificial intelligence (AI) encoding apparatus and operating method thereof and AI decoding apparatus and operating method thereof
CN111144564A (zh) * 2019-12-25 2020-05-12 上海寒武纪信息科技有限公司 对神经网络执行训练的设备及其集成电路板卡
CN111368978B (zh) * 2020-03-02 2023-03-24 开放智能机器(上海)有限公司 一种离线量化工具的精度提升方法
CN111368978A (zh) * 2020-03-02 2020-07-03 开放智能机器(上海)有限公司 一种离线量化工具的精度提升方法
CN113468935A (zh) * 2020-05-08 2021-10-01 上海齐感电子信息科技有限公司 人脸识别方法
CN113468935B (zh) * 2020-05-08 2024-04-02 上海齐感电子信息科技有限公司 人脸识别方法
CN113673664B (zh) * 2020-05-14 2023-09-12 杭州海康威视数字技术股份有限公司 数据溢出检测方法、装置、设备及存储介质
CN113673664A (zh) * 2020-05-14 2021-11-19 杭州海康威视数字技术股份有限公司 数据溢出检测方法、装置、设备及存储介质
WO2022009449A1 (fr) * 2020-07-10 2022-01-13 富士通株式会社 Dispositif, procédé et programme de traitement d'informations
WO2022009433A1 (fr) * 2020-07-10 2022-01-13 富士通株式会社 Dispositif, procédé et programme de traitement d'informations
CN113780523A (zh) * 2021-08-27 2021-12-10 深圳云天励飞技术股份有限公司 图像处理方法、装置、终端设备及存储介质
CN113780523B (zh) * 2021-08-27 2024-03-29 深圳云天励飞技术股份有限公司 图像处理方法、装置、终端设备及存储介质
WO2023115814A1 (fr) * 2021-12-22 2023-06-29 苏州浪潮智能科技有限公司 Architecture matérielle fpga, procédé de traitement de données associé et support de stockage
CN114492779A (zh) * 2022-02-16 2022-05-13 安谋科技(中国)有限公司 神经网络模型的运行方法、可读介质和电子设备

Also Published As

Publication number Publication date
CN108345939A (zh) 2018-07-31
CN108345939B (zh) 2022-05-24

Similar Documents

Publication Publication Date Title
WO2018140294A1 (fr) Réseau neuronal basé sur des opérations à virgule fixe
US11568258B2 (en) Operation method
Zhou et al. Dorefa-net: Training low bitwidth convolutional neural networks with low bitwidth gradients
US10929744B2 (en) Fixed-point training method for deep neural networks based on dynamic fixed-point conversion scheme
US10678508B2 (en) Accelerated quantized multiply-and-add operations
US10096134B2 (en) Data compaction and memory bandwidth reduction for sparse neural networks
KR20190050698A (ko) 신경망의 최적화 방법
CN113424202A (zh) 针对神经网络训练调整激活压缩
EP3816873A1 (fr) Dispositif de circuit de réseau neuronal, procédé de traitement de réseau neuronal et programme d&#39;exécution de réseau neuronal
CN107944545B (zh) 应用于神经网络的计算方法及计算装置
CN113273082A (zh) 具有异常块浮点的神经网络激活压缩
WO2020154083A1 (fr) Compression d&#39;activation de réseau neuronal avec des mantisses non uniformes
CN112673383A (zh) 神经网络核中动态精度的数据表示
CN111026544B (zh) 图网络模型的节点分类方法、装置及终端设备
Zhou et al. Deep learning binary neural network on an FPGA
CN113826122A (zh) 人工神经网络的训练
US11704556B2 (en) Optimization methods for quantization of neural network models
Choi et al. Retrain-less weight quantization for multiplier-less convolutional neural networks
CN113994347A (zh) 用于负值和正值的非对称缩放因子支持的系统和方法
CN114450891A (zh) 利用误差校正码的二进制神经元和二进制神经网络的设计和训练
US20240104342A1 (en) Methods, systems, and media for low-bit neural networks using bit shift operations
CN114444686A (zh) 一种卷积神经网络的模型参数量化方法、装置及相关装置
CN111126557A (zh) 神经网络量化、应用方法、装置和计算设备
Scanlan Low power & mobile hardware accelerators for deep convolutional neural networks
US20220405576A1 (en) Multi-layer neural network system and method

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 18709181

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 18709181

Country of ref document: EP

Kind code of ref document: A1