WO2020240687A1 - Procédé de traitement de calcul, dispositif de traitement de calcul et programme - Google Patents

Procédé de traitement de calcul, dispositif de traitement de calcul et programme Download PDF

Info

Publication number
WO2020240687A1
WO2020240687A1 PCT/JP2019/021060 JP2019021060W WO2020240687A1 WO 2020240687 A1 WO2020240687 A1 WO 2020240687A1 JP 2019021060 W JP2019021060 W JP 2019021060W WO 2020240687 A1 WO2020240687 A1 WO 2020240687A1
Authority
WO
WIPO (PCT)
Prior art keywords
output value
output
layer
neurons
quantization
Prior art date
Application number
PCT/JP2019/021060
Other languages
English (en)
Japanese (ja)
Inventor
圭一 黒川
中江 達哉
伸也 高前田
由華 大羽
Original Assignee
株式会社ソシオネクスト
国立大学法人北海道大学
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 株式会社ソシオネクスト, 国立大学法人北海道大学 filed Critical 株式会社ソシオネクスト
Priority to PCT/JP2019/021060 priority Critical patent/WO2020240687A1/fr
Publication of WO2020240687A1 publication Critical patent/WO2020240687A1/fr

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology

Definitions

  • the present invention relates to an arithmetic processing method, an arithmetic processing device, and a program.
  • a neural network is generally composed of a plurality of layers including one or more neurons.
  • neural networks composed of multiple layers of neural networks have also become known, and high performance has been obtained in various tasks such as image recognition and voice recognition.
  • the calculation cost of the neural network is generally high.
  • Non-Patent Document 1 and Non-Patent Document 2 propose a quantization method. For example, in Non-Patent Document 1, the calculation cost of the neural network is reduced by quantizing the output value of each neuron and expressing it in 1 bit.
  • the calculation cost is reduced, but the performance of tasks such as image recognition and voice recognition (for example, image classification accuracy and voice recognition accuracy) may be reduced. there were. That is, while the calculation cost can be reduced because the output value of each neuron can be expressed with a small bit length by the quantization method, the information of the output value is lost due to the quantization, so that the task performance may be deteriorated.
  • the embodiment of the present invention has been made in view of the above points, and an object thereof is to suppress a deterioration in performance while reducing the calculation cost of the neural network.
  • the arithmetic processing method is an arithmetic processing method of a neural network including K layers (K is an integer of 2 or more), and the k-th of the neural network.
  • K is an integer of 2 or more
  • the output value calculation procedure for calculating the output values of the plurality of neurons included in the k-th layer, and two or more of the plurality of neurons included in the k-th layer (however, k ⁇ K-1). It is characterized in that a computer executes a quantization procedure for quantizing the output value of one of the two or more neurons by using the output value of the above two or more neurons.
  • the present embodiment a calculation method capable of suppressing the performance deterioration due to the quantization while reducing the calculation cost by quantizing the output value of each neuron of the neural network will be described.
  • the parameters of the neural network for example, weighting coefficient, bias, etc.
  • FIG. 1 is a diagram for explaining an example of a conventional calculation method.
  • the neural network is composed of a total of three layers from the 0th layer to the second layer, each layer other than the output layer (second layer) is a fully connected layer, and each layer has 4 layers.
  • the case where one neuron is included will be described.
  • the neuron may be referred to as a "node" or the like.
  • the input layer (0th layer) is input to the nth neuron.
  • value i.e., the input values to the neural network
  • the output value M n 0 is calculated using the output value M 0 0 .
  • the output value M 0 0 is quantized (that is, binarized) to either "+1" or "-1".
  • the output value M 1 is quantized (that is, binarized) to either "+1" or "-1".
  • the output value M 1 is quantized (that is, binarized) to either "+1" or "-1”.
  • the output value M 1 to calculate the output values B 1 0.
  • n 2 and 3.
  • the output value M n 1 of is calculated.
  • the output value M n 1 calculates the output value B n 1 quantized.
  • the parameter of the neural network is only the weighting coefficient, but there may be a bias as the parameter of the neural network.
  • the bias is added to the above equations (1) to (3), respectively.
  • the sum of products of the output value (or the input value to the neural network) from the previous layer and the weighting coefficient is used as the output value to the next layer.
  • the output value of a predetermined activation function with the sum of products (or the value obtained by adding a bias to the sum of products) as an input may be used as the output value to the next layer.
  • FIG. 2 is a diagram for explaining an example of the calculation method of the present embodiment.
  • the neural network is composed of a total of three layers from the 0th layer to the second layer, and each layer other than the output layer (second layer) is a fully connected layer. Moreover, the case where each layer contains four neurons will be described. Further, the meaning of each symbol shall be the same as in FIG.
  • the output value M n 0 of the n-th neuron N n 0 by the above formula (1) calculate.
  • the output value B n 0 after quantization is calculated using the set of two output values. More specifically, the output value B 0 0 after quantization is calculated using the output value M 0 0 and the output value M 10 0 . Similarly, the output value B 1 0 after quantization calculated using the output value M 1 0 and the output value M 2 0, after quantization by using the output value M 2 0 and the output value M 3 0 calculating the output values B 2 0. Similarly, to calculate the output values B 3 0 after quantization using the output value M 3 0 and the output value M 0 0.
  • the output value M n 1 of is calculated.
  • the output value B n 1 after quantization is calculated using the set of two output values. More specifically, the output value B 0 1 after quantization is calculated using the output value M 0 1 and the output value M 11 1 . Similarly, the quantized output value B 1 1 is calculated using the output value M 1 1 and the output value M 2 1 , and the quantized output value B 1 1 is calculated using the output value M 2 1 and the output value M 3 1 . The output value B 2 1 is calculated. Similarly, to calculate the output values B 3 1 after quantization using the output value M 3 1 and the output value M 0 1.
  • the output value after quantization is calculated using the set of two output values. As a result, as will be described later, it is possible to suppress the deterioration of the performance of the neural network while maintaining the reduction of the calculation cost due to the quantization.
  • the number of layers of the neural network is 3 and the number of neurons included in each layer is 4, but the number of layers of the neural network may be any number of layers and is included in each layer.
  • FIG. 3 is a diagram showing an example of the overall configuration of the arithmetic processing unit 10 according to the present embodiment.
  • the arithmetic processing unit 10 has an arithmetic processing unit 100 and a storage unit 200.
  • the arithmetic processing unit 100 executes the arithmetic processing of the neural network by the arithmetic method of the present embodiment described above.
  • the storage unit 200 has various data (for example, learned neural network parameters (for example, weighting coefficient), input values to the neural network, etc.) necessary for executing the neural network arithmetic processing by the arithmetic method of the present embodiment. ) Is remembered.
  • the storage unit 200 also stores the output value of each layer calculated during the execution of the arithmetic processing of the neural network.
  • the arithmetic processing unit 100 includes an output calculation unit 110 and a quantization unit 120.
  • the output calculation unit 110 calculates the output value of each neuron included in each layer of the neural network by, for example, the above equations (1) to (3).
  • the quantization unit 120 calculates the output value after quantization using the set of output values calculated by the output calculation unit 110.
  • FIG. 4 is a diagram showing an example of the hardware configuration of the arithmetic processing unit 10 according to the present embodiment.
  • the arithmetic processing unit 10 includes an input device 301, a display device 302, a processor 303, a memory device 304, an external I / F 305, and a communication I / F 306. .. Each of these hardware is communicably connected via bus 307.
  • the input device 301 is, for example, a keyboard, a mouse, a touch panel, or the like.
  • the display device 302 is, for example, a display or the like.
  • the arithmetic processing unit 10 does not have to have at least one of the input device 301 and the display device 302.
  • the processor 303 is, for example, various arithmetic units such as a CPU (Central Processing Unit), a GPU (Graphics Processing Unit), and an FPGA (Field-Programmable Gate Array).
  • the memory device 304 is various storage devices such as a RAM (RandomAccessMemory), a ROM (ReadOnlyMemory), an HDD (HardDiskDrive), and an SSD (SolidStateDrive).
  • the arithmetic processing unit 100 is realized, for example, by processing one or more programs stored in the memory device 304 or the like to be executed by the processor 303. Further, the storage unit 200 can be realized by using, for example, a memory device 304 or the like.
  • the external I / F 305 is an interface with the recording medium 305a.
  • Examples of the recording medium 305a include a CD (Compact Disc), a DVD (Digital Versatile Disk), an SD memory card (Secure Digital memory card), a USB (Universal Serial Bus) memory card, and the like.
  • the communication I / F 306 is an interface for connecting the arithmetic processing unit 10 to the communication network.
  • the arithmetic processing unit 10 according to the present embodiment has the hardware configuration shown in FIG. 4, the arithmetic processing of the neural network can be realized by the arithmetic method of the present embodiment.
  • FIG. 4 shows a case where the arithmetic processing unit 10 according to the present embodiment is realized by one device (computer), but the present invention is not limited to this.
  • the arithmetic processing unit 10 according to the present embodiment may be realized by a plurality of devices (computers). Further, one device (computer) may include a plurality of processors 303 and a plurality of memory devices 304.
  • the arithmetic processing unit 100 initializes the index k representing each layer of the neural network to 0 (step S101).
  • the arithmetic processing unit 100 calculates the output of the kth layer (that is, the output value of each neuron included in the kth layer) by the output calculation unit 110 (step S102). Details of the processing in this step will be described later.
  • the arithmetic processing unit 100 quantizes the output of the kth layer by the quantization unit 120 (step S103). Details of the processing in this step will be described later.
  • step S103 does not necessarily have to be executed after the process of step S102 is completed, and as soon as the calculation of the output value required for quantization is completed, the calculation of the output value after the corresponding quantization is executed. May be good.
  • the quantized output value B 1 k may be calculated in step S103.
  • FIG. 6 is a flowchart showing an example of the process of calculating the output of the kth layer.
  • the output calculation unit 110 determines whether or not there is a neuron whose output value has not been calculated among the neurons in the kth layer (step S201).
  • step S201 If it is determined in step S201 that there are neurons for which the output value has not been calculated, the output calculation unit 110 selects one neuron from these neurons (step S202).
  • the output calculation unit 110 calculates the output value of the selected neuron (step S203). At this time, the output calculation unit 110 calculates the output value of the selected neuron by any of the following according to the value of k. The output value of the neuron calculated by the output calculation unit 110 is stored in the storage unit 200.
  • the output calculation unit 110 sets the output value M n 0 of the selected neuron N n 0 by the following equation (4). calculate.
  • the output calculation unit 110 uses the following equation (5) to calculate the output value of the selected neuron N n k . Calculate M n k .
  • the output calculation unit 110 uses the following equation (6) to calculate the output value of the selected neuron N n K-1 . to calculate the O n.
  • step S201 the output calculation unit 110 returns to step S201.
  • steps S202 to S203 are repeatedly executed until the output values of all the neurons in the k-th layer are calculated.
  • step S201 when it is determined in step S201 that there are no neurons for which the output value has not been calculated (that is, when the output values of all the neurons included in the k-th layer have been calculated), the output calculation unit 110 has the first The process of calculating the output of the k-th layer is terminated.
  • step S203 may be executed in parallel for a plurality of neurons. That is, for example, the output calculation unit 110 may select a plurality of neurons in step S202 and then calculate the output values of these plurality of neurons in parallel in step S203.
  • FIG. 7 is a flowchart showing an example of the process of quantizing the output of the kth layer. In the following description, it is assumed that the output values of all the neurons in the kth layer are calculated.
  • the quantization unit 120 sets a set of output values from the output of the kth layer (that is, the output value of each neuron in the kth layer).
  • the quantization unit 120 selects the output value after quantization from the set of output values selected in step S301 above.
  • step S302 Details of the processing in this step will be described later.
  • the quantization unit 120 initializes the index n representing the neuron to 0 (step S303).
  • the quantization unit 120 selects a set of output values (M n k , M n + 1 k ) from the output of the kth layer (step S305).
  • the quantization unit 120 calculates the output value B n k after quantization from a set of output values selected in step S305 described above (step S306). Details of the processing in this step will be described later.
  • step S304 when it is determined in step S304 that n> Nk- 2, the quantization unit 120 ends the process of quantizing the output of the kth layer.
  • step S306 may be executed in parallel for a plurality of "sets of output values". That is, for example, the quantization unit 120 selects a plurality of "output value sets" in the above step S305, and then in the above step S306, the quantized output values from the plurality of "output value sets”. May be calculated in parallel.
  • FIG. 8 is a flowchart showing an example of a process of calculating the output value after quantization from the set of output values.
  • the output value B p k after quantization is calculated from the set of output values (M p k , M q k ) will be described.
  • p N k -1
  • (p, q) (N k -1, 0)
  • the quantization unit 120 calculates the difference value between the output value M p k and the output value M q k included in the set of output values (M p k , M q k ) (step S401).
  • the quantization unit 120 calculates the output value B p k after quantization from the difference value D p k by an arbitrary quantization method (step S402). Incidentally, the output value B p k after quantization is stored in the storage unit 200.
  • any method can be used as the quantization method.
  • the quantization includes converting an input value represented by a first number of values into an output value represented by a second number of values less than the first number.
  • the quantization may include non-linear quantization in addition to linear quantization, and includes, for example, logarithmic quantization.
  • Quantization method 1 For example, quantization based on the quantization method described in the literature "Sergey Ioffe, Christian Szegedy,” Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift “, arXiv: 1502.03167v3 [cs.LG].” Do. In this document, the average value ⁇ and variance ⁇ 2 of the training data included in the mini-batch at the time of learning are normalized by the following equation (7) using the parameters ⁇ and ⁇ acquired by learning. It is carried out.
  • the mean value ⁇ and the variance ⁇ 2 are calculated at the time of inference, but since the calculation cost is high, the values calculated at the time of learning are averaged (that is, each mean value ⁇ calculated in mini-batch units).
  • the mean value and the mean value of each variance ⁇ 2 calculated in mini-batch units) are often used at the time of inference. Therefore, also in this embodiment, it is assumed that the mean value ⁇ and the variance ⁇ 2 are averaged values calculated at the time of learning.
  • the constants a and b are used, respectively.
  • the quantization unit 120 calculates the output value B p k after quantization according to the following Step 1-1 to Step 1-2.
  • Quantization method 2 For example, quantization may be performed by the logarithmic quantization method described in Non-Patent Document 2 described above. In this case, the quantization unit 120 calculates the output value B p k after quantization by the following Step 2-1 to Step 2-2.
  • Round represents a round function
  • the output value Mp k may be deleted from the storage unit 200. In this case, the output value Mp k is unnecessary for the calculation of the output value after the quantization.
  • the output value M 0 k may be deleted from the storage unit 200.
  • FIG. 9 is a flowchart showing another example of the process of quantizing the output of the kth layer.
  • the process of quantizing the output of the k-th layer described in FIG. 9 efficiently processes the process of quantizing the output of the k-th layer described in FIG. 7 by using the remainder operator "%". Is composed of.
  • the quantization unit 120 initializes the index n representing the neuron to 0 (step S501).
  • step S503 When it is determined in step S502 that n> N k -1, the quantization unit 120 substitutes n% N k for p and (n + 1)% N k for q (step S503). Where% is the modulo operator. That is, n% N k represents "the remainder of n divided by N k ", and (n + 1)% N k represents "the remainder of n + 1 divided by N k ".
  • the quantization unit 120 selects a set of output values (M p k , M q k ) from the output of the kth layer (step S504).
  • the quantization unit 120 calculates the output value B p k after quantization from a set of output values selected in step S504 described above (step S505).
  • the details of the processing in this step are as described in FIG.
  • step S502 when it is determined in step S502 that n> N k -1, the quantization unit 120 ends the process of quantizing the output of the kth layer.
  • the output value M 0 k can be deleted from the storage unit 200 when the quantized output value B 0 k is calculated. To do so.
  • a modification 1 of the calculation method of the present embodiment will be described with reference to FIG.
  • FIG. 10 is a diagram for explaining an example of the calculation method of the modification 1 of the present embodiment.
  • the neural network is composed of a total of three layers from the 0th layer to the second layer, and each layer other than the output layer (second layer) is a fully connected layer. Moreover, the case where each layer contains four neurons will be described. Further, the meaning of each symbol shall be the same as in FIG.
  • the output value M n 0 of the n-th neuron N n 0 by the above formula (1) calculate.
  • the output value B n 0 after quantization other than the output value B 3 0 after quantization a (n 0,1,2) 2 two output values set Is calculated using. More specifically, the output value B 0 0 after quantization is calculated using the output value M 0 0 and the output value M 10 0 . Similarly, the output value B 1 0 after quantization calculated using the output value M 1 0 and the output value M 2 0, after quantization by using the output value M 2 0 and the output value M 3 0 calculating the output values B 2 0. On the other hand, the output value B 3 0 after quantization, is calculated using only the output value M 3 0.
  • the output value M n 1 of is calculated.
  • the output value B n 1 after quantization other than the output value B 3 1 after quantization a (n 0,1,2) 2 two output values set Is calculated using. More specifically, the output value B 0 1 after quantization is calculated using the output value M 0 1 and the output value M 11 1 . Similarly, the quantized output value B 1 1 is calculated using the output value M 1 1 and the output value M 2 1 , and the quantized output value B 1 1 is calculated using the output value M 2 1 and the output value M 3 1 . The output value B 2 1 is calculated. On the other hand, the output value B 3 1 after quantization, is calculated using only the output value M 3 1.
  • FIG. 11 is a flowchart showing a modification 1 of the process of quantizing the output of the kth layer.
  • steps S601 to S605 are the same as steps S303 to S307 of FIG. 7, the description thereof will be omitted.
  • the quantization unit 120 may be quantized by, for example, the above-mentioned quantization method 1 or quantization method 2.
  • FIG. 12 is a diagram for explaining an example of the calculation method of the second modification of the present embodiment.
  • the neural network is composed of a total of three layers from the 0th layer to the second layer, and each layer other than the output layer (second layer) is a fully connected layer. Moreover, the case where each layer contains four neurons will be described. Further, the meaning of each symbol shall be the same as in FIG.
  • the output value M n 0 of the n-th neuron N n 0 by the above formula (1) calculate.
  • the output value M 3 0 the output value M 0 0 the output value M 1 0 and the output value B 3 after quantization using Calculate 0 the output value M 3 0 the output value M 0 0 the output value M 1 0 and the output value B 3 after quantization using Calculate 0 .
  • the output value M n 1 of is calculated.
  • the output value after quantization is calculated using the output values of three or more neurons (three neurons in the example shown in FIG. 12). As a result, it is possible to realize higher performance by the calculation method of the present embodiment while maintaining the reduction of the calculation cost by quantization.
  • FIG. 13 is a flowchart showing a modification 2 of the process of quantizing the output of the kth layer.
  • the quantization unit 120 sets a set of output values from the output of the kth layer (that is, the output value of each neuron in the kth layer).
  • the quantization unit 120 sets a set of output values from the output of the kth layer (that is, the output value of each neuron in the kth layer).
  • the quantization unit 120 initializes the index n representing the neuron to 0 (step S705).
  • the quantization unit 120 selects a set of output values (M n k , M n + 1 k , M n + 2 k ) from the output of the kth layer (M n k , M n + 1 k , M n + 2 k ) ( Step S707).
  • the quantization unit 120 calculates the output value B n k after quantization from a set of output values selected in step S707 described above (step S 708). Details of the processing in this step will be described later.
  • step S706 when it is determined in step S706 that n> N k -3, the quantization unit 120 ends the process of quantizing the output of the kth layer.
  • step S708 may be executed in parallel for a plurality of "sets of output values". That is, for example, the quantization unit 120 selects a plurality of "output value sets" in the above step S707, and then in the above step S708, the quantized output values from the plurality of "output value sets”. May be calculated in parallel.
  • the output value after quantization is calculated using the output values of three neurons, but as described above, any number of J (J is an integer of 3 or more).
  • the output value after quantization may be calculated using the output value of the neuron of.
  • a set of output values of J neurons is selected as a set of output values.
  • FIG. 14 is a flowchart showing an example of a process of calculating the output value after quantization from the set of output values.
  • the output value B p k after quantization is calculated from the set of output values (M p k , M q k , Mr k ) will be described.
  • the quantization unit 120 includes the output value M p k included in the set of output values (M p k , M q k , M r k ) and the output value M q k or M r k , whichever is larger.
  • the quantization unit 120 similarly to step S402 of FIG. 8, calculates the output value B p k after quantization from the difference value D p k by any quantization method (step S802).
  • the output value M p k, and the maximum output value among the remaining output values may be used as the difference value.
  • FIG. 15 is a flowchart showing another example of the second modification of the process of quantizing the output of the kth layer.
  • the process of quantizing the output of the kth layer is the process of quantizing the output of the kth layer described with reference to FIG. 14 efficiently by using the remainder operator “%”. It is configured.
  • the quantization unit 120 initializes the index n representing the neuron to 0 (step S901).
  • step S902 When it is determined in step S902 that n> N k -1, the quantization unit 120 sets p to n% N k , q to (n + 1)% N k , and r to (n + 2)% N k , respectively. Substitute (step S903).
  • the quantization unit 120 selects a set of output values (M p k , M q k , Mr k ) from the output of the kth layer (step S904).
  • the quantization unit 120 calculates the output value B p k after quantization from a set of output values selected in step S904 described above (step S905).
  • the details of the processing in this step are as described with reference to FIG.
  • step S902 when it is determined in step S902 that n> N k -1, the quantization unit 120 ends the process of quantizing the output of the kth layer.
  • the arithmetic processing apparatus 10 performs quantization using the output values of the two neurons when quantizing the output values of each neuron in the input layer and the intermediate layer of the neural network. ..
  • the deterioration of performance for example, image classification accuracy, voice recognition accuracy, etc.
  • the arithmetic processing device 10 when the arithmetic processing device 10 according to the present embodiment quantizes the output value of each neuron in the input layer and the intermediate layer of the neural network, the output value of a specific neuron is only the output value. It is also possible to perform quantization using. As a result, it is possible to improve the efficiency of the memory and reduce the calculation cost.
  • the arithmetic processing device 10 uses the output values of three or more neurons when quantizing the output values of each neuron in the input layer and the intermediate layer of the neural network. It is also possible to perform quantization. This makes it possible to further suppress the deterioration of performance due to quantization.
  • modification 1 and modification 2 can be used in combination. That is, when the output values of each neuron in the input layer and the intermediate layer of the neural network are quantized, the output value of a specific neuron is quantized using only the output value, and the output values of other neurons are quantized. It is also possible to quantize using the output values of three or more neurons.
  • each layer of the neural network is a fully connected layer
  • the present invention is not limited to this.
  • the present embodiment is similarly applicable to various neural networks such as convolutional neural networks.
  • the arithmetic processing unit 10 is applied not only to the case of calculating a neural network applied to image recognition and voice recognition, but also to any task as long as the neural network is applied. The same applies to the case of calculating a neural network.
  • Arithmetic processing unit 100 Arithmetic processing unit 110 Output calculation unit 120 Quantization unit 200 Storage unit

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Image Analysis (AREA)

Abstract

L'invention concerne un procédé de traitement de calcul pour un réseau neuronal comprenant K couches (K étant un entier au moins égal à 2), ledit procédé de traitement de calcul étant caractérisé en ce que, pour une couche k (k=0, ..., K-1) du réseau neuronal, un ordinateur exécute : une procédure de calcul de valeur de sortie pour calculer chacune des valeurs de sortie d'une pluralité de neurones inclus dans la couche k à l'aide, en tant qu'entrée, de la valeur d'entrée au réseau neuronal si k = 0, ou la valeur de sortie de la couche (k-1) si k ≠ 0 ; et une procédure de quantification pour quantifier la valeur de sortie de l'un d'au moins deux neurones de la pluralité de neurones inclus dans la couche k à l'aide des valeurs de sortie des deux neurones ou plus (tant que k ≠ K-1).
PCT/JP2019/021060 2019-05-28 2019-05-28 Procédé de traitement de calcul, dispositif de traitement de calcul et programme WO2020240687A1 (fr)

Priority Applications (1)

Application Number Priority Date Filing Date Title
PCT/JP2019/021060 WO2020240687A1 (fr) 2019-05-28 2019-05-28 Procédé de traitement de calcul, dispositif de traitement de calcul et programme

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/JP2019/021060 WO2020240687A1 (fr) 2019-05-28 2019-05-28 Procédé de traitement de calcul, dispositif de traitement de calcul et programme

Publications (1)

Publication Number Publication Date
WO2020240687A1 true WO2020240687A1 (fr) 2020-12-03

Family

ID=73552093

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/JP2019/021060 WO2020240687A1 (fr) 2019-05-28 2019-05-28 Procédé de traitement de calcul, dispositif de traitement de calcul et programme

Country Status (1)

Country Link
WO (1) WO2020240687A1 (fr)

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH08272759A (ja) * 1995-03-22 1996-10-18 Cselt Spa (Cent Stud E Lab Telecomun) 相関信号処理用ニューラルネットワークの実行スピードアップの方法
US20180341857A1 (en) * 2017-05-25 2018-11-29 Samsung Electronics Co., Ltd. Neural network method and apparatus

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH08272759A (ja) * 1995-03-22 1996-10-18 Cselt Spa (Cent Stud E Lab Telecomun) 相関信号処理用ニューラルネットワークの実行スピードアップの方法
US20180341857A1 (en) * 2017-05-25 2018-11-29 Samsung Electronics Co., Ltd. Neural network method and apparatus

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
OHBA, YUKA ET AL.: "Study of Hardware- Oriented High-Precision Model Based on Binarized Neural Network", IEICE TECHNICAL REPORT., vol. 118, no. 63, 2018, pages 21 - 26 *

Similar Documents

Publication Publication Date Title
US9916531B1 (en) Accumulator constrained quantization of convolutional neural networks
JP6965690B2 (ja) ニューラルネットワークの処理速度を向上させるための装置及び方法、並びにその応用
US20190347550A1 (en) Method and apparatus with neural network parameter quantization
CN110663048B (zh) 用于深度神经网络的执行方法、执行装置、学习方法、学习装置以及记录介质
US11604960B2 (en) Differential bit width neural architecture search
WO2022006919A1 (fr) Procédé et système basés sur un ajustement de point fixe d'activation pour la quantification post-apprentissage d'un réseau neuronal convolutif
US11907818B2 (en) Compression of machine-learned models via entropy penalized weight reparameterization
US11790234B2 (en) Resource-aware training for neural networks
US20240061889A1 (en) Systems and Methods for Weighted Quantization
US20230130638A1 (en) Computer-readable recording medium having stored therein machine learning program, method for machine learning, and information processing apparatus
KR20220042455A (ko) 마이크로-구조화된 가중치 프루닝 및 가중치 통합을 이용한 신경 네트워크 모델 압축을 위한 방법 및 장치
EP3882823A1 (fr) Procédé et appareil comportant une approximation softmax
WO2020240687A1 (fr) Procédé de traitement de calcul, dispositif de traitement de calcul et programme
CN115905546A (zh) 基于阻变存储器的图卷积网络文献识别装置与方法
WO2020177863A1 (fr) Apprentissage d'algorithmes
US20220083870A1 (en) Training in Communication Systems
WO2022247368A1 (fr) Procédés, systèmes et support pour réseaux neuronaux à faible bit utilisant des opérations de décalage de bit
CN110574024A (zh) 信息处理设备和信息处理方法
CN113361700A (zh) 生成量化神经网络的方法、装置、系统、存储介质及应用
CN114503439A (zh) 压缩表现混合压缩性的数据
US20240135153A1 (en) Processing data using a neural network implemented in hardware
JP6992864B1 (ja) ニューラルネットワーク軽量化装置、ニューラルネットワーク軽量化方法およびプログラム
US20230130779A1 (en) Method and apparatus with neural network compression
WO2023175722A1 (fr) Programme d'apprentissage et apprenant
EP4177794A1 (fr) Programme d'opération, méthode d'opération et calculatrice

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 19931319

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 19931319

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: JP