TWI672643B

TWI672643B - Full index operation method for deep neural networks, computer devices, and computer readable recording media

Info

Publication number: TWI672643B
Application number: TW107117479A
Authority: TW
Inventors: 吳昕益; 蕭文菁
Original assignee: 倍加科技股份有限公司
Priority date: 2018-05-23
Filing date: 2018-05-23
Publication date: 2019-09-21
Also published as: CN110531955B; CN110531955A; TW202004568A

Abstract

一種應用在深度神經網路的全指數運算方法，其預先將該深度神經網路的每一個神經元的該權重矩陣中的權重值正規化並量化成能以指數2 ^-i表示的量化後權重值，且預先將要輸入該深度神經網路的一圖像資料的複數圖素值正規化並量化成能以指數2 ^-j表示的量化後圖素值，再將該等量化後圖素值輸入該深度神經網路，使該深度神經網路的第一層的各該神經元以量化後的該權重矩陣、指數乘法器、指數加法器及指數減法器對該等量化後圖素值進行卷積運算，藉此降低深度神經網路的運算複雜度及電路複雜度，提高深度神經網路的運算速度並減少記憶體空間的佔用。 A full exponential calculation method applied to a deep neural network, which pre-normalizes and quantizes the weight value in the weight matrix of each neuron of the deep neural network into a quantized weight that can be expressed by an exponent 2 ^-i Value, and in advance normalize and quantize the complex pixel values of an image data to be input into the deep neural network into quantized pixel values that can be expressed by exponents 2 ^-j , and then input the quantized pixel values The deep neural network enables each neuron in the first layer of the deep neural network to use the quantized weight matrix, exponential multiplier, exponential adder, and exponential subtractor to scroll the quantized pixel values Product operation, thereby reducing the computational complexity and circuit complexity of the deep neural network, increasing the computing speed of the deep neural network and reducing the occupation of memory space.

Description

Full exponential calculation method applied to deep neural network, computer device and computer readable recording medium

本發明是有關於全指數運算方法，特別是指一種應用在深度神經網路的全指數運算方法。 The invention relates to a full exponential calculation method, in particular to a full exponential calculation method applied to a deep neural network.

深度神經網路是機器學習中一種深度學習的方法，藉由其模仿生物神經系統的數學模型，不斷地對其提供大量的資料進行不同階層與架構的多次運算和訓練，即能訓練出最佳化且最有效的一資料識別模型。如圖1所示，深度神經網路通常包含一輸入層11、一輸出層12以及連接該輸入層11與該輸出層12，且位於該輸入層11與該輸出層12之間的隱藏層13，隱藏層13由複數個前後相互連結的層14構成，其中每一層14具有複數個神經元141，每一個神經元141具有如圖2所示的一由複數個(例如3x3)權重值W構成的權重矩陣10。由該輸入層11輸入的資料D，例如圖2所示之一張具有5x5個圖素D的圖像20會被輸入隱藏層11中的第一層14的每一個神經元141，各該神經元141會在圖像上依序移動該權重矩陣10，並在權重矩陣10每一個經過的位置，讓權重矩陣10中的權重值W與圖像20上重疊(對應)的部分圖素D相乘並加總(即卷積運算)而得到一特徵值R，各該神經元141再將權重矩陣10經過圖像20上所有位置後運算得到的該等特徵值R輸出給第二層14的每一個神經元141，使第二層14的每一個神經元141對輸入的該等特徵值R進行如上所述的卷積運算，並將運算結果再輸出給其下一層14的每一個神經元141，依此類推，直到最後一層141的每一個神經元141輸出運算結果給輸出層12。 The deep neural network is a method of deep learning in machine learning. By imitating the mathematical model of the biological nervous system, it continuously provides a large amount of data for multiple calculations and training of different levels and structures, that is, it can train the most An optimized and most effective data recognition model. As shown in FIG. 1, a deep neural network generally includes an input layer 11, an output layer 12, and a hidden layer 13 connecting the input layer 11 and the output layer 12 and located between the input layer 11 and the output layer 12 , The hidden layer 13 is composed of a plurality of layers 14 connected to each other, wherein each layer 14 has a plurality of neurons 141, and each neuron 141 has a weight value W as shown in FIG. 2 (for example, 3x3)的 weight matrix 10. The data D input by the input layer 11, for example, an image 20 with 5x5 pixels D shown in FIG. 2 is input to each neuron 141 of the first layer 14 in the hidden layer 11, and each of the neurons Element 141 will move the weight matrix 10 in sequence on the image, and at each passing position of the weight matrix 10, let the weights in the weight matrix 10 The value W is multiplied by the overlapping (corresponding) partial pixels D on the image 20 and summed (ie, convolution operation) to obtain a eigenvalue R. Each neuron 141 then passes the weight matrix 10 through all of the image 20 The eigenvalues R calculated after the position are output to each neuron 141 of the second layer 14, so that each neuron 141 of the second layer 14 performs the convolution operation on the input eigenvalues R as described above And output the operation result to each neuron 141 of the next layer 14 and so on, until each neuron 141 of the last layer 141 outputs the operation result to the output layer 12.

但該權重矩陣10中的權重值W和輸入的資料D通常以浮點數表示，因此神經元141除了需要利用浮點乘法器對權重值W和輸入的資料D進行浮點數相乘運算外，由於浮點數相乘運算的運算量大且乘法運算也比加法運算來得相對複雜，因此浮點數的乘法運算比加法運算耗費更多的時間。而且以邏輯電路實現的浮點乘法器體積相較於加法器更為龐大，使得採用浮點乘法器的深度神經網路被實作成硬體電路的體積相對龐大。此外深度神經網路的權重值W和最終輸出的運算結果都是以浮點數儲存，因此會佔用大量的記憶體空間。 However, the weight value W and the input data D in the weight matrix 10 are usually expressed as floating point numbers, so the neuron 141 needs to use a floating point multiplier to multiply the weight value W and the input data D by floating point Since the multiplication operation of floating-point numbers is large and the multiplication operation is relatively more complicated than the addition operation, the multiplication operation of floating-point numbers takes more time than the addition operation. Moreover, the volume of the floating-point multiplier realized by the logic circuit is larger than that of the adder, so that the deep neural network using the floating-point multiplier is implemented as a hardware circuit with a relatively large volume. In addition, the weight value W of the deep neural network and the final output operation result are stored as floating point numbers, so it will occupy a large amount of memory space.

因此，本發明之目的，即在提供一種應用在深度神經網路的全指數運算方法，以及實現該方法的電腦裝置和電腦可讀取的記錄媒體，其能降低深度神經網路的運算量、運算複雜度及電路複雜度，以及減少記憶體空間的佔用並提高深度神經網路的運算速度。 Therefore, the purpose of the present invention is to provide a full exponential calculation method applied to a deep neural network, and a computer device and a computer-readable recording medium implementing the method, which can reduce the calculation amount of the deep neural network, Operation complexity and circuit Complexity, as well as reducing memory space usage and increasing the speed of deep neural network operations.

於是，本發明一種應用在深度神經網路的全指數運算方法，其中該深度神經網路載於一電腦裝置並具有一由複數個前後相互連結的層構成的隱藏層，該隱藏層中的每一層具有複數個神經元，每一個神經元具有由m個(m為整數且m≧1)權重值構成的一權重矩陣；該方法包括：該電腦裝置的一預處理模組預先將該深度神經網路的每一個神經元的該m個權重值正規化，使該m個正規化權重值落在-1~+1之間的範圍，並將各該正規化權重值量化成能以指數2^-i表示的量化後權重值，且該預處理模組以X位元表示該等量化後權重值的複數個群組；該預處理模組預先將要輸入該深度神經網路的一圖像資料的複數圖素值正規化，使該等正規化圖素值落在-1~+1之間的範圍，並將各該正規化圖素值量化成能以指數2^-j表示的量化後圖素值，且該預處理模組以Y位元表示該等量化後圖素值的複數個群組；該預處理模組將該等量化後圖素值輸入該深度神經網路的該隱藏層中的第一層，使第一層的各該神經元以量化後的該權重矩陣對該等量化後圖素值進行卷積運算，且在每一次的卷積運算中，各該神經元利用一指數乘法器將該m個量化後權重值2^-i分別與該等量化後圖素值中重疊的部分量化後圖素值2^-j相乘而得到m個乘積，其中，該指數乘法器計算每一個乘積的公式為： 2^-i×2^-j=2^-(i+j)，若i,j≠(2^N-1)且i+j≦(2^N-1)；，若i+j>(2^N-1)或i==(2^N-1)或j==(2^N-1)；其中若X=Y則N=X，若X≠Y則N取X、Y兩者中較大者；及在每一次的卷積運算中，各該神經元將該等乘積m中為正值的部分乘積輸入一指數加法器中累加而得到一正累加值2^-p，並另外將該等乘積m中為負值的部分乘積輸入該第一指數加法器中累加而得到一負累加值2^-q，再將該正累加值2^-p與該負累加值2^-q輸入一指數減法器相減而得到一特徵值r；其中，該指數加法器計算兩個指數2^-a和2^-b相加的公式為：2^-a+2^-b=2^-c，若a≠b，則c取a、b兩者中較小者；2^-a+2^-b=2^-a+1，若a==b且a≠(2^N-1)且2^-a+1<2⁰；2^-a+2^-b=2⁰，若a==b且a≠(2^N-1)且2^-a+1 2⁰；2^-a+2^-b=2^-a，若b==(2^N-1)；其中，該指數減法器計算該特徵值r的公式為：r=2^-p-2^-q=2^-p，若p≦q-1或q==(2^N-1)；，若p==q；r=2^-p-2^-q=-2^-q，若p==q+1；r=2^-p-2^-q=2^-p，若q==p+1；r=2^-p-2^-q=-2^-q，若q≦p-1或p==(2^N-1)；其中==是比較運算子，用以比較==左邊的變數是否等於==右邊的變數；且i、j、p、q、r、a、b為正整數。 Therefore, the present invention is a full exponential calculation method applied to a deep neural network, wherein the deep neural network is carried in a computer device and has a hidden layer composed of a plurality of layers connected to each other, and each of the hidden layers One layer has a plurality of neurons, and each neuron has a weight matrix composed of m (m is an integer and m ≧ 1) weight values; the method includes: a preprocessing module of the computer device pre-processes the deep nerve The m weight values of each neuron in the network are normalized, so that the m normalized weight values fall within the range of -1 to +1, and each of the normalized weight values is quantized to an index of 2 ^-i represents the quantized weight value, and the preprocessing module expresses the multiple groups of the quantized weight value in X bits; the preprocessing module will input an image data of the deep neural network in advance The normalization of the complex pixel values of, so that the normalized pixel values fall within the range of -1 to +1, and each of the normalized pixel values is quantized into a quantized graph that can be expressed by an index of 2 ^-j Prime values, and the pre-processing module expresses the multiple groups of the quantized pixel values in Y bits; the pre-processing module inputs the quantized pixel values into the hidden layer of the deep neural network In the first layer of the first layer, each neuron of the first layer performs a convolution operation on the quantized pixel values with the quantized weight matrix, and in each convolution operation, each neuron uses An exponential multiplier multiplies the m quantized weight values 2- ⁱ by the overlapping quantized pixel values 2- ^{j in the} quantized pixel values to obtain m products, where the exponential multiplier The formula for calculating each product is: 2 ^{- i} × 2 ^{- j} = 2- ^{( i + j )} , if i , j ≠ (2 ^N -1) and i + j ≦ (2 ^N -1); , If i + j > (2 ^N -1) or i == (2 ^N -1) or j == (2 ^N -1); where X = Y then N = X, if X ≠ Y then N takes The larger of X and Y; and in each convolution operation, each neuron inputs the partial product of the equal product m which is a positive value into an exponential adder to accumulate to obtain a positive accumulated value 2 ^-p , and additionally input the partial product of the negative product m into the first exponential adder to obtain a negative accumulated value 2 ^-q , and then add the positive accumulated value 2 ^-p and the negative accumulated value 2 ^-q index input a subtractor subtracting a characteristic value obtained R < wherein, the index adder calculates two index 2 ^-a 2 ^-b added and the formula is: 2 ^-a +2 ^-b = 2 ^{- c} , if a ≠ b, then c takes the smaller of a and b; 2 ^-a +2 ^-b = 2 ^{-a + 1} , if a == b and a ≠ (2 ^N -1) and 2 ^{-a + 1} <2 ⁰ ; 2 ^-a +2 ^-b = 2 ⁰ , if a == b and a ≠ (2 ^N -1) and 2 ^{-a + 1} 2 ⁰ ; 2 ^-a +2 ^-b = 2 ^-a , if b == (2 ^N -1); where, the formula of the exponential subtractor to calculate the eigenvalue r is: r = 2 ^-p -2 ^-q = 2 ^-p if p ≦ q-1 or q == (2 ^N -1); , If p == q; r = 2 ^-p -2 ^-q = -2 ^-q , if p == q + 1; r = 2 ^-p -2 ^-q = 2 ^-p , if q == p + 1; r = 2 ^-p -2 ^-q = -2 ^-q , if q ≦ p-1 or p == (2 ^N -1); where == is a comparison operator to compare the variable on the left side of == Is it equal to the variable on the right side of ==; and i, j, p, q, r, a, b are positive integers.

在本發明的一些實施態樣中，各該神經元還將完成全部卷積運算產生的該等特徵值r經由一線性整流函數進行整流運算，而產生相對應的複數個整流後特徵值r’，並將該等整流後特徵值r’輸入與其連結的下一層的該等神經元，使下一層的各該神經元利用量化後的該權重矩陣、該指數乘法器、該指數加法器及該指數減法器對該等整流後特徵值r’進行卷積運算。 In some embodiments of the present invention, each of the neurons will also perform the rectification operation on the eigenvalues r generated by all convolution operations through a linear rectification function, and generate corresponding rectified eigenvalues r ′ , And input the rectified eigenvalue r ′ to the neurons in the next layer connected to it, so that each neuron in the next layer uses the quantized weight matrix, the exponential multiplier, the exponential adder and the The exponential subtractor performs a convolution operation on the rectified characteristic value r ′.

在本發明的一些實施態樣中，該線性整流(Rectified Linear Unit,ReLU)函數可以是斜坡函數、泄露線性整流(Leaky ReLU)函數、帶泄露隨機線性整流(Randomized Leaky ReLU)函數及噪聲線性整流(Noisy ReLU)函數其中之一。 In some embodiments of the present invention, the linear rectification (Rectified Linear Unit, ReLU) function may be a ramp function, a leaky linear rectification (Leaky ReLU) function, a randomized linear rectification (Randomized Leaky ReLU) function, and a noise linear rectification (Noisy ReLU) One of the functions.

再者，本發明實現上述方法的一種電腦裝置，包括一深度神經網路及一預處理模組，其中該深度神經網路具有一由複數個前後相互連結的層構成的隱藏層，該隱藏層中的每一層具有複數個神經元，每一個神經元具有由m個(m為整數且m≧1)權重值構成的一權重矩陣、一指數乘法器、一指數加法器及一指數減法器；該預處理模組，其預先將該深度神經網路的每一個神經元的該m個權重值正規化，使該m個正規化權重值落在-1~+1之間的範圍，並將各該正規化權重值量化成能以指數2^-i表示的量化後權重值，且該預處理模組以X位元表示該等量化後權重值的複數個群組；且該預處理模組預先將要輸入該深度神經網路的一圖像資料的複數圖素值正規化，使該等正規化圖素值落在-1~+1之間的範圍，並將各該正規化圖素值量化成能以指數2^-j表示的量化後圖素值，且該預處理模組以Y位元表示該等量化後圖素值的複數個群組；且該預處理模組將該等量化後圖素值輸入該深度神經網路的該隱藏層中的第一層，使第一層的各該神經元以量化後的該權重矩陣對該等量化後圖素值進行卷積運算，且在每一次的卷積運算中，各該神經元利用該指數乘法器將該m個量化後權重值2^-i分別與該等量化後圖素值中重疊的部分量化後圖素值2^-j相乘而得到m個乘積，其中，該指數乘法器計算每一個乘積之公式為：2^-i×2^-j=2^-(i+j)，若i,j≠(2^N-1)且i+j≦(2^N-1)；，若i+j>(2^N-1)或i==(2^N-1)或j==(2^N-1)；其中若X=Y則N=X，若X≠Y則N取X、Y兩者中較大者；且在每一次的卷積運算中，各該神經元將該等乘積m中為正值的部分乘積輸入該指數加法器中累加而得到一正累加值2^-p，並將該等乘積m中為負值的部分乘積輸入該指數加法器中累加而得到一負累加值2^-q，再將該正累加值2^-p與該負累加值2^-q輸入該指數減法器相減而得到一特徵值r；其中，該指數加法器計算兩個指數2^-a和2^-b相加的公式為：2^-a+2^-b=2^-c，若a≠b，則c取a、b兩者中較小者； 2^-a+2^-b=2^-a+1，若a==b且a≠(2^N-1)且2^-a+1<2⁰；2^-a+2^-b=2⁰，若a==b且a≠(2^N-1)且2^-a+1 2⁰；2^-a+2^-b=2^-a，若b==(2^N-1)；其中，該指數減法器計算該特徵值r的公式為：r=2^-p-2^-q=2^-p，若p≦q-1或q==(2^N-1)；，若p==q；r=2^-p-2^-q=-2^-q，若p==q+1；r=2^-p-2^-q=2^-p，若q==p+1；r=2^-p-2^-q=-2^-q，若q≦p-1或p==(2^N-1)；其中==是比較運算子，用以比較==左邊的變數是否等於==右邊的變數；且i、j、p、q、r、a、b為正整數。 Furthermore, a computer device for realizing the above method according to the present invention includes a deep neural network and a preprocessing module, wherein the deep neural network has a hidden layer composed of a plurality of interconnected layers, the hidden layer Each layer in has a plurality of neurons, and each neuron has a weight matrix composed of m (m is an integer and m ≧ 1) weight values, an exponential multiplier, an exponential adder, and an exponential subtractor; The pre-processing module pre-normalizes the m weight values of each neuron of the deep neural network so that the m normalized weight values fall within the range of -1 to +1, and Each of the normalized weight values is quantized into a quantized weight value that can be represented by an exponent 2 ^-i , and the preprocessing module represents a plurality of groups of the quantized weight values in X bits; and the preprocessing module Normalize the complex pixel values of an image data to be input into the deep neural network in advance, so that the normalized pixel values fall within the range of -1 ~ + 1, and each normalized pixel value Quantized into quantized pixel values that can be represented by exponents 2 ^-j , and the pre-processing module expresses Y-bits of the plurality of groups of quantized pixel values; and the pre-processing module quantizes these After the pixel values are input to the first layer of the hidden layer of the deep neural network, each neuron of the first layer performs a convolution operation on the quantized pixel values with the quantized weight matrix, and In each convolution operation, each neuron uses the exponential multiplier to divide the m quantized weight values 2 ^-i with the quantized pixel values that overlap the quantized pixel values 2 ^-j Multiply to obtain m products, where the exponential multiplier calculates each product as: 2 ^{- i} × 2 ^{- j} = 2- ^{( i + j )} , if i , j ≠ (2 ^N -1) and i + j ≦ (2 ^N -1); , If i + j > (2 ^N -1) or i == (2 ^N -1) or j == (2 ^N -1); where X = Y then N = X, if X ≠ Y then N takes The larger of X and Y; and in each convolution operation, each neuron inputs the positive product of the equal product m into the exponential adder and accumulates it to obtain a positive accumulated value 2 ^-p , and input the partial product of the negative value in the product m into the exponential adder to accumulate to obtain a negative accumulated value 2 ^-q , and then the positive accumulated value 2 ^-p and the negative accumulated value 2 ^-q Enter the exponential subtractor to get a eigenvalue r; where, the exponential adder calculates the formula for adding two exponents 2 ^-a and 2 ^-b : 2 ^-a +2 ^-b = 2 ^-c , if a ≠ b, then c takes the smaller of a and b; 2 ^-a +2 ^-b = 2 ^{-a + 1} , if a == b and a ≠ (2 ^N -1) and 2 ^{-a + 1} <2 ⁰ ; 2 ^-a +2 ^-b = 2 ⁰ , if a == b and a ≠ (2 ^N -1) and 2 ^{-a + 1} 2 ⁰ ; 2 ^-a +2 ^-b = 2 ^-a , if b == (2 ^N -1); where, the formula of the exponential subtractor to calculate the eigenvalue r is: r = 2 ^-p -2 ^-q = 2 ^-p if p ≦ q-1 or q == (2 ^N -1); , If p == q; r = 2 ^-p -2 ^-q = -2 ^-q , if p == q + 1; r = 2 ^-p -2 ^-q = 2 ^-p , if q == p + 1; r = 2 ^-p -2 ^-q = -2 ^-q , if q ≦ p-1 or p == (2 ^N -1); where == is a comparison operator to compare the variable on the left side of == Is it equal to the variable on the right side of ==; and i, j, p, q, r, a, b are positive integers.

在本發明的一些實施態樣中，該深度神經網路及該預處理模組是儲存於該電腦裝置的一儲存單元中，且能被該電腦裝置的一處理單元讀取並執行的軟體程式。 In some embodiments of the present invention, the deep neural network and the preprocessing module are software programs stored in a storage unit of the computer device and can be read and executed by a processing unit of the computer device .

在本發明的一些實施態樣中，該深度神經網路及/或該預處理模組是整合在該電腦裝置的一特殊應用積體電路晶片或一可程式邏輯電路裝置中，或者是被燒錄在該電腦裝置的一微處理器中的韌體。 In some embodiments of the present invention, the deep neural network and / or the pre-processing module are integrated into a special application integrated circuit chip or a programmable logic circuit device of the computer device, or are burned Firmware recorded in a microprocessor of the computer device.

此外，本發明實現上述方法的一種電腦可讀取的記錄媒體，其中儲存一包含一深度神經網路及一預處理模組的軟體程式，該深度神經網路具有一由複數個前後相互連結的層構成的隱藏層，該隱藏層中的每一層具有複數個神經元，每一個神經元具有由m個(m為整數且m≧1)權重值構成的一權重矩陣、一指數乘法器、一指數加法器及一指數減法器，且該軟體程式被一電腦裝置載入並執行後，該電腦裝置能完成如上所述應用在深度神經網路的全指數運算方法。 In addition, the present invention is a computer-readable recording medium that implements the above method, in which a software program including a deep neural network and a preprocessing module is stored The deep neural network has a hidden layer composed of a plurality of interconnected layers, each layer in the hidden layer has a plurality of neurons, and each neuron has a number of m (m is an integer and m ≧ 1) A weight matrix, an exponential multiplier, an exponential adder and an exponential subtractor composed of weight values, and after the software program is loaded and executed by a computer device, the computer device can complete the application as described above in depth Full exponential calculation method of neural network.

本發明之功效在於：藉由預先將該深度神經網路中各該神經元的該權重矩陣中的該等權重值以及預備輸入深度該神經網路的圖像資料分別進行正規化，並分別量化成以2為底數的指數2^-i及2^-j，再將圖像資料之量化後圖素值輸入該深度神經網路的該隱藏層，使與隱藏層的第一層的各該神經元之量化後權重矩陣進行卷積運算，且藉由各該神經元中的指數乘法器、指數加法器及指數減法器對輸入的指數進行簡單的加、減運算及判斷取代習知浮點數乘法運算，而降低神經元的運算複雜度並快速地完成卷積運算，不但提高深度神經網路的運算速度，並能有效簡化並縮減實作成硬體之深度神經網路的電路體積。且當深度神經網路是以軟體程式實現時，亦能有效提高其運算速度。 The effect of the present invention is: by pre-normalizing and quantifying the weight values in the weight matrix of the neurons in the deep neural network and the image data prepared for input to the deep neural network respectively Into the exponents 2 ^-i and 2 ^-j with a base of 2, and then input the quantized pixel values of the image data into the hidden layer of the deep neural network so that each neuron in the first layer of the hidden layer The quantized weight matrix is convoluted, and the exponent multiplier, exponential adder, and exponential subtractor in each neuron are used to perform simple addition, subtraction, and judgment on the input exponent instead of the conventional floating-point multiplication Computing, and reducing the complexity of neurons and completing convolution operations quickly, not only improves the speed of deep neural networks, but also effectively simplifies and reduces the circuit volume of deep neural networks implemented as hardware. And when the deep neural network is implemented by software program, it can also effectively improve its calculation speed.

10‧‧‧權重矩陣 10‧‧‧ weight matrix

11‧‧‧輸入層 11‧‧‧ input layer

12‧‧‧輸出層 12‧‧‧Output layer

13‧‧‧隱藏層 13‧‧‧ hidden layer

14‧‧‧層 14‧‧‧ storey

141‧‧‧神經元 141‧‧‧ Neuron

20‧‧‧圖像 20‧‧‧Image

4‧‧‧電腦裝置 4‧‧‧Computer device

41‧‧‧儲存單元 41‧‧‧Storage unit

42‧‧‧處理單元 42‧‧‧Processing unit

43‧‧‧深度神經網路 43‧‧‧ Deep Neural Network

44‧‧‧預處理模組 44‧‧‧Pretreatment module

51‧‧‧指數乘法器 51‧‧‧Exponential multiplier

52‧‧‧指數加法器 52‧‧‧ Index Adder

53‧‧‧指數減法器 53‧‧‧Index subtractor

W‧‧‧權重值 W‧‧‧ weight value

D‧‧‧圖素 D‧‧‧ pixels

S1~S3‧‧‧步驟 S1 ~ S3‧‧‧Step

本發明之其他的特徵及功效，將於參照圖式的實施方式中清楚地顯示，其中：圖1是習知深度神經網路的基本組成架構示意圖；圖2說明一神經元以一權重矩陣對一輸入資料進行卷積運算的過程；圖3是本發明應用在深度神經網路的全指數運算方法的一實施例的主要流程圖；圖4是本發明實現圖3的方法的電腦裝置的一實施例的主要元件方塊圖；圖5說明本實施例的深度神經網路中的每一神經元具有一指數乘法器、一指數加法器及一指數減法器；及圖6說明本實施例的該預處理模組對深度神經網路的神經元中的權重值以及要輸入深度神經網路的資料執行正規化及量化的過程。 Other features and functions of the present invention will be clearly shown in the embodiments with reference to the drawings, in which: Figure 1 is a schematic diagram of the basic composition architecture of a conventional deep neural network; Figure 2 illustrates the process of a neuron performing a convolution operation on an input data with a weight matrix; Figure 3 is a full exponent of the present invention applied to a deep neural network The main flow chart of an embodiment of the calculation method; FIG. 4 is a block diagram of main components of an embodiment of the computer device of the present invention that implements the method of FIG. 3; FIG. 5 illustrates each nerve in the deep neural network of this embodiment The element has an exponential multiplier, an exponential adder, and an exponential subtractor; and FIG. 6 illustrates the weight value in the neuron of the deep neural network and the input to the deep neural network by the preprocessing module of this embodiment. The process of data normalization and quantification.

在本發明被詳細描述之前，應當注意在以下的說明內容中，類似的元件是以相同的編號來表示。 Before the present invention is described in detail, it should be noted that in the following description, similar elements are denoted by the same numbers.

參閱圖3，是本發明應用在深度神經網路的全指數運算方法的一實施例的主要流程，其由圖4所示的一電腦裝置4執行，該電腦裝置4主要包括一儲存單元41(即電腦可讀取的記錄媒體)及一處理單元42，該儲存單元41中存有一包含一深度神經網路43及一預處理模組44的程式，且該程式能被該電腦裝置4的該處理單元42 載入並執行，而完成圖3所示的方法流程，但不以此為限。亦即，該深度神經網路43及/或該預處理模組44也可以被整合在該電腦裝置4的一特殊應用積體電路(Application-specific integrated circuit，縮寫為ASIC)晶片或一可程式邏輯裝置(Programmable Logic Device，縮寫為PLD)中，而使該特殊應用積體電路晶片或該可程式邏輯電路裝置能完成圖3所示的方法流程。且該特殊應用積體電路晶片或該可程式邏輯電路裝置即為本實施例的該處理單元42；或者該深度神經網路43及/或該預處理模組44也可以是被燒錄在該電腦裝置4的一微處理器中的韌體，使該微處理器執行該韌體即能完成圖3所示的方法流程，且該微處理器即為本實施例的該處理單元42。 Referring to FIG. 3, it is the main flow of an embodiment of the full exponential arithmetic method of the present invention applied to a deep neural network, which is executed by a computer device 4 shown in FIG. 4, the computer device 4 mainly includes a storage unit 41 ( A computer-readable recording medium) and a processing unit 42, the storage unit 41 stores a program including a deep neural network 43 and a pre-processing module 44, and the program can be used by the computer device 4 Processing unit 42 Load and execute, and complete the method flow shown in Figure 3, but not limited to this. That is, the deep neural network 43 and / or the pre-processing module 44 can also be integrated into an application-specific integrated circuit (abbreviated as ASIC) chip or a programmable chip of the computer device 4 In a logic device (Programmable Logic Device, abbreviated as PLD), the special application integrated circuit chip or the programmable logic circuit device can complete the method flow shown in FIG. 3. And the special application integrated circuit chip or the programmable logic circuit device is the processing unit 42 of this embodiment; or the deep neural network 43 and / or the preprocessing module 44 may also be burned in the The firmware in a microprocessor of the computer device 4 enables the microprocessor to execute the firmware to complete the method flow shown in FIG. 3, and the microprocessor is the processing unit 42 of this embodiment.

且如同圖1所示，本實施例的該深度神經網路43除了具有一輸入層11及一輸出層12外，還具有一位於輸入層11和輸出層12之間且由複數個前後相互連結的層141構成的隱藏層14，該隱藏層14中的每一層141具有複數個神經元141，每一個神經元141具有由m個(m為整數且m≧1)權重值構成的一權重矩陣，例如圖2所示具有3x3個權重值W的權重矩陣10，以及如圖5所示的一指數乘法器51、一指數加法器52及一指數減法器53。 As shown in FIG. 1, in addition to an input layer 11 and an output layer 12, the deep neural network 43 of this embodiment also has a plurality of back-to-back connections between the input layer 11 and the output layer 12. Hidden layer 14 composed of layers 141, each layer 141 in the hidden layer 14 has a plurality of neurons 141, and each neuron 141 has a weight matrix composed of m (m is an integer and m ≧ 1) weight values For example, a weight matrix 10 having 3 × 3 weight values W shown in FIG. 2 and an exponential multiplier 51, an exponential adder 52, and an exponential subtractor 53 shown in FIG.

且為了降低深度神經網路的運算量及運算複雜度，如圖3的步驟S1，該預處理模組44預先將該深度神經網路43的每一個神經元141的該m個權重值W正規化，使該m個正規化權重值落在-1~+1之間的範圍，並將各該正規化權重值以log₂量化成能以指數2^-i表示的量化後權重值，且該預處理模組44將該等量化後權重值分成複數個群組，並以X位元表示該等群組。 In order to reduce the computational complexity and computational complexity of the deep neural network, as shown in step S1 of FIG. 3, the preprocessing module 44 pre-regulates the m weight values W of each neuron 141 of the deep neural network 43 to be regular To make the m normalized weight values fall within the range of -1 to +1, and quantize each normalized weight value with log ₂ into a quantized weight value that can be expressed by the index 2 ^-i , and the The preprocessing module 44 divides the quantized weight values into a plurality of groups, and expresses these groups in X bits.

舉例來說，如圖6(a)所示，假設該等權重值W有-2、-1.5、0、0.5、1、1.5等分佈在-2~1.5之間的值，則該預處理模組44將該等權重值W除以2以進行正規化，使該等正規化權重值(-1、-0.75、0、0.25、0.5、0.75)落在-1~+1之間的範圍，如圖6(b)所示；然後，將各該正規化權重值(-1、-0.75、0、0.25、0.5、0.75)以log₂量化成以指數2^-i表示的量化後權重值。因此，1、0.5、0.25及0這四個正規化權重值的絕對值將被以2⁰、2^-1、2^-2及2^-3表示，且由於以2為底的指數無法表示0.75，因此將0.75量化至最接近的1而以2⁰來表示。所以該預處理模組44將該等量化後權重值的絕對值分成四個群組(即2⁰、2^-1、2^-2及2^-3)，並以2位元表示該等群組，亦即00、01、10、11分別代表2⁰、2^-1、2^-2及2^-3，並另外記錄(標記)量化後權重值為正值或負值。 For example, as shown in FIG. 6 (a), assuming that the weight values W have values distributed between -2, -1.5, 0, 0.5, 1, 1.5, etc., between -2 and 1.5, the preprocessing mode Group 44 divides these equal weight values W by 2 to normalize, so that the normalized weight values (-1, -0.75, 0, 0.25, 0.5, 0.75) fall within the range of -1 to +1, As shown in FIG. 6 (b); then, each of the normalized weight values (-1, -0.75, 0, 0.25, 0.5, 0.75) is quantized by log ₂ into a quantized weight value expressed by an exponent 2- ⁱ . Therefore, the absolute values of the four normalized weight values of 1, 0.5, 0.25, and 0 will be represented by 2 ⁰ , 2 ^-1 , 2 ^-2, and 2 ^-3 , and since the base 2 index cannot represent 0.75, the thus quantized to the nearest 0.75 to 1 and ²⁰ to represent. Therefore, the preprocessing module 44 divides the absolute values of these quantized weight values into four groups (that is, 2 ⁰ , 2 ^-1 , 2 ^-2, and 2 ^-3 ), and expresses these groups in 2 bits , That is, 00, 01, 10, and 11 respectively represent 2 ⁰ , 2 ^-1 , 2 ^-2, and 2 ^-3 , and additionally record (mark) the quantized weight value as a positive or negative value.

因此，假設落在-1~+1之間的正規化權重值還包含正或負的0.125、0.0625、0.03125及0.015625或接近這些數值的值時，其以指數2^-i表示的量化後權重值將分別為2^-4、2^-5、2^-6及2^-7，則該預處理模組44需使用3位元才能表示2⁰~2^-7共8個群組，即以 000、001、010、011、100、101、110、111分別代表2⁰、2^-1、2^-2...2^-7。同理，若量化後權重值還包含小於2^-7的值，則該預處理模組44需使用4位元或4位元以上的位元數來表示超過8個以上的量化後權重值(絕對值)群組，以此類推。 Therefore, assuming that the normalized weight values falling between -1 and +1 also contain positive or negative 0.125, 0.0625, 0.03125, and 0.015625 or values close to these values, the quantized weight value expressed by the index 2 ^-i respectively as ^2-4, ^2-5, ^2-6 and ^2-7, the need to use pre-processing module 44 to 3 yuan ²⁰ ~ ^2-7 represents a total of eight groups, i.e. 000,001 , 010, 011, 100, 101, 110, 111 represent 2 ⁰ , 2 ^-1 , 2 ^-2 ... 2 ^{-7 respectively} . Similarly, if the quantized weight value also includes a value less than 2 ^-7 , the preprocessing module 44 needs to use 4 or more bits to represent more than 8 quantized weight values ( Absolute value) group, and so on.

然後，如圖3的步驟S2，該預處理模組44預先將要輸入該深度神經網路43的一圖像資料，例如圖2所示具有5x5個圖素D的圖像20正規化，使該等正規化圖素的圖素值落在-1~+1之間的範圍，並將各該正規化圖素值以log₂量化成能以指數2^-j表示的量化後圖素值，且該預處理模組44將該等量化後圖素值分成複數個群組，並以Y位元表示該等群組。同理，如圖6所示的例子，若量化後圖素值有2⁰、2^-1、2^-2及2^-3這四個絕對值群組，則該預處理模組44將以2位元表示該等群組，亦即以00、01、10、11分別代表2⁰、2^-1、2^-2及2^-3，並另外記錄量化後圖素值為正值或負值。而若量化後圖素值還包含更小的絕對值(例如0.125、0.0625、0.03125及0.015625或更小的值)時，則該預處理模組44將以3位元或更多位元來表示2⁰~2^-7這8個群組或超過8個以上的更多群組，即以000、001、010、011、100、101、110、111分別代表2⁰、2^-1、2^-2...2^-7，依此類推。因此，用來表示量化後權重值2^-i和量化後圖素值2^-j的位元數可能會相同或不同，端視量化後權重值2^-i和量化後圖素值2^-j的絕對值群組數而定。此外，步驟S1和步驟S2並無先後順序之分，也可以同步進行。 Then, as shown in step S2 of FIG. 3, the pre-processing module 44 pre-normalizes an image data to be input to the deep neural network 43, for example, an image 20 with 5x5 pixels D shown in FIG. 2 is normalized to make the The pixel value of the normalized pixel falls within the range of -1 to +1, and each normalized pixel value is quantized by log ₂ into a quantized pixel value that can be expressed by the index 2 ^-j , and The pre-processing module 44 divides the quantized pixel values into a plurality of groups, and represents the groups in Y bits. Similarly, the example shown in Figure 6, if the pixel values are quantized in FIG. ^20, ^2-1, ^2-2 and ^2-3 of the four absolute values of the group, the preprocessing module 44 will be 2 Bits represent these groups, that is, 00, 01, 10, and 11 represent 2 ⁰ , 2 ^-1 , 2 ^-2, and 2 ^-3 , respectively, and additionally record the quantized pixel values as positive or negative values. If the pixel values after quantization also contain smaller absolute values (for example, 0.125, 0.0625, 0.03125, and 0.015625 or less), the preprocessing module 44 will be represented by 3 bits or more ²⁰ ~ ^2-7, or eight groups of more than 8 or more groups, i.e., to represent the ²⁰ 000,001,010,011,100,101,110,111, ^2-1, 2 ^{- 2} ... 2 ^-7 , and so on. Therefore, the number of bits used to represent the quantized weight value 2 ^-i and the quantized pixel value 2 ^-j may be the same or different, depending on the quantized weight value 2 ^-i and the quantized pixel value 2 ^-j The absolute value depends on the number of groups. In addition, step S1 and step S2 have no order, and can also be performed synchronously.

然後，如圖3之步驟S3，該預處理模組44將該等量化後圖素值2^-j經由該深度神經網路43的輸入層11輸入該隱藏層13中的第一層14，使第一層14的各該神經元141以量化後的該權重矩陣10對該等量化後圖素值2^-j進行卷積運算。亦即，如圖2所示，該神經元141會在圖像20上一次移動一個單位(圖素)地依序移動量化後的該權重矩陣10，並在量化後的該權重矩陣10每一個經過的位置，讓量化後的該權重矩陣10中的量化後權重值2^-i與圖像20上重疊(對應)部分的量化後圖素值2^-j相乘並將該等乘積加總(即卷積運算)而得到一特徵值r。 Then, as shown in step S3 of FIG. 3, the pre-processing module 44 inputs the quantized pixel values 2- ^j to the first layer 14 of the hidden layer 13 through the input layer 11 of the deep neural network 43, so that Each neuron 141 of the first layer 14 performs a convolution operation on the quantized pixel values 2- ^j with the quantized weight matrix 10. That is, as shown in FIG. 2, the neuron 141 will sequentially move the quantized weight matrix 10 one unit (pixel) at a time on the image 20, and each of the quantized weight matrix 10 After passing the position, multiply the quantized weight value 2- ⁱ in the quantized weight matrix 10 by the quantized pixel value 2- ^j in the overlapping (corresponding) part of the image 20 and add the products together ( That is, convolution operation) to obtain a feature value r.

且在每一次的卷積運算中，各該神經元141利用該指數乘法器51將該m個量化後權重值2^-i分別與該等量化後圖素值2^-j中重疊的部分量化後圖素值2^-j相乘而得到m個乘積，其中，該指數乘法器51計算每一個乘積的公式為：2^-i×2^-j=2^-(i+j)，若i,j≠(2^N-1)且i+j≦(2^N-1)；，若i+j>(2^N-1)或i==(2^N-1)或j==(2^N-1)；其中，若X=Y則N=X，若X≠Y則N取X、Y兩者中較大者；舉例來說，若量化後權重值2^-i和量化後圖素值2^-j的絕對值群組數都在4或4以下，則X=Y=2，N=2，該指數乘法器51計算每一個乘積的公式則為： 2^-i×2^-j=2^-(i+j)，若i,j≠3且i+j≦3；2^-i×2^-j=2^-3，若i+j>3或i==3或j==3；而若量化後權重值2^-i的絕對值群組數在4或4以下，但量化後圖素值2^-j的絕對值群組數在5~7之間，則X=2，Y=3，N=3，該指數乘法器51計算每一個乘積之公式則為：2^-i×2^-j=2^-(i+j)，若i,j≠7且i+j≦7；2^-i×2^-j=2^-7，若i+j>7或i==7或j==7；由此可知，在i,j≠(2^N-1)且i+j≦(2^N-1)的情況下，該指數乘法器51實際上只是將量化後權重值2^-i和量化後圖素值2^-j的指數i、j相加，即得到兩者的乘積，並不需要進行乘法運算，且在i+j>(2^N-1)或i==(2^N-1)或j==(2^N-1)的情況下，該指數乘法器51甚至不需進行實際運算，即能輸出量化後權重值2^-i和量化後圖素值2^-j的乘積。 In each convolution operation, each neuron 141 uses the exponential multiplier 51 to quantize the overlapping portions of the m quantized weight values 2- ⁱ and the quantized pixel values 2- ^j The pixel values 2- ^{j are} multiplied to obtain m products, where the formula of the exponential multiplier 51 to calculate each product is: 2 ^{- i} × 2 ^{- j} = 2- ^{( i + j )} , if i , j ≠ (2 ^N -1) and i + j ≦ (2 ^N -1); , If i + j > (2 ^N -1) or i == (2 ^N -1) or j == (2 ^N -1); where, if X = Y then N = X, if X ≠ Y then N Take the larger of X and Y; for example, if the absolute value group number of the quantized weight value 2 ^-i and the quantized pixel value 2 ^-j are 4 or less, then X = Y = 2, N = 2, the formula of the exponential multiplier 51 to calculate each product is: 2 ^{- i} × 2 ^{- j} = 2- ^{( i + j )} , if i , j ≠ 3 and i + j ≦ 3; 2 ^-i × 2 ^{- j} = 2 ^-3 if i + j > 3 or i == 3 or j == 3; and if the quantized weight value 2 ^{-i has} an absolute value group number of 4 or less, but the absolute value of the quantized pixel values 2 ^-j group is between 5 and 7, then X = 2, Y = 3, N = 3, the index multiplier 51 calculates a product of each formula was: 2 ^{- i} × 2 ^{- j} = 2- ^{( i + j )} , if i , j ≠ 7 and i + j ≦ 7; 2 ^{- i} × 2 ^{- j} = 2 ^-7 , if i + j > 7 or i == 7 Or j == 7; it can be seen that, in the case of i , j ≠ (2 ^N -1) and i + j ≦ (2 ^N -1), the exponential multiplier 51 is actually just the quantized weight value 2 ^-i and the quantized pixel value 2 ^-j exponents i, j are added, that is, the product of the two is not required, and no multiplication is required, and i + j > (2 ^N -1) or i == ( 2 ^N -1) or j == (2 ^N -1), the exponential multiplier 51 can output the quantized weight value 2 ^-i and the quantized pixel value 2 ^-j without even performing actual calculations The product of.

且在每一次的卷積運算中，各該神經元141將該等乘積m中為正值的部分乘積輸入該指數加法器52中進行累加而得到一正累加值2^-p，再另外將該等乘積m中為負值的部分乘積輸入該第一指數加法器52中進行累加而得到一負累加值2^-q，再將該正累加值2^-p與該負累加值2^-q輸入該指數減法器53相減而得到該特徵值r。 And in each convolution operation, each of the neurons 141 inputs the partial product of the equal product m that is a positive value into the exponential adder 52 to accumulate to obtain a positive accumulated value 2 ^-p , and then additionally The partial product of the equal product m which is a negative value is input to the first exponential adder 52 for accumulation to obtain a negative accumulated value 2 ^-q , and then the positive accumulated value 2 ^-p and the negative accumulated value 2 ^{-q are} input to the The index subtractor 53 subtracts to obtain the characteristic value r.

其中，該指數加法器53計算兩個指數(例如乘積2^-a和2^-b或一個乘積2^-a和一個累加值2^-b)相加的公式為：2^-a+2^-b=2^-c，若a≠b，則c取a、b兩者中較小者； 2^-a+2^-b=2^-a+1，若a==b且a≠(2^N-1)且2^-a+1<2⁰；2^-a+2^-b=2⁰，若a==b且a≠(2^N-1)且2^-a+1 2⁰；2^-a+2^-b=2^-a，若b==(2^N-1)；舉例來說，若a=2，b=3，則c=a，2^-a+2^-b=2^-a；若a=b=2，且N=2時，則2^-a+2^-b=2^-a+1，若a==b且a≠3且2^-a+1<2⁰；2^-a+2^-b=2⁰，若a==b且a≠3且2^-a+1 2⁰；2^-a+2^-b=2^-a，若b==3；其中，該指數減法器53計算該特徵值r的公式為：r=2^-p-2^-q=2^-p，若p≦q-1或q==(2^N-1)；，若p==q；r=2^-p-2^-q=-2^-q，若p==q+1；r=2^-p-2^-q=2^-p，若q==p+1；r=2^-p-2^-q=-2^-q，若q≦p-1或p==(2^N-1)；舉例來說，若p=1，q=3，則r=2^-p-2^-q=2^-p；若p=q=3且N=2，則r=2^-p-2^-q=2^-3；若p=2，q=1，則r=2^-p-2^-q=-2^-q；若p=1，q=2，則r=2^-p-2^-q=2^-q；若q=1，p=3，則r=2^-p-2^-q=-2^-q Wherein, the index adder 53 calculates two index (e.g. 2 ^-a product or a product and a 2 ^-b 2 ^-a accumulated value and a 2 ^-b) added to the formula: 2 ^-a +2 ^-b = 2 ^-c , if a ≠ b, then c takes the smaller of a and b; 2 ^-a +2 ^-b = 2 ^{-a + 1} , if a == b and a ≠ (2 ^N -1) and 2 ^{-a + 1} <2 ⁰ ; 2 ^-a +2 ^-b = 2 ⁰ , if a == b and a ≠ (2 ^N -1) and 2 ^{-a + 1} ^{^{^{2 0; 2 -a +2 -b =}}} 2 -a, if b == (2 ^N -1); for example, if a = 2, b = 3, then c = a, 2 ^-a +2 ^{- b} = 2 ^-a ; if a = b = 2, and N = 2, then 2 ^-a +2 ^-b = 2 ^{-a + 1} , if a == b and a ≠ 3 and 2 ^{-a + 1} < 2 ⁰ ; 2 ^-a +2 ^-b = 2 ⁰ , if a == b and a ≠ 3 and 2 ^{-a + 1} 2 ⁰ ; 2 ^-a +2 ^-b = 2 ^-a , if b == 3; where, the formula of the exponential subtractor 53 to calculate the characteristic value r is: r = 2 ^-p -2 ^-q = 2 ^-p , If p ≦ q-1 or q == (2 ^N -1); , If p == q; r = 2 ^-p -2 ^-q = -2 ^-q , if p == q + 1; r = 2 ^-p -2 ^-q = 2 ^-p , if q == p + 1; r = 2 ^-p -2 ^-q = -2 ^-q , if q ≦ p-1 or p == (2 ^N -1); for example, if p = 1, q = 3, then r = 2 ^-p -2 ^-q = 2 ^-p ; if p = q = 3 and N = 2, then r = 2 ^-p -2 ^-q = 2 ^-3 ; if p = 2, q = 1, then r = 2 ^-p -2 ^-q = -2 ^-q ; if p = 1, q = 2, then r = 2 ^-p -2 ^-q = 2 ^-q ; if q = 1, p = 3, then r = 2 ^-p -2 ^-q = -2 ^-q

因此，各該神經元141藉由該指數乘法器51(實際上只進行加法運算)、指數加法器52及指數減法器53取代習知的浮點乘法器，並以簡單的加法及減法取代乘法運算，不但降低運算量且運算速度快，而且只要相對簡單的邏輯電路即能實作出該指數乘法器51、指數加法器52及指數減法器53，因此當該深度神經網路43被實作成實體電路時，神經元141的電路將能簡化，進而有效縮減該深度神經網路43的整體電路體積。 Therefore, each neuron 141 passes the exponential multiplier 51 (actually only Perform addition operations), exponential adder 52 and exponential subtractor 53 to replace the conventional floating-point multiplier, and replace multiplication with simple addition and subtraction, which not only reduces the amount of operation and the operation speed is fast, but also requires a relatively simple logic circuit That is, the exponential multiplier 51, exponential adder 52, and exponential subtractor 53 can be implemented, so when the deep neural network 43 is implemented as a physical circuit, the circuit of the neuron 141 will be simplified, thereby effectively reducing the deep nerve The overall circuit volume of the network 43.

再者，當該深度神經網路43的第一層14的各該神經元141完成全部的卷積運算而得到複數個特徵值r後，各該神經元141還將完成全部卷積運算產生的該等特徵值r經由一線性整流(Rectified Linear Unit,ReLU)函數進行整流運算，而產生相對應的複數個整流後特徵值r’，再將該等整流後特徵值r’輸入與其連結的下一層14的該等神經元141，使下一層14的各該神經元141同樣利用其中量化後的該權重矩陣10、該指數乘法器51、該指數加法器52及該指數減法器53對該等整流後特徵值r’進行卷積運算，然後再將其完成全部卷積運算產生的該等特徵值r經由線性整流函數進行整流運算，產生相對應的複數個整流後特徵值r’，再將該等整流後特徵值r’輸入與其連結的下一層14的該等神經元141，依此類推，直到該深度神經網路43的最後一層14輸出運算結果至輸出層12。其中線性整流函數可以是斜坡函數、泄露線性整流(Leaky ReLU)函數、帶泄露隨機線性整流(Randomized Leaky ReLU)函數及噪聲線性整流(Noisy ReLU)函數其中之一，但不以此為限。 Furthermore, after each neuron 141 of the first layer 14 of the deep neural network 43 completes all convolution operations to obtain a plurality of eigenvalues r, each neuron 141 will also complete all convolution operations. The eigenvalues r are rectified by a linear rectification (Rectified Linear Unit, ReLU) function to generate a corresponding plurality of rectified eigenvalues r ′, and then input the rectified eigenvalues r ′ to the next The neurons 141 of the first layer 14 make each neuron 141 of the next layer 14 also use the quantized weight matrix 10, the exponential multiplier 51, the exponential adder 52 and the exponential subtractor 53 for the After the rectified eigenvalue r 'is subjected to a convolution operation, the eigenvalue r resulting from the completion of all convolution operations is then rectified through a linear rectification function to generate a corresponding plurality of rectified eigenvalues r', and then The rectified eigenvalue r ′ is input to the neurons 141 of the next layer 14 connected thereto, and so on, until the last layer 14 of the deep neural network 43 outputs the operation result to the output layer 12. The linear rectification function can be a ramp function, a leaky linear rectification (Leaky ReLU) function, and a random linear rectification with leakage (Randomized Leaky) Relu) function and noise linear rectification (Noisy ReLU) function is one of, but not limited to this.

且由於輸入該深度神經網路43的圖像資料20及神經元141中的權重值皆已從原本使用浮點數表示轉換成最少以2位元即能表示，且該深度神經網路43輸出的運算結果也是最少以2位元即能表示，大大地減少了電腦裝置4之記憶體空間的佔用。 And since the weight values in the image data 20 and the neuron 141 input to the deep neural network 43 have been converted from the original floating-point representation to at least 2 bits, the deep neural network 43 outputs The calculation result is also expressed in at least 2 bits, which greatly reduces the memory space occupied by the computer device 4.

綜上所述，上述實施例藉由預先將深度神經網路43中各該神經元141之權重矩陣10中的該等權重值W以及預備輸入深度神經網路43的圖像資料20分別進行正規化，並分別量化成以2為底數的指數2^-i及2^-j，再將圖像資料20之量化後圖素值輸入深度神經網路43與其中各該神經元141之量化後權重矩陣10進行卷積運算，且藉由各該神經元141中的指數乘法器51、指數加法器52及指數減法器53對輸入的指數進行簡單的加、減運算及判斷，取代習知的浮點數乘法運算，降低運算複雜度且能快速地完成卷積運算，不但提高深度神經網路43的運算速度，並且以簡單的加法器取代乘法器而能有效地簡化並縮減實作成硬體之深度神經網路43的電路體積。而當深度神經網路43是以軟體實現時，由於卷積運算只需對輸入的指數進行簡單的加、減運算及判斷，不需乘法運算，故能有效提高其運算速度，而確實達到本發明之功效與目的。 In summary, in the above embodiment, the weight values W in the weight matrix 10 of each neuron 141 in the deep neural network 43 and the image data 20 prepared for input to the deep neural network 43 are normalized in advance. And quantize them into 2-based exponents 2 ^-i and 2 ^{-j respectively} , and then input the quantized pixel values of the image data 20 into the deep neural network 43 and the quantized weight matrix of each neuron 141 10 Perform convolution operation, and use the exponential multiplier 51, exponential adder 52, and exponential subtractor 53 in each neuron 141 to perform simple addition, subtraction, and judgment on the input exponent, replacing the conventional floating point The number multiplication operation reduces the computational complexity and can complete the convolution operation quickly, which not only improves the operation speed of the deep neural network 43, but also replaces the multiplier with a simple adder to effectively simplify and reduce the depth of the implemented hardware The circuit volume of the neural network 43. When the deep neural network 43 is implemented in software, since the convolution operation only needs simple addition, subtraction and judgment of the input exponent, and no multiplication operation, it can effectively improve its operation speed, and indeed achieve The effect and purpose of the invention.

惟以上所述者，僅為本發明之實施例而已，當不能以此限定本發明實施之範圍，凡是依本發明申請專利範圍及專利說明書內容所作之簡單的等效變化與修飾，皆仍屬本發明專利涵蓋之範圍內。 However, the above are only the embodiments of the present invention. This limits the scope of implementation of the present invention, and any simple equivalent changes and modifications made in accordance with the scope of the patent application of the present invention and the content of the patent specification are still covered by the patent of the present invention.

Claims

A full-exponential arithmetic method applied to a deep neural network, which is carried in a computer device and has a hidden layer composed of a plurality of layers connected to each other, and each layer in the hidden layer has a plurality of nerves Each neuron has a weight matrix composed of m (m is an integer and m ≧ 1) weight values; the method includes: a pre-processing module of the computer device pre-registers each of the deep neural networks The m weight values of the neuron are normalized so that the m normalized weight values fall within the range of -1 to +1, and each of the normalized weight values is quantized into a quantization that can be expressed by an exponent 2- ⁱ The post-weight value, and the pre-processing module expresses the plurality of groups of the quantized weight values in X bits; the pre-processing module will input the complex pixel value of an image data of the deep neural network in advance Normalization, so that the normalized pixel values fall within the range of -1 to +1, and each normalized pixel value is quantized into a quantized pixel value that can be expressed by an index of 2- ^j , and the The preprocessing module expresses the plural groups of the quantized pixel values in Y bits; the preprocessing module inputs the quantized pixel values into the first layer of the hidden layer of the deep neural network , So that each neuron of the first layer performs a convolution operation on the quantized pixel values with the quantized weight matrix, and in each convolution operation, each neuron uses an exponential multiplier to The m quantized weight values 2- ⁱ are respectively multiplied by the quantized pixel values 2- ^j overlapping in the quantized pixel values to obtain m products, wherein the exponential multiplier calculates the product of each product The formula is: 2 ^{- i} × 2 ^{- j} = 2- ^{( i + j )} , if i , j ≠ (2 ^N -1) and i + j ≦ (2 ^N -1);

, If i + j > (2 ^N -1) or i == (2 ^N -1) or j == (2 ^N -1); where X = Y then N = X, if X ≠ Y then N takes The larger of X and Y; and in each convolution operation, each neuron inputs the partial product of the equal product m which is a positive value into an exponential adder to accumulate to obtain a positive accumulated value 2 ^-p , and additionally input the partial product of the negative product m into the first exponential adder to obtain a negative accumulated value 2 ^-q , and then add the positive accumulated value 2 ^-p and the negative accumulated value 2 ^-q index input a subtractor subtracting a characteristic value obtained R < wherein, the index adder calculates two index 2 ^-a 2 ^-b added and the formula is: 2 ^-a +2 ^-b = 2 ^{- c} , if a ≠ b, then c takes the smaller of a and b; 2 ^-a +2 ^-b = 2 ^{-a + 1} , if a == b and a ≠ (2 ^N -1) and 2 ^{-a + 1} <2 ⁰ ; 2 ^-a +2 ^-b = 2 ⁰ , if a == b and a ≠ (2 ^N -1) and 2 ^{-a + 1}

2 ⁰ ; 2 ^-a +2 ^-b = 2 ^-a , if b == (2 ^N -1); where, the formula of the exponential subtractor to calculate the eigenvalue r is: r = 2 ^-p -2 ^-q = 2 ^-p if p ≦ q-1 or q == (2 ^N -1);

, If p == q; r = 2 ^-p -2 ^-q = -2 ^-q , if p == q + 1; r = 2 ^-p -2 ^-q = 2 ^-p , if q == p + 1; r = 2 ^-p -2 ^-q = -2 ^-q , if q ≦ p-1 or p == (2 ^N -1); where == is a comparison operator to compare the variable on the left side of == Is it equal to the variable on the right side of ==; and i, j, p, q, r, a, b are positive integers.

The full exponential calculation method applied to the deep neural network as described in claim 1, wherein each of the neurons will also complete the eigenvalue r generated by all convolution operations through a linear rectification function to perform rectification operations, and generate corresponding A plurality of rectified eigenvalues r ′, and input the rectified eigenvalues r ′ to the neurons in the next layer connected to them, so that each neuron in the next layer uses the quantized weight matrix and the index The multiplier, the exponential adder and the exponential subtractor perform convolution operations on the rectified eigenvalues r ′.

The full exponential calculation method applied to the deep neural network as described in claim 2, wherein the linear rectification (Rectified Linear Unit, ReLU ) function is a ramp function, a leaky linear rectification (Leaky ReLU) function, and a random linear rectification with leakage (Randomized Leaky) One of the ReLU function and the noise linear rectification (Noisy ReLU) function.

A computer device, including: a deep neural network, which has a hidden layer composed of a plurality of interconnected layers, each layer in the hidden layer has a plurality of neurons, and each neuron has a number of ( m is an integer and m ≧ 1) a weight matrix composed of weight values, an exponential multiplier, an exponential adder, and an exponential subtractor; and a pre-processing module that pre-processes each nerve of the deep neural network The m weight values of the element are normalized, so that the m normalized weight values fall within the range of -1 to +1, and each normalized weight value is quantized into a quantization that can be expressed by an exponent 2 ^-i Weight value, and the preprocessing module expresses the plural groups of the quantized weight values in X bits; and the preprocessing module will input the complex pixel value of an image data of the deep neural network in advance Normalization, so that the normalized pixel values fall within the range of -1 to +1, and each normalized pixel value is quantized into a quantized pixel value that can be expressed by an index of 2- ^j , and the The preprocessing module represents the plural groups of the quantized pixel values in Y bits; and the preprocessing module inputs the quantized pixel values into the first of the hidden layers of the deep neural network Layer, so that each neuron of the first layer performs a convolution operation on the quantized pixel values with the quantized weight matrix, and in each convolution operation, each neuron uses the exponential multiplier Multiplying the m quantized weight values 2- ⁱ by the overlapping quantized pixel values 2- ^{j in the} quantized pixel values respectively to obtain m products, where the exponential multiplier calculates each product The formula is: 2 ^{- i} × 2 ^{- j} = 2- ^{( i + j )} , if i , j ≠ (2 ^N -1) and i + j ≦ (2 ^N -1);

, If i + j > (2 ^N -1) or i == (2 ^N -1) or j == (2 ^{N -} 1); where X = Y then N = X, if X ≠ Y then N takes The larger of X and Y; and in each convolution operation, each neuron inputs the positive product of the equal product m into the exponential adder and accumulates it to obtain a positive accumulated value 2 ^-p , and input the partial product of the negative value in the product m into the exponential adder to accumulate to obtain a negative accumulated value 2 ^-q , and then the positive accumulated value 2 ^-p and the negative accumulated value 2 ^-q Enter the exponential subtractor to get a eigenvalue r; where, the exponential adder calculates the formula for adding two exponents 2 ^-a and 2 ^-b : 2 ^-a +2 ^-b = 2 ^-c , if a ≠ b, then c takes the smaller of a and b; 2 ^-a +2 ^-b = 2 ^{-a + 1} , if a == b and a ≠ (2 ^N -1) and 2 ^{-a + 1} <2 ⁰ ; 2 ^-a +2 ^-b = 2 ⁰ , if a == b and a ≠ (2 ^N -1) and 2 ^{-a + 1}

The computer device according to claim 4, wherein each of the neurons further performs the rectification operation on the eigenvalues r generated by performing all convolution operations, and generates corresponding rectified eigenvalues r ', And input the rectified characteristic value r' to the neurons in the next layer connected to it, so that each neuron in the next layer uses the quantized weight matrix, the exponential multiplier, the exponential adder and The exponential subtractor performs a convolution operation on the rectified characteristic value r ′.

The computer device according to claim 4, wherein the linear rectification (Rectified Linear Unit, ReLU) function is a ramp function, a leakage linear rectification (Leaky ReLU) function, a random linear rectification with leakage (Randomized Leaky ReLU) function, and a noise linear rectification ( Noisy ReLU) function.

The computer device according to any one of claims 4 to 6, wherein the deep neural network and the preprocessing module are stored in a storage unit of the computer device and can be used by a processing unit of the computer device Read and execute the software program.

The computer device according to any one of claims 4 to 6, wherein the deep neural network and / or the pre-processing module is a special application integrated circuit chip or a programmable logic circuit integrated in the computer device In the device.

The computer device according to any one of claims 4 to 6, wherein the deep neural network and / or the preprocessing module are firmware burned in a microprocessor of the computer device.

A computer-readable recording medium in which a software program including a deep neural network and a preprocessing module is stored. The deep neural network has a hidden layer composed of a plurality of interconnected layers. The hidden Each layer in the layer has a plurality of neurons, and each neuron has a weight matrix composed of m (m is an integer and m ≧ 1) weight values, an exponential multiplier, an exponential adder, and an exponential subtractor After the software program is loaded and executed by a computer device, the computer device can complete the full exponential calculation method applied to the deep neural network as described in any one of the request items 1 to 3.