TWI672643B - Full index operation method for deep neural networks, computer devices, and computer readable recording media - Google Patents

Full index operation method for deep neural networks, computer devices, and computer readable recording media Download PDF

Info

Publication number
TWI672643B
TWI672643B TW107117479A TW107117479A TWI672643B TW I672643 B TWI672643 B TW I672643B TW 107117479 A TW107117479 A TW 107117479A TW 107117479 A TW107117479 A TW 107117479A TW I672643 B TWI672643 B TW I672643B
Authority
TW
Taiwan
Prior art keywords
exponential
quantized
deep neural
neural network
neuron
Prior art date
Application number
TW107117479A
Other languages
Chinese (zh)
Other versions
TW202004568A (en
Inventor
吳昕益
蕭文菁
Original Assignee
倍加科技股份有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 倍加科技股份有限公司 filed Critical 倍加科技股份有限公司
Priority to TW107117479A priority Critical patent/TWI672643B/en
Priority to CN201810772630.9A priority patent/CN110531955B/en
Application granted granted Critical
Publication of TWI672643B publication Critical patent/TWI672643B/en
Publication of TW202004568A publication Critical patent/TW202004568A/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F7/00Methods or arrangements for processing data by operating upon the order or content of the data handled
    • G06F7/38Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation
    • G06F7/48Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation using non-contact-making devices, e.g. tube, solid state device; using unspecified devices
    • G06F7/544Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation using non-contact-making devices, e.g. tube, solid state device; using unspecified devices for evaluating functions by calculation
    • G06F7/556Logarithmic or exponential functions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/06Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons
    • G06N3/063Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons using electronic means
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Abstract

一種應用在深度神經網路的全指數運算方法,其預先將該深度神經網路的每一個神經元的該權重矩陣中的權重值正規化並量化成能以指數2 -i表示的量化後權重值,且預先將要輸入該深度神經網路的一圖像資料的複數圖素值正規化並量化成能以指數2 -j表示的量化後圖素值,再將該等量化後圖素值輸入該深度神經網路,使該深度神經網路的第一層的各該神經元以量化後的該權重矩陣、指數乘法器、指數加法器及指數減法器對該等量化後圖素值進行卷積運算,藉此降低深度神經網路的運算複雜度及電路複雜度,提高深度神經網路的運算速度並減少記憶體空間的佔用。 A full exponential calculation method applied to a deep neural network, which pre-normalizes and quantizes the weight value in the weight matrix of each neuron of the deep neural network into a quantized weight that can be expressed by an exponent 2 -i Value, and in advance normalize and quantize the complex pixel values of an image data to be input into the deep neural network into quantized pixel values that can be expressed by exponents 2 -j , and then input the quantized pixel values The deep neural network enables each neuron in the first layer of the deep neural network to use the quantized weight matrix, exponential multiplier, exponential adder, and exponential subtractor to scroll the quantized pixel values Product operation, thereby reducing the computational complexity and circuit complexity of the deep neural network, increasing the computing speed of the deep neural network and reducing the occupation of memory space.

Description

應用在深度神經網路的全指數運算方法、電腦裝置及電腦可讀取的記錄媒體Full exponential calculation method applied to deep neural network, computer device and computer readable recording medium

本發明是有關於全指數運算方法,特別是指一種應用在深度神經網路的全指數運算方法。 The invention relates to a full exponential calculation method, in particular to a full exponential calculation method applied to a deep neural network.

深度神經網路是機器學習中一種深度學習的方法,藉由其模仿生物神經系統的數學模型,不斷地對其提供大量的資料進行不同階層與架構的多次運算和訓練,即能訓練出最佳化且最有效的一資料識別模型。如圖1所示,深度神經網路通常包含一輸入層11、一輸出層12以及連接該輸入層11與該輸出層12,且位於該輸入層11與該輸出層12之間的隱藏層13,隱藏層13由複數個前後相互連結的層14構成,其中每一層14具有複數個神經元141,每一個神經元141具有如圖2所示的一由複數個(例如3x3)權重值W構成的權重矩陣10。由該輸入層11輸入的資料D,例如圖2所示之一張具有5x5個圖素D的圖像20會被輸入隱藏層11中的第一層14的每一個神經元141,各該神經元141會在圖像上依序移動該權重矩陣10,並在權重矩陣10每一個經過的位置,讓權重矩陣10中的權重 值W與圖像20上重疊(對應)的部分圖素D相乘並加總(即卷積運算)而得到一特徵值R,各該神經元141再將權重矩陣10經過圖像20上所有位置後運算得到的該等特徵值R輸出給第二層14的每一個神經元141,使第二層14的每一個神經元141對輸入的該等特徵值R進行如上所述的卷積運算,並將運算結果再輸出給其下一層14的每一個神經元141,依此類推,直到最後一層141的每一個神經元141輸出運算結果給輸出層12。 The deep neural network is a method of deep learning in machine learning. By imitating the mathematical model of the biological nervous system, it continuously provides a large amount of data for multiple calculations and training of different levels and structures, that is, it can train the most An optimized and most effective data recognition model. As shown in FIG. 1, a deep neural network generally includes an input layer 11, an output layer 12, and a hidden layer 13 connecting the input layer 11 and the output layer 12 and located between the input layer 11 and the output layer 12 , The hidden layer 13 is composed of a plurality of layers 14 connected to each other, wherein each layer 14 has a plurality of neurons 141, and each neuron 141 has a weight value W as shown in FIG. 2 (for example, 3x3)的 weight matrix 10. The data D input by the input layer 11, for example, an image 20 with 5x5 pixels D shown in FIG. 2 is input to each neuron 141 of the first layer 14 in the hidden layer 11, and each of the neurons Element 141 will move the weight matrix 10 in sequence on the image, and at each passing position of the weight matrix 10, let the weights in the weight matrix 10 The value W is multiplied by the overlapping (corresponding) partial pixels D on the image 20 and summed (ie, convolution operation) to obtain a eigenvalue R. Each neuron 141 then passes the weight matrix 10 through all of the image 20 The eigenvalues R calculated after the position are output to each neuron 141 of the second layer 14, so that each neuron 141 of the second layer 14 performs the convolution operation on the input eigenvalues R as described above And output the operation result to each neuron 141 of the next layer 14 and so on, until each neuron 141 of the last layer 141 outputs the operation result to the output layer 12.

但該權重矩陣10中的權重值W和輸入的資料D通常以浮點數表示,因此神經元141除了需要利用浮點乘法器對權重值W和輸入的資料D進行浮點數相乘運算外,由於浮點數相乘運算的運算量大且乘法運算也比加法運算來得相對複雜,因此浮點數的乘法運算比加法運算耗費更多的時間。而且以邏輯電路實現的浮點乘法器體積相較於加法器更為龐大,使得採用浮點乘法器的深度神經網路被實作成硬體電路的體積相對龐大。此外深度神經網路的權重值W和最終輸出的運算結果都是以浮點數儲存,因此會佔用大量的記憶體空間。 However, the weight value W and the input data D in the weight matrix 10 are usually expressed as floating point numbers, so the neuron 141 needs to use a floating point multiplier to multiply the weight value W and the input data D by floating point Since the multiplication operation of floating-point numbers is large and the multiplication operation is relatively more complicated than the addition operation, the multiplication operation of floating-point numbers takes more time than the addition operation. Moreover, the volume of the floating-point multiplier realized by the logic circuit is larger than that of the adder, so that the deep neural network using the floating-point multiplier is implemented as a hardware circuit with a relatively large volume. In addition, the weight value W of the deep neural network and the final output operation result are stored as floating point numbers, so it will occupy a large amount of memory space.

因此,本發明之目的,即在提供一種應用在深度神經網路的全指數運算方法,以及實現該方法的電腦裝置和電腦可讀取的記錄媒體,其能降低深度神經網路的運算量、運算複雜度及電路 複雜度,以及減少記憶體空間的佔用並提高深度神經網路的運算速度。 Therefore, the purpose of the present invention is to provide a full exponential calculation method applied to a deep neural network, and a computer device and a computer-readable recording medium implementing the method, which can reduce the calculation amount of the deep neural network, Operation complexity and circuit Complexity, as well as reducing memory space usage and increasing the speed of deep neural network operations.

於是,本發明一種應用在深度神經網路的全指數運算方法,其中該深度神經網路載於一電腦裝置並具有一由複數個前後相互連結的層構成的隱藏層,該隱藏層中的每一層具有複數個神經元,每一個神經元具有由m個(m為整數且m≧1)權重值構成的一權重矩陣;該方法包括:該電腦裝置的一預處理模組預先將該深度神經網路的每一個神經元的該m個權重值正規化,使該m個正規化權重值落在-1~+1之間的範圍,並將各該正規化權重值量化成能以指數2-i表示的量化後權重值,且該預處理模組以X位元表示該等量化後權重值的複數個群組;該預處理模組預先將要輸入該深度神經網路的一圖像資料的複數圖素值正規化,使該等正規化圖素值落在-1~+1之間的範圍,並將各該正規化圖素值量化成能以指數2-j表示的量化後圖素值,且該預處理模組以Y位元表示該等量化後圖素值的複數個群組;該預處理模組將該等量化後圖素值輸入該深度神經網路的該隱藏層中的第一層,使第一層的各該神經元以量化後的該權重矩陣對該等量化後圖素值進行卷積運算,且在每一次的卷積運算中,各該神經元利用一指數乘法器將該m個量化後權重值2-i分別與該等量化後圖素值中重疊的部分量化後圖素值2-j相乘而得到m個乘積,其中,該指數乘法器計算每一個乘積的公式為: 2-i ×2-j =2-(i+j),若i,j≠(2N-1)且i+j≦(2 N -1);,若i+j>(2 N -1)或i==(2 N -1)或j==(2 N -1);其中若X=Y則N=X,若X≠Y則N取X、Y兩者中較大者;及 在每一次的卷積運算中,各該神經元將該等乘積m中為正值的部分乘積輸入一指數加法器中累加而得到一正累加值2-p,並另外將該等乘積m中為負值的部分乘積輸入該第一指數加法器中累加而得到一負累加值2-q,再將該正累加值2-p與該負累加值2-q輸入一指數減法器相減而得到一特徵值r; 其中,該指數加法器計算兩個指數2-a和2-b相加的公式為:2-a+2-b=2-c,若a≠b,則c取a、b兩者中較小者;2-a+2-b=2-a+1,若a==b且a≠(2N-1)且2-a+1<20;2-a+2-b=20,若a==b且a≠(2N-1)且2-a+1 20;2-a+2-b=2-a,若b==(2N-1); 其中,該指數減法器計算該特徵值r的公式為:r=2-p-2-q=2-p,若p≦q-1或q==(2 N -1);,若p==q;r=2-p-2-q=-2-q,若p==q+1;r=2-p-2-q=2-p,若q==p+1;r=2-p-2-q=-2-q,若q≦p-1或p==(2 N -1);其中==是比較運算子,用以比較==左邊的變數是否等於==右 邊的變數;且i、j、p、q、r、a、b為正整數。 Therefore, the present invention is a full exponential calculation method applied to a deep neural network, wherein the deep neural network is carried in a computer device and has a hidden layer composed of a plurality of layers connected to each other, and each of the hidden layers One layer has a plurality of neurons, and each neuron has a weight matrix composed of m (m is an integer and m ≧ 1) weight values; the method includes: a preprocessing module of the computer device pre-processes the deep nerve The m weight values of each neuron in the network are normalized, so that the m normalized weight values fall within the range of -1 to +1, and each of the normalized weight values is quantized to an index of 2 -i represents the quantized weight value, and the preprocessing module expresses the multiple groups of the quantized weight value in X bits; the preprocessing module will input an image data of the deep neural network in advance The normalization of the complex pixel values of, so that the normalized pixel values fall within the range of -1 to +1, and each of the normalized pixel values is quantized into a quantized graph that can be expressed by an index of 2 -j Prime values, and the pre-processing module expresses the multiple groups of the quantized pixel values in Y bits; the pre-processing module inputs the quantized pixel values into the hidden layer of the deep neural network In the first layer of the first layer, each neuron of the first layer performs a convolution operation on the quantized pixel values with the quantized weight matrix, and in each convolution operation, each neuron uses An exponential multiplier multiplies the m quantized weight values 2- i by the overlapping quantized pixel values 2- j in the quantized pixel values to obtain m products, where the exponential multiplier The formula for calculating each product is: 2 - i × 2 - j = 2- ( i + j ) , if i , j ≠ (2 N -1) and i + j ≦ (2 N -1); , If i + j > (2 N -1) or i == (2 N -1) or j == (2 N -1); where X = Y then N = X, if X ≠ Y then N takes The larger of X and Y; and in each convolution operation, each neuron inputs the partial product of the equal product m which is a positive value into an exponential adder to accumulate to obtain a positive accumulated value 2 -p , and additionally input the partial product of the negative product m into the first exponential adder to obtain a negative accumulated value 2 -q , and then add the positive accumulated value 2 -p and the negative accumulated value 2 -q index input a subtractor subtracting a characteristic value obtained R < wherein, the index adder calculates two index 2 -a 2 -b added and the formula is: 2 -a +2 -b = 2 - c , if a ≠ b, then c takes the smaller of a and b; 2 -a +2 -b = 2 -a + 1 , if a == b and a ≠ (2 N -1) and 2 -a + 1 <2 0 ; 2 -a +2 -b = 2 0 , if a == b and a ≠ (2 N -1) and 2 -a + 1 2 0 ; 2 -a +2 -b = 2 -a , if b == (2 N -1); where, the formula of the exponential subtractor to calculate the eigenvalue r is: r = 2 -p -2 -q = 2 -p if p ≦ q-1 or q == (2 N -1); , If p == q; r = 2 -p -2 -q = -2 -q , if p == q + 1; r = 2 -p -2 -q = 2 -p , if q == p + 1; r = 2 -p -2 -q = -2 -q , if q ≦ p-1 or p == (2 N -1); where == is a comparison operator to compare the variable on the left side of == Is it equal to the variable on the right side of ==; and i, j, p, q, r, a, b are positive integers.

在本發明的一些實施態樣中,各該神經元還將完成全部卷積運算產生的該等特徵值r經由一線性整流函數進行整流運算,而產生相對應的複數個整流後特徵值r’,並將該等整流後特徵值r’輸入與其連結的下一層的該等神經元,使下一層的各該神經元利用量化後的該權重矩陣、該指數乘法器、該指數加法器及該指數減法器對該等整流後特徵值r’進行卷積運算。 In some embodiments of the present invention, each of the neurons will also perform the rectification operation on the eigenvalues r generated by all convolution operations through a linear rectification function, and generate corresponding rectified eigenvalues r ′ , And input the rectified eigenvalue r ′ to the neurons in the next layer connected to it, so that each neuron in the next layer uses the quantized weight matrix, the exponential multiplier, the exponential adder and the The exponential subtractor performs a convolution operation on the rectified characteristic value r ′.

在本發明的一些實施態樣中,該線性整流(Rectified Linear Unit,ReLU)函數可以是斜坡函數、泄露線性整流(Leaky ReLU)函數、帶泄露隨機線性整流(Randomized Leaky ReLU)函數及噪聲線性整流(Noisy ReLU)函數其中之一。 In some embodiments of the present invention, the linear rectification (Rectified Linear Unit, ReLU) function may be a ramp function, a leaky linear rectification (Leaky ReLU) function, a randomized linear rectification (Randomized Leaky ReLU) function, and a noise linear rectification (Noisy ReLU) One of the functions.

再者,本發明實現上述方法的一種電腦裝置,包括一深度神經網路及一預處理模組,其中該深度神經網路具有一由複數個前後相互連結的層構成的隱藏層,該隱藏層中的每一層具有複數個神經元,每一個神經元具有由m個(m為整數且m≧1)權重值構成的一權重矩陣、一指數乘法器、一指數加法器及一指數減法器;該預處理模組,其預先將該深度神經網路的每一個神經元的該m個權重值正規化,使該m個正規化權重值落在-1~+1之間的範圍,並將各該正規化權重值量化成能以指數2-i表示的量化後權重值,且該預處理模組以X位元表示該等量化後權重值的複數個群組;且該預處 理模組預先將要輸入該深度神經網路的一圖像資料的複數圖素值正規化,使該等正規化圖素值落在-1~+1之間的範圍,並將各該正規化圖素值量化成能以指數2-j表示的量化後圖素值,且該預處理模組以Y位元表示該等量化後圖素值的複數個群組;且該預處理模組將該等量化後圖素值輸入該深度神經網路的該隱藏層中的第一層,使第一層的各該神經元以量化後的該權重矩陣對該等量化後圖素值進行卷積運算,且在每一次的卷積運算中,各該神經元利用該指數乘法器將該m個量化後權重值2-i分別與該等量化後圖素值中重疊的部分量化後圖素值2-j相乘而得到m個乘積,其中,該指數乘法器計算每一個乘積之公式為:2-i ×2-j =2-(i+j),若i,j≠(2N-1)且i+j≦(2 N -1);,若i+j>(2 N -1)或i==(2 N -1)或j==(2 N -1);其中若X=Y則N=X,若X≠Y則N取X、Y兩者中較大者;且在每一次的卷積運算中,各該神經元將該等乘積m中為正值的部分乘積輸入該指數加法器中累加而得到一正累加值2-p,並將該等乘積m中為負值的部分乘積輸入該指數加法器中累加而得到一負累加值2-q,再將該正累加值2-p與該負累加值2-q輸入該指數減法器相減而得到一特徵值r;其中,該指數加法器計算兩個指數2-a和2-b相加的公式為:2-a+2-b=2-c,若a≠b,則c取a、b兩者中較小者; 2-a+2-b=2-a+1,若a==b且a≠(2N-1)且2-a+1<20;2-a+2-b=20,若a==b且a≠(2N-1)且2-a+1 20;2-a+2-b=2-a,若b==(2N-1);其中,該指數減法器計算該特徵值r的公式為:r=2-p-2-q=2-p,若p≦q-1或q==(2 N -1);,若p==q;r=2-p-2-q=-2-q,若p==q+1;r=2-p-2-q=2-p,若q==p+1;r=2-p-2-q=-2-q,若q≦p-1或p==(2 N -1);其中==是比較運算子,用以比較==左邊的變數是否等於==右邊的變數;且i、j、p、q、r、a、b為正整數。 Furthermore, a computer device for realizing the above method according to the present invention includes a deep neural network and a preprocessing module, wherein the deep neural network has a hidden layer composed of a plurality of interconnected layers, the hidden layer Each layer in has a plurality of neurons, and each neuron has a weight matrix composed of m (m is an integer and m ≧ 1) weight values, an exponential multiplier, an exponential adder, and an exponential subtractor; The pre-processing module pre-normalizes the m weight values of each neuron of the deep neural network so that the m normalized weight values fall within the range of -1 to +1, and Each of the normalized weight values is quantized into a quantized weight value that can be represented by an exponent 2 -i , and the preprocessing module represents a plurality of groups of the quantized weight values in X bits; and the preprocessing module Normalize the complex pixel values of an image data to be input into the deep neural network in advance, so that the normalized pixel values fall within the range of -1 ~ + 1, and each normalized pixel value Quantized into quantized pixel values that can be represented by exponents 2 -j , and the pre-processing module expresses Y-bits of the plurality of groups of quantized pixel values; and the pre-processing module quantizes these After the pixel values are input to the first layer of the hidden layer of the deep neural network, each neuron of the first layer performs a convolution operation on the quantized pixel values with the quantized weight matrix, and In each convolution operation, each neuron uses the exponential multiplier to divide the m quantized weight values 2 -i with the quantized pixel values that overlap the quantized pixel values 2 -j Multiply to obtain m products, where the exponential multiplier calculates each product as: 2 - i × 2 - j = 2- ( i + j ) , if i , j ≠ (2 N -1) and i + j ≦ (2 N -1); , If i + j > (2 N -1) or i == (2 N -1) or j == (2 N -1); where X = Y then N = X, if X ≠ Y then N takes The larger of X and Y; and in each convolution operation, each neuron inputs the positive product of the equal product m into the exponential adder and accumulates it to obtain a positive accumulated value 2 -p , and input the partial product of the negative value in the product m into the exponential adder to accumulate to obtain a negative accumulated value 2 -q , and then the positive accumulated value 2 -p and the negative accumulated value 2 -q Enter the exponential subtractor to get a eigenvalue r; where, the exponential adder calculates the formula for adding two exponents 2 -a and 2 -b : 2 -a +2 -b = 2 -c , if a ≠ b, then c takes the smaller of a and b; 2 -a +2 -b = 2 -a + 1 , if a == b and a ≠ (2 N -1) and 2 -a + 1 <2 0 ; 2 -a +2 -b = 2 0 , if a == b and a ≠ (2 N -1) and 2 -a + 1 2 0 ; 2 -a +2 -b = 2 -a , if b == (2 N -1); where, the formula of the exponential subtractor to calculate the eigenvalue r is: r = 2 -p -2 -q = 2 -p if p ≦ q-1 or q == (2 N -1); , If p == q; r = 2 -p -2 -q = -2 -q , if p == q + 1; r = 2 -p -2 -q = 2 -p , if q == p + 1; r = 2 -p -2 -q = -2 -q , if q ≦ p-1 or p == (2 N -1); where == is a comparison operator to compare the variable on the left side of == Is it equal to the variable on the right side of ==; and i, j, p, q, r, a, b are positive integers.

在本發明的一些實施態樣中,該深度神經網路及該預處理模組是儲存於該電腦裝置的一儲存單元中,且能被該電腦裝置的一處理單元讀取並執行的軟體程式。 In some embodiments of the present invention, the deep neural network and the preprocessing module are software programs stored in a storage unit of the computer device and can be read and executed by a processing unit of the computer device .

在本發明的一些實施態樣中,該深度神經網路及/或該預處理模組是整合在該電腦裝置的一特殊應用積體電路晶片或一可程式邏輯電路裝置中,或者是被燒錄在該電腦裝置的一微處理器中的韌體。 In some embodiments of the present invention, the deep neural network and / or the pre-processing module are integrated into a special application integrated circuit chip or a programmable logic circuit device of the computer device, or are burned Firmware recorded in a microprocessor of the computer device.

此外,本發明實現上述方法的一種電腦可讀取的記錄媒體,其中儲存一包含一深度神經網路及一預處理模組的軟體程 式,該深度神經網路具有一由複數個前後相互連結的層構成的隱藏層,該隱藏層中的每一層具有複數個神經元,每一個神經元具有由m個(m為整數且m≧1)權重值構成的一權重矩陣、一指數乘法器、一指數加法器及一指數減法器,且該軟體程式被一電腦裝置載入並執行後,該電腦裝置能完成如上所述應用在深度神經網路的全指數運算方法。 In addition, the present invention is a computer-readable recording medium that implements the above method, in which a software program including a deep neural network and a preprocessing module is stored The deep neural network has a hidden layer composed of a plurality of interconnected layers, each layer in the hidden layer has a plurality of neurons, and each neuron has a number of m (m is an integer and m ≧ 1) A weight matrix, an exponential multiplier, an exponential adder and an exponential subtractor composed of weight values, and after the software program is loaded and executed by a computer device, the computer device can complete the application as described above in depth Full exponential calculation method of neural network.

本發明之功效在於:藉由預先將該深度神經網路中各該神經元的該權重矩陣中的該等權重值以及預備輸入深度該神經網路的圖像資料分別進行正規化,並分別量化成以2為底數的指數2-i及2-j,再將圖像資料之量化後圖素值輸入該深度神經網路的該隱藏層,使與隱藏層的第一層的各該神經元之量化後權重矩陣進行卷積運算,且藉由各該神經元中的指數乘法器、指數加法器及指數減法器對輸入的指數進行簡單的加、減運算及判斷取代習知浮點數乘法運算,而降低神經元的運算複雜度並快速地完成卷積運算,不但提高深度神經網路的運算速度,並能有效簡化並縮減實作成硬體之深度神經網路的電路體積。且當深度神經網路是以軟體程式實現時,亦能有效提高其運算速度。 The effect of the present invention is: by pre-normalizing and quantifying the weight values in the weight matrix of the neurons in the deep neural network and the image data prepared for input to the deep neural network respectively Into the exponents 2 -i and 2 -j with a base of 2, and then input the quantized pixel values of the image data into the hidden layer of the deep neural network so that each neuron in the first layer of the hidden layer The quantized weight matrix is convoluted, and the exponent multiplier, exponential adder, and exponential subtractor in each neuron are used to perform simple addition, subtraction, and judgment on the input exponent instead of the conventional floating-point multiplication Computing, and reducing the complexity of neurons and completing convolution operations quickly, not only improves the speed of deep neural networks, but also effectively simplifies and reduces the circuit volume of deep neural networks implemented as hardware. And when the deep neural network is implemented by software program, it can also effectively improve its calculation speed.

10‧‧‧權重矩陣 10‧‧‧ weight matrix

11‧‧‧輸入層 11‧‧‧ input layer

12‧‧‧輸出層 12‧‧‧Output layer

13‧‧‧隱藏層 13‧‧‧ hidden layer

14‧‧‧層 14‧‧‧ storey

141‧‧‧神經元 141‧‧‧ Neuron

20‧‧‧圖像 20‧‧‧Image

4‧‧‧電腦裝置 4‧‧‧Computer device

41‧‧‧儲存單元 41‧‧‧Storage unit

42‧‧‧處理單元 42‧‧‧Processing unit

43‧‧‧深度神經網路 43‧‧‧ Deep Neural Network

44‧‧‧預處理模組 44‧‧‧Pretreatment module

51‧‧‧指數乘法器 51‧‧‧Exponential multiplier

52‧‧‧指數加法器 52‧‧‧ Index Adder

53‧‧‧指數減法器 53‧‧‧Index subtractor

W‧‧‧權重值 W‧‧‧ weight value

D‧‧‧圖素 D‧‧‧ pixels

S1~S3‧‧‧步驟 S1 ~ S3‧‧‧Step

本發明之其他的特徵及功效,將於參照圖式的實施方式中清楚地顯示,其中: 圖1是習知深度神經網路的基本組成架構示意圖;圖2說明一神經元以一權重矩陣對一輸入資料進行卷積運算的過程;圖3是本發明應用在深度神經網路的全指數運算方法的一實施例的主要流程圖;圖4是本發明實現圖3的方法的電腦裝置的一實施例的主要元件方塊圖;圖5說明本實施例的深度神經網路中的每一神經元具有一指數乘法器、一指數加法器及一指數減法器;及圖6說明本實施例的該預處理模組對深度神經網路的神經元中的權重值以及要輸入深度神經網路的資料執行正規化及量化的過程。 Other features and functions of the present invention will be clearly shown in the embodiments with reference to the drawings, in which: Figure 1 is a schematic diagram of the basic composition architecture of a conventional deep neural network; Figure 2 illustrates the process of a neuron performing a convolution operation on an input data with a weight matrix; Figure 3 is a full exponent of the present invention applied to a deep neural network The main flow chart of an embodiment of the calculation method; FIG. 4 is a block diagram of main components of an embodiment of the computer device of the present invention that implements the method of FIG. 3; FIG. 5 illustrates each nerve in the deep neural network of this embodiment The element has an exponential multiplier, an exponential adder, and an exponential subtractor; and FIG. 6 illustrates the weight value in the neuron of the deep neural network and the input to the deep neural network by the preprocessing module of this embodiment. The process of data normalization and quantification.

在本發明被詳細描述之前,應當注意在以下的說明內容中,類似的元件是以相同的編號來表示。 Before the present invention is described in detail, it should be noted that in the following description, similar elements are denoted by the same numbers.

參閱圖3,是本發明應用在深度神經網路的全指數運算方法的一實施例的主要流程,其由圖4所示的一電腦裝置4執行,該電腦裝置4主要包括一儲存單元41(即電腦可讀取的記錄媒體)及一處理單元42,該儲存單元41中存有一包含一深度神經網路43及一預處理模組44的程式,且該程式能被該電腦裝置4的該處理單元42 載入並執行,而完成圖3所示的方法流程,但不以此為限。亦即,該深度神經網路43及/或該預處理模組44也可以被整合在該電腦裝置4的一特殊應用積體電路(Application-specific integrated circuit,縮寫為ASIC)晶片或一可程式邏輯裝置(Programmable Logic Device,縮寫為PLD)中,而使該特殊應用積體電路晶片或該可程式邏輯電路裝置能完成圖3所示的方法流程。且該特殊應用積體電路晶片或該可程式邏輯電路裝置即為本實施例的該處理單元42;或者該深度神經網路43及/或該預處理模組44也可以是被燒錄在該電腦裝置4的一微處理器中的韌體,使該微處理器執行該韌體即能完成圖3所示的方法流程,且該微處理器即為本實施例的該處理單元42。 Referring to FIG. 3, it is the main flow of an embodiment of the full exponential arithmetic method of the present invention applied to a deep neural network, which is executed by a computer device 4 shown in FIG. 4, the computer device 4 mainly includes a storage unit 41 ( A computer-readable recording medium) and a processing unit 42, the storage unit 41 stores a program including a deep neural network 43 and a pre-processing module 44, and the program can be used by the computer device 4 Processing unit 42 Load and execute, and complete the method flow shown in Figure 3, but not limited to this. That is, the deep neural network 43 and / or the pre-processing module 44 can also be integrated into an application-specific integrated circuit (abbreviated as ASIC) chip or a programmable chip of the computer device 4 In a logic device (Programmable Logic Device, abbreviated as PLD), the special application integrated circuit chip or the programmable logic circuit device can complete the method flow shown in FIG. 3. And the special application integrated circuit chip or the programmable logic circuit device is the processing unit 42 of this embodiment; or the deep neural network 43 and / or the preprocessing module 44 may also be burned in the The firmware in a microprocessor of the computer device 4 enables the microprocessor to execute the firmware to complete the method flow shown in FIG. 3, and the microprocessor is the processing unit 42 of this embodiment.

且如同圖1所示,本實施例的該深度神經網路43除了具有一輸入層11及一輸出層12外,還具有一位於輸入層11和輸出層12之間且由複數個前後相互連結的層141構成的隱藏層14,該隱藏層14中的每一層141具有複數個神經元141,每一個神經元141具有由m個(m為整數且m≧1)權重值構成的一權重矩陣,例如圖2所示具有3x3個權重值W的權重矩陣10,以及如圖5所示的一指數乘法器51、一指數加法器52及一指數減法器53。 As shown in FIG. 1, in addition to an input layer 11 and an output layer 12, the deep neural network 43 of this embodiment also has a plurality of back-to-back connections between the input layer 11 and the output layer 12. Hidden layer 14 composed of layers 141, each layer 141 in the hidden layer 14 has a plurality of neurons 141, and each neuron 141 has a weight matrix composed of m (m is an integer and m ≧ 1) weight values For example, a weight matrix 10 having 3 × 3 weight values W shown in FIG. 2 and an exponential multiplier 51, an exponential adder 52, and an exponential subtractor 53 shown in FIG.

且為了降低深度神經網路的運算量及運算複雜度,如圖3的步驟S1,該預處理模組44預先將該深度神經網路43的每一個 神經元141的該m個權重值W正規化,使該m個正規化權重值落在-1~+1之間的範圍,並將各該正規化權重值以log2量化成能以指數2-i表示的量化後權重值,且該預處理模組44將該等量化後權重值分成複數個群組,並以X位元表示該等群組。 In order to reduce the computational complexity and computational complexity of the deep neural network, as shown in step S1 of FIG. 3, the preprocessing module 44 pre-regulates the m weight values W of each neuron 141 of the deep neural network 43 to be regular To make the m normalized weight values fall within the range of -1 to +1, and quantize each normalized weight value with log 2 into a quantized weight value that can be expressed by the index 2 -i , and the The preprocessing module 44 divides the quantized weight values into a plurality of groups, and expresses these groups in X bits.

舉例來說,如圖6(a)所示,假設該等權重值W有-2、-1.5、0、0.5、1、1.5等分佈在-2~1.5之間的值,則該預處理模組44將該等權重值W除以2以進行正規化,使該等正規化權重值(-1、-0.75、0、0.25、0.5、0.75)落在-1~+1之間的範圍,如圖6(b)所示;然後,將各該正規化權重值(-1、-0.75、0、0.25、0.5、0.75)以log2量化成以指數2-i表示的量化後權重值。因此,1、0.5、0.25及0這四個正規化權重值的絕對值將被以20、2-1、2-2及2-3表示,且由於以2為底的指數無法表示0.75,因此將0.75量化至最接近的1而以20來表示。所以該預處理模組44將該等量化後權重值的絕對值分成四個群組(即20、2-1、2-2及2-3),並以2位元表示該等群組,亦即00、01、10、11分別代表20、2-1、2-2及2-3,並另外記錄(標記)量化後權重值為正值或負值。 For example, as shown in FIG. 6 (a), assuming that the weight values W have values distributed between -2, -1.5, 0, 0.5, 1, 1.5, etc., between -2 and 1.5, the preprocessing mode Group 44 divides these equal weight values W by 2 to normalize, so that the normalized weight values (-1, -0.75, 0, 0.25, 0.5, 0.75) fall within the range of -1 to +1, As shown in FIG. 6 (b); then, each of the normalized weight values (-1, -0.75, 0, 0.25, 0.5, 0.75) is quantized by log 2 into a quantized weight value expressed by an exponent 2- i . Therefore, the absolute values of the four normalized weight values of 1, 0.5, 0.25, and 0 will be represented by 2 0 , 2 -1 , 2 -2, and 2 -3 , and since the base 2 index cannot represent 0.75, the thus quantized to the nearest 0.75 to 1 and 20 to represent. Therefore, the preprocessing module 44 divides the absolute values of these quantized weight values into four groups (that is, 2 0 , 2 -1 , 2 -2, and 2 -3 ), and expresses these groups in 2 bits , That is, 00, 01, 10, and 11 respectively represent 2 0 , 2 -1 , 2 -2, and 2 -3 , and additionally record (mark) the quantized weight value as a positive or negative value.

因此,假設落在-1~+1之間的正規化權重值還包含正或負的0.125、0.0625、0.03125及0.015625或接近這些數值的值時,其以指數2-i表示的量化後權重值將分別為2-4、2-5、2-6及2-7,則該預處理模組44需使用3位元才能表示20~2-7共8個群組,即以 000、001、010、011、100、101、110、111分別代表20、2-1、2-2...2-7。同理,若量化後權重值還包含小於2-7的值,則該預處理模組44需使用4位元或4位元以上的位元數來表示超過8個以上的量化後權重值(絕對值)群組,以此類推。 Therefore, assuming that the normalized weight values falling between -1 and +1 also contain positive or negative 0.125, 0.0625, 0.03125, and 0.015625 or values close to these values, the quantized weight value expressed by the index 2 -i respectively as 2-4, 2-5, 2-6 and 2-7, the need to use pre-processing module 44 to 3 yuan 20 ~ 2-7 represents a total of eight groups, i.e. 000,001 , 010, 011, 100, 101, 110, 111 represent 2 0 , 2 -1 , 2 -2 ... 2 -7 respectively . Similarly, if the quantized weight value also includes a value less than 2 -7 , the preprocessing module 44 needs to use 4 or more bits to represent more than 8 quantized weight values ( Absolute value) group, and so on.

然後,如圖3的步驟S2,該預處理模組44預先將要輸入該深度神經網路43的一圖像資料,例如圖2所示具有5x5個圖素D的圖像20正規化,使該等正規化圖素的圖素值落在-1~+1之間的範圍,並將各該正規化圖素值以log2量化成能以指數2-j表示的量化後圖素值,且該預處理模組44將該等量化後圖素值分成複數個群組,並以Y位元表示該等群組。同理,如圖6所示的例子,若量化後圖素值有20、2-1、2-2及2-3這四個絕對值群組,則該預處理模組44將以2位元表示該等群組,亦即以00、01、10、11分別代表20、2-1、2-2及2-3,並另外記錄量化後圖素值為正值或負值。而若量化後圖素值還包含更小的絕對值(例如0.125、0.0625、0.03125及0.015625或更小的值)時,則該預處理模組44將以3位元或更多位元來表示20~2-7這8個群組或超過8個以上的更多群組,即以000、001、010、011、100、101、110、111分別代表20、2-1、2-2...2-7,依此類推。因此,用來表示量化後權重值2-i和量化後圖素值2-j的位元數可能會相同或不同,端視量化後權重值2-i和量化後圖素值2-j的絕對值群組數而定。此外,步驟S1和步驟S2並無先後順序之 分,也可以同步進行。 Then, as shown in step S2 of FIG. 3, the pre-processing module 44 pre-normalizes an image data to be input to the deep neural network 43, for example, an image 20 with 5x5 pixels D shown in FIG. 2 is normalized to make the The pixel value of the normalized pixel falls within the range of -1 to +1, and each normalized pixel value is quantized by log 2 into a quantized pixel value that can be expressed by the index 2 -j , and The pre-processing module 44 divides the quantized pixel values into a plurality of groups, and represents the groups in Y bits. Similarly, the example shown in Figure 6, if the pixel values are quantized in FIG. 20, 2-1, 2-2 and 2-3 of the four absolute values of the group, the preprocessing module 44 will be 2 Bits represent these groups, that is, 00, 01, 10, and 11 represent 2 0 , 2 -1 , 2 -2, and 2 -3 , respectively, and additionally record the quantized pixel values as positive or negative values. If the pixel values after quantization also contain smaller absolute values (for example, 0.125, 0.0625, 0.03125, and 0.015625 or less), the preprocessing module 44 will be represented by 3 bits or more 20 ~ 2-7, or eight groups of more than 8 or more groups, i.e., to represent the 20 000,001,010,011,100,101,110,111, 2-1, 2 - 2 ... 2 -7 , and so on. Therefore, the number of bits used to represent the quantized weight value 2 -i and the quantized pixel value 2 -j may be the same or different, depending on the quantized weight value 2 -i and the quantized pixel value 2 -j The absolute value depends on the number of groups. In addition, step S1 and step S2 have no order, and can also be performed synchronously.

然後,如圖3之步驟S3,該預處理模組44將該等量化後圖素值2-j經由該深度神經網路43的輸入層11輸入該隱藏層13中的第一層14,使第一層14的各該神經元141以量化後的該權重矩陣10對該等量化後圖素值2-j進行卷積運算。亦即,如圖2所示,該神經元141會在圖像20上一次移動一個單位(圖素)地依序移動量化後的該權重矩陣10,並在量化後的該權重矩陣10每一個經過的位置,讓量化後的該權重矩陣10中的量化後權重值2-i與圖像20上重疊(對應)部分的量化後圖素值2-j相乘並將該等乘積加總(即卷積運算)而得到一特徵值r。 Then, as shown in step S3 of FIG. 3, the pre-processing module 44 inputs the quantized pixel values 2- j to the first layer 14 of the hidden layer 13 through the input layer 11 of the deep neural network 43, so that Each neuron 141 of the first layer 14 performs a convolution operation on the quantized pixel values 2- j with the quantized weight matrix 10. That is, as shown in FIG. 2, the neuron 141 will sequentially move the quantized weight matrix 10 one unit (pixel) at a time on the image 20, and each of the quantized weight matrix 10 After passing the position, multiply the quantized weight value 2- i in the quantized weight matrix 10 by the quantized pixel value 2- j in the overlapping (corresponding) part of the image 20 and add the products together ( That is, convolution operation) to obtain a feature value r.

且在每一次的卷積運算中,各該神經元141利用該指數乘法器51將該m個量化後權重值2-i分別與該等量化後圖素值2-j中重疊的部分量化後圖素值2-j相乘而得到m個乘積,其中,該指數乘法器51計算每一個乘積的公式為:2-i ×2-j =2-(i+j),若i,j≠(2N-1)且i+j≦(2 N -1);,若i+j>(2 N -1)或i==(2 N -1)或j==(2 N -1);其中,若X=Y則N=X,若X≠Y則N取X、Y兩者中較大者;舉例來說,若量化後權重值2-i和量化後圖素值2-j的絕對值群組數都在4或4以下,則X=Y=2,N=2,該指數乘法器51計算每一個乘積的公式則為: 2-i ×2-j =2-(i+j),若i,j≠3且i+j≦3;2-i ×2-j =2-3,若i+j>3或i==3或j==3;而若量化後權重值2-i的絕對值群組數在4或4以下,但量化後圖素值2-j的絕對值群組數在5~7之間,則X=2,Y=3,N=3,該指數乘法器51計算每一個乘積之公式則為:2-i ×2-j =2-(i+j),若i,j≠7且i+j≦7;2-i ×2-j =2-7,若i+j>7或i==7或j==7;由此可知,在i,j≠(2N-1)且i+j≦(2 N -1)的情況下,該指數乘法器51實際上只是將量化後權重值2-i和量化後圖素值2-j的指數i、j相加,即得到兩者的乘積,並不需要進行乘法運算,且在i+j>(2 N -1)或i==(2 N -1)或j==(2 N -1)的情況下,該指數乘法器51甚至不需進行實際運算,即能輸出量化後權重值2-i和量化後圖素值2-j的乘積。 In each convolution operation, each neuron 141 uses the exponential multiplier 51 to quantize the overlapping portions of the m quantized weight values 2- i and the quantized pixel values 2- j The pixel values 2- j are multiplied to obtain m products, where the formula of the exponential multiplier 51 to calculate each product is: 2 - i × 2 - j = 2- ( i + j ) , if i , j ≠ (2 N -1) and i + j ≦ (2 N -1); , If i + j > (2 N -1) or i == (2 N -1) or j == (2 N -1); where, if X = Y then N = X, if X ≠ Y then N Take the larger of X and Y; for example, if the absolute value group number of the quantized weight value 2 -i and the quantized pixel value 2 -j are 4 or less, then X = Y = 2, N = 2, the formula of the exponential multiplier 51 to calculate each product is: 2 - i × 2 - j = 2- ( i + j ) , if i , j ≠ 3 and i + j ≦ 3; 2 -i × 2 - j = 2 -3 if i + j > 3 or i == 3 or j == 3; and if the quantized weight value 2 -i has an absolute value group number of 4 or less, but the absolute value of the quantized pixel values 2 -j group is between 5 and 7, then X = 2, Y = 3, N = 3, the index multiplier 51 calculates a product of each formula was: 2 - i × 2 - j = 2- ( i + j ) , if i , j ≠ 7 and i + j ≦ 7; 2 - i × 2 - j = 2 -7 , if i + j > 7 or i == 7 Or j == 7; it can be seen that, in the case of i , j ≠ (2 N -1) and i + j ≦ (2 N -1), the exponential multiplier 51 is actually just the quantized weight value 2 -i and the quantized pixel value 2 -j exponents i, j are added, that is, the product of the two is not required, and no multiplication is required, and i + j > (2 N -1) or i == ( 2 N -1) or j == (2 N -1), the exponential multiplier 51 can output the quantized weight value 2 -i and the quantized pixel value 2 -j without even performing actual calculations The product of.

且在每一次的卷積運算中,各該神經元141將該等乘積m中為正值的部分乘積輸入該指數加法器52中進行累加而得到一正累加值2-p,再另外將該等乘積m中為負值的部分乘積輸入該第一指數加法器52中進行累加而得到一負累加值2-q,再將該正累加值2-p與該負累加值2-q輸入該指數減法器53相減而得到該特徵值r。 And in each convolution operation, each of the neurons 141 inputs the partial product of the equal product m that is a positive value into the exponential adder 52 to accumulate to obtain a positive accumulated value 2 -p , and then additionally The partial product of the equal product m which is a negative value is input to the first exponential adder 52 for accumulation to obtain a negative accumulated value 2 -q , and then the positive accumulated value 2 -p and the negative accumulated value 2 -q are input to the The index subtractor 53 subtracts to obtain the characteristic value r.

其中,該指數加法器53計算兩個指數(例如乘積2-a和2-b或一個乘積2-a和一個累加值2-b)相加的公式為:2-a+2-b=2-c,若a≠b,則c取a、b兩者中較小者; 2-a+2-b=2-a+1,若a==b且a≠(2N-1)且2-a+1<20;2-a+2-b=20,若a==b且a≠(2N-1)且2-a+1 20;2-a+2-b=2-a,若b==(2N-1);舉例來說,若a=2,b=3,則c=a,2-a+2-b=2-a;若a=b=2,且N=2時,則2-a+2-b=2-a+1,若a==b且a≠3且2-a+1<20;2-a+2-b=20,若a==b且a≠3且2-a+1 20;2-a+2-b=2-a,若b==3;其中,該指數減法器53計算該特徵值r的公式為:r=2-p-2-q=2-p,若p≦q-1或q==(2 N -1);,若p==q;r=2-p-2-q=-2-q,若p==q+1;r=2-p-2-q=2-p,若q==p+1;r=2-p-2-q=-2-q,若q≦p-1或p==(2 N -1);舉例來說,若p=1,q=3,則r=2-p-2-q=2-p;若p=q=3且N=2,則r=2-p-2-q=2-3;若p=2,q=1,則r=2-p-2-q=-2-q;若p=1,q=2,則r=2-p-2-q=2-q;若q=1,p=3,則r=2-p-2-q=-2-q Wherein, the index adder 53 calculates two index (e.g. 2 -a product or a product and a 2 -b 2 -a accumulated value and a 2 -b) added to the formula: 2 -a +2 -b = 2 -c , if a ≠ b, then c takes the smaller of a and b; 2 -a +2 -b = 2 -a + 1 , if a == b and a ≠ (2 N -1) and 2 -a + 1 <2 0 ; 2 -a +2 -b = 2 0 , if a == b and a ≠ (2 N -1) and 2 -a + 1 2 0; 2 -a +2 -b = 2 -a, if b == (2 N -1); for example, if a = 2, b = 3, then c = a, 2 -a +2 - b = 2 -a ; if a = b = 2, and N = 2, then 2 -a +2 -b = 2 -a + 1 , if a == b and a ≠ 3 and 2 -a + 1 < 2 0 ; 2 -a +2 -b = 2 0 , if a == b and a ≠ 3 and 2 -a + 1 2 0 ; 2 -a +2 -b = 2 -a , if b == 3; where, the formula of the exponential subtractor 53 to calculate the characteristic value r is: r = 2 -p -2 -q = 2 -p , If p ≦ q-1 or q == (2 N -1); , If p == q; r = 2 -p -2 -q = -2 -q , if p == q + 1; r = 2 -p -2 -q = 2 -p , if q == p + 1; r = 2 -p -2 -q = -2 -q , if q ≦ p-1 or p == (2 N -1); for example, if p = 1, q = 3, then r = 2 -p -2 -q = 2 -p ; if p = q = 3 and N = 2, then r = 2 -p -2 -q = 2 -3 ; if p = 2, q = 1, then r = 2 -p -2 -q = -2 -q ; if p = 1, q = 2, then r = 2 -p -2 -q = 2 -q ; if q = 1, p = 3, then r = 2 -p -2 -q = -2 -q

因此,各該神經元141藉由該指數乘法器51(實際上只 進行加法運算)、指數加法器52及指數減法器53取代習知的浮點乘法器,並以簡單的加法及減法取代乘法運算,不但降低運算量且運算速度快,而且只要相對簡單的邏輯電路即能實作出該指數乘法器51、指數加法器52及指數減法器53,因此當該深度神經網路43被實作成實體電路時,神經元141的電路將能簡化,進而有效縮減該深度神經網路43的整體電路體積。 Therefore, each neuron 141 passes the exponential multiplier 51 (actually only Perform addition operations), exponential adder 52 and exponential subtractor 53 to replace the conventional floating-point multiplier, and replace multiplication with simple addition and subtraction, which not only reduces the amount of operation and the operation speed is fast, but also requires a relatively simple logic circuit That is, the exponential multiplier 51, exponential adder 52, and exponential subtractor 53 can be implemented, so when the deep neural network 43 is implemented as a physical circuit, the circuit of the neuron 141 will be simplified, thereby effectively reducing the deep nerve The overall circuit volume of the network 43.

再者,當該深度神經網路43的第一層14的各該神經元141完成全部的卷積運算而得到複數個特徵值r後,各該神經元141還將完成全部卷積運算產生的該等特徵值r經由一線性整流(Rectified Linear Unit,ReLU)函數進行整流運算,而產生相對應的複數個整流後特徵值r’,再將該等整流後特徵值r’輸入與其連結的下一層14的該等神經元141,使下一層14的各該神經元141同樣利用其中量化後的該權重矩陣10、該指數乘法器51、該指數加法器52及該指數減法器53對該等整流後特徵值r’進行卷積運算,然後再將其完成全部卷積運算產生的該等特徵值r經由線性整流函數進行整流運算,產生相對應的複數個整流後特徵值r’,再將該等整流後特徵值r’輸入與其連結的下一層14的該等神經元141,依此類推,直到該深度神經網路43的最後一層14輸出運算結果至輸出層12。其中線性整流函數可以是斜坡函數、泄露線性整流(Leaky ReLU)函數、帶泄露隨機線性整流(Randomized Leaky ReLU)函數及噪聲線性整流(Noisy ReLU)函數其中之一,但不以此為限。 Furthermore, after each neuron 141 of the first layer 14 of the deep neural network 43 completes all convolution operations to obtain a plurality of eigenvalues r, each neuron 141 will also complete all convolution operations. The eigenvalues r are rectified by a linear rectification (Rectified Linear Unit, ReLU) function to generate a corresponding plurality of rectified eigenvalues r ′, and then input the rectified eigenvalues r ′ to the next The neurons 141 of the first layer 14 make each neuron 141 of the next layer 14 also use the quantized weight matrix 10, the exponential multiplier 51, the exponential adder 52 and the exponential subtractor 53 for the After the rectified eigenvalue r 'is subjected to a convolution operation, the eigenvalue r resulting from the completion of all convolution operations is then rectified through a linear rectification function to generate a corresponding plurality of rectified eigenvalues r', and then The rectified eigenvalue r ′ is input to the neurons 141 of the next layer 14 connected thereto, and so on, until the last layer 14 of the deep neural network 43 outputs the operation result to the output layer 12. The linear rectification function can be a ramp function, a leaky linear rectification (Leaky ReLU) function, and a random linear rectification with leakage (Randomized Leaky) Relu) function and noise linear rectification (Noisy ReLU) function is one of, but not limited to this.

且由於輸入該深度神經網路43的圖像資料20及神經元141中的權重值皆已從原本使用浮點數表示轉換成最少以2位元即能表示,且該深度神經網路43輸出的運算結果也是最少以2位元即能表示,大大地減少了電腦裝置4之記憶體空間的佔用。 And since the weight values in the image data 20 and the neuron 141 input to the deep neural network 43 have been converted from the original floating-point representation to at least 2 bits, the deep neural network 43 outputs The calculation result is also expressed in at least 2 bits, which greatly reduces the memory space occupied by the computer device 4.

綜上所述,上述實施例藉由預先將深度神經網路43中各該神經元141之權重矩陣10中的該等權重值W以及預備輸入深度神經網路43的圖像資料20分別進行正規化,並分別量化成以2為底數的指數2-i及2-j,再將圖像資料20之量化後圖素值輸入深度神經網路43與其中各該神經元141之量化後權重矩陣10進行卷積運算,且藉由各該神經元141中的指數乘法器51、指數加法器52及指數減法器53對輸入的指數進行簡單的加、減運算及判斷,取代習知的浮點數乘法運算,降低運算複雜度且能快速地完成卷積運算,不但提高深度神經網路43的運算速度,並且以簡單的加法器取代乘法器而能有效地簡化並縮減實作成硬體之深度神經網路43的電路體積。而當深度神經網路43是以軟體實現時,由於卷積運算只需對輸入的指數進行簡單的加、減運算及判斷,不需乘法運算,故能有效提高其運算速度,而確實達到本發明之功效與目的。 In summary, in the above embodiment, the weight values W in the weight matrix 10 of each neuron 141 in the deep neural network 43 and the image data 20 prepared for input to the deep neural network 43 are normalized in advance. And quantize them into 2-based exponents 2 -i and 2 -j respectively , and then input the quantized pixel values of the image data 20 into the deep neural network 43 and the quantized weight matrix of each neuron 141 10 Perform convolution operation, and use the exponential multiplier 51, exponential adder 52, and exponential subtractor 53 in each neuron 141 to perform simple addition, subtraction, and judgment on the input exponent, replacing the conventional floating point The number multiplication operation reduces the computational complexity and can complete the convolution operation quickly, which not only improves the operation speed of the deep neural network 43, but also replaces the multiplier with a simple adder to effectively simplify and reduce the depth of the implemented hardware The circuit volume of the neural network 43. When the deep neural network 43 is implemented in software, since the convolution operation only needs simple addition, subtraction and judgment of the input exponent, and no multiplication operation, it can effectively improve its operation speed, and indeed achieve The effect and purpose of the invention.

惟以上所述者,僅為本發明之實施例而已,當不能以 此限定本發明實施之範圍,凡是依本發明申請專利範圍及專利說明書內容所作之簡單的等效變化與修飾,皆仍屬本發明專利涵蓋之範圍內。 However, the above are only the embodiments of the present invention. This limits the scope of implementation of the present invention, and any simple equivalent changes and modifications made in accordance with the scope of the patent application of the present invention and the content of the patent specification are still covered by the patent of the present invention.

Claims (10)

一種應用在深度神經網路的全指數運算方法,該深度神經網路載於一電腦裝置並具有一由複數個前後相互連結的層構成的隱藏層,該隱藏層中的每一層具有複數個神經元,每一個神經元具有由m個(m為整數且m≧1)權重值構成的一權重矩陣;該方法包括:該電腦裝置的一預處理模組預先將該深度神經網路的每一個神經元的該m個權重值正規化,使該m個正規化權重值落在-1~+1之間的範圍,並將各該正規化權重值量化成能以指數2-i表示的量化後權重值,且該預處理模組以X位元表示該等量化後權重值的複數個群組;該預處理模組預先將要輸入該深度神經網路的一圖像資料的複數圖素值正規化,使該等正規化圖素值落在-1~+1之間的範圍,並將各該正規化圖素值量化成能以指數2-j表示的量化後圖素值,且該預處理模組以Y位元表示該等量化後圖素值的複數個群組;該預處理模組將該等量化後圖素值輸入該深度神經網路的該隱藏層中的第一層,使第一層的各該神經元以量化後的該權重矩陣對該等量化後圖素值進行卷積運算,且在每一次的卷積運算中,各該神經元利用一指數乘法器將該m個量化後權重值2-i分別與該等量化後圖素值中重疊的部分量化後圖素值2-j相乘而得到m個乘積,其中,該指數乘法器計算每一個乘積的公式為:2-i ×2-j =2-(i+j),若i,j≠(2N-1)且i+j≦(2 N -1);
Figure TWI672643B_C0001
,若i+j>(2 N -1)或i==(2 N -1)或j==(2 N -1);其中若X=Y則N=X,若X≠Y則N取X、Y兩者中較大者;及在每一次的卷積運算中,各該神經元將該等乘積m中為正值的部分乘積輸入一指數加法器中累加而得到一正累加值2-p,並另外將該等乘積m中為負值的部分乘積輸入該第一指數加法器中累加而得到一負累加值2-q,再將該正累加值2-p與該負累加值2-q輸入一指數減法器相減而得到一特徵值r;其中,該指數加法器計算兩個指數2-a和2-b相加的公式為:2-a+2-b=2-c,若a≠b,則c取a、b兩者中較小者;2-a+2-b=2-a+1,若a==b且a≠(2N-1)且2-a+1<20;2-a+2-b=20,若a==b且a≠(2N-1)且2-a+1
Figure TWI672643B_C0002
20;2-a+2-b=2-a,若b==(2N-1);其中,該指數減法器計算該特徵值r的公式為:r=2-p-2-q=2-p,若p≦q-1或q==(2 N -1);
Figure TWI672643B_C0003
,若p==q;r=2-p-2-q=-2-q,若p==q+1;r=2-p-2-q=2-p,若q==p+1;r=2-p-2-q=-2-q,若q≦p-1或p==(2 N -1);其中==是比較運算子,用以比較==左邊的變數是否等於==右邊的變數;且i、j、p、q、r、a、b為正整數。
A full-exponential arithmetic method applied to a deep neural network, which is carried in a computer device and has a hidden layer composed of a plurality of layers connected to each other, and each layer in the hidden layer has a plurality of nerves Each neuron has a weight matrix composed of m (m is an integer and m ≧ 1) weight values; the method includes: a pre-processing module of the computer device pre-registers each of the deep neural networks The m weight values of the neuron are normalized so that the m normalized weight values fall within the range of -1 to +1, and each of the normalized weight values is quantized into a quantization that can be expressed by an exponent 2- i The post-weight value, and the pre-processing module expresses the plurality of groups of the quantized weight values in X bits; the pre-processing module will input the complex pixel value of an image data of the deep neural network in advance Normalization, so that the normalized pixel values fall within the range of -1 to +1, and each normalized pixel value is quantized into a quantized pixel value that can be expressed by an index of 2- j , and the The preprocessing module expresses the plural groups of the quantized pixel values in Y bits; the preprocessing module inputs the quantized pixel values into the first layer of the hidden layer of the deep neural network , So that each neuron of the first layer performs a convolution operation on the quantized pixel values with the quantized weight matrix, and in each convolution operation, each neuron uses an exponential multiplier to The m quantized weight values 2- i are respectively multiplied by the quantized pixel values 2- j overlapping in the quantized pixel values to obtain m products, wherein the exponential multiplier calculates the product of each product The formula is: 2 - i × 2 - j = 2- ( i + j ) , if i , j ≠ (2 N -1) and i + j ≦ (2 N -1);
Figure TWI672643B_C0001
, If i + j > (2 N -1) or i == (2 N -1) or j == (2 N -1); where X = Y then N = X, if X ≠ Y then N takes The larger of X and Y; and in each convolution operation, each neuron inputs the partial product of the equal product m which is a positive value into an exponential adder to accumulate to obtain a positive accumulated value 2 -p , and additionally input the partial product of the negative product m into the first exponential adder to obtain a negative accumulated value 2 -q , and then add the positive accumulated value 2 -p and the negative accumulated value 2 -q index input a subtractor subtracting a characteristic value obtained R < wherein, the index adder calculates two index 2 -a 2 -b added and the formula is: 2 -a +2 -b = 2 - c , if a ≠ b, then c takes the smaller of a and b; 2 -a +2 -b = 2 -a + 1 , if a == b and a ≠ (2 N -1) and 2 -a + 1 <2 0 ; 2 -a +2 -b = 2 0 , if a == b and a ≠ (2 N -1) and 2 -a + 1
Figure TWI672643B_C0002
2 0 ; 2 -a +2 -b = 2 -a , if b == (2 N -1); where, the formula of the exponential subtractor to calculate the eigenvalue r is: r = 2 -p -2 -q = 2 -p if p ≦ q-1 or q == (2 N -1);
Figure TWI672643B_C0003
, If p == q; r = 2 -p -2 -q = -2 -q , if p == q + 1; r = 2 -p -2 -q = 2 -p , if q == p + 1; r = 2 -p -2 -q = -2 -q , if q ≦ p-1 or p == (2 N -1); where == is a comparison operator to compare the variable on the left side of == Is it equal to the variable on the right side of ==; and i, j, p, q, r, a, b are positive integers.
如請求項1所述應用在深度神經網路的全指數運算方法,其中各該神經元還將完成全部卷積運算產生的該等特徵值r經由一線性整流函數進行整流運算,而產生相對應的複數個整流後特徵值r’,並將該等整流後特徵值r’輸入與其連結的下一層的該等神經元,使下一層的各該神經元利用量化後的該權重矩陣、該指數乘法器、該指數加法器及該指數減法器對該等整流後特徵值r’進行卷積運算。The full exponential calculation method applied to the deep neural network as described in claim 1, wherein each of the neurons will also complete the eigenvalue r generated by all convolution operations through a linear rectification function to perform rectification operations, and generate corresponding A plurality of rectified eigenvalues r ′, and input the rectified eigenvalues r ′ to the neurons in the next layer connected to them, so that each neuron in the next layer uses the quantized weight matrix and the index The multiplier, the exponential adder and the exponential subtractor perform convolution operations on the rectified eigenvalues r ′. 如請求項2所述應用在深度神經網路的全指數運算方法,其中線性整流(Rectified Linear Unit,ReLU)函數是斜坡函數、泄露線性整流(Leaky ReLU)函數、帶泄露隨機線性整流(Randomized Leaky ReLU)函數及噪聲線性整流(Noisy ReLU)函數其中之一。The full exponential calculation method applied to the deep neural network as described in claim 2, wherein the linear rectification (Rectified Linear Unit, ReLU ) function is a ramp function, a leaky linear rectification (Leaky ReLU) function, and a random linear rectification with leakage (Randomized Leaky) One of the ReLU function and the noise linear rectification (Noisy ReLU) function. 一種電腦裝置,包括:一深度神經網路,其具有一由複數個前後相互連結的層構成的隱藏層,該隱藏層中的每一層具有複數個神經元,每一個神經元具有由m個(m為整數且m≧1)權重值構成的一權重矩陣、一指數乘法器、一指數加法器及一指數減法器;及一預處理模組,其預先將該深度神經網路的每一個神經元的該m個權重值正規化,使該m個正規化權重值落在-1~+1之間的範圍,並將各該正規化權重值量化成能以指數2-i表示的量化後權重值,且該預處理模組以X位元表示該等量化後權重值的複數個群組;且該預處理模組預先將要輸入該深度神經網路的一圖像資料的複數圖素值正規化,使該等正規化圖素值落在-1~+1之間的範圍,並將各該正規化圖素值量化成能以指數2-j表示的量化後圖素值,且該預處理模組以Y位元表示該等量化後圖素值的複數個群組;且該預處理模組將該等量化後圖素值輸入該深度神經網路的該隱藏層中的第一層,使第一層的各該神經元以量化後的該權重矩陣對該等量化後圖素值進行卷積運算,且在每一次的卷積運算中,各該神經元利用該指數乘法器將該m個量化後權重值2-i分別與該等量化後圖素值中重疊的部分量化後圖素值2-j相乘而得到m個乘積,其中,該指數乘法器計算每一個乘積之公式為:2-i ×2-j =2-(i+j),若i,j≠(2N-1)且i+j≦(2 N -1);
Figure TWI672643B_C0004
,若i+j>(2 N -1)或i==(2 N -1)或j==(2 N-1);其中若X=Y則N=X,若X≠Y則N取X、Y兩者中較大者;且在每一次的卷積運算中,各該神經元將該等乘積m中為正值的部分乘積輸入該指數加法器中累加而得到一正累加值2-p,並將該等乘積m中為負值的部分乘積輸入該指數加法器中累加而得到一負累加值2-q,再將該正累加值2-p與該負累加值2-q輸入該指數減法器相減而得到一特徵值r;其中,該指數加法器計算兩個指數2-a和2-b相加的公式為:2-a+2-b=2-c,若a≠b,則c取a、b兩者中較小者;2-a+2-b=2-a+1,若a==b且a≠(2N-1)且2-a+1<20;2-a+2-b=20,若a==b且a≠(2N-1)且2-a+1
Figure TWI672643B_C0005
20;2-a+2-b=2-a,若b==(2N-1);其中,該指數減法器計算該特徵值r的公式為:r=2-p-2-q=2-p,若p≦q-1或q==(2 N -1);
Figure TWI672643B_C0006
,若p==q;r=2-p-2-q=-2-q,若p==q+1;r=2-p-2-q=2-p,若q==p+1;r=2-p-2-q=-2-q,若q≦p-1或p==(2 N -1);其中==是比較運算子,用以比較==左邊的變數是否等於==右邊的變數;且i、j、p、q、r、a、b為正整數。
A computer device, including: a deep neural network, which has a hidden layer composed of a plurality of interconnected layers, each layer in the hidden layer has a plurality of neurons, and each neuron has a number of ( m is an integer and m ≧ 1) a weight matrix composed of weight values, an exponential multiplier, an exponential adder, and an exponential subtractor; and a pre-processing module that pre-processes each nerve of the deep neural network The m weight values of the element are normalized, so that the m normalized weight values fall within the range of -1 to +1, and each normalized weight value is quantized into a quantization that can be expressed by an exponent 2 -i Weight value, and the preprocessing module expresses the plural groups of the quantized weight values in X bits; and the preprocessing module will input the complex pixel value of an image data of the deep neural network in advance Normalization, so that the normalized pixel values fall within the range of -1 to +1, and each normalized pixel value is quantized into a quantized pixel value that can be expressed by an index of 2- j , and the The preprocessing module represents the plural groups of the quantized pixel values in Y bits; and the preprocessing module inputs the quantized pixel values into the first of the hidden layers of the deep neural network Layer, so that each neuron of the first layer performs a convolution operation on the quantized pixel values with the quantized weight matrix, and in each convolution operation, each neuron uses the exponential multiplier Multiplying the m quantized weight values 2- i by the overlapping quantized pixel values 2- j in the quantized pixel values respectively to obtain m products, where the exponential multiplier calculates each product The formula is: 2 - i × 2 - j = 2- ( i + j ) , if i , j ≠ (2 N -1) and i + j ≦ (2 N -1);
Figure TWI672643B_C0004
, If i + j > (2 N -1) or i == (2 N -1) or j == (2 N - 1); where X = Y then N = X, if X ≠ Y then N takes The larger of X and Y; and in each convolution operation, each neuron inputs the positive product of the equal product m into the exponential adder and accumulates it to obtain a positive accumulated value 2 -p , and input the partial product of the negative value in the product m into the exponential adder to accumulate to obtain a negative accumulated value 2 -q , and then the positive accumulated value 2 -p and the negative accumulated value 2 -q Enter the exponential subtractor to get a eigenvalue r; where, the exponential adder calculates the formula for adding two exponents 2 -a and 2 -b : 2 -a +2 -b = 2 -c , if a ≠ b, then c takes the smaller of a and b; 2 -a +2 -b = 2 -a + 1 , if a == b and a ≠ (2 N -1) and 2 -a + 1 <2 0 ; 2 -a +2 -b = 2 0 , if a == b and a ≠ (2 N -1) and 2 -a + 1
Figure TWI672643B_C0005
2 0 ; 2 -a +2 -b = 2 -a , if b == (2 N -1); where, the formula of the exponential subtractor to calculate the eigenvalue r is: r = 2 -p -2 -q = 2 -p if p ≦ q-1 or q == (2 N -1);
Figure TWI672643B_C0006
, If p == q; r = 2 -p -2 -q = -2 -q , if p == q + 1; r = 2 -p -2 -q = 2 -p , if q == p + 1; r = 2 -p -2 -q = -2 -q , if q ≦ p-1 or p == (2 N -1); where == is a comparison operator to compare the variable on the left side of == Is it equal to the variable on the right side of ==; and i, j, p, q, r, a, b are positive integers.
如請求項4所述的電腦裝置,其中各該神經元還將完成全部卷積運算產生的該等特徵值r經由一線性整流函數進行整流運算,而產生相對應的複數個整流後特徵值r’,並將該等整流後特徵值r’輸入與其連結的下一層的該等神經元,使下一層的各該神經元利用量化後的該權重矩陣、該指數乘法器、該指數加法器及該指數減法器對該等整流後特徵值r’進行卷積運算。The computer device according to claim 4, wherein each of the neurons further performs the rectification operation on the eigenvalues r generated by performing all convolution operations, and generates corresponding rectified eigenvalues r ', And input the rectified characteristic value r' to the neurons in the next layer connected to it, so that each neuron in the next layer uses the quantized weight matrix, the exponential multiplier, the exponential adder and The exponential subtractor performs a convolution operation on the rectified characteristic value r ′. 如請求項4所述的電腦裝置,其中線性整流(Rectified Linear Unit,ReLU)函數是斜坡函數、泄露線性整流(Leaky ReLU)函數、帶泄露隨機線性整流(Randomized Leaky ReLU)函數及噪聲線性整流(Noisy ReLU)函數其中之一。The computer device according to claim 4, wherein the linear rectification (Rectified Linear Unit, ReLU) function is a ramp function, a leakage linear rectification (Leaky ReLU) function, a random linear rectification with leakage (Randomized Leaky ReLU) function, and a noise linear rectification ( Noisy ReLU) function. 如請求項4至6其中任一項所述的電腦裝置,其中該深度神經網路及該預處理模組是儲存於該電腦裝置的一儲存單元中,且能被該電腦裝置的一處理單元讀取並執行的軟體程式。The computer device according to any one of claims 4 to 6, wherein the deep neural network and the preprocessing module are stored in a storage unit of the computer device and can be used by a processing unit of the computer device Read and execute the software program. 如請求項4至6其中任一項所述的電腦裝置,其中該深度神經網路及/或該預處理模組是整合在該電腦裝置的一特殊應用積體電路晶片或一可程式邏輯電路裝置中。The computer device according to any one of claims 4 to 6, wherein the deep neural network and / or the pre-processing module is a special application integrated circuit chip or a programmable logic circuit integrated in the computer device In the device. 如請求項4至6其中任一項所述的電腦裝置,其中該深度神經網路及/或該預處理模組是燒錄在該電腦裝置的一微處理器中的韌體。The computer device according to any one of claims 4 to 6, wherein the deep neural network and / or the preprocessing module are firmware burned in a microprocessor of the computer device. 一種電腦可讀取的記錄媒體,其中儲存一包含一深度神經網路及一預處理模組的軟體程式,該深度神經網路具有一由複數個前後相互連結的層構成的隱藏層,該隱藏層中的每一層具有複數個神經元,每一個神經元具有由m個(m為整數且m≧1)權重值構成的一權重矩陣、一指數乘法器、一指數加法器及一指數減法器,且該軟體程式被一電腦裝置載入並執行後,該電腦裝置能完成如請求項1至3其中任一項所述應用在深度神經網路的全指數運算方法。A computer-readable recording medium in which a software program including a deep neural network and a preprocessing module is stored. The deep neural network has a hidden layer composed of a plurality of interconnected layers. The hidden Each layer in the layer has a plurality of neurons, and each neuron has a weight matrix composed of m (m is an integer and m ≧ 1) weight values, an exponential multiplier, an exponential adder, and an exponential subtractor After the software program is loaded and executed by a computer device, the computer device can complete the full exponential calculation method applied to the deep neural network as described in any one of the request items 1 to 3.
TW107117479A 2018-05-23 2018-05-23 Full index operation method for deep neural networks, computer devices, and computer readable recording media TWI672643B (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
TW107117479A TWI672643B (en) 2018-05-23 2018-05-23 Full index operation method for deep neural networks, computer devices, and computer readable recording media
CN201810772630.9A CN110531955B (en) 2018-05-23 2018-07-13 Index calculation method for deep neural network, computer device, and recording medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
TW107117479A TWI672643B (en) 2018-05-23 2018-05-23 Full index operation method for deep neural networks, computer devices, and computer readable recording media

Publications (2)

Publication Number Publication Date
TWI672643B true TWI672643B (en) 2019-09-21
TW202004568A TW202004568A (en) 2020-01-16

Family

ID=68619274

Family Applications (1)

Application Number Title Priority Date Filing Date
TW107117479A TWI672643B (en) 2018-05-23 2018-05-23 Full index operation method for deep neural networks, computer devices, and computer readable recording media

Country Status (2)

Country Link
CN (1) CN110531955B (en)
TW (1) TWI672643B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
TWI773398B (en) * 2020-06-25 2022-08-01 英商普立N科技有限公司 Analog hardware realization of neural networks

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2021168644A1 (en) * 2020-02-25 2021-09-02 深圳市大疆创新科技有限公司 Data processing apparatus, electronic device, and data processing method
TWI743710B (en) * 2020-03-18 2021-10-21 國立中山大學 Method, electric device and computer program product for convolutional neural network
CN112199072B (en) * 2020-11-06 2023-06-02 杭州海康威视数字技术股份有限公司 Data processing method, device and equipment based on neural network layer

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20160026912A1 (en) * 2014-07-22 2016-01-28 Intel Corporation Weight-shifting mechanism for convolutional neural networks
EP3154001A2 (en) * 2015-10-08 2017-04-12 VIA Alliance Semiconductor Co., Ltd. Neural network unit with neural memory and array of neural processing units that collectively shift row of data received from neural memory
US20170103306A1 (en) * 2015-10-08 2017-04-13 Via Alliance Semiconductor Co., Ltd. Neural network unit with neural memory and array of neural processing units and sequencer that collectively shift row of data received from neural memory

Family Cites Families (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH10134018A (en) * 1996-07-08 1998-05-22 Nippon Telegr & Teleph Corp <Ntt> Method and device for finding rule, storage device stored with rule finding program, method and device for neural learning, and storage medium stored with neural learning program
CN101350155A (en) * 2008-09-09 2009-01-21 无敌科技(西安)有限公司 Method and system for generating and verifying cipher through genus nerval network
JP2017049907A (en) * 2015-09-04 2017-03-09 国立研究開発法人情報通信研究機構 Neural network, learning method therefor and computer program
US10776690B2 (en) * 2015-10-08 2020-09-15 Via Alliance Semiconductor Co., Ltd. Neural network unit with plurality of selectable output functions
CN106228238B (en) * 2016-07-27 2019-03-22 中国科学技术大学苏州研究院 Accelerate the method and system of deep learning algorithm on field programmable gate array platform
US11222263B2 (en) * 2016-07-28 2022-01-11 Samsung Electronics Co., Ltd. Neural network method and apparatus
US20180053086A1 (en) * 2016-08-22 2018-02-22 Kneron Inc. Artificial neuron and controlling method thereof

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20160026912A1 (en) * 2014-07-22 2016-01-28 Intel Corporation Weight-shifting mechanism for convolutional neural networks
EP3154001A2 (en) * 2015-10-08 2017-04-12 VIA Alliance Semiconductor Co., Ltd. Neural network unit with neural memory and array of neural processing units that collectively shift row of data received from neural memory
US20170103306A1 (en) * 2015-10-08 2017-04-13 Via Alliance Semiconductor Co., Ltd. Neural network unit with neural memory and array of neural processing units and sequencer that collectively shift row of data received from neural memory

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
Zhixi Shen1, Yong Liu ,"A Novel Connectivity of Deep Convolutional Neural Networks," 2017 Chinese Automation Congress ,20171022^&amp;rn^https://ieeexplore.ieee.org/document/8244187
Zhixi Shen1, Yong Liu ,"A Novel Connectivity of Deep Convolutional Neural Networks," 2017 Chinese Automation Congress ,20171022^&rn^https://ieeexplore.ieee.org/document/8244187 *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
TWI773398B (en) * 2020-06-25 2022-08-01 英商普立N科技有限公司 Analog hardware realization of neural networks
TWI796257B (en) * 2020-06-25 2023-03-11 英商普立N科技有限公司 Analog hardware realization of neural networks

Also Published As

Publication number Publication date
CN110531955B (en) 2023-10-10
CN110531955A (en) 2019-12-03
TW202004568A (en) 2020-01-16

Similar Documents

Publication Publication Date Title
TWI672643B (en) Full index operation method for deep neural networks, computer devices, and computer readable recording media
CN111684473B (en) Improving performance of neural network arrays
JP6977864B2 (en) Inference device, convolution operation execution method and program
US11580719B2 (en) Dynamic quantization for deep neural network inference system and method
KR20190062129A (en) Low-power hardware acceleration method and system for convolution neural network computation
TW202101302A (en) Circuit system and processing method for neural network activation function
WO2021135715A1 (en) Image compression method and apparatus
CN110109646B (en) Data processing method, data processing device, multiplier-adder and storage medium
EP3924895A1 (en) Outlier quantization for training and inference
CN110647974A (en) Network layer operation method and device in deep neural network
Adams et al. Energy-efficient approximate MAC unit
US11270196B2 (en) Multi-mode low-precision inner-product computation circuits for massively parallel neural inference engine
US20220043630A1 (en) Electronic device and control method therefor
US11907834B2 (en) Method for establishing data-recognition model
WO2018196750A1 (en) Device for processing multiplication and addition operations and method for processing multiplication and addition operations
Chen et al. Semantic attention and relative scene depth-guided network for underwater image enhancement
CN114978189A (en) Data coding method and related equipment
WO2021081854A1 (en) Convolution operation circuit and convolution operation method
KR102153167B1 (en) Matrix operator and matrix operation method for artificial neural network
TWM570477U (en) Computer device using full exponential operation on deep neural network
Wang et al. EASNet: searching elastic and accurate network architecture for stereo matching
CN113919479B (en) Method for extracting data features and related device
Peng et al. MBFQuant: A Multiplier-Bitwidth-Fixed, Mixed-Precision Quantization Method for Mobile CNN-Based Applications
JP7247418B2 (en) Computing unit, method and computer program for multiplication
WO2022222068A1 (en) Methods and systems for multiplier sharing in neural networks