TR202014798A1

TR202014798A1 - A SCALE, FLEXIBLE AND ULTRA-FASTER METHOD FOR THE REsource-EFFICIENT REALIZATION OF ARTIFICIAL NEURAL NETWORKS

Info

Publication number: TR202014798A1
Application number: TR2020/14798A
Authority: TR
Inventors: Yeşi̇l Soner; Şen Cansu; Kaya Altuğ
Original assignee: Aselsan Elektronik Sanayi Ve Ticaret As
Priority date: 2020-09-17
Filing date: 2020-09-17
Publication date: 2022-03-21
Also published as: WO2022060314A1

Abstract

Mevcut buluş, yapay sinir ağlarının verimli bir kaynak kullanım şekli altında gerçeklenmesi için, ölçeklenebilir, esnek ve ultra hızlı bir metot ile ilgilidir. Önerilen mimari, çarpma ve toplama birimleri dışında katsayı güncelleme işlemi için ekstra kaynak kullanmamak sureti ile çok düşük kaynak kullanmaktadır. Yapay sinir ağı katsayılarının sınırlandırılması, nöron sayısının arttırılabileceği oldukça ölçeklenebilir bir sistem sağlar. İlave olarak, önerilen metot yapay sinir ağı gerçeklemesinde yüksek çalışma saat hızlarına ulaşabilmeyi sağlamaktadır.The present invention relates to a scalable, flexible and ultra-fast method for implementing neural networks under an efficient resource utilization mode. The proposed architecture uses very low resources by not using extra resources for the coefficient update process other than the multiplication and addition units. Limiting the neural network coefficients provides a highly scalable system in which the number of neurons can be increased. In addition, the proposed method enables to achieve high operating clock speeds in artificial neural network implementation.

Description

TARIFNAME YAPAY SINIR AGLARININ VERIMLI KAYNAK KULLANAN GERÇEKLEMESI içiN OLÇEKLENEBILIR. ESNEK VE ULTRA HIZLI BIR METOT Teknik Alan Mevcut bulus, yapay sinir aglarini (YSA) verimli kaynak kullanacak sekilde gerçekleyen ölçeklenebilir, esnek ve ultra hizli bir metot ile ilgilidir. Bu yöntemde yapay sinir agi katsayilari sinirlandirilir ve katsayi güncelleme islemi için önerilen çarpma ve toplama birimleri disinda ekstra kaynak kullanilmaz. Teknigin Bilinen Durumu Yapay sinir agi, insan beyninin çalisma seklini taklit eden bir süreç araciligi ile bir grup veri içinde altta yatan iliskileri tanimaya çalisan bir dizi algoritmadir. Sinir aglari degisen girdilere uyum saglayabilir, bu sayede ag, çikti kriterlerini yeniden tasarlamaya gerek kalmadan mümkün olan en iyi sonucu üretir. Kökleri yapay zekaya dayanan sinir aglari kavrami hizli bir sekilde popülerlik kazanmaktadir. Yapay sinir aglari tipik olarak bir aktivasyon fonksiyonu içeren bir dizi birbirine bagli dügümden olusan katmanlar halinde düzenlenmektedir. Ornekler aga girdi katmani üzerinden sunulur. Girdi katmani, islemin agirlikli baglantilar ve aktivasyon fonksiyonlari sistemi tarafindan yapildigi bir ya da daha fazla gizli katman ile iletisim kurar. Ardindan, gizli katmanlar, yukarida belirtilen sürecin sonuçlarini çikarmak için bir çikti katmanina baglanir. Sinir aglari, paralel es zamanli çarpma ve toplamalar gerektirir. Bu nedenle, çogunlukla FPGA ve/veya ASIC teknolojilerinde gerçeklenmektedir. Bu çalismalarda kaynak kullanimi, çalisma hizi ve ölçeklenebilirlik çok önemlidir. Son on yilda Yapay Zekadaki hizli büyüme ile sinir agi gerçeklemesinin gerekliligi artmaktadir; ancak, donanim kaynaklari oldukça pahali ve sinirlidir. Mevcut sinir agi çalismalarinin gerçeklemesi, bellek ögeleri gibi mevcut kaynaklarin asiri kullanilmasina neden olur ve bu çalismalarin çalisma hizi daha düsüktür. Buna ek olarak, ölçeklenebilirlik mevcut gerçeklemelerde kolay yapilmamaktadir. Yayin numarasi EP2122542B'I olan Avrupa patent basvurusu, ölçeklenebilir bir yapay sinir agi için bir mimari sunmaktadir. Bu yapi, bir girdi katmani, en az bir gizli katman, bir çikti katmani ve bir paralellestirme alt sistemi içermektedir. Bu alt sistem girdi katmanina, en az bir gizli katman ve çikti katmanina degisken derecede paralellestirme saglamak için konfigüre edilmistir. Paralellestirme alt sisteminin saglanmasi gerekirse daha az paralel konfigürasyonun kullanilmasina izin verir, örnek olarak mevcut donanim kaynaklari ile eslesmek ya da donanim kaynak gereksinimlerini artirmadan yeterli performans saglamak verilebilir. Bununla birlikte, bu uygulama, katsayi güncelleme islemi için çarpma ve toplama birimleri disinda ekstra kaynak kullanmadan çok düsük kaynak kullanan bir mimari önermemektedir. Öte yandan, doküman içerisinde katsayilarin kesir noktalarinin konumlarini ayni yaparak sinir agi katsayilarinin sinirlandirilmasi hakkinda herhangi bir bilgi yoktur. Bulusun Kisa Açiklamasi Mevcut bulusun amaci, YSA katsayilarini sinirlandirarak ve önerilen çarpma ve toplama birimleri disinda katsayi güncelleme islemi için ekstra kaynak kullanmadan sinir aglarini verimli kaynak kullanarak gerçekleyen bir metot saglamaktir. Yapay sinir agi katsayilarini sinirlamak, farkli sistem gereksinimleri altinda ölçeklenebilirlik saglar. Katsayilari sinirlandirarak, genel bir yapay sinir aginin tüm dogrusal islem bloklari, bir tane iki girdili çarpici, bir tane iki girdili toplayici ve yazmaç elemanlarindan olusan kademeli çarpma ve toplama birimleri ile gerçeklenebilir. Bu birimlere dayali önerilen mimari, ek bellek ögelerine olan ihtiyaci ortadan kaldirir ve ultra hizli islemler ile ölçeklenebilirlik saglar. Onerilen metot. katsayi güncelleme islemi için çarpma ve toplama birimleri disinda ekstra kaynak kullanmamak sureti ile çok düsük kaynak kullanmaktadir. YSA katsayilarinin sinirlandirilmasi, nöron sayisinin arttirilabilecegi yüksek derecede ölçeklenebilir bir sistem saglar. Ilave olarak, bu metot ile sinir agi gerçeklemesinde daha önce görülmemis derecede yüksek çalisma saat hizlarina ulasilir. Sekillerin Kisa Açiklamasi Sekil 1, genel bir yapay sinir agi yapisini göstermektedir. Sekil 2, mevcut bulusun çarpma ve toplama birimlerini göstermektedir. Sekil 3, yazmaçlarin tümünün seçildigi (full pipelined) nöron gerçeklemesinin bir örnegini göstermektedir. Sekil 4, sinir agi yapisinin örnek bir parçasinda katsayi kaydirma islemini göstermektedir. Sekil 5, mevcut bulustaki nöron gerçeklemesinin örnek bir tasarimini göstermektedir. Referanslar HL. Gizli katman OL. Çikti katmani 8. Akilli birim CC: Katsayi sinirlandirmasi R = Iletim hattindaki toplam yazmaç sayisi Uo, U1, Uz, Us ,..., UR: Çarpma ve toplama birimleri Nle. katmaninj. nöronu vjl: l. katmaninj. nöronunun çiktisi w: Agirliklar: WIJ: wgJ-,wgj ...wgjj x: Girdi sinyalleri: xîlj, xgJ-,xgj ...xfwi w : Aktivasyon fonksiyonu Bulusun Detayli Açiklamasi L bir sinir agindaki katman sayisi ve wgj, (l- 1). katmanin i. nöronundan l. katmanin j. nöronuna kadar olan 1. katmanin baglanti agirligi ve b}, i. katmaninjv nöronunun bias (yanli) vektörü olsun. x (1 - 1). katmanin i. nöronundan l. katmaninj. nöronuna kadar olan girdi sinyali, LP, bir nöronun aktivasyon fonksiyonu, vi gösterildigi gibi, l. katmaninj. nöronunun çikti sinyali olsun. Katmanlar, aktivasyon fonksiyonu içeren bir dizi birbirine bagli dügümden olusur. Ornekler aga girdi katmani üzerinden sunulur. Girdi katmani, yapay nöronlarin bir dizi agirlikli girdi aldiklari ve aktivasyon fonksiyonu yoluyla çikti ürettigi bir ya da daha fazla gizli katman ile iletisim kurar. Ardindan, bu gizli katmanlar, sinir agi islemini sonuçlandirmak için çikti katmanina baglanir. - :LP(Zi_1 x: jiIjW + 19;) k: (l - 1). katmandaki nöronlarin sayisi Yapay sinir aglarindaki paralel islem yapisi sebebiyle katsayilarin ayni anda çalismasi gereken birden fazla bellek ögesinde depolanmasi gerekmektedir. Bu yapilarin gerçeklemesinde donanim kaynaklari asiri kullanilir ve çalisma saat hizi düsük olur. Onerilen metotta, YSA katsayilari akilli bir birimden (S) beslenir ve adaptasyon islemi sirasinda bu katsayilar ayni sabit nokta formatina sahip olacak sekilde sinirlandirilip nöronlarin içindeki çarpma ve toplama birimlerine atanir. Katsayilarin uzunlugu ve kesir noktasinin konumu tasarim boyunca degismez. Egitim asamasinda katsayilarin kesir noktasinin konumunun sabit tutulmasi katsayilarin kontrolsüz bir sekilde büyümesini önler, güvenilir ve otomatik bir sinir agi gerçeklemesini mümkün kilar. Bu sinirlayici önlem, ölçeklenebilirlige büyük ölçüde katkida bulunur. Ilave olarak, tüm girdi sinyallerinin sabit nokta formatinin da sistem içinde degismedigi varsayilmaktadir. Akilli birim (8) tüm katsayilari ayni sabit nokta formatinda yapar. Ornek olarak, tüm katsayilar 18 bit uzunlugunda oldugu zaman, her birinin ilk 3 biti noktadan önce kalan 15 biti ise noktadan sonra olur. Sinir agini gerçeklemek için önerilen metotta, toplama, çarpma ve gecikme ögelerini (yazmaç) içeren çarpma ve toplama birimleri kullanilir. R, Sekil 2'de gösterildigi sekli ile önerilen birimlerde çarpma isleminden sonra yer alan ardisik düzen (pipeline) yazmaçlarinin toplam sayisi olsun. R, ihtiyaç duyulan çalisma saat frekansina ulasmak için seçilen bir tasarim parametresidir. Yapay sinir aglarinin verimli kaynak kullanan gerçekle-mesi için önerilen metot asagidaki islem adimlarini içermektedir; - Yapay sinir agi katsayilarini sinirlandirmak için akilli bir birim (S) ile kesir noktalarinin konumlarinin esitlenmesi, - Nöronlarda bulunan çarpma ve toplama birimlerindeki çarpma elemaninin bir portundan katsayilarin beslenmesi, - Nöronlarda bulunan çarpma ve toplama birimlerindeki çarpma elemaninin diger portundan girdi sinyallerinin beslenmesi, - Nöronlarda bulunan birinci çarpma ve toplama birimindeki toplama ögesinden bias sinyalinin beslenmesi, - Çarpma ve toplama birimi çiktisinin bir sonraki birimin toplama ögesine beslenmesi ve nörondaki son birime kadar bu besleme isleminin tekrarlanmasi, - Nöronlarda bulunan son çarpma ve toplama birimi çiktisinin aktivasyon fonksiyonuna girdi olarak verilmesi, o Nöronlardaki aktivasyon fonksiyon çiktilarinin her bir nöronun sonucu olarak verilmesi. Sinir agi bias vektörü, toplama ögelerinden, sinir agi agirliklari ve girdi sinyalleri çarpma ögelerinden beslenir. Çarpma ve toplama birimi çesitleri arasindaki farklar, tasarimin ardisik düzen seçeneginden kaynaklanmaktadir. Hiçbir ardisik düzen yazmaci tercih edilmezse, nöronlarda sadece 0.(Uo) blok kullanilabilir. Eger ardisik düzen yazmaçlari tercih edilirse, birbirini izleyen her bir blok, bir öncekinden daha fazla sayida yazmaç (gecikme elemani) içermelidir. Gecikme ögeleri ve ardisik düzen seçenegi, Iiteratürdeki gerçeklemelerde görülmeyen seviyede yüksek çalisma saat hizlarina ulasilmasini saglar. Ilave olarak, çarpma ve toplama birimlerinin 0.(U0) bloktan baslayarak sira ile kullanilmasina gerek yoktur, her bir nöronun ilk blogu hiz gereksinimlerine göre seçilebilir. Birinci çarpma ve toplama biriminin çiktisi, ikinci birimin toplama ögelerinden beslenir. ikinci çarpma ve toplama biriminin çiktisi, üçüncü birimin toplama ögelerinden beslenir ve bu islem nörondaki son birime kadar devam Onerilen çarpma ve toplama birimleri, sekillerde gösterildigi hali ile 3 girdiye sahiptir. YSA katsayi güncelleme prosedüründe, katsayilar, Sekil 3'te gösterilen gecikme ögeleri ile her bir saat çevriminde ilgili çarpma ve toplama birimlerine kayarak yerlesir. Katsayi güncelleme prosedürü sona erdigi zaman, katsayilar artik kaymaz ve atanan bloklarini korurlar. Girdi sinyalleri de sekillerde görüldügü gibi bloklarin Çarpma ögelerinden beslenir, bunlar, dogrudan çarpana baglidir. Bias vektörü ise akilli bir birim (8) ile sinirlandirilarak bloklarin toplama ögelerinden beslenir. Kisitli sinir agi katsayilari (agirliklar ve bias sinyalleri), otomatik gerçekleme ortamindan ödün vermeden nöron sayisinin artirilabildigi, ölçeklenebilirligi oldukça yüksek bir sistem saglar. Bu sinirlama islemi, katsayilarin, çarpma ve toplama birimlerine gecikme ögeleri (yazmaçlar) ile sürekli olarak kaydirilabilmesini saglar. Yapay sinir agi mimarisinde, sadece bir nöron içindeki katsayilar degil tüm nöronlardaki katsayilar gecikme ögeleriyle sürekli sekilde kaydirilarak ilgili bloklara yerlestirilir. Katsayilar, Sekil 4'te gösterildigi gibi, bir nöronun son çarpma ve toplama biriminden, ardisik nöronun ilk çarpma ve toplama birimine dogru kaydirilir. Onerilen metotta ilave bellek ögelerine duyulan ihtiyacin ortadan kaldirilmasi, bu yöntemin verimli kaynak kullanimi açisindan literatürdeki mevcut tasarimlardan daha üstün oldugunu göstermektedir. Ilave olarak, bu tasarim belirli bir yapay sinir agi uygulamasi için uyarlanmamistir. Onerilen tasarim metodu herhangi bir uygulama için oldukça niteliklidir ve örnek bir özel tasarim gerçeklemesi Sekil 5'te verilmistir. TR DESCRIPTION CAN BE SCALED FOR RESOURCE EFFICIENT IMPLEMENTATION OF ARTIFICIAL BORDER NETWORKS. A FLEXIBLE AND ULTRA-FAST METHOD Technical Field The present invention relates to a scalable, flexible and ultra-fast method that implements artificial neural networks (ANN) in a way that uses resources efficiently. In this method, artificial neural network coefficients are limited and no extra resources are used other than the recommended multiplication and addition units for the coefficient update process. State of the Art Artificial neural network is a set of algorithms that attempt to recognize underlying relationships within a group of data through a process that mimics the way the human brain works. Neural networks can adapt to changing inputs, so the network produces the best possible result without having to redesign the output criteria. The concept of neural networks, rooted in artificial intelligence, is rapidly gaining popularity. Artificial neural networks are typically organized into layers consisting of a series of interconnected nodes that contain an activation function. Samples are presented to the network via the input layer. The input layer communicates with one or more hidden layers, where processing is done by a weighted system of connections and activation functions. Then, the hidden layers are connected to an output layer to output the results of the above-mentioned process. Neural networks require parallel simultaneous multiplications and additions. Therefore, it is mostly implemented in FPGA and/or ASIC technologies. In these studies, resource usage, operating speed and scalability are very important. With the rapid growth in Artificial Intelligence in the last decade, the necessity of neural network implementation is increasing; However, hardware resources are quite expensive and limited. The implementation of existing neural network studies causes excessive use of existing resources such as memory elements, and the operating speed of these studies is lower. In addition, scalability is not easily achieved in current implementations. The European patent application, publication number EP2122542B, presents an architecture for a scalable artificial neural network. This structure includes an input layer, at least one hidden layer, an output layer and a parallelization subsystem. This subsystem is configured to provide varying degrees of parallelization to the input layer, at least one hidden layer, and the output layer. Providing a parallelization subsystem allows the use of less parallel configurations if necessary, for example to match existing hardware resources or provide sufficient performance without increasing hardware resource requirements. However, this application does not propose a very low-resource architecture without using extra resources other than multiplication and addition units for the coefficient update process. On the other hand, there is no information in the document about limiting the neural network coefficients by keeping the positions of the fraction points of the coefficients the same. Brief Description of the Invention The aim of the present invention is to provide a method that implements neural networks using resources efficiently by limiting the ANN coefficients and without using extra resources for the coefficient update process other than the recommended multiplication and addition units. Limiting the neural network coefficients provides scalability under different system requirements. By limiting the coefficients, all linear operation blocks of a general artificial neural network can be implemented with cascade multiplication and addition units consisting of a two-input multiplier, a two-input adder and register elements. The proposed architecture based on these units eliminates the need for additional memory elements and provides scalability with ultra-fast operations. Recommended method. It uses very low resources by not using extra resources other than multiplication and addition units for the coefficient update process. Limiting the ANN coefficients provides a highly scalable system in which the number of neurons can be increased. Additionally, this method achieves unprecedentedly high clock speeds for neural network implementation. Brief Description of Figures Figure 1 shows a general artificial neural network structure. Figure 2 shows the multiplication and addition units of the present invention. Figure 3 shows an example of a fully pipelined neuron implementation. Figure 4 shows the coefficient shifting process in an example part of the neural network structure. Figure 5 shows an exemplary design of the neuron implementation of the present invention. References HL. BE the hidden layer. Output layer 8. Smart unit CC: Coefficient limitation R = Total number of registers on the transmission line Uo, U1, Uz, Us,..., UR: Multiplication and addition units Nle. kataninj. neuron vjl: l. kataninj. Output of neuron w: Weights: WIJ: wgJ-,wgj ...wgjj x: Input signals: xîlj, xgJ-,xgj ...xfwi w: Activation function Detailed Description of the Invention L is the number of layers in a neural network and wgj, (l- one). layer i. from neuron l. katanin j. connection weight of the 1st layer up to the neuron and b}, i. Let layerinjv be the bias vector of the neuron. x(1 - 1). layer i. from neuron l. kataninj. The input signal to the neuron, LP, is the activation function of a neuron, vi as shown in l. kataninj. Let be the output signal of the neuron. Layers consist of a series of interconnected nodes that contain an activation function. Samples are presented to the network via the input layer. The input layer communicates with one or more hidden layers, where artificial neurons receive a set of weighted inputs and produce output through the activation function. Then, these hidden layers are connected to the output layer to finalize the neural network processing. - :LP(Zi_1 x: jiIjW + 19;) k: (l - 1). Number of neurons in the layer. Due to the parallel processing structure in artificial neural networks, the coefficients need to be stored in more than one memory item that must work at the same time. In the implementation of these structures, hardware resources are used excessively and the operating clock speed is low. In the proposed method, ANN coefficients are fed from a smart unit (S), and during the adaptation process, these coefficients are limited to have the same fixed point format and assigned to the multiplication and addition units within the neurons. The length of the coefficients and the position of the fraction point do not change throughout the design. Keeping the position of the fraction point of the coefficients constant during the training phase prevents the coefficients from growing uncontrollably and makes it possible to realize a reliable and automatic neural network. This limiting measure contributes greatly to scalability. Additionally, the fixed point format of all input signals is assumed to remain unchanged within the system. The smart unit (8) makes all coefficients in the same fixed point format. For example, when all coefficients are 18 bits long, the first 3 bits of each are before the dot and the remaining 15 bits are after the dot. In the proposed method to implement the neural network, multiplication and addition units containing addition, multiplication and delay elements (register) are used. Let R be the total number of pipeline registers located after multiplication in the suggested units as shown in Figure 2. R is a design parameter chosen to achieve the required operating clock frequency. The proposed method for the resource-efficient implementation of artificial neural networks includes the following process steps; - Equalizing the positions of the fraction points with a smart unit (S) to limit the coefficients of the artificial neural network, - Feeding the coefficients from one port of the multiplication element in the multiplication and addition units in the neurons, - Feeding the input signals from the other port of the multiplication element in the multiplication and addition units in the neurons, - In the neurons Feeding the bias signal from the addition element in the first multiplication and addition unit, - Feeding the output of the multiplication and addition unit to the addition element of the next unit and repeating this feeding process until the last unit in the neuron, - Giving the output of the last multiplication and addition unit in the neurons as input to the activation function, o Giving the activation function outputs of neurons as a result of each neuron. The neural network bias vector is fed from the addition elements, the neural network weights and input signals are fed from the multiplication elements. The differences between multiplication and addition unit types arise from the sequential ordering option of the design. If no pipeline register is preferred, only block 0 (Uo) can be used in neurons. If pipelined registers are preferred, each successive block must contain a greater number of registers (delay elements) than the previous one. Delay elements and the pipeline option enable high operating clock speeds to be achieved at a level not seen in implementations in the literature. In addition, the multiplication and addition units do not need to be used sequentially starting from block 0 (U0), the first block of each neuron can be selected according to speed requirements. The output of the first multiplication and addition unit is fed from the addition elements of the second unit. The output of the second multiplication and addition unit is fed from the addition elements of the third unit, and this process continues until the last unit in the neuron. The proposed multiplication and addition units have 3 inputs as shown in the figures. In the ANN coefficient update procedure, the coefficients are shifted to the relevant multiplication and addition units in each clock cycle with the delay elements shown in Figure 3. When the coefficient update procedure ends, the coefficients no longer drift and retain their assigned blocks. Input signals are fed from the Multiplication elements of the blocks as seen in the figures, these are directly connected to the multiplier. The bias vector is limited by a smart unit (8) and fed from the addition elements of the blocks. Limited neural network coefficients (weights and bias signals) provide a highly scalable system in which the number of neurons can be increased without compromising the automatic implementation environment. This limiting operation ensures that the coefficients can be continuously shifted to the multiplication and addition units by delay elements (registers). In the artificial neural network architecture, not only the coefficients within a neuron, but also the coefficients in all neurons are constantly shifted with delay elements and placed in the relevant blocks. The coefficients are shifted from the last multiplication and addition unit of one neuron to the first multiplication and addition unit of the successive neuron, as shown in Figure 4. The elimination of the need for additional memory elements in the proposed method shows that this method is superior to the existing designs in the literature in terms of efficient resource use. Additionally, this design is not tailored for a specific artificial neural network application. The proposed design method is well suited for any application and an example specific design implementation is given in Figure 5. TR

Claims

1. CLAIMS It is a method recommended for the efficient implementation of artificial neural networks using resources, and its feature is; - Equalizing the positions of the fraction points with an intelligent unit (8) to limit the coefficients of the artificial neural network, - Feeding the coefficients from one port of the multiplication element in the multiplication and addition units in the neurons, i Feeding the input signals from the other port of the multiplication element in the multiplication and addition units in the neurons, - In the neurons Feeding the bias signal from the addition element in the first multiplication and addition unit, . Feeding the output of the multiplication and addition unit to the addition element of the next unit and repeating this feeding process until the last unit in the neuron, o Giving the output of the last multiplication and addition unit in the neurons as input to the activation function, . It includes processing steps to give the activation function outputs of neurons as a result of each neuron. It is a method according to claim 1 and its feature is; It is the implementation of all linear operation blocks of a general artificial neural network by cascading multiplication and addition units consisting of one two-input multiplier, one two-input adder and register elements. It is a method according to claim 1 and its feature is; Each step can be applied in cases where the number of neurons increases. It is a method according to the previous claims and its feature is; In the process of updating the coefficients, the coefficients are shifted to the relevant multiplication and addition units in all neurons in the network structure. It is a method according to the previous claims and its feature is; Artificial neural network design is managed by a finite state machine. It is a method according to claim 5 and its feature is; When the coefficient update occurs, the coefficients are shifted to the relevant multiplication and addition units one by one in each clock cycle by the delay elements, and then the shifting process stops. It is a method according to claim 5 and its feature is; When the state coefficient is updated, the coefficients are shifted to the relevant Multiplication and Addition units one by one in each clock cycle by the delay elements and then maintain the units to which they are assigned. It is a method according to the previous claims and its feature is; This is because the fraction point positions of the fixed-point coefficients are constrained to be the same for all neurons. TR