WO2018058427A1 - Appareil et procédé de calcul de réseau neuronal - Google Patents

Appareil et procédé de calcul de réseau neuronal Download PDF

Info

Publication number
WO2018058427A1
WO2018058427A1 PCT/CN2016/100784 CN2016100784W WO2018058427A1 WO 2018058427 A1 WO2018058427 A1 WO 2018058427A1 CN 2016100784 W CN2016100784 W CN 2016100784W WO 2018058427 A1 WO2018058427 A1 WO 2018058427A1
Authority
WO
WIPO (PCT)
Prior art keywords
neural network
data
unit
sparse
network data
Prior art date
Application number
PCT/CN2016/100784
Other languages
English (en)
Chinese (zh)
Inventor
陈天石
刘少礼
陈云霁
Original Assignee
北京中科寒武纪科技有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 北京中科寒武纪科技有限公司 filed Critical 北京中科寒武纪科技有限公司
Priority to PCT/CN2016/100784 priority Critical patent/WO2018058427A1/fr
Publication of WO2018058427A1 publication Critical patent/WO2018058427A1/fr

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks

Definitions

  • the present invention relates to the field of information technology, and in particular, to a neural network operation device and method compatible with general neural network data, sparse neural network data, and discrete neural network data.
  • ANNs Artificial neural networks
  • NNNs neural networks
  • This kind of network relies on the complexity of the system to adjust the relationship between a large number of internal nodes to achieve the purpose of processing information.
  • neural networks have made great progress in many fields such as intelligent control and machine learning. With the continuous development of deep learning technology, the current model of neural network is getting larger and larger, and the computing performance and memory bandwidth requirements are getting higher and higher.
  • the existing neural network computing platform (CPU, GPU, traditional neural network accelerator) The user's needs have not been met.
  • sparse neural network data and discrete neural network data are developed based on the general neural network data.
  • the current neural network computing platform needs to set up a separate processing module for each type of neural network data to process, resulting in tight computing resources, and associated problems such as insufficient memory bandwidth and high power consumption.
  • the present invention provides a neural network computing device and method for improving the degree of multiplexing of neural network data processing and saving computing resources.
  • a neural network computing device includes: a control unit, a storage unit, a sparse selection unit, and a neural network operation unit; wherein: a storage unit is configured to store neural network data; and a control unit is configured to generate a corresponding sparse selection unit and a neural network operation unit respectively And storing the microinstruction of the unit, and sending the microinstruction to the corresponding unit; the sparse selection unit is configured to store in the storage unit according to the microinstruction corresponding to the sparse selection unit delivered by the control unit according to the location information represented by the sparse data therein In the neural network data, the neural network data corresponding to the effective weight is selected to participate in the operation; and the neural network The network operation unit is configured to perform a neural network operation on the neural network data selected by the sparse selection unit according to the micro-command corresponding to the neural network operation unit delivered by the control unit, to obtain an operation result.
  • the neural network data processing method comprises: Step D, the discrete neural network data splitting unit splits the neural network model of the discrete neural network data into N sparsely represented sub-networks, each sub-network contains only one real number, and the remaining weights The values are all 0; step E, the sparse selection unit and the neural network operation unit process each sub-network according to the sparse neural network data to obtain the operation results respectively; and in step F, the neural network operation unit calculates the operation results of the N sub-networks And, the neural network operation result of the discrete neural network data is obtained, and the neural network data processing ends.
  • the neural network data processing method includes: Step A, the data type judging unit judges the type of the neural network data, if the neural network data is sparse neural network data, step B is performed, and if the neural network data is discrete neural network data, step D is performed; If the neural network data is general neural network data, step G is performed; step B, the sparse selection unit selects neural network data corresponding to the effective weight in the storage unit according to the position information represented by the sparse data; step C, neural network operation The unit performs neural network operations on the neural network data acquired by the sparse selection unit, and obtains the operation result of the sparse neural network data, and the neural network data processing ends; in step D, the discrete neural network data splitting unit decomposes the neural network model of the discrete neural network data.
  • step E sparse selection unit and neural network operation unit process each sub-network according to sparse neural network data , respectively, get the operation result;
  • step F the neural network operation unit sums the operation results of the N sub-networks to obtain the neural network operation result of the discrete neural network data, and the neural network data processing ends;
  • step G the neural network operation unit executes the neural network on the general neural network data. The operation is performed, and the operation result is obtained, and the neural network data processing ends.
  • the device can determine whether the data has an interdependence relationship. For example, the input data used in the next calculation is the output result after the end of the previous calculation, so that the data dependency module is not calculated, and the next calculation is performed. Starting the calculation without waiting for the end of the previous calculation will cause the calculation result to be incorrect.
  • the dependency processing unit determines the data dependency relationship, so that the control device waits for the data to perform the next calculation, thereby ensuring the correctness and high efficiency of the device operation.
  • FIG. 1 is a schematic structural diagram of a neural network computing device according to a first embodiment of the present invention
  • 2 is a schematic diagram of sparse neural network weight model data
  • FIG. 5 is a schematic structural diagram of a neural network computing device according to a second embodiment of the present invention.
  • FIG. 6 is a schematic structural diagram of a neural network computing device according to a third embodiment of the present invention.
  • FIG. 7 is a flowchart of a neural network data processing method according to a fourth embodiment of the present invention.
  • FIG. 8 is a flowchart of a neural network data processing method according to a fifth embodiment of the present invention.
  • FIG. 9 is a flowchart of a neural network data processing method according to a sixth embodiment of the present invention.
  • FIG. 10 is a flowchart of a neural network data processing method according to a seventh embodiment of the present invention.
  • general-purpose neural network data refers to general-purpose computer data, that is, data types commonly used in computers, such as 32-bit floating-point data, 16-bit floating-point data, 32-bit fixed-point data, and the like.
  • the discrete neural network data is expressed as: part of the data or all of the data is computer data represented by discrete data. Different from the data representation of 32-bit floating point and 16-bit floating point in general neural network data, the discrete neural network data refers to all the data involved in the operation is only a set of discrete real numbers.
  • the data in the neural network includes input data and Neural network model data. It includes the following types:
  • the input data and the neural network model data are all composed of these real numbers called all discrete data representations
  • the discrete data representation in the present invention refers to the three discrete data representations described above.
  • the input data is the original universal neural network data, which may be an RGB image data
  • the neural network model data is represented by discrete data
  • the weight data of a certain layer has only two values of -1/+1, which is Neural network represented by discrete neural network data.
  • the sparse neural network data is: data that is discontinuous in position, specifically including data and data location information.
  • the model data of a neural network is sparse.
  • 01 reflects whether the model data at the corresponding position is valid, 0 indicates that the position data is invalid and is sparse, and 1 indicates that the position data is valid.
  • 01 reflects whether the model data at the corresponding position is valid, 0 indicates that the position data is invalid and is sparse, and 1 indicates that the position data is valid.
  • the data information stored in this way and the position information stored in the 01-bit string together constitute our sparse neural network data, which is also called sparse data representation in the present invention.
  • the neural network computing device and method provided by the present invention support sparse neural network data and neural network operations of discrete neural network data by multiplexing sparse selection units.
  • the invention can be applied to the following (including but not limited to) scenarios: data processing, robots, computers, printers, scanners, telephones, tablets, smart terminals, mobile phones, driving recorders, navigators, sensors, cameras, cloud servers , cameras, camcorders, projectors, watches, earphones, mobile storage, wearable devices and other electronic products; aircraft, ships, vehicles and other types of transportation; televisions, air conditioners, microwave ovens, refrigerators, rice cookers, humidifiers, washing machines, Electric lights, gas stoves, range hoods and other household appliances; and including nuclear magnetic resonance instruments, B-ultrasound, electrocardiograph and other medical equipment.
  • a neural network computing device in a first exemplary embodiment of the present invention, includes: a control unit 100, a storage unit 200, a sparse selection unit 300, and a neural network operation unit 400.
  • the storage unit 200 is configured to store neural network data.
  • the control unit 100 is configured to generate microinstructions respectively corresponding to the sparse selection unit and the neural network operation unit, and send the microinstructions to the corresponding units.
  • the sparse selection unit 300 is configured to select, according to the microinstruction corresponding to the sparse selection unit delivered by the control unit, the neural network data corresponding to the effective weight in the neural network data stored by the storage unit according to the location information represented by the sparse data therein Participate in the operation.
  • the neural network operation unit 400 is configured to perform a neural network operation on the neural network data selected by the sparse selection unit according to the micro-instruction of the corresponding neural network operation unit delivered by the control unit, to obtain an operation result.
  • the storage unit 200 is configured to store three types of neural network data - general neural network data, sparse neural network data, and discrete neural network data.
  • the storage unit may be a scratch pad memory and can support different sizes.
  • the data size of the present invention is such that the necessary calculation data is temporarily stored in the scratch pad memory (Scratchpad Memory), so that the computing device can more flexibly and efficiently support data of different sizes in the process of performing neural network operations.
  • the memory cells can be implemented by a variety of different memory devices (SRAM, eDRAM, DRAM, memristor, 3D-DRAM or non-volatile memory, etc.).
  • the control unit 100 is configured to generate microinstructions respectively corresponding to the sparse selection unit and the neural network operation unit, and send the microinstructions to the corresponding units.
  • the control unit can support a plurality of different types of neural network algorithms, including but not limited to CNN/DNN/DBN/MLP/RNN/LSTM/SOM/RCNN/FastRCNN/Faster-RCNN.
  • the sparse selection unit 300 is configured to select a neuron participating operation corresponding to the effective weight according to the location information of the sparse neural network data. When dealing with discrete neural network data, we also process the corresponding discrete data through the sparse selection unit.
  • the neural network operation unit 400 is configured to acquire input data from the storage unit according to the microinstruction generated by the control unit, and execute a general neural network or a sparse neural network or a discrete data representation. Through the network operation, the operation result is obtained, and the operation result is stored in the storage unit.
  • the sparse selection unit may process the sparse data representation and the discrete data representation. Specifically, the sparse selection unit selects a neural network corresponding to the location according to the location information of the sparse data and the 01 bit string. The input data of one layer is sent to the neural network operation unit. In the 01 bit string, each bit corresponds to one weight data in the neural network model, and 0 indicates that the corresponding weight data is invalid and does not exist. 1 indicates that the corresponding weight data is valid and exists. The data portion of the sparse data representation stores only valid data. For example, we have the sparse neural network weight model data shown in Figure 2.
  • the sparse selection module directly sends the effective weight portion of the sparse data representation to the neural network operation unit, and then selects the data of the input neuron corresponding to the position of one of the valid bits according to the 01 string to be sent to the neural network operation unit.
  • the sparse selection module will be associated with the weight position 1/2/3/7/9/11/15 (this number corresponds to the left-to-right position of the position information number 1 in Figure 2, which corresponds to the array number).
  • the corresponding input neuron is sent to the neural network operation unit.
  • a major feature of the present invention is the reuse of the sparse selection module 300 in discrete neural network data.
  • discrete neural network data several real values are used as a few sparse neural network data to perform operations.
  • the neural network model is split into N sub-networks.
  • the subnetwork is the same size as the original discrete neural network data.
  • Each subnetwork contains only one real number, and the remaining weights are all 0, so each subnetwork is similar to the sparse representation described above.
  • the only difference from the above sparse data is that after the neural network operation unit is calculated, the sub-network needs an external command to control the neural network operation unit to sum the sub-network calculation results to obtain the final result.
  • the neural network computing device further includes: a discrete neural network data splitting unit 500.
  • the discrete neural network data splitting unit 500 is configured to:
  • the neural network model of the discrete neural network model data is divided into N sub-networks, each sub-network contains only one real number, and the remaining weights are all 0;
  • the sparse selection unit 300 and the neural network operation unit 400 process each sub-network according to the sparse neural network data, and respectively obtain the operation result.
  • the neural network operation unit is further configured to sum the operation results of the N sub-networks to obtain the discrete neural network data. The result of the neural network operation.
  • the weight data of a certain layer in the neural network model data is represented by discrete data.
  • the four sub-networks are prepared externally, and the sparse selection module reads four sub-networks in turn, and the subsequent processing method is the same as the sparse neural network data, and the input data corresponding to the weight position information is selected and sent to the operation. unit.
  • the only difference is that after the calculation of the arithmetic unit is completed, an external command is needed to control the neural network operation unit to sum the operation results of the four sub-networks.
  • the sparse selection module selects the input data twice, and the input data corresponding to the position information of the weight 1 and the weight-1 respectively is input to the operation unit. Similarly, an external command control unit is required to sum the output of the two sub-networks.
  • a neural network computing device is provided. As shown in FIG. 5, the neural network computing device of the present embodiment differs from the first embodiment in that the data type determining unit 600 is added to determine the type of the neural network data.
  • the neural network data type of the present invention is specified in the instruction.
  • the control unit controls the operation mode of the sparse selection unit and the operation unit by the output result of the data type judgment unit:
  • the sparse selection module selects corresponding input data according to the location information and sends it to the neural network operation unit;
  • the sparse selection unit selects, according to the location information represented by the sparse data, a neuron corresponding to the effective weight to participate in the operation in the storage unit;
  • the neural network operation unit performs a neural network operation on the neural network data acquired by the sparse selection unit, and obtains an operation result.
  • the sparse selection module selects the corresponding input data according to the position information and sends it to the operation unit, and the operation unit sums the calculation result according to the external operation instruction;
  • the discrete neural network data splitting unit is operated, and the neural network model of the discrete neural network data is split into N sub-networks;
  • the unit and the neural network operation unit work, and each sub-network is processed according to the sparse neural network data to obtain the operation result respectively;
  • the neural network operation unit is operated, and the operation results of the N sub-networks are summed to obtain the discrete neural network. Neural network operation results of network data.
  • the sparse selection module does not work and is not selected based on location information.
  • the sparse selection unit is disabled, and the neural network operation unit performs a neural network operation on the general neural network data to obtain an operation result.
  • a neural network computing device is provided. Compared with the second embodiment, the neural network computing device of the present embodiment differs in that a dependency processing function is added to the control unit.
  • the control unit 100 includes: an instruction cache module 110, configured to store a neural network instruction to be executed, where the neural network instruction includes address information of the neural network data to be processed;
  • the module 120 is configured to acquire a neural network instruction from the instruction cache module, and the decoding module 130 is configured to decode the neural network instruction to obtain micro instructions corresponding to the storage unit, the sparse selection unit, and the neural network operation unit respectively.
  • the microinstruction includes address information of the corresponding neural network data; the instruction queue 140 is configured to store the decoded microinstruction; the scalar register file 150 is configured to store address information of the to-be-processed neural network data;
  • the dependency processing module 160 is configured to determine whether the microinstruction in the instruction queue and the previous microinstruction access the same data, and if so, store the microinstruction in a storage queue, and store the microinstruction after the execution of the previous microinstruction.
  • the microinstruction in the queue is transmitted to the corresponding unit; otherwise, the microinstruction is directly transmitted to the corresponding unit.
  • the instruction cache module is configured to store a neural network instruction to be executed. During the execution of the instruction, it is also cached in the instruction cache module. When an instruction is executed, if the instruction is also the earliest instruction in the instruction cache module that is not committed, the instruction will be submitted once submitted. The operation of this instruction will not be able to cancel the change of the device status.
  • the instruction cache module can be a reordering cache.
  • the neural network computing device of this embodiment further includes: an input and output unit for storing data in the storage unit, or acquiring a neural network operation result from the storage unit.
  • the direct storage unit is responsible for reading data from or writing data to the memory.
  • the present invention also provides a general neural network data (general neural network refers to a neural network in which data is not represented by discrete data representation or sparse representation), and is used for performing a general operation according to an operation instruction. Neural network operations. As shown in FIG. 7, the processing method of the general neural network data in this embodiment includes:
  • Step S701 the instruction fetch module extracts the neural network instruction from the instruction cache module, and sends the neural network instruction to the decoding module.
  • Step S702 the decoding module decodes the neural network instruction, and obtains micro-instructions corresponding to the storage unit, the sparse selection unit, and the neural network operation unit respectively, and sends each micro-instruction to the instruction queue;
  • Step S703 obtaining a neural network operation operation code of the micro-instruction and a neural network operation operand from the scalar register file, and then the micro-instruction is sent to the dependency processing unit;
  • Step S704 the dependency processing unit analyzes whether the microinstruction has a dependency on the data with the microinstruction that has not been executed before, and if so, the microinstruction needs to wait in the storage queue until it is not executed before. After the microinstruction no longer has a dependency on the data, the microinstruction is sent to the neural network operation unit and the storage unit;
  • Step S705 the neural network operation unit extracts the required data (including input data, neural network model data, etc.) from the scratchpad memory according to the address and size of the required data.
  • Step S706 then completing the neural network operation corresponding to the operation instruction in the neural network operation unit, and writing the result obtained by the neural network operation back to the storage unit.
  • the present invention also provides a sparse neural network data processing method for performing a sparse neural network operation according to an operation instruction.
  • the processing method of the sparse neural network data in this embodiment includes:
  • Step S801 the instruction module extracts the neural network instruction from the instruction cache module, and the The neural network command is sent to the decoding module;
  • Step S802 the decoding module decodes the neural network instruction, obtains micro-instructions respectively corresponding to the storage unit, the sparse selection unit, and the neural network operation unit, and sends each micro-instruction to the instruction queue;
  • Step S803 obtaining a neural network operation operation code of the micro-instruction and a neural network operation operand from the scalar register file, and then the micro-instruction is sent to the dependency processing unit;
  • Step S804 the dependency processing unit analyzes whether the microinstruction has a dependency on the data with the microinstruction that has not been executed before, and if so, the microinstruction needs to wait in the storage queue until it is not executed before. After the microinstruction no longer has a dependency on the data, the microinstruction is sent to the neural network operation unit and the storage unit;
  • Step S805 the operation unit extracts the required data (including input data, neural network model data, neural network sparse representation data) from the scratchpad memory according to the address and size of the required data, and then the sparse selection module selects according to the sparse representation.
  • Input data corresponding to effective neural network weight data
  • input data is represented by general data
  • neural network model data is represented by sparse representation.
  • the sparse selection module selects input data corresponding to the weight according to the 01-bit string of the neural network model data, and the length of the 01-bit string is equal to the length of the neural network model data, as shown in FIG. 2, where the bit string number is 1
  • the input data input device corresponding to the position weight does not input the input data corresponding to the weight.
  • Step S806 the neural network operation corresponding to the operation instruction is completed in the operation unit (because we have selected the input data corresponding thereto according to the sparse weight data in S5. Therefore, the calculation process and the step S106 in FIG. 3 The same is the case: the process of adding the offset and the final excitation after the input and the weight are multiplied, and writing the result of the neural network operation back to the storage unit.
  • the present invention also provides a discrete neural network data processing method for performing a neural network operation of discrete data representation according to an operation instruction.
  • the data processing method for the discrete neural network in this embodiment includes:
  • Step S901 the fetching module extracts the neural network instruction from the instruction cache module, and the The neural network command is sent to the decoding module;
  • Step S902 the decoding module decodes the neural network instruction, obtains micro-instructions corresponding to the storage unit, the sparse selection unit, and the neural network operation unit, respectively, and sends each micro-instruction to the instruction queue;
  • Step S903 acquiring a neural network operation operation code of the micro-instruction and a neural network operation operand from the scalar register file, and then the micro-instruction is sent to the dependency processing unit;
  • Step S904 the dependency processing unit analyzes whether the microinstruction has a dependency on the data with the microinstruction that has not been executed before, and if so, the microinstruction needs to wait in the storage queue until it is not executed before. After the microinstruction no longer has a dependency on the data, the microinstruction is sent to the neural network operation unit and the storage unit;
  • Step S905 the operation unit extracts the required data (including the input data, such as the model data of the plurality of subnetworks described above) from the cache memory according to the address and size of the required data, and each subnetwork includes only one discrete representation.
  • the weight, and the sparse representation of each sub-network, and then the sparse selection module selects the input data corresponding to the effective weight data of the sub-network according to the sparse representation of each sub-network.
  • the storage method of the discrete data is as shown in FIG. 3 and FIG. 4, for example.
  • the sparse selection module is similar to the above operation, and according to the position information represented by the sparsely represented 01-bit string, the corresponding input data is selected and retrieved from the scratchpad memory into the device)
  • Step S906 and then completing the operation of the sub-neural network corresponding to the operation instruction in the operation unit (this process is also similar to the calculation process in the above, the only difference is that, for example, the difference like the difference between FIG. 2 and FIG. 3, the sparse representation method
  • the model data has only one sparse representation, while the discrete data representation may generate multiple sub-models from one model data.
  • the operation process the operation results of all sub-models need to be accumulated and the operation of each sub-network is performed. The results are added and the final result of the operation is written back to the storage unit.
  • the present invention also provides a neural network data processing method.
  • the neural network data processing method of this embodiment includes:
  • Step A the data type judging unit judges the type of the neural network data, if the neural network data is sparse neural network data, step B is performed, if the neural network data is discrete neural network data, step D is performed; if the neural network data is a general neural network Data, execution steps G;
  • Step B The sparse selection unit selects neural network data corresponding to the effective weight in the storage unit according to the location information represented by the sparse data;
  • Step C The neural network operation unit performs a neural network operation on the neural network data acquired by the sparse selection unit, and obtains an operation result of the sparse neural network data, and the neural network data processing ends;
  • Step D the discrete neural network data splitting unit splits the neural network model of the discrete neural network data into N sparsely represented sub-networks, each sub-network contains only one real number, and the remaining weights are all 0;
  • Step E the sparse selection unit and the neural network operation unit process each sub-network according to the sparse neural network data, and respectively obtain the operation result;
  • Step F the neural network operation unit sums the operation results of the N sub-networks to obtain a neural network operation result of the discrete neural network data, and the neural network data processing ends;
  • step G the neural network operation unit performs a neural network operation on the general neural network data, and obtains the operation result, and the neural network data processing ends.
  • the seventh embodiment of the present invention has been introduced to the sparse neural network data processing method.
  • the general data of the common neural network data is mixed.
  • the input data is general neural network data.
  • some layers of data use discrete data, and some layers of data use sparse data. Since the basic flow in the apparatus of the present invention is exemplified by a layer of neural network operation, the neural network actually used is often multi-layered, so it is common for each layer to adopt different data types in real use.
  • the present invention solves the problem of reducing the amount of data required for the operation and increasing the data multiplexing in the operation process by multiplexing the sparse selection unit and efficiently supporting the sparse neural network and the neural network operation of the discrete data representation.
  • problems such as insufficient computing performance, insufficient memory access bandwidth, and excessive power consumption.
  • the dependency processing module through the dependency processing module, the effect of ensuring the correct operation of the neural network and improving the running efficiency and shortening the running time is achieved. It has wide application in many fields, and has strong application prospect and great economic value.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Image Analysis (AREA)

Abstract

L'invention concerne un appareil et un procédé de calcul de réseau neuronal. L'appareil de calcul de réseau neuronal comprend : une unité de commande (100), une unité de stockage (200), une unité de sélection éparse (300), et une unité de calcul de réseau neuronal (400); l'unité de commande (100) est utilisée pour produire des micro-commandes correspondant respectivement à chaque unité, et envoyer les micro-commandes à chaque unité correspondante; l'unité de sélection éparse (300) est utilisée pour sélectionner des données de réseau neuronal correspondant à une valeur pondérée efficace pour le calcul à partir des données de réseau neuronal stockées dans l'unité de stockage (200) sur la base de la micro-commande correspondant à l'unité de sélection éparse (300) émise par l'unité de commande (100) et en fonction d'informations de position indiquées par les données éparses à l'intérieur de celle-ci; et l'unité de calcul de réseau neuronal (400) est utilisée pour exécuter un calcul de réseau neuronal sur les données de réseau neuronal sélectionnées par l'unité de sélection éparse (300) sur la base de la micro-commande correspondant à l'unité de calcul de réseau neuronal (400) émise par l'unité de commande (100), afin d'obtenir des résultats de calcul. Cet appareil et ce procédé améliorent la capacité de l'appareil de calcul de réseau neuronal à traiter différents types de données, augmentant la vitesse de calcul de réseau neuronal tout en réduisant la consommation d'énergie.
PCT/CN2016/100784 2016-09-29 2016-09-29 Appareil et procédé de calcul de réseau neuronal WO2018058427A1 (fr)

Priority Applications (1)

Application Number Priority Date Filing Date Title
PCT/CN2016/100784 WO2018058427A1 (fr) 2016-09-29 2016-09-29 Appareil et procédé de calcul de réseau neuronal

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2016/100784 WO2018058427A1 (fr) 2016-09-29 2016-09-29 Appareil et procédé de calcul de réseau neuronal

Publications (1)

Publication Number Publication Date
WO2018058427A1 true WO2018058427A1 (fr) 2018-04-05

Family

ID=61763580

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2016/100784 WO2018058427A1 (fr) 2016-09-29 2016-09-29 Appareil et procédé de calcul de réseau neuronal

Country Status (1)

Country Link
WO (1) WO2018058427A1 (fr)

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109032669A (zh) * 2018-02-05 2018-12-18 上海寒武纪信息科技有限公司 神经网络处理装置及其执行向量最小值指令的方法
CN111222632A (zh) * 2018-11-27 2020-06-02 中科寒武纪科技股份有限公司 计算装置、计算方法及相关产品
CN111523655A (zh) * 2019-02-03 2020-08-11 上海寒武纪信息科技有限公司 处理装置及方法
CN111767995A (zh) * 2019-04-02 2020-10-13 上海寒武纪信息科技有限公司 运算方法、装置及相关产品
CN111860796A (zh) * 2019-04-30 2020-10-30 上海寒武纪信息科技有限公司 运算方法、装置及相关产品
CN111985634A (zh) * 2020-08-21 2020-11-24 北京灵汐科技有限公司 神经网络的运算方法、装置、计算机设备及存储介质

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105184366A (zh) * 2015-09-15 2015-12-23 中国科学院计算技术研究所 一种时分复用的通用神经网络处理器
CN105488563A (zh) * 2015-12-16 2016-04-13 重庆大学 面向深度学习的稀疏自适应神经网络、算法及实现装置
CN105512723A (zh) * 2016-01-20 2016-04-20 南京艾溪信息科技有限公司 一种用于稀疏连接的人工神经网络计算装置和方法
US20160196488A1 (en) * 2013-08-02 2016-07-07 Byungik Ahn Neural network computing device, system and method

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20160196488A1 (en) * 2013-08-02 2016-07-07 Byungik Ahn Neural network computing device, system and method
CN105184366A (zh) * 2015-09-15 2015-12-23 中国科学院计算技术研究所 一种时分复用的通用神经网络处理器
CN105488563A (zh) * 2015-12-16 2016-04-13 重庆大学 面向深度学习的稀疏自适应神经网络、算法及实现装置
CN105512723A (zh) * 2016-01-20 2016-04-20 南京艾溪信息科技有限公司 一种用于稀疏连接的人工神经网络计算装置和方法

Cited By (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109032669A (zh) * 2018-02-05 2018-12-18 上海寒武纪信息科技有限公司 神经网络处理装置及其执行向量最小值指令的方法
CN109101273A (zh) * 2018-02-05 2018-12-28 上海寒武纪信息科技有限公司 神经网络处理装置及其执行向量最大值指令的方法
CN109101273B (zh) * 2018-02-05 2023-08-25 上海寒武纪信息科技有限公司 神经网络处理装置及其执行向量最大值指令的方法
CN109032669B (zh) * 2018-02-05 2023-08-29 上海寒武纪信息科技有限公司 神经网络处理装置及其执行向量最小值指令的方法
CN111222632A (zh) * 2018-11-27 2020-06-02 中科寒武纪科技股份有限公司 计算装置、计算方法及相关产品
CN111523655A (zh) * 2019-02-03 2020-08-11 上海寒武纪信息科技有限公司 处理装置及方法
CN111523655B (zh) * 2019-02-03 2024-03-29 上海寒武纪信息科技有限公司 处理装置及方法
CN111767995A (zh) * 2019-04-02 2020-10-13 上海寒武纪信息科技有限公司 运算方法、装置及相关产品
CN111767995B (zh) * 2019-04-02 2023-12-05 上海寒武纪信息科技有限公司 运算方法、装置及相关产品
CN111860796A (zh) * 2019-04-30 2020-10-30 上海寒武纪信息科技有限公司 运算方法、装置及相关产品
CN111860796B (zh) * 2019-04-30 2023-10-03 上海寒武纪信息科技有限公司 运算方法、装置及相关产品
CN111985634A (zh) * 2020-08-21 2020-11-24 北京灵汐科技有限公司 神经网络的运算方法、装置、计算机设备及存储介质

Similar Documents

Publication Publication Date Title
CN110298443B (zh) 神经网络运算装置及方法
WO2018058427A1 (fr) Appareil et procédé de calcul de réseau neuronal
CN109284823B (zh) 一种运算装置及相关产品
CN111260025B (zh) 用于执行lstm神经网络运算的装置和运算方法
KR102544275B1 (ko) 콘볼루션 신경망 트레이닝 실행용 장치와 방법
KR102402111B1 (ko) 콘볼루션 신경망 정방향 연산 실행용 장치와 방법
CN109376861B (zh) 一种用于执行全连接层神经网络训练的装置和方法
CN111857819B (zh) 一种用于执行矩阵加/减运算的装置和方法
WO2018024232A1 (fr) Dispositif et procédé d'exécution d'une opération sur un réseau neuronal
WO2017185387A1 (fr) Procédé et dispositif d'exécution d'une opération de transfert d'un réseau neuronal en couches entièrement connecté
WO2018120016A1 (fr) Appareil d'exécution d'opération de réseau neuronal lstm, et procédé opérationnel
KR20190107766A (ko) 계산 장치 및 방법
CN111126590B (zh) 一种人工神经网络运算的装置及方法
WO2017185395A1 (fr) Appareil et procédé d'exécution d'opération de comparaison vectorielle
CN108171328B (zh) 一种神经网络处理器和采用其执行的卷积运算方法
EP3451238A1 (fr) Appareil et procédé pour exécuter une opération de regroupement
CN109711540B (zh) 一种计算装置及板卡
CN107305486B (zh) 一种神经网络maxout层计算装置
WO2017177446A1 (fr) Appareil de support de représentation de données discrètes et procédé d'apprentissage arrière d'un réseau neuronal artificiel
WO2017185248A1 (fr) Appareil et procédé permettant d'effectuer une opération d'apprentissage automatique de réseau neuronal artificiel
WO2022001500A1 (fr) Appareil informatique, puce de circuit intégré, carte de circuit imprimé, dispositif électronique et procédé de calcul
CN112766475B (zh) 处理部件及人工智能处理器
KR102467544B1 (ko) 연산 장치 및 그 조작 방법
CN112817898A (zh) 数据传输方法、处理器、芯片及电子设备
CN114692824A (zh) 一种神经网络模型的量化训练方法、装置和设备

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 16917181

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 16917181

Country of ref document: EP

Kind code of ref document: A1