WO2022105805A1 - 数据的处理方法及存算一体芯片 - Google Patents

数据的处理方法及存算一体芯片 Download PDF

Info

Publication number
WO2022105805A1
WO2022105805A1 PCT/CN2021/131247 CN2021131247W WO2022105805A1 WO 2022105805 A1 WO2022105805 A1 WO 2022105805A1 CN 2021131247 W CN2021131247 W CN 2021131247W WO 2022105805 A1 WO2022105805 A1 WO 2022105805A1
Authority
WO
WIPO (PCT)
Prior art keywords
data
computing
storage
chip
array
Prior art date
Application number
PCT/CN2021/131247
Other languages
English (en)
French (fr)
Inventor
何伟
沈杨书
祝夭龙
Original Assignee
北京灵汐科技有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 北京灵汐科技有限公司 filed Critical 北京灵汐科技有限公司
Publication of WO2022105805A1 publication Critical patent/WO2022105805A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F15/00Digital computers in general; Data processing equipment in general
    • G06F15/76Architectures of general purpose stored program computers
    • G06F15/78Architectures of general purpose stored program computers comprising a single central processing unit
    • G06F15/7807System on chip, i.e. computer system on a single chip; System in package, i.e. computer system on one or more chips in a single package
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/06Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons
    • G06N3/063Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons using electronic means
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Definitions

  • the present application relates to the field of neural networks, in particular to a data processing method and an integrated storage and computing chip.
  • the embodiments of the present application provide a data processing method and an integrated storage and computing chip, so as to solve the problem that in the process of using the integrated storage and computing chip to process data, a large amount of data needs to be stored and calculated in a computing array, resulting in a large amount of operations in the storage and computing array. and high power consumption.
  • the present application provides a data processing method, which is applied to an integrated storage and computing chip, wherein the integrated storage and computing chip includes at least one computing core, and the computing core includes: a storage computing array and a computing module; the method It includes: the storage and calculation array operates on the first data with the first data attribute input to the integrated storage and calculation chip; the calculation module calculates the second data with the second data attribute input into the integrated storage and calculation chip An operation is performed, the first data attribute is different from the second data attribute.
  • the present application provides an integrated storage and computing chip
  • the integrated storage and computing chip includes at least one computing core
  • the computing core includes: a storage computing array and a computing module; the storage computing array is used for inputting The first data with the first data attribute of the integrated storage and calculation chip is operated; the calculation module is used for calculating the second data inputted into the integrated storage and calculation chip with the second data attribute, the first data A data attribute is different from the second data attribute.
  • embodiments of the present application further provide an electronic device, which is characterized by comprising a processor, a memory, and a program or instruction stored in the memory and executable on the processor, the program or The instructions, when executed by the processor, implement the steps of the method as described in the first aspect.
  • an embodiment of the present application further provides a readable storage medium, where a program or an instruction is stored on the readable storage medium, and when the program or instruction is executed by a processor, the method according to the first aspect is implemented. step.
  • embodiments of the present application further provide a computer program product, comprising computer-readable codes, or a non-volatile computer-readable storage medium carrying computer-readable codes, when the computer-readable codes are stored in an electronic When running in the processor of the device, the processor in the electronic device executes the steps for implementing the method of the first aspect.
  • different modules can be used to perform operations on data with different data attributes, that is, the data with the first data attribute can be operated through the storage and computing array, and The data with the second data attribute is operated by the computing module, thereby reducing the computation amount of the storage and computing array in the integrated storage and computing chip, and solving the problem of a large amount of data in the process of using the integrated storage and computing chip to process data in the related art.
  • a storage computing array is required to perform operations, resulting in the problems of a large amount of operations and high power consumption of the storage computing array.
  • FIG. 1 is a schematic structural diagram of a storage-computing integrated chip in the related art
  • FIG. 2 is a schematic structural diagram of a memory-computing integrated chip according to an embodiment of the present application
  • FIG. 3 is a flowchart of a data processing method according to an embodiment of the present application.
  • first and second are only for the purpose of description, and cannot be understood as indicating or implying relative importance or implying the number of indicated technical features .
  • a feature delimited with “first”, “second” may expressly or implicitly include one or more of that feature.
  • plural means two or more.
  • FIG. 1 is a schematic structural diagram of a memory-computing integrated chip in the related art.
  • the integrated storage and computing chip 100 includes: a storage and computing array 101 , a digital-to-analog conversion module 102 and an analog-to-digital conversion module 103 .
  • a large amount of data needs to be calculated by the storage and calculation array 101 , which leads to the problems of large calculation amount and high power consumption of the storage and calculation array 101 .
  • An embodiment of the present application provides a data processing method, which is applied to an integrated storage and computing chip, wherein, as shown in FIG. 2 , the integrated storage and computing chip 200 includes at least one computing core 210 , wherein the at least one computing core 210 It includes: a storage computing array 201 and a computing module 202 . It should be noted that the integrated storage and computing chip 200 in FIG. 2 only schematically shows one computing core 210. In a possible implementation manner, the integrated storage and computing chip 200 may include multiple computing cores 210 to form many cores The structure of the storage and computing integrated chip. In the embodiment of the present disclosure, only one computing core 210 is included in the integrated storage and computing chip 200 as an example for illustrative description.
  • FIG. 3 is a flowchart of a data processing method according to an embodiment of the present application. As shown in FIG. 3 , the steps of the method include:
  • Step S302 the storage-computation array performs operations on the first data with the first data attribute input to the integrated storage-computation chip;
  • Step S304 the computing module operates on the second data inputted into the integrated storage-computation chip with a second data attribute, where the first data attribute is different from the second data attribute.
  • different modules can be used to perform operations on data with different data attributes, that is, the data with the first data attribute can be operated through the storage computing array, and the data with the first data attribute can be operated through the computing module.
  • the data of the second data attribute is calculated, thereby reducing the calculation amount of the storage and calculation array in the integrated storage and calculation chip, and solving the problem that in the process of using the integrated storage and calculation chip to process data, a large amount of data needs to be stored in the calculation array for calculation, resulting in
  • the storage computing array has the problems of large computational load and high power consumption.
  • the integrated storage and computing chip in the embodiment of the present application can process the data of the hybrid neural network.
  • the integrated storage and computing chip can allocate the data of the neural network with different data attributes in the hybrid neural network to the storage and computing.
  • One or more computing cores in an all-in-one chip perform processing.
  • the first neural network and the second neural network are neural networks in the hybrid neural network, and the data of the first neural network and the data of the second neural network have different data attributes.
  • the computing integrated chip distinguishes the data of the neural network with different data attributes, and obtains the data of the neural network suitable for the processing of the computing module in the computing core and the data of the neural network suitable for the processing of the storage computing array in the computing core. Attributes allocate the data that can be processed by the computing module and storage computing array in the computing core, so that the processing efficiency of the data of different neural networks in the hybrid neural network can be balanced, and the processing efficiency of the data in the hybrid neural network can be improved.
  • the storage computing array of the embodiment of the present application includes non-volatile storage
  • the non-volatile storage may include a NOR-type flash memory cell array, NAND-type flash memory cell array, Resistive Random Access Memory (RRAM) devices; the above non-volatile storage is just an example, in other application scenarios, the non-volatile storage can also be non-volatile Memory (Non-Volatile Memory, NVM), non-volatile magnetic random access memory (Magnetic Random Access Memory, MRAM), etc.
  • the operation mode of the storage calculation array may be a matrix operation.
  • the calculation module in this embodiment of the present application may include at least one of the following: a vector operation module and a matrix operation module. Based on this, the operation mode of the calculation module in the embodiment of the present application may be matrix operation and/or vector operation.
  • the first data in the embodiment of the present application is the data of the first neural network; the storage and calculation array pre-stores the weight of the first neural network, and the weight of the first neural network is pre-stored.
  • the weight is the weight of the function included in the first neural network, for example, the weight of the activation function, the loss function and other functions.
  • the first data in this embodiment of the present application includes at least one of the following: data whose usage frequency exceeds a preset frequency threshold, data whose computing power consumption exceeds a preset power consumption threshold, and data whose transmission delay is lower than a preset delay threshold data;
  • the second data in the embodiment of the present application includes at least one of the following: data whose usage frequency is less than or equal to a preset frequency threshold, data whose computing power consumption is less than or equal to a preset power consumption threshold, and whose transmission delay is higher than or equal to Preset delay threshold data.
  • the data used at high frequency (that is, the data whose usage frequency exceeds the preset frequency threshold) can be directly input into the storage array for operation, and the data used at low frequency (that is, the usage frequency is less than or equal to the preset frequency) Threshold data) is input into the calculation module for operation, or the calculation-intensive power consumption is given to the storage calculation array, and the data that saves delay and bandwidth is sent to the calculation module, or the low-latency data is input into the storage calculation array,
  • the probability of frequent data transmission from off-chip during operation is reduced.
  • the timing of pre-storing the weight of the first neural network may be before the storage and calculation array and the calculation module in the integrated storage and calculation chip have not performed operations at all, that is, before the operation is performed, the storage and calculation arrays have already been stored.
  • the data of the first neural network can be directly input into the storage computing array for operation; it is also possible to add weights of other types of the first neural network after the operation for a period of time, such as mixing The weights of other neural networks in the neural network. That is to say, in the embodiment of the present application, the weight of the first neural network is pre-stored and then the data of the first neural network is operated.
  • the neural network whose usage frequency exceeds the preset frequency threshold in the embodiments of the present application may refer to the main purpose of the neural network, such as face recognition, image classification, data risk assessment, and so on.
  • the weights corresponding to the neural network are pre-stored in the storage computing array, so that when the neural network inputs image data , which is directly input into the storage computing array for operation, and the data of other neural networks is input into the computing module for operation, which not only reduces the computing burden of the storage computing array, but also improves the computing efficiency of the integrated storage and computing chip.
  • the second data is the data of the second neural network; before the computing module performs the operation on the second data with the second data attribute input to the integrated chip of storage and computing, it includes:
  • Step S11 the storage computing array receives and stores the weight of the second neural network.
  • Step S12 the storage computing array sends the weight of the second neural network to the computing module in response to the weight acquisition request of the computing module.
  • the weight of the second neural network may also be stored in advance before the calculation is performed at the same storage timing as the weight of the first neural network.
  • the computing module since the computing module operates on the data with the second data attribute that is input to the integrated memory-calculation chip, it is recognized that the data currently input to the integrated memory-calculation chip is not the data with the first data attribute, but has the second attribute. In the case of the data of the second neural network, the data with the second attribute will be input into the calculation module. Therefore, the weight of the second neural network can be called from the storage and calculation array through the weight acquisition request, and then in the calculation process, according to the first The weights of the second neural network operate on the data of the second neural network.
  • the storage computing array and the computing module can perform data operations in parallel.
  • the data with different data attributes input to the integrated storage-computation chip can be differentiated and operated, thereby reducing the occupation of the storage and computing array by the low-frequency data.
  • the operation power consumption is reduced, and the input/output (I/O) data transfer in the operation process is reduced.
  • the data processing method in this embodiment of the present application may be performed by hardware, or the method may be performed by a processor running computer-executable codes.
  • different embodiments of the present application may be combined with each other, and the descriptions of different embodiments have some emphasis, and for the parts that are not described, reference may be made to the records of other embodiments.
  • the integrated storage and computing chip includes at least one computing core, and the at least one computing core includes: a storage computing array and a computing module;
  • a storage-computation array which is used to perform operations on the first data with the first data attribute input to the integrated storage-computation chip
  • the calculation module is used to perform operation on the second data with the second data attribute input into the integrated chip of storage and calculation, and the first data attribute is different from the second data attribute.
  • the first data in the embodiment of the present application is data of the first neural network; the storage and calculation array pre-stores the weight of the first neural network.
  • the first data in this embodiment of the present application includes at least one of the following: data whose usage frequency exceeds a preset frequency threshold, data whose computing power consumption exceeds a preset power consumption threshold, and a transmission delay that is lower than a preset delay. Threshold data.
  • the second data in this embodiment of the present application includes at least one of the following: data whose usage frequency is less than or equal to a preset frequency threshold, data whose computing power consumption is less than or equal to a preset power consumption threshold, and whose transmission delay is higher than or equal to the preset power consumption threshold. or data equal to the preset delay threshold.
  • the second data in the embodiment of the present application is the data of the second neural network; the storage computing array is also used for inputting the data of the second neural network with the second data attribute of the integrated chip in the computing module.
  • the weight of the second neural network is received and stored; in response to the weight acquisition request of the computing module, the weight of the second neural network is sent to the computing module.
  • the storage computing array and the computing module in this embodiment of the present application perform data operations in parallel.
  • the storage computing array includes non-volatile storage.
  • the calculation module includes at least one of the following: a vector operation module and a matrix operation module.
  • an embodiment of the present application further provides an electronic device, including a processor, a memory, a program or instruction stored in the memory and executable on the processor, and the program or instruction is executed by the processor to implement the above.
  • an electronic device including a processor, a memory, a program or instruction stored in the memory and executable on the processor, and the program or instruction is executed by the processor to implement the above.
  • the electronic devices in the embodiments of the present application include the aforementioned mobile electronic devices and non-mobile electronic devices.
  • the embodiments of the present application further provide a readable storage medium, where a program or an instruction is stored on the readable storage medium, and when the program or instruction is executed by a processor, each process of the foregoing data processing method embodiment can be achieved, and can achieve The same technical effect, in order to avoid repetition, will not be repeated here.
  • the processor is the processor in the electronic device described in the foregoing embodiments.
  • the readable storage medium includes a computer-readable storage medium, such as a computer read-only memory (Read-Only Memory, ROM), a random access memory (Random Access Memory, RAM), a magnetic disk or an optical disk, and the like.
  • modules or steps of the present application can be implemented by a general-purpose computing device, and they can be centralized on a single computing device, or distributed in a network composed of multiple computing devices Alternatively, they may be implemented in program code executable by a computing device, such that they may be stored in a storage device and executed by the computing device, and in some cases, in a different order than here
  • the steps shown or described are performed either by fabricating them separately into individual integrated circuit modules, or by fabricating multiple modules or steps of them into a single integrated circuit module.
  • the present application is not limited to any particular combination of hardware and software.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Computing Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Computer Hardware Design (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • General Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Evolutionary Computation (AREA)
  • Data Mining & Analysis (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Computational Linguistics (AREA)
  • Artificial Intelligence (AREA)
  • Neurology (AREA)
  • Microelectronics & Electronic Packaging (AREA)
  • Power Sources (AREA)

Abstract

一种数据的处理方法及存算一体芯片(200),其中,该方法应用于存算一体芯片(200),存算一体芯片(200)包括至少一个计算核(210),至少一个计算核(210)包括:存储计算阵列(201)、计算模块(202);该方法包括:存储计算阵列(201)对输入存算一体芯片(200)的具有第一数据属性的第一数据进行运算(S302);计算模块(202)对输入存算一体芯片(200)的具有第二数据属性的第二数据进行运算,第一数据属性与第二数据属性不同(S304)。根据该方法,可以解决在采用存算一体芯片对数据进行处理的过程中,大量数据需要存储计算阵列进行运算,导致存储计算阵列运算量大以及功耗高的问题。

Description

数据的处理方法及存算一体芯片 技术领域
本申请涉及神经网络领域,具体涉及一种数据的处理方法及存算一体芯片。
背景技术
近年来,为了解决传统冯诺依曼计算体系结构瓶颈,存算一体架构得到人们的广泛关注,如图1所示,其基本思想是直接利用存储器进行逻辑计算,从而减少存储器与处理器之间的数据传输量以及传输距离,降低功耗的同时提高性能。但是,在有大量数据需要进行运算的情况下,仍需消耗较多的时间将大量数据逐行写入存算一体芯片中的存储计算阵列中,从而难以进一步提高存算一体在实际应用中的整体工作效率。
发明内容
本申请实施例提供了一种数据的处理方法及存算一体芯片,以解决在采用存算一体芯片对数据进行处理的过程中,大量数据需要存储计算阵列进行运算,导致存储计算阵列运算量大以及功耗高的问题。
为了解决上述技术问题,本申请是这样实现的:
第一方面,本申请提供了一种数据的处理方法,应用于存算一体芯片,所述存算一体芯片包括至少一个计算核,所述计算核包括:存储计算阵列、计算模块;所述方法包括:所述存储计算阵列对输入所述存算一体芯片的具有第一数据属性的第一数据进行运算;所述计算模块对输入所述存算一体芯片的具有第二数据属性的第二数据进行运算,所述第一数据属性与所述第二数据属性不同。
第二方面,本申请提供了一种存算一体芯片,所述存算一体芯片包括至少一个计算核,所述计算核包括:存储计算阵列、计算模块;所述存储计算阵列,用于对输入所述存算一体芯片的具有第一数据属性的第一数据进行运算;所述 计算模块,用于对输入所述存算一体芯片的具有第二数据属性的第二数据进行运算,所述第一数据属性与所述第二数据属性不同。
第三方面,本申请实施例还提供了一种电子设备,其特征在于,包括处理器,存储器及存储在所述存储器上并可在所述处理器上运行的程序或指令,所述程序或指令被所述处理器执行时实现如第一方面所述的方法的步骤。
第四方面,本申请实施例还提供了一种可读存储介质,所述可读存储介质上存储程序或指令,所述程序或指令被处理器执行时实现如第一方面所述的方法的步骤。
第五方面,本申请实施例还提供了一种计算机程序产品,包括计算机可读代码,或者承载有计算机可读代码的非易失性计算机可读存储介质,当所述计算机可读代码在电子设备的处理器中运行时,所述电子设备中的处理器执行用于实现第一方面所述的方法的步骤。
在本申请中,通过本申请实施例中的存算一体芯片,能够针对不同数据属性的数据采用不同的模块进行运算,即可以通过存储计算阵列对具有第一数据属性的数据进行运算,以及可以通过计算模块对具有第二数据属性的数据进行运算,从而减轻了存算一体芯片中存储计算阵列的运算量,解决了相关技术中在采用存算一体芯片对数据进行处理的过程中,大量数据需要存储计算阵列进行运算,导致存储计算阵列运算量大以及功耗高的问题。
附图说明
图1是相关技术中存算一体芯片的结构示意图;
图2是本申请实施例的存算一体芯片的结构示意图;
图3是本申请实施例的数据的处理方法的流程图。
具体实施方式
下面将结合本申请一些实施例中的附图,对本申请一些实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例是本申请一部分实施例,而不是全部的实施例。基于本申请中的实施例,本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例,都属于本申请实施例保护的范围。
在本申请实施例的描述中,需要理解的是,术语“第一”、“第二”仅由于描述目的,且不能理解为指示或暗示相对重要性或者隐含指明所指示的技术特征的数量。因此,限定有“第一”、“第二”的特征可以明示或者隐含地包括一个或者多个该特征。本申请实施例的描述中,除非另有说明,“多个”的含义是两个或两个以上。
图1是相关技术中存算一体芯片的结构示意图。在图1中,存算一体芯片100包括:存储计算阵列101、数模转换模块102和模数转换模块103。在图1中,在采用存算一体芯片100对数据进行处理的过程中,大量数据需要存储计算阵列101进行运算,从而导致存储计算阵列101运算量大以及功耗高的问题。
下面结合附图,通过具体的实施例及其应用场景对本申请实施例提供的数据的处理方法进行详细地说明。
本申请实施例提供了一种数据的处理方法,该方法应用于存算一体芯片,其中,如图2所示,该存算一体芯片200包括至少一个计算核210,其中,至少一个计算核210包括:存储计算阵列201、计算模块202。需要说明的是,图2中的存算一体芯片200仅示意性给出一个计算核210,在一种可能的实现方式中,存算一体芯片200中可以包括多个计算核210,构成众核结构的存算一体芯片。在本公开实施例中,仅以存算一体芯片200中包含一个计算核210为例,进行示例性说明。
基于此,图3是本申请实施例的数据的处理方法的流程图,如图3所示,该方法的步骤包括:
步骤S302,存储计算阵列对输入存算一体芯片的具有第一数据属性的第一数据进行运算;
步骤S304,计算模块对输入存算一体芯片的具有第二数据属性的第二数据进行运算,第一数据属性与第二数据属性不同。
通过本申请实施例中的存算一体芯片,能够针对不同数据属性的数据采用不同的模块进行运算,即可以通过存储计算阵列对具有第一数据属性的数据进行运算,以及可以通过计算模块对具有第二数据属性的数据进行运算,从而减轻了存算一体芯片中存储计算阵列的运算量,解决了在采用存算一体芯片对数据进行处理的过程中,大量数据需要存储计算阵列进行运算,导致存储计算阵 列运算量大以及功耗高的问题。
需要说明的是,本申请实施例中的存算一体芯片可以对混合神经网络的数据进行处理,例如存算一体芯片可以将混合神经网络中的不同数据属性的神经网络的数据,分配至存算一体芯片中的一个或多个计算核进行处理。在本申请实施例中,该第一神经网络和第二神经网络为该混合神经网络中的神经网络,第一神经网络的数据和第二神经网络的数据具有不同数据属性,因此,通过该存算一体芯片对具有不同数据属性的神经网络的数据进行区分,得到适合计算核中的计算模块处理的神经网络的数据及适合计算核中的存储计算阵列处理的神经网络的数据,从而根据不同数据属性分配计算核中计算模块和存储计算阵列可以处理的数据,使混合神经网络中的不同神经网络的数据的处理效率保持平衡,可以提升混合神经网络中数据的处理效率。
进一步地,在本申请实施例的可选实施方式中,本申请实施例的存储计算阵列中包括非易失性存储,在具体应用场景中该非易失性存储可以包括NOR型闪存单元阵列、NAND型闪存单元阵列、电阻性随机存取存储器(Resistive Random Access Memory,RRAM)器件;上述非易失性存储仅仅是举例说明,在其他应用场景中该非易失性存储还可以是非易失性存储器(Non-Volatile Memory,NVM)、非挥发性的磁性随机存储器(Magnetic Random Access Memory,MRAM)等。此外,在本申请实施例中,该存储计算阵列的运算方式可以为矩阵运算。
另外,本申请实施例中的计算模块可以包括以下至少之一:向量运算模块、矩阵运算模块。基于此,本申请实施例中的计算模块的运算方式可以为矩阵运算和/或向量运算。
在本申请实施例的另一个可选实施方式中,本申请实施例中的第一数据为第一神经网络的数据;存储计算阵列预存储有第一神经网络的权重,该第一神经网络的权重为第一神经网络中所包含函数的权重,例如,激活函数、损失函数等函数的权重。基于此,由于预先存储有第一神经网络的权重,在第一神经网络的数据输入到存储计算阵列中后可以直接对其进行运算,大大提高了存储计算阵列的运算效率。
此外,本申请实施例中的第一数据包括以下至少之一:使用频率超过预设 频率阈值的数据、运算功耗超过预设功耗阈值的数据、传输时延低于预设时延阈值的数据;本申请实施例中的第二数据包括以下至少之一:使用频率小于或等于预设频率阈值的数据、运算功耗小于或等于预设功耗阈值的数据、传输时延高于或等于预设时延阈值的数据。
在本申请实施例的运算过程中,可以将高频使用的数据(即使用频率超过预设频率阈值的数据)直接输入存储阵列进行运算,低频使用的数据(即使用频率小于或等于预设频率阈值的数据)输入计算模块中进行运算,或者是将计算密集型功耗大给存储计算阵列,省延迟和节省带宽的数据给计算模块,或者,是将低延迟数据输入到存储计算阵列中,从而降低运算过程中数据频繁从片外传输的概率。
需要说明的是,该预存储第一神经网络的权重的时机可以是存储计算一体芯片中的存储计算阵列和计算模块完全没有进行运算之前,即在进行运算之前就已经在存储计算阵列中存储有该第一神经网络的权重,第一神经网络的数据则可以直接输入到存储计算阵列中进行运算;还可以是,在运算一段时间后,再添加其他类型的第一神经网络的权重,例如混合神经网络中的其他神经网络的权重。也就是说,在本申请实施例中是预先存储第一神经网络的权重之后再对第一神经网络的数据进行运算。
另外,本申请实施例中的使用频率超过预设频率阈值的神经网络可以是指该神经网络的主要用途,例如:人脸识别、图像分类、数据的风险评估等等。以人脸识别为主要用途的神经网络为例,则在对输入的图像进行人脸识别之前,将与该神经网络对应的权重预存储到存储计算阵列中,这样在该神经网络输入图像数据时,直接输入到存储计算阵列中进行运算,而对于其他神经网络的数据则输入到计算模块中进行运算,在减轻了存储计算阵列运算负担的同时,提升了存算一体芯片的运算效率。
在本申请实施例的另一个可选实施方式中,第二数据为第二神经网络的数据;在计算模块对输入存算一体芯片的具有第二数据属性的第二数据进行运算之前,包括:
步骤S11,存储计算阵列接收并存储第二神经网络的权重。
步骤S12,存储计算阵列响应于计算模块的权重获取请求,发送第二神经 网络的权重至计算模块。
需要说明的是,该第二神经网络的权重也可以是与第一神经网络的权重的存储时机一样,在还没进行运算之前就预先存储。另外,由于计算模块是对输入存算一体芯片的具有第二数据属性的数据进行运算,则在识别出当前输入存算一体芯片的数据不是具有第一数据属性的数据,而是具有第二属性的数据的情况下,会将该具有第二属性的数据输入到计算模块中,因此,可以通过权重获取请求,从存储计算阵列中调用第二神经网络的权重,进而在运算过程中,依据第二神经网络的权重对第二神经网络的数据进行运算。
在本申请实施例中存储计算阵列和计算模块可以并行执行数据运算。
基于本申请实施例中的数据的处理方式,在存算一体芯片的基础上,对输入存算一体芯片的具有不同数据属性的数据可以进行区分运算,从而减少低频数据对存储计算阵列的占用、降低运算功耗、并且减少在运算过程中的输入/输出(Input/Output,I/O)数据搬运。
通过以上的实施方式的描述,本领域的技术人员可以清楚地了解到根据上述实施例的方法可借助软件加必需的通用硬件平台的方式来实现,当然也可以通过硬件,但很多情况下前者是更佳的实施方式。基于这样的理解,本申请的技术方案本质上或者说对相关技术做出贡献的部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质(如ROM/RAM、磁碟、光盘)中,包括若干指令用以使得一台终端设备(可以是手机,计算机,服务器,或者网络设备等)执行本申请上述实施例的方法。
需要说明的是,本申请实施例的数据的处理方法可以由硬件执行,或者该方法可以由通过处理器运行计算机可执行代码的方式来执行。在不违背逻辑的情况下,本申请不同实施例之间可以相互结合,不同实施例描述有所侧重,未侧重描述的部分可参见其他实施例的记载。
本申请实施例提供了一种存算一体芯片,如图2所示,该存算一体芯片包括至少一个计算核,该至少一个计算核包括:存储计算阵列、计算模块;
存储计算阵列,用于对输入存算一体芯片的具有第一数据属性的第一数据进行运算;
计算模块,用于对输入存算一体芯片的具有第二数据属性的第二数据进行 运算,第一数据属性与第二数据属性不同。
可选地,本申请实施例中的第一数据为第一神经网络的数据;存储计算阵列预存储有第一神经网络的权重。
可选地,本申请实施例中的第一数据包括以下至少之一:使用频率超过预设频率阈值的数据、运算功耗超过预设功耗阈值的数据、传输时延低于预设时延阈值的数据。
可选地,本申请实施例中的第二数据包括以下至少之一:使用频率小于或等于预设频率阈值的数据、运算功耗小于或等于预设功耗阈值的数据、传输时延高于或等于预设时延阈值的数据。
可选地,本申请实施例中的第二数据为第二神经网络的数据;存储计算阵列,还用于在计算模块对输入存算一体芯片的具有第二数据属性的第二神经网络的数据进行运算之前,接收并存储第二神经网络的权重;响应于计算模块的权重获取请求,发送第二神经网络的权重至计算模块。
可选地,本申请实施例中的存储计算阵列和计算模块并行执行数据运算。
可选地,存储计算阵列包括非易失性存储。
可选地,计算模块包括以下至少之一:向量运算模块、矩阵运算模块。
可选地,本申请实施例还提供一种电子设备,包括处理器,存储器,存储在存储器上并可在所述处理器上运行的程序或指令,该程序或指令被处理器执行时实现上述对数据的处理方法实施例的各个过程,且能达到相同的技术效果,为避免重复,这里不再赘述。
需要注意的是,本申请实施例中的电子设备包括上述所述的移动电子设备和非移动电子设备。
本申请实施例还提供一种可读存储介质,所述可读存储介质上存储有程序或指令,该程序或指令被处理器执行时实现上述数据的处理方法实施例的各个过程,且能达到相同的技术效果,为避免重复,这里不再赘述。
其中,所述处理器为上述实施例中所述的电子设备中的处理器。所述可读存储介质,包括计算机可读存储介质,如计算机只读存储器(Read-Only Memory,ROM)、随机存取存储器(Random Access Memory,RAM)、磁碟或者光盘等。
显然,本领域的技术人员应该明白,上述的本申请的各模块或各步骤可以 用通用的计算装置来实现,它们可以集中在单个的计算装置上,或者分布在多个计算装置所组成的网络上,可选地,它们可以用计算装置可执行的程序代码来实现,从而,可以将它们存储在存储装置中由计算装置来执行,并且在某些情况下,可以以不同于此处的顺序执行所示出或描述的步骤,或者将它们分别制作成各个集成电路模块,或者将它们中的多个模块或步骤制作成单个集成电路模块来实现。这样,本申请不限制于任何特定的硬件和软件结合。
以上所述仅为本申请的可选实施例而已,并不用于限制本申请,对于本领域的技术人员来说,本申请可以有各种更改和变化。凡在本申请的精神和原则之内,所作的任何修改、等同替换、改进等,均应包含在本申请的保护范围之内。

Claims (11)

  1. 一种数据的处理方法,应用于存算一体芯片,其特征在于,所述存算一体芯片包括至少一个计算核,所述计算核包括:存储计算阵列、计算模块;所述方法包括:
    所述存储计算阵列对输入所述存算一体芯片的具有第一数据属性的第一数据进行运算;
    所述计算模块对输入所述存算一体芯片的具有第二数据属性的第二数据进行运算,所述第一数据属性与所述第二数据属性不同。
  2. 根据权利要求1所述的方法,其特征在于,所述第一数据为第一神经网络的数据;所述存储计算阵列预存储有所述第一神经网络的权重。
  3. 根据权利要求1所述的方法,其特征在于,
    所述第一数据包括以下至少之一:使用频率超过预设频率阈值的数据、运算功耗超过预设功耗阈值的数据、传输时延低于预设时延阈值的数据。
  4. 根据权利要求1所述的方法,其特征在于,所述第二数据包括以下至少之一:使用频率小于或等于预设频率阈值的数据、运算功耗小于或等于预设功耗阈值的数据、传输时延高于或等于预设时延阈值的数据。
  5. 根据权利要求1所述的方法,其特征在于,所述第二数据为第二神经网络的数据;在所述计算模块对输入所述存算一体芯片的具有第二数据属性的第二数据进行运算之前,所述方法还包括:
    所述存储计算阵列接收并存储所述第二神经网络的权重;
    所述存储计算阵列响应于所述计算模块的权重获取请求,发送所述第二神经网络的权重至所述计算模块。
  6. 根据权利要求1所述的方法,其特征在于,
    所述存储计算阵列和所述计算模块并行执行数据运算;
    所述存储计算阵列包括非易失性存储;
    所述计算模块包括以下至少之一:向量运算模块、矩阵运算模块。
  7. 一种存算一体芯片,其特征在于,所述存算一体芯片包括至少一个计算核,所述计算核包括:存储计算阵列、计算模块;
    所述存储计算阵列,用于对输入所述存算一体芯片的具有第一数据属性的第一数据进行运算;
    所述计算模块,用于对输入所述存算一体芯片的具有第二数据属性的第二数据进行运算,所述第一数据属性与所述第二数据属性不同。
  8. 根据权利要求7所述的存算一体芯片,其特征在于,所述第一数据为第一神经网络的数据;所述存储计算阵列预存储有所述第一神经网络的权重。
  9. 一种电子设备,其特征在于,包括处理器,存储器及存储在所述存储器上并可在所述处理器上运行的程序或指令,所述程序或指令被所述处理器执行时实现包括如权利要求1-6中任一项所述的数据的处理方法的步骤。
  10. 一种可读存储介质,其特征在于,所述可读存储介质上存储程序或指令,所述程序或指令被处理器执行时实现包括如权利要求1-6中任一项所述的数据的处理方法的步骤。
  11. 一种计算机程序产品,包括计算机可读代码,或者承载有计算机可读代码的非易失性计算机可读存储介质,当所述计算机可读代码在电子设备的处理器中运行时,所述电子设备中的处理器执行用于实现权利要求1-6中的任一项所述的数据的处理方法的步骤。
PCT/CN2021/131247 2020-11-18 2021-11-17 数据的处理方法及存算一体芯片 WO2022105805A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202011293845.6 2020-11-18
CN202011293845.6A CN112395247B (zh) 2020-11-18 2020-11-18 数据的处理方法、存算一体芯片

Publications (1)

Publication Number Publication Date
WO2022105805A1 true WO2022105805A1 (zh) 2022-05-27

Family

ID=74607396

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/131247 WO2022105805A1 (zh) 2020-11-18 2021-11-17 数据的处理方法及存算一体芯片

Country Status (2)

Country Link
CN (1) CN112395247B (zh)
WO (1) WO2022105805A1 (zh)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115439566A (zh) * 2022-08-23 2022-12-06 中国电子科技南湖研究院 一种基于存算一体架构的压缩感知系统及方法
CN115665268A (zh) * 2022-11-21 2023-01-31 上海亿铸智能科技有限公司 一种适用于存算一体芯片的数据传输装置及方法
CN116151343A (zh) * 2023-04-04 2023-05-23 荣耀终端有限公司 数据处理电路和电子设备

Families Citing this family (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112395247B (zh) * 2020-11-18 2024-05-03 北京灵汐科技有限公司 数据的处理方法、存算一体芯片
CN113138957A (zh) * 2021-03-29 2021-07-20 北京智芯微电子科技有限公司 用于神经网络推理的芯片及加速神经网络推理的方法
CN113190208B (zh) * 2021-05-07 2022-12-27 电子科技大学 能存算一体化单元及状态控制方法、集成模组、处理器及设备
CN114997388B (zh) * 2022-06-30 2024-05-07 杭州知存算力科技有限公司 存算一体芯片用基于线性规划的神经网络偏置处理方法
CN116167424B (zh) * 2023-04-23 2023-07-14 深圳市九天睿芯科技有限公司 基于cim的神经网络加速器、方法、存算处理系统与设备
CN116777727B (zh) * 2023-06-21 2024-01-09 北京忆元科技有限公司 存算一体芯片、图像处理方法、电子设备及存储介质

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20190180170A1 (en) * 2017-12-12 2019-06-13 Amazon Technologies, Inc. Multi-memory on-chip computational network
CN209766043U (zh) * 2019-06-26 2019-12-10 北京知存科技有限公司 存算一体芯片、存储单元阵列结构
CN210924662U (zh) * 2020-01-16 2020-07-03 北京比特大陆科技有限公司 神经网络处理的装置与系统
CN111611195A (zh) * 2019-02-26 2020-09-01 北京知存科技有限公司 可软件定义存算一体芯片及其软件定义方法
CN111611197A (zh) * 2019-02-26 2020-09-01 北京知存科技有限公司 可软件定义的存算一体芯片的运算控制方法和装置
CN112395247A (zh) * 2020-11-18 2021-02-23 北京灵汐科技有限公司 数据的处理方法、存算一体芯片

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111241028A (zh) * 2018-11-28 2020-06-05 北京知存科技有限公司 一种数模混合存算一体芯片以及运算装置
CN109766309B (zh) * 2018-12-29 2020-07-28 北京航空航天大学 自旋存算一体芯片
CN110147880A (zh) * 2019-05-22 2019-08-20 苏州浪潮智能科技有限公司 一种神经网络数据处理结构、方法、系统及相关装置

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20190180170A1 (en) * 2017-12-12 2019-06-13 Amazon Technologies, Inc. Multi-memory on-chip computational network
CN111611195A (zh) * 2019-02-26 2020-09-01 北京知存科技有限公司 可软件定义存算一体芯片及其软件定义方法
CN111611197A (zh) * 2019-02-26 2020-09-01 北京知存科技有限公司 可软件定义的存算一体芯片的运算控制方法和装置
CN209766043U (zh) * 2019-06-26 2019-12-10 北京知存科技有限公司 存算一体芯片、存储单元阵列结构
CN210924662U (zh) * 2020-01-16 2020-07-03 北京比特大陆科技有限公司 神经网络处理的装置与系统
CN112395247A (zh) * 2020-11-18 2021-02-23 北京灵汐科技有限公司 数据的处理方法、存算一体芯片

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115439566A (zh) * 2022-08-23 2022-12-06 中国电子科技南湖研究院 一种基于存算一体架构的压缩感知系统及方法
CN115665268A (zh) * 2022-11-21 2023-01-31 上海亿铸智能科技有限公司 一种适用于存算一体芯片的数据传输装置及方法
CN115665268B (zh) * 2022-11-21 2023-04-18 苏州亿铸智能科技有限公司 一种适用于存算一体芯片的数据传输装置及方法
CN116151343A (zh) * 2023-04-04 2023-05-23 荣耀终端有限公司 数据处理电路和电子设备
CN116151343B (zh) * 2023-04-04 2023-09-05 荣耀终端有限公司 数据处理电路和电子设备

Also Published As

Publication number Publication date
CN112395247B (zh) 2024-05-03
CN112395247A (zh) 2021-02-23

Similar Documents

Publication Publication Date Title
WO2022105805A1 (zh) 数据的处理方法及存算一体芯片
JP2019036298A (ja) 知能型高帯域幅メモリシステム及びそのための論理ダイ
WO2021088688A1 (zh) 一种卷积加速运算方法、装置、存储介质及终端设备
CN115880132B (zh) 图形处理器、矩阵乘法任务处理方法、装置及存储介质
CN110209472B (zh) 任务数据处理方法和板卡
WO2023051505A1 (zh) 一种任务求解方法及其装置
CN110555700A (zh) 区块链智能合约执行方法、装置、计算机可读存储介质
WO2020119188A1 (zh) 一种程序检测方法、装置、设备及可读存储介质
CN111967608A (zh) 数据处理方法、装置、设备及存储介质
WO2019001323A1 (zh) 信号处理的系统和方法
CN115203126B (zh) 一种算子融合处理方法、装置、设备及存储介质
US20220374742A1 (en) Method, device and storage medium for running inference service platform
WO2021012506A1 (zh) 语音识别系统中的负载均衡实现方法、装置以及计算机设备
CN114461384A (zh) 任务执行方法、装置、计算机设备和存储介质
WO2021057811A1 (zh) 网络节点处理方法、装置、存储介质及电子设备
CN112966054A (zh) 基于企业图谱节点间关系的族群划分方法和计算机设备
CN114579187B (zh) 一种指令分配方法、装置、电子设备及可读存储介质
US20230306236A1 (en) Device and method for executing lstm neural network operation
US20220206554A1 (en) Processor and power supply ripple reduction method
CN112261023A (zh) 一种卷积神经网络的数据传输方法和装置
US20240020510A1 (en) System and method for execution of inference models across multiple data processing systems
US20240020550A1 (en) System and method for inference generation via optimization of inference model portions
CN115361285B (zh) 实现离在线业务混合部署的方法、装置、设备及介质
WO2024016894A1 (zh) 一种神经网络的训练方法以及相关设备
CN117041161A (zh) 业务请求处理方法、装置、计算机设备和存储介质

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21893952

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 20.09.2023)

122 Ep: pct application non-entry in european phase

Ref document number: 21893952

Country of ref document: EP

Kind code of ref document: A1