WO2021196158A1 - Data access circuit and method - Google Patents

Data access circuit and method Download PDF

Info

Publication number
WO2021196158A1
WO2021196158A1 PCT/CN2020/083195 CN2020083195W WO2021196158A1 WO 2021196158 A1 WO2021196158 A1 WO 2021196158A1 CN 2020083195 W CN2020083195 W CN 2020083195W WO 2021196158 A1 WO2021196158 A1 WO 2021196158A1
Authority
WO
WIPO (PCT)
Prior art keywords
data
circuit
memory
control
internal
Prior art date
Application number
PCT/CN2020/083195
Other languages
French (fr)
Chinese (zh)
Inventor
王维伟
罗飞
Original Assignee
北京希姆计算科技有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 北京希姆计算科技有限公司 filed Critical 北京希姆计算科技有限公司
Priority to PCT/CN2020/083195 priority Critical patent/WO2021196158A1/en
Priority to CN202080098538.4A priority patent/CN115280272A/en
Publication of WO2021196158A1 publication Critical patent/WO2021196158A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers

Definitions

  • the present disclosure relates to the field of data access, and in particular to a data access circuit and method.
  • the chip is the cornerstone of data processing, it is fundamentally determined Improved people’s ability to process data. From the perspective of application fields, there are two main routes for chips: one is a general-purpose chip route, such as the central processing unit (CPU), etc. They can provide great flexibility, but they are effective in processing algorithms in specific areas.
  • a general-purpose chip route such as the central processing unit (CPU), etc. They can provide great flexibility, but they are effective in processing algorithms in specific areas.
  • the computing power is relatively low; the other is a dedicated chip route, such as Tensor Processing Unit (TPU), etc., which can play a higher effective computing power in some specific fields, but face flexible and changeable comparisons In general fields, they have poor processing capabilities or even can't handle them.
  • TPU Tensor Processing Unit
  • multi-core or many-core chips are often used.
  • the processing cores in the multi-core architecture all have a certain degree of independent processing capability, and have a relatively large internal storage space for storing their own programs, data, and weights.
  • the basic computing power of a single processing core determines the ability of the entire chip to calculate neural networks.
  • the performance of the basic computing power of a single processing core is determined by the ideal computing power and storage access efficiency of the computing unit of the single processing core.
  • the access speed of the register is the fastest, accessing a few hundred ps (picosecond); followed by Static Random Access Memory (SRAM), the access speed is generally a few hundred ps to a few ns (nanosecond) Within the range of; again is the memory unit, that is, Double Data Rate Synchronous Dynamic Random Access Memory (DDR SDRAM), and its access speed is generally tens to hundreds of ns; finally, it is passed Other memories accessed by IO (input and output) ports, such as hard disks, have slow access speeds, generally in the ms (millisecond) level.
  • DDR SDRAM Double Data Rate Synchronous Dynamic Random Access Memory
  • the general concern is the access of the processing unit to the memory unit.
  • the speed of the processing unit is very fast, and its main frequency is generally several hundred MHz (megahertz) to several GHz (gigahertz), that is, ps to ns level, and the access speed of the memory unit is tens of ns level, both There is a big difference in speed.
  • How to solve the speed difference between the processing unit and the memory access, and effectively utilize the computing power of the processing unit, is a difficult point in modern CPU design.
  • FIG. 1 In order to solve the problem of speed matching between the processing unit and the memory unit, the solution in FIG. 1 is generally used in the prior art.
  • PU Processing Unit
  • Cache is a high-speed cache
  • Memory is a memory.
  • a high-speed cache Cache is inserted between the processing unit PU and the memory Memory.
  • the PU accesses the Memory in a hierarchical and indirect manner. It directly accesses the Cache and indirectly accesses the Memory through the Cache.
  • Cache is a mapping of Memory, and its content is a subset of memory content. It is transparent to the program run by the PU, has no functional significance, and has no independent addressing space. Its address is the same as the memory address accessed, that is, the program cannot access the Cache alone.
  • a data access circuit including:
  • a first data interface circuit configured to connect a first internal memory of the plurality of internal memories with an external processing unit
  • a second data interface circuit for connecting a second internal memory of the plurality of internal memories with an external memory
  • the first control circuit is configured to receive a first control instruction to control the first data interface circuit and the second data interface circuit;
  • the second control circuit is configured to receive a second control instruction to control the data transmission direction of the second data interface circuit.
  • the first data interface circuit includes:
  • a first switch circuit the first switch circuit includes a plurality of connection states, wherein each connection state is used to connect the processing unit with one of the plurality of internal memories.
  • the second data interface circuit includes:
  • a second switch circuit the second switch circuit includes a plurality of connection states, wherein each connection state is used to connect the external memory with one of the plurality of internal memories.
  • the second data interface circuit further includes:
  • a storage controller which is connected between the external storage and the plurality of internal storages, and is used to control the data exchange between the external storage and the plurality of internal storages.
  • the first control circuit includes:
  • the switch control circuit is used to receive a first control instruction sent by the external processing unit to generate a first switch control signal and a second switch control signal, wherein the first switch control signal is used to set the first switch circuit The connection state, the second switch control signal is used to set the connection state of the second switch circuit.
  • the second control circuit includes:
  • An access control circuit which is used to receive a second control instruction sent by the external processing unit to generate a control signal, wherein the control signal is used to control the storage controller to fetch the control from the second internal memory
  • the data indicated by the signal is stored in the external memory, or the storage controller is controlled to fetch the data indicated by the control signal from the external memory and store it in the second internal memory.
  • embodiments of the present disclosure provide a data access method, which is characterized in that it includes:
  • the external processing unit obtains data from the first internal memory or sends data to the first internal memory;
  • the data in the second internal memory is sent to the external memory or the data obtained from the external memory is stored in the second internal memory.
  • a data access device including:
  • At least one data access circuit as described in any one of the first aspect.
  • an embodiment of the present disclosure provides an electronic device, including: a memory, configured to store computer-readable instructions; and one or more processors, configured to execute the computer-readable instructions to cause the processor to run
  • the data access method described in any one of the foregoing second aspects is realized at a time.
  • embodiments of the present disclosure provide a non-transitory computer-readable storage medium, characterized in that the non-transitory computer-readable storage medium stores computer instructions, and the computer instructions are used to make a computer execute the aforementioned second aspect Any of the data access methods described above.
  • embodiments of the present disclosure provide a computer program product, wherein the computer program product is characterized by including computer instructions, and when the computer instructions are executed by a computing device, the computing device can execute any of the foregoing second aspects.
  • the data processing method is characterized by including computer instructions, and when the computer instructions are executed by a computing device, the computing device can execute any of the foregoing second aspects.
  • an embodiment of the present disclosure provides a chip characterized by comprising at least one data processing device described in the third aspect.
  • an embodiment of the present disclosure provides a computing device, which is characterized in that it includes at least one chip as described in the seventh aspect.
  • the embodiment of the present disclosure discloses a data access circuit and method.
  • the data processing circuit includes: a plurality of internal memories; a first data interface circuit for connecting the first internal memory of the plurality of internal memories with a processing unit outside the memory management circuit; and a second data interface A circuit for connecting a second internal memory of the plurality of internal memories with an external memory; a first control circuit for receiving a first control instruction to control the first data interface circuit and the second data Interface circuit; a second control circuit for receiving a second control instruction to control the data transmission direction of the second data interface circuit.
  • Figure 1 is a schematic structural diagram of a processing scheme in the prior art
  • FIG. 2 is a schematic diagram of an application scenario of a data access circuit in an embodiment of the disclosure
  • 3a is a schematic diagram of the structure of a data access circuit in an embodiment of the disclosure.
  • 3b is a schematic diagram of a specific structure of a data access circuit in an embodiment of the disclosure.
  • FIG. 4 is a schematic flowchart of a data access method in an embodiment of the disclosure.
  • FIG. 5 is a schematic flowchart of another data access method in an embodiment of the disclosure.
  • FIG. 6 is a schematic flowchart of another data access method in an embodiment of the disclosure.
  • FIG. 7 is a schematic diagram of an actual application scenario of an embodiment of the disclosure.
  • Fig. 8 is a sequence diagram of neural network calculations using an embodiment of the present disclosure.
  • FIG. 2 is a schematic diagram of an application scenario of a data access circuit in an embodiment of the disclosure.
  • Fig. 2 shows a data access device including the data access circuit of the present disclosure.
  • the device includes a processing unit PU, a data access circuit, and an external memory Memory.
  • the data access circuit is located between the PU and the Memory, and is responsible for data transfer between the PU and the Memory.
  • FIG. 3a is a schematic diagram of the structure of a data access circuit in an embodiment of the disclosure.
  • the data access circuit 300 includes: a plurality of internal memories 301; a first data interface circuit 302 for managing the first internal memory and the memory among the plurality of internal memories 301 The processing unit PU outside the circuit is connected; the second data interface circuit 303 is used to connect the second internal memory of the plurality of internal memories 301 with the external memory Memory; the first control circuit 304 is used to receive the first The control instruction is used to control the first data interface circuit 302 and the second data interface circuit 303; the second control circuit 305 is used to receive a second control instruction to control the data transmission direction of the second data interface circuit 303.
  • the multiple internal memories are random access memory RAM, which can directly exchange data with the processor. It is understandable that although only two internal memories are shown in FIG. 3, the number of internal memories in practical applications It can be set arbitrarily, so I won't repeat it here.
  • the first data interface circuit 302 under the control of the first control circuit 304, determines at least one first internal memory from the plurality of internal memories 301 to connect it with the external processing unit, so that the Data transmission can be performed between the first internal memory and the external processing unit PU;
  • the second data interface circuit 303 under the control of the first control circuit 304, determines at least one from the plurality of internal memories 301
  • the second internal memory is connected to the external memory, so that data transmission can be performed between the second internal memory and the external memory.
  • the first control circuit 304 receives a first control instruction to control the first data interface circuit 302 and the second data interface circuit 303.
  • the first control instruction is issued by the external processing unit, so
  • the first control circuit 304 after receiving the first control instruction, undergoes decoding and execution, and configures the first data interface circuit 302 and the second data interface circuit 303 according to the parameters in the first control instruction,
  • the first data interface circuit 302 is made to communicate with the corresponding first internal memory and the external processing unit
  • the second data interface circuit 303 is obtained to communicate with the corresponding second internal memory and the external processing unit.
  • the second data interface circuit 303 determines the data transfer direction of the second data interface circuit under the control of the second control circuit 305, that is, sends data from the second internal memory to the external memory or from the outside.
  • the acquired data in the memory is stored in the second internal memory.
  • first internal memory and the second internal memory may be the same memory, that is, the external processing unit and the external memory are connected to the same internal memory.
  • the first data interface circuit 302 includes: a first switch circuit, the first switch circuit includes a plurality of connection states, wherein each connection state is used to connect the processing unit to the plurality of internal memories One of the connections.
  • FIG. 3b is a schematic diagram of a specific structure of a data access circuit in an embodiment of the disclosure.
  • the first data interface circuit 302 includes a first switch circuit SW_1, wherein the connection state in the first switch circuit SW_1 is the same as the number of internal memories, as shown in the example in FIG.
  • the internal memory includes two E_RAM and O_RAM, so the first switch circuit SW_1 includes two connection states of 0 and 1, respectively used to connect E_RAM and O_RAM, and the two connection states are both connected to the external processing unit, so that the first The switch circuit SW_1 can only be in one connection state at a time to connect the external processing unit and the internal storage unit.
  • the second data interface circuit 303 includes: a second switch circuit, the second switch circuit includes a plurality of connection states, wherein each connection state is used to connect the external memory to the plurality of internal memories. One of the connections.
  • the second data interface circuit 303 includes a second switch circuit SW_2, and the connection state in the first switch circuit SW_2 is the same as the number of internal memories, as shown in FIG.
  • the internal memory includes two E_RAM and O_RAM
  • the first switch circuit SW_2 includes two connection states of 0 and 1, respectively used to connect E_RAM and O_RAM, and the two connection states are both connected to the external memory, so that the first The switch circuit SW_2 can only be in one connection state at a time to connect the external memory and the internal memory unit.
  • the second data interface circuit 303 further includes a storage controller connected between the external storage and the multiple internal storages, and configured to control the external storage and the internal storages.
  • the data exchange between multiple internal memories Exemplarily, as shown in FIG. 3b, the second data interface circuit 303 further includes a memory controller DMAC, which is located between the external memory and the plurality of internal memories, specifically, connected to the second switch circuit SW_1 and Between external memories, used to control the data exchange between the external memory and the multiple internal memories, that is, control data from the external memory to the multiple internal memories or control data from the multiple internal memories To the external storage.
  • DMAC memory controller
  • the second data interface circuit 303 further includes a memory controller DMAC, which is located between the external memory and the plurality of internal memories, specifically, connected to the second switch circuit SW_1 and Between external memories, used to control the data exchange between the external memory and the multiple internal memories, that is, control data from the external memory to the multiple internal memories or control data from the multiple internal memories To the external storage.
  • the first control circuit 304 includes: a switch control circuit for receiving a first control instruction sent by the external processing unit to generate a first switch control signal and a second switch control signal, wherein the first switch control signal A switch control signal is used to set the connection state of the first switch circuit, and the second switch control signal is used to set the connection state of the second switch circuit.
  • a switch control circuit for receiving a first control instruction sent by the external processing unit to generate a first switch control signal and a second switch control signal, wherein the first switch control signal A switch control signal is used to set the connection state of the first switch circuit, and the second switch control signal is used to set the connection state of the second switch circuit.
  • the first control circuit 304 includes a switch control circuit SW_ctrl, which receives the first control command ISW_dis sent by the external processing unit PU, and after receiving the first control command Isw_dis, It is decoded and executed to generate a first switch control signal C_SW1 and a second switch control signal C_SW2, wherein the first control instruction includes parameters for controlling the first switch and the second switch to indicate the current moment of the first switch and In which connection state the second switch should be, the first switch control signal C_SW1 and the second switch control signal C_SW2 can be generated to respectively set the connection state of the first switch SW_1 and the second switch SW_2, so that the external processing unit can be determined separately
  • the internal storage unit connected to the PU at this time and the internal storage unit connected to the external memory are as shown in the example of FIG. 3b. At this time, the external processing unit PU is connected to E_RAM, and the external memory Memory is connected to O_RAM.
  • the second control circuit 305 includes: an access control circuit, which is configured to receive a second control instruction sent by the external processing unit to generate a control signal, wherein the control signal is used to control the storage control
  • the device fetches the data indicated by the control signal from the second internal memory and stores it in the external memory, or controls the storage controller to fetch the data indicated by the control signal from the external memory and store it in the external memory.
  • the second internal memory Exemplarily, as shown in FIG.
  • the second control circuit 305 includes an access control circuit LS_mem_ctrl, which receives the second control command Ils_dis sent by the external processing unit PU, and after receiving the second control command Ils_dis , It is decoded and executed to generate the control signal C_DMAC, wherein the second control instruction Ils_dis includes the fetch instruction ld_mem instruction that fetches the data from the external memory Memory and stores it in the internal memory RAM and stores the data in the internal memory RAM St_mem instruction to the external memory Memory; the control signal C_DMAC is used to configure and start the memory controller DMAC according to the parameters in the second control instruction to control the memory controller DMAC to fetch from the internal memory O_RAM The data indicated by the control signal C_DMAC is stored in the external memory Memory, or the memory controller DMAC is controlled to fetch the data indicated by the control signal C_DMAC from the external memory Memory and store it in the internal memory O_RAM.
  • the second control instruction Ils_dis includes the fetch instruction ld_mem instruction that fetch
  • the first switch circuit and the second switch circuit are two independent circuits, and the connection state of the switch control circuit is controlled, so that the external processing unit PU accesses the internal memory RAM through the first switch circuit, so that the storage controller
  • the DMAC accesses the internal memory RAM through the second switch circuit, and the two can run independently and in parallel.
  • the addresses of the multiple internal memories are the same, so that every time the PU accesses an internal memory, its addressing space is unchanged; and the internal memory can be addressed uniformly with the external memory, for example
  • the address range of the low address is allocated to the internal memory RAM, and the address range of the high address is allocated to the external memory Memory.
  • the external processing unit PU it only needs to access the address of the internal memory RAM, so its access address range is limited to the internal memory RAM; and for the memory controller DMAC, it needs to access both the internal memory RAM and the external Memory, so its access address range is the full address range of the internal memory and external memory unified addressing. In this way, when the external memory PU reads and writes to the internal memory, it uses the same address.
  • the external processor PU From the perspective of the external processor PU, it always uses the same internal memory RAM, which does not distinguish which internal memory RAM. ; Similarly, for the memory controller DMAC, it also uses an address (such as the low address range of uniform addressing), so it does not need to distinguish which specific internal memory, so there is no need to consider calculation and storage
  • the traditional serial program can be written when the program is written, which can greatly reduce the complexity of the program.
  • FIG. 4 is a schematic flowchart of a data access method in an embodiment of the disclosure. As shown in Figure 4, the data access method includes:
  • Step S401 receiving a first control instruction to determine a first internal memory connected to an external processing unit from a plurality of internal memories;
  • Step S402 The external processing unit obtains data from the first internal memory or sends data to the first internal memory.
  • step S401 after receiving the first control instruction sent by the external processing unit PU to the data access circuit 300, the switch control circuit of the data access circuit decodes it and executes the control signal to generate the first switch to The connection state of the first switch circuit is controlled so that the external processing unit is connected to the first internal memory of the plurality of internal memories; after that, in step S402, the external processing unit performs its arithmetic operation, from the The first temporal memory acquires data or sends data to the internal memory.
  • FIG. 5 is a schematic flowchart of another data access method in an embodiment of the disclosure. As shown in FIG. 5, the data access method includes:
  • Step S501 receiving a first control instruction to determine a second internal memory connected to an external memory from a plurality of internal memories
  • Step S502 receiving a second control instruction to determine the data transfer direction between the external memory and the second internal memory
  • Step S503 According to the data transfer direction, the data in the second internal memory is sent to the external memory or the data obtained from the external memory is stored in the second internal memory.
  • step S501 after receiving the first control instruction sent by the external processing unit PU to the data access circuit 300, the switch control circuit of the data access circuit decodes it and executes the control signal to generate the second switch, In order to control the connection state of the second switch circuit, the external processing unit is connected to the second internal memory of the plurality of internal memories; then, in step S502, the access control circuit of the data access circuit is receiving After the external processing unit PU sends the first control instruction to the data access circuit 300, it is decoded and executed to generate a control signal for controlling the storage controller to determine the data transfer between the external memory and the second internal memory Direction; then in step S503, the storage controller sends the data indicated in the second control instruction from the second internal storage to the external storage or from the external storage according to the data transfer direction The data indicated in the second control instruction is acquired from the external memory and stored in the internal memory.
  • the above two data access methods are methods independently executed by the circuits at both ends of the data access circuit, which can be executed independently and in parallel, and the two methods can also be put together to complete more complex data access tasks.
  • the embodiments of the present disclosure also provide a data access method, including:
  • Step S601 receiving a first control instruction to determine a first internal memory connected to the external processing unit and a second internal memory connected to the external memory;
  • Step S602 receiving a second control instruction to determine the data transfer direction between the external memory and the second internal memory
  • Step S603 The external processing unit obtains data from the first internal memory or sends data to the first internal memory;
  • Step S604 According to the data transfer direction, the data in the second internal memory is sent to the external memory or the data obtained from the external memory is stored in the second internal memory.
  • the external processor PU sends a first control instruction Isw_dis to the data access circuit 300, and the switch control circuit of the data access circuit receives the instruction and generates a first switch control signal SW_1 and the second switch control signal SW_2, where SW_1 controls the connection state of the first switch to 0, so that the external processing unit is connected to E_RAM, and SW_2 controls the connection state of the second switch to 1, so that DAMC is connected to O_RAM.
  • the external processing unit PU sends a second control command Ils_dis to the data access circuit 300, where Ils_dis is the fetch command ld_mem.
  • the access control circuit of the data access circuit 300 After the access control circuit of the data access circuit 300 receives the command ld_mem, it performs Decode and execute the control signal C_DMAC of the memory controller DMAC to configure the read address, write address, and data size of the DMAC to transfer the data block determined by the instruction ld_mem from the external memory Memory
  • the storage area of the read address is read out and written into the storage area of the write address of O_RAM, where both the read address and the write address may be the first address.
  • the external processing unit PU can execute its own instruction, the operand of the instruction is read from E_RAM, or the execution result data of the instruction is stored in E_RAM.
  • FIG. 7 is a schematic diagram of an actual application scenario of an embodiment of the disclosure.
  • the data access circuit in the embodiment of the present disclosure is used to make the external processing unit PU execute the calculation of the neural network.
  • Figure 7 shows a schematic diagram of the neural network.
  • the neural network has two layers.
  • the parameters and data used by the neural network of each layer are smaller than the capacity of a single block of RAM, that is, E_RAM and O_RAM can accommodate one
  • the parameters and data calculated by the layered neural network are smaller than the capacity of a single block of RAM, that is, E_RAM and O_RAM can accommodate one The parameters and data calculated by the layered neural network.
  • E_RAM corresponds to the first layer of neural network layer1
  • O_RAM corresponds to the second layer of neural network layer2
  • the external processing unit PU can switch between E_RAM and O_RAM at different times to execute the two-layer neural network program.
  • Fig. 8 is a sequence diagram of neural network calculations using an embodiment of the present disclosure.
  • the first switch circuit selects E_RAM to connect with PU
  • the second switch circuit selects O_RAM to connect with DMAC.
  • PU obtains and calculates the operand of layer1 through E_RAM and performs the calculation of layer1, or calculates The result of layer1 is stored in E_RAM.
  • DMAC updates the data in O_RAM according to the second control instruction of the PU, such as storing the data in Memory in O_RAM; at t1, the PU issues the first control instruction, and the control switch controls the first
  • the switch control signal and the second switch control signal are used to control the first switch circuit to select O_RAM and the second switch circuit to select E_RAM.
  • the PU obtains and calculates the operand of layer2 through O_RAM and executes the calculation of layer2, or saves the result of calculation of layer2
  • the DMAC updates the data in E_RAM according to the second control instruction of the PU, such as storing the data in Memory in E_RAM; at t2, the PU sends out the first control instruction, and controls the switch to generate the first switch control signal and The second switch control signal controls the first switch circuit to select E_RAM and the second switch circuit to select O_RAM.
  • the PU obtains and calculates the operand of layer1 through E_RAM and executes the calculation of layer1, or saves the result of calculating layer1 in E_RAM, DMAC updates the data in O_RAM according to the second control instruction of the PU, such as storing the data in Memory in O_RAM. This cycle alternates until the calculation task of the neural network is completed.
  • the RAM may not be switched, so that the calculation and the parameter/data update are performed on the same block of RAM, so that the calculation and storage will be performed serially on the same block of RAM.
  • An embodiment of the present disclosure also provides a data access device, which is characterized by comprising: at least one data access circuit as described in any of the above embodiments.
  • the data access device is, for example, a processing core.
  • An embodiment of the present disclosure also provides a chip, which is characterized by including at least one data access circuit as described in any of the above embodiments.
  • the embodiments of the present disclosure provide a computer program product, which is characterized in that it includes computer instructions, and when the computer instructions are executed by a computing device, the computing device can execute any of the data accesses in the foregoing embodiments. method.
  • the embodiments of the present disclosure provide a non-transitory computer-readable storage medium, characterized in that the non-transitory computer-readable storage medium stores computer instructions, and the computer instructions are used to make a computer execute any of the foregoing third aspects.
  • the data access method is not limited to:
  • An embodiment of the present disclosure provides a computing device, which is characterized by comprising at least one chip described in any one of the foregoing embodiments.
  • each block in the flowchart or block diagram can represent a module, program segment, or part of code, and the module, program segment, or part of code contains one or more for realizing the specified logic function.
  • Executable instructions can also occur in a different order from the order marked in the drawings. For example, two blocks shown one after another can actually be executed substantially in parallel, and they can sometimes be executed in the reverse order, depending on the functions involved.
  • each block in the block diagram and/or flowchart, and the combination of the blocks in the block diagram and/or flowchart can be implemented by a dedicated hardware-based system that performs the specified functions or operations Or it can be realized by a combination of dedicated hardware and computer instructions.
  • the units involved in the embodiments described in the present disclosure can be implemented in software or hardware. Among them, the name of the unit does not constitute a limitation on the unit itself under certain circumstances.
  • exemplary types of hardware logic components include: Field Programmable Gate Array (FPGA), Application Specific Integrated Circuit (ASIC), Application Specific Standard Product (ASSP), System on Chip (SOC), Complex Programmable Logical device (CPLD) and so on.
  • FPGA Field Programmable Gate Array
  • ASIC Application Specific Integrated Circuit
  • ASSP Application Specific Standard Product
  • SOC System on Chip
  • CPLD Complex Programmable Logical device
  • a machine-readable medium may be a tangible medium, which may contain or store a program for use by the instruction execution system, apparatus, or device or in combination with the instruction execution system, apparatus, or device.
  • the machine-readable medium may be a machine-readable signal medium or a machine-readable storage medium.
  • the machine-readable medium may include, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, device, or device, or any suitable combination of the foregoing.
  • machine-readable storage media would include electrical connections based on one or more wires, portable computer disks, hard disks, random access memory (RAM), read-only memory (ROM), erasable programmable read-only memory (EPROM or flash memory), optical fiber, portable compact disk read only memory (CD-ROM), optical storage device, magnetic storage device, or any suitable combination of the foregoing.
  • RAM random access memory
  • ROM read-only memory
  • EPROM or flash memory erasable programmable read-only memory
  • CD-ROM compact disk read only memory
  • magnetic storage device or any suitable combination of the foregoing.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Advance Control (AREA)

Abstract

Disclosed are a data access circuit and method. The data access circuit comprises: a plurality of internal memories; a first data interface circuit for connecting a first internal memory from among the plurality of internal memories to an external processing unit; a second data interface circuit for connecting a second internal memory from among the plurality of internal memories to an external memory; a first control circuit for receiving a first control instruction to control the first data interface circuit and the second data interface circuit; and a second control circuit for receiving a second control instruction to control a data transmission direction of the second data interface circuit. By means of controlling of the interface circuits by the control circuits in the data access circuit, the capacities and access efficiency of the internal memories are all improved, thereby solving the technical problems in the prior art of insufficient capacities of memories, low access efficiency of the memories, and complex circuits.

Description

数据存取电路和方法Data access circuit and method 技术领域Technical field
本公开涉及数据存取领域,尤其涉及一种数据存取电路和方法。The present disclosure relates to the field of data access, and in particular to a data access circuit and method.
背景技术Background technique
随着科学技术的发展,人类社会正在快速进入智能时代。智能时代的重要特点,就是人们获得数据的种类越来越多,获得数据的量越来越大,而对处理数据的速度要求越来越高.芯片是数据处理的基石,它从根本上决定了人们处理数据的能力。从应用领域来看,芯片主要有两条路线:一条是通用芯片路线,例如中央处理器CPU(Central Processing Unit,CPU)等,它们能提供极大的灵活性,但是在处理特定领域算法时有效算力比较低;另一条是专用芯片路线,例如张量处理器(Tensor Processing Unit,TPU)等,它们在某些特定领域,能发挥较高的有效算力,但是面对灵活多变的比较通用的领域,它们处理能力比较差甚至无法处理。由于智能时代的数据种类繁多且数量巨大,所以要求芯片既具有极高的灵活性,能处理不同领域且日新月异的算法,又具有极强的处理能力,能快速处理极大的且急剧增长的数据量。With the development of science and technology, human society is rapidly entering the era of intelligence. The important feature of the intelligent age is that people are getting more and more types of data, the amount of data is getting bigger and bigger, and the speed of processing data is getting higher and higher. The chip is the cornerstone of data processing, it is fundamentally determined Improved people’s ability to process data. From the perspective of application fields, there are two main routes for chips: one is a general-purpose chip route, such as the central processing unit (CPU), etc. They can provide great flexibility, but they are effective in processing algorithms in specific areas. The computing power is relatively low; the other is a dedicated chip route, such as Tensor Processing Unit (TPU), etc., which can play a higher effective computing power in some specific fields, but face flexible and changeable comparisons In general fields, they have poor processing capabilities or even can't handle them. Due to the wide variety and huge amount of data in the intelligent era, the chip is required to have extremely high flexibility, capable of processing different fields and rapidly changing algorithms, and extremely strong processing capabilities, which can quickly process extremely large and rapidly increasing data. quantity.
在神经网络计算中,经常会用到多核或者众核的芯片。此处多(众)核架构中的处理核,都有一定独立处理能力,并且带有比较大的核内存储空间,用于存储自身的程序、数据和权重。单个处理核的基础计算能力的发挥,决定了整个芯片计算神经网络的能力。而单个处理核的基础计算能力的发挥,由单个处理核的计算单元的理想计算能力及存储访问效率决定。In neural network calculations, multi-core or many-core chips are often used. Here, the processing cores in the multi-core architecture all have a certain degree of independent processing capability, and have a relatively large internal storage space for storing their own programs, data, and weights. The basic computing power of a single processing core determines the ability of the entire chip to calculate neural networks. The performance of the basic computing power of a single processing core is determined by the ideal computing power and storage access efficiency of the computing unit of the single processing core.
不同的存储单元,其被访问的速度不一样。一般来说,寄存器的访问速度最快,访问一次几百ps(皮秒);其次是静态随机存储器(Static Random Access Memory,SRAM),其访问速度一般在几百ps到几ns(纳秒)的范围内;再次是内存单元,也就是双倍数据率同步动态随机存取存储器(Double Data Rate Synchronous Dynamic Random Access Memory,DDR SDRAM),其访问速度一般是几十到几百ns;最后是通过IO(输 入输出)口访问的其他存储器,如硬盘等,其访问速度缓慢,一般是ms(毫秒)级。Different storage units have different access speeds. Generally speaking, the access speed of the register is the fastest, accessing a few hundred ps (picosecond); followed by Static Random Access Memory (SRAM), the access speed is generally a few hundred ps to a few ns (nanosecond) Within the range of; again is the memory unit, that is, Double Data Rate Synchronous Dynamic Random Access Memory (DDR SDRAM), and its access speed is generally tens to hundreds of ns; finally, it is passed Other memories accessed by IO (input and output) ports, such as hard disks, have slow access speeds, generally in the ms (millisecond) level.
在神经网络处理场合,一般关注的是处理单元对内存单元的访问。众所周知,处理单元的速度非常快,其主频一般是几百MHz(兆赫兹)到几GHz(吉赫兹),也就是ps到ns级,而内存单元的访问速度是几十ns级别,两者的速度有着较大的差异。如何解决处理单元和内存访问的速度差,有效发挥处理单元的算力,是现代CPU设计的一个难点。In the case of neural network processing, the general concern is the access of the processing unit to the memory unit. As we all know, the speed of the processing unit is very fast, and its main frequency is generally several hundred MHz (megahertz) to several GHz (gigahertz), that is, ps to ns level, and the access speed of the memory unit is tens of ns level, both There is a big difference in speed. How to solve the speed difference between the processing unit and the memory access, and effectively utilize the computing power of the processing unit, is a difficult point in modern CPU design.
为了解决处理单元和内存单元之间的速度匹配问题,现有技术中一般使用图1中的方案。如图1所示,PU(Processing Unit)为处理单元,Cache为高速缓存,Memory为内存。在此方案中,在处理单元PU和内存Memory之间插入高速缓存Cache,PU采用分层的、间接的方式访问Memory,其直接访问的是Cache,通过Cache间接访问Memory。此方案中,Cache是Memory的映射,其内容是内存内容的子集。它对PU运行的程序而言是透明的,没有功能上的意义,没有独立的编址空间,它的地址与访问的Memory地址相同,也就是程序不能单独访问Cache。In order to solve the problem of speed matching between the processing unit and the memory unit, the solution in FIG. 1 is generally used in the prior art. As shown in Figure 1, PU (Processing Unit) is a processing unit, Cache is a high-speed cache, and Memory is a memory. In this solution, a high-speed cache Cache is inserted between the processing unit PU and the memory Memory. The PU accesses the Memory in a hierarchical and indirect manner. It directly accesses the Cache and indirectly accesses the Memory through the Cache. In this scheme, Cache is a mapping of Memory, and its content is a subset of memory content. It is transparent to the program run by the PU, has no functional significance, and has no independent addressing space. Its address is the same as the memory address accessed, that is, the program cannot access the Cache alone.
对于上述现有方案,由于在神经网络计算中,用到的参数和数据量庞大,通常远超Cache的容量。这样Cache基于数据的时间局部性特性和空间局部性特性而采取的降低访问失效率的措施将无法实现,从而大大降低了处理单元的算力发挥;并且因为Cache电路复杂,大大提升了芯片设计的难度和芯片的成本。For the above-mentioned existing solutions, due to the huge amount of parameters and data used in neural network calculations, they usually far exceed the capacity of the Cache. In this way, the measures taken by Cache to reduce the access failure rate based on the temporal and spatial local characteristics of data will not be realized, which greatly reduces the computing power of the processing unit; and because the Cache circuit is complex, the chip design is greatly improved. Difficulty and cost of the chip.
发明内容Summary of the invention
提供该发明内容部分以便以简要的形式介绍构思,这些构思将在后面的具体实施方式部分被详细描述。该发明内容部分并不旨在标识要求保护的技术方案的关键特征或必要特征,也不旨在用于限制所要求的保护的技术方案的范围。The content of the invention is provided in order to introduce concepts in a brief form, and these concepts will be described in detail in the following specific embodiments. The content of the invention is not intended to identify the key features or essential features of the technical solution required to be protected, nor is it intended to be used to limit the scope of the technical solution required to be protected.
为了解决现有技术中高速缓存在数据量大时容量不足以及电路复杂的技术问题,本公开实施例提出如下技术方案:In order to solve the technical problems of insufficient capacity of the cache when the amount of data is large and complicated circuits in the prior art, the embodiments of the present disclosure propose the following technical solutions:
第一方面,本公开实施例提供一种数据存取电路,包括:In the first aspect, embodiments of the present disclosure provide a data access circuit, including:
多个内部存储器;Multiple internal memories;
第一数据接口电路,用于将所述多个内部存储器中的第一内部存储器和外部处理单元相连接;A first data interface circuit, configured to connect a first internal memory of the plurality of internal memories with an external processing unit;
第二数据接口电路,用于将所述多个内部存储器中的第二内部存储器与外部存储器相连接;A second data interface circuit for connecting a second internal memory of the plurality of internal memories with an external memory;
第一控制电路,用于接收第一控制指令以控制所述第一数据接口电路和所述第二数据接口电路;The first control circuit is configured to receive a first control instruction to control the first data interface circuit and the second data interface circuit;
第二控制电路,用于接收第二控制指令以控制所述第二数据接口电路的数据传送方向。The second control circuit is configured to receive a second control instruction to control the data transmission direction of the second data interface circuit.
进一步的,所述多个内部存储器的地址相同。Further, the addresses of the multiple internal memories are the same.
进一步的,所述第一数据接口电路包括:Further, the first data interface circuit includes:
第一开关电路,所述第一开关电路包括多个连接状态,其中每个连接状态用于将所述处理单元与所述多个内部存储器中的一个连接。A first switch circuit, the first switch circuit includes a plurality of connection states, wherein each connection state is used to connect the processing unit with one of the plurality of internal memories.
进一步的,所述第二数据接口电路包括:Further, the second data interface circuit includes:
第二开关电路,所述第二开关电路包括多个连接状态,其中每个连接状态用于将所述外部存储器与所述多个内部存储器中的一个连接。A second switch circuit, the second switch circuit includes a plurality of connection states, wherein each connection state is used to connect the external memory with one of the plurality of internal memories.
进一步的,所述第二数据接口电路还包括:Further, the second data interface circuit further includes:
存储控制器,所述存储控制器连接于所述外部存储器与所述多个内部存储器之间,用于控制所述外部存储器与所述多个内部存储器之间的所述数据交换。A storage controller, which is connected between the external storage and the plurality of internal storages, and is used to control the data exchange between the external storage and the plurality of internal storages.
进一步的,所述第一控制电路包括:Further, the first control circuit includes:
开关控制电路,用于接收所述外部处理单元发送的第一控制指令以产生第一开关控制信号和第二开关控制信号,其中所述第一开关控制信号用于设置所述第一开关电路的连接状态,所述第二开关控制信号用于设置所述第二开关电路的连接状态。The switch control circuit is used to receive a first control instruction sent by the external processing unit to generate a first switch control signal and a second switch control signal, wherein the first switch control signal is used to set the first switch circuit The connection state, the second switch control signal is used to set the connection state of the second switch circuit.
进一步的,所述第二控制电路包括:Further, the second control circuit includes:
存取控制电路,其用于接收所述外部处理单元发送的第二控制指令以产生控制信号,其中所述控制信号用于控制所述存储控制器从所述第二内部存储器中取出所述控制信号所指示的数据存入所述外部存储器中,或者控制所述存储控制器从所述外部存储器中取出所述控制信号所指示的数据存入所述第二内部存储器中。An access control circuit, which is used to receive a second control instruction sent by the external processing unit to generate a control signal, wherein the control signal is used to control the storage controller to fetch the control from the second internal memory The data indicated by the signal is stored in the external memory, or the storage controller is controlled to fetch the data indicated by the control signal from the external memory and store it in the second internal memory.
进一步的,所述多个内部存储器与所述外部存储器统一编址。Further, the multiple internal memories and the external memory are uniformly addressed.
第二方面,本公开实施例提供一种数据存取方法,其特征在于,包括:In a second aspect, embodiments of the present disclosure provide a data access method, which is characterized in that it includes:
接收第一控制指令以确定与外部处理单元连接的第一内部存储器以及与外部存储器连接的第二内部存储器;Receiving a first control instruction to determine a first internal memory connected to the external processing unit and a second internal memory connected to the external memory;
接收第二控制指令以确定所述外部存储器和第二内部存储器之间的数据传送方向;Receiving a second control instruction to determine the data transfer direction between the external memory and the second internal memory;
所述外部处理单元从所述第一内部存储器获取数据或者向所述第一内部存储器发送数据;The external processing unit obtains data from the first internal memory or sends data to the first internal memory;
根据所述数据传送方向将所述第二内部存储器中的数据发送至所述外部存储器或将所述从外部存储器获取数据存入所述第二内部存储器。According to the data transfer direction, the data in the second internal memory is sent to the external memory or the data obtained from the external memory is stored in the second internal memory.
第三方面,本公开实施例提供一种数据存取装置,包括:In a third aspect, embodiments of the present disclosure provide a data access device, including:
至少一个如第一方面中任一所述的数据存取电路。At least one data access circuit as described in any one of the first aspect.
第四方面,本公开实施例提供一种电子设备,包括:存储器,用于存储计算机可读指令;以及一个或多个处理器,用于运行所述计算机可读指令,使得所述处理器运行时实现前述第二方面中的任一所述数据存取方法。In a fourth aspect, an embodiment of the present disclosure provides an electronic device, including: a memory, configured to store computer-readable instructions; and one or more processors, configured to execute the computer-readable instructions to cause the processor to run The data access method described in any one of the foregoing second aspects is realized at a time.
第五方面,本公开实施例提供一种非暂态计算机可读存储介质,其特征在于,该非暂态计算机可读存储介质存储计算机指令,该计算机指令用于使计算机执行前述第二方面中的任一所述数据存取方法。In a fifth aspect, embodiments of the present disclosure provide a non-transitory computer-readable storage medium, characterized in that the non-transitory computer-readable storage medium stores computer instructions, and the computer instructions are used to make a computer execute the aforementioned second aspect Any of the data access methods described above.
第六方面,本公开实施例提供一种计算机程序产品,其中,其特征在于:包括计算机指令,当所述计算机指令被计算设备执行时,所述计算设备可以执行前述第二方面中的任一所述数据处理方法。In a sixth aspect, embodiments of the present disclosure provide a computer program product, wherein the computer program product is characterized by including computer instructions, and when the computer instructions are executed by a computing device, the computing device can execute any of the foregoing second aspects. The data processing method.
第七方面,本公开实施例提供一种芯片,其特征在于,包括至少一个第三方面中所述的数据处理装置。In a seventh aspect, an embodiment of the present disclosure provides a chip characterized by comprising at least one data processing device described in the third aspect.
第八方面,本公开实施例提供一种计算装置,其特征在于,包括至少一个所述第七方面中所述的芯片。In an eighth aspect, an embodiment of the present disclosure provides a computing device, which is characterized in that it includes at least one chip as described in the seventh aspect.
本公开实施例公开了一种数据存取电路及方法。其中该数据处理电路包括:多个内部存储器;第一数据接口电路,用于将所述多个内部存储器中的第一内部存储器和所述存储器管理电路外部的处理单元相连接;第二数据接口电路,用于将所述多个内部存储器中的第二内部存储器与外部存储器相连接;第一控制电路,用于接收第一控制指令以控制所述第一数据接口电路和所述第二数据接口电路;第二控制电路,用于接收第二控制指令以控制所述第二数据接口电路的数据传送方向。通过上述数据存取电路中的控制电路对接口电路的控制,使得内部存储器的容量和访问效率均得到提高,解决了现有技术中存储器容量不足、访问效率低以及电路复杂的技术问题。The embodiment of the present disclosure discloses a data access circuit and method. The data processing circuit includes: a plurality of internal memories; a first data interface circuit for connecting the first internal memory of the plurality of internal memories with a processing unit outside the memory management circuit; and a second data interface A circuit for connecting a second internal memory of the plurality of internal memories with an external memory; a first control circuit for receiving a first control instruction to control the first data interface circuit and the second data Interface circuit; a second control circuit for receiving a second control instruction to control the data transmission direction of the second data interface circuit. By controlling the interface circuit by the control circuit in the data access circuit, the capacity and access efficiency of the internal memory are improved, and the technical problems of insufficient memory capacity, low access efficiency and circuit complexity in the prior art are solved.
上述说明仅是本公开技术方案的概述,为了能更清楚了解本公开的技术手段,而可依照说明书的内容予以实施,并且为让本公开的上述和其他目的、特征和优点能够更明显易懂,以下特举较佳实施例,并配合附图,详细说明如下。The above description is only an overview of the technical solutions of the present disclosure. In order to understand the technical means of the present disclosure more clearly, they can be implemented in accordance with the content of the specification, and to make the above and other objectives, features and advantages of the present disclosure more obvious and understandable. In the following, the preferred embodiments are cited in conjunction with the drawings, and the detailed description is as follows.
附图说明Description of the drawings
结合附图并参考以下具体实施方式,本公开各实施例的上述和其他特征、优点及方面将变得更加明显。贯穿附图中,相同或相似的附图标记表示相同或相似的元素。应当理解附图是示意性的,原件和元素不一定按照比例绘制。The above and other features, advantages, and aspects of the embodiments of the present disclosure will become more apparent in conjunction with the accompanying drawings and with reference to the following specific implementations. Throughout the drawings, the same or similar reference signs indicate the same or similar elements. It should be understood that the drawings are schematic and the originals and elements are not necessarily drawn to scale.
图1为现有技术中的处理方案的结构示意图;Figure 1 is a schematic structural diagram of a processing scheme in the prior art;
图2为本公开实施例中的数据存取电路的应用场景示意图;2 is a schematic diagram of an application scenario of a data access circuit in an embodiment of the disclosure;
图3a为本公开实施例中的数据存取电路的结构示意图;3a is a schematic diagram of the structure of a data access circuit in an embodiment of the disclosure;
图3b为本公开实施例中的数据存取电路的具体结构示意图;3b is a schematic diagram of a specific structure of a data access circuit in an embodiment of the disclosure;
图4为本公开实施例中的数据存取方法的流程示意图;4 is a schematic flowchart of a data access method in an embodiment of the disclosure;
图5为本公开实施例中的另一数据存取方法的流程示意图;5 is a schematic flowchart of another data access method in an embodiment of the disclosure;
图6为本公开实施例中的另一数据存取方法的流程示意图;6 is a schematic flowchart of another data access method in an embodiment of the disclosure;
图7为本公开实施例的一个实际应用场景示意图;FIG. 7 is a schematic diagram of an actual application scenario of an embodiment of the disclosure;
图8为利用本公开实施例进行神经网络计算的时序图。Fig. 8 is a sequence diagram of neural network calculations using an embodiment of the present disclosure.
具体实施方式Detailed ways
下面将参照附图更详细地描述本公开的实施例。虽然附图中显示了本公开的某些实施例,然而应当理解的是,本公开可以通过各种形式来实现,而且不应该被解释为限于这里阐述的实施例,相反提供这些实施例是为了更加透彻和完整地理解本公开。应当理解的是,本公开的附图及实施例仅用于示例性作用,并非用于限制本公开的保护范围。Hereinafter, embodiments of the present disclosure will be described in more detail with reference to the accompanying drawings. Although some embodiments of the present disclosure are shown in the drawings, it should be understood that the present disclosure can be implemented in various forms and should not be construed as being limited to the embodiments set forth herein. On the contrary, these embodiments are provided for Have a more thorough and complete understanding of this disclosure. It should be understood that the drawings and embodiments of the present disclosure are only used for exemplary purposes, and are not used to limit the protection scope of the present disclosure.
应当理解,本公开的方法实施方式中记载的各个步骤可以按照不同的顺序执行,和/或并行执行。此外,方法实施方式可以包括附加的步骤和/或省略执行示出的步骤。本公开的范围在此方面不受限制。It should be understood that the various steps recorded in the method embodiments of the present disclosure may be executed in a different order, and/or executed in parallel. In addition, method implementations may include additional steps and/or omit to perform the illustrated steps. The scope of the present disclosure is not limited in this respect.
本文使用的术语“包括”及其变形是开放性包括,即“包括但不限于”。术语“基于”是“至少部分地基于”。术语“一个实施例”表示“至少一个实施例”;术 语“另一实施例”表示“至少一个另外的实施例”;术语“一些实施例”表示“至少一些实施例”。其他术语的相关定义将在下文描述中给出。The term "including" and its variations as used herein are open-ended includes, that is, "including but not limited to". The term "based on" is "based at least in part on." The term "one embodiment" means "at least one embodiment"; the term "another embodiment" means "at least one additional embodiment"; the term "some embodiments" means "at least some embodiments". Related definitions of other terms will be given in the following description.
需要注意,本公开中提及的“第一”、“第二”等概念仅用于对不同的装置、模块或单元进行区分,并非用于限定这些装置、模块或单元所执行的功能的顺序或者相互依存关系。It should be noted that the concepts of “first” and “second” mentioned in the present disclosure are only used to distinguish different devices, modules or units, and are not used to limit the order of functions performed by these devices, modules or units. Or interdependence.
需要注意,本公开中提及的“一个”、“多个”的修饰是示意性而非限制性的,本领域技术人员应当理解,除非在上下文另有明确指出,否则应该理解为“一个或多个”。It should be noted that the modifications of “a” and “a plurality of” mentioned in the present disclosure are illustrative and not restrictive, and those skilled in the art should understand that unless otherwise clearly indicated in the context, they should be understood as “one or Multiple".
本公开实施方式中的多个装置之间所交互的消息或者信息的名称仅用于说明性的目的,而并不是用于对这些消息或信息的范围进行限制。The names of messages or information exchanged between multiple devices in the embodiments of the present disclosure are only used for illustrative purposes, and are not used to limit the scope of these messages or information.
图2为本公开实施例中的数据存取电路的应用场景示意图。如图2所示为一个包括本公开的数据存取电路的数据存取装置。该装置包括处理单元PU,数据存取电路,以及外部存储器Memory。所述数据存取电路位于所述PU和所述Memory之间,负责PU和Memory之间的数据传递。FIG. 2 is a schematic diagram of an application scenario of a data access circuit in an embodiment of the disclosure. Fig. 2 shows a data access device including the data access circuit of the present disclosure. The device includes a processing unit PU, a data access circuit, and an external memory Memory. The data access circuit is located between the PU and the Memory, and is responsible for data transfer between the PU and the Memory.
图3a为本公开实施例中的数据存取电路的结构示意图。如图3所示,所述数据存取电路300中包括:多个内部存储器301;第一数据接口电路302,用于将所述多个内部存储器301中的第一内部存储器和所述存储器管理电路外部的处理单元PU相连接;第二数据接口电路303,用于将所述多个内部存储器301中的第二内部存储器与外部存储器Memory相连接;第一控制电路304,用于接收第一控制指令以控制所述第一数据接口电路302和所述第二数据接口电路303;第二控制电路305,用于接收第二控制指令以控制所述第二数据接口电路303的数据传送方向。FIG. 3a is a schematic diagram of the structure of a data access circuit in an embodiment of the disclosure. As shown in FIG. 3, the data access circuit 300 includes: a plurality of internal memories 301; a first data interface circuit 302 for managing the first internal memory and the memory among the plurality of internal memories 301 The processing unit PU outside the circuit is connected; the second data interface circuit 303 is used to connect the second internal memory of the plurality of internal memories 301 with the external memory Memory; the first control circuit 304 is used to receive the first The control instruction is used to control the first data interface circuit 302 and the second data interface circuit 303; the second control circuit 305 is used to receive a second control instruction to control the data transmission direction of the second data interface circuit 303.
示例性的,所述多个内部存储器为随机存储器RAM,其可以直接与处理器交换数据,可以理解的,虽然图3中仅示出两个内部存储器,在实际应用中所述内部存储器的数量可以任意设置,在此不再赘述。Exemplarily, the multiple internal memories are random access memory RAM, which can directly exchange data with the processor. It is understandable that although only two internal memories are shown in FIG. 3, the number of internal memories in practical applications It can be set arbitrarily, so I won't repeat it here.
所述第一数据接口电路302在所述第一控制电路304的控制下,从所述多个内部存储器301中确定至少一个第一内部存储器,使其与所述外部处理单元连接,使得所述第一内部存储器与所述外部处理单元PU之间可以进行数据传输;所述第二数据接口电路303在所述第一控制电路304的控制下,从所述多个内部存储器301中确定至少一个第二内部存储器,使其与所述外部存储器连接,使得所述第二内部存储器与所述外部存储器之间可以进行数据传输。所述第一控制电路304通过接收第一控制指令以控制所述第一数据接口电路302和第二数据接口电路303,可选的,所述第一控制指令由所述外部处理单元发出,所述第一控制电路304在接收到所述第一控制指令之后,经过解码和执行,根据所述第一控制指令中的参数来配置所述第一数据接口电路302和第二数据接口电路303,使得所述第一数据接口电路302连通相应的第一内部存储器和外部处理单元,得到所述第二数据接口电路303连通相应的第二内部存储器和外部处理单元。The first data interface circuit 302, under the control of the first control circuit 304, determines at least one first internal memory from the plurality of internal memories 301 to connect it with the external processing unit, so that the Data transmission can be performed between the first internal memory and the external processing unit PU; the second data interface circuit 303, under the control of the first control circuit 304, determines at least one from the plurality of internal memories 301 The second internal memory is connected to the external memory, so that data transmission can be performed between the second internal memory and the external memory. The first control circuit 304 receives a first control instruction to control the first data interface circuit 302 and the second data interface circuit 303. Optionally, the first control instruction is issued by the external processing unit, so The first control circuit 304, after receiving the first control instruction, undergoes decoding and execution, and configures the first data interface circuit 302 and the second data interface circuit 303 according to the parameters in the first control instruction, The first data interface circuit 302 is made to communicate with the corresponding first internal memory and the external processing unit, and the second data interface circuit 303 is obtained to communicate with the corresponding second internal memory and the external processing unit.
所述第二数据接口电路303在所述第二控制电路305的控制下,确定所述第二数据接口电路的数据传送方向,即从第二内部存储器发送数据到所述外部存储器中或者从外部存储器中获取数存入所述第二内部存储器中。The second data interface circuit 303 determines the data transfer direction of the second data interface circuit under the control of the second control circuit 305, that is, sends data from the second internal memory to the external memory or from the outside. The acquired data in the memory is stored in the second internal memory.
可以理解的,所述第一内部存储器和所述第二内部存储器可以是同一个存储器,即外部处理单元和外部存储器连接到同一个内部存储器上。It can be understood that the first internal memory and the second internal memory may be the same memory, that is, the external processing unit and the external memory are connected to the same internal memory.
可选的,所述第一数据接口电路302包括:第一开关电路,所述第一开关电路包括多个连接状态,其中每个连接状态用于将所述处理单元与所述多个内部存储器中的一个连接。图3b为本公开实施例中的数据存取电路的具体结构示意图。示例性的,如图3b所述,所述第一数据接口电路302包括第一开关电路SW_1,其中所述第一开关电路SW_1中的连接状态与内部存储器的数量相同,如图3b中的示例所示,内部存储器包括E_RAM和O_RAM两个,所以第一开关电路SW_1包括0和1两个连接状态,分别用于连接E_RAM和O_RAM,而两个连接状态均与外部处理单元连接,这样第一开关电路SW_1在每个时刻只能处于一个连接状态,以将外部处理单元和内部存储单元 连接。Optionally, the first data interface circuit 302 includes: a first switch circuit, the first switch circuit includes a plurality of connection states, wherein each connection state is used to connect the processing unit to the plurality of internal memories One of the connections. FIG. 3b is a schematic diagram of a specific structure of a data access circuit in an embodiment of the disclosure. Exemplarily, as shown in FIG. 3b, the first data interface circuit 302 includes a first switch circuit SW_1, wherein the connection state in the first switch circuit SW_1 is the same as the number of internal memories, as shown in the example in FIG. 3b As shown, the internal memory includes two E_RAM and O_RAM, so the first switch circuit SW_1 includes two connection states of 0 and 1, respectively used to connect E_RAM and O_RAM, and the two connection states are both connected to the external processing unit, so that the first The switch circuit SW_1 can only be in one connection state at a time to connect the external processing unit and the internal storage unit.
可选的,所述第二数据接口电路303包括:第二开关电路,所述第二开关电路包括多个连接状态,其中每个连接状态用于将所述外部存储器与所述多个内部存储器中的一个连接。,示例性的,如图3b所示,所述第二数据接口电路303包括第二开关电路SW_2,中所述第一开关电路SW_2中的连接状态与内部存储器的数量相同,如图3b中的示例所示,内部存储器包括E_RAM和O_RAM两个,所以第一开关电路SW_2包括0和1两个连接状态,分别用于连接E_RAM和O_RAM,而两个连接状态均与外部存储器连接,这样第一开关电路SW_2在每个时刻只能处于一个连接状态,以将外部存储器和内部存储单元连接。Optionally, the second data interface circuit 303 includes: a second switch circuit, the second switch circuit includes a plurality of connection states, wherein each connection state is used to connect the external memory to the plurality of internal memories. One of the connections. Exemplarily, as shown in FIG. 3b, the second data interface circuit 303 includes a second switch circuit SW_2, and the connection state in the first switch circuit SW_2 is the same as the number of internal memories, as shown in FIG. 3b As shown in the example, the internal memory includes two E_RAM and O_RAM, so the first switch circuit SW_2 includes two connection states of 0 and 1, respectively used to connect E_RAM and O_RAM, and the two connection states are both connected to the external memory, so that the first The switch circuit SW_2 can only be in one connection state at a time to connect the external memory and the internal memory unit.
可选的,所述第二数据接口电路303还包括:存储控制器,所述存储控制器连接于所述外部存储器与所述多个内部存储器之间,用于控制所述外部存储器与所述多个内部存储器之间的所述数据交换。示例性的,如图3b所示,所述第二数据接口电路303还包括存储控制器DMAC,其位于外部存储器与所述多个内部存储器之间,具体的,连接于第二开关电路SW_1与外部存储器之间,用于控制所述外部存储器与所述多个内部存储器之间的所述数据交换,即控制数据从外部存储器到所述多个内部存储器或者控制数据从所述多个内部存储器到所述外部存储器。Optionally, the second data interface circuit 303 further includes a storage controller connected between the external storage and the multiple internal storages, and configured to control the external storage and the internal storages. The data exchange between multiple internal memories. Exemplarily, as shown in FIG. 3b, the second data interface circuit 303 further includes a memory controller DMAC, which is located between the external memory and the plurality of internal memories, specifically, connected to the second switch circuit SW_1 and Between external memories, used to control the data exchange between the external memory and the multiple internal memories, that is, control data from the external memory to the multiple internal memories or control data from the multiple internal memories To the external storage.
可选的,所述第一控制电路304包括:开关控制电路,其用于接收所述外部处理单元发送的第一控制指令以产生第一开关控制信号和第二开关控制信号,其中所述第一开关控制信号用于设置所述第一开关电路的连接状态,所述第二开关控制信号用于设置所述第二开关电路的连接状态。示例性的,如图3b所示,所述第一控制电路304包括开关控制电路SW_ctrl,其接收所述外部处理单元PU发送的第一控制指令ISW_dis,在接收到该第一控制指令Isw_dis之后,对其进行解码和执行以生成第一开关控制信号C_SW1和第二开关控制信号C_SW2,其中第一控制指令中包括了控制所述第一开关和第二开关的参数以指示当前时刻第一开关和第二开关应该处于哪个连接状态,由此可以产生第一开关控制信号C_SW1和第二开关控制信号C_SW2以分别设 置第一开关SW_1和第二开关SW_2的连接状态,从而可以分别确定出外部处理单元PU在该时刻所连接的内部存储单元以及外部存储器所连接的内部存储单元,如图3b的示例所示,在该时刻外部处理单元PU连接E_RAM,而外部存储器Memory连接O_RAM。Optionally, the first control circuit 304 includes: a switch control circuit for receiving a first control instruction sent by the external processing unit to generate a first switch control signal and a second switch control signal, wherein the first switch control signal A switch control signal is used to set the connection state of the first switch circuit, and the second switch control signal is used to set the connection state of the second switch circuit. Exemplarily, as shown in FIG. 3b, the first control circuit 304 includes a switch control circuit SW_ctrl, which receives the first control command ISW_dis sent by the external processing unit PU, and after receiving the first control command Isw_dis, It is decoded and executed to generate a first switch control signal C_SW1 and a second switch control signal C_SW2, wherein the first control instruction includes parameters for controlling the first switch and the second switch to indicate the current moment of the first switch and In which connection state the second switch should be, the first switch control signal C_SW1 and the second switch control signal C_SW2 can be generated to respectively set the connection state of the first switch SW_1 and the second switch SW_2, so that the external processing unit can be determined separately The internal storage unit connected to the PU at this time and the internal storage unit connected to the external memory are as shown in the example of FIG. 3b. At this time, the external processing unit PU is connected to E_RAM, and the external memory Memory is connected to O_RAM.
可选的,所述第二控制电路305包括:存取控制电路,其用于接收所述外部处理单元发送的第二控制指令以产生控制信号,其中所述控制信号用于控制所述存储控制器从所述第二内部存储器中取出所述控制信号所指示的数据存入所述外部存储器中,或者控制所述存储控制器从所述外部存储器中取出所述控制信号所指示的数据存入所述第二内部存储器中。示例性的,如图3b所示,所述第二控制电路305包括存取控制电路LS_mem_ctrl,其接收所述外部处理单元PU发送的第二控制指令Ils_dis,在接收到该第二控制指令Ils_dis之后,对其进行解码和执行以生成控制信号C_DMAC,其中所述第二控制指令Ils_dis包括从外部存储器Memory中取数并存到内部存储器RAM中的取数指令ld_mem指令和将内部存储器RAM中的数据存到外部存储器Memory中的st_mem指令;所述控制信号C_DMAC用于根据所述第二控制指令中的参数,配置和启动所述存储控制器DMAC以控制所述存储控制器DMAC从内部存储器O_RAM中取出所述控制信号C_DMAC所指示的数据存入所述外部存储器Memory中,或者控制所述存储控制器DMAC从所述外部存储器Memory中取出所述控制信号C_DMAC所指示的数据存入内部存储器O_RAM中。Optionally, the second control circuit 305 includes: an access control circuit, which is configured to receive a second control instruction sent by the external processing unit to generate a control signal, wherein the control signal is used to control the storage control The device fetches the data indicated by the control signal from the second internal memory and stores it in the external memory, or controls the storage controller to fetch the data indicated by the control signal from the external memory and store it in the external memory. The second internal memory. Exemplarily, as shown in FIG. 3b, the second control circuit 305 includes an access control circuit LS_mem_ctrl, which receives the second control command Ils_dis sent by the external processing unit PU, and after receiving the second control command Ils_dis , It is decoded and executed to generate the control signal C_DMAC, wherein the second control instruction Ils_dis includes the fetch instruction ld_mem instruction that fetches the data from the external memory Memory and stores it in the internal memory RAM and stores the data in the internal memory RAM St_mem instruction to the external memory Memory; the control signal C_DMAC is used to configure and start the memory controller DMAC according to the parameters in the second control instruction to control the memory controller DMAC to fetch from the internal memory O_RAM The data indicated by the control signal C_DMAC is stored in the external memory Memory, or the memory controller DMAC is controlled to fetch the data indicated by the control signal C_DMAC from the external memory Memory and store it in the internal memory O_RAM.
在上述实施例中,第一开关电路和第二开关电路为两个独立的电路,由开关控制电路控制其连接状态,使得外部处理单元PU通过第一开关电路访问内部存储器RAM,使得存储控制器DMAC通过第二开关电路访问访问内部存储器RAM,两者可以独立并行的运行。In the above embodiment, the first switch circuit and the second switch circuit are two independent circuits, and the connection state of the switch control circuit is controlled, so that the external processing unit PU accesses the internal memory RAM through the first switch circuit, so that the storage controller The DMAC accesses the internal memory RAM through the second switch circuit, and the two can run independently and in parallel.
在本公开实施例中,所述多个内部存储器的地址相同,这样PU每次访问一个内部存储器时,其寻址空间不变;而所述内部存储器可以与所述外部存储器统一编址,例如低地址的地址范围分配给内部存储器RAM,高地址的地址范围分配给外部存储器Memory。对于外部处理单元PU而言,它只需要访问内部存储器RAM的地址,所以其 访问地址范围限制在内部存储器RAM内;而对于存储控制器DMAC,由于其既要访问内部存储器RAM,又要访问外部存储器Memory,所以它的访问地址范围是内部存储器和外部存储器统一编址的全地址范围。这样外部存储器PU对内部存储器进行读写时,使用的是相同的地址,从外部处理器PU的角度来看其始终使用的是同一个内部存储器RAM,其并不区分具体的哪一个内部存储器RAM;同样的,对于存储控制器DMAC来说,其使用的也是用一个地址(如统一编址的低地址范围),因此其也不需要区分具体的哪一个内部存储器,这样不需要考虑计算和存储的并行执行,程序编写时就按传统的串行程序编写即可,能大大降低程序编写的复杂性。In the embodiment of the present disclosure, the addresses of the multiple internal memories are the same, so that every time the PU accesses an internal memory, its addressing space is unchanged; and the internal memory can be addressed uniformly with the external memory, for example The address range of the low address is allocated to the internal memory RAM, and the address range of the high address is allocated to the external memory Memory. For the external processing unit PU, it only needs to access the address of the internal memory RAM, so its access address range is limited to the internal memory RAM; and for the memory controller DMAC, it needs to access both the internal memory RAM and the external Memory, so its access address range is the full address range of the internal memory and external memory unified addressing. In this way, when the external memory PU reads and writes to the internal memory, it uses the same address. From the perspective of the external processor PU, it always uses the same internal memory RAM, which does not distinguish which internal memory RAM. ; Similarly, for the memory controller DMAC, it also uses an address (such as the low address range of uniform addressing), so it does not need to distinguish which specific internal memory, so there is no need to consider calculation and storage For the parallel execution of the program, the traditional serial program can be written when the program is written, which can greatly reduce the complexity of the program.
图4为本公开实施例中的数据存取方法的流程示意图。如图4所示,所述数据存取方法包括:FIG. 4 is a schematic flowchart of a data access method in an embodiment of the disclosure. As shown in Figure 4, the data access method includes:
步骤S401,接收第一控制指令以从多个内部存储器中确定与外部处理单元连接的第一内部存储器;Step S401, receiving a first control instruction to determine a first internal memory connected to an external processing unit from a plurality of internal memories;
步骤S402,所述外部处理单元从所述第一内部存储器获取数据或者向所述第一内部存储器发送数据。Step S402: The external processing unit obtains data from the first internal memory or sends data to the first internal memory.
其中在步骤S401中,数据存取电路的开关控制电路在接收到所述外部处理单元PU向数据存取电路300发送第一控制指令之后,将其解码并执行生成第一开关的控制信号,以控制所述第一开关电路的连接状态使得所述外部处理单元连接到所述多个内部存储器中的第一内部存储器;之后,在步骤S402中,所述外部处理单元执行其运算操作,从所述第一颞部存储器获取数据或者向所述内部存储器发送数据。In step S401, after receiving the first control instruction sent by the external processing unit PU to the data access circuit 300, the switch control circuit of the data access circuit decodes it and executes the control signal to generate the first switch to The connection state of the first switch circuit is controlled so that the external processing unit is connected to the first internal memory of the plurality of internal memories; after that, in step S402, the external processing unit performs its arithmetic operation, from the The first temporal memory acquires data or sends data to the internal memory.
图5为本公开实施例中的又一数据存取方法的流程示意图。如图5所示,所述数据存取方法包括:FIG. 5 is a schematic flowchart of another data access method in an embodiment of the disclosure. As shown in FIG. 5, the data access method includes:
步骤S501,接收第一控制指令以从多个内部存储器中确定与外部存储器连接的第二内部存储器;Step S501, receiving a first control instruction to determine a second internal memory connected to an external memory from a plurality of internal memories;
步骤S502,接收第二控制指令以确定外部存储器和所述第二内部存储器之间的数 据传送方向;Step S502, receiving a second control instruction to determine the data transfer direction between the external memory and the second internal memory;
步骤S503,根据所述数据传送方向将所述第二内部存储器中的数据发送至所述外部存储器或将所述从外部存储器获取数据存入所述第二内部存储器。Step S503: According to the data transfer direction, the data in the second internal memory is sent to the external memory or the data obtained from the external memory is stored in the second internal memory.
在所述步骤S501中,数据存取电路的开关控制电路在接收到所述外部处理单元PU向数据存取电路300发送第一控制指令之后,将其解码并执行生成第二开关的控制信号,以控制所述第二开关电路的连接状态使得所述外部处理单元连接到所述多个内部存储器中的第二内部存储器;之后,在步骤S502中,数据存取电路的存取控制电路在接收到所述外部处理单元PU向数据存取电路300发送第一控制指令之后,将其解码并执行生成控制存储控制器的控制信号,以确定外部存储器和所述第二内部存储器之间的数据传送方向;之后在步骤S503,所述存储控制器根据所述数据传送方向,将所述第二控制指令中所指示的数据从所述第二内部存储其中发送至所述外部存储器中或者从所述外部存储器中获取所述第二控制指令中所指示的数据存入所述内部存储器中。In the step S501, after receiving the first control instruction sent by the external processing unit PU to the data access circuit 300, the switch control circuit of the data access circuit decodes it and executes the control signal to generate the second switch, In order to control the connection state of the second switch circuit, the external processing unit is connected to the second internal memory of the plurality of internal memories; then, in step S502, the access control circuit of the data access circuit is receiving After the external processing unit PU sends the first control instruction to the data access circuit 300, it is decoded and executed to generate a control signal for controlling the storage controller to determine the data transfer between the external memory and the second internal memory Direction; then in step S503, the storage controller sends the data indicated in the second control instruction from the second internal storage to the external storage or from the external storage according to the data transfer direction The data indicated in the second control instruction is acquired from the external memory and stored in the internal memory.
上述两个数据存取方法分别为所述数据存取电路的两端的电路所独立执行的方法,其可以独立并行的执行,两个方法也可以放到一起可以完成更复杂的数据存取任务。The above two data access methods are methods independently executed by the circuits at both ends of the data access circuit, which can be executed independently and in parallel, and the two methods can also be put together to complete more complex data access tasks.
因此,本公开实施例还提供一种数据存取方法,包括:Therefore, the embodiments of the present disclosure also provide a data access method, including:
步骤S601,接收第一控制指令以确定与外部处理单元连接的第一内部存储器以及与外部存储器连接的第二内部存储器;Step S601, receiving a first control instruction to determine a first internal memory connected to the external processing unit and a second internal memory connected to the external memory;
步骤S602,接收第二控制指令以确定所述外部存储器和第二内部存储器之间的数据传送方向;Step S602, receiving a second control instruction to determine the data transfer direction between the external memory and the second internal memory;
步骤S603,所述外部处理单元从所述第一内部存储器获取数据或者向所述第一内部存储器发送数据;Step S603: The external processing unit obtains data from the first internal memory or sends data to the first internal memory;
步骤S604,根据所述数据传送方向将所述第二内部存储器中的数据发送至所述外部存储器或将所述从外部存储器获取数据存入所述第二内部存储器。Step S604: According to the data transfer direction, the data in the second internal memory is sent to the external memory or the data obtained from the external memory is stored in the second internal memory.
上述方法的一个示例如下:如图3b所示,外部处理器PU向数据存取电路300发送第一控制指令Isw_dis,所述数据存取电路的开关控制电路接收该指令并生成第一开关控制信号SW_1和第二开关控制信号SW_2,其中SW_1控制所述第一开关的连接状态为0,使得外部处理单元与E_RAM连接,SW_2控制所述第二开关的连接状态为1,使得DAMC与O_RAM连接。之后外部处理单元PU向数据存取电路300发送第二控制指令Ils_dis,此处令Ils_dis为取数指令ld_mem,所述数据存取电路300的存取控制电路接收到该指令ld_mem之后,对其进行解码和执行得到存储控制器DMAC的控制信号C_DMAC,以配置所述DMAC的读取地址、写入地址和数据大的小,以将所述指令ld_mem所确定的数据块,从外部存储器Memory中的读取地址的存储区间读出来,写入O_RAM的写入地址的存储区间,其中所述读取地址和写入地址均可以是首地址。于此同时,外部处理单元PU可以执行其自身的指令,指令的操作数从E_RAM中读取,或者指令的执行结果数据存入E_RAM中。An example of the above method is as follows: as shown in FIG. 3b, the external processor PU sends a first control instruction Isw_dis to the data access circuit 300, and the switch control circuit of the data access circuit receives the instruction and generates a first switch control signal SW_1 and the second switch control signal SW_2, where SW_1 controls the connection state of the first switch to 0, so that the external processing unit is connected to E_RAM, and SW_2 controls the connection state of the second switch to 1, so that DAMC is connected to O_RAM. After that, the external processing unit PU sends a second control command Ils_dis to the data access circuit 300, where Ils_dis is the fetch command ld_mem. After the access control circuit of the data access circuit 300 receives the command ld_mem, it performs Decode and execute the control signal C_DMAC of the memory controller DMAC to configure the read address, write address, and data size of the DMAC to transfer the data block determined by the instruction ld_mem from the external memory Memory The storage area of the read address is read out and written into the storage area of the write address of O_RAM, where both the read address and the write address may be the first address. At the same time, the external processing unit PU can execute its own instruction, the operand of the instruction is read from E_RAM, or the execution result data of the instruction is stored in E_RAM.
图7为本公开实施例的一个实际应用场景示意图。使用本公开实施例中的数据存取电路来使得外部处理单元PU执行神经网络的计算。如图7所示为所述神经网络的示意图,所述神经网络有两层,每一层的神经网络的所使用的参数和数据小于单块RAM的容量,即E_RAM和O_RAM均可容下一层神经网络计算的参数和数据。其中,将E_RAM对应到第一层神经网路layer1,将O_RAM对应到第二层神经网络layer2,由此外部处理单元PU在不同的时刻可以在E_RAM和O_RAM之间切换以执行两层神经网络的程序。FIG. 7 is a schematic diagram of an actual application scenario of an embodiment of the disclosure. The data access circuit in the embodiment of the present disclosure is used to make the external processing unit PU execute the calculation of the neural network. Figure 7 shows a schematic diagram of the neural network. The neural network has two layers. The parameters and data used by the neural network of each layer are smaller than the capacity of a single block of RAM, that is, E_RAM and O_RAM can accommodate one The parameters and data calculated by the layered neural network. Among them, E_RAM corresponds to the first layer of neural network layer1, and O_RAM corresponds to the second layer of neural network layer2, so the external processing unit PU can switch between E_RAM and O_RAM at different times to execute the two-layer neural network program.
图8为利用本公开实施例进行神经网络计算的时序图。如图8所示,在t0时刻,第一开关电路选择E_RAM与PU连接,第二开关电路选择O_RAM与DMAC连接,此时PU通过E_RAM获取计算layer1的操作数并执行layer1的计算,或者将计算layer1的结果保存到E_RAM中,DMAC根据PU的第二控制指令更新O_RAM中的数据,如将Memory 中的数据存入O_RAM中;在t1时刻,PU发出第一控制指令,控制开关控制产生第一开关控制信号和第二开关控制信号以控制第一开关电路选择O_RAM,控制第二开关电路选择E_RAM,此时PU通过O_RAM获取计算layer2的操作数并执行layer2的计算,或者将计算layer2的结果保存到O_RAM中,DMAC根据PU的第二控制指令更新E_RAM中的数据,如将Memory中的数据存入E_RAM中;在t2时刻,PU发出第一控制指令,控制开关控制产生第一开关控制信号和第二开关控制信号以控制第一开关电路选择E_RAM,控制第二开关电路选择O_RAM,此时PU通过E_RAM获取计算layer1的操作数并执行layer1的计算,或者将计算layer1的结果保存到E_RAM中,DMAC根据PU的第二控制指令更新O_RAM中的数据,如将Memory中的数据存入O_RAM中。由此循环交替,直至神经网络的计算任务完成为止。Fig. 8 is a sequence diagram of neural network calculations using an embodiment of the present disclosure. As shown in Figure 8, at time t0, the first switch circuit selects E_RAM to connect with PU, and the second switch circuit selects O_RAM to connect with DMAC. At this time, PU obtains and calculates the operand of layer1 through E_RAM and performs the calculation of layer1, or calculates The result of layer1 is stored in E_RAM. DMAC updates the data in O_RAM according to the second control instruction of the PU, such as storing the data in Memory in O_RAM; at t1, the PU issues the first control instruction, and the control switch controls the first The switch control signal and the second switch control signal are used to control the first switch circuit to select O_RAM and the second switch circuit to select E_RAM. At this time, the PU obtains and calculates the operand of layer2 through O_RAM and executes the calculation of layer2, or saves the result of calculation of layer2 To O_RAM, the DMAC updates the data in E_RAM according to the second control instruction of the PU, such as storing the data in Memory in E_RAM; at t2, the PU sends out the first control instruction, and controls the switch to generate the first switch control signal and The second switch control signal controls the first switch circuit to select E_RAM and the second switch circuit to select O_RAM. At this time, the PU obtains and calculates the operand of layer1 through E_RAM and executes the calculation of layer1, or saves the result of calculating layer1 in E_RAM, DMAC updates the data in O_RAM according to the second control instruction of the PU, such as storing the data in Memory in O_RAM. This cycle alternates until the calculation task of the neural network is completed.
由图8可以看出,当E_RAM被PU的计算占用的时候,O_RAM同时在进行下一次计算的参数和数据更新;同理当O_RAM被PU的计算占用的时候,E_RAM同时在进行下一次计算的参数和数据更新。这样一来,计算和参数/数据更新并行进行,能最大程度的发挥算力。It can be seen from Figure 8 that when E_RAM is occupied by the calculation of the PU, O_RAM is simultaneously updating the parameters and data of the next calculation; in the same way, when O_RAM is occupied by the calculation of the PU, E_RAM is also performing the next calculation of the parameters. And data update. In this way, calculations and parameter/data updates are performed in parallel, which can maximize computing power.
可以理解的,在上述运算过程中也可以不切换RAM,让计算和参数/数据的更新都在同一块RAM上进行,这样计算和存储会在同一块RAM上串行进行。It is understandable that in the above calculation process, the RAM may not be switched, so that the calculation and the parameter/data update are performed on the same block of RAM, so that the calculation and storage will be performed serially on the same block of RAM.
本公开实施例还提供了一种数据存取装置,其特征在于,包括:至少一个如上述实施例中任一个所述的数据存取电路。An embodiment of the present disclosure also provides a data access device, which is characterized by comprising: at least one data access circuit as described in any of the above embodiments.
所述数据存取装置例如为一个处理核。The data access device is, for example, a processing core.
本公开实施例还提供了一种芯片,其特征在于,包括包括至少一个如上述实施例中任一个所述的数据存取电路。An embodiment of the present disclosure also provides a chip, which is characterized by including at least one data access circuit as described in any of the above embodiments.
本公开实施例提供一种计算机程序产品,其中,其特征在于:包括计算机指令,当所述计算机指令被计算设备执行时,所述计算设备可以执行前述实施例中的任一所述数据存取方法。The embodiments of the present disclosure provide a computer program product, which is characterized in that it includes computer instructions, and when the computer instructions are executed by a computing device, the computing device can execute any of the data accesses in the foregoing embodiments. method.
本公开实施例提供一种非暂态计算机可读存储介质,其特征在于,该非暂态计算 机可读存储介质存储计算机指令,该计算机指令用于使计算机执行前述第三方面中的任一所述数据存取方法。The embodiments of the present disclosure provide a non-transitory computer-readable storage medium, characterized in that the non-transitory computer-readable storage medium stores computer instructions, and the computer instructions are used to make a computer execute any of the foregoing third aspects. The data access method.
本公开实施例提供一种计算装置,其特征在于,包括至少一个前述实施例中的任一所述的芯片。An embodiment of the present disclosure provides a computing device, which is characterized by comprising at least one chip described in any one of the foregoing embodiments.
本公开附图中的流程图和框图,图示了按照本公开各种实施例的系统、方法和计算机程序产品的可能实现的体系架构、功能和操作。在这点上,流程图或框图中的每个方框可以代表一个模块、程序段、或代码的一部分,该模块、程序段、或代码的一部分包含一个或多个用于实现规定的逻辑功能的可执行指令。也应当注意,在有些作为替换的实现中,方框中所标注的功能也可以以不同于附图中所标注的顺序发生。例如,两个接连地表示的方框实际上可以基本并行地执行,它们有时也可以按相反的顺序执行,这依所涉及的功能而定。也要注意的是,框图和/或流程图中的每个方框、以及框图和/或流程图中的方框的组合,可以用执行规定的功能或操作的专用的基于硬件的系统来实现,或者可以用专用硬件与计算机指令的组合来实现。The flowcharts and block diagrams in the drawings of the present disclosure illustrate the possible implementation architecture, functions, and operations of the system, method, and computer program product according to various embodiments of the present disclosure. In this regard, each block in the flowchart or block diagram can represent a module, program segment, or part of code, and the module, program segment, or part of code contains one or more for realizing the specified logic function. Executable instructions. It should also be noted that, in some alternative implementations, the functions marked in the block may also occur in a different order from the order marked in the drawings. For example, two blocks shown one after another can actually be executed substantially in parallel, and they can sometimes be executed in the reverse order, depending on the functions involved. It should also be noted that each block in the block diagram and/or flowchart, and the combination of the blocks in the block diagram and/or flowchart, can be implemented by a dedicated hardware-based system that performs the specified functions or operations Or it can be realized by a combination of dedicated hardware and computer instructions.
描述于本公开实施例中所涉及到的单元可以通过软件的方式实现,也可以通过硬件的方式来实现。其中,单元的名称在某种情况下并不构成对该单元本身的限定。The units involved in the embodiments described in the present disclosure can be implemented in software or hardware. Among them, the name of the unit does not constitute a limitation on the unit itself under certain circumstances.
本文中以上描述的功能可以至少部分地由一个或多个硬件逻辑部件来执行。例如,非限制性地,可以使用的示范类型的硬件逻辑部件包括:现场可编程门阵列(FPGA)、专用集成电路(ASIC)、专用标准产品(ASSP)、片上系统(SOC)、复杂可编程逻辑设备(CPLD)等等。The functions described above in this document may be performed at least in part by one or more hardware logic components. For example, without limitation, exemplary types of hardware logic components that can be used include: Field Programmable Gate Array (FPGA), Application Specific Integrated Circuit (ASIC), Application Specific Standard Product (ASSP), System on Chip (SOC), Complex Programmable Logical device (CPLD) and so on.
在本公开的上下文中,机器可读介质可以是有形的介质,其可以包含或存储以供指令执行系统、装置或设备使用或与指令执行系统、装置或设备结合地使用的程序。机器可读介质可以是机器可读信号介质或机器可读储存介质。机器可读介质可以包括但不限于电子的、磁性的、光学的、电磁的、红外的、或半导体系统、装置或设备,或者上述内容的任何合适组合。机器可读存储介质的更具体示例会包括基于一个或多个线的电气连接、便携式计算机盘、硬盘、随机存取存储器(RAM)、只读存储器(ROM)、可擦除可编程只读存储器(EPROM或快闪存储器)、光纤、便捷式紧凑盘只读存储器(CD-ROM)、光学储存设备、磁储存设备、或上述内容的任何合适组合。In the context of the present disclosure, a machine-readable medium may be a tangible medium, which may contain or store a program for use by the instruction execution system, apparatus, or device or in combination with the instruction execution system, apparatus, or device. The machine-readable medium may be a machine-readable signal medium or a machine-readable storage medium. The machine-readable medium may include, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, device, or device, or any suitable combination of the foregoing. More specific examples of machine-readable storage media would include electrical connections based on one or more wires, portable computer disks, hard disks, random access memory (RAM), read-only memory (ROM), erasable programmable read-only memory (EPROM or flash memory), optical fiber, portable compact disk read only memory (CD-ROM), optical storage device, magnetic storage device, or any suitable combination of the foregoing.

Claims (10)

  1. 一种数据存取电路,其特征在于,包括:A data access circuit, characterized in that it comprises:
    多个内部存储器;Multiple internal memories;
    第一数据接口电路,用于将所述多个内部存储器中的第一内部存储器和外部处理单元相连接;A first data interface circuit, configured to connect a first internal memory of the plurality of internal memories with an external processing unit;
    第二数据接口电路,用于将所述多个内部存储器中的第二内部存储器与外部存储器相连接;A second data interface circuit for connecting a second internal memory of the plurality of internal memories with an external memory;
    第一控制电路,用于接收第一控制指令以控制所述第一数据接口电路和所述第二数据接口电路;The first control circuit is configured to receive a first control instruction to control the first data interface circuit and the second data interface circuit;
    第二控制电路,用于接收第二控制指令以控制所述第二数据接口电路的数据传送方向。The second control circuit is configured to receive a second control instruction to control the data transmission direction of the second data interface circuit.
  2. 如权利要求1所述的数据存取电路,其中所述多个内部存储器的地址相同。The data access circuit according to claim 1, wherein the addresses of the plurality of internal memories are the same.
  3. 如权利要求1或2所述的数据存取电路,其中所述第一数据接口电路包括:3. The data access circuit according to claim 1 or 2, wherein the first data interface circuit comprises:
    第一开关电路,所述第一开关电路包括多个连接状态,其中每个连接状态用于将所述处理单元与所述多个内部存储器中的一个连接。A first switch circuit, the first switch circuit includes a plurality of connection states, wherein each connection state is used to connect the processing unit with one of the plurality of internal memories.
  4. 如权利要求1-3中任一项所述的数据存取电路,其中所述第二数据接口电路包括:3. The data access circuit according to any one of claims 1-3, wherein the second data interface circuit comprises:
    第二开关电路,所述第二开关电路包括多个连接状态,其中每个连接状态用于将所述外部存储器与所述多个内部存储器中的一个连接。A second switch circuit, the second switch circuit includes a plurality of connection states, wherein each connection state is used to connect the external memory with one of the plurality of internal memories.
  5. 如权利要求4所述的数据存取电路,其中所述第二数据接口电路还包括:5. The data access circuit of claim 4, wherein the second data interface circuit further comprises:
    存储控制器,所述存储控制器连接于所述外部存储器与所述多个内部存储器之间,用于控制所述外部存储器与所述多个内部存储器之间的所述数据交换。A storage controller, which is connected between the external storage and the plurality of internal storages, and is used to control the data exchange between the external storage and the plurality of internal storages.
  6. 如权利要求1-4任一项所述的数据存取电路,其中所述第一控制电路包括:5. The data access circuit according to any one of claims 1 to 4, wherein the first control circuit comprises:
    开关控制电路,用于接收所述外部处理单元发送的第一控制指令以产生第一开关控制信号和第二开关控制信号,其中所述第一开关控制信号用于设置所述第一开关电路的连接状态,所述第二开关控制信号用于设置所述第二开关电路的连接状态。The switch control circuit is used to receive a first control instruction sent by the external processing unit to generate a first switch control signal and a second switch control signal, wherein the first switch control signal is used to set the first switch circuit The connection state, the second switch control signal is used to set the connection state of the second switch circuit.
  7. 如权利要求1-6任一项所述的数据存取电路,其中所述第二控制电路包括:5. The data access circuit according to any one of claims 1 to 6, wherein the second control circuit comprises:
    存取控制电路,其用于接收所述外部处理单元发送的第二控制指令以产生控制信号,其中所述控制信号用于控制所述存储控制器从所述第二内部存储器中取出所述控制信号所指示的数据存入所述外部存储器中,或者控制所述存储控制器从所述外部存储器中取出所述控制信号所指示的数据存入所述第二内部存储器中。An access control circuit, which is used to receive a second control instruction sent by the external processing unit to generate a control signal, wherein the control signal is used to control the storage controller to fetch the control from the second internal memory The data indicated by the signal is stored in the external memory, or the storage controller is controlled to fetch the data indicated by the control signal from the external memory and store it in the second internal memory.
  8. 如权利要求1所述的数据存取电路,其中所述多个内部存储器与所述外部存储器统一编址。3. The data access circuit according to claim 1, wherein the plurality of internal memories and the external memory are uniformly addressed.
  9. 一种数据存取方法,其特征在于,包括:A data access method, characterized in that it comprises:
    接收第一控制指令以确定与外部处理单元连接的第一内部存储器以及与外部存储器连接的第二内部存储器;Receiving a first control instruction to determine a first internal memory connected to the external processing unit and a second internal memory connected to the external memory;
    接收第二控制指令以确定所述外部存储器和第二内部存储器之间的数据传送方向;Receiving a second control instruction to determine the data transfer direction between the external memory and the second internal memory;
    所述外部处理单元从所述第一内部存储器获取数据或者向所述第一内部存储器发送数据;The external processing unit obtains data from the first internal memory or sends data to the first internal memory;
    根据所述数据传送方向将所述第二内部存储器中的数据发送至所述外部存储器或将所述从外部存储器获取数据存入所述第二内部存储器。According to the data transfer direction, the data in the second internal memory is sent to the external memory or the data obtained from the external memory is stored in the second internal memory.
  10. 一种芯片,包括至少一个如权利要求1-8中任一项所述的数据存取电路。A chip comprising at least one data access circuit according to any one of claims 1-8.
PCT/CN2020/083195 2020-04-03 2020-04-03 Data access circuit and method WO2021196158A1 (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
PCT/CN2020/083195 WO2021196158A1 (en) 2020-04-03 2020-04-03 Data access circuit and method
CN202080098538.4A CN115280272A (en) 2020-04-03 2020-04-03 Data access circuit and method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2020/083195 WO2021196158A1 (en) 2020-04-03 2020-04-03 Data access circuit and method

Publications (1)

Publication Number Publication Date
WO2021196158A1 true WO2021196158A1 (en) 2021-10-07

Family

ID=77927131

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2020/083195 WO2021196158A1 (en) 2020-04-03 2020-04-03 Data access circuit and method

Country Status (2)

Country Link
CN (1) CN115280272A (en)
WO (1) WO2021196158A1 (en)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1183842A (en) * 1995-05-09 1998-06-03 软体未来设计股份有限公司 Interface circuit and data processing apparatus and method
US20030208654A1 (en) * 2002-05-03 2003-11-06 Compaq Information Technologies Group, L.P. Computer system architecture with hot pluggable main memory boards
CN107239823A (en) * 2016-08-12 2017-10-10 北京深鉴科技有限公司 A kind of apparatus and method for realizing sparse neural network
CN107689948A (en) * 2016-08-22 2018-02-13 北京深鉴科技有限公司 Efficient data memory access managing device applied to neural network hardware acceleration system
CN108875920A (en) * 2018-02-12 2018-11-23 北京旷视科技有限公司 Operation method, device, system and the storage medium of neural network

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1183842A (en) * 1995-05-09 1998-06-03 软体未来设计股份有限公司 Interface circuit and data processing apparatus and method
US20030208654A1 (en) * 2002-05-03 2003-11-06 Compaq Information Technologies Group, L.P. Computer system architecture with hot pluggable main memory boards
CN107239823A (en) * 2016-08-12 2017-10-10 北京深鉴科技有限公司 A kind of apparatus and method for realizing sparse neural network
CN107689948A (en) * 2016-08-22 2018-02-13 北京深鉴科技有限公司 Efficient data memory access managing device applied to neural network hardware acceleration system
CN108875920A (en) * 2018-02-12 2018-11-23 北京旷视科技有限公司 Operation method, device, system and the storage medium of neural network

Also Published As

Publication number Publication date
CN115280272A (en) 2022-11-01

Similar Documents

Publication Publication Date Title
US11301340B2 (en) Memory-based distributed processor architecture
TWI567551B (en) Allocating and configuring persistent memory
TWI620194B (en) Apparatuses and methods for memory device as a store for program instructions
US11868299B2 (en) Network-on-chip data processing method and device
US11061742B2 (en) System, apparatus and method for barrier synchronization in a multi-threaded processor
US9141173B2 (en) Thread consolidation in processor cores
US20210373799A1 (en) Method for storing data and method for reading data
CN104115230A (en) Efficient PCMS refresh mechanism background
JP2023508660A (en) Reduced double data rate memory training for memory context restoration, faster system-on-chip boot time
JPH1097464A (en) Information processing system
CN107624178A (en) The cabinet-type framework being quickly zeroed(RSA)With shared memory controller(SMC)Technology
WO2021196158A1 (en) Data access circuit and method
JP6378775B2 (en) Reconfigurable device
TW202008172A (en) Memory system
KR20230041593A (en) Scalable address decoding scheme for cxl type-2 devices with programmable interleave granularity
KR20230169684A (en) Pim computing system and method for pim arithmetic offloading thereof
WO2021196160A1 (en) Data storage management apparatus and processing core
US20230259486A1 (en) Neural processing unit synchronization systems and methods
TWI828052B (en) Computer system and memory management method based on wafer-on-wafer architecture
US20230273733A1 (en) In-memory compute core for machine learning acceleration
US11080059B1 (en) Reducing firmware size and increasing firmware performance
CN117389621A (en) Supporting multiple vector lengths with a configurable vector register file
CN115794604A (en) Data generation method, apparatus, device, medium, and program product
JPH06231085A (en) Incorporated register access control system

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20929559

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 20929559

Country of ref document: EP

Kind code of ref document: A1