CN113220346A - Hardware circuit, data moving method, chip and electronic equipment - Google Patents

Hardware circuit, data moving method, chip and electronic equipment Download PDF

Info

Publication number
CN113220346A
CN113220346A CN202110518007.2A CN202110518007A CN113220346A CN 113220346 A CN113220346 A CN 113220346A CN 202110518007 A CN202110518007 A CN 202110518007A CN 113220346 A CN113220346 A CN 113220346A
Authority
CN
China
Prior art keywords
memory
information
data
storage
controller
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110518007.2A
Other languages
Chinese (zh)
Inventor
李越
朱志岐
王文强
徐宁仪
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanghai Power Tensors Intelligent Technology Co Ltd
Original Assignee
Shanghai Power Tensors Intelligent Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanghai Power Tensors Intelligent Technology Co Ltd filed Critical Shanghai Power Tensors Intelligent Technology Co Ltd
Publication of CN113220346A publication Critical patent/CN113220346A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/30003Arrangements for executing specific machine instructions
    • G06F9/3004Arrangements for executing specific machine instructions to perform operations on memory
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/30098Register arrangements
    • G06F9/3012Organisation of register space, e.g. banked or distributed register file
    • G06F9/3013Organisation of register space, e.g. banked or distributed register file according to data content, e.g. floating-point registers, address registers
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/38Concurrent instruction execution, e.g. pipeline, look ahead
    • G06F9/3836Instruction issuing, e.g. dynamic instruction scheduling or out of order instruction execution

Landscapes

  • Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Memory System Of A Hierarchy Structure (AREA)

Abstract

The application provides a hardware circuit, a data moving method, a chip and electronic equipment. The hardware circuit may include: accessing a controller; the source memory and the destination memory are connected with the access controller; and the storage controller is connected with the source storage, the destination storage and the access controller. The access controller is used for responding to the received data moving instruction, and acquiring first information used for reading data and second information used for writing data which are included in the data moving instruction; and generating a reading instruction according to the first information, and sending the reading instruction to the source memory. The source memory is used for reading data according to the first information included in the reading instruction and sending the read data to the memory controller. The storage controller is used for acquiring the second information and generating a storage instruction according to the acquired second information and the read data so as to store the data to the destination memory.

Description

Hardware circuit, data moving method, chip and electronic equipment
Technical Field
The present application relates to computer technologies, and in particular, to a hardware circuit, a data moving method, a chip, and an electronic device.
Background
With the continuous development of artificial intelligence and high-performance computing, the amount of data required to be processed by a computing system is becoming huge. In the calculation process, the size of the memory space of the calculation system is limited, and a large amount of original data and calculation intermediate results need to be frequently carried between the shared memory and the external memory.
In the existing scheme, when data is moved from the shared memory to the external memory, a read instruction can be sent to the thread through the instruction transmitting unit to read the data to be moved from the shared memory to the thread register, and then a storage instruction is sent to store the data in the thread register to the external memory. The process of moving data from external memory to shared memory is not described in detail herein with reference to the aforementioned process.
It is not easy to find that, in the data transfer process, on one hand, thread register resources need to be occupied, so that a Crossbar (Crossbar) used for data transfer between threads and thread registers needs to be frequently occupied, and the performance of the thread registers and the Crossbar is reduced; on the other hand, the instruction issue unit needs to issue two instructions of read and store, resulting in a degradation of the instruction issue unit performance.
Disclosure of Invention
In view of this, the first aspect of the present application discloses a hardware circuit. The hardware circuit includes: accessing a controller; the source memory and the destination memory are connected with the access controller; and a storage controller connected to the source memory, the destination memory, and the access controller; the source memory indicates a memory to transmit data; the destination memory indicates a memory to receive data; the access controller is used for responding to the received data moving instruction, and acquiring first information used for reading data and second information used for writing data which are included in the data moving instruction; generating a reading instruction according to the first information, and sending the reading instruction to the source memory; the source memory is used for reading data according to first information included in the reading instruction and sending the read data to the memory controller; and the storage controller is used for acquiring the second information and generating a storage instruction according to the acquired second information and the read data so as to store the data to the destination memory.
In some embodiments shown, the hardware circuit further comprises: an information storage; the information memory is connected with the access controller and the storage controller; the access controller is used for storing the second information to the information memory to obtain a storage address for storing the second information; generating a reading instruction according to the storage address and the first information, and sending the reading instruction to the source memory; the source memory is used for reading data according to first information included by the reading instruction and sending the read data and the storage address included by the reading instruction to the storage controller; and the storage controller is used for acquiring the second information from the information storage according to the storage address and generating a storage instruction according to the acquired second information and the read data so as to store the data to the destination storage.
The second aspect of the present application further provides a hardware circuit, including: accessing a controller; the shared memory and the external memory are connected with the access controller; a first controller coupled to the shared memory, the access controller, and the external memory; and a second controller coupled to the external memory, the access controller, and the shared memory; the access controller is used for responding to the received data moving instruction, acquiring first information used for reading data and second information used for writing data which are included in the data moving instruction, generating a reading instruction according to the first information, and sending the reading instruction to a source memory; the source memory is used for reading data according to the first information included in the reading instruction and sending the read data to the memory controller; the storage controller is used for generating a storage instruction according to the second information and the obtained second information and the read data so as to store the data to a target memory; the source storage is the shared memory, the destination storage is the external memory, and the storage controller is the first controller; and/or the source memory is the external memory, the destination memory is the shared memory, and the storage controller is the second controller.
In some embodiments shown, the hardware circuit further comprises: a first memory and a second memory; the first memory is connected with the access controller and the first controller; the second memory is connected with the access controller and the second controller; the access controller is used for storing the second information to an information memory to obtain a storage address for storing the second information; generating a reading instruction according to the storage address and the first information, and sending the reading instruction to a source memory; the source memory is used for reading data according to first information included in the reading instruction and sending the read data and the storage address included in the reading instruction to the storage controller; the storage controller is used for acquiring the second information from the information storage according to the storage address and generating a storage instruction according to the acquired second information and the read data so as to store the data to a destination storage; the source storage is the shared memory, the destination storage is the external memory, the information storage is the first storage, and the storage controller is the first controller; and/or the source memory is the external memory, the destination memory is the shared memory, the information memory is the second memory, and the storage controller is the second controller.
In some embodiments shown, the data move instructions comprise instructions initiated for each thread within a thread group; the information memory comprises a plurality of memory cells; the access controller is used for acquiring the first information and the second information respectively corresponding to each thread according to the data moving instruction; sending second information corresponding to each thread to the information memory; the information memory is used for determining a target memory cell which does not store data in the memory cells, storing second information corresponding to each thread to the target memory cell, and sending a memory address of the target memory cell to the access controller so that the access controller can obtain the memory address.
In some embodiments shown, the target storage unit includes storage spaces corresponding to the threads respectively; the information memory is used for storing second information corresponding to each thread into a storage space corresponding to each thread in the target storage unit.
In some embodiments shown, the access controller is configured to determine, according to a source address represented by the first information corresponding to each thread, a number of first threads with address conflicts among the threads; and combining the first information corresponding to the plurality of first threads with the storage address respectively to obtain a plurality of first reading instructions corresponding to the first threads respectively.
In some embodiments, the access controller is configured to combine the first information corresponding to the second threads that do not conflict among the threads with the storage address to obtain second read instructions corresponding to the second threads.
In some embodiments shown, the source memory is configured to, in response to the first read instruction, obtain first information included in the first read instruction, and read first data corresponding to a first thread corresponding to the first read instruction according to a source address represented by the obtained first information; sending the read first data, the storage address included in the first read instruction, and a thread ID corresponding to the first data to the storage controller; responding to the second read instruction, acquiring first information which is included in the second read instruction and respectively corresponds to each second thread, and reading second data respectively corresponding to each second thread according to a source address represented by the first information respectively corresponding to each second thread; and sending each read second data, the memory address included in the second read instruction and the thread ID corresponding to each second data to the memory controller.
In some embodiments, the target storage unit further includes access identifiers corresponding to the threads; the access identifier indicates whether data in a memory space corresponding to the thread is accessed; the storage controller is used for acquiring second information corresponding to the thread corresponding to the read data from the information storage according to the storage address; the information memory is used for responding to the memory controller to acquire the second information from the information memory according to the memory address and setting an access identifier corresponding to a thread corresponding to the read data as a first identifier; wherein the first identifier characterizes that data in a memory space corresponding to the thread has been accessed.
In some illustrated embodiments, the information storage is configured to determine whether access identifiers included in each storage unit and respectively corresponding to the threads are the first identifiers, and if so, release data stored in the corresponding storage unit.
In some embodiments shown, the access controller includes a configuration register, an address register, and an arithmetic unit; the address register is used for responding to the data of the target moved by the data moving instruction, storing a preset source address in a plurality of source addresses; the configuration register is used for storing the operation relation of other source addresses except the preset source address in the plurality of source addresses obtained by the preset source address; and the operation unit is used for generating a plurality of instructions for moving the data of the plurality of source addresses according to the operation relation and the preset source address so as to complete the moving of the target data.
In some illustrative embodiments, the hardware circuit includes a plurality of shared memories and a plurality of external memories, and a crossbar is connected between the plurality of shared memories and the second controller and between the plurality of external memories and the first controller; the data transfer instruction is used for carrying out data transfer between the shared memories and the external memories; and the storage controller is used for generating a storage instruction according to the acquired second information and the read data, and sending the storage instruction to a destination memory through the crossbar matrix so as to finish data transfer.
In some embodiments shown, the accessing controller further comprises a state feedback unit; the state feedback sheet is used for responding to a received data moving instruction sent by a processor and sending first state information to the processor to indicate that the hardware circuit is in a busy state currently; and responding to a storage instruction generated according to the acquired second information and the read data to store the target data to the destination memory, and sending second state information to the processor to indicate that the hardware circuit is in an idle state currently.
In some embodiments shown, the first memory and/or the second memory include a buffer that supports a read-write.
The third aspect of the present application further provides a data moving method, which is applied to a hardware circuit shown in any of the foregoing embodiments, and the method includes: the access controller responds to the received data moving instruction, and acquires first information used for reading data and second information used for writing data which are included in the data moving instruction; generating a reading instruction according to the first information, and sending the reading instruction to the source memory; the source memory reads data according to the first information included in the reading instruction and sends the read data to the storage controller; and the storage controller acquires the second information and generates a storage instruction according to the acquired second information and the read data so as to store the data to the destination memory.
In some embodiments shown, the hardware circuit further comprises: an information storage; the information memory is connected with the access controller and the storage controller; the access controller responds to the received data moving instruction, and acquires first information used for reading data and second information used for writing data which are included in the data moving instruction; and generating a reading instruction according to the first information, and sending the reading instruction to the source memory, including: the access controller responds to a received data moving instruction, acquires first information used for reading data and second information used for writing data which are included in the data moving instruction, and stores the second information into the information memory to obtain a storage address for storing the second information; generating a reading instruction according to the storage address and the first information, and sending the reading instruction to the source memory; the source memory reads data according to first information included in the reading instruction and sends the read data to the storage controller, and the method includes: the source memory reads data according to first information included in the reading instruction and sends the read data and the storage address included in the reading instruction to the storage controller; the storage controller acquires the second information, and generates a storage instruction according to the acquired second information and the read data to store the data to the destination memory, including: and the storage controller acquires the second information from the information storage according to the storage address, and generates a storage instruction according to the acquired second information and the read data so as to store the data to the destination storage.
The fourth aspect of the present application further provides a data moving method, which is applied to the hardware circuit shown in any of the foregoing embodiments, and the method includes: the access controller responds to a received data moving instruction, acquires first information used for reading data and second information used for writing data which are included in the data moving instruction, generates a reading instruction according to the first information, and sends the reading instruction to a source memory; the source memory reads data according to the first information included in the reading instruction and sends the read data to the storage controller; and the storage controller generates a storage instruction according to the second information and the acquired second information and the read data so as to store the data to a destination memory.
In some embodiments shown, the hardware circuit further comprises: a first memory and a second memory; the first memory is connected with the access controller and the first controller; the second memory is connected with the access controller and the second controller; the accessing controller, in response to the received data moving instruction, acquires first information for reading data and second information for writing data included in the data moving instruction, generates a read instruction according to the first information, and sends the read instruction to a source memory, including: the access controller responds to a received data moving instruction, acquires first information used for reading data and second information used for writing data which are included in the data moving instruction, and stores the second information into an information memory to obtain a storage address for storing the second information; generating a reading instruction according to the storage address and the first information, and sending the reading instruction to a source memory; the source memory reads data according to first information included in the reading instruction and sends the read data to the storage controller, and the method comprises the following steps: the source memory reads data according to first information included in the reading instruction and sends the read data and the storage address included in the reading instruction to a storage controller; the storage controller generates a storage instruction according to the second information and the obtained second information and the read data to store the data to a destination memory, and the storage controller includes: the storage controller acquires the second information from the information storage according to the storage address, and generates a storage instruction according to the acquired second information and the read data so as to store the data to a destination storage; the source storage is the shared memory, the destination storage is the external memory, the information storage is the first storage, and the storage controller is the first controller; and/or the source memory is the external memory, the destination memory is the shared memory, the information memory is the second memory, and the storage controller is the second controller.
In some embodiments shown, the data move instructions comprise instructions initiated for each thread within a thread group; the information memory comprises a plurality of memory cells; the accessing controller, in response to the received data moving instruction, acquires first information for reading data and second information for writing data included in the data moving instruction, and stores the second information in the information memory to obtain a storage address for storing the second information, including: the access controller acquires the first information and the second information respectively corresponding to each thread according to the data moving instruction; sending second information corresponding to each thread to the information memory; the method further comprises the following steps: the information memory determines a target memory cell which does not store data in the memory cells, stores second information corresponding to each thread to the target memory cell, and sends a memory address of the target memory cell to the access controller so that the access controller can obtain the memory address.
In some embodiments shown, the target storage unit includes storage spaces corresponding to the threads respectively; the information memory determines a target memory cell which does not store data in the plurality of memory cells, and stores second information corresponding to each thread to the target memory cell, including: the information memory stores second information corresponding to each thread in a storage space corresponding to each thread in the target storage unit.
In some embodiments shown, generating a read instruction according to the memory address of the access controller and the first information includes: the access controller determines a plurality of first threads with address conflict in each thread according to the source address represented by the first information corresponding to each thread; and combining the first information corresponding to the plurality of first threads with the storage address respectively to obtain a plurality of first reading instructions corresponding to the first threads respectively.
In some embodiments shown, generating a read instruction according to the memory address of the access controller and the first information includes: and the access controller combines the first information corresponding to a plurality of second threads which do not conflict in the threads with the storage address to obtain second reading instructions corresponding to the plurality of second threads.
In some embodiments shown, the reading data from the source memory according to the first information included in the read instruction, and sending the read data and the storage address included in the read instruction to the storage controller, includes: the source memory responds to the first reading instruction, acquires first information included in the first reading instruction, and reads first data corresponding to a first thread corresponding to the first reading instruction according to a source address represented by the acquired first information; sending the read first data, the storage address included in the first read instruction, and a thread ID corresponding to the first data to the storage controller; responding to the second read instruction, acquiring first information which is included in the second read instruction and respectively corresponds to each second thread, and reading second data respectively corresponding to each second thread according to a source address represented by the first information respectively corresponding to each second thread; and sending each read second data, the memory address included in the second read instruction and the thread ID corresponding to each second data to the memory controller.
In some embodiments, the target storage unit further includes access identifiers corresponding to the threads; the access identifier indicates whether data in a memory space corresponding to the thread is accessed; the storage controller acquires the second information from the information storage according to the storage address, and the method comprises the following steps: the storage controller acquires second information corresponding to the thread corresponding to the read data from the information storage according to the storage address; the method further comprises the following steps: the information memory responds to the storage controller to acquire the second information from the information memory according to the storage address, and sets an access identifier corresponding to a thread corresponding to the read data as a first identifier; wherein the first identifier characterizes that data in a memory space corresponding to the thread has been accessed.
In some embodiments shown, the method further comprises: and the information memory determines whether the access identifications, corresponding to the threads, included in each storage unit are the first identifications, and if so, releases the data stored in the corresponding storage unit.
In some embodiments shown, the access controller includes a configuration register, an address register, and an arithmetic unit; the method further comprises the following steps: the address register responds to the data of the target moved by the data moving instruction, comprises data of a plurality of source addresses, and stores a preset source address in the plurality of source addresses; the configuration register stores the operation relation of other source addresses except the preset source address in the source addresses obtained by the preset source address; and the arithmetic unit generates a plurality of instructions for moving the data of the plurality of source addresses according to the arithmetic relation and the preset source address so as to finish moving the target data.
In some illustrative embodiments, the hardware circuit includes a plurality of shared memories and a plurality of external memories, and a crossbar is connected between the plurality of shared memories and the second controller and between the plurality of external memories and the first controller; the data transfer instruction is used for carrying out data transfer between the shared memories and the external memories; the storage controller generates a storage instruction according to the acquired second information and the read data to store the data to a destination memory, and the method comprises the following steps: and the storage controller generates a storage instruction according to the acquired second information and the read data, and sends the storage instruction to a target memory through the crossbar to finish data movement.
In some embodiments shown, the accessing controller further comprises a state feedback unit; the method further comprises the following steps: the state feedback sheet responds to a received data moving instruction sent by a processor and sends first state information to the processor to indicate that the hardware circuit is in a busy state currently; and responding to a storage instruction generated according to the acquired second information and the read data to store the target data to the destination memory, and sending second state information to the processor to indicate that the hardware circuit is in an idle state currently.
In some embodiments shown, the first memory and/or the second memory include a buffer that supports a read-write.
The fifth aspect of the present application also provides a chip including the hardware circuit shown in any of the foregoing embodiments.
The sixth aspect of the present application further provides an electronic device, which includes a hardware circuit as shown in any of the foregoing embodiments or a chip as shown in any of the foregoing embodiments.
The seventh aspect of the present application also provides a computer-readable storage medium, on which a computer program is stored, where the computer program, when executed by a controller, implements the data migration method shown in any one of the foregoing embodiments.
The invention has at least the following effects:
in the solutions described in the first, third, fifth, and seventh aspects of the present application, on one hand, point-to-point data shifting between a source memory and a destination memory may be formed, so as to release a thread register resource and a Crossbar (Crossbar) between threads and thread registers, thereby reducing thread power consumption and improving Crossbar performance; on the other hand, the data transfer can be completed only by sending a data transfer instruction by the instruction transmitting unit, so that the data transfer efficiency is improved.
In the solutions described in the second and fourth aspects of the present application, first, point-to-point data shifting between a source memory (shared memory) and a destination memory (external memory) may be formed, so as to release a thread register resource and a Crossbar (Crossbar) between threads and thread registers, thereby reducing thread power consumption and improving Crossbar performance; secondly, the data transfer can be completed only by sending a data transfer instruction by the instruction transmitting unit, so that the data transfer efficiency is improved; third, the data moving method disclosed in the foregoing embodiment is completed by different devices in a hardware circuit respectively for different types of data moving instructions, so that two types of data moving instructions can be processed asynchronously, and data moving efficiency is improved.
It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the application.
Drawings
In order to more clearly illustrate one or more embodiments of the present application or technical solutions in the related art, the drawings needed to be used in the description of the embodiments or the related art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments described in one or more embodiments of the present application, and other drawings can be obtained by those skilled in the art without inventive exercise.
FIG. 1 is a schematic diagram of a hardware circuit shown in the present application;
FIG. 2 is a schematic diagram of a hardware circuit according to the present application;
FIG. 3 is a schematic diagram of a hardware circuit according to the present application;
FIG. 4 is a schematic diagram of a hardware circuit shown in the present application;
FIG. 5 is a schematic diagram of a hardware circuit configuration shown in the present application;
fig. 6 is a flowchart of a method of data moving according to the present application.
Detailed Description
Reference will now be made in detail to the exemplary embodiments, examples of which are illustrated in the accompanying drawings. When the following description refers to the accompanying drawings, like numbers in different drawings represent the same or similar elements unless otherwise indicated. The embodiments described in the following exemplary embodiments do not represent all embodiments consistent with the present application. Rather, they are merely examples of apparatus and methods consistent with certain aspects of the present application, as detailed in the appended claims.
The terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the application. As used in this application and the appended claims, the singular forms "a", "an", and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise. It should also be understood that the term "and/or" as used herein refers to and encompasses any and all possible combinations of one or more of the associated listed items. It should also be understood that the word "if" as used herein may be interpreted as "at … …" or "at … …" or "in response to a determination," depending on the context.
In view of the above, the present application provides a hardware circuit (hereinafter referred to as a circuit). The devices in the circuit may cooperate with each other to perform a data transfer method. The hardware circuit may be part of a chip or an integrated circuit board.
Referring to fig. 1, fig. 1 is a schematic structural diagram of a hardware circuit shown in the present application. The hardware circuit comprises memories, and data can be moved among the memories by using the method disclosed by the application. It should be noted that the hardware circuit structure shown in fig. 1 is only a schematic illustration, and other structures may exist in a practical situation. The present application does not specifically limit the configuration of the hardware circuit.
As shown in fig. 1, the hardware circuitry may include an access controller 101; a source memory 102 and a destination memory 103 connected with the access controller; and a memory controller 104 connected to the source memory 102 and the destination memory 103.
The source memory indicates a memory to transmit data; the destination memory indicates a memory that receives data. In different scenarios, the source memory and the destination memory may refer to different memories. For example, in response to moving data from shared memory to external storage, the source storage may be shared memory and the destination storage may be external storage. As another example, in response to data being moved from external storage to shared memory, the source storage may be external storage and the destination storage may be shared memory.
The access controller may receive the data moving instruction by opening an interface, and then may be configured to obtain first information for reading data and second information for writing data included in the data moving instruction in response to the received data moving instruction.
The data move instruction may be an instruction initiated by the instruction issue unit for a thread in response to a data move request. The data moving instruction may include first information for reading data and second information for writing data.
The first information may be used to read data from a source memory. In some embodiments, the first information may include information such as a source address. Data can be read from the source memory through the source address.
The second information may be used to store the read data to a destination memory. In some embodiments, the second information may include information such as a destination address. And storing the read data to a destination memory through the destination address.
In some embodiments, the access controller may analyze the received data moving instruction to obtain the first information and the second information.
Thereafter, the access controller may be configured to generate a read instruction according to the first information, and send the read instruction to the source memory. In some embodiments, the access controller may generate a read instruction according to the first information through a built-in instruction editing unit, and send the read instruction to the source memory.
Thereafter, the source memory may be configured to read data according to the first information included in the read instruction and send the read data to the memory controller. In some embodiments, the source memory may address according to a source address represented by the first information included in the read instruction, and then read out data stored in the source address. The read data may then be sent to the memory controller according to a destination address (an address corresponding to the memory controller) included in the read command.
Then, the storage controller may be configured to obtain the second information, and generate a storage instruction according to the obtained second information and the read data to store the data in the destination memory. In some embodiments, the access controller may send the second information to the storage controller for storage in advance. The storage controller may obtain corresponding second information from the stored second information, and then may generate a storage instruction according to the obtained second information and the read data through the instruction editing unit, and send the storage instruction to the destination memory. The destination memory may store the read data to a storage space corresponding to the destination address in response to the destination address represented by the second information included in the store instruction.
In the hardware circuit, the access controller may respond to the received data move instruction, acquire first information for reading data and second information for writing data included in the data move instruction, generate a read instruction according to the first information, and send the read instruction to the source memory. And the source memory reads data according to the first information included by the reading instruction and stores the read data in the memory controller. The storage controller may obtain the second information, and generate a storage instruction according to the obtained second information and the read data to store the data to the destination memory.
Therefore, in the hardware circuit, on one hand, point-to-point data movement between the source memory and the destination memory can be formed, so that thread register resources and a Crossbar (Crossbar) between threads and the thread registers are released, further thread power consumption is reduced, and Crossbar performance is improved; on the other hand, the data transfer can be completed only by sending a data transfer instruction by the instruction sending unit, so that the performance of the instruction sending unit is improved.
In some embodiments, the hardware circuit may include an information storage for storing the second information, so that the storage space of the storage controller may be released, and the performance of the storage controller may be improved.
Referring to fig. 2, fig. 2 is a schematic structural diagram of a hardware circuit according to the present application. The hardware circuit comprises memories, and data can be moved among the memories by using the method disclosed by the application. It should be noted that the hardware circuit structure shown in fig. 2 is only a schematic illustration, and other structures may exist in a practical situation. The present application does not specifically limit the configuration of the hardware circuit.
As shown in fig. 2, the hardware circuitry may include an access controller 201; a source memory 202, a destination memory 203 and an information memory 204 which are connected with the access controller; and a memory controller 205 connected to the source memory 202, the destination memory 203, and the information memory 204.
After acquiring the first information and the second information in the data transfer instruction, the access controller may store the second information in the information memory to obtain a storage address for storing the second information;
several memory cells may be included in the information storage. The information memory may store the received second information into an empty memory cell, and then return a memory address corresponding to the memory cell to the access controller.
Then, the memory access controller may generate a read instruction according to the storage address and the first information, and send the read instruction to the source memory. In some embodiments, the access controller may generate a read instruction according to the memory address obtained by the first information and the address corresponding to the memory controller through an internal instruction editing unit, and send the read instruction to the source memory, so that the read data in the source memory may be sent to the memory controller.
And the source memory is used for reading data according to the first information included by the reading instruction and sending the read data and the storage address included by the reading instruction to the storage controller. In some embodiments, the source memory may address according to a source address represented by the first information included in the read instruction, and then read out data stored in the source address. Then, according to a destination address (an address corresponding to the memory controller) included in the read instruction, the read data and the memory address included in the read instruction may be sent to the memory controller.
And the storage controller is used for acquiring the second information from the information storage according to the storage address and generating a storage instruction according to the acquired second information and the read data so as to store the data to the destination storage. In some embodiments, the memory controller may generate a second information reading instruction according to the memory address and send the second information reading instruction to the information memory. The information memory may respond to the read instruction, address according to the storage address, and return second information stored in the storage address to the storage controller. And then the storage controller can generate a storage instruction according to the acquired second information and the read data through an instruction editing unit, and send the storage instruction to a destination memory. The destination memory may store the read data to a storage space corresponding to the destination address in response to the destination address represented by the second information included in the store instruction.
Therefore, the information memory can be used as an intermediate storage device to store the second information, so that the storage space of the storage controller is released, and the performance of the storage controller is improved.
In some embodiments, the information storage may be a buffer supporting read-write, so that a process of storing the second information analyzed by the access controller and a process of reading the second information from the information storage by the storage controller may be performed in parallel, thereby improving data moving efficiency.
The application also provides a hardware circuit. Referring to fig. 3, fig. 3 is a schematic structural diagram of a hardware circuit according to the present application.
As shown in fig. 3, the hardware circuit may include: access controller 301; a shared memory 302 and an external memory 303 connected with the access controller; a first controller 304 coupled to the shared memory 302, the access controller 301, and the external memory 303; and a second controller 305 coupled to the external memory 303, the access controller 301, and the shared memory 302.
The shared memory is used for providing shared data service for threads, thread groups, thread blocks or processes. The application does not limit the type of shared memory. In some embodiments, the shared Memory may be an SRAM (Static Random-Access Memory). The shared memory can store data needed by the current task.
The external memory is a concept in terms of shared memory and is used for storing task result data. Data transfers often need to occur between the external memory and the shared memory. The present application does not limit the type of external memory. In some embodiments, the external Memory may be a DDR SDRAM (Double Data Rate Synchronous Random Access Memory).
In response to the data move instruction being a first type of instruction, data may be moved from the shared memory to the external memory. In this case, the shared memory may be regarded as a source storage, and the external storage may be regarded as a destination storage.
In response to the data move instruction being a second type of instruction, data may be moved from external memory to shared memory. In this case, the external memory may be used as the source memory, and the shared memory may be used as the destination memory.
The first controller and the second controller may function as the aforementioned storage controller. In response to the data move instruction being the first type instruction, the first controller may be adapted to obtain the second information, and generate a store instruction according to the second information and the data read from the shared memory to store the data in the external memory. In response to the data moving instruction being the second type instruction, the second controller may be adapted to obtain the second information, and generate a storage instruction according to the second information and the data read from the external memory to store the data in the shared memory.
The data move instruction may be an instruction initiated by the instruction issue unit for a thread or a thread group (Warp) in response to a data move request. It will be appreciated that instructions that are issued for a thread group are actually instructions that are issued for each thread within the thread group. Hereinafter, an instruction issued to a thread group is referred to as a thread group instruction, and an instruction issued to a thread is referred to as a thread instruction. The shared information in each thread instruction can be extracted as shared information by the thread group instruction mode, so that compared with the mode of independently sending the instruction for each thread, the instruction format can be simplified, the performance of the instruction sending unit is improved on one hand, and the instruction processing efficiency is improved on the other hand, so that the data moving efficiency is improved.
The following description will be given taking the data transfer instruction as an example of an instruction issued to each thread in a thread group.
The type of the data movement instruction can comprise a first type instruction and a second type instruction. The following description will be given taking the data transfer instruction as a first type instruction as an example. It is understood that the data movement method for the second type of instruction may refer to the first type of instruction and will not be described in detail herein.
In response to the data move instruction being a first type of instruction, the shared memory may be considered a source memory (shared memory), the external memory may be considered a destination memory (external memory), and the first controller may be considered a storage controller (first controller).
The access controller is used for responding to the received data moving instruction, acquiring first information used for reading data and second information used for writing data which are included in the data moving instruction, generating a reading instruction according to the first information, and sending the reading instruction to a source memory.
And the source memory is used for reading data according to the first information included by the reading instruction and sending the read data to the memory controller.
And the storage controller is used for generating a storage instruction according to the second information and the acquired second information and the read data so as to store the data to a destination memory.
The source storage is the shared memory, the destination storage is the external memory, and the storage controller is the first controller; and/or the source memory is the external memory, the destination memory is the shared memory, and the storage controller is the second controller.
In the hardware circuit, firstly, point-to-point data transfer between a source memory (shared memory) and a target memory (external memory) can be formed, so that thread register resources and a Crossbar (Crossbar) between threads and thread registers are released, thread power consumption is reduced, and Crossbar performance is improved; secondly, the data transfer can be completed only by sending a data transfer instruction by the instruction sending unit, so that the performance of the instruction sending unit is improved; third, the data transfer method disclosed in the foregoing embodiment is completed by different devices in the hardware circuit for the first type of data transfer instruction and the second type of data transfer instruction, so that two types of data transfer instructions can be processed asynchronously, and the data transfer efficiency is improved.
In some embodiments, the hardware circuit may include an information storage for storing the second information, so that the storage space of the storage controller may be released, and the performance of the storage controller may be improved.
Referring to fig. 4, fig. 4 is a schematic structural diagram of a hardware circuit shown in the present application.
As shown in fig. 4, the hardware circuit may include: an access controller 401; a shared memory 402, an external memory 403, a first storage 404 and a second storage 405 connected with the access controller; a first controller 406 connected to the shared memory 402, the first storage 404, and the external memory 403; and a second controller 407 coupled to the external memory 403, the second storage 405, and the shared memory 402.
The first memory and the second memory may function as the aforementioned information memory. In response to the data move instruction being a first type of instruction, the first memory may be employed to store second information. In response to the data move instruction being a second type of instruction, the second memory may be employed to store second information.
The data transfer instruction is a first type instruction, and the shared memory may be a source memory (shared memory), the external memory may be a destination memory (external memory), the first memory may be an information memory (first memory), and the first controller may be a storage controller (first controller).
The access controller may receive the data moving instruction through a provided interface, and then may be configured to obtain first information for reading data and second information for writing data included in the data moving instruction in response to the received data moving instruction, and store the second information in an information memory to obtain a storage address where the second information is stored.
In some embodiments, the access controller may be configured to obtain the first information and the second information corresponding to the threads respectively according to the data moving instruction.
In some embodiments, the data move instruction may be split to obtain a thread instruction for each thread in the thread group, and then each thread instruction may be analyzed to obtain first information and second information corresponding to each thread. Therefore, asynchronous processing of all thread instructions included in the thread group instructions is achieved, and data moving efficiency is improved.
Then, the access controller may be configured to send the second information corresponding to each thread to the information storage (first storage). In some embodiments, the access controller may send the second information of each thread to the information memory (first memory) separately or in its entirety.
In some embodiments, the information storage (first storage) may comprise a number of storage units; each memory location may correspond one-to-one to each memory address.
After receiving the second information, the information storage (first storage) may be configured to determine a target storage unit that does not store data in the plurality of storage units, store the second information corresponding to each thread in the target storage unit, and send a storage address of the target storage unit to the access controller, so that the access controller obtains the storage address.
In some embodiments, the second information corresponding to the threads in the same thread group may be stored in the same storage unit, and the threads may share the same storage address, so as to simplify the generation difficulty of the read instruction and the storage instruction and improve the data transfer efficiency.
In some embodiments, the target storage unit includes storage spaces corresponding to the threads, respectively. The information memory (first memory) may be configured to store second information corresponding to each of the threads in a storage space corresponding to each of the threads in the target storage unit.
In some embodiments, a correspondence between thread IDs of threads (threads within a thread group have unique IDs) and respective memory spaces may be maintained in the information controller (first memory). After receiving the second information corresponding to the thread, the second information may be stored in the corresponding storage space according to the thread ID. Therefore, the subsequent reading of the second information corresponding to each thread can be facilitated.
In some embodiments, the target storage unit further includes access identifiers corresponding to the threads. The access flag may indicate whether data in a memory space corresponding to the thread is accessed or whether data is stored. In some embodiments, data of the memory space may be indicated by the first identifier to be accessed, or no data within the memory space. The storage space may be indicated by the second identifier that data that has not been accessed is stored.
After storing the second information of the thread in the corresponding storage space, the information storage (first storage) may be configured to set the access identifier corresponding to the thread as the second identifier, so as to indicate that the storage space stores the data that has not been accessed. Therefore, whether the second information corresponding to each thread is read (accessed) or not can be counted conveniently, and if the access identifications corresponding to each thread in the same storage unit are the first identifications, the storage unit can be released, so that the multiplexing rate of the storage unit is increased.
In some embodiments, the information store (first store) returns the memory address of the target memory location to the access controller after completing the second information store. The access controller may be configured to generate a read instruction according to the storage address and the first information, and send the read instruction to a source memory.
In some embodiments, the access controller may be configured to determine, according to a source address represented by the first information corresponding to each thread, a number of first threads with address conflicts among the threads. Then, the first information corresponding to the plurality of first threads may be combined with the storage address, respectively, to obtain a plurality of first read instructions corresponding to the first threads, respectively. The plurality of first read instructions may then be sequentially sent to a source storage (shared memory) to complete data reading. Therefore, the problem of abnormal data reading caused by address conflict can be solved, and the data extraction accuracy is improved.
In some embodiments, the access controller may be further configured to combine the first information corresponding to the second threads that do not conflict among the threads with the storage address to obtain second read instructions corresponding to the second threads. The second read instruction may then be sent to the source storage (shared memory) to complete the data read for each thread. Therefore, the common information in each thread instruction can be extracted as the shared information in a mode of second reading instructions (thread group instructions), so that compared with a mode of sending the reading instructions for each thread independently, the instruction format can be simplified, the performance of the access controller is improved, the instruction processing efficiency is improved, and the data transfer efficiency is improved.
For example, threads 0-9 that did not have an access conflict need to read data from source storage (shared memory), respectively. Where threads 0-9 belong to the same thread group.
The access controller may generate a second read instruction for threads 0-9 according to the data read operations corresponding to threads 0-9, respectively. The second read instruction may only include a piece of information related to the thread group basic information (e.g., the thread group number), and compared with the data read instructions generated for the threads 0 to 9, the instruction format may be simplified, so that on one hand, the instruction generation efficiency of the access controller may be improved, thereby improving the performance thereof, and on the other hand, the instruction processing efficiency of the source memory (shared memory) may be improved, thereby improving the data transfer efficiency.
After receiving the read instruction, the source storage (shared memory) may be configured to read data according to first information included in the read instruction, and send the read data and the storage address included in the read instruction to the storage controller.
In some embodiments, the source storage (shared memory) may be configured to, in response to the first read instruction, obtain first information included in the first read instruction, and read first data corresponding to a first thread corresponding to the first read instruction according to a source address represented by the obtained first information. The read first data, the memory address included in the first read instruction, and the thread ID corresponding to the first data may then be sent to the memory controller (first controller).
In some embodiments, the first read instruction may include a thread ID, first information, a memory address, and a memory controller (first controller) address. The source memory (shared memory) can be addressed according to the source address represented by the first information, and the stored first data can be read. The thread ID, the first data, and the memory address may then be sent to the memory controller (first controller) to store the data to a destination memory (external memory) based on the memory controller (first controller) address. Whereby the data read can be done for the first read instruction.
In some embodiments, the source storage (shared memory) may be configured to, in response to the second read instruction, obtain first information, included in the second read instruction, corresponding to each second thread, and read second data corresponding to each second thread according to a source address represented by the first information corresponding to each second thread. The read second data, the memory address included in the second read instruction, and the thread ID corresponding to the second data may then be sent to the memory controller (first controller).
In some embodiments, the second read instruction may include each thread ID, first information corresponding to each thread ID, a memory address, and a memory controller (first controller) address. The source storage (shared memory) may take each thread as a current thread, and perform:
and acquiring first information corresponding to the current thread through the current thread ID. And then addressing and reading second data corresponding to the current thread through the source address represented by the first information. The read second data, the current thread ID, and the memory address may then be sent to the memory controller (first controller) to store the data to a destination memory (external memory) based on the memory controller (first controller) address. Whereby the data read can be completed for the second read instruction.
After receiving the read data and other information, the storage controller (first controller) may be configured to obtain the second information from the information storage according to the storage address, and generate a storage instruction according to the obtained second information and the read data to store the data in a destination storage.
In some embodiments, the storage controller (first controller) may obtain, from the information memory (first memory), second information corresponding to a thread corresponding to the read data according to the storage address.
In some embodiments, the memory controller (first controller) may construct the second information reading instruction according to the thread ID and the memory address, and send the second information reading instruction to the information memory (first memory). The information memory (first memory) may address according to a memory address in response to the second information reading instruction, read second information corresponding to the thread ID, and return the second information to the memory controller (first controller).
After receiving the second information corresponding to the thread, the storage controller (first controller) may be configured to generate a storage instruction according to the obtained second information and the read data, and send the storage instruction to a destination memory (external memory). The destination memory (external memory) can address according to the destination address represented by the second information and store the read data to the corresponding address. Thus, the data transfer from the shared memory to the external memory is completed aiming at the data transfer instruction (the first type instruction).
In the hardware circuit, the information memory can be used as an intermediate memory device to store second information, so that the memory space of the memory controller is released, and the performance of the memory controller is improved.
In some embodiments, the information storage may be a buffer supporting read-write, so that a process of storing the second information analyzed by the access controller and a process of reading the second information from the information storage by the storage controller may be performed in parallel, thereby improving data moving efficiency.
In some embodiments, after the second information reading is completed in the information storage (the first storage), the information storage may be configured to set an access identifier corresponding to a thread corresponding to the read data as the first identifier; wherein the first identifier characterizes that data in a memory space corresponding to the thread has been accessed. The information memory (first memory) may periodically determine whether the access identifiers included in each storage unit and corresponding to the threads are the first identifiers, and if so, release the data stored in the corresponding storage unit. Thereby increasing the memory cell reuse rate.
In some embodiments, the access controller may include a configuration register, an address register, and an arithmetic unit.
The address register may be configured to store a preset source address of the source addresses in response to that target data moved by the data moving instruction includes data of a plurality of source addresses. The configuration register may be configured to store an operation relationship of source addresses, except the preset source address, among the plurality of source addresses obtained from the preset source address. The arithmetic unit may generate a plurality of instructions for moving the data with respect to the plurality of source addresses according to the arithmetic relationship and the preset source address to complete moving the target data.
In some embodiments, the data of the plurality of source addresses have a spatial relationship therebetween. For example, the data of the plurality of source addresses may constitute one data block. The plurality of source addresses may be viewed as a cube.
The preset source address may be any address among the plurality of source addresses. In some embodiments, the predetermined source address may be a composition with a smallest address among the plurality of source addresses. And obtaining other source addresses according to the operation relation through the preset source address.
The preset source address may be stored in the address register, the operation relationship may be stored in the configuration register, and the access controller may generate, through the operation unit, a plurality of instructions for moving data of the plurality of source addresses. Wherein, the source address included by the instructions is calculated by the operation relation and the preset source address. Then, the access controller may execute the data transfer method shown in any of the foregoing embodiments for the plurality of instructions, respectively, to complete the transfer of the target data (data block).
In this example, only one instruction is needed for moving data of a plurality of source addresses, and the access controller can split the instruction into a plurality of instructions according to the address register, the configuration register and the arithmetic unit, so as to complete moving of data of a plurality of source addresses, and further improve the working performance of the instruction sending unit.
In some embodiments, the hardware circuit includes a plurality of shared memories and a plurality of external memories, and Crossbar is connected between the plurality of shared memories and the second controller and between the plurality of external memories and the first controller. The data transfer instruction is used for carrying out data transfer between the shared memories and the external memories.
The Crossbar connects each shared memory with each external memory through a Crossbar architecture, so that data transfer can be performed between each shared memory and each shared external memory. The storage controller (first controller) may be configured to generate a storage instruction according to the acquired second information and the read data, and send the storage instruction to a destination memory (external memory) through the crossbar matrix to complete data transfer. Therefore, the data moving method disclosed by the application can expand point-to-point data moving into multipoint-to-multipoint data moving, and the application range of the data moving method is expanded.
In some embodiments, the access controller further comprises a state feedback unit.
The state feedback unit is used for feeding back the current working state of the access controller.
In some embodiments, the status feedback unit may be configured to send, in response to receiving a data movement instruction sent by a processor, first status information to the processor to indicate that the hardware circuit is currently in a busy status.
The state feedback unit may be configured to, in response to generating a storage instruction according to the obtained second information and the read data to store the target data in the destination memory (external memory), send second state information to the processor to indicate that the hardware circuit is currently in an idle state.
Therefore, the processor can master the working states of the access controller and the hardware circuit in real time, and the processor can conveniently and efficiently schedule tasks.
Referring to fig. 5, fig. 5 is a schematic diagram of a hardware circuit structure shown in the present application.
As shown in fig. 5, the hardware circuit includes: accessing a controller; the access controller is connected with the plurality of shared memories, the plurality of external memories, the first memory and the second memory; a first controller coupled to the plurality of shared memories, the first storage, and the plurality of external memories; a second controller coupled to the plurality of external memories, the second storage, and the plurality of shared memories; and Crossbar is connected between the shared memories and the second controller and between the external memories and the first controller.
The access controller comprises a state feedback unit, a configuration register, an address register and an arithmetic unit.
Assume that data at 4 addresses of 0-3 (hereinafter referred to as addresses 0-3) of the shared memory needs to be stored in address 5 (hereinafter referred to as address 5) of the external memory 1. The processor may generate a first type of instruction with the instruction issue unit using addresses 0-3 as the first information (source address) and address 5 as the second information (destination address) and issue the instruction to the access controller.
After receiving the first type instruction, the access controller may feed back, to the processor, first state information indicating that the current state is busy through a state feedback unit. The address 0 can also be stored as a preset source address to an address register, and the step length 1 can be stored as an operation relation to a configuration register. Then, 4 instructions with source addresses of 0 to 3 and destination addresses of 5 are generated by the arithmetic unit.
Then, the access controller may execute the data moving method for the 4 instructions, respectively, until the data in addresses 0-3 are all moved to address 5. Therefore, the number of the instructions generated by the instruction sending unit can be reduced, and the working performance of the instruction sending unit is improved.
After the access controller completes the data transfer, second state information can be sent to the processor through the state feedback unit to indicate that the processor is in an idle state currently. Therefore, the processor can control the working state of the hardware circuit in real time, and task scheduling is facilitated.
The application also provides a data moving method. The method can be applied to the hardware circuit shown in any of the foregoing embodiments.
Referring to fig. 6, fig. 6 is a flowchart illustrating a method for moving data according to the present application. As shown in fig. 6, the method may include:
s602, where the access controller is configured to respond to a received data movement instruction, and acquire first information used for reading data and second information used for writing data that are included in the data movement instruction; and generating a reading instruction according to the first information, and sending the reading instruction to the source memory.
And S604, the source memory is used for reading data according to the first information included in the reading instruction and sending the read data to the memory controller.
And S606, the storage controller is used for acquiring the second information and generating a storage instruction according to the acquired second information and the read data so as to store the data to the destination memory.
The method can acquire first information used for reading data and second information used for writing data which are included in a data moving instruction, generate a reading instruction by using the first information, read data from a source memory, generate a storage instruction by using the second information and the read data, and store the read data to a target memory to finish data moving. Wherein the source memory indicates a memory to transmit data. Such as shared memory. The destination memory indicates a memory that receives data. Such as external memory.
Therefore, on one hand, the method can form point-to-point data movement between the source memory and the target memory, so that thread register resources and a Crossbar (Crossbar) between threads and the thread registers are released, the thread power consumption is reduced, and the Crossbar performance is improved; on the other hand, the data transfer can be completed only by sending a data transfer instruction by the instruction transmitting unit, so that the data transfer efficiency is improved.
In some embodiments shown, the hardware circuit further comprises: an information storage; the information memory is connected with the access controller and the storage controller; the access controller responds to the received data moving instruction, and acquires first information used for reading data and second information used for writing data which are included in the data moving instruction; and generating a reading instruction according to the first information, and sending the reading instruction to the source memory, including:
the access controller responds to a received data moving instruction, acquires first information used for reading data and second information used for writing data which are included in the data moving instruction, and stores the second information into the information memory to obtain a storage address for storing the second information; generating a reading instruction according to the storage address and the first information, and sending the reading instruction to the source memory;
the source memory reads data according to first information included in the reading instruction and sends the read data to the storage controller, and the method includes:
the source memory reads data according to first information included in the reading instruction and sends the read data and the storage address included in the reading instruction to the storage controller;
the storage controller acquires the second information, and generates a storage instruction according to the acquired second information and the read data to store the data to the destination memory, including:
and the storage controller acquires the second information from the information storage according to the storage address, and generates a storage instruction according to the acquired second information and the read data so as to store the data to the destination storage.
Therefore, the second information can be stored in the information memory, so that the memory space of the memory controller is released, and the performance of the memory controller is improved.
The application also provides a data moving method. The method can be applied to the hardware circuit shown in any of the foregoing embodiments.
The method may include:
s702, the access controller responds to the received data moving instruction, acquires first information used for reading data and second information used for writing data which are included in the data moving instruction, generates a reading instruction according to the first information, and sends the reading instruction to a source memory.
S704, the source memory reads data according to the first information included in the reading instruction and sends the read data to a memory controller.
S706, the storage controller generates a storage instruction according to the second information and the obtained second information and the read data so as to store the data to a destination memory.
The data moving instruction may be a first type of data moving instruction or a second type of data moving instruction. In response to a first type of data transfer instruction, the source memory is the shared memory, the destination memory is the external memory, and the storage controller is the first controller. In response to a second type of data transfer instruction, the source storage is the external storage, the destination storage is the shared memory, and the storage controller is the second controller.
The method can complete the data moving method disclosed in the foregoing embodiment through different devices in a hardware circuit respectively for a first type of data moving instruction and a second type of data moving instruction, so that two types of data moving instructions can be processed asynchronously, and the data moving efficiency is improved.
In some embodiments shown, the hardware circuit further comprises: a first memory and a second memory; the first memory is connected with the access controller and the first controller; the second memory is connected with the access controller and the second controller;
the accessing controller, in response to the received data moving instruction, acquires first information for reading data and second information for writing data included in the data moving instruction, generates a read instruction according to the first information, and sends the read instruction to a source memory, including:
the access controller responds to a received data moving instruction, acquires first information used for reading data and second information used for writing data which are included in the data moving instruction, and stores the second information into an information memory to obtain a storage address for storing the second information; generating a reading instruction according to the storage address and the first information, and sending the reading instruction to a source memory;
the source memory reads data according to first information included in the reading instruction and sends the read data to the storage controller, and the method comprises the following steps:
the source memory reads data according to first information included in the reading instruction and sends the read data and the storage address included in the reading instruction to a storage controller;
the storage controller generates a storage instruction according to the second information and the obtained second information and the read data to store the data to a destination memory, and the storage controller includes:
the storage controller acquires the second information from the information storage according to the storage address, and generates a storage instruction according to the acquired second information and the read data so as to store the data to a destination storage;
the source storage is the shared memory, the destination storage is the external memory, the information storage is the first storage, and the storage controller is the first controller; and/or the source memory is the external memory, the destination memory is the shared memory, the information memory is the second memory, and the storage controller is the second controller.
In some embodiments shown, the data move instructions comprise instructions initiated for each thread within a thread group; the information memory comprises a plurality of memory cells;
the accessing controller, in response to the received data moving instruction, acquires first information for reading data and second information for writing data included in the data moving instruction, and stores the second information in the information memory to obtain a storage address for storing the second information, including:
the access controller acquires the first information and the second information respectively corresponding to each thread according to the data moving instruction; sending second information corresponding to each thread to the information memory;
the method further comprises the following steps:
the information memory determines a target memory cell which does not store data in the memory cells, stores second information corresponding to each thread to the target memory cell, and sends a memory address of the target memory cell to the access controller so that the access controller can obtain the memory address.
In some embodiments shown, the target storage unit includes storage spaces corresponding to the threads respectively; the information memory determines a target memory cell which does not store data in the plurality of memory cells, and stores second information corresponding to each thread to the target memory cell, including:
the information memory stores second information corresponding to each thread in a storage space corresponding to each thread in the target storage unit.
In some embodiments shown, generating a read instruction according to the memory address of the access controller and the first information includes:
the access controller determines a plurality of first threads with address conflict in each thread according to the source address represented by the first information corresponding to each thread;
and combining the first information corresponding to the plurality of first threads with the storage address respectively to obtain a plurality of first reading instructions corresponding to the first threads respectively.
In some embodiments shown, generating a read instruction according to the memory address of the access controller and the first information includes:
and the access controller combines the first information corresponding to a plurality of second threads which do not conflict in the threads with the storage address to obtain second reading instructions corresponding to the plurality of second threads.
In some embodiments shown, the reading data from the source memory according to the first information included in the read instruction, and sending the read data and the storage address included in the read instruction to the storage controller, includes:
the source memory responds to the first reading instruction, acquires first information included in the first reading instruction, and reads first data corresponding to a first thread corresponding to the first reading instruction according to a source address represented by the acquired first information;
sending the read first data, the storage address included in the first read instruction, and a thread ID corresponding to the first data to the storage controller;
responding to the second read instruction, acquiring first information which is included in the second read instruction and respectively corresponds to each second thread, and reading second data respectively corresponding to each second thread according to a source address represented by the first information respectively corresponding to each second thread;
and sending each read second data, the memory address included in the second read instruction and the thread ID corresponding to each second data to the memory controller.
In some embodiments, the target storage unit further includes access identifiers corresponding to the threads; the access identifier indicates whether data in a memory space corresponding to the thread is accessed;
the storage controller acquires the second information from the information storage according to the storage address, and the method comprises the following steps:
the storage controller acquires second information corresponding to the thread corresponding to the read data from the information storage according to the storage address;
the method further comprises the following steps:
the information memory responds to the storage controller to acquire the second information from the information memory according to the storage address, and sets an access identifier corresponding to a thread corresponding to the read data as a first identifier; wherein the first identifier characterizes that data in a memory space corresponding to the thread has been accessed.
In some embodiments shown, the method further comprises:
and the information memory determines whether the access identifications, corresponding to the threads, included in each storage unit are the first identifications, and if so, releases the data stored in the corresponding storage unit.
In some embodiments shown, the access controller includes a configuration register, an address register, and an arithmetic unit; the method further comprises the following steps:
the address register responds to the data of the target moved by the data moving instruction, comprises data of a plurality of source addresses, and stores a preset source address in the plurality of source addresses;
the configuration register stores the operation relation of other source addresses except the preset source address in the source addresses obtained by the preset source address;
and the arithmetic unit generates a plurality of instructions for moving the data of the plurality of source addresses according to the arithmetic relation and the preset source address so as to finish moving the target data.
In some illustrative embodiments, the hardware circuit includes a plurality of shared memories and a plurality of external memories, and a crossbar is connected between the plurality of shared memories and the second controller and between the plurality of external memories and the first controller; the data transfer instruction is used for carrying out data transfer between the shared memories and the external memories;
the storage controller generates a storage instruction according to the acquired second information and the read data to store the data to a destination memory, and the method comprises the following steps:
and the storage controller generates a storage instruction according to the acquired second information and the read data, and sends the storage instruction to a target memory through the crossbar to finish data movement.
In some embodiments shown, the accessing controller further comprises a state feedback unit; the method further comprises the following steps:
the state feedback sheet responds to a received data moving instruction sent by a processor and sends first state information to the processor to indicate that the hardware circuit is in a busy state currently;
and responding to a storage instruction generated according to the acquired second information and the read data to store the target data to the destination memory, and sending second state information to the processor to indicate that the hardware circuit is in an idle state currently.
In some embodiments shown, the first memory and/or the second memory include a buffer that supports a read-write. The corresponding effects of the embodiments of the methods can be described with reference to the embodiments of the corresponding hardware circuits, and are not described in detail herein.
The application also provides a chip. The chip may include the hardware circuitry shown in any of the embodiments described above. When data is moved inside the chip, the data moving method shown in any one of the embodiments can be realized by using a hardware circuit inside the chip, so that point-to-point data moving between a source memory and a destination memory can be formed on one hand, thread register resources and a Crossbar (Crossbar) between threads and thread registers are released, thread power consumption is reduced, and Crossbar performance is improved; on the other hand, the data transfer can be completed only by sending a data transfer instruction by the instruction transmitting unit, so that the data transfer efficiency is improved, and the chip performance is further improved.
The application also provides an electronic device, which comprises the hardware circuit shown in any embodiment or the chip.
For example, the electronic device may be a smart terminal such as a mobile phone, or may be another device that has a camera and can perform image processing. Exemplarily, when the electronic device needs data transfer, the data transfer method shown in any of the foregoing embodiments may be implemented by a chip or a hardware circuit inside the device, so that on one hand, point-to-point data transfer between a source memory and a destination memory may be formed, thereby releasing a thread register resource and a Crossbar (Crossbar) between threads and a thread register, further reducing thread power consumption, and improving Crossbar performance; on the other hand, the data transfer can be completed only by sending a data transfer instruction by the instruction transmitting unit, so that the data transfer efficiency is improved, and the equipment performance is further improved.
The present application also proposes a computer-readable storage medium on which a computer program is stored, which, when executed by a controller, implements the data migration method as shown in any of the foregoing embodiments.
One skilled in the art will recognize that one or more embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, one or more embodiments of the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, one or more embodiments of the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, optical storage, and the like) having computer-usable program code embodied therein.
"and/or" as recited herein means having at least one of two, for example, "a and/or B" includes three scenarios: A. b, and "A and B".
The embodiments in the present application are described in a progressive manner, and the same and similar parts among the embodiments can be referred to each other, and each embodiment focuses on the differences from the other embodiments. In particular, for the data processing apparatus embodiment, since it is substantially similar to the method embodiment, the description is relatively simple, and for the relevant points, reference may be made to part of the description of the method embodiment.
Specific embodiments of the present application have been described. Other embodiments are within the scope of the following claims. In some cases, the acts or steps recited in the claims may be performed in a different order than in the embodiments and still achieve desirable results. In addition, the processes depicted in the accompanying figures do not necessarily require the particular order shown, or sequential order, to achieve desirable results. In some embodiments, multitasking and parallel processing may also be possible or may be advantageous.
Embodiments of the subject matter and functional operations described in this application may be implemented in the following: digital electronic circuitry, tangibly embodied computer software or firmware, computer hardware including the structures disclosed in this application and their structural equivalents, or a combination of one or more of them. Embodiments of the subject matter described in this application can be implemented as one or more computer programs, i.e., one or more modules of computer program instructions, encoded on a tangible, non-transitory program carrier for execution by, or to control the operation of, data processing apparatus. Alternatively or additionally, the program instructions may be encoded on an artificially generated propagated signal, e.g., a machine-generated electrical, optical, or electromagnetic signal, that is generated to encode and transmit information to suitable receiver apparatus for execution by the data processing apparatus. The computer storage medium may be a machine-readable storage device, a machine-readable storage substrate, a random or serial access memory device, or a combination of one or more of them.
The processes and logic flows described in this application can be performed by one or more programmable computers executing one or more computer programs to perform corresponding functions by operating on input data and generating output. The processes and logic flows can also be performed by, and apparatus can also be implemented as, special purpose logic circuitry, e.g., an FPGA (field programmable gate array) or an ASIC (application-specific integrated circuit).
Computers suitable for executing computer programs include, for example, general and/or special purpose microprocessors, or any other type of central processing system. Generally, a central processing system will receive instructions and data from a read-only memory and/or a random access memory. The essential components of a computer include a central processing system for implementing or executing instructions and one or more memory devices for storing instructions and data. Generally, a computer will also include, or be operatively coupled to receive data from or transfer data to, or both, one or more mass storage devices for storing data, e.g., magnetic, magneto-optical disks, or optical disks. However, a computer does not necessarily have such a device. Moreover, a computer may be embedded in another device, e.g., a mobile telephone, a Personal Digital Assistant (PDA), a mobile audio or video player, a game console, a Global Positioning System (GPS) receiver, or a portable storage device such as a Universal Serial Bus (USB) flash drive, to name a few.
Computer-readable media suitable for storing computer program instructions and data include all forms of non-volatile memory, media and memory devices, including by way of example semiconductor memory devices (e.g., EPROM, EEPROM, and flash memory devices), magnetic disks (e.g., an internal hard disk or a removable disk), magneto-optical disks, and 0xCD _00ROM and DVD-ROM disks. The processor and the memory can be supplemented by, or incorporated in, special purpose logic circuitry.
Although this application contains many specific implementation details, these should not be construed as limiting the scope of any disclosure or of what may be claimed, but rather as merely describing features of particular disclosed embodiments. Certain features that are described in this application in the context of separate embodiments can also be implemented in combination in a single embodiment. In other instances, features described in connection with one embodiment may be implemented as discrete components or in any suitable subcombination. Moreover, although features may be described above as acting in certain combinations and even initially claimed as such, one or more features from a claimed combination can in some cases be excised from the combination, and the claimed combination may be directed to a subcombination or variation of a subcombination.
Similarly, while operations are depicted in the drawings in a particular order, this should not be understood as requiring that such operations be performed in the particular order described or in sequential order, or that all illustrated operations be performed, to achieve desirable results. In some cases, multitasking and parallel processing may be advantageous. Moreover, the separation of various system modules and components in the described embodiments is not to be understood as requiring such separation in all embodiments, and it is to be understood that the described program components and systems can generally be integrated together in a single software product or packaged into multiple software products.
Thus, particular embodiments of the subject matter have been described. Other embodiments are within the scope of the following claims. In some cases, the actions recited in the claims can be performed in a different order and still achieve desirable results. Moreover, the processes depicted in the accompanying figures do not require the particular order described, or sequential order, to achieve desirable results. In some implementations, multitasking and parallel processing may be advantageous.
The above description is only for the purpose of illustrating the preferred embodiments of the present application and is not intended to limit the present application to the particular embodiments of the present application, and any modifications, equivalents, improvements and the like that are within the spirit and principle of the present application and are intended to be included within the scope of the present application.

Claims (20)

1. A hardware circuit, comprising: accessing a controller; the source memory and the destination memory are connected with the access controller; and a storage controller connected to the source memory, the destination memory, and the access controller; the source memory indicates a memory to transmit data; the destination memory indicates a memory to receive data;
the access controller is used for responding to the received data moving instruction, and acquiring first information used for reading data and second information used for writing data which are included in the data moving instruction; generating a reading instruction according to the first information, and sending the reading instruction to the source memory;
the source memory is used for reading data according to first information included in the reading instruction and sending the read data to the memory controller;
and the storage controller is used for acquiring the second information and generating a storage instruction according to the acquired second information and the read data so as to store the data to the destination memory.
2. The hardware circuit of claim 1, further comprising: an information storage; the information memory is connected with the access controller and the storage controller;
the access controller is used for storing the second information to the information memory to obtain a storage address for storing the second information; generating a reading instruction according to the storage address and the first information, and sending the reading instruction to the source memory;
the source memory is used for reading data according to first information included by the reading instruction and sending the read data and the storage address included by the reading instruction to the storage controller;
and the storage controller is used for acquiring the second information from the information storage according to the storage address and generating a storage instruction according to the acquired second information and the read data so as to store the data to the destination storage.
3. A hardware circuit, comprising: accessing a controller; the shared memory and the external memory are connected with the access controller; a first controller coupled to the shared memory, the access controller, and the external memory; and a second controller coupled to the external memory, the access controller, and the shared memory;
the access controller is used for responding to the received data moving instruction, acquiring first information used for reading data and second information used for writing data which are included in the data moving instruction, generating a reading instruction according to the first information, and sending the reading instruction to a source memory;
the source memory is used for reading data according to the first information included in the reading instruction and sending the read data to the memory controller;
the storage controller is used for generating a storage instruction according to the second information and the obtained second information and the read data so as to store the data to a target memory;
the source storage is the shared memory, the destination storage is the external memory, and the storage controller is the first controller; and/or the source memory is the external memory, the destination memory is the shared memory, and the storage controller is the second controller.
4. The hardware circuit of claim 3, further comprising: a first memory and a second memory; the first memory is connected with the access controller and the first controller; the second memory is connected with the access controller and the second controller;
the access controller is used for storing the second information to an information memory to obtain a storage address for storing the second information; generating a reading instruction according to the storage address and the first information, and sending the reading instruction to a source memory;
the source memory is used for reading data according to first information included in the reading instruction and sending the read data and the storage address included in the reading instruction to the storage controller;
the storage controller is used for acquiring the second information from the information storage according to the storage address and generating a storage instruction according to the acquired second information and the read data so as to store the data to a destination storage;
the source storage is the shared memory, the destination storage is the external memory, the information storage is the first storage, and the storage controller is the first controller; and/or the source memory is the external memory, the destination memory is the shared memory, the information memory is the second memory, and the storage controller is the second controller.
5. The hardware circuit of claim 4, the data move instruction comprising an instruction initiated for each thread within a thread group; the information memory comprises a plurality of memory cells;
the access controller is used for acquiring the first information and the second information respectively corresponding to each thread according to the data moving instruction; sending second information corresponding to each thread to the information memory;
the information memory is used for determining a target memory cell which does not store data in the memory cells, storing second information corresponding to each thread to the target memory cell, and sending a memory address of the target memory cell to the access controller so that the access controller can obtain the memory address.
6. The hardware circuit of claim 5, wherein the target memory location comprises memory spaces corresponding to the respective threads;
the information memory is used for storing second information corresponding to each thread into a storage space corresponding to each thread in the target storage unit.
7. The hardware circuit of claim 5 or 6, wherein the access controller is configured to determine, according to a source address represented by the first information corresponding to each thread, a number of first threads with which address conflicts occur among the threads;
and combining the first information corresponding to the plurality of first threads with the storage address respectively to obtain a plurality of first reading instructions corresponding to the first threads respectively.
8. The hardware circuit of claim 7, wherein the access controller is configured to combine the first information corresponding to the second threads that do not conflict with each other in the threads with the memory address to obtain second read instructions corresponding to the second threads.
9. The hardware circuit of claim 8, wherein the source memory is configured to, in response to the first read instruction, obtain first information included in the first read instruction, and read first data corresponding to a first thread corresponding to the first read instruction according to a source address represented by the obtained first information;
sending the read first data, the storage address included in the first read instruction, and a thread ID corresponding to the first data to the storage controller;
responding to the second read instruction, acquiring first information which is included in the second read instruction and respectively corresponds to each second thread, and reading second data respectively corresponding to each second thread according to a source address represented by the first information respectively corresponding to each second thread;
and sending each read second data, the memory address included in the second read instruction and the thread ID corresponding to each second data to the memory controller.
10. The hardware circuit of claim 9, the target memory location further comprising access identifications corresponding to the threads; the access identifier indicates whether data in a memory space corresponding to the thread is accessed;
the storage controller is used for acquiring second information corresponding to the thread corresponding to the read data from the information storage according to the storage address;
the information memory is used for responding to the memory controller to acquire the second information from the information memory according to the memory address and setting an access identifier corresponding to a thread corresponding to the read data as a first identifier; wherein the first identifier characterizes that data in a memory space corresponding to the thread has been accessed.
11. The hardware circuit of claim 10, wherein the information storage is configured to determine whether the access identifiers included in each storage unit and corresponding to the threads are the first identifiers, and if so, release the data stored in the corresponding storage unit.
12. The hardware circuit of any of claims 1-11, wherein the access controller comprises a configuration register, an address register, and an arithmetic unit;
the address register is used for responding to the data of the target moved by the data moving instruction, storing a preset source address in a plurality of source addresses;
the configuration register is used for storing the operation relation of other source addresses except the preset source address in the plurality of source addresses obtained by the preset source address;
and the operation unit is used for generating a plurality of instructions for moving the data of the plurality of source addresses according to the operation relation and the preset source address so as to complete the moving of the target data.
13. A hardware circuit as claimed in any one of claims 1 to 12, said hardware circuit comprising a plurality of shared memories and a plurality of memories, and a crossbar coupled between said plurality of shared memories and said second controller and between said plurality of memories and said first controller; the data transfer instruction is used for carrying out data transfer between the shared memories and the external memories;
and the storage controller is used for generating a storage instruction according to the acquired second information and the read data, and sending the storage instruction to a destination memory through the crossbar matrix so as to finish data transfer.
14. The hardware circuit of any of claims 1-13, wherein the access controller further comprises a state feedback unit;
the state feedback sheet is used for responding to a received data moving instruction sent by a processor and sending first state information to the processor to indicate that the hardware circuit is in a busy state currently;
and responding to a storage instruction generated according to the acquired second information and the read data to store the target data to the destination memory, and sending second state information to the processor to indicate that the hardware circuit is in an idle state currently.
15. The hardware circuit of claim 4, wherein the first memory and/or the second memory comprise a buffer to support a read-write.
16. A data migration method applied to the hardware circuit as claimed in claim 1 or 2, the method comprising:
the access controller responds to the received data moving instruction, and acquires first information used for reading data and second information used for writing data which are included in the data moving instruction; generating a reading instruction according to the first information, and sending the reading instruction to the source memory;
the source memory reads data according to the first information included in the reading instruction and sends the read data to the storage controller;
and the storage controller acquires the second information and generates a storage instruction according to the acquired second information and the read data so as to store the data to the destination memory.
17. A data migration method applied to a hardware circuit as claimed in any one of claims 3 to 15, the method comprising:
the access controller responds to a received data moving instruction, acquires first information used for reading data and second information used for writing data which are included in the data moving instruction, generates a reading instruction according to the first information, and sends the reading instruction to a source memory;
the source memory reads data according to the first information included in the reading instruction and sends the read data to the storage controller;
the storage controller generates a storage instruction according to the second information and the acquired second information and the read data so as to store the data to a target memory;
the source storage is the shared memory, the destination storage is the external memory, and the storage controller is the first controller; and/or the source memory is the external memory, the destination memory is the shared memory, and the storage controller is the second controller.
18. A chip comprising a hardware circuit as claimed in any one of claims 1 to 15.
19. An electronic device comprising a hardware circuit as claimed in any one of claims 1 to 15 or a chip as claimed in claim 18.
20. A computer-readable storage medium on which a computer program is stored, the program, when executed by a controller, implementing the data migration method according to claim 16 or 17.
CN202110518007.2A 2021-04-29 2021-05-12 Hardware circuit, data moving method, chip and electronic equipment Pending CN113220346A (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202110476825 2021-04-29
CN2021104768250 2021-04-29

Publications (1)

Publication Number Publication Date
CN113220346A true CN113220346A (en) 2021-08-06

Family

ID=77095209

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110518007.2A Pending CN113220346A (en) 2021-04-29 2021-05-12 Hardware circuit, data moving method, chip and electronic equipment

Country Status (2)

Country Link
CN (1) CN113220346A (en)
WO (1) WO2022227563A1 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2022227563A1 (en) * 2021-04-29 2022-11-03 上海阵量智能科技有限公司 Hardware circuit, data migration method, chip, and electronic device

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2019047026A1 (en) * 2017-09-05 2019-03-14 华为技术有限公司 Data migration method and system and intelligent network card
CN111782154A (en) * 2020-07-13 2020-10-16 北京四季豆信息技术有限公司 Data moving method, device and system
CN111984395A (en) * 2019-05-22 2020-11-24 中移(苏州)软件技术有限公司 Data migration method and system, and computer readable storage medium
CN112506437A (en) * 2020-12-10 2021-03-16 上海阵量智能科技有限公司 Chip, data moving method and electronic equipment
CN112578997A (en) * 2019-09-30 2021-03-30 华为技术有限公司 Data migration method, system and related equipment

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105988953B (en) * 2015-02-12 2019-03-12 深圳市中兴微电子技术有限公司 A kind of direct memory access dma controller and the method for data transmission
CN107783918A (en) * 2016-08-31 2018-03-09 北京信威通信技术股份有限公司 A kind of method and device of data transfer
CN111190842B (en) * 2019-12-30 2021-07-20 Oppo广东移动通信有限公司 Direct memory access, processor, electronic device, and data transfer method
CN113220346A (en) * 2021-04-29 2021-08-06 上海阵量智能科技有限公司 Hardware circuit, data moving method, chip and electronic equipment

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2019047026A1 (en) * 2017-09-05 2019-03-14 华为技术有限公司 Data migration method and system and intelligent network card
CN111984395A (en) * 2019-05-22 2020-11-24 中移(苏州)软件技术有限公司 Data migration method and system, and computer readable storage medium
CN112578997A (en) * 2019-09-30 2021-03-30 华为技术有限公司 Data migration method, system and related equipment
CN111782154A (en) * 2020-07-13 2020-10-16 北京四季豆信息技术有限公司 Data moving method, device and system
CN112506437A (en) * 2020-12-10 2021-03-16 上海阵量智能科技有限公司 Chip, data moving method and electronic equipment

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2022227563A1 (en) * 2021-04-29 2022-11-03 上海阵量智能科技有限公司 Hardware circuit, data migration method, chip, and electronic device

Also Published As

Publication number Publication date
WO2022227563A1 (en) 2022-11-03

Similar Documents

Publication Publication Date Title
CN103488580A (en) Method for establishing address mapping table of solid-state memory
CN112506437A (en) Chip, data moving method and electronic equipment
WO2013121085A2 (en) Method, apparatus, and computer program product for inter-core communication in multi-core processors
US11941514B2 (en) Method for execution of computational graph in neural network model and apparatus thereof
CN115033184A (en) Memory access processing device and method, processor, chip, board card and electronic equipment
CN114880259B (en) Data processing method, device, system, electronic equipment and storage medium
WO2007037843A2 (en) Method and apparatus for sharing memory in a multiprocessor system
CN113220346A (en) Hardware circuit, data moving method, chip and electronic equipment
CN110245024B (en) Dynamic allocation system and method for static storage blocks
CN113033785B (en) Chip, neural network training system, memory management method, device and equipment
CN113312182B (en) Cloud computing node, file processing method and device
CN116521096B (en) Memory access circuit, memory access method, integrated circuit, and electronic device
CN101421705A (en) Multi media card with high storage capacity
CN103970714A (en) Apparatus and method for sharing function logic and reconfigurable processor thereof
US20190065075A1 (en) Method to improve mixed workload performance on storage devices that use cached operations
CN116737083B (en) Memory access circuit, memory access method, integrated circuit, and electronic device
CN112596669A (en) Data processing method and device based on distributed storage
KR20110037492A (en) Multi-port memory system and access control method thereof
CN104142802A (en) Memory control apparatus and method
CN116661703A (en) Memory access circuit, memory access method, integrated circuit, and electronic device
US8627031B2 (en) Semiconductor memory device and method of reading data from and writing data into a plurality of storage units
CN102200961B (en) Expansion method of sub-units in dynamically reconfigurable processor
CN110096355B (en) Shared resource allocation method, device and equipment
CN113296972A (en) Information registration method, computing device and storage medium
CN105718993A (en) Cell array calculation system and communication method therein

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
REG Reference to a national code

Ref country code: HK

Ref legal event code: DE

Ref document number: 40049955

Country of ref document: HK