WO2020052171A1 - Hardware system and electronic device - Google Patents

Hardware system and electronic device Download PDF

Info

Publication number
WO2020052171A1
WO2020052171A1 PCT/CN2018/124854 CN2018124854W WO2020052171A1 WO 2020052171 A1 WO2020052171 A1 WO 2020052171A1 CN 2018124854 W CN2018124854 W CN 2018124854W WO 2020052171 A1 WO2020052171 A1 WO 2020052171A1
Authority
WO
WIPO (PCT)
Prior art keywords
task
cache access
unit
sub
access
Prior art date
Application number
PCT/CN2018/124854
Other languages
French (fr)
Chinese (zh)
Inventor
李炜
曹庆新
黎立煌
Original Assignee
深圳云天励飞技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 深圳云天励飞技术有限公司 filed Critical 深圳云天励飞技术有限公司
Publication of WO2020052171A1 publication Critical patent/WO2020052171A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/30003Arrangements for executing specific machine instructions
    • G06F9/3004Arrangements for executing specific machine instructions to perform operations on memory
    • G06F9/30047Prefetch instructions; cache control instructions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode

Definitions

  • the present application relates to the field of computer technology, and in particular, to a hardware system and an electronic device.
  • the traditional solution is generally carried out through a shared data cache.
  • a functional module needs to pass data to the shared data cache, it needs to interrupt the central processing unit, and then the central processing unit notifies the other functional module that the data is ready, so that another functional module can retrieve the data from the shared
  • the data cache reads the data it needs.
  • the embodiments of the present application provide a hardware system and an electronic device, so as to improve the operating efficiency of the hardware system and reduce the interruption of the central processing unit.
  • a first aspect of the present application provides a hardware system, including:
  • a hardware system comprising: a central processing unit, a data buffer, a cache access unit, and a task manager;
  • the central processing unit is connected to the task manager and the cache access unit;
  • the data buffer is connected to the task manager and the cache access unit;
  • the central processor is configured to deliver at least one cache access task to the task manager and the cache access unit;
  • the cache access unit is configured to execute the at least one cache access task to access the data buffer
  • the task manager is configured to monitor the execution of the at least one cache access task by the cache access unit based on the at least one cache access task.
  • the at least one cache access task includes a cache access task T1 and a cache access task T0, and the cache access task T0 is a cache access task on which the cache access task T1 depends;
  • the task manager is specifically configured to, after receiving an access request q1 for the cache access task T1 from the cache access unit, determine whether to allow the response to the access request based on the completion of the cache access task T0 q1;
  • the task manager sends an access response aq1 for responding to the access request q1 to the cache access unit, and the access response aq1 is used to indicate that the cache access unit is allowed to respond to the access request q1.
  • the data buffer executes the cache access task T1.
  • the task manager is specifically configured to: when the cache access task T0 is completed, determine to allow response to the access request q1; when the cache access task T0 is not completed, request the access requested by the cache access task T1
  • the cached pointer is compared with the current pointer of the cache access task T0. If the pointer comparison is passed, the judgment is allowed to respond to the access request q1; if the pointer comparison fails, the judgment is not allowed to respond to the access request q1 .
  • the data buffer includes a plurality of cache slices
  • the cache access task T1 includes the following fields:
  • the first field to indicate the slice identification of the accessed cache slice the second field to indicate the start address of the accessed cache slice, the third field to indicate the length of the accessed data, and the cache access task.
  • T1 depends on the fields of the cache access task T0.
  • the access request q1 includes the following fields: the first field and a fourth field for indicating a unit identifier of a cache access unit that issued the access request q1;
  • the access request q1 includes the following fields: the second field, the third field, and the fourth field;
  • the access request q1 includes the following fields: the second field, the third field, and a fifth field for indicating a task identifier of the cache access task T1.
  • the cache access unit includes the following access subunits: an external data reading subunit, an external data retrieval subunit, an internal data reading subunit, and an internal data retrieval subunit;
  • the hardware system further includes an external memory and an arithmetic unit
  • the external data reading subunit and the external data storing subunit are connected between the external memory and the data buffer; the internal data reading subunit and the internal data storing subunit are connected to all Said operation unit and said data buffer.
  • the external data reading subunit is configured to store data read from the external memory into the data buffer
  • the external data return subunit is configured to store data read from the data buffer into the external memory
  • the internal data reading subunit is configured to provide data read from the data buffer to the arithmetic unit for operation;
  • the internal data return subunit is configured to store result data obtained by the operation of the operation unit into the data buffer.
  • a central processing unit communicates with the task manager, the external data reading sub-unit, the external data storing sub-unit, the internal data reading sub-unit, and the internal unit through a bus. Data retrieval sub-unit connection;
  • the central processing unit is specifically configured to issue a cache access task of the external data reading subunit to the external data reading subunit through a bus; and issue an office to the external data return subunit.
  • the cache access task of the external data return sub-unit; and the cache access task of the internal data return sub-unit is issued to the internal data return sub-unit, and an office is issued to the external data return sub-unit
  • the task manager includes: a cache access controller and a cache access task queue;
  • the cache access task queue is configured to store the external data reading subunit, the external data storing subunit, the internal data storing subunit, and the internal data issued by the central processing unit. Cache access tasks of sub-units;
  • the cache access controller in the task manager is configured to:
  • the pointer of the cache access task T1 requested to access the cache is compared with the current pointer of the cache access task T0. If the pointer comparison is passed, the decision is allowed to respond to the Access request q1; if the pointer comparison fails, it is determined that it is not allowed to respond to the access request q1.
  • the cache access task queue includes a first sub queue, a second sub queue, a third sub queue, and a fourth sub queue:
  • the first sub-queue is configured to store a cache access task of the external data reading sub-unit issued by the central processing unit;
  • the second sub-queue is configured to store a cache access task of the external data return sub-unit delivered by the central processing unit;
  • the third sub-queue is configured to store a cache access task of the internal data reading sub-unit issued by the central processing unit;
  • the fourth sub-queue is configured to store a cache access task of the internal data return sub-unit delivered by the central processing unit;
  • the cache access controller includes: a first sub-controller, a second sub-controller, a third sub-controller, and a fourth sub-controller:
  • the first sub-controller is a cache access controller of an external data reading sub-unit; the second sub-controller is a cache access controller of an external data return sub-unit; and the third sub-controller is internal data Read the cache access controller of the sub-unit; the fourth sub-controller is a cache access controller of the internal data return sub-unit.
  • a second aspect of the present application provides an electronic device including a casing and a hardware system housed in the casing, where the hardware system is any one of the hardware systems provided in the first aspect.
  • a task manager which is different from a central processor, is introduced.
  • the central processor is mainly used to deliver cache access tasks to the task manager and the cache access unit, and task management
  • the processor is configured to monitor the execution of the at least one cache access task by the cache access unit based on the cache access task of the cache access unit. In other words, the execution of cache access tasks no longer requires the central processor to control through interrupts, but the task manager is responsible for managing the data buffer and scheduling the cache access unit, which is beneficial to reducing the central processor. Interruption.
  • the dedicated hardware outside the CPU that is, the task manager that is different from the CPU
  • Automatic management of data buffers and cache access units compared to the traditional technology of central processing of data buffers by a central processor (in traditional technology, in addition to its main job, the central processor also needs to manage the data buffers) Access, then the management of the data buffer can be considered a part-time job of the central processor), the response speed of the dedicated hardware to manage the data buffer is relatively faster, which is more conducive to improving the utilization of the data buffer; and because By reducing the interruption of the central processing unit, the central processing unit can focus more on its own work related to the system operation, which is conducive to greatly improving the operating efficiency of the entire system.
  • the cache access unit more subunits can be subdivided based on different access mechanisms. For example, when the cache access unit includes an external data read subunit, an external data return subunit, an internal data read subunit, and an internal data return subunit Units and other sub-units, the task manager can coordinate the data access work between these sub-units, which is conducive to improving the access efficiency of the data buffer and further improving the utilization of the data buffer.
  • 1-A is a schematic diagram of a hardware system architecture according to an embodiment of the present application.
  • 1-B is a schematic diagram of another hardware system architecture according to an embodiment of the present application.
  • FIG. 1-C is a schematic diagram of another hardware system architecture according to an embodiment of the present application.
  • FIG. 1-D is a schematic diagram of another hardware system architecture according to an embodiment of the present application.
  • FIG. 2 is a schematic diagram of a central processing unit according to an embodiment of the present application delivering a task to a task manager and each cache access unit through a task bus;
  • FIG. 3 is a schematic diagram of an internal architecture of a task manager according to an embodiment of the present application.
  • FIG. 4 is a schematic diagram of an internal architecture of another task manager according to an embodiment of the present application.
  • FIG. 5 is a schematic flowchart of a processing method of a hardware system according to an embodiment of the present application.
  • 6-A and 6-B are schematic diagrams of two data formats of a task provided by an embodiment of the present application.
  • 7-A is a schematic diagram of a situation in which tasks are cached in each sub-queue in the task manager provided in the embodiment of the present application;
  • 7-B is a schematic diagram of another situation of caching tasks in each sub-queue in the task manager provided in the embodiment of the present application;
  • FIG. 8 is a schematic diagram of an execution association relationship of some tasks provided by an embodiment of the present application.
  • the embodiments of the present application provide a hardware system and an electronic device, so as to improve the operating efficiency of the hardware system and reduce the interruption of the central processing unit.
  • the embodiments of the present application strive to design a hardware system, which is conducive to reducing the interruption of the central processing unit and improving the operating efficiency of the hardware system.
  • FIG. 1-A is a schematic diagram of a hardware system architecture provided by an embodiment of the present application.
  • the hardware system 100 mainly includes a central processing unit 110, a data buffer 120, a cache access unit 130, and a task manager 140.
  • the central processing unit 110 is connected to the task manager 140 and the cache access unit 130.
  • the data buffer 120 is connected to the task manager 140 and the cache access unit 130.
  • the central processing unit 110 is configured to deliver at least one cache access task (for example, one cache access task or multiple cache access tasks) to the task manager 140 and the cache access unit 130.
  • at least one cache access task for example, one cache access task or multiple cache access tasks
  • the cache access unit 130 is configured to execute the at least one cache access task to access the data buffer 120.
  • the task manager 140 is configured to monitor the execution of the at least one cache access task by the cache access unit 130 based on at least one cache access task of the cache access unit 130.
  • the central processor is mainly used to deliver cache access tasks to the task manager and the cache access unit, and the task manager is used to The cache access task of the cache access unit monitors the cache access unit 130 performing the at least one cache access task.
  • the execution of cache access tasks no longer requires the central processor to control through interrupts, but the task manager is responsible for managing the data buffer and scheduling the cache access unit, which is beneficial to reducing the central processor. Interruption. Since the execution of the cache access task can be completed automatically by other hardware (that is, the task manager) other than the CPU, the dedicated hardware outside the CPU (that is, the task manager that is different from the CPU) can be realized.
  • At least two components of the central processing unit 110, the data buffer 120, the cache access unit 130, and the task manager 140 may be connected through a bus.
  • the central processing unit 110, the data buffer 120, the cache access unit 130, and the task manager 140 are all connected through a bus.
  • the hardware system 100 may further include some other components.
  • FIG. 1-B is a schematic diagram of another hardware system architecture provided by an embodiment of the present application.
  • the hardware system 100 may further include an operation unit 150.
  • the cache access unit 130 may include the following access subunits: an internal data reading subunit 133 and an internal data return subunit 134.
  • the internal data reading subunit 133 and the internal data returning subunit 133 are connected between the operation unit 150 and the data buffer 120.
  • the internal data reading subunit 133 may be configured to provide data read from the data buffer 120 to the operation unit 150 for operations.
  • the internal data return sub-unit 134 may be configured to store the result data obtained by the operation unit 150 into the data buffer 120.
  • the operation unit 150 may include at least one operator unit, and the at least one operator unit may include, for example, at least one convolution operator unit, at least one addition operator unit, at least one subtraction operator unit, and / or at least one Other operator units, etc.
  • FIG. 1-C is a schematic diagram of another hardware system architecture provided by an embodiment of the present application.
  • the hardware system 100 may further include an external memory 160.
  • the cache access unit 130 may include the following access subunits: an external data reading subunit 131 and an external data return subunit 132.
  • the internal data reading subunit 131 and the internal data returning subunit 132 are connected between the external memory 160 and the data buffer 120.
  • the external data reading subunit 131 is configured to store data read from the external memory 160 into the data buffer 120.
  • the external data return sub-unit 132 is configured to store data read from the data buffer 120 into the external memory 160.
  • FIG. 1-D is a schematic diagram of still another hardware system architecture according to an embodiment of the present application.
  • the hardware system 100 may include both the operation unit 150 and the external memory 160.
  • the cache access unit 130 may include the following access subunits: an external data read subunit 131, an external data readback subunit 132, an internal data read subunit 133, and an internal data readback subunit. 134.
  • an external data read subunit 131 an external data read subunit 131
  • an external data readback subunit 132 an internal data read subunit 133
  • an internal data readback subunit 134 an internal data readback subunit.
  • the cache access unit 130 includes an external data reading sub-unit 131, an external data return sub-unit 132, an internal data reading sub-unit 133, and an internal data return Storing the sub-units such as the sub-unit 134, the task manager 140 can coordinate the data access work between these sub-units, which is conducive to improving the access efficiency of the data buffer and further improving the utilization of the data buffer 120.
  • the external memory 160 is a memory relatively far from the computing unit 150, and the data buffer 120 is a memory relatively close to the computing unit 150.
  • the external memory 120 may be, for example, an on-chip random access memory, an off-chip random access memory, and specifically may be a double data rate (SDRAM) synchronous dynamic random access memory (SDRAM) or other types of random access memories.
  • SDRAM double data rate synchronous dynamic random access memory
  • the external data reading subunit 131 may be, for example, eidma (external input direct memory access), and the external data reading subunit 131 may read data in the external memory 160 into the data buffer 120.
  • the external data retrieval sub-unit 132 may be, for example, eodma (external output direct memory access), and the external data retrieval sub-unit 132 may store the data in the data buffer 120 into the external memory 160.
  • eodma external output direct memory access
  • the internal data reading sub-unit 141 may be, for example, an idma (input direct memory access), and the internal data reading sub-unit 141 may read the data in the data buffer 120 to the arithmetic unit 150 for calculation.
  • idma input direct memory access
  • the internal data return sub-unit 142 may be, for example, odma (output direct memory access), and the internal data return sub-unit 141 may return the operation result of the operation unit 150 to the data buffer 120.
  • odma output direct memory access
  • the central processing unit 110 may include, for example, a central processing unit (CPU) or other processors, such as a digital signal processor (DSP), a microprocessor, a micro central processing unit, or a neural network calculator.
  • the components of the hardware system may be coupled together through a bus system, for example.
  • the bus system may include a data bus, a power bus, a control bus, and a status signal bus.
  • the central processing unit 110 may be an integrated circuit chip with signal processing capabilities.
  • the central processing unit 110 may also include other hardware accelerators, such as an application-specific integrated circuit, a field programmable gate array or other programmable logic device, a discrete gate, or a transistor. Logic devices, discrete hardware components, etc.
  • the central processing unit 110 is, for example, configured to issue a task through a task bus (task_bus).
  • the task manager 140 (task manager) performs management of the data buffer 120 and coordinated and synchronous management of each cache access unit.
  • FIG. 2 is a schematic diagram of sending tasks by a central processing unit 110 to a task manager and various cache access units through a task bus (task_bus).
  • task_bus a task bus
  • the following examples illustrate some possible ways for the task manager 140 to cooperatively manage the cache access unit to access the data buffer.
  • the task manager 140 may be configured to, after receiving an access request q1 for the cache access task T1 from the cache access unit, determine based on the completion of the cache access task T0 on which the cache access task T1 depends. Whether to respond to the access request q1. If it is determined that the access request q1 is allowed to be responded, an access response aq1 for responding to the access request q1 is sent to the cache access unit, and the access response aq1 is used to indicate that the cache access unit is allowed to respond to the data buffer The access request q1 is executed.
  • the task manager 140 may be configured to: when the cache access task T1 depends on the cache access task T0 If it is completed, it is judged that it is allowed to respond to the access request q1; when the cache access task T0 is not completed, the pointer to access the cache requested by the cache access task T1 and the current pointer of the cache access task T0 may be further performed Pointer comparison. If the pointer comparison passes, it is judged that the access request q1 is allowed to be responded; if the pointer comparison fails, it may be judged that the access request q1 is not allowed to be responded to.
  • the current pointer of the cache access task T0 lags behind the pointer requested by the cache access task T1 to indicate that the pointer has passed the comparison; when the current pointer of the cache access task T0 is not behind the pointer of the cache access task T1 requested to access the cache Indicates that the pointer comparison failed.
  • the cache access task T1 depends on the cache access task T0, which means that the execution of the cache access task T1 must depend on the execution of the cache access task T0, that is, the execution of the cache access task T1 is based on the successful execution of the cache access task T0. That is, after the cache access task T0 is successfully executed, the cache access task T1 can be executed.
  • the cache access task T1 is a read data buffer 120 (data memory) and the cache access task T0 is a write data buffer 120
  • the data that the cache access task T1 wants to read from the data buffer 120 may be Some or all of the data written by the cache access task T0 to the data buffer 120. Only after the cache access task T0 is successfully executed, can the cache access task T1 read the corresponding data from the data buffer 120. If the cache access task T0 If it is not successfully executed, then the data buffer 120 does not have the data to be read by the cache access task T1, then the cache access task T1 cannot be successfully executed. In this case, the cache access task T1 depends on the cache access task T0.
  • the data buffer 120 may include multiple buffers.
  • the cache access task T1 may include the following fields: a field for indicating a slice ID (buffer ID) of the accessed cache slice, a field for indicating a start address of the accessed cache slice, and an instruction for indicating A field of the length of the accessed data, a field for indicating the cache access task T0 on which the cache access task T1 depends, and the like. It can be seen that the above-mentioned fields included in the cache access task are helpful for clearly indicating the corresponding cache access task, the cache location accessed, and the length of the cache space accessed.
  • the access request q1 may include the following fields: a field used to indicate a start address of the accessed cache slice, and a field used to indicate a unit identifier of a cache access unit that issued the access request q1;
  • the access request q1 may include the following fields: a field used to indicate the start address of the accessed cache slice, a field used to indicate the length of the accessed data, and used to indicate the cache access unit that issued the access request q1 Field of the unit identification;
  • the access request q1 may include the following fields: a field used to indicate the start address of the accessed cache slice, a field used to indicate the length of the accessed data, and used to indicate the task identifier of the cache access task T1. Field.
  • the central processing unit 110 when the central processing unit 110 communicates with the task manager 140, the external data reading subunit 131, the external data storing subunit 132, the internal data reading subunit 133, and The internal data storage subunit 134 is connected. Then, the central processing unit 110 may be specifically configured to send a cache access task of the external data reading subunit 131 to the external data reading subunit 131 through a bus; and return the subdata to the external data.
  • the unit 132 issues a cache access task of the external data retrieval subunit 132; and issues a cache access task of the internal data retrieval subunit 133 to the internal data retrieval subunit 133, and sends the cache access task to the external data retrieval subunit 133.
  • the data retrieval subunit 134 issues a cache access task of the external data reading and retrieval subunit 134; and issues the external data reading subunit 131 and the external data retrieval to the task manager 140 A cache access task of the sub-unit 132, the internal data return sub-unit 133, and the internal data return sub-unit 134.
  • the task manager 140 may include a cache access controller 142 and a cache access task queue 141.
  • the cache access task queue is configured to store the external data reading subunit, the external data storing subunit, the internal data storing subunit, and the internal data issued by the central processing unit. Cache access tasks for sub-units.
  • the cache access controller 142 in the task manager 140 may be used to: When the cache access task T0 on which the cache access task T1 depends is completed, it is determined to be allowed to respond to the access request q1; when the cache access task T0 is not completed, the pointer to access the cache requested by the access request q1 and the cache The current pointer of the access task T0 is compared with the pointer. If the pointer comparison is passed, the decision is allowed to respond to the access request q1; if the pointer comparison fails, the decision is not allowed to respond to the access request q1.
  • FIG. 4 illustrates an example of a cache access task queue 141 and a cache access controller 142 included in the task manager 140.
  • the cache access task queue 141 includes a first sub-queue (such as eidma task queue), a second sub-queue (such as eodma task queue), a third sub-queue (such as idma task queue), and a fourth sub-queue (such as odma task task). queue).
  • the first sub-queue is used to store a cache access task of the external data reading sub-unit issued by the central processing unit 110.
  • the second sub-queue is configured to store a cache access task of the external data return sub-unit delivered by the central processing unit 110.
  • the third sub-queue is used to store a cache access task of the internal data reading sub-unit issued by the central processing unit 110.
  • the fourth sub-queue is used to store a cache access task of the internal data return sub-unit delivered by the central processing unit 110.
  • the sub-queue is further divided inside the cache access task queue, so that different sub-queues store cache access tasks that need to be performed by different cache access subunits, then it will be easier to implement the classification management of cache access tasks and help simplify cache access.
  • the read complexity of the task further improves the read efficiency of the cache access task.
  • the cache access controller 142 includes:
  • the first sub-controller (such as eidma cache access controller), the second sub-controller (such as eodma cache access controller), the third sub-controller (such as idma cache access controller), and the fourth sub-controller (such as odma Cache Access Controller).
  • the first sub-controller is a cache access controller of the external data reading sub-unit.
  • the second sub-controller is a cache access controller of an external data retrieval sub-unit;
  • the third sub-controller is a cache access controller of an internal data reading sub-unit;
  • the fourth sub-controller is internal data Cache access controller to store subunits.
  • each cache access unit is configured with a corresponding cache access sub-controller, it is more conducive to independent control of the cache access unit, and further facilitates the collaborative management and control of the data access unit.
  • FIG. 5 other working processes of the hardware system 100 are described below with reference to FIG. 5.
  • the following describes the collaborative execution process of data access tasks to describe how the central processor and the task manager work together to complete the data access of the data buffer.
  • the central processing unit 110 sends tasks to the task manager 140, eodma 132, eidma 131, idma 133, and odma 134 through a task bus.
  • the tasks that access the data buffer 120 are stored in the respective buffer queues of the task manager 140.
  • the tasks that operate on the data buffer 120 and have a dependency relationship with each other are entered into the buffer queue in the task manager 140.
  • the tasks in the buffer task can indicate the following information:
  • the buffer ID of the accessed buffer (cache chip), the start address of the accessed buffer, the length of the accessed data, and the task on which the current task depends.
  • the task that the current task depends on is reading data memory
  • the task can also indicate from which unit the input data of the current task comes from (can be called a data source unit). If the task write on which the current task depends is data memory, the task can also indicate to which unit the output data of the current task goes (for example, it can be called a data target unit).
  • Figure 6-A and Figure 6-B illustrate two data formats for tasks.
  • the task may include a bufferID field indicating the bufferID of the accessed buffer, an address field indicating the start address of the accessed buffer, a length field indicating the length of the accessed data, and a record of the task on which the current task depends.
  • eidma_task1 reads the data data001 from the external memory to the buf0 of the data buffer 120.
  • idma_task1 read the data data001 written by eidma_task1 into buf0, and send the read data data001 to the arithmetic unit for calculation.
  • odma_task1 The operation result data data002 obtained by operating the operation unit on data data001 is stored in buf1 of the data buffer 120.
  • eodma_task1 Read the operation result data data002 stored in buf1 by odma task1, and store the read operation result data data002 into the external memory.
  • FIG. 7-A illustrates an example of a task cached by each sub-queue of the cache access task for storing eodma, eidma, idma, and odma in the task manager.
  • eodma task queue is used to store the cache access task of eodma
  • eidma task queue is used to store the cache access task of eidma
  • the idma task queue is used to store the cache access tasks for eodma
  • the odma task queue is used to store the cache access tasks for eodma.
  • the eidma sends a memory request (task execution request) of the task eidma_task1 to the task manager.
  • the task manager receives a memory request for the task eidma_task1, the eidma cache access controller in the task manager reads the task eidma_task1 in the eidma task queue, and determines whether to respond to the memory request for the task eidma_task1.
  • the data memory controller can backpressure the corresponding unit that initiated the request.
  • back pressure means to notify the corresponding unit that initiated the request to extend the tolerance time to wait for the corresponding response. For example, the normal time to wait for the corresponding response is 2 seconds, while back pressure can inform the corresponding unit that initiated the request to wait for the corresponding response
  • the tolerance time is extended to 5 seconds or other periods of not less than two seconds.
  • the eidma cache access controller in the task manager can notify the data memory controller to respond to the memory request of the task eidma_task1, which can be specifically notified through the xxx_buf_rdy instruction.
  • the eidma131 reads the data to be processed data001 from the external memory 120 to the buf0 of the data buffer 120 according to the guide of the eidma_task1.
  • the task manager 140 notifies (for example, the xxx_buf_rdy instruction) that the IDMA reads the data data001 from the data buffer 120.
  • idma133 After receiving the notification from the task manager 140, idma133 reads the data data001 required for calculation to the computing unit 150. Correspondingly, the operation unit 150 performs an operation on the data data001 and obtains an operation result data002.
  • the odma 134 stores the operation result data002 obtained by the operation unit 150 on the data data001 into the buf1 of the data buffer 120.
  • the task manager 140 may notify the eodma 132 to read the data data002 from buf1 of the data buffer 120 and then save the data002 to the external memory 160.
  • eodma132 When the task manager 140 instructs eodma132 to read the data data002 from the data buffer 120 and then save it to the external storage 160, then, eodma132 reads the data data002 from the data buffer 120 and then saves it to the external storage. 160.
  • the central processing unit assigns tasks to eidma, idma, odma, eodma, and task manager through the task bus (task_bus).
  • eidma, idma, odma, and eodma will work according to the assigned task.
  • task manager dynamically monitors each cache access unit (eidma, idma, odma, and eodma), and performs coordinated management of task execution of each access unit according to the issued task.
  • eidma_task1 Read the initial data data001 stored in external memory to buf0 of data memory.
  • idma_task1 Read the data data001 written by eidma task1 into buf0, and send the read data001 to the process element for calculation.
  • odma_task1 The result data data002 obtained by the process element calculation data data001 is stored in buf1 of the data memory.
  • idma_task2 read the data 002 stored in buf1 by odma task1, and send the read data data002 to process element for calculation.
  • odma_task2 The result data data003 obtained by the process element calculation data data002 is stored in buf0 of the data memory.
  • idma_task3 Read the data data003 stored in buf0 by odma task2, and send the read data data003 to process element for calculation.
  • odma_task3 Result data data004 obtained by process element calculation data data003 is stored in buf1 of data memory.
  • eodma_task1 Read the data 004 stored in buf1 by odma task3, and store the data 004 read in external memory.
  • FIG. 7-B illustrates another example of a task cached by each sub-queue of a cache access task for storing eodma, eidma, idma, and odma in task management.
  • FIG. 8 the execution association relationship of these tasks is illustrated in FIG. 8.
  • the dotted arrows indicate the order in which tasks are performed.
  • eidma_task1 is executed first.
  • Eidma_task1 instructs to write data001 to buf0.
  • idma_task1 which depends on eidma_task1 is executed on eidma_task1, it instructs to read data001 from buf0 to the computing unit for calculation.
  • odma_task1 After odma_task1 is executed on idma_task1, odma_task1 indicates that data002 generated by calculating data001 is written to buf1. After idma_task2, which depends on odma_task1, is executed on odma_task1, it indicates that data002 is read from buf1 and calculated by the computing unit.
  • odma_task2 After odma_task2 is executed on idma_task2, odma_task2 instructs to write data003 resulting from the calculation of data002 to buf0. After idma_task3, which depends on odma_task2, is executed on odma_task2, it instructs to read data002 from buf1 for calculation by the computing unit.
  • odma_task3 After odma_task3 is executed on idma_task3, odma_task3 instructs to write data004 generated by calculating data003 to buf1. After eodma_task1 which depends on odma_task3 is executed on odma_task3, it instructs to read data004 from buf1 and store the read data data004 into external memory.

Landscapes

  • Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Memory System Of A Hierarchy Structure (AREA)

Abstract

A hardware system and an electronic device. The hardware system (100) comprises: a central processing unit (110), a data memory (120), a buffer access element (130) and a task manager (140), wherein the central processing unit (110) is connected to the task manager (140) and the buffer access element (130); the data memory (120) is connected to the task manager (140) and the buffer access element (130); the central processing unit (110) is used for issuing at least one buffer access task to the task manager (140) and the buffer access element (130); the buffer access element (130) is used for executing the at least one buffer access task to access the data memory (120); and the task manager (140) is used for monitoring, based on the at least one buffer access task, the execution of the at least one buffer access task by the buffer access element (130). The hardware system and the electronic device facilitate the improvement of the operating efficiency of the hardware system and reduction of the interruption of the central processing unit.

Description

硬件系统和电子设备Hardware systems and electronics
本申请要求于2018年9月11日提交中国专利局,申请号为201811056568.X、发明名称为“硬件系统和电子设备”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。This application claims priority from a Chinese patent application filed with the Chinese Patent Office on September 11, 2018, with an application number of 201811056568.X and an invention name of "Hardware System and Electronic Equipment", the entire contents of which are incorporated herein by reference .
技术领域Technical field
本申请涉及计算机技术领域,具体涉及了硬件系统和电子设备。The present application relates to the field of computer technology, and in particular, to a hardware system and an electronic device.
背景技术Background technique
通常来讲,一个系统内部的多个功能模块之间如果有数据交换的话,传统方案一般是通过一块共享的数据缓存来进行。传统方案中,一个功能模块将需要传递数据写入共享的数据缓存之后,需要中断中央处理器,然后由中央处理器通知另一个功能模块数据已经准备好,这样另一个功能模块就可从共享的数据缓存读取到它所需要的数据。Generally speaking, if there is data exchange between multiple functional modules within a system, the traditional solution is generally carried out through a shared data cache. In the traditional solution, after a functional module needs to pass data to the shared data cache, it needs to interrupt the central processing unit, and then the central processing unit notifies the other functional module that the data is ready, so that another functional module can retrieve the data from the shared The data cache reads the data it needs.
然而,实践过程发现,传统方案由于需不断的中断中央处理器,使得系统的运转效率相对较低。However, in practice, it is found that the traditional scheme requires relatively constant interruption of the central processing unit, which makes the operating efficiency of the system relatively low.
发明内容Summary of the Invention
本申请实施例提供一种硬件系统和电子设备,以期提升硬件系统的运转效率和减少中央处理器中断。The embodiments of the present application provide a hardware system and an electronic device, so as to improve the operating efficiency of the hardware system and reduce the interruption of the central processing unit.
本申请第一方面提供一种硬件系统,包括:A first aspect of the present application provides a hardware system, including:
一种硬件系统,其特征在于,包括:中央处理器、数据缓存器、缓存访问单元和任务管理器;A hardware system, comprising: a central processing unit, a data buffer, a cache access unit, and a task manager;
其中,所述中央处理器与所述任务管理器和缓存访问单元连接;Wherein, the central processing unit is connected to the task manager and the cache access unit;
其中,所述数据缓存器与所述任务管理器和所述缓存访问单元连接;The data buffer is connected to the task manager and the cache access unit;
其中,所述中央处理器用于,向所述任务管理器和所述缓存访问单元下发至少一个缓存访问任务;The central processor is configured to deliver at least one cache access task to the task manager and the cache access unit;
所述缓存访问单元,用于执行所述至少一个缓存访问任务以对所述数据缓存器进行访问;The cache access unit is configured to execute the at least one cache access task to access the data buffer;
所述任务管理器用于,基于所述至少一个缓存访问任务,对所述缓存访问单元执行所述至少一个缓存访问任务进行监控。The task manager is configured to monitor the execution of the at least one cache access task by the cache access unit based on the at least one cache access task.
在一些可能的实施方式中,所述至少一个缓存访问任务包括缓存访问任务T1和缓存访问 任务T0,所述缓存访问任务T0为所述缓存访问任务T1所依赖的缓存访问任务;In some possible implementation manners, the at least one cache access task includes a cache access task T1 and a cache access task T0, and the cache access task T0 is a cache access task on which the cache access task T1 depends;
所述任务管理器具体用于,在接收到来自所述缓存访问单元的针对所述缓存访问任务T1的访问请求q1之后,基于所述缓存访问任务T0的完成情况判决是否允许响应所述访问请求q1;The task manager is specifically configured to, after receiving an access request q1 for the cache access task T1 from the cache access unit, determine whether to allow the response to the access request based on the completion of the cache access task T0 q1;
若判决允许响应所述访问请求q1,所述任务管理器向所述缓存访问单元发送用于响应所述访问请求q1的访问响应aq1,所述访问响应aq1用于指示允许所述缓存访问单元对所述数据缓存器执行所述缓存访问任务T1。If it is determined that it is allowed to respond to the access request q1, the task manager sends an access response aq1 for responding to the access request q1 to the cache access unit, and the access response aq1 is used to indicate that the cache access unit is allowed to respond to the access request q1. The data buffer executes the cache access task T1.
在一些可能的实施方式中,在基于所述缓存访问任务T0的完成情况判决是否允许响应所述访问请求q1时,In some possible implementation manners, when determining whether to allow response to the access request q1 based on the completion of the cache access task T0,
所述任务管理器具体用于,当所述缓存访问任务T0已完成,则判决允许响应所述访问请求q1;当所述缓存访问任务T0未完成,则将所述缓存访问任务T1所请求访问缓存的指针与所述缓存访问任务T0的当前指针进行指针比对,若指针对比通过,则判决允许响应所述访问请求q1;若指针对比未能通过,则判决不允许响应所述访问请求q1。The task manager is specifically configured to: when the cache access task T0 is completed, determine to allow response to the access request q1; when the cache access task T0 is not completed, request the access requested by the cache access task T1 The cached pointer is compared with the current pointer of the cache access task T0. If the pointer comparison is passed, the judgment is allowed to respond to the access request q1; if the pointer comparison fails, the judgment is not allowed to respond to the access request q1 .
在一些可能的实施方式中,所述数据缓存器包括多个缓存片,所述缓存访问任务T1包括如下字段:In some possible implementation manners, the data buffer includes a plurality of cache slices, and the cache access task T1 includes the following fields:
用于指示所访问缓存片的片标识的第一字段、用于指示所访问缓存片的起始地址的第二字段、用于指示所访问数据的长度的第三字段、用于指示缓存访问任务T1所依赖缓存访问任务T0的字段。The first field to indicate the slice identification of the accessed cache slice, the second field to indicate the start address of the accessed cache slice, the third field to indicate the length of the accessed data, and the cache access task. T1 depends on the fields of the cache access task T0.
在一些可能的实施方式中,所述访问请求q1包括如下字段:所述第一字段和用于指示发出所述访问请求q1的缓存访问单元的单元标识的第四字段;In some possible implementation manners, the access request q1 includes the following fields: the first field and a fourth field for indicating a unit identifier of a cache access unit that issued the access request q1;
或者,or,
所述访问请求q1包括如下字段:所述第二字段、所述第三字段和所述第四字段;The access request q1 includes the following fields: the second field, the third field, and the fourth field;
或者,所述访问请求q1包括如下字段:所述第二字段、所述第三字段和用于指示所述缓存访问任务T1的任务标识的第五字段。Alternatively, the access request q1 includes the following fields: the second field, the third field, and a fifth field for indicating a task identifier of the cache access task T1.
在一些可能的实施方式中,所述缓存访问单元包括如下访问子单元:外部数据读取子单元、外部数据回存子单元、内部数据读取子单元和内部数据回存子单元;In some possible implementation manners, the cache access unit includes the following access subunits: an external data reading subunit, an external data retrieval subunit, an internal data reading subunit, and an internal data retrieval subunit;
所述硬件系统还包括外部存储器和运算单元;The hardware system further includes an external memory and an arithmetic unit;
其中,所述外部数据读取子单元和外部数据回存子单元连接于所述外部存储器和所述数据缓存器之间;所述内部数据读取子单元和内部数据回存子单元连接于所述运算单元和所述数据缓存器之间。Wherein, the external data reading subunit and the external data storing subunit are connected between the external memory and the data buffer; the internal data reading subunit and the internal data storing subunit are connected to all Said operation unit and said data buffer.
其中,所述外部数据读取子单元,用于将从所述外部存储器读取到的数据存入到所述数 据缓存器中;The external data reading subunit is configured to store data read from the external memory into the data buffer;
所述外部数据回存子单元,用于将从所述数据缓存器中读取到的数据存入到所述外部存储器中;The external data return subunit is configured to store data read from the data buffer into the external memory;
其中,所述内部数据读取子单元,用于将从所述数据缓存器读取到的数据提供给所述运算单元运算;The internal data reading subunit is configured to provide data read from the data buffer to the arithmetic unit for operation;
所述内部数据回存子单元,用于将所述运算单元运算得到的结果数据存入到到所述数据缓存器中。The internal data return subunit is configured to store result data obtained by the operation of the operation unit into the data buffer.
在一些可能的实施方式中,中央处理器通过总线与所述任务管理器、所述外部数据读取子单元、所述外部数据回存子单元、所述内部数据读取子单元和所述内部数据回存子单元连接;In some possible implementation manners, a central processing unit communicates with the task manager, the external data reading sub-unit, the external data storing sub-unit, the internal data reading sub-unit, and the internal unit through a bus. Data retrieval sub-unit connection;
其中,所述中央处理器具体用于通过总线,向所述外部数据读取子单元下发所述外部数据读取子单元的缓存访问任务;并向所述外部数据回存子单元下发所述外部数据回存子单元的缓存访问任务;并向所述内部数据回存子单元下发所述内部数据回存子单元的缓存访问任务,并向所述外部数据回存子单元下发所述外部数据读取回存子单元的缓存访问任务;并向所述任务管理器下发所述外部数据读取子单元、所述外部数据回存子单元、所述内部数据读取子单元和所述内部数据回存子单元的缓存访问任务。The central processing unit is specifically configured to issue a cache access task of the external data reading subunit to the external data reading subunit through a bus; and issue an office to the external data return subunit. The cache access task of the external data return sub-unit; and the cache access task of the internal data return sub-unit is issued to the internal data return sub-unit, and an office is issued to the external data return sub-unit The cache access task of the external data reading and storing subunit; and delivering the external data reading subunit, the external data storing subunit, the internal data reading subunit, and the task manager to the task manager; A cache access task of the internal data return subunit.
在一些可能的实施方式中,所述任务管理器包括:缓存访问控制器和缓存访问任务队列;In some possible implementation manners, the task manager includes: a cache access controller and a cache access task queue;
其中,所述缓存访问任务队列用于存储所述中央处理器下发的所述外部数据读取子单元、所述外部数据回存子单元、所述内部数据回存子单元和所述内部数据回存子单元的缓存访问任务;The cache access task queue is configured to store the external data reading subunit, the external data storing subunit, the internal data storing subunit, and the internal data issued by the central processing unit. Cache access tasks of sub-units;
其中,在基于所述缓存访问任务T1所依赖缓存访问任务T0的完成情况判决是否允许响应所述访问请求q1的方面,所述任务管理器中的所述缓存访问控制器用于,Wherein, in the aspect of determining whether to allow response to the access request q1 based on the completion of the cache access task T0 on which the cache access task T1 depends, the cache access controller in the task manager is configured to:
当所述缓存访问任务T1所依赖缓存访问任务T0已完成,则判决允许响应所述访问请求q1;When the cache access task T0 on which the cache access task T1 depends has been completed, it is judged that it is allowed to respond to the access request q1;
当所述缓存访问任务T0未完成,则将所述缓存访问任务T1所请求访问缓存的指针与所述缓存访问任务T0的当前指针进行指针比对,若指针对比通过,则判决允许响应所述访问请求q1;若指针对比未能通过,则判决不允许响应所述访问请求q1。When the cache access task T0 is not completed, the pointer of the cache access task T1 requested to access the cache is compared with the current pointer of the cache access task T0. If the pointer comparison is passed, the decision is allowed to respond to the Access request q1; if the pointer comparison fails, it is determined that it is not allowed to respond to the access request q1.
在一些可能的实施方式中,所述缓存访问任务队列包括第一子队列、第二子队列、第三子队列和第四子队列:In some possible implementation manners, the cache access task queue includes a first sub queue, a second sub queue, a third sub queue, and a fourth sub queue:
其中,所述第一子队列用于存储所述中央处理器下发的所述外部数据读取子单元的缓存 访问任务;The first sub-queue is configured to store a cache access task of the external data reading sub-unit issued by the central processing unit;
所述第二子队列用于存储所述中央处理器下发的所述外部数据回存子单元的缓存访问任务;The second sub-queue is configured to store a cache access task of the external data return sub-unit delivered by the central processing unit;
所述第三子队列用于存储所述中央处理器下发的所述内部数据读取子单元的缓存访问任务;The third sub-queue is configured to store a cache access task of the internal data reading sub-unit issued by the central processing unit;
所述第四子队列用于存储所述中央处理器下发的所述内部数据回存子单元的缓存访问任务;The fourth sub-queue is configured to store a cache access task of the internal data return sub-unit delivered by the central processing unit;
其中,所述缓存访问控制器包括:第一子控制器、第二子控制器、第三子控制器和第四子控制器:The cache access controller includes: a first sub-controller, a second sub-controller, a third sub-controller, and a fourth sub-controller:
所述第一子控制器为外部数据读取子单元的缓存访问控制器;所述第二子控制器为外部数据回存子单元的缓存访问控制器;所述第三子控制器为内部数据读取子单元的缓存访问控制器;所述第四子控制器为内部数据回存子单元的缓存访问控制器。The first sub-controller is a cache access controller of an external data reading sub-unit; the second sub-controller is a cache access controller of an external data return sub-unit; and the third sub-controller is internal data Read the cache access controller of the sub-unit; the fourth sub-controller is a cache access controller of the internal data return sub-unit.
本申请第二方面提供一种电子设备,包括:壳体和容纳于所述壳体之中的硬件系统,所述硬件系统为第一方面提供的任意一种硬件系统。A second aspect of the present application provides an electronic device including a casing and a hardware system housed in the casing, where the hardware system is any one of the hardware systems provided in the first aspect.
可以看出,上述硬件系统中,引入了任务管理器这个区别于中央处理器的硬件,中央处理器主要用于向所述任务管理器和所述缓存访问单元下发缓存访问任务,而任务管理器则用于基于所述缓存访问单元的缓存访问任务,对所述缓存访问单元执行所述至少一个缓存访问任务进行监控。也就是说,缓存访问任务的执行不再需要中央处理器通过中断来控制,而是由任务管理器来负责进行数据缓存器的管理和缓存访问单元的调度,而这就有利于减少中央处理器的中断。由于缓存访问任务的执行可完全由中央处理器之外的其它硬件(即任务管理器)自动协同完成,即可实现中央处理器之外的专用硬件(即区别于中央处理器的任务管理器)自动管理数据缓存器和缓存访问单元,这相比于由中央处理器以兼职方式管理数据缓存器的传统技术(传统技术中,中央处理器除了其主要本职工作外,还需要分管数据缓存器的访问,那么对数据缓存器的管理工作,可认为是中央处理器的兼职工作),专用硬件来管理数据缓存器的响应速度相对更快,进而更有利于提升数据缓存器的利用率;并且由于减少了中央处理器的中断,那么中央处理器可更专注于与系统运行相关的本职工作,这就有利于大大提高整个系统的运行效率。It can be seen that, in the above-mentioned hardware system, a task manager, which is different from a central processor, is introduced. The central processor is mainly used to deliver cache access tasks to the task manager and the cache access unit, and task management The processor is configured to monitor the execution of the at least one cache access task by the cache access unit based on the cache access task of the cache access unit. In other words, the execution of cache access tasks no longer requires the central processor to control through interrupts, but the task manager is responsible for managing the data buffer and scheduling the cache access unit, which is beneficial to reducing the central processor. Interruption. Since the execution of the cache access task can be completed automatically by other hardware (that is, the task manager) other than the CPU, the dedicated hardware outside the CPU (that is, the task manager that is different from the CPU) can be realized. Automatic management of data buffers and cache access units, compared to the traditional technology of central processing of data buffers by a central processor (in traditional technology, in addition to its main job, the central processor also needs to manage the data buffers) Access, then the management of the data buffer can be considered a part-time job of the central processor), the response speed of the dedicated hardware to manage the data buffer is relatively faster, which is more conducive to improving the utilization of the data buffer; and because By reducing the interruption of the central processing unit, the central processing unit can focus more on its own work related to the system operation, which is conducive to greatly improving the operating efficiency of the entire system.
在缓存访问单元内部,可基于不同访问机制细分更多子单元,例如当缓存访问单元包括外部数据读取子单元、外部数据回存子单元、内部数据读取子单元和内部数据回存子单元等子单元,任务管理器可协调这些子单元之间的数据存取工作,进而有利于提升数据缓存器的 存取效率,有利于进一步提升数据缓存器的利用率。Within the cache access unit, more subunits can be subdivided based on different access mechanisms. For example, when the cache access unit includes an external data read subunit, an external data return subunit, an internal data read subunit, and an internal data return subunit Units and other sub-units, the task manager can coordinate the data access work between these sub-units, which is conducive to improving the access efficiency of the data buffer and further improving the utilization of the data buffer.
附图说明BRIEF DESCRIPTION OF THE DRAWINGS
下面对实施例描述中所使用的附图作简单地介绍。The drawings used in the description of the embodiments are briefly described below.
图1-A为本申请实施例提供的一种硬件系统架构示意图;1-A is a schematic diagram of a hardware system architecture according to an embodiment of the present application;
图1-B为本申请实施例提供的另一种硬件系统架构示意图;1-B is a schematic diagram of another hardware system architecture according to an embodiment of the present application;
图1-C为本申请实施例提供的又一种硬件系统架构示意图;FIG. 1-C is a schematic diagram of another hardware system architecture according to an embodiment of the present application; FIG.
图1-D为本申请实施例提供的再一种硬件系统架构示意图;FIG. 1-D is a schematic diagram of another hardware system architecture according to an embodiment of the present application; FIG.
图2为本申请实施例提供的中央处理器通过任务总线向任务管理器和各个缓存访问单元下发任务的示意图;2 is a schematic diagram of a central processing unit according to an embodiment of the present application delivering a task to a task manager and each cache access unit through a task bus;
图3为本申请实施例提供的一种任务管理器的内部架构示意图;3 is a schematic diagram of an internal architecture of a task manager according to an embodiment of the present application;
图4为本申请实施例提供的另一种任务管理器的内部架构示意图;4 is a schematic diagram of an internal architecture of another task manager according to an embodiment of the present application;
图5是本申请实施例提供的硬件系统的一种处理方法的流程示意图;5 is a schematic flowchart of a processing method of a hardware system according to an embodiment of the present application;
图6-A和图6-B是本申请实施例提供的任务的两种数据格式的示意图;6-A and 6-B are schematic diagrams of two data formats of a task provided by an embodiment of the present application;
图7-A是本申请实施例提供的任务管理器中各子队列缓存任务的一种情况的示意图;7-A is a schematic diagram of a situation in which tasks are cached in each sub-queue in the task manager provided in the embodiment of the present application;
图7-B是本申请实施例提供的任务管理器中各子队列缓存任务的另一种情况的示意图;7-B is a schematic diagram of another situation of caching tasks in each sub-queue in the task manager provided in the embodiment of the present application;
图8是本申请实施例提供的一些任务的执行关联关系的示意图。FIG. 8 is a schematic diagram of an execution association relationship of some tasks provided by an embodiment of the present application.
具体实施方式detailed description
本申请实施例提供一种硬件系统和电子设备,以期提升硬件系统的运转效率和减少中央处理器中断。The embodiments of the present application provide a hardware system and an electronic device, so as to improve the operating efficiency of the hardware system and reduce the interruption of the central processing unit.
为了使本技术领域的人员更好地理解本申请方案,下面将结合本申请实施例中的附图,对本申请实施例中的技术方案进行清楚、完整地描述,显然所描述的实施例仅仅是本申请一部分的实施例,而不是全部的实施例。基于本申请中的实施例,本领域普通技术人员所获得的所有其他实施例,都应当属于本申请保护的范围。In order to enable those skilled in the art to better understand the solution of the present application, the technical solution in the embodiment of the present application will be clearly and completely described with reference to the accompanying drawings in the embodiments of the present application. Obviously, the described embodiment is only Examples of this application are part, but not all. Based on the embodiments in this application, all other embodiments obtained by those skilled in the art should fall within the protection scope of this application.
以下分别进行详细说明。其中,本申请的说明书和权利要求书及上述附图中的术语“第一”、“第二”、“第三”、“第四”和“第五”等是用于区别不同对象,而不是用于描述特定顺序。此外,术语“包括”和“具有”以及它们任何变形,意图在于覆盖不排他的包含。例如,包含了一系列步骤或单元的过程、方法、系统、产品或设备没有限定于已列出的步骤或单元,而是可选地还包括没有列出的步骤或单元,或可选地还包括对于这些过程、方法、产品或设备固有的其它步骤或单元。Each of them will be described in detail below. The terms "first", "second", "third", "fourth", and "fifth" in the description and claims of the present application and the above-mentioned drawings are used to distinguish different objects, and It is not used to describe a specific order. Furthermore, the terms "including" and "having" and any variations thereof are intended to cover non-exclusive inclusion. For example, a process, method, system, product, or device that contains a series of steps or units is not limited to the listed steps or units, but optionally also includes steps or units not listed, or optionally also Include other steps or units inherent to these processes, methods, products, or equipment.
本申请实施例力求设计一种硬件系统,这种硬件系统有利于减少中央处理器的中断和提升硬件系统的运转效率。The embodiments of the present application strive to design a hardware system, which is conducive to reducing the interruption of the central processing unit and improving the operating efficiency of the hardware system.
参见图1-A,图1-A为本申请实施例提供的一种硬件系统架构示意图。如图1-A举例所示,硬件系统100主要包括:中央处理器110、数据缓存器120、缓存访问单元130和任务管理器140。Referring to FIG. 1-A, FIG. 1-A is a schematic diagram of a hardware system architecture provided by an embodiment of the present application. As shown in the example of FIG. 1-A, the hardware system 100 mainly includes a central processing unit 110, a data buffer 120, a cache access unit 130, and a task manager 140.
其中,中央处理器110与任务管理器140和缓存访问单元130连接。The central processing unit 110 is connected to the task manager 140 and the cache access unit 130.
其中,数据缓存器120与任务管理器140和缓存访问单元130连接。The data buffer 120 is connected to the task manager 140 and the cache access unit 130.
其中,中央处理器110用于向任务管理器140和缓存访问单元130下发至少一个缓存访问任务(例如1个缓存访问任务或多个缓存访问任务)。The central processing unit 110 is configured to deliver at least one cache access task (for example, one cache access task or multiple cache access tasks) to the task manager 140 and the cache access unit 130.
缓存访问单元130用于执行所述至少一个缓存访问任务以对所述数据缓存器120进行访问。The cache access unit 130 is configured to execute the at least one cache access task to access the data buffer 120.
任务管理器140用于基于缓存访问单元130的至少一个缓存访问任务,对所述缓存访问单元130执行所述至少一个缓存访问任务进行监控。The task manager 140 is configured to monitor the execution of the at least one cache access task by the cache access unit 130 based on at least one cache access task of the cache access unit 130.
可以看出,上述方案在硬件系统中引入了任务管理器这个硬件,中央处理器主要用于向所述任务管理器和所述缓存访问单元下发缓存访问任务,而任务管理器则用于基于所述缓存访问单元的缓存访问任务,对所述缓存访问单元130执行所述至少一个缓存访问任务进行监控。也就是说,缓存访问任务的执行不再需要中央处理器通过中断来控制,而是由任务管理器来负责进行数据缓存器的管理和缓存访问单元的调度,那么这就有利于减少中央处理器的中断。由于缓存访问任务的执行可完全由中央处理器之外的其它硬件(即任务管理器)自动协同完成,即可实现中央处理器之外的专用硬件(即区别于中央处理器的任务管理器)自动管理数据缓存器和缓存访问单元,这相比于由中央处理器以兼职方式管理数据缓存器的传统技术(传统技术中,中央处理器除其主要本职工作外,还需分管数据缓存器的访问,那对数据缓存器的管理工作,可认为是中央处理器的兼职工作),专用硬件管理数据缓存器的响应速度相对更快,进而更有利于提升数据缓存器的利用率;并且由于减少了中央处理器的中断,那么中央处理器可更专注于与系统运行相关的本职工作,这就有利于大大提高整个系统的运行效率。It can be seen that the above solution introduces the task manager hardware into the hardware system. The central processor is mainly used to deliver cache access tasks to the task manager and the cache access unit, and the task manager is used to The cache access task of the cache access unit monitors the cache access unit 130 performing the at least one cache access task. In other words, the execution of cache access tasks no longer requires the central processor to control through interrupts, but the task manager is responsible for managing the data buffer and scheduling the cache access unit, which is beneficial to reducing the central processor. Interruption. Since the execution of the cache access task can be completed automatically by other hardware (that is, the task manager) other than the CPU, the dedicated hardware outside the CPU (that is, the task manager that is different from the CPU) can be realized. Automatic management of data buffers and cache access units, compared to the traditional technology of central processing of data buffers by a central processor (in traditional technology, in addition to its main job, the central processor also needs to manage the data buffers) Access, the management of the data buffer can be considered a part-time job of the central processor), the response speed of the dedicated hardware to manage the data buffer is relatively faster, which is more conducive to improving the utilization of the data buffer; With the interruption of the central processing unit, the central processing unit can focus more on its own work related to the system operation, which is conducive to greatly improving the operating efficiency of the entire system.
在一些可能实施方式中,中央处理器110、数据缓存器120、缓存访问单元130和任务管理器140中的至少两个组件之间可以通过总线连接。具体例如中央处理器110、数据缓存器120、缓存访问单元130和任务管理器140均通过总线来连接。In some possible implementation manners, at least two components of the central processing unit 110, the data buffer 120, the cache access unit 130, and the task manager 140 may be connected through a bus. For example, the central processing unit 110, the data buffer 120, the cache access unit 130, and the task manager 140 are all connected through a bus.
当然,硬件系统100除了可包括图1-A举例所示的组件,也还可能进一步包括其它一些组件。例如参见图1-B,图1-B为本申请实施例提供的另一硬件系统架构示意图。如图1-B举例所 示,硬件系统100还可包括运算单元150。如图1-B举例所示,缓存访问单元130可包括如下访问子单元:内部数据读取子单元133和内部数据回存子单元134。其中,所述内部数据读取子单元133和内部数据回存子单元133连接于所述运算单元150和所述数据缓存器120之间。Of course, in addition to the components shown in the example of FIG. 1-A, the hardware system 100 may further include some other components. For example, refer to FIG. 1-B, which is a schematic diagram of another hardware system architecture provided by an embodiment of the present application. As shown in the example of FIG. 1-B, the hardware system 100 may further include an operation unit 150. As shown in the example of FIG. 1-B, the cache access unit 130 may include the following access subunits: an internal data reading subunit 133 and an internal data return subunit 134. The internal data reading subunit 133 and the internal data returning subunit 133 are connected between the operation unit 150 and the data buffer 120.
其中,所述内部数据读取子单元133可用于将从所述数据缓存器120读取到的数据提供给所述运算单元150运算。所述内部数据回存子单元134可用于将所述运算单元150运算得到的结果数据存入到到所述数据缓存器120中。The internal data reading subunit 133 may be configured to provide data read from the data buffer 120 to the operation unit 150 for operations. The internal data return sub-unit 134 may be configured to store the result data obtained by the operation unit 150 into the data buffer 120.
其中,运算单元150中可包括至少一个运算子单元,所述至少一个运算子单元例如可包括至少一个卷积运算子单元、至少一个加法运算子单元、至少一个减法运算子单元和/或至少一个其它运算子单元等。The operation unit 150 may include at least one operator unit, and the at least one operator unit may include, for example, at least one convolution operator unit, at least one addition operator unit, at least one subtraction operator unit, and / or at least one Other operator units, etc.
参见图1-C,图1-C为本申请实施例提供的又一硬件系统架构示意图。如图1-C举例所示,硬件系统100还可包括外部存储器160。如图1-C举例所示,缓存访问单元130可包括如下访问子单元:外部数据读取子单元131和外部数据回存子单元132。所述内部数据读取子单元131和内部数据回存子单元132连接于所述外部存储器160和所述数据缓存器120之间。Referring to FIG. 1-C, FIG. 1-C is a schematic diagram of another hardware system architecture provided by an embodiment of the present application. As shown in the example of FIG. 1-C, the hardware system 100 may further include an external memory 160. As shown in the example of FIG. 1-C, the cache access unit 130 may include the following access subunits: an external data reading subunit 131 and an external data return subunit 132. The internal data reading subunit 131 and the internal data returning subunit 132 are connected between the external memory 160 and the data buffer 120.
其中,所述外部数据读取子单元131用于将从所述外部存储器160读取到的数据存入到所述数据缓存器120中。所述外部数据回存子单元132用于将从所述数据缓存器120中读取到的数据存入到所述外部存储器160中。The external data reading subunit 131 is configured to store data read from the external memory 160 into the data buffer 120. The external data return sub-unit 132 is configured to store data read from the data buffer 120 into the external memory 160.
参见图1-D,图1-D为本申请实施例提供的再一硬件系统架构示意图。如图1-D举例所示,硬件系统100也可以既包括运算单元150,并且也还包括外部存储器160。Referring to FIG. 1-D, FIG. 1-D is a schematic diagram of still another hardware system architecture according to an embodiment of the present application. As shown in the example of FIG. 1-D, the hardware system 100 may include both the operation unit 150 and the external memory 160.
如图1-D举例所示,缓存访问单元130可包括如下访问子单元:外部数据读取子单元131、外部数据回存子单元132、内部数据读取子单元133和内部数据回存子单元134。其中,缓存访问单元130中的各访问子单元的工作机制参见上述举例描述。As shown in the example of FIG. 1-D, the cache access unit 130 may include the following access subunits: an external data read subunit 131, an external data readback subunit 132, an internal data read subunit 133, and an internal data readback subunit. 134. For a working mechanism of each access subunit in the cache access unit 130, refer to the foregoing example description.
在缓存访问单元内部可基于不同访问机制细分更多子单元,例如缓存访问单元130包括外部数据读取子单元131、外部数据回存子单元132、内部数据读取子单元133和内部数据回存子单元134等子单元,那么任务管理器140可协调这些子单元之间的数据存取工作,进而有利于提升数据缓存器的存取效率,有利于进一步提升数据缓存器120的利用率。Within the cache access unit, more sub-units can be subdivided based on different access mechanisms. For example, the cache access unit 130 includes an external data reading sub-unit 131, an external data return sub-unit 132, an internal data reading sub-unit 133, and an internal data return Storing the sub-units such as the sub-unit 134, the task manager 140 can coordinate the data access work between these sub-units, which is conducive to improving the access efficiency of the data buffer and further improving the utilization of the data buffer 120.
其中,外部存储器160为相对远离计算单元150的存储器,而数据缓存器120为相对靠近计算单元150的存储器。其中,外部存储器120例如可为片内随机存储器、片外随机存储器,具体例如可为双倍速率(double data rate)同步动态随机存储器(SDRAM,synchronous dynamic random access memory)或其它类型的随机存储器。The external memory 160 is a memory relatively far from the computing unit 150, and the data buffer 120 is a memory relatively close to the computing unit 150. The external memory 120 may be, for example, an on-chip random access memory, an off-chip random access memory, and specifically may be a double data rate (SDRAM) synchronous dynamic random access memory (SDRAM) or other types of random access memories.
其中,在本申请的某一些实施例中,缓存访问单元例如可以使用dma(direct memory  access,直接内存存取)技术来进行相关数据的访问。Among them, in some embodiments of the present application, the cache access unit may use dma (direct memory access) technology to perform access to related data.
其中,外部数据读取子单元131例如可为eidma(external input dma,外部输入直接内存存取),外部数据读取子单元131可将外部存储器160中的数据读取到数据缓存器120中。The external data reading subunit 131 may be, for example, eidma (external input direct memory access), and the external data reading subunit 131 may read data in the external memory 160 into the data buffer 120.
外部数据回存子单元132例如可为eodma(external output dma,外部输出直接内存存取),外部数据回存子单元132可将数据缓存器120中的数据会存到外部存储器160中。The external data retrieval sub-unit 132 may be, for example, eodma (external output direct memory access), and the external data retrieval sub-unit 132 may store the data in the data buffer 120 into the external memory 160.
其中,内部数据读取子单元141例如可为idma(input dma,内部输入直接内存存取),内部数据读取子单元141可将数据缓存器120中的数据读取给运算单元150运算。The internal data reading sub-unit 141 may be, for example, an idma (input direct memory access), and the internal data reading sub-unit 141 may read the data in the data buffer 120 to the arithmetic unit 150 for calculation.
其中,内部数据回存子单元142例如可为odma(output dma,内部输出直接内存存取),内部数据回存子单元141可将运算单元150的运算结果回存到数据缓存器120中。The internal data return sub-unit 142 may be, for example, odma (output direct memory access), and the internal data return sub-unit 141 may return the operation result of the operation unit 150 to the data buffer 120.
中央处理器110例如可包括中央处理单元(CPU,central processing unit)或其他处理器,如数字信号处理器(DSP,digital signal processor)、微处理器、微中央处理器或神经网络计算器等。在一些具体的应用中,硬件系统的各组件例如可以通过总线系统耦合在一起。其中,总线系统除了可以包括数据总线之外,还可包括电源总线、控制总线和状态信号总线等。中央处理器110可能是一种集成电路芯片,具有信号的处理能力。在一些实现过程中,中央处理器110除了可包括执行软件指令的单元,也还可以包括其他硬件加速器,例如可以包括专用集成电路、现场可编程门阵列或者其他可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件等。The central processing unit 110 may include, for example, a central processing unit (CPU) or other processors, such as a digital signal processor (DSP), a microprocessor, a micro central processing unit, or a neural network calculator. In some specific applications, the components of the hardware system may be coupled together through a bus system, for example. The bus system may include a data bus, a power bus, a control bus, and a status signal bus. The central processing unit 110 may be an integrated circuit chip with signal processing capabilities. In some implementation processes, in addition to the unit executing software instructions, the central processing unit 110 may also include other hardware accelerators, such as an application-specific integrated circuit, a field programmable gate array or other programmable logic device, a discrete gate, or a transistor. Logic devices, discrete hardware components, etc.
其中,中央处理器110例如用于通过任务总线(task_bus)下发任务。任务管理器140(task manager)进行数据缓存器120的管理和各个缓存访问单元的协调同步管理。The central processing unit 110 is, for example, configured to issue a task through a task bus (task_bus). The task manager 140 (task manager) performs management of the data buffer 120 and coordinated and synchronous management of each cache access unit.
参见图2,图2为中央处理器110通过任务总线(task_bus)向任务管理器和各个缓存访问单元下发任务的示意图。Referring to FIG. 2, FIG. 2 is a schematic diagram of sending tasks by a central processing unit 110 to a task manager and various cache access units through a task bus (task_bus).
下面举例任务管理器140协同管理缓存访问单元来访问数据缓存器的一些可能方式。The following examples illustrate some possible ways for the task manager 140 to cooperatively manage the cache access unit to access the data buffer.
在一些可能实施方式中,任务管理器140可用于,在接收到来自缓存访问单元的针对缓存访问任务T1的访问请求q1之后,基于所述缓存访问任务T1所依赖缓存访问任务T0的完成情况判决是否允许响应所述访问请求q1。若判决允许响应所述访问请求q1,向所述缓存访问单元发送用于响应所述访问请求q1的访问响应aq1,所述访问响应aq1用于指示允许所述缓存访问单元针对所述数据缓存器执行所述访问请求q1。In some possible implementation manners, the task manager 140 may be configured to, after receiving an access request q1 for the cache access task T1 from the cache access unit, determine based on the completion of the cache access task T0 on which the cache access task T1 depends. Whether to respond to the access request q1. If it is determined that the access request q1 is allowed to be responded, an access response aq1 for responding to the access request q1 is sent to the cache access unit, and the access response aq1 is used to indicate that the cache access unit is allowed to respond to the data buffer The access request q1 is executed.
例如在基于所述缓存访问任务T1所依赖缓存访问任务T0的完成情况判决是否允许响应所述访问请求q1的方面,任务管理器140可用于,当所述缓存访问任务T1所依赖缓存访问任务T0已完成,则判决允许响应所述访问请求q1;当所述缓存访问任务T0未完成,则可进一步将 所述缓存访问任务T1所请求访问缓存的指针与所述缓存访问任务T0的当前指针进行指针比对,若指针对比通过,则判决允许响应所述访问请求q1;若指针对比未能通过,则可判决不允许响应所述访问请求q1。For example, in determining whether it is allowed to respond to the access request q1 based on the completion of the cache access task T1 on which the cache access task T1 depends, the task manager 140 may be configured to: when the cache access task T1 depends on the cache access task T0 If it is completed, it is judged that it is allowed to respond to the access request q1; when the cache access task T0 is not completed, the pointer to access the cache requested by the cache access task T1 and the current pointer of the cache access task T0 may be further performed Pointer comparison. If the pointer comparison passes, it is judged that the access request q1 is allowed to be responded; if the pointer comparison fails, it may be judged that the access request q1 is not allowed to be responded to.
其中,缓存访问任务T0的当前指针,落后于缓存访问任务T1所请求访问缓存的指针,表示指针对比通过;当缓存访问任务T0的当前指针,未落后于缓存访问任务T1所请求访问缓存的指针,表示指针对比未通过。Among them, the current pointer of the cache access task T0 lags behind the pointer requested by the cache access task T1 to indicate that the pointer has passed the comparison; when the current pointer of the cache access task T0 is not behind the pointer of the cache access task T1 requested to access the cache Indicates that the pointer comparison failed.
其中,缓存访问任务T1依赖于缓存访问任务T0,表示缓存访问任务T1的执行必须依赖于缓存访问任务T0的执行,即缓存访问任务T1的执行是以当缓存访问任务T0的成功执行为基础的,也即当缓存访问任务T0被成功执行之后,才能执行缓存访问任务T1。Among them, the cache access task T1 depends on the cache access task T0, which means that the execution of the cache access task T1 must depend on the execution of the cache access task T0, that is, the execution of the cache access task T1 is based on the successful execution of the cache access task T0. That is, after the cache access task T0 is successfully executed, the cache access task T1 can be executed.
例如,假设缓存访问任务T1是读数据缓存器120(data memory),而缓存访问任务T0为写数据缓存器120,那么,缓存访问任务T1从数据缓存器120所要读取的数据,则可能就是缓存访问任务T0向数据缓存器120写入的数据的部分或全部,只有当缓存访问任务T0成功执行后,缓存访问任务T1才能从数据缓存器120读取到相应数据,如果当缓存访问任务T0未成功执行,那么数据缓存器120不存在缓存访问任务T1所要读取的数据,那么,缓存访问任务T1也就无法成功执行,这种情况下说明缓存访问任务T1依赖于缓存访问任务T0。For example, if the cache access task T1 is a read data buffer 120 (data memory) and the cache access task T0 is a write data buffer 120, then the data that the cache access task T1 wants to read from the data buffer 120 may be Some or all of the data written by the cache access task T0 to the data buffer 120. Only after the cache access task T0 is successfully executed, can the cache access task T1 read the corresponding data from the data buffer 120. If the cache access task T0 If it is not successfully executed, then the data buffer 120 does not have the data to be read by the cache access task T1, then the cache access task T1 cannot be successfully executed. In this case, the cache access task T1 depends on the cache access task T0.
其中,所述数据缓存器120可以包括多个缓存片(buffer)。The data buffer 120 may include multiple buffers.
在一些可能实施方式中,缓存访问任务T1可包括如下字段:用于指示所访问缓存片的片标识(buffer ID)的字段、用于指示所访问缓存片的起始地址的字段、用于指示所访问数据的长度的字段、用于指示缓存访问任务T1所依赖缓存访问任务T0的字段等等。可以看出,缓存访问任务中包括的上述字段,有利于较清晰指示出对应所依赖缓存访问任务、所访问的缓存位置和所访问缓存空间长度。In some possible implementation manners, the cache access task T1 may include the following fields: a field for indicating a slice ID (buffer ID) of the accessed cache slice, a field for indicating a start address of the accessed cache slice, and an instruction for indicating A field of the length of the accessed data, a field for indicating the cache access task T0 on which the cache access task T1 depends, and the like. It can be seen that the above-mentioned fields included in the cache access task are helpful for clearly indicating the corresponding cache access task, the cache location accessed, and the length of the cache space accessed.
在一些可能实施方式中,所述访问请求q1可包括如下字段:用于指示所访问缓存片的起始地址的字段、用于指示发出所述访问请求q1的缓存访问单元的单元标识的字段;In some possible implementation manners, the access request q1 may include the following fields: a field used to indicate a start address of the accessed cache slice, and a field used to indicate a unit identifier of a cache access unit that issued the access request q1;
或者,所述访问请求q1可包括如下字段:用于指示所访问缓存片的起始地址的字段、用于指示所访问数据的长度的字段,用于指示发出所述访问请求q1的缓存访问单元的单元标识的字段;Alternatively, the access request q1 may include the following fields: a field used to indicate the start address of the accessed cache slice, a field used to indicate the length of the accessed data, and used to indicate the cache access unit that issued the access request q1 Field of the unit identification;
或者,所述访问请求q1可包括如下字段:用于指示所访问缓存片的起始地址的字段、用于指示所访问数据的长度的字段,用于指示所述缓存访问任务T1的任务标识的字段。Alternatively, the access request q1 may include the following fields: a field used to indicate the start address of the accessed cache slice, a field used to indicate the length of the accessed data, and used to indicate the task identifier of the cache access task T1. Field.
在一些可能实施方式中,当中央处理器110通过总线与任务管理器140、所述外部数据读取子单元131、所述外部数据回存子单元132、所述内部数据读取子单元133和所述内部数据回 存子单元134连接。那么,所述中央处理器110可具体用于通过总线,向所述外部数据读取子单元131下发所述外部数据读取子单元131的缓存访问任务;并向所述外部数据回存子单元132下发所述外部数据回存子单元132的缓存访问任务;并向所述内部数据回存子单元133下发所述内部数据回存子单元133的缓存访问任务,并向所述外部数据回存子单元134下发所述外部数据读取回存子单元134的缓存访问任务;并向所述任务管理器140下发所述外部数据读取子单元131、所述外部数据回存子单元132、所述内部数据回存子单元133和所述内部数据回存子单元134的缓存访问任务。In some possible implementation manners, when the central processing unit 110 communicates with the task manager 140, the external data reading subunit 131, the external data storing subunit 132, the internal data reading subunit 133, and The internal data storage subunit 134 is connected. Then, the central processing unit 110 may be specifically configured to send a cache access task of the external data reading subunit 131 to the external data reading subunit 131 through a bus; and return the subdata to the external data. The unit 132 issues a cache access task of the external data retrieval subunit 132; and issues a cache access task of the internal data retrieval subunit 133 to the internal data retrieval subunit 133, and sends the cache access task to the external data retrieval subunit 133. The data retrieval subunit 134 issues a cache access task of the external data reading and retrieval subunit 134; and issues the external data reading subunit 131 and the external data retrieval to the task manager 140 A cache access task of the sub-unit 132, the internal data return sub-unit 133, and the internal data return sub-unit 134.
在一些可能实施方式中,参见图3,任务管理器140可包括:缓存访问控制器142和缓存访问任务队列141。其中,所述缓存访问任务队列用于存储所述中央处理器下发的所述外部数据读取子单元、所述外部数据回存子单元、所述内部数据回存子单元和所述内部数据回存子单元的缓存访问任务。In some possible implementations, referring to FIG. 3, the task manager 140 may include a cache access controller 142 and a cache access task queue 141. The cache access task queue is configured to store the external data reading subunit, the external data storing subunit, the internal data storing subunit, and the internal data issued by the central processing unit. Cache access tasks for sub-units.
在基于所述缓存访问任务T1所依赖缓存访问任务T0的完成情况判决是否允许响应所述访问请求q1的方面,所述任务管理器140中的所述缓存访问控制器142可用于,当所述缓存访问任务T1所依赖缓存访问任务T0已完成,则判决允许响应所述访问请求q1;当所述缓存访问任务T0未完成,则将所述访问请求q1所请求访问缓存的指针与所述缓存访问任务T0的当前指针进行指针比对,若指针对比通过,则判决允许响应所述访问请求q1;若指针对比未能通过,则判决不允许响应所述访问请求q1。In terms of determining whether it is allowed to respond to the access request q1 based on the completion of the cache access task T1 on which the cache access task T1 is dependent, the cache access controller 142 in the task manager 140 may be used to: When the cache access task T0 on which the cache access task T1 depends is completed, it is determined to be allowed to respond to the access request q1; when the cache access task T0 is not completed, the pointer to access the cache requested by the access request q1 and the cache The current pointer of the access task T0 is compared with the pointer. If the pointer comparison is passed, the decision is allowed to respond to the access request q1; if the pointer comparison fails, the decision is not allowed to respond to the access request q1.
参见图4,图4举例示出了任务管理器140包括的缓存访问任务队列141和缓存访问控制器142的情况。Referring to FIG. 4, FIG. 4 illustrates an example of a cache access task queue 141 and a cache access controller 142 included in the task manager 140.
其中,缓存访问任务队列141包括可第一子队列(如eidma task队列)、第二子队列(例如eodma task队列)、第三子队列(例如idma task队列)和第四子队列(如odma task队列)。其中,所述第一子队列用于存储所述中央处理器110下发的所述外部数据读取子单元的缓存访问任务。所述第二子队列用于存储所述中央处理器110下发的所述外部数据回存子单元的缓存访问任务。所述第三子队列用于存储所述中央处理器110下发的所述内部数据读取子单元的缓存访问任务。所述第四子队列用于存储所述中央处理器110下发的所述内部数据回存子单元的缓存访问任务。Among them, the cache access task queue 141 includes a first sub-queue (such as eidma task queue), a second sub-queue (such as eodma task queue), a third sub-queue (such as idma task queue), and a fourth sub-queue (such as odma task task). queue). The first sub-queue is used to store a cache access task of the external data reading sub-unit issued by the central processing unit 110. The second sub-queue is configured to store a cache access task of the external data return sub-unit delivered by the central processing unit 110. The third sub-queue is used to store a cache access task of the internal data reading sub-unit issued by the central processing unit 110. The fourth sub-queue is used to store a cache access task of the internal data return sub-unit delivered by the central processing unit 110.
可以理解,如果在缓存访问任务队列内部进一步进行子队列划分,使得不同子队列存储不同缓存访问子单元所需执行的缓存访问任务,那么将更便于实现缓存访问任务分类管理,有利于简化缓存访问任务的读取复杂度,进而提升缓存访问任务读取效率。It can be understood that if the sub-queue is further divided inside the cache access task queue, so that different sub-queues store cache access tasks that need to be performed by different cache access subunits, then it will be easier to implement the classification management of cache access tasks and help simplify cache access. The read complexity of the task further improves the read efficiency of the cache access task.
其中,所述缓存访问控制器142包括:The cache access controller 142 includes:
第一子控制器(如eidma缓存访问控制器)、第二子控制器(如eodma缓存访问控制器)、第三子控制器(如idma缓存访问控制器)和第四子控制器(如odma缓存访问控制器)。所述第一子控制器为外部数据读取子单元的缓存访问控制器。所述第二子控制器为外部数据回存子单元的缓存访问控制器;所述第三子控制器为内部数据读取子单元的缓存访问控制器;所述第四子控制器为内部数据回存子单元的缓存访问控制器。The first sub-controller (such as eidma cache access controller), the second sub-controller (such as eodma cache access controller), the third sub-controller (such as idma cache access controller), and the fourth sub-controller (such as odma Cache Access Controller). The first sub-controller is a cache access controller of the external data reading sub-unit. The second sub-controller is a cache access controller of an external data retrieval sub-unit; the third sub-controller is a cache access controller of an internal data reading sub-unit; and the fourth sub-controller is internal data Cache access controller to store subunits.
可以看出,若每个缓存访问单元均配置一个对应缓存访问子控制器,那更有利于针对缓存访问单元进行独立控制,进而有利于进一步提高对数据访问单元的协同管控能力。It can be seen that if each cache access unit is configured with a corresponding cache access sub-controller, it is more conducive to independent control of the cache access unit, and further facilitates the collaborative management and control of the data access unit.
下面结合附图介绍硬件系统的一些工作流程。The following describes some work processes of the hardware system with reference to the drawings.
参见图5,下面结合图5来介绍硬件系统100的另一些工作流程。下面通过举例数据访问任务协同执行过程,来描述中央处理器和任务管理器等部件之间如何协同工作,以完成数据缓存器的数据访问。Referring to FIG. 5, other working processes of the hardware system 100 are described below with reference to FIG. 5. The following describes the collaborative execution process of data access tasks to describe how the central processor and the task manager work together to complete the data access of the data buffer.
501、中央处理器110通过任务总线向任务管理器140、eodma 132、eidma 131、idma 133和odma 134下发任务(task)。501. The central processing unit 110 sends tasks to the task manager 140, eodma 132, eidma 131, idma 133, and odma 134 through a task bus.
其中,访问数据缓存器120的task,会被存入到任务管理器(task manager)140中各自对应的buffer task queue(缓存任务队列)中。其中,在一个实施例中,只有对数据缓存器120进行操作且相互之间有依赖关系的任务,才进入任务管理器140中的buffer task queue中。Among them, the tasks that access the data buffer 120 are stored in the respective buffer queues of the task manager 140. Among them, in one embodiment, only tasks that operate on the data buffer 120 and have a dependency relationship with each other are entered into the buffer queue in the task manager 140.
其中,buffer task queue中的task可指示如下信息:Among them, the tasks in the buffer task can indicate the following information:
所访问buffer(缓存片)的buffer ID(缓存片的片标识)、所访问buffer的起始地址、访问的数据长度和当前任务所依赖的任务。The buffer ID of the accessed buffer (cache chip), the start address of the accessed buffer, the length of the accessed data, and the task on which the current task depends.
此外,若当前任务所依赖的任务是读data memory,task还可指示出当前任务的输入数据来自哪个单元(可称数据源单元)。若当前任务所依赖的任务写是data memory,task还可指示出当前任务的输出数据去往哪个单元(例如可称为数据目标单元)。In addition, if the task that the current task depends on is reading data memory, the task can also indicate from which unit the input data of the current task comes from (can be called a data source unit). If the task write on which the current task depends is data memory, the task can also indicate to which unit the output data of the current task goes (for example, it can be called a data target unit).
参见图6-A和图6-B,图6-A和图6-B举例介绍了任务的两种数据格式。任务可包括用于指示所访问buffer的bufferID的bufferID字段、用于指示所访问buffer的起始地址的地址字段、用于指示访问的数据长度的长度字段、用于记录当前任务所依赖的任务的依赖任务字段、用于记录数据源单元标识的数据源单元字段或用于记录数据目标单元的数据目标单元字段等。See Figure 6-A and Figure 6-B. Figure 6-A and Figure 6-B illustrate two data formats for tasks. The task may include a bufferID field indicating the bufferID of the accessed buffer, an address field indicating the start address of the accessed buffer, a length field indicating the length of the accessed data, and a record of the task on which the current task depends. Depends on the task field, the data source unit field used to record the data source unit identification, or the data target unit field used to record the data target unit.
假设中央处理器通过task_bus下发任务如下:eidma_task1:将数据data001从外部存储器读取到数据缓存器120的buf0中。idma_task1:读取eidma_task1写入buf0中的数据data001,并且,将读取到的数据data001并送入运算单元进行运算。odma_task1:将运算单 元针对数据data001进行运算而得到的运算结果数据data002存入数据缓存器120的buf1中。eodma_task1:读取odma task1存入buf1中的运算结果数据data002,并将读取到的运算结果数据data002存入外部存储器中。Assume that the central processor sends the task through task_bus as follows: eidma_task1: reads the data data001 from the external memory to the buf0 of the data buffer 120. idma_task1: read the data data001 written by eidma_task1 into buf0, and send the read data data001 to the arithmetic unit for calculation. odma_task1: The operation result data data002 obtained by operating the operation unit on data data001 is stored in buf1 of the data buffer 120. eodma_task1: Read the operation result data data002 stored in buf1 by odma task1, and store the read operation result data data002 into the external memory.
参见图7-A,图7-A举例示出了任务管理器中用于存储eodma、eidma、idma和odma的缓存访问任务的各子队列所缓存任务的一种情况。Referring to FIG. 7-A, FIG. 7-A illustrates an example of a task cached by each sub-queue of the cache access task for storing eodma, eidma, idma, and odma in the task manager.
其中,eodma task queue用于存储eodma的缓存访问任务,eidma task queue用于存储eidma的缓存访问任务。其中,idma task queue用于存储eodma的缓存访问任务,odma task queue用于存储eodma的缓存访问任务。Among them, eodma task queue is used to store the cache access task of eodma, and eidma task queue is used to store the cache access task of eidma. Among them, the idma task queue is used to store the cache access tasks for eodma, and the odma task queue is used to store the cache access tasks for eodma.
502、eidma向任务管理器发送任务eidma_task1的memory request(任务执行请求)。502. The eidma sends a memory request (task execution request) of the task eidma_task1 to the task manager.
503、任务管理器接收针对任务eidma_task1的memory request,任务管理器中eidma缓存访问控制器读取eidma task queue中的任务eidma_task1,并判决是否响应针对任务eidma_task1的memory request。503. The task manager receives a memory request for the task eidma_task1, the eidma cache access controller in the task manager reads the task eidma_task1 in the eidma task queue, and determines whether to respond to the memory request for the task eidma_task1.
其中,如果任务eidma_task1依赖的任务已经完成,那么就可以响应针对任务eidma_task1的memory request。如果任务eidma_task1所依赖的任务还没有完成,那么不响应针对任务eidma_task1的memory request。若当前request无法响应,那么data memory控制器可对发起请求的相应单元进行反压。其中,反压是指通知发起请求的相应单元延长等待相应响应的容忍时间,例如,正常等待相应响应的容忍时间是2秒,而通过反压则可通知发起请求的相应单元将等待相应响应的容忍时间延长至5秒或其它不少于两秒的时长。Among them, if the task that the task eidma_task1 depends on has been completed, then it can respond to the memory request for the task eidma_task1. If the task that task eidma_task1 depends on has not been completed, then it does not respond to the memory request for task eidma_task1. If the current request fails to respond, the data memory controller can backpressure the corresponding unit that initiated the request. Among them, back pressure means to notify the corresponding unit that initiated the request to extend the tolerance time to wait for the corresponding response. For example, the normal time to wait for the corresponding response is 2 seconds, while back pressure can inform the corresponding unit that initiated the request to wait for the corresponding response The tolerance time is extended to 5 seconds or other periods of not less than two seconds.
此处以任务eidma_task1依赖的任务已完成为例。那么,任务管理器中的eidma缓存访问控制器可以通知data memory控制器响应任务eidma_task1的memory request,具体可以通过xxx_buf_rdy指令来进行通知。Here, take the task that the task eidma_task1 depends on has been completed as an example. Then, the eidma cache access controller in the task manager can notify the data memory controller to respond to the memory request of the task eidma_task1, which can be specifically notified through the xxx_buf_rdy instruction.
504、eidma131按照eidma_task1的指引,将待处理数据data001从外部存储器120读取到数据缓存器120的buf0。504. The eidma131 reads the data to be processed data001 from the external memory 120 to the buf0 of the data buffer 120 according to the guide of the eidma_task1.
505、任务管理器140通知(例如xxx_buf_rdy指令)idma从数据缓存器120中读取数据data001。505. The task manager 140 notifies (for example, the xxx_buf_rdy instruction) that the IDMA reads the data data001 from the data buffer 120.
506、idma133在接收到任务管理器140的通知后,将计算需使用的数据data001读取给运算单元150。相应的,运算单元150针对数据data001进行运算并且得到运算结果data002。506. After receiving the notification from the task manager 140, idma133 reads the data data001 required for calculation to the computing unit 150. Correspondingly, the operation unit 150 performs an operation on the data data001 and obtains an operation result data002.
507、odma134将运算单元150针对数据data001进行运算而得到的运算结果data002回存到数据缓存器120的buf1中。507. The odma 134 stores the operation result data002 obtained by the operation unit 150 on the data data001 into the buf1 of the data buffer 120.
508、任务管理器140可通知eodma132将数据data002从数据缓存器120的buf1中读取到之 后回存到外部存储器160。508. The task manager 140 may notify the eodma 132 to read the data data002 from buf1 of the data buffer 120 and then save the data002 to the external memory 160.
509、当任务管理器140通知eodma132将数据data002从数据缓存器120中读取到之后回存到外部存储器160,那么,eodma132将数据data002从数据缓存器120中读取到之后回存到外部存储器160。509. When the task manager 140 instructs eodma132 to read the data data002 from the data buffer 120 and then save it to the external storage 160, then, eodma132 reads the data data002 from the data buffer 120 and then saves it to the external storage. 160.
可以看出,在上述举例流程中,中央处理器通过任务总线(task_bus)将任务下给eidma、idma、odma、eodma和task manager。eidma、idma、odma和eodma则会根据下发的任务进行工作。其中,task manager则动态监控各个缓存访问单元(eidma、idma、odma和eodma),根据下发的任务进行各个存取单元任务执行的协同管理。It can be seen that in the above example process, the central processing unit assigns tasks to eidma, idma, odma, eodma, and task manager through the task bus (task_bus). eidma, idma, odma, and eodma will work according to the assigned task. Among them, task manager dynamically monitors each cache access unit (eidma, idma, odma, and eodma), and performs coordinated management of task execution of each access unit according to the issued task.
上述举例中,是以eidma、idma、odma、eodma中的每个单元只领取到一个任务为例来进行举例描述,当然,中央处理器也可给它们下发更多任务。例如中央处理器通过task_bus给每个单元的任务也可如下:The above example is described by taking each unit in eidma, idma, odma, and eodma as an example for description. Of course, the central processing unit may also issue more tasks to them. For example, the tasks that the central processor gives each unit through task_bus can also be as follows:
eidma_task1:将外部存储器中存储的初始数据data001读取到data memory的buf0中。eidma_task1: Read the initial data data001 stored in external memory to buf0 of data memory.
idma_task1:读取eidma task1写入buf0中的数据data001,并将读取到的数据data001送入process element进行计算。idma_task1: Read the data data001 written by eidma task1 into buf0, and send the read data001 to the process element for calculation.
odma_task1:将process element计算数据data001而得到的结果数据data002存入data memory的buf1中。odma_task1: The result data data002 obtained by the process element calculation data data001 is stored in buf1 of the data memory.
idma_task2:读取odma task1存入buf1中的数据data002,并将读取到的数据data002送入process element进行计算。idma_task2: read the data 002 stored in buf1 by odma task1, and send the read data data002 to process element for calculation.
odma_task2:将process element计算数据data002而得到的结果数据data003存入data memory的buf0中。odma_task2: The result data data003 obtained by the process element calculation data data002 is stored in buf0 of the data memory.
idma_task3:读取odma task2存入buf0中的数据data003,并将读取到的数据data003送入process element进行计算。idma_task3: Read the data data003 stored in buf0 by odma task2, and send the read data data003 to process element for calculation.
odma_task3:将process element计算数据data003而得到的结果数据data004存入data memory的buf1中。odma_task3: Result data data004 obtained by process element calculation data data003 is stored in buf1 of data memory.
eodma_task1:读取odma task3存入buf1中的数据data004,并读取到的数据data004存入外部存储器。eodma_task1: Read the data 004 stored in buf1 by odma task3, and store the data 004 read in external memory.
参见图7-B,图7-B举例示出了任务管理中用于存储eodma、eidma、idma和odma的缓存访问任务的各子队列所缓存任务的另一种情况。Referring to FIG. 7-B, FIG. 7-B illustrates another example of a task cached by each sub-queue of a cache access task for storing eodma, eidma, idma, and odma in task management.
参见图8,图8中举例示出了这些任务的执行关联关系。虚线箭头表示了任务的执行先后顺序。Referring to FIG. 8, the execution association relationship of these tasks is illustrated in FIG. 8. The dotted arrows indicate the order in which tasks are performed.
在图8所示举例中,首先执行eidma_task1,eidma_task1指示将data001写入buf0,依赖于eidma_task1的idma_task1执行于eidma_task1之后,其指示从buf0读取data001给计算单元计算。In the example shown in FIG. 8, eidma_task1 is executed first. Eidma_task1 instructs to write data001 to buf0. After idma_task1 which depends on eidma_task1 is executed on eidma_task1, it instructs to read data001 from buf0 to the computing unit for calculation.
odma_task1执行于idma_task1后,odma_task1指示将计算data001而产生的data002写入buf1;依赖于odma_task1的idma_task2执行于odma_task1后,其指示从buf1读取data002给计算单元计算。After odma_task1 is executed on idma_task1, odma_task1 indicates that data002 generated by calculating data001 is written to buf1. After idma_task2, which depends on odma_task1, is executed on odma_task1, it indicates that data002 is read from buf1 and calculated by the computing unit.
odma_task2执行于idma_task2后,odma_task2指示将计算data002而产生的data003写入buf0,依赖于odma_task2的idma_task3执行于odma_task2后,其指示从buf1读取data002给计算单元计算。After odma_task2 is executed on idma_task2, odma_task2 instructs to write data003 resulting from the calculation of data002 to buf0. After idma_task3, which depends on odma_task2, is executed on odma_task2, it instructs to read data002 from buf1 for calculation by the computing unit.
odma_task3执行于idma_task3后,odma_task3指示将计算data003而产生的data004写入buf1,依赖odma_task3的eodma_task1执行于odma_task3后,其指示从buf1读取data004,并读取到的数据data004存入外部存储器。After odma_task3 is executed on idma_task3, odma_task3 instructs to write data004 generated by calculating data003 to buf1. After eodma_task1 which depends on odma_task3 is executed on odma_task3, it instructs to read data004 from buf1 and store the read data data004 into external memory.
可以理解,中央处理110所下发任务可是多种多样的,并不限于上述的几种举例。其他多个任务的执行过程,可参考上述实施例的举例执行过程,此处不再赘述。It can be understood that the tasks issued by the central processing 110 are various and are not limited to the above examples. For execution processes of other tasks, reference may be made to the example execution processes of the foregoing embodiments, and details are not described herein again.
以上所述,以上实施例仅用以说明本申请技术方案,而非对其限制;尽管参照前述实施例对本申请进行了详细的说明,本领域的普通技术人员应当理解的是:其依然可以对前述各实施例所记载的技术方案进行修改,或者对其中部分技术特征进行等同替换;而这些修改或者替换,并不使相应技术方案的本质脱离本申请各实施例技术方案的范围。As mentioned above, the above embodiments are only used to illustrate the technical solution of the present application, but not limited thereto. Although the present application has been described in detail with reference to the foregoing embodiments, it should be understood by those skilled in the art that: The technical solutions described in the foregoing embodiments are modified, or some technical features are equivalently replaced; and these modifications or replacements do not deviate from the scope of the technical solutions of the embodiments of the present application.

Claims (10)

  1. 一种硬件系统,其特征在于,包括:中央处理器、数据缓存器、缓存访问单元和任务管理器;A hardware system, comprising: a central processing unit, a data buffer, a cache access unit, and a task manager;
    其中,所述中央处理器与所述任务管理器和缓存访问单元连接;Wherein, the central processing unit is connected to the task manager and the cache access unit;
    其中,所述数据缓存器与所述任务管理器和所述缓存访问单元连接;The data buffer is connected to the task manager and the cache access unit;
    其中,所述中央处理器用于,向所述任务管理器和所述缓存访问单元下发至少一个缓存访问任务;The central processor is configured to deliver at least one cache access task to the task manager and the cache access unit;
    所述缓存访问单元,用于执行所述至少一个缓存访问任务以对所述数据缓存器进行访问;The cache access unit is configured to execute the at least one cache access task to access the data buffer;
    所述任务管理器用于,基于所述至少一个缓存访问任务,对所述缓存访问单元执行所述至少一个缓存访问任务进行监控。The task manager is configured to monitor the execution of the at least one cache access task by the cache access unit based on the at least one cache access task.
  2. 根据权利要求1所述的系统,其特征在于,所述至少一个缓存访问任务包括缓存访问任务T1和缓存访问任务T0,所述缓存访问任务T0为所述缓存访问任务T1所依赖的缓存访问任务;The system according to claim 1, wherein the at least one cache access task comprises a cache access task T1 and a cache access task T0, and the cache access task T0 is a cache access task on which the cache access task T1 depends. ;
    所述任务管理器具体用于,在接收到来自所述缓存访问单元的针对所述缓存访问任务T1的访问请求q1之后,基于所述缓存访问任务T0的完成情况判决是否允许响应所述访问请求q1;The task manager is specifically configured to, after receiving an access request q1 for the cache access task T1 from the cache access unit, determine whether to allow the response to the access request based on the completion of the cache access task T0 q1;
    若判决允许响应所述访问请求q1,所述任务管理器向所述缓存访问单元发送用于响应所述访问请求q1的访问响应aq1,所述访问响应aq1用于指示允许所述缓存访问单元对所述数据缓存器执行所述缓存访问任务T1。If it is determined that it is allowed to respond to the access request q1, the task manager sends an access response aq1 for responding to the access request q1 to the cache access unit, and the access response aq1 is used to indicate that the cache access unit is allowed to respond to the access request q1. The data buffer executes the cache access task T1.
  3. 根据权利要求2所述的系统,其特征在于,在基于所述缓存访问任务T0的完成情况判决是否允许响应所述访问请求q1时,The system according to claim 2, characterized in that, when determining whether to allow response to the access request q1 based on the completion of the cache access task T0,
    所述任务管理器具体用于,当所述缓存访问任务T0已完成,则判决允许响应所述访问请求q1;当所述缓存访问任务T0未完成,则将所述缓存访问任务T1所请求访问缓存的指针与所述缓存访问任务T0的当前指针进行指针比对,若指针对比通过,则判决允许响应所述访问请求q1;若指针对比未能通过,则判决不允许响应所述访问请求q1。The task manager is specifically configured to: when the cache access task T0 is completed, determine to allow response to the access request q1; when the cache access task T0 is not completed, request the access requested by the cache access task T1 The cached pointer is compared with the current pointer of the cache access task T0. If the pointer comparison is passed, the judgment is allowed to respond to the access request q1; if the pointer comparison fails, the judgment is not allowed to respond to the access request q1 .
  4. 根据权利要求3所述的系统,其特征在于,所述数据缓存器包括多个缓存片,所述缓存访问任务T1包括如下字段:The system according to claim 3, wherein the data buffer includes a plurality of cache slices, and the cache access task T1 includes the following fields:
    用于指示所访问缓存片的片标识的第一字段、用于指示所访问缓存片的起始地址的第二字段、用于指示所访问数据的长度的第三字段、用于指示缓存访问任务T1所依赖缓存访问任务T0的字段。The first field to indicate the slice identification of the accessed cache slice, the second field to indicate the start address of the accessed cache slice, the third field to indicate the length of the accessed data, and the cache access task. T1 depends on the fields of the cache access task T0.
  5. 根据权利要求3或4所述的系统,其特征在于,所述访问请求q1包括如下字段:所述第一字段和用于指示发出所述访问请求q1的缓存访问单元的单元标识的第四字段;The system according to claim 3 or 4, wherein the access request q1 includes the following fields: the first field and a fourth field for indicating a unit identifier of a cache access unit issuing the access request q1 ;
    或者,or,
    所述访问请求q1包括如下字段:所述第二字段、所述第三字段和所述第四字段;The access request q1 includes the following fields: the second field, the third field, and the fourth field;
    或者,所述访问请求q1包括如下字段:所述第二字段、所述第三字段和用于指示所述缓存访问任务T1的任务标识的第五字段。Alternatively, the access request q1 includes the following fields: the second field, the third field, and a fifth field for indicating a task identifier of the cache access task T1.
  6. 根据权利要求1至5任意一项所述的系统,其特征在于,所述缓存访问单元包括如下访问子单元:外部数据读取子单元、外部数据回存子单元、内部数据读取子单元和内部数据回存子单元;The system according to any one of claims 1 to 5, wherein the cache access unit comprises the following access subunits: an external data reading subunit, an external data storing subunit, an internal data reading subunit, and Internal data storage subunit;
    所述硬件系统还包括外部存储器和运算单元;The hardware system further includes an external memory and an arithmetic unit;
    其中,所述外部数据读取子单元和外部数据回存子单元连接于所述外部存储器和所述数据缓存器之间;所述内部数据读取子单元和内部数据回存子单元连接于所述运算单元和所述数据缓存器之间;Wherein, the external data reading subunit and the external data storing subunit are connected between the external memory and the data buffer; the internal data reading subunit and the internal data storing subunit are connected to all Between the arithmetic unit and the data buffer;
    其中,所述外部数据读取子单元,用于将从所述外部存储器读取到的数据存入到所述数据缓存器中;The external data reading subunit is configured to store data read from the external memory into the data buffer;
    所述外部数据回存子单元,用于将从所述数据缓存器中读取到的数据存入到所述外部存储器中;The external data return subunit is configured to store data read from the data buffer into the external memory;
    其中,所述内部数据读取子单元,用于将从所述数据缓存器读取到的数据提供给所述运算单元运算;The internal data reading subunit is configured to provide data read from the data buffer to the arithmetic unit for operation;
    所述内部数据回存子单元,用于将所述运算单元运算得到的结果数据存入到到所述数据缓存器中。The internal data return subunit is configured to store result data obtained by the operation of the operation unit into the data buffer.
  7. 根据权利要求6所述的系统,其特征在于,所述中央处理器通过总线与所述任务管理器、所述外部数据读取子单元、所述外部数据回存子单元、所述内部数据读取子单元和所述内部数据回存子单元连接;The system according to claim 6, wherein the central processing unit communicates with the task manager, the external data reading subunit, the external data storing subunit, and the internal data reading through a bus. The sub-fetch unit is connected to the internal data storage sub-unit;
    其中,所述中央处理器具体用于通过总线,向所述外部数据读取子单元下发所述外部数据读取子单元的缓存访问任务;并向所述外部数据回存子单元下发所述外部数据回存子单元的缓存访问任务;并向所述内部数据回存子单元下发所述内部数据回存子单元的缓存访问任务,并向所述外部数据回存子单元下发所述外部数据读取回存子单元的缓存访问任务;并向所述任务管理器下发所述外部数据读取子单元、所述外部数据回存子单元、所述内部数据读取子单元和所述内部数据回存子单元的缓存访问任务。The central processing unit is specifically configured to issue a cache access task of the external data reading subunit to the external data reading subunit through a bus; and issue an office to the external data return subunit. The cache access task of the external data return sub-unit; and the cache access task of the internal data return sub-unit is issued to the internal data return sub-unit, and an office is issued to the external data return sub-unit The cache access task of the external data reading and storing subunit; and delivering the external data reading subunit, the external data storing subunit, the internal data reading subunit, and the task manager to the task manager; A cache access task of the internal data return subunit.
  8. 根据权利要求5或6或7所述的系统,其特征在于,The system according to claim 5 or 6 or 7, characterized in that:
    所述任务管理器包括:缓存访问控制器和缓存访问任务队列;The task manager includes: a cache access controller and a cache access task queue;
    其中,所述缓存访问任务队列用于存储所述中央处理器下发的所述外部数据读取子单元、所述外部数据回存子单元、所述内部数据回存子单元和所述内部数据回存子单元的缓存访问任务;The cache access task queue is configured to store the external data reading subunit, the external data storing subunit, the internal data storing subunit, and the internal data issued by the central processing unit. Cache access tasks of sub-units;
    其中,在基于所述缓存访问任务T1所依赖缓存访问任务T0的完成情况判决是否允许响应所述访问请求q1的方面,所述缓存访问控制器用于,Wherein, in the aspect of determining whether to allow response to the access request q1 based on the completion of the cache access task T0 on which the cache access task T1 depends, the cache access controller is configured to:
    当所述缓存访问任务T1所依赖缓存访问任务T0已完成,则判决允许响应所述访问请求q1;When the cache access task T0 on which the cache access task T1 depends has been completed, it is judged that it is allowed to respond to the access request q1;
    当所述缓存访问任务T0未完成,则将所述缓存访问任务T1所请求访问缓存的指针与所述缓存访问任务T0的当前指针进行指针比对,若指针对比通过,则判决允许响应所述访问请求q1;若指针对比未能通过,则判决不允许响应所述访问请求q1。When the cache access task T0 is not completed, the pointer of the cache access task T1 requested to access the cache is compared with the current pointer of the cache access task T0. If the pointer comparison is passed, the decision is allowed to respond to the Access request q1; if the pointer comparison fails, it is determined that it is not allowed to respond to the access request q1.
  9. 根据权利要求8所述的系统,其特征在于,所述缓存访问任务队列包括第一子队列、第二子队列、第三子队列和第四子队列:The system according to claim 8, wherein the cache access task queue comprises a first sub-queue, a second sub-queue, a third sub-queue and a fourth sub-queue:
    其中,所述第一子队列用于存储所述中央处理器下发的所述外部数据读取子单元的缓存访问任务;The first sub-queue is configured to store a cache access task of the external data reading sub-unit issued by the central processing unit;
    所述第二子队列用于存储所述中央处理器下发的所述外部数据回存子单元的缓存访问任务;The second sub-queue is configured to store a cache access task of the external data return sub-unit delivered by the central processing unit;
    所述第三子队列用于存储所述中央处理器下发的所述内部数据读取子单元的缓存访问任务;The third sub-queue is configured to store a cache access task of the internal data reading sub-unit issued by the central processing unit;
    所述第四子队列用于存储所述中央处理器下发的所述内部数据回存子单元的缓存访问任务;The fourth sub-queue is configured to store a cache access task of the internal data return sub-unit delivered by the central processing unit;
    其中,所述缓存访问控制器包括:第一子控制器、第二子控制器、第三子控制器和第四子控制器:The cache access controller includes: a first sub-controller, a second sub-controller, a third sub-controller, and a fourth sub-controller:
    所述第一子控制器为外部数据读取子单元的缓存访问控制器;所述第二子控制器为外部数据回存子单元的缓存访问控制器;所述第三子控制器为内部数据读取子单元的缓存访问控制器;所述第四子控制器为内部数据回存子单元的缓存访问控制器。The first sub-controller is a cache access controller of an external data reading sub-unit; the second sub-controller is a cache access controller of an external data return sub-unit; and the third sub-controller is internal data Read the cache access controller of the sub-unit; the fourth sub-controller is a cache access controller of the internal data return sub-unit.
  10. 一种电子设备,其特征在于,包括:壳体和容纳于所述壳体之中的硬件系统,所述硬件系统为如权利要求1至9任意一项所述的硬件系统。An electronic device, comprising: a housing and a hardware system housed in the housing, the hardware system being the hardware system according to any one of claims 1 to 9.
PCT/CN2018/124854 2018-09-11 2018-12-28 Hardware system and electronic device WO2020052171A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201811056568.X 2018-09-11
CN201811056568.XA CN110888675B (en) 2018-09-11 2018-09-11 Hardware system and electronic device

Publications (1)

Publication Number Publication Date
WO2020052171A1 true WO2020052171A1 (en) 2020-03-19

Family

ID=69745493

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2018/124854 WO2020052171A1 (en) 2018-09-11 2018-12-28 Hardware system and electronic device

Country Status (2)

Country Link
CN (1) CN110888675B (en)
WO (1) WO2020052171A1 (en)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101997777A (en) * 2010-11-16 2011-03-30 福建星网锐捷网络有限公司 Interruption processing method, device and network equipment
CN102073481A (en) * 2011-01-14 2011-05-25 上海交通大学 Multi-kernel DSP reconfigurable special integrated circuit system
CN105511964A (en) * 2015-11-30 2016-04-20 华为技术有限公司 I/O request processing method and device
US20160147532A1 (en) * 2014-11-24 2016-05-26 Junghi Min Method for handling interrupts

Family Cites Families (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5943500A (en) * 1996-07-19 1999-08-24 Compaq Computer Corporation Long latency interrupt handling and input/output write posting
EP1330725B1 (en) * 2000-09-29 2012-03-21 Alacritech, Inc. Intelligent network storage interface system and devices
JP4390694B2 (en) * 2004-12-24 2009-12-24 富士通株式会社 DMA circuit and disk array device using the same
US8458380B2 (en) * 2008-03-26 2013-06-04 Qualcomm Incorporated Off-line task list architecture utilizing tightly coupled memory system
EP2691883A1 (en) * 2011-03-28 2014-02-05 Citrix Systems Inc. Systems and methods of utf-8 pattern matching
US9804971B2 (en) * 2012-01-17 2017-10-31 International Business Machines Corporation Cache management of track removal in a cache for storage
CN102929714B (en) * 2012-10-19 2015-05-13 国电南京自动化股份有限公司 uC/OS-II-based hardware task manager
CN103902364B (en) * 2012-12-25 2018-10-30 腾讯科技(深圳)有限公司 A kind of physical resource management method, apparatus and intelligent terminal
US9519591B2 (en) * 2013-06-22 2016-12-13 Microsoft Technology Licensing, Llc Latch-free, log-structured storage for multiple access methods
CN105677455A (en) * 2014-11-21 2016-06-15 深圳市中兴微电子技术有限公司 Device scheduling method and task administrator
US10713210B2 (en) * 2015-10-13 2020-07-14 Microsoft Technology Licensing, Llc Distributed self-directed lock-free RDMA-based B-tree key-value manager
US10235171B2 (en) * 2016-12-27 2019-03-19 Intel Corporation Method and apparatus to efficiently handle allocation of memory ordering buffers in a multi-strand out-of-order loop processor

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101997777A (en) * 2010-11-16 2011-03-30 福建星网锐捷网络有限公司 Interruption processing method, device and network equipment
CN102073481A (en) * 2011-01-14 2011-05-25 上海交通大学 Multi-kernel DSP reconfigurable special integrated circuit system
US20160147532A1 (en) * 2014-11-24 2016-05-26 Junghi Min Method for handling interrupts
CN105630593A (en) * 2014-11-24 2016-06-01 三星电子株式会社 Method for handling interrupts
CN105511964A (en) * 2015-11-30 2016-04-20 华为技术有限公司 I/O request processing method and device

Also Published As

Publication number Publication date
CN110888675A (en) 2020-03-17
CN110888675B (en) 2021-04-06

Similar Documents

Publication Publication Date Title
WO2018076793A1 (en) Nvme device, and methods for reading and writing nvme data
US9448846B2 (en) Dynamically configurable hardware queues for dispatching jobs to a plurality of hardware acceleration engines
TWI447650B (en) Interrupt distribution scheme
US7117285B2 (en) Method and system for efficiently directing interrupts
US10552213B2 (en) Thread pool and task queuing method and system
US7234004B2 (en) Method, apparatus and program product for low latency I/O adapter queuing in a computer system
US20180225155A1 (en) Workload optimization system
US10972555B2 (en) Function based dynamic traffic management for network services
US10614004B2 (en) Memory transaction prioritization
US11005970B2 (en) Data storage system with processor scheduling using distributed peek-poller threads
JP2007095065A (en) Technique for controlling cell processors over network
WO2023045203A1 (en) Task scheduling method, chip, and electronic device
JP4620768B2 (en) Control of a computer system having a processor including multiple cores
US9286129B2 (en) Termination of requests in a distributed coprocessor system
JP2009521054A (en) Dynamic cache management apparatus and method
WO2020052171A1 (en) Hardware system and electronic device
US8719499B2 (en) Cache-line based notification
CN113076189B (en) Data processing system with multiple data paths and virtual electronic device constructed using multiple data paths
JP2004078683A (en) Computer system and shared memory controlling method
CN113076180B (en) Method for constructing uplink data path and data processing system
WO2007039933A1 (en) Operation processing device
US10891244B2 (en) Method and apparatus for redundant array of independent drives parity quality of service improvements
US20230384855A1 (en) Reducing system power consumption when capturing data from a usb device
CN113535377A (en) Data path, resource management method thereof, and information processing apparatus thereof
CN113535345A (en) Method and device for constructing downlink data path

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 18933683

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 18933683

Country of ref document: EP

Kind code of ref document: A1