CN112513809A - Processor, task response method, movable platform and camera - Google Patents

Processor, task response method, movable platform and camera Download PDF

Info

Publication number
CN112513809A
CN112513809A CN201980050197.0A CN201980050197A CN112513809A CN 112513809 A CN112513809 A CN 112513809A CN 201980050197 A CN201980050197 A CN 201980050197A CN 112513809 A CN112513809 A CN 112513809A
Authority
CN
China
Prior art keywords
processor
special
module
response
task
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201980050197.0A
Other languages
Chinese (zh)
Inventor
雍振强
董岚
杨富强
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
SZ DJI Technology Co Ltd
Original Assignee
SZ DJI Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by SZ DJI Technology Co Ltd filed Critical SZ DJI Technology Co Ltd
Publication of CN112513809A publication Critical patent/CN112513809A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/30098Register arrangements
    • G06F9/30105Register structure
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/30098Register arrangements
    • G06F9/3012Organisation of register space, e.g. banked or distributed register file

Landscapes

  • Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Advance Control (AREA)

Abstract

A processor (300), a task response method, a movable platform, a camera, and a computer-readable storage medium. The processor (300) comprises: a bus interface (306), a control register (304), and a controller (302). Wherein the bus interface (306) is capable of being coupled to a plurality of application specific modules (308, 310, 312, 314) external to the processor (300) for receiving a plurality of task completion requests with respect to the plurality of application specific modules (308, 310, 312, 314). The control register (304) is used for storing a plurality of response modes of the processor (300) responding to a plurality of special modules (308, 310, 312, 314); wherein a plurality of the response modes are different. Wherein said processor (300) responds to a plurality of said task completion requests of a plurality of said application specific modules (308, 310, 312, 314) in accordance with a corresponding said response mode in said control register (304). According to the processor, the task response method, the movable platform, the camera and the computer readable storage medium, system efficiency can be improved.

Description

Processor, task response method, movable platform and camera
Technical Field
The present application relates to the field of computer technologies, and in particular, to a processor, a task response method, a movable platform, a camera, and a computer-readable storage medium.
Background
With the development of technology, there is a limit to the instructions or functions executed by a Central Processing Unit (CPU). Relying solely on the CPU to process certain specific tasks would not be efficient in accomplishing those specific tasks. Therefore, in order to improve the execution efficiency of the CPU and improve the system functions, a coprocessor unit or an accelerated processing unit (hereinafter, referred to as a dedicated module) with a specific function is added around the CPU, so that the CPU and the dedicated module operate in parallel to improve the work efficiency of the whole system. However, the interaction between the CPU and the special-purpose module may interrupt the task being processed by the CPU, thereby reducing the system efficiency.
Disclosure of Invention
Embodiments of the present invention provide a processor, a task response method, a movable platform, a camera, and a computer-readable storage medium.
According to a first aspect of embodiments of the present application, there is provided a processor, including: a bus interface, and a control register. The bus interface can be coupled to a plurality of special modules outside the processor and used for receiving a plurality of task completion requests related to the special modules. The control register is used for storing a plurality of response modes of the processor responding to a plurality of special modules; wherein a plurality of the response modes are different. The processor is coupled with the control register and used for responding to a plurality of task completion requests of a plurality of special modules according to the corresponding response modes in the control register.
According to a second aspect of embodiments of the present application, there is provided a task response method, including: receiving a plurality of task completion requests corresponding to a plurality of special modules which are coupled with the bus interface and are positioned outside the processor; acquiring a plurality of response modes of the processor responding to the special modules, and storing the response modes in a control register; wherein a plurality of the response modes are different; and responding to a plurality of task completion requests of a plurality of special modules according to the corresponding response modes in the control register.
According to a third aspect of embodiments herein, there is provided a computer-readable storage medium having computer instructions stored thereon. When the computer instructions are executed by one or more processors, the one or more processors perform acts comprising: receiving a plurality of task completion requests corresponding to a plurality of special modules which are coupled with the bus interface and are positioned outside the processor; acquiring a plurality of response modes of the processor responding to the special modules, and storing the response modes in a control register; wherein a plurality of the response modes are different; and responding to a plurality of task completion requests according to the corresponding response modes in the control register.
According to a fourth aspect of embodiments of the present application, there is provided a movable platform. The movable platform comprises the processor mentioned in the first aspect of the embodiments of the present application.
According to a fifth aspect of embodiments of the present application, there is provided a camera. The camera comprises the processor mentioned in the first aspect of the embodiments of the present application.
According to the processor, the task response method, the movable platform, the camera and the computer readable storage medium provided by the embodiment of the invention, the system efficiency can be improved.
Drawings
The accompanying drawings, which are included to provide a further understanding of the application and are incorporated in and constitute a part of this application, illustrate embodiment(s) of the application and together with the description serve to explain the application and not to limit the application. In the drawings:
FIG. 1 is a diagram of a processor and application specific modules according to an embodiment of the invention;
FIG. 2 is a flow diagram of an interrupt response scheme of the processor shown in FIG. 1;
FIG. 3 is a schematic diagram of a processor and a plurality of application specific modules according to another embodiment of the present invention;
FIG. 4 is a schematic diagram of the control register configuration shown in FIG. 1;
FIG. 5 is a schematic diagram of a query instruction of the processor shown in FIG. 1;
FIG. 6a is a flow diagram of the processor shown in FIG. 1 responding to a specialized module in a query mode according to one embodiment of the invention;
FIG. 6b is a flow diagram of the processor shown in FIG. 1 responding to a specialized module in a query mode according to one embodiment of the invention;
FIG. 7a is a schematic diagram of pseudo code for the processor shown in FIG. 1 using a query instruction in accordance with one embodiment of the present invention;
FIG. 7b is a schematic diagram of pseudo code for the processor shown in FIG. 1 using a query instruction according to another embodiment of the invention;
FIG. 8 is a schematic diagram of a detection unit in the processor shown in FIG. 1;
FIG. 9 is a schematic diagram of a detection module in the processor shown in FIG. 1;
fig. 10 is a flowchart of a task response method according to an embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present application more apparent, the technical solutions of the present application will be described in detail and completely with reference to the following specific embodiments of the present application and the accompanying drawings. It should be apparent that the described embodiments are only some of the embodiments of the present application, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.
With the development of technology, the number of instructions or functions that can be executed by a Central Processing Unit (CPU) is limited. It is not efficient to handle certain tasks by relying solely on the CPU to do so. Therefore, in order to improve the execution efficiency of the CPU and improve the system functions, in a common system nowadays, a coprocessor unit or an accelerated processing unit (hereinafter, referred to as a dedicated module) with a specific function is added around the CPU, so that the CPU and the dedicated module work in parallel to improve the work efficiency of the whole system.
Please refer to fig. 1. Fig. 1 is a diagram of a processor and a dedicated module according to an embodiment of the invention. A processor 106 (e.g., CPU) interacts data with the memory 102 via the system bus 104. In addition, the processor 106 interacts data with various application specific modules via an internal bus 110. That is, structurally, these specialized modules are connected to the processor through an internal bus 110 to receive instructions or configuration signals from the processor. Further, these specialized modules can send completion requests to the processor after completing certain tasks. Further, these dedicated modules are also connected to each other through the internal bus 110. The special modules may be a graphics processing module (GPU) 120 with a graphics processing function, a Vector processing module (Vector processing Unit)112, a floating point processing module (FPU) 114, a Direct Memory Access (DMA) 108, or a similar special processing Unit capable of processing a special computing task. Such as an Artificial Intelligence (AI) accelerator 118, a Fast Fourier Transform (FFT) 116, and so forth. These special purpose modules may extend the processing functionality of the processor through an extended instruction set or provide configuration registers. However, the present invention is not limited thereto, and may include a dedicated module for realizing other functions.
Take data transfer from the processor 106 to the memory 102 as an example. A set of DMAs 108 are added between the processor 106 and the memory 102 as a specialized module. The processor 106 sends configuration information to the DMA108 to configure configuration registers in the DMA108, and informs the DMA108 of information such as an address where data needs to be transferred from the memory 102 and a size of the data that needs to be transferred. After the configuration is completed, the DMA108 transfers data to the memory 102 via the system bus 104 in accordance with the configuration information of the processor 106. On the other hand, the processor 106 does not intervene in the process of transferring data by the DMA108 during the transfer of the DMA108, and thus can perform processing of other tasks. When the DMA108 finishes the task of carrying data, the DMA108 will issue an interrupt request to the processor 106. The processor 106, upon receiving an interrupt request from the DMA108, stops other tasks being processed in response to the interrupt request sent by the DMA 108. In the interrupt service function corresponding to the interrupt request, the processor 106 processes the memory data carried back by the DMA108, and continues to execute the interrupted task with the result and state of the current operation.
The above-described manner in which the processor 106 responds to the DMA108 after the DMA108 task is completed is referred to as an interrupt response manner. Please refer to fig. 2. FIG. 2 is a flow chart of an interrupt response scheme of the processor shown in FIG. 1. In the interrupt response mode, as shown in step S202, the processor (e.g., CPU) configures a dedicated module (e.g., DMA) and transmits a configuration signal or other instruction regarding the dedicated module to the dedicated module. At this point, the special purpose module receives the configuration signal from the processor and configures the configuration register in the special purpose module, and the special purpose module receives other instructions to perform operations related to these instructions, as shown in step S222. In one embodiment, the processor sends a configuration signal to a special purpose module (e.g., a DMA) for configuring a configuration register in the special purpose module.
In step S204, the processor performs other tasks. At this time, as shown in step S224, the dedicated module starts operating after the configuration register is configured. For example, the application specific module begins performing operations or processes associated with configuring signals or other instructions. After completing the operations or processes associated with the configuration signal or other instructions sent by the processor, the application specific module issues a task complete request to the processor, as shown in step S226.
In step S206, although the processor is executing other tasks, once the processor receives a task completion request issued by the application-specific module, the other tasks being executed by the processor are interrupted. At this point, the processor enters an interrupt service routine.
After the interrupt service routine is executed, the processor continues to execute the previously executed interrupted other tasks, as shown in step S208.
After the previously executed, interrupted task is executed, the processor enters an idle state, or executes a new task, as shown in step S210.
It should be noted that there are some disadvantages when the processor interacts with the application-specific module in an interrupt-responsive manner, as follows:
first, when the special module completes the task and issues an interrupt request, no matter what the processor is doing, the processor needs to respond to the interrupt request sent by the special module, which means that the task currently processed by the processor must be interrupted. Thus, to some extent, this interaction and response pattern interferes with the progress of the task being processed by the processor. Taking DMA as an example to carry data, if the data that the DMA needs to carry is small, but the number of carrying times is large, the task that the processor is processing will be interrupted continuously, resulting in a reduction in system efficiency.
Secondly, when the special module completes the task, if there is no dependency relationship between the task processing result of the special module and the task currently processed by the processor, it has no meaning to interrupt the task being processed by the processor. Taking DMA as an example, if a task being processed by a processor does not need data transferred using DMA, it is meaningless for the processor to respond to an interrupt request of DMA at this time. And if the task being processed by the processor has a higher real-time requirement, responding to the interrupt request of the special module reduces the real-time of the task being processed by the processor.
Based on the above problems, embodiments of the present application provide a processor, a task response method, a movable platform, a camera, and a computer-readable storage medium, which can improve system efficiency. It should be noted that the embodiments of the present application provide a processor, a task response method, and a computer-readable storage medium, which are applicable to any system having an acceleration unit or a co-processing unit. For example, the processor and the task response method may be applied to a device having a network, such as a mobile phone, a computer, a Personal tablet, or a PDA (Personal Digital Assistant). Alternatively, the processor and the task response method may also be applied to devices such as a movable platform (e.g., an unmanned aerial vehicle, an unmanned vehicle, or an unmanned ship), a camera, a sweeping robot, or a smart speaker.
According to an embodiment of the present invention, there is provided a processor including: a bus interface and a control register. The system comprises a bus interface, a plurality of special modules and a plurality of task completion modules, wherein the bus interface can be coupled with the special modules outside a processor and is used for receiving a plurality of task completion requests related to the special modules; the control register is used for storing a plurality of response modes of the processor responding to the plurality of special modules; wherein the plurality of response modes are different; and the processor responds to the task completion requests of the special modules according to the corresponding response modes in the control register.
According to another embodiment of the present invention, a movable platform is provided. The movable platform comprises the processor.
According to another embodiment of the present invention, a camera is provided. The camera comprises the processor.
It should be noted that the present invention is not limited to this, and the processor may further include a controller, coupled to the control register, for responding to a plurality of task completion requests of a plurality of the special-purpose modules according to the corresponding response manner in the control register. Please refer to fig. 3. FIG. 3 is a schematic diagram of a processor and a plurality of application specific modules according to another embodiment of the present invention. Processor 300 includes a controller 302, control registers 304, and a bus interface 306. The bus interface 306 can be coupled to a plurality of special purpose modules outside the processor 300, and is configured to receive a plurality of task completion requests related to the special purpose modules. Control registers 304 for storing a plurality of response modes of processor 300 in response to the special purpose modules. These plural said response modes are different from each other. The controller 302 is coupled to the control registers 304 and is configured to respond to the plurality of task completion requests of the application specific modules according to the corresponding response patterns in the control registers 304.
In fig. 3, a processor 300 (e.g., a CPU) is connected to N Special Function Units (SFUs) via an internal bus. The system numbers N special modules, andit is ensured that the numbering of all dedicated modules is not repeated. As shown in FIG. 3, the N special modules are special modules SFU 1308, special purpose module SFU 2310, special purpose module SFU 3312N314. As can be seen from fig. 3, in this embodiment, no changes need to be made to the manner of connection of the internal bus. In another embodiment, the system may classify N specialized modules. For example, the specialized modules are classified according to their functions and uses, and are numbered according to different classification results. For example, if there are two different functional specialized modules, the different functional specialized modules are numbered with different codes. If there are a plurality of dedicated modules a1, a2 and A3 for a functionality and a plurality of dedicated modules B1, B2 and B3 for B functionality, dedicated modules a1, a2 and A3 are encoded as 000, 001, 010 and dedicated modules B1, B2 and B3 are encoded as 100, 101, 110. That is, the dedicated modules a1, a2, and A3 and the dedicated modules B1, B2, and B3 are encoded with 4 bits. When the highest bit of the serial number of the dedicated module is "0", it indicates that the dedicated module is a dedicated module having a function a. When the highest order bit of the serial number of the dedicated module is "1", it means that the dedicated module is a dedicated module having the B function. Alternatively, when the highest order bit of the serial number of the dedicated module is "1", it indicates that the dedicated module is a dedicated module having a function. When the highest order bit of the serial number of the dedicated module is "0", it means that the dedicated module is a dedicated module having the B function. That is, the bit value of the highest bit of the dedicated module having different functions may be different. For example, if there are a plurality of graphics processing modules and a plurality of artificial intelligence accelerators, the highest order bit of the plurality of graphics processing modules is encoded as "0" and the highest order bit of the plurality of artificial intelligence accelerators is encoded as "1". However, the present invention is not limited thereto. Other coding modes which can realize the special module coding classification fall into the protection scope of the invention.
According to one embodiment of the invention, the processor responds to the N specialized modules in a plurality of response modes. For example, the plurality of response modes include an interrupt mode and a query mode. In the query mode, a processor (e.g., a CPU) checks whether a dedicated module completes a task by executing a query instruction. If the special module does not complete the task, the processor enters an abort state; if the special purpose module is detected to have completed the task, the processor may execute the subsequent instructions. Since the query executed by the processor is determined according to the algorithm or system requirements, it can be ensured that the processor is in an idle state when executing the query, or the processor is in an urgent need to use the calculation result of the special module.
In the processor 300, a set of Control Registers (CR) 304 is added, and the bit width of the Control registers 304 is N bits. Each bit on control register 304 corresponds one-to-one to N dedicated modules. That is, each bit on the control register 304 corresponds to a dedicated module SFU, respectively1308, special purpose module SFU 2310, special purpose module SFU 3312N 314。
In one embodiment, control register 304 includes a first bit. The first bit is for storing a first special purpose module (e.g., special purpose module SFU) corresponding to the processor 300 responding to a first special purpose module (e.g., N special purpose modules) of the plurality of special purpose modules (e.g., N special purpose modules)1308, special purpose module SFU 2310, special purpose module SFU 3312N314) of the first response pattern. When the first bit is '0', the first response mode is an interrupt mode; when the first bit is "1", the first response mode is the inquiry mode. Or when the first bit is '1', the first response mode is an interrupt mode; when the first bit is "0", the first response mode is the query mode. That is, the bit values of the bits corresponding to the two response modes may be different.
According to an embodiment of the invention, each bit of control register 304 may be configured as a logic 1 or a logic 0. When a bit of the control register 304 is configured to be logic 1, the response mode of the processor 300 responding to the corresponding special-purpose module is the query mode. That is, this bit is now configured in query mode. When a bit of the control register 304 is configured to be logic 0, the processor 300 responds to the corresponding application-specific module in an interrupt manner. That is, this bit is now configured in interrupt mode.
In another embodiment, when a bit of the control register 304 is configured to be logic 0, the response mode of the processor 300 responding to the corresponding application-specific module is the query mode. That is, this bit is now configured in query mode. When a bit of the control register 304 is configured to be a logic 1 (i.e., the bit is configured in the interrupt mode), the processor 300 responds to the corresponding application-specific module in an interrupt mode. That is, this bit is now configured in interrupt mode.
According to an embodiment of the present invention, the interrupt mode and the query mode may coexist in N dedicated modules. When configuring the control register 304, it is necessary to configure the bits corresponding to the respective dedicated modules at corresponding logic values. For processor 300, processor 300 will determine the response mode to be taken by the corresponding specialized module of the N specialized modules based on the logic value configured by each bit in control register 304. When the bit of the control register is configured to be logic 1, the response mode of the processor responding to the corresponding special-purpose module is an inquiry mode, and when the bit of the control register is configured to be logic 0, the response mode of the processor responding to the corresponding special-purpose module is an interrupt mode. If the processor 300 responds to the special purpose module SFU 1308 and special purpose module SFU 3312 is a query mode, and the processor 300 responds to the special purpose module SFU 2310 and special purpose module SFU N314 is in interrupt mode, the special module SFU 1308 and special purpose module SFU 3312 corresponding bit configuration as logic 1, special purpose module SFU 2310 and special purpose module SFU N314 is configured as a logic 0. In another embodiment, when the bit of the control register is configured to be logic 0, the processor responds to the corresponding special-purpose module in a query mode, and when the bit of the control register is configured to be logic 1, the processor responds to the corresponding special-purpose module in an interrupt mode. If the processor 300 responds to the special purpose module SFU 1308 and special purpose module SFU 3312 responderIn the form of a query, processor 300 responds to the special purpose module SFU 2310 and special purpose module SFU N314 is in interrupt mode, the special module SFU 1308 and special purpose module SFU 3312 corresponding bit configuration to logic 0, special purpose module SFU 2310 and special purpose module SFU N314 is configured to a logic 1.
Referring to fig. 4, fig. 4 is a schematic diagram illustrating a control register configuration shown in fig. 1. As shown in fig. 4, the special purpose module SFU1420, special purpose module SFU2422 and special purpose module SFU4426 correspond to bits 402, 404, and 408 in the control register, and the logical values of bits 402, 404, and 408 are all 1. Special module SFU3424 and special purpose module SFU5428 correspond to bits 406 and 410 in the control register, respectively, and the logical values of both bits 406 and 410 are 0. If the bit of the control register is configured to be logic 1 to represent the query mode and the bit of the control register is configured to be logic 0 to represent the interrupt mode, the processor responds to the special purpose module SFU1420, special purpose module SFU2422 and special purpose module SFU4426 are all query modes, and the processor responds to the special module SFU3424 and special purpose module SFU5428 respond in an interrupt manner. That is, the processor communicates with the special purpose module SFU via the query mode1420, special purpose module SFU2422 and special purpose module SFU4426, the processor interacts with the special-purpose module SFU by means of an interrupt3424 and special purpose module SFU5428 interact. Otherwise, if the bit of the control register is configured to be logic 0 to represent the query mode and the bit of the control register is configured to be logic 1 to represent the interrupt mode, the processor responds to the special purpose module SFU1420, special purpose module SFU2422 and special purpose module SFU4426 are interrupt modes, the processor responds to the special module SFU3424 and special purpose module SFU5428 respond in a query fashion. That is, the processor communicates with the special purpose module SFU via an interrupt mode1 420,Special module SFU2422 and special purpose module SFU4426, the processor interacts with the special module SFU in a query mode3424 and special purpose module SFU5428 interact. Therefore, the response mode that the CPU responds to the N special modules through the control register is a precondition and a basis for ensuring that the system has a flexible task response mode.
It should be noted that the control register may include not only a bit indicating the response mode of the processor responding to the special purpose module, but also some other bits to implement other functions of the processor responding to the special purpose module. For example, the control processor may include a plurality of bits, wherein the bits are to store a response time corresponding to the processor with respect to at least one task completion request of the plurality of task completion requests. When the response mode of the processor responding to a special module is an interrupt mode, after the processor receives at least one task completion request from the special module through the bus interface, the controller in the processor delays the response of the at least one task completion request, wherein the delay time is the response time.
In one embodiment, the response time is a preset time. In another embodiment, the response time may be adjusted in real-time based on task completion and task priority of the processor. For example, if the task currently being processed by the processor is the highest priority task, the controller does not respond to a specific module immediately after receiving a task completion request for the specific module even if the processor responds to the specific module in an interrupt mode. But rather delays responding to the task completion request. Wherein the delay time is set according to the completion time estimated by the system with respect to the current processing task. That is, the delay time may be adjusted in real time according to the completion time of the task. In this embodiment, the flexibility of the processor is higher. Therefore, the processor can not only ensure that the task with the highest optimal level is processed in time, but also ensure the efficiency of the processor responding to the special module. In addition, to simplify the calculation, the response time may also be preset in advance. That is, the response time is a preset time. In one embodiment, the response time is preset based on the maximum time required to complete a task. The above embodiments are only for explaining the present invention, and are not intended to limit the present invention.
In another embodiment, if the task currently being processed by the processor is the task to be processed, then even if the processor responds to a specific module in an interrupt mode, the controller in the processor does not respond to the specific module immediately after receiving a task completion request for the specific module. A controller in the processor delays responding to the task completion request. Wherein the delay time is a response time. The response time may be a response time preset in advance. However, if the task currently being processed by the processor is the task that has just begun to be processed, the controller in the processor responds to a specialized module immediately after receiving a task completion request for that specialized module if the processor responds to the specialized module in an interrupt mode. At this time, both the delay time and the response time are 0. In this embodiment, the time point for responding to the application-specific module may be decided according to the completion of the task currently processed by the processor. Therefore, the task currently processed by the processor and the task completion request of the special module can be considered.
According to a further embodiment of the invention, the control register comprises some other bits. Wherein the bits are used to store response conditions corresponding to the processor with respect to the plurality of task completion requests. And when the response mode of the processor responding to the special module is an interrupt mode, the controller responds to the task completion request of the special module if the response condition is met. The response condition is that the processor receives a task completion request of the special purpose module from the special purpose module through the bus interface and the processor finishes processing a currently executed task. When such an embodiment is used, it can be ensured that the task completion currently processed by the processor is not disturbed by the task completion request of the dedicated module. This embodiment applies to any situation where the processor is not involved between the tasks currently being processed by the processor and the tasks being completed by the application-specific module.
In one embodiment, if the response mode of the processor to the special module is the query mode, the controller sends at least one query instruction about the special module to determine whether the special module has sent a task completion request. According to an embodiment of the present invention, the at least one query instruction includes at least one of an instruction type, a code of a serial number of the specific module to be queried, a first address of the specific module, a number of the specific module, and a scalable flag; the scalable flag is used for indicating whether the size of at least one query instruction is scalable. In one embodiment, the scalable flag indicates whether the size of the query instruction may vary according to the number of application specific modules corresponding to the query instruction.
In one embodiment, assuming that the number of specialized modules is N, then the number of specialized modules needs to be encoded using log2N bits. For example, if the number N of the dedicated modules is 32, the number of the dedicated modules needs to be encoded using 5 bits. For a CPU with an instruction encoding width of 32 bits, one or more special module numbers can be accommodated. According to embodiments of the present invention, three applicable encoding schemes for query instructions are provided. Please refer to fig. 5. FIG. 5 is a diagram illustrating a query instruction of the processor shown in FIG. 1. In the encoding method 500, the query command only includes a specific module number numi with query. In the encoding mode 502, the query instruction may encode more specific module numbers (e.g., specific module numbers numi, numj, numk, nump, numq) at a time. Unlike the encoding scheme 500, the encoding scheme 502 can fully utilize reserved space (reserved) in instruction encoding. In this example of implementation, assuming that the query instruction itself requires 7 bits to encode the instruction type (denoted "opcode" in fig. 5), 25 bits of encoding space remain. Thus, in this embodiment, a query may encode the IDs of up to 5 specialized modules. In the encoding mode 504, a first address (denoted by "start _ id" in fig. 5) of a specific module to be queried and the number (denoted by "length" in fig. 5) of specific modules to be queried are defined. Therefore, the number of the specific modules to be queried (e.g., the address of the specific module) that can be supported in the encoding manner 504 is start _ id, start _ id +1, start _ id +2,. and.. gth. It should be noted that, the above three encoding manners of the query instruction may be used simultaneously in one embodiment, and according to the instruction type of the query instruction, the processor (for example, a CPU) decodes the query instruction and determines the number of the specific module to be queried. However, the present invention is not limited to this, and only one of the above three encoding methods of the query instruction or any combination of the above three encoding methods of the query instruction may be used to increase the flexibility of the query instruction.
Please refer to fig. 6a and 6 b. FIG. 6a is a flow chart of the processor shown in FIG. 1 responding to a specific module in a query mode according to an embodiment of the invention. FIG. 6b is a flow diagram of the processor shown in FIG. 1 responding to a specific module in a query mode according to an embodiment of the invention. In the query mode, the processor dynamically schedules to determine when to insert a query instruction according to algorithm requirements and the idle state of the CPU and the algorithm execution degree. For example, the controller issues the at least one query instruction when a task being executed by the processor requires a processing result of a task executed by the first application-specific module. Or when the processor is in an idle state, the controller sends out the at least one query instruction.
The processor may perform other tasks after assigning tasks to the plurality of application specific modules. In fig. 6a, a processor (e.g., CPU) configures a dedicated module (e.g., DMA or other dedicated module) and sends a configuration signal or other instruction regarding the dedicated module to the dedicated module, as shown in step S602. At this time, the special purpose module receives the configuration signal from the processor and configures the configuration register in the special purpose module, and the special purpose module receives other instructions to perform operations related to these instructions, as shown in step S620. In one embodiment, the processor sends a configuration signal to a special purpose module (e.g., a DMA) for configuring a configuration register in the special purpose module.
In step S604, the processor performs other tasks. At this time, as shown in step S622, the dedicated module starts operating after the configuration register is configured. For example, the application specific module begins performing operations or processes associated with configuring signals or other instructions. After completing the operations or processing associated with the configuration signal or other instructions sent by the processor, the application specific module issues a task complete request to the processor, as shown in step S624.
In step S606, the processor continues to execute other tasks until the processor completes the executing task and enters an idle state.
After entering the idle state, the processor executes a query instruction, as shown in step S608. In one embodiment, when the processor detects a completion request issued by the application specific processing module, the processor processes the task completed by the application specific module.
After the task completed by the application specific module is processed, the processor executes a new task, as shown in step S610. In another embodiment, the processor enters the execution idle state after processing the tasks completed by the application specific module.
In fig. 6b, the processor (e.g., CPU) configures the dedicated module (e.g., DMA) and sends a configuration signal or other instruction regarding the dedicated module to the dedicated module, as shown in step S632. At this time, the special purpose module receives the configuration signal from the processor and configures the configuration register in the special purpose module, and the special purpose module receives other instructions to perform operations related to these instructions, as shown in step S650. In one embodiment, the processor sends a configuration signal to a special purpose module (e.g., a DMA) for configuring a configuration register in the special purpose module.
In step S634, the processor performs other tasks. At this time, as shown in step S652, the dedicated module starts operating after the configuration register is configured. For example, the application specific module begins performing operations or processes associated with configuring signals or other instructions.
As shown in step S636, if the processor finds that the executing task needs the calculation result of the special module, the processor executes the query instruction, suspends the execution of the currently executing task, and waits for the task completion request of the special module. After completing the operations or processes associated with the configuration signal or other instructions issued by the processor, the application specific module issues a task complete request to the processor, as shown in step S654.
In step S638, after receiving the task completion request issued by the dedicated module, the processor continues to execute the other tasks in step S634 (i.e., the tasks whose execution is suspended in step S636). In one embodiment, the processor enters an idle state after it has performed the above tasks.
The processor executes the new task as shown in step S640.
It should be noted that, in the above embodiment, the special module is one of a graphics processing module, a vector calculation module, a floating point processing module, a direct memory access module, an artificial intelligence accelerator, and a fast fourier transform module. In addition, other specialized modules that interact with the processor may also be used. In fig. 6a, the specialized module has completed its task when the query instruction is used, and therefore the processor (e.g., CPU) does not enter an abort state. However, in fig. 6b, it is known from the algorithm that the processor needs to use the calculation results of the special modules in other tasks at present. Thus, the processor executes the query instruction and waits for a task completion request of the application specific module. Different from the interrupt mode, when the query mode is used, the processor executes the query instruction only when the processor has finished other tasks and enters an idle state or the processor urgently needs the calculation result of the special module. Therefore, the execution efficiency of the system can be sufficiently improved by using the query method.
In one embodiment, when the processor and the special module cooperate to complete the same task, if the response mode of the processor responding to the special module is the query mode, the time point of the processor needing to utilize the processing result of the first special module is estimated, and the time point of the controller sending at least one query instruction about the first special module is determined according to the time point, so as to reduce the waiting time of the processor. Moreover, the method can improve the calculation efficiency of the processor.
The processor may further comprise a process state register for marking a process state of a process handled by said processor. Further, the controller dynamically adjusts the response mode of the processor responding to the plurality of special modules according to the marked process state of the process state register, and sends the adjusted response mode to the control register. In this embodiment, the processor is capable of adjusting the response of the processor to the plurality of application specific modules in real time.
In one embodiment, the flag status register value is set to "1" if the task currently being processed by the processor is the highest priority task or the task currently being processed by the processor is higher priority than the plurality of application specific modules. When the flag status register value is set to "1", the processor does not respond to the manner in which any special purpose module responds. Further, the above scheme may also be implemented with a logical value of "0". For example, if the task currently processed by the processor is the highest priority task, or the task currently processed by the processor is higher in priority than the plurality of special purpose modules, the value of the flag status register is set to "0". The processor does not respond to the manner in which any of the application specific modules respond.
In another embodiment, the value of the flag status register is set to "1" if the task currently being processed by the processor is the lowest priority task or the task currently being processed by the processor is lower priority than the plurality of application specific modules. The processor responds immediately to the mode of response of the application specific module as long as the processor receives a complete request for the task from the application specific module. Further, the above scheme may also be implemented with a logical value of "0". The value of the flag status register is set to "0" if the task currently being processed by the processor is the lowest priority task or the task currently being processed by the processor is lower in priority than the plurality of application specific modules. The processor responds immediately to the mode of response of the application specific module as long as the processor receives a complete request for the task from the application specific module.
Please refer to fig. 7a and 7 b. FIG. 7a is a diagram of pseudo code for the processor shown in FIG. 1 using a query instruction according to one embodiment of the invention. FIG. 7b is a schematic diagram of pseudo code for the processor shown in FIG. 1 using a query instruction according to another embodiment of the invention. When a query instruction is used, multiple query instructions may be used non-consecutively or consecutively. As shown in FIG. 7a, pseudo code 700 is a non-sequential execution or a single execution query instruction. Specifically, in the code lines 702 to 706, the dedicated modules i to k are first arranged. In code line 708, the processor performs other tasks. In code line 710, the processor executes query instruction 1. After the line of code 710 is executed, the processor executes some other code (not shown). In code line 712, the processor executes query instruction 2. After the line of code 712 is executed, the processor executes some other code (not shown). In code line 714, the processor executes other instructions.
As shown in FIG. 7b, pseudo-code 720 is a plurality of query instructions that are executed in succession. Specifically, in the code lines 722 to 726, the dedicated modules i to k are arranged first. In line 728 of code, the processor performs other tasks. In code line 730, the processor executes query instruction 1. After executing the line of code 730, the processor executes the line of code 732. In code line 732, the processor executes query instruction 2. After the line of code 732 is executed, the processor executes some other code (not shown). In code line 734, the processor executes other instructions. As can be seen from the above, query instruction 1 and query instruction 2 are two query instructions executed consecutively. In one embodiment, since the processor continuously uses a plurality of query instructions, the processor can execute subsequent other instructions only after all the dedicated modules specified by the query instructions report the task completion request.
According to an embodiment of the invention, the same query instruction may be used for multiple specialized modules. In this query instruction, it may not be necessary to indicate the ID (e.g., application specific module code), address information, or other attribute information of the application specific module. At this time, it may be queried whether any of the plurality of specialized modules has issued a task completion request. And if any special module sends a task completion request, the processor responds to the task completion request. Therefore, the types of the instructions can be reduced, and the instruction resources and the space for storing the query instructions are saved. Furthermore, the same query instruction may be used for a plurality of dedicated modules of the same type. At this time, identification information on the same type of dedicated module needs to be added to the inquiry instruction. Such as the coding of these specialized modules, or attribute flags. When a plurality of special modules of the same type process certain specific tasks in parallel, only one instruction can be sent so as to improve the utilization rate of the instruction. In another embodiment, multiple query instructions may also be used to apply different query instructions to different types of application specific modules. However, the present invention is not limited thereto. In another embodiment, the use of several query instructions as described above may also be combined to improve the flexibility of the use of the query instructions.
When the processor executes the query instruction, the number of the special module to be queried is recorded in the pipeline processing, and whether the special module to be queried sends out a task completion request is recorded by using an independent register in the processor. For a processor (e.g., a CPU) having a 3-stage pipeline architecture, the processor may be divided into an instruction fetch, decode, and execution three-stage pipeline. After the query instruction is streamed to the execution stage, the serial number of the special module to be queried is recorded on the execution stage. Different query command encoding modes require different methods for recording the special modules.
According to an embodiment of the invention, the processor further comprises one or more detection units. That is, the number of detection units may be determined according to different hardware designs. According to one embodiment of the invention, a processor includes a detection module. The detection module comprises at least one detection unit. Wherein the detection unit comprises a comparator and a data register. The data register is used for storing a first special module number corresponding to a first special module in the plurality of special modules; and when the detection unit receives the second task completion request, the detection unit determines a second special module number of the second task request, and compares the second special module number with the first special module number in the data register through the comparator to determine a special module corresponding to the second task completion request. The detection unit further comprises a result register. The detection unit stores a plurality of comparison results in a result register. And when the controller sends a query instruction, the controller responds to the second task request according to the value in the result register.
The detection module further comprises an AND gate, wherein the AND gate is used for receiving a plurality of first comparison results from a plurality of first detection units, and the AND gate performs AND operation on the plurality of first comparison results to determine whether the plurality of special modules respectively send out corresponding task completion requests.
Please refer to fig. 8. Fig. 8 is a schematic diagram of a detection unit in the processor shown in fig. 1. As shown in fig. 8, the processor further detects a unit 800. Detection unit 800 includes a comparator 802 (e.g., a comparison circuit), a data register 804, and a result register 806. After the special module sends out the task completion request, the comparator in the detection unit 800 detects whether the special module number corresponding to the task completion request is consistent with the special module number num stored in the data register 804 corresponding to the detection unit 800. If the number of the specific module stored in the data register 804 of the detection unit 800 is identical to the number of the specific module corresponding to the task completion request, the detection result is valid, and the detection result is recorded in the result register 806. If the number of the specific module stored in the data register 806 of the detection unit 800 is not consistent with the number of the specific module corresponding to the task completion request, the detection result is invalid.
According to another embodiment of the present invention, the processor further comprises a flag register for storing a record as to whether the first specialized module issues the first task completion request when a response mode of the processor to the first specialized module among the plurality of specialized modules is a query mode. For example, the processor includes a flag register A. Wherein the flag register A corresponds to the special purpose module SFU0. If special module SFU is detected0A task completion request is issued, the value of flag register a is set to "1". Otherwise, the flag register A has a value of "0". Or, if the special module SFU is detected0A task completion request is issued, and the value of flag register a is set to "0". Otherwise, the flag register A has a value of "1". Flag register a may be result register 806. However, the flag register may also be another register capable of storing a record as to whether or not the special-purpose module issued a task completion request.
According to an embodiment of the present invention, if a plurality of response modes of the processor responding to the plurality of special purpose modules are query modes, the controller sends a query instruction about the plurality of special purpose modules to determine whether the plurality of special purpose modules send out a plurality of task completion requests. If all task completion requests have been issued, the processor executes tasks associated with the plurality of specialized processing modules.
Please refer to fig. 9. FIG. 9 is a schematic diagram of a detection module in the processor shown in FIG. 1. The processor further comprises a detection module 900 for detecting whether a plurality of special purpose processing modules (e.g. special purpose processing modules SFU)0,SFU1,......,SFUN) A task completion request is issued. The detection module 900 includes a detection unit 906 through a detection unit 914. The detection units 906 to 914 include comparators (e.g., comparison circuits), data registers, and result registers, respectively. The data register is used for storing the special module code to be inquired. The result register is used for storing the comparison result of the comparator. For example, the detection unit 906 includes a comparator 904, a data register 916, and a result register 920. The data register 816 is used for storing the special module code numi. The detection units 908 to 914 have a similar configuration to the detection unit 906. For brevity, no further description is provided.
Special module SFU0Special purpose module SFU1Special purpose module SFU2,., and special purpose module SFUNAfter any one of the detection units 906 to 914 sends out the task completion request, each of the comparators detects whether the special module number corresponding to the task completion request is consistent with the special module number stored in the data register corresponding to the detection unit. If the special module number stored in at least one data register of the detection units 906 to 914 is consistent with the special module number corresponding to the task completion request, the detection result is valid, and the detection result is recorded in the result register. If the special module numbers stored in all the data registers of the detection units 906 to 914 and the task are completedAnd if the numbers of the special modules corresponding to the requests are not consistent, the detection result is invalid. When the result registers of all the detection units are valid, all the special modules to be inquired in the detection modules are indicated to finish the calculation. At this point, the output of AND gate 902 is valid. That is, the processor outputs a query completion signal, the query leaves the execution stage, and the processor may continue to execute subsequent instructions.
Please refer to fig. 10. Fig. 10 is a flowchart of a task response method according to an embodiment of the present invention. The task response method can be applied to interactions between processing and application specific modules. The special module may be a graphics processing module (GPU), a Vector processing module (Vector processing Unit), a floating point processing module (FPU), a Direct memory access module (DMA), or a similar special processing Unit capable of processing a specific computing task, such as an Artificial Intelligence (AI) accelerator, a Fast Fourier Transform (FFT), or the like. Structurally, the specialized modules are connected to the processor by an internal bus and are interconnected by the internal bus. These specialized modules receive instructions or configuration signals from the CPU and can also send completion requests to the CPU after certain tasks are completed.
As shown in fig. 10, the task response method includes steps S1002 to S1006.
In step S1002, a plurality of task completion requests corresponding to a plurality of application specific modules, which are coupled to the bus interface and located outside the processor, are received;
in step S1004, a plurality of response modes of the processor responding to the plurality of special modules are acquired, and the plurality of response modes are stored in the control register; wherein the plurality of response modes are different; and
in step S1006, a plurality of task completion requests of the plurality of application specific modules are responded according to the corresponding response modes in the control register.
In one embodiment, the plurality of response modes include an interrupt mode and a query mode; and the task response method comprises the following steps: when the response mode of the processor responding to the first special module in the plurality of special modules is the query mode, a record about whether the first special module sends out the first task completion request is stored.
In one embodiment, a response time corresponding to a processor with respect to at least one task completion request of a plurality of task completion requests is stored using a plurality of bits; the response time is preset time; and when the first response mode of the processor responding to the first special module is the interrupt mode, after receiving at least one task completion request from the first special module through the bus interface, delaying the response of the at least one task completion request, wherein the delay time is response time.
In one embodiment, a plurality of bits are utilized to store response conditions corresponding to a processor with respect to a plurality of task completion requests; the response condition is that the processor receives a task completion request of a first special module from the first special module in the plurality of special modules through the bus interface and the processor finishes processing a currently executed task; and when the first response mode of the processor responding to the first special module is the interrupt mode, if the response condition is met, responding to the task completion request of the first special module.
In one embodiment, a first bit is used to store a first response mode corresponding to the processor responding to a first special module in the plurality of special modules; and wherein, when the first bit is 0, the first response mode is an interrupt mode; when the first bit is 1, the second response mode is a query mode; or when the first bit is 1, the first response mode is an interrupt mode; when the first bit is 0, the second response mode is the query mode.
In one embodiment, if the response mode of the processor responding to the first application specific module is the query mode, at least one query instruction related to the first application specific module is sent to determine whether the first application specific module has issued at least one first task completion request.
In one embodiment, when a task being executed by a processor requires a processing result of a task executed by a first application-specific module, the processor (e.g., a controller in the processor) issues at least one query instruction; or when the processor is in an idle state, the processor (for example, a controller in the processor) sends out at least one inquiry instruction.
In one embodiment, the at least one query command includes at least one of a command type, an encoding of a specific module number, a specific module header address, a specific module number, and a scalable flag; the scalable flag is used for indicating whether the size of at least one query instruction is scalable.
In one embodiment, a first special module number corresponding to a plurality of special modules is stored through a data register; and when the detection unit receives the second task completion request, determining a second special module number of the second task request through the detection unit, and comparing the second special module number with the first special module number in the data register through the comparator to determine a special module corresponding to the second task completion request.
In one embodiment, a plurality of comparison results are stored in a result register; and responding to the second task request according to values in the plurality of result registers when a processor (e.g., a controller in the processor) sends a query instruction.
In one embodiment, a plurality of comparison results are received, and the comparison results are subjected to and operation to determine whether the plurality of dedicated modules respectively send out corresponding task completion requests.
In one embodiment, if the plurality of response modes of the processor responding to the plurality of special modules are query modes, sending query instructions about the plurality of special modules to determine whether the plurality of special processing modules send out a plurality of task completion requests; if all task completion requests have been issued, tasks associated with the plurality of specialized processing modules are executed.
In one embodiment, when the processor and a first application specific module of the plurality of application specific modules cooperate to complete the same task, if a response mode of the processor responding to the first application specific module is a query mode, a time point at which the processor needs to utilize a processing result of the first application specific module is estimated, and a time point at which the processor (for example, a controller in the processor) sends at least one query instruction related to the first application specific module is determined according to the time point, so as to reduce the waiting time of the processor.
In one embodiment, the process state of a process processed by a processor is marked; and according to the marked process state of the process state register, dynamically adjusting the response mode of the processor responding to the plurality of special modules, and storing the adjusted response mode.
There is also provided, in an embodiment of the present application, a computer-readable storage medium storing a computer program, the computer program including program instructions, and the processor executing the program instructions to perform the following actions:
receiving a plurality of task completion requests corresponding to a plurality of special modules which are coupled with the bus interface and are positioned outside the processor;
acquiring a plurality of response modes of the processor responding to the plurality of special modules, and storing the plurality of response modes in a control register; wherein the plurality of response modes are different; and
and responding to the plurality of task completion requests according to corresponding response modes in the control register.
In embodiments of the present application, the memory is used for storing a computer program and may be configured to store other various data to support operations on the device on which it is located. Wherein the processor may execute a computer program stored in the memory to implement the corresponding control logic. The memory may be implemented by any type or combination of volatile or non-volatile storage devices, such as Static Random Access Memory (SRAM), electrically erasable programmable read only memory (EEAROM), erasable programmable read only memory (earrom), programmable read only memory (AROM), Read Only Memory (ROM), magnetic memory, flash memory, magnetic or optical disks.
In the embodiments of the present application, the processor may be any hardware processing device that can execute the above described method logic. Alternatively, the processor may be a Central processing Unit (CAU), a graphics processing Unit (GAU), or a Micro Control Unit (MCU); programmable devices such as Field-programmable Gate arrays (FAGAs), programmable Array Logic devices (AAL), General Array Logic devices (GAL), complex programmable Logic devices (CALD), etc.; or Advanced Reduced Instruction Set (RISC) processors (ARM) or System On Chip (SOC), etc., but is not limited thereto.
It should be noted that, the descriptions of "first", "second", etc. in this document are used for distinguishing different messages, devices, modules, etc., and do not represent a sequential order, nor limit the types of "first" and "second" to be different.
As will be appreciated by one skilled in the art, embodiments of the present invention may be provided as a method, system, or computer program product. Accordingly, the present invention may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present invention may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, etc.) having computer-usable program code embodied in the medium.
The present invention is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the invention. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in one flow or process of the flowchart and/or one block or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flow and/or block diagram block or blocks.
In a typical configuration, a computing device includes one or more processors (CAU), input/output interfaces, network interfaces, and memory.
The memory may include forms of volatile memory in a computer readable medium, Random Access Memory (RAM) and/or non-volatile memory, such as Read Only Memory (ROM) or flash memory (flash RAM). Memory is an example of a computer-readable medium.
Computer-readable media, including both non-transitory and non-transitory, removable and non-removable media, may implement information storage by any method or technology. The information may be computer readable instructions, data structures, modules of a program, or other data. Examples of computer storage media include, but are not limited to, phase change memory (ARAM), Static Random Access Memory (SRAM), Dynamic Random Access Memory (DRAM), other types of Random Access Memory (RAM), Read Only Memory (ROM), electrically erasable programmable read only memory (EEAROM), flash memory or other memory technology, compact disc read only memory (CD-ROM), Digital Versatile Discs (DVD) or other optical storage, magnetic cassettes, magnetic tape magnetic disk storage or other magnetic storage devices, or any other non-transmission medium that can be used to store information that can be accessed by a computing device. As defined herein, a computer readable medium does not include a transitory computer readable medium such as a modulated data signal and a carrier wave.
It should also be noted that the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. The use of the phrase "including a" does not exclude the presence of other, identical elements in the process, method, article, or apparatus that comprises the same element, whether or not the same element is present in all of the same element.
The above description is only an example of the present application and is not intended to limit the present application. Various modifications and changes may occur to those skilled in the art. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present application should be included in the scope of the claims of the present application.

Claims (33)

1. A processor, comprising:
a bus interface capable of being coupled to a plurality of application specific modules external to the processor for receiving a plurality of task completion requests with respect to the plurality of application specific modules; and
a control register for storing a plurality of response modes of the processor in response to the plurality of special modules; wherein a plurality of the response modes are different;
and the processor responds to a plurality of task completion requests of the special modules according to the corresponding response modes in the control register.
2. The processor of claim 1,
the plurality of response modes comprise an interruption mode and a query mode; and
the processor further includes a flag register for storing a record as to whether a first task completion request is issued by a first special purpose module when a response mode of the processor in response to the first special purpose module among the plurality of special purpose modules is a query mode.
3. The processor of claim 1,
the control register comprises a plurality of bits; wherein a plurality of the bits are used to store a response time corresponding to the processor with respect to at least one of a plurality of the task completion requests; and
when the first response mode of the processor responding to the first special module is an interrupt mode, after the processor receives the at least one task completion request from the first special module through the bus interface, the processor delays responding to the at least one task completion request, wherein the delay time is the response time.
4. The processor of claim 1,
the control register comprises a plurality of bits; wherein a plurality of the bits are used to store response conditions corresponding to a plurality of the task completion requests by the processor; and
when a first response mode of the processor responding to the first special module is an interrupt mode, if the response condition is met, the processor responds to the task completion request of the first special module;
wherein the response condition is that the processor receives a task completion request of a first special module from the first special module of the plurality of special modules through the bus interface and the processor finishes processing a currently executed task.
5. The processor of claim 1,
the control register includes a first bit for storing a first response mode corresponding to the processor responding to a first one of the plurality of special purpose modules; and
when the first bit is 0, the first response mode is an interrupt mode; when the first bit is 1, the first response mode is a query mode; or
When the first bit is 1, the first response mode is an interrupt mode; when the first bit is 0, the first response mode is a query mode.
6. The processor of claim 1,
if the response mode of the processor responding to the first special module is the query mode, the processor sends at least one query instruction related to the first special module to determine whether the first special module sends out a first task completion request.
7. The processor of claim 6,
when the task being executed by the processor needs the processing result of the task executed by the first special module, the processor sends out the at least one query instruction; or
When the processor is in an idle state, the processor sends out the at least one query instruction.
8. The processor of claim 6,
the at least one query instruction comprises at least one of an instruction type, a code of a special module number, a special module initial address, a special module number and a scalable mark; wherein the scalable flag is used to indicate whether the size of the at least one query instruction is scalable.
9. The processor of claim 1, further comprising:
the detection module comprises a detection unit, wherein the detection unit comprises a comparator and a data register, and the data register is used for storing a first special module number corresponding to a first special module in the special modules; and
when the detection unit receives a second task completion request, the detection unit determines a second special module number of the second task request, and compares the second special module number with the first special module number in the data register through the comparator to determine a special module corresponding to the second task completion request.
10. The processor of claim 9,
the detection unit further comprises a result register; and
the detection unit stores the comparison result in the result register, an
And when the processor sends a query instruction, the processor responds to the second task request according to the value in the result register.
11. The processor of claim 9,
the detection module further comprises an AND gate, the AND gate is used for receiving a plurality of first comparison results from a plurality of first detection units, and the AND gate performs AND operation on the plurality of first comparison results to determine whether the plurality of special modules respectively send out corresponding task completion requests.
12. The processor of claim 1,
if the response mode of the processor responding to the special modules is the query mode, the processor sends query instructions about the special modules to determine whether the special processing modules send out task completion requests;
if all task completion requests have been issued, the processor executes tasks associated with the plurality of specialized processing modules.
13. The processor of claim 1,
when the processor and a first special module in the special modules cooperate to complete the same task, if the response mode of the processor responding to the first special module is the query mode, the time point of the processor needing to utilize the processing result of the first special module is estimated, and the time point of the processor sending at least one query instruction about the first special module is determined according to the time point, so that the waiting time of the processor is reduced.
14. The processor of claim 1, further comprising:
a process state register for marking a process state of a process processed by the processor; and
and according to the marked process state of the process state register, the processor dynamically adjusts the response mode of the processor responding to the plurality of special modules and sends the adjusted response mode to the control register.
15. The processor of claim 1,
the special modules are one of a graphic processing module, a vector calculation module, a floating point processing module, a direct memory access module, an artificial intelligence accelerator and a fast Fourier transform module.
16. A task response method, comprising:
receiving a plurality of task completion requests corresponding to a plurality of special modules which are coupled with the bus interface and are positioned outside the processor;
acquiring a plurality of response modes of the processor responding to the special modules, and storing the response modes in a control register; wherein a plurality of the response modes are different; and
and responding to a plurality of task completion requests of a plurality of special modules according to the corresponding response modes in the control register.
17. A task response method according to claim 16,
the plurality of response modes comprise an interruption mode and a query mode; and
the task response method comprises the following steps:
when the response mode of the processor responding to the first special module in the special modules is the query mode, the processor stores the record of whether the first special module sends out the first task completion request.
18. A task response method according to claim 16,
storing, with a plurality of bits, a response time corresponding to the processor with respect to at least one of the plurality of task completion requests; the response time is preset time; and
when the first response mode of the processor responding to the first special module is an interrupt mode, after the at least one task completion request is received from the first special module through the bus interface, the processor delays responding to the at least one task completion request, wherein the delay time is the response time.
19. A task response method according to claim 16,
storing, with a plurality of bits, response conditions corresponding to the processor with respect to a plurality of the task completion requests; the response condition is that the processor receives a task completion request of a first special module from the first special module in the special modules through the bus interface and the processor finishes processing a currently executed task; and
and when the first response mode of the processor responding to the first special module is an interrupt mode, responding to the task completion request of the first special module if the response condition is met.
20. A task response method according to claim 16,
storing, with a first bit, a first response mode corresponding to the processor responding to a first one of the plurality of specialized modules; and
when the first bit is 0, the first response mode is an interrupt mode; when the first bit is 1, the second response mode is a query mode; or
When the first bit is 1, the first response mode is an interrupt mode; and when the first bit is 0, the second response mode is a query mode.
21. A task response method according to claim 16,
if the response mode of the processor responding to the first special module is the query mode, at least one query instruction related to the first special module is sent to determine whether the first special module sends out at least one first task completion request.
22. A task response method according to claim 21,
when the task being executed by the processor needs the processing result of the task executed by the first special module, the processor sends out the at least one query instruction; or
When the processor is in an idle state, the processor sends out the at least one query instruction.
23. A task response method according to claim 21,
the at least one query instruction comprises at least one of an instruction type, a code of a special module number, a special module initial address, a special module number and a scalable mark; wherein the scalable flag is used to indicate whether the size of the at least one query instruction is scalable.
24. A task response method according to claim 16,
storing, by a data register, a first special module number corresponding to a plurality of the special modules; and
when the detection unit receives a second task completion request, the detection unit determines a second special module number of the second task request, and the comparator compares the second special module number with the first special module number in the data register to determine a special module corresponding to the second task completion request.
25. A task response method according to claim 24,
storing a plurality of comparison results in a result register; and
and when the processor sends a query instruction, responding to the second task request according to the values in the result registers.
26. A task response method according to claim 25,
and receiving a plurality of comparison results, and performing AND operation on the plurality of comparison results to determine whether the plurality of special modules respectively send corresponding task completion requests.
27. A task response method according to claim 16,
if the response mode of the processor responding to the special modules is a query mode, sending query instructions about the special modules to determine whether the special processing modules send out a plurality of task completion requests;
and if all task completion requests are sent, executing tasks related to the special processing modules.
28. A task response method according to claim 16,
when the processor and a first special module in the special modules cooperate to complete the same task, if the response mode of the processor responding to the first special module is the query mode, the time point of the processor needing to utilize the processing result of the first special module is estimated, and the time point of the processor sending at least one query instruction about the first special module is determined according to the time point, so that the waiting time of the processor is reduced.
29. A task response method according to claim 16,
marking a process state of a process processed by the processor; and
and according to the marked process state of the process state register, dynamically adjusting the response mode of the processor responding to the plurality of special modules, and storing the adjusted response mode.
30. A task response method according to claim 16,
the special modules are one of a graphic processing module, a vector calculation module, a floating point processing module, a direct memory access module, an artificial intelligence accelerator and a fast Fourier transform module.
31. A computer-readable storage medium storing computer instructions, wherein when the computer instructions are executed by one or more processors, the one or more processors perform acts comprising:
receiving a plurality of task completion requests corresponding to a plurality of special modules which are coupled with the bus interface and are positioned outside the processor;
acquiring a plurality of response modes of the processor responding to the special modules, and storing the response modes in a control register; wherein a plurality of the response modes are different; and
and responding to a plurality of task completion requests according to the corresponding response modes in the control register.
32. A movable platform comprising a processor according to claims 1-15.
33. A camera characterized in that the camera comprises a processor according to claims 1-15.
CN201980050197.0A 2019-12-27 2019-12-27 Processor, task response method, movable platform and camera Pending CN112513809A (en)

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2019/129100 WO2021128249A1 (en) 2019-12-27 2019-12-27 Processor, task response method, movable platform, and camera

Publications (1)

Publication Number Publication Date
CN112513809A true CN112513809A (en) 2021-03-16

Family

ID=74924085

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201980050197.0A Pending CN112513809A (en) 2019-12-27 2019-12-27 Processor, task response method, movable platform and camera

Country Status (2)

Country Link
CN (1) CN112513809A (en)
WO (1) WO2021128249A1 (en)

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5796984A (en) * 1996-01-26 1998-08-18 Dell Usa, L.P. Operating system independent apparatus and method for eliminating peripheral device functions
CN101221540A (en) * 2007-01-09 2008-07-16 国际商业机器公司 Reducing memory access latency for hypervisor- or supervisor-initiated memory access requests
CN104951412A (en) * 2015-06-06 2015-09-30 华为技术有限公司 Storage device capable of being accessed through memory bus

Family Cites Families (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6829697B1 (en) * 2000-09-06 2004-12-07 International Business Machines Corporation Multiple logical interfaces to a shared coprocessor resource
CN101567078B (en) * 2009-03-27 2011-06-22 西安交通大学 Dual-bus visual processing chip architecture
CN101980149B (en) * 2010-10-15 2013-09-18 无锡中星微电子有限公司 Main processor and coprocessor communication system and communication method
CN102141904B (en) * 2011-03-31 2014-02-12 杭州中天微系统有限公司 Data processor supporting interrupt shielding instruction
CN103019835A (en) * 2011-09-26 2013-04-03 同方股份有限公司 System and method for optimizing interruption resources in multi-core processor
CN205899270U (en) * 2016-06-23 2017-01-18 陕西宝成航空仪表有限责任公司 Two redundant ARINC429 bus interface systems of high reliability
US10120829B2 (en) * 2016-11-23 2018-11-06 Infineon Technologies Austria Ag Bus device with programmable address

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5796984A (en) * 1996-01-26 1998-08-18 Dell Usa, L.P. Operating system independent apparatus and method for eliminating peripheral device functions
CN101221540A (en) * 2007-01-09 2008-07-16 国际商业机器公司 Reducing memory access latency for hypervisor- or supervisor-initiated memory access requests
CN104951412A (en) * 2015-06-06 2015-09-30 华为技术有限公司 Storage device capable of being accessed through memory bus

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
孙康;沈海斌;王继民;潘雪增;: "基于映像寄存器构建的实时操作系统内核", 清华大学学报(自然科学版), no. 2, 15 October 2007 (2007-10-15) *

Also Published As

Publication number Publication date
WO2021128249A1 (en) 2021-07-01

Similar Documents

Publication Publication Date Title
JP6776696B2 (en) Parallel information processing equipment, information processing methods, and programs
CN103975314B (en) Across multiple memory areas strong in order, the auto-sequencing of device and mutual exclusion affairs
JP5085178B2 (en) DMA controller and DMA transfer method
CN102906726A (en) Co-processing accelerating method, device and system
US7689734B2 (en) Method for toggling non-adjacent channel identifiers during DMA double buffering operations
US6154832A (en) Processor employing multiple register sets to eliminate interrupts
US20190065075A1 (en) Method to improve mixed workload performance on storage devices that use cached operations
US8244947B2 (en) Methods and apparatus for resource sharing in a programmable interrupt controller
WO2018000765A1 (en) Co-processor, data reading method, processor system and storage medium
CN102622274A (en) Computer device and interrupt task allocation method thereof
CN112513809A (en) Processor, task response method, movable platform and camera
EP4035016A1 (en) Processor and interrupt controller therein
WO2023151460A1 (en) Data processing method and apparatus, chip, and medium
WO2022160703A1 (en) Pooling method, and chip, device and storage medium
CN101303676A (en) Electronic system with direct memory access and method thereof
WO2021179222A1 (en) Scheduling device, scheduling method, accelerating system and unmanned aerial vehicle
CN107408061B (en) Parallel processing system (PPS)
CN105718993A (en) Cell array calculation system and communication method therein
CN103942165A (en) Data processing method, system and IO adapter based on multiprocessor
CN117931555B (en) Method and device for simulating SCSI equipment fault under kernel mode
CN111915014B (en) Processing method and device of artificial intelligent instruction, board card, main board and electronic equipment
CN110929857B (en) Data processing method and device of neural network
US20120246444A1 (en) Reconfigurable processor, apparatus, and method for converting code
US10534707B2 (en) Semiconductor device including plurality of bus masters and control device and program used in the semiconductor device
JPH08137703A (en) Task switching device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination