CN115756612A - Computing device, operating method and machine-readable storage medium - Google Patents

Computing device, operating method and machine-readable storage medium Download PDF

Info

Publication number
CN115756612A
CN115756612A CN202211486231.9A CN202211486231A CN115756612A CN 115756612 A CN115756612 A CN 115756612A CN 202211486231 A CN202211486231 A CN 202211486231A CN 115756612 A CN115756612 A CN 115756612A
Authority
CN
China
Prior art keywords
instruction
descriptor
address
address information
circuitry
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202211486231.9A
Other languages
Chinese (zh)
Inventor
请求不公布姓名
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanghai Biren Intelligent Technology Co Ltd
Original Assignee
Shanghai Biren Intelligent Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanghai Biren Intelligent Technology Co Ltd filed Critical Shanghai Biren Intelligent Technology Co Ltd
Priority to CN202211486231.9A priority Critical patent/CN115756612A/en
Publication of CN115756612A publication Critical patent/CN115756612A/en
Pending legal-status Critical Current

Links

Images

Abstract

The invention provides a computing device, an operating method thereof and a machine-readable storage medium. The computing device includes an Execution Unit (EU) circuit and an Instruction Scheduler (IS) circuit. The IS circuit may issue instructions to the EU circuit for execution. The IS circuitry includes a Descriptor Address Register (DAR). The DAR is used for storing the address information of the resource descriptor. In the case where the resource descriptor IS used by the current instruction being processed by the IS circuitry, the IS circuitry takes address information of the DAR stored locally to the IS circuitry without triggering the EU circuitry to read the address information of the resource descriptor from a scalar register file external to the IS circuitry. After obtaining the address information of the resource descriptor, the IS circuit transmits a current instruction using the resource descriptor to the EU circuit for execution based on the address information.

Description

Computing device, operating method and machine-readable storage medium
Technical Field
The present invention relates to electronic devices, and more particularly, to a computing device, an operating method, and a machine-readable storage medium.
Background
In general, a resource descriptor (resource descriptor) is used to describe information (e.g., size, etc.) about a resource (e.g., image, texture, etc.). The instructions may access resources in the memory based on the resource descriptors. The instructions may use one or more resource descriptors. For example, a load instruction (load instruction), a store instruction (store instruction), or an atomic instruction (minimum operation instruction) uses one resource descriptor, whereas a texture instruction (texture instruction) typically uses two resource descriptors. A bindless resource descriptor (bindless resource descriptor) is typically not pre-bound to any on-chip addressable memory (on-chip addressable memory). Unbound resource descriptors are typically dynamically fetched by a resource access (resource access instruction) instruction. For example, a memory access instruction must retrieve a resource descriptor before performing a memory access on the resource.
A Scalar Register File (SRF) is pre-stored with either a 64-bit full memory address of the resource descriptor (the address of the target resource in memory) or a 32-bit offset (offset) of the resource descriptor in a descriptor set (descriptor set). During the process of using the unbound resource descriptor by the Instruction to access the memory, an Instruction Scheduler (IS) obtains the full memory address of the unbound resource descriptor from the SRF according to the address of the SRF provided by the Instruction. In the first step of the get resource descriptor address/offset process, the instruction scheduler issues an additional hidden instruction (hidden instruction) to the Execution Unit (Execution Unit, EU) to read the value (address/offset of the resource descriptor) within the SRF. In the second step, the execution unit executes the hidden instruction issued by the instruction scheduler to read the address/offset of the resource descriptor from the SRF. In the third step, the execution unit passes back the value from the SRF (address/offset of resource descriptor) to the instruction scheduler. If the value returned to the instruction scheduler is the offset of the resource descriptor within the descriptor set, the instruction scheduler may use the Identifier (ID) of the descriptor set (or the base memory address of the descriptor set) and the offset returned by the execution unit to calculate the full memory address of the resource descriptor. Whether the instruction scheduler calculates the full memory address of the resource descriptor or the execution unit directly returns the full memory address of the resource descriptor to the instruction scheduler, after determining the full memory address of the resource descriptor, the instruction may use the resource descriptor to access memory based on execution by the execution unit.
The problem is that the instruction scheduler must issue additional hidden instructions to the execution unit, which cause execution unit pipeline latency. Hidden instructions are additional instructions that are dynamically generated by the instruction scheduler rather than instructions of the instruction set. Hidden instructions are not in the source code of the program and therefore cannot be optimized during compilation (compilation) of the source code of the program. Furthermore, each descriptor enforces hidden instructions, thereby ignoring the reuse of descriptors within the instruction scheduler and placing a burden on the hardware.
Disclosure of Invention
The invention provides a computing device, an operating method thereof and a machine-readable storage medium, which are used for more efficiently obtaining address information of a resource descriptor.
In an embodiment according to the invention, the computing device includes Execution Unit (EU) circuitry and Instruction Scheduler (IS) circuitry. The instruction scheduler circuit is coupled to the execution unit circuit and is used for transmitting instructions to the execution unit circuit for execution. The instruction scheduler circuitry includes Descriptor Address Registers (DARs). The descriptor address register is used for storing first address information of a first resource descriptor (resource descriptor). In the event that a current instruction being processed by the instruction scheduler circuitry uses the first resource descriptor, the instruction scheduler circuitry fetches first address information of the first resource descriptor from a descriptor address register stored locally to the instruction scheduler circuitry without triggering the execution unit circuitry to fetch the first address information of the first resource descriptor from a Scalar Register File (SRF) external to the instruction scheduler circuitry. After obtaining the first address information of the first resource descriptor, the instruction scheduler circuitry issues a current instruction using the first resource descriptor to the execution unit circuitry for execution based on the first address information.
In an embodiment according to the invention, the method of operation comprises: storing, by a descriptor address register of an instruction scheduler circuit of a computing device, first address information of a first resource descriptor; in the case where a current instruction processed by the instruction scheduler circuitry uses the first resource descriptor, fetching first address information of a descriptor address register stored locally to the instruction scheduler circuitry without triggering execution unit circuitry of the computing device to read the first address information of the first resource descriptor from a scalar register file external to the instruction scheduler circuitry; and after obtaining the first address information of the first resource descriptor, transmitting, by the instruction scheduler circuitry, a current instruction using the first resource descriptor to the execution unit circuitry for execution based on the first address information.
In an embodiment consistent with the invention, the machine-readable storage medium is to store non-transitory machine-readable instructions. The non-transitory machine-readable instructions, when executed by a computer, may implement a method of operation of the computing device.
Based on the above, the instruction scheduler circuitry is configured locally with descriptor address registers. The instruction scheduler circuit can directly access the address information stored in the address register of the local descriptor, thereby more efficiently obtaining the address information of the resource descriptor. Based on the actual design and application context, in some embodiments, the address information may include the full memory address of the resource descriptor or the offset (offset) of the resource descriptor in the descriptor set (descriptor set). Accordingly, the instruction scheduler is free from issuing additional hidden instructions (hidden instructions) to the execution units, thereby avoiding pipeline latency of the execution units caused by the hidden instructions. In the case where the address information of the second resource descriptor used by the following instruction is the same as the address information of the resource descriptor used by the present instruction, the instruction scheduler circuitry may reuse the address information stored in the descriptor address register. Thus, the instruction scheduler circuitry may more efficiently utilize address information stored in descriptor address registers local to the instruction scheduler circuitry.
Drawings
Fig. 1 is a schematic block diagram of a computing device according to an embodiment of the invention.
FIG. 2 is a flow chart illustrating a method of operating a computing device according to an embodiment of the invention.
Description of the reference numerals
100: computing device
110: instruction Scheduler (IS) circuit
111: descriptor Address Register (DAR)
112: base Address Register (BAR)
120: execution Unit (EU) Circuit
130: scalar register file
S210, S220, S230: step (ii) of
Detailed Description
Reference will now be made in detail to exemplary embodiments of the invention, examples of which are illustrated in the accompanying drawings. Wherever possible, the same reference numbers will be used throughout the drawings and the description to refer to the same or like parts.
The term "coupled" as used throughout this specification, including the claims, may refer to any direct or indirect connection means. For example, if a first device couples (or connects) to a second device, that should be interpreted as that the first device may be directly connected to the second device or the first device may be indirectly connected to the second device through other devices or some means of connection. The terms "first," "second," and the like, as used throughout this specification, including the claims, are used to designate elements (elements) that are not necessarily limited by the number of elements or by the order in which the elements are listed. Further, wherever possible, the same reference numbers will be used throughout the drawings and the description to refer to the same or like parts. Components/parts/steps in different embodiments using the same reference numerals or using the same terms may be referred to one another in relation to the description.
Fig. 1 is a schematic block diagram of a computing device 100 according to an embodiment of the invention. The computing device 100 shown in fig. 1 includes an Instruction Scheduler (IS) circuit 110, an Execution Unit (EU) circuit 120, and a Scalar Register File (SRF) 130. The EU circuit 120 IS coupled to the IS circuit 110 and the scalar register file 130. The IS circuitry 110 may issue instructions to the EU circuitry 120 for execution. In some embodiments, the IS circuit 110 and (or) the EU circuit 120 may be implemented by hardware (hardware) circuits according to different design requirements. In other embodiments, the IS circuit 110 and/or the EU circuit 120 may be implemented in firmware, software, or a combination thereof. In still other embodiments, the IS circuit 110 and/or the EU circuit 120 may be implemented in a combination of hardware, firmware, and software.
In terms of hardware, the IS circuit 110 and/or the EU circuit 120 may be implemented as a logic circuit on an integrated circuit (integrated circuit). For example, the functions of the IS circuit 110 and/or the EU circuit 120 may be implemented in various logic blocks, modules, and circuits of one or more controllers, microcontrollers (microcontrollers), microprocessors (microprocessors), application-specific integrated circuits (ASICs), digital Signal Processors (DSPs), field Programmable Gate Arrays (FPGAs), and/or other processing units. The IS circuit 110 and/or the EU circuit 120 may be implemented as hardware circuits, such as various logic blocks, modules and circuits in an integrated circuit, using a hardware description language (e.g., verilog HDL or VHDL) or other suitable programming languages.
In software and/or firmware, the functions of the IS circuit 110 and/or the EU circuit 120 can be implemented as programming codes. For example, the IS circuit 110 and (or) the EU circuit 120 are implemented by a general programming language (e.g., C + +, or assembly language) or other suitable programming languages. The program code may be recorded/stored in a non-transitory machine-readable storage medium. In some embodiments, the machine-readable storage medium includes, for example, a semiconductor memory and/or a storage device. The semiconductor Memory includes a Memory card, a Read Only Memory (ROM), a FLASH Memory (FLASH Memory), a programmable logic circuit, or other semiconductor memories. The storage device includes a tape (tape), a disk (disk), a Hard Disk Drive (HDD), a Solid-state drive (SSD), or other storage devices. An electronic device (e.g., a Central Processing Unit (CPU), a controller, a microcontroller, or a microprocessor) may read and execute the programming codes from the machine-readable storage medium to implement the functions of the IS circuit 110 and/or the EU circuit 120. Alternatively, the programming code may be provided to the electronic device via any transmission medium, such as a communications network or broadcast waves, among others. Such as the Internet, a wired communication network, a wireless communication network, or other communication media.
Fig. 2 is a flow chart illustrating a method of operating a computing device according to an embodiment of the invention. In some embodiments, the method of operation of the computing device shown in fig. 2 may be implemented in firmware or software (i.e., a program). For example, operations associated with the method of operation of the computing device shown in fig. 2 may be implemented as non-transitory machine-readable instructions (programming code or program) that may be stored on a machine-readable storage medium. The non-transitory machine readable instructions, when executed by a computer, may implement a method of operation of the computing device shown in fig. 2. In other embodiments, the method of operation of the computing device shown in FIG. 2 may be implemented in hardware, for example, in computing device 100 shown in FIG. 1.
Please refer to fig. 1 and fig. 2. The IS circuit 110 includes a Descriptor Address Register (DAR) 111 and a Base Address Register (BAR) 112. The base register 112 may store a base memory address (base memory address) of the descriptor set. The base memory address of the descriptor set may be, for example, 64 bits. Each descriptor set corresponds to a base register 112. For example, when 4 or 8 descriptor sets are supported, 4 or 8 base registers 112 are required, respectively. The specific number of base registers 112 may be determined according to the actual design. The identifier (Descriptor set identifier, descriptor set ID) of the Descriptor set is an immediate number embedded in the present instruction, and the base register 112 can be selected by the identifier of the Descriptor set, thereby determining the base memory address of the Descriptor set to which the identifier corresponds. The number of identifiers is determined by the current command, for example, when 2 resource descriptors are required, 2 corresponding identifiers are provided in the current command. The descriptor address register 111 may store address information of a resource descriptor (resource descriptor). Based on the actual design and application context, in some embodiments, the address information may include the full memory address of the resource descriptor, or the offset (offset) of the resource descriptor in the descriptor set (descriptor set). The full memory address of a resource descriptor is the sum of the base memory address of the descriptor set and the offset of the resource descriptor in the descriptor set. The specific number of descriptor address registers 111 may be determined according to the actual design. For example, the descriptor address register 111 may include a 32-bit register dar0 (not shown, or other number of bits) to store an offset of a resource descriptor within the descriptor set. Alternatively, descriptor address register 111 may include two 32-bit registers dar0 and dar1 (not shown, or other number of registers). When the address information is an offset, the offset of the resource descriptor may be stored in the register dar0. When the address information is the full memory address of the resource descriptor, the full memory address of the resource descriptor can be stored in the registers dar0 and dar1. Alternatively, descriptor address register 111 may include four 32-bit registers, dar0, dar1, dar2 and dar3 (not shown, or other number of registers). When a resource descriptor is used by an instruction, the offset of the resource descriptor can be stored in register dar0, or the full memory address of the resource descriptor can be stored in registers dar0 and dar1. When two resource descriptors are used by an instruction, the full memory address (or offset) of one resource descriptor can be stored in registers dar0 and dar1, and the full memory address (or offset) of the other resource descriptor can be stored in registers dar2 and dar3.
For example, but not limited to, a thread group (warp) may have 4 32-bit registers dar0, dar1, dar2 and dar3, and IS circuit 110 may directly read the registers dar 0-dar 3 of descriptor address register 111. In general, a thread group (thread group) may include multiple warps, which may also be referred to as thread bundles. A load instruction (load instruction) or a store instruction (store instruction) may use registers dar0 and dar1 of descriptor address register 111 to store an offset or address of a descriptor. The texture instruction (texture instruction) may store an offset or address of a texture descriptor (texture descriptor) using registers dar0 and dar1 of the descriptor address register 111, and an offset or address of a sampler descriptor (sampler descriptor) using registers dar2 and dar3 of the descriptor address register 111.
It is assumed here that the descriptor address register 111 stores first address information of the first resource descriptor (step S210). In the case where the current instruction processed by the IS circuit 110 uses the first resource descriptor, the IS circuit 110 may take the first address information of the descriptor address register 111 stored locally to the IS circuit 110 (step S220) without triggering the EU circuit 120 to read the first address information of the first resource descriptor from the scalar register file 130 external to the IS circuit 110. After retrieving the first address information of the first resource descriptor, the IS circuit 110 may issue a current instruction using the first resource descriptor to the EU circuit 120 for execution based on the first address information (step S230).
In the case that the address information of the second resource descriptor used by the following instruction IS the same as the first address information of the first resource descriptor used by the current instruction, i.e. when the resource descriptor used by the following instruction IS the same as the resource descriptor used by the current instruction, the IS circuit 110 may repeatedly use the first address information stored in the descriptor address register 111 to access the resource in the memory (not shown), without triggering the EU circuit 120 to read the address information of the second resource descriptor from the scalar register file 130 to the IS circuit 110. After obtaining the address information of the second resource descriptor (i.e., the first address information), the IS circuit 110 may issue a current instruction using the second resource descriptor to the EU circuit 120 for execution based on the first address information (step S230).
Based on the actual design and application context, in some embodiments, the address information may include the full memory address of the resource descriptor, or the offset of the resource descriptor in the descriptor set. If the address information includes an offset of the resource descriptor in the descriptor set, the IS circuit 110 may select the corresponding base register 112 according to the identifier of the descriptor set carried in the current instruction, and then use the base memory address of the descriptor set stored in the base register 112 and the offset stored in the descriptor address register 111 to calculate the complete memory address of the first resource descriptor. The IS circuitry 110 may issue the full memory address and the current instruction using the first resource descriptor to the EU circuitry 120 for execution. If the address information includes the full memory address of the first resource descriptor in the memory, the IS circuit 110 may issue the full memory address stored in the descriptor address register 111 and the current instruction using the first resource descriptor to the EU circuit 120 for execution.
To summarize the above, descriptor address register 111 may be added to IS circuit 110. The IS circuit 110 can directly access the address information stored in the local descriptor address register 111, so that the address information of the resource descriptor (e.g. the complete memory address of the resource descriptor or the offset of the resource descriptor in the descriptor set) can be obtained more efficiently. Accordingly, the IS circuit 110 avoids issuing an additional hidden instruction (hidden instruction) to the EU circuit 120, thereby avoiding pipeline delay (pipeline latency) of the EU circuit 120 caused by the hidden instruction and reducing the burden of hardware. In the case where the address information of the second resource descriptor used by the following instruction IS identical to the first address information of the first resource descriptor used by the present instruction, the IS circuit 110 may reuse the address information stored in the descriptor address register 111. Accordingly, the IS circuit 110 can more efficiently utilize the address information of the descriptor address register 111 stored locally in the IS circuit 110.
In contrast, prior art instruction schedulers must issue additional hidden instructions to the execution unit for each bindless resource descriptor (bindless resource descriptor) to trigger the execution unit to read the value (address/offset of the resource descriptor) within the scalar register file. The problem is that hidden instructions issued by prior art instruction schedulers can cause execution unit pipeline delays. Furthermore, the hidden instruction is an additional instruction that is dynamically generated by the instruction scheduler, rather than an instruction of the instruction set (instruction set). Hidden instructions are not in the source code of the program and therefore cannot be optimized during compilation (compilation) of the source code of the program. Furthermore, the prior art forces the issue/execution of hidden instructions for each descriptor, thereby ignoring the reuse of descriptors within the instruction scheduler and placing a burden on the hardware.
Please refer to fig. 1 for an embodiment of the present invention. This embodiment may add a new move instruction SMOVD to the instruction set to move the value (either the full memory address of the resource descriptor or the offset of the resource descriptor within the descriptor set) from the scalar register file 130 to the descriptor address register 111. For example, the move instruction SMOVD may be the following pseudo code [1] according to the actual operation scenario. In pseudo code [1] below, dest represents the descriptor address register 111, and source may represent a scalar register file 130, a constant Access Memory (RAM), a constant register, or an immediate. Programs (Programs) or shaders (shaders) may update the descriptor address register 111 with a move instruction SMOVD instruction.
smovd dest, source; pseudo code [1]
The move instruction SMOVD is an instruction of the instruction set. The shift instruction SMOVD is in the program source code, so that the shift instruction SMOVD can be optimized in the compiling process of the program source code. In the case that the second address information of the second resource descriptor used by the following instruction IS different from the first address information of the first resource descriptor used by the current instruction, that IS, when the resource descriptor used by the following instruction IS not the resource descriptor used by the current instruction, the IS circuit 110 may transmit the move instruction SMOVD of the instruction set to the EU circuit 120. Based on the move instruction SMOVD, the EU circuitry 120 can move second address information for the second resource descriptor from the scalar register file 130 external to the IS circuitry 110 to the descriptor address register 111 local to the IS circuitry 110. Accordingly, when the IS circuit 110 processes the subsequent instruction using the second resource descriptor, the IS circuit 110 may access the second address information stored in the descriptor address register 111. After retrieving the second address information for a second resource descriptor, IS circuitry 110 may issue the next instruction using a second resource descriptor to EU circuitry 120 for execution based on the second address information.
In contrast, prior art instruction schedulers issue additional hidden instructions to the execution unit for each unbound resource descriptor to trigger the execution unit to read the value (address/offset of the resource descriptor) within the scalar register file. The problem is that the hidden instructions are extra instructions that are dynamically generated by the instruction scheduler, not instructions of the instruction set. The hidden instruction is not in the program source code, and therefore the hidden instruction cannot be optimized during compilation of the program source code.
Please refer to fig. 1 for an embodiment of the present invention. In some practical operating scenarios, the address information of the resource descriptors generated by scalar operations (scalar operations) may be stored directly in descriptor address registers 111 local to IS circuitry 110. In other practical operational scenarios, address information for resource descriptors generated by scalar operations may be deposited in scalar register file 130 external to IS circuitry 110 for use by IS circuitry 110.
For example, prior to IS circuitry 110 processing the current instruction, IS circuitry 110 processes the scalar operation instruction to generate the first address information for the first resource descriptor. At this time, when the IS circuit 110 needs to reuse the first address information, the IS circuit 110 may directly store the first address information in the descriptor address register 111 local to the IS circuit 110, instead of storing the first address information in the scalar register file 130 external to the IS circuit 110 through the EU circuit 120. The IS circuitry 110 may access the first address information stored in the descriptor address register 111 while the IS circuitry 110 IS processing the current instruction using the first resource descriptor. Accordingly, the IS circuitry 110 may issue a current instruction using the first resource descriptor to the EU circuitry 120 for execution based on the first address information. When the IS circuit 110 does not need to reuse the first address information, the IS circuit 110 may store the first address information in the scalar register file 130 outside the IS circuit 110 through the EU circuit 120.
For another example, in the case that the second address information of the second resource descriptor used by the following instruction IS different from the first address information of the first resource descriptor used by the current instruction, that IS, when the resource descriptor used by the following instruction IS not the resource descriptor used by the current instruction, the IS circuit 110 processes the scalar operation instruction to generate the second address information of the second resource descriptor. At this time, IS circuitry 110 may directly store the second address information in descriptor address registers 111 local to IS circuitry 110 without storing the second address information in scalar register file 130 external to IS circuitry 110 (e.g., when IS circuitry 110 need not reuse the second address information). Accordingly, when the IS circuit 110 processes the subsequent instruction using the second resource descriptor, the IS circuit 110 may access the second address information stored in the descriptor address register 111. After retrieving the second address information for a second resource descriptor, IS circuitry 110 may issue the next instruction using a second resource descriptor to EU circuitry 120 for execution based on the second address information.
As another example, after the scalar operation instruction processed by IS circuitry 110 generates the first address information of the first resource descriptor, IS circuitry 110 may store the first address information in a scalar register file 130 external to IS circuitry 110. Before the IS circuit 110 processes the current instruction using the first resource descriptor, the IS circuit 110 may issue a shift instruction SMOVD of the instruction set to the EU circuit 120 to shift the first address information of the first resource descriptor from the scalar register file 130 to the descriptor address register 111 of the IS circuit 110. The IS circuitry 110 may access the first address information stored in the descriptor address register 111 while the IS circuitry 110 IS processing the current instruction using the first resource descriptor. Accordingly, the IS circuitry 110 may issue a current instruction using the first resource descriptor to the EU circuitry 120 for execution based on the first address information.
Based on the actual design, in some embodiments, descriptor address registers 111 may be added to the address space of scalar register file 130. A program or shader may update the descriptor address register 111 with a target address in the address space of the scalar register file 130. For example, a first interval of the address space may correspond to scalar register file 130 external to IS circuitry 110, while a second interval of the address space may correspond to descriptor address register 111 local to IS circuitry 110. After a scalar operation instruction processed by IS circuitry 110 generates first address information for a first resource descriptor, IS circuitry 110 may use the address space to selectively deposit the first address information in one of scalar register file 130 and descriptor address register 111. As a specific example of adding the descriptor address register 111 to the scalar register file 130 address space, it is assumed herein that the address space of the scalar register file 130 is a 7-bit address space, where each of the addresses 0-63 (the first span of address space) may be used to represent a conventional scalar register in the scalar register file 130, and each of the addresses 64-67 (the second span of address space) may be used to represent one of the registers dar 0-dar 3 in the descriptor address register 111. IS circuitry 110 may write the result of a scalar operation to descriptor address register 111 using the address space specification of scalar register file 130.
To summarize, the IS circuit 110 may directly access the descriptor address register 111 local to the IS circuit 110, and thus the IS circuit 110 may avoid hidden instructions that cause EU pipeline delays. The complexity of IS circuitry 110 IS reduced because IS does not need to prepare and issue hidden instructions. In the case where the address information of the second resource descriptor used by the following instruction IS identical to the address information of the resource descriptor used by the present instruction, the IS circuit 110 can reuse the address information stored in the descriptor address register 111, that IS, reuse the resource descriptors having the same offset (or address), which was impossible in the prior art. Greater performance may be obtained by IS circuit 110 when a series of instructions use the same resource descriptors.
Finally, it should be noted that: the above embodiments are only used to illustrate the technical solution of the present invention, and not to limit the same; while the invention has been described in detail and with reference to the foregoing embodiments, it will be understood by those skilled in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some or all of the technical features may be equivalently replaced; and the modifications or the substitutions do not make the essence of the corresponding technical solutions depart from the scope of the technical solutions of the embodiments of the present invention.

Claims (17)

1. A computing device, the computing device comprising:
an execution unit circuit; and
an instruction scheduler circuitry, coupled to the execution unit circuitry, to issue instructions for execution by the execution unit circuitry, wherein the instruction scheduler circuitry includes a descriptor address register to store first address information for a first resource descriptor,
in the event that a current instruction being processed by said instruction scheduler circuitry uses said first resource descriptor, said instruction scheduler circuitry fetches said first address information of said descriptor address register held locally to said instruction scheduler circuitry without triggering said execution unit circuitry to read said first address information of said first resource descriptor from a scalar register file external to said instruction scheduler circuitry, and
after fetching the first address information for the first resource descriptor, the instruction scheduler circuitry issues the present instruction using the first resource descriptor to the execution unit circuitry for execution based on the first address information.
2. The computing apparatus of claim 1 wherein the instruction scheduler circuitry reuses the first address information deposited in the descriptor address register in the event that address information of a second resource descriptor used by a following instruction is identical to the first address information of the first resource descriptor used by the present instruction, without triggering the execution unit circuitry to read the first address information from the scalar register file to the instruction scheduler circuitry.
3. The computing device of claim 1,
in the case where second address information of a second resource descriptor used by a following instruction is not the first address information of the first resource descriptor used by the present instruction: the instruction scheduler circuitry issues a move instruction of an instruction set to the execution unit circuitry to move the second address information of the second resource descriptor from the scalar register file external to the instruction scheduler circuitry to the descriptor address register local to the instruction scheduler circuitry, or the instruction scheduler circuitry processes a scalar operation instruction to generate the second address information of the second resource descriptor, and the instruction scheduler circuitry directly stores the second address information in the descriptor address register local to the instruction scheduler circuitry without storing the second address information in the scalar register file external to the instruction scheduler circuitry;
while the instruction scheduler circuitry is processing the next instruction using the second resource descriptor, the instruction scheduler circuitry accessing the second address information held in the descriptor address register; and
after retrieving the second address information for the second resource descriptor, the instruction scheduler circuitry is to issue the next instruction using the second resource descriptor to the execution unit circuitry for execution based on the second address information.
4. The computing apparatus of claim 1 wherein, prior to the instruction scheduler circuitry processing the present instruction, the instruction scheduler circuitry processing scalar operation instructions to generate the first address information for the first resource descriptor, and wherein the instruction scheduler circuitry directly deposits the first address information in the descriptor address register local to the instruction scheduler circuitry without depositing the first address information in the scalar register file external to the instruction scheduler circuitry.
5. The computing device of claim 1,
after a scalar operation instruction processed by the instruction scheduler circuitry generates the first address information for the first resource descriptor, the instruction scheduler circuitry deposits the first address information in the scalar register file external to the instruction scheduler circuitry; and
before the instruction scheduler circuitry processes the current instruction, the instruction scheduler circuitry issues a move instruction of an instruction set to the execution unit circuitry to move the first address information of the first resource descriptor from the scalar register file to the descriptor address register of the instruction scheduler circuitry.
6. The computing apparatus of claim 1, wherein the descriptor address register is added to an address space of the scalar register file, wherein a first interval of the address space corresponds to the scalar register file external to the instruction scheduler circuitry, wherein a second interval of the address space corresponds to the descriptor address register local to the instruction scheduler circuitry, and wherein after a scalar operation instruction processed by the instruction scheduler circuitry generates the first address information for the first resource descriptor, the instruction scheduler circuitry uses the address space to selectively deposit the first address information in one of the scalar register file and the descriptor address register.
7. The computing apparatus of claim 1, wherein the first address information comprises an offset of the first resource descriptor within a descriptor set, wherein the instruction scheduler circuitry further comprises:
a base register to store a base memory address of the descriptor set,
wherein said instruction scheduler circuitry uses said base memory address stored in said base register and said offset stored in said descriptor address register to calculate a full memory address of said first resource descriptor, and said instruction scheduler circuitry issues said full memory address and said current instruction using said first resource descriptor to said execution unit circuitry for execution.
8. The computing device of claim 1, wherein the first address information comprises a full memory address of the first resource descriptor in memory, and wherein the instruction scheduler circuitry is to issue the full memory address deposited in the descriptor address register and the present instruction using the first resource descriptor to the execution unit circuitry for execution.
9. A method of operation of a computing device, the method of operation comprising:
depositing, by a descriptor address register of an instruction scheduler circuit of the computing device, first address information of a first resource descriptor;
fetching the first address information of the descriptor address register stored locally to the instruction scheduler circuitry without triggering execution unit circuitry of the computing device to read the first address information of the first resource descriptor from a scalar register file external to the instruction scheduler circuitry, in the event that a current instruction processed by the instruction scheduler circuitry uses the first resource descriptor; and
after fetching the first address information for the first resource descriptor, issuing, by the instruction scheduler circuitry, the present instruction using the first resource descriptor to the execution unit circuitry for execution based on the first address information.
10. The method of operation of claim 9, further comprising:
in the event that address information of a second resource descriptor used by a following instruction is the same as the first address information of the first resource descriptor used by the present instruction, reusing the first address information deposited in the descriptor address register without triggering the execution unit circuitry to read the first address information from the scalar register file to the instruction scheduler circuitry.
11. The method of operation of claim 9, further comprising:
in the case where second address information of a second resource descriptor used by a following instruction is not the first address information of the first resource descriptor used by the present instruction: transmitting, by the instruction scheduler circuitry, a move instruction of an instruction set to the execution unit circuitry to move the second address information of the second resource descriptor from the scalar register file external to the instruction scheduler circuitry to the descriptor address register local to the instruction scheduler circuitry, or to process a scalar operation instruction by the instruction scheduler circuitry to generate the second address information of the second resource descriptor, and the instruction scheduler circuitry directly stores the second address information in the descriptor address register local to the instruction scheduler circuitry without storing the second address information in the scalar register file external to the instruction scheduler circuitry;
fetching the second address information held in the descriptor address register while the instruction scheduler circuitry is processing the next instruction using the second resource descriptor; and
after retrieving the second address information for the second resource descriptor, transmitting, by the instruction scheduler circuitry, the next instruction using the second resource descriptor to the execution unit circuitry for execution based on the second address information.
12. The method of operation of claim 9, further comprising:
processing scalar operation instructions by said instruction scheduler circuitry to generate said first address information of said first resource descriptor prior to said instruction scheduler circuitry processing said current instruction; and
directly deposit the first address information in the descriptor address register local to the instruction scheduler circuitry without depositing the first address information in the scalar register file external to the instruction scheduler circuitry.
13. The method of operation of claim 9, further comprising:
after a scalar operation instruction processed by the instruction scheduler circuitry generates the first address information for the first resource descriptor, depositing the first address information in the scalar register file external to the instruction scheduler circuitry; and
before the instruction scheduler circuitry processes the current instruction, issuing, by the instruction scheduler circuitry, a move instruction of an instruction set to the execution unit circuitry to move the first address information of the first resource descriptor from the scalar register file to the descriptor address register of the instruction scheduler circuitry.
14. The method of operation of claim 9 wherein the descriptor address registers are added to an address space of the scalar register file, a first interval of the address space corresponding to the scalar register file external to the instruction scheduler circuitry, a second interval of the address space corresponding to the descriptor address registers local to the instruction scheduler circuitry, and further comprising:
after a scalar operation instruction processed by the instruction scheduler circuitry generates the first address information for the first resource descriptor, using the address space to selectively deposit the first address information in one of the scalar register file and the descriptor address register.
15. The method of operation of claim 9 wherein the first address information comprises an offset of the first resource descriptor within a descriptor set, and further comprising:
depositing, by a base register of the instruction scheduler circuitry, a base memory address of the descriptor set;
using said base memory address stored in said base register and said offset stored in said descriptor address register to calculate a full memory address for said first resource descriptor; and
transmitting, by the instruction scheduler circuitry, the full memory address and the present instruction using the first resource descriptor to the execution unit circuitry for execution.
16. The method of operation of claim 9, wherein the first address information comprises a full memory address of the first resource descriptor in memory, and wherein the method of operation further comprises:
issuing, by the instruction scheduler circuitry, the full memory address held in the descriptor address register and the present instruction using the first resource descriptor to the execution unit circuitry for execution.
17. A machine-readable storage medium storing non-transitory machine-readable instructions which, when executed by a computer, may implement a method of operation of the computing apparatus of any of claims 9-16.
CN202211486231.9A 2022-11-24 2022-11-24 Computing device, operating method and machine-readable storage medium Pending CN115756612A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211486231.9A CN115756612A (en) 2022-11-24 2022-11-24 Computing device, operating method and machine-readable storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211486231.9A CN115756612A (en) 2022-11-24 2022-11-24 Computing device, operating method and machine-readable storage medium

Publications (1)

Publication Number Publication Date
CN115756612A true CN115756612A (en) 2023-03-07

Family

ID=85337586

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211486231.9A Pending CN115756612A (en) 2022-11-24 2022-11-24 Computing device, operating method and machine-readable storage medium

Country Status (1)

Country Link
CN (1) CN115756612A (en)

Similar Documents

Publication Publication Date Title
US11507520B2 (en) Tracking streaming engine vector predicates to control processor execution
EP0394624B1 (en) Multiple sequence processor system
WO1983002018A1 (en) Branch predicting computer
US10379861B2 (en) Decoding instructions that are modified by one or more other instructions
US10540182B2 (en) Processor and instruction code generation device
US10664280B2 (en) Fetch ahead branch target buffer
KR20160031503A (en) Method and apparatus for selective renaming in a microprocessor
CN108319559B (en) Data processing apparatus and method for controlling vector memory access
EP4152146A1 (en) Data processing method and device, and storage medium
US7991985B2 (en) System and method for implementing and utilizing a zero overhead loop
JP2005182659A (en) Vliw type dsp and its operation method
US20060095726A1 (en) Independent hardware based code locator
CN115756612A (en) Computing device, operating method and machine-readable storage medium
KR19990077799A (en) Microprocessor, operation process execution method and recording medium
US5440757A (en) Data processor having multistage store buffer for processing exceptions
US6789185B1 (en) Instruction control apparatus and method using micro program
US20140297958A1 (en) System and method for updating an instruction cache following a branch instruction in a semiconductor device
US11036512B2 (en) Systems and methods for processing instructions having wide immediate operands
US11500784B2 (en) Pseudo-first in, first out (FIFO) tag line replacement
US20230065512A1 (en) Pseudo-First In, First Out (FIFO) Tag Line Replacement
US11915002B2 (en) Providing extended branch target buffer (BTB) entries for storing trunk branch metadata and leaf branch metadata
US20130246695A1 (en) Integrated circuit device, signal processing system and method for prefetching lines of data therefor
CN111124494A (en) Method and circuit for accelerating unconditional jump in CPU
CN113227970A (en) Instruction tightly coupled memory and instruction cache access prediction
US9164761B2 (en) Obtaining data in a pipelined processor

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
CB02 Change of applicant information
CB02 Change of applicant information

Country or region after: China

Address after: 201100 room 1302, 13 / F, building 16, No. 2388, Chenhang highway, Minhang District, Shanghai

Applicant after: Shanghai Bi Ren Technology Co.,Ltd.

Address before: 201100 room 1302, 13 / F, building 16, No. 2388, Chenhang highway, Minhang District, Shanghai

Applicant before: Shanghai Bilin Intelligent Technology Co.,Ltd.

Country or region before: China