CN106537331A - Instruction processing method and device - Google Patents

Instruction processing method and device Download PDF

Info

Publication number
CN106537331A
CN106537331A CN201580001167.2A CN201580001167A CN106537331A CN 106537331 A CN106537331 A CN 106537331A CN 201580001167 A CN201580001167 A CN 201580001167A CN 106537331 A CN106537331 A CN 106537331A
Authority
CN
China
Prior art keywords
instruction
key
instructions
executable
long delay
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201580001167.2A
Other languages
Chinese (zh)
Other versions
CN106537331B (en
Inventor
蔡卫光
顾雄礼
方磊
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Huawei Technologies Co Ltd
Original Assignee
Huawei Technologies Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Huawei Technologies Co Ltd filed Critical Huawei Technologies Co Ltd
Publication of CN106537331A publication Critical patent/CN106537331A/en
Application granted granted Critical
Publication of CN106537331B publication Critical patent/CN106537331B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/38Concurrent instruction execution, e.g. pipeline or look ahead

Landscapes

  • Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Advance Control (AREA)

Abstract

An embodiment of the present invention provides an instruction processing method and device, and relates to the field of computers. The method comprises: determining at least one key instruction in executable instructions, the key instruction being a long-delay instruction or an instruction located in a key instruction chain, the key instruction chain comprising N steps of instructions, wherein, an i-th step instruction of the N steps of instructions is a producer instruction of an i + 1 step instruction. In the producer instruction, a target register of an i-th-step instruction is a source register of an i + 1 step instruction, and an N-th step instruction of the key instruction chain is a long-delay instruction, where N is a positive integer greater than 1, and i is a positive integer greater than 0 but less than N; and prioritizing execution of the at least one key instruction in the executable instructions. The present invention advances an execution time of a long delay instruction, further improving a processing function of a processor.

Description

Command processing method and equipment Technical field
The present invention relates to computer realm, more particularly to a kind of command processing method and equipment.
Background technology
In computer realm, in order that equipment realizes some function, it is necessary to install an application program on the device.And in order to run the application program, it is necessary to handle the instruction that the application program includes.
At present, the process of instruction processing can be:The processor that the equipment includes carries out fetching from the instruction that instruction buffer includes, obtains multiple instruction according to instruction sequences, and the instruction sequences are the order that the application program successively runs the plurality of instruction.Instructed to the plurality of into row decoding, and according to the instruction sequences, perform the multiple instruction after decoding.During the multiple instruction after performing decoding, if go to the long delay instruction for causing long delay to operate, can be in the long delay execution process instruction, perform and be located at after long delay instruction and the instruction unrelated with long delay instruction, so as to the target register number based on the plurality of instruction, according to the instruction sequences, the implementing result of the plurality of instruction is submitted into corresponding destination register, and then realize the processing of the plurality of instruction.
In the execution process instruction that the long delay is operated, after being instructed positioned at the long delay and after all instructions unrelated with long delay instruction have performed, if long delay instruction has also been not carried out, now, the processor is stilled need after wait long delay instruction execution completion, other operations could be handled, the process performance of processor is reduced.
The content of the invention
In order to improve the process performance of processor, the embodiments of the invention provide a kind of command processing method and equipment.The technical scheme is as follows:
First aspect includes there is provided a kind of command processing method, methods described:
Determine the key instruction at least one executable instruction, the key instruction refers to long delay instruction or the instruction on key instruction chain, the key instruction chain is instructed comprising N ranks, wherein, producer's instruction that the i-th rank instruction in the N ranks instruction instructs for i+1 rank, producer's instruction refers to that the destination register of the i-th rank instruction is the source register of i+1 rank instruction, N ranks instruction on the key instruction chain is long delay instruction, N is the positive integer more than 1, and i is the positive integer more than 0 and less than N;
The preferential key instruction performed at least one described executable instruction.
With reference in a first aspect, in the first possible implementation of above-mentioned first aspect, before the key instruction determined at least one executable instruction, in addition to:
Take out multiple instructions to be performed from instruction buffer;
From it is the multiple it is instructions to be performed in, determine at least one executable instruction.
With reference in a first aspect, in second of possible implementation of above-mentioned first aspect, the key instruction determined at least one executable instruction, including:
For any executable instruction at least one described executable instruction, judge that the key instruction of any executable instruction identifies whether whether effective or described any executable instruction is long delay instruction;
When the key instruction of any executable instruction, which identifies effective or described any executable instruction, to be instructed for long delay, it is key instruction to determine any executable instruction.
With reference to first aspect to first aspect second of possible implementation in any possible implementation, in the third possible implementation of above-mentioned first aspect, methods described also includes:
There is producer's instruction in the key instruction, and producer's instruction of the key instruction, when being not flagged as key instruction, the key instruction mark that the producer of the key instruction is instructed is set to effectively.
With reference to first aspect to first aspect the third possible implementation in any possible implementation, in the 4th kind of possible implementation of above-mentioned first aspect, methods described also includes:
When the key instruction of the key instruction identifies invalid and described key instruction and instructed for long delay, the key instruction mark that the long delay is instructed is set to effectively.
With reference to the 4th kind of possible reality of the third possible implementation or first aspect of first aspect Existing mode, in the 5th kind of possible implementation of above-mentioned first aspect, methods described also includes:
When every key instruction mark by an instruction is set to effective, the mark of the key instruction is write in the instruction buffer.
It is described to judge whether any executable instruction is long delay instruction in the 6th kind of possible implementation of above-mentioned first aspect with reference to second of possible implementation of first aspect, including:
Row decoding is entered to any executable instruction, the instruction type of any executable instruction is obtained;
Whether the instruction type of any executable instruction is included in the key types storehouse for judging storage;
If including the instruction type of any executable instruction in the key types storehouse, it is determined that any executable instruction instructs for long delay.
It is described to judge whether any executable instruction is long delay instruction in the 7th kind of possible implementation of above-mentioned first aspect with reference to second of possible implementation of first aspect, including:
Whether the IA of any executable instruction is included in the crucial address base for judging storage;
If including the IA of any executable instruction in the crucial address base, it is determined that any executable instruction instructs for long delay.
With reference to first aspect to first aspect the 7th kind of possible implementation in any possible implementation, in the 8th kind of possible implementation of above-mentioned first aspect, it is described taken out from instruction buffer it is multiple it is instructions to be performed after, in addition to:
For it is the multiple it is instructions to be performed in it is each it is instructions to be performed distribute a memory space respectively, each memory space is respectively used to store each implementing result instructions to be performed;
Correspondingly, methods described also includes:
When often obtaining an implementing result instructions to be performed, the implementing result instructions to be performed is write in corresponding memory space;
After the multiple end instructions to be performed is performed, according to the instruction sequences of the multiple instruction, successively by it is the multiple it is instructions to be performed in implementing result write-in each corresponding destination register instructions to be performed in each corresponding memory space instructions to be performed.
Second aspect, there is provided a kind of computing device computer-readable recording medium, including computing device instruction, when computing device described in the computing device of computing device is instructed, the method described in any possible implementation in the 8th kind of possible implementation of the above-mentioned first aspect of computing device to first aspect.
The third aspect includes there is provided a kind of instruction processing apparatus, the equipment:Processor, memory, bus and communication interface;
The memory is used to store computer executed instructions, the processor is connected with the memory by the bus, when the data storage device is run, the computer executed instructions of memory storage described in the computing device, so that the instruction processing apparatus performs the method described in any possible implementation of the above-mentioned first aspect into the 8th kind of possible implementation of first aspect.
Fourth aspect includes there is provided a kind of instruction processing apparatus, the equipment:
Determining module, for determining the key instruction at least one executable instruction, the key instruction refers to long delay instruction or the instruction on key instruction chain, the key instruction chain is instructed comprising N ranks, wherein, producer's instruction that the i-th rank instruction in the N ranks instruction instructs for i+1 rank, producer's instruction refers to that the destination register of the i-th rank instruction is the source register of i+1 rank instruction, N ranks instruction on the key instruction chain is long delay instruction, N is the positive integer more than 1, and i is the positive integer more than 0 and less than N;
Performing module, for preferentially performing the key instruction at least one described executable instruction.
With reference to fourth aspect, in the first possible implementation of above-mentioned fourth aspect, the equipment also includes fetching module,
The fetching module, it is multiple instructions to be performed for being taken out from instruction buffer;
The determining module, be additionally operable to from it is the multiple it is instructions to be performed in, determine at least one executable instruction.
With reference to fourth aspect, in second of possible implementation of above-mentioned fourth aspect, the determining module is used to determine the key instruction at least one executable instruction, including:
Judging unit, for for any executable instruction at least one described executable instruction, the determining module to be used to judge that the key instruction of any executable instruction to identify whether whether effective or described any executable instruction is long delay instruction;When the key instruction of any executable instruction, which identifies effective or described any executable instruction, to be instructed for long delay, it is additionally operable to determine that any executable instruction is key instruction.
With reference to fourth aspect to fourth aspect second of possible implementation in any possible implementation, in the third possible implementation of above-mentioned fourth aspect, the equipment also includes:
First setup module, for there is producer's instruction in the key instruction, and producer's instruction of the key instruction, when being not flagged as key instruction, the key instruction mark that the producer of the key instruction is instructed is set to effectively.
With reference to the third possible implementation of fourth aspect to fourth aspect, in the 4th kind of possible implementation of above-mentioned fourth aspect, the equipment also includes:
Second setup module, for when the key instruction of the key instruction identifies invalid and described key instruction and instructed for long delay, the key instruction mark that the long delay is instructed to be set to effectively.
With reference to the 4th kind of possible implementation of the third possible implementation or fourth aspect of fourth aspect, in the 5th kind of possible implementation of above-mentioned fourth aspect, the equipment also includes:
First writing module, for when every key instruction mark by an instruction is set to effective, the mark of the key instruction to be write in the instruction buffer.
With reference to second of possible implementation of fourth aspect, in the 6th kind of possible implementation of above-mentioned fourth aspect, the determining module is used to judge whether any executable instruction is long delay instruction, including:
The determining module, for entering row decoding to any executable instruction, obtains the instruction type of any executable instruction;
And the instruction type of any executable instruction whether is included in the key types storehouse for judging storage;
If including the instruction type of any executable instruction in the key types storehouse, it is additionally operable to really Fixed any executable instruction instructs for long delay.
With reference to second of possible implementation of fourth aspect, in the 7th kind of possible implementation of above-mentioned fourth aspect, the determining module is used to judge whether any executable instruction is long delay instruction, including:
Whether the IA of any executable instruction is included in the determining module, the crucial address base for judging storage;
If including the IA of any executable instruction in the crucial address base, for determining that any executable instruction instructs for long delay.
With reference to fourth aspect to fourth aspect the 7th kind of possible implementation in any possible implementation in, in the 8th kind of possible implementation of above-mentioned fourth aspect, the equipment also includes:
Distribute module, for for it is the multiple it is instructions to be performed in it is each it is instructions to be performed distribute a memory space respectively, each memory space is respectively used to store each implementing result instructions to be performed;
Correspondingly, the equipment also includes:
Second writing module, for when often obtaining an implementing result instructions to be performed, the implementing result instructions to be performed to be write in corresponding memory space;And for after the multiple end execution instructions to be performed, according to the instruction sequences of the multiple instruction, successively by it is the multiple it is instructions to be performed in implementing result write-in each corresponding destination register instructions to be performed in each corresponding memory space instructions to be performed.
The beneficial effect of technical scheme provided in an embodiment of the present invention is:In embodiments of the present invention, determine the key instruction at least one executable instruction, and when performing at least one executable instruction, the key instruction at least one executable instruction can preferentially be performed, because key instruction is instructed including long delay, therefore, preferentially perform the key instruction at least one executable instruction, the execution time of long delay instruction can be shifted to an earlier date, and in the long delay execution process instruction, it can ensure there are enough independent instructions to perform, and then reduce the stand-by period of processor, so as to improve the process performance of processor.
Brief description of the drawings
Technical scheme in order to illustrate the embodiments of the present invention more clearly, the accompanying drawing used required in being described below to embodiment is briefly described, apparently, drawings in the following description are only some embodiments of the present invention, for those of ordinary skill in the art, on the premise of not paying creative work, other accompanying drawings can also be obtained according to these accompanying drawings.
Fig. 1 is a kind of flow chart of command processing method provided in an embodiment of the present invention.
Fig. 2 is the flow chart of another command processing method provided in an embodiment of the present invention.
Fig. 3 is a kind of instruction processing procedure schematic diagram provided in an embodiment of the present invention.
Fig. 4 is a kind of structural representation of instruction processing apparatus provided in an embodiment of the present invention.
Fig. 5 is the structural representation of another instruction processing apparatus provided in an embodiment of the present invention.
Fig. 6 is the structural representation of another instruction processing apparatus provided in an embodiment of the present invention.
Embodiment
To make the object, technical solutions and advantages of the present invention clearer, below in conjunction with accompanying drawing, embodiment of the present invention is described further in detail.
Before to the embodiment of the present invention carrying out that explanation is explained in detail, first the application scenarios of the embodiment of the present invention are introduced.Equipment is during application program is run, and the processor of the equipment needs to handle the multiple instruction that the application program includes.And during process instruction, some instructions can cause long delay to operate, such as, divide instruction, surmount function instruction etc., when processor run into these long delays operation when, it may be necessary to wait the long delay operation completion, other operations could be performed, the performance of processor is reduced.Therefore, the embodiments of the invention provide a kind of command processing method, to improve the performance of processor.
Fig. 1 is a kind of flow chart of command processing method provided in an embodiment of the present invention.Referring to Fig. 1, this method includes:
Step 101:Determine the key instruction at least one executable instruction, the key instruction refers to long delay instruction or the instruction on key instruction chain, the key instruction chain is instructed comprising N ranks, wherein, producer's instruction that the i-th rank instruction in the N ranks instruction instructs for i+1 rank, producer's instruction refers to that the destination register of the i-th rank instruction is the source register of i+1 rank instruction, the key instruction N ranks instruction on chain is long delay instruction, and N is the positive integer more than 1, and i is the positive integer more than 0 and less than N.
Step 102:The preferential key instruction performed at least one described executable instruction.
In embodiments of the present invention, determine the key instruction at least one executable instruction, and when performing at least one executable instruction, the key instruction at least one executable instruction can preferentially be performed, because key instruction is instructed including long delay, therefore, preferentially perform the key instruction at least one executable instruction, the execution time of long delay instruction can be shifted to an earlier date, and in the long delay execution process instruction, it can ensure there are enough independent instructions to perform, and then reduce the stand-by period of processor, so as to improve the process performance of processor.
If the destination register of instruction 1 is the source register of instruction 2, i.e. instruction 2 will use the result of calculation of instruction 1 to carry out computing, then instruction 1 is to instruct 2 producer to instruct, instruction 2 is the consumer instruction of instruction 1, represent that instruction 1 and instruction 2 have priority in logic to perform relation, instruction 1 will be performed prior to instruction 2.
Optionally it is determined that before key instruction at least one executable instruction, in addition to:
Take out multiple instructions to be performed from instruction buffer;
From it is the plurality of it is instructions to be performed in, determine at least one executable instruction.
Optionally it is determined that the key instruction at least one executable instruction, including:
For any executable instruction at least one executable instruction, judge that the key instruction of any executable instruction identifies whether whether effective or any executable instruction is long delay instruction;
When the key instruction mark of any executable instruction is effective or any executable instruction is that long delay is instructed, it is key instruction to determine any executable instruction.
Alternatively, this method also includes:
There is producer's instruction in the key instruction, and producer's instruction of the key instruction, when being not flagged as key instruction, the key instruction mark that the producer of the key instruction is instructed is set to effectively.
Alternatively, this method also includes:
When it is that long delay is instructed that the key instruction of the key instruction, which identifies the invalid and key instruction, this is grown The key instruction mark of delay instruction is set to effectively.
Alternatively, this method also includes:
When every key instruction mark by an instruction is set to effective, by the mark write instruction caching of the key instruction.
Alternatively, whether judge any executable instruction is long delay instruction, including:
Row decoding is entered to any executable instruction, the instruction type of any executable instruction is obtained;
Whether the instruction type of any executable instruction is included in the key types storehouse for judging storage;
If including the instruction type of any executable instruction in the key types storehouse, it is determined that any executable instruction instructs for long delay.
Alternatively, whether judge any executable instruction is long delay instruction, including:
Whether the IA of any executable instruction is included in the crucial address base for judging storage;
If including the IA of any executable instruction in the crucial address base, it is determined that any executable instruction instructs for long delay.
Alternatively, taken out from instruction buffer it is multiple it is instructions to be performed after, in addition to:
For it is the plurality of it is instructions to be performed in it is each it is instructions to be performed distribute a memory space respectively, each memory space is respectively used to each implementing result instructions to be performed of storage;
Correspondingly, this method also includes:
When often obtaining an implementing result instructions to be performed, the implementing result instructions to be performed is write in corresponding memory space;
It is the plurality of it is instructions to be performed terminate to perform after, according to the instruction sequences of the plurality of instruction, successively by it is the plurality of it is instructions to be performed in implementing result in each corresponding memory space instructions to be performed write each corresponding destination register instructions to be performed.
Above-mentioned all optional technical schemes, can form the alternative embodiment of the present invention according to any combination, and the embodiment of the present invention is no longer repeated this one by one.
Fig. 2 is a kind of flow chart of command processing method provided in an embodiment of the present invention.Referring to Fig. 2, the party Method includes:
Step 201:According to instruction sequences, take out multiple instructions to be performed from instruction buffer.
In embodiments of the present invention, when processor is handled instruction, it can take out multiple instructions to be performed from instruction buffer according to instruction sequences.Wherein, instruction sequences are the order that application program successively runs the plurality of instruction.
Such as, it is as shown in table 1 below, the instruction buffer includes 7 instructions, respectively ADD instruction, SUB instructions, AND instructions, XOR instructions, LSL instructions, LSR instructions and LOAD instruction, and it is ADD instruction, SUB instructions, AND instructions, XOR instructions, LSL instructions, LSR instructions and LOAD instruction that the application program, which successively runs the instruction sequences of this 7 instructions,.Therefore, the processor takes out multiple respectively ADD instructions instructions to be performed, SUB instructions, AND is instructed, XOR is instructed, LSL is instructed, LSR is instructed and LOAD instruction according to the instruction sequences from the instruction buffer.
Table 1
Wherein, in embodiments of the present invention, for it is the plurality of it is instructions to be performed in it is each instructions to be performed, the processor can take out the plurality of instructions to be performed one by one, the embodiment of the present invention is not specifically limited to this according to instruction sequences from the instruction buffer.
Alternatively, in embodiments of the present invention, processor can also be without according to instruction sequences, take out the plurality of instructions to be performed from the instruction buffer, it that is to say, processor can take out the plurality of instructions to be performed from the instruction buffer out of sequence, and the embodiment of the present invention is not specifically limited to this.
Step 202:From it is the plurality of it is instructions to be performed in, determine at least one executable instruction.
For it is the plurality of it is instructions to be performed in it is any instructions to be performed, the processor may determine that whether any operand instructions to be performed is ready, if any operand instructions to be performed is ready, then determine that this is any instructions to be performed for executable instruction, this can be immediately performed any instructions to be performed, without waiting for this it is any it is instructions to be performed before instruction perform completion.If any operand instructions to be performed is not ready, it is determined that this is any instructions to be performed for not executable instruction, at this time, it may be necessary to which until any operand instructions to be performed is ready, this could be performed any instructions to be performed.Wherein, any operand instructions to be performed is the data that store in any source register instructions to be performed.
Or, for it is the plurality of it is instructions to be performed in it is any instructions to be performed, the processor may determine that whether any source register instructions to be performed is effective, if any source register instructions to be performed is effective, then determine that this is any instructions to be performed for executable instruction, this can be immediately performed any instructions to be performed, without waiting for this it is any it is instructions to be performed before instruction perform completion.If any source register instructions to be performed is invalid, it is determined that this is any instructions to be performed for not executable instruction, at this time, it may be necessary to which until any source register instructions to be performed is effective, this could be performed any instructions to be performed.
Wherein, for the execution time advance for instructing long delay, it that is to say, preferential executive chairman's delay instruction, improve the process performance of processor, can for it is the plurality of it is instructions to be performed in it is each it is instructions to be performed distribute a memory space respectively, each memory space is respectively used to each implementing result instructions to be performed of storage.Correspondingly, when often obtaining an implementing result instructions to be performed, the implementing result instructions to be performed can be write in corresponding memory space.And then be set to effectively, that is to say by the corresponding memory space instructions to be performed, the destination register instructions to be performed is set to effectively.
Wherein, for it is the plurality of it is instructions to be performed in it is each it is instructions to be performed respectively distribute a memory space when, the processor can for it is the plurality of it is instructions to be performed in it is each it is instructions to be performed respectively distribute one storage numbering, the storage numbering be the corresponding numbering of memory space.And because the plurality of memory space includes a pointer, the initial position of the pointer points to first memory space, therefore, during to the plurality of one storage numbering of distribution respectively instructions to be performed, the storage numbering that the pointer can be pointed to is distributed to the plurality of instructions to be performed, and a storage numbering is often distributed, the pointer moves down one, so as to obtain the plurality of instructions to be performed Storage numbering.It that is to say, in embodiments of the present invention, the plurality of one storage of distribution respectively instructions to be performed can be numbered according to instruction sequences.
Such as, the initial position of the pointer points to first memory space, therefore, the numbering 1 of first memory space is distributed into ADD instruction, now, the pointer moves down one, it that is to say, the pointer points to second memory space, therefore, the numbering 2 of second memory space is distributed into SUB instructions, similarly, the numbering 3 of 3rd memory space is distributed into AND instructions, the numbering 4 of 4th memory space is distributed into XOR instructions, the numbering 5 of 5th memory space is distributed into LSL instructions, the numbering 6 of 6th memory space is distributed into LSR instructions, and the numbering 7 of the 7th memory space is distributed into LOAD instruction.
It should be noted that storage numbering can be ROB (Re-Order Buffer, Re-Order Buffer) numberings, memory space can be each list item in ROB, and the embodiment of the present invention is not specifically limited to this.
Step 203:Determine the key instruction at least one executable instruction, the key instruction refers to long delay instruction or the instruction on key instruction chain, key instruction chain is instructed comprising N ranks, wherein, producer's instruction that the i-th rank instruction in the instruction of N ranks instructs for i+1 rank, producer instruction refers to that the destination register of the i-th rank instruction is the source register of i+1 rank instruction, and the N ranks instruction on the key instruction chain is long delay instruction, N is the positive integer more than 1, and i is the positive integer more than 0 and less than N.
It should be appreciated that, certain single order key instruction on key instruction chain can have one and the positive integer bar more than 1, such as a certain bar addition instruction is key instruction, and the addition instruction has N number of source register, then the addition instruction can have N number of producer to instruct, so the upper single order key instruction of the addition instruction has N number of.
Specifically, for any executable instruction at least one executable instruction, judge that the key instruction of any executable instruction identifies whether whether effective or any executable instruction is long delay instruction;When the key instruction mark of any executable instruction is effective or any executable instruction is that long delay is instructed, it is key instruction to determine any executable instruction, so determines the key instruction at least one executable instruction.
In embodiments of the present invention, when it is every perform one it is instructions to be performed when, if this it is instructions to be performed there is producer's instruction, instructions to be performed and producer's instruction instructions to be performed is constituted into an instruction Chain, and it is key instruction when this is instructions to be performed, and the producer of the key instruction is instructed when being not flagged as key instruction, the key instruction mark that then producer of the key instruction is instructed is set to effectively, it that is to say, perform after a key instruction, can be key instruction by producer's cue mark of the key instruction.It is key instruction by the long delay cue mark and if the key instruction of the key instruction identifies the invalid and key instruction and instructed for long delay, then the key instruction mark instructed the long delay is set to effectively, that is to say.Afterwards, then when performing next instructions to be performed, if next producer's instruction instructions to be performed is the last single order instruction in the command chain, by it is next it is instructions to be performed be added to the last of the command chain, until traversal it is the plurality of it is instructions to be performed untill.In addition, when including long delay instruction on the command chain, the command chain is referred to as into key instruction chain.
It should be noted that, there is producer's instruction and the relation of consumer instruction in instructions to be performed on same command chain, it that is to say, when the destination register instructions to be performed of the i-th rank on the command chain is i+1 rank source register instructions to be performed, the i+1 rank producer instructions to be performed can be referred to as instruct the i-th rank is instructions to be performed, and be referred to as the i-th rank consumer instruction instructions to be performed by i+1 rank is instructions to be performed.
Further, it is key instruction or non-key instruction that key instruction, which is identified for identifying this instructions to be performed, it that is to say, when a key instruction mark instructions to be performed is effective, it can determine that this is instructions to be performed for key instruction, and when a key instruction mark instructions to be performed is invalid, it may be determined that it is non-key instruction that this is instructions to be performed.And key instruction mark is set to operate effectively and can be:Key instruction mark is set to the first numerical value.And key instruction mark it is invalid when, can by the key instruction mark be set to second value.Wherein, the first numerical value and second value can be set in advance, such as, and the first numerical value can be 1, and second value can be with 0, and the embodiment of the present invention is not specifically limited to this.
In addition, when a key instruction mark instructions to be performed often being set into effective, key instruction mark can be stored in instruction buffer.In this way, when taking out instructions to be performed from the instruction buffer next time, can directly know key instruction, the efficiency of instruction processing is improved.
In embodiments of the present invention, can be by two ways, whether judge any executable instruction is long delay instruction, including:
First way, enters row decoding to any executable instruction, obtains the instruction type of any executable instruction;Whether the instruction type of any executable instruction is included in the key types storehouse for judging storage;If including the instruction type of any executable instruction in the key types storehouse, it is determined that any executable instruction instructs for long delay.If not including the instruction type of any executable instruction in the key types storehouse, it is determined that any executable instruction instructs for long delay.
Such as, any executable instruction is LOAD instruction, row decoding is entered to any executable instruction, the instruction type for obtaining any executable instruction is access instruction, at this point it is possible to which the instruction type of any executable instruction and the instruction type that is included in key types storehouse as shown in table 2 below are compared, the instruction type for including any executable instruction in the key types storehouse is determined, now, determine that any executable instruction instructs for long delay.
Table 2
Instruction type
Divide instruction
Access instruction
Surmount function instruction
Encryption and decryption is instructed
……
Alternatively, in inventive embodiments, can not only set key types storehouse include instruction type, and based on the above method come judge any executable instruction whether be long delay instruct.Certainly, whether in practical application, it is long delay instruction that any executable instruction can also be judged in other way.Such as, the processor enters row decoding to any executable instruction, obtains the instruction type of any executable instruction.Afterwards, instruction type based on any executable instruction, corresponding identification value is obtained from the key types storehouse of storage, if the identification value obtained is the first numerical value, then determine that any executable instruction instructs for long delay, if the identification value obtained is second value, it is determined that any executable instruction is not long delay Instruction.Wherein, the key types storehouse can include the corresponding relation between instruction type and identification value, certainly, in practical application, the instruction type corresponding to each identification value in the key types storehouse can also be arranged in advance, now, identification value can be only included in the key types storehouse, the embodiment of the present invention is not specifically limited to this.
It should be noted that processor may be referred to correlation technique to the process that any executable instruction enters row decoding, this is not set forth in detail the embodiment of the present invention.
For the ease of description, in embodiments of the present invention, illustrated by taking the corresponding relation that key types storehouse is included between instruction type and identification value as an example.Such as, the first numerical value is 1, and any executable instruction is LOAD instruction, and row decoding is entered to any executable instruction, and the instruction type for obtaining any executable instruction is access instruction.Based on the instruction type of any executable instruction, in the corresponding relation between the instruction type and identification value of storage as shown in table 3 below, it is 1 to obtain corresponding identification value, now, determines that any executable instruction instructs for long delay.
Table 3
Instruction type Identification value
Addition instruction 0
Subtraction instruction 0
Logical AND is instructed 0
XOR is instructed 0
Shift instruction 0
Left shift instruction 0
Access instruction 1
Divide instruction 1
Encryption and decryption is instructed 0
Surmount function instruction 1
…… ……
Whether include the IA of any executable instruction in the second way, the crucial address base for judging storage;If crucial address base includes the IA of any executable instruction, it is determined that any executable instruction instructs for long delay.If not including the IA of any executable instruction in the crucial address base, it is determined that any executable instruction does not instruct for long delay.
Such as, any executable instruction is LOAD instruction, and the IA of the LOAD instruction is 0x1f00_0340.Now, the IA of the LOAD instruction is compared by the processor with crucial address base as shown in table 4 below, determines the IA for including LOAD instruction in the crucial address base, accordingly, it is determined that any executable instruction instructs for long delay.
Table 4
IA
0x1f00_0000
0x1f00_0200
0x1f00_0280
0x1f00_0340
……
It should be noted that the IA of instruction is the storage address of the instruction, the embodiment of the present invention is not specifically limited to this.In addition, key types storehouse and crucial address base are to be configured in advance, and key types storehouse and crucial address base can also dynamically be modified, and the embodiment of the present invention is not specifically limited to this.
Step 204:Preferentially perform the key instruction at least one executable instruction.
Wherein, when the processor preferentially performs the key instruction at least one executable instruction, if go to long delay instruction, can be in the long delay execution process instruction, independent instructions are performed, so that the plurality of instruction is performed, to obtain the implementing result of the plurality of instruction.Wherein, the independent instructions are the instruction in the plurality of instruction in addition to the instruction that key instruction chain where the long delay is instructed includes.
Due to long delay instruction can be included in key instruction, therefore, when performing the plurality of instruction, preferentially perform the key instruction at least one executable instruction, so, it can ensure there are enough independent instructions to be performed in long delay execution process instruction, reduce the stand-by period of processor, and then raising is handled The process performance of device.
Further, in embodiments of the present invention, when often obtaining an implementing result instructions to be performed, the implementing result instructions to be performed can be write in corresponding memory space.
Such as, when the implementing result for obtaining AND instructions is 01110110, the implementing result can be stored in the 3rd memory space in ROB, when the implementing result for obtaining LSL instructions is 11101100, the implementing result can be stored in the 5th memory space in ROB, similarly, the implementing result of LOAD instruction is stored in the 7th memory space in ROB, the implementing result of ADD instruction is stored in first memory space in ROB, the SUB implementing results instructed are stored in second memory space in ROB, the XOR implementing results instructed are stored in the 4th memory space in ROB, the LSR implementing results instructed are stored in the 6th memory space in ROB.
Step 205:Based on the target register number of the plurality of instruction, according to the instruction sequences, the implementing result of the plurality of instruction is write into corresponding destination register.
Because the implementing result of each instruction is stored in the memory space that processor includes, therefore, after the plurality of end instructions to be performed is performed, the processor can according to the plurality of instruction sequences instructions to be performed, successively by it is the plurality of it is instructions to be performed in implementing result in each corresponding memory space instructions to be performed write each corresponding destination register instructions to be performed.And specifically, the processor can obtain each implementing result instructions to be performed from the plurality of corresponding memory space instructions to be performed;And based on the plurality of target register number instructions to be performed, the implementing result of acquisition is write into corresponding destination register successively.
It should be noted that, when processor is according to the instruction sequences, when the plurality of implementing result instructions to be performed is write into corresponding destination register, if currently instructions to be performed do not obtain implementing result also, even if it is current it is instructions to be performed after instructions to be performed obtain implementing result, so can not across it is current instructions to be performed and will it is current it is instructions to be performed after implementing result instructions to be performed write corresponding destination register, implementing result can only be obtained when currently instructions to be performed, and write the implementing result after corresponding destination register, could will it is current it is instructions to be performed after implementing result instructions to be performed be committed to corresponding destination register.
In addition, after implementing result instructions to be performed is write corresponding destination register by the processor, can be discharged by the corresponding storage numbering release instructions to be performed, and by corresponding memory space.
In summary, the process that instruction is handled can be represented by the process being illustrated in fig. 3 shown below, that is to say, the processor needs first to take out multiple instructions to be performed from instruction buffer.Afterwards, to it is the plurality of it is instructions to be performed enter row decoding, and recognized based on configuration information in the decoding stage it is the plurality of it is instructions to be performed in long delay instruction, the configuration information includes key types storehouse or crucial address base.It again to each carry out renaming instructions to be performed, that is to say, be one memory space of each distribution instructions to be performed, then begin to perform the plurality of instructions to be performed, and the plurality of implementing result instructions to be performed is stored in corresponding memory space.Afterwards, each key instruction mark can also be stored in instruction buffer by the processor, convenient fetching next time, improve instruction processing efficiency.
In embodiments of the present invention, can be according to instruction sequences, take out multiple instructions to be performed from instruction buffer, and from it is the plurality of it is instructions to be performed in, obtain at least one executable instruction, and when performing at least one executable instruction, the key instruction at least one executable instruction can preferentially be performed, because key instruction is instructed including long delay, therefore, preferentially perform the key instruction at least one executable instruction, the execution time of long delay instruction can be shifted to an earlier date, and in the long delay execution process instruction, it can ensure there are enough independent instructions to perform, and then reduce the stand-by period of processor, so as to improve the process performance of processor.
Instructed the embodiments of the invention provide a kind of computing device computer-readable recording medium, including computing device, when the computing device instruction of the computing device of computing device, the computing device can perform command processing method described above.
Referring to Fig. 4, the embodiments of the invention provide a kind of instruction processing apparatus, the equipment includes:Processor 401, memory 402, bus 403 and communication interface 404;
Memory 402 is used to store computer executed instructions 4021, and processor 401 is connected with memory 402 by the bus 403, when data storage device is run, and processor 401 performs what memory 402 was stored Computer executed instructions, so that instruction processing apparatus performs command processing method described above.
Fig. 5 is a kind of structural representation of instruction processing apparatus provided in an embodiment of the present invention.Referring to Fig. 5, the equipment includes:
Determining module 501, for determining the key instruction at least one executable instruction, the key instruction refers to long delay instruction or the instruction on key instruction chain, the key instruction chain is instructed comprising N ranks, wherein, producer's instruction that the i-th rank instruction in N ranks instruction instructs for i+1 rank, producer instruction refers to that the destination register of the i-th rank instruction is the source register of i+1 rank instruction, N ranks instruction on the key instruction chain is long delay instruction, N is the positive integer more than 1, and i is the positive integer more than 0 and less than N;
Performing module 502, for preferentially performing the key instruction at least one executable instruction.
Alternatively, referring to Fig. 6, the equipment also includes:
Fetching module 503, it is multiple instructions to be performed for being taken out from instruction buffer;
Determining module 501, be additionally operable to from it is the plurality of it is instructions to be performed in, determine at least one executable instruction.
Optionally it is determined that module 401 is used to determine the key instruction at least one executable instruction, including:
For any executable instruction at least one executable instruction, determining module 401 is used to judge that the key instruction of any executable instruction to identify whether whether effective or any executable instruction is long delay instruction;When the key instruction mark of any executable instruction is effective or any executable instruction is that long delay is instructed, it is additionally operable to determine that any executable instruction is key instruction.
Alternatively, the equipment also includes:
First setup module, for there is producer's instruction in the key instruction, and producer's instruction of the key instruction, when being not flagged as key instruction, the key instruction mark that the producer of the key instruction is instructed is set to effectively.
Alternatively, the equipment also includes:
Second setup module, for when it is that long delay is instructed that the key instruction of the key instruction, which identifies the invalid and key instruction, the key instruction mark that the long delay is instructed to be set to effectively.
Alternatively, the equipment also includes:
First writing module, for when every key instruction mark by an instruction is set to effective, the mark of the key instruction to be write in the instruction buffer.
Optionally it is determined that module 501 is used to judge whether any executable instruction is long delay instruction, including:
Determining module 501, for entering row decoding to any executable instruction, obtains the instruction type of any executable instruction;
And the instruction type of any executable instruction whether is included in the key types storehouse for judging storage;
If including the instruction type of any executable instruction in the key types storehouse, it is additionally operable to determine that any executable instruction instructs for long delay.
Optionally it is determined that module 501 is used to judge whether any executable instruction is long delay instruction, including:
Whether the IA of any executable instruction is included in determining module 501, the crucial address base for judging storage;
If including the IA of any executable instruction in the crucial address base, for determining that any executable instruction instructs for long delay.
Alternatively, the equipment also includes:
Distribute module, for for it is the plurality of it is instructions to be performed in it is each it is instructions to be performed distribute a memory space respectively, each memory space is respectively used to each implementing result instructions to be performed of storage;
Correspondingly, the equipment also includes:
Second writing module, for when often obtaining an implementing result instructions to be performed, the implementing result instructions to be performed to be write in corresponding memory space;And for after the plurality of end execution instructions to be performed, according to the plurality of instruction sequences instructions to be performed, successively by it is the plurality of it is instructions to be performed in implementing result in each corresponding memory space instructions to be performed write each corresponding destination register instructions to be performed.
In embodiments of the present invention, the key instruction at least one executable instruction is determined, and should performing During at least one executable instruction, the key instruction at least one executable instruction can preferentially be performed, because key instruction is instructed including long delay, therefore, the key instruction at least one executable instruction is preferentially performed, the execution time of long delay instruction can be shifted to an earlier date, and in the long delay execution process instruction, it can ensure there are enough independent instructions to perform, and then reduce the stand-by period of processor, so as to improve the process performance of processor.
One of ordinary skill in the art will appreciate that realizing all or part of step of above-described embodiment can be completed by hardware, the hardware of correlation can also be instructed to complete by program, described program can be stored in a kind of computer-readable recording medium, storage medium mentioned above can be read-only storage, disk or CD etc..
Presently preferred embodiments of the present invention is the foregoing is only, is not intended to limit the invention, within the spirit and principles of the invention, any modification, equivalent substitution and improvements made etc. should be included in the scope of the protection.

Claims (20)

  1. A kind of command processing method, it is characterised in that methods described includes:
    Determine the key instruction at least one executable instruction, the key instruction refers to long delay instruction or the instruction on key instruction chain, the key instruction chain is instructed comprising N ranks, wherein, producer's instruction that the i-th rank instruction in the N ranks instruction instructs for i+1 rank, producer's instruction refers to that the destination register of the i-th rank instruction is the source register of i+1 rank instruction, N ranks instruction on the key instruction chain is long delay instruction, N is the positive integer more than 1, and i is the positive integer more than 0 and less than N;
    The preferential key instruction performed at least one described executable instruction.
  2. According to the method described in claim 1, it is characterised in that it is described determine at least one executable instruction in key instruction before, in addition to:
    Take out multiple instructions to be performed from instruction buffer;
    From it is the multiple it is instructions to be performed in, determine at least one executable instruction.
  3. According to the method described in claim 1, it is characterised in that it is described determine at least one executable instruction in key instruction, including:
    For any executable instruction at least one described executable instruction, judge that the key instruction of any executable instruction identifies whether whether effective or described any executable instruction is long delay instruction;
    When the key instruction of any executable instruction, which identifies effective or described any executable instruction, to be instructed for long delay, it is key instruction to determine any executable instruction.
  4. Method according to claim any one of 1-3, it is characterised in that methods described also includes:
    There is producer's instruction in the key instruction, and producer's instruction of the key instruction, when being not flagged as key instruction, the key instruction mark that the producer of the key instruction is instructed is set to effectively.
  5. Method according to claim any one of 1-4, it is characterised in that methods described also includes:
    When the key instruction of the key instruction identifies invalid and described key instruction and instructed for long delay, the key instruction mark that the long delay is instructed is set to effectively.
  6. Method according to claim 4 or 5, it is characterised in that methods described also includes:
    When every key instruction mark by an instruction is set to effective, the mark of the key instruction is write in the instruction buffer.
  7. Method according to claim 3, it is characterised in that described to judge whether any executable instruction is long delay instruction, including:
    Row decoding is entered to any executable instruction, the instruction type of any executable instruction is obtained;
    Whether the instruction type of any executable instruction is included in the key types storehouse for judging storage;
    If including the instruction type of any executable instruction in the key types storehouse, it is determined that any executable instruction instructs for long delay.
  8. Method according to claim 3, it is characterised in that described to judge whether any executable instruction is long delay instruction, including:
    Whether the IA of any executable instruction is included in the crucial address base for judging storage;
    If including the IA of any executable instruction in the crucial address base, it is determined that any executable instruction instructs for long delay.
  9. Method according to claim any one of 1-8, it is characterised in that it is described taken out from instruction buffer it is multiple it is instructions to be performed after, in addition to:
    For it is the multiple it is instructions to be performed in it is each it is instructions to be performed distribute a memory space respectively, each memory space is respectively used to store each implementing result instructions to be performed;
    Correspondingly, methods described also includes:
    When often obtaining an implementing result instructions to be performed, the implementing result instructions to be performed is write in corresponding memory space;
    After the multiple end instructions to be performed is performed, according to the multiple instruction sequences instructions to be performed, successively by it is the multiple it is instructions to be performed in implementing result write-in each corresponding destination register instructions to be performed in each corresponding memory space instructions to be performed.
  10. A kind of computing device computer-readable recording medium, it is characterised in that including computing device instruction, when computing device described in the computing device of computing device is instructed, the method described in the computing device claim any one of 1-9.
  11. A kind of instruction processing apparatus, it is characterised in that the equipment includes:Processor, memory, bus and communication interface;
    The memory is used to store computer executed instructions, the processor is connected with the memory by the bus, when the data storage device is run, the computer executed instructions of memory storage described in the computing device, so that the instruction processing apparatus perform claim requires the method described in any one of 1-9.
  12. A kind of instruction processing apparatus, it is characterised in that the equipment includes:
    Determining module, for determining the key instruction at least one executable instruction, the key instruction refers to long delay instruction or the instruction on key instruction chain, the key instruction chain is instructed comprising N ranks, wherein, producer's instruction that the i-th rank instruction in the N ranks instruction instructs for i+1 rank, producer's instruction refers to that the destination register of the i-th rank instruction is the source register of i+1 rank instruction, N ranks instruction on the key instruction chain is long delay instruction, N is the positive integer more than 1, and i is the positive integer more than 0 and less than N;
    Performing module, for preferentially performing the key instruction at least one described executable instruction.
  13. Equipment according to claim 12, it is characterised in that the equipment also includes fetching module,
    The fetching module, it is multiple instructions to be performed for being taken out from instruction buffer;
    The determining module, be additionally operable to from it is the multiple it is instructions to be performed in, determine at least one executable instruction.
  14. Equipment according to claim 12, it is characterised in that the determining module is used to determine the key instruction at least one executable instruction, including:
    For any executable instruction at least one described executable instruction, the determining module is used to judge that the key instruction of any executable instruction to identify whether whether effective or described any executable instruction is long delay instruction;When the key instruction of any executable instruction, which identifies effective or described any executable instruction, to be instructed for long delay, it is additionally operable to determine that any executable instruction is key instruction.
  15. Equipment according to claim any one of 12-14, it is characterised in that the equipment also includes:
    First setup module, for there is producer's instruction in the key instruction, and the key instruction When producer's instruction is not flagged as key instruction, the key instruction mark that the producer of the key instruction is instructed is set to effectively.
  16. Equipment according to claim any one of 12-15, it is characterised in that the equipment also includes:
    Second setup module, for when the key instruction of the key instruction identifies invalid and described key instruction and instructed for long delay, the key instruction mark that the long delay is instructed to be set to effectively.
  17. Equipment according to claim 15 or 16, it is characterised in that the equipment also includes:
    First writing module, for when every key instruction mark by an instruction is set to effective, the mark of the key instruction to be write in the instruction buffer.
  18. Equipment according to claim 14, it is characterised in that the determining module is used to judge whether any executable instruction is long delay instruction, including:
    The determining module, for entering row decoding to any executable instruction, obtains the instruction type of any executable instruction;
    And the instruction type of any executable instruction whether is included in the key types storehouse for judging storage;
    If including the instruction type of any executable instruction in the key types storehouse, it is additionally operable to determine that any executable instruction instructs for long delay.
  19. Equipment according to claim 14, it is characterised in that the determining module is used to judge whether any executable instruction is long delay instruction, including:
    Whether the IA of any executable instruction is included in the determining module, the crucial address base for judging storage;
    If including the IA of any executable instruction in the crucial address base, for determining that any executable instruction instructs for long delay.
  20. Equipment according to claim any one of 12-19, it is characterised in that the equipment also includes:
    Distribute module, for for it is the multiple it is instructions to be performed in it is each it is instructions to be performed distribute a memory space respectively, each memory space is respectively used to store each implementing result instructions to be performed;
    Correspondingly, the equipment also includes:
    Second writing module, for when often obtaining an implementing result instructions to be performed, the implementing result instructions to be performed to be write in corresponding memory space;And for after the multiple end execution instructions to be performed, according to the multiple instruction sequences instructions to be performed, successively by it is the multiple it is instructions to be performed in implementing result write-in each corresponding destination register instructions to be performed in each corresponding memory space instructions to be performed.
CN201580001167.2A 2015-06-19 2015-06-19 Command processing method and equipment Active CN106537331B (en)

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2015/081954 WO2016201699A1 (en) 2015-06-19 2015-06-19 Instruction processing method and device

Publications (2)

Publication Number Publication Date
CN106537331A true CN106537331A (en) 2017-03-22
CN106537331B CN106537331B (en) 2019-07-09

Family

ID=57544746

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201580001167.2A Active CN106537331B (en) 2015-06-19 2015-06-19 Command processing method and equipment

Country Status (2)

Country Link
CN (1) CN106537331B (en)
WO (1) WO2016201699A1 (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112214241A (en) * 2020-09-23 2021-01-12 上海赛昉科技有限公司 Method and system for distributed instruction execution unit
CN114461278A (en) * 2022-04-13 2022-05-10 海光信息技术股份有限公司 Method for operating instruction scheduling queue, operating device and electronic device

Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5475823A (en) * 1992-03-25 1995-12-12 Hewlett-Packard Company Memory processor that prevents errors when load instructions are moved in the execution sequence
US6385715B1 (en) * 1996-11-13 2002-05-07 Intel Corporation Multi-threading for a processor utilizing a replay queue
CN1504873A (en) * 2002-12-05 2004-06-16 �Ҵ���˾ Multithreading recycle and dispatch system and method thereof
CN1550978A (en) * 2003-05-08 2004-12-01 �Ҵ���˾ Method and system for implementing queue instruction in multi-threaded microprocessor
CN1804792A (en) * 2004-12-16 2006-07-19 英特尔公司 Technology of permitting storage transmitting during long wait-time instruction execution
CN101263452A (en) * 2004-03-31 2008-09-10 太阳微系统公司 Method and structure for explicit software control of execution of a thread including a helper subthread
US20100169611A1 (en) * 2008-12-30 2010-07-01 Chou Yuan C Branch misprediction recovery mechanism for microprocessors
CN102799419A (en) * 2012-09-05 2012-11-28 无锡江南计算技术研究所 Register writing conflict detection method and device, and processor
US20150046684A1 (en) * 2013-08-07 2015-02-12 Nvidia Corporation Technique for grouping instructions into independent strands

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7552318B2 (en) * 2004-12-17 2009-06-23 International Business Machines Corporation Branch lookahead prefetch for microprocessors

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5475823A (en) * 1992-03-25 1995-12-12 Hewlett-Packard Company Memory processor that prevents errors when load instructions are moved in the execution sequence
US6385715B1 (en) * 1996-11-13 2002-05-07 Intel Corporation Multi-threading for a processor utilizing a replay queue
CN1504873A (en) * 2002-12-05 2004-06-16 �Ҵ���˾ Multithreading recycle and dispatch system and method thereof
CN1550978A (en) * 2003-05-08 2004-12-01 �Ҵ���˾ Method and system for implementing queue instruction in multi-threaded microprocessor
CN101263452A (en) * 2004-03-31 2008-09-10 太阳微系统公司 Method and structure for explicit software control of execution of a thread including a helper subthread
CN1804792A (en) * 2004-12-16 2006-07-19 英特尔公司 Technology of permitting storage transmitting during long wait-time instruction execution
US20100169611A1 (en) * 2008-12-30 2010-07-01 Chou Yuan C Branch misprediction recovery mechanism for microprocessors
CN102799419A (en) * 2012-09-05 2012-11-28 无锡江南计算技术研究所 Register writing conflict detection method and device, and processor
US20150046684A1 (en) * 2013-08-07 2015-02-12 Nvidia Corporation Technique for grouping instructions into independent strands

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112214241A (en) * 2020-09-23 2021-01-12 上海赛昉科技有限公司 Method and system for distributed instruction execution unit
CN112214241B (en) * 2020-09-23 2023-11-24 上海赛昉科技有限公司 Method and system for distributed instruction execution unit
CN114461278A (en) * 2022-04-13 2022-05-10 海光信息技术股份有限公司 Method for operating instruction scheduling queue, operating device and electronic device
CN114461278B (en) * 2022-04-13 2022-06-21 海光信息技术股份有限公司 Method for operating instruction scheduling queue, operating device and electronic device

Also Published As

Publication number Publication date
CN106537331B (en) 2019-07-09
WO2016201699A1 (en) 2016-12-22

Similar Documents

Publication Publication Date Title
US10552163B2 (en) Method and apparatus for efficient scheduling for asymmetrical execution units
KR102123633B1 (en) Matrix computing device and method
KR100292300B1 (en) System and method for register renaming
US20150286504A1 (en) Scheduling and execution of tasks
WO2016140756A1 (en) Register renaming in multi-core block-based instruction set architecture
JP2004234642A (en) Layout of integrated structure for instruction execution unit
CN109408214A (en) A kind of method for parallel processing of data, device, electronic equipment and readable medium
US9959123B2 (en) Speculative load data in byte-write capable register file and history buffer for a multi-slice microprocessor
US9626281B2 (en) Call stack display with program flow indication
US20200042320A1 (en) Parallel dispatching of multi-operation instructions in a multi-slice computer processor
US20160110201A1 (en) Flexible instruction execution in a processor pipeline
US20140047221A1 (en) Fusing flag-producing and flag-consuming instructions in instruction processing circuits, and related processor systems, methods, and computer-readable media
CN106537331A (en) Instruction processing method and device
US10282207B2 (en) Multi-slice processor issue of a dependent instruction in an issue queue based on issue of a producer instruction
US11042502B2 (en) Vector processing core shared by a plurality of scalar processing cores for scheduling and executing vector instructions
US20200278865A1 (en) Hazard Mitigation for Lightweight Processor Cores
CN110515656B (en) CASP instruction execution method, microprocessor and computer equipment
CN114237705A (en) Verification method, verification device, electronic equipment and computer-readable storage medium
US10956159B2 (en) Method and processor for implementing an instruction including encoding a stopbit in the instruction to indicate whether the instruction is executable in parallel with a current instruction, and recording medium therefor
KR20230124598A (en) Compressed Command Packets for High Throughput and Low Overhead Kernel Initiation
CN103077069B (en) The method and device that instruction resolves
CN102722341A (en) Device for controlling speculative execution of storing and loading unit
US10754817B2 (en) Information processing apparatus and information processing method for process order in reconfigurable circuit
US20180232205A1 (en) Apparatus and method for recursive processing
US20160162379A1 (en) Sata receiver equalization margin determination/setting method and apparatus

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant