US10853081B2 - Processor and pipelining method - Google Patents

Processor and pipelining method Download PDF

Info

Publication number
US10853081B2
US10853081B2 US16/201,296 US201816201296A US10853081B2 US 10853081 B2 US10853081 B2 US 10853081B2 US 201816201296 A US201816201296 A US 201816201296A US 10853081 B2 US10853081 B2 US 10853081B2
Authority
US
United States
Prior art keywords
instruction
thread
branch
subsequent
processor
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active, expires
Application number
US16/201,296
Other versions
US20190163494A1 (en
Inventor
Kazuhiro Mima
Hitomi SHISHIDO
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Sanken Electric Co Ltd
Original Assignee
Sanken Electric Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Sanken Electric Co Ltd filed Critical Sanken Electric Co Ltd
Assigned to SANKEN ELECTRIC CO., LTD. reassignment SANKEN ELECTRIC CO., LTD. ASSIGNMENT OF ASSIGNORS INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: MIMA, KAZUHIRO, SHISHIDO, HITOMI
Publication of US20190163494A1 publication Critical patent/US20190163494A1/en
Application granted granted Critical
Publication of US10853081B2 publication Critical patent/US10853081B2/en
Active legal-status Critical Current
Adjusted expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/38Concurrent instruction execution, e.g. pipeline, look ahead
    • G06F9/3867Concurrent instruction execution, e.g. pipeline, look ahead using instruction pipelines
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/38Concurrent instruction execution, e.g. pipeline, look ahead
    • G06F9/3836Instruction issuing, e.g. dynamic instruction scheduling or out of order instruction execution
    • G06F9/3851Instruction issuing, e.g. dynamic instruction scheduling or out of order instruction execution from multiple instruction streams, e.g. multistreaming
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/38Concurrent instruction execution, e.g. pipeline, look ahead
    • G06F9/3802Instruction prefetching
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/38Concurrent instruction execution, e.g. pipeline, look ahead
    • G06F9/3802Instruction prefetching
    • G06F9/3804Instruction prefetching for branches, e.g. hedging, branch folding

Definitions

  • the disclosure relates to a processor and a pipelining method that perform pipelining and execute a branch instruction.
  • a processor performs pipelining to improve processing speed.
  • Pipelining is processing to fetch, decode, and execute instructions in such a manner that at the same time that a fetched instruction A starts to be decoded, an instruction B starts to be fetched.
  • a plurality of instructions are processed concurrently.
  • the plurality of instructions are stored in an instruction memory constructed separately from the processor, and are linked to given addresses.
  • FIG. 5 illustrates a block diagram of multithreading.
  • the multithreading illustrated in FIG. 5 involves a plurality of thresholds TH 0 to TH 2 linked to a plurality of instructions' addresses, and each of the threads TH 0 to TH 2 has its own program counter and general purpose registers.
  • An instruction of a thread having an execution right is fetched (FE), decoded (DE), and executed.
  • FIG. 6 illustrates how a conventional processor that performs pipelining drops an instruction of a thread when there is a thread switch involved.
  • FE denotes fetch; DE, decode; EX, execute; and WB 1 and WB 2 , register write-back of data.
  • FIG. 6 illustrates an example of pipelining where at the same time that a fetched instruction C 1 starts to be decoded, a branch instruction JMP starts to be fetched.
  • the example illustrated in FIG. 6 has threads TH 1 and TH 2 , and involves a thread switch.
  • the branch instruction JMP of the thread TH 2 is executed, an instruction C 2 of the thread TH 1 and an instruction Ci of the thread TH 2 , which are subsequent to the branch instruction in the pipeline, are dropped (indicated by “x” and “drop” in FIG. 6 ).
  • Patent Literature 1 Japanese Patent Application Publication No. 2011-100454
  • This processor stores an instruction which was not predicted by the branch prediction, in order to avoid a penalty incurred by a branch prediction failure. As a result, the processor's processing speed can be improved.
  • a processor may be provided that performs pipelining which processes a plurality of threads and executes instructions in concurrent processing, the instructions corresponding to thread numbers of the threads and including a branch instruction.
  • the processor includes: a pipeline processor including a fetch part that fetches the instruction of the thread having an execution right, and a computation execution part that executes the instruction fetched by the fetch part; and a branch controller that determines whether to drop an instruction subsequent to the branch instruction within the pipeline processor based on the thread number of the thread where the branch instruction is executed and on the thread number of the subsequent instruction.
  • the branch controller may not drop and continues to execute an instruction subsequent to the branch instruction within the pipeline processor when the thread number of the thread where the branch instruction is executed is different from the thread number of the subsequent instruction.
  • the branch controller may drop an instruction subsequent to the branch instruction within the pipeline processor when the thread number of the thread where the branch instruction is executed and the thread number of the subsequent instruction are the same.
  • a pipelining method for processing a plurality of threads and executing instructions in concurrent processing, the instructions corresponding to thread numbers of the threads and including a branch instruction.
  • the method may include: performing pipelining that fetches the instruction of the thread having an execution right and executes the instruction fetched; and performing branch control to determine whether to drop an instruction subsequent to the branch instruction within the pipelining based on the thread number of the thread where the branch instruction is executed and on the thread number of the subsequent instruction.
  • the branch control may not drop and continues to execute an instruction subsequent to the branch instruction within the pipelining when the thread number of the thread where the branch instruction is executed is different from the thread number of the subsequent instruction.
  • the branch control may drop an instruction subsequent to the branch instruction within the pipelining when the thread number of the thread where the branch instruction is executed and the thread number of the subsequent instruction are the same.
  • FIG. 1 is a block diagram illustrating the configuration of a processor according to one or more embodiments
  • FIG. 2 is a diagram illustrating how the processor according to Embodiment 1 performing pipelining drops an instruction of a thread when there is a thread switch involved;
  • FIG. 3 is a diagram illustrating how the processor according to Embodiment 1 performing pipelining drops an instruction of a thread when there is no thread switch involved;
  • FIG. 4 is a block diagram illustrating pipelining by the processor according to one or more embodiments
  • FIG. 5 is a block diagram illustrating multithreading
  • FIG. 6 is a diagram illustrating how a processor of a related art drops an instruction of a thread when there is a thread switch involved.
  • FIG. 1 is a diagram illustrating the configuration of a processor according to Embodiment 1.
  • the processor according to Embodiment 1 performs pipelining that processes a plurality of threads and executes instructions in concurrent processing, the instructions corresponding to the thread numbers of the threads and including a branch instruction.
  • the pipelining includes instruction fetch (F) 1 , decode (D)/execute (E) 2 , and write-back (W) 3 .
  • F instruction fetch
  • D decode
  • E executes instructions in concurrent processing
  • W write-back
  • the instruction fetch (F) 1 is related to a thread number (TH) 11 , a program counter (PC) 12 , an instruction memory 13 , and a branch controller 14 .
  • the thread number (TH) 11 is a thread number of a fetched instruction.
  • the program counter (PC) 12 is an address on the instruction memory 13 where an address of a currently-processed instruction is stored.
  • the instruction memory 13 stores a sequence of instructions (C 1 to C 3 ) belonging to each thread (e.g., TH 1 ) corresponding to the address of the program counter (PC) 12 .
  • the branch controller 14 compares a thread number (TH) 11 and a thread number received from decode (D)/execute (E) 2 , and outputs a comparison signal to an instruction decoder 21 . Characteristic processing performed by the branch controller 14 will be described later.
  • the decode (D)/execute (E) 2 corresponds to the computation executer of the invention, and includes the instruction decoder 21 , a register reader 22 , a computation controller 23 , the branch processor 24 , an adder 25 , and a data retriever 26 .
  • the instruction decoder 21 decodes an instruction stored in a register 15 , and outputs the decode result to the computation controller 23 .
  • the register reader 22 reads a flag set to 0 or 1 and stored in a flag register (not shown), and outputs the flag to the branch processor 24 via the computation controller 23 .
  • the computation controller 23 executes an instruction decoded by the instruction decoder 21 , and outputs the execution result to a register writer 31 and the branch processor 24 .
  • the branch processor 24 handles the branch instruction as follows: branch processing is performed if the flag is set to 1 indicating that the branch instruction is “taken”; and branch processing is not performed if the flag is set to 0 indicating that the branch instruction is “not taken”.
  • the branch controller 14 checks whether the thread number TH_F of an instruction fetched by the instruction fetch (F) 1 is the same as the thread number TH_E of an instruction executed by the decode (D)/execute (E) 2 and thereby determines whether to drop the subsequent instruction.
  • the former thread number is the thread number (TH) 11
  • the latter thread number is the thread number of a branch instruction executed by the branch processor 24 .
  • the branch controller 14 determines whether to drop the subsequent instruction based on the thread number of the thread where a branch instruction is executed and the thread number of the instruction subsequent to the branch instruction within the pipeline processor.
  • the branch controller 14 does not drop the subsequent instruction and continues execution of the subsequent instruction if the thread number of the thread where a branch instruction is executed is different from the thread number of the instruction subsequent to the branch instruction within the pipeline processor.
  • the branch controller 14 drops the subsequent instruction if the thread number of the thread where a branch instruction is executed is different from the thread number of the instruction subsequent to the branch instruction within the pipeline processor.
  • the adder 25 changes the address in the register 15 based on the address from the branch processor 24 , and outputs the change result to the program counter (PC) 12 .
  • the data retriever 26 retrieves data on an instruction decoded by the instruction decoder 21 , and outputs the data to the register writer 31 .
  • the write-back (W) 3 has the register writer 31 that writes data into a register.
  • Embodiment 1 thus configured and a pipelining method are described in detail with reference to the drawings.
  • EX in FIGS. 2 and 4 may correspond to the computation executer according to one or more embodiments.
  • JMP branch instruction
  • FE branch instruction of a thread TH 2
  • EX executed
  • the branch processor 24 When the branch instruction (JMP) of the thread TH 2 is executed (EX), the branch processor 24 performs branch processing when the flag is set to 1 indicating that the branch instruction is “taken”.
  • the thread number TH 2 of the thread where the branch instruction is executed is different from the thread number TH 1 of the instruction C 2 subsequent to the branch instruction within the pipeline processor.
  • the branch controller 14 enables the Valid signal for each stage of pipelining, thereby allowing the subsequent instruction C 2 not to be dropped but to be continued to be executed.
  • the branch controller 14 disables the Valid signal for each stage of pipelining, thereby allowing the subsequent instruction Ci to be dropped.
  • JMP branch instruction
  • FE branch instruction of a thread TH 1
  • EX executed
  • the branch processor 24 When the branch instruction (JMP) from the thread TH 1 is executed (EX), the branch processor 24 performs branch processing if the flag is set to 1 indicating that the branch instruction is “taken”. In this event, the branch controller 14 drops an instruction C 2 subsequent to the branch instruction within the pipeline processor because the thread number TH 1 of the thread where the branch instruction is executed is the same as the thread number TH 1 of the subsequent instruction C 2 .
  • the branch controller 14 drops an instruction C 3 subsequent to the branch instruction within the pipeline processor because the thread number TH 1 of the thread where the branch instruction is executed is the same as the thread number TH 1 of the subsequent instruction C 3 .
  • the branch controller determines whether to drop the subsequent instruction based on the thread number of the thread where the branch instruction is executed and on the thread number of the instruction subsequent to the branch instruction within the pipeline processor.
  • the instruction subsequent to the branch instruction can be prevented from being dropped and be continued to be executed based on the thread number of the subsequent instruction. This helps avoid time loss and improves processing speed.
  • the embodiment described above can provide a processor and a pipelining method capable of helping avoid time loss and improving processing speed.

Abstract

A processor is disclosed that performs pipelining which processes a plurality of threads and executes instructions in concurrent processing, the instructions corresponding to thread numbers of the threads and including a branch instruction. The processor may include a pipeline processor, which includes a fetch part that fetches the instruction of the thread having an execution right, and a computation execution part that executes the instruction fetched by the fetch part. The processor may include a branch controller that determines whether to drop an instruction subsequent to the branch instruction within the pipeline processor based on the thread number of the thread where the branch instruction is executed and on the thread number of the subsequent instruction.

Description

CROSS REFERENCE TO RELATED APPLICATIONS
This application claims priority based on 35 USC 119 from prior Japanese Patent Application No. 2017-229010 filed on Nov. 29, 2017, entitled “PROCESSOR AND PIPELINING METHOD”, the entire contents of which are incorporated herein by reference.
BACKGROUND
The disclosure relates to a processor and a pipelining method that perform pipelining and execute a branch instruction.
A processor performs pipelining to improve processing speed. Pipelining is processing to fetch, decode, and execute instructions in such a manner that at the same time that a fetched instruction A starts to be decoded, an instruction B starts to be fetched.
In other words, in pipelining, a plurality of instructions are processed concurrently. The plurality of instructions are stored in an instruction memory constructed separately from the processor, and are linked to given addresses.
FIG. 5 illustrates a block diagram of multithreading. The multithreading illustrated in FIG. 5 involves a plurality of thresholds TH0 to TH2 linked to a plurality of instructions' addresses, and each of the threads TH0 to TH2 has its own program counter and general purpose registers. An instruction of a thread having an execution right is fetched (FE), decoded (DE), and executed.
FIG. 6 illustrates how a conventional processor that performs pipelining drops an instruction of a thread when there is a thread switch involved. In FIG. 6, FE denotes fetch; DE, decode; EX, execute; and WB1 and WB2, register write-back of data. FIG. 6 illustrates an example of pipelining where at the same time that a fetched instruction C1 starts to be decoded, a branch instruction JMP starts to be fetched.
The example illustrated in FIG. 6 has threads TH1 and TH2, and involves a thread switch. When the branch instruction JMP of the thread TH2 is executed, an instruction C2 of the thread TH1 and an instruction Ci of the thread TH2, which are subsequent to the branch instruction in the pipeline, are dropped (indicated by “x” and “drop” in FIG. 6).
As a conventional technique, there is known a processor that uses a branch mis-prediction buffer, which is described in Japanese Patent Application Publication No. 2011-100454 (Patent Literature 1). This processor executes pipelining on instructions stored in an instruction memory, and upon detection of a branch instruction, predicts the outcome of the branch instruction, and requests subsequent instructions from the instruction memory based on the predicted outcome.
This processor stores an instruction which was not predicted by the branch prediction, in order to avoid a penalty incurred by a branch prediction failure. As a result, the processor's processing speed can be improved.
SUMMARY
In accordance with one or more embodiments, a processor may be provided that performs pipelining which processes a plurality of threads and executes instructions in concurrent processing, the instructions corresponding to thread numbers of the threads and including a branch instruction. The processor includes: a pipeline processor including a fetch part that fetches the instruction of the thread having an execution right, and a computation execution part that executes the instruction fetched by the fetch part; and a branch controller that determines whether to drop an instruction subsequent to the branch instruction within the pipeline processor based on the thread number of the thread where the branch instruction is executed and on the thread number of the subsequent instruction.
Further in accordance with one or more embodiments, in the processor the branch controller may not drop and continues to execute an instruction subsequent to the branch instruction within the pipeline processor when the thread number of the thread where the branch instruction is executed is different from the thread number of the subsequent instruction.
Further in accordance with one or more embodiments, in the processor the branch controller may drop an instruction subsequent to the branch instruction within the pipeline processor when the thread number of the thread where the branch instruction is executed and the thread number of the subsequent instruction are the same.
In accordance with one or more embodiments, a pipelining method is provided for processing a plurality of threads and executing instructions in concurrent processing, the instructions corresponding to thread numbers of the threads and including a branch instruction. The method may include: performing pipelining that fetches the instruction of the thread having an execution right and executes the instruction fetched; and performing branch control to determine whether to drop an instruction subsequent to the branch instruction within the pipelining based on the thread number of the thread where the branch instruction is executed and on the thread number of the subsequent instruction.
Further in accordance with one or more embodiments, in the method the branch control may not drop and continues to execute an instruction subsequent to the branch instruction within the pipelining when the thread number of the thread where the branch instruction is executed is different from the thread number of the subsequent instruction.
Further in accordance with one or more embodiments, in the method the branch control may drop an instruction subsequent to the branch instruction within the pipelining when the thread number of the thread where the branch instruction is executed and the thread number of the subsequent instruction are the same.
Those skilled in the art will recognize additional features and advantages upon reading the following detailed description, and upon viewing the accompanying drawings.
BRIEF DESCRIPTION OF DRAWINGS
The accompanying drawings, which are incorporated herein and constitute part of this specification, illustrate exemplary embodiments of the invention, and together with the general description given above and the detailed description given below, serve to explain the features of the invention.
FIG. 1 is a block diagram illustrating the configuration of a processor according to one or more embodiments;
FIG. 2 is a diagram illustrating how the processor according to Embodiment 1 performing pipelining drops an instruction of a thread when there is a thread switch involved;
FIG. 3 is a diagram illustrating how the processor according to Embodiment 1 performing pipelining drops an instruction of a thread when there is no thread switch involved;
FIG. 4 is a block diagram illustrating pipelining by the processor according to one or more embodiments;
FIG. 5 is a block diagram illustrating multithreading; and
FIG. 6 is a diagram illustrating how a processor of a related art drops an instruction of a thread when there is a thread switch involved.
DETAILED DESCRIPTION
Embodiments are described with reference to drawings, in which the same constituents are designated by the same reference numerals and duplicate explanation concerning the same constituents may be omitted for brevity and ease of explanation. The drawings are illustrative and exemplary in nature and provided to facilitate understanding of the illustrated embodiments and may not be exhaustive or limiting. Dimensions or proportions in the drawings are not intended to impose restrictions on the disclosed embodiments. For this reason, specific dimensions and the like should be interpreted with the accompanying descriptions taken into consideration. In addition, the drawings include parts whose dimensional relationship and ratios are different from one drawing to another.
In pipelining, a plurality of instructions are processed concurrently. Although the following embodiment describes multithreading for processing instructions belonging to multiple threads, the present invention is also applicable to approaches other than multithreading.
First Embodiment
FIG. 1 is a diagram illustrating the configuration of a processor according to Embodiment 1. The processor according to Embodiment 1 performs pipelining that processes a plurality of threads and executes instructions in concurrent processing, the instructions corresponding to the thread numbers of the threads and including a branch instruction. The pipelining includes instruction fetch (F) 1, decode (D)/execute (E) 2, and write-back (W) 3. For example, at the same time that a fetched instruction A starts to be decoded, an instruction B starts to be fetched.
The instruction fetch (F) 1 is related to a thread number (TH) 11, a program counter (PC) 12, an instruction memory 13, and a branch controller 14. The thread number (TH) 11 is a thread number of a fetched instruction. The program counter (PC) 12 is an address on the instruction memory 13 where an address of a currently-processed instruction is stored.
The instruction memory 13 stores a sequence of instructions (C1 to C3) belonging to each thread (e.g., TH1) corresponding to the address of the program counter (PC) 12.
The branch controller 14 compares a thread number (TH) 11 and a thread number received from decode (D)/execute (E) 2, and outputs a comparison signal to an instruction decoder 21. Characteristic processing performed by the branch controller 14 will be described later.
The decode (D)/execute (E) 2 corresponds to the computation executer of the invention, and includes the instruction decoder 21, a register reader 22, a computation controller 23, the branch processor 24, an adder 25, and a data retriever 26.
The instruction decoder 21 decodes an instruction stored in a register 15, and outputs the decode result to the computation controller 23. The register reader 22 reads a flag set to 0 or 1 and stored in a flag register (not shown), and outputs the flag to the branch processor 24 via the computation controller 23.
The computation controller 23 executes an instruction decoded by the instruction decoder 21, and outputs the execution result to a register writer 31 and the branch processor 24.
When an instruction decoded by the instruction decoder 21 is a branch instruction, the branch processor 24 handles the branch instruction as follows: branch processing is performed if the flag is set to 1 indicating that the branch instruction is “taken”; and branch processing is not performed if the flag is set to 0 indicating that the branch instruction is “not taken”.
When the branch processor 24 performs branch processing, the branch controller 14 checks whether the thread number TH_F of an instruction fetched by the instruction fetch (F) 1 is the same as the thread number TH_E of an instruction executed by the decode (D)/execute (E) 2 and thereby determines whether to drop the subsequent instruction. The former thread number is the thread number (TH) 11, and the latter thread number is the thread number of a branch instruction executed by the branch processor 24.
Specifically, the branch controller 14 determines whether to drop the subsequent instruction based on the thread number of the thread where a branch instruction is executed and the thread number of the instruction subsequent to the branch instruction within the pipeline processor.
The branch controller 14 does not drop the subsequent instruction and continues execution of the subsequent instruction if the thread number of the thread where a branch instruction is executed is different from the thread number of the instruction subsequent to the branch instruction within the pipeline processor.
The branch controller 14 drops the subsequent instruction if the thread number of the thread where a branch instruction is executed is different from the thread number of the instruction subsequent to the branch instruction within the pipeline processor.
The adder 25 changes the address in the register 15 based on the address from the branch processor 24, and outputs the change result to the program counter (PC) 12.
The data retriever 26 retrieves data on an instruction decoded by the instruction decoder 21, and outputs the data to the register writer 31.
The write-back (W) 3 has the register writer 31 that writes data into a register.
Next, an operation of the processor of Embodiment 1 thus configured and a pipelining method are described in detail with reference to the drawings. Note that “EX” in FIGS. 2 and 4 may correspond to the computation executer according to one or more embodiments.
First, with reference to FIGS. 2 and 4, a description is given of how the processor performing pipelining drops an instruction of a thread when there is a thread switch involved.
First, a branch instruction (JMP) of a thread TH2 is fetched (FE) to be decoded (DE) and executed (EX). Next, at the same time that the branch instruction of the thread TH2 is decoded (DE), an instruction C2 of a thread TH1 is fetched (FE) to be decoded (DE) and executed (EX).
When the branch instruction (JMP) of the thread TH2 is executed (EX), the branch processor 24 performs branch processing when the flag is set to 1 indicating that the branch instruction is “taken”. In this example, the thread number TH2 of the thread where the branch instruction is executed is different from the thread number TH1 of the instruction C2 subsequent to the branch instruction within the pipeline processor. Thus, the branch controller 14 enables the Valid signal for each stage of pipelining, thereby allowing the subsequent instruction C2 not to be dropped but to be continued to be executed.
As a result, no time loss is incurred by a pipeline flush of an instruction of a different thread, which requires a do-over of the instruction of the fetch stage. This helps avoid time loss and improves processing speed. The five-stage pipeline configuration illustrated in FIGS. 2 and 4 is different from the three-stage pipeline configuration illustrated in FIG. 1; nonetheless, the invention is applicable no matter how many pipeline stages there are.
Then, when the branch instruction of the thread TH2 is executed (EX), the thread number TH2 of the thread where the branch instruction is executed is the same as the thread number TH2 of an instruction Ci subsequent to the branch instruction within the pipeline processor. Thus, the branch controller 14 disables the Valid signal for each stage of pipelining, thereby allowing the subsequent instruction Ci to be dropped.
Next, with reference to FIG. 3, a description is given of how the processor drops an instruction when there is no thread switch involved.
First, a branch instruction (JMP) of a thread TH1 is fetched (FE) to be decoded (DE) and executed (EX). Next, at the same time that the branch instruction of the thread TH1 is decoded (DE), an instruction C2 of the thread TH1 is fetched (FE) to be decoded (DE) and executed (EX).
When the branch instruction (JMP) from the thread TH1 is executed (EX), the branch processor 24 performs branch processing if the flag is set to 1 indicating that the branch instruction is “taken”. In this event, the branch controller 14 drops an instruction C2 subsequent to the branch instruction within the pipeline processor because the thread number TH1 of the thread where the branch instruction is executed is the same as the thread number TH1 of the subsequent instruction C2.
Next, the branch controller 14 drops an instruction C3 subsequent to the branch instruction within the pipeline processor because the thread number TH1 of the thread where the branch instruction is executed is the same as the thread number TH1 of the subsequent instruction C3.
Note that the determination of whether to drop the subsequent instruction may be made using the address in the program counter PC.
According to the processor and pipelining method according to the embodiment as described above, the branch controller determines whether to drop the subsequent instruction based on the thread number of the thread where the branch instruction is executed and on the thread number of the instruction subsequent to the branch instruction within the pipeline processor. Thus, the instruction subsequent to the branch instruction can be prevented from being dropped and be continued to be executed based on the thread number of the subsequent instruction. This helps avoid time loss and improves processing speed.
In conventional techniques illustrated in FIG. 6 and Patent Literature 1, instructions in the pipeline are dropped upon detection of a branch instruction. Thereafter, in order to execute the instructions thus dropped at the same time as the execution of the branch instruction, the dropped instructions have to be retrieved again (re-fetch of the instruction C2 from the thread TH1 in FIG. 6).
Then, time loss is incurred by the drop of the instruction, hindering improvement in the processor's processing speed. In this regard, the embodiment described above can provide a processor and a pipelining method capable of helping avoid time loss and improving processing speed.
The invention includes other embodiments in addition to the above-described embodiments without departing from the spirit of the invention. The embodiments are to be considered in all respects as illustrative, and not restrictive. The scope of the invention is indicated by the appended claims rather than by the foregoing description. Hence, all configurations including the meaning and range within equivalent arrangements of the claims are intended to be embraced in the invention.

Claims (6)

The invention claimed is:
1. A processor that performs pipelining which processes a plurality of threads and executes instructions in concurrent processing, the instructions corresponding to thread numbers of the threads and including a branch instruction, the processor comprising:
a pipeline processor comprising
a fetch part that fetches the instruction of the thread having an execution right, and
a computation execution part that executes the instruction fetched by the fetch part; and
a branch controller that determines whether to drop an instruction subsequent to the branch instruction within the pipeline processor based on the thread number of the thread where the branch instruction is executed and on the thread number of the subsequent instruction.
2. The processor according to claim 1, wherein
the branch controller does not drop and continues to execute an instruction subsequent to the branch instruction within the pipeline processor when the thread number of the thread where the branch instruction is executed is different from the thread number of the subsequent instruction.
3. The processor according to claim 1, wherein
the branch controller drops an instruction subsequent to the branch instruction within the pipeline processor when the thread number of the thread where the branch instruction is executed and the thread number of the subsequent instruction are the same.
4. A pipelining method of processing a plurality of threads and executing instructions in concurrent processing, the instructions corresponding to thread numbers of the threads and including a branch instruction, the method comprising:
performing pipelining that fetches the instruction of the thread having an execution right and executes the instruction fetched; and
performing branch control to determine whether to drop an instruction subsequent to the branch instruction within the pipelining based on the thread number of the thread where the branch instruction is executed and on the thread number of the subsequent instruction.
5. The method according to claim 4, wherein
the branch control does not drop and continues to execute an instruction subsequent to the branch instruction within the pipelining when the thread number of the thread where the branch instruction is executed is different from the thread number of the subsequent instruction.
6. The method according to claim 4, wherein
the branch control drops an instruction subsequent to the branch instruction within the pipelining when the thread number of the thread where the branch instruction is executed and the thread number of the subsequent instruction are the same.
US16/201,296 2017-11-29 2018-11-27 Processor and pipelining method Active 2039-02-06 US10853081B2 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
JP2017229010A JP2019101543A (en) 2017-11-29 2017-11-29 Processor and pipeline processing method
JP2017-229010 2017-11-29

Publications (2)

Publication Number Publication Date
US20190163494A1 US20190163494A1 (en) 2019-05-30
US10853081B2 true US10853081B2 (en) 2020-12-01

Family

ID=66632343

Family Applications (1)

Application Number Title Priority Date Filing Date
US16/201,296 Active 2039-02-06 US10853081B2 (en) 2017-11-29 2018-11-27 Processor and pipelining method

Country Status (2)

Country Link
US (1) US10853081B2 (en)
JP (1) JP2019101543A (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114258516A (en) 2019-09-12 2022-03-29 三垦电气株式会社 Processor and event processing method
CN114253821B (en) * 2022-03-01 2022-05-27 西安芯瞳半导体技术有限公司 Method and device for analyzing GPU performance and computer storage medium

Citations (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH04373023A (en) 1991-06-21 1992-12-25 Fujitsu Ltd Data processor
US6256728B1 (en) * 1997-11-17 2001-07-03 Advanced Micro Devices, Inc. Processor configured to selectively cancel instructions from its pipeline responsive to a predicted-taken short forward branch instruction
US6694425B1 (en) * 2000-05-04 2004-02-17 International Business Machines Corporation Selective flush of shared and other pipeline stages in a multithread processor
JP3641327B2 (en) 1996-10-18 2005-04-20 株式会社ルネサステクノロジ Data processor and data processing system
US7013383B2 (en) * 2003-06-24 2006-03-14 Via-Cyrix, Inc. Apparatus and method for managing a processor pipeline in response to exceptions
US7254700B2 (en) * 2005-02-11 2007-08-07 International Business Machines Corporation Fencing off instruction buffer until re-circulation of rejected preceding and branch instructions to avoid mispredict flush
JP2008299729A (en) 2007-06-01 2008-12-11 Digital Electronics Corp Processor
US20090106495A1 (en) * 2007-10-23 2009-04-23 Sun Microsystems, Inc. Fast inter-strand data communication for processors with write-through l1 caches
JP2011100454A (en) 2009-11-04 2011-05-19 Ceva Dsp Ltd System and method for using branch mis-prediction buffer
US20120290820A1 (en) * 2011-05-13 2012-11-15 Oracle International Corporation Suppression of control transfer instructions on incorrect speculative execution paths
US9489207B2 (en) * 2009-04-14 2016-11-08 International Business Machines Corporation Processor and method for partially flushing a dispatched instruction group including a mispredicted branch
US20170075692A1 (en) * 2015-09-11 2017-03-16 Qualcomm Incorporated Selective flushing of instructions in an instruction pipeline in a processor back to an execution-resolved target address, in response to a precise interrupt
US20180095766A1 (en) * 2016-10-05 2018-04-05 Centipede Semi Ltd. Flushing in a parallelized processor

Patent Citations (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH04373023A (en) 1991-06-21 1992-12-25 Fujitsu Ltd Data processor
JP3641327B2 (en) 1996-10-18 2005-04-20 株式会社ルネサステクノロジ Data processor and data processing system
US6256728B1 (en) * 1997-11-17 2001-07-03 Advanced Micro Devices, Inc. Processor configured to selectively cancel instructions from its pipeline responsive to a predicted-taken short forward branch instruction
US6694425B1 (en) * 2000-05-04 2004-02-17 International Business Machines Corporation Selective flush of shared and other pipeline stages in a multithread processor
US7013383B2 (en) * 2003-06-24 2006-03-14 Via-Cyrix, Inc. Apparatus and method for managing a processor pipeline in response to exceptions
US7254700B2 (en) * 2005-02-11 2007-08-07 International Business Machines Corporation Fencing off instruction buffer until re-circulation of rejected preceding and branch instructions to avoid mispredict flush
JP2008299729A (en) 2007-06-01 2008-12-11 Digital Electronics Corp Processor
US20090106495A1 (en) * 2007-10-23 2009-04-23 Sun Microsystems, Inc. Fast inter-strand data communication for processors with write-through l1 caches
US9489207B2 (en) * 2009-04-14 2016-11-08 International Business Machines Corporation Processor and method for partially flushing a dispatched instruction group including a mispredicted branch
JP2011100454A (en) 2009-11-04 2011-05-19 Ceva Dsp Ltd System and method for using branch mis-prediction buffer
US20120290820A1 (en) * 2011-05-13 2012-11-15 Oracle International Corporation Suppression of control transfer instructions on incorrect speculative execution paths
US20170075692A1 (en) * 2015-09-11 2017-03-16 Qualcomm Incorporated Selective flushing of instructions in an instruction pipeline in a processor back to an execution-resolved target address, in response to a precise interrupt
US20180095766A1 (en) * 2016-10-05 2018-04-05 Centipede Semi Ltd. Flushing in a parallelized processor

Also Published As

Publication number Publication date
JP2019101543A (en) 2019-06-24
US20190163494A1 (en) 2019-05-30

Similar Documents

Publication Publication Date Title
US8555039B2 (en) System and method for using a local condition code register for accelerating conditional instruction execution in a pipeline processor
US9733944B2 (en) Instruction sequence buffer to store branches having reliably predictable instruction sequences
US8566569B2 (en) State machine-based filtering of pattern history tables based on distinguishable pattern detection
US9201654B2 (en) Processor and data processing method incorporating an instruction pipeline with conditional branch direction prediction for fast access to branch target instructions
US20100169625A1 (en) Reducing branch checking for non control flow instructions
JP5815596B2 (en) Method and system for accelerating a procedure return sequence
CN106681695B (en) Fetching branch target buffer in advance
US20230273797A1 (en) Processor with adaptive pipeline length
US7620804B2 (en) Central processing unit architecture with multiple pipelines which decodes but does not execute both branch paths
US10853081B2 (en) Processor and pipelining method
JP3486690B2 (en) Pipeline processor
US11055101B2 (en) Processing apparatus and controlling method for processing apparatus
CN112596792A (en) Branch prediction method, apparatus, medium, and device
US20060200653A1 (en) Decoding predication instructions within a superscalar data processing system
CN112540792A (en) Instruction processing method and device
US10437596B2 (en) Processor with a full instruction set decoder and a partial instruction set decoder
US11392383B2 (en) Apparatus and method for prefetching data items
JP5696210B2 (en) Processor and instruction processing method thereof
JP4728877B2 (en) Microprocessor and pipeline control method
US11586444B2 (en) Processor and pipeline processing method for processing multiple threads including wait instruction processing
JP2001142703A (en) Microcomputer and method for controlling conditional branch instruction fetch

Legal Events

Date Code Title Description
AS Assignment

Owner name: SANKEN ELECTRIC CO., LTD., JAPAN

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:MIMA, KAZUHIRO;SHISHIDO, HITOMI;REEL/FRAME:047593/0518

Effective date: 20181122

FEPP Fee payment procedure

Free format text: ENTITY STATUS SET TO UNDISCOUNTED (ORIGINAL EVENT CODE: BIG.); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY

STPP Information on status: patent application and granting procedure in general

Free format text: APPLICATION DISPATCHED FROM PREEXAM, NOT YET DOCKETED

STPP Information on status: patent application and granting procedure in general

Free format text: DOCKETED NEW CASE - READY FOR EXAMINATION

STPP Information on status: patent application and granting procedure in general

Free format text: NON FINAL ACTION MAILED

STCF Information on status: patent grant

Free format text: PATENTED CASE