WO2008038373A1

WO2008038373A1 - Processor for increasing the branching prediction speed

Info

Publication number: WO2008038373A1
Application number: PCT/JP2006/319339
Authority: WO
Inventors: Shinichiro Tago
Original assignee: Fujitsu Limited
Priority date: 2006-09-28
Filing date: 2006-09-28
Publication date: 2008-04-03

Abstract

In a processor for looking-ahead an instruction by a branching prediction, branching prediction information indicating a command address and a task identifier is written in a storage region used for branching prediction.

Description

Specification

Processor that speeds up branch prediction

Technical field

The present invention relates to a processor that enables high-speed branch prediction by suppressing BTB (Branch Target Buffer) clearing frequency and entry frequency by task switching.

Background art

[0002] A branch instruction is a factor that degrades the instruction supply performance in a processor that performs a knock-line process. The processor sequentially executes instructions having consecutive normal addresses. However, if a branch instruction is executed and it is decided to branch, the branch destination address must be calculated according to the procedure indicated in the branch instruction and the instruction must be executed by jumping to that address from the next time. Don't be.

[0003] The processor that performs the pipeline processing performs prefetching of instructions having consecutive normal addresses, and smoothly executes the pipeline processing. After the branch is determined, the branch destination address is calculated and the branch destination is calculated. When the instruction at the address is read and executed, the prefetch instruction that cannot be read remains in the pipeline, and the supply of the branch destination instruction to be executed next is delayed.

[0004] It is well known to use a branch prediction method and BTB (Branch Target Buffer) to avoid this problem. Branch prediction is a technique for predicting instruction execution order and prefetching instructions to prevent prefetching of instructions and reducing pipeline disruption. BTB is an instruction address that calculates the address of the branch destination instruction in advance when the previous branch is executed in order to provide the address of the branch destination instruction and perform instruction prefetching earlier when a branch is predicted. This is a device that holds the address of the branch destination instruction in correspondence with.

[0005] When a processor is operated using dynamic address conversion, a plurality of physical addresses may be assigned to one logical address. In particular, in a multitasking system, multiple tasks share a logical address space. For example, as shown in Figure 1, when multiple tasks メイン, Β, ... in the main memory 1 are expanded in the virtual memory space 2. Etc.

[0006] Normally, BTB uses logical addresses. For this reason, if the input address to the BTB is a shared space, the data read out by the BTB may be information on branch instructions of other tasks. If a branch instruction of another task is read, the instruction that is predicted to be the branch destination stored in the BTB may be prefetched even though it is not a branch instruction. Conventionally, methods have been used such as clearing the BTB every time the task is switched, or clearing the BTB once a BTB reading error has occurred, so as not to accidentally read the branch instruction of another task. (For example, Patent Document 1). Patent Document 1: JP-A-8-249181

Disclosure of the invention

Problems to be solved by the invention

[0007] However, in the conventional method described above, when task switching occurs frequently in a multitask system, there is a problem that the frequency of BTB clearing increases and there are many states in which branch prediction itself cannot be performed. .

[0008] The present invention has been made in view of such a problem, and BT by task switching.

An object of the present invention is to provide a processor capable of high-speed branch prediction by suppressing the B clear frequency and the BTB entry frequency.

Means for solving the problem

In order to solve the above problems, the present invention provides a processor that prefetches instructions by branch prediction, and includes branch prediction information including a tag indicating an instruction address and task identification information in a storage area used in branch prediction Branch prediction information writing means for writing

The invention's effect

[0010] According to the present invention, high-speed branch prediction is enabled by suppressing the frequency of BTB clearing and the frequency of entry to the BTB due to task switching.

Brief Description of Drawings

FIG. 1 is a diagram illustrating an example of a multitask system. O

FIG. 2 is a block diagram of a processor according to an embodiment of the present invention.

圆 1-31—] is a diagram illustrating a first configuration example of branch prediction information entered in the BTB.

[4] It is a diagram showing a second configuration example of the branch prediction information entered in the BTB.

FIG. 5 is a flowchart for explaining a read operation from a BTB.

FIG. 6 is a flowchart for explaining the operation of updating the BTB.

Explanation of symbols

Instruction cache controller

12 Instruction fetch controller

13 Coders with instructions

14 Instruction execution controller

15 Arithmetic execution unit

16 Load store execution unit

17 Branch execution unit

18 Branch prediction unit

19 BTB (Branch Target Bulfer

20 Current task ID register

100 processor

BEST MODE FOR CARRYING OUT THE INVENTION

Hereinafter, embodiments of the present invention will be described with reference to the drawings. The present invention is realized in a multitask system as shown in FIG.

FIG. 2 is a block diagram of a processor according to an embodiment of the present invention. In FIG. 2, the processor 100 includes an instruction cache controller 11, an instruction fetch controller 12, an instruction decoder 13, an instruction execution controller 14, an operation execution unit 15, a load store execution unit 16, and a branch execution. Unit 17, branch prediction unit 18, and BTB (Branch Target Buffer) 19 are provided. As BTB 19, a memory device such as RAM (Random Access Memory) is used.

The instruction cache controller 12 reads an instruction from the cache memory in response to an instruction fetch request from the instruction fetch controller 13. The instruction fetch controller 13 reads an instruction from the instruction cache controller 12 by making an instruction fetch request using an address to the instruction fetch controller 12. At the same time, the instruction fetch controller 13 issues a read request for the branch prediction information entered in the BTB 19, and if it is branch prediction information corresponding to the fetched instruction (BTB hit), the instruction fetch controller 13 Request branch prediction for the instruction.

The instruction fetch controller 13 adds a processing result related to branch prediction to the instruction acquired from the instruction cache controller 12 and sends the result to the instruction decoder 13. The instruction decoder 13 decodes the instruction and sends it to the instruction execution controller 14.

The instruction execution controller 14 sends the instruction decoded by the instruction decoder 13 to any one of the operation execution unit 15, the load store execution unit 16, or the branch execution unit 17 according to the type of instruction.

[0019] The branch execution unit 17 executes a branch instruction and determines whether or not a branch is possible. If the branch execution unit 17 determines to branch, branch prediction information is entered into the BTB 19, and the instruction fetch controller 12 fetches the instruction at the branch destination address. On the other hand, if the branch execution unit 17 determines not to branch, the instruction fetch controller 12 fetches the instruction at the sequential next address without branching.

The configuration of the branch prediction information entered in the BTB 19 will be described with reference to FIGS. FIG. 3 is a diagram illustrating a first configuration example of the branch prediction information entered in the BTB. In Figure 3, when the first branch instruction is executed, branch prediction information 5 is registered in BTB19. Tag comparison data 6a is generated by the current task ID stored in the register 20 by the OS (Operating System) and the address tag based on the address of the branch instruction.

[0021] Branch prediction information 5 including a Valid flag having a value "1", tag comparison data 6a based on the current task ID and address tag, and a branch destination address is registered in the BTB 19. The value of the Valid flag of branch prediction information 5 registered in BTB19 is then updated as necessary.

[0022] When the branch prediction information 5 is read from the BTB 19, the tag comparison data 6b is generated from the current task ID stored in the register 20 and the address tag based on the address of the branch instruction that also fetches the cache power. . And branch prediction information read from BTB19 If the valid flag power of 5 is “l” and the data consisting of the task ID and address tag of branch prediction information 5 matches the tag comparison data 6b, it is determined that the BTB has been hit.

[0023] In this embodiment, if the data consisting of the task ID and address tag of the branch prediction information 5 does not match the tag comparison data 6b, the cache flag is not judged to be valid and the Valid flag is left as "1". This branch prediction information 5 is left in BTB19. After that, after the task is switched, this branch prediction information 5 can be used effectively.

[0024] That is, in FIG. 1, the branch prediction information 5 stored in BTB19 by task A becomes available again when task A power is switched to task B and then switched to task A. It is possible to improve the processing efficiency without having to re-entry prediction information.

In order to effectively use the storage area of the BTB 19, the branch prediction information 5 shown in FIG. 3 can be configured as shown in FIG. FIG. 4 is a diagram showing a second configuration example of the branch prediction information entered in the BTB. In the second configuration example shown in FIG. 4, tag comparison data 6-2a is generated by performing a predetermined calculation using a part of the address tag and the current task ID stored in the register 20. . This tag comparison data 6-2a is an address tag having a shorter bit length than the tag comparison data 6a based on the task ID and address tag shown in FIG.

[0026] The branch prediction information 5-2, consisting of the Valid flag with the value "1", tag comparison data 6-2a based on the current task ID and address tag, and the branch destination address, is registered in BTB19. It is done. The value of the Valid flag in the branch prediction information 5-2 registered in BTB19 is then updated as necessary.

[0027] When branch prediction information 5-2 is read from BTB19, tag comparison data 6-2b is determined by the current task ID stored in register 20 and the address tag based on the address of the branch instruction fetched from the cache. Is generated. And the valid flag power of branch prediction information 5-2 read from BTB19 is “l”, and the data with the task ID and address tag of branch prediction information 5-2 matches the tag comparison data 6-2b If it does, branch prediction information 5-2 corresponding to the instruction of the current task is determined (BTB hit).

In this embodiment, if the data consisting of the task ID and address tag of the branch prediction information 5-2 does not match the tag comparison data 6-2b, it is not determined as a cache miss and the Valid flag is set to “ Leave “1”, and leave this branch prediction information 5-2 in BTB19. After the task is switched, this branch prediction information 5-2 can be used effectively.

[0029] The operation executed at the time of registration and reading is, for example, an EOR (exclusive logic) operation.

In this case, if the amount of information is small, an EOR operation may be performed using the upper address of the branch instruction and the task ID. Alternatively, the operation executed at the time of registration and reading may be an operation that sets the task ID to the upper address of the branch instruction.

[0030] With the configuration of the branch prediction information 5-2 as described above, the task ID can be included without increasing the storage area of the BTB 19. Moreover, it is possible to prevent the branch prediction information 5-2 from being used accidentally by switching tasks.

[0031] As described with reference to FIGS. 3 and 4, even when the task ID does not match, the branch prediction information 5 or 5-2 stored in the BTB 19 is not invalidated. When returning to this task, branch prediction information 5 or 5-2 can be used without registering again. Therefore, it is possible to eliminate the waste of processing, and the processing related to branch prediction can be performed at high speed.

Next, the read operation and update operation of the branch prediction information from the BTB 19 will be described with reference to FIG. 5 and FIG. In FIGS. 5 and 6, the same processing is possible in the first configuration example shown in FIG. 3, which will be described based on the second configuration example shown in FIG.

FIG. 5 is a flowchart for explaining an operation of reading BTB force branch prediction information. In FIG. 5, an instruction fetch request is sent from the instruction fetch controller 12 to the instruction cache controller 11 to acquire an instruction (step S11).

[0034] Next, a read request is made from the instruction fetch controller 12 to the BTB 19 (step S12). _{0 The} instruction fetch controller 12 generates tag comparison data 6-2b with the current task ID, the address at which the instruction was fetched, and the power. (Step S13).

[0035] The instruction fetch controller 12 determines whether or not the Valid flag of the branch prediction information 5 read from the BTB 19 is "1" and the address tag matches the tag comparison data 6-2b ( Step S14). Branch prediction information read from BTB19 5—Valid flag of S-2 If S is not “l” or the address tag does not match tag comparison data 6—2b (BTB miss), branch prediction satisfies the branch condition It is “not-taken” indicating that it is predicted not to The instruction fetch controller 12 causes the instruction cache controller 11 to sequentially fetch the instruction at the next address (step S14-2), and returns to step S12.

[0036] On the other hand, if the valid flag of the branch prediction information 5-2 read from the BTB 19 is “1” and the address tag matches the tag comparison data 6-2b in step S14 (BTB hit The instruction fetch controller 12 obtains the branch prediction direction from the branch prediction unit by sending the address fetched instruction to the branch prediction unit 18, and the branch prediction direction predicts that the branch condition is satisfied. It is determined whether or not the force indicates “taken” (step S15).

[0037] In step S15, if the branch prediction direction indicates "not-taken" in which the branch condition is predicted not to be satisfied, the instruction fetch controller 12 sequentially sends the next address to the instruction cache controller 11. Is fetched (step S16), and the process returns to step S12.

[0038] On the other hand, when the branch prediction direction indicates “taken” predicted to satisfy the branch condition, the instruction fetch controller 12 reads the branch prediction address 5-2 of the branch prediction information 5-2 read from the BTB 19 into the instruction cache controller 11. Is fetched (step S16), and the process returns to step S12.

FIG. 6 is a flowchart for explaining the operation of updating the BTB. In FIG. 6, the instruction execution controller 14 instructs the execution unit 15, the load / store execution unit 16, or the branch execution unit 17 to execute depending on the type of instruction sent from the instruction decoder 13 (step S 21 ), It is determined whether or not the instruction to be executed hits BTB19 at the time of instruction fetch (step S22).

[0040] If BTB19 is hit, it is determined whether the instruction to be executed is a branch instruction (step S23). If the instruction to be executed is a branch instruction, BTB update is not performed (step S24), and the BTB update operation is terminated. On the other hand, if the instruction to be executed is other than a branch instruction, the branch prediction information 5-2 of the hit BTB 19 is invalidated (step S25). Write "0" to the Valid flag in this branch prediction information 5-2 to end the BTB update operation.

[0041] On the other hand, if the instruction to be executed does not hit BTB19 in step S22, It is determined whether the instruction to be executed is a branch instruction (step S26). Task ID mismatch is included as a case where the instruction to be executed does not hit BTB19. If it is a branch instruction, the branch instruction executed by the branch execution unit 17 is registered in the BTB 19 (step S27).

[0042] The operation registered in the BTB 19 will be described in steps S272, S274, and S276.

First, the current task ID and branch instruction address force tag comparison data 6a are generated (step S272), and the branch destination address is calculated (step S274). Then, the branch prediction information 5-2 including the Valid flag with the value “1”, the tag comparison data 6a, and the branch destination address is written to BT B19 (step S276), and the operation of registering in BTB19 is terminated. . Also, the operation of updating BTB 19 is ended.

If it is determined that the instruction to be executed in step S26 is other than a branch instruction, BTB update is not performed (step S28), and the BTB update operation is terminated. In other words, clearing of BTB19 is suppressed.

In FIG. 6, steps S23, S24, and S25 are processing when the current task ID of the instruction to be executed matches the task ID of the branch prediction information 5-2 read from BTB19.

On the other hand, steps S26, S27 and S28 are performed when the current task ID of the instruction to be executed does not match the task ID of the branch prediction information 5-2 from which the BTB 19 force is also read, or the fetched instruction This is executed when the address and branch prediction information 5-2 address do not match. Step S27 is a registration process to the BTB 19 that is performed when a certain branch instruction is executed for the first time.

[0046] In step S28, even if the task IDs do not match, the BTB update is not performed. Therefore, the branch prediction information 5-2 read from the BTB 19 in step S14 in Fig. 5 has the Valid flag set to "1". It will be left in BTB19. Therefore, when returning to the original task, the branch prediction information 5-2 can be used without reentry.

An example of operation in a multi-system when the operation according to the present invention described above is applied will be described with reference to FIG.

[0048] Suppose that task A identified by task ID “0” executes instruction 0 at address 0 and branches to instruction j at address 30. Since instruction a is a branch instruction, the 0th entry in BTB19 In the prediction information 5, the Valid flag “1”, the tag comparison data 6a including the current task ID, and the branch destination address “30” of the instruction j are set.

In the case of branch prediction information 5-2 using EOR operation, the value of EOR operation between the upper address “0” of the branch instruction and the current task ID “0” is “0”.

[0050] When an interrupt occurs and task B identified by task ID "1" is executed, instruction bl at address 0 is fetched. Then, the branch destination address “30” of the branch prediction information 5 entered at address 0 of BTB 19 is read. Also, by comparing the task ID “0” in branch prediction information 5 with the current task ID “1” (Fig. 3), the task ID mismatch is obtained.

[0051] When EOR operation is used, the upper address "0" of the current instruction and the current task ID "1"

By obtaining the value “1” by EOR operation and comparing the upper address of the address tag of branch prediction information 5-2 read from BTB19 with “0” (Figure 4), a task ID mismatch is obtained.

[0052] Due to the task ID mismatch, the branch prediction is regarded as “not-taken”, and the instruction b2 at the address 4 following the address 0 is fetched (step S14-2 in FIG. 5). Thereafter, since the instructions bl and b2 are arithmetic instructions, the arithmetic execution unit 15 executes the arithmetic. In this case, since the executed instructions bl and b2 are other than branch instructions, the BTB update is not performed (step S28 in FIG. 6).

[0053] An interrupt occurs again, task A identified by task ID "0" is started to be executed, and instruction a at address 0 is fetched. Then, the branch destination address “30” of the branch prediction information 5 entered at address 0 of BTB19 is read. Also, by comparing the task ID “0” of branch prediction information 5 entered at address 0 of BTB19 with the current task ID “0” (FIG. 3), the task ID matches.

[0054] When using the EOR operation, the value "0" is obtained by the EOR operation between the upper address "0" of the current instruction and the current task ID "0", and the branch prediction information 5-2 read from the BTB19 By comparing the upper address of the address tag with “0” (Fig. 4), a task ID match is obtained.

[0055] Since branch prediction is performed and the branch direction is indicated, the instruction at the branch prediction destination address 30 is fetched to obtain instruction j (step S16 in FIG. 5). After that, the branch is performed with the instruction a, the branch prediction is hit, and then the instruction j is executed. Therefore, even if a task ID mismatch occurs, the branch prediction information 5 entered in the BTB 19 is not cleared, so that it is possible to efficiently process after switching to the original task. In addition, it is possible to prevent misreading of branch prediction information 5 due to task ID mismatch.

[0057] The present invention is not limited to the specifically disclosed embodiments, and various modifications and changes can be made without departing from the scope of the claims.

Claims

The scope of the claims

[1] A processor that prefetches instructions by branch prediction,

A processor comprising branch prediction information writing means for writing branch prediction information including a tag indicating an instruction address and task identification information in a storage area used for branch prediction.

[2] A clear suppression means for suppressing clearing of the storage area when the task identification information read from the branch prediction information stored in the storage area does not match the current task identification information is provided. The processor according to claim 1.

3. The processor according to claim 2, wherein the clear inhibiting unit is executed when an instruction to be executed is an instruction other than a branch instruction.

[4] The branch prediction information writing means stores the task identification information stored in the storage area! And read from the branch prediction information and the current task identification information, and an instruction to be executed. 2. The processor according to claim 1, wherein the processor is executed when the instruction is a branch instruction.

[5] When the task identification information read from the branch prediction information stored in the storage area matches the current task identification information and the instruction to be executed is an instruction other than a branch instruction, 2. The processor according to claim 1, further comprising branch prediction information invalidating means for invalidating the branch prediction information.

[6] If the task identification information read from the branch prediction information stored in the storage area matches the current task identification information, and the instruction to be executed is a branch instruction, the storage area 2. The processor according to claim 1, further comprising update inhibiting means for inhibiting update.

7. The processor according to claim 1, further comprising tag configuration means for configuring the tag with an instruction address and task identification information.

87. The processor according to claim 1, further comprising tag generation means for generating the tag by performing an operation on an instruction address and task identification information. 9. The processor according to claim 8, wherein the tag generation means performs the calculation on an upper address part of the instruction address and the task identification information.

10. The processor according to claim 8, wherein the operation is an exclusive OR operation.

[118] reading means for reading the branch prediction information from the storage area;

Task matching judgment means for judging whether or not the current task identification information matches the task identification information of the branch prediction information;

2. The processor according to claim 1, further comprising: an unsatisfied means for determining that the branch condition is not satisfied when the task identification information does not match.

12. The processor according to claim 11, further comprising comparison tag forming means for forming a comparison tag with the instruction address and the current task identification information.

139. The processor according to claim 118, further comprising comparison tag generation means for generating the comparison tag by performing an operation on the instruction address and the current task identification information.

14. The processor according to claim 13, wherein the comparison tag generation means performs an exclusive OR operation on the upper address part of the instruction address and the task identification information.

[1510] Branch prediction information is a method executed in a processor that prefetches instructions by branch prediction, and writes branch prediction information including a tag indicating an instruction address and task identification information in a storage area used for branch prediction. Writing procedure;

Branch prediction information stored in the storage area When the read task identification information and the current task identification information do not match, the method includes a clear suppression procedure for suppressing the clearing of the storage area. .