WO2005013129A1 - Dispositif de traitement d'information, dispositif de commande de traitement d'instruction, procede de commande de traitement d'instruction, programme de commande de traitement d'instruction et support d'enregistrement lisible par ordinateur contenant ce programme - Google Patents

Dispositif de traitement d'information, dispositif de commande de traitement d'instruction, procede de commande de traitement d'instruction, programme de commande de traitement d'instruction et support d'enregistrement lisible par ordinateur contenant ce programme Download PDF

Info

Publication number
WO2005013129A1
WO2005013129A1 PCT/JP2003/009635 JP0309635W WO2005013129A1 WO 2005013129 A1 WO2005013129 A1 WO 2005013129A1 JP 0309635 W JP0309635 W JP 0309635W WO 2005013129 A1 WO2005013129 A1 WO 2005013129A1
Authority
WO
WIPO (PCT)
Prior art keywords
priority
instruction
request
unit
information processing
Prior art date
Application number
PCT/JP2003/009635
Other languages
English (en)
Japanese (ja)
Inventor
Mariko Sakamoto
Original Assignee
Fujitsu Limited
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Fujitsu Limited filed Critical Fujitsu Limited
Priority to PCT/JP2003/009635 priority Critical patent/WO2005013129A1/fr
Publication of WO2005013129A1 publication Critical patent/WO2005013129A1/fr

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/38Concurrent instruction execution, e.g. pipeline or look ahead
    • G06F9/3836Instruction issuing, e.g. dynamic instruction scheduling or out of order instruction execution
    • G06F9/3851Instruction issuing, e.g. dynamic instruction scheduling or out of order instruction execution from multiple instruction streams, e.g. multistreaming

Definitions

  • the present invention relates to an information processing device, an instruction processing control device, an instruction processing control method, an instruction processing control program, and a computer-readable recording medium storing the instruction processing control program.
  • the present invention relates to a technique for realizing an improvement in throughput by controlling an instruction output to a decoder in an information processing apparatus that executes two or more processes in parallel.
  • FIG. 12 is a block diagram showing a functional configuration of an information processing device adopting a general multi-thread method.
  • the information processing device shown in FIG. 12 includes an instruction address generator 1, a primary instruction cache memory 2, It is composed of an instruction buffer 3, a selector 4, a decoder 5, an arithmetic unit 6, a register 7, a primary data cache memory 8, a secondary cache memory 9, and a buffer 10. It is configured to execute a plurality of instructions based on the obtained plurality of processes in parallel.
  • the processing refers to a program or an instruction flow generated from a certain program (one or more instructions corresponding to the processing flow of instruction fetch and decode).
  • This information processing device is communicably connected to an external memory (main storage unit) 11 via a network or the like.
  • the instruction address generation unit 1 generates an address on the instruction area of the external memory 11 in which an instruction word group of an instruction resulting from the generated processing is stored, and the address generation is independent for each processing. To be done.
  • Primary instruction cache memory 2 is a copy of part of the instruction area on external memory 1 1. And outputs an instruction corresponding to the instruction address generated by the instruction address generation unit 1 from the copy to the instruction buffer 3 at the subsequent stage.
  • the primary data cache memory 8 A part of the data area on the external memory 11 is stored, and data required for the operation in the operation unit 6 is output in response to a request from the operation unit 6.
  • the secondary cache memory 9 stores a copy of the instruction area and a part of the data area on the external memory 11, and an instruction corresponding to the instruction address generated by the address generation unit 1 is a primary instruction. If the data does not exist in the cache memory 2 or the data required for the operation in the operation unit 6 does not exist in the primary data cache memory 8, the secondary cache memory 9 is searched.
  • the instruction buffer 3 temporarily stores instruction words fetched from the primary instruction cache memory 2, the secondary cache memory 9, or the external memory 11 by the instruction address generation unit 1, and in accordance with an instruction from the selector 4, An instruction is issued to the decoder 5.
  • the instruction buffer 3 is configured to temporarily store eight instructions (instruction word groups). To maximize the benefits of the multi-thread system, the instruction buffer 3 is used for multiple processing. Fetching of these instructions is performed independently of each other.
  • the selector 4 functions to control the operation of issuing instructions from the instruction buffer 3 to the decoder 5, and selects at least one of the instructions stored in the instruction buffer 3 and outputs the selected instruction to the decoder 5. It is. In the device shown in FIG. 12, since four decoders 5 are provided, the selector 4 is configured to issue a maximum of four instructions to the decoder 5 at the same time. In addition, the selector 4 is used to maximize the advantages of the multi-thread method. Normally, when a plurality of instructions are stored in the instruction buffer 3, regardless of which processing is based on the plurality of instructions, In other words, as many instructions as possible, that is, the maximum number of instructions that can be issued simultaneously (four instructions in the device shown in Fig. 12) are selected and issued without specifying the processing.
  • the decoder 5 is provided with four of D1, D2, D3 and D4. These decoders Dl, D2, D3 and D4 are issued simultaneously from the instruction buffer 3.
  • the decoder 5 receives each instruction and decodes it in parallel.
  • Each decoder 5 has a format that allows the operation unit (instruction processing unit) 6 to actually recognize the contents of the instruction from the received instruction.
  • the operation resources of the opcode and the operand are secured.
  • the arithmetic unit 6 is configured to include a plurality of arithmetic units in order to process a plurality of instructions in parallel.
  • the arithmetic unit 6 sequentially executes arithmetic processing based on the instructions decoded by the decoder 5, and outputs a result of the arithmetic processing to a subsequent stage.
  • the buffer 10 functions as an interface to the external memory 11.
  • the instruction words related to the instruction for the generated processing are not stored in either the primary instruction cache memory 2 or the secondary cache memory 9 and cannot be fetched from these cache memories.
  • an instruction cache miss When a miss occurs in the cache memory 2 or the secondary cache memory 9 (hereinafter, referred to as an instruction cache miss), or when the arithmetic unit 6 executes an instruction, the data used for the instruction is 1
  • the external memory 11 is referred to via the buffer 10.
  • the instruction group and data is fetched in the primary instruction cache memory 2, the primary data key Yasshu memory 8, 2 Tsugiki Yasshumemori 9 via the buffer 1 0.
  • An information processing device that employs the multi-thread method processes a plurality of processes in parallel, but the unit of the execution environment that executes one process is called a workload.
  • the workload is the usage status of each function of the information processing device (eg, instruction buffer, cache memory, selector, decoder, arithmetic unit, register, knocker, etc.) used when a certain process is executed.
  • each function used by multiple processes is shared between workloads.
  • a plurality of instructions based on a plurality of processes are decoded in parallel and executed in parallel.
  • Resources that is, multiple computing units
  • cache misses when no distinction is made between instruction cache misses and data cache misses
  • results in the free time of computational resources data from external memory 11
  • the instruction execution in the operation unit 6 is stopped due to the long latency, and no processing is performed using the operation resources of the operation unit 6), or the instruction is re-fetched when the branch prediction fails.
  • the idle time of the computational resources generated by this is reduced.
  • the total time for performing a plurality of processes is reduced, and the throughput can be improved.
  • the user of the information processing apparatus may preferentially execute a specific process while executing a plurality of processes in parallel. Even if desired, the workload characteristics of each process cannot be manipulated, so specific processes cannot be prioritized and the user's will cannot be reflected.
  • the present invention has been made in view of such a situation.
  • the present invention provides an information processing apparatus that can effectively use arithmetic resources according to an actual instruction execution situation.
  • the purpose is to achieve a significant improvement in throughput by more reliably reducing the idle time of computational resources.
  • Patent Document 1
  • an information processing device of the present invention is an information processing device that executes two or more processes in parallel, and executes an instruction address for executing the two or more processes.
  • An instruction address generation unit that is generated each time, and temporarily stores a copy of a part of the instruction area on the main memory, and from the copy, the instruction address generation unit
  • An instruction cache memory for outputting an instruction corresponding to the generated instruction address; an instruction buffer for holding a plurality of instructions output from the instruction cache memory; and at least one of the instructions held in the instruction buffer
  • a selector for selecting and outputting an instruction, a decoder for decoding an instruction from the selector, an operation unit for executing an operation in accordance with a result of decoding by the decoder, and a temporary copy of a part of the data area on the main memory.
  • Data cache memory for storing data necessary for the operation in the operation unit in response to a request from the operation unit, and reducing the free time of operation resources in the operation unit
  • a priority calculation unit that calculates / changes the priority of each process so as to perform the processing, and refers to the priority of each process calculated / changed by the priority calculation unit.
  • a control unit for controlling the operation of the selector so as to preferentially select an instruction for the most recent process and output the selected instruction to the decoder.
  • the information processing apparatus further includes a request management unit that manages information related to a request transmitted from the information processing apparatus to the outside in accordance with the execution of the two or more processes for each of the processes. It is preferable that the priority of each process is calculated / changed based on the information on the request managed by the management unit.
  • the priority calculation unit is controlled by the request management unit and waits for a response from the outside. It is preferable to calculate / change the priority of the process that is the source of the request according to the number of requests in the request.
  • the priority calculation unit changes the priority of the process that is the source of the request to be lower than the current level.
  • the priority calculation unit upon receiving a response to the request transmitted along with the execution of any one of the two or more processes, performs a history of the request managed by the request management unit. It is preferable to calculate / change the priority of the process that is the source of the request according to the request.
  • the request is managed by the request management unit, If the request is the oldest of all the requests waiting for the response, it is preferable that the priority calculation unit changes the priority of the process that is the source of the request to be higher than the current level. .
  • the priority calculation unit determines the type of the request managed by the request management unit. It is preferable to calculate / change the priority of the process that is the source of the request according to the request.
  • the priority calculation unit sets the priority of the process that is the source of the request to be lower than the current level. It is preferable to change it to be higher.
  • the priority calculation unit determines, based on the type of the request managed by the request management unit, the process of the transmission source of the request. It is preferable to calculate / change the priority.
  • the priority calculation unit lowers the priority of the process that is the source of the request from the current level. It is preferable to change as follows.
  • the arithmetic unit further includes a sampling unit for sampling the usage status of the computation resources for each of the processes, wherein the priority calculation unit determines each of the processing resources according to the usage status of the computation resources sampled by the sampling unit. It is preferable to calculate / change the processing priority.
  • the priority calculation unit may It is preferable to change the priority so as to be lower than the current state.
  • a request management unit that manages information related to a request transmitted from the information processing apparatus to the outside in accordance with the execution of the two or more processes for each of the processes;
  • a collecting unit that collects information on each of the requests managed by the request managing unit; and a use status of the computation resources collected by the collecting unit. And depending on It is preferable to calculate / change the priority of each process.
  • the priority calculation unit is waiting for a response from the outside, which is managed by the request management unit.
  • the priority of the process that is the source of the request is calculated and changed according to the number of requests and the usage status of the processing resource collected by the collection unit for the process that is the source of the request. Is preferred.
  • the priority calculation unit changes the priority of the process that is the transmission source of the request so as to be lower than the current level. Further, when the number of the requests exceeds a predetermined value, it is preferable that the priority calculation unit changes the priority of the process that is the source of the request to be lower than the current level.
  • the reference value (the initial value of the priority) may be set according to the characteristics of the workload as one or more execution environments used when executing the two or more processes in the information processing device, or according to a user's request.
  • the execution status of a workload as one or more execution environments used when executing the two or more processes in the information processing device, and a reference value (an initial value of the priority) according to the execution status.
  • an instruction processing control device includes an instruction address generation unit, an instruction cache memory, an instruction buffer, a selector, a decoder, a calculation unit, and a data cache memory.
  • An information processing apparatus for executing processing in parallel comprising: an instruction processing control apparatus for controlling an execution state of each instruction for the two or more processings, wherein the idle time of operation resources in the operation unit is reduced. Calculate the priority of each process.Refer to the priority calculation unit that changes Z and the priority of each process calculated and changed by the priority calculation unit, and give priority to the instruction for the process with the higher priority. And a control unit for controlling the operation of the selector so as to select and output the data to the decoder.
  • an instruction processing control method in an information processing apparatus that executes two or more processes described above in parallel, executes an execution state of each instruction for the two or more processes.
  • An instruction processing control method for controlling wherein a priority of each processing is calculated so as to reduce an idle time of a processing resource in the processing unit; a priority calculating step of changing Z; and a calculation / change in the priority calculating step And controlling the operation of the selector so as to preferentially select an instruction for the high-priority processing and output the instruction to the decoder with reference to the priority of each processing performed.
  • an instruction processing control program in an information processing apparatus that executes two or more processes in parallel, controls an execution state of each instruction for the two or more processes.
  • An instruction processing control program for causing a computer to execute a function to perform, the priority calculating unit calculating / changing the priority of each processing so as to reduce the idle time of the processing resources in the processing unit; and
  • the operation of the selector is controlled so as to refer to the priority of each processing calculated / changed by the degree calculation unit and to preferentially select an instruction for the processing with the higher priority and output the instruction to the decoder.
  • the computer is caused to function as a control unit.
  • an information processing apparatus that executes the above-described two or more processes in parallel.
  • An instruction processing control program for causing a computer to execute the function of controlling the execution state of each instruction for the A computer-readable recording medium, wherein the instruction processing control program calculates and changes the priority of each processing so as to reduce the idle time of the processing resources in the processing section;
  • the operation of the selector is controlled by referring to the priority of each process whose Z has been changed by the priority calculation unit and preferentially selecting an instruction for the process with the higher priority and outputting it to the decoder. It is characterized in that the computer functions as a control unit.
  • FIG. 1 is a block diagram illustrating a functional configuration of the information processing apparatus according to the first embodiment of the present invention.
  • FIG. 2 is a diagram showing a request management table in the information processing apparatus according to the first embodiment of the present invention.
  • FIG. 3 is a diagram showing a priority storage table in the information processing apparatus according to the first embodiment of the present invention.
  • FIG. 4 is a flowchart for explaining a procedure of calculating a priority variable at a first timing in the priority calculation unit of the information processing apparatus according to the first embodiment of the present invention.
  • FIG. 5 is a flowchart for explaining a procedure of calculating a priority variable at a second timing in the priority calculation unit of the information processing apparatus according to the first embodiment of the present invention.
  • FIG. 6 is a flowchart for explaining a procedure of calculating a priority variable at a third timing in the priority calculation unit of the information processing apparatus according to the first embodiment of the present invention.
  • FIG. 7 is a flowchart illustrating a procedure of calculating a priority variable at a fourth timing in the priority calculation unit of the information processing apparatus according to the first embodiment of the present invention.
  • FIG. 8 is a flowchart for explaining the procedure of the instruction control method of the information processing apparatus according to the first embodiment of the present invention.
  • FIG. 9 is a block diagram showing a functional configuration of the information processing apparatus according to the second embodiment of the present invention.
  • FIG. 10 is a block diagram showing a functional configuration of an information processing apparatus according to the third embodiment of the present invention.
  • FIG. 11 is a diagram showing a table in the information processing device according to the third embodiment of the present invention.
  • FIG. 12 is a block diagram showing a functional configuration of an information processing apparatus employing a general multi-thread method.
  • FIG. 1 is a block diagram showing a functional configuration of the information processing apparatus according to the first embodiment.
  • the information processing apparatus according to the first embodiment of the present invention has the same configuration as that shown in FIG.
  • An instruction address generator 1 a primary instruction cache memory 2, an instruction buffer 3, a selector 4, a decoder 5, an arithmetic unit 6, an arithmetic unit 6, registers 7, 1 similar to an information processing device employing a general multi-thread system.
  • a secondary data cache memory 8 and a secondary cache memory 9 are provided, and a request management unit 12, a priority calculation unit 13, a control unit 14, and a sampling unit 15 are further provided.
  • the request management unit 12 functions as an interface to the external memory 11.
  • the request management unit 12 refers to the external memory 11 via the request management unit 12, Data is fetched.
  • the request management unit 12 manages the request transmitted by the instruction or data in which the cache miss has occurred in order to refer to the external memory 11, and temporarily stores the information related to the transmitted request. It has a management table 12a.
  • the request management unit 12 uses the request management tape ⁇ 12 a Is associated with the request ID of the request sent to the outside (ie, the external memory 11), the process ID of the process that is the source of the request, and a response waiting flag of the request (the response of the request). (A flag indicating the presence / absence of the request) and the time when the entry of the request is created in the request management table 12a.
  • the priority calculation unit 13 is configured by the information processing apparatus based on the information managed by the request management unit 12 and the usage status of the calculation resources in the calculation unit 6 (the collection result by the collection unit 15 described above).
  • the priority of each process (thread, strand) to be executed is calculated and changed.
  • a priority variable (hereinafter referred to as a priority initial value) set in advance for each process (here, each process ⁇ , ⁇ )
  • Calculating unit 13a that calculates a new priority variable from the above-mentioned priority initial value according to the situation related to the data cache miss, and the priority for each process calculated by this calculating unit 13a. It has a priority storage table 13b for storing degree variables.
  • the situation related to the data cache miss refers to the timing (hereinafter also referred to as the first timing) at which the read (reference) request is sent to the outside via the request management unit 12.
  • the timing at which the request management unit 12 receives a response to the request (hereinafter, also referred to as a second timing) and the timing at which the response to the request is entered into the primary data cache memory 8 (hereinafter, referred to as a third timing). Timing), and the timing at which a data cache miss occurs (hereinafter, also referred to as the fourth timing).
  • the calculating unit 13a of the priority calculating unit 13 converts the priority variable of the process that is the transmission source of the request at the above-described first to fourth timings with reference to FIGS. It is configured to calculate according to the following procedure.
  • the priority variables [P (A), P (B)] are stored for each process ID (here, processes A and B).
  • the priority variables calculated at the above-mentioned first to fourth timings are changed (rewritten) each time, and the latest calculated priority variables are always stored as the priority of each process.
  • control unit 14 refers to the priority variables of the respective processes stored in the priority storage table 13b of the priority calculation unit 13, and based on the priority variables, executes the process with the higher priority.
  • the operation of the selector 4 is controlled so that an instruction for processing is preferentially selected and output to the decoder 5.
  • sampling unit 15 is for sampling the usage status of the computing resources in the arithmetic unit 6 for each process, and based on the usage status of the computing resources sampled in the sampling unit 15, a priority calculation unit described later. Calculation of a priority variable at the first timing in 13 is performed.
  • the usage status of the processing resources is, for example, the entry usage rate of the instruction window, the usage rate of the renaming register, the entry usage rate of the fetch port, and the entry usage rate of the reservation station.
  • the function as the instruction processing control device of the present invention is realized by at least the priority calculation unit 13 and the control unit 14 among the functions of the information processing device.
  • step S10 it is determined whether or not a request to read the external memory 11 has been sent from the request management unit 13 in accordance with the occurrence of a cache miss of data used when executing the instruction from the process A (step S10). Whether or not this request has been sent (first timing) is determined by the calculation unit 13a of the priority calculation unit 13 based on the response waiting flag in the request management table 12a of the request management unit 12. to decide.
  • Step S11 If the request has not been sent (N route of step S10), the priority variable P (A) of the process A that is the source of the request is not changed and is stored in the priority storage table 13b.
  • Step S11 If 1 request 1 is transmitted (YE S route of step S10; first timing), the request is sent to processing A that has transmitted the request.
  • the number of requests waiting for a response from the outside (external memory 11) that is, the request for process A for which an entry has already been made on the request management tape ⁇ / le 12a before the request) W ( 'When the request is sent and received, it is determined whether the value managed by increasing / decreasing the counter) is equal to or more than the predetermined number X1 and smaller than the predetermined number X2 (step S12).
  • XI and X2 are natural numbers that are appropriately set, and the number of requests W is calculated based on the request management table 12a of the request management unit 12, and is calculated by the calculation unit 13a of the priority calculation unit 13.
  • Step S 13 if the number of requests W is not within the range of the predetermined number X1 or more and less than X2 (N route of step S12), it is determined whether the request number W is the predetermined number X2 or more.
  • step S14 if the number of requests W is equal to or more than the predetermined number X 2 (YE S route of step S 13), the correction number i 2 (> 0) preset in the current priority variable P (A) is In addition, the priority variable P (A) of the process A is changed (step S14).
  • step S12 If the number of requests W is in the range of a predetermined number X1 or more and less than X2 (YE S route in step S12), other instructions related to process A, which is the source of the request, are executed. It is determined whether or not the instruction currently uses the computing resources of the computing unit 6 for a predetermined value Y (usage rate) or more (step S15).
  • the calculation unit 13 a of the priority calculation unit 13 determines using the use status of the calculation resources of the calculation unit 6 for the process A collected by the collection unit 15 (the various usage rates described above). . If the utilization rate of the operation resources of the operation unit 6 of the process A is smaller than Y (NO route in step S15), the priority variable P (A) of the process A is not changed, and the priority is stored.
  • the priority stored in the table 13b remains (step S11). Conversely, if the usage rate of the processing resources of the processing unit 6 of the processing A is equal to or more than ⁇ (YES route in step S15), the trapping set in advance to the current priority variable P (A) is performed.
  • the priority variable P (A) of the process A is changed by adding a positive number i 1 (> 0) (step S 16).
  • the priority variable P (A) calculated in this way is changed in the priority storage table 13b of the priority calculation unit 13 each time.
  • the use status of the instruction window entry, the renaming register, the fetch port entry, and the reservation station entry are used as the operation resources of the operation unit 6 as described above.
  • the priority calculation unit 13 uses one of the above four usage rates as the usage status of the computing resources used in the processing at the first timing, and the reference value corresponding to the computing resource is used.
  • the condition for judging the usage status of the computational resource may be set by employing all of the above four computing resources, or the condition may be satisfied if at least one of the four computing resources clears such a determination condition. You may comprise.
  • the request management Regarding the process that is the source of the request managed in the request management table 12a of Part 1 2, according to the number W of requests for which entries have been made in the request management table 12a, the request The priority variable P (A) of the process which is the transmission source is calculated, and the priority variable of the process already stored in the priority storage table 13 b is changed. Note that this change The priority is lower than the priority.
  • step S14 By performing such a change (step S14), execution of processing other than the processing in which many data cache misses occur is prioritized, and the free time for the processing resources of the processing unit 6 is reduced. Can be suppressed.
  • the priority calculation unit 13 calculates the priority variable of the process that is the transmission source of the request in accordance with the usage status (usage rate) of the calculation resource of the calculation unit 6 collected by the collection unit 15. Then, the priority variable of the process that has already been stored in the priority storage table 13 b is changed. This change is also made so that the priority is lower than the current status.
  • step S16 By making such a change (step S16), the execution of other processings other than the processing with a high usage rate is given priority, and the arithmetic unit by a plurality of processings (here, processings A and B) is executed.
  • the use balance of the computing resources of No. 6 can be maintained, and a plurality of processes can be executed efficiently.
  • step S20 the calculation procedure of the priority variable at the second timing in the priority calculation unit 13 will be described with reference to the flowchart (steps S20 to S22) shown in FIG.
  • the received response is due to a data cache miss that occurred in process A, and is the oldest of all requests due to a data cache miss that occurred in process A while waiting for an external response. Is determined (step S20).
  • the request management unit 12 receives the response from the external memory 11 (second timing)
  • the entry of the request management table 12 a of the request management unit 12 becomes This is the timing of invalidation, which can be grasped by sending a signal to the priority calculation unit 13 at this time.
  • the history of the request (whether it is the oldest or not) is calculated by the calculation unit 13a of the priority calculation unit 13 based on the time when the entry was created in the request management table 12a. to decide.
  • the response received from the outside is the data cache miss that occurred in process A. If the request is the oldest of all the requests waiting for the response to the data cache miss caused by the process A (YES route in step S20), the current priority variable P ( The preset correction number i 3 (> 0) is subtracted from A), and the priority variable P (A) of the process A is changed (step S 21). Conversely, the response received from the outside is not due to the data cache miss that occurred in process A, or is included in all requests that are waiting for a data cache miss response due to process A. If it is not the oldest (NO route of step S20), the priority variable P (A) of the process A is not changed, and the current priority stored in the priority storage template 13b is not changed. (Step S22).
  • the calculation unit 13 a of the priority calculation unit 13 performs request management.
  • the priority variable of the process that is the source of the request is calculated in accordance with the history of the request managed by the unit 12, and the priority variable of the process already stored on the priority storage tape is calculated. change. This change is made so that the priority is higher than the current status.
  • step S 21 the process which is the oldest instruction among the instructions stopped by the data cache miss and the source of the instruction is executed with priority. The execution of such processing is performed smoothly.
  • the priority is increased only when the response received from the outside is the oldest of all the requests waiting for a response due to the data cache miss of the process. If the response received is not the oldest but the response received is somehow old, the priority may be increased.
  • Step S30 If it is determined that this response has been entered in the primary data cache memory 8 (YES route of step S30; third timing), The correction number i 4 (> 0) preset for the current priority variable P (A) is reduced, and the priority variable P (A) of the process A is changed (step S31).
  • the priority variable P (A) of the process A is not changed, and the priority is not changed.
  • the priority stored in the degree storage table 13b remains (step S32).
  • the response is a response to the request managed by the request management unit 12.
  • the priority variable of the process that is the transmission source of the request is calculated, and the priority is already stored. Change the priority variable of the process stored in the table. This change is made so that the priority is higher than the current status.
  • step S31 By performing such a change (step S31), the priority is set high when the instruction stopped by the data cache miss is executed, and the instruction is stopped by the data cache miss.
  • the instruction belonging to the process from which the instruction is sent is preferentially sent to the decoder, and the execution is performed smoothly.
  • step S40 it is determined whether or not a data cache miss has occurred along with the execution of the process A (step S40). If a data cache miss has occurred (YES route in step S40; Timing), a preset capture number i 5 (> 0) is added to the current priority variable P (A), and the priority variable P (A) of the process A is changed (step S41). Conversely, if a data cache miss has not occurred (N route of step S40), the priority variable P (A) of the process A is not changed and is stored in the priority storage table 13b. The current priority set remains (step S42).
  • the priority calculation unit 13 calculates the priority variable of the process that is the source of the data cache miss, and calculates the priority variable that has already been stored in the priority storage tape. Change the processing priority variable. This change Is done so that the priority is lower than the current status.
  • step S41 execution of processing other than the processing in which the data cache miss occurred is given priority, and free time is generated in the operation resources of the arithmetic unit 6. Can be suppressed.
  • the priority variables are independently calculated / changed at the above-described first to fourth timings, and the calculation results are Each time, it is stored in the priority storage table 13b of the priority calculation unit 13.
  • the calculation Z of the priority variable is changed for the process B as in the process A.
  • the priority calculation step S50 for calculating the priority of each process in the priority calculation unit 13 and the control unit 14 include a priority calculation step.
  • Degree calculation step Controls the operation of selector 4 so as to refer to the priority of each processing calculated / changed in S50 and preferentially select the instruction for the processing with higher priority and output it to decoder 5.
  • the instruction processing control is performed by the control step S51.
  • the priority calculation step S50 the priority of each process is calculated / changed in the priority calculation unit 13 in the procedure described above with reference to FIGS.
  • the priority calculation step S50 is considered so as not to become a volt-neck for the clock cycle of the arithmetic unit.
  • the control unit 14 controls the power selector 4, and among the instructions held in the instruction buffer 3, the high-priority processing is performed by the priority storage table 1 of the priority calculation unit 13. 3 Based on the priority variables of processes A and B stored in b, compare the priority variables of each process, select the instruction for the process with the higher priority first, and send it to the decoder 5 at most. Outputs four instructions.
  • instruction buffer 3 If there is no high-priority processing instruction in instruction buffer 3, An instruction belonging to a process with a higher priority is selected. If the number of instructions sent from the instruction buffer 3 to the decoder 5 is smaller than the number of decoders (here, four), instructions for each processing are selected one after another in descending order of priority and sent to the decoder 5. Can be In this case, multiple processing instructions are decoded in one cycle.
  • control unit 14 is configured to operate once in a predetermined period in which the selector 4 outputs an instruction to the decoder 5 to switch processing. This is because if the control unit 14 is activated every time the selector 4 outputs an instruction to the decoder 5 and the processing is switched based on the priority, the cost of switching the processing is increased, and the throughput is rather increased. This will hinder the improvement of the quality.
  • the control unit 14 sets the priority variables of the processes A and B stored in the priority storage table 13 b of the priority calculation unit 13 every cycle in which the selector 4 outputs the instruction to the decoder 5. If the difference between the priority variables of the processing A and the processing B is equal to or larger than a preset threshold value, the instruction of the processing with the higher priority is preferentially selected and output to the decoder 5. May be configured. As a result, the cost increase associated with the above-described process switching is suppressed, and the process can be switched efficiently, so that the throughput can be reliably improved. .
  • the priority of each process is calculated according to the Z.
  • the instruction held in the instruction buffer 3 is output to the decoder 5 based on the priority and is executed in the arithmetic unit 6, so that the actual instruction execution status is changed.
  • the computing resources of the computing unit 6 are used efficiently in a plurality of processes, and it is ensured that free time is generated in the computing resources. Indeed, the throughput is greatly improved.
  • the priority calculation unit 13 is configured to calculate the priority variables at the first to fourth timings. For the calculation at the third timing, either It may be configured as follows.
  • FIG. 9 is a block diagram illustrating a functional configuration of the information processing apparatus according to the second embodiment. Note that, in FIG. 9, the same reference numerals as those described above indicate the same or almost the same portions, and thus detailed description thereof will be omitted.
  • the information processing apparatus according to the second embodiment of the present invention has the same configuration as the information processing apparatus of the above-described first embodiment, and further includes operation means 16, monitor means 18, It is provided with display means 20.
  • the operation means 16 is for manually operating the characteristic of the workload as the execution environment of a certain process or the value (also referred to as a reference value) for determining the priority of a certain process.
  • the monitoring means 18 is for monitoring the execution status of the work module for each process executed by the information processing apparatus.
  • the usage status of each function of the information processing device such as the usage status of the information processing device is monitored.
  • the display means 20 displays the execution status of the workload of each process monitored by the monitor means 18 on a monitor or the like connected to the information processing apparatus.
  • the value (reference value) for determining the priority of a specific process is, specifically, the calculation unit 13a of the priority calculation unit 13 calculates the priority of the process first applied.
  • the operation hand Step 16 includes hardware such as a keyboard and a mouse, and software for reflecting information (numerical values and the like) input by operating the hardware to values used in the calculator 13a. It is a man machine interface.
  • the priority of the process A can be increased.
  • the priority initial value of the process A is made smaller than the priority initial values of the other processes B. As a result, the priority variable of the process A can be relatively reduced, and the priority of the process A can be increased.
  • the priority calculation unit 13 increases the values of the correction values i3 and i4 used for calculating the priority variable of the process A.
  • the priority of the process A can be made higher than that of the other process B.
  • the reference value is changed by the operation means 16 in the above (1) to (5), the user can freely operate the change width.
  • the operating unit 16 uses the criteria used to calculate / change the priority in the priority calculating unit 13. Since the value can be manipulated so that the priority of a specific process is made higher, the specific process can be executed with priority and the user's intention can be reflected.
  • the reference value for the process B is set to the relative value of the priority of the process B by performing the operation reverse to the above operations (1) to (5). May be set so that process A is executed with priority.
  • the execution status of the workload for each process is monitored by the monitor unit 18 and the monitoring result is displayed on a monitor or the like by the display unit 20. Workload execution status can be visually checked. Therefore, after recognizing the execution status of the workload for each process, the user can operate the reference value used for calculating / changing the priority in the priority calculation unit 13 by the operation means 16 using the operation means 16. Depending on the workload status of each process, specific processes can be executed preferentially.
  • FIG. 10 is a block diagram illustrating a functional configuration of the information processing apparatus according to the third embodiment.
  • FIG. 10 since the same reference numerals as those described above indicate the same or almost the same parts, detailed description thereof will be omitted.
  • the information processing apparatus has the same configuration as the information processing apparatus of the above-described first embodiment, and further includes the execution of a workload as an execution environment.
  • a change that dynamically changes the reference value of such processing to a value according to the execution status of the workload monitored by the monitoring means 18 by referring to the table 17 in response to the result of the moeta by the means 18 means It is composed with 19. '
  • the table 17 is configured to be able to hold a specific workload and a reference value corresponding to the workload according to the user's intention and the like. As an example, as shown in Fig. 11, the workload execution status of case 1 to case 3 and the reference value according to the execution status of each workload are held.
  • Case 1 in Table 17 shows the hit rates in the primary instruction cache memory 2 and the primary data cache memory 8 (that is, the rate at which the instruction group based on the instruction is stored in the primary instruction cache memory 2). And the rate at which data used for instructions are stored in the primary data cache memory 8) is 97% or more, and the first priority in the calculation unit 13a of the priority calculation unit 13
  • the value 1 of the reference value X 2 (see steps S 12 and S 13 in FIG. 4) used for calculating / changing the priority variable at the timing of is held in association with each other.
  • Case 2 in Table 17 shows the workload execution status where the number of data fetch requests when executing instructions for such processing is 300000 or less, and the calculation unit of the priority calculation unit 13 13 The priority value at the first timing in 3a The value 1 of the reference value X2 (see steps S12 and S13 in Fig. 4) used for calculating / changing the variable is held in association with Have been.
  • a cache miss based on such processing occurs at least once in 500 instructions, and the execution time of the processing in the arithmetic unit 6 is executed in parallel with this processing.
  • the execution status of the workload which is 70% or less of the execution time of the other processing, is caused by the calculation Z of the priority variable at the first timing in the calculation unit 13a of the priority calculation unit 13.
  • the value of the correction value i 1 used (see step S 16 in FIG. 4) is decremented by 1, and the correction value i 3 used to change the priority variable Z at the second timing (step S 16 in FIG. 5) 2 1 ′) or the correction value i 4 (see step S 31 in FIG. 6) used to calculate / change the priority variable at the third timing. Is held.
  • the monitoring means 18 monitors the execution status of the first load for each process. Specifically, the monitoring means 18 determines whether the usage status of the processing resources of each process, the usage status of the cache memory, and the like. The usage status of each function of the information processing device is monitored.
  • the control unit 14 and the power priority calculation unit 13 calculate each Z-changed process.
  • the selector 4 is controlled so that a process having a higher priority is preferentially selected on the basis of the priority and output from the instruction buffer 3 to the decoder 5.
  • the monitoring status of the processing workload execution status is constantly monitored, and the changing means 19 receives the monitoring result of the monitoring means 18, refers to the tape hole 17, and holds the monitoring result in the table 17.
  • the reference values in the priority calculation unit 13 are set so that the reference values corresponding to the workload execution status stored in Table 17 are obtained. To change.
  • the monitoring result of the workload of the processing A by the monitoring means 18 is shown in Case 1 of Table 17. If it corresponds to the workload execution status, the change unit 19 sets the reference value X2 of the process A of the priority calculation unit 13 to 1. As a result, if a cache miss has hardly occurred, or if a cache miss has occurred in process A, the priority of process A can be set to be lower (step S 14 in FIG. 4). ), And it is possible to maintain a good execution balance with process B.
  • the changing means 19 performs the processing of the priority calculation unit 13
  • the reference value X 2 of A is set to 1.
  • the priority of process A can be set to be low (see step S14 in FIG. 4).
  • the execution balance (computation resource sharing balance) can be maintained well.
  • the monitoring result of the workload of process A by the monitoring means 18 corresponds to the workload execution status shown in Case 3 of Table 17, cache misses occur frequently and other processes (here, Use of computing resources compared to process B)
  • the priority can be set to be higher for the process A having a small ratio (see step S16 in FIG. 4 and step S21 in FIG. 5 or step S31 in FIG. 6). This makes it possible to change the priority of the processing A, which does not use the computational resources of the processing unit 6 as much as compared with the other processing B, and is not executed smoothly, to a higher priority. Can be kept good.
  • the changing unit 19 is configured to change the workload execution status based on the workload execution status held in the table 17 and the reference value corresponding to the workload execution status. Since the reference value of the priority calculation unit 13 is dynamically changed, the priority of such processing is dynamically changed according to the workload execution status. As a result, a plurality of processes can be executed according to the workload execution status, and the intention of the user can be reflected according to the workload execution status.
  • the workload execution status, the contents of the reference values corresponding thereto, and specific numerical values stored in the table 17 according to the present embodiment are shown as examples, and are not limited thereto. .
  • the workload execution status and the corresponding reference value stored in Table 17 are changed depending on the user's intention, the execution status of the process in the information processing device, and the like.
  • the present invention is not limited to this.
  • the present invention is applied similarly to the above-described embodiments even when three or more processes are executed in parallel, and the same operation and effect as the above-described embodiments can be obtained.
  • the type of request handled by the request management unit 12 covers a type other than a data cache miss. In such a case, it is conceivable to prepare individual reference values depending on the type of request. Furthermore, the maximum value and the minimum value of the priority variable for such processing are set according to the type of the request, and when these values are exceeded, May be configured to stop the process of increasing / decreasing priority variables (calculation / change process of priority variables).
  • the functions of the request management unit 12, the priority calculation unit 13 (calculation unit 13a), the control unit 14, the collection unit 15, the operation unit 16, the monitor unit 18, and the change unit 19 described above are implemented by a computer (CPU , An information processing device, and various terminals) by executing a predetermined application program (command processing control program).
  • the program is provided in a form recorded on a computer-readable recording medium such as a flexible disk, a CD-ROM, a CD-R, a CD-RW, and a DVD.
  • the computer reads the instruction processing control program from the recording medium, transfers it to the internal storage device or the external storage device, stores it, and uses it.
  • the program may be recorded on a storage device (recording medium) such as a magnetic disk, an optical disk, or a magneto-optical disk, and provided to the computer from the storage device via a communication line. ,.
  • the computer is a concept including hardware and an OS (operating system), and means hardware that operates under the control of the OS.
  • OS operating system
  • the hardware itself corresponds to a computer.
  • the hardware has at least a microprocessor such as a CPU and a means for reading a computer program recorded on a recording medium.
  • the application program as the instruction processing control program includes a request management unit 12, a priority calculation unit 13, a control unit 14, a sampling unit 15, an operation unit 16, a monitoring unit 18, It includes a program code for realizing the function as the changing means 19.
  • some of the functions may be realized by OS instead of application programs.
  • the recording medium includes an IC card, a ROM cartridge, a magnetic tape, and a magnetic disk.
  • IC card computer internal storage (memory such as RAM and ROM), external storage
  • RAM random access memory
  • ROM read-only memory
  • external storage Various computer-readable media, such as storage devices and printed materials on which codes such as bar codes are printed, can also be used.
  • the priority of each process is set so that the computing resources can be effectively used according to the actual instruction execution status. Since the calculation / change is performed and the processing is executed based on this priority, it is possible to more surely reduce the idle time of the computational resources and achieve a significant improvement in the throughput.
  • the present invention is suitable for use in an information processing device such as a computer employing a multi-thread system, and its usefulness is considered to be extremely high.

Landscapes

  • Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Multimedia (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Memory System Of A Hierarchy Structure (AREA)

Abstract

La présente invention concerne un dispositif de traitement d'information qui exécute au moins deux processus en parallèle, une ressource de calcul étant utilisée efficacement en fonction de conditions d'exécution d'instruction réelle et d'un temps mort de ressource de calcul est réduit de manière certaine de façon à obtenir une hausse de rendement importante. Ce dispositif de traitement d'information comprend une partie calcul de priorité (13) permettant de calculer/modifier la priorité de chaque processus de façon à réduire le temps mort de ressource de calcul dans une partie calcul (6) et, une partie commande (14) destinée à référencer la priorité de chaque processus calculé/modifié par la partie calcul de priorité (13) et une opération de commande de sélecteur (4) de façon à sélectionner une instruction de plus haute priorité en priorité et à produire en sortie cette dernière vers un décodeur (5).
PCT/JP2003/009635 2003-07-30 2003-07-30 Dispositif de traitement d'information, dispositif de commande de traitement d'instruction, procede de commande de traitement d'instruction, programme de commande de traitement d'instruction et support d'enregistrement lisible par ordinateur contenant ce programme WO2005013129A1 (fr)

Priority Applications (1)

Application Number Priority Date Filing Date Title
PCT/JP2003/009635 WO2005013129A1 (fr) 2003-07-30 2003-07-30 Dispositif de traitement d'information, dispositif de commande de traitement d'instruction, procede de commande de traitement d'instruction, programme de commande de traitement d'instruction et support d'enregistrement lisible par ordinateur contenant ce programme

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/JP2003/009635 WO2005013129A1 (fr) 2003-07-30 2003-07-30 Dispositif de traitement d'information, dispositif de commande de traitement d'instruction, procede de commande de traitement d'instruction, programme de commande de traitement d'instruction et support d'enregistrement lisible par ordinateur contenant ce programme

Publications (1)

Publication Number Publication Date
WO2005013129A1 true WO2005013129A1 (fr) 2005-02-10

Family

ID=34113451

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/JP2003/009635 WO2005013129A1 (fr) 2003-07-30 2003-07-30 Dispositif de traitement d'information, dispositif de commande de traitement d'instruction, procede de commande de traitement d'instruction, programme de commande de traitement d'instruction et support d'enregistrement lisible par ordinateur contenant ce programme

Country Status (1)

Country Link
WO (1) WO2005013129A1 (fr)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103905783A (zh) * 2012-12-25 2014-07-02 杭州海康威视数字技术股份有限公司 对视频流进行解码显示的方法及设备

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
Agarwal A. et al., APRIL: A Processor Architecture for Multiprocessing In: Proceedings of the 17th Annual Symposium on Computer Architecture, 1990, pages 104 - 114 *
HIRATA S. et al., "Taju Seigyo Flow Kiko o Sonaeta Shigen Kyoyugata Processor Architecture", Information Processing Society of Japan Kenkyu Hokoku, Vol. 92, No. 48 (92-ARC-94), JP, Information Processing Society of Japan, 1992, pages 9 - 16 *

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103905783A (zh) * 2012-12-25 2014-07-02 杭州海康威视数字技术股份有限公司 对视频流进行解码显示的方法及设备

Similar Documents

Publication Publication Date Title
US8468324B2 (en) Dual thread processor
US7051329B1 (en) Method and apparatus for managing resources in a multithreaded processor
JP5329563B2 (ja) マルチスレッド・プロセッサのための共有割込みコントローラ
US5742782A (en) Processing apparatus for executing a plurality of VLIW threads in parallel
US8516024B2 (en) Establishing thread priority in a processor or the like
US8386753B2 (en) Completion arbitration for more than two threads based on resource limitations
JP5413853B2 (ja) マルチスレッド型プロセッサのためのスレッドデエンファシス方法及びデバイス
US5727227A (en) Interrupt coprocessor configured to process interrupts in a computer system
CN101501636A (zh) 用于基于动态可改变延迟来执行处理器指令的方法和设备
JP2004518183A (ja) マルチスレッド・システムにおける命令のフェッチとディスパッチ
US8261049B1 (en) Determinative branch prediction indexing
JP5104861B2 (ja) 演算処理装置
EP2159691B1 (fr) Contrôleur d'achèvement d'instruction multifile simultané
EP1811375A1 (fr) Processeur
WO2022066559A1 (fr) Processeur à multiples pipelines d'extraction et de décodage
JP2007517322A (ja) プロセッサにおける同時物理スレッド数からの論理スレッド数のデカップリング
JP5155655B2 (ja) マイクロプロセッサ出力ポート、及び、そこから提供された命令の制御
JP2020091751A (ja) 演算処理装置および演算処理装置の制御方法
WO2005013129A1 (fr) Dispositif de traitement d'information, dispositif de commande de traitement d'instruction, procede de commande de traitement d'instruction, programme de commande de traitement d'instruction et support d'enregistrement lisible par ordinateur contenant ce programme
JP2004295195A (ja) 命令発行方法及び装置、中央演算装置、命令発行プログラム及びそれを記憶したコンピュータ読み取り可能な記憶媒体
JP4631442B2 (ja) プロセッサ
US11907126B2 (en) Processor with multiple op cache pipelines
KR100953986B1 (ko) 우선순위 기반 실행을 이용한 캐시미스 대기시간 활용 방법및 장치
US20040128488A1 (en) Strand switching algorithm to avoid strand starvation
JP2001356904A (ja) プロセッサ

Legal Events

Date Code Title Description
AK Designated states

Kind code of ref document: A1

Designated state(s): JP US

NENP Non-entry into the national phase

Ref country code: JP

WWW Wipo information: withdrawn in national office

Country of ref document: JP