WO2007057831A1 - Méthode et appareil de traitement de données - Google Patents

Méthode et appareil de traitement de données Download PDF

Info

Publication number
WO2007057831A1
WO2007057831A1 PCT/IB2006/054213 IB2006054213W WO2007057831A1 WO 2007057831 A1 WO2007057831 A1 WO 2007057831A1 IB 2006054213 W IB2006054213 W IB 2006054213W WO 2007057831 A1 WO2007057831 A1 WO 2007057831A1
Authority
WO
WIPO (PCT)
Prior art keywords
instruction
register
operation result
data processing
written
Prior art date
Application number
PCT/IB2006/054213
Other languages
English (en)
Inventor
Jean-Paul C. F. H. Smeets
David E. Leane
Willem E . H. Kloosterhuis
Original Assignee
Nxp B.V.
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nxp B.V. filed Critical Nxp B.V.
Publication of WO2007057831A1 publication Critical patent/WO2007057831A1/fr

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/38Concurrent instruction execution, e.g. pipeline or look ahead
    • G06F9/3824Operand accessing
    • G06F9/3826Bypassing or forwarding of data results, e.g. locally between pipeline stages or within a pipeline stage
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/38Concurrent instruction execution, e.g. pipeline or look ahead
    • G06F9/3824Operand accessing
    • G06F9/383Operand prefetching
    • G06F9/3832Value prediction for operands; operand history buffers
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/38Concurrent instruction execution, e.g. pipeline or look ahead
    • G06F9/3861Recovery, e.g. branch miss-prediction, exception handling

Definitions

  • the invention relates to a data processing method and apparatus, and in particular to a data processing method and apparatus having power reduction in register based instruction sets.
  • the invention further relates to a device, such as a mobile phone, PDA or alike, comprising such data processing apparatus.
  • registers are used as fast local storage elements. Instructions operate on these registers by reading the operands from the registers and writing the result to a register.
  • the provision of local registers means that fewer accesses are required to the main system memory to retrieve operands and to store results of ALU operations, which is considerably slower than accessing data from local memory.
  • Fig. 1 shows an example of a simplified data path in a typical pipeline data processing system 1.
  • the data path comprises a register file 3, an Arithmetic logic Unit (ALU) 5, and an ALU input register 7.
  • ALU Arithmetic logic Unit
  • the instruction reads its operands from the register file 3 (i.e. it reads registers rO and rl).
  • the instruction is executed during cycle t2 by the ALU 5.
  • the result is then stored in the register file 3 (i.e. in register r2) during cycle t3.
  • the second instruction cannot read (RR) its operands in cycle t2 because the first instruction is still calculating the result (EXE).
  • the first instruction writes the result to the register r2 in cycle t3. Since the write (WR) takes place early in t3, this means that the second instruction can only validly read (RR) its operands asynchronously from register r2 (and register r3) in cycle t3, thus causing a delay in the data processing.
  • a known method of reducing this delay is to use a bypass network as shown in
  • the circuit shown in Fig. 5 comprises a register file 3 connected to an ALU 5, via an ALU input register 7, and having a feedback path 9 for storing results generated by the ALU 5 in the register file 3.
  • a bypass circuit comprising multiplexers 11 and 13 is also provided for effectively bypassing the register file 3.
  • the multiplexers 11 and 13 enable the results of the ALU 5 to be forwarded directly to the input register 7 of the ALU 5, effectively bypassing the register file 3.
  • the bypass circuit determines if the source register is the same as the destination register (in this case r2 and r2). If the source register matches the destination register, the bypass circuit is adapted to allow the ALU input register 7 to use the data from the multiplexers 11, 13, while the data is also being written to the register file 3.
  • the bypass circuit is adapted to pass an operation result of the first instruction to the second instruction as a source operand while the operation result is being written to the register file.
  • the second instruction can receive its source operand from the bypass circuit or the register file.
  • Fig. 6 shows how the second instruction is now able to read (RR) the operand data in cycle t2, i.e. via the bypass circuit, thus allowing instructions to be executed back-to- back.
  • the provision of the bypass circuit enables the delay described in Figs. 3 and 4 to be removed, since the execution instruction (EXE) of instruction 2 can now take place in cycle t3. In other words, as the result is generated by the first instruction, it is bypassed in the same cycle to the ALU input register 7, thus being available for execution by the second instruction.
  • bypass circuit enables efficient utilization of the ALU generated data in a pipeline processor.
  • the applicants have recognized that a pipeline processor having a bypass circuit has the disadvantage of utilizing more power.
  • the aim of the invention is to provide a data processing method and apparatus that enables power to be reduced in a data processing apparatus having a bypass circuit.
  • a data processing apparatus for processing first and second instructions in a pipelined manner.
  • the data processing apparatus comprises a bypass circuit adapted to pass an operation result of the first instruction to the second instruction as a source operand while the operation result is being written to a register.
  • the second instruction can receive its source operand from the bypass circuit or the register.
  • the data processing apparatus further comprises write control means for selectively preventing the operation result of the first instruction from being written to the register.
  • the write control means is adapted to determine when the operation result of the first instruction is only used once by another instruction and the timing of the first and second instructions is such that the source operand for the second instruction is provided by the bypass network, and for preventing the operation result from being written to the register accordingly.
  • the invention has the advantage of reducing power consumption, since a register write operation is avoided in certain circumstances.
  • a method of processing first and second instructions in a pipelined manner comprises the steps of providing a bypass circuit for passing an operation result of the first instruction to the second instruction as a source operand while the operation result is being written to a register, wherein the second instruction can receive its source operand from the bypass circuit or the register.
  • the method also comprises the step of selectively preventing the operation result of the first instruction from being written to the register.
  • an instruction for a data processing apparatus that processes first and second instructions in a pipelined manner, the data processing apparatus having a bypass circuit for passing an operation result of a first instruction to a second instruction as a source operand while the operation result is being written to a register, wherein the second instruction can receive its source operand from the bypass circuit or the register.
  • the instruction comprises a data bit for enabling the data processing apparatus to selectively prevent the operation result of the first instruction from being written to the register.
  • Fig. 1 shows an illustration of a simple data path in a pipeline processor
  • Fig. 2 shows the timing sequence of the operation of Fig. 1
  • Fig. 3 shows the operation of the simple data path of Fig. 1 when performing first and second instructions
  • Fig. 4 shows the timing sequence relating to Fig. 3;
  • Fig. 5 shows an illustration of a simple data path in a pipeline processor having a bypass circuit
  • Fig. 6 shows the timing sequence relating to the operation of Fig. 5;
  • Fig. 7 shows a data processing apparatus according to the present invention.
  • Fig. 7 shows a data processing apparatus according to the present invention.
  • the data processing apparatus comprises a register file 3 connected to an ALU 5 via an ALU input register 7, and having a feedback path 9 for storing results generated by the ALU 5 in the register file 3.
  • a bypass circuit for example comprising multiplexers 11 and 13, is provided for effectively bypassing the register file 3.
  • the multiplexers 11 and 13 enable the results of the ALU 5 to be forwarded directly to the input register 7 of the ALU 5, thus effectively bypassing the register file 3.
  • a write control means 15 is provided, and is adapted to determine when the operation result of a first instruction is only used once by a second instruction, and the timing of the first and second instructions is such that a source operand for the second instruction is provided by the bypass network.
  • the write control means 15 determines this to be the case, the write control means 15 is further adapted to prevent the operation result of the first instruction from being written to the register file 3. In this case the contents will not be used in the future anymore.
  • the write control means 15 is adapted to selectively prevent the operation result from the first instruction from being written to the register file 3.
  • the invention has the advantage of saving power, by omitting the write operation to the register file 3 during this special circumstance. As can be seen from the above, a power saving can be achieved by the write control means 15 determining whether or not it can omit the write operation to the register file 3.
  • this is achieved by including information in every instruction which enables the write control means 15 to determine if the write operation to the register file 3 can be prevented or omitted.
  • a data bit is included in every instruction, which indicates if the write to the register file 3 may be omitted, or if the write to the register file 3 must be carried out.
  • this data bit is set an instruction can omit the write to the register file 3.
  • this data bit is not set the hardware must write to the register file 3.
  • other implementations are also possible for selectively preventing a write operation to be performed.
  • the invention relies on the condition that the result of a first instruction is only used once by a second instruction, and the timing of the first and second instructions is such that an operand for the second instruction is provided by the bypass network.
  • the write control means 15 is adapted to ignore the data bit and always write to the register file 3 to allow the second instruction to read its operands from the register file 3 at a later time. In other words, the write control means 15 effectively overrides the decision that would otherwise prevent the operation result form being written to the register file 3.

Landscapes

  • Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Advance Control (AREA)

Abstract

L’appareil de traitement de données comprend un fichier (3) de registres raccordé à une ALU (5) par l'intermédiaire d’un registre (7) d’entrée d’ALU. Un chemin de rétroaction (9) existe pour stocker dans le fichier (3) de registres des résultats générés par l’ALU (5). Un circuit de dérivation comprenant, par exemple, des multiplexeurs (11 et 13) est utilisé pour contourner efficacement le fichier (3) de registres, permettant ainsi que les résultats de l’ALU (5) soient transmis directement au registre d’entrée (7) de l’ALU (5), si bien que le fichier (3) de registres peut être contourné lorsque le résultat d’une première instruction est utilisé en tant qu’opérande d’une deuxième instruction pendant un cycle de traitement suivant. Un moyen (15) de commande de dérivation existe pour déterminer si le résultat d’une première instruction n’est utilisé qu’une seule fois par une deuxième instruction et la synchronisation de la première et de la deuxième instructions est telles qu’un opérande pour la deuxième instruction est fourni par le réseau de dérivation. Lorsque le moyen (15) de commande de dérivation détermine que cela est le cas, le moyen (15) de commande de dérivation est également capable d’empêcher le résultat de la première instruction d’être écrit dans le fichier de registres. Dans ce cas, les contenus ne seront à l'avenir plus utilisés. L’invention a pour avantage d’économiser de l’énergie en omettant cette écriture dans le fichier de registres.
PCT/IB2006/054213 2005-11-15 2006-11-13 Méthode et appareil de traitement de données WO2007057831A1 (fr)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
EP05110767.0 2005-11-15
EP05110767 2005-11-15

Publications (1)

Publication Number Publication Date
WO2007057831A1 true WO2007057831A1 (fr) 2007-05-24

Family

ID=37772554

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/IB2006/054213 WO2007057831A1 (fr) 2005-11-15 2006-11-13 Méthode et appareil de traitement de données

Country Status (2)

Country Link
TW (1) TW200811710A (fr)
WO (1) WO2007057831A1 (fr)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7730284B2 (en) * 2003-03-19 2010-06-01 Koninklijke Philips Electronics N.V. Pipelined instruction processor with data bypassing and disabling circuit
CN104216681A (zh) * 2013-05-31 2014-12-17 华为技术有限公司 一种cpu指令处理方法和处理器
WO2016130275A1 (fr) * 2015-02-09 2016-08-18 Qualcomm Incorporated Station de réservation ayant une instruction avec utilisation sélective d'un registre spécial comme opérande source en fonction de bits d'instruction
US11150909B2 (en) 2015-12-11 2021-10-19 International Business Machines Corporation Energy efficient source operand issue

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2001061469A2 (fr) * 2000-02-16 2001-08-23 Koninklijke Philips Electronics N.V. Dispositif et procede servant a reduire le trafic d'ecriture de registre dans des processeurs presentant des programmes d'exception
WO2001061478A2 (fr) * 2000-02-16 2001-08-23 Koninklijke Philips Electronics N.V. Systeme et procede de reduction du trafic d'ecriture dans un processeur
EP1199629A1 (fr) * 2000-10-17 2002-04-24 STMicroelectronics S.r.l. Architecture de processeur pipeline à étapes variables

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2001061469A2 (fr) * 2000-02-16 2001-08-23 Koninklijke Philips Electronics N.V. Dispositif et procede servant a reduire le trafic d'ecriture de registre dans des processeurs presentant des programmes d'exception
WO2001061478A2 (fr) * 2000-02-16 2001-08-23 Koninklijke Philips Electronics N.V. Systeme et procede de reduction du trafic d'ecriture dans un processeur
EP1199629A1 (fr) * 2000-10-17 2002-04-24 STMicroelectronics S.r.l. Architecture de processeur pipeline à étapes variables

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7730284B2 (en) * 2003-03-19 2010-06-01 Koninklijke Philips Electronics N.V. Pipelined instruction processor with data bypassing and disabling circuit
CN104216681A (zh) * 2013-05-31 2014-12-17 华为技术有限公司 一种cpu指令处理方法和处理器
WO2016130275A1 (fr) * 2015-02-09 2016-08-18 Qualcomm Incorporated Station de réservation ayant une instruction avec utilisation sélective d'un registre spécial comme opérande source en fonction de bits d'instruction
US11150909B2 (en) 2015-12-11 2021-10-19 International Business Machines Corporation Energy efficient source operand issue

Also Published As

Publication number Publication date
TW200811710A (en) 2008-03-01

Similar Documents

Publication Publication Date Title
US6745336B1 (en) System and method of operand value based processor optimization by detecting a condition of pre-determined number of bits and selectively disabling pre-determined bit-fields by clock gating
US6205543B1 (en) Efficient handling of a large register file for context switching
US8386754B2 (en) Renaming wide register source operand with plural short register source operands for select instructions to detect dependency fast with existing mechanism
US11663006B2 (en) Hardware apparatuses and methods to switch shadow stack pointers
US5764943A (en) Data path circuitry for processor having multiple instruction pipelines
US20160055004A1 (en) Method and apparatus for non-speculative fetch and execution of control-dependent blocks
KR101048234B1 (ko) 마이크로프로세서 내부의 다수의 레지스터 유닛들을 결합하기 위한 방법 및 시스템
US10678541B2 (en) Processors having fully-connected interconnects shared by vector conflict instructions and permute instructions
US9141386B2 (en) Vector logical reduction operation implemented using swizzling on a semiconductor chip
JP2003523573A (ja) プロセッサにおける書き込みトラヒックを減少するシステム及び方法
KR20100032441A (ko) 조건부 명령을 비조건부 명령 및 선택 명령으로 확장하기 위한 방법 및 시스템
TW201203103A (en) Operand size control
JPH09311786A (ja) データ処理装置
US9317285B2 (en) Instruction set architecture mode dependent sub-size access of register with associated status indication
KR20070026434A (ko) 듀얼 경로 프로세서에서 제어 프로세싱용 장치 및 방법
TW201510861A (zh) 指令順序執行之指令對、處理器、方法及系統
US10579378B2 (en) Instructions for manipulating a multi-bit predicate register for predicating instruction sequences
KR101077425B1 (ko) 효율적 인터럽트 리턴 어드레스 저장 메커니즘
WO2007057831A1 (fr) Méthode et appareil de traitement de données
US20070220235A1 (en) Instruction subgraph identification for a configurable accelerator
JP2009169767A (ja) パイプライン型プロセッサ
US12086595B2 (en) Apparatuses, methods, and systems for instructions for downconverting a tile row and interleaving with a register
US6625634B1 (en) Efficient implementation of multiprecision arithmetic
US11544065B2 (en) Bit width reconfiguration using a shadow-latch configured register file
US6851044B1 (en) System and method for eliminating write backs with buffer for exception processing

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application
NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 06821407

Country of ref document: EP

Kind code of ref document: A1