US20180226970A1 - Shift operation circuit and shift operation method - Google Patents

Shift operation circuit and shift operation method Download PDF

Info

Publication number
US20180226970A1
US20180226970A1 US15/877,765 US201815877765A US2018226970A1 US 20180226970 A1 US20180226970 A1 US 20180226970A1 US 201815877765 A US201815877765 A US 201815877765A US 2018226970 A1 US2018226970 A1 US 2018226970A1
Authority
US
United States
Prior art keywords
shift
data
circuit
bit
shift amount
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
US15/877,765
Other versions
US10056906B1 (en
Inventor
Tomoharu Miyadai
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Fujitsu Ltd
Original Assignee
Fujitsu Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Fujitsu Ltd filed Critical Fujitsu Ltd
Assigned to FUJITSU LIMITED reassignment FUJITSU LIMITED ASSIGNMENT OF ASSIGNORS INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: MIYADAI, TOMOHARU
Publication of US20180226970A1 publication Critical patent/US20180226970A1/en
Application granted granted Critical
Publication of US10056906B1 publication Critical patent/US10056906B1/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H03ELECTRONIC CIRCUITRY
    • H03KPULSE TECHNIQUE
    • H03K19/00Logic circuits, i.e. having at least two inputs acting on one output; Inverting circuits
    • H03K19/0175Coupling arrangements; Interface arrangements
    • H03K19/017509Interface arrangements
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F5/00Methods or arrangements for data conversion without changing the order or content of the data handled
    • G06F5/01Methods or arrangements for data conversion without changing the order or content of the data handled for shifting, e.g. justifying, scaling, normalising
    • G06F5/015Methods or arrangements for data conversion without changing the order or content of the data handled for shifting, e.g. justifying, scaling, normalising having at least two separately controlled shifting levels, e.g. using shifting matrices

Definitions

  • the embodiments discussed herein relate to a shift operation circuit and a shift operation method.
  • SIMD Single Instruction Multiple Data
  • CPU Central Processing Unit
  • Such a processor includes a plurality of arithmetic units, such as an adder, a logical unit, and a shifter, and causes the plurality of arithmetic units to operate in a coupled manner when an instruction indicates a scalar mode, and causes the plurality of arithmetic units to operate independently from each other when an instruction indicates a vector mode (for example, see Patent Document 1) .
  • a processor includes a pair of Arithmetic Logic Units (ALU) and a pair of shifters coupled to each other via a shift data selecting circuit.
  • ALU Arithmetic Logic Units
  • the processor causes the shifters to operate in a coupled manner as well as causes the ALUs to operate in a coupled manner.
  • the processor causes the shifters to operate independently from each other as well as causes the ALUs to operate independently from each other (for example, see Patent Document 2).
  • Patent Document 1 Japanese Laid-open Patent Publication No. H8-50575
  • Patent Document 2 Japanese Laid-open Patent Publication No. 2009-15555
  • a shift operation circuit for executing digit alignment of a significand includes a plurality of shift circuits that respectively shift a plurality of sets of data divided when executing a SIMD instruction.
  • a shift operation circuit includes: a plurality of shift circuits each of which is coupled to a corresponding internal bus that is one of a plurality of internal buses having a bit width greater than a bit width of input data, a part of bit numbers of the plurality of internal buses overlapping, each of the plurality of shift circuits being configured to receive corresponding divided data that is one of a plurality of sets of divided data obtained by dividing the input data and to receive a corresponding shift amount signal that is one of a plurality of shift amount signals, each of the plurality of shift circuits being configured to output the corresponding divided data to a range shifted based on a shift amount represented by the corresponding shift amount signal from a reference bit position in the corresponding internal bus; a shift control circuit configured to receive, during a first mode, each of a plurality of shift amount signals whose shift amounts are common and to output, as the corresponding shift amount signal, the received plurality of shift amount signals to each of the plurality of shift circuits, and the shift
  • a shift operation method for a shift operation circuit including a plurality of shift circuits each of which is coupled to a corresponding internal bus that is one of a plurality of internal buses having a bit width greater than a bit width of input data, a part of bit numbers of the plurality of internal buses overlapping includes: receiving, by each of the plurality of shift circuits, corresponding divided data that is one of a plurality of sets of divided data obtained by dividing the input data; receiving, by each of the plurality of shift circuits, a corresponding shift amount signal that is one of a plurality of shift amount signals; outputting, by each of the plurality of shift circuits, the corresponding divided data to a range shifted based on a shift amount represented by the corresponding shift amount signal from a reference bit position in the corresponding internal bus; receiving, by a shift control circuit included in the shift operation circuit, during a first mode, each of a plurality of shift amount signals whose shift amounts are common and outputting, as the corresponding shift amount
  • FIG. 1 is a diagram illustrating a shift operation circuit according to an embodiment
  • FIG. 2 is a diagram illustrating an example of an operation processing apparatus on which the shift operation circuit that is illustrated in FIG. 1 is mounted;
  • FIG. 3 is a diagram illustrating an example of a floating-point adder that is illustrated in FIG. 2 ;
  • FIG. 4 is a diagram illustrating an example of shift control circuits, which are illustrated in FIG. 1 ;
  • FIG. 5 is a diagram illustrating an example of buffer circuits and a bit selecting circuit, which are illustrated in FIG. 1 ;
  • FIG. 6 is a diagram illustrating an example of an operation in a normal mode of the shift operation circuit, which is illustrated in FIG. 1 ;
  • FIG. 7 is a diagram illustrating an example of an operation in a SIMD mode of the shift operation circuit, which is illustrated in FIG. 1 ;
  • FIG. 8 is a diagram illustrating an example of a case in which parity predictors that predict parity bits are built in the shift operation circuit, which is illustrated in FIG. 1 ;
  • FIG. 9 is a diagram illustrating an example of allocation of data and parity bits in the shift operation circuit, which is illustrated in FIG. 8 ;
  • FIG. 10 is a diagram illustrating a shift operation circuit as another example.
  • FIG. 11 is a diagram illustrating a shift operation circuit according to another embodiment
  • FIG. 12 is a diagram illustrating an example of shift control circuits, which are illustrated in FIG. 11 ;
  • FIG. 13 is a diagram illustrating an example of buffer circuits and a bit selecting circuit, which are illustrated in FIG. 11 ;
  • FIG. 14 is a diagram illustrating an example of an operation in the normal mode of the shift operation circuit, which is illustrated in FIG. 11 ;
  • FIG. 15 is a diagram illustrating an example of an operation in the SIMD mode of the shift operation circuit, which is illustrated in FIG. 11 ;
  • FIG. 16 is a diagram illustrating a shift operation circuit according to another embodiment
  • FIG. 17 is a diagram illustrating an example of shift control circuits, which are illustrated in FIG. 16 ;
  • FIG. 18 is a diagram illustrating an example of buffer circuits and a bit selecting circuit, which are illustrated in FIG. 16 ;
  • FIG. 19 is a diagram illustrating an example of an operation in the normal mode of the shift operation circuit, which is illustrated in FIG. 16 ;
  • FIG. 20 is a diagram illustrating an example of an operation in the SIMD mode of the shift operation circuit, which is illustrated in FIG. 16 ;
  • FIG. 21 is a diagram illustrating an example of a shift operation of the shift operation circuit, which is illustrated in FIG. 16 .
  • FIG. 1 illustrates a shift operation circuit 100 according to an embodiment.
  • the shift operation circuit 100 includes shift control circuits 10 and 11 , shift circuits 20 a and 20 b, buffer circuits 30 and 31 , and a bit selecting circuit 40 .
  • the shift control circuit 10 receives a 7-bit shift amount signal SAH[ 6 : 0 ] that represents a shift amount of the shift circuit 20 a, and changes logical values of the shift amount signal SAH[ 6 : 0 ] in accordance with a mode signal SIMD, and outputs the changed signal as a shift amount signal SAH 1 [ 6 : 0 ].
  • the mode signal SIMD is set to the logical value 1 during a SIMD mode in which an operation processing apparatus 200 executes a SIMD operation based on a SIMD instruction
  • the mode signal SIMD is set to the logical value 0 during a normal mode in which the operation processing apparatus 200 executes a single operation based on a normal instruction.
  • the normal mode is an example of a first mode
  • the SIMD mode is an example of a second mode.
  • the shift control circuit 11 receives a 7-bit shift amount signal SAL[ 6 : 0 ] that represents a shift amount of the shift circuit 20 b, and changes logical values of the shift amount signal SAL[ 6 : 0 ] in accordance with a mode signal SIMD, and outputs the changed signal as a shift amount signal SAL 1 [ 6 : 0 ] .
  • the shift control circuits 10 and 11 may be provided, on the shift operation circuit 100 , as one shift control circuit.
  • the shift amount signals SAH[ 6 : 0 ], SAL[ 6 : 0 ], SAH 1 [ 6 : 0 ], and SAL 1 [ 6 : 0 ] may also be referred to as the shift amount signals SAH, SAL, SAH 1 , and SAL 1 by omitting the bit numbers.
  • the shift amount signals SAH[ 6 : 0 ] and SAL[ 6 : 0 ] are set to values equal to each other.
  • the shift amount signals SAH[ 6 : 0 ] and SAL[ 6 : 0 ] are set independently from each other.
  • the most significant bit SAH 1 [ 6 ] of the shift amount signal SAH 1 is set to the logical value 0, and the most significant bit SAL 1 [ 6 ] of the shift amount signal SAL 1 is set to the logical value 1.
  • the shift amount signal SAH 1 of which the most significant bit SAH 1 [ 6 ] is set to the logical value 0, represents one of “0” to “63”
  • the shift amount signal SAL 1 of which the most significant bit SAL 1 [ 6 ] is set to the logical value 1, represents one of “64” to “127”.
  • the shift amount signals SAH[ 6 : 0 ] and SAL[ 6 : 0 ] are converted, at the internal buses RH[ 191 : 33 ] and RL[ 159 : 1 ], into shift amount signals SAH 1 [ 6 : 0 ] and SAL 1 [ 6 : 0 ] that represent shift ranges of which the bit numbers do not overlap.
  • An example of the shift control circuits 10 and 11 are illustrated in FIG. 4 .
  • the shift circuit 20 a receives 32-bit divided data D[ 63 : 32 ] obtained by dividing 64-bit input data D[ 63 : 0 ] and receives the shift amount signal SAH 1 [ 6 : 0 ].
  • the shift circuit 20 a outputs, in the internal bus RH[ 191 : 33 ], the divided data D[ 63 : 32 ] to a range shifted from a reference bit position RH[ 191 ] by the shift amount represented by the shift amount signal SAH 1 [ 6 : 0 ].
  • the input data D[ 63 : 0 ] may also be referred to as the data D[ 63 : 0 ]
  • the divided data D[ 63 : 32 ] may also be referred to as the data D[ 63 : 32 ].
  • the data transmitted to the internal bus RH[ 191 : 33 ] may be also referred to as the data RH[ 191 : 33 ].
  • each of the shift circuits 20 a and 20 b may receive the corresponding divided data, which is one of a plurality of sets of divided data obtained by dividing input data D[ 63 : 0 ], and receive the corresponding shift amount signal, which is one of a plurality of shift amount signals, and each of the shift circuits 20 a and 20 b may output the corresponding divided data to a range shifted by based on shift amount represented by the corresponding shift amount signal from a reference bit position in the corresponding internal bus.
  • the shift circuit 20 a shifts, in accordance with the value of the shift amount signal SAH 1 , the bits of the data D[ 63 : 32 ] from the high-order side to the low-order side, and outputs the shifted data as 159-bit data RH[ 191 : 33 ]. That is, the shift circuit 20 a shifts the data D[ 63 : 32 ] to the right by the value of the shift amount signal SAH 1 (which is a value from 0 bits to 127 bits).
  • the shift circuit 20 a includes a function to set 127 bits to the logical value 0 except for 32 bits output as the data D[ 63 : 32 ], within the 159-bit data RH[ 191 : 33 ].
  • the shift circuit 20 b receives 32-bit divided data D[ 31 : 0 ] obtained by dividing the 64-bit input data D[ 63 : 0 ] and receives the shift amount signal SAL 1 [ 6 : 0 ].
  • the shift circuit 20 b outputs, in the internal bus RL[ 159 : 1 ], the divided data D[ 31 : 0 ] to a range shifted from a reference bit position RL[ 159 ] by the shift amount represented by the shift amount signal SAL 1 [ 6 : 0 ].
  • the divided data D[ 31 : 0 ] may also be referred to as the data D[ 31 : 0 ]
  • the data transmitted to the internal bus RL[ 159 : 1 ] may be also referred to as the data RL[ 159 : 1 ].
  • the shift circuit 20 b shifts, in accordance with the value of the shift amount signal SAL 1 , the bits of the data D[ 31 : 0 ] from the high-order side to the low-order side, and outputs the shifted data as 159-bit data RH[ 159 : 1 ]. That is, the shift circuit 20 b shifts the data D[ 31 : 0 ] to the right by the value of the shift amount signal SAL 1 (which is a value from 0 bits to 127 bits).
  • the shift circuit 20 b includes a function to set 127 bits to the logical value 0 except for 32 bits output as the data D[ 31 : 0 ] within the 159-bit data RL[ 159 : 1 ].
  • bit numbers of the bits RH[ 159 : 33 ] of the internal bus RH[ 191 : 33 ] coupled to the shift circuit 20 a and the bit numbers of the bits RL[ 159 : 33 ] of the internal bus RL[ 159 : 1 ] coupled to the shift circuit 20 b overlap with each other.
  • a part of the bit numbers of the internal buses RH[ 191 : 33 ] and RL[ 159 : 1 ] overlap.
  • the reference bit position RL[ 159 ] in the shift circuit 20 b is allocated by shifting the bit width of the divided data D[ 63 : 32 ] with respect to the reference bit position RH[ 191 ] in the shift circuit 20 a.
  • the divided data D[ 63 : 32 ] and D[ 31 : 0 ] supplied to the shift circuits 20 a and 20 b, which are different from each other, can be output, as continuous data D[ 63 : 0 ], to the output bus R[ 191 : 1 ].
  • the data transmitted to the output bus R[ 191 : 1 ] may be also referred to as the data R[ 191 : 1 ].
  • the shift circuits 20 a and 20 b are circuits equal to each other and have common circuit data (macro data). Hence, for example, design data of the shift circuit 20 a can be used in the shift circuit 20 b. Therefore, it is possible to reduce a designing period of the shift circuits 20 a and 20 b relative to a case of independently designing the shift circuits 20 a and 20 b.
  • the buffer circuit 30 outputs, as data R[ 191 : 160 ], the high-order 32-bit data RH[ 191 : 160 ] within the data RH[ 191 : 33 ] output from the shift circuit 20 a. That is, the buffer circuit 30 outputs, to the output bus [ 191 : 160 ], the data RH[ 191 : 160 ] output by the bits RH[ 191 : 160 ] whose bit numbers do not overlap with the internal bus RL[ 159 : 1 ] in the internal bus RH[ 191 : 33 ].
  • the buffer circuit 31 outputs, as data R[ 32 : 1 ], the low-order 32-bit data RL[ 32 : 1 ] within the data RL[ 159 : 1 ] output from the shift circuit 20 b.
  • the buffer circuit 31 outputs, to the output bus [ 32 : 1 ], the data RL[ 32 : 1 ] output by the bits RL[ 32 : 1 ] whose bit numbers do not overlap with the internal bus RH[ 191 : 33 ] in the internal bus RL[ 159 : 1 ].
  • the bit selecting circuit 40 selects valid bits from the data RH[ 159 : 33 ], output from the shift circuit 20 a, and the data RL[ 159 : 33 ], output from the shift circuit 20 b, and outputs the selected bits to the output bus R[ 159 : 33 ].
  • the valid bits are 32 bits at a minimum and 64 bits at a maximum.
  • the data D[ 63 : 0 ], RH[ 191 : 33 ], RL[ 159 : 1 ], and R[ 191 : 1 ] may also be referred to as the data D, RH, RL, and R by omitting the bit numbers.
  • the shift operation circuit 100 shifts the input data D[ 63 : 0 ] to the right by the value of the shift amount signals SAH and SAL (the same logical value), and outputs the shifted data as any 64 bits of the data R[ 191 : 1 ].
  • the shift operation circuit 100 shifts the input data D[ 63 : 32 ] to the right by the value of the shift amount signal SAH and outputs the shifted data as any 32 bits of the data R[ 191 : 95 ].
  • the shift operation circuit 100 shifts the input data D[ 31 : 0 ] to the right by the value of the shift amount signal SAL and outputs the shifted data as any 32 bits of the data R[ 95 : 1 ].
  • An example of the operation of the shift operation circuit 100 in the normal mode is illustrated in FIG. 6
  • an example of the operation of the shift operation circuit 100 in the SIMD mode is illustrated in FIG. 7 .
  • FIG. 2 illustrates an example of the operation processing apparatus 200 on which the shift operation circuit 100 that is illustrated in FIG. 1 is mounted.
  • the operation processing apparatus 200 includes an instruction cache 50 , an instruction buffer 52 , a decoding unit 54 , a reservation station unit 56 , and an operation executing unit 58 .
  • the operation processing apparatus 200 may be a processor such as a CPU, and FIG. 2 illustrates a part of a processor core mounted on the processor.
  • the instruction cache 50 is a secondary cache (second level cache) or a primary instruction cache (first level cache) that stores an instruction transmitted from a main memory or the like.
  • the instruction buffer 52 sequentially holds an instruction transmitted from the instruction cache and sequentially outputs, to the decoding unit 54 , the held instruction.
  • the decoding unit 54 decodes the instruction transmitted from the instruction buffer 52 , and inputs, in the reservation station unit 56 , an instruction code, a register number, and the like included in the decoded instruction.
  • the reservation station unit 56 includes a Reservation Station for Execution (RSE) including a plurality of entries that hold operation instructions. Further, the reservation station unit 56 includes a Reservation Station for Address (RSA) including a plurality of entries that hold memory access instructions such as a load instruction and a store instruction.
  • RSE Reservation Station for Execution
  • RSA Reservation Station for Address
  • the Reservation Station for Execution determines a dependence relationship between the operation instructions held in the entries, and selects, based on the determined dependence relationship, an executable operation instruction from the operation instructions held in the entries.
  • the Reservation Station for Execution inputs the selected operation instruction into the operation executing unit 58 .
  • the Reservation Station for Address determines a dependence relationship between the memory access instructions held in the entries, and selects, based on the determined dependence relationship, an executable load instruction or store instruction from the memory access instructions held in the entries.
  • the Reservation Station for Address inputs the selected load instruction or store instruction into the operation executing unit 58 .
  • the operation executing unit 58 includes a fixed-point operation unit 60 , a floating-point operation unit 62 , a logical operation unit 64 , an address operation unit 66 , and a register unit 68 .
  • the fixed-point operation unit 60 includes an adder ADD that executes addition or subtraction of fixed-point numbers and a multiplier MUL that executes multiplication or division of fixed-point numbers.
  • the floating-point operation unit 62 includes an adder FADD that executes addition or subtraction of floating-point numbers and a multiplier FMUL that executes multiplication or division of floating-point numbers. Further, the floating-point operation unit 62 includes a multiplier/adder FMA that executes multiplication and addition of floating-point numbers.
  • the shift operation circuit 100 that is illustrated in FIG. 1 is mounted on the adder FADD for floating-point numbers. Note that the shift operation circuit 100 may be mounted on the multiplier/adder FMA for floating-point numbers.
  • the adder FADD, the multiplier FMUL, and the multiplier/adder FMA include a function to execute a SIMD operation.
  • the SIMD operation because a plurality of operations are executed in parallel based on a single instruction, a plurality of sets of data are respectively stored in a first operand and a second operand of a SIMD instruction in a divided manner.
  • the logical operation unit 64 includes a logical conjunction operator AND that executes an AND logical operation, and a logical disjunction operator OR that executes an OR logical operation, and a shift operator that executes a shift operation.
  • the address operation unit 66 calculates an access address based on a memory access instruction input from the reservation station RSA and outputs the calculated access address to a data cache or the like not illustrated.
  • the register unit 68 has a plurality of universal registers designated by an instruction and a plurality of registers (update buffers) that temporarily hold operation results and the like.
  • each register is 64 bits.
  • FIG. 3 illustrates an example of the floating-point adder FADD, which is illustrated in FIG. 2 .
  • the floating-point adder FADD includes a comparator CMP, a switch SW, a subtractor SUB 1 , a right shifter RSFT, an adder ADD 1 , a leading zero predictor RZP, a normalization shifter NRMSFT, and an adder ADD 2 .
  • the shift operation circuit 100 which is illustrated in FIG. 1 , may be mounted as the right shifter RSFT on the adder FADD for floating-.point numbers.
  • the floating-point adder FADD which is illustrated in FIG. 3 , adds a 64-bit operand OP 1 , which includes an exponent EXP 1 and a significand FRC 1 , and a 64-bit operand OP 2 , which includes an exponent EXP 2 and a significand FRC 2 , and outputs the exponent EXP and the significand FRC that indicate the addition result.
  • the operands OP 1 and OP 2 and the addition result are held in universal registers of the register unit 68 , which is illustrated in FIG. 2 .
  • a 64-bit floating number has a 1-bit sign part, a 11-bit exponent part, and a 52-bit significand part.
  • the sign bit part (sign bit) is omitted.
  • the normalized most significant bit is omitted as a hidden bit in a floating point number, but the output of the switch SW is supplemented with the hidden bit.
  • the comparator CMP compares the magnitude of the exponent EXP 1 with the magnitude of the exponent EXP 2 .
  • the comparator CMP outputs, to the switch SW, a switch control signal SWC for switching the exponents EXP 1 and EXP 2 .
  • the comparator CMP outputs, to the switch SW, a switch control signal SWC for not switching the exponents EXP 1 and EXP 2 .
  • the subtractor SUB 1 obtains a difference between the exponents EXP 1 and EXP 2 output from the switch SW and outputs, to the right shifter RSFT and the adder ADD 2 , a difference signal DIF that represents the obtained difference.
  • the value of the difference signal DIF is supplied, to the right shifter RSFT, as shift amount signals SAR[ 6 : 0 ] and SAL[ 6 : 0 ] that are illustrated in FIG. 1 .
  • the right shifter RSFT shifts the significand (one of FRC 1 or FRC 2 ) having a smaller value out of the operands OP 1 and OP 2 to the right by the value of the differential signal DIF and outputs it to the adder ADD 1 and the leading zero predictor RZP.
  • the significand supplied from the switch SW to the right shifter RSFT is included in data D[ 63 : 0 ] that is illustrated in FIG. 1 .
  • the right shifter RSFT (that is, the shift operation circuit 100 ) includes the shift circuits 20 a and 20 b that shift data D[ 63 : 32 ] and D[ 31 : 0 ] independently, as illustrated in FIG. 1 .
  • FIG. 7 An example of the operation of the shift operation circuit 100 in the SIMD mode for executing the SIMD operation is illustrated in FIG. 7 .
  • the adder ADD 1 adds the digit-matched significands FRC 1 and FRC 2 , and outputs the addition result to the normalization shifter NRMSFT.
  • the leading zero predictor RZP predicts the number of “0” until the first “1” appears in the high-order bit-side in the addition result by the adder ADD 1 . Then, the leading zero predictor RZP outputs, to the normalization shifter NSFT and the adder ADD 2 , the predicted number as a shift amount.
  • the adder ADD 2 adds the value of the difference signal DIF from the subtractor SUB 1 and the value of the shift amount, and outputs the addition result as an exponent EXP.
  • the floating-point adder FADD adds 32-bit floating-point numbers included in the respective operands OP 1 and OP 2 to each other, and also adds other 32-bit floating-point numbers included in the respective operands OP 1 and OP 2 to each other. That is, the operation processing apparatus 200 has an SIMD operation function to independently add two pairs of floating-point data included in the operands OP 1 and OP 2 .
  • each element of the floating-point adder FADD is switched to the function of adding two pairs of floating-point data, but the details of the circuit are omitted. Note that in a case where the shift operation circuit 100 is mounted in the multiplier/adder FMA illustrated in FIG. 2 , the shift operation circuit 100 is also mounted as a right shifter RSFT in the adder of the multiplier/adder FMA similarly to FIG. 3 .
  • FIG. 4 illustrates an example of the shift control circuits 10 and 11 , which are illustrated in FIG. 1 .
  • the shift control circuit includes an and-circuit AND that receives a mode signal SIMD via an inverter IV, and the shift control circuit 10 includes a plurality of buffers BUF that output a shift amount signal SAH[ 5 : 0 ] as a shift amount signal SAH 1 [ 5 : 0 ].
  • the and-circuit AND outputs the most significant bit SAH[ 6 ] of the shift amount signal SAH as the shift amount signal SAH 1 [ 6 ].
  • the and-circuit AND sets the shift amount signal SAH 1 [ 6 ] to “0”.
  • the shift control circuit 10 outputs, in accordance with the shift amount signal SAH[ 5 : 0 ], the shift amount signal SAH 1 [ 6 : 0 ] that represents a shift amount of from 0 bits to 63 bits.
  • the shift control circuit 11 includes an or-circuit OR that receives the mode signal SIMD, and the shift control circuit 11 includes a plurality of buffers BUF that output a shift amount signal SAL[ 5 : 0 ] as a shift amount signal SAL 1 [ 5 : 0 ].
  • the or-circuit _OR outputs the most significant bit SAL[ 6 ] of the shift amount signal SAL as the shift amount signal SAL 1 [ 6 ].
  • the or-circuit _OR sets the shift amount signal SAL 1 [ 6 ] to “0”.
  • the shift control circuit 11 outputs, in accordance with the shift amount signal SAL[ 5 : 0 ], the shift amount signal SAL 1 [ 6 : 0 ] that represents a shift amount of from 64 bits to 127 bits.
  • the most significant bits SAH 1 [ 6 ] and SAL 1 [ 6 ] of the respective shift amount signals SAH 1 [ 6 : 0 ] and SAL 1 [ 6 : 0 ] output to the shift circuits 20 a and 20 b are set, by the and-circuit AND and the or-circuit OR, to logical values different from each other.
  • the shift amount signal SAH[ 6 : 0 ] and the shift amount signal SAL[ 6 : 0 ] are set independently from each other in the SIMD mode, it is possible to prevent the data D[ 63 : 32 ] and the data D[ 31 : 0 ] from collision.
  • FIG. 5 illustrates an example of the buffer circuits 30 and 31 and the bit selecting circuit 40 , which are illustrated in FIG. 1 .
  • the buffer circuit 30 includes a plurality of buffers BUF that output data RH[ 191 : 160 ] as data R[ 191 : 160 ].
  • the buffer circuit 31 includes a plurality of buffers BUF that output data RL[ 32 : 1 ] as data R[ 32 : 1 ].
  • the bit selecting circuit 40 includes a plurality of or-circuits OR that output, as data R, an or-logic of each bit of 127-bit data RH[ 159 : 33 ] and RL[ 159 : 33 ] of which the bit numbers overlap with each other. That is, for each bit of the data R[ 159 : 33 ], the logical value 1 is set in a case where either the respective bit of the data RH[ 159 : 33 ] or the respective bit of data RL[ 159 : 33 ] is the logical value 1.
  • the shift circuit 20 a includes a function, at the internal bus RH, to set 127 bits to the logical value 0 except for 32 valid bits output as the data D[ 63 : 32 ].
  • the shift circuit 20 b includes a function, at the internal bus RL, to set 127 bits to the logical value 0 except for 32 valid bits output as the data D[ 31 : 0 ]. Further, as illustrated in FIG. 6 and FIG. 7 , valid data D is not simultaneously output by data RH and RL having same bit numbers among the data RH[ 159 : 33 ] and RL[ 159 : 33 ]. Hence, the logical value 0 is necessarily supplied to one of two input units of each or-circuit OR of the bit selecting circuit 40 .
  • the bit selecting circuit 40 can select valid data and output the selected data to the output bus R[ 159 : 33 ] without using a control signal.
  • the shift circuit 20 a shifts, in accordance with the shift amount signal SAH 1 [ 6 : 0 ], the position of each bit of the data D[ 63 : 32 ] in a range of from 0 bits to 127 bits, and outputs the shifted data as the data RH[ 191 : 33 ].
  • the shift circuit 20 b shifts, in accordance with the shift amount signal SAL 1 [ 6 : 0 ], the position of each bit of the data D[ 31 : 0 ] in a range of from 0 bits to 127 bits, and outputs the shifted data as the data RL[ 159 : 1 ].
  • the bits used as the references by the shift circuits 20 a and 20 b differ by 32 bits. Therefore, the bit range of the data RH that is output by the shift circuit 20 a and the bit range of the data RL that is output by the shift circuit 20 b differ by 32 bits. Further, in the normal mode, the value of the shift amount signal SAH 1 [ 6 : 0 ] and the value of the shift amount signal SAL 1 [ 6 : 0 ] are equal to each other. Hence, in a shift operation of the data D[ 63 : 0 ], the shift operation circuit 100 can output the data D[ 63 : 32 ] and D[ 31 : 0 ] as the data R without causing the bit numbers of the data RH and the data RL to overlap.
  • the shift operation circuit 100 can output the data D[ 63 : 0 ] as the unified 64-bit data R without blank bit numbers in the data RH and the data RL.
  • the output bus R[ 191 : 1 ] illustrated within brackets at the lower part of FIG. 6 indicates an example of bit positions at which data D[ 63 : 0 ] appears in accordance with the shift amount signals SAH 1 and SAL 1 .
  • the sign “h” at the end of the numerical value of the shift amount signals SAH 1 and SAL 1 indicates that the numerical value is a hex number. Note that the value of the shift amount signal SAH 1 is the same as the value of the shift amount signal SAH supplied to the shift operation circuit 100 , and the value of the shift amount signal SAL 1 is the same as the value of the shift amount signal SAL supplied to the shift operation circuit 100 .
  • the shift circuit 20 a sets each bit of the data RH[ 159 : 33 ], where the data D[ 63 : 32 ] does not appear, to “0”.
  • the shift circuit 20 b sets each bit of the data RL[ 127 : 1 ], where the data D[ 31 : 0 ] does not appear, to “0”.
  • the bit selecting circuit 40 sets each bit of the data R[ 128 : 33 ] to “0”.
  • the buffer circuit 31 sets each bit of the data R[ 32 : 1 ] to “0”.
  • the shift circuit 20 a sets each bit of the data RH[ 191 : 167 ] and RH[ 134 : 33 ], where the data D[ 63 : 32 ] does not appears, to “0”.
  • the shift circuit 20 b sets each bit of the data RL[ 159 : 135 ] and RL[ 102 : 1 ], where the data D[ 31 : 0 ] does not appear, to “0”.
  • the buffer circuit 30 sets each bit of the data R[ 191 : 167 ] to “0”.
  • the bit selecting circuit 40 sets each bit of the data R[ 102 : 33 ] to “0”.
  • the buffer circuit 31 sets each bit of the data R[ 32 : 1 ] to “0”.
  • the shift circuit 20 a sets each bit of the data RH[ 191 : 82 ] and RH[ 49 : 33 ], where the data D[ 63 : 32 ] does not appear, to “0”.
  • the shift circuit 20 b sets each bit of the data RL[ 159 : 50 ] and RL[ 17 : 1 ], where the data D[ 31 : 0 ] does not appear, to “0”.
  • the buffer circuit 30 sets each bit of the data R[ 191 : 160 ] to “0”.
  • the bit selecting circuit 40 sets each bit of the data R[ 159 : 82 ] to “0”.
  • the buffer circuit 31 sets each bit of the data R[ 17 : 1 ] to “0”.
  • the shift circuit 20 a sets each bit of the data RH[ 191 : 64 ], where the data D[ 63 : 32 ] does not appear, to “0”.
  • the shift circuit 20 b sets each bit of the data RL[ 159 : 33 ], where the data D[ 31 : 0 ] does not appear, to “0”.
  • the buffer circuit 30 sets each bit of the data R[ 191 : 160 ] to “0”.
  • the bit selecting circuit 40 sets each bit of the data R[ 159 : 65 ] to “0”.
  • the shift circuit 20 a shifts, in accordance with the shift amount signal SAH 1 [ 6 : 0 ], the position of each bit of the data D[ 63 : 32 ] in a range of from 0 bits to 63 bits, and outputs the shifted data as the data RH[ 191 : 97 ].
  • the shift circuit 20 b shifts, in accordance with the shift amount signal SAL 1 [ 6 : 0 ], the position of each bit of the data D[ 31 : 0 ] in a range of from 0 bits to 63 bits, and outputs the shifted data as the data RL[ 95 : 1 ]. That is, in the SIMD mode, the bit range of the data RH that is output by the shift circuit 20 a and the bit range of the data RL that is output by the shift circuit 20 b do not overlap.
  • the data D[ 63 : 32 ] is output to a range of data R[ 191 : 97 ] and the data D[ 31 : 0 ] is output to a range of data R[ 95 : 1 ].
  • the shift amount signal SAH[ 6 : 0 ] and the shift amount signal SAL[ 6 : 0 ] which are illustrated in FIG. 1 , are set independently from each other, it is possible to prevent the data D[ 63 : 32 ] and the data D[ 31 : 0 ] from collision.
  • FIG. 8 illustrates an example of a case in which parity predictors that predict parity bits are built in the shift operation circuit 100 , which is illustrated in FIG. 1 .
  • the same reference numerals are given to elements the same as the elements in FIG. 1 and their detailed descriptions are omitted as appropriate.
  • a shift operation circuit 100 P that includes parity predictors PPa and PPb includes shift circuits 20 Pa and 20 Pb instead of the shift circuits 20 a and 20 b, which are illustrated in FIG. 1 .
  • the shift operation circuit 100 P includes buffer circuits 30 P and 31 P instead of the buffer circuits 30 and 31 , which are illustrated in FIG. 1 .
  • the shift operation circuit 100 P includes a bit selecting circuit 40 P instead of the bit selecting circuit 40 , which is illustrated in FIG. 1 .
  • the shift operation circuit 100 P receives data D[ 63 : 0 ] and parity bits DP[ 15 : 0 ], and detects an error in the data D[ 63 : 0 ].
  • Each bit of the parity bits DP[ 15 : 0 ] is appended per 4 bits of the data D[ 63 : 0 ].
  • the parity predictors PPa and PPb are respectively mounted within the shift circuits 20 Pa and 20 Pb.
  • Each of the parity predictors PPa and PPb includes an exclusive OR circuit that calculates a parity bit DP for each 4-bit data D.
  • the parity bits DP are calculated, in the shift circuits 20 Pa and 20 Pb, for a plurality of respective stages for sequentially shifting data.
  • the respective parity predictors PPa and PPb account for several tens of percent of the sizes of the respective shift circuits 20 Pa and 20 Pb.
  • the shift circuit 20 Pa shifts, in accordance with the shift amount signal SAH 1 [ 6 : 0 ], the bit positions of the data D[ 63 : 32 ] to output data RH[ 191 : 33 ] and parity bits RPH[ 47 : 8 ]. Each bit of the parity bits RPH[ 47 : 8 ] is added per 4 bits of the data RH[ 191 : 33 ].
  • the shift circuit 20 Pb shifts, in accordance with the shift amount signal SAL 1 [ 6 : 0 ], the bit positions of the data D[ 31 : 0 ] to output data RL[ 159 : 1 ] and parity bits RPL[ 39 : 0 ]. Each bit of the parity bits RPL[ 39 : 0 ] is added per 4 bits of the data RL[ 159 : 1 ].
  • the parity predictor PPa of the shift circuit 20 Pa By generating parity bits every time the shift circuit 20 Pa shifts the data D[ 63 : 32 ], the parity predictor PPa of the shift circuit 20 Pa outputs the parity bits RPH[ 47 : 8 ] together with the output of the data RH[ 191 : 33 ]. Similarly, by generating parity bits every time the shift circuit 20 Pb shifts the data D[ 31 : 0 ], the parity predictor PPb of the shift circuit 20 Pb outputs the parity bits RPL[ 39 : 0 ] together with the output of the data RL[ 159 : 1 ].
  • the parity predictor PPa can predict the parity bits RPH[ 47 : 8 ] without using the data RH[ 191 : 33 ], and the parity predictor PPb can predict the parity bits RPL[ 39 : 0 ] without using the data RL[ 159 : 1 ].
  • the buffer circuit 30 P outputs data RH[ 191 : 160 ] as data R[ 191 : 160 ], and outputs parity bits RPH[ 47 : 40 ] corresponding to the data RH[ 191 : 160 ] as parity bits RP[ 47 : 40 ].
  • the buffer circuit 31 P outputs data RL[ 32 : 1 ] as data R[ 32 : 1 ], and outputs parity bits RPL[ 7 : 0 ] corresponding to the data RL[ 32 : 1 ] as parity bits RP[ 7 : 0 ].
  • the bit selecting circuit 40 P selects valid bits from the data RH[ 159 : 33 ] and the data RL[ 159 : 33 ], and outputs the selected bits as data R[ 159 : 33 ]. Further, the bit selecting circuit 40 P selects valid bits from parity bits RPH[ 39 : 8 ] and parity bits RPL[ 39 : 8 ], and outputs the selected bits as parity bits RP[ 39 : 8 ]. Note that each bit of the parity bits RP[ 47 : 0 ] is added per 4 bits of the data R[ 191 : 1 ].
  • the parity bits RP[ 47 : 0 ], which are output together with the data R[ 191 : 1 ], are used, in a circuit to which the data R[ 191 : 1 ] is supplied, to detect an error in the data R[ 191 : 1 ].
  • FIG. 9 illustrates an example of allocation of data and parity bits in the shift operation circuit 100 P, which is illustrated in FIG. 8 .
  • Each bit of the parity bits DP[ 15 : 0 ] is added per 4 bits of the data D[ 63 : 0 ].
  • Each bit of the parity bits RP[ 47 : 8 ] is added per 4 bits of the data RH[ 191 : 33 ].
  • Each bit of the parity bits RPL[ 39 : 0 ] is added per 4 bits of the data RL[ 159 : 1 ].
  • Each bit of the parity bits RP[ 47 : 0 ] is added per 4 bits of the data R[ 191 : 1 ].
  • a wiring length of a signal for transmitting parity bits can be shortened in the shift circuits 20 Pa and 20 Pb relative to a case in which parity bits are not inserted between data bits.
  • FIG. 10 illustrates a shift operation circuit 102 as another example. Detailed descriptions of elements and functions of the shift operation circuit 102 similar to those of the shift operation circuit 100 , which is illustrated in FIG. 1 , are omitted as appropriate.
  • the shift operation circuit 102 includes shift circuits 28 and 29 , a buffer circuit 39 , and a selector circuit 49 . Note that in a case where parity predictors are built in the shift operation circuit 102 , parity bits DP, RPH, RPL, and RP that are indicated in the brackets are appended. In the following, a case will be described in which the shift operation circuit 102 does not include parity predictors and parity bits DP, RPH, RPL, and RP are not appended.
  • the shift circuit 28 shifts, in accordance with the value of a shift amount signal SAH[ 6 : 0 ], the bits of 64-bit data D[ 63 : 0 ] from the high-order side to the low-order side, and outputs the shifted data as 191-bit data RH[ 191 : 1 ]. That is, the shift circuit 28 shifts the data D[ 63 : 0 ] to the right by the value of the shift amount signal SAH (which is a value from 0 bits to 127 bits).
  • the shift circuit 28 shifts, in accordance with the value of a shift amount signal SAH[ 5 : 0 ], the bits of 32-bit data D[ 63 : 32 ] from the high-order side to the low-order side, and outputs the shifted data as 95-bit data RH[ 191 : 97 ]. That is, the shift circuit 28 shifts the data D[ 63 : 32 ] to the right by the value of the shift amount signal SAH (which is a value from 0 bits to 63 bits).
  • the buffer circuit 39 outputs, as data R[ 191 : 96 ], the high-order 96-bit data RH[ 191 : 96 ] within the data RH[ 191 : 1 ] output from the shift circuit 28 .
  • the data D[ 31 : 0 ] is supplied to the shift circuits 28 and 29 in an overlapped manner. Because the shift circuit 28 does not shift the data D[ 31 : 0 ] during the SIMD mode, the shift circuit 28 has a wasted circuit that does not operate during the SIMD mode. Further, because the respective shift circuits 28 and 29 are designed independently, the designing period is longer than that of the shift circuits 20 a and 20 b, which are illustrated in FIG. 1 .
  • the shift circuit 28 operates upon receiving the 64-bit data D[ 63 : 0 ], and the shift circuit 29 operates upon receiving the 32-bit data D[ 31 : 0 ]. Therefore, the total number of bits of the input data D is 96 bits. This number is larger than the total number of bits of the data D input to the shift circuits 20 a and 20 b (64 bits), which are illustrated in FIG. 1 , by 32 bits.
  • the total number of bits input to the shift circuits 28 and 29 is 120 bits, and is larger by 40 bits than the total number of bits input to the shift circuits 20 Pa and 20 Pb of the shift operation circuit 100 P (80 bits), which is illustrated in FIG. 8 .
  • a circuit scale of a shift circuit exponentially increases depending on the number of bits of input data.
  • the circuit scale of the shift operation circuit 102 which is illustrated in FIG. 10
  • the circuit scale of the shift operation circuit 102 is further larger than the circuit scale of the shift operation circuit 100 P, which is illustrated in FIG. 8 .
  • data D[ 63 : 32 ] and data D[ 31 : 0 ] are respectively supplied to the shift circuits 20 a and 20 b without being overlapped.
  • the data D[ 63 : 32 ] and D[ 31 : 0 ] whose bits do not overlap can be respectively supplied to the shift circuits 20 a and 20 b and the shift operation in the normal mode and the shift operation in the SIMD mode can be executed.
  • the total number of bits of data D supplied to the shift circuits 20 a and 20 b can be reduced relative to a case in which bits are supplied to a plurality of other shift circuits in an overlapped manner.
  • the total number of bits of data D supplied to the shift circuits 20 a and 20 b (64 bits) can be two-thirds of the total number of bits of data D supplied to the shift circuits 28 and 29 of the shift operation circuit 102 (96 bits), which is illustrated in FIG. 10 .
  • the circuit scale of the shift circuits 20 a and 20 b relative to the circuit scale of the shift circuits 28 and 29 , and it is possible to reduce the circuit size of the shift operation circuit 100 .
  • the designing period can be reduced relative to a case of independently designing both the shift circuits 28 and 29 , which are illustrated in FIG. 10 .
  • the reference bit positions RL[ 159 ] and RH[ 191 ] are allocated by shifting the bit width of the divided data D[ 63 : 32 ], and thereby the divided data D[ 63 : 32 ] and D[ 31 : 0 ] can be output, to the output bus R[ 191 : 1 ], as continuous data D[ 63 : 0 ]. In other words, in the normal mode, it is possible to prevent the data D[ 63 : 32 ] and D[ 31 : 0 ] from collision.
  • the bit selecting circuit 40 can select valid data D and output the selected data to the output bus R[ 159 : 33 ] without using a control signal.
  • the high-order bit SAH 1 [ 6 ] of the shift amount signal SAH 1 [ 6 : 0 ] and the high-order bit SAL 1 [ 6 ] of the shift amount signal SAL 1 [ 6 : 0 ] are set to logical values opposite to each other.
  • the shift amount signal SAH[ 6 : 0 ] and the shift amount signal SAL[ 6 : 0 ] are set independently from each other, it is possible to prevent the data D[ 63 : 32 ] and the data D[ 31 : 0 ] from collision.
  • FIG. 11 illustrates a shift operation circuit 104 according to another embodiment.
  • the shift operation circuit 104 includes shift control circuits 10 , 13 , and 14 , shift circuits 20 a, 22 a, and 22 b, buffer circuits 30 , and 32 , and a bit selecting circuit 42 .
  • the shift operation circuit 104 can foe mounted on the adder FADD or the multiplier/adder FMA for floating-point numbers of the operation processing apparatus 200 , which is illustrated in FIG. 2 .
  • parity bits DP, RPH, RPLH, RPL, and RP that are indicated in the brackets are appended.
  • the shift operation circuit 104 does not include parity predictors and parity bits DP, RPH, RPLH, RPL, and RP are not appended.
  • the shift operation circuit 104 can be mounted on the adder FADD or the multiplier/adder FMA for floating-point numbers of the operation processing apparatus 200 , which is illustrated in FIG. 2 .
  • a circuit configuration and functions of the shift control circuit 10 of FIG. 11 are the same as the circuit configuration and the functions of the shift control circuit illustrated in FIG. 1 .
  • the shift control circuit 13 changes logical values of a shift amount signal SALH[ 6 : 0 ] in accordance with a mode signal SIMD, and outputs the changed signal as a shift amount signal SALH 1 [ 6 : 0 ].
  • the shift control circuit 13 operates in a manner similar to that of the shift control circuit 11 , which is illustrated in FIG. 1 , except the 2 high-order bits SALH 1 [ 6 : 5 ] of a shift amount signal SALH 1 [ 6 : 0 ] are set to “10” during the SIMD mode.
  • the shift control circuit 14 operates in a manner similar to that of the shift control circuit 11 , which is illustrated in FIG. 1 , except the 2 high-order bits SAL 1 [ 6 : 5 ] of a shift amount signal SAL 1 [ 6 : 0 ] are set to “11” during the SIMD mode.
  • the shift amount signals SAH[ 6 : 0 ], SALH[ 6 : 0 ], and SAL[ 6 : 0 ] are set to values equal to each other.
  • SIMD “1”
  • the shift amount signals SAH[ 6 : 0 ], SALH[ 6 : 0 ] are set independently from each other.
  • a circuit configuration and functions of the shift circuit 20 a of FIG. 11 are the same as the circuit configuration and the functions of the shift circuit 20 a that is illustrated in FIG. 1 .
  • the shift circuit 22 a shifts, in accordance with the value of a shift amount signal SALH 1 , the bits of 16-bit data D[ 31 : 16 ] within the 64-bit data D[ 63 : 0 ] from the high-order side to the low-order side, and outputs the shifted data to the 143-bit internal bus RLH[ 159 : 17 ]. That is, the shift circuit 22 a shifts the data D[ 31 : 16 ] to the right by the value of the shift amount signal SALH 1 (which is a value from 0 bits to 127 bits).
  • the data transmitted to the internal bus RLH[ 159 : 17 ] may be also referred to as the data RLH[ 159 : 17 ].
  • the shift circuit 22 a includes a function to set 127 bits to “0” except for 16 bits output as the data D[ 31 : 16 ] within the 143-bit data RLH[ 159 : 17 ].
  • the shift circuit 22 b shifts, in accordance with the value of a shift amount signal SAL 1 , the bits of 16-bit data D[ 15 : 0 ] within the 64-bit data D[ 63 : 0 ] from the high-order side to the low-order side, and outputs the shifted data as 143-bit data RL[ 143 : 1 ]. That is, the shift circuit 22 b shifts the data D[ 15 : 0 ] to the right by the value of the shift amount signal SAL 1 (which is a value from 0 bits to 127 bits).
  • the shift circuit 22 b includes a function to set 127 bits to “0” except for 16 bits output as the data D[ 15 : 0 ] within the 143-bit data RL[ 143 : 1 ]. Note that because the shift circuits 22 a and 22 b are circuits equal to each other and have common circuit data (macro data), it is possible to reduce a designing period of the shift circuits 22 a and 22 b relative to a case of independently designing the shift circuits 22 a and 22 b.
  • a circuit configuration and functions of the buffer circuit 30 of FIG. 11 are the same as the circuit configuration and the functions of the buffer circuit 30 that is illustrated in FIG. 1 .
  • the buffer circuit 32 outputs, as data R[ 16 : 1 ], the low-order 16-bit data R[ 16 : 1 ] within the data RL[ 143 : 1 ] output from the shift circuit 22 b.
  • the bit selecting circuit 42 receives the data RH[ 159 : 33 ], output from the shift circuit 20 a, the data RLH[ 159 : 17 ], output from the shift circuit 22 a, and the data RL[ 143 : 17 ], output from the shift circuit 22 b.
  • the bit selecting circuit 42 selects valid bits from the data RH[ 159 : 33 ], the data RLH[ 159 : 17 ], and the data RL[ 143 : 17 ], and outputs the selected bits as data R[ 159 : 17 ].
  • the valid bits are 32 bits at a minimum and 64 bits at a maximum.
  • FIG. 12 illustrates an example of the shift control circuits 10 , 13 , and 14 , which are illustrated in FIG. 11 .
  • a circuit configuration and functions of the shift control circuit 10 of FIG. 12 are the same as the circuit configuration and the functions of the shift control circuit 10 that is illustrated in FIG. 4 . That is, during the SIMD mode, the shift control circuit 10 outputs, in accordance with the shift amount signal SAH[ 5 : 0 ], the shift amount signal SAH 1 [ 6 : 0 ] that represents a shift amount of from 0 bits to 63 bits.
  • the shift control circuit 13 includes an or-circuit OR that receives a mode signal SIMD, and an and-circuit AND that receives the mode signal SIMD via an inverter IV. Further, the shift control circuit 13 includes a plurality of buffers BUF that output a shift amount signal SALH[ 4 : 0 ] as a shift amount signal SALH 1 [ 4 : 0 ]. Outputs of the or-circuit OR and the and-circuit AND (SALH 1 [ 6 : 5 ]) are set to “10” during the SIMD mode.
  • the shift control circuit 13 outputs, in accordance with the shift amount signal SALH[ 4 : 0 ], the shift amount signal SALH 1 [ 6 : 0 ] that represents a shift amount of from 64 bits to 95 bits.
  • the shift control circuit 14 includes or-circuits OR 1 and OR 2 that receive the mode signal SIMD, and the shift control circuit 14 includes a plurality of buffers BUF that output a shift amount signal SAL[ 4 : 0 ] as a shift amount signal SAL 1 [ 4 : 0 ].
  • Outputs of the or-circuits OR 1 and OR 2 (SAL 1 [ 6 : 5 ]) are set to “11” during the SIMD mode. That is, during the SIMD mode, the shift control circuit 14 outputs, in accordance with the shift amount signal SAL[ 4 : 0 ], the shift amount signal SAL 1 [ 6 : 0 ] that represents a shift amount of from 96 bits to 127 bits.
  • FIG. 13 illustrates an example of the buffer circuits 30 and 32 and the bit selecting circuit 42 , which are illustrated in FIG. 11 .
  • a circuit configuration and functions of the buffer circuit 30 of FIG. 13 are the same as the circuit configuration and the functions of the buffer circuit 30 that is illustrated in FIG. 5 .
  • the buffer circuit 32 includes a plurality of buffers BUF that output data RL[ 16 : 1 ] as data R[ 16 : 1 ].
  • the bit selecting circuit 42 includes a plurality of or-circuits OR each of which has two input units to operate an-or logic of each bit of data RH and RLH corresponding to data R[ 159 : 144 ]. Further, the bit selecting circuit 42 includes a plurality of or-circuits OR each of which has three input units to operate an-or logic of each bit of data RH, RLH, and RLH corresponding to data R[ 143 : 33 ]. Furthermore, the bit selecting circuit 42 includes a plurality of or-circuits OR each of which has two input units to operate an-or logic of each bit of data RLH and RL corresponding to data R[ 32 : 17 ].
  • the logical value 1 is set in a case where the respective bit of the data RH[ 159 : 33 ], the respective of the data RLH[ 159 : 33 ], or the respective of the data RL[ 143 : 17 ] is the logical value 1.
  • Each of the shift circuits 20 a, 22 a, and 20 b which are illustrated in FIG. 11 , includes a function to set bits to the logical value 0 except for valid bits. Further, as illustrated in FIG. 14 and FIG. 15 , the data D[ 63 : 0 ] is not simultaneously output to the internal buses RH, RLH, and RL having same bit numbers. Hence, valid data D is not simultaneously supplied to the plurality of input units of each or-circuit OR of the bit selecting circuit 42 .
  • the bit selecting circuit 42 can select valid data and output the selected data to the output bus R[ 159 : 17 ] without using a control signal.
  • the operation of the shift circuit 20 a is the same as the operation in FIG. 6 .
  • the shift circuit 22 a shifts, in accordance with the shift amount signal SALH 1 [ 6 : 0 ], the position of each bit of the data D[ 31 : 16 ] in a range of from 0 bits to 127 bits, and outputs the shifted data as the data RLH[ 159 : 17 ].
  • the shift circuit 22 b shifts, in accordance with the shift amount signal SAL 1 [ 6 : 0 ], the position of each bit of the data D[ 15 : 0 ] in a range of from 0 bits to 127 bits, and outputs the shifted data as the data RL[ 143 : 1 ].
  • the bit range of the data RH that is output by the shift circuit 20 a and the bit range of the data RLH that is output by the shift circuit 22 a differ by 32 bits.
  • the bit range of the data RLH that is output by the shift circuit 22 a and the bit range of the data RL that is output by the shift circuit 22 b differ by 16 bits.
  • the shift amount signals SAH[ 6 : 0 ], SALH[ 6 : 0 ], and SAL[ 6 : 0 ] are set to values equal to each other.
  • the shift operation circuit 104 can output the data D[ 63 : 32 ], D[ 31 : 16 ], and D[ 15 : 0 ] as the data R without causing the bit numbers of the data RH, RLH, and RL to overlap with each other. Further, the shift operation circuit 104 can output the data R making blank bit numbers in data RH, RLH, and RL.
  • the data R[ 191 : 1 ] illustrated within brackets at the lower part of FIG. 14 indicates an example of bit positions at which data D[ 63 : 0 ] appears in accordance with the shift amount signals SAH 1 , SALH 1 , and SAL 1 .
  • the bit positions at which the data D[ 63 : 0 ] appears are similar to those in FIG. 6 .
  • the most significant bit SAH 1 [ 6 ] of the shift amount signal SAH 1 [ 6 : 0 ] is fixed to “0”, and the high-order bits SALH 1 [ 6 : 5 ] of the shift amount signal SALH 1 [ 6 : 0 ] are fixed to “10”. Further, the high-order bits SAL 1 [ 6 : 5 ] of the shift amount signal SAL 1 [ 6 : 0 ] are fixed to “11”. That is, in the SIMD mode, the 2 high-order bits of the shift amount signals SAH 1 , SALH 1 , and SAL 1 are set to logical values different from each other.
  • the shift circuit 20 a operates in a manner similar to that in FIG. 7 . That is, with reference to the bit RH[ 191 ], the shift circuit 20 a shifts, in accordance with the shift amount signal SAH 1 [ 6 : 0 ], the position of each bit of the data D[ 63 : 32 ] in a range of from 0 bits to 63 bits, and outputs the shifted data as the data RH[ 191 : 97 ].
  • the shift circuit 22 a shifts, in accordance with the shift amount signal SALH 1 [ 6 : 0 ], the position of each bit of the data D[ 31 : 16 ] in a range of from 64 bits to 95 bits, and outputs the shifted data as the data RLH[ 95 : 49 ].
  • the shift circuit 22 b shifts, in accordance with the shift amount signal SAL 1 [ 6 : 0 ], the position of each bit of the 16-bit data D[ 15 : 0 ] in a range of from 96 bits to 127 bits, and outputs the shifted data as the data RL[ 47 : 1 ].
  • the bit range of the data RH that is output by the shift circuit 20 a, the bit range of the data RLH that is output by the shift circuit 22 a, and the bit range of the data RL that is output by the shift circuit 22 b do not overlap.
  • the shift amount signal SAH 1 is also set to “3Fh” (right shift by 63 bits), and the data D[ 63 : 32 ] is output as data R[ 128 : 97 ].
  • the shift amount signal SALH is “1Fh”
  • the shift amount signal SALH 1 is set to “5Fh” (right shift by 95 bits)
  • the data D[ 31 : 16 ] is output as data R[ 64 : 49 ].
  • the shift amount signal SAL is “1Fh”
  • the shift amount signal SALH 1 is set to “7Fh” (right shift by 127 bits), and the data D[ 15 : 0 ] is output as data R[ 16 : 1 ].
  • the data D[ 63 : 32 ] is output to a range of data R[ 191 : 97 ]
  • the data D[ 31 : 16 ] is output to a range of data R[ 95 : 49 ]
  • the data D[ 15 : 0 ] is output to a range of data R[ 47 : 1 ].
  • the shift amount signals [ 6 : 0 ], SALH[ 6 : 0 ], and SAL[ 6 : 0 ] which are illustrated in FIG. 11 are set independently from each other, it is possible to prevent the data D[ 63 : 32 ], D[ 31 : 16 ] and D[ 15 : 0 ] from collision.
  • the reference bit positions of the internal buses RH, RLH, and RL, which are respectively coupled to the three shift circuits 20 a, 22 a, and 22 b, are shifted by the bit width of divided data D, and thereby data D can be prevented from collision in the normal mode.
  • the bit selecting circuit 42 can select valid data D without using a control signal.
  • FIG. 16 illustrates a shift operation circuit 106 according to another embodiment.
  • the shift operation circuit 106 includes shift control circuits 15 , 16 , 13 , and 14 , shift circuits 22 c, 22 d, 22 a, and 22 b, buffer circuits 33 , and 32 , and a bit selecting circuit 44 .
  • the shift operation circuit 106 can be mounted on the adder FADD or the multiplier/adder FMA for floating-point numbers of the operation processing apparatus 200 , which is illustrated in FIG. 2 .
  • the shift operation circuit 106 can be mounted on the adder FADD or the multiplier/adder FMA for floating-point numbers of the operation processing apparatus 200 , which is illustrated in FIG. 2 .
  • data (operands) divided into four are used to execute the operation in parallel.
  • parity bits DP, RPH, RPHH, RPLH, RPL, and RP that are indicated in the brackets are appended.
  • the shift operation circuit 106 does not include parity predictors and parity bits parity bits DP, RPH, RPHH, RPLH, RPL, and RP are not appended.
  • the shift control circuit 15 changes logical values of a shift amount signal SAH[ 6 : 0 ] in accordance with a mode signal SIMD, and outputs the changed signal as a shift amount signal SAH 1 [ 6 : 0 ].
  • the shift control circuit 16 changes logical values of a shift amount signal SAHH[ 6 : 0 ] in accordance with the mode signal SIMD, and outputs the changed signal as a shift amount signal SAHH 1 [ 6 : 0 ].
  • a circuit configuration and functions of the shift control circuit 13 of FIG. 16 are the same as the circuit configuration and the functions of the shift control circuit 13 that is illustrated in FIG. 11
  • a circuit configuration and functions of the shift control circuit 14 of FIG. 16 are the same as the circuit configuration and the functions of the shift control circuit 14 that is illustrated in FIG. 11 .
  • the shift circuits 22 c, 22 d, 22 a, and 22 b have circuit configurations the same as those of the shift circuits 22 a and 22 b, which are illustrated in FIG. 11 .
  • An operation of the shift circuit 22 a of FIG. 16 is the same as the operation of the shift circuit 22 a that is illustrated in FIG. 11
  • an operation of the shift circuit 22 b of FIG. 16 is the same as the operation of the shift circuit 22 b that is illustrated in FIG. 11 .
  • the shift circuit 22 c shifts, in accordance with the value of a shift amount signal SAH 1 , the bits of 16-bit data D[ 63 : 48 ] within the 64-bit data D[ 63 : 0 ] from the high-order side to the low-order side, and outputs the shifted data to the 143-bit internal bus RH[ 191 : 49 ]. That is, the shift circuit 22 c shifts the data D[ 63 : 48 ] to the right by the value of the shift amount signal SAH 1 (which is a value from 0 bits to 127 bits).
  • the shift circuit 22 d shifts, in accordance with the value of a shift amount signal SAHH 1 , the bits of 16-bit data D[ 47 : 32 ] within the 64-bit data D[ 63 : 0 ] from the high-order side to the low-order side, and outputs the shifted data to the 143-bit internal bus RHH[ 175 : 33 ]. That is, the shift circuit 22 d shifts the data D[ 47 : 32 ] to the right by the value of the shift amount signal SAHH 1 (which is a value from 0 bits to 127 bits).
  • data transmitted to the internal bus RHH[ 175 : 33 ] may be also referred to as the data RHH[ 175 : 33 ].
  • a circuit configuration and functions of the buffer circuit 32 of FIG. 16 are the same as the circuit configuration and the functions of the buffer circuit 32 that is illustrated in FIG. 11 .
  • the buffer circuit 33 has a circuit configuration the same as that of the buffer circuit 32 .
  • the buffer circuit 33 outputs, as data R[ 191 : 176 ], the high-order 16-bit data RH[ 191 : 176 ] within the data RH[ 191 : 49 ] output from the shift circuit 22 c.
  • the bit selecting circuit 44 receives the data RH[ 175 : 49 ], output from the shift circuit 22 c, and the data RHH[ 175 : 33 ], output from the shift circuit 22 d. Further, the bit selecting circuit 44 receives the data RLH[ 159 : 17 ], output from the shift circuit 22 a, and the data RL[ 143 : 17 ], output from the shift circuit 22 b. The bit selecting circuit 44 selects valid bits from the data RH[ 175 : 49 ], the data RHH[ 175 : 33 ], the data RLH[ 159 : 17 ], and the data RL[ 143 : 17 ], and outputs the selected bits as data R[ 175 : 17 ]. Within the data R[ 175 : 17 ], the valid bits are 48 bits at a minimum and 64 bits at a maximum.
  • FIG. 17 illustrates an example of the shift control circuits 15 , 16 , 13 , and 14 , which are illustrated in FIG. 16 .
  • a circuit configuration and functions of the shift control circuit 13 of FIG. 17 are the same as the circuit configuration and the functions of the shift control circuit 13 that is illustrated in FIG. 11
  • a circuit configuration and functions of the shift control circuit 14 of FIG. 17 are the same as the circuit configuration and the functions of the shift control circuit 14 that is illustrated in FIG. 11 .
  • the shift control circuit 15 includes and-circuits AND 1 and AND 2 that receive a mode signal SIMD via an inverter IV, and a plurality of buffers BUF that output a shift amount signal SAH[ 4 : 0 ] as a shift amount signal SAH 1 [ 4 : 0 ].
  • Outputs of the and-circuit AND 1 and AND 2 (SAH 1 [ 6 : 5 ]) are set to “00” during the SIMD mode. That is, during the SIMD mode, the shift control circuit 15 outputs, in accordance with the shift amount signal SAH[ 4 : 0 ], the shift amount signal SAH 1 [ 6 : 0 ] that represents a shift amount of from 0 bits to 31 bits.
  • the shift control circuit 16 includes an and-circuit AND that receives the mode signal SIMD via an inverter and an or-circuit OR that receives the mode signal SIMD. Further, the shift control circuit 16 includes a plurality of buffers BUF that output a shift amount signal SAHH[ 4 : 0 ] as a shift amount signal SAHH 1 [ 4 : 0 ]. Outputs of the and-circuit AND and the or-circuit OR (SAHH 1 [ 6 : 5 ]) are set to “01” during the SIMD mode.
  • the shift control circuit 16 outputs, in accordance with the shift amount signal SAHH[ 4 : 0 ], the shift amount signal SAHH 1 [ 6 : 0 ] that represents a shift amount of from 32 bits to 64 bits.
  • FIG. 18 illustrates an example of the buffer circuits 33 and 32 and the bit selecting circuit 44 , which are illustrated in FIG. 16 .
  • a circuit configuration and functions of the buffer circuit 32 of FIG. 18 are the same as the circuit configuration and the functions of the buffer circuit 32 that is illustrated in FIG. 13 .
  • the buffer circuit 33 includes a plurality of buffers BUF that output data RH[ 191 : 176 ] as data R[ 191 : 176 ].
  • the bit selecting circuit 44 includes a plurality of or-circuits OR each of which has two input units to operate an-or logic of each bit of data RH and RHH corresponding to data R[ 175 : 160 ]. Further, the bit selecting circuit 44 includes a plurality of or-circuits OR each of which has three input units to operate an-or logic of each bit of data RH, RHH, and RLH corresponding to data R[ 159 : 144 ]. Furthermore, the bit selecting circuit 44 includes a plurality of or-circuits OR each of which has four input units to operate an-or logic of each bit of data RH, RHH, RLH, and RL corresponding to data R[ 143 : 49 ].
  • bit selecting circuit 44 includes a plurality of or-circuits OR each of which has three input units to operate an-or logic of each bit of data RHH, RLH, and RL corresponding to data R[ 48 : 33 ]. Furthermore, the bit selecting circuit 44 includes a plurality of or-circuits OR each of which has two input units to operate an-or logic of each bit of data RLH and RL corresponding to data R[ 32 : 17 ].
  • the logical value 1 is set in a case where the respective of the data RH[ 175 : 49 ], the respective of the data RHH[ 175 : 33 ], each bit of the data RLH[ 159 : 17 ], or the respective of the data RL[ 143 : 17 ] is the logical value 1.
  • Each of the shift circuits 22 c, 22 d, 22 a, and 20 b, which are illustrated in FIG. 16 includes a function to set bits to the logical value 0 except for valid bits. Further, as illustrated in FIG. 19 and FIG. 20 , the data D[ 63 : 0 ] is not simultaneously output to the internal buses RH, RHH, RLH, and RL having same bit numbers. Hence, valid data D is not simultaneously supplied to a plurality of input units of each or-circuit OR of the bit selecting circuit 44 .
  • the bit selecting circuit 44 can select valid data and output the selected data to the output bus R[ 175 : 17 ] without using a control signal.
  • the shift circuit 22 c shifts, in accordance with the shift amount signal SAH 1 [ 6 : 0 ], the position of each bit of the data D[ 63 : 48 ] in a range of from 0 bits to 127 bits, and outputs the shifted data as the data RH[ 191 : 49 ].
  • the shift circuit 22 d shifts, in accordance with the shift amount signal SAHH 1 [ 6 : 0 ], the position of each bit of the data D[ 47 : 32 ] in a range of from 0 bits to 127 bits, and outputs the shifted data as the data RHH[ 175 : 33 ].
  • Operations of the shift circuits 22 a and 22 b are the same as those in FIG. 14 .
  • the bit range of the data RH that is output by the shift circuit 22 c and the bit range of the data RHH that is output by the shift circuit 22 d differ by 16 bits.
  • the bit range of the data RHH that is output by the shift circuit 22 d and the bit range of the data RLH that is output by the shift circuit 22 a differ by 16 bits.
  • the bit range of the data RLH that is output by the shift circuit 22 a and the bit range of the data RL that is output by the shift circuit 22 b differ by 16 bits.
  • the shift amount signals SAH[ 6 : 0 ], SAHH[ 6 : 0 ], SALH[ 6 : 0 ], and SAL[ 6 : 0 ] are set to values equal to each other.
  • the shift operation circuit 106 can output the data D[ 63 : 48 ], D[ 47 : 32 ], [ 31 : 16 ], and D[ 15 : 0 ] as the data R without causing the bit numbers of the data RH, RHH, RLH, and RL to overlap with each other. Further, the shift operation circuit 106 can output the data R without making blank bit numbers of data RH, RHH, RLH, and RL.
  • the high-order bits SAH 1 [ 6 : 5 ] of the shift amount signal SAH 1 [ 6 : 0 ] are fixed to “00”, and the high-order bits SAHH 1 [ 6 : 5 ] of the shift amount signal SAHH 1 [ 6 : 0 ] are fixed to “01”. Further, the high-order bits SALH 1 [ 6 : 5 ] of the shift amount signal SALH 1 [ 6 : 0 ] are fixed to “10”, and the high-order bits SAL 1 [ 6 : 5 ] of the shift amount signal SAL 1 [ 6 : 0 ] are fixed to “11”. That is, in the SIMD mode, the 2 high-order bits of the shift amount signals SAH 1 , SAHH 1 , SALH 1 , and SAL 1 are set to logical values different from each other.
  • the shift circuit 22 c shifts, in accordance with the shift amount signal SAH 1 [ 6 : 0 ], the position of each bit of the data D[ 63 : 48 ] in a range of from 0 bits to 31 bits, and outputs the shifted data as the data RH[ 191 : 145 ].
  • the shift circuit 22 d shifts, in accordance with the shift amount signal SAHH 1 [ 6 : 0 ], the position of each bit of the data D[ 47 : 32 ] in a range of from 32 bits to 63 bits, and outputs the shifted data as the data RH[ 143 : 97 ].
  • Operations of the shift circuits 22 a and 22 b of FIG. 20 are the same as the operations of the shift circuits 22 a and 22 b that axe illustrated in FIG. 15 .
  • the data D[ 63 : 48 ] is output to a range of data R[ 191 : 145 ], and the data D[ 47 : 32 ] is output to a range of data R[ 143 : 97 ].
  • the data D[ 31 : 16 ] is output to a range of data R[ 95 : 49 ], and the data D[ 15 : 0 ] is output to a range of data R[ 47 : 1 ]. That is, the bit ranges of the data RH, RHH, RLH, and RL that are output by the shift circuits 22 c, 22 d, 22 a, and 22 b do not overlap with each other.
  • FIG. 21 illustrates an example of a shift operation of the shift operation circuit 106 , which is illustrated in FIG. 16 .
  • the shift amount signals SAH, SAHH, SALH, and SAL are “00h”
  • the shift amount signals SAH 1 , SAHH 1 , SALH 1 , and SAL 1 are also set to “00h” (right shift by 0 bits).
  • the data D[ 63 : 48 ], D[ 47 : 32 ], D[ 31 : 16 ], and D[ 15 : 0 ] are output as R[ 191 : 128 ].
  • the shift amount signals SAH, SAHH, SALH, and SAL are “19h”
  • the shift amount signals SAH 1 , SAHH 1 , SALH 1 , and SAL 1 are also set to “19h” (right shift by 25 bits).
  • the data D[ 63 : 48 ], D[ 47 : 32 ], D[ 31 : 16 ], and D[ 15 : 0 ] are output as R[ 166 : 103 ].
  • the shift amount signals SAH, SAHH, SALH, and SAL are “6Eh” (right shift by 110 bits).
  • the data D[ 63 : 48 ], D[ 47 : 32 ], D[ 31 : 16 ], and D[ 15 : 0 ] are output as R[ 81 : 18 ].
  • the shift amount signals SAH, SAHH, SALH, and SAL are “7Fh”
  • the shift amount signals SAH 1 , SAHH 1 , SALH 1 , and SAL 1 are also set to “7Fh” (right shift by 127 bits).
  • the data D[ 63 : 48 ], D[ 47 : 32 ], D[ 31 : 16 ], and D[ 15 : 0 ] are output as R[ 64 : 1 ].
  • the shift amount signals SAH, SAHH, SALH, and SAL are set to “00h”, “1Fh”, “00h”, and “1Fh”.
  • the shift amount signals SAH 1 , SAHH 1 , SALH 1 , and SAL 1 are set to “00h”, “3Fh”, “40h”, and “7Fh”.
  • the data D[ 63 : 48 ] is output as R[ 191 : 176 ]
  • the data D[ 47 : 32 ] is output as R[ 112 : 97 ].
  • the data D[ 31 : 16 ] is output as R[ 95 : 80 ]
  • the data D[ 15 : 0 ] is output as R[ 16 : 1 ].
  • the shift amount signals SAH, SAHH, SALH, and SAL are set to “1Fh”, “00h”, “1Fh”, and “00h”.
  • the shift amount signals SAH 1 , SAHH 1 , SALH 1 , and SAL 1 are set to “1Fh”, “20h”, “5Fh”, and “60h”.
  • the data D[ 63 : 48 ] is output as R[ 160 : 145 ]
  • the data D[ 47 : 32 ] is output as R[ 143 : 128 ].
  • the data D[ 31 : 16 ] is output as R[ 64 : 49 ]
  • the data D[ 15 : 0 ] is output as R[ 47 : 32 ].
  • the operation processing apparatus 200 in which the shift operation circuit 106 is mounted on an arithmetic unit, can execute both a SIMD operation for 32-bit data (divided into two) and a SIMD operation for 16-bit data (divided into four).
  • the shift control circuits 15 and 16 set the most significant bits SAH 1 [ 6 ] and SAHH 1 [ 6 ] of shift amount signals SAH 1 [ 6 : 0 ] and SAHH 1 [ 6 : 0 ] to the logical value 0.
  • the shift control circuits 13 and 14 set the most significant bits SALH 1 [ 6 ] and SAL 1 [ 6 ] of shift amount signals SALH 1 [ 6 : 0 ] and SAL 1 [ 6 : 0 ] to the logical value 1. Thereby, the shift operation circuit 106 operates in a manner similar to that in FIG. 7 .
  • a second SIMD mode for executing a four-divisional SIMD operation is the same as that in FIG. 20 .
  • the reference bit positions of the internal buses RH, RHH, RLH, and RL which are respectively coupled to the four shift circuits 22 c, 22 d, 22 a, and 22 b, are shifted by the bit width of divided data D, and thereby data D can be prevented from collision in the normal mode.
  • the bit selecting circuit 44 can select valid data D without using a control signal.
  • the shift operation circuit 106 can execute a two-divisional SIMD operation or a four-divisional SIMD operation.

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Hardware Design (AREA)
  • General Physics & Mathematics (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Executing Machine-Instructions (AREA)
  • Advance Control (AREA)

Abstract

A shift operation circuit includes: shift circuits respectively coupled to internal buses whose bit numbers partially overlap, each shift circuit receiving one of sets of divided data obtained by dividing input data and one of shift amount signals and outputting the corresponding divided data to a range shifted based on a shift amount represented by the corresponding shift amount signal from a reference bit position in the corresponding internal bus; a shift control circuit configured to output, during a first mode, shift amount signals whose shift amounts are common to the shift circuits, and configured to convert, during a second mode, a shift amount signal for each shift circuit, into a shift amount signal representing a shift range whose bit numbers do not overlap in the internal buses; and a bit selecting circuit configured to select valid divided data from bits whose bit numbers overlap in the internal buses.

Description

    CROSS-REFERENCE TO RELATED APPLICATION
  • This application is based upon and claims the benefit of priority of the prior Japanese Patent Application No. 2017-019576 filed on Feb. 6, 2017, the entire contents of which are incorporated herein by reference.
  • FIELD
  • The embodiments discussed herein relate to a shift operation circuit and a shift operation method.
  • BACKGROUND
  • In recent years, a method called a Single Instruction Multiple Data (SIMD) operation is proposed for operating multiple sets of data in parallel based on a single instruction in order to efficiently process data that is used for image processing or the like by using a processor, such as a Central Processing Unit (CPU).
  • Such a processor includes a plurality of arithmetic units, such as an adder, a logical unit, and a shifter, and causes the plurality of arithmetic units to operate in a coupled manner when an instruction indicates a scalar mode, and causes the plurality of arithmetic units to operate independently from each other when an instruction indicates a vector mode (for example, see Patent Document 1) . Further, such a processor includes a pair of Arithmetic Logic Units (ALU) and a pair of shifters coupled to each other via a shift data selecting circuit. Then, in a mode for causing the ALUs to operate on a non-divided basis, the processor causes the shifters to operate in a coupled manner as well as causes the ALUs to operate in a coupled manner. In a mode for causing the ALUs to operate on a divided basis, the processor causes the shifters to operate independently from each other as well as causes the ALUs to operate independently from each other (for example, see Patent Document 2).
  • RELATED-ART DOCUMENTS Patent Documents
  • [Patent Document 1] Japanese Laid-open Patent Publication No. H8-50575
  • [Patent Document 2] Japanese Laid-open Patent Publication No. 2009-15555
  • Here, when a SIMD function is mounted on an arithmetic unit such as a floating-point adder, a function of each element of the arithmetic unit is switched between a case in which a normal instruction other than a SIMD instruction is executed and a case in which a SIMD instruction is executed. For example, in a floating-point adder or a floating-point multiplier/adder, a shift operation circuit for executing digit alignment of a significand includes a plurality of shift circuits that respectively shift a plurality of sets of data divided when executing a SIMD instruction. When bits of data supplied to a plurality of shift circuits overlap, the circuit scale of the shift circuits increases relative to a case in which bits do not overlap. However, a method for supplying data to a plurality of shift circuits without causing bits to overlap is not proposed.
  • SUMMARY
  • According to an aspect of the embodiments, a shift operation circuit includes: a plurality of shift circuits each of which is coupled to a corresponding internal bus that is one of a plurality of internal buses having a bit width greater than a bit width of input data, a part of bit numbers of the plurality of internal buses overlapping, each of the plurality of shift circuits being configured to receive corresponding divided data that is one of a plurality of sets of divided data obtained by dividing the input data and to receive a corresponding shift amount signal that is one of a plurality of shift amount signals, each of the plurality of shift circuits being configured to output the corresponding divided data to a range shifted based on a shift amount represented by the corresponding shift amount signal from a reference bit position in the corresponding internal bus; a shift control circuit configured to receive, during a first mode, each of a plurality of shift amount signals whose shift amounts are common and to output, as the corresponding shift amount signal, the received plurality of shift amount signals to each of the plurality of shift circuits, and the shift control circuit being configured to receive, during a second mode, a shift amount signal for each of the plurality of shift circuits, convert the received shift amount signal into a corresponding shift amount signal that represents a shift range whose bit numbers do not overlap in the plurality of internal buses, and to output the corresponding shift amount signal to each of the plurality of shift circuits; and a bit selecting circuit configured to select valid corresponding divided data from bits whose bit numbers overlap in the plurality of internal buses and configured to output the selected corresponding divided data to an output bus.
  • According to another aspect of the embodiments, a shift operation method for a shift operation circuit including a plurality of shift circuits each of which is coupled to a corresponding internal bus that is one of a plurality of internal buses having a bit width greater than a bit width of input data, a part of bit numbers of the plurality of internal buses overlapping includes: receiving, by each of the plurality of shift circuits, corresponding divided data that is one of a plurality of sets of divided data obtained by dividing the input data; receiving, by each of the plurality of shift circuits, a corresponding shift amount signal that is one of a plurality of shift amount signals; outputting, by each of the plurality of shift circuits, the corresponding divided data to a range shifted based on a shift amount represented by the corresponding shift amount signal from a reference bit position in the corresponding internal bus; receiving, by a shift control circuit included in the shift operation circuit, during a first mode, each of a plurality of shift amount signals whose shift amounts are common and outputting, as the corresponding shift amount signal, the received plurality of shift amount signals to each of the plurality of shift circuits; receiving, by the shift control circuit, during a second mode, a shift amount signal for each of the plurality of shift circuits, converts the received shift amount signal into a corresponding shift amount signal that represents a shift range whose bit numbers do not overlap in the plurality of internal buses, and outputting the corresponding shift amount signal to each of the plurality of shift circuits; selecting, by a bit selecting circuit included in the shift operation circuit, valid corresponding divided data from bits whose bit numbers overlap in the plurality of internal buses; and outputting, by the bit selecting circuit, the selected corresponding divided data to an output bus.
  • The object and advantages of the invention will be realized and attained by means of the elements and combinations particularly pointed out in the claims. It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory and are not restrictive of the invention.
  • BRIEF DESCRIPTION OF DRAWINGS
  • FIG. 1 is a diagram illustrating a shift operation circuit according to an embodiment;
  • FIG. 2 is a diagram illustrating an example of an operation processing apparatus on which the shift operation circuit that is illustrated in FIG. 1 is mounted;
  • FIG. 3 is a diagram illustrating an example of a floating-point adder that is illustrated in FIG. 2;
  • FIG. 4 is a diagram illustrating an example of shift control circuits, which are illustrated in FIG. 1;
  • FIG. 5 is a diagram illustrating an example of buffer circuits and a bit selecting circuit, which are illustrated in FIG. 1;
  • FIG. 6 is a diagram illustrating an example of an operation in a normal mode of the shift operation circuit, which is illustrated in FIG. 1;
  • FIG. 7 is a diagram illustrating an example of an operation in a SIMD mode of the shift operation circuit, which is illustrated in FIG. 1;
  • FIG. 8 is a diagram illustrating an example of a case in which parity predictors that predict parity bits are built in the shift operation circuit, which is illustrated in FIG. 1;
  • FIG. 9 is a diagram illustrating an example of allocation of data and parity bits in the shift operation circuit, which is illustrated in FIG. 8;
  • FIG. 10 is a diagram illustrating a shift operation circuit as another example;
  • FIG. 11 is a diagram illustrating a shift operation circuit according to another embodiment; FIG. 12 is a diagram illustrating an example of shift control circuits, which are illustrated in FIG. 11;
  • FIG. 13 is a diagram illustrating an example of buffer circuits and a bit selecting circuit, which are illustrated in FIG. 11;
  • FIG. 14 is a diagram illustrating an example of an operation in the normal mode of the shift operation circuit, which is illustrated in FIG. 11;
  • FIG. 15 is a diagram illustrating an example of an operation in the SIMD mode of the shift operation circuit, which is illustrated in FIG. 11;
  • FIG. 16 is a diagram illustrating a shift operation circuit according to another embodiment;
  • FIG. 17 is a diagram illustrating an example of shift control circuits, which are illustrated in FIG. 16;
  • FIG. 18 is a diagram illustrating an example of buffer circuits and a bit selecting circuit, which are illustrated in FIG. 16;
  • FIG. 19 is a diagram illustrating an example of an operation in the normal mode of the shift operation circuit, which is illustrated in FIG. 16;
  • FIG. 20 is a diagram illustrating an example of an operation in the SIMD mode of the shift operation circuit, which is illustrated in FIG. 16; and
  • FIG. 21 is a diagram illustrating an example of a shift operation of the shift operation circuit, which is illustrated in FIG. 16.
  • DESCRIPTION OF EMBODIMENT
  • In the following, embodiments will be described with reference to the accompanying drawings. It is an object in one aspect of the invention to reduce the circuit size of a shift operation circuit relative to a conventional one.
  • FIG. 1 illustrates a shift operation circuit 100 according to an embodiment. The shift operation circuit 100 includes shift control circuits 10 and 11, shift circuits 20 a and 20 b, buffer circuits 30 and 31, and a bit selecting circuit 40.
  • The shift control circuit 10 receives a 7-bit shift amount signal SAH[6:0] that represents a shift amount of the shift circuit 20 a, and changes logical values of the shift amount signal SAH[6:0] in accordance with a mode signal SIMD, and outputs the changed signal as a shift amount signal SAH1[6:0]. Note that the mode signal SIMD is set to the logical value 1 during a SIMD mode in which an operation processing apparatus 200 executes a SIMD operation based on a SIMD instruction, and the mode signal SIMD is set to the logical value 0 during a normal mode in which the operation processing apparatus 200 executes a single operation based on a normal instruction. The normal mode is an example of a first mode, and the SIMD mode is an example of a second mode.
  • The shift control circuit 11 receives a 7-bit shift amount signal SAL[6:0] that represents a shift amount of the shift circuit 20 b, and changes logical values of the shift amount signal SAL[6:0] in accordance with a mode signal SIMD, and outputs the changed signal as a shift amount signal SAL1[6:0] . Note that the shift control circuits 10 and 11 may be provided, on the shift operation circuit 100, as one shift control circuit. In the following, the shift amount signals SAH[6:0], SAL[6:0], SAH1[6:0], and SAL1[6:0] may also be referred to as the shift amount signals SAH, SAL, SAH1, and SAL1 by omitting the bit numbers.
  • In a case where the mode signal SIMD represents the normal mode, the shift amount signals SAH[6:0] and SAL[6:0] are set to values equal to each other. In a case where the mode signal SIMD represents the SIMD mode, the shift amount signals SAH[6:0] and SAL[6:0] are set independently from each other.
  • For example, during the SIMD mode, the most significant bit SAH1[6] of the shift amount signal SAH1 is set to the logical value 0, and the most significant bit SAL1[6] of the shift amount signal SAL1 is set to the logical value 1. The shift amount signal SAH1, of which the most significant bit SAH1[6] is set to the logical value 0, represents one of “0” to “63”, and the shift amount signal SAL1, of which the most significant bit SAL1[6] is set to the logical value 1, represents one of “64” to “127”. In other words, during the SIMD mode, the shift amount signals SAH[6:0] and SAL[6:0] are converted, at the internal buses RH[191:33] and RL[159:1], into shift amount signals SAH1[6:0] and SAL1[6:0] that represent shift ranges of which the bit numbers do not overlap. An example of the shift control circuits 10 and 11 are illustrated in FIG. 4.
  • The shift circuit 20 a receives 32-bit divided data D[63:32] obtained by dividing 64-bit input data D[63:0] and receives the shift amount signal SAH1[6:0]. The shift circuit 20 a outputs, in the internal bus RH[191:33], the divided data D[63:32] to a range shifted from a reference bit position RH[191] by the shift amount represented by the shift amount signal SAH1[6:0]. In the following, the input data D[63:0] may also be referred to as the data D[63:0], and the divided data D[63:32] may also be referred to as the data D[63:32]. Further, the data transmitted to the internal bus RH[191:33] may be also referred to as the data RH[191:33]. In other words, each of the shift circuits 20 a and 20 b may receive the corresponding divided data, which is one of a plurality of sets of divided data obtained by dividing input data D[63:0], and receive the corresponding shift amount signal, which is one of a plurality of shift amount signals, and each of the shift circuits 20 a and 20 b may output the corresponding divided data to a range shifted by based on shift amount represented by the corresponding shift amount signal from a reference bit position in the corresponding internal bus.
  • The shift circuit 20 a shifts, in accordance with the value of the shift amount signal SAH1, the bits of the data D[63:32] from the high-order side to the low-order side, and outputs the shifted data as 159-bit data RH[191:33]. That is, the shift circuit 20 a shifts the data D[63:32] to the right by the value of the shift amount signal SAH1 (which is a value from 0 bits to 127 bits). The shift circuit 20 a includes a function to set 127 bits to the logical value 0 except for 32 bits output as the data D[63:32], within the 159-bit data RH[191:33].
  • The shift circuit 20 b receives 32-bit divided data D[31:0] obtained by dividing the 64-bit input data D[63:0] and receives the shift amount signal SAL1[6:0]. The shift circuit 20 b outputs, in the internal bus RL[159:1], the divided data D[31:0] to a range shifted from a reference bit position RL[159] by the shift amount represented by the shift amount signal SAL1[6:0]. In the following, the divided data D[31:0] may also be referred to as the data D[31:0], and the data transmitted to the internal bus RL[159:1] may be also referred to as the data RL[159:1].
  • The shift circuit 20 b shifts, in accordance with the value of the shift amount signal SAL1, the bits of the data D[31:0] from the high-order side to the low-order side, and outputs the shifted data as 159-bit data RH[159:1]. That is, the shift circuit 20 b shifts the data D[31:0] to the right by the value of the shift amount signal SAL1 (which is a value from 0 bits to 127 bits). The shift circuit 20 b includes a function to set 127 bits to the logical value 0 except for 32 bits output as the data D[31:0] within the 159-bit data RL[159:1].
  • The bit numbers of the bits RH[159:33] of the internal bus RH[191:33] coupled to the shift circuit 20 a and the bit numbers of the bits RL[159:33] of the internal bus RL[159:1] coupled to the shift circuit 20 b overlap with each other. In other words, a part of the bit numbers of the internal buses RH[191:33] and RL[159:1] overlap. Conversely, the reference bit position RL[159] in the shift circuit 20 b is allocated by shifting the bit width of the divided data D[63:32] with respect to the reference bit position RH[191] in the shift circuit 20 a. Thereby, as described in the following with reference to FIG. 6, in the normal mode, the divided data D[63:32] and D[31:0] supplied to the shift circuits 20 a and 20 b, which are different from each other, can be output, as continuous data D[63:0], to the output bus R[191:1]. In the following, the data transmitted to the output bus R[191:1] may be also referred to as the data R[191:1].
  • The shift circuits 20 a and 20 b are circuits equal to each other and have common circuit data (macro data). Hence, for example, design data of the shift circuit 20 a can be used in the shift circuit 20 b. Therefore, it is possible to reduce a designing period of the shift circuits 20 a and 20 b relative to a case of independently designing the shift circuits 20 a and 20 b.
  • The buffer circuit 30 outputs, as data R[191:160], the high-order 32-bit data RH[191:160] within the data RH[191:33] output from the shift circuit 20 a. That is, the buffer circuit 30 outputs, to the output bus [191:160], the data RH[191:160] output by the bits RH[191:160] whose bit numbers do not overlap with the internal bus RL[159:1] in the internal bus RH[191:33].
  • The buffer circuit 31 outputs, as data R[32:1], the low-order 32-bit data RL[32:1] within the data RL[159:1] output from the shift circuit 20 b. The buffer circuit 31 outputs, to the output bus [32:1], the data RL[32:1] output by the bits RL[32:1] whose bit numbers do not overlap with the internal bus RH[191:33] in the internal bus RL[159:1].
  • The bit selecting circuit 40 selects valid bits from the data RH[159:33], output from the shift circuit 20 a, and the data RL[159:33], output from the shift circuit 20 b, and outputs the selected bits to the output bus R[159:33]. Within the data R[159:33], the valid bits are 32 bits at a minimum and 64 bits at a maximum. In the following, the data D[63:0], RH[191:33], RL[159:1], and R[191:1] may also be referred to as the data D, RH, RL, and R by omitting the bit numbers.
  • During the normal mode, the shift operation circuit 100 shifts the input data D[63:0] to the right by the value of the shift amount signals SAH and SAL (the same logical value), and outputs the shifted data as any 64 bits of the data R[191:1]. In contrast, during the SIMD mode, the shift operation circuit 100 shifts the input data D[63:32] to the right by the value of the shift amount signal SAH and outputs the shifted data as any 32 bits of the data R[191:95]. Further, during the SIMD mode, the shift operation circuit 100 shifts the input data D[31:0] to the right by the value of the shift amount signal SAL and outputs the shifted data as any 32 bits of the data R[95:1]. An example of the operation of the shift operation circuit 100 in the normal mode is illustrated in FIG. 6, and an example of the operation of the shift operation circuit 100 in the SIMD mode is illustrated in FIG. 7.
  • FIG. 2 illustrates an example of the operation processing apparatus 200 on which the shift operation circuit 100 that is illustrated in FIG. 1 is mounted. The operation processing apparatus 200 includes an instruction cache 50, an instruction buffer 52, a decoding unit 54, a reservation station unit 56, and an operation executing unit 58. The operation processing apparatus 200 may be a processor such as a CPU, and FIG. 2 illustrates a part of a processor core mounted on the processor.
  • For example, the instruction cache 50 is a secondary cache (second level cache) or a primary instruction cache (first level cache) that stores an instruction transmitted from a main memory or the like. The instruction buffer 52 sequentially holds an instruction transmitted from the instruction cache and sequentially outputs, to the decoding unit 54, the held instruction. The decoding unit 54 decodes the instruction transmitted from the instruction buffer 52, and inputs, in the reservation station unit 56, an instruction code, a register number, and the like included in the decoded instruction.
  • The reservation station unit 56 includes a Reservation Station for Execution (RSE) including a plurality of entries that hold operation instructions. Further, the reservation station unit 56 includes a Reservation Station for Address (RSA) including a plurality of entries that hold memory access instructions such as a load instruction and a store instruction.
  • The Reservation Station for Execution (RSE) determines a dependence relationship between the operation instructions held in the entries, and selects, based on the determined dependence relationship, an executable operation instruction from the operation instructions held in the entries. The Reservation Station for Execution (RSE) inputs the selected operation instruction into the operation executing unit 58. The Reservation Station for Address (RSA) determines a dependence relationship between the memory access instructions held in the entries, and selects, based on the determined dependence relationship, an executable load instruction or store instruction from the memory access instructions held in the entries. The Reservation Station for Address (RSA) inputs the selected load instruction or store instruction into the operation executing unit 58.
  • The operation executing unit 58 includes a fixed-point operation unit 60, a floating-point operation unit 62, a logical operation unit 64, an address operation unit 66, and a register unit 68. The fixed-point operation unit 60 includes an adder ADD that executes addition or subtraction of fixed-point numbers and a multiplier MUL that executes multiplication or division of fixed-point numbers. The floating-point operation unit 62 includes an adder FADD that executes addition or subtraction of floating-point numbers and a multiplier FMUL that executes multiplication or division of floating-point numbers. Further, the floating-point operation unit 62 includes a multiplier/adder FMA that executes multiplication and addition of floating-point numbers. The shift operation circuit 100 that is illustrated in FIG. 1 is mounted on the adder FADD for floating-point numbers. Note that the shift operation circuit 100 may be mounted on the multiplier/adder FMA for floating-point numbers.
  • For example, the adder FADD, the multiplier FMUL, and the multiplier/adder FMA include a function to execute a SIMD operation. In the SIMD operation, because a plurality of operations are executed in parallel based on a single instruction, a plurality of sets of data are respectively stored in a first operand and a second operand of a SIMD instruction in a divided manner.
  • The logical operation unit 64 includes a logical conjunction operator AND that executes an AND logical operation, and a logical disjunction operator OR that executes an OR logical operation, and a shift operator that executes a shift operation. The address operation unit 66 calculates an access address based on a memory access instruction input from the reservation station RSA and outputs the calculated access address to a data cache or the like not illustrated.
  • The register unit 68 has a plurality of universal registers designated by an instruction and a plurality of registers (update buffers) that temporarily hold operation results and the like. For example, each register is 64 bits.
  • FIG. 3 illustrates an example of the floating-point adder FADD, which is illustrated in FIG. 2. The floating-point adder FADD includes a comparator CMP, a switch SW, a subtractor SUB 1, a right shifter RSFT, an adder ADD 1, a leading zero predictor RZP, a normalization shifter NRMSFT, and an adder ADD 2. For example, the shift operation circuit 100, which is illustrated in FIG. 1, may be mounted as the right shifter RSFT on the adder FADD for floating-.point numbers.
  • In the following, an operation of the floating-point adder FADD in a normal mode will be described. The floating-point adder FADD, which is illustrated in FIG. 3, adds a 64-bit operand OP1, which includes an exponent EXP1 and a significand FRC1, and a 64-bit operand OP2, which includes an exponent EXP2 and a significand FRC2, and outputs the exponent EXP and the significand FRC that indicate the addition result. The operands OP1 and OP2 and the addition result are held in universal registers of the register unit 68, which is illustrated in FIG. 2.
  • For example, in The IEEE (Institute of Electrical and Electronics Engineers) 754 (Standard for Floating-point Arithmetic), a 64-bit floating number has a 1-bit sign part, a 11-bit exponent part, and a 52-bit significand part. In FIG. 2, the sign bit part (sign bit) is omitted. Further, in the IEEE754, the normalized most significant bit is omitted as a hidden bit in a floating point number, but the output of the switch SW is supplemented with the hidden bit.
  • The comparator CMP compares the magnitude of the exponent EXP1 with the magnitude of the exponent EXP2. When the exponent EXP2 is larger than the exponent EXP1, the comparator CMP outputs, to the switch SW, a switch control signal SWC for switching the exponents EXP1 and EXP2. When the exponent EXP1 is equal to or larger than the exponent EXP2, the comparator CMP outputs, to the switch SW, a switch control signal SWC for not switching the exponents EXP1 and EXP2. The subtractor SUB1 obtains a difference between the exponents EXP1 and EXP2 output from the switch SW and outputs, to the right shifter RSFT and the adder ADD2, a difference signal DIF that represents the obtained difference. Here, in a normal mode, the value of the difference signal DIF is supplied, to the right shifter RSFT, as shift amount signals SAR[6:0] and SAL[6:0] that are illustrated in FIG. 1.
  • The right shifter RSFT shifts the significand (one of FRC1 or FRC2) having a smaller value out of the operands OP1 and OP2 to the right by the value of the differential signal DIF and outputs it to the adder ADD1 and the leading zero predictor RZP. The significand supplied from the switch SW to the right shifter RSFT is included in data D[63:0] that is illustrated in FIG. 1. By the operation of the right shifter RSFT, one digit of the significand FRC1 or FRC2 is matched to that of the other of the significand FRC1 or FRC2, and the significands FRC1 and FRC2 whose digits are matched are added by the adder ADD1. Note that in order to execute a SIMD operation, the right shifter RSFT (that is, the shift operation circuit 100) includes the shift circuits 20 a and 20 b that shift data D[63:32] and D[31:0] independently, as illustrated in FIG. 1. An example of the operation of the shift operation circuit 100 in the SIMD mode for executing the SIMD operation is illustrated in FIG. 7.
  • The adder ADD1 adds the digit-matched significands FRC1 and FRC2, and outputs the addition result to the normalization shifter NRMSFT. Using the digit-matched significands FRC1 and FRC2, the leading zero predictor RZP predicts the number of “0” until the first “1” appears in the high-order bit-side in the addition result by the adder ADD1. Then, the leading zero predictor RZP outputs, to the normalization shifter NSFT and the adder ADD2, the predicted number as a shift amount.
  • The normalization shifter NRMSFT bit-shifts, based on the shift amount predicted by the leading zero predictor RZP, the addition result (significand) by the adder ADD1, and thereby sets “1”, which first appears on the high-order bit side of the addition result, to a hidden bit. Then, the normalization shifter NRMSFT outputs a significand FRC having the correct hidden bit. The adder ADD2 adds the value of the difference signal DIF from the subtractor SUB1 and the value of the shift amount, and outputs the addition result as an exponent EXP.
  • Note that during the SIMD mode, the floating-point adder FADD adds 32-bit floating-point numbers included in the respective operands OP1 and OP2 to each other, and also adds other 32-bit floating-point numbers included in the respective operands OP1 and OP2 to each other. That is, the operation processing apparatus 200 has an SIMD operation function to independently add two pairs of floating-point data included in the operands OP1 and OP2. During the SIMD mode, each element of the floating-point adder FADD is switched to the function of adding two pairs of floating-point data, but the details of the circuit are omitted. Note that in a case where the shift operation circuit 100 is mounted in the multiplier/adder FMA illustrated in FIG. 2, the shift operation circuit 100 is also mounted as a right shifter RSFT in the adder of the multiplier/adder FMA similarly to FIG. 3.
  • FIG. 4 illustrates an example of the shift control circuits 10 and 11, which are illustrated in FIG. 1. The shift control circuit includes an and-circuit AND that receives a mode signal SIMD via an inverter IV, and the shift control circuit 10 includes a plurality of buffers BUF that output a shift amount signal SAH[5:0] as a shift amount signal SAH1[5:0]. During the normal mode (SIMD=“0”, the and-circuit AND outputs the most significant bit SAH[6] of the shift amount signal SAH as the shift amount signal SAH1[6]. During the SIMD mode (SIMD=“1”), the and-circuit AND sets the shift amount signal SAH1[6] to “0”. That is, during the SIMD mode, the shift control circuit 10 outputs, in accordance with the shift amount signal SAH[5:0], the shift amount signal SAH1[6:0] that represents a shift amount of from 0 bits to 63 bits.
  • The shift control circuit 11 includes an or-circuit OR that receives the mode signal SIMD, and the shift control circuit 11 includes a plurality of buffers BUF that output a shift amount signal SAL[5:0] as a shift amount signal SAL1[5:0]. During the normal mode, the or-circuit _OR outputs the most significant bit SAL[6] of the shift amount signal SAL as the shift amount signal SAL1[6]. During the SIMD mode (SIMD=“1”), the or-circuit _OR sets the shift amount signal SAL1[6] to “0”. That is, during the SIMD mode, the shift control circuit 11 outputs, in accordance with the shift amount signal SAL[5:0], the shift amount signal SAL1[6:0] that represents a shift amount of from 64 bits to 127 bits.
  • The most significant bits SAH1[6] and SAL1[6] of the respective shift amount signals SAH1[6:0] and SAL1[6:0] output to the shift circuits 20 a and 20 b are set, by the and-circuit AND and the or-circuit OR, to logical values different from each other. Thus, as will be described with reference to FIG. 7, even when the shift amount signal SAH[6:0] and the shift amount signal SAL[6:0] are set independently from each other in the SIMD mode, it is possible to prevent the data D[63:32] and the data D[31:0] from collision.
  • FIG. 5 illustrates an example of the buffer circuits 30 and 31 and the bit selecting circuit 40, which are illustrated in FIG. 1. The buffer circuit 30 includes a plurality of buffers BUF that output data RH[191:160] as data R[191:160]. The buffer circuit 31 includes a plurality of buffers BUF that output data RL[32:1] as data R[32:1].
  • The bit selecting circuit 40 includes a plurality of or-circuits OR that output, as data R, an or-logic of each bit of 127-bit data RH[159:33] and RL[159:33] of which the bit numbers overlap with each other. That is, for each bit of the data R[159:33], the logical value 1 is set in a case where either the respective bit of the data RH[159:33] or the respective bit of data RL[159:33] is the logical value 1.
  • The shift circuit 20 a includes a function, at the internal bus RH, to set 127 bits to the logical value 0 except for 32 valid bits output as the data D[63:32]. The shift circuit 20 b includes a function, at the internal bus RL, to set 127 bits to the logical value 0 except for 32 valid bits output as the data D[31:0]. Further, as illustrated in FIG. 6 and FIG. 7, valid data D is not simultaneously output by data RH and RL having same bit numbers among the data RH[159:33] and RL[159:33]. Hence, the logical value 0 is necessarily supplied to one of two input units of each or-circuit OR of the bit selecting circuit 40. Therefore, by receiving, through the or-circuits OR, the respective bits of the data RH[159:33] and RL[159:33] of which the bit numbers overlap with each other, the bit selecting circuit 40 can select valid data and output the selected data to the output bus R[159:33] without using a control signal.
  • FIG. 6 illustrates an example of an operation in the normal mode (SIMD=“0”) of the shift operation circuit 100, which is illustrated in FIG. 1. That is, FIG. 6 illustrates an example of a shift operation method by the shift operation circuit 100. In FIG. 6, 7-bit “*”, which indicates the shift amount signals SAH1[6:0] and SAL1[6:0], indicates that the value of each bit is set to “0” or “1”.
  • With reference to the bit RH[191], the shift circuit 20 a shifts, in accordance with the shift amount signal SAH1[6:0], the position of each bit of the data D[63:32] in a range of from 0 bits to 127 bits, and outputs the shifted data as the data RH[191:33]. With reference to the bit RL[159], the shift circuit 20 b shifts, in accordance with the shift amount signal SAL1[6:0], the position of each bit of the data D[31:0] in a range of from 0 bits to 127 bits, and outputs the shifted data as the data RL[159:1].
  • The bits used as the references by the shift circuits 20 a and 20 b differ by 32 bits. Therefore, the bit range of the data RH that is output by the shift circuit 20 a and the bit range of the data RL that is output by the shift circuit 20 b differ by 32 bits. Further, in the normal mode, the value of the shift amount signal SAH1[6:0] and the value of the shift amount signal SAL1[6:0] are equal to each other. Hence, in a shift operation of the data D[63:0], the shift operation circuit 100 can output the data D[63:32] and D[31:0] as the data R without causing the bit numbers of the data RH and the data RL to overlap. That is, it is possible to prevent the data D[63:32] and D[31:0] from collision. Further, the shift operation circuit 100 can output the data D[63:0] as the unified 64-bit data R without blank bit numbers in the data RH and the data RL.
  • The output bus R[191:1] illustrated within brackets at the lower part of FIG. 6 indicates an example of bit positions at which data D[63:0] appears in accordance with the shift amount signals SAH1 and SAL1. The sign “h” at the end of the numerical value of the shift amount signals SAH1 and SAL1 indicates that the numerical value is a hex number. Note that the value of the shift amount signal SAH1 is the same as the value of the shift amount signal SAH supplied to the shift operation circuit 100, and the value of the shift amount signal SAL1 is the same as the value of the shift amount signal SAL supplied to the shift operation circuit 100.
  • In a case where the shift amount signals SAH1 and SAL1 are “00h” (right shift by 0 bits), the data D[63:32] is output as data R[191:160], and the data D[31:0] is output as data R[159:128]. The shift circuit 20 a sets each bit of the data RH[159:33], where the data D[63:32] does not appear, to “0”. The shift circuit 20 b sets each bit of the data RL[127:1], where the data D[31:0] does not appear, to “0”. Hence, the bit selecting circuit 40 sets each bit of the data R[128:33] to “0”. The buffer circuit 31 sets each bit of the data R[32:1] to “0”.
  • In a case where the shift amount signals SAH1 and SAL1 are “19h” (right shift by 25 bits), the data D[63:32] is output as data R[166:135], and the data D[31:0] is output as data R[134:103]. The shift circuit 20 a sets each bit of the data RH[191:167] and RH[134:33], where the data D[63:32] does not appears, to “0”. The shift circuit 20 b sets each bit of the data RL[159:135] and RL[102:1], where the data D[31:0] does not appear, to “0”. Hence, the buffer circuit 30 sets each bit of the data R[191:167] to “0”. The bit selecting circuit 40 sets each bit of the data R[102:33] to “0”. The buffer circuit 31 sets each bit of the data R[32:1] to “0”.
  • In a case where the shift amount signals SAH1 and SAL1 are “6Eh” (right shift by 110 bits), the data D[63:32] is output as data R[81:50], and the data D[31:0] is output as data R[49:18]. The shift circuit 20 a sets each bit of the data RH[191:82] and RH[49:33], where the data D[63:32] does not appear, to “0”. The shift circuit 20 b sets each bit of the data RL[159:50] and RL[17:1], where the data D[31:0] does not appear, to “0”. Hence, the buffer circuit 30 sets each bit of the data R[191:160] to “0”. The bit selecting circuit 40 sets each bit of the data R[159:82] to “0”. The buffer circuit 31 sets each bit of the data R[17:1] to “0”.
  • In a case where the shift amount signals SAH1 and SAL1 are “7Fh” (right shift by 127 bits), the data D[63:32] is output as data R[64:33], and the data D[31:0] is output as data R[32:1]. The shift circuit 20 a sets each bit of the data RH[191:64], where the data D[63:32] does not appear, to “0”. The shift circuit 20 b sets each bit of the data RL[159:33], where the data D[31:0] does not appear, to “0”. Hence, the buffer circuit 30 sets each bit of the data R[191:160] to “0”. The bit selecting circuit 40 sets each bit of the data R[159:65] to “0”.
  • FIG. 7 illustrates an example of an operation in the SIMD mode (SIMD=“1”) of the shift operation circuit 100, which is illustrated in FIG. 1. That is, FIG. 7 illustrates another example of the shift operation method by the shift operation circuit 100. Detailed descriptions for an operation of FIG. 7 similar to that of FIG. 6 are omitted as appropriate. Note that in the SIMD mode, the shift amount signal SAH1[6:0] and the shift amount signal SAL1[6:0], illustrated in FIG. 1, are set independently from each other.
  • In the SIMD mode, the most significant bit SAH1[6] of the shift amount signal SAH1 is fixed to “0”, and the most significant bit SAL1[6] of the shift amount signal SAL1 is fixed to “1”. That is, in the SIMD mode, a predetermined number of bits of the shift amount signals SAH1 and SAL1 are set to logical values different from each other. Thus, with reference to the bit RH[191], the shift circuit 20 a shifts, in accordance with the shift amount signal SAH1[6:0], the position of each bit of the data D[63:32] in a range of from 0 bits to 63 bits, and outputs the shifted data as the data RH[191:97]. With reference to the bit RL[159], the shift circuit 20 b shifts, in accordance with the shift amount signal SAL1[6:0], the position of each bit of the data D[31:0] in a range of from 0 bits to 63 bits, and outputs the shifted data as the data RL[95:1]. That is, in the SIMD mode, the bit range of the data RH that is output by the shift circuit 20 a and the bit range of the data RL that is output by the shift circuit 20 b do not overlap.
  • With respect to the uppermost case within the brackets at the lower part of FIG. 7, when the shift amount signal SAH is “00h”, the shift amount signal SAH1 is also set to “00h” (right shift by 0 bits), and the data D[63:32] is output as data R[191:160]. When the shift amount signal SAL is “25h”, the shift amount signal SAL1 is set to “65h” (right shift by 101 bits), and the data D[31:0] is output as data R[58:27].
  • With respect to the central case within the brackets at the lower part of FIG. 7, when the shift amount signal SAH is “3Fh”, the shift amount signal SAH1 is also set to “3Fh” (right shift by 63 bits), and the data D[63:32] is output as data R[128:97]. When the shift amount signal SAL is “00h”, the shift amount signal SAL1 is set to “40h” (right shift by 64 bits), and the data D[31:0] is output as data R[95:64].
  • With respect to the lowermost case within the brackets at the lower part of FIG. 7, when the shift amount signal SAH is “10h”, the shift amount signal SAH1 is also set to “10h” (right shift by 16 bits), and the data D[63:32] is output as data R[175:144]. When the shift amount signal SAL is “3Fh”, the shift amount signal SAL1 is set to “7Fh” (right shift by 127 bits), and the data D[31:0] is output as data R[32:1].
  • In this way, in the SIMD mode, the data D[63:32] is output to a range of data R[191:97] and the data D[31:0] is output to a range of data R[95:1]. Hence, even when the shift amount signal SAH[6:0] and the shift amount signal SAL[6:0], which are illustrated in FIG. 1, are set independently from each other, it is possible to prevent the data D[63:32] and the data D[31:0] from collision.
  • FIG. 8 illustrates an example of a case in which parity predictors that predict parity bits are built in the shift operation circuit 100, which is illustrated in FIG. 1. In FIG. 8, the same reference numerals are given to elements the same as the elements in FIG. 1 and their detailed descriptions are omitted as appropriate.
  • A shift operation circuit 100P that includes parity predictors PPa and PPb includes shift circuits 20Pa and 20Pb instead of the shift circuits 20 a and 20 b, which are illustrated in FIG. 1. Further, the shift operation circuit 100P includes buffer circuits 30P and 31P instead of the buffer circuits 30 and 31, which are illustrated in FIG. 1. Further, the shift operation circuit 100P includes a bit selecting circuit 40P instead of the bit selecting circuit 40, which is illustrated in FIG. 1. The shift operation circuit 100P receives data D[63:0] and parity bits DP[15:0], and detects an error in the data D[63:0]. Each bit of the parity bits DP[15:0] is appended per 4 bits of the data D[63:0].
  • The parity predictors PPa and PPb are respectively mounted within the shift circuits 20Pa and 20Pb. Each of the parity predictors PPa and PPb includes an exclusive OR circuit that calculates a parity bit DP for each 4-bit data D. The parity bits DP are calculated, in the shift circuits 20Pa and 20Pb, for a plurality of respective stages for sequentially shifting data. Hence, the respective parity predictors PPa and PPb account for several tens of percent of the sizes of the respective shift circuits 20Pa and 20Pb.
  • The shift circuit 20Pa shifts, in accordance with the shift amount signal SAH1[6:0], the bit positions of the data D[63:32] to output data RH[191:33] and parity bits RPH[47:8]. Each bit of the parity bits RPH[47:8] is added per 4 bits of the data RH[191:33]. The shift circuit 20Pb shifts, in accordance with the shift amount signal SAL1[6:0], the bit positions of the data D[31:0] to output data RL[159:1] and parity bits RPL[39:0]. Each bit of the parity bits RPL[39:0] is added per 4 bits of the data RL[159:1].
  • By generating parity bits every time the shift circuit 20Pa shifts the data D[63:32], the parity predictor PPa of the shift circuit 20Pa outputs the parity bits RPH[47:8] together with the output of the data RH[191:33]. Similarly, by generating parity bits every time the shift circuit 20Pb shifts the data D[31:0], the parity predictor PPb of the shift circuit 20Pb outputs the parity bits RPL[39:0] together with the output of the data RL[159:1]. That is, the parity predictor PPa can predict the parity bits RPH[47:8] without using the data RH[191:33], and the parity predictor PPb can predict the parity bits RPL[39:0] without using the data RL[159:1].
  • The buffer circuit 30P outputs data RH[191:160] as data R[191:160], and outputs parity bits RPH[47:40] corresponding to the data RH[191:160] as parity bits RP[47:40]. The buffer circuit 31P outputs data RL[32:1] as data R[32:1], and outputs parity bits RPL[7:0] corresponding to the data RL[32:1] as parity bits RP[7:0].
  • The bit selecting circuit 40P selects valid bits from the data RH[159:33] and the data RL[159:33], and outputs the selected bits as data R[159:33]. Further, the bit selecting circuit 40P selects valid bits from parity bits RPH[39:8] and parity bits RPL[39:8], and outputs the selected bits as parity bits RP[39:8]. Note that each bit of the parity bits RP[47:0] is added per 4 bits of the data R[191:1]. The parity bits RP[47:0], which are output together with the data R[191:1], are used, in a circuit to which the data R[191:1] is supplied, to detect an error in the data R[191:1].
  • FIG. 9 illustrates an example of allocation of data and parity bits in the shift operation circuit 100P, which is illustrated in FIG. 8. Each bit of the parity bits DP[15:0] is added per 4 bits of the data D[63:0]. Each bit of the parity bits RP[47:8] is added per 4 bits of the data RH[191:33]. Each bit of the parity bits RPL[39:0] is added per 4 bits of the data RL[159:1].
  • Each bit of the parity bits RP[47:0] is added per 4 bits of the data R[191:1]. By inserting parity bits between data bits, a wiring length of a signal for transmitting parity bits can be shortened in the shift circuits 20Pa and 20Pb relative to a case in which parity bits are not inserted between data bits.
  • FIG. 10 illustrates a shift operation circuit 102 as another example. Detailed descriptions of elements and functions of the shift operation circuit 102 similar to those of the shift operation circuit 100, which is illustrated in FIG. 1, are omitted as appropriate. The shift operation circuit 102 includes shift circuits 28 and 29, a buffer circuit 39, and a selector circuit 49. Note that in a case where parity predictors are built in the shift operation circuit 102, parity bits DP, RPH, RPL, and RP that are indicated in the brackets are appended. In the following, a case will be described in which the shift operation circuit 102 does not include parity predictors and parity bits DP, RPH, RPL, and RP are not appended.
  • During the normal mode (SIMD=“0”), the shift circuit 28 shifts, in accordance with the value of a shift amount signal SAH[6:0], the bits of 64-bit data D[63:0] from the high-order side to the low-order side, and outputs the shifted data as 191-bit data RH[191:1]. That is, the shift circuit 28 shifts the data D[63:0] to the right by the value of the shift amount signal SAH (which is a value from 0 bits to 127 bits). Further, during the SIMD mode (SIMD=“1”), the shift circuit 28 shifts, in accordance with the value of a shift amount signal SAH[5:0], the bits of 32-bit data D[63:32] from the high-order side to the low-order side, and outputs the shifted data as 95-bit data RH[191:97]. That is, the shift circuit 28 shifts the data D[63:32] to the right by the value of the shift amount signal SAH (which is a value from 0 bits to 63 bits).
  • The shift circuit 29 operates only during the SIMD mode (SIMD=“1”), and shifts, in accordance with the value of a shift amount signal SAL[5:0], the bits of 32-bit data D[31:0] from the high-order side to the low-order side, and outputs the shifted data as 95-bit data RH[95:1]. That is, the shift circuit 29 shifts the data D[31:0] to the right by the value of the shift amount signal SAL (which is a value from 0 bits to 63 bits).
  • The buffer circuit 39 outputs, as data R[191:96], the high-order 96-bit data RH[191:96] within the data RH[191:1] output from the shift circuit 28. The selector circuit 49 selects data RH[95:1] during the normal mode (SIMD=“0”), selects data RL[95:1] during the SIMD mode (SIMD=“1”), and outputs the selected data as data R[95:1].
  • In the shift operation circuit 102 illustrated in FIG. 10, the data D[31:0] is supplied to the shift circuits 28 and 29 in an overlapped manner. Because the shift circuit 28 does not shift the data D[31:0] during the SIMD mode, the shift circuit 28 has a wasted circuit that does not operate during the SIMD mode. Further, because the respective shift circuits 28 and 29 are designed independently, the designing period is longer than that of the shift circuits 20 a and 20 b, which are illustrated in FIG. 1.
  • Furthermore, the shift circuit 28 operates upon receiving the 64-bit data D[63:0], and the shift circuit 29 operates upon receiving the 32-bit data D[31:0]. Therefore, the total number of bits of the input data D is 96 bits. This number is larger than the total number of bits of the data D input to the shift circuits 20 a and 20 b (64 bits), which are illustrated in FIG. 1, by 32 bits.
  • In a case where the shift operation circuit 102 includes parity predictors, the total number of bits input to the shift circuits 28 and 29 is 120 bits, and is larger by 40 bits than the total number of bits input to the shift circuits 20Pa and 20Pb of the shift operation circuit 100P (80 bits), which is illustrated in FIG. 8.
  • For example, a circuit scale of a shift circuit exponentially increases depending on the number of bits of input data. Hence, the circuit scale of the shift operation circuit 102, which is illustrated in FIG. 10, is larger than the circuit scale of the shift operation circuit 100, which is illustrated in FIG. 1. In a case where the shift operation circuit 102 includes parity predictors, the circuit scale of the shift operation circuit 102 is further larger than the circuit scale of the shift operation circuit 100P, which is illustrated in FIG. 8. In other words, according to the shift operation circuit 100 illustrated in FIG. 1, data D[63:32] and data D[31:0] are respectively supplied to the shift circuits 20 a and 20 b without being overlapped. Hence, it is possible to reduce the size of the shift operation circuit 100 relative to the size of the shift operation circuit 100P, in which part of data D[63:0] is supplied to the plurality of shift circuits 28 and 29 in an overlapped manner.
  • As described above, according to an embodiment illustrated in FIG. 1 to FIG. 10, the data D[63:32] and D[31:0] whose bits do not overlap can be respectively supplied to the shift circuits 20 a and 20 b and the shift operation in the normal mode and the shift operation in the SIMD mode can be executed. Thereby, the total number of bits of data D supplied to the shift circuits 20 a and 20 b can be reduced relative to a case in which bits are supplied to a plurality of other shift circuits in an overlapped manner. For example, the total number of bits of data D supplied to the shift circuits 20 a and 20 b (64 bits) can be two-thirds of the total number of bits of data D supplied to the shift circuits 28 and 29 of the shift operation circuit 102 (96 bits), which is illustrated in FIG. 10. As a result, it is possible to reduce the circuit scale of the shift circuits 20 a and 20 b relative to the circuit scale of the shift circuits 28 and 29, and it is possible to reduce the circuit size of the shift operation circuit 100. Further, because the shift circuits 20 a and 20 b are circuits equal to each other, the designing period can be reduced relative to a case of independently designing both the shift circuits 28 and 29, which are illustrated in FIG. 10.
  • The reference bit positions RL[159] and RH[191] are allocated by shifting the bit width of the divided data D[63:32], and thereby the divided data D[63:32] and D[31:0] can be output, to the output bus R[191:1], as continuous data D[63:0]. In other words, in the normal mode, it is possible to prevent the data D[63:32] and D[31:0] from collision.
  • By receiving, at the or-circuits OR, the respective bits of the data RH[159:33] and RL[159:33] of which the bit numbers overlap with each other, the bit selecting circuit 40 can select valid data D and output the selected data to the output bus R[159:33] without using a control signal.
  • In the SIMD mode, the high-order bit SAH1[6] of the shift amount signal SAH1[6:0] and the high-order bit SAL1[6] of the shift amount signal SAL1[6:0] are set to logical values opposite to each other. Thus, even when the shift amount signal SAH[6:0] and the shift amount signal SAL[6:0] are set independently from each other, it is possible to prevent the data D[63:32] and the data D[31:0] from collision.
  • FIG. 11 illustrates a shift operation circuit 104 according to another embodiment. In FIG. 11, the same reference numerals are given to elements the same as or similar to the elements in FIG. 1 and their detailed descriptions are omitted as appropriate. The shift operation circuit 104 according to the embodiment includes shift control circuits 10, 13, and 14, shift circuits 20 a, 22 a, and 22 b, buffer circuits 30, and 32, and a bit selecting circuit 42. Similar to the shift operation circuit 100 that is illustrated in FIG. 1, the shift operation circuit 104 can foe mounted on the adder FADD or the multiplier/adder FMA for floating-point numbers of the operation processing apparatus 200, which is illustrated in FIG. 2.
  • Note that in a case where parity predictors are built in the shift operation circuit 104, parity bits DP, RPH, RPLH, RPL, and RP that are indicated in the brackets are appended. In the following, a case will be described in which the shift operation circuit 104 does not include parity predictors and parity bits DP, RPH, RPLH, RPL, and RP are not appended. Similar to the shift operation circuit 100 that is illustrated in FIG. 1, the shift operation circuit 104 can be mounted on the adder FADD or the multiplier/adder FMA for floating-point numbers of the operation processing apparatus 200, which is illustrated in FIG. 2.
  • A circuit configuration and functions of the shift control circuit 10 of FIG. 11 are the same as the circuit configuration and the functions of the shift control circuit illustrated in FIG. 1. The shift control circuit 13 changes logical values of a shift amount signal SALH[6:0] in accordance with a mode signal SIMD, and outputs the changed signal as a shift amount signal SALH1[6:0]. The shift control circuit 13 operates in a manner similar to that of the shift control circuit 11, which is illustrated in FIG. 1, except the 2 high-order bits SALH1[6:5] of a shift amount signal SALH1[6:0] are set to “10” during the SIMD mode.
  • The shift control circuit 14 operates in a manner similar to that of the shift control circuit 11, which is illustrated in FIG. 1, except the 2 high-order bits SAL1[6:5] of a shift amount signal SAL1[6:0] are set to “11” during the SIMD mode. In the normal mode (SIMD=“0”), the shift amount signals SAH[6:0], SALH[6:0], and SAL[6:0] are set to values equal to each other. In the SIMD mode (SIMD =“1”), the shift amount signals SAH[6:0], SALH[6:0], and SAL[6:0] are set independently from each other.
  • A circuit configuration and functions of the shift circuit 20 a of FIG. 11 are the same as the circuit configuration and the functions of the shift circuit 20 a that is illustrated in FIG. 1. The shift circuit 22 a shifts, in accordance with the value of a shift amount signal SALH1, the bits of 16-bit data D[31:16] within the 64-bit data D[63:0] from the high-order side to the low-order side, and outputs the shifted data to the 143-bit internal bus RLH[159:17]. That is, the shift circuit 22 a shifts the data D[31:16] to the right by the value of the shift amount signal SALH1 (which is a value from 0 bits to 127 bits). In the following, the data transmitted to the internal bus RLH[159:17] may be also referred to as the data RLH[159:17]. The shift circuit 22 a includes a function to set 127 bits to “0” except for 16 bits output as the data D[31:16] within the 143-bit data RLH[159:17].
  • The shift circuit 22 b shifts, in accordance with the value of a shift amount signal SAL1, the bits of 16-bit data D[15:0] within the 64-bit data D[63:0] from the high-order side to the low-order side, and outputs the shifted data as 143-bit data RL[143:1]. That is, the shift circuit 22 b shifts the data D[15:0] to the right by the value of the shift amount signal SAL1 (which is a value from 0 bits to 127 bits). The shift circuit 22 b includes a function to set 127 bits to “0” except for 16 bits output as the data D[15:0] within the 143-bit data RL[143:1]. Note that because the shift circuits 22 a and 22 b are circuits equal to each other and have common circuit data (macro data), it is possible to reduce a designing period of the shift circuits 22 a and 22 b relative to a case of independently designing the shift circuits 22 a and 22 b.
  • A circuit configuration and functions of the buffer circuit 30 of FIG. 11 are the same as the circuit configuration and the functions of the buffer circuit 30 that is illustrated in FIG. 1. The buffer circuit 32 outputs, as data R[16:1], the low-order 16-bit data R[16:1] within the data RL[143:1] output from the shift circuit 22 b.
  • The bit selecting circuit 42 receives the data RH[159:33], output from the shift circuit 20 a, the data RLH[159:17], output from the shift circuit 22 a, and the data RL[143:17], output from the shift circuit 22 b. The bit selecting circuit 42 selects valid bits from the data RH[159:33], the data RLH[159:17], and the data RL[143:17], and outputs the selected bits as data R[159:17]. Within the data R[159:17], the valid bits are 32 bits at a minimum and 64 bits at a maximum.
  • FIG. 12 illustrates an example of the shift control circuits 10, 13, and 14, which are illustrated in FIG. 11. A circuit configuration and functions of the shift control circuit 10 of FIG. 12 are the same as the circuit configuration and the functions of the shift control circuit 10 that is illustrated in FIG. 4. That is, during the SIMD mode, the shift control circuit 10 outputs, in accordance with the shift amount signal SAH[5:0], the shift amount signal SAH1[6:0] that represents a shift amount of from 0 bits to 63 bits.
  • The shift control circuit 13 includes an or-circuit OR that receives a mode signal SIMD, and an and-circuit AND that receives the mode signal SIMD via an inverter IV. Further, the shift control circuit 13 includes a plurality of buffers BUF that output a shift amount signal SALH[4:0] as a shift amount signal SALH1[4:0]. Outputs of the or-circuit OR and the and-circuit AND (SALH1[6:5]) are set to “10” during the SIMD mode. That is, during the SIMD mode, the shift control circuit 13 outputs, in accordance with the shift amount signal SALH[4:0], the shift amount signal SALH1[6:0] that represents a shift amount of from 64 bits to 95 bits.
  • The shift control circuit 14 includes or-circuits OR1 and OR2 that receive the mode signal SIMD, and the shift control circuit 14 includes a plurality of buffers BUF that output a shift amount signal SAL[4:0] as a shift amount signal SAL1[4:0]. Outputs of the or-circuits OR1 and OR2 (SAL1[6:5]) are set to “11” during the SIMD mode. That is, during the SIMD mode, the shift control circuit 14 outputs, in accordance with the shift amount signal SAL[4:0], the shift amount signal SAL1[6:0] that represents a shift amount of from 96 bits to 127 bits.
  • FIG. 13 illustrates an example of the buffer circuits 30 and 32 and the bit selecting circuit 42, which are illustrated in FIG. 11. A circuit configuration and functions of the buffer circuit 30 of FIG. 13 are the same as the circuit configuration and the functions of the buffer circuit 30 that is illustrated in FIG. 5. The buffer circuit 32 includes a plurality of buffers BUF that output data RL[16:1] as data R[16:1].
  • The bit selecting circuit 42 includes a plurality of or-circuits OR each of which has two input units to operate an-or logic of each bit of data RH and RLH corresponding to data R[159:144]. Further, the bit selecting circuit 42 includes a plurality of or-circuits OR each of which has three input units to operate an-or logic of each bit of data RH, RLH, and RLH corresponding to data R[143:33]. Furthermore, the bit selecting circuit 42 includes a plurality of or-circuits OR each of which has two input units to operate an-or logic of each bit of data RLH and RL corresponding to data R[32:17]. That is, for each bit of the data R[159:17], the logical value 1 is set in a case where the respective bit of the data RH[159:33], the respective of the data RLH[159:33], or the respective of the data RL[143:17] is the logical value 1.
  • Each of the shift circuits 20 a, 22 a, and 20 b, which are illustrated in FIG. 11, includes a function to set bits to the logical value 0 except for valid bits. Further, as illustrated in FIG. 14 and FIG. 15, the data D[63:0] is not simultaneously output to the internal buses RH, RLH, and RL having same bit numbers. Hence, valid data D is not simultaneously supplied to the plurality of input units of each or-circuit OR of the bit selecting circuit 42. Therefore, by receiving, through the or-circuits OR, the respective bits of the data RH, RLH, and RL of which the bit numbers overlap with each other, the bit selecting circuit 42 can select valid data and output the selected data to the output bus R[159:17] without using a control signal.
  • FIG. 14 illustrates an example of an operation in the normal mode (SIMD=“0”) of the shift operation circuit 104, which is illustrated in FIG. 11. That is, FIG. 14 illustrates an example of a shift operation method by the shift operation circuit 104. Detailed descriptions for an operation of FIG. 14 similar to that of FIG. 6 are omitted as appropriate.
  • The operation of the shift circuit 20 a is the same as the operation in FIG. 6. With reference to the bit RLH[159], the shift circuit 22 a shifts, in accordance with the shift amount signal SALH1[6:0], the position of each bit of the data D[31:16] in a range of from 0 bits to 127 bits, and outputs the shifted data as the data RLH[159:17]. With reference to the bit RL[143], the shift circuit 22 b shifts, in accordance with the shift amount signal SAL1[6:0], the position of each bit of the data D[15:0] in a range of from 0 bits to 127 bits, and outputs the shifted data as the data RL[143:1].
  • The bit range of the data RH that is output by the shift circuit 20 a and the bit range of the data RLH that is output by the shift circuit 22 a differ by 32 bits. The bit range of the data RLH that is output by the shift circuit 22 a and the bit range of the data RL that is output by the shift circuit 22 b differ by 16 bits. Further, in the normal mode (SIMD=“0”), the shift amount signals SAH[6:0], SALH[6:0], and SAL[6:0] are set to values equal to each other. Hence, in a shift operation of the data D[63:0], the shift operation circuit 104 can output the data D[63:32], D[31:16], and D[15:0] as the data R without causing the bit numbers of the data RH, RLH, and RL to overlap with each other. Further, the shift operation circuit 104 can output the data R making blank bit numbers in data RH, RLH, and RL.
  • The data R[191:1] illustrated within brackets at the lower part of FIG. 14 indicates an example of bit positions at which data D[63:0] appears in accordance with the shift amount signals SAH1, SALH1, and SAL1. The bit positions at which the data D[63:0] appears are similar to those in FIG. 6.
  • FIG. 15 illustrates an example of an operation in the SIMD mode (SIMD=“1”) of the shift operation circuit 104, which is illustrated in FIG. 11. That is, FIG. 15 illustrates another example of the shift operation method by the shift operation circuit 104. Detailed descriptions for an operation of FIG. 15 similar to that of FIG. 7 are omitted as appropriate. Note that in the SIMD mode, the shift amount signals SAH1[6:0], SALH1[6:0], and SAL1[6:0], which are illustrated in FIG. 11, are set independently from each other.
  • In the SIMD mode, the most significant bit SAH1[6] of the shift amount signal SAH1[6:0] is fixed to “0”, and the high-order bits SALH1[6:5] of the shift amount signal SALH1[6:0] are fixed to “10”. Further, the high-order bits SAL1[6:5] of the shift amount signal SAL1[6:0] are fixed to “11”. That is, in the SIMD mode, the 2 high-order bits of the shift amount signals SAH1, SALH1, and SAL1 are set to logical values different from each other.
  • The shift circuit 20 a operates in a manner similar to that in FIG. 7. That is, with reference to the bit RH[191], the shift circuit 20 a shifts, in accordance with the shift amount signal SAH1[6:0], the position of each bit of the data D[63:32] in a range of from 0 bits to 63 bits, and outputs the shifted data as the data RH[191:97].
  • With reference to the bit RLH[159], the shift circuit 22 a shifts, in accordance with the shift amount signal SALH1[6:0], the position of each bit of the data D[31:16] in a range of from 64 bits to 95 bits, and outputs the shifted data as the data RLH[95:49]. With reference to the bit RL[143], the shift circuit 22 b shifts, in accordance with the shift amount signal SAL1[6:0], the position of each bit of the 16-bit data D[15:0] in a range of from 96 bits to 127 bits, and outputs the shifted data as the data RL[47:1]. That is, in the SIMD mode, the bit range of the data RH that is output by the shift circuit 20 a, the bit range of the data RLH that is output by the shift circuit 22 a, and the bit range of the data RL that is output by the shift circuit 22 b do not overlap.
  • With respect to the upper case within the brackets at the lower part of FIG. 15, when the shift amount signal SAH is “00h”, the shift amount signal SAH1 is also set to “00h” (right shift by 0 bits), and the data D[63:32] is output as data R[191:160]. When the shift amount signal SALH is “00h”, the shift amount signal SALH1 is set to “40h” (right shift by 64 bits), and the data D[31:16] is output as data R[95:80]. When the shift amount signal SAL is “00h”, the shift amount signal SALH1 is set to “60h” (right shift by 96 bits), and the data D[15:0] is output as data R[47:32].
  • With respect to the lower case within the brackets at the lower part of FIG. 15, when the shift amount signal SAH is “3Fh”, the shift amount signal SAH1 is also set to “3Fh” (right shift by 63 bits), and the data D[63:32] is output as data R[128:97]. When the shift amount signal SALH is “1Fh”, the shift amount signal SALH1 is set to “5Fh” (right shift by 95 bits), and the data D[31:16] is output as data R[64:49]. When the shift amount signal SAL is “1Fh”, the shift amount signal SALH1 is set to “7Fh” (right shift by 127 bits), and the data D[15:0] is output as data R[16:1].
  • In this way, in the SIMD mode, the data D[63:32] is output to a range of data R[191:97], the data D[31:16] is output to a range of data R[95:49], and the data D[15:0] is output to a range of data R[47:1]. Hence, even when the shift amount signals [6:0], SALH[6:0], and SAL[6:0] which are illustrated in FIG. 11, are set independently from each other, it is possible to prevent the data D[63:32], D[31:16] and D[15:0] from collision.
  • As described above, it is also possible to obtain, from the embodiment illustrated in FIG. 11 to FIG. 15, effects similar to those of the embodiments illustrated in FIG. 1 to FIG. 10. For example, it is possible to reduce the circuit size of the shift operation circuit 104 relative to the circuit size of another shift operation circuit including a plurality of shift circuits to which bits are supplied in an overlapped manner.
  • The reference bit positions of the internal buses RH, RLH, and RL, which are respectively coupled to the three shift circuits 20 a, 22 a, and 22 b, are shifted by the bit width of divided data D, and thereby data D can be prevented from collision in the normal mode. By receiving, through the or-circuits OR each of which has two or three input units, the respective bits of the data RH, RLH, and RL of which the bit numbers overlap with each other, the bit selecting circuit 42 can select valid data D without using a control signal. In the SIMD mode, by making logical values of the high-order two bits of the shift amount signals SAH1, SALH1, and SAL1 different from each other, it is possible to prevent the data D[63:32], the data D[31:0], and the data D[15:8] from collision.
  • FIG. 16 illustrates a shift operation circuit 106 according to another embodiment. In FIG. 16, the same reference numerals are given to elements the same as or similar to the elements in FIG. 1 and FIG. 11 and their detailed descriptions are omitted as appropriate. The shift operation circuit 106 according to the embodiment includes shift control circuits 15, 16, 13, and 14, shift circuits 22 c, 22 d, 22 a, and 22 b, buffer circuits 33, and 32, and a bit selecting circuit 44.
  • Similar to the shift operation circuit 100 that is illustrated in FIG. 1, the shift operation circuit 106 can be mounted on the adder FADD or the multiplier/adder FMA for floating-point numbers of the operation processing apparatus 200, which is illustrated in FIG. 2. In this case, in a SIMD operation, data (operands) divided into four are used to execute the operation in parallel.
  • Note that in a case where parity predictors are built in the shift operation circuit 106, parity bits DP, RPH, RPHH, RPLH, RPL, and RP that are indicated in the brackets are appended. In the following, a case will be described in which the shift operation circuit 106 does not include parity predictors and parity bits parity bits DP, RPH, RPHH, RPLH, RPL, and RP are not appended.
  • The shift control circuit 15 changes logical values of a shift amount signal SAH[6:0] in accordance with a mode signal SIMD, and outputs the changed signal as a shift amount signal SAH1[6:0]. The shift control circuit 16 changes logical values of a shift amount signal SAHH[6:0] in accordance with the mode signal SIMD, and outputs the changed signal as a shift amount signal SAHH1[6:0]. A circuit configuration and functions of the shift control circuit 13 of FIG. 16 are the same as the circuit configuration and the functions of the shift control circuit 13 that is illustrated in FIG. 11, and a circuit configuration and functions of the shift control circuit 14 of FIG. 16 are the same as the circuit configuration and the functions of the shift control circuit 14 that is illustrated in FIG. 11.
  • The shift circuits 22 c, 22 d, 22 a, and 22 b have circuit configurations the same as those of the shift circuits 22 a and 22 b, which are illustrated in FIG. 11. Hence, it is possible to have common circuit data (macro data) in the shift circuits 22 c, 22 d, 22 a, and 22 b, and it is possible to reduce the designing period of the shift circuits 22 c, 22 d, 22 a, and 22 b relative to a case of independently designing the shift circuits 22 c, 22 d, 22 a, and 22 b. An operation of the shift circuit 22 a of FIG. 16 is the same as the operation of the shift circuit 22 a that is illustrated in FIG. 11, and an operation of the shift circuit 22 b of FIG. 16 is the same as the operation of the shift circuit 22 b that is illustrated in FIG. 11.
  • The shift circuit 22 c shifts, in accordance with the value of a shift amount signal SAH1, the bits of 16-bit data D[63:48] within the 64-bit data D[63:0] from the high-order side to the low-order side, and outputs the shifted data to the 143-bit internal bus RH[191:49]. That is, the shift circuit 22 c shifts the data D[63:48] to the right by the value of the shift amount signal SAH1 (which is a value from 0 bits to 127 bits).
  • The shift circuit 22 d shifts, in accordance with the value of a shift amount signal SAHH1, the bits of 16-bit data D[47:32] within the 64-bit data D[63:0] from the high-order side to the low-order side, and outputs the shifted data to the 143-bit internal bus RHH[175:33]. That is, the shift circuit 22 d shifts the data D[47:32] to the right by the value of the shift amount signal SAHH1 (which is a value from 0 bits to 127 bits). In the following, data transmitted to the internal bus RHH[175:33] may be also referred to as the data RHH[175:33].
  • A circuit configuration and functions of the buffer circuit 32 of FIG. 16 are the same as the circuit configuration and the functions of the buffer circuit 32 that is illustrated in FIG. 11. The buffer circuit 33 has a circuit configuration the same as that of the buffer circuit 32. The buffer circuit 33 outputs, as data R[191:176], the high-order 16-bit data RH[191:176] within the data RH[191:49] output from the shift circuit 22 c.
  • The bit selecting circuit 44 receives the data RH[175:49], output from the shift circuit 22 c, and the data RHH[175:33], output from the shift circuit 22 d. Further, the bit selecting circuit 44 receives the data RLH[159:17], output from the shift circuit 22 a, and the data RL[143:17], output from the shift circuit 22 b. The bit selecting circuit 44 selects valid bits from the data RH[175:49], the data RHH[175:33], the data RLH[159:17], and the data RL[143:17], and outputs the selected bits as data R[175:17]. Within the data R[175:17], the valid bits are 48 bits at a minimum and 64 bits at a maximum.
  • FIG. 17 illustrates an example of the shift control circuits 15, 16, 13, and 14, which are illustrated in FIG. 16. A circuit configuration and functions of the shift control circuit 13 of FIG. 17 are the same as the circuit configuration and the functions of the shift control circuit 13 that is illustrated in FIG. 11, and a circuit configuration and functions of the shift control circuit 14 of FIG. 17 are the same as the circuit configuration and the functions of the shift control circuit 14 that is illustrated in FIG. 11.
  • The shift control circuit 15 includes and-circuits AND1 and AND2 that receive a mode signal SIMD via an inverter IV, and a plurality of buffers BUF that output a shift amount signal SAH[4:0] as a shift amount signal SAH1[4:0]. Outputs of the and-circuit AND1 and AND2 (SAH1[6:5]) are set to “00” during the SIMD mode. That is, during the SIMD mode, the shift control circuit 15 outputs, in accordance with the shift amount signal SAH[4:0], the shift amount signal SAH1[6:0] that represents a shift amount of from 0 bits to 31 bits.
  • The shift control circuit 16 includes an and-circuit AND that receives the mode signal SIMD via an inverter and an or-circuit OR that receives the mode signal SIMD. Further, the shift control circuit 16 includes a plurality of buffers BUF that output a shift amount signal SAHH[4:0] as a shift amount signal SAHH1[4:0]. Outputs of the and-circuit AND and the or-circuit OR (SAHH1[6:5]) are set to “01” during the SIMD mode. That is, during the SIMD mode, the shift control circuit 16 outputs, in accordance with the shift amount signal SAHH[4:0], the shift amount signal SAHH1[6:0] that represents a shift amount of from 32 bits to 64 bits.
  • FIG. 18 illustrates an example of the buffer circuits 33 and 32 and the bit selecting circuit 44, which are illustrated in FIG. 16. A circuit configuration and functions of the buffer circuit 32 of FIG. 18 are the same as the circuit configuration and the functions of the buffer circuit 32 that is illustrated in FIG. 13. The buffer circuit 33 includes a plurality of buffers BUF that output data RH[191:176] as data R[191:176].
  • The bit selecting circuit 44 includes a plurality of or-circuits OR each of which has two input units to operate an-or logic of each bit of data RH and RHH corresponding to data R[175:160]. Further, the bit selecting circuit 44 includes a plurality of or-circuits OR each of which has three input units to operate an-or logic of each bit of data RH, RHH, and RLH corresponding to data R[159:144]. Furthermore, the bit selecting circuit 44 includes a plurality of or-circuits OR each of which has four input units to operate an-or logic of each bit of data RH, RHH, RLH, and RL corresponding to data R[143:49].
  • Further, the bit selecting circuit 44 includes a plurality of or-circuits OR each of which has three input units to operate an-or logic of each bit of data RHH, RLH, and RL corresponding to data R[48:33]. Furthermore, the bit selecting circuit 44 includes a plurality of or-circuits OR each of which has two input units to operate an-or logic of each bit of data RLH and RL corresponding to data R[32:17]. That is, for each bit of the data R[175:17], the logical value 1 is set in a case where the respective of the data RH[175:49], the respective of the data RHH[175:33], each bit of the data RLH[159:17], or the respective of the data RL[143:17] is the logical value 1.
  • Each of the shift circuits 22 c, 22 d, 22 a, and 20 b, which are illustrated in FIG. 16, includes a function to set bits to the logical value 0 except for valid bits. Further, as illustrated in FIG. 19 and FIG. 20, the data D[63:0] is not simultaneously output to the internal buses RH, RHH, RLH, and RL having same bit numbers. Hence, valid data D is not simultaneously supplied to a plurality of input units of each or-circuit OR of the bit selecting circuit 44. Therefore, by receiving, through the or-circuits OR, the respective bits of the data RH, RLH, and RL of which the bit numbers overlap with each other, the bit selecting circuit 44 can select valid data and output the selected data to the output bus R[175:17] without using a control signal.
  • FIG. 19 illustrates an example of an operation in the normal mode (SIMD=“0”) of the shift operation circuit 106, which is illustrated in FIG. 16. That is, FIG. 19 illustrates an example of a shift operation method by the shift operation circuit 106. Detailed descriptions for an operation of FIG. 19 similar to that of FIG. 6 and FIG. 14 are omitted as appropriate.
  • With reference to the bit RH[191], the shift circuit 22 c shifts, in accordance with the shift amount signal SAH1[6:0], the position of each bit of the data D[63:48] in a range of from 0 bits to 127 bits, and outputs the shifted data as the data RH[191:49]. With reference to the bit RH[175], the shift circuit 22 d shifts, in accordance with the shift amount signal SAHH1[6:0], the position of each bit of the data D[47:32] in a range of from 0 bits to 127 bits, and outputs the shifted data as the data RHH[175:33]. Operations of the shift circuits 22 a and 22 b are the same as those in FIG. 14.
  • The bit range of the data RH that is output by the shift circuit 22 c and the bit range of the data RHH that is output by the shift circuit 22 d differ by 16 bits. The bit range of the data RHH that is output by the shift circuit 22 d and the bit range of the data RLH that is output by the shift circuit 22 a differ by 16 bits. The bit range of the data RLH that is output by the shift circuit 22 a and the bit range of the data RL that is output by the shift circuit 22 b differ by 16 bits. Further, in the normal mode (SIMD=“0”), the shift amount signals SAH[6:0], SAHH[6:0], SALH[6:0], and SAL[6:0] are set to values equal to each other. Hence, in a shift operation, the shift operation circuit 106 can output the data D[63:48], D[47:32], [31:16], and D[15:0] as the data R without causing the bit numbers of the data RH, RHH, RLH, and RL to overlap with each other. Further, the shift operation circuit 106 can output the data R without making blank bit numbers of data RH, RHH, RLH, and RL.
  • FIG. 20 illustrates an example of an operation in the SIMD mode (SIMD=“1”) of the shift operation circuit 106, which is illustrated in FIG. 16. That is, FIG. 20 illustrates another example of the shift operation method by the shift operation circuit 106. Detailed descriptions for an operation of FIG. 20 similar to that of FIG. 7 and FIG. 15 are omitted as appropriate. Note that in the SIMD mode, the shift amount signals SAH1[6:0], SAHH1[6:0], SALH1[6:0], and SAL1[6:0], which are illustrated in FIG. 16, are set independently from each other.
  • In the SIMD mode, the high-order bits SAH1[6:5] of the shift amount signal SAH1[6:0] are fixed to “00”, and the high-order bits SAHH1[6:5] of the shift amount signal SAHH1[6:0] are fixed to “01”. Further, the high-order bits SALH1[6:5] of the shift amount signal SALH1[6:0] are fixed to “10”, and the high-order bits SAL1[6:5] of the shift amount signal SAL1[6:0] are fixed to “11”. That is, in the SIMD mode, the 2 high-order bits of the shift amount signals SAH1, SAHH1, SALH1, and SAL1 are set to logical values different from each other.
  • With reference to the bit RH[191], the shift circuit 22 c shifts, in accordance with the shift amount signal SAH1[6:0], the position of each bit of the data D[63:48] in a range of from 0 bits to 31 bits, and outputs the shifted data as the data RH[191:145]. With reference to the bit RH[175], the shift circuit 22 d shifts, in accordance with the shift amount signal SAHH1[6:0], the position of each bit of the data D[47:32] in a range of from 32 bits to 63 bits, and outputs the shifted data as the data RH[143:97]. Operations of the shift circuits 22 a and 22 b of FIG. 20 are the same as the operations of the shift circuits 22 a and 22 b that axe illustrated in FIG. 15.
  • As illustrated in FIG. 20, in the SIMD mode, the data D[63:48] is output to a range of data R[191:145], and the data D[47:32] is output to a range of data R[143:97]. The data D[31:16] is output to a range of data R[95:49], and the data D[15:0] is output to a range of data R[47:1]. That is, the bit ranges of the data RH, RHH, RLH, and RL that are output by the shift circuits 22 c, 22 d, 22 a, and 22 b do not overlap with each other. Hence, even when the shift amount signals SAH, SAHH, SALH, and SAL are set independently from each other, it is possible to prevent the data D[63:48], D[47:32], D[31:16], and D[15:0] from collision.
  • FIG. 21 illustrates an example of a shift operation of the shift operation circuit 106, which is illustrated in FIG. 16. In the normal mode, for example, when the shift amount signals SAH, SAHH, SALH, and SAL are “00h”, the shift amount signals SAH1, SAHH1, SALH1, and SAL1 are also set to “00h” (right shift by 0 bits). In this case, the data D[63:48], D[47:32], D[31:16], and D[15:0] are output as R[191:128]. When the shift amount signals SAH, SAHH, SALH, and SAL are “19h”, the shift amount signals SAH1, SAHH1, SALH1, and SAL1 are also set to “19h” (right shift by 25 bits). In this case, the data D[63:48], D[47:32], D[31:16], and D[15:0] are output as R[166:103].
  • When the shift amount signals SAH, SAHH, SALH, and SAL are “6Eh”, the shift amount signals SAH1, SAHH1, SALH1, and SAL1 are also set to “6Eh” (right shift by 110 bits). In this case, the data D[63:48], D[47:32], D[31:16], and D[15:0] are output as R[81:18]. When the shift amount signals SAH, SAHH, SALH, and SAL are “7Fh”, the shift amount signals SAH1, SAHH1, SALH1, and SAL1 are also set to “7Fh” (right shift by 127 bits). In this case, the data D[63:48], D[47:32], D[31:16], and D[15:0] are output as R[64:1].
  • Conversely, in the SIMD mode, for example, the shift amount signals SAH, SAHH, SALH, and SAL are set to “00h”, “1Fh”, “00h”, and “1Fh”. In this case, the shift amount signals SAH1, SAHH1, SALH1, and SAL1 are set to “00h”, “3Fh”, “40h”, and “7Fh”. In this case, the data D[63:48] is output as R[191:176], and the data D[47:32] is output as R[112:97]. In this case, the data D[31:16] is output as R[95:80], and the data D[15:0] is output as R[16:1].
  • Further, in the SIMD mode, for example, the shift amount signals SAH, SAHH, SALH, and SAL are set to “1Fh”, “00h”, “1Fh”, and “00h”. In this case, the shift amount signals SAH1, SAHH1, SALH1, and SAL1 are set to “1Fh”, “20h”, “5Fh”, and “60h”. In this case, the data D[63:48] is output as R[160:145], and the data D[47:32] is output as R[143:128]. In this case, the data D[31:16] is output as R[64:49], and the data D[15:0] is output as R[47:32].
  • Note that by making the mode signal SIMD illustrated in FIG. 16 into 2 bits, the operation processing apparatus 200, in which the shift operation circuit 106 is mounted on an arithmetic unit, can execute both a SIMD operation for 32-bit data (divided into two) and a SIMD operation for 16-bit data (divided into four). In this case, during a first SIMD mode for executing a two-divisional SIMD operation, the shift control circuits 15 and 16 set the most significant bits SAH1[6] and SAHH1[6] of shift amount signals SAH1[6:0] and SAHH1[6:0] to the logical value 0. The shift control circuits 13 and 14 set the most significant bits SALH1[6] and SAL1[6] of shift amount signals SALH1[6:0] and SAL1[6:0] to the logical value 1. Thereby, the shift operation circuit 106 operates in a manner similar to that in FIG. 7. A second SIMD mode for executing a four-divisional SIMD operation is the same as that in FIG. 20.
  • As described above, it is also possible to obtain, from the embodiment illustrated in FIG. 16 to FIG. 21, effects similar to those of the embodiments illustrated in FIG. 1 to FIG. 15. For example, it is possible to reduce the circuit size of the shift operation circuit 106 relative to the circuit size of another shift operation circuit including a plurality of shift circuits to which bits are supplied in an overlapped manner.
  • The reference bit positions of the internal buses RH, RHH, RLH, and RL, which are respectively coupled to the four shift circuits 22 c, 22 d, 22 a, and 22 b, are shifted by the bit width of divided data D, and thereby data D can be prevented from collision in the normal mode. By receiving, through the or-circuits OR each of which has two input units, three input units or four input units, the respective bits of the data RH, RHH, RLH, and RL of which the bit numbers overlap with each other, the bit selecting circuit 44 can select valid data D without using a control signal. In the SIMD mode, by making logical values of the high-order two bits of the shift amount signals SAH1, SAHH1, SALH1, and SAL1 different from each other, it is possible to prevent the data D[63:48], D[47:32], D[31:16], and D[15:0] from collision. Further, according to the embodiment illustrated in FIG. 16 to FIG. 21, by making the mode signal SIMD into 2 bits, the shift operation circuit 106 can execute a two-divisional SIMD operation or a four-divisional SIMD operation.
  • All examples and conditional language provided herein are intended for pedagogical purposes of aiding the reader in understanding the invention and the concepts contributed by the inventors to further the art, and are not to be construed as limitation to such specifically recited examples and conditions, nor does the organization of such examples in the specification relate to a showing of superiority and inferiority of the invention. Although one or more embodiments of the present invention have been described in detail, it should be understood that various changes, substitutions, and alterations could be made hereto without departing from the spirit and scope of the invention.

Claims (6)

What is claimed is:
1. A shift operation circuit comprising:
a plurality of shift circuits each of which is coupled to a corresponding internal bus that is one of a plurality of internal buses having a bit width greater than a bit width of input data, a part of bit numbers of the plurality of internal buses overlapping, each of the plurality of shift circuits being configured to receive corresponding divided data that is one of a plurality of sets of divided data obtained by dividing the input data and to receive a corresponding shift amount signal that is one of a plurality of shift amount signals, each of the plurality of shift circuits being configured to output the corresponding divided data to a range shifted based on a shift amount represented by the corresponding shift amount signal from a reference bit position in the corresponding internal bus;
a shift control circuit configured to receive, during a first mode, each of a plurality of shift amount signals whose shift amounts are common and to output, as the corresponding shift amount signal, the received plurality of shift amount signals to each of the plurality of shift circuits, and the shift control circuit being configured to receive, during a second mode, a shift amount signal for each of the plurality of shift circuits, convert the received shift amount signal into a corresponding shift amount signal that represents a shift range whose bit numbers do not overlap in the plurality of internal buses, and to output the corresponding shift amount signal to each of the plurality of shift circuits; and
a bit selecting circuit configured to select valid corresponding divided data from bits whose bit numbers overlap in the plurality of internal buses and configured to output the selected corresponding divided data to an output bus.
2. The shift operation circuit according to claim 1, wherein the reference bit position in each of the plurality of respective internal buses is allocated by shifting a bit width of the corresponding divided data.
3. The shift operation circuit according to claim 1,
wherein each of the plurality of shift circuits includes a function to set one or more bits, which do not output the corresponding divided data in the corresponding internal bus, to a logical value 0, and
wherein the bit selecting circuit includes a plurality of or-circuits having input units coupled to bits whose bit numbers overlap in the plurality of internal buses.
4. The shift operation circuit according to claim 1, wherein during the second mode, the shift control circuit sets a predetermined number of high-order bits in the corresponding shift amount signal output to each of the plurality of shift circuits to logical values different from each other.
5. The shift operation circuit according to claim 1, further comprising:
a buffer circuit configured to output, to the output bus, corresponding divided data output to bits whose bit numbers do not overlap in the plurality of internal buses.
6. A shift operation method for a shift operation circuit including a plurality of shift circuits each of which is coupled to a corresponding internal bus that is one of a plurality of internal buses having a bit width greater than a bit width of input data, a part of bit numbers of the plurality of internal buses overlapping, the shift operation method comprising:
receiving, by each of the plurality of shift circuits, corresponding divided data that is one of a plurality of sets of divided data obtained by dividing the input data;
receiving, by each of the plurality of shift circuits, a corresponding shift amount signal that is one of a plurality of shift amount signals;
outputting, by each of the plurality of shift circuits, the corresponding divided data to a range shifted based on a shift amount represented by the corresponding shift amount signal from a reference bit position in the corresponding internal bus;
receiving, by a shift control circuit included in the shift operation circuit, during a first mode, each of a plurality of shift amount signals whose shift amounts are common and outputting, as the corresponding shift amount signal, the received plurality of shift amount signals to each of the plurality of shift circuits;
receiving, by the shift control circuit, during a second mode, a shift amount signal for each of the plurality of shift circuits, converts the received shift amount signal into a corresponding shift amount signal that represents a shift range whose bit numbers do not overlap in the plurality of internal buses, and outputting the corresponding shift amount signal to each of the plurality of shift circuits;
selecting, by a bit selecting circuit included in the shift operation circuit, valid corresponding divided data from bits whose bit numbers overlap in the plurality of internal buses; and
outputting, by the bit selecting circuit, the selected corresponding divided data to an output bus.
US15/877,765 2017-02-06 2018-01-23 Shift operation circuit and shift operation method Active US10056906B1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
JP2017-019576 2017-02-06
JP2017019576A JP6733569B2 (en) 2017-02-06 2017-02-06 Shift operation circuit and shift operation method

Publications (2)

Publication Number Publication Date
US20180226970A1 true US20180226970A1 (en) 2018-08-09
US10056906B1 US10056906B1 (en) 2018-08-21

Family

ID=63038015

Family Applications (1)

Application Number Title Priority Date Filing Date
US15/877,765 Active US10056906B1 (en) 2017-02-06 2018-01-23 Shift operation circuit and shift operation method

Country Status (2)

Country Link
US (1) US10056906B1 (en)
JP (1) JP6733569B2 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20230004393A1 (en) * 2021-06-26 2023-01-05 Intel Corporation Apparatus and method for vector packed signed/unsigned shift, round, and saturate

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113131948B (en) * 2020-01-10 2024-07-16 瑞昱半导体股份有限公司 Data shift operation device and method with multiple modes

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US4490809A (en) * 1979-08-31 1984-12-25 Fujitsu Limited Multichip data shifting system
US6810475B1 (en) * 1998-10-06 2004-10-26 Texas Instruments Incorporated Processor with pipeline conflict resolution using distributed arbitration and shadow registers
US20110004643A1 (en) * 2009-07-01 2011-01-06 Fujitsu Limited Shift calculator
US20120254271A1 (en) * 2011-03-29 2012-10-04 Fujitsu Limited Arithmetic operation circuit and method of converting binary number
US20130173994A1 (en) * 2011-12-30 2013-07-04 Lsi Corporation Variable Barrel Shifter
US20150261498A1 (en) * 2014-03-14 2015-09-17 Arm Limited Data processing apparatus and method for performing a shift function on a binary number
US20180067721A1 (en) * 2016-09-07 2018-03-08 Arm Limited Floating point addition with early shifting

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP3112676B2 (en) * 1987-12-17 2000-11-27 日本電気株式会社 Shift operation circuit
JPH04361325A (en) * 1991-06-07 1992-12-14 Sony Corp Barrel shifter device
EP0681236B1 (en) 1994-05-05 2000-11-22 Conexant Systems, Inc. Space vector data path
US7685212B2 (en) * 2001-10-29 2010-03-23 Intel Corporation Fast full search motion estimation with SIMD merge instruction
JP4690362B2 (en) 2007-07-04 2011-06-01 株式会社リコー SIMD type microprocessor and data transfer method for SIMD type microprocessor

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US4490809A (en) * 1979-08-31 1984-12-25 Fujitsu Limited Multichip data shifting system
US6810475B1 (en) * 1998-10-06 2004-10-26 Texas Instruments Incorporated Processor with pipeline conflict resolution using distributed arbitration and shadow registers
US20110004643A1 (en) * 2009-07-01 2011-01-06 Fujitsu Limited Shift calculator
US20120254271A1 (en) * 2011-03-29 2012-10-04 Fujitsu Limited Arithmetic operation circuit and method of converting binary number
US20130173994A1 (en) * 2011-12-30 2013-07-04 Lsi Corporation Variable Barrel Shifter
US20150261498A1 (en) * 2014-03-14 2015-09-17 Arm Limited Data processing apparatus and method for performing a shift function on a binary number
US20180067721A1 (en) * 2016-09-07 2018-03-08 Arm Limited Floating point addition with early shifting

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20230004393A1 (en) * 2021-06-26 2023-01-05 Intel Corporation Apparatus and method for vector packed signed/unsigned shift, round, and saturate

Also Published As

Publication number Publication date
US10056906B1 (en) 2018-08-21
JP2018128727A (en) 2018-08-16
JP6733569B2 (en) 2020-08-05

Similar Documents

Publication Publication Date Title
US8386755B2 (en) Non-atomic scheduling of micro-operations to perform round instruction
US9778911B2 (en) Reducing power consumption in a fused multiply-add (FMA) unit of a processor
US6490607B1 (en) Shared FP and SIMD 3D multiplier
US6397239B2 (en) Floating point addition pipeline including extreme value, comparison and accumulate functions
JP4953644B2 (en) System and method for a floating point unit providing feedback prior to normalization and rounding
US8214417B2 (en) Subnormal number handling in floating point adder without detection of subnormal numbers before exponent subtraction
US8468191B2 (en) Method and system for multi-precision computation
US8577948B2 (en) Split path multiply accumulate unit
US5940311A (en) Immediate floating-point operand reformatting in a microprocessor
US8838664B2 (en) Methods and apparatus for compressing partial products during a fused multiply-and-accumulate (FMAC) operation on operands having a packed-single-precision format
US8046400B2 (en) Apparatus and method for optimizing the performance of x87 floating point addition instructions in a microprocessor
Bruguera et al. Floating-point fused multiply-add: reduced latency for floating-point addition
CN111767516A (en) System and method for performing floating point addition with selected rounding
US10056906B1 (en) Shift operation circuit and shift operation method
Boersma et al. The POWER7 binary floating-point unit
US20200133633A1 (en) Arithmetic processing apparatus and controlling method therefor
US10838718B2 (en) Processing device, arithmetic unit, and control method of processing device
US20220326911A1 (en) Product-sum calculation device and product-sum calculation method
US20190317766A1 (en) Apparatuses for integrating arithmetic with logic operations
US10248417B2 (en) Methods and apparatuses for calculating FP (full precision) and PP (partial precision) values
US8041927B2 (en) Processor apparatus and method of processing multiple data by single instructions
JP2007226489A (en) Multiplier and arithmetic unit
US9280316B2 (en) Fast normalization in a mixed precision floating-point unit
EP3118737B1 (en) Arithmetic processing device and method of controlling arithmetic processing device
EP1089166A2 (en) An integer instruction set architecture and implementation

Legal Events

Date Code Title Description
FEPP Fee payment procedure

Free format text: ENTITY STATUS SET TO UNDISCOUNTED (ORIGINAL EVENT CODE: BIG.); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY

AS Assignment

Owner name: FUJITSU LIMITED, JAPAN

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:MIYADAI, TOMOHARU;REEL/FRAME:045169/0416

Effective date: 20171226

STCF Information on status: patent grant

Free format text: PATENTED CASE

MAFP Maintenance fee payment

Free format text: PAYMENT OF MAINTENANCE FEE, 4TH YEAR, LARGE ENTITY (ORIGINAL EVENT CODE: M1551); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY

Year of fee payment: 4