WO1991010186A1 - Systeme d'operations haute vitesse - Google Patents
Systeme d'operations haute vitesse Download PDFInfo
- Publication number
- WO1991010186A1 WO1991010186A1 PCT/JP1990/001686 JP9001686W WO9110186A1 WO 1991010186 A1 WO1991010186 A1 WO 1991010186A1 JP 9001686 W JP9001686 W JP 9001686W WO 9110186 A1 WO9110186 A1 WO 9110186A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- data
- operations
- arithmetic
- old
- new
- Prior art date
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F7/00—Methods or arrangements for processing data by operating upon the order or content of the data handled
- G06F7/38—Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F7/00—Methods or arrangements for processing data by operating upon the order or content of the data handled
- G06F7/38—Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation
- G06F7/48—Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation using non-contact-making devices, e.g. tube, solid state device; using unspecified devices
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F7/00—Methods or arrangements for processing data by operating upon the order or content of the data handled
- G06F7/60—Methods or arrangements for performing computations using a digital non-denominational number representation, i.e. number representation without radix; Computing devices using combinations of denominational and non-denominational quantity representations, e.g. using difunction pulse trains, STEELE computers, phase computers
- G06F7/72—Methods or arrangements for performing computations using a digital non-denominational number representation, i.e. number representation without radix; Computing devices using combinations of denominational and non-denominational quantity representations, e.g. using difunction pulse trains, STEELE computers, phase computers using residue arithmetic
- G06F7/724—Finite field arithmetic
Definitions
- the present invention relates to a high-speed operation method, and more particularly, to a high-speed operation method capable of performing operations such as floating-point operations at a high speed based on an entirely new encoding principle.
- Landscape technology a high-speed operation method capable of performing operations such as floating-point operations at a high speed based on an entirely new encoding principle.
- FIG. 22 is a conceptual diagram showing the configuration of a conventional operation method.
- the memory 3 and the arithmetic unit 45 are connected to the bus 2 connected to the CPU 1.
- the first computing unit 4 performs a binary operation on the data X and Y read from the memory 3 under the control of the CP IJ 1, and stores the result Z in the memory 3 again.
- the second computing unit 5 performs a unary operation on the data X read from the memory 3, and stores the result Y in the memory 3. That is, in many cases, the conventional operation method directly performs an operation on the data read from the memory 3.
- the number of gate stages of the critical path is 0 (10 gn) and the number of elements is n in the order of a polynomial even when performing optimal logic design.
- ⁇ (log n) indicates that it is of the order of log n (the same applies hereinafter).
- a multiplier using redundant binary representation (Naofumi Takagi “High-speed multiplier for VLSI using redundant binary adder” IEICE Trans. (D), J66-D, 6, p p. 683 — 69 0 (Showa 58—06)) and parallel counter type multipliers with 0 (10 gn) gate stages and 0 (n 2 ) elements have already been devised.
- the fixed-point multiplier using the redundant binary representation is incorporated as a mantissa multiplier of a floating-point multiplier and implemented as an LSI (H. E damatsueta 1 .: "A33 MF LOPSF loating—Point Processor U sing R edundant Binary R epresentation ”ISSCC 88, pp. 15 2—15 3).
- Such an approach to the optimal logic design of the arithmetic unit is not limited to the multiplier, but also to the adder / subtractor, etc. It is not expected.
- the issue of the logic design of a computing unit is that the given computing unit has already been determined for a given number expression (for example, a binary number in 2's complement notation).
- the problem is how to implement the input / output relationship (truth table) using logic circuits.
- a fixed-point multiplier using redundant binary representation, etc. realizes an efficient circuit by using redundant binary representation for the internal representation of a logic circuit. Are also considered to have reached the limit.
- redundant binary multiplication is performed through a binary representation-redundant binary representation converter, and the results of the redundant binary representation are accumulated by accumulating a plurality of multiplication results in the redundant binary number.
- Redundant binary one 2 A signal processor has also been developed that allows the final product-sum operation result to be calculated at high speed by passing it through a base number converter (T. E nomoto, eta 1 .: "200 MHz z 16 bit B iCMO SS ignal Processor "ISSCC '89 D iestof TH PM 12.8, F eb., 1989).
- fixed-point data is encoded in a set of remainders or redundant binary representation, a series of operations is performed on the encoded data, and the result is decoded into fixed-point data.
- a series of operations is performed on the encoded data, and the result is decoded into fixed-point data.
- the operation is limited to addition / subtraction multiplication, and it cannot be applied to floating point data other than fixed point data.
- encoding is effective for high-speed operations, and in particular, redundant encoding is also effective for high-speed design of arithmetic units.
- Some studies have been conducted on the conversion of data to data (for example, Yasuura, Takagi, Yajima, "High-speed parallel algorithms using redundant coding", IEICE (D), J70-D, 3, p. 5 2 5-5 3 3 (Showa 2-0 3) ).
- redundant coding and local computability are defined in a general form. According to it, the definition part is as follows. Redundant coding
- ⁇ be a finite set.
- ⁇ be a finite set of symbols used for encoding. "Represents a set of all sequences of the upper length ⁇ .
- the elements are encoded with the sequence of ⁇ " as ⁇ .
- only isometric codes are considered. Also assume that ⁇ ⁇
- mapping ⁇ is called ⁇ for ⁇ : upper-length encoding if the following two conditions are satisfied.
- mapping of coding Since we define the mapping of coding as a mapping from the code space ⁇ : "to the union of the original set and ⁇ , we can create redundant coding. One or more of ⁇ Encoding is said to be redundant when an element has more than one code Figure 23 shows the mapping from the code space ⁇ : "onto the union of the original set ⁇ and ⁇ FIG.
- F (fx, ⁇ 2, ' ⁇ , ⁇ be a mapping from “to ⁇ :», and let ⁇ i be a partial function for each element of the output of F.
- one or more arbitrary operations are defined, and data corresponding to an arbitrary finite set that is operationally closed with respect to these operations (hereinafter referred to as data of the old operation system)
- data of the old operation system For a calculation system such as a computer that computes (the computation system of this system is hereinafter referred to as the old computation system), it is difficult to perform the computation as it is in the old computation system.
- the data is once converted to data of a new operation system that satisfies certain conditions by an encoder, a series of operations is executed by an operation unit of the new operation system, and the obtained result is converted to an old operation system by a decoder.
- the data on the collection base ⁇ is once converted to data on the finite set ⁇ 'by the encoder ⁇ : ⁇ ⁇ ⁇ ', and the operations Ap and B A'p: ⁇ 'X ⁇ ' ⁇ ⁇ ', B'q : Execute ⁇ ' ⁇ ⁇ ', and the calculation result obtained as data on ⁇ ' ⁇ : By converting from ⁇ 'to ⁇ , a configuration is obtained in which the calculation result on ⁇ originally calculated is obtained.
- the new operation system the combination of the finite set ⁇ 'and the operations ⁇ ' ⁇ , B ', and the encoder ⁇ and decoder ( ⁇ ', ⁇ ' ⁇ , B' Q , ⁇ , ⁇ ) is called the new operation system.
- the new operation system ⁇ ', ⁇ ' ⁇ , ⁇ ', ⁇ , ⁇
- the old operation system ⁇ , Ap, BJ
- ⁇ ([X]) ⁇ ⁇ (3) For all X, X that is, and for all ⁇ ⁇ that is ⁇ ⁇ 1, 2,.
- Equation (2) above shows the relationship at the time of encoding
- Equation (3) shows the relationship at the time of decoding
- Equation (4) shows the relationship between the new and old operations and the sign at the time of binomial operation
- (5) The expressions show the relationship between the new and old operations and the sign at the time of unary operation, respectively.
- [X] contains at least one element.
- mapping in the present invention is a surjection to the data set ⁇ of the old arithmetic system, but the mapping in Reference 1 is not a surjection to ⁇ but a surjection to ⁇ ⁇ ⁇ .
- the degree of freedom of selection described in the present invention is qualitatively different from the degree of freedom of selection described in the aforementioned reference 1, and has a remarkably large degree of freedom. This means that if encoding is redundant, F is not uniquely determined. That is, the equation (1) is satisfied. Since there are multiple candidates for, there is a degree of freedom in selecting F that defines *. We choose F that can be calculated at high speed using this degree of freedom.
- the document 1 described above uses the degree of freedom in determining the arithmetic unit that is generated due to the redundant code for a given redundant code. It is clear from.
- FIG. 1 is a block diagram showing the relationship between a preferred encoder, decoder, arithmetic unit, and GP ⁇ according to the present invention.
- FIG. 2 is a diagram showing the mapping of the present invention and the domain ⁇ ′-C1 ⁇ .
- (: And a diagram showing the relationship between the range 2 and FIGS. 3 to 5 are explanatory diagrams of the respective operation rules in the binary operation ⁇ , ⁇ and the unary operation C, which are specific examples of the old operation system.
- 6 to 8 are diagrams showing respective circuit examples of the arithmetic units A, ⁇ , and C by the old arithmetic system
- FIG. 9 is a diagram showing symbols of the new arithmetic system in the embodiment, and FIGS. Fig.
- FIG. 12 is a diagram showing the operation rules of the binary operation A ', B' and the unary operation C 'of the new operation system in the embodiment corresponding to each of A, ⁇ , and C of the old operation system.
- FIG. 14 is a diagram showing the correspondence between the decoders in the embodiment
- FIGS. 15 to 19 are two terms of the new arithmetic system in the embodiment, respectively.
- Circuit diagrams of arithmetic ⁇ , ⁇ ', unary operation C', encoder ⁇ and decoder, Fig. 20 shows the number of elements in each gate and the delay time, and Fig. 21 shows the specifics of the old arithmetic system.
- Fig. 20 shows the number of elements in each gate and the delay time
- Fig. 21 shows the specifics of the old arithmetic system.
- FIG. 22 shows a comparison between the number of elements and the delay time between the example and the embodiment of the new operation system.
- Fig. 22 shows the configuration of a typical conventional operation method.
- FIG. 23 is a diagram showing a mapping from the code space “disclosed in Reference 1” to the union table of the original tables ⁇ and ⁇ .
- FIG. 1 is a block diagram showing a relationship between a preferred encoder, a decoder, an arithmetic unit (new arithmetic system) and a CPU according to the present invention. It is described in a form corresponding to the operation method (old operation system). The same numbers, # 1 for binary operations and # 2 for unary operations, are used to indicate the correspondence between the new and old arithmetic systems. In the conventional calculation system of the old arithmetic system as shown in Fig. 22, the data in the memory was sent to the arithmetic unit under the control of CP IJ1 to obtain the operation result.
- data of the old operation system on the memory for example, x,: r is first sent to the encoder ⁇ under the control of the CPU 1, and T, : Converted to a new operation system and stored in memory.
- the data of the new operation system in the memory is sent from the memory to the operation unit of the new operation system under the control of the CPU 1, and the operation result of the new operation system coming out of the operation unit is stored in the memory.
- a series of desired operations for example, product-sum operation, vector operation, matrix operation, etc.
- the new result data of the new operation system is obtained in the memory.
- This data is sent to the decoder under the control of CPU 1.
- This is a method to obtain the desired result of the old arithmetic system.
- the operation of the new operation system used for the iterative operation of FIG. 1 can generally achieve a much higher speed than the operation of the old operation system in FIG.
- an example of the present invention will be described below.
- ⁇ ⁇ X 0 , Xi, X 2 , ', ⁇ ].
- FIGS. 3 to 5 The rules of this operation are shown in FIGS. 3 to 5, respectively.
- Figures 3 and 4 show the rules of operation for the binary operation ⁇ and ⁇
- Figure 5 shows the rules of operation for the unary operation C.
- ⁇ is coded as follows with three sequences of Boolean values ⁇ 0, 1 ⁇ .
- the binary operations ⁇ and ⁇ can be expressed by three Boolean functions of six variables, and the circuit can be realized by a combination circuit of 6 inputs and 3 outputs.
- the unary operation C can be expressed by three Boolean functions of three variables, and the circuit can be realized by a combination circuit of three inputs and three outputs.
- the operation system ( ⁇ , A, B, C) given by such a logical expression is defined as the old operation system.
- Figures 6 to 8 show that the operations A, B, and C are grouped by the two-input AND, two-input EXOR, and NOT gates, respectively. This is realized as a five-way circuit, and both are circuits of the old arithmetic system.
- Figure 9 is (2) to (5) New operation system of many satisfying the expression (in this case 2 3! Species) ( ⁇ ', ⁇ ', ⁇ ', C', ⁇ , ⁇ ) of,
- this ⁇ ' It is assumed that, like ⁇ , is coded as follows with three sequences of Boolean values ⁇ 0, 1 ⁇ .
- FIGS. 10 to 12 The operation rules are shown in FIGS. 10 to 12, respectively. Also, the corresponding diagram of encoder 0 (corresponding to a truth table if translated into a Boolean value) corresponding to the code shown in Fig. 9 is shown in Fig. 13 and the corresponding diagram of decoder is shown in Fig. 14 It will be something like
- a 'and B' can be represented by six variables and C 'and three variables each by three Boolean functions.
- Fig. 15 to Fig. 19 show the combination circuit of the operations A ', B', C 'and the encoder ⁇ , the decoder by the gates of 2-input AND, 2-input EXOR and NOT, respectively. It was realized as. Therefore, when the circuits of these new arithmetic systems ( ⁇ ', ⁇ ', ⁇ '. C', ⁇ , ⁇ ) are installed, the old arithmetic systems shown in Figs. 7, 8, and 9 In order to compare the differences when a circuit is mounted, the number of transistor elements (hereinafter simply referred to as the number of elements) and the delay time of the critical path (hereinafter simply referred to as the delay time) of each circuit are calculated using the AND, EXOR, and NOT elements. Based on number and delay time
- Figure 20 shows the number of elements in each gate and the delay time, the number of NOT elements as ⁇ , and the delay time as ⁇ .
- Figure 21 shows the number of elements and the delay time of each circuit in the old and new arithmetic systems based on Fig. 20 and each circuit diagram.
- the arithmetic units ⁇ ', ⁇ ', and C 'in the new arithmetic system have greatly reduced both the number of elements and the delay time compared to the arithmetic units A, B, and C in the old arithmetic system. .
- the total number of elements in the old arithmetic system is 101 b, whereas the total number of arithmetic units, encoders and decoders in the new arithmetic system is 44 4, which is less than half.
- the cycle time ratio is the same as that of the Neumann type, but the effect of reducing the number of elements is It becomes more noticeable.
- the ratio of the number of elements in the system base and the number of elements in the system is sufficiently large.
- the ratio of the number of elements in the arithmetic unit approaches (101: 21). That is, the total number of elements is about 1Z5.
- the code is selected according to a criterion that minimizes the variable dependency of each arithmetic unit of the new arithmetic system, but the same applies when this criterion is replaced with another criterion. .
Landscapes
- Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Mathematical Optimization (AREA)
- Mathematical Analysis (AREA)
- Pure & Applied Mathematics (AREA)
- Computational Mathematics (AREA)
- Computing Systems (AREA)
- General Engineering & Computer Science (AREA)
- Mathematical Physics (AREA)
- Complex Calculations (AREA)
- Compression, Expansion, Code Conversion, And Decoders (AREA)
Description
Claims
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
JP34411989A JPH03201114A (ja) | 1989-12-28 | 1989-12-28 | 高速演算方式 |
JP1/344119 | 1989-12-28 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO1991010186A1 true WO1991010186A1 (fr) | 1991-07-11 |
Family
ID=18366783
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/JP1990/001686 WO1991010186A1 (fr) | 1989-12-28 | 1990-12-25 | Systeme d'operations haute vitesse |
Country Status (4)
Country | Link |
---|---|
EP (1) | EP0507946A4 (ja) |
JP (1) | JPH03201114A (ja) |
CA (1) | CA2072254A1 (ja) |
WO (1) | WO1991010186A1 (ja) |
Families Citing this family (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2004068364A1 (ja) * | 2003-01-27 | 2004-08-12 | Mathematec Kabushiki Kaisha | 演算処理装置、演算処理装置設計方法および論理回路設計方法 |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JPS51105245A (ja) * | 1975-03-13 | 1976-09-17 | Nippon Musical Instruments Mfg | |
JPS53148234A (en) * | 1977-05-30 | 1978-12-23 | Fujitsu Ltd | High-speed multiplication and division system between image data |
JPS5469039A (en) * | 1977-11-14 | 1979-06-02 | Hitachi Denshi Ltd | Multiplier/divider |
JPS5547540A (en) * | 1978-09-28 | 1980-04-04 | Tech Res & Dev Inst Of Japan Def Agency | Anti-logarithm adding circuit of logarithm |
Family Cites Families (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US4336468A (en) * | 1979-11-15 | 1982-06-22 | The Regents Of The University Of California | Simplified combinational logic circuits and method of designing same |
-
1989
- 1989-12-28 JP JP34411989A patent/JPH03201114A/ja active Pending
-
1990
- 1990-12-25 EP EP19910900925 patent/EP0507946A4/en not_active Withdrawn
- 1990-12-25 WO PCT/JP1990/001686 patent/WO1991010186A1/ja not_active Application Discontinuation
- 1990-12-25 CA CA 2072254 patent/CA2072254A1/en not_active Abandoned
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JPS51105245A (ja) * | 1975-03-13 | 1976-09-17 | Nippon Musical Instruments Mfg | |
JPS53148234A (en) * | 1977-05-30 | 1978-12-23 | Fujitsu Ltd | High-speed multiplication and division system between image data |
JPS5469039A (en) * | 1977-11-14 | 1979-06-02 | Hitachi Denshi Ltd | Multiplier/divider |
JPS5547540A (en) * | 1978-09-28 | 1980-04-04 | Tech Res & Dev Inst Of Japan Def Agency | Anti-logarithm adding circuit of logarithm |
Non-Patent Citations (1)
Title |
---|
See also references of EP0507946A4 * |
Also Published As
Publication number | Publication date |
---|---|
CA2072254A1 (en) | 1991-06-29 |
EP0507946A1 (en) | 1992-10-14 |
JPH03201114A (ja) | 1991-09-03 |
EP0507946A4 (en) | 1993-09-01 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Kang et al. | Digit-pipelined direct digital frequency synthesis based on differential CORDIC | |
Sasao et al. | Numerical function generators using LUT cascades | |
Fabricius | Modern digital design and switching theory | |
US5007009A (en) | Non-recovery parallel divider circuit | |
Vinod et al. | A MEMORYLESS REVERSE CONVERTER FOR THE 4-MODULI SUPERSET {2n-1, 2n, 2n+ 1, 2n+ 1-1} | |
Molahosseini et al. | New arithmetic residue to binary converters | |
Meehan et al. | An universal input and output RNS converter | |
Chandra | A novel method for scalable VLSI implementation of hyperbolic tangent function | |
Villalba et al. | Radix-2 multioperand and multiformat streaming online addition | |
WO1991010186A1 (fr) | Systeme d'operations haute vitesse | |
Valls et al. | Efficient mapping of CORDIC algorithms on FPGA | |
Persson et al. | Forward and reverse converters and moduli set selection in signed-digit residue number systems | |
Chren Jr | Low delay-power product CMOS design using one-hot residue coding | |
Vinod et al. | A Memoryless Reverse Converter for the 4-Moduli Superset {2 [sup n]-1, 2 [sup n], 2 [sup n]+ 1, 2 [sup n+ 1]-1}. | |
Daumas et al. | Further reducing the redundancy of a notation over a minimally redundant digit set | |
Astola et al. | New digit-serial implementations of stack filters | |
Alfredsson | VLSI architectures and arithmetic operations with application to the Fermat number transform | |
Dawid et al. | High speed bit-level pipelined architectures for redundant CORDIC implementation | |
Aoki et al. | High-radix parallel VLSI dividers without using quotient digit selection tables | |
JPH063578B2 (ja) | 演算処理装置 | |
RU2751802C1 (ru) | Умножитель по модулю | |
Arnold et al. | Under-and overflow detection in the residue logarithmic number system | |
RU2739338C1 (ru) | Вычислительное устройство | |
Conway et al. | New one-hot RNS structures for high-speed signal processing | |
Bashagha et al. | A new high radix non-restoring divider architecture |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
AK | Designated states |
Kind code of ref document: A1 Designated state(s): CA US |
|
AL | Designated countries for regional patents |
Kind code of ref document: A1 Designated state(s): AT BE CH DE DK ES FR GB GR IT LU NL SE |
|
WWE | Wipo information: entry into national phase |
Ref document number: 2072254 Country of ref document: CA |
|
WWE | Wipo information: entry into national phase |
Ref document number: 1991900925 Country of ref document: EP |
|
WWP | Wipo information: published in national office |
Ref document number: 1991900925 Country of ref document: EP |
|
WWW | Wipo information: withdrawn in national office |
Ref document number: 1991900925 Country of ref document: EP |