US7779229B2 - Method and arrangement for bringing together data on parallel data paths - Google Patents
Method and arrangement for bringing together data on parallel data paths Download PDFInfo
- Publication number
- US7779229B2 US7779229B2 US10/505,028 US50502805A US7779229B2 US 7779229 B2 US7779229 B2 US 7779229B2 US 50502805 A US50502805 A US 50502805A US 7779229 B2 US7779229 B2 US 7779229B2
- Authority
- US
- United States
- Prior art keywords
- data
- arrangement
- processing units
- parallel
- linkage
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active, expires
Links
- 238000000034 method Methods 0.000 title claims description 12
- 238000004891 communication Methods 0.000 claims description 10
- 230000015572 biosynthetic process Effects 0.000 claims description 6
- 230000001934 delay Effects 0.000 abstract description 2
- 238000004364 calculation method Methods 0.000 description 10
- 239000013598 vector Substances 0.000 description 8
- 238000003672 processing method Methods 0.000 description 4
- 230000008901 benefit Effects 0.000 description 2
- 230000000712 assembly Effects 0.000 description 1
- 238000000429 assembly Methods 0.000 description 1
- 230000003111 delayed effect Effects 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 238000005265 energy consumption Methods 0.000 description 1
- 230000003116 impacting effect Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/30—Arrangements for executing machine instructions, e.g. instruction decode
- G06F9/30003—Arrangements for executing specific machine instructions
- G06F9/30007—Arrangements for executing specific machine instructions to perform operations on data operands
- G06F9/3001—Arithmetic instructions
- G06F9/30014—Arithmetic instructions with variable precision
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/30—Arrangements for executing machine instructions, e.g. instruction decode
- G06F9/30003—Arrangements for executing specific machine instructions
- G06F9/30007—Arrangements for executing specific machine instructions to perform operations on data operands
- G06F9/30036—Instructions to perform operations on packed data, e.g. vector, tile or matrix operations
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/30—Arrangements for executing machine instructions, e.g. instruction decode
- G06F9/38—Concurrent instruction execution, e.g. pipeline or look ahead
- G06F9/3885—Concurrent instruction execution, e.g. pipeline or look ahead using a plurality of independent parallel functional units
Definitions
- the invention relates to data flow in parallel data processing arrangements.
- the invention relates to systems and methods for splitting data for parallel processing and recombining processed split data.
- a characteristic of common parallel processor architecture is the provision of a plurality of processing units, by which parallel processing of data can be accomplished.
- Such an architecture and processing unit assigned method are described, for example, in German Letters of Disclosure DE 198 35 216.
- This German Letters of Disclosure describes data in a data memory being split into data groups with a plurality of elements and stored under one and the same address. Each element of a data group is assigned to a processing unit. All data elements are simultaneously read out of the data memory in parallel and distributed as input data to one or more processing units, where they are processed in parallel under clock control.
- the parallel processing units are connected together via a communication unit.
- a processing unit comprises at least one process unit and one storage unit, arranged in a strip. Each strip in the processing unit is generally adjacent to at least one additional strip of like structure.
- SIMD Single Instruction Multiple Data
- the respective data elements are processed in the parallel data paths (i.e. strips) as described above.
- the partial results may be written in the group memory as corresponding data elements or as data groups.
- the partial results of the strips have been calculated with the aid of a program over a plurality of clock cycles in order to obtain the desired intermediate result. If this global intermediate result is required for subsequent calculations of the algorithm, calculation of the end result is delayed.
- inventions for achieving high data processing speeds.
- inventive methods and processor arrangements are configured so that input, intermediate and/or output data of a variety of processing units can be linked via at least one section wise combinatorial operation, which is not a clock-controlled operation.
- the combinatorial linking operation may be deployed in either logical and/or arithmetic operations.
- All possible linkages of data from a variety of processing units and parallel data paths can be obtained in accordance with the principles of the present invention.
- the combinatorial linking operation may involve a redundant numeric representation in at least one partial step.
- arithmetic operations such as addition or subtraction
- carries at all positions of the data can be conducted or performed simultaneously and used for the next partial step.
- the carry vector can propagate almost as rapidly as the sum vector within a partial step. Any delay due to a “ripple” effect can occur only in the last partial step operation in which the sum and carry vectors are brought together.
- a single data element or alternatively a data group is produced as the result of linkage of the local data in accordance with the principles of the present invention.
- Any desired data sources from the various strips can be linked together and the result fed to any desired data sinks of the processor arrangement.
- the result of the linking operations can be fed back to a processing unit, which enables, for example, recursive algorithms to be performed more rapidly.
- a single one can be selected for bringing together data.
- a plurality of algorithms and/or complex algorithms with a plurality of various assemblies of local data can be converted in a processor arrangement.
- the inventive processor arrangements include at least a section wise combinatorial linkage arrangement, which is not a clock-controlled linkage arrangement.
- the combinatorial linkage arrangement can link together data from a variety of strips, and in particular can link together input, intermediate and/or output data of a variety of processing units.
- the inventive processor arrangements permit the assembly of local data from a variety of strips required for certain algorithms to be performed more rapidly than in the prior art. Delays that occur in conventional parallel data processing are avoided.
- the linkage arrangement may comprise an addition network, subtraction network and/or a network for minimum/maximum formation.
- Such networks are capable of ascertaining the carry at a position of the data resulting in a specific step of the performance the operation in the logic arrangement, independently of the results of preceding positions or steps.
- the linkage arrangement may be designed so that carries occurring in all or almost all partial steps are not used for the calculation of subsequent positions. Thus in only part of the linkage network in which a sum vector and a carry vector are brought together does a known delay occur.
- an index which represents or indicates the strip with an extreme data value, in addition to the extreme data value itself.
- a processor arrangement for linkage of a wide variety of data within a data path or strip with data from other data path or strips may include have a plurality of selectable linkage arrangements of various types.
- the selection of appropriate linkage arrangements of various types may be program controlled.
- suitable selection of the linking arrangements it may be possible to link the same data variably, logically or arithmetically.
- a deployed linking arrangement may be configured so that its output may be connected with any desired registers of the processor arrangement.
- the linking arrangement output may, for example, be connected to a register of a processing unit, or alternatively with a global register in which a data group is capable of being filed or stored.
- At least one input register of the linkage arrangement may be assigned a control mechanism or switch which can be operated to separate the input register and hence its data from the linkage arrangement. Since the linkage arrangement operates at least section wise combinatorially in a manner that is not clock-controlled, changes at the data input of the linkage arrangement can be prevented from automatically impacting the linkage arrangement, even though this is not necessary and/or all input data are not yet present at the given moment.
- FIG. 1 is a schematic representation of processor architecture in accordance with the principles of the present invention
- FIG. 2 is a schematic representation of another processor architecture in accordance with the principles of the present invention.
- FIG. 3 is a schematic representation of an exemplary linkage arrangement for the linkage of data from a variety of parallel data paths, in accordance with the principles of the present invention.
- Parallel processing methods and arrangements are provided for improving the speed of data processing.
- the parallel processing arrangements are configured with data linking arrangements so that processed data from individual data processing strips or processing units can be linked or brought together without requiring a great expenditure of time.
- FIG. 1 shows a schematic representation of an exemplary processor arrangement designed in accordance with the principles of the present invention.
- the processor arrangement includes a plurality of parallel data processing strips ( 0 ). Further the processor arrangement includes a group memory 1 , in which data groups are capable of being stored under one address, where a single data group has a plurality of data elements.
- Processing units 2 each with an input register R 0 . . . R N and an output register RR 0 . . . RR N , are arranged in a strip structure.
- the registers may be designed as a register set, which includes a plurality of input and output registers.
- a global linkage arrangement 5 is inserted after the output registers RR 0 to RR N .
- Global linkage arrangement 5 is designed to be a combinatorial addition network in the input step.
- Global linkage arrangement 5 is not clock controlled.
- a global communication unit 3 may be disposed between processing units 2 and group memory 1 . Data from group memory 1 may be fed to the respective processing units 2 via communication unit 3 . Additionally or alternatively, it is possible that a data group or at least one element of the data group may be connectable directly with the assigned processing units bypassing the communication unit.
- a data group is simultaneously read out of the data memory in parallel and distributed to a plurality of processing units 2 for processing in parallel.
- the processing units 2 in each instance include at least one process unit and one arithmetic logic unit (not shown).
- At least one input linkage logic and at least one output linkage logic with which the registers of a register set are connectable within a data path may additionally be arranged between a processing unit and the assigned input register set and output register set.
- each element of a data group from the group memory 1 may be sent either directly to the assigned processing unit or via communication unit 3 to be distributed to other processing units.
- the sent data reach the respective processing units 2 via input registers R 0 . . . R N .
- the data results of processing units 2 are written in respective output register RR 0 . . . RR N .
- These result data may in turn be written directly in the group memory 1 or be distributed by means of the communication unit 3 .
- a local linkage arrangement 4 is disposed between adjacent processing units 2 .
- Local linkage arrangement may be utilized to link data from two adjacent processing units 2 in a combinatorial manner that is not clock-controlled. The linkage results may be written back to either one of the two processing units.
- the two data elements are XOR-linked via a combinatorial, which is not a clock-controlled network. Accordingly, no additional clock cycle is necessary for ascertainment of the result, and the processing unit in which the result is further processed experiences no internal delay.
- all output registers RR 0 . . . RR N are connected with the global linkage arrangement 5 , in which the individual output data of all (N+1) processing units 2 are added.
- the addition network of the linkage arrangement 5 is represented in FIG. 3 as a four-strip processor arrangement where, for the sake of simplicity of representation, four bit-data words are added in the linkage arrangement.
- FIG. 3 shows operation of the linkage arrangement in rows S 1 -S 3 .
- a data word position are labeled as Dij, where the index i identifies the data word (i.e., the strip) and j identifies the data word position.
- the individual bits of three data words D 0 -D 2 from the registers RR 0 , RR 1 and RR 2 are added by means of four full adders VA.
- the results of each adder are given in a second step to an assigned full adder VA at row S 2 .
- the carries C to the full adders of the next position are given to assigned full adders VA (e.g., step S 2 ).
- the respective bit of the fourth data word D 3 from the register RR 3 of the fourth strip is also shown as being present in the four full adders VA at the second row S 2 . Since, in the first and second steps at rows S 1 and S 2 of the linkage arrangement, the carry C is not transferred to the full adders of the subsequent data word position, all calculations in both steps S 1 and S 2 can be performed simultaneously and immediately with the data fed to the inputs of the full adders. Only in the last step at row S 3 , which includes a half adder HA and three subsequent full adders VA, are the carries C of a lower data word position sent or transferred on to the full adder of the subsequent position. In this manner, the linkage arrangement shown in FIG.
- a 6-bit data word G (G 0 -G 5 ) is produced in a last step of the linking arrangement, by bringing together a carry and a sum vector.
- the higher positions of the result word G are filled with zeros for the formation of a data group (not shown) and may be fed via the control means 7 into global communication unit 3 , from which the calculated data group is either stored in the group memory or distributed to the processing means.
- each input of the global linkage arrangement 5 may have a controllable gate 6 in the form of a latch by which a change in the output registers RR 1 . . . RR N can be fed into the global linkage arrangement 5 .
- a change in an output register of a processing unit is precluded from automatically causing global linkage arrangement 5 to link data, which is always associated with the consumption of energy. In this way, the linkage of data can be moved to such time at which the data brought together in the linkage arrangement are required or such time after all input data are present.
- combinatorial linkage arrangement may include the provision of an additional XOR-linkage as a subtraction network or may comprise a shift arrangement or an inverter.
- a processor arrangement includes a linkage arrangement for maximum formation is designed via a plurality of strips.
- the arrangement has a plurality of calculation steps, in which the data of two strips are in each instance subtracted from one another. If the result is negative, the subtrahend is sent on to the next calculation step. If the result is positive, the minuend is sent on to the next calculation step. At the same time, an index to this calculation step is transmitted, which indicates the strip in which of the strips thus far considered the extreme lies. In a maximum formation over 8 strips, an index of 3 bits and 7 calculation steps are thus required. These may be processed cascade-like, but may alternatively be designed for at least partially processing in parallel.
- FIG. 2 shows a schematic representation of another exemplary processor arrangement designed in accordance with the principles of the present invention.
- a global linkage arrangement 8 is connected with the input registers R 0 . . . R N .
- the linkage arrangement 8 includes two separate and independent logic arrangements. Any one of which may be selected by means of the control line 11 .
- the first logic arrangement produces a data group that is fed back via the global data feedback 10 into the global communication unit 3 .
- a data element is produced which is fed back via the local data feedback 9 into the input register R 1 of the processing unit 2 of the second strip.
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Software Systems (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- Mathematical Analysis (AREA)
- Computational Mathematics (AREA)
- Mathematical Physics (AREA)
- Mathematical Optimization (AREA)
- Pure & Applied Mathematics (AREA)
- Multi Processors (AREA)
- Advance Control (AREA)
- Image Processing (AREA)
- Bus Control (AREA)
Applications Claiming Priority (4)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
DE10206830 | 2002-02-18 | ||
DE10206830.5 | 2002-02-18 | ||
DE10206830A DE10206830B4 (de) | 2002-02-18 | 2002-02-18 | Verfahren und Anordnung zur Zusammenführung von Daten aus parallelen Datenpfaden |
PCT/DE2003/000417 WO2003071431A2 (de) | 2002-02-18 | 2003-02-12 | Verfahren und anordnung zur zusammenführung von daten auf parallelen datenpfaden |
Publications (2)
Publication Number | Publication Date |
---|---|
US20060090060A1 US20060090060A1 (en) | 2006-04-27 |
US7779229B2 true US7779229B2 (en) | 2010-08-17 |
Family
ID=27674728
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US10/505,028 Active 2024-11-20 US7779229B2 (en) | 2002-02-18 | 2003-02-12 | Method and arrangement for bringing together data on parallel data paths |
Country Status (6)
Country | Link |
---|---|
US (1) | US7779229B2 (ja) |
EP (1) | EP1476806B1 (ja) |
JP (1) | JP2005518048A (ja) |
AU (1) | AU2003210141A1 (ja) |
DE (2) | DE10206830B4 (ja) |
WO (1) | WO2003071431A2 (ja) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20140040604A1 (en) * | 2011-12-30 | 2014-02-06 | Elmoustapha Ould-Ahmed-Vall | Packed rotate processors, methods, systems, and instructions |
Families Citing this family (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US9418043B2 (en) * | 2014-03-07 | 2016-08-16 | Sony Corporation | Data speculation for array processors |
JP2017174291A (ja) | 2016-03-25 | 2017-09-28 | ルネサスエレクトロニクス株式会社 | 画像処理装置、画像処理方法、及び自動車制御装置 |
Citations (34)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US4507748A (en) * | 1982-08-02 | 1985-03-26 | International Telephone And Telegraph Corporation | Associative processor with variable length fast multiply capability |
US5129092A (en) * | 1987-06-01 | 1992-07-07 | Applied Intelligent Systems,Inc. | Linear chain of parallel processors and method of using same |
US5197140A (en) * | 1989-11-17 | 1993-03-23 | Texas Instruments Incorporated | Sliced addressing multi-processor and method of operation |
US5317753A (en) * | 1990-04-20 | 1994-05-31 | Siemens Aktiengesellschaft | Coordinate rotation digital computer processor (cordic processor) for vector rotations in carry-save architecture |
JPH06290262A (ja) | 1993-03-31 | 1994-10-18 | Sony Corp | 画像コーデック用プロセッサ |
US5371896A (en) * | 1989-11-17 | 1994-12-06 | Texas Instruments Incorporated | Multi-processor having control over synchronization of processors in mind mode and method of operation |
US5423051A (en) * | 1992-09-24 | 1995-06-06 | International Business Machines Corporation | Execution unit with an integrated vector operation capability |
US5471593A (en) * | 1989-12-11 | 1995-11-28 | Branigin; Michael H. | Computer processor with an efficient means of executing many instructions simultaneously |
JPH0830577A (ja) | 1994-07-15 | 1996-02-02 | Mitsubishi Electric Corp | Simdプロセッサ |
US5535151A (en) * | 1993-11-19 | 1996-07-09 | Sony Corporation | Electronic processor for performing multiplication |
US5758195A (en) * | 1989-11-17 | 1998-05-26 | Texas Instruments Incorporated | Register to memory data transfers with field extraction and zero/sign extension based upon size and mode data corresponding to employed address register |
US5765012A (en) * | 1990-11-13 | 1998-06-09 | International Business Machines Corporation | Controller for a SIMD/MIMD array having an instruction sequencer utilizing a canned routine library |
US5765015A (en) * | 1990-11-13 | 1998-06-09 | International Business Machines Corporation | Slide network for an array processor |
US5822606A (en) | 1996-01-11 | 1998-10-13 | Morton; Steven G. | DSP having a plurality of like processors controlled in parallel by an instruction word, and a control processor also controlled by the instruction word |
US5973705A (en) * | 1997-04-24 | 1999-10-26 | International Business Machines Corporation | Geometry pipeline implemented on a SIMD machine |
DE19835216A1 (de) | 1998-08-05 | 2000-02-17 | Univ Dresden Tech | Verfahren und Prozessoranordnung zur parallelen Datenverarbeitung |
US6044448A (en) * | 1997-12-16 | 2000-03-28 | S3 Incorporated | Processor having multiple datapath instances |
US6260088B1 (en) * | 1989-11-17 | 2001-07-10 | Texas Instruments Incorporated | Single integrated circuit embodying a risc processor and a digital signal processor |
US20010010073A1 (en) * | 1998-09-30 | 2001-07-26 | Intel Corporation. | Non-stalling circular counterflow pipeline processor with reorder buffer |
US6308252B1 (en) * | 1999-02-04 | 2001-10-23 | Kabushiki Kaisha Toshiba | Processor method and apparatus for performing single operand operation and multiple parallel operand operation |
US6401194B1 (en) * | 1997-01-28 | 2002-06-04 | Samsung Electronics Co., Ltd. | Execution unit for processing a data stream independently and in parallel |
US6404439B1 (en) * | 1997-03-11 | 2002-06-11 | Sony Corporation | SIMD control parallel processor with simplified configuration |
US6557097B1 (en) * | 1998-10-06 | 2003-04-29 | Texas Instruments Incorporated | Linear vector computation |
US6625722B1 (en) * | 1997-05-23 | 2003-09-23 | Aspex Technology Limited | Processor controller for accelerating instruction issuing rate |
US6636828B1 (en) * | 1998-05-11 | 2003-10-21 | Nec Electronics Corp. | Symbolic calculation system, symbolic calculation method and parallel circuit simulation system |
US6665790B1 (en) * | 2000-02-29 | 2003-12-16 | International Business Machines Corporation | Vector register file with arbitrary vector addressing |
US6675268B1 (en) * | 2000-12-11 | 2004-01-06 | Lsi Logic Corporation | Method and apparatus for handling transfers of data volumes between controllers in a storage environment having multiple paths to the data volumes |
US20040015931A1 (en) * | 2001-04-13 | 2004-01-22 | Bops, Inc. | Methods and apparatus for automated generation of abbreviated instruction set and configurable processor architecture |
US6732253B1 (en) * | 2000-11-13 | 2004-05-04 | Chipwrights Design, Inc. | Loop handling for single instruction multiple datapath processor architectures |
US6763450B1 (en) * | 1999-10-08 | 2004-07-13 | Texas Instruments Incorporated | Processor |
US6839831B2 (en) * | 2000-02-09 | 2005-01-04 | Texas Instruments Incorporated | Data processing apparatus with register file bypass |
US6968445B2 (en) * | 2001-12-20 | 2005-11-22 | Sandbridge Technologies, Inc. | Multithreaded processor with efficient processing for convergence device applications |
US7197625B1 (en) * | 1997-10-09 | 2007-03-27 | Mips Technologies, Inc. | Alignment and ordering of vector elements for single instruction multiple data processing |
US7394052B2 (en) * | 2001-07-30 | 2008-07-01 | Nippon Telegraph And Telephone Corporation | Parallel processing logic circuit for sensor signal processing |
Family Cites Families (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US6088782A (en) * | 1997-07-10 | 2000-07-11 | Motorola Inc. | Method and apparatus for moving data in a parallel processor using source and destination vector registers |
-
2002
- 2002-02-18 DE DE10206830A patent/DE10206830B4/de not_active Expired - Fee Related
-
2003
- 2003-02-12 US US10/505,028 patent/US7779229B2/en active Active
- 2003-02-12 AU AU2003210141A patent/AU2003210141A1/en not_active Abandoned
- 2003-02-12 EP EP03742488A patent/EP1476806B1/de not_active Expired - Lifetime
- 2003-02-12 JP JP2003570257A patent/JP2005518048A/ja not_active Ceased
- 2003-02-12 DE DE50307765T patent/DE50307765D1/de not_active Expired - Lifetime
- 2003-02-12 WO PCT/DE2003/000417 patent/WO2003071431A2/de active IP Right Grant
Patent Citations (35)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US4507748A (en) * | 1982-08-02 | 1985-03-26 | International Telephone And Telegraph Corporation | Associative processor with variable length fast multiply capability |
US5129092A (en) * | 1987-06-01 | 1992-07-07 | Applied Intelligent Systems,Inc. | Linear chain of parallel processors and method of using same |
US5197140A (en) * | 1989-11-17 | 1993-03-23 | Texas Instruments Incorporated | Sliced addressing multi-processor and method of operation |
US6260088B1 (en) * | 1989-11-17 | 2001-07-10 | Texas Instruments Incorporated | Single integrated circuit embodying a risc processor and a digital signal processor |
US5371896A (en) * | 1989-11-17 | 1994-12-06 | Texas Instruments Incorporated | Multi-processor having control over synchronization of processors in mind mode and method of operation |
US6038584A (en) * | 1989-11-17 | 2000-03-14 | Texas Instruments Incorporated | Synchronized MIMD multi-processing system and method of operation |
US5758195A (en) * | 1989-11-17 | 1998-05-26 | Texas Instruments Incorporated | Register to memory data transfers with field extraction and zero/sign extension based upon size and mode data corresponding to employed address register |
US5471593A (en) * | 1989-12-11 | 1995-11-28 | Branigin; Michael H. | Computer processor with an efficient means of executing many instructions simultaneously |
US5317753A (en) * | 1990-04-20 | 1994-05-31 | Siemens Aktiengesellschaft | Coordinate rotation digital computer processor (cordic processor) for vector rotations in carry-save architecture |
US5765012A (en) * | 1990-11-13 | 1998-06-09 | International Business Machines Corporation | Controller for a SIMD/MIMD array having an instruction sequencer utilizing a canned routine library |
US5765015A (en) * | 1990-11-13 | 1998-06-09 | International Business Machines Corporation | Slide network for an array processor |
US5423051A (en) * | 1992-09-24 | 1995-06-06 | International Business Machines Corporation | Execution unit with an integrated vector operation capability |
JPH06290262A (ja) | 1993-03-31 | 1994-10-18 | Sony Corp | 画像コーデック用プロセッサ |
US5535151A (en) * | 1993-11-19 | 1996-07-09 | Sony Corporation | Electronic processor for performing multiplication |
JPH0830577A (ja) | 1994-07-15 | 1996-02-02 | Mitsubishi Electric Corp | Simdプロセッサ |
US5822606A (en) | 1996-01-11 | 1998-10-13 | Morton; Steven G. | DSP having a plurality of like processors controlled in parallel by an instruction word, and a control processor also controlled by the instruction word |
US6401194B1 (en) * | 1997-01-28 | 2002-06-04 | Samsung Electronics Co., Ltd. | Execution unit for processing a data stream independently and in parallel |
US6404439B1 (en) * | 1997-03-11 | 2002-06-11 | Sony Corporation | SIMD control parallel processor with simplified configuration |
US5973705A (en) * | 1997-04-24 | 1999-10-26 | International Business Machines Corporation | Geometry pipeline implemented on a SIMD machine |
US6625722B1 (en) * | 1997-05-23 | 2003-09-23 | Aspex Technology Limited | Processor controller for accelerating instruction issuing rate |
US7197625B1 (en) * | 1997-10-09 | 2007-03-27 | Mips Technologies, Inc. | Alignment and ordering of vector elements for single instruction multiple data processing |
US6044448A (en) * | 1997-12-16 | 2000-03-28 | S3 Incorporated | Processor having multiple datapath instances |
US6636828B1 (en) * | 1998-05-11 | 2003-10-21 | Nec Electronics Corp. | Symbolic calculation system, symbolic calculation method and parallel circuit simulation system |
DE19835216A1 (de) | 1998-08-05 | 2000-02-17 | Univ Dresden Tech | Verfahren und Prozessoranordnung zur parallelen Datenverarbeitung |
US20010010073A1 (en) * | 1998-09-30 | 2001-07-26 | Intel Corporation. | Non-stalling circular counterflow pipeline processor with reorder buffer |
US6557097B1 (en) * | 1998-10-06 | 2003-04-29 | Texas Instruments Incorporated | Linear vector computation |
US6308252B1 (en) * | 1999-02-04 | 2001-10-23 | Kabushiki Kaisha Toshiba | Processor method and apparatus for performing single operand operation and multiple parallel operand operation |
US6763450B1 (en) * | 1999-10-08 | 2004-07-13 | Texas Instruments Incorporated | Processor |
US6839831B2 (en) * | 2000-02-09 | 2005-01-04 | Texas Instruments Incorporated | Data processing apparatus with register file bypass |
US6665790B1 (en) * | 2000-02-29 | 2003-12-16 | International Business Machines Corporation | Vector register file with arbitrary vector addressing |
US6732253B1 (en) * | 2000-11-13 | 2004-05-04 | Chipwrights Design, Inc. | Loop handling for single instruction multiple datapath processor architectures |
US6675268B1 (en) * | 2000-12-11 | 2004-01-06 | Lsi Logic Corporation | Method and apparatus for handling transfers of data volumes between controllers in a storage environment having multiple paths to the data volumes |
US20040015931A1 (en) * | 2001-04-13 | 2004-01-22 | Bops, Inc. | Methods and apparatus for automated generation of abbreviated instruction set and configurable processor architecture |
US7394052B2 (en) * | 2001-07-30 | 2008-07-01 | Nippon Telegraph And Telephone Corporation | Parallel processing logic circuit for sensor signal processing |
US6968445B2 (en) * | 2001-12-20 | 2005-11-22 | Sandbridge Technologies, Inc. | Multithreaded processor with efficient processing for convergence device applications |
Non-Patent Citations (4)
Title |
---|
Drescher et al., An Architectural Study of a Digital Signal Processor for Block Codes, 1998, IEEE, pp. 3129-3132. * |
Goodman et al, An Energy-Efficient Reconfigurable Public-key Cryptography Processor, IEEE Journal of Solid-State Circuits, Nov. 2001, p. 1808-1820. * |
Goodman et al., An Energy-Efficient IEEE 1363-based Reconfigurable Public-Key Cryptography Processor, 2001, ISSCC 2001, pp. 1-4. * |
Satoh et al., A High-Speed Small RSA Encryption LSI with Low Power Dissipation, 1998, IBM, pp. 174-187. * |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20140040604A1 (en) * | 2011-12-30 | 2014-02-06 | Elmoustapha Ould-Ahmed-Vall | Packed rotate processors, methods, systems, and instructions |
US9864602B2 (en) * | 2011-12-30 | 2018-01-09 | Intel Corporation | Packed rotate processors, methods, systems, and instructions |
US10324718B2 (en) | 2011-12-30 | 2019-06-18 | Intel Corporation | Packed rotate processors, methods, systems, and instructions |
Also Published As
Publication number | Publication date |
---|---|
DE50307765D1 (de) | 2007-09-06 |
WO2003071431A3 (de) | 2004-04-08 |
US20060090060A1 (en) | 2006-04-27 |
EP1476806B1 (de) | 2007-07-25 |
AU2003210141A1 (en) | 2003-09-09 |
JP2005518048A (ja) | 2005-06-16 |
WO2003071431A2 (de) | 2003-08-28 |
EP1476806A2 (de) | 2004-11-17 |
DE10206830B4 (de) | 2004-10-14 |
DE10206830A1 (de) | 2003-09-04 |
AU2003210141A8 (en) | 2003-09-09 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US5546545A (en) | Rotating priority selection logic circuit | |
US5257218A (en) | Parallel carry and carry propagation generator apparatus for use with carry-look-ahead adders | |
US7350054B2 (en) | Processor having array of processing elements whose individual operations and mutual connections are variable | |
JP7241470B2 (ja) | ベクトルプロセッサの配列ソート方法 | |
JPH06266750A (ja) | ロジックシステム | |
US5721809A (en) | Maximum value selector | |
US7460666B2 (en) | Combinational circuit, encryption circuit, method for constructing the same and program | |
WO2004017223A2 (en) | Programmable pipeline fabric utilizing partially global configuration buses | |
EP0769738B1 (en) | Logic circuit with carry selection technique | |
US7779229B2 (en) | Method and arrangement for bringing together data on parallel data paths | |
US5764550A (en) | Arithmetic logic unit with improved critical path performance | |
JPH0926949A (ja) | データ駆動型情報処理装置 | |
US20030120694A1 (en) | Method and apparatus for use in booth-encoded multiplication | |
US7313586B2 (en) | Adder-subtracter circuit | |
US4942549A (en) | Recursive adder for calculating the sum of two operands | |
US7565387B1 (en) | Systems and methods for configuring a programmable logic device to perform a computation using carry chains | |
US4958353A (en) | Device for calculating the parity bits of a sum of two numbers | |
US6892215B2 (en) | Fast parallel cascaded array modular multiplier | |
Zhabin et al. | Methods of on-line computation acceleration in systems with direct connection between units | |
US7007059B1 (en) | Fast pipelined adder/subtractor using increment/decrement function with reduced register utilization | |
JPS622328B2 (ja) | ||
US5689451A (en) | Device for calculating parity bits associated with a sum of two numbers | |
JPH04364525A (ja) | 並列演算装置 | |
Abdelguerfi | Special function unit for statistical aggregation functions | |
USRE37335E1 (en) | Ripple carry logic and method |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
AS | Assignment |
Owner name: PHILIPS COMMUNICATIONS NETWORK, GERMANY Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:DRESCHER, WOLFRAM;REEL/FRAME:016676/0361 Effective date: 20050101 |
|
AS | Assignment |
Owner name: PHILIPS SEMICONDUCTORS DRESDEN AG, GERMANY Free format text: CORRECTION TO THE ASSIGNEE;ASSIGNOR:DRESCHER, WOLFRAM;REEL/FRAME:017214/0470 Effective date: 20050110 |
|
AS | Assignment |
Owner name: PHILIPS SEMICONDUCTORS DRESDEN AG, GERMANY Free format text: CHANGE OF NAME;ASSIGNOR:SYSTEMONIC AG;REEL/FRAME:021523/0877 Effective date: 20030303 Owner name: NXP SEMICONDUCTORS GERMANY GMBH, GERMANY Free format text: MERGER;ASSIGNOR:PHILIPS SEMICONDUCTORS DRESDEN AG;REEL/FRAME:021523/0925 Effective date: 20061127 Owner name: PHILIPS SEMICONDUCTORS DRESDEN AG,GERMANY Free format text: CHANGE OF NAME;ASSIGNOR:SYSTEMONIC AG;REEL/FRAME:021523/0877 Effective date: 20030303 Owner name: NXP SEMICONDUCTORS GERMANY GMBH,GERMANY Free format text: MERGER;ASSIGNOR:PHILIPS SEMICONDUCTORS DRESDEN AG;REEL/FRAME:021523/0925 Effective date: 20061127 |
|
AS | Assignment |
Owner name: NXP B.V., NETHERLANDS Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:NXP SEMICONDUCTORS GERMANY GMBH;REEL/FRAME:021531/0289 Effective date: 20080723 Owner name: NXP B.V.,NETHERLANDS Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:NXP SEMICONDUCTORS GERMANY GMBH;REEL/FRAME:021531/0289 Effective date: 20080723 |
|
STCF | Information on status: patent grant |
Free format text: PATENTED CASE |
|
AS | Assignment |
Owner name: NXP B.V., NETHERLANDS Free format text: CONFIRMATORY ASSIGNMENT;ASSIGNOR:NXP SEMICONDUCTORS GERMANY GMBH;REEL/FRAME:026924/0803 Effective date: 20110919 |
|
AS | Assignment |
Owner name: CALLAHAN CELLULAR L.L.C., DELAWARE Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:NXP B.V.;REEL/FRAME:027265/0798 Effective date: 20110926 |
|
FEPP | Fee payment procedure |
Free format text: PAYOR NUMBER ASSIGNED (ORIGINAL EVENT CODE: ASPN); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY |
|
FPAY | Fee payment |
Year of fee payment: 4 |
|
MAFP | Maintenance fee payment |
Free format text: PAYMENT OF MAINTENANCE FEE, 8TH YEAR, LARGE ENTITY (ORIGINAL EVENT CODE: M1552) Year of fee payment: 8 |
|
MAFP | Maintenance fee payment |
Free format text: PAYMENT OF MAINTENANCE FEE, 12TH YEAR, LARGE ENTITY (ORIGINAL EVENT CODE: M1553); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY Year of fee payment: 12 |