TW200604941A - Processor having parallel vector multiply and reduce operations with sequential semantics - Google Patents
Processor having parallel vector multiply and reduce operations with sequential semanticsInfo
- Publication number
- TW200604941A TW200604941A TW094111014A TW94111014A TW200604941A TW 200604941 A TW200604941 A TW 200604941A TW 094111014 A TW094111014 A TW 094111014A TW 94111014 A TW94111014 A TW 94111014A TW 200604941 A TW200604941 A TW 200604941A
- Authority
- TW
- Taiwan
- Prior art keywords
- accumulator
- processor
- unit
- vector multiply
- reduce operations
- Prior art date
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F7/00—Methods or arrangements for processing data by operating upon the order or content of the data handled
- G06F7/38—Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation
- G06F7/48—Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation using non-contact-making devices, e.g. tube, solid state device; using unspecified devices
- G06F7/52—Multiplying; Dividing
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/30—Arrangements for executing machine instructions, e.g. instruction decode
- G06F9/30003—Arrangements for executing specific machine instructions
- G06F9/30007—Arrangements for executing specific machine instructions to perform operations on data operands
- G06F9/3001—Arithmetic instructions
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F7/00—Methods or arrangements for processing data by operating upon the order or content of the data handled
- G06F7/38—Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation
- G06F7/48—Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation using non-contact-making devices, e.g. tube, solid state device; using unspecified devices
- G06F7/52—Multiplying; Dividing
- G06F7/523—Multiplying only
- G06F7/53—Multiplying only in parallel-parallel fashion, i.e. both operands being entered in parallel
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F7/00—Methods or arrangements for processing data by operating upon the order or content of the data handled
- G06F7/38—Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation
- G06F7/48—Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation using non-contact-making devices, e.g. tube, solid state device; using unspecified devices
- G06F7/544—Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation using non-contact-making devices, e.g. tube, solid state device; using unspecified devices for evaluating functions by calculation
- G06F7/5443—Sum of products
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F2207/00—Indexing scheme relating to methods or arrangements for processing data by operating upon the order or content of the data handled
- G06F2207/38—Indexing scheme relating to groups G06F7/38 - G06F7/575
- G06F2207/3804—Details
- G06F2207/3808—Details concerning the type of numbers or the way they are handled
- G06F2207/3828—Multigauge devices, i.e. capable of handling packed numbers without unpacking them
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F2207/00—Indexing scheme relating to methods or arrangements for processing data by operating upon the order or content of the data handled
- G06F2207/38—Indexing scheme relating to groups G06F7/38 - G06F7/575
- G06F2207/3804—Details
- G06F2207/386—Special constructional features
- G06F2207/388—Skewing
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F7/00—Methods or arrangements for processing data by operating upon the order or content of the data handled
- G06F7/38—Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation
- G06F7/48—Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation using non-contact-making devices, e.g. tube, solid state device; using unspecified devices
- G06F7/499—Denomination or exception handling, e.g. rounding or overflow
- G06F7/49905—Exception handling
- G06F7/4991—Overflow or underflow
- G06F7/49921—Saturation, i.e. clipping the result to a minimum or maximum value
Landscapes
- Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Mathematical Optimization (AREA)
- Mathematical Analysis (AREA)
- Pure & Applied Mathematics (AREA)
- Computational Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- Computing Systems (AREA)
- Software Systems (AREA)
- Advance Control (AREA)
- Image Processing (AREA)
Applications Claiming Priority (3)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US56019804P | 2004-04-07 | 2004-04-07 | |
US10/841,261 US7593978B2 (en) | 2003-05-09 | 2004-05-07 | Processor reduction unit for accumulation of multiple operands with or without saturation |
US11/096,921 US7797363B2 (en) | 2004-04-07 | 2005-04-01 | Processor having parallel vector multiply and reduce operations with sequential semantics |
Publications (1)
Publication Number | Publication Date |
---|---|
TW200604941A true TW200604941A (en) | 2006-02-01 |
Family
ID=35150606
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
TW094111014A TW200604941A (en) | 2004-04-07 | 2005-04-07 | Processor having parallel vector multiply and reduce operations with sequential semantics |
Country Status (6)
Country | Link |
---|---|
US (1) | US7797363B2 (zh) |
EP (1) | EP1735694A4 (zh) |
JP (1) | JP2007533009A (zh) |
KR (1) | KR20060133086A (zh) |
TW (1) | TW200604941A (zh) |
WO (1) | WO2005101190A2 (zh) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN118245017A (zh) * | 2023-11-02 | 2024-06-25 | 芯立嘉集成电路(杭州)有限公司 | 存储器内二进位浮点乘法装置及其操作方法 |
Families Citing this family (53)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US8074051B2 (en) * | 2004-04-07 | 2011-12-06 | Aspen Acquisition Corporation | Multithreaded processor with multiple concurrent pipelines per thread |
TW200625097A (en) * | 2004-11-17 | 2006-07-16 | Sandbridge Technologies Inc | Data file storing multiple date types with controlled data access |
US8266198B2 (en) * | 2006-02-09 | 2012-09-11 | Altera Corporation | Specialized processing block for programmable logic device |
WO2008070250A2 (en) * | 2006-09-26 | 2008-06-12 | Sandbridge Technologies Inc. | Software implementation of matrix inversion in a wireless communication system |
KR101545357B1 (ko) * | 2006-11-10 | 2015-08-18 | 퀄컴 인코포레이티드 | 파이프라인 컴퓨터 처리의 병렬화를 위한 방법 및 시스템 |
US8386553B1 (en) | 2006-12-05 | 2013-02-26 | Altera Corporation | Large multiplier for programmable logic device |
US7930336B2 (en) | 2006-12-05 | 2011-04-19 | Altera Corporation | Large multiplier for programmable logic device |
US8316215B2 (en) * | 2007-03-08 | 2012-11-20 | Nec Corporation | Vector processor with plural arithmetic units for processing a vector data string divided into plural register banks accessed by read pointers starting at different positions |
KR101190937B1 (ko) * | 2007-05-17 | 2012-10-12 | 후지쯔 가부시끼가이샤 | 연산 유닛, 프로세서 및 프로세서 아키텍처 |
US8239438B2 (en) * | 2007-08-17 | 2012-08-07 | International Business Machines Corporation | Method and apparatus for implementing a multiple operand vector floating point summation to scalar function |
WO2009061547A1 (en) * | 2007-11-05 | 2009-05-14 | Sandbridge Technologies, Inc. | Method of encoding register instruction fields |
US8239439B2 (en) * | 2007-12-13 | 2012-08-07 | International Business Machines Corporation | Method and apparatus implementing a minimal area consumption multiple addend floating point summation function in a vector microprocessor |
WO2009097444A1 (en) * | 2008-01-30 | 2009-08-06 | Sandbridge Technologies, Inc. | Method for enabling multi-processor synchronization |
WO2009105332A1 (en) * | 2008-02-18 | 2009-08-27 | Sandbridge Technologies, Inc. | Method to accelerate null-terminated string operations |
US8959137B1 (en) | 2008-02-20 | 2015-02-17 | Altera Corporation | Implementing large multipliers in a programmable integrated circuit device |
WO2009114691A2 (en) * | 2008-03-13 | 2009-09-17 | Sandbridge Technologies, Inc. | Method for achieving power savings by disabling a valid array |
US8244789B1 (en) | 2008-03-14 | 2012-08-14 | Altera Corporation | Normalization of floating point operations in a programmable integrated circuit device |
JP2011530744A (ja) | 2008-08-06 | 2011-12-22 | アスペン・アクイジション・コーポレーション | 停止可能および再始動可能dmaエンジン |
US9335997B2 (en) | 2008-08-15 | 2016-05-10 | Apple Inc. | Processing vectors using a wrapping rotate previous instruction in the macroscalar architecture |
US9342304B2 (en) | 2008-08-15 | 2016-05-17 | Apple Inc. | Processing vectors using wrapping increment and decrement instructions in the macroscalar architecture |
US9335980B2 (en) | 2008-08-15 | 2016-05-10 | Apple Inc. | Processing vectors using wrapping propagate instructions in the macroscalar architecture |
US8539205B2 (en) * | 2008-08-15 | 2013-09-17 | Apple Inc. | Processing vectors using wrapping multiply and divide instructions in the macroscalar architecture |
US8886696B1 (en) | 2009-03-03 | 2014-11-11 | Altera Corporation | Digital signal processing circuitry with redundancy and ability to support larger multipliers |
US8447954B2 (en) * | 2009-09-04 | 2013-05-21 | International Business Machines Corporation | Parallel pipelined vector reduction in a data processing system |
US8862650B2 (en) | 2010-06-25 | 2014-10-14 | Altera Corporation | Calculation of trigonometric functions in an integrated circuit device |
US9600278B1 (en) | 2011-05-09 | 2017-03-21 | Altera Corporation | Programmable device using fixed and configurable logic to implement recursive trees |
US8949298B1 (en) | 2011-09-16 | 2015-02-03 | Altera Corporation | Computing floating-point polynomials in an integrated circuit device |
US9053045B1 (en) | 2011-09-16 | 2015-06-09 | Altera Corporation | Computing floating-point polynomials in an integrated circuit device |
US9389860B2 (en) | 2012-04-02 | 2016-07-12 | Apple Inc. | Prediction optimizations for Macroscalar vector partitioning loops |
US9098332B1 (en) | 2012-06-01 | 2015-08-04 | Altera Corporation | Specialized processing block with fixed- and floating-point structures |
US8996600B1 (en) | 2012-08-03 | 2015-03-31 | Altera Corporation | Specialized processing block for implementing floating-point multiplier with subnormal operation support |
US9207909B1 (en) | 2012-11-26 | 2015-12-08 | Altera Corporation | Polynomial calculations optimized for programmable integrated circuit device structures |
US9189200B1 (en) | 2013-03-14 | 2015-11-17 | Altera Corporation | Multiple-precision processing block in a programmable integrated circuit device |
US9348589B2 (en) | 2013-03-19 | 2016-05-24 | Apple Inc. | Enhanced predicate registers having predicates corresponding to element widths |
US9817663B2 (en) | 2013-03-19 | 2017-11-14 | Apple Inc. | Enhanced Macroscalar predicate operations |
US9348795B1 (en) | 2013-07-03 | 2016-05-24 | Altera Corporation | Programmable device using fixed and configurable logic to implement floating-point rounding |
KR101418686B1 (ko) * | 2013-08-02 | 2014-07-10 | 공주대학교 산학협력단 | 유한체에서 타입 4 가우시안 정규기저를 이용한 이차 미만의 공간 복잡도를 갖는 병렬 곱셈 연산방법 및 그 연산장치 |
US20150052330A1 (en) * | 2013-08-14 | 2015-02-19 | Qualcomm Incorporated | Vector arithmetic reduction |
US9379687B1 (en) | 2014-01-14 | 2016-06-28 | Altera Corporation | Pipelined systolic finite impulse response filter |
US9355061B2 (en) * | 2014-01-28 | 2016-05-31 | Arm Limited | Data processing apparatus and method for performing scan operations |
US9916130B2 (en) * | 2014-11-03 | 2018-03-13 | Arm Limited | Apparatus and method for vector processing |
US9684488B2 (en) | 2015-03-26 | 2017-06-20 | Altera Corporation | Combined adder and pre-adder for high-radix multiplier circuit |
US20160313995A1 (en) * | 2015-04-24 | 2016-10-27 | Optimum Semiconductor Technologies, Inc. | Computer processor with indirect only branching |
US10108581B1 (en) * | 2017-04-03 | 2018-10-23 | Google Llc | Vector reduction processor |
US10942706B2 (en) | 2017-05-05 | 2021-03-09 | Intel Corporation | Implementation of floating-point trigonometric functions in an integrated circuit device |
US11409692B2 (en) * | 2017-07-24 | 2022-08-09 | Tesla, Inc. | Vector computational unit |
WO2019066797A1 (en) * | 2017-09-27 | 2019-04-04 | Intel Corporation | INSTRUCTIONS FOR VECTORIC MULTIPLICATION OF NOT SIGNED WORDS WITH BOROUGH |
WO2019066796A1 (en) * | 2017-09-27 | 2019-04-04 | Intel Corporation | INSTRUCTIONS FOR THE VECTORIAL MULTIPLICATION OF WORDS SIGNED AT BOROUGH |
US20190102199A1 (en) * | 2017-09-30 | 2019-04-04 | Intel Corporation | Methods and systems for executing vectorized pythagorean tuple instructions |
US12061910B2 (en) * | 2019-12-05 | 2024-08-13 | International Business Machines Corporation | Dispatching multiply and accumulate operations based on accumulator register index number |
US11119772B2 (en) * | 2019-12-06 | 2021-09-14 | International Business Machines Corporation | Check pointing of accumulator register results in a microprocessor |
GB2601466A (en) * | 2020-02-10 | 2022-06-08 | Xmos Ltd | Rotating accumulator |
US20240004647A1 (en) * | 2022-07-01 | 2024-01-04 | Andes Technology Corporation | Vector processor with vector and element reduction method |
Family Cites Families (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5889689A (en) | 1997-09-08 | 1999-03-30 | Lucent Technologies Inc. | Hierarchical carry-select, three-input saturation |
US5991785A (en) | 1997-11-13 | 1999-11-23 | Lucent Technologies Inc. | Determining an extremum value and its index in an array using a dual-accumulation processor |
JP2000322235A (ja) | 1999-05-07 | 2000-11-24 | Sony Corp | 情報処理装置 |
US6526430B1 (en) | 1999-10-04 | 2003-02-25 | Texas Instruments Incorporated | Reconfigurable SIMD coprocessor architecture for sum of absolute differences and symmetric filtering (scalable MAC engine for image processing) |
US6968445B2 (en) | 2001-12-20 | 2005-11-22 | Sandbridge Technologies, Inc. | Multithreaded processor with efficient processing for convergence device applications |
GB2389433B (en) | 2002-06-08 | 2005-08-31 | Motorola Inc | Bit exactness support in dual-mac architecture |
US6842848B2 (en) | 2002-10-11 | 2005-01-11 | Sandbridge Technologies, Inc. | Method and apparatus for token triggered multithreading |
US6904511B2 (en) | 2002-10-11 | 2005-06-07 | Sandbridge Technologies, Inc. | Method and apparatus for register file port reduction in a multithreaded processor |
US6925643B2 (en) | 2002-10-11 | 2005-08-02 | Sandbridge Technologies, Inc. | Method and apparatus for thread-based memory access in a multithreaded processor |
CN1820246A (zh) | 2003-05-09 | 2006-08-16 | 杉桥技术公司 | 执行饱和或不执行饱和地累加多操作数的处理器还原单元 |
-
2005
- 2005-04-01 US US11/096,921 patent/US7797363B2/en active Active
- 2005-04-07 KR KR1020067023224A patent/KR20060133086A/ko not_active Application Discontinuation
- 2005-04-07 WO PCT/US2005/011976 patent/WO2005101190A2/en active Application Filing
- 2005-04-07 JP JP2007507533A patent/JP2007533009A/ja active Pending
- 2005-04-07 TW TW094111014A patent/TW200604941A/zh unknown
- 2005-04-07 EP EP05734929A patent/EP1735694A4/en not_active Withdrawn
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN118245017A (zh) * | 2023-11-02 | 2024-06-25 | 芯立嘉集成电路(杭州)有限公司 | 存储器内二进位浮点乘法装置及其操作方法 |
Also Published As
Publication number | Publication date |
---|---|
KR20060133086A (ko) | 2006-12-22 |
US7797363B2 (en) | 2010-09-14 |
EP1735694A2 (en) | 2006-12-27 |
WO2005101190A3 (en) | 2007-03-22 |
US20060041610A1 (en) | 2006-02-23 |
JP2007533009A (ja) | 2007-11-15 |
WO2005101190A2 (en) | 2005-10-27 |
EP1735694A4 (en) | 2008-10-08 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
TW200604941A (en) | Processor having parallel vector multiply and reduce operations with sequential semantics | |
WO2004103056A3 (en) | Processor reduction unit for accumulation of multiple operands with or without saturation | |
TW200500940A (en) | Simd integer multiply high with round and shift | |
IN266871B (zh) | ||
USD713269S1 (en) | Wrist watch case | |
TW200607290A (en) | Facilitating access to input/output resources via an I/O partition shared by multiple consumer partitions | |
WO2009134927A3 (en) | Business software application system and method | |
ATE484789T1 (de) | Multiplikator | |
WO2009120981A3 (en) | Vector instructions to enable efficient synchronization and parallel reduction operations | |
WO2007018467A8 (en) | Programmable digital signal processor having a clustered simd microarchitecture including a complex short multiplier and an independent vector load unit | |
TW200604990A (en) | Image signal processing device | |
WO2008118805A3 (en) | Processor with adaptive multi-shader | |
IES20080198A2 (en) | A processor | |
WO2012009252A3 (en) | Dynamic enabling and disabling of simd units in a graphics processor | |
WO2011156192A3 (en) | Calculator with dynamic computation environment | |
MX343892B (es) | Dispositivo de computo configurado con una red de tablas. | |
IL182962A0 (en) | A method for generating a composite image | |
WO2009132154A3 (en) | Server-controlled user interface | |
TW200734894A (en) | Virtual tree searcher using parallel tree search method | |
TW200509612A (en) | Data packet arithmetic logic devices and methods | |
USD766781S1 (en) | Ambulation aid | |
DE602007013456D1 (zh) | ||
WO2007055986A3 (en) | Input/query methods and apparatuses | |
EP1812928A4 (en) | VIDEO PROCESSING | |
WO2013111144A3 (en) | System for inserting services in a software application |