TWI402766B - 繪圖處理器 - Google Patents
繪圖處理器 Download PDFInfo
- Publication number
- TWI402766B TWI402766B TW097147390A TW97147390A TWI402766B TW I402766 B TWI402766 B TW I402766B TW 097147390 A TW097147390 A TW 097147390A TW 97147390 A TW97147390 A TW 97147390A TW I402766 B TWI402766 B TW I402766B
- Authority
- TW
- Taiwan
- Prior art keywords
- precision
- operand
- operations
- dfma
- path
- Prior art date
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/30—Arrangements for executing machine instructions, e.g. instruction decode
- G06F9/30003—Arrangements for executing specific machine instructions
- G06F9/30007—Arrangements for executing specific machine instructions to perform operations on data operands
- G06F9/3001—Arithmetic instructions
- G06F9/30014—Arithmetic instructions with variable precision
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T1/00—General purpose image data processing
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/14—Digital output to display device ; Cooperation and interconnection of the display device with other functional units
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F7/00—Methods or arrangements for processing data by operating upon the order or content of the data handled
- G06F7/38—Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation
- G06F7/48—Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation using non-contact-making devices, e.g. tube, solid state device; using unspecified devices
- G06F7/50—Adding; Subtracting
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/30—Arrangements for executing machine instructions, e.g. instruction decode
- G06F9/30098—Register arrangements
- G06F9/30105—Register structure
- G06F9/30112—Register structure comprising data of variable length
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/30—Arrangements for executing machine instructions, e.g. instruction decode
- G06F9/30098—Register arrangements
- G06F9/3012—Organisation of register space, e.g. banked or distributed register file
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/30—Arrangements for executing machine instructions, e.g. instruction decode
- G06F9/30098—Register arrangements
- G06F9/3012—Organisation of register space, e.g. banked or distributed register file
- G06F9/30123—Organisation of register space, e.g. banked or distributed register file according to context, e.g. thread buffers
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/30—Arrangements for executing machine instructions, e.g. instruction decode
- G06F9/38—Concurrent instruction execution, e.g. pipeline or look ahead
- G06F9/3836—Instruction issuing, e.g. dynamic instruction scheduling or out of order instruction execution
- G06F9/3851—Instruction issuing, e.g. dynamic instruction scheduling or out of order instruction execution from multiple instruction streams, e.g. multistreaming
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/30—Arrangements for executing machine instructions, e.g. instruction decode
- G06F9/38—Concurrent instruction execution, e.g. pipeline or look ahead
- G06F9/3885—Concurrent instruction execution, e.g. pipeline or look ahead using a plurality of independent parallel functional units
- G06F9/3887—Concurrent instruction execution, e.g. pipeline or look ahead using a plurality of independent parallel functional units controlled by a single instruction for multiple data lanes [SIMD]
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/30—Arrangements for executing machine instructions, e.g. instruction decode
- G06F9/38—Concurrent instruction execution, e.g. pipeline or look ahead
- G06F9/3885—Concurrent instruction execution, e.g. pipeline or look ahead using a plurality of independent parallel functional units
- G06F9/3888—Concurrent instruction execution, e.g. pipeline or look ahead using a plurality of independent parallel functional units controlled by a single instruction for multiple threads [SIMT] in parallel
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/30—Arrangements for executing machine instructions, e.g. instruction decode
- G06F9/38—Concurrent instruction execution, e.g. pipeline or look ahead
- G06F9/3885—Concurrent instruction execution, e.g. pipeline or look ahead using a plurality of independent parallel functional units
- G06F9/3893—Concurrent instruction execution, e.g. pipeline or look ahead using a plurality of independent parallel functional units controlled in tandem, e.g. multiplier-accumulator
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T1/00—General purpose image data processing
- G06T1/20—Processor architectures; Processor configuration, e.g. pipelining
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Software Systems (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- Mathematical Analysis (AREA)
- Computational Mathematics (AREA)
- Multimedia (AREA)
- Pure & Applied Mathematics (AREA)
- Mathematical Optimization (AREA)
- Computing Systems (AREA)
- Human Computer Interaction (AREA)
- Image Processing (AREA)
- Image Generation (AREA)
- Complex Calculations (AREA)
Applications Claiming Priority (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| US11/952,858 US8106914B2 (en) | 2007-12-07 | 2007-12-07 | Fused multiply-add functional unit |
Publications (2)
| Publication Number | Publication Date |
|---|---|
| TW200937341A TW200937341A (en) | 2009-09-01 |
| TWI402766B true TWI402766B (zh) | 2013-07-21 |
Family
ID=40230776
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| TW097147390A TWI402766B (zh) | 2007-12-07 | 2008-12-05 | 繪圖處理器 |
Country Status (7)
| Country | Link |
|---|---|
| US (1) | US8106914B2 (enExample) |
| JP (2) | JP2009140491A (enExample) |
| KR (1) | KR101009095B1 (enExample) |
| CN (1) | CN101452571B (enExample) |
| DE (1) | DE102008059371B9 (enExample) |
| GB (1) | GB2455401B (enExample) |
| TW (1) | TWI402766B (enExample) |
Families Citing this family (72)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US8190669B1 (en) | 2004-10-20 | 2012-05-29 | Nvidia Corporation | Multipurpose arithmetic functional unit |
| US8037119B1 (en) | 2006-02-21 | 2011-10-11 | Nvidia Corporation | Multipurpose functional unit with single-precision and double-precision operations |
| US8051123B1 (en) | 2006-12-15 | 2011-11-01 | Nvidia Corporation | Multipurpose functional unit with double-precision and filtering operations |
| US8106914B2 (en) | 2007-12-07 | 2012-01-31 | Nvidia Corporation | Fused multiply-add functional unit |
| US8289333B2 (en) * | 2008-03-04 | 2012-10-16 | Apple Inc. | Multi-context graphics processing |
| US8477143B2 (en) | 2008-03-04 | 2013-07-02 | Apple Inc. | Buffers for display acceleration |
| US8633936B2 (en) * | 2008-04-21 | 2014-01-21 | Qualcomm Incorporated | Programmable streaming processor with mixed precision instruction execution |
| US8239441B2 (en) * | 2008-05-15 | 2012-08-07 | Oracle America, Inc. | Leading zero estimation modification for unfused rounding catastrophic cancellation |
| US8495121B2 (en) * | 2008-11-20 | 2013-07-23 | Advanced Micro Devices, Inc. | Arithmetic processing device and methods thereof |
| US20100125621A1 (en) * | 2008-11-20 | 2010-05-20 | Advanced Micro Devices, Inc. | Arithmetic processing device and methods thereof |
| KR101511273B1 (ko) * | 2008-12-29 | 2015-04-10 | 삼성전자주식회사 | 멀티 코어 프로세서를 이용한 3차원 그래픽 렌더링 방법 및시스템 |
| US8803897B2 (en) * | 2009-09-03 | 2014-08-12 | Advanced Micro Devices, Inc. | Internal, processing-unit memory for general-purpose use |
| US8990282B2 (en) * | 2009-09-21 | 2015-03-24 | Arm Limited | Apparatus and method for performing fused multiply add floating point operation |
| US8745111B2 (en) | 2010-11-16 | 2014-06-03 | Apple Inc. | Methods and apparatuses for converting floating point representations |
| KR101735677B1 (ko) | 2010-11-17 | 2017-05-16 | 삼성전자주식회사 | 부동 소수점의 복합 연산장치 및 그 연산방법 |
| US8752064B2 (en) * | 2010-12-14 | 2014-06-10 | Advanced Micro Devices, Inc. | Optimizing communication of system call requests |
| US8965945B2 (en) * | 2011-02-17 | 2015-02-24 | Arm Limited | Apparatus and method for performing floating point addition |
| DE102011108754A1 (de) * | 2011-07-28 | 2013-01-31 | Khs Gmbh | Inspektionseinheit |
| CN102750663A (zh) * | 2011-08-26 | 2012-10-24 | 新奥特(北京)视频技术有限公司 | 一种基于gpu的地理信息数据处理的方法、设备和系统 |
| US9792087B2 (en) | 2012-04-20 | 2017-10-17 | Futurewei Technologies, Inc. | System and method for a floating-point format for digital signal processors |
| US9110713B2 (en) | 2012-08-30 | 2015-08-18 | Qualcomm Incorporated | Microarchitecture for floating point fused multiply-add with exponent scaling |
| US9152382B2 (en) * | 2012-10-31 | 2015-10-06 | Intel Corporation | Reducing power consumption in a fused multiply-add (FMA) unit responsive to input data values |
| US9665973B2 (en) * | 2012-11-20 | 2017-05-30 | Intel Corporation | Depth buffering |
| US9019284B2 (en) | 2012-12-20 | 2015-04-28 | Nvidia Corporation | Input output connector for accessing graphics fixed function units in a software-defined pipeline and a method of operating a pipeline |
| US9123128B2 (en) | 2012-12-21 | 2015-09-01 | Nvidia Corporation | Graphics processing unit employing a standard processing unit and a method of constructing a graphics processing unit |
| US9317251B2 (en) | 2012-12-31 | 2016-04-19 | Nvidia Corporation | Efficient correction of normalizer shift amount errors in fused multiply add operations |
| GB2511314A (en) | 2013-02-27 | 2014-09-03 | Ibm | Fast fused-multiply-add pipeline |
| US9389871B2 (en) | 2013-03-15 | 2016-07-12 | Intel Corporation | Combined floating point multiplier adder with intermediate rounding logic |
| US9465578B2 (en) * | 2013-12-13 | 2016-10-11 | Nvidia Corporation | Logic circuitry configurable to perform 32-bit or dual 16-bit floating-point operations |
| US10297001B2 (en) * | 2014-12-26 | 2019-05-21 | Intel Corporation | Reduced power implementation of computer instructions |
| KR102276910B1 (ko) | 2015-01-06 | 2021-07-13 | 삼성전자주식회사 | 테셀레이션 장치 및 방법 |
| US11847427B2 (en) | 2015-04-04 | 2023-12-19 | Texas Instruments Incorporated | Load store circuit with dedicated single or dual bit shift circuit and opcodes for low power accelerator processor |
| US9817791B2 (en) | 2015-04-04 | 2017-11-14 | Texas Instruments Incorporated | Low energy accelerator processor architecture with short parallel instruction word |
| US9952865B2 (en) | 2015-04-04 | 2018-04-24 | Texas Instruments Incorporated | Low energy accelerator processor architecture with short parallel instruction word and non-orthogonal register data file |
| US10152310B2 (en) * | 2015-05-27 | 2018-12-11 | Nvidia Corporation | Fusing a sequence of operations through subdividing |
| US10503474B2 (en) | 2015-12-31 | 2019-12-10 | Texas Instruments Incorporated | Methods and instructions for 32-bit arithmetic support using 16-bit multiply and 32-bit addition |
| US10387988B2 (en) * | 2016-02-26 | 2019-08-20 | Google Llc | Compiler techniques for mapping program code to a high performance, power efficient, programmable image processing hardware platform |
| US10282169B2 (en) | 2016-04-06 | 2019-05-07 | Apple Inc. | Floating-point multiply-add with down-conversion |
| US10157059B2 (en) * | 2016-09-29 | 2018-12-18 | Intel Corporation | Instruction and logic for early underflow detection and rounder bypass |
| US10401412B2 (en) | 2016-12-16 | 2019-09-03 | Texas Instruments Incorporated | Line fault signature analysis |
| US10275391B2 (en) | 2017-01-23 | 2019-04-30 | International Business Machines Corporation | Combining of several execution units to compute a single wide scalar result |
| GB2560766B (en) | 2017-03-24 | 2019-04-03 | Imagination Tech Ltd | Floating point to fixed point conversion |
| US10409614B2 (en) | 2017-04-24 | 2019-09-10 | Intel Corporation | Instructions having support for floating point and integer data types in the same register |
| US10489877B2 (en) | 2017-04-24 | 2019-11-26 | Intel Corporation | Compute optimization mechanism |
| US10417731B2 (en) | 2017-04-24 | 2019-09-17 | Intel Corporation | Compute optimization mechanism for deep neural networks |
| US10417734B2 (en) | 2017-04-24 | 2019-09-17 | Intel Corporation | Compute optimization mechanism for deep neural networks |
| US10726514B2 (en) * | 2017-04-28 | 2020-07-28 | Intel Corporation | Compute optimizations for low precision machine learning operations |
| US10474458B2 (en) | 2017-04-28 | 2019-11-12 | Intel Corporation | Instructions and logic to perform floating-point and integer operations for machine learning |
| CN108595369B (zh) * | 2018-04-28 | 2020-08-25 | 天津芯海创科技有限公司 | 算式并行计算装置及方法 |
| US10635439B2 (en) * | 2018-06-13 | 2020-04-28 | Samsung Electronics Co., Ltd. | Efficient interface and transport mechanism for binding bindless shader programs to run-time specified graphics pipeline configurations and objects |
| CN108958705B (zh) * | 2018-06-26 | 2021-11-12 | 飞腾信息技术有限公司 | 一种支持混合数据类型的浮点融合乘加器及其应用方法 |
| US11138009B2 (en) * | 2018-08-10 | 2021-10-05 | Nvidia Corporation | Robust, efficient multiprocessor-coprocessor interface |
| US11093579B2 (en) * | 2018-09-05 | 2021-08-17 | Intel Corporation | FP16-S7E8 mixed precision for deep learning and other algorithms |
| US11455766B2 (en) * | 2018-09-18 | 2022-09-27 | Advanced Micro Devices, Inc. | Variable precision computing system |
| JP7115211B2 (ja) | 2018-10-18 | 2022-08-09 | 富士通株式会社 | 演算処理装置および演算処理装置の制御方法 |
| US11934342B2 (en) | 2019-03-15 | 2024-03-19 | Intel Corporation | Assistance for hardware prefetch in cache access |
| ES3041900T3 (en) | 2019-03-15 | 2025-11-17 | Intel Corp | Architecture for block sparse operations on a systolic array |
| DE112020001258T5 (de) | 2019-03-15 | 2021-12-23 | Intel Corporation | Grafikprozessoren und Grafikverarbeitungseinheiten mit Skalarproduktakkumulationsanweisungen für ein Hybrid-Gleitkommaformat |
| EP3938893B1 (en) | 2019-03-15 | 2025-10-15 | Intel Corporation | Systems and methods for cache optimization |
| US11016765B2 (en) * | 2019-04-29 | 2021-05-25 | Micron Technology, Inc. | Bit string operations using a computing tile |
| US10990389B2 (en) * | 2019-04-29 | 2021-04-27 | Micron Technology, Inc. | Bit string operations using a computing tile |
| US11861761B2 (en) | 2019-11-15 | 2024-01-02 | Intel Corporation | Graphics processing unit processing and caching improvements |
| US11663746B2 (en) | 2019-11-15 | 2023-05-30 | Intel Corporation | Systolic arithmetic on sparse data |
| US11907713B2 (en) * | 2019-12-28 | 2024-02-20 | Intel Corporation | Apparatuses, methods, and systems for fused operations using sign modification in a processing element of a configurable spatial accelerator |
| US12020349B2 (en) * | 2020-05-01 | 2024-06-25 | Samsung Electronics Co., Ltd. | Methods and apparatus for efficient blending in a graphics pipeline |
| CN111610955B (zh) * | 2020-06-28 | 2022-06-03 | 中国人民解放军国防科技大学 | 一种数据饱和加打包处理部件、芯片及设备 |
| US11386034B2 (en) * | 2020-10-30 | 2022-07-12 | Xilinx, Inc. | High throughput circuit architecture for hardware acceleration |
| WO2022109115A1 (en) * | 2020-11-19 | 2022-05-27 | Google Llc | Systolic array cells with output post-processing |
| US20230129750A1 (en) * | 2021-10-27 | 2023-04-27 | International Business Machines Corporation | Performing a floating-point multiply-add operation in a computer implemented environment |
| KR102839211B1 (ko) * | 2022-09-05 | 2025-07-28 | 리벨리온 주식회사 | 뉴럴 프로세싱 장치, 그에 포함되는 프로세싱 엘리먼트 및 뉴럴 프로세싱 장치의 다양한 포맷 연산 방법 |
| CN117908827A (zh) * | 2022-10-19 | 2024-04-19 | 华为技术有限公司 | 浮点数据精度转换方法和装置 |
| TWI882937B (zh) * | 2024-12-04 | 2025-05-01 | 國立陽明交通大學 | 適用於反平方根運算和倒數運算之間切換的電路架構 |
Citations (4)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US4972362A (en) * | 1988-06-17 | 1990-11-20 | Bipolar Integrated Technology, Inc. | Method and apparatus for implementing binary multiplication using booth type multiplication |
| US5487022A (en) * | 1994-03-08 | 1996-01-23 | Texas Instruments Incorporated | Normalization method for floating point numbers |
| US6061781A (en) * | 1998-07-01 | 2000-05-09 | Ip First Llc | Concurrent execution of divide microinstructions in floating point unit and overflow detection microinstructions in integer unit for integer divide |
| US20050235134A1 (en) * | 2002-08-07 | 2005-10-20 | Mmagix Technology Limited | Apparatus, method and system for a synchronicity independent, resource delegating, power and instruction optimizing processor |
Family Cites Families (52)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US5241638A (en) * | 1985-08-12 | 1993-08-31 | Ceridian Corporation | Dual cache memory |
| JPS6297060A (ja) * | 1985-10-23 | 1987-05-06 | Mitsubishi Electric Corp | デイジタルシグナルプロセツサ |
| US4893268A (en) * | 1988-04-15 | 1990-01-09 | Motorola, Inc. | Circuit and method for accumulating partial products of a single, double or mixed precision multiplication |
| US5287511A (en) * | 1988-07-11 | 1994-02-15 | Star Semiconductor Corporation | Architectures and methods for dividing processing tasks into tasks for a programmable real time signal processor and tasks for a decision making microprocessor interfacing therewith |
| US4969118A (en) * | 1989-01-13 | 1990-11-06 | International Business Machines Corporation | Floating point unit for calculating A=XY+Z having simultaneous multiply and add |
| JPH0378083A (ja) * | 1989-08-21 | 1991-04-03 | Hitachi Ltd | 倍精度演算方式及び積和演算装置 |
| JPH03100723A (ja) * | 1989-09-13 | 1991-04-25 | Fujitsu Ltd | 精度変換命令の処理方式 |
| US5241636A (en) * | 1990-02-14 | 1993-08-31 | Intel Corporation | Method for parallel instruction execution in a computer |
| US5068816A (en) * | 1990-02-16 | 1991-11-26 | Noetzel Andrew S | Interplating memory function evaluation |
| EP0474297B1 (en) * | 1990-09-05 | 1998-06-10 | Koninklijke Philips Electronics N.V. | Very long instruction word machine for efficient execution of programs with conditional branches |
| JPH0612229A (ja) * | 1992-06-10 | 1994-01-21 | Nec Corp | 乗累算回路 |
| DE69329260T2 (de) * | 1992-06-25 | 2001-02-22 | Canon K.K., Tokio/Tokyo | Gerät zum Multiplizieren von Ganzzahlen mit vielen Ziffern |
| JPH0659862A (ja) * | 1992-08-05 | 1994-03-04 | Fujitsu Ltd | 乗算器 |
| US5581778A (en) * | 1992-08-05 | 1996-12-03 | David Sarnoff Researach Center | Advanced massively parallel computer using a field of the instruction to selectively enable the profiling counter to increase its value in response to the system clock |
| EP0622727A1 (en) * | 1993-04-29 | 1994-11-02 | International Business Machines Corporation | System for optimizing argument reduction |
| EP0645699A1 (en) * | 1993-09-29 | 1995-03-29 | International Business Machines Corporation | Fast multiply-add instruction sequence in a pipeline floating-point processor |
| US5673407A (en) * | 1994-03-08 | 1997-09-30 | Texas Instruments Incorporated | Data processor having capability to perform both floating point operations and memory access in response to a single instruction |
| US5553015A (en) | 1994-04-15 | 1996-09-03 | International Business Machines Corporation | Efficient floating point overflow and underflow detection system |
| US5734874A (en) * | 1994-04-29 | 1998-03-31 | Sun Microsystems, Inc. | Central processing unit with integrated graphics functions |
| JP3493064B2 (ja) | 1994-09-14 | 2004-02-03 | 株式会社東芝 | バレルシフタ |
| US5548545A (en) * | 1995-01-19 | 1996-08-20 | Exponential Technology, Inc. | Floating point exception prediction for compound operations and variable precision using an intermediate exponent bus |
| US5701405A (en) * | 1995-06-21 | 1997-12-23 | Apple Computer, Inc. | Method and apparatus for directly evaluating a parameter interpolation function used in rendering images in a graphics system that uses screen partitioning |
| US5778247A (en) | 1996-03-06 | 1998-07-07 | Sun Microsystems, Inc. | Multi-pipeline microprocessor with data precision mode indicator |
| JP3790307B2 (ja) * | 1996-10-16 | 2006-06-28 | 株式会社ルネサステクノロジ | データプロセッサ及びデータ処理システム |
| US6490607B1 (en) * | 1998-01-28 | 2002-12-03 | Advanced Micro Devices, Inc. | Shared FP and SIMD 3D multiplier |
| JP2000081966A (ja) * | 1998-07-09 | 2000-03-21 | Matsushita Electric Ind Co Ltd | 演算装置 |
| JP3600026B2 (ja) * | 1998-08-12 | 2004-12-08 | 株式会社東芝 | 浮動小数点演算器 |
| US6317133B1 (en) * | 1998-09-18 | 2001-11-13 | Ati Technologies, Inc. | Graphics processor with variable performance characteristics |
| US6480872B1 (en) * | 1999-01-21 | 2002-11-12 | Sandcraft, Inc. | Floating-point and integer multiply-add and multiply-accumulate |
| JP2000293494A (ja) * | 1999-04-09 | 2000-10-20 | Fuji Xerox Co Ltd | 並列計算装置および並列計算方法 |
| JP2001236206A (ja) * | 1999-10-01 | 2001-08-31 | Hitachi Ltd | データのロード方法及びその記憶方法、データワードのロード方法及びその記憶方法、並びに、浮動小数点数の比較方法 |
| US6198488B1 (en) * | 1999-12-06 | 2001-03-06 | Nvidia | Transform, lighting and rasterization system embodied on a single semiconductor platform |
| US6807620B1 (en) * | 2000-02-11 | 2004-10-19 | Sony Computer Entertainment Inc. | Game system with graphics processor |
| US6557022B1 (en) * | 2000-02-26 | 2003-04-29 | Qualcomm, Incorporated | Digital signal processor with coupled multiply-accumulate units |
| US6912557B1 (en) * | 2000-06-09 | 2005-06-28 | Cirrus Logic, Inc. | Math coprocessor |
| JP2002008060A (ja) * | 2000-06-23 | 2002-01-11 | Hitachi Ltd | データ処理方法、記録媒体及びデータ処理装置 |
| US6976043B2 (en) * | 2001-07-30 | 2005-12-13 | Ati Technologies Inc. | Technique for approximating functions based on lagrange polynomials |
| JP3845009B2 (ja) * | 2001-12-28 | 2006-11-15 | 富士通株式会社 | 積和演算装置、及び積和演算方法 |
| JP2003223316A (ja) | 2002-01-31 | 2003-08-08 | Matsushita Electric Ind Co Ltd | 演算処理装置 |
| US8549501B2 (en) * | 2004-06-07 | 2013-10-01 | International Business Machines Corporation | Framework for generating mixed-mode operations in loop-level simdization |
| US7437538B1 (en) * | 2004-06-30 | 2008-10-14 | Sun Microsystems, Inc. | Apparatus and method for reducing execution latency of floating point operations having special case operands |
| US7640285B1 (en) * | 2004-10-20 | 2009-12-29 | Nvidia Corporation | Multipurpose arithmetic functional unit |
| WO2006053173A2 (en) * | 2004-11-10 | 2006-05-18 | Nvidia Corporation | Multipurpose multiply-add functional unit |
| KR20060044124A (ko) * | 2004-11-11 | 2006-05-16 | 삼성전자주식회사 | 3차원 그래픽 가속을 위한 그래픽 시스템 및 메모리 장치 |
| JP4571903B2 (ja) * | 2005-12-02 | 2010-10-27 | 富士通株式会社 | 演算処理装置,情報処理装置,及び演算処理方法 |
| US7728841B1 (en) | 2005-12-19 | 2010-06-01 | Nvidia Corporation | Coherent shader output for multiple targets |
| US7747842B1 (en) * | 2005-12-19 | 2010-06-29 | Nvidia Corporation | Configurable output buffer ganging for a parallel processor |
| US7484076B1 (en) * | 2006-09-18 | 2009-01-27 | Nvidia Corporation | Executing an SIMD instruction requiring P operations on an execution unit that performs Q operations at a time (Q<P) |
| US7617384B1 (en) * | 2006-11-06 | 2009-11-10 | Nvidia Corporation | Structured programming control flow using a disable mask in a SIMD architecture |
| JP4954799B2 (ja) | 2007-06-05 | 2012-06-20 | 日本発條株式会社 | 衝撃吸収装置 |
| US8775777B2 (en) * | 2007-08-15 | 2014-07-08 | Nvidia Corporation | Techniques for sourcing immediate values from a VLIW |
| US8106914B2 (en) | 2007-12-07 | 2012-01-31 | Nvidia Corporation | Fused multiply-add functional unit |
-
2007
- 2007-12-07 US US11/952,858 patent/US8106914B2/en active Active
-
2008
- 2008-11-25 GB GB0821495A patent/GB2455401B/en active Active
- 2008-11-27 JP JP2008302713A patent/JP2009140491A/ja active Pending
- 2008-11-28 DE DE102008059371A patent/DE102008059371B9/de active Active
- 2008-12-04 CN CN2008101825044A patent/CN101452571B/zh not_active Expired - Fee Related
- 2008-12-05 TW TW097147390A patent/TWI402766B/zh active
- 2008-12-08 KR KR1020080124099A patent/KR101009095B1/ko active Active
-
2011
- 2011-09-30 JP JP2011217575A patent/JP2012084142A/ja active Pending
Patent Citations (4)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US4972362A (en) * | 1988-06-17 | 1990-11-20 | Bipolar Integrated Technology, Inc. | Method and apparatus for implementing binary multiplication using booth type multiplication |
| US5487022A (en) * | 1994-03-08 | 1996-01-23 | Texas Instruments Incorporated | Normalization method for floating point numbers |
| US6061781A (en) * | 1998-07-01 | 2000-05-09 | Ip First Llc | Concurrent execution of divide microinstructions in floating point unit and overflow detection microinstructions in integer unit for integer divide |
| US20050235134A1 (en) * | 2002-08-07 | 2005-10-20 | Mmagix Technology Limited | Apparatus, method and system for a synchronicity independent, resource delegating, power and instruction optimizing processor |
Also Published As
| Publication number | Publication date |
|---|---|
| TW200937341A (en) | 2009-09-01 |
| CN101452571B (zh) | 2012-04-25 |
| DE102008059371B4 (de) | 2012-03-08 |
| JP2009140491A (ja) | 2009-06-25 |
| GB2455401B (en) | 2010-05-05 |
| US20090150654A1 (en) | 2009-06-11 |
| DE102008059371A1 (de) | 2009-06-25 |
| US8106914B2 (en) | 2012-01-31 |
| KR101009095B1 (ko) | 2011-01-18 |
| KR20090060207A (ko) | 2009-06-11 |
| GB2455401A (en) | 2009-06-10 |
| GB0821495D0 (en) | 2008-12-31 |
| CN101452571A (zh) | 2009-06-10 |
| JP2012084142A (ja) | 2012-04-26 |
| DE102008059371B9 (de) | 2012-06-06 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| TWI402766B (zh) | 繪圖處理器 | |
| US11797303B2 (en) | Generalized acceleration of matrix multiply accumulate operations | |
| US12321743B2 (en) | Generalized acceleration of matrix multiply accumulate operations | |
| US7225323B2 (en) | Multi-purpose floating point and integer multiply-add functional unit with multiplication-comparison test addition and exponent pipelines | |
| US7428566B2 (en) | Multipurpose functional unit with multiply-add and format conversion pipeline | |
| US20060101244A1 (en) | Multipurpose functional unit with combined integer and floating-point multiply-add pipeline | |
| US8037119B1 (en) | Multipurpose functional unit with single-precision and double-precision operations | |
| US7640285B1 (en) | Multipurpose arithmetic functional unit | |
| US8051123B1 (en) | Multipurpose functional unit with double-precision and filtering operations | |
| US8190669B1 (en) | Multipurpose arithmetic functional unit | |
| KR100911786B1 (ko) | 다목적 승산-가산 기능 유닛 | |
| US7240184B2 (en) | Multipurpose functional unit with multiplication pipeline, addition pipeline, addition pipeline and logical test pipeline capable of performing integer multiply-add operations |