JP2012084142A5 - - Google Patents

Download PDF

Info

Publication number
JP2012084142A5
JP2012084142A5 JP2011217575A JP2011217575A JP2012084142A5 JP 2012084142 A5 JP2012084142 A5 JP 2012084142A5 JP 2011217575 A JP2011217575 A JP 2011217575A JP 2011217575 A JP2011217575 A JP 2011217575A JP 2012084142 A5 JP2012084142 A5 JP 2012084142A5
Authority
JP
Japan
Prior art keywords
operand
mantissa
path
dfma
product
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
JP2011217575A
Other languages
English (en)
Japanese (ja)
Other versions
JP2012084142A (ja
Filing date
Publication date
Priority claimed from US11/952,858 external-priority patent/US8106914B2/en
Application filed filed Critical
Publication of JP2012084142A publication Critical patent/JP2012084142A/ja
Publication of JP2012084142A5 publication Critical patent/JP2012084142A5/ja
Pending legal-status Critical Current

Links

JP2011217575A 2007-12-07 2011-09-30 融合型積和演算機能ユニット Pending JP2012084142A (ja)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US11/952,858 2007-12-07
US11/952,858 US8106914B2 (en) 2007-12-07 2007-12-07 Fused multiply-add functional unit

Related Parent Applications (1)

Application Number Title Priority Date Filing Date
JP2008302713A Division JP2009140491A (ja) 2007-12-07 2008-11-27 融合型積和演算機能ユニット

Publications (2)

Publication Number Publication Date
JP2012084142A JP2012084142A (ja) 2012-04-26
JP2012084142A5 true JP2012084142A5 (enExample) 2013-05-30

Family

ID=40230776

Family Applications (2)

Application Number Title Priority Date Filing Date
JP2008302713A Pending JP2009140491A (ja) 2007-12-07 2008-11-27 融合型積和演算機能ユニット
JP2011217575A Pending JP2012084142A (ja) 2007-12-07 2011-09-30 融合型積和演算機能ユニット

Family Applications Before (1)

Application Number Title Priority Date Filing Date
JP2008302713A Pending JP2009140491A (ja) 2007-12-07 2008-11-27 融合型積和演算機能ユニット

Country Status (7)

Country Link
US (1) US8106914B2 (enExample)
JP (2) JP2009140491A (enExample)
KR (1) KR101009095B1 (enExample)
CN (1) CN101452571B (enExample)
DE (1) DE102008059371B9 (enExample)
GB (1) GB2455401B (enExample)
TW (1) TWI402766B (enExample)

Families Citing this family (72)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8190669B1 (en) 2004-10-20 2012-05-29 Nvidia Corporation Multipurpose arithmetic functional unit
US8037119B1 (en) 2006-02-21 2011-10-11 Nvidia Corporation Multipurpose functional unit with single-precision and double-precision operations
US8051123B1 (en) 2006-12-15 2011-11-01 Nvidia Corporation Multipurpose functional unit with double-precision and filtering operations
US8106914B2 (en) 2007-12-07 2012-01-31 Nvidia Corporation Fused multiply-add functional unit
US8289333B2 (en) * 2008-03-04 2012-10-16 Apple Inc. Multi-context graphics processing
US8477143B2 (en) 2008-03-04 2013-07-02 Apple Inc. Buffers for display acceleration
US8633936B2 (en) * 2008-04-21 2014-01-21 Qualcomm Incorporated Programmable streaming processor with mixed precision instruction execution
US8239441B2 (en) * 2008-05-15 2012-08-07 Oracle America, Inc. Leading zero estimation modification for unfused rounding catastrophic cancellation
US8495121B2 (en) * 2008-11-20 2013-07-23 Advanced Micro Devices, Inc. Arithmetic processing device and methods thereof
US20100125621A1 (en) * 2008-11-20 2010-05-20 Advanced Micro Devices, Inc. Arithmetic processing device and methods thereof
KR101511273B1 (ko) * 2008-12-29 2015-04-10 삼성전자주식회사 멀티 코어 프로세서를 이용한 3차원 그래픽 렌더링 방법 및시스템
US8803897B2 (en) * 2009-09-03 2014-08-12 Advanced Micro Devices, Inc. Internal, processing-unit memory for general-purpose use
US8990282B2 (en) * 2009-09-21 2015-03-24 Arm Limited Apparatus and method for performing fused multiply add floating point operation
US8745111B2 (en) 2010-11-16 2014-06-03 Apple Inc. Methods and apparatuses for converting floating point representations
KR101735677B1 (ko) 2010-11-17 2017-05-16 삼성전자주식회사 부동 소수점의 복합 연산장치 및 그 연산방법
US8752064B2 (en) * 2010-12-14 2014-06-10 Advanced Micro Devices, Inc. Optimizing communication of system call requests
US8965945B2 (en) * 2011-02-17 2015-02-24 Arm Limited Apparatus and method for performing floating point addition
DE102011108754A1 (de) * 2011-07-28 2013-01-31 Khs Gmbh Inspektionseinheit
CN102750663A (zh) * 2011-08-26 2012-10-24 新奥特(北京)视频技术有限公司 一种基于gpu的地理信息数据处理的方法、设备和系统
US9792087B2 (en) 2012-04-20 2017-10-17 Futurewei Technologies, Inc. System and method for a floating-point format for digital signal processors
US9110713B2 (en) 2012-08-30 2015-08-18 Qualcomm Incorporated Microarchitecture for floating point fused multiply-add with exponent scaling
US9152382B2 (en) * 2012-10-31 2015-10-06 Intel Corporation Reducing power consumption in a fused multiply-add (FMA) unit responsive to input data values
US9665973B2 (en) * 2012-11-20 2017-05-30 Intel Corporation Depth buffering
US9019284B2 (en) 2012-12-20 2015-04-28 Nvidia Corporation Input output connector for accessing graphics fixed function units in a software-defined pipeline and a method of operating a pipeline
US9123128B2 (en) 2012-12-21 2015-09-01 Nvidia Corporation Graphics processing unit employing a standard processing unit and a method of constructing a graphics processing unit
US9317251B2 (en) 2012-12-31 2016-04-19 Nvidia Corporation Efficient correction of normalizer shift amount errors in fused multiply add operations
GB2511314A (en) 2013-02-27 2014-09-03 Ibm Fast fused-multiply-add pipeline
US9389871B2 (en) 2013-03-15 2016-07-12 Intel Corporation Combined floating point multiplier adder with intermediate rounding logic
US9465578B2 (en) * 2013-12-13 2016-10-11 Nvidia Corporation Logic circuitry configurable to perform 32-bit or dual 16-bit floating-point operations
US10297001B2 (en) * 2014-12-26 2019-05-21 Intel Corporation Reduced power implementation of computer instructions
KR102276910B1 (ko) 2015-01-06 2021-07-13 삼성전자주식회사 테셀레이션 장치 및 방법
US11847427B2 (en) 2015-04-04 2023-12-19 Texas Instruments Incorporated Load store circuit with dedicated single or dual bit shift circuit and opcodes for low power accelerator processor
US9817791B2 (en) 2015-04-04 2017-11-14 Texas Instruments Incorporated Low energy accelerator processor architecture with short parallel instruction word
US9952865B2 (en) 2015-04-04 2018-04-24 Texas Instruments Incorporated Low energy accelerator processor architecture with short parallel instruction word and non-orthogonal register data file
US10152310B2 (en) * 2015-05-27 2018-12-11 Nvidia Corporation Fusing a sequence of operations through subdividing
US10503474B2 (en) 2015-12-31 2019-12-10 Texas Instruments Incorporated Methods and instructions for 32-bit arithmetic support using 16-bit multiply and 32-bit addition
US10387988B2 (en) * 2016-02-26 2019-08-20 Google Llc Compiler techniques for mapping program code to a high performance, power efficient, programmable image processing hardware platform
US10282169B2 (en) 2016-04-06 2019-05-07 Apple Inc. Floating-point multiply-add with down-conversion
US10157059B2 (en) * 2016-09-29 2018-12-18 Intel Corporation Instruction and logic for early underflow detection and rounder bypass
US10401412B2 (en) 2016-12-16 2019-09-03 Texas Instruments Incorporated Line fault signature analysis
US10275391B2 (en) 2017-01-23 2019-04-30 International Business Machines Corporation Combining of several execution units to compute a single wide scalar result
GB2560766B (en) 2017-03-24 2019-04-03 Imagination Tech Ltd Floating point to fixed point conversion
US10409614B2 (en) 2017-04-24 2019-09-10 Intel Corporation Instructions having support for floating point and integer data types in the same register
US10489877B2 (en) 2017-04-24 2019-11-26 Intel Corporation Compute optimization mechanism
US10417731B2 (en) 2017-04-24 2019-09-17 Intel Corporation Compute optimization mechanism for deep neural networks
US10417734B2 (en) 2017-04-24 2019-09-17 Intel Corporation Compute optimization mechanism for deep neural networks
US10726514B2 (en) * 2017-04-28 2020-07-28 Intel Corporation Compute optimizations for low precision machine learning operations
US10474458B2 (en) 2017-04-28 2019-11-12 Intel Corporation Instructions and logic to perform floating-point and integer operations for machine learning
CN108595369B (zh) * 2018-04-28 2020-08-25 天津芯海创科技有限公司 算式并行计算装置及方法
US10635439B2 (en) * 2018-06-13 2020-04-28 Samsung Electronics Co., Ltd. Efficient interface and transport mechanism for binding bindless shader programs to run-time specified graphics pipeline configurations and objects
CN108958705B (zh) * 2018-06-26 2021-11-12 飞腾信息技术有限公司 一种支持混合数据类型的浮点融合乘加器及其应用方法
US11138009B2 (en) * 2018-08-10 2021-10-05 Nvidia Corporation Robust, efficient multiprocessor-coprocessor interface
US11093579B2 (en) * 2018-09-05 2021-08-17 Intel Corporation FP16-S7E8 mixed precision for deep learning and other algorithms
US11455766B2 (en) * 2018-09-18 2022-09-27 Advanced Micro Devices, Inc. Variable precision computing system
JP7115211B2 (ja) 2018-10-18 2022-08-09 富士通株式会社 演算処理装置および演算処理装置の制御方法
US11934342B2 (en) 2019-03-15 2024-03-19 Intel Corporation Assistance for hardware prefetch in cache access
ES3041900T3 (en) 2019-03-15 2025-11-17 Intel Corp Architecture for block sparse operations on a systolic array
DE112020001258T5 (de) 2019-03-15 2021-12-23 Intel Corporation Grafikprozessoren und Grafikverarbeitungseinheiten mit Skalarproduktakkumulationsanweisungen für ein Hybrid-Gleitkommaformat
EP3938893B1 (en) 2019-03-15 2025-10-15 Intel Corporation Systems and methods for cache optimization
US11016765B2 (en) * 2019-04-29 2021-05-25 Micron Technology, Inc. Bit string operations using a computing tile
US10990389B2 (en) * 2019-04-29 2021-04-27 Micron Technology, Inc. Bit string operations using a computing tile
US11861761B2 (en) 2019-11-15 2024-01-02 Intel Corporation Graphics processing unit processing and caching improvements
US11663746B2 (en) 2019-11-15 2023-05-30 Intel Corporation Systolic arithmetic on sparse data
US11907713B2 (en) * 2019-12-28 2024-02-20 Intel Corporation Apparatuses, methods, and systems for fused operations using sign modification in a processing element of a configurable spatial accelerator
US12020349B2 (en) * 2020-05-01 2024-06-25 Samsung Electronics Co., Ltd. Methods and apparatus for efficient blending in a graphics pipeline
CN111610955B (zh) * 2020-06-28 2022-06-03 中国人民解放军国防科技大学 一种数据饱和加打包处理部件、芯片及设备
US11386034B2 (en) * 2020-10-30 2022-07-12 Xilinx, Inc. High throughput circuit architecture for hardware acceleration
WO2022109115A1 (en) * 2020-11-19 2022-05-27 Google Llc Systolic array cells with output post-processing
US20230129750A1 (en) * 2021-10-27 2023-04-27 International Business Machines Corporation Performing a floating-point multiply-add operation in a computer implemented environment
KR102839211B1 (ko) * 2022-09-05 2025-07-28 리벨리온 주식회사 뉴럴 프로세싱 장치, 그에 포함되는 프로세싱 엘리먼트 및 뉴럴 프로세싱 장치의 다양한 포맷 연산 방법
CN117908827A (zh) * 2022-10-19 2024-04-19 华为技术有限公司 浮点数据精度转换方法和装置
TWI882937B (zh) * 2024-12-04 2025-05-01 國立陽明交通大學 適用於反平方根運算和倒數運算之間切換的電路架構

Family Cites Families (56)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5241638A (en) * 1985-08-12 1993-08-31 Ceridian Corporation Dual cache memory
JPS6297060A (ja) * 1985-10-23 1987-05-06 Mitsubishi Electric Corp デイジタルシグナルプロセツサ
US4893268A (en) * 1988-04-15 1990-01-09 Motorola, Inc. Circuit and method for accumulating partial products of a single, double or mixed precision multiplication
US4972362A (en) * 1988-06-17 1990-11-20 Bipolar Integrated Technology, Inc. Method and apparatus for implementing binary multiplication using booth type multiplication
US5287511A (en) * 1988-07-11 1994-02-15 Star Semiconductor Corporation Architectures and methods for dividing processing tasks into tasks for a programmable real time signal processor and tasks for a decision making microprocessor interfacing therewith
US4969118A (en) * 1989-01-13 1990-11-06 International Business Machines Corporation Floating point unit for calculating A=XY+Z having simultaneous multiply and add
JPH0378083A (ja) * 1989-08-21 1991-04-03 Hitachi Ltd 倍精度演算方式及び積和演算装置
JPH03100723A (ja) * 1989-09-13 1991-04-25 Fujitsu Ltd 精度変換命令の処理方式
US5241636A (en) * 1990-02-14 1993-08-31 Intel Corporation Method for parallel instruction execution in a computer
US5068816A (en) * 1990-02-16 1991-11-26 Noetzel Andrew S Interplating memory function evaluation
EP0474297B1 (en) * 1990-09-05 1998-06-10 Koninklijke Philips Electronics N.V. Very long instruction word machine for efficient execution of programs with conditional branches
JPH0612229A (ja) * 1992-06-10 1994-01-21 Nec Corp 乗累算回路
DE69329260T2 (de) * 1992-06-25 2001-02-22 Canon K.K., Tokio/Tokyo Gerät zum Multiplizieren von Ganzzahlen mit vielen Ziffern
JPH0659862A (ja) * 1992-08-05 1994-03-04 Fujitsu Ltd 乗算器
US5581778A (en) * 1992-08-05 1996-12-03 David Sarnoff Researach Center Advanced massively parallel computer using a field of the instruction to selectively enable the profiling counter to increase its value in response to the system clock
EP0622727A1 (en) * 1993-04-29 1994-11-02 International Business Machines Corporation System for optimizing argument reduction
EP0645699A1 (en) * 1993-09-29 1995-03-29 International Business Machines Corporation Fast multiply-add instruction sequence in a pipeline floating-point processor
US5487022A (en) * 1994-03-08 1996-01-23 Texas Instruments Incorporated Normalization method for floating point numbers
US5673407A (en) * 1994-03-08 1997-09-30 Texas Instruments Incorporated Data processor having capability to perform both floating point operations and memory access in response to a single instruction
US5553015A (en) 1994-04-15 1996-09-03 International Business Machines Corporation Efficient floating point overflow and underflow detection system
US5734874A (en) * 1994-04-29 1998-03-31 Sun Microsystems, Inc. Central processing unit with integrated graphics functions
JP3493064B2 (ja) 1994-09-14 2004-02-03 株式会社東芝 バレルシフタ
US5548545A (en) * 1995-01-19 1996-08-20 Exponential Technology, Inc. Floating point exception prediction for compound operations and variable precision using an intermediate exponent bus
US5701405A (en) * 1995-06-21 1997-12-23 Apple Computer, Inc. Method and apparatus for directly evaluating a parameter interpolation function used in rendering images in a graphics system that uses screen partitioning
US5778247A (en) 1996-03-06 1998-07-07 Sun Microsystems, Inc. Multi-pipeline microprocessor with data precision mode indicator
JP3790307B2 (ja) * 1996-10-16 2006-06-28 株式会社ルネサステクノロジ データプロセッサ及びデータ処理システム
US6490607B1 (en) * 1998-01-28 2002-12-03 Advanced Micro Devices, Inc. Shared FP and SIMD 3D multiplier
US6061781A (en) * 1998-07-01 2000-05-09 Ip First Llc Concurrent execution of divide microinstructions in floating point unit and overflow detection microinstructions in integer unit for integer divide
JP2000081966A (ja) * 1998-07-09 2000-03-21 Matsushita Electric Ind Co Ltd 演算装置
JP3600026B2 (ja) * 1998-08-12 2004-12-08 株式会社東芝 浮動小数点演算器
US6317133B1 (en) * 1998-09-18 2001-11-13 Ati Technologies, Inc. Graphics processor with variable performance characteristics
US6480872B1 (en) * 1999-01-21 2002-11-12 Sandcraft, Inc. Floating-point and integer multiply-add and multiply-accumulate
JP2000293494A (ja) * 1999-04-09 2000-10-20 Fuji Xerox Co Ltd 並列計算装置および並列計算方法
JP2001236206A (ja) * 1999-10-01 2001-08-31 Hitachi Ltd データのロード方法及びその記憶方法、データワードのロード方法及びその記憶方法、並びに、浮動小数点数の比較方法
US6198488B1 (en) * 1999-12-06 2001-03-06 Nvidia Transform, lighting and rasterization system embodied on a single semiconductor platform
US6807620B1 (en) * 2000-02-11 2004-10-19 Sony Computer Entertainment Inc. Game system with graphics processor
US6557022B1 (en) * 2000-02-26 2003-04-29 Qualcomm, Incorporated Digital signal processor with coupled multiply-accumulate units
US6912557B1 (en) * 2000-06-09 2005-06-28 Cirrus Logic, Inc. Math coprocessor
JP2002008060A (ja) * 2000-06-23 2002-01-11 Hitachi Ltd データ処理方法、記録媒体及びデータ処理装置
US6976043B2 (en) * 2001-07-30 2005-12-13 Ati Technologies Inc. Technique for approximating functions based on lagrange polynomials
JP3845009B2 (ja) * 2001-12-28 2006-11-15 富士通株式会社 積和演算装置、及び積和演算方法
JP2003223316A (ja) 2002-01-31 2003-08-08 Matsushita Electric Ind Co Ltd 演算処理装置
AU2003250575A1 (en) * 2002-08-07 2004-02-25 Mmagix Technology Limited Apparatus, method and system for a synchronicity independent, resource delegating, power and instruction optimizing processor
US8549501B2 (en) * 2004-06-07 2013-10-01 International Business Machines Corporation Framework for generating mixed-mode operations in loop-level simdization
US7437538B1 (en) * 2004-06-30 2008-10-14 Sun Microsystems, Inc. Apparatus and method for reducing execution latency of floating point operations having special case operands
US7640285B1 (en) * 2004-10-20 2009-12-29 Nvidia Corporation Multipurpose arithmetic functional unit
WO2006053173A2 (en) * 2004-11-10 2006-05-18 Nvidia Corporation Multipurpose multiply-add functional unit
KR20060044124A (ko) * 2004-11-11 2006-05-16 삼성전자주식회사 3차원 그래픽 가속을 위한 그래픽 시스템 및 메모리 장치
JP4571903B2 (ja) * 2005-12-02 2010-10-27 富士通株式会社 演算処理装置,情報処理装置,及び演算処理方法
US7728841B1 (en) 2005-12-19 2010-06-01 Nvidia Corporation Coherent shader output for multiple targets
US7747842B1 (en) * 2005-12-19 2010-06-29 Nvidia Corporation Configurable output buffer ganging for a parallel processor
US7484076B1 (en) * 2006-09-18 2009-01-27 Nvidia Corporation Executing an SIMD instruction requiring P operations on an execution unit that performs Q operations at a time (Q<P)
US7617384B1 (en) * 2006-11-06 2009-11-10 Nvidia Corporation Structured programming control flow using a disable mask in a SIMD architecture
JP4954799B2 (ja) 2007-06-05 2012-06-20 日本発條株式会社 衝撃吸収装置
US8775777B2 (en) * 2007-08-15 2014-07-08 Nvidia Corporation Techniques for sourcing immediate values from a VLIW
US8106914B2 (en) 2007-12-07 2012-01-31 Nvidia Corporation Fused multiply-add functional unit

Similar Documents

Publication Publication Date Title
JP2012084142A5 (enExample)
JP6495220B2 (ja) 選択可能な副精度に対して、低減された電力要求を有する浮動小数点プロセッサ
CN108287681B (zh) 一种单精度浮点融合点乘运算装置
CN104520807B (zh) 用于具有指数按比例缩放的浮点融合乘法加法的微架构
CN104111816B (zh) Gpdsp中多功能simd结构浮点融合乘加运算装置
US6697832B1 (en) Floating-point processor with improved intermediate result handling
US8606840B2 (en) Apparatus and method for floating-point fused multiply add
US9696964B2 (en) Multiply adder
GB2497469B (en) Multiply add functional unit capable of executing scale,round,Getexp,round,getmant,reduce,range and class instructions
JP5640081B2 (ja) 飽和を伴う整数乗算および乗算加算演算
CN101133389A (zh) 多用途乘法-加法功能单元
Patil et al. Out of order floating point coprocessor for RISC V ISA
US20100125621A1 (en) Arithmetic processing device and methods thereof
CN104991757A (zh) 一种浮点处理方法及浮点处理器
Quinnell et al. Bridge floating-point fused multiply-add design
Brunie et al. A mixed-precision fused multiply and add
Tsen et al. A combined decimal and binary floating-point multiplier
US8244783B2 (en) Normalizer shift prediction for log estimate instructions
JP2010218197A (ja) 浮動小数点積和演算装置、浮動小数点積和演算方法、及び浮動小数点積和演算用プログラム
Liu et al. A multi-functional floating point multiplier
Nasiri et al. Modified fused multiply-accumulate chained unit
Dhanabal et al. Implementation of low power and area efficient floating-point fused multiply-add unit
Mangalath et al. An efficient universal multi-mode floating point multiplier using Vedic mathematics
Rao Implementation of the IEEE standard binary floating‐point arithmetic unit
Chittaluri Implementation of area efficient IEEE-754 double precision floating point arithmetic unit using Verilog