CN111213125B - 使用simd指令进行高效的直接卷积 - Google Patents

使用simd指令进行高效的直接卷积 Download PDF

Info

Publication number
CN111213125B
CN111213125B CN201880066852.7A CN201880066852A CN111213125B CN 111213125 B CN111213125 B CN 111213125B CN 201880066852 A CN201880066852 A CN 201880066852A CN 111213125 B CN111213125 B CN 111213125B
Authority
CN
China
Prior art keywords
vector
vectors
data
instruction
source
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201880066852.7A
Other languages
English (en)
Chinese (zh)
Other versions
CN111213125A (zh
Inventor
J·R·戴蒙德
A·P·帕特尔
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Oracle International Corp
Original Assignee
Oracle International Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Oracle International Corp filed Critical Oracle International Corp
Priority to CN202311376759.5A priority Critical patent/CN119556989A/zh
Publication of CN111213125A publication Critical patent/CN111213125A/zh
Application granted granted Critical
Publication of CN111213125B publication Critical patent/CN111213125B/zh
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/38Concurrent instruction execution, e.g. pipeline or look ahead
    • G06F9/3885Concurrent instruction execution, e.g. pipeline or look ahead using a plurality of independent parallel functional units
    • G06F9/3887Concurrent instruction execution, e.g. pipeline or look ahead using a plurality of independent parallel functional units controlled by a single instruction for multiple data lanes [SIMD]
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/10Complex mathematical operations
    • G06F17/15Correlation function computation including computation of convolution operations
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/30003Arrangements for executing specific machine instructions
    • G06F9/30007Arrangements for executing specific machine instructions to perform operations on data operands
    • G06F9/3001Arithmetic instructions
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/30003Arrangements for executing specific machine instructions
    • G06F9/30007Arrangements for executing specific machine instructions to perform operations on data operands
    • G06F9/30032Movement instructions, e.g. MOVE, SHIFT, ROTATE, SHUFFLE
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/30003Arrangements for executing specific machine instructions
    • G06F9/30007Arrangements for executing specific machine instructions to perform operations on data operands
    • G06F9/30036Instructions to perform operations on packed data, e.g. vector, tile or matrix operations
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/0464Convolutional networks [CNN, ConvNet]
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/06Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons
    • G06N3/063Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons using electronic means
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Software Systems (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Data Mining & Analysis (AREA)
  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computing Systems (AREA)
  • Molecular Biology (AREA)
  • General Health & Medical Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Computational Linguistics (AREA)
  • Artificial Intelligence (AREA)
  • Computational Mathematics (AREA)
  • Mathematical Analysis (AREA)
  • Mathematical Optimization (AREA)
  • Pure & Applied Mathematics (AREA)
  • Neurology (AREA)
  • Algebra (AREA)
  • Databases & Information Systems (AREA)
  • Advance Control (AREA)
  • Complex Calculations (AREA)
CN201880066852.7A 2017-09-08 2018-09-06 使用simd指令进行高效的直接卷积 Active CN111213125B (zh)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202311376759.5A CN119556989A (zh) 2017-09-08 2018-09-06 使用simd指令进行高效的直接卷积

Applications Claiming Priority (5)

Application Number Priority Date Filing Date Title
US201762556274P 2017-09-08 2017-09-08
US62/556,274 2017-09-08
US15/941,975 US11803377B2 (en) 2017-09-08 2018-03-30 Efficient direct convolution using SIMD instructions
US15/941,975 2018-03-30
PCT/US2018/049666 WO2019051027A1 (en) 2017-09-08 2018-09-06 EFFECTIVE DIRECT CONVOLUTION USING HMIS INSTRUCTIONS

Related Child Applications (1)

Application Number Title Priority Date Filing Date
CN202311376759.5A Division CN119556989A (zh) 2017-09-08 2018-09-06 使用simd指令进行高效的直接卷积

Publications (2)

Publication Number Publication Date
CN111213125A CN111213125A (zh) 2020-05-29
CN111213125B true CN111213125B (zh) 2023-11-07

Family

ID=65631104

Family Applications (2)

Application Number Title Priority Date Filing Date
CN201880066852.7A Active CN111213125B (zh) 2017-09-08 2018-09-06 使用simd指令进行高效的直接卷积
CN202311376759.5A Pending CN119556989A (zh) 2017-09-08 2018-09-06 使用simd指令进行高效的直接卷积

Family Applications After (1)

Application Number Title Priority Date Filing Date
CN202311376759.5A Pending CN119556989A (zh) 2017-09-08 2018-09-06 使用simd指令进行高效的直接卷积

Country Status (5)

Country Link
US (2) US11803377B2 (enExample)
EP (1) EP3676700B1 (enExample)
JP (2) JP7335231B2 (enExample)
CN (2) CN111213125B (enExample)
WO (1) WO2019051027A1 (enExample)

Families Citing this family (28)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10747844B2 (en) * 2017-12-12 2020-08-18 Tesla, Inc. Systems and methods for converting a matrix input to a vectorized input for a matrix processor
US10565285B2 (en) * 2017-12-18 2020-02-18 International Business Machines Corporation Processor and memory transparent convolutional lowering and auto zero padding for deep neural network implementations
US12099912B2 (en) 2018-06-22 2024-09-24 Samsung Electronics Co., Ltd. Neural processor
CN111813447B (zh) * 2019-04-12 2022-11-08 杭州中天微系统有限公司 一种数据拼接指令的处理方法和处理装置
US11671111B2 (en) 2019-04-17 2023-06-06 Samsung Electronics Co., Ltd. Hardware channel-parallel data compression/decompression
US11211944B2 (en) 2019-04-17 2021-12-28 Samsung Electronics Co., Ltd. Mixed-precision compression with random access
US11880760B2 (en) 2019-05-01 2024-01-23 Samsung Electronics Co., Ltd. Mixed-precision NPU tile with depth-wise convolution
US12182577B2 (en) 2019-05-01 2024-12-31 Samsung Electronics Co., Ltd. Neural-processing unit tile for shuffling queued nibbles for multiplication with non-zero weight nibbles
US20210049474A1 (en) * 2019-08-13 2021-02-18 Samsung Electronics Co., Ltd. Neural network method and apparatus
US11726950B2 (en) * 2019-09-28 2023-08-15 Intel Corporation Compute near memory convolution accelerator
US11475283B2 (en) * 2019-10-24 2022-10-18 Apple Inc. Multi dimensional convolution in neural network processor
US12112141B2 (en) 2019-12-12 2024-10-08 Samsung Electronics Co., Ltd. Accelerating 2D convolutional layer mapping on a dot product architecture
CN111178505B (zh) * 2019-12-23 2023-04-07 福建星网视易信息系统有限公司 卷积神经网络的加速方法和计算机可读存储介质
CN111797985B (zh) * 2020-07-22 2022-11-22 哈尔滨工业大学 一种基于gpu的卷积运算内存访问优化方法
KR102860334B1 (ko) * 2020-08-14 2025-09-16 삼성전자주식회사 중복성 감축 기반의 컨볼루션 연산 처리 방법 및 장치
CN112633505B (zh) * 2020-12-24 2022-05-27 苏州浪潮智能科技有限公司 一种基于risc-v的人工智能推理方法和系统
US12182570B2 (en) 2021-06-25 2024-12-31 Intel Corporation Apparatuses, methods, and systems for a packed data convolution instruction with shift control and width control
US12443412B2 (en) 2022-01-30 2025-10-14 Simplex Micro, Inc. Method and apparatus for a scalable microprocessor with time counter
CN114443143B (zh) * 2022-01-30 2025-01-07 上海阵量智能科技有限公司 指令处理方法、装置、芯片、电子设备以及存储介质
US12190116B2 (en) 2022-04-05 2025-01-07 Simplex Micro, Inc. Microprocessor with time count based instruction execution and replay
US12169716B2 (en) 2022-04-20 2024-12-17 Simplex Micro, Inc. Microprocessor with a time counter for statically dispatching extended instructions
US12141580B2 (en) 2022-04-20 2024-11-12 Simplex Micro, Inc. Microprocessor with non-cacheable memory load prediction
US12288065B2 (en) 2022-04-29 2025-04-29 Simplex Micro, Inc. Microprocessor with odd and even register sets
US12124849B2 (en) * 2022-07-13 2024-10-22 Simplex Micro, Inc. Vector processor with extended vector registers
US12147812B2 (en) 2022-07-13 2024-11-19 Simplex Micro, Inc. Out-of-order execution of loop instructions in a microprocessor
US12282772B2 (en) 2022-07-13 2025-04-22 Simplex Micro, Inc. Vector processor with vector data buffer
CN117313803B (zh) * 2023-11-28 2024-02-02 进迭时空(杭州)科技有限公司 基于risc-v向量处理器架构的滑动窗口2d卷积计算方法
CN119536744B (zh) * 2025-01-23 2025-06-17 山东浪潮科学研究院有限公司 一种代码自动向量化优化方法、设备及介质

Citations (17)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH03189868A (ja) * 1989-12-20 1991-08-19 Akira Iwata データ処理プロセツサ
EP0681236A1 (en) * 1994-05-05 1995-11-08 Rockwell International Corporation Space vector data path
CN1175731A (zh) * 1996-08-19 1998-03-11 三星电子株式会社 多任务计算系统环境中有效现场保存与恢复的装置和方法
US5801975A (en) * 1996-12-02 1998-09-01 Compaq Computer Corporation And Advanced Micro Devices, Inc. Computer modified to perform inverse discrete cosine transform operations on a one-dimensional matrix of numbers within a minimal number of instruction cycles
US5909572A (en) * 1996-12-02 1999-06-01 Compaq Computer Corp. System and method for conditionally moving an operand from a source register to a destination register
US5933650A (en) * 1997-10-09 1999-08-03 Mips Technologies, Inc. Alignment and ordering of vector elements for single instruction multiple data processing
GB0226732D0 (en) * 2002-11-15 2002-12-24 Imagination Tech Ltd A configurable processor architecture
CN1522401A (zh) * 2001-10-29 2004-08-18 ض� 数据并行右移合并的方法与装置
CN1577257A (zh) * 2003-06-30 2005-02-09 英特尔公司 具有取整和移位的单指令多数据整数高位乘法
CN101923534A (zh) * 2009-06-10 2010-12-22 新奥特(北京)视频技术有限公司 应用sse指令集对视音频信号的对称卷积核进行卷积的方法
CN102495721A (zh) * 2011-12-02 2012-06-13 南京大学 一种支持fft加速的simd向量处理器
CN104025033A (zh) * 2011-12-30 2014-09-03 英特尔公司 利用控制操纵的simd可变移位和循环
CN104969215A (zh) * 2013-03-13 2015-10-07 高通股份有限公司 具有用于提供多模基-2x蝶形向量处理电路的可编程数据路径的向量处理引擎以及相关的向量处理器、系统和方法
CN105723333A (zh) * 2013-11-15 2016-06-29 高通股份有限公司 在执行单元与向量数据存储器之间具有合并电路系统的向量处理引擎以及相关的方法
EP3093757A2 (en) * 2015-05-11 2016-11-16 Ceva D.S.P. Ltd. Multi-dimensional sliding window operation for a vector processor
CN106940815A (zh) * 2017-02-13 2017-07-11 西安交通大学 一种可编程卷积神经网络协处理器ip核
CN106991473A (zh) * 2017-03-30 2017-07-28 中国人民解放军国防科学技术大学 面向向量处理器的基于simd的平均值值池化并行处理方法

Family Cites Families (33)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPS6489868A (en) 1987-09-30 1989-04-05 Sony Corp Video signal processing circuit
US5734874A (en) * 1994-04-29 1998-03-31 Sun Microsystems, Inc. Central processing unit with integrated graphics functions
US7085795B2 (en) 2001-10-29 2006-08-01 Intel Corporation Apparatus and method for efficient filtering and convolution of content data
US6115812A (en) * 1998-04-01 2000-09-05 Intel Corporation Method and apparatus for efficient vertical SIMD computations
EP2241968B1 (en) * 1998-08-24 2012-06-27 MicroUnity Systems Engineering, Inc. System with wide operand architecture, and method
US7725521B2 (en) * 2001-10-29 2010-05-25 Intel Corporation Method and apparatus for computing matrix transformations
US6954841B2 (en) * 2002-06-26 2005-10-11 International Business Machines Corporation Viterbi decoding for SIMD vector processors with indirect vector element access
US7409415B2 (en) * 2002-12-20 2008-08-05 Texas Instruments Incorporated Processor system with efficient shift operations including EXTRACT operation
GB2409063B (en) 2003-12-09 2006-07-12 Advanced Risc Mach Ltd Vector by scalar operations
GB2409065B (en) * 2003-12-09 2006-10-25 Advanced Risc Mach Ltd Multiplexing operations in SIMD processing
US7328230B2 (en) * 2004-03-26 2008-02-05 Intel Corporation SIMD four-data element average instruction
US7315937B2 (en) * 2004-10-01 2008-01-01 Mips Technologies, Inc. Microprocessor instructions for efficient bit stream extractions
US7933405B2 (en) * 2005-04-08 2011-04-26 Icera Inc. Data access and permute unit
US7623732B1 (en) 2005-04-26 2009-11-24 Mercury Computer Systems, Inc. Method and apparatus for digital image filtering with discrete filter kernels using graphics hardware
US7529918B2 (en) * 2006-07-21 2009-05-05 Broadcom Corporation System and method for efficiently performing bit-field extraction and bit-field combination operations in a processor
US20080071851A1 (en) * 2006-09-20 2008-03-20 Ronen Zohar Instruction and logic for performing a dot-product operation
US8255884B2 (en) 2008-06-06 2012-08-28 International Business Machines Corporation Optimized scalar promotion with load and splat SIMD instructions
US20100180100A1 (en) 2009-01-13 2010-07-15 Mavrix Technology, Inc. Matrix microprocessor and method of operation
US8732437B2 (en) * 2010-01-26 2014-05-20 Oracle America, Inc. Low-overhead misalignment and reformatting support for SIMD
US9363068B2 (en) 2010-08-03 2016-06-07 Intel Corporation Vector processor having instruction set with sliding window non-linear convolutional function
US20120185670A1 (en) * 2011-01-14 2012-07-19 Toll Bret L Scalar integer instructions capable of execution with three registers
US20120254589A1 (en) * 2011-04-01 2012-10-04 Jesus Corbal San Adrian System, apparatus, and method for aligning registers
KR102207599B1 (ko) 2011-10-27 2021-01-26 인텔 코포레이션 블록 기반 파고율 저감
US9946540B2 (en) * 2011-12-23 2018-04-17 Intel Corporation Apparatus and method of improved permute instructions with multiple granularities
US9477999B2 (en) * 2013-09-20 2016-10-25 The Board Of Trustees Of The Leland Stanford Junior University Low power programmable image processor
US9442731B2 (en) * 2014-03-13 2016-09-13 Intel Corporation Packed two source inter-element shift merge processors, methods, systems, and instructions
US9582726B2 (en) * 2015-06-24 2017-02-28 Qualcomm Incorporated Systems and methods for image processing in a deep convolution network
US10459731B2 (en) * 2015-07-20 2019-10-29 Qualcomm Incorporated Sliding window operation
GB2540939B (en) * 2015-07-31 2019-01-23 Advanced Risc Mach Ltd An apparatus and method for performing a splice operation
US20170357894A1 (en) * 2016-06-10 2017-12-14 Apple Inc. Data packing for convolution of artificial neural networks
US10282204B2 (en) * 2016-07-02 2019-05-07 Intel Corporation Systems, apparatuses, and methods for strided load
US10824938B2 (en) * 2017-04-24 2020-11-03 Intel Corporation Specialized fixed function hardware for efficient convolution
JP6958027B2 (ja) * 2017-07-03 2021-11-02 富士通株式会社 演算処理装置及び演算処理装置の制御方法

Patent Citations (17)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH03189868A (ja) * 1989-12-20 1991-08-19 Akira Iwata データ処理プロセツサ
EP0681236A1 (en) * 1994-05-05 1995-11-08 Rockwell International Corporation Space vector data path
CN1175731A (zh) * 1996-08-19 1998-03-11 三星电子株式会社 多任务计算系统环境中有效现场保存与恢复的装置和方法
US5801975A (en) * 1996-12-02 1998-09-01 Compaq Computer Corporation And Advanced Micro Devices, Inc. Computer modified to perform inverse discrete cosine transform operations on a one-dimensional matrix of numbers within a minimal number of instruction cycles
US5909572A (en) * 1996-12-02 1999-06-01 Compaq Computer Corp. System and method for conditionally moving an operand from a source register to a destination register
US5933650A (en) * 1997-10-09 1999-08-03 Mips Technologies, Inc. Alignment and ordering of vector elements for single instruction multiple data processing
CN1522401A (zh) * 2001-10-29 2004-08-18 ض� 数据并行右移合并的方法与装置
GB0226732D0 (en) * 2002-11-15 2002-12-24 Imagination Tech Ltd A configurable processor architecture
CN1577257A (zh) * 2003-06-30 2005-02-09 英特尔公司 具有取整和移位的单指令多数据整数高位乘法
CN101923534A (zh) * 2009-06-10 2010-12-22 新奥特(北京)视频技术有限公司 应用sse指令集对视音频信号的对称卷积核进行卷积的方法
CN102495721A (zh) * 2011-12-02 2012-06-13 南京大学 一种支持fft加速的simd向量处理器
CN104025033A (zh) * 2011-12-30 2014-09-03 英特尔公司 利用控制操纵的simd可变移位和循环
CN104969215A (zh) * 2013-03-13 2015-10-07 高通股份有限公司 具有用于提供多模基-2x蝶形向量处理电路的可编程数据路径的向量处理引擎以及相关的向量处理器、系统和方法
CN105723333A (zh) * 2013-11-15 2016-06-29 高通股份有限公司 在执行单元与向量数据存储器之间具有合并电路系统的向量处理引擎以及相关的方法
EP3093757A2 (en) * 2015-05-11 2016-11-16 Ceva D.S.P. Ltd. Multi-dimensional sliding window operation for a vector processor
CN106940815A (zh) * 2017-02-13 2017-07-11 西安交通大学 一种可编程卷积神经网络协处理器ip核
CN106991473A (zh) * 2017-03-30 2017-07-28 中国人民解放军国防科学技术大学 面向向量处理器的基于simd的平均值值池化并行处理方法

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
MCC-SIMD数据并行卷积计算方法的研究;张发存, 赵晓红, 王忠, 沈绪榜;计算机工程(第09期);34-36 *
SIMD技术在数字图像处理中的应用研究(英文);辛明瑞, 高德远, 佟凤辉;微电子学与计算机(第11期);164-168 *
基于SIMD技术的图像卷积处理器体系结构研究;佟凤辉, 樊晓桠, 王党辉, 辛明瑞;微电子学与计算机(第03期);13-16+20 *

Also Published As

Publication number Publication date
EP3676700B1 (en) 2022-12-28
CN119556989A (zh) 2025-03-04
JP7652507B2 (ja) 2025-03-27
JP7335231B2 (ja) 2023-08-29
US20190079764A1 (en) 2019-03-14
WO2019051027A1 (en) 2019-03-14
US11803377B2 (en) 2023-10-31
CN111213125A (zh) 2020-05-29
US20240012644A1 (en) 2024-01-11
JP2023160833A (ja) 2023-11-02
JP2020533691A (ja) 2020-11-19
EP3676700A1 (en) 2020-07-08

Similar Documents

Publication Publication Date Title
CN111213125B (zh) 使用simd指令进行高效的直接卷积
CN107408037B (zh) 配置成对可变长度向量进行操作的单片向量处理器
TWI528276B (zh) 執行乘法乘法累加指令之技術
US11630997B2 (en) Method and apparatus with bit-serial data processing of a neural network
CN112069459A (zh) 用于稀疏-密集矩阵乘法的加速器
CN104603746B (zh) 由读和写掩码控制的向量移动指令
CN114341802B (zh) 用于执行存储器内处理操作的方法及相关存储器装置和系统
CN107533460B (zh) 紧缩有限冲激响应(fir)滤波处理器、方法、系统和指令
JP7385009B2 (ja) 圧縮支援命令
US20200265106A1 (en) Two-dimensional multi-layer convolution for deep learning
US9436465B2 (en) Moving average processing in processor and processor
KR20230109791A (ko) 패킹된 데이터 정렬 플러스 계산 명령어, 프로세서,방법, 및 시스템
US20240111530A1 (en) Matrix multiplication unit with flexible precision operations
CN110235099A (zh) 用于处理输入操作数值的装置和方法
CN114090954A (zh) 一种基于ft-2000+的整数矩阵乘法内核优化方法
CN112434255A (zh) 向量-矩阵运算和数据处理方法、乘法器和处理器芯片
WO2024251385A1 (en) Indexed vector permutation, vector comparison, and/or population count operations
GB2523805A (en) Data processing apparatus and method for performing vector scan operation
US12493577B2 (en) Digital signal processor (DSP) and electronic device using the same
US20250258648A1 (en) Apparatus and method with in-register computing
WO2024250758A1 (zh) 复数数据处理方法以及相关设备

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant