TW202316325A - 用於神經網路的並行逐深度處理架構 - Google Patents

用於神經網路的並行逐深度處理架構 Download PDF

Info

Publication number
TW202316325A
TW202316325A TW111131684A TW111131684A TW202316325A TW 202316325 A TW202316325 A TW 202316325A TW 111131684 A TW111131684 A TW 111131684A TW 111131684 A TW111131684 A TW 111131684A TW 202316325 A TW202316325 A TW 202316325A
Authority
TW
Taiwan
Prior art keywords
output
processing
input
circuit
depth
Prior art date
Application number
TW111131684A
Other languages
English (en)
Chinese (zh)
Inventor
默斯塔法 貝達羅格魯
中澤 王
弗朗索瓦伊布拉欣 阿塔拉
Original Assignee
美商高通公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 美商高通公司 filed Critical 美商高通公司
Publication of TW202316325A publication Critical patent/TW202316325A/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/06Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons
    • G06N3/063Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons using electronic means
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F7/00Methods or arrangements for processing data by operating upon the order or content of the data handled
    • G06F7/38Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation
    • G06F7/48Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation using non-contact-making devices, e.g. tube, solid state device; using unspecified devices
    • G06F7/544Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation using non-contact-making devices, e.g. tube, solid state device; using unspecified devices for evaluating functions by calculation
    • G06F7/5443Sum of products
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/0464Convolutional networks [CNN, ConvNet]
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/40Scenes; Scene-specific elements in video content
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/60Type of objects
    • G06V20/64Three-dimensional [3D] objects
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Health & Medical Sciences (AREA)
  • Computing Systems (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • General Engineering & Computer Science (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • General Health & Medical Sciences (AREA)
  • Software Systems (AREA)
  • Molecular Biology (AREA)
  • Data Mining & Analysis (AREA)
  • Computational Linguistics (AREA)
  • Mathematical Physics (AREA)
  • Neurology (AREA)
  • Multimedia (AREA)
  • Mathematical Analysis (AREA)
  • Mathematical Optimization (AREA)
  • Pure & Applied Mathematics (AREA)
  • Computational Mathematics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Databases & Information Systems (AREA)
  • Medical Informatics (AREA)
  • Image Processing (AREA)
  • Multi Processors (AREA)
TW111131684A 2021-09-02 2022-08-23 用於神經網路的並行逐深度處理架構 TW202316325A (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US17/465,550 2021-09-02
US17/465,550 US20230065725A1 (en) 2021-09-02 2021-09-02 Parallel depth-wise processing architectures for neural networks

Publications (1)

Publication Number Publication Date
TW202316325A true TW202316325A (zh) 2023-04-16

Family

ID=83506452

Family Applications (1)

Application Number Title Priority Date Filing Date
TW111131684A TW202316325A (zh) 2021-09-02 2022-08-23 用於神經網路的並行逐深度處理架構

Country Status (7)

Country Link
US (1) US20230065725A1 (https=)
EP (1) EP4396726B1 (https=)
JP (1) JP2024537610A (https=)
KR (1) KR20240058084A (https=)
CN (1) CN117897708A (https=)
TW (1) TW202316325A (https=)
WO (1) WO2023034696A1 (https=)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
TWI884084B (zh) * 2024-09-16 2025-05-11 國立陽明交通大學 記憶體內運算(cim)裝置及其縮放係數的訓練方法

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US12340304B2 (en) * 2021-08-10 2025-06-24 Qualcomm Incorporated Partial sum management and reconfigurable systolic flow architectures for in-memory computation
KR102706004B1 (ko) * 2021-12-21 2024-09-12 주식회사 넥스트칩 차량을 제어하기 위한 이미지 처리 방법 및 그 방법을 수행하는 전자 장치
KR20230123864A (ko) * 2022-02-17 2023-08-24 주식회사 마키나락스 인공지능 기반의 반도체 설계 방법
KR20250041871A (ko) * 2023-09-19 2025-03-26 주식회사 딥엑스 가변 주파수를 이용하여 신경 프로세싱 유닛의 파워를 낮추는 기술

Family Cites Families (21)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7196708B2 (en) * 2004-03-31 2007-03-27 Sony Corporation Parallel vector processing
US20050226337A1 (en) * 2004-03-31 2005-10-13 Mikhail Dorojevets 2D block processing architecture
JP2010192983A (ja) * 2009-02-16 2010-09-02 Renesas Electronics Corp フィルタ処理装置及び半導体装置
US8583720B2 (en) * 2010-02-10 2013-11-12 L3 Communications Integrated Systems, L.P. Reconfigurable networked processing elements partial differential equations system
US20180164866A1 (en) * 2016-12-13 2018-06-14 Qualcomm Incorporated Low-power architecture for sparse neural network
KR102642853B1 (ko) * 2017-01-05 2024-03-05 한국전자통신연구원 컨볼루션 회로, 그것을 포함하는 어플리케이션 프로세서 및 그것의 동작 방법
US11507429B2 (en) * 2017-09-14 2022-11-22 Electronics And Telecommunications Research Institute Neural network accelerator including bidirectional processing element array
US10872290B2 (en) * 2017-09-21 2020-12-22 Raytheon Company Neural network processor with direct memory access and hardware acceleration circuits
US10768856B1 (en) * 2018-03-12 2020-09-08 Amazon Technologies, Inc. Memory access for multiple circuit components
US11475306B2 (en) * 2018-03-22 2022-10-18 Amazon Technologies, Inc. Processing for multiple input data sets
US12175356B2 (en) * 2018-05-15 2024-12-24 Mitsubishi Electric Corporation Arithmetic device
US11347916B1 (en) * 2019-06-28 2022-05-31 Amazon Technologies, Inc. Increasing positive clock skew for systolic array critical path
US20210012186A1 (en) * 2019-07-11 2021-01-14 Facebook Technologies, Llc Systems and methods for pipelined parallelism to accelerate distributed processing
US11842169B1 (en) * 2019-09-25 2023-12-12 Amazon Technologies, Inc. Systolic multiply delayed accumulate processor architecture
US11816446B2 (en) * 2019-11-27 2023-11-14 Amazon Technologies, Inc. Systolic array component combining multiple integer and floating-point data types
US11422773B1 (en) * 2020-06-29 2022-08-23 Amazon Technologies, Inc. Multiple busses within a systolic array processing element
US11308026B1 (en) * 2020-06-29 2022-04-19 Amazon Technologies, Inc. Multiple busses interleaved in a systolic array
US11308027B1 (en) * 2020-06-29 2022-04-19 Amazon Technologies, Inc. Multiple accumulate busses in a systolic array
US20220383081A1 (en) * 2021-05-28 2022-12-01 Meta Platforms Technologies, Llc Bandwidth-aware flexible-scheduling machine learning accelerator
US11880682B2 (en) * 2021-06-30 2024-01-23 Amazon Technologies, Inc. Systolic array with efficient input reduction and extended array performance
US11494627B1 (en) * 2021-07-08 2022-11-08 Hong Kong Applied Science and Technology Research Institute Company Limited Dynamic tile parallel neural network accelerator

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
TWI884084B (zh) * 2024-09-16 2025-05-11 國立陽明交通大學 記憶體內運算(cim)裝置及其縮放係數的訓練方法

Also Published As

Publication number Publication date
WO2023034696A1 (en) 2023-03-09
JP2024537610A (ja) 2024-10-16
KR20240058084A (ko) 2024-05-03
US20230065725A1 (en) 2023-03-02
EP4396726A1 (en) 2024-07-10
EP4396726B1 (en) 2026-04-22
CN117897708A (zh) 2024-04-16

Similar Documents

Publication Publication Date Title
TW202316325A (zh) 用於神經網路的並行逐深度處理架構
US12223288B2 (en) Neural network processing unit including approximate multiplier and system on chip including the same
TWI759361B (zh) 用於稀疏神經網路加速的架構、方法、電腦可讀取媒體和裝備
CN111758107B (zh) 用于基于硬件的池化的系统和方法
JP7321372B2 (ja) 微細構造化重みプルーニング・重み統合によるニューラルネットワークモデルの圧縮のための方法、装置およびコンピュータプログラム
US11915118B2 (en) Method and apparatus for processing computation of zero value in processing of layers in neural network
EP4384899B1 (en) Partial sum management and reconfigurable systolic flow architectures for in-memory computation
US20230025068A1 (en) Hybrid machine learning architecture with neural processing unit and compute-in-memory processing elements
CN108805262A (zh) 用于根据高级程序进行脉动阵列设计的系统及方法
US10936943B2 (en) Providing flexible matrix processors for performing neural network convolution in matrix-processor-based devices
CN108268931A (zh) 数据处理的方法、装置和系统
TW202324210A (zh) 支援深度式迴旋神經網路(cnn)的記憶體內計算(cim)架構和資料串流
WO2023279069A1 (en) Compute in memory architecture and dataflows for depth- wise separable convolution
Guo et al. BOOST: Block minifloat-based on-device CNN training accelerator with transfer learning
CN116781484B (zh) 数据处理方法、装置、计算机设备及存储介质
WO2022163861A1 (ja) ニューラルネットワーク生成装置、ニューラルネットワーク演算装置、エッジデバイス、ニューラルネットワーク制御方法およびソフトウェア生成プログラム
CN115699022A (zh) 结构化卷积和相关联加速
JP7642919B2 (ja) ニューラルネットワークアクセラレータにおけるデータ再利用のための活性化バッファアーキテクチャ
CN116090518A (zh) 基于脉动运算阵列的特征图处理方法、装置以及存储介质
TW202324205A (zh) 用於分階段逐深度迴旋的記憶體內計算架構
Chen et al. M2M: Learning to Enhance Low-Light Image from Model to Mobile FPGA
WO2023004374A1 (en) Hybrid machine learning architecture with neural processing unit and compute-in-memory processing elements
TW202611728A (zh) 用於基於歸約之記憶體中處理架構的自動索引機制