CN117897708A - 用于神经网络的并行逐深度处理架构 - Google Patents

用于神经网络的并行逐深度处理架构 Download PDF

Info

Publication number
CN117897708A
CN117897708A CN202280058042.3A CN202280058042A CN117897708A CN 117897708 A CN117897708 A CN 117897708A CN 202280058042 A CN202280058042 A CN 202280058042A CN 117897708 A CN117897708 A CN 117897708A
Authority
CN
China
Prior art keywords
output
processing
circuit
input
neural network
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202280058042.3A
Other languages
English (en)
Chinese (zh)
Inventor
M·巴达罗格鲁
Z·王
F·I·艾塔拉
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Qualcomm Inc
Original Assignee
Qualcomm Inc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Qualcomm Inc filed Critical Qualcomm Inc
Publication of CN117897708A publication Critical patent/CN117897708A/zh
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/06Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons
    • G06N3/063Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons using electronic means
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F7/00Methods or arrangements for processing data by operating upon the order or content of the data handled
    • G06F7/38Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation
    • G06F7/48Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation using non-contact-making devices, e.g. tube, solid state device; using unspecified devices
    • G06F7/544Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation using non-contact-making devices, e.g. tube, solid state device; using unspecified devices for evaluating functions by calculation
    • G06F7/5443Sum of products
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/0464Convolutional networks [CNN, ConvNet]
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/40Scenes; Scene-specific elements in video content
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/60Type of objects
    • G06V20/64Three-dimensional [3D] objects
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Health & Medical Sciences (AREA)
  • Computing Systems (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • General Engineering & Computer Science (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • General Health & Medical Sciences (AREA)
  • Software Systems (AREA)
  • Molecular Biology (AREA)
  • Data Mining & Analysis (AREA)
  • Computational Linguistics (AREA)
  • Mathematical Physics (AREA)
  • Neurology (AREA)
  • Multimedia (AREA)
  • Mathematical Analysis (AREA)
  • Mathematical Optimization (AREA)
  • Pure & Applied Mathematics (AREA)
  • Computational Mathematics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Databases & Information Systems (AREA)
  • Medical Informatics (AREA)
  • Image Processing (AREA)
  • Multi Processors (AREA)
CN202280058042.3A 2021-09-02 2022-08-22 用于神经网络的并行逐深度处理架构 Pending CN117897708A (zh)

Applications Claiming Priority (3)

Application Number Priority Date Filing Date Title
US17/465,550 2021-09-02
US17/465,550 US20230065725A1 (en) 2021-09-02 2021-09-02 Parallel depth-wise processing architectures for neural networks
PCT/US2022/075255 WO2023034696A1 (en) 2021-09-02 2022-08-22 Parallel depth-wise processing architectures for neural networks

Publications (1)

Publication Number Publication Date
CN117897708A true CN117897708A (zh) 2024-04-16

Family

ID=83506452

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202280058042.3A Pending CN117897708A (zh) 2021-09-02 2022-08-22 用于神经网络的并行逐深度处理架构

Country Status (7)

Country Link
US (1) US20230065725A1 (https=)
EP (1) EP4396726B1 (https=)
JP (1) JP2024537610A (https=)
KR (1) KR20240058084A (https=)
CN (1) CN117897708A (https=)
TW (1) TW202316325A (https=)
WO (1) WO2023034696A1 (https=)

Families Citing this family (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US12340304B2 (en) * 2021-08-10 2025-06-24 Qualcomm Incorporated Partial sum management and reconfigurable systolic flow architectures for in-memory computation
KR102706004B1 (ko) * 2021-12-21 2024-09-12 주식회사 넥스트칩 차량을 제어하기 위한 이미지 처리 방법 및 그 방법을 수행하는 전자 장치
KR20230123864A (ko) * 2022-02-17 2023-08-24 주식회사 마키나락스 인공지능 기반의 반도체 설계 방법
KR20250041871A (ko) * 2023-09-19 2025-03-26 주식회사 딥엑스 가변 주파수를 이용하여 신경 프로세싱 유닛의 파워를 낮추는 기술
TWI884084B (zh) * 2024-09-16 2025-05-11 國立陽明交通大學 記憶體內運算(cim)裝置及其縮放係數的訓練方法

Family Cites Families (21)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7196708B2 (en) * 2004-03-31 2007-03-27 Sony Corporation Parallel vector processing
US20050226337A1 (en) * 2004-03-31 2005-10-13 Mikhail Dorojevets 2D block processing architecture
JP2010192983A (ja) * 2009-02-16 2010-09-02 Renesas Electronics Corp フィルタ処理装置及び半導体装置
US8583720B2 (en) * 2010-02-10 2013-11-12 L3 Communications Integrated Systems, L.P. Reconfigurable networked processing elements partial differential equations system
US20180164866A1 (en) * 2016-12-13 2018-06-14 Qualcomm Incorporated Low-power architecture for sparse neural network
KR102642853B1 (ko) * 2017-01-05 2024-03-05 한국전자통신연구원 컨볼루션 회로, 그것을 포함하는 어플리케이션 프로세서 및 그것의 동작 방법
US11507429B2 (en) * 2017-09-14 2022-11-22 Electronics And Telecommunications Research Institute Neural network accelerator including bidirectional processing element array
US10872290B2 (en) * 2017-09-21 2020-12-22 Raytheon Company Neural network processor with direct memory access and hardware acceleration circuits
US10768856B1 (en) * 2018-03-12 2020-09-08 Amazon Technologies, Inc. Memory access for multiple circuit components
US11475306B2 (en) * 2018-03-22 2022-10-18 Amazon Technologies, Inc. Processing for multiple input data sets
US12175356B2 (en) * 2018-05-15 2024-12-24 Mitsubishi Electric Corporation Arithmetic device
US11347916B1 (en) * 2019-06-28 2022-05-31 Amazon Technologies, Inc. Increasing positive clock skew for systolic array critical path
US20210012186A1 (en) * 2019-07-11 2021-01-14 Facebook Technologies, Llc Systems and methods for pipelined parallelism to accelerate distributed processing
US11842169B1 (en) * 2019-09-25 2023-12-12 Amazon Technologies, Inc. Systolic multiply delayed accumulate processor architecture
US11816446B2 (en) * 2019-11-27 2023-11-14 Amazon Technologies, Inc. Systolic array component combining multiple integer and floating-point data types
US11422773B1 (en) * 2020-06-29 2022-08-23 Amazon Technologies, Inc. Multiple busses within a systolic array processing element
US11308026B1 (en) * 2020-06-29 2022-04-19 Amazon Technologies, Inc. Multiple busses interleaved in a systolic array
US11308027B1 (en) * 2020-06-29 2022-04-19 Amazon Technologies, Inc. Multiple accumulate busses in a systolic array
US20220383081A1 (en) * 2021-05-28 2022-12-01 Meta Platforms Technologies, Llc Bandwidth-aware flexible-scheduling machine learning accelerator
US11880682B2 (en) * 2021-06-30 2024-01-23 Amazon Technologies, Inc. Systolic array with efficient input reduction and extended array performance
US11494627B1 (en) * 2021-07-08 2022-11-08 Hong Kong Applied Science and Technology Research Institute Company Limited Dynamic tile parallel neural network accelerator

Also Published As

Publication number Publication date
WO2023034696A1 (en) 2023-03-09
JP2024537610A (ja) 2024-10-16
TW202316325A (zh) 2023-04-16
KR20240058084A (ko) 2024-05-03
US20230065725A1 (en) 2023-03-02
EP4396726A1 (en) 2024-07-10
EP4396726B1 (en) 2026-04-22

Similar Documents

Publication Publication Date Title
CN117897708A (zh) 用于神经网络的并行逐深度处理架构
EP3496008B1 (en) Method and apparatus for processing convolution operation in neural network
TWI759361B (zh) 用於稀疏神經網路加速的架構、方法、電腦可讀取媒體和裝備
JP7007488B2 (ja) ハードウェアベースのプーリングのシステムおよび方法
CN108268931B (zh) 数据处理的方法、装置和系统
JP2022037022A (ja) ハードウェアにおけるカーネルストライドの実行
EP3674987A1 (en) Method and apparatus for processing convolution operation in neural network
US20180181864A1 (en) Sparsified Training of Convolutional Neural Networks
CN110050267A (zh) 用于数据管理的系统和方法
WO2017116924A1 (en) Neural network training performance optimization framework
EP4384899B1 (en) Partial sum management and reconfigurable systolic flow architectures for in-memory computation
US10936943B2 (en) Providing flexible matrix processors for performing neural network convolution in matrix-processor-based devices
US20230025068A1 (en) Hybrid machine learning architecture with neural processing unit and compute-in-memory processing elements
KR20220020816A (ko) 심층 신경망들에서의 깊이-우선 컨볼루션
CN116075821A (zh) 表格卷积和加速
CN117413280A (zh) 具有内核扩展和张量累积的卷积
CN109447239B (zh) 一种基于arm的嵌入式卷积神经网络加速方法
JP7642919B2 (ja) ニューラルネットワークアクセラレータにおけるデータ再利用のための活性化バッファアーキテクチャ
CN115699022A (zh) 结构化卷积和相关联加速
US12585923B2 (en) Desparsified convolution for sparse activations
TWI850463B (zh) 用於逐點迴旋的方法、處理系統及電腦可讀取媒體
WO2023004374A1 (en) Hybrid machine learning architecture with neural processing unit and compute-in-memory processing elements

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination