KR20240149907A - 희소 신경망을 위한 적응형 텐서 계산 커널 - Google Patents

희소 신경망을 위한 적응형 텐서 계산 커널 Download PDF

Info

Publication number
KR20240149907A
KR20240149907A KR1020247028942A KR20247028942A KR20240149907A KR 20240149907 A KR20240149907 A KR 20240149907A KR 1020247028942 A KR1020247028942 A KR 1020247028942A KR 20247028942 A KR20247028942 A KR 20247028942A KR 20240149907 A KR20240149907 A KR 20240149907A
Authority
KR
South Korea
Prior art keywords
filters
array
tensor
ifm
convolution
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
KR1020247028942A
Other languages
English (en)
Korean (ko)
Inventor
샤오치엔 장
은쉬 옌
쥐빈 샤오
Original Assignee
모펫 인터내셔널 컴퍼니 리미티드
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 모펫 인터내셔널 컴퍼니 리미티드 filed Critical 모펫 인터내셔널 컴퍼니 리미티드
Publication of KR20240149907A publication Critical patent/KR20240149907A/ko
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/0464Convolutional networks [CNN, ConvNet]
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/0495Quantised networks; Sparse networks; Compressed networks
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/082Learning methods modifying the architecture, e.g. adding, deleting or silencing nodes or connections
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • Computing Systems (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Molecular Biology (AREA)
  • Artificial Intelligence (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Complex Calculations (AREA)
  • Image Processing (AREA)
KR1020247028942A 2022-02-16 2023-02-13 희소 신경망을 위한 적응형 텐서 계산 커널 Pending KR20240149907A (ko)

Applications Claiming Priority (3)

Application Number Priority Date Filing Date Title
US17/673,490 US20230259758A1 (en) 2022-02-16 2022-02-16 Adaptive tensor compute kernel for sparse neural network
US17/673,490 2022-02-16
PCT/CN2023/075661 WO2023155748A1 (en) 2022-02-16 2023-02-13 Adaptive tensor compute kernel for sparse neural network

Publications (1)

Publication Number Publication Date
KR20240149907A true KR20240149907A (ko) 2024-10-15

Family

ID=87558678

Family Applications (1)

Application Number Title Priority Date Filing Date
KR1020247028942A Pending KR20240149907A (ko) 2022-02-16 2023-02-13 희소 신경망을 위한 적응형 텐서 계산 커널

Country Status (7)

Country Link
US (1) US20230259758A1 (https=)
EP (1) EP4479887A4 (https=)
JP (1) JP2025505291A (https=)
KR (1) KR20240149907A (https=)
CN (1) CN118715527A (https=)
TW (1) TWI857493B (https=)
WO (1) WO2023155748A1 (https=)

Families Citing this family (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116261736B (zh) * 2020-06-12 2024-08-16 墨芯国际有限公司 用于双稀疏卷积处理和并行化的方法和系统
CN112925644B (zh) * 2021-02-26 2024-08-13 北京小米松果电子有限公司 深度学习算子优化方法、装置、设备及存储介质
CN116662330A (zh) * 2022-02-21 2023-08-29 中兴通讯股份有限公司 数据处理方法、转发芯片、存储介质及程序产品
US12567122B1 (en) 2022-04-19 2026-03-03 Nvidia Corporation Application programming interface to modify tensor dimensions
US20230140173A1 (en) * 2022-08-19 2023-05-04 Arnab Raha Deep neural network (dnn) accelerators with heterogeneous tiling
TWI873681B (zh) * 2023-06-14 2025-02-21 緯創資通股份有限公司 物件檢測方法、機器學習方法及電子裝置
CN117707791B (zh) * 2024-02-02 2024-05-14 北京壁仞科技开发有限公司 用于进行注意力运算的方法、设备和存储介质
CN118152713B (zh) * 2024-05-10 2024-08-06 北京壁仞科技开发有限公司 数据处理方法、装置、电子设备和计算机可读存储介质
TWI884041B (zh) * 2024-07-19 2025-05-11 國立清華大學 基於混合精度演算法和記憶體內運算加速器之軟硬體協同運作方法及其系統及非暫態電腦可讀儲存媒體
CN121233886B (zh) * 2025-12-01 2026-03-20 上海壁仞科技股份有限公司 卷积计算方法、电子设备、存储介质及程序产品

Family Cites Families (15)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10169298B1 (en) * 2017-05-11 2019-01-01 NovuMind Limited Native tensor processor, using outer product unit
US11475351B2 (en) * 2017-11-15 2022-10-18 Uatc, Llc Systems and methods for object detection, tracking, and motion prediction
US11443176B2 (en) * 2018-05-17 2022-09-13 International Business Machines Corporation Acceleration of convolutional neural networks on analog arrays
CN112888459B (zh) * 2018-06-01 2023-05-23 格里尔公司 卷积神经网络系统及数据分类方法
US11429850B2 (en) * 2018-07-19 2022-08-30 Xilinx, Inc. Performing consecutive mac operations on a set of data using different kernels in a MAC circuit
US11481934B2 (en) * 2018-10-10 2022-10-25 New York University System, method, and computer-accessible medium for generating magnetic resonance imaging-based anatomically guided positron emission tomography reconstruction images with a convolutional neural network
EP3654247B1 (en) * 2018-11-15 2025-01-01 IMEC vzw Convolution engine for neural networks
US10878173B2 (en) * 2018-11-29 2020-12-29 Adobe Inc. Object recognition and tagging based on fusion deep learning models
US11604958B2 (en) * 2019-03-13 2023-03-14 Samsung Electronics Co., Ltd. Method and apparatus for processing computation of zero value in processing of layers in neural network
WO2021071930A1 (en) * 2019-10-07 2021-04-15 Google Llc Redistributing tensor elements between machine learning computing units
US12554962B2 (en) * 2019-12-24 2026-02-17 Intel Corporation Configurable processor element arrays for implementing convolutional neural networks
CN115456161A (zh) * 2020-03-27 2022-12-09 华为技术有限公司 一种数据处理方法和数据处理系统
KR102914873B1 (ko) * 2020-12-14 2026-01-16 삼성전자 주식회사 채널 수에 기초하여 컨볼루션 연산을 수행하는 npu 장치 및 이의 동작 방법
KR102602584B1 (ko) * 2021-04-14 2023-11-16 한국전자통신연구원 인공 지능 반도체 프로세서 및 인공 지능 반도체 프로세서의 동작 방법
US20230195419A1 (en) * 2021-12-17 2023-06-22 Arm Limited System and Method for Accelerating Neural Networks

Also Published As

Publication number Publication date
CN118715527A (zh) 2024-09-27
EP4479887A4 (en) 2026-01-07
EP4479887A1 (en) 2024-12-25
TW202343310A (zh) 2023-11-01
WO2023155748A1 (en) 2023-08-24
TWI857493B (zh) 2024-10-01
US20230259758A1 (en) 2023-08-17
JP2025505291A (ja) 2025-02-21

Similar Documents

Publication Publication Date Title
KR20240149907A (ko) 희소 신경망을 위한 적응형 텐서 계산 커널
CN114503125B (zh) 结构化剪枝方法、系统和计算机可读介质
Lu et al. SpWA: An efficient sparse winograd convolutional neural networks accelerator on FPGAs
CN111831254B (zh) 图像处理加速方法、图像处理模型存储方法及对应装置
CN108765247B (zh) 图像处理方法、装置、存储介质及设备
CN114026569B (zh) 使用脉动阵列的扩张卷积
CN112801279B (zh) 用于卷积神经网络的超像素方法
EP3526665B1 (en) Sorting for data-parallel computing devices
Pınar et al. Fast optimal load balancing algorithms for 1D partitioning
Peterka et al. A configurable algorithm for parallel image-compositing applications
CN106846235B (zh) 一种利用NVIDIA Kepler GPU汇编指令加速的卷积优化方法及系统
US20240303837A1 (en) Method and apparatus with convolution neural network processing
US20220188613A1 (en) Sgcnax: a scalable graph convolutional neural network accelerator with workload balancing
US9965343B2 (en) System and method for determining concurrency factors for dispatch size of parallel processor kernels
CN117112145B (zh) 训练模型分配方法、装置、计算机设备和存储介质
Zlateski et al. ZNNi: maximizing the inference throughput of 3D convolutional networks on CPUs and GPUs
US20240273163A1 (en) Accelerator for sparse matrix multiplication in neural networks
CN112668708A (zh) 一种提高数据利用率的卷积运算装置
CN118410214B (zh) 一种基于稀疏矩阵的气象数据处理方法、设备及介质
KR102372869B1 (ko) 인공 신경망을 위한 행렬 연산기 및 행렬 연산 방법
US20200250842A1 (en) Method and apparatus with convolution neural network processing
CN118193914A (zh) 面向分布式平台的lu分解方法、装置、设备及存储介质
CN113900808A (zh) 一种基于任意多面体非结构网格的mpi并行数据结构
CN120226017A (zh) 具有卷积计算单元的向量运算加速
Guo et al. Fused DSConv: Optimizing sparse CNN inference for execution on edge devices

Legal Events

Date Code Title Description
PA0105 International application

St.27 status event code: A-0-1-A10-A15-nap-PA0105

PG1501 Laying open of application

St.27 status event code: A-1-1-Q10-Q12-nap-PG1501

D16 Fast track examination requested

Free format text: ST27 STATUS EVENT CODE: A-1-2-D10-D16-EXM-PA0302 (AS PROVIDED BY THE NATIONAL OFFICE)

PA0302 Request for accelerated examination

St.27 status event code: A-1-2-D10-D16-exm-PA0302