CN115315688B - 处理数据流修改以减少并行处理期间的功率效应 - Google Patents

处理数据流修改以减少并行处理期间的功率效应

Info

Publication number
CN115315688B
CN115315688B CN202180023936.4A CN202180023936A CN115315688B CN 115315688 B CN115315688 B CN 115315688B CN 202180023936 A CN202180023936 A CN 202180023936A CN 115315688 B CN115315688 B CN 115315688B
Authority
CN
China
Prior art keywords
data
blocks
processing
density
sub
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202180023936.4A
Other languages
English (en)
Chinese (zh)
Other versions
CN115315688A (zh
Inventor
H·J·朴
R·G·霍夫曼
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Qualcomm Inc
Original Assignee
Qualcomm Inc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Qualcomm Inc filed Critical Qualcomm Inc
Priority to CN202610004365.4A priority Critical patent/CN121807567A/zh
Publication of CN115315688A publication Critical patent/CN115315688A/zh
Application granted granted Critical
Publication of CN115315688B publication Critical patent/CN115315688B/zh
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5061Partitioning or combining of resources
    • G06F9/5066Algorithms for mapping a plurality of inter-dependent sub-tasks onto a plurality of physical CPUs
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F13/00Interconnection of, or transfer of information or other signals between, memories, input/output devices or central processing units
    • G06F13/14Handling requests for interconnection or transfer
    • G06F13/20Handling requests for interconnection or transfer for access to input/output bus
    • G06F13/28Handling requests for interconnection or transfer for access to input/output bus using burst mode transfer, e.g. direct memory access DMA, cycle steal
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5005Allocation of resources, e.g. of the central processing unit [CPU] to service a request
    • G06F9/5027Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5094Allocation of resources, e.g. of the central processing unit [CPU] where the allocation takes into account power or heat criteria
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Software Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Computation (AREA)
  • Medical Informatics (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Image Processing (AREA)
  • Compression Or Coding Systems Of Tv Signals (AREA)
  • Power Sources (AREA)
  • Complex Calculations (AREA)
CN202180023936.4A 2020-03-30 2021-03-29 处理数据流修改以减少并行处理期间的功率效应 Active CN115315688B (zh)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202610004365.4A CN121807567A (zh) 2020-03-30 2021-03-29 处理数据流修改以减少并行处理期间的功率效应

Applications Claiming Priority (3)

Application Number Priority Date Filing Date Title
US16/834,986 US11507423B2 (en) 2020-03-30 2020-03-30 Processing data stream modification to reduce power effects during parallel processing
US16/834,986 2020-03-30
PCT/US2021/070327 WO2021203125A1 (en) 2020-03-30 2021-03-29 Processing data stream modification to reduce power effects during parallel processing

Related Child Applications (1)

Application Number Title Priority Date Filing Date
CN202610004365.4A Division CN121807567A (zh) 2020-03-30 2021-03-29 处理数据流修改以减少并行处理期间的功率效应

Publications (2)

Publication Number Publication Date
CN115315688A CN115315688A (zh) 2022-11-08
CN115315688B true CN115315688B (zh) 2025-12-23

Family

ID=75640050

Family Applications (2)

Application Number Title Priority Date Filing Date
CN202180023936.4A Active CN115315688B (zh) 2020-03-30 2021-03-29 处理数据流修改以减少并行处理期间的功率效应
CN202610004365.4A Pending CN121807567A (zh) 2020-03-30 2021-03-29 处理数据流修改以减少并行处理期间的功率效应

Family Applications After (1)

Application Number Title Priority Date Filing Date
CN202610004365.4A Pending CN121807567A (zh) 2020-03-30 2021-03-29 处理数据流修改以减少并行处理期间的功率效应

Country Status (9)

Country Link
US (2) US11507423B2 (https=)
EP (1) EP4127928A1 (https=)
JP (2) JP7750847B2 (https=)
KR (1) KR20220154698A (https=)
CN (2) CN115315688B (https=)
BR (1) BR112022018896A2 (https=)
PH (1) PH12022551885A1 (https=)
TW (1) TWI876016B (https=)
WO (1) WO2021203125A1 (https=)

Families Citing this family (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11513799B2 (en) * 2019-11-04 2022-11-29 Apple Inc. Chained buffers in neural network processor
US11507423B2 (en) 2020-03-30 2022-11-22 Qualcomm Incorporated Processing data stream modification to reduce power effects during parallel processing
US11972348B2 (en) 2020-10-30 2024-04-30 Apple Inc. Texture unit circuit in neural network processor
US12277494B2 (en) 2020-11-19 2025-04-15 Apple Inc. Multi-dimensional tensor support extension in neural network processor
JP7806459B2 (ja) * 2021-11-22 2026-01-27 富士通株式会社 制御プログラム、情報処理装置および制御方法
US20240152407A1 (en) * 2022-11-04 2024-05-09 Nvidia Corporation Generating sparse neural networks
TWI845081B (zh) * 2022-12-21 2024-06-11 國立成功大學 圖形處理器
US12236241B2 (en) * 2023-02-24 2025-02-25 Arm Limited Data processing apparatus with selectively delayed transmission of operands
CN120234822B (zh) * 2025-05-30 2025-08-08 苏州元脑智能科技有限公司 数据处理方法、电子设备、存储介质及产品

Family Cites Families (28)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6128416A (en) * 1993-09-10 2000-10-03 Olympus Optical Co., Ltd. Image composing technique for optimally composing a single image from a plurality of digital images
US6775417B2 (en) * 1997-10-02 2004-08-10 S3 Graphics Co., Ltd. Fixed-rate block-based image compression with inferred pixel values
US20020151040A1 (en) * 2000-02-18 2002-10-17 Matthew O' Keefe Apparatus and methods for parallel processing of microvolume liquid reactions
JP3392798B2 (ja) * 2000-02-22 2003-03-31 理想科学工業株式会社 画像属性判別方法および装置
US6788302B1 (en) 2000-08-03 2004-09-07 International Business Machines Corporation Partitioning and load balancing graphical shape data for parallel applications
US6934760B1 (en) 2001-02-04 2005-08-23 Cisco Technology, Inc. Method and apparatus for resequencing of packets into an original ordering using multiple resequencing components
JP2002300402A (ja) * 2001-03-30 2002-10-11 Fuji Photo Film Co Ltd 画像処理装置、方法及び記録媒体
US20070019778A1 (en) * 2005-07-22 2007-01-25 Clouse Melvin E Voxel histogram analysis for measurement of plaque
JP2010111099A (ja) * 2008-11-10 2010-05-20 Canon Inc 画像処理装置およびその制御方法
JPWO2010073922A1 (ja) * 2008-12-25 2012-06-14 日本電気株式会社 誤り訂正符号化装置、復号装置、符号化方法、復号方法、及びそのプログラム
US8521782B2 (en) 2011-07-20 2013-08-27 Salesforce.Com, Inc. Methods and systems for processing large graphs using density-based processes using map-reduce
CN107133018B (zh) * 2011-12-22 2020-12-22 英特尔公司 执行groestl散列的指令
JP6089949B2 (ja) * 2013-05-14 2017-03-08 株式会社リコー Simd型プロセッサ
US9721204B2 (en) * 2013-10-28 2017-08-01 Qualcomm Incorporated Evaluation of a system including separable sub-systems over a multidimensional range
GB2521151B (en) * 2013-12-10 2021-06-02 Advanced Risc Mach Ltd Configurable thread ordering for a data processing apparatus
KR101599133B1 (ko) * 2014-06-09 2016-03-15 주식회사 엔지스테크널러지 네비게이션 장치의 지도 데이터 제공 방법 및 시스템
JP6523428B2 (ja) * 2015-03-04 2019-05-29 オリンパス株式会社 画像処理装置
US9774508B1 (en) 2015-12-02 2017-09-26 Color Genomics, Inc. Communication generation using sparse indicators and sensor data
EP4557209A3 (en) * 2015-06-10 2025-08-13 Mobileye Vision Technologies Ltd. Image processor and methods for processing an image
JP6832155B2 (ja) * 2016-12-28 2021-02-24 ソニーセミコンダクタソリューションズ株式会社 画像処理装置、画像処理方法、及び画像処理システム
US10672175B2 (en) * 2017-04-17 2020-06-02 Intel Corporation Order independent asynchronous compute and streaming for graphics
EP3392804A1 (en) * 2017-04-18 2018-10-24 Koninklijke Philips N.V. Device and method for modelling a composition of an object of interest
US10943171B2 (en) 2017-09-01 2021-03-09 Facebook, Inc. Sparse neural network training optimization
US10803096B2 (en) * 2017-09-28 2020-10-13 Here Global B.V. Parallelized clustering of geospatial data
US11561833B1 (en) * 2018-06-28 2023-01-24 Amazon Technologies, Inc. Allocation and placement of resources for network computation
JP2022500755A (ja) * 2018-09-11 2022-01-04 ホアウェイ・テクノロジーズ・カンパニー・リミテッド 順次計算dagのための異種スケジューリング
US11288436B2 (en) * 2020-01-30 2022-03-29 Taiwan Semiconductor Manufacturing Co., Ltd. Method of analyzing and detecting critical cells
US11507423B2 (en) 2020-03-30 2022-11-22 Qualcomm Incorporated Processing data stream modification to reduce power effects during parallel processing

Also Published As

Publication number Publication date
EP4127928A1 (en) 2023-02-08
US20230078991A1 (en) 2023-03-16
WO2021203125A1 (en) 2021-10-07
TWI876016B (zh) 2025-03-11
KR20220154698A (ko) 2022-11-22
CN115315688A (zh) 2022-11-08
CN121807567A (zh) 2026-04-07
JP7750847B2 (ja) 2025-10-07
US11983567B2 (en) 2024-05-14
JP2026015712A (ja) 2026-01-30
BR112022018896A2 (pt) 2022-11-08
TW202143032A (zh) 2021-11-16
US20210303359A1 (en) 2021-09-30
US11507423B2 (en) 2022-11-22
PH12022551885A1 (en) 2023-11-20
JP2023519665A (ja) 2023-05-12

Similar Documents

Publication Publication Date Title
CN115315688B (zh) 处理数据流修改以减少并行处理期间的功率效应
KR101959376B1 (ko) 멀티 코어 최적화된 순환 신경망을 위한 시스템 및 방법
Zhang et al. Fast linear interpolation
Breß et al. Efficient co-processor utilization in database query processing
TWI844116B (zh) 在機器學習硬體加速器處利用資料稀疏性
JP7834166B2 (ja) ハードウェアアクセラレータ最適化グループ畳み込みベースのニューラルネットワークモデル
KR20220162727A (ko) 기계 학습 작업시 개선된 메모리 압축 전달을 위한 희소성에 기초한 특징 재정렬
KR20220161339A (ko) 기계 학습 작업시 개선된 메모리 압축 전달을 위한 유사도에 기초한 특징 재정렬
JP2026001179A (ja) パターンベースのキャッシュブロック圧縮
Venieris et al. unzipFPGA: Enhancing FPGA-based CNN engines with on-the-fly weights generation
Wang et al. A novel parallel algorithm for sparse tensor matrix chain multiplication via TCU-acceleration
Venieris et al. Nawq-sr: A hybrid-precision npu engine for efficient on-device super-resolution
US20250124700A1 (en) Neural network architecture for implementing group convolutions
Raghunandan et al. A parallel implementation of FastBit radix sort using MPI and CUDA
Jeon et al. Tinymem: Boosting multi-dnn inference on tiny ai accelerators with weight memory virtualization
Li et al. S-LGCN: Software-hardware co-design for accelerating LightGCN
Morishima et al. Performance evaluations of graph database using cuda and openmp compatible libraries
Ali et al. Cross-layer CNN approximations for hardware implementation
Fèvre et al. Optimization of SpGEMM with Risc-V vector instructions
CN119856181A (zh) 用于稀疏张量的去稀疏化卷积
US20260004151A1 (en) Reducing power consumption of neural network accelerator through weight reordering
Wirawan et al. High performance protein sequence database scanning on the Cell Broadband Engine
US20260003935A1 (en) High performance execution of state space models on neural network accelerators
Zlateski et al. Znni-maximizing the inference throughput of 3d convolutional networks on multi-core cpus and gpus
Momeni et al. A parallel clustering algorithm for placement

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant