KR20220154698A - 병렬 프로세싱 동안 전력 영향들을 감소시키기 위한 프로세싱 데이터 스트림 수정 - Google Patents

병렬 프로세싱 동안 전력 영향들을 감소시키기 위한 프로세싱 데이터 스트림 수정 Download PDF

Info

Publication number
KR20220154698A
KR20220154698A KR1020227032661A KR20227032661A KR20220154698A KR 20220154698 A KR20220154698 A KR 20220154698A KR 1020227032661 A KR1020227032661 A KR 1020227032661A KR 20227032661 A KR20227032661 A KR 20227032661A KR 20220154698 A KR20220154698 A KR 20220154698A
Authority
KR
South Korea
Prior art keywords
data
processing
blocks
sub
density
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
KR1020227032661A
Other languages
English (en)
Korean (ko)
Inventor
희 준 박
리차드 제라드 호프만
Original Assignee
퀄컴 인코포레이티드
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 퀄컴 인코포레이티드 filed Critical 퀄컴 인코포레이티드
Publication of KR20220154698A publication Critical patent/KR20220154698A/ko
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5061Partitioning or combining of resources
    • G06F9/5066Algorithms for mapping a plurality of inter-dependent sub-tasks onto a plurality of physical CPUs
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F13/00Interconnection of, or transfer of information or other signals between, memories, input/output devices or central processing units
    • G06F13/14Handling requests for interconnection or transfer
    • G06F13/20Handling requests for interconnection or transfer for access to input/output bus
    • G06F13/28Handling requests for interconnection or transfer for access to input/output bus using burst mode transfer, e.g. direct memory access DMA, cycle steal
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5005Allocation of resources, e.g. of the central processing unit [CPU] to service a request
    • G06F9/5027Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5094Allocation of resources, e.g. of the central processing unit [CPU] where the allocation takes into account power or heat criteria
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Software Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Computation (AREA)
  • Medical Informatics (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Image Processing (AREA)
  • Compression Or Coding Systems Of Tv Signals (AREA)
  • Power Sources (AREA)
  • Complex Calculations (AREA)
KR1020227032661A 2020-03-30 2021-03-29 병렬 프로세싱 동안 전력 영향들을 감소시키기 위한 프로세싱 데이터 스트림 수정 Pending KR20220154698A (ko)

Applications Claiming Priority (3)

Application Number Priority Date Filing Date Title
US16/834,986 US11507423B2 (en) 2020-03-30 2020-03-30 Processing data stream modification to reduce power effects during parallel processing
US16/834,986 2020-03-30
PCT/US2021/070327 WO2021203125A1 (en) 2020-03-30 2021-03-29 Processing data stream modification to reduce power effects during parallel processing

Publications (1)

Publication Number Publication Date
KR20220154698A true KR20220154698A (ko) 2022-11-22

Family

ID=75640050

Family Applications (1)

Application Number Title Priority Date Filing Date
KR1020227032661A Pending KR20220154698A (ko) 2020-03-30 2021-03-29 병렬 프로세싱 동안 전력 영향들을 감소시키기 위한 프로세싱 데이터 스트림 수정

Country Status (9)

Country Link
US (2) US11507423B2 (https=)
EP (1) EP4127928A1 (https=)
JP (2) JP7750847B2 (https=)
KR (1) KR20220154698A (https=)
CN (2) CN115315688B (https=)
BR (1) BR112022018896A2 (https=)
PH (1) PH12022551885A1 (https=)
TW (1) TWI876016B (https=)
WO (1) WO2021203125A1 (https=)

Families Citing this family (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11513799B2 (en) * 2019-11-04 2022-11-29 Apple Inc. Chained buffers in neural network processor
US11507423B2 (en) 2020-03-30 2022-11-22 Qualcomm Incorporated Processing data stream modification to reduce power effects during parallel processing
US11972348B2 (en) 2020-10-30 2024-04-30 Apple Inc. Texture unit circuit in neural network processor
US12277494B2 (en) 2020-11-19 2025-04-15 Apple Inc. Multi-dimensional tensor support extension in neural network processor
JP7806459B2 (ja) * 2021-11-22 2026-01-27 富士通株式会社 制御プログラム、情報処理装置および制御方法
US20240152407A1 (en) * 2022-11-04 2024-05-09 Nvidia Corporation Generating sparse neural networks
TWI845081B (zh) * 2022-12-21 2024-06-11 國立成功大學 圖形處理器
US12236241B2 (en) * 2023-02-24 2025-02-25 Arm Limited Data processing apparatus with selectively delayed transmission of operands
CN120234822B (zh) * 2025-05-30 2025-08-08 苏州元脑智能科技有限公司 数据处理方法、电子设备、存储介质及产品

Family Cites Families (28)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6128416A (en) * 1993-09-10 2000-10-03 Olympus Optical Co., Ltd. Image composing technique for optimally composing a single image from a plurality of digital images
US6775417B2 (en) * 1997-10-02 2004-08-10 S3 Graphics Co., Ltd. Fixed-rate block-based image compression with inferred pixel values
US20020151040A1 (en) * 2000-02-18 2002-10-17 Matthew O' Keefe Apparatus and methods for parallel processing of microvolume liquid reactions
JP3392798B2 (ja) * 2000-02-22 2003-03-31 理想科学工業株式会社 画像属性判別方法および装置
US6788302B1 (en) 2000-08-03 2004-09-07 International Business Machines Corporation Partitioning and load balancing graphical shape data for parallel applications
US6934760B1 (en) 2001-02-04 2005-08-23 Cisco Technology, Inc. Method and apparatus for resequencing of packets into an original ordering using multiple resequencing components
JP2002300402A (ja) * 2001-03-30 2002-10-11 Fuji Photo Film Co Ltd 画像処理装置、方法及び記録媒体
US20070019778A1 (en) * 2005-07-22 2007-01-25 Clouse Melvin E Voxel histogram analysis for measurement of plaque
JP2010111099A (ja) * 2008-11-10 2010-05-20 Canon Inc 画像処理装置およびその制御方法
JPWO2010073922A1 (ja) * 2008-12-25 2012-06-14 日本電気株式会社 誤り訂正符号化装置、復号装置、符号化方法、復号方法、及びそのプログラム
US8521782B2 (en) 2011-07-20 2013-08-27 Salesforce.Com, Inc. Methods and systems for processing large graphs using density-based processes using map-reduce
CN107133018B (zh) * 2011-12-22 2020-12-22 英特尔公司 执行groestl散列的指令
JP6089949B2 (ja) * 2013-05-14 2017-03-08 株式会社リコー Simd型プロセッサ
US9721204B2 (en) * 2013-10-28 2017-08-01 Qualcomm Incorporated Evaluation of a system including separable sub-systems over a multidimensional range
GB2521151B (en) * 2013-12-10 2021-06-02 Advanced Risc Mach Ltd Configurable thread ordering for a data processing apparatus
KR101599133B1 (ko) * 2014-06-09 2016-03-15 주식회사 엔지스테크널러지 네비게이션 장치의 지도 데이터 제공 방법 및 시스템
JP6523428B2 (ja) * 2015-03-04 2019-05-29 オリンパス株式会社 画像処理装置
US9774508B1 (en) 2015-12-02 2017-09-26 Color Genomics, Inc. Communication generation using sparse indicators and sensor data
EP4557209A3 (en) * 2015-06-10 2025-08-13 Mobileye Vision Technologies Ltd. Image processor and methods for processing an image
JP6832155B2 (ja) * 2016-12-28 2021-02-24 ソニーセミコンダクタソリューションズ株式会社 画像処理装置、画像処理方法、及び画像処理システム
US10672175B2 (en) * 2017-04-17 2020-06-02 Intel Corporation Order independent asynchronous compute and streaming for graphics
EP3392804A1 (en) * 2017-04-18 2018-10-24 Koninklijke Philips N.V. Device and method for modelling a composition of an object of interest
US10943171B2 (en) 2017-09-01 2021-03-09 Facebook, Inc. Sparse neural network training optimization
US10803096B2 (en) * 2017-09-28 2020-10-13 Here Global B.V. Parallelized clustering of geospatial data
US11561833B1 (en) * 2018-06-28 2023-01-24 Amazon Technologies, Inc. Allocation and placement of resources for network computation
JP2022500755A (ja) * 2018-09-11 2022-01-04 ホアウェイ・テクノロジーズ・カンパニー・リミテッド 順次計算dagのための異種スケジューリング
US11288436B2 (en) * 2020-01-30 2022-03-29 Taiwan Semiconductor Manufacturing Co., Ltd. Method of analyzing and detecting critical cells
US11507423B2 (en) 2020-03-30 2022-11-22 Qualcomm Incorporated Processing data stream modification to reduce power effects during parallel processing

Also Published As

Publication number Publication date
EP4127928A1 (en) 2023-02-08
US20230078991A1 (en) 2023-03-16
CN115315688B (zh) 2025-12-23
WO2021203125A1 (en) 2021-10-07
TWI876016B (zh) 2025-03-11
CN115315688A (zh) 2022-11-08
CN121807567A (zh) 2026-04-07
JP7750847B2 (ja) 2025-10-07
US11983567B2 (en) 2024-05-14
JP2026015712A (ja) 2026-01-30
BR112022018896A2 (pt) 2022-11-08
TW202143032A (zh) 2021-11-16
US20210303359A1 (en) 2021-09-30
US11507423B2 (en) 2022-11-22
PH12022551885A1 (en) 2023-11-20
JP2023519665A (ja) 2023-05-12

Similar Documents

Publication Publication Date Title
US11983567B2 (en) Processing data stream modification to reduce power effects during parallel processing
Lu et al. Optimizing depthwise separable convolution operations on gpus
Shen et al. Escher: A CNN accelerator with flexible buffering to minimize off-chip transfer
Cho et al. MEC: Memory-efficient convolution for deep neural network
Hill et al. Deftnn: Addressing bottlenecks for dnn execution on gpus via synapse vector elimination and near-compute data fission
JP2022070955A (ja) ニューラルネットワーク処理のスケジューリング
CN114503125A (zh) 结构化剪枝方法、系统和计算机可读介质
Breß et al. Efficient co-processor utilization in database query processing
KR20210023401A (ko) 뉴럴 네트워크 연산 방법 및 이를 포함하는 시스템
KR20220161339A (ko) 기계 학습 작업시 개선된 메모리 압축 전달을 위한 유사도에 기초한 특징 재정렬
US20240169463A1 (en) Mixture-of-experts layer with dynamic gating
Cojean et al. Resource aggregation for task-based cholesky factorization on top of modern architectures
CN114428936A (zh) 针对矩阵-矩阵乘法分配处理线程
US20240160894A1 (en) Mixture-of-experts layer with switchable parallel modes
Jeon et al. Tinymem: Boosting multi-dnn inference on tiny ai accelerators with weight memory virtualization
Erdem et al. Runtime design space exploration and mapping of dcnns for the ultra-low-power orlando soc
Lv et al. A survey of graph pre-processing methods: from algorithmic to hardware perspectives
Ali et al. Cross-layer CNN approximations for hardware implementation
JP7837470B2 (ja) ハードウェアアクセラレータにおけるメモリバンク競合の低減
US20240160906A1 (en) Collective communication phases at mixture-of-experts layer
Villarrubia et al. Balanced segmentation of CNNs for multi-TPU inference: J. Villarrubia et al.
Chen et al. High throughput and low bandwidth demand: Accelerating CNN inference Block-by-block on FPGAs
Cao et al. Lssm-spmm: A long-row splitting and short-row merging approach for parallel spmm on pezy-sc3s
Aher et al. Accelerate the execution of graph processing using GPU
Tang et al. MSA2: An Efficient S parsity-A ware Accelerator for Matrix Multiplication with M ulti-core S ystolic A rrays

Legal Events

Date Code Title Description
PA0105 International application

Patent event date: 20220920

Patent event code: PA01051R01D

Comment text: International Patent Application

PG1501 Laying open of application
A201 Request for examination
PA0201 Request for examination

Patent event code: PA02012R01D

Patent event date: 20240314

Comment text: Request for Examination of Application