IN2015DN02742A - - Google Patents

Info

Publication number
IN2015DN02742A
IN2015DN02742A IN2742DEN2015A IN2015DN02742A IN 2015DN02742 A IN2015DN02742 A IN 2015DN02742A IN 2742DEN2015 A IN2742DEN2015 A IN 2742DEN2015A IN 2015DN02742 A IN2015DN02742 A IN 2015DN02742A
Authority
IN
India
Prior art keywords
tlb
gpu
gpus
cpu
task
Prior art date
Application number
Inventor
Misel-Myrto; Papadopoulou
Lisa R ; Hsu
Andrew G ; Kegel
Jayasena S ; Nuwan
Bradford M ; Beckmann
Steven K ; Reinhardt
Original Assignee
Advanced Micro Devices Inc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Advanced Micro Devices Inc filed Critical Advanced Micro Devices Inc
Publication of IN2015DN02742A publication Critical patent/IN2015DN02742A/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F12/00Accessing, addressing or allocating within memory systems or architectures
    • G06F12/02Addressing or allocation; Relocation
    • G06F12/08Addressing or allocation; Relocation in hierarchically structured memory systems, e.g. virtual memory systems
    • G06F12/10Address translation
    • G06F12/1027Address translation using associative or pseudo-associative address translation means, e.g. translation look-aside buffer [TLB]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/48Program initiating; Program switching, e.g. by interrupt
    • G06F9/4806Task transfer initiation or dispatching
    • G06F9/4843Task transfer initiation or dispatching by program, e.g. task dispatcher, supervisor, operating system
    • G06F9/485Task life-cycle, e.g. stopping, restarting, resuming execution
    • G06F9/4856Task life-cycle, e.g. stopping, restarting, resuming execution resumption being on a different machine, e.g. task migration, virtual machine migration
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2212/00Indexing scheme relating to accessing, addressing or allocation within memory systems or architectures
    • G06F2212/65Details of virtual memory and virtual address translation
    • G06F2212/654Look-ahead translation
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Abstract

Methods and apparatuses are provided for avoiding cold translation lookaside buffer (TLB) misses in a computer system. A typical system is configured as a heterogeneous computing system having at least one central processing unit (CPU) and one or more graphic processing units (GPUs) that share a common memory address space. Each processing unit (CPU and GPU) has an independent TLB. When offloading a task from a particular CPU to a particular GPU, translation information is sent along with the task assignment. The translation information allows the GPU to load the address translation data into the TLB associated with the one or more GPUs prior to executing the task. Preloading the TLB of the GPUs reduces or avoids cold TLB misses that could otherwise occur without the benefits offered by the present disclosure.
IN2742DEN2015 2012-10-05 2013-09-20 IN2015DN02742A (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US13/645,685 US20140101405A1 (en) 2012-10-05 2012-10-05 Reducing cold tlb misses in a heterogeneous computing system
PCT/US2013/060826 WO2014055264A1 (en) 2012-10-05 2013-09-20 Reducing cold tlb misses in a heterogeneous computing system

Publications (1)

Publication Number Publication Date
IN2015DN02742A true IN2015DN02742A (en) 2015-09-04

Family

ID=49305166

Family Applications (1)

Application Number Title Priority Date Filing Date
IN2742DEN2015 IN2015DN02742A (en) 2012-10-05 2013-09-20

Country Status (7)

Country Link
US (1) US20140101405A1 (en)
EP (1) EP2904498A1 (en)
JP (1) JP2015530683A (en)
KR (1) KR20150066526A (en)
CN (1) CN104704476A (en)
IN (1) IN2015DN02742A (en)
WO (1) WO2014055264A1 (en)

Families Citing this family (16)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20140208758A1 (en) 2011-12-30 2014-07-31 Clearsign Combustion Corporation Gas turbine with extended turbine blade stream adhesion
US9170954B2 (en) * 2012-12-10 2015-10-27 International Business Machines Corporation Translation management instructions for updating address translation data structures in remote processing nodes
US9235512B2 (en) * 2013-01-18 2016-01-12 Nvidia Corporation System, method, and computer program product for graphics processing unit (GPU) demand paging
US10437591B2 (en) * 2013-02-26 2019-10-08 Qualcomm Incorporated Executing an operating system on processors having different instruction set architectures
US9396089B2 (en) 2014-05-30 2016-07-19 Apple Inc. Activity tracing diagnostic systems and methods
US9619012B2 (en) * 2014-05-30 2017-04-11 Apple Inc. Power level control using power assertion requests
CN104035819B (en) * 2014-06-27 2017-02-15 清华大学深圳研究生院 Scientific workflow scheduling method and device
GB2546343A (en) 2016-01-15 2017-07-19 Stmicroelectronics (Grenoble2) Sas Apparatus and methods implementing dispatch mechanisms for offloading executable functions
CN105786717B (en) * 2016-03-22 2018-11-16 华中科技大学 The DRAM-NVM stratification isomery memory pool access method and system of software-hardware synergism management
DE102016219202A1 (en) * 2016-10-04 2018-04-05 Robert Bosch Gmbh Method and device for protecting a working memory
CN109213698B (en) * 2018-08-23 2020-10-27 贵州华芯通半导体技术有限公司 VIVT cache access method, arbitration unit and processor
CN111274166B (en) * 2018-12-04 2022-09-20 展讯通信(上海)有限公司 TLB pre-filling and locking method and device
KR102147912B1 (en) 2019-08-13 2020-08-25 삼성전자주식회사 Processor chip and control methods thereof
US11816037B2 (en) * 2019-12-12 2023-11-14 Advanced Micro Devices, Inc. Enhanced page information co-processor
CN111338988B (en) * 2020-02-20 2022-06-14 西安芯瞳半导体技术有限公司 Memory access method and device, computer equipment and storage medium
US11861403B2 (en) * 2020-10-15 2024-01-02 Nxp Usa, Inc. Method and system for accelerator thread management

Family Cites Families (26)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US4481573A (en) * 1980-11-17 1984-11-06 Hitachi, Ltd. Shared virtual address translation unit for a multiprocessor system
US5893144A (en) * 1995-12-22 1999-04-06 Sun Microsystems, Inc. Hybrid NUMA COMA caching system and methods for selecting between the caching modes
US6208543B1 (en) * 1999-05-18 2001-03-27 Advanced Micro Devices, Inc. Translation lookaside buffer (TLB) including fast hit signal generation circuitry
US6851038B1 (en) * 2000-05-26 2005-02-01 Koninklijke Philips Electronics N.V. Background fetching of translation lookaside buffer (TLB) entries
US6668308B2 (en) * 2000-06-10 2003-12-23 Hewlett-Packard Development Company, L.P. Scalable architecture based on single-chip multiprocessing
JP3594082B2 (en) * 2001-08-07 2004-11-24 日本電気株式会社 Data transfer method between virtual addresses
US6891543B2 (en) * 2002-05-08 2005-05-10 Intel Corporation Method and system for optimally sharing memory between a host processor and graphics processor
EP1391820A3 (en) * 2002-07-31 2007-12-19 Texas Instruments Incorporated Concurrent task execution in a multi-processor, single operating system environment
US7321958B2 (en) * 2003-10-30 2008-01-22 International Business Machines Corporation System and method for sharing memory by heterogeneous processors
US7386669B2 (en) * 2005-03-31 2008-06-10 International Business Machines Corporation System and method of improving task switching and page translation performance utilizing a multilevel translation lookaside buffer
US20070083870A1 (en) * 2005-07-29 2007-04-12 Tomochika Kanakogi Methods and apparatus for task sharing among a plurality of processors
US7917723B2 (en) * 2005-12-01 2011-03-29 Microsoft Corporation Address translation table synchronization
US20080028181A1 (en) * 2006-07-31 2008-01-31 Nvidia Corporation Dedicated mechanism for page mapping in a gpu
US8140822B2 (en) * 2007-04-16 2012-03-20 International Business Machines Corporation System and method for maintaining page tables used during a logical partition migration
US7941631B2 (en) * 2007-12-28 2011-05-10 Intel Corporation Providing metadata in a translation lookaside buffer (TLB)
US8451281B2 (en) * 2009-06-23 2013-05-28 Intel Corporation Shared virtual memory between a host and discrete graphics device in a computing system
US8397049B2 (en) * 2009-07-13 2013-03-12 Apple Inc. TLB prefetching
US8285969B2 (en) * 2009-09-02 2012-10-09 International Business Machines Corporation Reducing broadcasts in multiprocessors
US8615637B2 (en) * 2009-09-10 2013-12-24 Advanced Micro Devices, Inc. Systems and methods for processing memory requests in a multi-processor system using a probe engine
US20110161620A1 (en) * 2009-12-29 2011-06-30 Advanced Micro Devices, Inc. Systems and methods implementing shared page tables for sharing memory resources managed by a main operating system with accelerator devices
US8341357B2 (en) * 2010-03-16 2012-12-25 Oracle America, Inc. Pre-fetching for a sibling cache
US9128849B2 (en) * 2010-04-13 2015-09-08 Apple Inc. Coherent memory scheme for heterogeneous processors
US9471532B2 (en) * 2011-02-11 2016-10-18 Microsoft Technology Licensing, Llc Remote core operations in a multi-core computer
KR20120129695A (en) * 2011-05-20 2012-11-28 삼성전자주식회사 Method of operating memory management unit and apparatus of the same
US10185566B2 (en) * 2012-04-27 2019-01-22 Intel Corporation Migrating tasks between asymmetric computing elements of a multi-core processor
US9235529B2 (en) * 2012-08-02 2016-01-12 Oracle International Corporation Using broadcast-based TLB sharing to reduce address-translation latency in a shared-memory system with optical interconnect

Also Published As

Publication number Publication date
US20140101405A1 (en) 2014-04-10
CN104704476A (en) 2015-06-10
EP2904498A1 (en) 2015-08-12
KR20150066526A (en) 2015-06-16
WO2014055264A1 (en) 2014-04-10
JP2015530683A (en) 2015-10-15

Similar Documents

Publication Publication Date Title
IN2015DN02742A (en)
GB2476360B (en) Sharing virtual memory-based multi-version data between the heterogenous processors of a computer platform
GB2541038A (en) Apparatus and method for managing virtual graphics processor unit
WO2015081308A3 (en) Dynamic i/o virtualization
WO2014031495A3 (en) Translation look-aside buffer with prefetching
GB2513789A (en) System and method to reduce memory usage by optimally placing VMS in a virtualized data center
WO2015108708A3 (en) Unified memory systems and methods
IN2013MN00405A (en)
MX2012012534A (en) Subbuffer objects.
GB2513496A (en) Memory address translation-based data encryption
MY162612A (en) Apparatus and method for handling access operations issued to local cache structures within a data processing apparatus
BR112015011219A2 (en) high dynamic range image processing
EP2498183A3 (en) Protecting guest virtual machine memory
EP2660752A3 (en) Memory protection circuit, processing unit, and memory protection method
GB2494331A (en) Hardware assist thread
GB2514501A (en) Adaptive cache promotions in a two level caching System
BR112015001988A2 (en) multiple sets of attribute fields within a single page table entry
GB2520856A (en) Enabling Virtualization of a processor resource
WO2011163407A3 (en) Region based technique for accurately predicting memory accesses
GB2518785A (en) Concurrent control for a page miss handler
BR112012007445A2 (en) shared face training data
PH12017550126A1 (en) Bulk allocation of instruction blocks to a processor instruction window
TW200943224A (en) Multi-buffer support for off-screen surfaces in a graphics processing system
GB2517371A (en) Virtual machine exclusive caching
BR112018002466A2 (en) hardware-applied content protection for graphics processing units