US20140043333A1 - Application load times by caching shader binaries in a persistent storage - Google Patents

Application load times by caching shader binaries in a persistent storage Download PDF

Info

Publication number
US20140043333A1
US20140043333A1 US13/731,785 US201213731785A US2014043333A1 US 20140043333 A1 US20140043333 A1 US 20140043333A1 US 201213731785 A US201213731785 A US 201213731785A US 2014043333 A1 US2014043333 A1 US 2014043333A1
Authority
US
United States
Prior art keywords
shader
key
binary
memory
graphics processor
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Abandoned
Application number
US13/731,785
Inventor
Swaminathan Narayanan
Nicholas Haemel
Jeffrey A. Bolz
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nvidia Corp
Original Assignee
Nvidia Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nvidia Corp filed Critical Nvidia Corp
Priority to US13/731,785 priority Critical patent/US20140043333A1/en
Publication of US20140043333A1 publication Critical patent/US20140043333A1/en
Abandoned legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T15/003D [Three Dimensional] image rendering
    • G06T15/50Lighting effects
    • G06T15/80Shading
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F1/00Details not covered by groups G06F3/00 - G06F13/00 and G06F21/00
    • G06F1/26Power supply means, e.g. regulation thereof
    • G06F1/32Means for saving power
    • G06F1/3203Power management, i.e. event-based initiation of a power-saving mode
    • G06F1/3234Power saving characterised by the action undertaken
    • G06F1/324Power saving characterised by the action undertaken by lowering clock frequency
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F1/00Details not covered by groups G06F3/00 - G06F13/00 and G06F21/00
    • G06F1/26Power supply means, e.g. regulation thereof
    • G06F1/32Means for saving power
    • G06F1/3203Power management, i.e. event-based initiation of a power-saving mode
    • G06F1/3206Monitoring of events, devices or parameters that trigger a change in power modality
    • G06F1/3215Monitoring of peripheral devices
    • G06F1/3218Monitoring of peripheral devices of display devices
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F1/00Details not covered by groups G06F3/00 - G06F13/00 and G06F21/00
    • G06F1/26Power supply means, e.g. regulation thereof
    • G06F1/32Means for saving power
    • G06F1/3203Power management, i.e. event-based initiation of a power-saving mode
    • G06F1/3234Power saving characterised by the action undertaken
    • G06F1/325Power saving in peripheral device
    • G06F1/3265Power saving in display device
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/34Recording or statistical evaluation of computer activity, e.g. of down time, of input/output operation ; Recording or statistical evaluation of user activity, e.g. usability assessment
    • G06F11/3409Recording or statistical evaluation of computer activity, e.g. of down time, of input/output operation ; Recording or statistical evaluation of user activity, e.g. usability assessment for performance assessment
    • G06F11/3419Recording or statistical evaluation of computer activity, e.g. of down time, of input/output operation ; Recording or statistical evaluation of user activity, e.g. usability assessment for performance assessment by assessing time
    • G06F11/3423Recording or statistical evaluation of computer activity, e.g. of down time, of input/output operation ; Recording or statistical evaluation of user activity, e.g. usability assessment for performance assessment by assessing time where the assessed time is active or idle time
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Definitions

  • Embodiments of the present disclosure relate generally to the field of graphics processing and more specifically to the field of improved shader binary caching and execution for efficient graphics processing.
  • High level graphics languages allow applications to specify the execution of particular shaders.
  • Shaders are instruction sets that define how certain pieces of geometry or fragments are processed by a graphics processor. These shader instruction sets can be quite long and detailed in what they do, and often execute millions of times.
  • An exemplary compiler takes an instruction set written with a programming language (e.g., C programming language and other similar programming languages) and compiles the instruction set into a shader binary code or microcode that can be executed on the graphics processor.
  • One or more of these shader instruction sets may be compiled together to form an entire execution pipeline or program object.
  • An exemplary process by which an application selects shaders and compiles the selected shader sources into binaries and links them together into a program object may require an unbounded amount of time.
  • An exemplary compiler may require an extensive amount of optimization time while compiling and linking intermediate results and/or a final output.
  • Shader compilation times on mobile hardware can take a significant portion of frame render time. For example, on an application, such as a video game, running at 60 Hz, the frame render time is roughly 16 ms. When the application compiles a handful of shaders, which take on average 3-5 ms to compile on current mobile hardware, the shader compilation time can easily exceed the frame time and unfortunately cause visible stuttering on the screen.
  • An application attempting to compile shaders during runtime risks frame hitches or a gap of time when rendering stops or slows as shaders are compiled and programs linked together. Such visible stuttering or frame rate hitches are undesirable. Applications may attempt to get around this by compiling in-between runtimes, such that the results of the compile or link are not required for immediate execution. Despite such timing efforts, there are states or contexts that may change in 3D graphics, requiring one or more shaders to be recompiled during runtime. So even if an application attempts to compile and link all required shaders ahead of time (such as in-between levels of a video game) it is still possible for the application to require shader recompiling during runtime.
  • Embodiments of this present invention provide solutions to the challenges inherent in efficiently compiling shaders during runtime.
  • a method for compiling a shader for execution by a graphics processor comprises selecting a shader for execution.
  • a key is computed for the selected shader.
  • a memory is searched for a copy of the computed key.
  • a shader binary stored in the memory is passed to the graphics processor for execution if the copy of the computed key is located in the memory. Otherwise, the shader is compiled to produce the shader binary for execution by the graphics processor and storing the shader binary in the memory.
  • the shader binary is associated with the computed key and the copy of the computed key.
  • the computer system comprises a processor, a graphics processor, and a memory.
  • the memory is operable to store instructions, that when executed performs a method for compiling a shader for execution by a graphics processor.
  • the method comprises selecting a shader for execution.
  • a key is computed for the selected shader.
  • a memory is searched for a copy of the computed key.
  • a shader binary stored in the memory is passed to the graphics processor for execution if the copy of the computed key is located in the memory. Otherwise, the shader is compiled to produce the shader binary for execution by the graphics processor and storing the shader binary in the memory.
  • the shader binary is associated with the computed key and the copy of the computed key.
  • the computer system comprises a compiler, a memory, a graphics driver module, and a graphics processor.
  • the compiler is operable to compile and link shader source code to create a shader binary.
  • the memory is operable to store a plurality of shader binaries. Each shader binary is paired with an associated key.
  • the graphics driver module is operable to select one or more shaders for execution by the graphics processor and to compute a key for a selected shader, and is further operable to search the memory for a copy of the computed key.
  • a shader binary is passed from the memory for execution by the graphics processor if the copy of the computed key is located in the memory.
  • the compiler is operable to compile and link the shader to create a shader binary for execution by the graphics processor and storing the shader binary in the memory.
  • the shader binary is associated with the computed key and the copy of the computed key.
  • FIG. 1 illustrates an exemplary schematic block diagram of a graphics rendering system with a persistent cache for persistent storage of shader binaries in accordance with an embodiment of the present invention
  • FIG. 2 illustrates an exemplary schematic block diagram of a graphics rendering system with a persistent cache for persistent storage of shader binaries and corresponding keys in accordance with an embodiment of the present invention
  • FIG. 3A illustrates an exemplary schematic block diagram of a graphics rendering system with a persistent cache for persistent storage of ARB assemblies and shader microcode in accordance with an embodiment of the present invention
  • FIG. 3B illustrates an exemplary functional block diagram of a graphics rendering system with a persistent cache for persistent storage of ARB assemblies and shader microcode in accordance with an embodiment of the present invention
  • FIG. 4 illustrates an exemplary flow diagram illustrating steps of a computer implemented method for compiling a shader for execution by a graphics processor in accordance with an embodiment of the present invention.
  • This present invention provides a solution to the increasing challenges inherent in compiling and linking shader source code to produce shader binaries at runtime.
  • Various embodiments of the present disclosure provide an exemplary persistent cache memory that stores shader binaries and associated keys.
  • a cache is searched for a key associated with a shader binary for the selected shader instruction set. If the key is found in the memory, then the corresponding shader binary is sent to the graphics processor for execution, otherwise, the shader instruction set is sent to a compiler for compiling and linking to create a shader binary for execution by the graphics processor and storage in the memory.
  • FIG. 1 illustrates an exemplary graphics rendering system comprising a shader instruction set 102 (hereafter referred to as a shader), a graphics driver module 104 , a compiler 106 , a cache 108 , and a graphics processor 110 .
  • a shader instruction set 102 hereafter referred to as a shader
  • a graphics driver module 104 may comprise shader source code written with a high-level programming language, such as C programming language or other similar languages.
  • a plurality of shaders 102 e.g., a program object
  • the cache 108 is a persistent memory that retains stored shader binaries and associated keys between runtime sessions so that previously compiled and linked shaders are immediately available (in the cache) the next time they need to be executed.
  • an exemplary cache 108 for a mobile computing system may be approximately 2-8 Mbytes in size, while an exemplary cache 108 for a desktop computing system may be approximately 64 Mbytes. Other cache sizes are also possible and are within the scope of this disclosure.
  • some mobile applications may compile a significant number of shaders during a single runtime of the application.
  • each shader binary for a mobile computing system may be 1-2 Kbytes in size. This may cause the persistent cache 108 to fill up.
  • One solution is to track the usage of shader binaries and to delete any old entries based on a least recently used (LRU) cache storage algorithm.
  • Another solution may be to use a coarse ring buffer scheme using a pair of persistent files. For example, when one of the persistent files is filled up, the other persistent file may be truncated and new entries appended to that persistent file.
  • LRU least recently used
  • shader binaries stored in a persistent cache may be compressed (e.g., RLE compression) beforehand.
  • the shader binary is compressed immediately before placement into the cache and decompressed immediately after removal from the cache.
  • a selected shader (or a plurality of shaders) 102 is received by the graphics driver 104 .
  • the graphics driver 104 searches the cache 108 for a shader binary that is a compiled and linked version of the selected shader 102 , ready to be executed by the graphics processor.
  • the graphic driver 104 forwards the shader 102 to the compiler 106 for compiling and linking to produce the desired shader binary.
  • This shader binary is forwarded to the graphics driver 104 , where it is then forwarded to both the graphics processor 110 for execution and to the cache 108 for persistent storage.
  • the cache 108 may be searched for a desired shader binary by searching the cache 108 for a key that is associated with the desired shader binary.
  • a shader binary paired with an associated key is stored in the cache 108 .
  • a key is computed that is associated with the desired shader binary and is used to search for a matching key stored in the cache 108 .
  • the computed key is used to search the cache 108 for a matching key paired with the desired shader binary.
  • FIG. 2 illustrates an exemplary functional block diagram of a graphics rendering system operable to produce a shader binary for execution by a graphics processor and the computation of a key associated with the shader binary.
  • an exemplary graphics driver 202 provides shader arguments to a complier 204 and a compute key block 206 .
  • the compiler 204 receives the shader arguments and shader text that are used by the compiler 204 to compile and link the shader to produce a shader binary.
  • the compute key function block 206 receives the shader arguments and shader text along with a graphics driver version, graphics processor type/version, and a compiler driver version to produce a key that is associated with the produced shader binary.
  • a shader binary is based upon the corresponding shader arguments and shader text that are also dependent upon the current graphics driver version, compiler version, and graphics processor type/version.
  • a cache stores shader binaries and associates each shader binary with a key.
  • a key size and hash function may be used to produce a key that may be chosen such that a probability of collisions is kept extremely low.
  • a 64 bit key may suffice as a number of possible shaders that a typical application may compile for execution number at the most in the tens of thousands.
  • an exemplary key is computed using a hash function on a source shader string and the shader arguments that are also passed to the shader compiler. These arguments are computed internally by the graphics driver using the graphics driver's current state. The same shader instruction set may be compiled using different compiler arguments, resulting in multiple key/value pairs being added to the cache as graphic driver states change.
  • the cache 210 may also contain a global key that is computed at runtime based on a current graphics driver version, current compiler version, and other hardware related states.
  • the global key may be computed at graphics driver startup and compared to a global key previously stored in the cache 210 . If there is a mismatch between the previously stored global key and the new global key, the cache 210 is out of date and all stored shader binaries are invalidated. When stored shader binaries are invalidated, a shader selected for execution will need to be compiled, even if a copy of the shader binary is stored in the cache 210 (in other words, the stored shader binary is invalid).
  • Such global keys may be used to ensure that only the latest updated shader binaries are used by the application.
  • the shader binary and associated key are forwarded together to the cache 210 for storage.
  • the cache 210 is a persistent memory that retains its saved contents from previous runtime sessions.
  • the shader binary may be forwarded to a graphics processor 110 for execution.
  • a compute checksum function block 208 computes a checksum that is also stored in the cache 210 along with the shader binary and key.
  • a checksum may be used to ensure that a shader binary stored in a persistent cache 210 is uncorrupted.
  • comparing a computed key to a key stored in the cache 210 may be used to ensure that the previously stored shader binary associated with the stored key is still valid (that there have not been software or hardware changes) while a checksum is used to ensure that a stored shader binary has not been corrupted due to copy errors, etc.
  • a key is used to ensure that a desired shader binary selected is the correct one and is up to date, while the checksum is used to ensure that there are no errors in the cached shader binary.
  • FIG. 3A illustrates an exemplary graphics rendering system 300 .
  • the graphics rendering system 300 illustrated in FIG. 3 comprises a shader 302 , a graphics driver 304 , a compiler 306 , a cache 308 , and a graphics processor 310 .
  • a shader 302 is selected by a graphics driver 304 for execution by the graphics processor 310 .
  • the shader before the shader is executed, the shader must be compiled and linked by the compiler to produce a shader binary.
  • the desired shader binary may be retrieved from the cache 308 and passed to the graphics processor 310 for execution.
  • the compiler 306 compiles and links a shader to produce a shader binary, the paired shader binary and corresponding key are stored in the cache 308 for later retrieval.
  • GLSL shaders used in applications are compiled to produce an intermediate compilation using ARB assembly code, which is compiled itself to produce an executable using shader microcode or binary.
  • ARB assembly and the shader microcode or binary are stored in the cache 308 .
  • an OpenGL graphics rendering API supports user-supplied ARB assembly programs and fixed-function shading, and these are all cached in the cache 308 for later retrieval.
  • the desired shader binary (e.g., an ARB assembly and shader binary used to produce the desired executable) retrieved during a current runtime session was stored in the cache 308 during a previous runtime session.
  • a shader is passed to the compiler 306 and an ARB assembly is returned to the graphics driver 304 , after which, the graphics driver 304 passes an ARB ASM to the compiler 306 to produce a shader microcode which is executed by the graphics processor 310 .
  • the ARB assembly and shader binaries are compressed to minimize a required footprint in the cache 308 and to possibly further reduce load times.
  • RLE compression is used.
  • a checksum may also be used to evaluate cached shader binaries to prevent execution problems should the stored shader binary be corrupted (e.g., due to abnormal process termination or due to improper file locking).
  • FIG. 3B illustrates an exemplary functional block diagram of the graphics rendering system 300 illustrated in FIG. 3A .
  • a shader 302 is selected by a graphics driver 304 for execution by a graphics processor 310 .
  • the shader 302 is selected by the graphics driver 304 during a first phase 304 -A.
  • the shader 302 is passed to a frontend compiler 306 -A for compiling.
  • the frontend compiler 306 -A returns an ARB assembly to the graphics driver 304 during the first phase 304 -A.
  • the ARB assembly is passed from the first phase 304 -A to the second phase 304 -B and forwarded to the backend compiler 306 -B for further compiling.
  • the backend compiler 306 -B returns a shader microcode for execution by the graphics processor 310 .
  • the backend compiler 306 -B may produce a shader microcode or binary.
  • the first and second graphics driver phases 304 -A, 304 -B are performed by the graphics driver 304 .
  • the frontend compiler functionality ( 306 -A) and the backend compiler functionality ( 306 -B) are implemented with a single compiler 306 .
  • the ARB assembly and shader microcode and associated keys may be stored in the cache 308 for persistent storage. In other words, separate unique keys for the ARB assembly code portion and for the shader microcode are created and stored.
  • FIG. 4 illustrates an exemplary flow diagram illustrating a process for compiling a shader 102 for execution by a graphics processor 110 in accordance with an embodiment of the present invention.
  • a shader 102 is selected for execution by a graphics processor 110 .
  • a single shader is selected for execution.
  • a plurality of shaders are selected for execution together as a program object.
  • a key is computed for the selected shader.
  • a key is computed based upon shader arguments, shader text, current graphics driver version, and a current compiler driver version. The computed key is therefore associated with the desired shader binary.
  • the computed key is compared with keys stored in a cache 108 . As discussed herein, during step 406 , the computed key is compared to the stored keys to determine if a key associated with the desired shader binary is stored in the cache 108 . If the desired shader binary is in the cache 108 , the associated key will match the computed key.
  • step 408 of FIG. 4 a determination is made as to whether or not the desired key is found in the cache 108 . In other words, is there a match between the computed key (that is based upon the desired shader binary) and a key previously stored in the cache 108 ? If a key in the cache 108 matches the computed key, the process continues to step 410 of FIG. 4 . However, if the desired key is not found in the cache 108 , the process continues to step 416 of FIG. 4 .
  • step 410 of FIG. 4 a shader binary associated with the stored key that matches the computed key is retrieved from the cache 108 and a checksum is performed on the retrieved shader binary.
  • step 412 of FIG. 4 a determination is made as to whether the checksum passes. If the checksum passes, the process continues to step 414 of FIG. 4 . If the checksum does not pass, the process continues to step 416 of FIG. 4 .
  • step 414 of FIG. 4 the shader binary is passed to the graphics processor 110 for execution.
  • step 416 of FIG. 4 the selected shader or plurality of shaders is compiled and linked to produce a shader binary.
  • step 418 of FIG. 4 the shader binary is passed to the graphics processor 110 for execution.
  • step 420 of FIG. 4 the shader binary and associated key and checksum are stored in the cache 108 . As discussed herein, after the shader binary and the associated key and checksum are stored in the cache 108 , the shader binary will be available for execution by the graphics processor 110 during future runtime sessions without having to compile and link the shader again.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Computer Graphics (AREA)
  • Power Sources (AREA)
  • Memory System Of A Hierarchy Structure (AREA)

Abstract

A method for compiling a shader for execution by a graphics processor. The method comprises selecting a shader for execution. A key is computed for the selected shader. A memory is searched for a copy of the computed key. A shader binary stored in the memory is passed to the graphics processor for execution if the copy of the computed key is located in the memory. Otherwise, the shader is compiled to produce the shader binary for execution by the graphics processor and storing the shader binary in the memory. The shader binary is associated with the computed key and the copy of the computed key.

Description

    CROSS-REFERENCES TO RELATED APPLICATIONS
  • This application claims the benefit of Provisional Application No. 61/585,620, filed on Jan. 11, 2012, titled “GRAPHICS PROCESSOR CLOCK SCALING, APPLICATION LOAD TIME IMPROVEMENTS, AND DYNAMICALLY ADJUSTING RESOLUTION OF RENDER BUFFER TO IMPROVE AND STABILIZE FRAME TIMES OF A GRAPHICS PROCESSOR,” by Swaminathan Narayanan, et al., which is herein incorporated by reference.
  • TECHNICAL FIELD
  • Embodiments of the present disclosure relate generally to the field of graphics processing and more specifically to the field of improved shader binary caching and execution for efficient graphics processing.
  • BACKGROUND
  • High level graphics languages (e.g., OpenGL and DirectX) allow applications to specify the execution of particular shaders. Shaders are instruction sets that define how certain pieces of geometry or fragments are processed by a graphics processor. These shader instruction sets can be quite long and detailed in what they do, and often execute millions of times. In order to enable execution of these shaders on graphics processors, a compiler is employed. An exemplary compiler takes an instruction set written with a programming language (e.g., C programming language and other similar programming languages) and compiles the instruction set into a shader binary code or microcode that can be executed on the graphics processor.
  • One or more of these shader instruction sets may be compiled together to form an entire execution pipeline or program object. An exemplary process by which an application selects shaders and compiles the selected shader sources into binaries and links them together into a program object may require an unbounded amount of time. An exemplary compiler may require an extensive amount of optimization time while compiling and linking intermediate results and/or a final output.
  • Shader compilation times on mobile hardware can take a significant portion of frame render time. For example, on an application, such as a video game, running at 60 Hz, the frame render time is roughly 16 ms. When the application compiles a handful of shaders, which take on average 3-5 ms to compile on current mobile hardware, the shader compilation time can easily exceed the frame time and unfortunately cause visible stuttering on the screen.
  • An application attempting to compile shaders during runtime risks frame hitches or a gap of time when rendering stops or slows as shaders are compiled and programs linked together. Such visible stuttering or frame rate hitches are undesirable. Applications may attempt to get around this by compiling in-between runtimes, such that the results of the compile or link are not required for immediate execution. Despite such timing efforts, there are states or contexts that may change in 3D graphics, requiring one or more shaders to be recompiled during runtime. So even if an application attempts to compile and link all required shaders ahead of time (such as in-between levels of a video game) it is still possible for the application to require shader recompiling during runtime.
  • It would also be difficult for an application vendor to supply a shader binary library that contains all of the possible binaries that would need to be stored so as to avoid compiling. Such an exemplary effort may result in a binary library containing hundreds of thousands of possible shader binaries to supply shader binaries for every possible configuration and shader combination possible. Even so, should unexpected changes occur, the stored binaries would then be out of date. Shader binaries will need to be recreated whenever the graphics hardware, application, or graphics drivers change so that the recompiled shader binaries are updated. In other words, a large binary library will not provide the required flexibility to update the executable shader binaries to reflect any changes to hardware and software.
  • SUMMARY OF THE INVENTION
  • Embodiments of this present invention provide solutions to the challenges inherent in efficiently compiling shaders during runtime. According to one embodiment of the present invention, a method for compiling a shader for execution by a graphics processor is disclosed. The method comprises selecting a shader for execution. A key is computed for the selected shader. A memory is searched for a copy of the computed key. A shader binary stored in the memory is passed to the graphics processor for execution if the copy of the computed key is located in the memory. Otherwise, the shader is compiled to produce the shader binary for execution by the graphics processor and storing the shader binary in the memory. The shader binary is associated with the computed key and the copy of the computed key.
  • In a computer system according to one embodiment of the present invention, the computer system comprises a processor, a graphics processor, and a memory. The memory is operable to store instructions, that when executed performs a method for compiling a shader for execution by a graphics processor. The method comprises selecting a shader for execution. A key is computed for the selected shader. A memory is searched for a copy of the computed key. A shader binary stored in the memory is passed to the graphics processor for execution if the copy of the computed key is located in the memory. Otherwise, the shader is compiled to produce the shader binary for execution by the graphics processor and storing the shader binary in the memory. The shader binary is associated with the computed key and the copy of the computed key.
  • In a computer system according to one embodiment of the present invention, the computer system comprises a compiler, a memory, a graphics driver module, and a graphics processor. The compiler is operable to compile and link shader source code to create a shader binary. The memory is operable to store a plurality of shader binaries. Each shader binary is paired with an associated key. The graphics driver module is operable to select one or more shaders for execution by the graphics processor and to compute a key for a selected shader, and is further operable to search the memory for a copy of the computed key. A shader binary is passed from the memory for execution by the graphics processor if the copy of the computed key is located in the memory. Otherwise, the compiler is operable to compile and link the shader to create a shader binary for execution by the graphics processor and storing the shader binary in the memory. The shader binary is associated with the computed key and the copy of the computed key.
  • BRIEF DESCRIPTION OF THE DRAWINGS
  • The present invention will be better understood from the following detailed description, taken in conjunction with the accompanying drawing figures in which like reference characters designate like elements and in which:
  • FIG. 1 illustrates an exemplary schematic block diagram of a graphics rendering system with a persistent cache for persistent storage of shader binaries in accordance with an embodiment of the present invention;
  • FIG. 2 illustrates an exemplary schematic block diagram of a graphics rendering system with a persistent cache for persistent storage of shader binaries and corresponding keys in accordance with an embodiment of the present invention;
  • FIG. 3A illustrates an exemplary schematic block diagram of a graphics rendering system with a persistent cache for persistent storage of ARB assemblies and shader microcode in accordance with an embodiment of the present invention;
  • FIG. 3B illustrates an exemplary functional block diagram of a graphics rendering system with a persistent cache for persistent storage of ARB assemblies and shader microcode in accordance with an embodiment of the present invention; and
  • FIG. 4 illustrates an exemplary flow diagram illustrating steps of a computer implemented method for compiling a shader for execution by a graphics processor in accordance with an embodiment of the present invention.
  • DETAILED DESCRIPTION
  • Reference will now be made in detail to the preferred embodiments of the present invention, examples of which are illustrated in the accompanying drawings. While the invention will be described in conjunction with the preferred embodiments, it will be understood that they are not intended to limit the invention to these embodiments. On the contrary, the invention is intended to cover alternatives, modifications and equivalents, which may be included within the spirit and scope of the invention as defined by the appended claims. Furthermore, in the following detailed description of embodiments of the present invention, numerous specific details are set forth in order to provide a thorough understanding of the present invention. However, it will be recognized by one of ordinary skill in the art that the present invention may be practiced without these specific details. In other instances, well-known methods, procedures, components, and circuits have not been described in detail so as not to unnecessarily obscure aspects of the embodiments of the present invention. The drawings showing embodiments of the invention are semi-diagrammatic and not to scale and, particularly, some of the dimensions are for the clarity of presentation and are shown exaggerated in the drawing Figures. Similarly, although the views in the drawings for the ease of description generally show similar orientations, this depiction in the Figures is arbitrary for the most part. Generally, the invention can be operated in any orientation.
  • Notation and Nomenclature:
  • Some portions of the detailed descriptions, which follow, are presented in terms of procedures, steps, logic blocks, processing, and other symbolic representations of operations on data bits within a computer memory. These descriptions and representations are the means used by those skilled in the data processing arts to most effectively convey the substance of their work to others skilled in the art. A procedure, computer executed step, logic block, process, etc., is here, and generally, conceived to be a self-consistent sequence of steps or instructions leading to a desired result. The steps are those requiring physical manipulations of physical quantities. Usually, though not necessarily, these quantities take the form of electrical or magnetic signals capable of being stored, transferred, combined, compared, and otherwise manipulated in a computer system. It has proven convenient at times, principally for reasons of common usage, to refer to these signals as bits, values, elements, symbols, characters, terms, numbers, or the like.
  • It should be borne in mind, however, that all of these and similar terms are to be associated with the appropriate physical quantities and are merely convenient labels applied to these quantities. Unless specifically stated otherwise as apparent from the following discussions, it is appreciated that throughout the present invention, discussions utilizing terms such as “processing” or “accessing” or “executing” or “storing” or “rendering” or the like, refer to the action and processes of a computer system, or similar electronic computing device, that manipulates and transforms data represented as physical (electronic) quantities within the computer system's registers and memories and other computer readable media into other data similarly represented as physical quantities within the computer system memories or registers or other such information storage, transmission or display devices. When a component appears in several embodiments, the use of the same reference numeral signifies that the component is the same component as illustrated in the original embodiment.
  • This present invention provides a solution to the increasing challenges inherent in compiling and linking shader source code to produce shader binaries at runtime. Various embodiments of the present disclosure provide an exemplary persistent cache memory that stores shader binaries and associated keys. As discussed in detail below, when a graphics driver selects a shader instruction set for execution by a graphics processor, a cache is searched for a key associated with a shader binary for the selected shader instruction set. If the key is found in the memory, then the corresponding shader binary is sent to the graphics processor for execution, otherwise, the shader instruction set is sent to a compiler for compiling and linking to create a shader binary for execution by the graphics processor and storage in the memory.
  • Improving Application Load Times by Caching Shader Binaries in a Persistent Store:
  • FIG. 1 illustrates an exemplary graphics rendering system comprising a shader instruction set 102 (hereafter referred to as a shader), a graphics driver module 104, a compiler 106, a cache 108, and a graphics processor 110. As discussed herein, an exemplary shader 102 may comprise shader source code written with a high-level programming language, such as C programming language or other similar languages. In another embodiment, a plurality of shaders 102 (e.g., a program object) may be compiled, linked and stored in the cache 108. In one exemplary embodiment, the cache 108 is a persistent memory that retains stored shader binaries and associated keys between runtime sessions so that previously compiled and linked shaders are immediately available (in the cache) the next time they need to be executed. In one embodiment, an exemplary cache 108 for a mobile computing system may be approximately 2-8 Mbytes in size, while an exemplary cache 108 for a desktop computing system may be approximately 64 Mbytes. Other cache sizes are also possible and are within the scope of this disclosure.
  • As discussed herein, some mobile applications, such as a WebGL compatible browser, may compile a significant number of shaders during a single runtime of the application. In one embodiment, each shader binary for a mobile computing system may be 1-2 Kbytes in size. This may cause the persistent cache 108 to fill up. One solution is to track the usage of shader binaries and to delete any old entries based on a least recently used (LRU) cache storage algorithm. Another solution may be to use a coarse ring buffer scheme using a pair of persistent files. For example, when one of the persistent files is filled up, the other persistent file may be truncated and new entries appended to that persistent file.
  • In one embodiment, shader binaries stored in a persistent cache may be compressed (e.g., RLE compression) beforehand. In one embodiment, the shader binary is compressed immediately before placement into the cache and decompressed immediately after removal from the cache.
  • As illustrated in FIG. 1, a selected shader (or a plurality of shaders) 102 is received by the graphics driver 104. After receiving the shader 102, the graphics driver 104 searches the cache 108 for a shader binary that is a compiled and linked version of the selected shader 102, ready to be executed by the graphics processor. As illustrated in FIG. 1, if there is not a matching shader binary in the cache 108, the graphic driver 104 forwards the shader 102 to the compiler 106 for compiling and linking to produce the desired shader binary. This shader binary is forwarded to the graphics driver 104, where it is then forwarded to both the graphics processor 110 for execution and to the cache 108 for persistent storage. In one exemplary embodiment, the cache 108 may be searched for a desired shader binary by searching the cache 108 for a key that is associated with the desired shader binary. As discussed in detail below, a shader binary paired with an associated key is stored in the cache 108. As also discussed herein, when a shader is selected for execution, a key is computed that is associated with the desired shader binary and is used to search for a matching key stored in the cache 108. In other words, the computed key is used to search the cache 108 for a matching key paired with the desired shader binary.
  • FIG. 2 illustrates an exemplary functional block diagram of a graphics rendering system operable to produce a shader binary for execution by a graphics processor and the computation of a key associated with the shader binary. As illustrated in FIG. 2, an exemplary graphics driver 202 provides shader arguments to a complier 204 and a compute key block 206. The compiler 204 receives the shader arguments and shader text that are used by the compiler 204 to compile and link the shader to produce a shader binary. As also illustrated in FIG. 2, the compute key function block 206 receives the shader arguments and shader text along with a graphics driver version, graphics processor type/version, and a compiler driver version to produce a key that is associated with the produced shader binary. In other words, a shader binary is based upon the corresponding shader arguments and shader text that are also dependent upon the current graphics driver version, compiler version, and graphics processor type/version.
  • In one embodiment, a cache stores shader binaries and associates each shader binary with a key. In one embodiment a key size and hash function may be used to produce a key that may be chosen such that a probability of collisions is kept extremely low. In one embodiment, a 64 bit key may suffice as a number of possible shaders that a typical application may compile for execution number at the most in the tens of thousands. In one embodiment, an exemplary key is computed using a hash function on a source shader string and the shader arguments that are also passed to the shader compiler. These arguments are computed internally by the graphics driver using the graphics driver's current state. The same shader instruction set may be compiled using different compiler arguments, resulting in multiple key/value pairs being added to the cache as graphic driver states change.
  • In one embodiment, as discussed herein, the cache 210 may also contain a global key that is computed at runtime based on a current graphics driver version, current compiler version, and other hardware related states. The global key may be computed at graphics driver startup and compared to a global key previously stored in the cache 210. If there is a mismatch between the previously stored global key and the new global key, the cache 210 is out of date and all stored shader binaries are invalidated. When stored shader binaries are invalidated, a shader selected for execution will need to be compiled, even if a copy of the shader binary is stored in the cache 210 (in other words, the stored shader binary is invalid). Such global keys may be used to ensure that only the latest updated shader binaries are used by the application.
  • As illustrated in FIG. 2, the shader binary and associated key are forwarded together to the cache 210 for storage. In one embodiment, the cache 210 is a persistent memory that retains its saved contents from previous runtime sessions. As illustrated in FIG. 1, the shader binary may be forwarded to a graphics processor 110 for execution. To ensure that the shader binary has not been corrupted, a compute checksum function block 208 computes a checksum that is also stored in the cache 210 along with the shader binary and key.
  • In one embodiment, a checksum may be used to ensure that a shader binary stored in a persistent cache 210 is uncorrupted. As noted herein, comparing a computed key to a key stored in the cache 210 may be used to ensure that the previously stored shader binary associated with the stored key is still valid (that there have not been software or hardware changes) while a checksum is used to ensure that a stored shader binary has not been corrupted due to copy errors, etc. In other words, a key is used to ensure that a desired shader binary selected is the correct one and is up to date, while the checksum is used to ensure that there are no errors in the cached shader binary.
  • FIG. 3A illustrates an exemplary graphics rendering system 300. The graphics rendering system 300 illustrated in FIG. 3 comprises a shader 302, a graphics driver 304, a compiler 306, a cache 308, and a graphics processor 310. In one embodiment, a shader 302 is selected by a graphics driver 304 for execution by the graphics processor 310. As noted herein, before the shader is executed, the shader must be compiled and linked by the compiler to produce a shader binary. In one embodiment, rather than compiling and linking the shader to produce the required shader binary, the desired shader binary may be retrieved from the cache 308 and passed to the graphics processor 310 for execution. As discussed herein, when the compiler 306 compiles and links a shader to produce a shader binary, the paired shader binary and corresponding key are stored in the cache 308 for later retrieval.
  • In one embodiment, GLSL shaders used in applications are compiled to produce an intermediate compilation using ARB assembly code, which is compiled itself to produce an executable using shader microcode or binary. In one embodiment, the ARB assembly and the shader microcode or binary are stored in the cache 308. In one exemplary environment, an OpenGL graphics rendering API supports user-supplied ARB assembly programs and fixed-function shading, and these are all cached in the cache 308 for later retrieval.
  • In one embodiment, the desired shader binary (e.g., an ARB assembly and shader binary used to produce the desired executable) retrieved during a current runtime session was stored in the cache 308 during a previous runtime session. As illustrated in FIGS. 3A and 3B, a shader is passed to the compiler 306 and an ARB assembly is returned to the graphics driver 304, after which, the graphics driver 304 passes an ARB ASM to the compiler 306 to produce a shader microcode which is executed by the graphics processor 310. In one embodiment, the ARB assembly and shader binaries are compressed to minimize a required footprint in the cache 308 and to possibly further reduce load times. In one embodiment, RLE compression is used. As discussed herein, a checksum may also be used to evaluate cached shader binaries to prevent execution problems should the stored shader binary be corrupted (e.g., due to abnormal process termination or due to improper file locking).
  • FIG. 3B illustrates an exemplary functional block diagram of the graphics rendering system 300 illustrated in FIG. 3A. In FIG. 3B, a shader 302 is selected by a graphics driver 304 for execution by a graphics processor 310. As illustrated in FIG. 3B, the shader 302 is selected by the graphics driver 304 during a first phase 304-A. During the first phase 304-A, the shader 302 is passed to a frontend compiler 306-A for compiling. The frontend compiler 306-A returns an ARB assembly to the graphics driver 304 during the first phase 304-A. The ARB assembly is passed from the first phase 304-A to the second phase 304-B and forwarded to the backend compiler 306-B for further compiling. As illustrated in FIG. 3B, the backend compiler 306-B returns a shader microcode for execution by the graphics processor 310. In one embodiment, the backend compiler 306-B may produce a shader microcode or binary.
  • In one embodiment, as illustrated in FIGS. 3A and 3B, the first and second graphics driver phases 304-A, 304-B are performed by the graphics driver 304. In one embodiment, the frontend compiler functionality (306-A) and the backend compiler functionality (306-B) are implemented with a single compiler 306. As further illustrated in FIG. 3B, when the ARB assembly and the shader microcode are produced, the ARB assembly and shader microcode and associated keys may be stored in the cache 308 for persistent storage. In other words, separate unique keys for the ARB assembly code portion and for the shader microcode are created and stored.
  • FIG. 4 illustrates an exemplary flow diagram illustrating a process for compiling a shader 102 for execution by a graphics processor 110 in accordance with an embodiment of the present invention. In step 402 of FIG. 4, a shader 102 is selected for execution by a graphics processor 110. In one embodiment, a single shader is selected for execution. In another embodiment a plurality of shaders are selected for execution together as a program object.
  • In step 404 of FIG. 4, a key is computed for the selected shader. In one embodiment, as discussed herein, a key is computed based upon shader arguments, shader text, current graphics driver version, and a current compiler driver version. The computed key is therefore associated with the desired shader binary. In step 406 of FIG. 4, the computed key is compared with keys stored in a cache 108. As discussed herein, during step 406, the computed key is compared to the stored keys to determine if a key associated with the desired shader binary is stored in the cache 108. If the desired shader binary is in the cache 108, the associated key will match the computed key.
  • In step 408 of FIG. 4, a determination is made as to whether or not the desired key is found in the cache 108. In other words, is there a match between the computed key (that is based upon the desired shader binary) and a key previously stored in the cache 108? If a key in the cache 108 matches the computed key, the process continues to step 410 of FIG. 4. However, if the desired key is not found in the cache 108, the process continues to step 416 of FIG. 4.
  • In step 410 of FIG. 4, a shader binary associated with the stored key that matches the computed key is retrieved from the cache 108 and a checksum is performed on the retrieved shader binary. In step 412 of FIG. 4, a determination is made as to whether the checksum passes. If the checksum passes, the process continues to step 414 of FIG. 4. If the checksum does not pass, the process continues to step 416 of FIG. 4. In step 414 of FIG. 4, the shader binary is passed to the graphics processor 110 for execution.
  • In step 416 of FIG. 4, the selected shader or plurality of shaders is compiled and linked to produce a shader binary. In step 418 of FIG. 4, the shader binary is passed to the graphics processor 110 for execution. Lastly, in step 420 of FIG. 4, the shader binary and associated key and checksum are stored in the cache 108. As discussed herein, after the shader binary and the associated key and checksum are stored in the cache 108, the shader binary will be available for execution by the graphics processor 110 during future runtime sessions without having to compile and link the shader again.
  • Although certain preferred embodiments and methods have been disclosed herein, it will be apparent from the foregoing disclosure to those skilled in the art that variations and modifications of such embodiments and methods may be made without departing from the spirit and scope of the invention. It is intended that the invention shall be limited only to the extent required by the appended claims and the rules and principles of applicable law.

Claims (20)

What is claimed is:
1. A method for compiling a shader for execution by a graphics processor, the method comprising:
selecting a shader for execution;
computing a computed key for the selected shader;
searching a memory for a copy of the computed key; and
passing a shader binary stored in the memory to the graphics processor for execution if the copy of the computed key is located in the memory, otherwise compiling the shader to produce a shader binary for execution by the graphics processor and storing the shader binary in the memory, wherein the shader binary is associated with the computed key and the copy of the computed key.
2. The method of claim 1, wherein compiling the shader to produce a shader binary comprises computing an associated key to be stored with the shader binary in the memory.
3. The method of claim 1 further comprising:
verifying a checksum associated with the shader binary before execution by the graphics processor; and
recompiling the shader to produce a replacement shader binary for execution by the graphics processor and storing the shader binary in the memory if the verification fails.
4. The method of claim 1 further comprising:
generating a first global key;
comparing the first global key to a second global key previously stored in the memory; and
invalidating all binaries stored in the memory if the first global key mismatches the second global key.
5. The method of claim 2, wherein a key is based on at least one of:
shader arguments;
shader text;
graphics driver version;
graphics processor type/version; and
compiler version.
6. The method of claim 1, wherein a shader binary is based on at least one of:
shader arguments;
shader text;
graphics driver version;
graphics processor type/version; and
compiler version.
7. The method of claim 1, wherein the storing the shader binary in the memory comprises storing an associated key and an associated checksum with the shader binary in the memory.
8. A computer system comprising:
a processor;
a graphics processor;
and a memory, wherein the memory is operable to store instructions, that when executed by the processor perform a method for compiling a shader for execution by a graphics processor, the method comprising:
selecting a shader for execution;
computing a computed key for the selected shader;
searching a memory for a copy of the computed key; and
passing a shader binary stored in the memory to the graphics processor for execution if the copy of the computed key is located in the memory, otherwise compiling the shader to produce a shader binary for execution by the graphics processor and storing the shader binary in the memory, wherein the shader binary is associated with the computed key and the copy of the computed key.
9. The computer system of claim 8, wherein compiling the shader to produce a shader binary comprises computing an associated key to be stored with the shader binary in the memory.
10. The computer system of claim 8, wherein the method further comprises:
verifying a checksum associated with the shader binary before execution by the graphics processor; and
recompiling the shader to produce a replacement shader binary for execution by the graphics processor and storing the shader binary in the memory if the verification fails.
11. The computer system of claim 8, wherein the method further comprises:
generating a first global key;
comparing the first global key to a second global key previously stored in the memory; and
invalidating all binaries stored in the memory if the first global key mismatches the second global key.
12. The computer system of claim 9, wherein a key is based on at least one of:
shader arguments;
shader text;
graphics driver version;
graphics processor type/version; and
compiler version.
13. The computer system of claim 8, wherein a shader binary is based on at least one of:
shader arguments;
shader text;
graphics driver version;
graphics processor type/version; and
compiler version.
14. The computer system of claim 8, wherein the storing the shader binary in the memory comprises storing an associated key and an associated checksum with the shader binary in the memory.
15. A computer system comprising:
a compiler operable to compile and link shader source code to create a shader binary;
a memory operable to store a plurality of shader binaries, wherein each shader binary is paired with an associated key; and
a graphics driver module operable to select one or more shaders for execution by a graphics processor and to compute a computed key for a selected shader, and is further operable to search the memory for a copy of the computed key and pass a shader binary from the memory for execution by the graphics processor if the copy of the computed key is located in the memory, otherwise the compiler is operable to compile and link the shader to create a shader binary for execution by the graphics processor and storing the shader binary in the memory, wherein the shader binary is associated with the computed key and the copy of the computed key.
16. The computer system of claim 15, wherein the computed key associated with the shader binary is also stored in the memory.
17. The computer system of claim 15, wherein the graphics driver is further operable to:
verify a checksum associated with the shader binary before execution by the graphics processor; and
recompile the shader source code to produce a replacement shader binary for execution by the graphics processor and storage in the memory if the verification fails.
18. The computer system of claim 15, wherein the graphics driver is further operable to compute a first global key, compare the first global key to a second global key previously stored in the memory, and invalidate all shader binaries stored in the memory if the first global key mismatches the second global key.
19. The computer system of claim 16, wherein a key is based on at least one of:
shader arguments;
shader text;
graphics driver version;
graphics processor type/version; and
compiler version.
20. The computer system of claim 15, wherein a shader binary is based on at least one of:
shader arguments;
shader text;
graphics driver version;
graphics processor type/version; and
compiler version.
US13/731,785 2012-01-11 2012-12-31 Application load times by caching shader binaries in a persistent storage Abandoned US20140043333A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
US13/731,785 US20140043333A1 (en) 2012-01-11 2012-12-31 Application load times by caching shader binaries in a persistent storage

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US201261585620P 2012-01-11 2012-01-11
US13/731,785 US20140043333A1 (en) 2012-01-11 2012-12-31 Application load times by caching shader binaries in a persistent storage

Publications (1)

Publication Number Publication Date
US20140043333A1 true US20140043333A1 (en) 2014-02-13

Family

ID=48744794

Family Applications (2)

Application Number Title Priority Date Filing Date
US13/712,839 Active 2035-01-02 US9773344B2 (en) 2012-01-11 2012-12-12 Graphics processor clock scaling based on idle time
US13/731,785 Abandoned US20140043333A1 (en) 2012-01-11 2012-12-31 Application load times by caching shader binaries in a persistent storage

Family Applications Before (1)

Application Number Title Priority Date Filing Date
US13/712,839 Active 2035-01-02 US9773344B2 (en) 2012-01-11 2012-12-12 Graphics processor clock scaling based on idle time

Country Status (1)

Country Link
US (2) US9773344B2 (en)

Cited By (23)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20130169642A1 (en) * 2011-12-29 2013-07-04 Qualcomm Incorporated Packing multiple shader programs onto a graphics processor
US20140047470A1 (en) * 2012-08-08 2014-02-13 Intel Corporation Securing Content from Malicious Instructions
KR20170055392A (en) * 2015-11-11 2017-05-19 삼성전자주식회사 Method and apparatus for processing graphics command
US9766649B2 (en) 2013-07-22 2017-09-19 Nvidia Corporation Closed loop dynamic voltage and frequency scaling
US9786026B2 (en) 2015-06-15 2017-10-10 Microsoft Technology Licensing, Llc Asynchronous translation of computer program resources in graphics processing unit emulation
US9881351B2 (en) 2015-06-15 2018-01-30 Microsoft Technology Licensing, Llc Remote translation, aggregation and distribution of computer program resources in graphics processing unit emulation
US9912322B2 (en) 2013-07-03 2018-03-06 Nvidia Corporation Clock generation circuit that tracks critical path across process, voltage and temperature variation
US9939883B2 (en) 2012-12-27 2018-04-10 Nvidia Corporation Supply-voltage control for device power management
US10002401B2 (en) 2015-11-11 2018-06-19 Samsung Electronics Co., Ltd. Method and apparatus for efficient processing of graphics commands
US10068370B2 (en) 2014-09-12 2018-09-04 Microsoft Technology Licensing, Llc Render-time linking of shaders
US20190232164A1 (en) * 2018-01-26 2019-08-01 Valve Corporation Distributing shaders between client machines for precaching
WO2019199848A1 (en) * 2018-04-10 2019-10-17 Google Llc Memory management in gaming rendering
US10466763B2 (en) 2013-12-02 2019-11-05 Nvidia Corporation Dynamic voltage-frequency scaling to limit power transients
US10898812B2 (en) 2018-04-02 2021-01-26 Google Llc Methods, devices, and systems for interactive cloud gaming
US11077364B2 (en) 2018-04-02 2021-08-03 Google Llc Resolution-based scaling of real-time interactive graphics
US11140207B2 (en) 2017-12-21 2021-10-05 Google Llc Network impairment simulation framework for verification of real time interactive media streaming systems
US11169733B2 (en) 2017-10-26 2021-11-09 Hewlett-Packard Development Company, L.P. Asset processing from persistent memory
US11305186B2 (en) 2016-05-19 2022-04-19 Google Llc Methods and systems for facilitating participation in a game session
US20220126203A1 (en) * 2020-10-25 2022-04-28 Meta Platforms, Inc. Systems and methods for distributing compiled shaders
US11369873B2 (en) 2018-03-22 2022-06-28 Google Llc Methods and systems for rendering and encoding content for online interactive gaming sessions
US11662051B2 (en) 2018-11-16 2023-05-30 Google Llc Shadow tracking of real-time interactive simulations for complex system analysis
US11684849B2 (en) 2017-10-10 2023-06-27 Google Llc Distributed sample-based game profiling with game metadata and metrics and gaming API platform supporting third-party content
US11872476B2 (en) 2018-04-02 2024-01-16 Google Llc Input device for an electronic system

Families Citing this family (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9449359B2 (en) 2012-09-13 2016-09-20 Ati Technologies Ulc Rendering settings in a multi-graphics processing unit system
US9201487B2 (en) * 2013-03-05 2015-12-01 Intel Corporation Reducing power consumption during graphics rendering
KR102164099B1 (en) 2014-03-28 2020-10-12 삼성전자 주식회사 System on chip, method thereof, and device including the same
US10009944B2 (en) * 2015-08-26 2018-06-26 International Business Machines Corporation Controlling wireless connection of a device to a wireless access point
US10331195B2 (en) 2016-06-06 2019-06-25 Qualcomm Incorporated Power and performance aware memory-controller voting mechanism
US10459760B2 (en) * 2016-07-08 2019-10-29 Sap Se Optimizing job execution in parallel processing with improved job scheduling using job currency hints
US10319065B2 (en) 2017-04-13 2019-06-11 Microsoft Technology Licensing, Llc Intra-frame real-time frequency control
US11209886B2 (en) 2019-09-16 2021-12-28 Microsoft Technology Licensing, Llc Clock frequency adjustment for workload changes in integrated circuit devices
US11714564B2 (en) * 2020-01-06 2023-08-01 Arm Limited Systems and methods of power management

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6272649B1 (en) * 1998-09-28 2001-08-07 Apple Computer, Inc. Method and system for ensuring cache file integrity
US6275919B1 (en) * 1998-10-15 2001-08-14 Creative Technology Ltd. Memory storage and retrieval with multiple hashing functions
US20030004921A1 (en) * 2001-06-28 2003-01-02 Schroeder Jacob J. Parallel lookups that keep order
US7015909B1 (en) * 2002-03-19 2006-03-21 Aechelon Technology, Inc. Efficient use of user-defined shaders to implement graphics operations
US7839410B1 (en) * 2006-07-28 2010-11-23 Nvidia Corporation Parameter buffer objects for shader parameters in a graphics library
US20120306877A1 (en) * 2011-06-01 2012-12-06 Apple Inc. Run-Time Optimized Shader Program

Family Cites Families (61)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5142690A (en) 1990-03-20 1992-08-25 Scientific-Atlanta, Inc. Cable television radio frequency data processor
US5396635A (en) 1990-06-01 1995-03-07 Vadem Corporation Power conservation apparatus having multiple power reduction levels dependent upon the activity of the computer system
US5551033A (en) 1991-05-17 1996-08-27 Zenith Data Systems Corporation Apparatus for maintaining one interrupt mask register in conformity with another in a manner invisible to an executing program
JPH06236284A (en) 1991-10-21 1994-08-23 Intel Corp Method for preservation and restoration of computer-system processing state and computer system
GB2264794B (en) 1992-03-06 1995-09-20 Intel Corp Method and apparatus for automatic power management in a high integration floppy disk controller
US5402492A (en) 1993-06-18 1995-03-28 Ast Research, Inc. Security system for a stand-alone computer
US5524249A (en) 1994-01-27 1996-06-04 Compaq Computer Corporation Video subsystem power management apparatus and method
US5557777A (en) 1994-09-30 1996-09-17 Apple Computer, Inc. Method and apparatus for system recovery from power loss
US5752050A (en) 1994-10-04 1998-05-12 Intel Corporation Method and apparatus for managing power consumption of external devices for personal computers using a power management coordinator
KR0135904B1 (en) 1994-12-30 1998-06-15 김광호 Power management system
JP3520611B2 (en) 1995-07-06 2004-04-19 株式会社日立製作所 Processor control method
US5889529A (en) 1996-03-22 1999-03-30 Silicon Graphics, Inc. System and method for generating and displaying complex graphic images at a constant frame rate
US5951689A (en) 1996-12-31 1999-09-14 Vlsi Technology, Inc. Microprocessor power control system
US6549240B1 (en) 1997-09-26 2003-04-15 Sarnoff Corporation Format and frame rate conversion for display of 24Hz source video
JPH11161385A (en) 1997-11-28 1999-06-18 Toshiba Corp Computer system and its system state control method
US20020126751A1 (en) 1998-05-22 2002-09-12 Christoph E. Scheurich Maintaining a frame rate in a digital imaging system
US6178523B1 (en) 1998-06-12 2001-01-23 Philips Consumer Communications Lp Battery-operated device with power failure recovery
US6347370B1 (en) 1998-12-30 2002-02-12 Intel Corporation Method and system for pre-loading system resume operation data on suspend operation
US6523128B1 (en) 1999-08-31 2003-02-18 Intel Corporation Controlling power for a sleeping state of a computer to prevent overloading of the stand-by power rails by selectively asserting a control signal
US6760850B1 (en) 2000-07-31 2004-07-06 Hewlett-Packard Development Company, L.P. Method and apparatus executing power on self test code to enable a wakeup device for a computer system responsive to detecting an AC power source
US6804763B1 (en) 2000-10-17 2004-10-12 Igt High performance battery backed ram interface
US6694451B2 (en) 2000-12-07 2004-02-17 Hewlett-Packard Development Company, L.P. Method for redundant suspend to RAM
US6542240B2 (en) 2001-03-30 2003-04-01 Alcan International Limited Method of identifying defective roll on a strip processing line
US7058834B2 (en) 2001-04-26 2006-06-06 Paul Richard Woods Scan-based state save and restore method and system for inactive state power reduction
TW501037B (en) 2001-05-01 2002-09-01 Benq Corp Interactive update method for parameter data
US6990594B2 (en) 2001-05-02 2006-01-24 Portalplayer, Inc. Dynamic power management of devices in computer system by selecting clock generator output based on a current state and programmable policies
JP4974202B2 (en) 2001-09-19 2012-07-11 ルネサスエレクトロニクス株式会社 Semiconductor integrated circuit
US20030156639A1 (en) 2002-02-19 2003-08-21 Jui Liang Frame rate control system and method
US6950951B2 (en) 2002-04-30 2005-09-27 Arm Limited Power control signalling
US7100013B1 (en) 2002-08-30 2006-08-29 Nvidia Corporation Method and apparatus for partial memory power shutoff
US6901298B1 (en) 2002-09-30 2005-05-31 Rockwell Automation Technologies, Inc. Saving and restoring controller state and context in an open operating system
US7043649B2 (en) 2002-11-20 2006-05-09 Portalplayer, Inc. System clock power management for chips with multiple processing modules
CN100416573C (en) 2003-05-07 2008-09-03 睦塞德技术公司 Power managers for an integrated circuit
US7174472B2 (en) 2003-05-20 2007-02-06 Arm Limited Low overhead integrated circuit power down and restart
US7428644B2 (en) 2003-06-20 2008-09-23 Micron Technology, Inc. System and method for selective memory module power management
US7076735B2 (en) 2003-07-21 2006-07-11 Landmark Graphics Corporation System and method for network transmission of graphical data through a distributed application
US7091967B2 (en) 2003-09-01 2006-08-15 Realtek Semiconductor Corp. Apparatus and method for image frame synchronization
US7426647B2 (en) 2003-09-18 2008-09-16 Vulcan Portals Inc. Low power media player for an electronic device
JP4789494B2 (en) 2004-05-19 2011-10-12 株式会社ソニー・コンピュータエンタテインメント Image frame processing method, apparatus, rendering processor, and moving image display method
US7401240B2 (en) 2004-06-03 2008-07-15 International Business Machines Corporation Method for dynamically managing power in microprocessor chips according to present processing demands
US7529958B2 (en) 2004-11-15 2009-05-05 Charles Roth Programmable power transition counter
US7659746B2 (en) 2005-02-14 2010-02-09 Qualcomm, Incorporated Distributed supply current switch circuits for enabling individual power domains
US7434072B2 (en) 2005-04-25 2008-10-07 Arm Limited Integrated circuit power management control
US8102398B2 (en) * 2006-03-03 2012-01-24 Ati Technologies Ulc Dynamically controlled power reduction method and circuit for a graphics processor
US7414550B1 (en) 2006-06-30 2008-08-19 Nvidia Corporation Methods and systems for sample rate conversion and sample clock synchronization
US7739533B2 (en) 2006-09-22 2010-06-15 Agere Systems Inc. Systems and methods for operational power management
US9209792B1 (en) 2007-08-15 2015-12-08 Nvidia Corporation Clock selection system and method
US8327173B2 (en) 2007-12-17 2012-12-04 Nvidia Corporation Integrated circuit device core power down independent of peripheral device operation
GB2455744B (en) 2007-12-19 2012-03-14 Advanced Risc Mach Ltd Hardware driven processor state storage prior to entering a low power mode
US8370663B2 (en) * 2008-02-11 2013-02-05 Nvidia Corporation Power management with dynamic frequency adjustments
US9411390B2 (en) 2008-02-11 2016-08-09 Nvidia Corporation Integrated circuit device having power domains and partitions based on use case power optimization
US9423846B2 (en) 2008-04-10 2016-08-23 Nvidia Corporation Powered ring to maintain IO state independent of the core of an integrated circuit device
JP4742174B1 (en) 2010-04-20 2011-08-10 株式会社ソニー・コンピュータエンタテインメント 3D video playback method and 3D video playback device
US8484498B2 (en) * 2010-08-26 2013-07-09 Advanced Micro Devices Method and apparatus for demand-based control of processing node performance
US9171350B2 (en) 2010-10-28 2015-10-27 Nvidia Corporation Adaptive resolution DGPU rendering to provide constant framerate with free IGPU scale up
US8694811B2 (en) * 2010-10-29 2014-04-08 Texas Instruments Incorporated Power management for digital devices
US9007362B2 (en) 2011-01-14 2015-04-14 Brian Mark Shuster Adaptable generation of virtual environment frames
US9839844B2 (en) 2011-03-01 2017-12-12 Disney Enterprises, Inc. Sprite strip renderer
US8650423B2 (en) * 2011-10-12 2014-02-11 Qualcomm Incorporated Dynamic voltage and clock scaling control based on running average, variant and trend
US9547602B2 (en) 2013-03-14 2017-01-17 Nvidia Corporation Translation lookaside buffer entry systems and methods
GB2547170B (en) 2014-12-05 2020-07-22 Piolax Inc Retaining device

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6272649B1 (en) * 1998-09-28 2001-08-07 Apple Computer, Inc. Method and system for ensuring cache file integrity
US6275919B1 (en) * 1998-10-15 2001-08-14 Creative Technology Ltd. Memory storage and retrieval with multiple hashing functions
US20030004921A1 (en) * 2001-06-28 2003-01-02 Schroeder Jacob J. Parallel lookups that keep order
US7015909B1 (en) * 2002-03-19 2006-03-21 Aechelon Technology, Inc. Efficient use of user-defined shaders to implement graphics operations
US7839410B1 (en) * 2006-07-28 2010-11-23 Nvidia Corporation Parameter buffer objects for shader parameters in a graphics library
US20120306877A1 (en) * 2011-06-01 2012-12-06 Apple Inc. Run-Time Optimized Shader Program

Cited By (35)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9530245B2 (en) * 2011-12-29 2016-12-27 Qualcomm Incorporated Packing multiple shader programs onto a graphics processor
US20130169642A1 (en) * 2011-12-29 2013-07-04 Qualcomm Incorporated Packing multiple shader programs onto a graphics processor
US20140047470A1 (en) * 2012-08-08 2014-02-13 Intel Corporation Securing Content from Malicious Instructions
US9646153B2 (en) * 2012-08-08 2017-05-09 Intel Corporation Securing content from malicious instructions
US10386916B2 (en) 2012-12-27 2019-08-20 Nvidia Corporation Supply-voltage control for device power management
US9939883B2 (en) 2012-12-27 2018-04-10 Nvidia Corporation Supply-voltage control for device power management
US9912322B2 (en) 2013-07-03 2018-03-06 Nvidia Corporation Clock generation circuit that tracks critical path across process, voltage and temperature variation
US9766649B2 (en) 2013-07-22 2017-09-19 Nvidia Corporation Closed loop dynamic voltage and frequency scaling
US10466763B2 (en) 2013-12-02 2019-11-05 Nvidia Corporation Dynamic voltage-frequency scaling to limit power transients
US10068370B2 (en) 2014-09-12 2018-09-04 Microsoft Technology Licensing, Llc Render-time linking of shaders
US9786026B2 (en) 2015-06-15 2017-10-10 Microsoft Technology Licensing, Llc Asynchronous translation of computer program resources in graphics processing unit emulation
US9881351B2 (en) 2015-06-15 2018-01-30 Microsoft Technology Licensing, Llc Remote translation, aggregation and distribution of computer program resources in graphics processing unit emulation
KR20170055392A (en) * 2015-11-11 2017-05-19 삼성전자주식회사 Method and apparatus for processing graphics command
US10002401B2 (en) 2015-11-11 2018-06-19 Samsung Electronics Co., Ltd. Method and apparatus for efficient processing of graphics commands
KR102254119B1 (en) * 2015-11-11 2021-05-20 삼성전자주식회사 Method and apparatus for processing graphics command
US11305186B2 (en) 2016-05-19 2022-04-19 Google Llc Methods and systems for facilitating participation in a game session
US11684849B2 (en) 2017-10-10 2023-06-27 Google Llc Distributed sample-based game profiling with game metadata and metrics and gaming API platform supporting third-party content
US11169733B2 (en) 2017-10-26 2021-11-09 Hewlett-Packard Development Company, L.P. Asset processing from persistent memory
US11140207B2 (en) 2017-12-21 2021-10-05 Google Llc Network impairment simulation framework for verification of real time interactive media streaming systems
KR20200115557A (en) * 2018-01-26 2020-10-07 밸브 코포레이션 Distributing shaders among client machines for precaching
US10668378B2 (en) * 2018-01-26 2020-06-02 Valve Corporation Distributing shaders between client machines for precaching
WO2019147974A3 (en) * 2018-01-26 2020-04-16 Valve Corporation Distributing shaders between client machines for precaching
KR102600025B1 (en) * 2018-01-26 2023-11-07 밸브 코포레이션 Distributing shaders between client machines for precaching
US20190232164A1 (en) * 2018-01-26 2019-08-01 Valve Corporation Distributing shaders between client machines for precaching
US11369873B2 (en) 2018-03-22 2022-06-28 Google Llc Methods and systems for rendering and encoding content for online interactive gaming sessions
US11872476B2 (en) 2018-04-02 2024-01-16 Google Llc Input device for an electronic system
US11077364B2 (en) 2018-04-02 2021-08-03 Google Llc Resolution-based scaling of real-time interactive graphics
US10898812B2 (en) 2018-04-02 2021-01-26 Google Llc Methods, devices, and systems for interactive cloud gaming
US11110348B2 (en) * 2018-04-10 2021-09-07 Google Llc Memory management in gaming rendering
EP4141781A1 (en) * 2018-04-10 2023-03-01 Google LLC Memory management in gaming rendering
WO2019199848A1 (en) * 2018-04-10 2019-10-17 Google Llc Memory management in gaming rendering
US11813521B2 (en) * 2018-04-10 2023-11-14 Google Llc Memory management in gaming rendering
US20210213354A1 (en) * 2018-04-10 2021-07-15 Google Llc Memory management in gaming rendering
US11662051B2 (en) 2018-11-16 2023-05-30 Google Llc Shadow tracking of real-time interactive simulations for complex system analysis
US20220126203A1 (en) * 2020-10-25 2022-04-28 Meta Platforms, Inc. Systems and methods for distributing compiled shaders

Also Published As

Publication number Publication date
US9773344B2 (en) 2017-09-26
US20130179711A1 (en) 2013-07-11

Similar Documents

Publication Publication Date Title
US20140043333A1 (en) Application load times by caching shader binaries in a persistent storage
US8935683B2 (en) Inline function linking
US10223528B2 (en) Technologies for deterministic code flow integrity protection
US20170161040A1 (en) Arranging Binary Code Based on Call Graph Partitioning
US7730463B2 (en) Efficient generation of SIMD code in presence of multi-threading and other false sharing conditions and in machines having memory protection support
US8561045B2 (en) Constructing runtime state for inlined code
US8615735B2 (en) System and method for blurring instructions and data via binary obfuscation
US20140082597A1 (en) Unifying static and dynamic compiler optimizations in source-code bases
US20110321002A1 (en) Rewriting Branch Instructions Using Branch Stubs
US9607160B2 (en) Method and apparatus for providing string encryption and decryption in program files
US6975325B2 (en) Method and apparatus for graphics processing using state and shader management
US9280490B2 (en) Secure computing
US20160154746A1 (en) Secure computing
US9952843B2 (en) Partial program specialization at runtime
US8943480B2 (en) Setting breakpoints in optimized instructions
WO2012154606A1 (en) Efficient conditional flow control compilation
Sabanal Hiding behind ART
US5854928A (en) Use of run-time code generation to create speculation recovery code in a computer system
US11379195B2 (en) Memory ordering annotations for binary emulation
US20180349156A1 (en) Techniques for performing dynamic linking
US10521206B2 (en) Supporting compiler variable instrumentation for uninitialized memory references
Besnard et al. A framework for automatic and parameterizable memoization
US11860996B1 (en) Security concepts for web frameworks
US11615014B2 (en) Using relocatable debugging information entries to save compile time
US20230418950A1 (en) Methods, Devices, and Systems for Control Flow Integrity

Legal Events

Date Code Title Description
AS Assignment

Owner name: CELLCO PARTNERSHIP D/B/A VERIZON WIRELESS, NEW JER

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:POLEHN, DONNA L;CHANG, PATRICIA R;REEL/FRAME:029838/0517

Effective date: 20121227

Owner name: VERIZON PATENT AND LICENSING INC., NEW JERSEY

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:AKSU, ARDA;KAKADIA, DEEPAK;MACIAS, JOHN F;AND OTHERS;SIGNING DATES FROM 20121221 TO 20121231;REEL/FRAME:029838/0462

STCB Information on status: application discontinuation

Free format text: ABANDONED -- FAILURE TO RESPOND TO AN OFFICE ACTION