US20140043333A1 - Application load times by caching shader binaries in a persistent storage - Google Patents
Application load times by caching shader binaries in a persistent storage Download PDFInfo
- Publication number
- US20140043333A1 US20140043333A1 US13/731,785 US201213731785A US2014043333A1 US 20140043333 A1 US20140043333 A1 US 20140043333A1 US 201213731785 A US201213731785 A US 201213731785A US 2014043333 A1 US2014043333 A1 US 2014043333A1
- Authority
- US
- United States
- Prior art keywords
- shader
- key
- binary
- memory
- graphics processor
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Abandoned
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T15/00—3D [Three Dimensional] image rendering
- G06T15/50—Lighting effects
- G06T15/80—Shading
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F1/00—Details not covered by groups G06F3/00 - G06F13/00 and G06F21/00
- G06F1/26—Power supply means, e.g. regulation thereof
- G06F1/32—Means for saving power
- G06F1/3203—Power management, i.e. event-based initiation of a power-saving mode
- G06F1/3234—Power saving characterised by the action undertaken
- G06F1/324—Power saving characterised by the action undertaken by lowering clock frequency
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F1/00—Details not covered by groups G06F3/00 - G06F13/00 and G06F21/00
- G06F1/26—Power supply means, e.g. regulation thereof
- G06F1/32—Means for saving power
- G06F1/3203—Power management, i.e. event-based initiation of a power-saving mode
- G06F1/3206—Monitoring of events, devices or parameters that trigger a change in power modality
- G06F1/3215—Monitoring of peripheral devices
- G06F1/3218—Monitoring of peripheral devices of display devices
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F1/00—Details not covered by groups G06F3/00 - G06F13/00 and G06F21/00
- G06F1/26—Power supply means, e.g. regulation thereof
- G06F1/32—Means for saving power
- G06F1/3203—Power management, i.e. event-based initiation of a power-saving mode
- G06F1/3234—Power saving characterised by the action undertaken
- G06F1/325—Power saving in peripheral device
- G06F1/3265—Power saving in display device
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/30—Monitoring
- G06F11/34—Recording or statistical evaluation of computer activity, e.g. of down time, of input/output operation ; Recording or statistical evaluation of user activity, e.g. usability assessment
- G06F11/3409—Recording or statistical evaluation of computer activity, e.g. of down time, of input/output operation ; Recording or statistical evaluation of user activity, e.g. usability assessment for performance assessment
- G06F11/3419—Recording or statistical evaluation of computer activity, e.g. of down time, of input/output operation ; Recording or statistical evaluation of user activity, e.g. usability assessment for performance assessment by assessing time
- G06F11/3423—Recording or statistical evaluation of computer activity, e.g. of down time, of input/output operation ; Recording or statistical evaluation of user activity, e.g. usability assessment for performance assessment by assessing time where the assessed time is active or idle time
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02D—CLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
- Y02D10/00—Energy efficient computing, e.g. low power processors, power management or thermal management
Definitions
- Embodiments of the present disclosure relate generally to the field of graphics processing and more specifically to the field of improved shader binary caching and execution for efficient graphics processing.
- High level graphics languages allow applications to specify the execution of particular shaders.
- Shaders are instruction sets that define how certain pieces of geometry or fragments are processed by a graphics processor. These shader instruction sets can be quite long and detailed in what they do, and often execute millions of times.
- An exemplary compiler takes an instruction set written with a programming language (e.g., C programming language and other similar programming languages) and compiles the instruction set into a shader binary code or microcode that can be executed on the graphics processor.
- One or more of these shader instruction sets may be compiled together to form an entire execution pipeline or program object.
- An exemplary process by which an application selects shaders and compiles the selected shader sources into binaries and links them together into a program object may require an unbounded amount of time.
- An exemplary compiler may require an extensive amount of optimization time while compiling and linking intermediate results and/or a final output.
- Shader compilation times on mobile hardware can take a significant portion of frame render time. For example, on an application, such as a video game, running at 60 Hz, the frame render time is roughly 16 ms. When the application compiles a handful of shaders, which take on average 3-5 ms to compile on current mobile hardware, the shader compilation time can easily exceed the frame time and unfortunately cause visible stuttering on the screen.
- An application attempting to compile shaders during runtime risks frame hitches or a gap of time when rendering stops or slows as shaders are compiled and programs linked together. Such visible stuttering or frame rate hitches are undesirable. Applications may attempt to get around this by compiling in-between runtimes, such that the results of the compile or link are not required for immediate execution. Despite such timing efforts, there are states or contexts that may change in 3D graphics, requiring one or more shaders to be recompiled during runtime. So even if an application attempts to compile and link all required shaders ahead of time (such as in-between levels of a video game) it is still possible for the application to require shader recompiling during runtime.
- Embodiments of this present invention provide solutions to the challenges inherent in efficiently compiling shaders during runtime.
- a method for compiling a shader for execution by a graphics processor comprises selecting a shader for execution.
- a key is computed for the selected shader.
- a memory is searched for a copy of the computed key.
- a shader binary stored in the memory is passed to the graphics processor for execution if the copy of the computed key is located in the memory. Otherwise, the shader is compiled to produce the shader binary for execution by the graphics processor and storing the shader binary in the memory.
- the shader binary is associated with the computed key and the copy of the computed key.
- the computer system comprises a processor, a graphics processor, and a memory.
- the memory is operable to store instructions, that when executed performs a method for compiling a shader for execution by a graphics processor.
- the method comprises selecting a shader for execution.
- a key is computed for the selected shader.
- a memory is searched for a copy of the computed key.
- a shader binary stored in the memory is passed to the graphics processor for execution if the copy of the computed key is located in the memory. Otherwise, the shader is compiled to produce the shader binary for execution by the graphics processor and storing the shader binary in the memory.
- the shader binary is associated with the computed key and the copy of the computed key.
- the computer system comprises a compiler, a memory, a graphics driver module, and a graphics processor.
- the compiler is operable to compile and link shader source code to create a shader binary.
- the memory is operable to store a plurality of shader binaries. Each shader binary is paired with an associated key.
- the graphics driver module is operable to select one or more shaders for execution by the graphics processor and to compute a key for a selected shader, and is further operable to search the memory for a copy of the computed key.
- a shader binary is passed from the memory for execution by the graphics processor if the copy of the computed key is located in the memory.
- the compiler is operable to compile and link the shader to create a shader binary for execution by the graphics processor and storing the shader binary in the memory.
- the shader binary is associated with the computed key and the copy of the computed key.
- FIG. 1 illustrates an exemplary schematic block diagram of a graphics rendering system with a persistent cache for persistent storage of shader binaries in accordance with an embodiment of the present invention
- FIG. 2 illustrates an exemplary schematic block diagram of a graphics rendering system with a persistent cache for persistent storage of shader binaries and corresponding keys in accordance with an embodiment of the present invention
- FIG. 3A illustrates an exemplary schematic block diagram of a graphics rendering system with a persistent cache for persistent storage of ARB assemblies and shader microcode in accordance with an embodiment of the present invention
- FIG. 3B illustrates an exemplary functional block diagram of a graphics rendering system with a persistent cache for persistent storage of ARB assemblies and shader microcode in accordance with an embodiment of the present invention
- FIG. 4 illustrates an exemplary flow diagram illustrating steps of a computer implemented method for compiling a shader for execution by a graphics processor in accordance with an embodiment of the present invention.
- This present invention provides a solution to the increasing challenges inherent in compiling and linking shader source code to produce shader binaries at runtime.
- Various embodiments of the present disclosure provide an exemplary persistent cache memory that stores shader binaries and associated keys.
- a cache is searched for a key associated with a shader binary for the selected shader instruction set. If the key is found in the memory, then the corresponding shader binary is sent to the graphics processor for execution, otherwise, the shader instruction set is sent to a compiler for compiling and linking to create a shader binary for execution by the graphics processor and storage in the memory.
- FIG. 1 illustrates an exemplary graphics rendering system comprising a shader instruction set 102 (hereafter referred to as a shader), a graphics driver module 104 , a compiler 106 , a cache 108 , and a graphics processor 110 .
- a shader instruction set 102 hereafter referred to as a shader
- a graphics driver module 104 may comprise shader source code written with a high-level programming language, such as C programming language or other similar languages.
- a plurality of shaders 102 e.g., a program object
- the cache 108 is a persistent memory that retains stored shader binaries and associated keys between runtime sessions so that previously compiled and linked shaders are immediately available (in the cache) the next time they need to be executed.
- an exemplary cache 108 for a mobile computing system may be approximately 2-8 Mbytes in size, while an exemplary cache 108 for a desktop computing system may be approximately 64 Mbytes. Other cache sizes are also possible and are within the scope of this disclosure.
- some mobile applications may compile a significant number of shaders during a single runtime of the application.
- each shader binary for a mobile computing system may be 1-2 Kbytes in size. This may cause the persistent cache 108 to fill up.
- One solution is to track the usage of shader binaries and to delete any old entries based on a least recently used (LRU) cache storage algorithm.
- Another solution may be to use a coarse ring buffer scheme using a pair of persistent files. For example, when one of the persistent files is filled up, the other persistent file may be truncated and new entries appended to that persistent file.
- LRU least recently used
- shader binaries stored in a persistent cache may be compressed (e.g., RLE compression) beforehand.
- the shader binary is compressed immediately before placement into the cache and decompressed immediately after removal from the cache.
- a selected shader (or a plurality of shaders) 102 is received by the graphics driver 104 .
- the graphics driver 104 searches the cache 108 for a shader binary that is a compiled and linked version of the selected shader 102 , ready to be executed by the graphics processor.
- the graphic driver 104 forwards the shader 102 to the compiler 106 for compiling and linking to produce the desired shader binary.
- This shader binary is forwarded to the graphics driver 104 , where it is then forwarded to both the graphics processor 110 for execution and to the cache 108 for persistent storage.
- the cache 108 may be searched for a desired shader binary by searching the cache 108 for a key that is associated with the desired shader binary.
- a shader binary paired with an associated key is stored in the cache 108 .
- a key is computed that is associated with the desired shader binary and is used to search for a matching key stored in the cache 108 .
- the computed key is used to search the cache 108 for a matching key paired with the desired shader binary.
- FIG. 2 illustrates an exemplary functional block diagram of a graphics rendering system operable to produce a shader binary for execution by a graphics processor and the computation of a key associated with the shader binary.
- an exemplary graphics driver 202 provides shader arguments to a complier 204 and a compute key block 206 .
- the compiler 204 receives the shader arguments and shader text that are used by the compiler 204 to compile and link the shader to produce a shader binary.
- the compute key function block 206 receives the shader arguments and shader text along with a graphics driver version, graphics processor type/version, and a compiler driver version to produce a key that is associated with the produced shader binary.
- a shader binary is based upon the corresponding shader arguments and shader text that are also dependent upon the current graphics driver version, compiler version, and graphics processor type/version.
- a cache stores shader binaries and associates each shader binary with a key.
- a key size and hash function may be used to produce a key that may be chosen such that a probability of collisions is kept extremely low.
- a 64 bit key may suffice as a number of possible shaders that a typical application may compile for execution number at the most in the tens of thousands.
- an exemplary key is computed using a hash function on a source shader string and the shader arguments that are also passed to the shader compiler. These arguments are computed internally by the graphics driver using the graphics driver's current state. The same shader instruction set may be compiled using different compiler arguments, resulting in multiple key/value pairs being added to the cache as graphic driver states change.
- the cache 210 may also contain a global key that is computed at runtime based on a current graphics driver version, current compiler version, and other hardware related states.
- the global key may be computed at graphics driver startup and compared to a global key previously stored in the cache 210 . If there is a mismatch between the previously stored global key and the new global key, the cache 210 is out of date and all stored shader binaries are invalidated. When stored shader binaries are invalidated, a shader selected for execution will need to be compiled, even if a copy of the shader binary is stored in the cache 210 (in other words, the stored shader binary is invalid).
- Such global keys may be used to ensure that only the latest updated shader binaries are used by the application.
- the shader binary and associated key are forwarded together to the cache 210 for storage.
- the cache 210 is a persistent memory that retains its saved contents from previous runtime sessions.
- the shader binary may be forwarded to a graphics processor 110 for execution.
- a compute checksum function block 208 computes a checksum that is also stored in the cache 210 along with the shader binary and key.
- a checksum may be used to ensure that a shader binary stored in a persistent cache 210 is uncorrupted.
- comparing a computed key to a key stored in the cache 210 may be used to ensure that the previously stored shader binary associated with the stored key is still valid (that there have not been software or hardware changes) while a checksum is used to ensure that a stored shader binary has not been corrupted due to copy errors, etc.
- a key is used to ensure that a desired shader binary selected is the correct one and is up to date, while the checksum is used to ensure that there are no errors in the cached shader binary.
- FIG. 3A illustrates an exemplary graphics rendering system 300 .
- the graphics rendering system 300 illustrated in FIG. 3 comprises a shader 302 , a graphics driver 304 , a compiler 306 , a cache 308 , and a graphics processor 310 .
- a shader 302 is selected by a graphics driver 304 for execution by the graphics processor 310 .
- the shader before the shader is executed, the shader must be compiled and linked by the compiler to produce a shader binary.
- the desired shader binary may be retrieved from the cache 308 and passed to the graphics processor 310 for execution.
- the compiler 306 compiles and links a shader to produce a shader binary, the paired shader binary and corresponding key are stored in the cache 308 for later retrieval.
- GLSL shaders used in applications are compiled to produce an intermediate compilation using ARB assembly code, which is compiled itself to produce an executable using shader microcode or binary.
- ARB assembly and the shader microcode or binary are stored in the cache 308 .
- an OpenGL graphics rendering API supports user-supplied ARB assembly programs and fixed-function shading, and these are all cached in the cache 308 for later retrieval.
- the desired shader binary (e.g., an ARB assembly and shader binary used to produce the desired executable) retrieved during a current runtime session was stored in the cache 308 during a previous runtime session.
- a shader is passed to the compiler 306 and an ARB assembly is returned to the graphics driver 304 , after which, the graphics driver 304 passes an ARB ASM to the compiler 306 to produce a shader microcode which is executed by the graphics processor 310 .
- the ARB assembly and shader binaries are compressed to minimize a required footprint in the cache 308 and to possibly further reduce load times.
- RLE compression is used.
- a checksum may also be used to evaluate cached shader binaries to prevent execution problems should the stored shader binary be corrupted (e.g., due to abnormal process termination or due to improper file locking).
- FIG. 3B illustrates an exemplary functional block diagram of the graphics rendering system 300 illustrated in FIG. 3A .
- a shader 302 is selected by a graphics driver 304 for execution by a graphics processor 310 .
- the shader 302 is selected by the graphics driver 304 during a first phase 304 -A.
- the shader 302 is passed to a frontend compiler 306 -A for compiling.
- the frontend compiler 306 -A returns an ARB assembly to the graphics driver 304 during the first phase 304 -A.
- the ARB assembly is passed from the first phase 304 -A to the second phase 304 -B and forwarded to the backend compiler 306 -B for further compiling.
- the backend compiler 306 -B returns a shader microcode for execution by the graphics processor 310 .
- the backend compiler 306 -B may produce a shader microcode or binary.
- the first and second graphics driver phases 304 -A, 304 -B are performed by the graphics driver 304 .
- the frontend compiler functionality ( 306 -A) and the backend compiler functionality ( 306 -B) are implemented with a single compiler 306 .
- the ARB assembly and shader microcode and associated keys may be stored in the cache 308 for persistent storage. In other words, separate unique keys for the ARB assembly code portion and for the shader microcode are created and stored.
- FIG. 4 illustrates an exemplary flow diagram illustrating a process for compiling a shader 102 for execution by a graphics processor 110 in accordance with an embodiment of the present invention.
- a shader 102 is selected for execution by a graphics processor 110 .
- a single shader is selected for execution.
- a plurality of shaders are selected for execution together as a program object.
- a key is computed for the selected shader.
- a key is computed based upon shader arguments, shader text, current graphics driver version, and a current compiler driver version. The computed key is therefore associated with the desired shader binary.
- the computed key is compared with keys stored in a cache 108 . As discussed herein, during step 406 , the computed key is compared to the stored keys to determine if a key associated with the desired shader binary is stored in the cache 108 . If the desired shader binary is in the cache 108 , the associated key will match the computed key.
- step 408 of FIG. 4 a determination is made as to whether or not the desired key is found in the cache 108 . In other words, is there a match between the computed key (that is based upon the desired shader binary) and a key previously stored in the cache 108 ? If a key in the cache 108 matches the computed key, the process continues to step 410 of FIG. 4 . However, if the desired key is not found in the cache 108 , the process continues to step 416 of FIG. 4 .
- step 410 of FIG. 4 a shader binary associated with the stored key that matches the computed key is retrieved from the cache 108 and a checksum is performed on the retrieved shader binary.
- step 412 of FIG. 4 a determination is made as to whether the checksum passes. If the checksum passes, the process continues to step 414 of FIG. 4 . If the checksum does not pass, the process continues to step 416 of FIG. 4 .
- step 414 of FIG. 4 the shader binary is passed to the graphics processor 110 for execution.
- step 416 of FIG. 4 the selected shader or plurality of shaders is compiled and linked to produce a shader binary.
- step 418 of FIG. 4 the shader binary is passed to the graphics processor 110 for execution.
- step 420 of FIG. 4 the shader binary and associated key and checksum are stored in the cache 108 . As discussed herein, after the shader binary and the associated key and checksum are stored in the cache 108 , the shader binary will be available for execution by the graphics processor 110 during future runtime sessions without having to compile and link the shader again.
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- Computer Graphics (AREA)
- Power Sources (AREA)
- Memory System Of A Hierarchy Structure (AREA)
Abstract
Description
- This application claims the benefit of Provisional Application No. 61/585,620, filed on Jan. 11, 2012, titled “GRAPHICS PROCESSOR CLOCK SCALING, APPLICATION LOAD TIME IMPROVEMENTS, AND DYNAMICALLY ADJUSTING RESOLUTION OF RENDER BUFFER TO IMPROVE AND STABILIZE FRAME TIMES OF A GRAPHICS PROCESSOR,” by Swaminathan Narayanan, et al., which is herein incorporated by reference.
- Embodiments of the present disclosure relate generally to the field of graphics processing and more specifically to the field of improved shader binary caching and execution for efficient graphics processing.
- High level graphics languages (e.g., OpenGL and DirectX) allow applications to specify the execution of particular shaders. Shaders are instruction sets that define how certain pieces of geometry or fragments are processed by a graphics processor. These shader instruction sets can be quite long and detailed in what they do, and often execute millions of times. In order to enable execution of these shaders on graphics processors, a compiler is employed. An exemplary compiler takes an instruction set written with a programming language (e.g., C programming language and other similar programming languages) and compiles the instruction set into a shader binary code or microcode that can be executed on the graphics processor.
- One or more of these shader instruction sets may be compiled together to form an entire execution pipeline or program object. An exemplary process by which an application selects shaders and compiles the selected shader sources into binaries and links them together into a program object may require an unbounded amount of time. An exemplary compiler may require an extensive amount of optimization time while compiling and linking intermediate results and/or a final output.
- Shader compilation times on mobile hardware can take a significant portion of frame render time. For example, on an application, such as a video game, running at 60 Hz, the frame render time is roughly 16 ms. When the application compiles a handful of shaders, which take on average 3-5 ms to compile on current mobile hardware, the shader compilation time can easily exceed the frame time and unfortunately cause visible stuttering on the screen.
- An application attempting to compile shaders during runtime risks frame hitches or a gap of time when rendering stops or slows as shaders are compiled and programs linked together. Such visible stuttering or frame rate hitches are undesirable. Applications may attempt to get around this by compiling in-between runtimes, such that the results of the compile or link are not required for immediate execution. Despite such timing efforts, there are states or contexts that may change in 3D graphics, requiring one or more shaders to be recompiled during runtime. So even if an application attempts to compile and link all required shaders ahead of time (such as in-between levels of a video game) it is still possible for the application to require shader recompiling during runtime.
- It would also be difficult for an application vendor to supply a shader binary library that contains all of the possible binaries that would need to be stored so as to avoid compiling. Such an exemplary effort may result in a binary library containing hundreds of thousands of possible shader binaries to supply shader binaries for every possible configuration and shader combination possible. Even so, should unexpected changes occur, the stored binaries would then be out of date. Shader binaries will need to be recreated whenever the graphics hardware, application, or graphics drivers change so that the recompiled shader binaries are updated. In other words, a large binary library will not provide the required flexibility to update the executable shader binaries to reflect any changes to hardware and software.
- Embodiments of this present invention provide solutions to the challenges inherent in efficiently compiling shaders during runtime. According to one embodiment of the present invention, a method for compiling a shader for execution by a graphics processor is disclosed. The method comprises selecting a shader for execution. A key is computed for the selected shader. A memory is searched for a copy of the computed key. A shader binary stored in the memory is passed to the graphics processor for execution if the copy of the computed key is located in the memory. Otherwise, the shader is compiled to produce the shader binary for execution by the graphics processor and storing the shader binary in the memory. The shader binary is associated with the computed key and the copy of the computed key.
- In a computer system according to one embodiment of the present invention, the computer system comprises a processor, a graphics processor, and a memory. The memory is operable to store instructions, that when executed performs a method for compiling a shader for execution by a graphics processor. The method comprises selecting a shader for execution. A key is computed for the selected shader. A memory is searched for a copy of the computed key. A shader binary stored in the memory is passed to the graphics processor for execution if the copy of the computed key is located in the memory. Otherwise, the shader is compiled to produce the shader binary for execution by the graphics processor and storing the shader binary in the memory. The shader binary is associated with the computed key and the copy of the computed key.
- In a computer system according to one embodiment of the present invention, the computer system comprises a compiler, a memory, a graphics driver module, and a graphics processor. The compiler is operable to compile and link shader source code to create a shader binary. The memory is operable to store a plurality of shader binaries. Each shader binary is paired with an associated key. The graphics driver module is operable to select one or more shaders for execution by the graphics processor and to compute a key for a selected shader, and is further operable to search the memory for a copy of the computed key. A shader binary is passed from the memory for execution by the graphics processor if the copy of the computed key is located in the memory. Otherwise, the compiler is operable to compile and link the shader to create a shader binary for execution by the graphics processor and storing the shader binary in the memory. The shader binary is associated with the computed key and the copy of the computed key.
- The present invention will be better understood from the following detailed description, taken in conjunction with the accompanying drawing figures in which like reference characters designate like elements and in which:
-
FIG. 1 illustrates an exemplary schematic block diagram of a graphics rendering system with a persistent cache for persistent storage of shader binaries in accordance with an embodiment of the present invention; -
FIG. 2 illustrates an exemplary schematic block diagram of a graphics rendering system with a persistent cache for persistent storage of shader binaries and corresponding keys in accordance with an embodiment of the present invention; -
FIG. 3A illustrates an exemplary schematic block diagram of a graphics rendering system with a persistent cache for persistent storage of ARB assemblies and shader microcode in accordance with an embodiment of the present invention; -
FIG. 3B illustrates an exemplary functional block diagram of a graphics rendering system with a persistent cache for persistent storage of ARB assemblies and shader microcode in accordance with an embodiment of the present invention; and -
FIG. 4 illustrates an exemplary flow diagram illustrating steps of a computer implemented method for compiling a shader for execution by a graphics processor in accordance with an embodiment of the present invention. - Reference will now be made in detail to the preferred embodiments of the present invention, examples of which are illustrated in the accompanying drawings. While the invention will be described in conjunction with the preferred embodiments, it will be understood that they are not intended to limit the invention to these embodiments. On the contrary, the invention is intended to cover alternatives, modifications and equivalents, which may be included within the spirit and scope of the invention as defined by the appended claims. Furthermore, in the following detailed description of embodiments of the present invention, numerous specific details are set forth in order to provide a thorough understanding of the present invention. However, it will be recognized by one of ordinary skill in the art that the present invention may be practiced without these specific details. In other instances, well-known methods, procedures, components, and circuits have not been described in detail so as not to unnecessarily obscure aspects of the embodiments of the present invention. The drawings showing embodiments of the invention are semi-diagrammatic and not to scale and, particularly, some of the dimensions are for the clarity of presentation and are shown exaggerated in the drawing Figures. Similarly, although the views in the drawings for the ease of description generally show similar orientations, this depiction in the Figures is arbitrary for the most part. Generally, the invention can be operated in any orientation.
- Some portions of the detailed descriptions, which follow, are presented in terms of procedures, steps, logic blocks, processing, and other symbolic representations of operations on data bits within a computer memory. These descriptions and representations are the means used by those skilled in the data processing arts to most effectively convey the substance of their work to others skilled in the art. A procedure, computer executed step, logic block, process, etc., is here, and generally, conceived to be a self-consistent sequence of steps or instructions leading to a desired result. The steps are those requiring physical manipulations of physical quantities. Usually, though not necessarily, these quantities take the form of electrical or magnetic signals capable of being stored, transferred, combined, compared, and otherwise manipulated in a computer system. It has proven convenient at times, principally for reasons of common usage, to refer to these signals as bits, values, elements, symbols, characters, terms, numbers, or the like.
- It should be borne in mind, however, that all of these and similar terms are to be associated with the appropriate physical quantities and are merely convenient labels applied to these quantities. Unless specifically stated otherwise as apparent from the following discussions, it is appreciated that throughout the present invention, discussions utilizing terms such as “processing” or “accessing” or “executing” or “storing” or “rendering” or the like, refer to the action and processes of a computer system, or similar electronic computing device, that manipulates and transforms data represented as physical (electronic) quantities within the computer system's registers and memories and other computer readable media into other data similarly represented as physical quantities within the computer system memories or registers or other such information storage, transmission or display devices. When a component appears in several embodiments, the use of the same reference numeral signifies that the component is the same component as illustrated in the original embodiment.
- This present invention provides a solution to the increasing challenges inherent in compiling and linking shader source code to produce shader binaries at runtime. Various embodiments of the present disclosure provide an exemplary persistent cache memory that stores shader binaries and associated keys. As discussed in detail below, when a graphics driver selects a shader instruction set for execution by a graphics processor, a cache is searched for a key associated with a shader binary for the selected shader instruction set. If the key is found in the memory, then the corresponding shader binary is sent to the graphics processor for execution, otherwise, the shader instruction set is sent to a compiler for compiling and linking to create a shader binary for execution by the graphics processor and storage in the memory.
-
FIG. 1 illustrates an exemplary graphics rendering system comprising a shader instruction set 102 (hereafter referred to as a shader), agraphics driver module 104, acompiler 106, acache 108, and agraphics processor 110. As discussed herein, anexemplary shader 102 may comprise shader source code written with a high-level programming language, such as C programming language or other similar languages. In another embodiment, a plurality of shaders 102 (e.g., a program object) may be compiled, linked and stored in thecache 108. In one exemplary embodiment, thecache 108 is a persistent memory that retains stored shader binaries and associated keys between runtime sessions so that previously compiled and linked shaders are immediately available (in the cache) the next time they need to be executed. In one embodiment, anexemplary cache 108 for a mobile computing system may be approximately 2-8 Mbytes in size, while anexemplary cache 108 for a desktop computing system may be approximately 64 Mbytes. Other cache sizes are also possible and are within the scope of this disclosure. - As discussed herein, some mobile applications, such as a WebGL compatible browser, may compile a significant number of shaders during a single runtime of the application. In one embodiment, each shader binary for a mobile computing system may be 1-2 Kbytes in size. This may cause the
persistent cache 108 to fill up. One solution is to track the usage of shader binaries and to delete any old entries based on a least recently used (LRU) cache storage algorithm. Another solution may be to use a coarse ring buffer scheme using a pair of persistent files. For example, when one of the persistent files is filled up, the other persistent file may be truncated and new entries appended to that persistent file. - In one embodiment, shader binaries stored in a persistent cache may be compressed (e.g., RLE compression) beforehand. In one embodiment, the shader binary is compressed immediately before placement into the cache and decompressed immediately after removal from the cache.
- As illustrated in
FIG. 1 , a selected shader (or a plurality of shaders) 102 is received by thegraphics driver 104. After receiving theshader 102, thegraphics driver 104 searches thecache 108 for a shader binary that is a compiled and linked version of the selectedshader 102, ready to be executed by the graphics processor. As illustrated inFIG. 1 , if there is not a matching shader binary in thecache 108, thegraphic driver 104 forwards theshader 102 to thecompiler 106 for compiling and linking to produce the desired shader binary. This shader binary is forwarded to thegraphics driver 104, where it is then forwarded to both thegraphics processor 110 for execution and to thecache 108 for persistent storage. In one exemplary embodiment, thecache 108 may be searched for a desired shader binary by searching thecache 108 for a key that is associated with the desired shader binary. As discussed in detail below, a shader binary paired with an associated key is stored in thecache 108. As also discussed herein, when a shader is selected for execution, a key is computed that is associated with the desired shader binary and is used to search for a matching key stored in thecache 108. In other words, the computed key is used to search thecache 108 for a matching key paired with the desired shader binary. -
FIG. 2 illustrates an exemplary functional block diagram of a graphics rendering system operable to produce a shader binary for execution by a graphics processor and the computation of a key associated with the shader binary. As illustrated inFIG. 2 , anexemplary graphics driver 202 provides shader arguments to acomplier 204 and a computekey block 206. Thecompiler 204 receives the shader arguments and shader text that are used by thecompiler 204 to compile and link the shader to produce a shader binary. As also illustrated inFIG. 2 , the computekey function block 206 receives the shader arguments and shader text along with a graphics driver version, graphics processor type/version, and a compiler driver version to produce a key that is associated with the produced shader binary. In other words, a shader binary is based upon the corresponding shader arguments and shader text that are also dependent upon the current graphics driver version, compiler version, and graphics processor type/version. - In one embodiment, a cache stores shader binaries and associates each shader binary with a key. In one embodiment a key size and hash function may be used to produce a key that may be chosen such that a probability of collisions is kept extremely low. In one embodiment, a 64 bit key may suffice as a number of possible shaders that a typical application may compile for execution number at the most in the tens of thousands. In one embodiment, an exemplary key is computed using a hash function on a source shader string and the shader arguments that are also passed to the shader compiler. These arguments are computed internally by the graphics driver using the graphics driver's current state. The same shader instruction set may be compiled using different compiler arguments, resulting in multiple key/value pairs being added to the cache as graphic driver states change.
- In one embodiment, as discussed herein, the
cache 210 may also contain a global key that is computed at runtime based on a current graphics driver version, current compiler version, and other hardware related states. The global key may be computed at graphics driver startup and compared to a global key previously stored in thecache 210. If there is a mismatch between the previously stored global key and the new global key, thecache 210 is out of date and all stored shader binaries are invalidated. When stored shader binaries are invalidated, a shader selected for execution will need to be compiled, even if a copy of the shader binary is stored in the cache 210 (in other words, the stored shader binary is invalid). Such global keys may be used to ensure that only the latest updated shader binaries are used by the application. - As illustrated in
FIG. 2 , the shader binary and associated key are forwarded together to thecache 210 for storage. In one embodiment, thecache 210 is a persistent memory that retains its saved contents from previous runtime sessions. As illustrated inFIG. 1 , the shader binary may be forwarded to agraphics processor 110 for execution. To ensure that the shader binary has not been corrupted, a computechecksum function block 208 computes a checksum that is also stored in thecache 210 along with the shader binary and key. - In one embodiment, a checksum may be used to ensure that a shader binary stored in a
persistent cache 210 is uncorrupted. As noted herein, comparing a computed key to a key stored in thecache 210 may be used to ensure that the previously stored shader binary associated with the stored key is still valid (that there have not been software or hardware changes) while a checksum is used to ensure that a stored shader binary has not been corrupted due to copy errors, etc. In other words, a key is used to ensure that a desired shader binary selected is the correct one and is up to date, while the checksum is used to ensure that there are no errors in the cached shader binary. -
FIG. 3A illustrates an exemplarygraphics rendering system 300. Thegraphics rendering system 300 illustrated inFIG. 3 comprises ashader 302, agraphics driver 304, acompiler 306, acache 308, and agraphics processor 310. In one embodiment, ashader 302 is selected by agraphics driver 304 for execution by thegraphics processor 310. As noted herein, before the shader is executed, the shader must be compiled and linked by the compiler to produce a shader binary. In one embodiment, rather than compiling and linking the shader to produce the required shader binary, the desired shader binary may be retrieved from thecache 308 and passed to thegraphics processor 310 for execution. As discussed herein, when thecompiler 306 compiles and links a shader to produce a shader binary, the paired shader binary and corresponding key are stored in thecache 308 for later retrieval. - In one embodiment, GLSL shaders used in applications are compiled to produce an intermediate compilation using ARB assembly code, which is compiled itself to produce an executable using shader microcode or binary. In one embodiment, the ARB assembly and the shader microcode or binary are stored in the
cache 308. In one exemplary environment, an OpenGL graphics rendering API supports user-supplied ARB assembly programs and fixed-function shading, and these are all cached in thecache 308 for later retrieval. - In one embodiment, the desired shader binary (e.g., an ARB assembly and shader binary used to produce the desired executable) retrieved during a current runtime session was stored in the
cache 308 during a previous runtime session. As illustrated inFIGS. 3A and 3B , a shader is passed to thecompiler 306 and an ARB assembly is returned to thegraphics driver 304, after which, thegraphics driver 304 passes an ARB ASM to thecompiler 306 to produce a shader microcode which is executed by thegraphics processor 310. In one embodiment, the ARB assembly and shader binaries are compressed to minimize a required footprint in thecache 308 and to possibly further reduce load times. In one embodiment, RLE compression is used. As discussed herein, a checksum may also be used to evaluate cached shader binaries to prevent execution problems should the stored shader binary be corrupted (e.g., due to abnormal process termination or due to improper file locking). -
FIG. 3B illustrates an exemplary functional block diagram of thegraphics rendering system 300 illustrated inFIG. 3A . InFIG. 3B , ashader 302 is selected by agraphics driver 304 for execution by agraphics processor 310. As illustrated inFIG. 3B , theshader 302 is selected by thegraphics driver 304 during a first phase 304-A. During the first phase 304-A, theshader 302 is passed to a frontend compiler 306-A for compiling. The frontend compiler 306-A returns an ARB assembly to thegraphics driver 304 during the first phase 304-A. The ARB assembly is passed from the first phase 304-A to the second phase 304-B and forwarded to the backend compiler 306-B for further compiling. As illustrated inFIG. 3B , the backend compiler 306-B returns a shader microcode for execution by thegraphics processor 310. In one embodiment, the backend compiler 306-B may produce a shader microcode or binary. - In one embodiment, as illustrated in
FIGS. 3A and 3B , the first and second graphics driver phases 304-A, 304-B are performed by thegraphics driver 304. In one embodiment, the frontend compiler functionality (306-A) and the backend compiler functionality (306-B) are implemented with asingle compiler 306. As further illustrated inFIG. 3B , when the ARB assembly and the shader microcode are produced, the ARB assembly and shader microcode and associated keys may be stored in thecache 308 for persistent storage. In other words, separate unique keys for the ARB assembly code portion and for the shader microcode are created and stored. -
FIG. 4 illustrates an exemplary flow diagram illustrating a process for compiling ashader 102 for execution by agraphics processor 110 in accordance with an embodiment of the present invention. Instep 402 ofFIG. 4 , ashader 102 is selected for execution by agraphics processor 110. In one embodiment, a single shader is selected for execution. In another embodiment a plurality of shaders are selected for execution together as a program object. - In
step 404 ofFIG. 4 , a key is computed for the selected shader. In one embodiment, as discussed herein, a key is computed based upon shader arguments, shader text, current graphics driver version, and a current compiler driver version. The computed key is therefore associated with the desired shader binary. Instep 406 ofFIG. 4 , the computed key is compared with keys stored in acache 108. As discussed herein, duringstep 406, the computed key is compared to the stored keys to determine if a key associated with the desired shader binary is stored in thecache 108. If the desired shader binary is in thecache 108, the associated key will match the computed key. - In
step 408 ofFIG. 4 , a determination is made as to whether or not the desired key is found in thecache 108. In other words, is there a match between the computed key (that is based upon the desired shader binary) and a key previously stored in thecache 108? If a key in thecache 108 matches the computed key, the process continues to step 410 ofFIG. 4 . However, if the desired key is not found in thecache 108, the process continues to step 416 ofFIG. 4 . - In
step 410 ofFIG. 4 , a shader binary associated with the stored key that matches the computed key is retrieved from thecache 108 and a checksum is performed on the retrieved shader binary. Instep 412 ofFIG. 4 , a determination is made as to whether the checksum passes. If the checksum passes, the process continues to step 414 ofFIG. 4 . If the checksum does not pass, the process continues to step 416 ofFIG. 4 . Instep 414 ofFIG. 4 , the shader binary is passed to thegraphics processor 110 for execution. - In
step 416 ofFIG. 4 , the selected shader or plurality of shaders is compiled and linked to produce a shader binary. Instep 418 ofFIG. 4 , the shader binary is passed to thegraphics processor 110 for execution. Lastly, instep 420 ofFIG. 4 , the shader binary and associated key and checksum are stored in thecache 108. As discussed herein, after the shader binary and the associated key and checksum are stored in thecache 108, the shader binary will be available for execution by thegraphics processor 110 during future runtime sessions without having to compile and link the shader again. - Although certain preferred embodiments and methods have been disclosed herein, it will be apparent from the foregoing disclosure to those skilled in the art that variations and modifications of such embodiments and methods may be made without departing from the spirit and scope of the invention. It is intended that the invention shall be limited only to the extent required by the appended claims and the rules and principles of applicable law.
Claims (20)
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US13/731,785 US20140043333A1 (en) | 2012-01-11 | 2012-12-31 | Application load times by caching shader binaries in a persistent storage |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US201261585620P | 2012-01-11 | 2012-01-11 | |
US13/731,785 US20140043333A1 (en) | 2012-01-11 | 2012-12-31 | Application load times by caching shader binaries in a persistent storage |
Publications (1)
Publication Number | Publication Date |
---|---|
US20140043333A1 true US20140043333A1 (en) | 2014-02-13 |
Family
ID=48744794
Family Applications (2)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US13/712,839 Active 2035-01-02 US9773344B2 (en) | 2012-01-11 | 2012-12-12 | Graphics processor clock scaling based on idle time |
US13/731,785 Abandoned US20140043333A1 (en) | 2012-01-11 | 2012-12-31 | Application load times by caching shader binaries in a persistent storage |
Family Applications Before (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US13/712,839 Active 2035-01-02 US9773344B2 (en) | 2012-01-11 | 2012-12-12 | Graphics processor clock scaling based on idle time |
Country Status (1)
Country | Link |
---|---|
US (2) | US9773344B2 (en) |
Cited By (23)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20130169642A1 (en) * | 2011-12-29 | 2013-07-04 | Qualcomm Incorporated | Packing multiple shader programs onto a graphics processor |
US20140047470A1 (en) * | 2012-08-08 | 2014-02-13 | Intel Corporation | Securing Content from Malicious Instructions |
KR20170055392A (en) * | 2015-11-11 | 2017-05-19 | 삼성전자주식회사 | Method and apparatus for processing graphics command |
US9766649B2 (en) | 2013-07-22 | 2017-09-19 | Nvidia Corporation | Closed loop dynamic voltage and frequency scaling |
US9786026B2 (en) | 2015-06-15 | 2017-10-10 | Microsoft Technology Licensing, Llc | Asynchronous translation of computer program resources in graphics processing unit emulation |
US9881351B2 (en) | 2015-06-15 | 2018-01-30 | Microsoft Technology Licensing, Llc | Remote translation, aggregation and distribution of computer program resources in graphics processing unit emulation |
US9912322B2 (en) | 2013-07-03 | 2018-03-06 | Nvidia Corporation | Clock generation circuit that tracks critical path across process, voltage and temperature variation |
US9939883B2 (en) | 2012-12-27 | 2018-04-10 | Nvidia Corporation | Supply-voltage control for device power management |
US10002401B2 (en) | 2015-11-11 | 2018-06-19 | Samsung Electronics Co., Ltd. | Method and apparatus for efficient processing of graphics commands |
US10068370B2 (en) | 2014-09-12 | 2018-09-04 | Microsoft Technology Licensing, Llc | Render-time linking of shaders |
US20190232164A1 (en) * | 2018-01-26 | 2019-08-01 | Valve Corporation | Distributing shaders between client machines for precaching |
WO2019199848A1 (en) * | 2018-04-10 | 2019-10-17 | Google Llc | Memory management in gaming rendering |
US10466763B2 (en) | 2013-12-02 | 2019-11-05 | Nvidia Corporation | Dynamic voltage-frequency scaling to limit power transients |
US10898812B2 (en) | 2018-04-02 | 2021-01-26 | Google Llc | Methods, devices, and systems for interactive cloud gaming |
US11077364B2 (en) | 2018-04-02 | 2021-08-03 | Google Llc | Resolution-based scaling of real-time interactive graphics |
US11140207B2 (en) | 2017-12-21 | 2021-10-05 | Google Llc | Network impairment simulation framework for verification of real time interactive media streaming systems |
US11169733B2 (en) | 2017-10-26 | 2021-11-09 | Hewlett-Packard Development Company, L.P. | Asset processing from persistent memory |
US11305186B2 (en) | 2016-05-19 | 2022-04-19 | Google Llc | Methods and systems for facilitating participation in a game session |
US20220126203A1 (en) * | 2020-10-25 | 2022-04-28 | Meta Platforms, Inc. | Systems and methods for distributing compiled shaders |
US11369873B2 (en) | 2018-03-22 | 2022-06-28 | Google Llc | Methods and systems for rendering and encoding content for online interactive gaming sessions |
US11662051B2 (en) | 2018-11-16 | 2023-05-30 | Google Llc | Shadow tracking of real-time interactive simulations for complex system analysis |
US11684849B2 (en) | 2017-10-10 | 2023-06-27 | Google Llc | Distributed sample-based game profiling with game metadata and metrics and gaming API platform supporting third-party content |
US11872476B2 (en) | 2018-04-02 | 2024-01-16 | Google Llc | Input device for an electronic system |
Families Citing this family (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US9449359B2 (en) | 2012-09-13 | 2016-09-20 | Ati Technologies Ulc | Rendering settings in a multi-graphics processing unit system |
US9201487B2 (en) * | 2013-03-05 | 2015-12-01 | Intel Corporation | Reducing power consumption during graphics rendering |
KR102164099B1 (en) | 2014-03-28 | 2020-10-12 | 삼성전자 주식회사 | System on chip, method thereof, and device including the same |
US10009944B2 (en) * | 2015-08-26 | 2018-06-26 | International Business Machines Corporation | Controlling wireless connection of a device to a wireless access point |
US10331195B2 (en) | 2016-06-06 | 2019-06-25 | Qualcomm Incorporated | Power and performance aware memory-controller voting mechanism |
US10459760B2 (en) * | 2016-07-08 | 2019-10-29 | Sap Se | Optimizing job execution in parallel processing with improved job scheduling using job currency hints |
US10319065B2 (en) | 2017-04-13 | 2019-06-11 | Microsoft Technology Licensing, Llc | Intra-frame real-time frequency control |
US11209886B2 (en) | 2019-09-16 | 2021-12-28 | Microsoft Technology Licensing, Llc | Clock frequency adjustment for workload changes in integrated circuit devices |
US11714564B2 (en) * | 2020-01-06 | 2023-08-01 | Arm Limited | Systems and methods of power management |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US6272649B1 (en) * | 1998-09-28 | 2001-08-07 | Apple Computer, Inc. | Method and system for ensuring cache file integrity |
US6275919B1 (en) * | 1998-10-15 | 2001-08-14 | Creative Technology Ltd. | Memory storage and retrieval with multiple hashing functions |
US20030004921A1 (en) * | 2001-06-28 | 2003-01-02 | Schroeder Jacob J. | Parallel lookups that keep order |
US7015909B1 (en) * | 2002-03-19 | 2006-03-21 | Aechelon Technology, Inc. | Efficient use of user-defined shaders to implement graphics operations |
US7839410B1 (en) * | 2006-07-28 | 2010-11-23 | Nvidia Corporation | Parameter buffer objects for shader parameters in a graphics library |
US20120306877A1 (en) * | 2011-06-01 | 2012-12-06 | Apple Inc. | Run-Time Optimized Shader Program |
Family Cites Families (61)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5142690A (en) | 1990-03-20 | 1992-08-25 | Scientific-Atlanta, Inc. | Cable television radio frequency data processor |
US5396635A (en) | 1990-06-01 | 1995-03-07 | Vadem Corporation | Power conservation apparatus having multiple power reduction levels dependent upon the activity of the computer system |
US5551033A (en) | 1991-05-17 | 1996-08-27 | Zenith Data Systems Corporation | Apparatus for maintaining one interrupt mask register in conformity with another in a manner invisible to an executing program |
JPH06236284A (en) | 1991-10-21 | 1994-08-23 | Intel Corp | Method for preservation and restoration of computer-system processing state and computer system |
GB2264794B (en) | 1992-03-06 | 1995-09-20 | Intel Corp | Method and apparatus for automatic power management in a high integration floppy disk controller |
US5402492A (en) | 1993-06-18 | 1995-03-28 | Ast Research, Inc. | Security system for a stand-alone computer |
US5524249A (en) | 1994-01-27 | 1996-06-04 | Compaq Computer Corporation | Video subsystem power management apparatus and method |
US5557777A (en) | 1994-09-30 | 1996-09-17 | Apple Computer, Inc. | Method and apparatus for system recovery from power loss |
US5752050A (en) | 1994-10-04 | 1998-05-12 | Intel Corporation | Method and apparatus for managing power consumption of external devices for personal computers using a power management coordinator |
KR0135904B1 (en) | 1994-12-30 | 1998-06-15 | 김광호 | Power management system |
JP3520611B2 (en) | 1995-07-06 | 2004-04-19 | 株式会社日立製作所 | Processor control method |
US5889529A (en) | 1996-03-22 | 1999-03-30 | Silicon Graphics, Inc. | System and method for generating and displaying complex graphic images at a constant frame rate |
US5951689A (en) | 1996-12-31 | 1999-09-14 | Vlsi Technology, Inc. | Microprocessor power control system |
US6549240B1 (en) | 1997-09-26 | 2003-04-15 | Sarnoff Corporation | Format and frame rate conversion for display of 24Hz source video |
JPH11161385A (en) | 1997-11-28 | 1999-06-18 | Toshiba Corp | Computer system and its system state control method |
US20020126751A1 (en) | 1998-05-22 | 2002-09-12 | Christoph E. Scheurich | Maintaining a frame rate in a digital imaging system |
US6178523B1 (en) | 1998-06-12 | 2001-01-23 | Philips Consumer Communications Lp | Battery-operated device with power failure recovery |
US6347370B1 (en) | 1998-12-30 | 2002-02-12 | Intel Corporation | Method and system for pre-loading system resume operation data on suspend operation |
US6523128B1 (en) | 1999-08-31 | 2003-02-18 | Intel Corporation | Controlling power for a sleeping state of a computer to prevent overloading of the stand-by power rails by selectively asserting a control signal |
US6760850B1 (en) | 2000-07-31 | 2004-07-06 | Hewlett-Packard Development Company, L.P. | Method and apparatus executing power on self test code to enable a wakeup device for a computer system responsive to detecting an AC power source |
US6804763B1 (en) | 2000-10-17 | 2004-10-12 | Igt | High performance battery backed ram interface |
US6694451B2 (en) | 2000-12-07 | 2004-02-17 | Hewlett-Packard Development Company, L.P. | Method for redundant suspend to RAM |
US6542240B2 (en) | 2001-03-30 | 2003-04-01 | Alcan International Limited | Method of identifying defective roll on a strip processing line |
US7058834B2 (en) | 2001-04-26 | 2006-06-06 | Paul Richard Woods | Scan-based state save and restore method and system for inactive state power reduction |
TW501037B (en) | 2001-05-01 | 2002-09-01 | Benq Corp | Interactive update method for parameter data |
US6990594B2 (en) | 2001-05-02 | 2006-01-24 | Portalplayer, Inc. | Dynamic power management of devices in computer system by selecting clock generator output based on a current state and programmable policies |
JP4974202B2 (en) | 2001-09-19 | 2012-07-11 | ルネサスエレクトロニクス株式会社 | Semiconductor integrated circuit |
US20030156639A1 (en) | 2002-02-19 | 2003-08-21 | Jui Liang | Frame rate control system and method |
US6950951B2 (en) | 2002-04-30 | 2005-09-27 | Arm Limited | Power control signalling |
US7100013B1 (en) | 2002-08-30 | 2006-08-29 | Nvidia Corporation | Method and apparatus for partial memory power shutoff |
US6901298B1 (en) | 2002-09-30 | 2005-05-31 | Rockwell Automation Technologies, Inc. | Saving and restoring controller state and context in an open operating system |
US7043649B2 (en) | 2002-11-20 | 2006-05-09 | Portalplayer, Inc. | System clock power management for chips with multiple processing modules |
CN100416573C (en) | 2003-05-07 | 2008-09-03 | 睦塞德技术公司 | Power managers for an integrated circuit |
US7174472B2 (en) | 2003-05-20 | 2007-02-06 | Arm Limited | Low overhead integrated circuit power down and restart |
US7428644B2 (en) | 2003-06-20 | 2008-09-23 | Micron Technology, Inc. | System and method for selective memory module power management |
US7076735B2 (en) | 2003-07-21 | 2006-07-11 | Landmark Graphics Corporation | System and method for network transmission of graphical data through a distributed application |
US7091967B2 (en) | 2003-09-01 | 2006-08-15 | Realtek Semiconductor Corp. | Apparatus and method for image frame synchronization |
US7426647B2 (en) | 2003-09-18 | 2008-09-16 | Vulcan Portals Inc. | Low power media player for an electronic device |
JP4789494B2 (en) | 2004-05-19 | 2011-10-12 | 株式会社ソニー・コンピュータエンタテインメント | Image frame processing method, apparatus, rendering processor, and moving image display method |
US7401240B2 (en) | 2004-06-03 | 2008-07-15 | International Business Machines Corporation | Method for dynamically managing power in microprocessor chips according to present processing demands |
US7529958B2 (en) | 2004-11-15 | 2009-05-05 | Charles Roth | Programmable power transition counter |
US7659746B2 (en) | 2005-02-14 | 2010-02-09 | Qualcomm, Incorporated | Distributed supply current switch circuits for enabling individual power domains |
US7434072B2 (en) | 2005-04-25 | 2008-10-07 | Arm Limited | Integrated circuit power management control |
US8102398B2 (en) * | 2006-03-03 | 2012-01-24 | Ati Technologies Ulc | Dynamically controlled power reduction method and circuit for a graphics processor |
US7414550B1 (en) | 2006-06-30 | 2008-08-19 | Nvidia Corporation | Methods and systems for sample rate conversion and sample clock synchronization |
US7739533B2 (en) | 2006-09-22 | 2010-06-15 | Agere Systems Inc. | Systems and methods for operational power management |
US9209792B1 (en) | 2007-08-15 | 2015-12-08 | Nvidia Corporation | Clock selection system and method |
US8327173B2 (en) | 2007-12-17 | 2012-12-04 | Nvidia Corporation | Integrated circuit device core power down independent of peripheral device operation |
GB2455744B (en) | 2007-12-19 | 2012-03-14 | Advanced Risc Mach Ltd | Hardware driven processor state storage prior to entering a low power mode |
US8370663B2 (en) * | 2008-02-11 | 2013-02-05 | Nvidia Corporation | Power management with dynamic frequency adjustments |
US9411390B2 (en) | 2008-02-11 | 2016-08-09 | Nvidia Corporation | Integrated circuit device having power domains and partitions based on use case power optimization |
US9423846B2 (en) | 2008-04-10 | 2016-08-23 | Nvidia Corporation | Powered ring to maintain IO state independent of the core of an integrated circuit device |
JP4742174B1 (en) | 2010-04-20 | 2011-08-10 | 株式会社ソニー・コンピュータエンタテインメント | 3D video playback method and 3D video playback device |
US8484498B2 (en) * | 2010-08-26 | 2013-07-09 | Advanced Micro Devices | Method and apparatus for demand-based control of processing node performance |
US9171350B2 (en) | 2010-10-28 | 2015-10-27 | Nvidia Corporation | Adaptive resolution DGPU rendering to provide constant framerate with free IGPU scale up |
US8694811B2 (en) * | 2010-10-29 | 2014-04-08 | Texas Instruments Incorporated | Power management for digital devices |
US9007362B2 (en) | 2011-01-14 | 2015-04-14 | Brian Mark Shuster | Adaptable generation of virtual environment frames |
US9839844B2 (en) | 2011-03-01 | 2017-12-12 | Disney Enterprises, Inc. | Sprite strip renderer |
US8650423B2 (en) * | 2011-10-12 | 2014-02-11 | Qualcomm Incorporated | Dynamic voltage and clock scaling control based on running average, variant and trend |
US9547602B2 (en) | 2013-03-14 | 2017-01-17 | Nvidia Corporation | Translation lookaside buffer entry systems and methods |
GB2547170B (en) | 2014-12-05 | 2020-07-22 | Piolax Inc | Retaining device |
-
2012
- 2012-12-12 US US13/712,839 patent/US9773344B2/en active Active
- 2012-12-31 US US13/731,785 patent/US20140043333A1/en not_active Abandoned
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US6272649B1 (en) * | 1998-09-28 | 2001-08-07 | Apple Computer, Inc. | Method and system for ensuring cache file integrity |
US6275919B1 (en) * | 1998-10-15 | 2001-08-14 | Creative Technology Ltd. | Memory storage and retrieval with multiple hashing functions |
US20030004921A1 (en) * | 2001-06-28 | 2003-01-02 | Schroeder Jacob J. | Parallel lookups that keep order |
US7015909B1 (en) * | 2002-03-19 | 2006-03-21 | Aechelon Technology, Inc. | Efficient use of user-defined shaders to implement graphics operations |
US7839410B1 (en) * | 2006-07-28 | 2010-11-23 | Nvidia Corporation | Parameter buffer objects for shader parameters in a graphics library |
US20120306877A1 (en) * | 2011-06-01 | 2012-12-06 | Apple Inc. | Run-Time Optimized Shader Program |
Cited By (35)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US9530245B2 (en) * | 2011-12-29 | 2016-12-27 | Qualcomm Incorporated | Packing multiple shader programs onto a graphics processor |
US20130169642A1 (en) * | 2011-12-29 | 2013-07-04 | Qualcomm Incorporated | Packing multiple shader programs onto a graphics processor |
US20140047470A1 (en) * | 2012-08-08 | 2014-02-13 | Intel Corporation | Securing Content from Malicious Instructions |
US9646153B2 (en) * | 2012-08-08 | 2017-05-09 | Intel Corporation | Securing content from malicious instructions |
US10386916B2 (en) | 2012-12-27 | 2019-08-20 | Nvidia Corporation | Supply-voltage control for device power management |
US9939883B2 (en) | 2012-12-27 | 2018-04-10 | Nvidia Corporation | Supply-voltage control for device power management |
US9912322B2 (en) | 2013-07-03 | 2018-03-06 | Nvidia Corporation | Clock generation circuit that tracks critical path across process, voltage and temperature variation |
US9766649B2 (en) | 2013-07-22 | 2017-09-19 | Nvidia Corporation | Closed loop dynamic voltage and frequency scaling |
US10466763B2 (en) | 2013-12-02 | 2019-11-05 | Nvidia Corporation | Dynamic voltage-frequency scaling to limit power transients |
US10068370B2 (en) | 2014-09-12 | 2018-09-04 | Microsoft Technology Licensing, Llc | Render-time linking of shaders |
US9786026B2 (en) | 2015-06-15 | 2017-10-10 | Microsoft Technology Licensing, Llc | Asynchronous translation of computer program resources in graphics processing unit emulation |
US9881351B2 (en) | 2015-06-15 | 2018-01-30 | Microsoft Technology Licensing, Llc | Remote translation, aggregation and distribution of computer program resources in graphics processing unit emulation |
KR20170055392A (en) * | 2015-11-11 | 2017-05-19 | 삼성전자주식회사 | Method and apparatus for processing graphics command |
US10002401B2 (en) | 2015-11-11 | 2018-06-19 | Samsung Electronics Co., Ltd. | Method and apparatus for efficient processing of graphics commands |
KR102254119B1 (en) * | 2015-11-11 | 2021-05-20 | 삼성전자주식회사 | Method and apparatus for processing graphics command |
US11305186B2 (en) | 2016-05-19 | 2022-04-19 | Google Llc | Methods and systems for facilitating participation in a game session |
US11684849B2 (en) | 2017-10-10 | 2023-06-27 | Google Llc | Distributed sample-based game profiling with game metadata and metrics and gaming API platform supporting third-party content |
US11169733B2 (en) | 2017-10-26 | 2021-11-09 | Hewlett-Packard Development Company, L.P. | Asset processing from persistent memory |
US11140207B2 (en) | 2017-12-21 | 2021-10-05 | Google Llc | Network impairment simulation framework for verification of real time interactive media streaming systems |
KR20200115557A (en) * | 2018-01-26 | 2020-10-07 | 밸브 코포레이션 | Distributing shaders among client machines for precaching |
US10668378B2 (en) * | 2018-01-26 | 2020-06-02 | Valve Corporation | Distributing shaders between client machines for precaching |
WO2019147974A3 (en) * | 2018-01-26 | 2020-04-16 | Valve Corporation | Distributing shaders between client machines for precaching |
KR102600025B1 (en) * | 2018-01-26 | 2023-11-07 | 밸브 코포레이션 | Distributing shaders between client machines for precaching |
US20190232164A1 (en) * | 2018-01-26 | 2019-08-01 | Valve Corporation | Distributing shaders between client machines for precaching |
US11369873B2 (en) | 2018-03-22 | 2022-06-28 | Google Llc | Methods and systems for rendering and encoding content for online interactive gaming sessions |
US11872476B2 (en) | 2018-04-02 | 2024-01-16 | Google Llc | Input device for an electronic system |
US11077364B2 (en) | 2018-04-02 | 2021-08-03 | Google Llc | Resolution-based scaling of real-time interactive graphics |
US10898812B2 (en) | 2018-04-02 | 2021-01-26 | Google Llc | Methods, devices, and systems for interactive cloud gaming |
US11110348B2 (en) * | 2018-04-10 | 2021-09-07 | Google Llc | Memory management in gaming rendering |
EP4141781A1 (en) * | 2018-04-10 | 2023-03-01 | Google LLC | Memory management in gaming rendering |
WO2019199848A1 (en) * | 2018-04-10 | 2019-10-17 | Google Llc | Memory management in gaming rendering |
US11813521B2 (en) * | 2018-04-10 | 2023-11-14 | Google Llc | Memory management in gaming rendering |
US20210213354A1 (en) * | 2018-04-10 | 2021-07-15 | Google Llc | Memory management in gaming rendering |
US11662051B2 (en) | 2018-11-16 | 2023-05-30 | Google Llc | Shadow tracking of real-time interactive simulations for complex system analysis |
US20220126203A1 (en) * | 2020-10-25 | 2022-04-28 | Meta Platforms, Inc. | Systems and methods for distributing compiled shaders |
Also Published As
Publication number | Publication date |
---|---|
US9773344B2 (en) | 2017-09-26 |
US20130179711A1 (en) | 2013-07-11 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US20140043333A1 (en) | Application load times by caching shader binaries in a persistent storage | |
US8935683B2 (en) | Inline function linking | |
US10223528B2 (en) | Technologies for deterministic code flow integrity protection | |
US20170161040A1 (en) | Arranging Binary Code Based on Call Graph Partitioning | |
US7730463B2 (en) | Efficient generation of SIMD code in presence of multi-threading and other false sharing conditions and in machines having memory protection support | |
US8561045B2 (en) | Constructing runtime state for inlined code | |
US8615735B2 (en) | System and method for blurring instructions and data via binary obfuscation | |
US20140082597A1 (en) | Unifying static and dynamic compiler optimizations in source-code bases | |
US20110321002A1 (en) | Rewriting Branch Instructions Using Branch Stubs | |
US9607160B2 (en) | Method and apparatus for providing string encryption and decryption in program files | |
US6975325B2 (en) | Method and apparatus for graphics processing using state and shader management | |
US9280490B2 (en) | Secure computing | |
US20160154746A1 (en) | Secure computing | |
US9952843B2 (en) | Partial program specialization at runtime | |
US8943480B2 (en) | Setting breakpoints in optimized instructions | |
WO2012154606A1 (en) | Efficient conditional flow control compilation | |
Sabanal | Hiding behind ART | |
US5854928A (en) | Use of run-time code generation to create speculation recovery code in a computer system | |
US11379195B2 (en) | Memory ordering annotations for binary emulation | |
US20180349156A1 (en) | Techniques for performing dynamic linking | |
US10521206B2 (en) | Supporting compiler variable instrumentation for uninitialized memory references | |
Besnard et al. | A framework for automatic and parameterizable memoization | |
US11860996B1 (en) | Security concepts for web frameworks | |
US11615014B2 (en) | Using relocatable debugging information entries to save compile time | |
US20230418950A1 (en) | Methods, Devices, and Systems for Control Flow Integrity |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
AS | Assignment |
Owner name: CELLCO PARTNERSHIP D/B/A VERIZON WIRELESS, NEW JER Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:POLEHN, DONNA L;CHANG, PATRICIA R;REEL/FRAME:029838/0517 Effective date: 20121227 Owner name: VERIZON PATENT AND LICENSING INC., NEW JERSEY Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:AKSU, ARDA;KAKADIA, DEEPAK;MACIAS, JOHN F;AND OTHERS;SIGNING DATES FROM 20121221 TO 20121231;REEL/FRAME:029838/0462 |
|
STCB | Information on status: application discontinuation |
Free format text: ABANDONED -- FAILURE TO RESPOND TO AN OFFICE ACTION |