WO2021109105A1 - Synchronisation entre unités de traitement graphique et unités de traitement d'affichage - Google Patents

Synchronisation entre unités de traitement graphique et unités de traitement d'affichage Download PDF

Info

Publication number
WO2021109105A1
WO2021109105A1 PCT/CN2019/123547 CN2019123547W WO2021109105A1 WO 2021109105 A1 WO2021109105 A1 WO 2021109105A1 CN 2019123547 W CN2019123547 W CN 2019123547W WO 2021109105 A1 WO2021109105 A1 WO 2021109105A1
Authority
WO
WIPO (PCT)
Prior art keywords
pixels
buffer
frame
processing unit
scanning
Prior art date
Application number
PCT/CN2019/123547
Other languages
English (en)
Inventor
Andrew Evan GRUBER
Yongjun XU
Bo Du
Nan Zhang
Xiaokai WEN
Original Assignee
Qualcomm Incorporated
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Qualcomm Incorporated filed Critical Qualcomm Incorporated
Priority to PCT/CN2019/123547 priority Critical patent/WO2021109105A1/fr
Publication of WO2021109105A1 publication Critical patent/WO2021109105A1/fr

Links

Images

Classifications

    • GPHYSICS
    • G09EDUCATION; CRYPTOGRAPHY; DISPLAY; ADVERTISING; SEALS
    • G09GARRANGEMENTS OR CIRCUITS FOR CONTROL OF INDICATING DEVICES USING STATIC MEANS TO PRESENT VARIABLE INFORMATION
    • G09G5/00Control arrangements or circuits for visual indicators common to cathode-ray tube indicators and other visual indicators
    • G09G5/12Synchronisation between the display unit and other units, e.g. other display units, video-disc players
    • GPHYSICS
    • G09EDUCATION; CRYPTOGRAPHY; DISPLAY; ADVERTISING; SEALS
    • G09GARRANGEMENTS OR CIRCUITS FOR CONTROL OF INDICATING DEVICES USING STATIC MEANS TO PRESENT VARIABLE INFORMATION
    • G09G5/00Control arrangements or circuits for visual indicators common to cathode-ray tube indicators and other visual indicators
    • G09G5/003Details of a display terminal, the details relating to the control arrangement of the display terminal and to the interfaces thereto
    • GPHYSICS
    • G09EDUCATION; CRYPTOGRAPHY; DISPLAY; ADVERTISING; SEALS
    • G09GARRANGEMENTS OR CIRCUITS FOR CONTROL OF INDICATING DEVICES USING STATIC MEANS TO PRESENT VARIABLE INFORMATION
    • G09G5/00Control arrangements or circuits for visual indicators common to cathode-ray tube indicators and other visual indicators
    • G09G5/36Control arrangements or circuits for visual indicators common to cathode-ray tube indicators and other visual indicators characterised by the display of a graphic pattern, e.g. using an all-points-addressable [APA] memory
    • G09G5/363Graphics controllers
    • GPHYSICS
    • G09EDUCATION; CRYPTOGRAPHY; DISPLAY; ADVERTISING; SEALS
    • G09GARRANGEMENTS OR CIRCUITS FOR CONTROL OF INDICATING DEVICES USING STATIC MEANS TO PRESENT VARIABLE INFORMATION
    • G09G2330/00Aspects of power supply; Aspects of display protection and defect management
    • G09G2330/02Details of power systems and of start or stop of display operation
    • G09G2330/021Power management, e.g. power saving

Definitions

  • the following relates generally to multimedia communications and more specifically to synchronization between graphical processing units (GPUs) and display processing units (DPUs) .
  • GPUs graphical processing units
  • DPUs display processing units
  • Multimedia systems are widely deployed to provide various types of multimedia communication content such as voice, video, packet data, messaging, broadcast, and so on. These multimedia systems may be capable of processing, storage, generation, manipulation and rendition of multimedia information. Examples of multimedia systems include wireless communications systems, entertainment systems, information systems, virtual reality systems, model and simulation systems, and so on. These systems may employ a combination of hardware and software technologies (e.g., central processing units (CPUs) , graphical processing units (GPUs) , display processing units (DPUs) ) to support processing, storage, generation, manipulation and rendition of multimedia information, for example, such as capture devices, storage devices, communication networks, computer systems, and display devices. As demand for multimedia communication efficiency increases, some multimedia systems may fail to provide satisfactory multimedia operations for multimedia communications, and thereby may be unable to support high reliability or low latency multimedia operations, among other examples.
  • CPUs central processing units
  • GPUs graphical processing units
  • DPUs display processing units
  • a device to support synchronization between a rendering hardware (e.g., a graphical processing unit (GPU) ) and a display hardware (e.g., a display processing unit (DPU) ) of the device.
  • the described techniques may be used to configure the device to reduce a vertical synchronization (VSYNC) delay time for multimedia applications, such as gaming applications running on the device (e.g., a mobile platform) based on the synchronization between the GPU and the DPU.
  • the described techniques may be used to configure the rendering hardware (e.g., the GPU) with a counter to track a read pointer of the display hardware (e.g., the DPU) to simulate a scanline position of the display hardware.
  • the counter may be a hardware counter configured with the rendering hardware. In some examples, the counter may be configurable based on a resolution of the display hardware or a refresh rate of the display hardware, or both.
  • the counter may be synched with the DPU during each refresh cycle according to the configuration.
  • the rendering hardware may perform bin rendering after detecting that the scanline position of the display hardware (e.g., the DPU) satisfies a threshold (e.g., a threshold distance from a rendering area) based on the read pointer, or suspend the bin rendering until the scanline of the display hardware (e.g., the DPU HW) satisfies the threshold (e.g., satisfies the threshold distance from the rendering area) based on the read pointer.
  • a threshold e.g., a threshold distance from a rendering area
  • the device may decrease a rendering latency to a delay period of one VSYNC, as well as mitigate tearing issues for the multimedia applications.
  • the method may include scanning one or more pixels using a DPU of a device, where the one or more pixels are associated with a portion of a first frame in a buffer, tracking a pointer that indicates a position of the scanning of the one or more pixels associated with the portion of the first frame in the buffer, and rendering one or more pixels of a bin of a set of bins using a GPU of the device based on the tracking, where the one or more pixels of the bin of the set of bins are associated with a portion of a second frame in the buffer.
  • the apparatus may include a processor, memory coupled with the processor, and instructions stored in the memory.
  • the instructions may be executable by the processor to cause the apparatus to scan one or more pixels using a DPU of the apparatus, where the one or more pixels are associated with a portion of a first frame in a buffer, track a pointer that indicates a position of the scanning of the one or more pixels associated with the portion of the first frame in the buffer, and render one or more pixels of a bin of a set of bins using a GPU of the apparatus based on the tracking, where the one or more pixels of the bin of the set of bins are associated with a portion of a second frame in the buffer.
  • the apparatus may include means for scanning one or more pixels using a DPU of the apparatus, where the one or more pixels are associated with a portion of a first frame in a buffer, tracking a pointer that indicates a position of the scanning of the one or more pixels associated with the portion of the first frame in the buffer, and rendering one or more pixels of a bin of a set of bins using a GPU of the apparatus based on the tracking, where the one or more pixels of the bin of the set of bins are associated with a portion of a second frame in the buffer.
  • a non-transitory computer-readable medium storing code is described.
  • the code may include instructions executable by a processor to scan one or more pixels using a DPU of a device, where the one or more pixels are associated with a portion of a first frame in a buffer, track a pointer that indicates a position of the scanning of the one or more pixels associated with the portion of the first frame in the buffer, and render one or more pixels of a bin of a set of bins using a GPU of the device based on the tracking, where the one or more pixels of the bin of the set of bins are associated with a portion of a second frame in the buffer.
  • the portion of the second frame in the buffer associated with the bin may be a region from which the scanning of the one or more pixels associated with the portion of the first frame in the buffer may have occurred.
  • Some examples of the method, apparatuses, and non-transitory computer-readable medium described herein may further include operations, features, means, or instructions for tracking, by the GPU, the position of the scanning of the one or more pixels associated with the portion of the first frame in the buffer using the counter.
  • Some examples of the method, apparatuses, and non-transitory computer-readable medium described herein may further include operations, features, means, or instructions for determining a resolution associated with the DPU, and configuring a parameter of the counter based on the resolution associated with the DPU.
  • Some examples of the method, apparatuses, and non-transitory computer-readable medium described herein may further include operations, features, means, or instructions for determining a refresh rate associated with the DPU, and configuring a parameter of the counter based on the refresh rate associated with the DPU.
  • Some examples of the method, apparatuses, and non-transitory computer-readable medium described herein may further include operations, features, means, or instructions for synchronizing, based on a refresh cycle of the DPU, the counter of the GPU with the pointer that indicates the position of the scanning, by the DPU, of the one or more pixels associated with the portion of the first frame in the buffer.
  • Some examples of the method, apparatuses, and non-transitory computer-readable medium described herein may further include operations, features, means, or instructions for determining that the position of the scanning of the one or more pixels associated with the portion of the first frame in the buffer using the DPU of the device satisfies a threshold region, and resetting the counter based on the position of the scanning of the one or more pixels associated with the portion of the first frame in the buffer satisfying the threshold region.
  • Some examples of the method, apparatuses, and non-transitory computer-readable medium described herein may further include operations, features, means, or instructions for determining that the position of the scanning of the one or more pixels associated with the portion of the first frame in the buffer using the DPU of the device satisfies a threshold region, , and where rendering the one or more pixels of the bin of the set of bins using the GPU of the device based on the scanning of the one or more pixels associated with the portion of the first frame in the buffer satisfying the threshold region.
  • Some examples of the method, apparatuses, and non-transitory computer-readable medium described herein may further include operations, features, means, or instructions for activating the GPU based on the position of the scanning of the one or more pixels associated with the portion of the first frame in the buffer satisfying the threshold region.
  • Some examples of the method, apparatuses, and non-transitory computer-readable medium described herein may further include operations, features, means, or instructions for scanning of the one or more pixels associated with the portion of the first frame in the buffer and rendering the one or more pixels of the bin of the set of bins using the GPU of the device corresponds to a single VSYNC delay period.
  • Some examples of the method, apparatuses, and non-transitory computer-readable medium described herein may further include operations, features, means, or instructions for determining that the position of the scanning of the one or more pixels associated with the portion of the first frame in the buffer using the DPU of the device satisfies a threshold region, and refraining from rendering one or more pixels of a second bin of the set of bins using the GPU of the device based on the position of the scanning of the one or more pixels associated with the portion of the first frame in the buffer satisfying the threshold region, where the one or more pixels of the second bin may be associated with a second portion of the second frame in the buffer.
  • Some examples of the method, apparatuses, and non-transitory computer-readable medium described herein may further include operations, features, means, or instructions for deactivating the GPU based on the refraining.
  • one or more of the GPU and the DPU may be operating in a single buffer mode.
  • Some examples of the method, apparatuses, and non-transitory computer-readable medium described herein may further include operations, features, means, or instructions for identifying an application running on the device, comparing the application to a set of applications configured for the single buffer mode, and applying the single buffer mode to the application based on the comparing.
  • Some examples of the method, apparatuses, and non-transitory computer-readable medium described herein may further include operations, features, means, or instructions for scanning of the one or more pixels associated with the portion of the first frame in the buffer and rendering the one or more pixels of the bin of the set of bins using the GPU of the device may be based on the single buffer mode.
  • FIG. 1 illustrates an example of a multimedia system for multimedia communications that supports synchronization between graphical processing units (GPUs) and display processing units (DPUs) in accordance with aspects of the present disclosure.
  • GPUs graphical processing units
  • DPUs display processing units
  • FIG. 2 shows an example of a block diagram that supports rendering in accordance with aspects of the present disclosure.
  • FIG. 3A shows an example of a block diagram that supports rendering using a single buffer mode in accordance with aspects of the present disclosure.
  • FIG. 3B illustrates an example of a timing diagram that supports rendering using a single buffer mode in accordance with aspects of the present disclosure.
  • FIG. 4 illustrates an example of a timing diagram related to frame latency in accordance with aspects of the present disclosure.
  • FIG. 5 illustrates an example of a timing diagram that supports synchronization between GPUs and DPUs in accordance with aspects of the present disclosure.
  • FIG. 6A shows an example of a block a diagram that supports synchronization between GPUs and DPUs in accordance with aspects of the present disclosure.
  • FIG. 6B shows a flowchart illustrating a method that supports synchronization between GPUs and DPUs in accordance with aspects of the present disclosure.
  • FIG. 7 shows a flowchart illustrating a method that supports synchronization between GPUs and DPUs in accordance with aspects of the present disclosure.
  • FIGs. 8 and 9 show block diagrams of devices that support synchronization between GPUs and DPUs in accordance with aspects of the present disclosure.
  • FIG. 10 shows a block diagram of a multimedia manager that supports synchronization between GPUs and DPUs in accordance with aspects of the present disclosure.
  • FIGs. 11 and 12 show diagrams of systems including a device that supports synchronization between GPUs and DPUs in accordance with aspects of the present disclosure.
  • FIGs. 13 through 17 show flowcharts illustrating methods that support synchronization between GPUs and DPUs in accordance with aspects of the present disclosure.
  • Multimedia systems may include multiple devices, which may provide various types of multimedia communication content such as voice, video, packet data, and so on.
  • the devices may be capable of processing, storage, generation, manipulation and rendition of multimedia information.
  • multimedia systems include wireless communications systems, entertainment systems, information systems, gaming systems, virtual reality systems, model and simulation systems, and so on.
  • the devices may employ a combination of hardware and software technologies (e.g., central processing units (CPUs) , graphical processing units (GPUs) , display processing units (DPUs) ) to support processing, storage, generation, manipulation and rendition of multimedia information.
  • the devices may support gaming applications via a mobile platform of the devices.
  • the gaming applications may relate to non-virtual reality systems.
  • the gaming applications may relate to virtual reality systems.
  • Some examples of virtual reality systems may support a fully immersive virtual reality experience, a non-immersive virtual reality experience, or a collaborative virtual reality experience.
  • a quality of the gaming applications running via the mobile platform of the devices, as well as user experience, may be affected by a rendering-to-display latency. That is, for the gaming applications running on the devices via the mobile platform (e.g., a mobile operating system) , rendering and displaying of multimedia-related information (e.g., graphics data) associated with the gaming applications using the mobile operating systems may introduce a VSYNC delay. For example, for devices configured with an Android operating system (OS) , rendering and displaying operations of the Android OS may introduce a VSYNC delay time of at least three VSYNC frames for gaming applications.
  • OS Android operating system
  • some devices may be absent of a synchronization mechanism associated with a single buffer mode, and gaming applications running on the mobile operating systems may be unable to use the single buffer mode directly.
  • some virtual reality applications may support synchronization associated with a single buffer mode to reduce a render-to-display latency. These virtual reality applications are configured with the synchronization in a virtual reality framework or a software development kit (SDK) , and such synchronization associated with a single buffer mode may be unable to be extended to gaming applications (e.g., non-virtual reality applications, non-virtual reality gaming applications) .
  • SDK software development kit
  • Various aspects of the described techniques relate to configuring the devices to support synchronization between a rendering hardware (e.g., a GPU) and a display hardware (e.g., a DPU) of the devices to support high reliability or low latency rendering-to-displaying operations for multimedia applications (e.g., gaming applications) running on a mobile platform of the devices.
  • the described techniques may be used to configure the devices to reduce a VSYNC delay time for multimedia applications, such as gaming applications running on the devices (e.g., a mobile platform) based on the synchronization between the GPU and the DPU.
  • the described techniques may be used to configure the rendering hardware (e.g., the GPU) with a counter to track a read pointer of the display hardware (e.g., the DPU) to simulate a scanline position of the display hardware.
  • the counter may be a hardware counter configured with the rendering hardware.
  • the counter may be configurable based on a resolution of the display hardware or a refresh rate of the display hardware, or both.
  • the counter may be synched with the DPU during each refresh cycle according to the configuration.
  • the rendering hardware may perform bin rendering after detecting that the scanline position of the display hardware (e.g., the DPU) satisfies a threshold (e.g., a threshold distance from a rendering area) based on the read pointer, or suspend the bin rendering until the scanline of the display hardware (e.g., the DPU HW) satisfies the threshold (e.g., satisfies the threshold distance from the rendering area) based on the read pointer.
  • a threshold e.g., a threshold distance from a rendering area
  • the devices may decrease a rendering latency to a delay period of one VSYNC, as well as mitigate tearing issues for the multimedia applications.
  • the techniques employed by the devices may provide benefits and enhancements to the operation of the devices.
  • operations performed by the devices may provide improvements to rendering-to-displaying operations for multimedia applications (e.g., gaming applications) in multimedia systems.
  • configuring the devices with a counter e.g., a hardware counter in a GPU
  • a counter e.g., a hardware counter in a GPU
  • a read pointer of a display hardware e.g., a DPU
  • configuring the devices with a counter may support improvements to power consumption, rendering multimedia information (e.g., frames) , displaying the multimedia information and, in some examples, may promote enhanced efficiency for rendering-to-displaying operations, among other benefits.
  • aspects of the disclosure are initially described in the context of multimedia systems. Aspects of the disclosure are then illustrated by and described with references to timing diagrams and process flows that relate to synchronization between GPUs and DPUs. Aspects of the disclosure are further illustrated by and described with reference to apparatus diagrams, system diagrams, and flowcharts that relate to synchronization between GPUs and DPUs.
  • FIG. 1 illustrates an example of a multimedia system 100 that supports synchronization between GPUs and DPUs in accordance with aspects of the present disclosure.
  • the multimedia system 100 may include devices 105, a server 110, and a database 115.
  • the multimedia system 100 illustrates two devices 105, a single server 110, a single database 115, and a single network 120
  • the present disclosure applies to any multimedia system architecture having one or more devices 105, servers 110, databases 115, and networks 120.
  • the devices 105, the server 110, and the database 115 may communicate with each other and exchange information between GPUs and DPUs, such as multimedia packets (e.g., audio packets, voice packets, video packets) , multimedia data, or multimedia control information, via network 120 using communications links 125.
  • multimedia packets e.g., audio packets, voice packets, video packets
  • multimedia data e.g., or multimedia data
  • multimedia control information e.g., a portion or all of the techniques described herein supporting
  • a device 105 may be a cellular phone, a smartphone, a personal digital assistant (PDA) , a wireless communication device, a handheld device, a tablet computer, a laptop computer, a cordless phone, a display device (e.g., monitors) , among other examples that supports various types of communication and functional features related to multimedia (e.g., transmitting, receiving, broadcasting, streaming, sinking, capturing, storing, rendering, displaying, and recording multimedia data (e.g., graphics data) .
  • PDA personal digital assistant
  • multimedia data e.g., graphics data
  • a device 105 may, additionally or alternatively, be referred to by those skilled in the art as a user equipment (UE) , a user device, a smartphone, a Bluetooth device, a Wi-Fi device, a mobile station, a subscriber station, a mobile unit, a subscriber unit, a wireless unit, a remote unit, a mobile device, a wireless device, a wireless communications device, a remote device, an access terminal, a mobile terminal, a wireless terminal, a remote terminal, a handset, a user agent, a mobile client, a client, or some other suitable terminology.
  • UE user equipment
  • the devices 105 may also be able to communicate directly with another device (e.g., using a peer-to-peer (P2P) or device-to-device (D2D) protocol) .
  • P2P peer-to-peer
  • D2D device-to-device
  • a device 105 may be able to receive from or transmit to another device 105 variety of information, such as instructions or commands (e.g., multimedia-related information) .
  • the devices 105 may, in some examples, include an application 130, a multimedia manager 135, a central processing unit (CPU) 140, a GPU 145, and a DPU 150. While, the multimedia system 100 illustrates the devices 105 including both the application 130 and the multimedia manager 135, the application 130 and the multimedia manager 135 may be an optional feature for the devices 105.
  • the application 130 may be a multimedia-based application that can receive (e.g., download, stream, broadcast) from the server 110, database 115 or another device 105, or transmit (e.g., upload) multimedia data to the server 110, the database 115, or to another device 105 via using communications links 125.
  • the multimedia manager 135 may be part of a general-purpose processor, a digital signal processor (DSP) , an image signal processor (ISP) , the CPU 140, the GPU 145, the DPU 150, a microcontroller, an application-specific integrated circuit (ASIC) , a field-programmable gate array (FPGA) , a discrete gate or transistor logic component, a discrete hardware component, or any combination thereof, or other programmable logic device, discrete gate or transistor logic, discrete hardware components, or any combination thereof designed to perform the functions described in the present disclosure, or the like.
  • the multimedia manager 135 may process multimedia (e.g., image data, video data, audio data) from and write multimedia data to a local memory of the device 105 or to the database 115.
  • multimedia e.g., image data, video data, audio data
  • the multimedia manager 135 may also be configured to provide multimedia enhancements, multimedia restoration, multimedia analysis, multimedia compression, multimedia streaming, and multimedia synthesis, among other functionality.
  • the multimedia manager 135 may perform white balancing, cropping, scaling (e.g., multimedia compression) , adjusting a resolution, multimedia stitching, color processing, multimedia filtering, spatial multimedia filtering, artifact removal, frame rate adjustments, multimedia encoding, multimedia decoding, and multimedia filtering.
  • the multimedia manager 135 may process multimedia data to support synchronization between the GPU 145 and the DPU 150 associated with the devices 105, according to the techniques described herein.
  • aspects of the multimedia system 100 may support synchronization between the GPU 145 and the DPU 150 of the device 105. In other examples, aspects of the multimedia system 100 may support synchronization between a GPU 145 of the device 105 and a DPU 150 of another device 105. In some examples, the multimedia system 100 may be a virtual reality system that supports synchronization between GPUs 145 and DPUs 150 in accordance with aspects of the present disclosure.
  • a device 105 may be a rendering device capable of communicating one or more frames to another device 105 to provide a virtual reality experience.
  • a frame may be a stereoscopic three dimensional (3D) visualization that is transmitted to the other device 105 for presentation.
  • the device 105 may include rendering hardware (e.g., a GPU 145) .
  • the device 105 may include an image stream producer capable of producing (e.g., using the GPU 145) graphic buffers for consumption.
  • the device 105 may be configured to run an application programming interface (API) for rendering multidimensional graphics (e.g., 2D and 3D computer graphics) , for example, such as multidimensional graphics used for video games.
  • the device 105 may run APIs such as OpenGL ES, Canvas 2D, and mediaserver video decoders, for example.
  • the other device 105 may be, for example, a display device.
  • the other device 105 may include display hardware (e.g., a DPU 150) supportive of processing and providing a multidimensional representation (e.g., a stereoscopic 3D visualization) of graphics (e.g., images, video images) generated by the rendering hardware (e.g., the GPU 145) of the device 105.
  • the other device 105 may include, for example, a display screen or display panel.
  • the other device 105 may be a head-mounted display (HMD) . As an HMD, the other device 105 may be worn by a user.
  • HMD head-mounted display
  • the other device 105 may be configured with one or more sensors to sense a position of the user and an environment surrounding the HMD to generate information when the user is wearing the HMD.
  • the information may include movement information, orientation information, angle information, etc. regarding the other device 105.
  • the other device 105 may be configured with a microphone for capturing audio and one or more speakers for broadcasting audio.
  • the other device 105 may also be configured with a set of lenses and a display screen for the user to view and be part of the virtual reality experience.
  • a rendering hardware of the device 105 may be configured with a counter (e.g., a hardware counter, a software counter) to track a read pointer of a display hardware of the other device 105 (or a read pointer of a display hardware of the device 105) , for example, to simulate a scanline position of the display hardware.
  • the GPU 145 of the device 105 may be configured with a counter (e.g., a hardware counter, a software counter) to track a read pointer of the DPU 150 of the other device 105 (or the DPU 150 of the device 105) to simulate a scanline position of the DPU 150.
  • the counter may be configurable based on a resolution of the DPU 150 or a refresh rate of the DPU 150, or both. As such, the counter may be synched with the DPU 150 during each refresh cycle. For example, examples described herein support providing a hardware link between the GPU 145 and the DPU 150, such that the counter may be reset when the DPU 150 begins scanning (e.g., displaying) a portion of a frame.
  • the GPU 145 e.g., GPU driver
  • a device 105 may scan pixels using a DPU 150, where the pixels are associated with a portion of a first frame in a buffer.
  • the device 105 may track a pointer that indicates a position associated with the scanning of the pixels.
  • the device 105 (using a GPU 145 of the device 105) may render pixels of a bin associated with a portion of a second frame in the buffer, based on the tracking.
  • the device 105 may perform or pause bin rendering based a scanline position of the DPU 150 with respect to a render area.
  • the server 110 may be a data server, a cloud server, a server associated with a multimedia subscription provider, proxy server, web server, application server, communications server, home server, mobile server, or any combination thereof.
  • the server 110 may in some cases include a multimedia distribution platform 155.
  • the multimedia distribution platform 155 may allow the devices 105 to discover, browse, share, and download multimedia via network 120 using communications links 125, and therefore provide a digital distribution of the multimedia from the multimedia distribution platform 155.
  • a digital distribution may be a form of delivering media content such as audio, video, images, without the use of physical media but over online delivery mediums, such as the Internet.
  • the devices 105 may upload or download multimedia-related applications for streaming, downloading, uploading, processing, enhancing, etc. multimedia (e.g., images, audio, video) .
  • the server 110 may also transmit to the devices 105 a variety of information, such as instructions or commands (e.g., multimedia-related information) to download multimedia-related applications on the device 105.
  • the database 115 may store a variety of information, such as instructions or commands (e.g., multimedia-related information) .
  • the database 115 may store multimedia 160.
  • the device 105 may support synchronization between GPUs and DPUs associated with the devices 105 for the multimedia 160.
  • the device 105 may retrieve the stored data from the database 115 via the network 120 using communication links 125.
  • the database 115 may be a relational database (e.g., a relational database management system (RDBMS) or a Structured Query Language (SQL) database) , a non-relational database, a network database, an object-oriented database, or other type of database, that stores the variety of information, such as instructions or commands (e.g., multimedia-related information) .
  • a relational database e.g., a relational database management system (RDBMS) or a Structured Query Language (SQL) database
  • SQL Structured Query Language
  • the network 120 may provide encryption, access authorization, tracking, Internet Protocol (IP) connectivity, and other access, computation, modification, and functions.
  • Examples of network 120 may include any combination of cloud networks, local area networks (LAN) , wide area networks (WAN) , virtual private networks (VPN) , wireless networks (using 802.11, for example) , cellular networks (using third generation (3G) , fourth generation (4G) , long-term evolved (LTE) , or new radio (NR) systems (e.g., fifth generation (5G) ) , etc.
  • Network 120 may include the Internet.
  • the communications links 125 shown in the multimedia system 100 may include uplink transmissions from the device 105 to the server 110 and the database 115, and downlink transmissions, from the server 110 and the database 115 to the device 105.
  • the communication links 125 may transmit bidirectional communications and unidirectional communications.
  • the communication links 125 may be a wired connection or a wireless connection, or both.
  • the communications links 125 may include one or more connections, including but not limited to, Wi-Fi, Bluetooth, Bluetooth low-energy (BLE) , cellular, Z-WAVE, 802.11, peer-to-peer, LAN, wireless local area network (WLAN) , Ethernet, FireWire, fiber optic, and other connection types related to wireless communication systems.
  • the described methods, systems, devices, and apparatuses provide techniques which may support synchronization between GPUs and DPUs, among other advantages.
  • supported techniques may include features for frame rendering by a GPU based on tracking a scanning position of a DPU, which may reduce rendering latency to a delay period of one VSYNC frame and minimize or eliminate tearing issues.
  • the improved techniques provide for switching a GPU to a sleep state or to processing other tasks when a scanline of the DPU is outside a safe region, which may achieve improved power savings and reduce overall processing time.
  • FIG. 2 an example of a block diagram 200 that supports rendering in accordance with aspects of the present disclosure.
  • the block diagram 200 may include a device 205, a buffer queue component 220, and a device 235.
  • the device 205 may be a rendering device and the device 235 may be a display device.
  • the device 205 may be configured with rendering hardware (e.g., a GPU 210)
  • the device 235 may be configured with a display hardware (e.g., a DPU 240) supportive of processing and providing a multidimensional representation (e.g., 2D, 3D visualization) of graphics (e.g., images, video images) generated by the rendering hardware (e.g., the GPU 210) of the device 205.
  • the GPU 210 may be referred to as an image stream producer.
  • the buffer queue component 220 may be configured with one or more buffers 225 of a buffer pool.
  • the buffer queue component 220 may use a buffer queue mechanism (e.g., a frame buffer) or a data structure capable of combining the one or more buffers 225 and, in some cases, passing or allocating the one or more buffers 225 between different operations or processes.
  • a buffer queue mechanism e.g., a frame buffer
  • a data structure capable of combining the one or more buffers 225 and, in some cases, passing or allocating the one or more buffers 225 between different operations or processes.
  • the buffer queue component 220 may mediate a constant cycle of the one or more buffers 225 (e.g., a dequeued buffer 225-a, a queued buffer 225-b, an acquired buffer 225-c, and an available buffer 225-d) from the device 205 to the device 235 (e.g., via a dequeueBuffer instruction 230-a, a queueBuffer instruction 230-b, an acquireBuffer instruction 230-c, and a releaseBuffer instruction 230-d) .
  • a dequeueBuffer instruction 230-a, a queueBuffer instruction 230-b, an acquireBuffer instruction 230-c, and a releaseBuffer instruction 230-d e.g., via a dequeueBuffer instruction 230-a, a queueBuffer instruction 230-b, an acquireBuffer instruction 230-c, and a releaseBuffer instruction 230-d
  • the device 235 may include an image stream consumer capable of consuming (e.g., using the DPU 240) image streams generated by an image stream producer (e.g., the GPU 210) of the device 205.
  • the device 235 may run image stream consumers such as SurfaceFlinger, other OpenGL ES applications, or non-GL applications, for example.
  • One or more of the device 205 and the device 235 may use a BufferQueue mechanism (e.g., configured with an Android OS) or a data structure of the buffer queue component 220 to share the one or more buffers 225 between application rendering (e.g., at an image producer) and Surface Flinger (SF) , Hardware Composer (HWC) composition (e.g., at an image consumer) , and use a sync framework (e.g., Fence mechanism) to sync between rendering (e.g., at the image producer side) and composition (e.g., at the image consumer side) .
  • the sync framework may provide for synchronization between image producers and image consumers (e.g., generic synchronization) while providing for drivers for hardware synchronization between hardware blocks (e.g., between the GPU 210 and the DPU 240) .
  • an application may create multiple buffers 225 for rendering content (e.g., frame content) of a current frame to one of the buffers 225, queue the content to BufferQueue for display, and dequeue a different buffer 225 from BufferQueue for rendering a next frame.
  • Some applications may create two buffers 225 for GPU rendering (i.e., double buffer mode) or three buffers 225 for GPU rendering (i.e., triple buffer mode) .
  • applying the BufferQueue mechanism and a fence mechanism for multi-application, multi-window use cases may introduce a delay of at least three VSYNC frames (e.g., lag of at least three frames) from the application rendering to application display (e.g., from the rendering hardware to the display hardware) .
  • a delay may negatively impact one or more of the device 205 and the device 235 performance and user experience when executing applications (e.g., gaming applications) via one or more mobile operating systems (e.g., Android OS) .
  • some operating systems such as mobile operating systems, for example, Android N may support a single buffer mode within a framework of the mobile operating system to reduce latency from application rendering to application display. These operating systems however may be unable to support synchronization associated with the single buffer mode. For example, these operating systems may instead use a default double buffer mode or a triple buffer mode, and not a single buffer mode for rendering-to-displaying operations for applications (e.g., gaming applications running via a mobile platform) . Some gaming applications have incorporated timewarp and single buffer technology to reduce motion-to-photon latency (i.e., an amount of time for user movement to be fully reflected on a display screen) . Use of the default double buffer mode or a triple buffer mode may provide a latency for the rendering-to-displaying operations that may negatively impact user experience.
  • FIG. 3A shows an example of a block diagram 300 that supports rendering using a single buffer mode in accordance with aspects of the present disclosure.
  • the block diagram 300 may include a GPU 305, a single buffer mode (SBM) buffer 315, and a DPU 330.
  • a device as described herein may be configured with one or more of the GPU 305, the SBM buffer 315, and the DPU 330 to support rendering multimedia information (e.g., frames, graphics data) using a single buffer mode.
  • the GPU 305 when rendering an application, may write data 310 associated with rendering the application to the SBM buffer 315.
  • the DPU 330 may read, from the SBM buffer 315, corresponding data 325 associated with displaying the application.
  • a line 320 may indicate a split (e.g., a half-line) of the SBM buffer 315 associated with rendering the application and writing the data 310 versus displaying the application and reading the data 325.
  • a virtual reality application, virtual reality framework may control timing associated with rendering to prevent or minimize tearing issues for a virtual reality application.
  • the virtual reality application or the virtual reality framework, or both may monitor the DPU 330 display scanline position.
  • the DPU 330 fetches data from the bottom half of a frame (e.g., a current frame) according to a display scanline of the DPU 330
  • the virtual reality application or the virtual reality framework, or both may begin to render the top half of a frame (e.g., a next frame) with the GPU 305.
  • the DPU 330 fetches data from a top half of a frame (e.g., a current frame) according to a display scanline of the DPU 330, the virtual reality application or the virtual reality framework, or both may begin to render the bottom half of the frame (e.g., the current frame) with the GPU 305.
  • a frame e.g., a current frame
  • the virtual reality application or the virtual reality framework, or both may begin to render the bottom half of the frame (e.g., the current frame) with the GPU 305.
  • FIG. 3B illustrates an example of a timing diagram 301 that supports rendering using a single buffer mode in accordance with aspects of the present disclosure.
  • the timing diagram 301 illustrates latency associated with virtual reality rendering and displaying techniques, such as time warping (e.g., asynchronous time warping) , or reprojection.
  • Time warping may be a technique in virtual reality that warps a rendered image before sending the rendered image to a display hardware, so as to correct for head movement which may have occurred after the rendering.
  • the timing diagram 301 illustrates a frame 335-a (e.g., a frame N-1) , a frame 335-b (e.g., a frame N) , a signal 340 (e.g., display 0 VSYNC) , a signal 345 (e.g., associated with time warping) , a signal 350 (e.g., display 0) , and latency 355 between the signal 345 and the signal 350.
  • the signal 345 may include time warping 346-a associated with a left image and time warping 346-b associated with a right image.
  • the time warping techniques may reduce latency and increase or maintain frame rate. In some other examples, the time warping techniques may result in reduced judder resulting from missed frames, for example, for frames which have not been completely rendered prior to display. In other examples, the time warping techniques may include modifying a rendered image with updated positional information obtained from sensors of an HMD, and displaying the modified image.
  • the signal 350 may include data 351-a corresponding to the left image and data 351-b corresponding to the right image. Each frame 335 may have a period T.
  • FIG. 4 illustrates an example of a timing diagram 400 related to frame latency in accordance with aspects of the present disclosure.
  • the timing diagram 400 may relate to an application running on a device as described herein.
  • the timing diagram 400 may correspond to a gaming application running on a device using a mobile operating system.
  • rendering-to-displaying operations for the gaming application using the mobile operating system may have a VSYNC delay period affecting a quality of the gaming application.
  • the timing diagram 400 shows a VSYNC delay period of at least three VSYNC frames for a gaming application running on a device using an Android OS.
  • the device using the Android OS may support a double buffer mode.
  • the timing diagram 400 illustrates queueing, rendering, and displaying of frames A and B according to the double buffer mode for rendering and displaying multimedia- related information (e.g., graphics data) associated with the gaming application.
  • the timing diagram 400 illustrates the queueing of the frames A and B to a queue buffer by a CPU 405 (e.g., at time points T1, T2, and T3) , rendering of the frames A and B by a GPU 410, displaying of the frames A and B by a software component or a hardware component, or both 415, and displaying of the frames A and B by a DPU 420.
  • a game frame latency 425 is equal to a delay period of three VSYNC frames.
  • FIG. 5 illustrates an example of a timing diagram 500 that supports synchronization between GPUs and DPUs in accordance with aspects of the present disclosure.
  • the timing diagram 500 may implement aspects of the multimedia system 100.
  • the timing diagram may correspond to a synchronization scheme for synchronizing a GPU and a DPU (e.g., a DPU driver) in a single buffer mode.
  • the synchronization scheme may support, for example, a hardware synchronization scheme between a GPU and a DPU of a device as described herein.
  • the synchronization scheme may support, for example, a hardware synchronization scheme between a GPU of a device and a DPU of another device as described herein.
  • frame processing by the CPU is indicated by 505
  • the frame processing by the GPU is indicated by 510
  • the frame processing by the DPU is indicated by 515.
  • the GPU may include a hardware block configured to track a read pointer of the DPU.
  • the hardware block may automatically track the read pointer of the DPU.
  • the hardware block may include a counter (e.g., a hardware counter) capable of simulating a scanline position (e.g., a read pointer) of the DPU, and the hardware block may be configurable based on a panel resolution and refresh rate associated with the DPU (e.g., with a display panel of the DPU) .
  • Aspects of the synchronization scheme described herein may include synching the counter with the DPU at refresh cycles (e.g., at every refresh cycle) associated with refreshing the display screen or display panel of the DPU.
  • the GPU and the DPU of the device may be coupled via a hardware link.
  • the synchronization scheme may include resetting the counter (e.g., via the hardware link between the GPU and the DPU) based on the DPU beginning a scanning operation for scanning a frame.
  • the synchronization schemes described herein may include applying a single buffer mode to applications (e.g., game applications) included in a list of applications (e.g., a whitelist of applications) .
  • the synchronization schemes may include assigning a higher execution priority to applications (e.g., game applications) included in the whitelist compared to applications not included in the whitelist.
  • the GPU e.g., a GPU driver
  • the GPU may detect the scanline of the DPU using the counter of the GPU, and pause bin rendering associated with a current bin until the scanline of the DPU satisfies a threshold region associated with the rendering (e.g., until the scanline is greater than a threshold distance from a rendering area associated with the rendering) .
  • aspects of the multimedia system 100 may include hardware synchronization for minimizing the rendering-to-display latency to a delay period of one VSYNC frame, which may minimize or eliminate the tearing issue associated with synchronization schemes by some devices.
  • synchronization between the GPU and the DPU may be achieved with rendering-to-display latency to a delay period of one VSYNC frame or less, without a double buffer or fence sync mechanism.
  • the synchronization schemes may include synchronizing the GPU and the DPU of the device with a tearing effect (TE) in a single buffer mode.
  • the DPU may initiate a scan (e.g., display) of a frame N-1 (where N may be an integer) , for example, at a portion of the frame N-1.
  • the DPU may initiate the scan from a top left portion (e.g., in a downward right direction, as indicated by 516) of the frame N-1.
  • the CPU may prepare a frame N for rendering, and the GPU may render (e.g., continue rendering, for example, using bin rendering) the bottom of frame N-1.
  • Preparing the frame N for rendering may include, for example, pre-rendering the frame N or preparing instructions associated with rendering the frame N.
  • rendering may include bin rendering pixels of the frame N based on the bins 512.
  • the GPU may render (e.g., continue rendering) one or more pixels associated with bins 511 determined for rendering a frame N-1.
  • the GPU may render pixels of each of bins 511-t through 511-v associated with rendering the frame N-1, as indicated at 520.
  • Time point T1 may correspond to, for example, the start of a refresh cycle of the multimedia system 100 (e.g., a refresh cycle associated with the DPU) .
  • the CPU may have completed preparing the frame N for rendering (e.g., completed pre-rendering of the frame N and compositing the pre-rendered components to a frame buffer) .
  • the CPU may pass a rendering instruction (associated with rendering the frame N) to the GPU via a queue buffer.
  • the DPU may scan (e.g., continue to scan) the frame N-1, as indicated by 516, and the GPU may initiate rendering (e.g., bin rendering) of a next frame (e.g., a frame N) .
  • the GPU may have completed rendering of the frame N-1, and the GPU may initiate rendering (e.g., bin rendering) of one or more pixels associated with bins 512 determined for rendering the frame N.
  • the GPU may render pixels of each of bins 512-a through 512-c determined for rendering the frame N, as indicated at 525.
  • the portion of the frame N associated with a bin 512 e.g., bins 512-a through 512-c
  • the portion of the frame N associated with a bin 512 is a region from which the scanning of pixels associated with the portion of the frame N-1 (e.g., associated with bins 511) has occurred.
  • the GPU may be able to render pixels of frames (e.g., the frames N-1 through N+4) at a rate faster than the DPU is able to scan (e.g., display) the frames.
  • the GPU may switch to processing other tasks or enter a sleep state until a scan pointer 517 of the DPU satisfies a threshold distance from a render area (e.g., is greater than the threshold distance from the render area) associated with rendering the pixels of the frame N.
  • the GPU may have completed rendering of pixels associated with bins 512-a through 512-l for rendering the frame N, as indicated at a 530.
  • the GPU may determine the scan pointer 517 of the DPU fails to satisfy a threshold distance from a render area 518 (e.g., track the scan pointer 517, determining the scan pointer 517 of the DPU is less than a threshold distance from the render area 518) .
  • the GPU may pause rendering (e.g., refrain from rendering) remaining pixels of the frame N, and in some examples, switch to processing other tasks or enter a sleep state until the device or the GPU has determined the scan pointer of the DPU satisfies the threshold distance from the render area 518 (e.g., until the scan pointer 517 of the DPU is greater than the threshold distance from the render area 518) associated with rendering the pixels of the frame N.
  • the device may activate the GPU from a sleep state based on determining the scan pointer of the DPU satisfies the threshold distance from the render area 518.
  • the DPU may initiate a scan (e.g., display) of the frame N, for example, at a portion of the frame N.
  • the DPU may initiate the scan from a top left portion (e.g., in a downward right direction, as indicated by 516) of the frame N.
  • the CPU may prepare a frame N+1 for rendering, and the GPU may render (e.g., continue rendering, for example, using bin rendering) the bottom of frame N.
  • the GPU may render (e.g., continue rendering) one or more pixels associated with remaining bins 512 for rendering the frame N (e.g., may render pixels of each of bins 512-t through 512-v) of the frame N, as indicated at example 535.
  • Time point T4 may correspond to, for example, the start of a subsequent refresh cycle of the multimedia system 100 (e.g., a subsequent refresh cycle associated with the GPU and the DPU) .
  • scanning pixels associated with a portion of the frame N-1 in the buffer e.g., using the DPU
  • rendering the pixels of bins 511 using the GPU of the device corresponds to a single VSYNC delay period.
  • aspects of the techniques may sync the counter of the GPU with the DPU (e.g., by configuring the counter to track a read pointer of the DPU) , which may reduce the rendering-to-display latency to a delay period of one VSYNC frame or less, thereby preventing or minimizing tearing issues associated with gaming applications.
  • the GPU (e.g., via the counter) may determine the rendering-to-display latency exceeds a threshold (e.g., greater than one VSYNC frame) , and in some examples, determine that a tearing issue may result.
  • the GPU may pause rendering (e.g., refrain from rendering) until the scanning by the DPU has caught up to the rendering by the GPU (e.g., based on threshold distance as described herein) .
  • FIG. 6A shows an example of a block diagram 600 that supports synchronization between GPUs and DPUs in accordance with aspects of the present disclosure.
  • the block diagram 600 may implement aspects of the multimedia system 100 as described in FIG. 1, respectively.
  • the block diagram 600 may be of a device, which may be an example of aspects of a device as described herein.
  • the device may include a GPU 605 and a DPU 615.
  • the GPU 605 and the DPU 615 may be part of separate devices.
  • the GPU 605 and the DPU 615 may be in communication with one another, for example, via one or more links or buses, such as a hardware link 625.
  • the GPU 605 may be configured to track (e.g., automatically track) a read pointer of the DPU 615. For example, the GPU 605 may predict a position of a scanline 616 of the DPU 615.
  • the GPU 605 may include a hardware counter 610 configured to predict the position of the scanline 616, for example, based on a resolution (e.g., panel resolution) and refresh rate of the DPU 615. In some examples, the GPU 605 may determine the resolution and the refresh rate.
  • the GPU 605 may configure a parameter of the hardware counter 610 based on the resolution, the refresh rate, or both.
  • the GPU 605 (e.g., the hardware counter 610) may simulate the scanline 616 (e.g., predict changes in the position of the scanline 616) .
  • the hardware counter 610 may synch with the DPU 615 at one or more (or every) refresh cycles of the DPU 615 (e.g., at every refresh cycle) associated with refreshing the display screen or display panel of the DPU 615.
  • Aspects of the hardware synchronization schemes described herein may include resetting the hardware counter 610 (e.g., via the hardware link 625 between the GPU 605 and the DPU 615) , for example, when the DPU 615 is scanning (e.g., displaying) or displaying frames generated by the GPU 605.
  • the DPU 615 may scan (e.g., display) a frame 630 beginning from a top left position of the frame 620, for example, as indicated by one or more scanlines 616.
  • the GPU 605 may reset the hardware counter 610 based on scanning by the DPU 615 (e.g., based on position of a scanline 616) .
  • the GPU 605 may detect a scanline 616 of the DPU 615 (e.g., the hardware counter 610 may detect a position, or changes in position, of the scanline 616) , and pause bin rendering until the scanline 616 satisfies a threshold distance from a render area of the frame 620 (e.g., until the scanline 616 is at a position greater than a threshold distance from the render area) .
  • the GPU 605 may reset the hardware counter 610 based on the scanline 616 satisfying the threshold distance.
  • FIG. 6B shows a flowchart illustrating a method 601 that supports synchronization between GPUs and DPUs in accordance with aspects of the present disclosure.
  • the method 601 may implement aspects of the multimedia system 100 described with reference to FIG. 1, respectively.
  • the operations of method 601 may be implemented by a device or its components (e.g., a GPU, a DPU) as described herein.
  • the method 601 may support synchronization between the GPU 605 and the DPU 615 described with reference to FIG. 6A, respectively.
  • the operations of the method 601 may, in some examples, be performed by a multimedia manager as described with reference to FIGs. 8 through 11.
  • a device may execute a set of instructions to control the functional elements of the device to perform the functions described herein. Additionally or alternatively, a device may perform aspects of the functions described herein using special-purpose hardware.
  • a device may initiate a binning pass associated with a frame N, where N may be an integer.
  • the GPU of the device may initiate or perform the binning pass.
  • the binning pass may include, processing an entire image (e.g., the frame N) and sorting rasterized primitives (such as triangles) into tile-sized areas that may be referred to as bins.
  • the GPU may process a command stream for the entire image (e.g., the frame N) and assign the rasterized primitives of the image to bins.
  • the device may initiate rendering (e.g., bin rendering) of the frame N.
  • a GPU of the device may have completed rendering of a previous frame (e.g., a frame N-1) , and the GPU may initiate rendering (e.g., bin rendering) of one or more pixels associated with bins determined based on the binning pass.
  • the GPU may render pixels of one or more bins included in a render area.
  • the GPU may initiate bin rendering at a first bin determined based on the binning pass.
  • the device may determine whether the render area is behind a scanline position (e.g., a scan pointer) of a DPU of the device.
  • the GPU of the device may determine whether a distance between the render area (e.g., inclusive of a current bin, for example, the first bin) and a scan pointer of the DPU of the device (or a DPU of another device) satisfies a threshold distance (e.g., is greater than the threshold distance) .
  • the device e.g., the GPU of the device determines at 640 that the render area is behind the scanline position of the DPU, then at 645, the device (e.g., the GPU) may render the current bin (e.g., the first bin) and perform a resolve pass for the current bin (e.g., write or forward results of the rendering from the memory of the GPU to a buffer in a system memory associated with the device) . In some examples, after rendering the current bin (e.g., the first bin) , the device may return to 640.
  • the device may determine, for a next bin, whether a render area (inclusive of the next bin) is behind the scanline position (e.g., the scan pointer) of the DPU.
  • the device e.g., the GPU of the device
  • the device may pause rendering.
  • the device e.g., the GPU
  • the device may refrain from rendering of pixels (or remaining pixels) of the frame N, and in some examples, switch to processing other tasks or enter a sleep state.
  • the device may pause rendering until the device has determined that the render area is behind the scanline position of the DPU.
  • the device may pause rendering until the device has determined a scanline position of the DPU satisfies the threshold distance from a render area (e.g., the render area inclusive of the current bin, or a render area inclusive of a next bin) .
  • the device may return to 640, where the device may again determine whether a scanline position of the DPU satisfies the threshold distance from a render area (e.g., the render area inclusive of the current bin, or a render area inclusive of a next bin) .
  • the device may activate the GPU based on determining the scanline position of the DPU satisfies the threshold distance from a render area.
  • FIG. 7 shows a flowchart illustrating a method 700 that supports synchronization between GPUs and DPUs in accordance with aspects of the present disclosure.
  • the method 700 may implement aspects of the multimedia system 100 described with reference to FIG. 1, respectively.
  • the operations of method 700 may be implemented by a device or its components (e.g., a GPU, a DPU) as described herein.
  • the operations of the method 700 may be performed by a multimedia manager as described with reference to FIGs. 8 through 11.
  • a device may execute a set of instructions to control the functional elements of the device to perform the functions described herein. Additionally or alternatively, a device may perform aspects of the functions described herein using special-purpose hardware.
  • a device may apply a single buffer mode to one or more applications (e.g., one or more game applications) that are part of a list of applications (e.g., a whitelist of applications) , to reduce complexity associated with graphics library rendering at an application-side, a software development kit (SDK) side, or an engine side, for example.
  • the operations of the method 700 may be implemented by a framework or graphics library wrapper associated with a single buffer mode.
  • the framework or graphics library wrapper may be implemented with a single buffer mode, for example, on a device as described herein.
  • the framework or graphics library wrapper may be implemented, for example, at a GPU of a device.
  • the framework or graphics library wrapper may be implemented at an application side, an SDK side, or an engine side.
  • the device may identify an application for rendering.
  • the application may be, for example, a game application associated with one or more of an SDK or a game engine.
  • the device may initiate rendering the application.
  • the device may utilize an EGL interface (e.g., Android EGL) to create rendering surfaces associated with a frame or image.
  • the EGL interface may be configured for graphics context management, surface creation and buffer creation, and rendering synchronization, for example.
  • the device may determine or check whether the application is included in a list.
  • the list may include, for example, a list of applications having a relatively high execution priority among applications executable by the device.
  • the list may include a set of applications configured for a single buffer mode.
  • the device may compare the application to the applications included in the list.
  • the device may access the list from a memory stored or coupled to the device or, for example, from a server or a database.
  • the device may proceed to 720.
  • the device may thus apply a check for graphics intensive applications (e.g., graphics intensive game applications) where GPU rendering delay (e.g., rendering-to-display latency) is greater than a delay period of one VSYNC frame.
  • GPU rendering delay e.g., rendering-to-display latency
  • the check may be applied to identify GPU driver upgrades which may minimize or mitigate possible stability issues associated with rendering an identified application.
  • the device may enable a single buffer mode.
  • the device may utilize EGL API parameters (e.g., hack EGL API parameters of eglCreateWindowSurface () ) to enable the single buffer mode.
  • the device may apply the single buffer mode to the application. Otherwise, if the device determines the application is not included in the list, the device may proceed to 725.
  • the device may access a vendor EGL associated with rendering the application. In some examples, the device may render the application utilizing the vendor EGL. In some examples, if the device has enabled the single buffer mode, the device may render the application utilizing the vendor EGL and the single buffer mode.
  • FIG. 8 shows a block diagram 800 of a device 805 that supports synchronization between GPUs and DPUs in accordance with aspects of the present disclosure.
  • the device 805 may be an example of aspects of a device as described herein.
  • the device 805 may include a receiver 810, a multimedia manager 815, and a transmitter 820.
  • the device 805 may also include a processor. Each of these components may be in communication with one another (e.g., via one or more buses) .
  • the receiver 810 may receive information such as packets, user data, or control information associated with various information channels (e.g., control channels, data channels, and information related to synchronization between GPUs and DPUs, etc. ) . Information may be passed on to other components of the device 805.
  • the receiver 810 may be an example of aspects of the transceiver 1120 described with reference to FIG. 11.
  • the receiver 810 may utilize a single antenna or a set of antennas.
  • the multimedia manager 815 may scan one or more pixels using a DPU of the device 805, where the one or more pixels are associated with a portion of a first frame in a buffer, track a pointer that indicates a position of the scanning of the one or more pixels associated with the portion of the first frame in the buffer, and render one or more pixels of a bin of a set of bins using a GPU of the device 805 based on the tracking, where the one or more pixels of the bin of the set of bins are associated with a portion of a second frame in the buffer.
  • the multimedia manager 815 may be an example of aspects of the multimedia manager 1110 described herein.
  • the multimedia manager 815 as described herein may be implemented to realize one or more potential advantages.
  • One implementation may allow the device 805 to provide techniques which may support synchronization between GPUs and DPUs, among other advantages.
  • the device 805 may include features for frame rendering by a GPU based on tracking a scanning position of a DPU, which may reduce rendering latency to a delay period of one VSYNC frame and minimize or eliminate tearing issues.
  • the device 805 may include features for switching a GPU to a sleep state or to processing other tasks when a scanline of the DPU is outside a safe region, which may achieve improved power savings and reduce overall processing time.
  • the multimedia manager 815 may be an example of aspects of the multimedia manager 1110 described herein.
  • aspects of the techniques described herein support the synchronization between GPUs and DPUs completely at the GPU hardware and driver level for a single buffer mode, which is transparent from an upper layer application, SDK, or engine. Accordingly, the techniques significantly reduce rendering complexity at the application side, SDK side, or engine side.
  • the multimedia manager 815 may be implemented in hardware, code (e.g., software or firmware) executed by a processor, or any combination thereof. If implemented in code executed by a processor, the functions of the multimedia manager 815, or its sub-components may be executed by a general-purpose processor, a CPU, a DSP, a GPU, a DPU, an application-specific integrated circuit (ASIC) , a FPGA or other programmable logic device, discrete gate or transistor logic, discrete hardware components, or any combination thereof designed to perform the functions described in the present disclosure.
  • code e.g., software or firmware
  • ASIC application-specific integrated circuit
  • the multimedia manager 815 may be physically located at various positions, including being distributed such that portions of functions are implemented at different physical locations by one or more physical components.
  • the multimedia manager 815, or its sub-components may be a separate and distinct component in accordance with various aspects of the present disclosure.
  • the multimedia manager 815, or its sub-components may be combined with one or more other hardware components, including but not limited to an input/output (I/O) component, a transceiver, a network server, another computing device, one or more other components described in the present disclosure, or a combination thereof in accordance with various aspects of the present disclosure.
  • I/O input/output
  • the transmitter 820 may transmit signals generated by other components of the device 805.
  • the transmitter 820 may be collocated with a receiver 810 in a transceiver component.
  • the transmitter 820 may be an example of aspects of the transceiver 1120 described with reference to FIG. 11.
  • the transmitter 820 may utilize a single antenna or a set of antennas.
  • FIG. 9 shows a block diagram 900 of a device 905 that supports synchronization between GPUs and DPUs in accordance with aspects of the present disclosure.
  • the device 905 may be an example of aspects of a device 805 or a device 105 as described herein.
  • the device 905 may include a receiver 910, a multimedia manager 915, and a transmitter 935.
  • the device 905 may also include a processor. Each of these components may be in communication with one another (e.g., via one or more buses) .
  • the receiver 910 may receive information such as packets, user data, or control information associated with various information channels (e.g., control channels, data channels, and information related to synchronization between GPUs and DPUs, etc. ) . Information may be passed on to other components of the device 905.
  • the receiver 910 may be an example of aspects of the transceiver 1120 described with reference to FIG. 11.
  • the receiver 910 may utilize a single antenna or a set of antennas.
  • the multimedia manager 915 may be an example of aspects of the multimedia manager 815 as described herein.
  • the multimedia manager 915 may include a scan component 920, a track component 925, and a render component 930.
  • the multimedia manager 915 may be an example of aspects of the multimedia manager 1110 described herein.
  • the scan component 920 may scan one or more pixels using a DPU of the device 905, where the one or more pixels are associated with a portion of a first frame in a buffer.
  • the track component 925 may track a pointer that indicates a position of the scanning of the one or more pixels associated with the portion of the first frame in the buffer.
  • the render component 930 may render one or more pixels of a bin of a set of bins using a GPU of the device 905 based on the tracking, where the one or more pixels of the bin of the set of bins are associated with a portion of a second frame in the buffer.
  • the transmitter 935 may transmit signals generated by other components of the device 905.
  • the transmitter 935 may be collocated with a receiver 910 in a transceiver component.
  • the transmitter 935 may be an example of aspects of the transceiver 1120 described with reference to FIG. 11.
  • the transmitter 935 may utilize a single antenna or a set of antennas.
  • FIG. 10 shows a block diagram 1000 of a multimedia manager 1005 that supports synchronization between GPUs and DPUs in accordance with aspects of the present disclosure.
  • the multimedia manager 1005 may be an example of aspects of a multimedia manager 815, a multimedia manager 915, or a multimedia manager 1110 described herein.
  • the multimedia manager 1005 may include a scan component 1010, a track component 1015, a render component 1020, a parameter component 1025, a synchronization component 1030, an activation component 1035, a deactivation component 1040, and a mode component 1045. Each of these components may communicate, directly or indirectly, with one another (e.g., via one or more buses) .
  • the scan component 1010 may scan one or more pixels using a DPU of a device, where the one or more pixels are associated with a portion of a first frame in a buffer. In some examples, the scan component 1010 may scan of the one or more pixels associated with the portion of the first frame in the buffer and render one or more pixels of a bin of a set of bins using a GPU of the device based in part on a single VSYNC delay period. In some examples, one or more of the GPU and the DPU are operating in a single buffer mode.
  • the track component 1015 may track a pointer that indicates a position of the scanning of the one or more pixels associated with the portion of the first frame in the buffer.
  • the pointer corresponds to a counter of the GPU.
  • the track component 1015 may track, by the GPU, the position of the scanning of the one or more pixels associated with the portion of the first frame in the buffer using the counter.
  • the track component 1015 may determine that the position of the scanning of the one or more pixels associated with the portion of the first frame in the buffer using the DPU of the device satisfies a threshold region.
  • the track component 1015 may reset the counter based on the position of the scanning of the one or more pixels associated with the portion of the first frame in the buffer satisfying the threshold region.
  • the render component 1020 may render the one or more pixels of the bin of the set of bins using the GPU of the device based on the tracking, where the one or more pixels of the bin of the set of bins are associated with a portion of a second frame in the buffer. In some examples, the render component 1020 may determine that the position of the scanning of the one or more pixels associated with the portion of the first frame in the buffer using the DPU of the device satisfies a threshold region. In some examples, the render component 1020 may render the one or more pixels of the bin of the set of bins using the GPU of the device based on the scanning of the one or more pixels associated with the portion of the first frame in the buffer satisfying the threshold region.
  • the render component 1020 may determine that the position of the scanning of the one or more pixels associated with the portion of the first frame in the buffer using the DPU of the device satisfies a threshold region. In some examples, the render component 1020 may refrain from rendering one or more pixels of a second bin of the set of bins using the GPU of the device based on the position of the scanning of the one or more pixels associated with the portion of the first frame in the buffer satisfying the threshold region. The one or more pixels of the second bin are associated with a second portion of the second frame in the buffer. In some examples, the portion of the second frame in the buffer associated with the bin is a region from which the scanning of the one or more pixels associated with the portion of the first frame in the buffer has occurred.
  • the parameter component 1025 may determine a resolution associated with the DPU. In some examples, the parameter component 1025 may configure a parameter of the counter based on the resolution associated with the DPU. In some examples, the parameter component 1025 may determine a refresh rate associated with the DPU. In some examples, the parameter component 1025 may configure a parameter of the counter based on the refresh rate associated with the DPU.
  • the synchronization component 1030 may synchronize, based on a refresh cycle of the DPU, the counter of the GPU with the pointer that indicates the position of the scanning, by the DPU, of the one or more pixels associated with the portion of the first frame in the buffer.
  • the activation component 1035 may activate the GPU based on the position of the scanning of the one or more pixels associated with the portion of the first frame in the buffer satisfying the threshold region.
  • the deactivation component 1040 may deactivate the GPU based on the refraining.
  • the mode component 1045 may identify an application running on the device. In some examples, the mode component 1045 may compare the application to a set of applications configured for the single buffer mode. In some examples, the mode component 1045 may apply the single buffer mode to the application based on the comparing. In some examples, the mode component 1045 may scan of the one or more pixels associated with the portion of the first frame in the buffer and render the one or more pixels of the bin of the set of bins using the GPU of the device is based on the single buffer mode.
  • FIG. 11 shows a diagram of a system 1100 including a device 1105 that supports synchronization between GPUs and DPUs in accordance with aspects of the present disclosure.
  • the device 1105 may be an example of or include the components of device 805, device 905, or a device as described herein.
  • the device 1105 may include components for bi-directional voice and data communications including components for transmitting and receiving communications, including a multimedia manager 1110, an I/O controller 1115, a transceiver 1120, an antenna 1125, memory 1130, a processor 1140, and a coding manager 1150. These components may be in electronic communication via one or more buses (e.g., bus 1145) .
  • buses e.g., bus 1145
  • the multimedia manager 1110 may scan one or more pixels using a DPU of device 1105, where the one or more pixels are associated with a portion of a first frame in a buffer, track a pointer that indicates a position of the scanning of the one or more pixels associated with the portion of the first frame in the buffer, and render one or more pixels of a bin of a set of bins using a GPU of the device 1105 based on the tracking, where the one or more pixels of the bin of the set of bins are associated with a portion of a second frame in the buffer.
  • the I/O controller 1115 may manage input and output signals for the device 1105.
  • the I/O controller 1115 may also manage peripherals not integrated into the device 1105.
  • the I/O controller 1115 may represent a physical connection or port to an external peripheral.
  • the I/O controller 1115 may utilize an operating system such as iOS, ANDROID, MS-DOS, MS-WINDOWS, OS/2, UNIX, LINUX, or another known operating system.
  • the I/O controller 1115 may represent or interact with a modem, a keyboard, a mouse, a touchscreen, or a similar device.
  • the I/O controller 1115 may be implemented as part of a processor.
  • a user may interact with the device 1105 via the I/O controller 1115 or via hardware components controlled by the I/O controller 1115.
  • the transceiver 1120 may communicate bi-directionally, via one or more antennas, wired, or wireless links as described herein.
  • the transceiver 1120 may represent a wireless transceiver and may communicate bi-directionally with another wireless transceiver.
  • the transceiver 1120 may also include a modem to modulate the packets and provide the modulated packets to the antennas for transmission, and to demodulate packets received from the antennas.
  • the device 1105 may include a single antenna 1125. However, in some examples, the device 1105 may have more than one antenna 1125, which may be capable of concurrently transmitting or receiving multiple wireless transmissions.
  • the memory 1130 may include RAM and read-only memory (ROM) .
  • the memory 1130 may store computer-readable, computer-executable code 1135 including instructions that, when executed, cause the processor to perform various functions described herein.
  • the memory 1130 may contain, among other things, a BIOS which may control basic hardware or software operation such as the interaction with peripheral components or devices.
  • the code 1135 may include instructions to implement aspects of the present disclosure, including instructions to support multimedia communications.
  • the code 1135 may be stored in a non-transitory computer-readable medium such as system memory or other type of memory.
  • the code 1135 may not be directly executable by the processor 1140 but may cause a computer (e.g., when compiled and executed) to perform functions described herein.
  • the processor 1140 may include an intelligent hardware device, (e.g., a general-purpose processor, a DSP, a CPU, a GPU, a DPU, a microcontroller, an ASIC, an FPGA, a programmable logic device, a discrete gate or transistor logic component, a discrete hardware component, or any combination thereof) .
  • the processor 1140 may be configured to operate a memory array using a memory controller. In other cases, a memory controller may be integrated into the processor 1140.
  • the processor 1140 may be configured to execute computer-readable instructions stored in a memory (e.g., the memory 1130) to cause the device 1105 to perform various functions (e.g., functions or tasks supporting synchronization between GPUs and DPUs) .
  • FIG. 12 shows a diagram of a system 1200 including a device 1205 that supports synchronization between GPUs and DPUs in accordance with aspects of the present disclosure.
  • the device 1205 may be an example of or include the components of device 805, device 905, device 1105, or a device as described herein.
  • the device 1205 may include components for bi-directional multimedia communications including components for transmitting and receiving multimedia communications, including a user interface unit 1210, a CPU 1215, a CPU memory 1220, a GPU driver 1225, a GPU 1230, a GPU memory 1235, a DPU 1240, a system memory 1245, and a display 1250. These components may be in electronic communication via one or more buses.
  • the CPU 1215 may include, but is not limited to, a DSP, general purpose microprocessor, an ASIC, an FPGA, or other equivalent integrated or discrete logic circuitry. Although the CPU 1215 and the GPU 1230 are illustrated as separate units in the example of FIG. 12, in some examples, the CPU 1215 and the GPU 1230 may be integrated into a single unit.
  • the CPU 1215 may execute one or more software applications. Examples of the software applications may include operating systems, word processors, web browsers, e-mail applications, spreadsheets, video games, audio and video capture applications, playback or editing applications, or other such applications that initiate generation of multimedia data (e.g., audio data, video data, or a combination thereof) to be outputted via the display 1250.
  • the CPU 1215 may include the CPU memory 1220.
  • the CPU memory 1220 may represent on-chip storage or memory used in executing machine or object code.
  • the CPU memory 1220 may include one or more volatile or non-volatile memories or storage devices, such as flash memory, a magnetic data media, an optical storage media, etc.
  • the CPU 1215 may be configured to read values from or write values to the CPU memory 1220 more quickly than reading values from or writing values to the system memory 1245, which may be accessed, e.g., over a system bus.
  • the CPU memory 1220 may be a cache memory.
  • the GPU 1230 may represent one or more dedicated processors for performing graphical operations.
  • the GPU 1230 may be a dedicated hardware unit having fixed function and programmable components for rendering graphics and executing GPU applications.
  • the GPU 1230 may also include a DSP, a general purpose microprocessor, an ASIC, an FPGA, or other equivalent integrated or discrete logic circuitry.
  • the GPU 1230 may be built with a highly-parallel structure that provides more efficient processing of complex graphic-related operations than the CPU 1215.
  • the GPU 1230 may include a number of processing elements that are configured to operate on multiple vertices or pixels in a parallel manner.
  • the highly parallel nature of the GPU 1230 may allow the GPU 1230 to generate graphic images (e.g., graphical user interfaces and two-dimensional or three-dimensional graphics scenes) for output at the display 1250 more quickly than the CPU 1215.
  • the GPU 1230 may, in some examples, be integrated into a motherboard of the device 1205. In other examples, the GPU 1230 may be present on a graphics card that is installed in a port in the motherboard of the device 1205 or may be otherwise incorporated within a peripheral device configured to interoperate with the device 1205.
  • the GPU 1230 may include the GPU memory 1235.
  • the GPU memory 1235 may represent on-chip storage or memory used in executing machine or object code.
  • the GPU memory 1235 may include one or more volatile or non-volatile memories or storage devices, such as flash memory, a magnetic data media, an optical storage media, etc.
  • the GPU 1230 may be able to read values from or write values to the GPU memory 1235 more quickly than reading values from or writing values to the system memory 1245, which may be accessed, e.g., over a system bus. That is, the GPU 1230 may read data from and write data to the GPU memory 1235 without using the system bus to access off-chip memory. This operation may allow the GPU 1230 to operate in a more efficient manner by reducing a load for the GPU 1230 to read and write data via the system bus, which may experience heavy bus traffic.
  • the GPU memory 1235 may be a cache memory.
  • the display 1250 may be configured as a unit capable of displaying video, images, text or any other type of data for consumption by a viewer.
  • the display 1250 may include a liquid-crystal display (LCD) , a light emitting diode (LED) display, an organic LED (OLED) , an active-matrix OLED (AMOLED) , or the like.
  • the DPU 1240 may include a display buffer, which may be configured as a memory or storage device dedicated to storing data for presentation of imagery, such as computer-generated graphics, still images, video frames, or the like for the display 1250.
  • the display buffer may represent a two-dimensional buffer that includes a plurality of storage locations.
  • the number of storage locations within the display buffer may, in some examples, correspond to the number of pixels to be displayed on the display 1250.
  • the display buffer of the DPU may include 640 x 480 storage locations storing pixel color and intensity information, such as red, green, and blue pixel values, or other color values.
  • the display buffer may store the final pixel values for each of the pixels processed by the GPU 1230 and the DPU 1240.
  • the display 1250 may retrieve the final pixel values from the display buffer and display the final image based on the pixel values stored in the display buffer.
  • the display buffer may be a cache memory.
  • the user interface unit 1210 be configured as a unit with which a user may interact with or otherwise interface to communicate with other units of the device 1205, such as the CPU 1215.
  • Examples of the user interface unit 1210 include, but are not limited to, a trackball, a mouse, a keyboard, and other types of input devices.
  • the user interface unit 1210 may also be, or include, a touch screen and the touch screen may be incorporated as part of the display 1250.
  • the system memory 1245 may include one or more computer-readable storage media. Examples of the system memory 1245 include, but are not limited to, a RAM, static RAM (SRAM) , dynamic RAM (DRAM) , a ROM, an electrically erasable programmable read-only memory (EEPROM) , a compact disc read-only memory (CD-ROM) or other optical disc storage, magnetic disc storage, or other magnetic storage devices, flash memory, or any other medium that can be used to store desired program code in the form of instructions or data structures and that can be accessed by a computer or a processor.
  • the system memory 1245 may store program components and instructions that are accessible for execution by the CPU 1215. Additionally, the system memory 1245 may store user applications and application surface data associated with the applications.
  • the system memory 1245 may, in some examples, store information for use by and information generated by other components of the device 1205.
  • the system memory 1245 may act as a device memory for the GPU 1230 and may store data to be operated on by the GPU 1230, as well as data resulting from operations performed by the GPU 1230.
  • the system memory 1245 may include instructions that cause the CPU 1215 or the GPU 1230 to perform the functions attributed to the CPU 1215 or the GPU 1230 in aspects of the present disclosure.
  • the system memory 1245 may, in some examples, be considered as a non-transitory storage medium.
  • the term “non-transitory” should not be interpreted to mean that the system memory 1245 is non-movable.
  • the system memory 1245 may be removed from the device 1205 and moved to another device.
  • a system memory substantially similar to the system memory 1245 may be inserted into the device 1205.
  • a non-transitory storage medium may store data that can, over time, change (e.g., in RAM) .
  • the system memory 1245 may store the GPU driver 1225 and compiler, a GPU program, and a locally-compiled GPU program.
  • the GPU driver 1225 may represent a computer program or executable code that provides an interface to access the GPU 1230.
  • the CPU 1215 may execute the GPU driver 1225 or portions thereof to interface with the GPU 1230 and, for this reason, the GPU driver 1225 is shown in the example of FIG. 12 within the CPU 1215.
  • the GPU driver 1225 may be accessible to programs or other executables executed by the CPU 1215, including the GPU program stored in the system memory 1245.
  • the CPU 1215 may provide graphics commands and graphics data to the GPU 1230 for rendering to the display 1250 (e.g., via the GPU driver 1225) .
  • the GPU program may include code written in a high level (HL) programming language, e.g., using an application programming interface (API) .
  • APIs include Open Graphics Library ( “OpenGL” ) , DirectX, Render-Man, WebGL, or any other public or proprietary standard graphics API.
  • the instructions may also conform to so-called heterogeneous computing libraries, such as Open-Computing Language ( “OpenCL” ) , DirectCompute, etc.
  • an API includes a predetermined, standardized set of commands that are executed by associated hardware. API commands allow a user to instruct hardware components of the GPU 1230 to execute commands without user knowledge as to the specifics of the hardware components.
  • the CPU 1215 may issue one or more rendering commands to the GPU 1230 (e.g., through the GPU driver 1225) to cause the GPU 1230 to perform some or all of the rendering of the graphics data.
  • the graphics data to be rendered may include a list of graphics primitives (e.g., points, lines, triangles, quadrilaterals, etc. ) .
  • the compiler may receive the GPU program from the CPU 1215 when executing HL code that includes the GPU program. That is, a software application being executed by the CPU 1215 may invoke the GPU driver 1225 (e.g., via a graphics API) to issue one or more commands to the GPU 1230 for rendering one or more graphics primitives into displayable graphics images.
  • the compiler may compile the GPU program to generate the locally-compiled GPU program that conforms to a low-level (LL) programming language.
  • the compiler may then output the locally-compiled GPU program that includes the LL instructions.
  • the LL instructions may be provided to the GPU 1230 in the form a list of drawing primitives (e.g., triangles, rectangles, etc. ) .
  • the LL instructions may include vertex specifications that specify one or more vertices associated with the primitives to be rendered.
  • the vertex specifications may include positional coordinates for each vertex and, in some instances, other attributes associated with the vertex, such as color coordinates, normal vectors, and texture coordinates.
  • the primitive definitions may include primitive type information, scaling information, rotation information, and the like.
  • the GPU driver 1225 may formulate one or more commands that specify one or more operations for the GPU 1230 to perform in order to render the primitive.
  • the GPU 1230 receives a command from the CPU 1215, it may decode the command and configure one or more processing elements to perform the specified operation and may output the rendered data to the DPU 1240.
  • the GPU 1230 may receive the locally-compiled GPU program, and then, in some instances, the GPU 1230 renders one or more images and outputs the rendered images to the DPU 1240. For example, the GPU 1230 may generate a number of primitives to be displayed at the display 1250. Primitives may include one or more of a line (including curves, splines, etc. ) , a point, a circle, an ellipse, a polygon (e.g., a triangle) , or any other two-dimensional primitive. The term “primitive” may also refer to three-dimensional primitives, such as cubes, cylinders, sphere, cone, pyramid, torus, or the like.
  • the term “primitive” refers to any basic geometric shape or element capable of being rendered by the GPU 1230 for display as an image (or frame in the context of video data) via the display 1250.
  • the GPU 1230 may transform primitives and other attributes (e.g., that define a color, texture, lighting, camera configuration, or other aspect) of the primitives into a so-called “world space” by applying one or more model transforms (which may also be specified in the state data) .
  • the GPU 1230 may apply a view transform for the active camera (which again may also be specified in the state data defining the camera) to transform the coordinates of the primitives and lights into the camera or eye space.
  • the GPU 1230 may also perform vertex shading to render the appearance of the primitives in view of any active lights.
  • the GPU 1230 may perform vertex shading in one or more of the above model, world, or view space.
  • the GPU 1230 may perform projections to project the image into a canonical view volume. After transforming the model from the eye space to the canonical view volume, the GPU 1230 may perform clipping to remove any primitives that do not at least partially reside within the canonical view volume. That is, the GPU 1230 may remove any primitives that are not within the frame of the camera. The GPU 1230 may then map the coordinates of the primitives from the view volume to the screen space, effectively reducing the three-dimensional coordinates of the primitives to the two-dimensional coordinates of the screen. Given the transformed and projected vertices defining the primitives with their associated shading data, the GPU 1230 may then rasterize the primitives.
  • rasterization may refer to the task of taking an image described in a vector graphics format and converting it to a raster image (e.g., a pixelated image) for output on a video display or for storage in a bitmap file format.
  • a raster image e.g., a pixelated image
  • the GPU 1230 may include a dedicated fast bin buffer (e.g., a fast memory buffer, such as general memory (GMEM) , which may be referred to by the GPU memory 1235) .
  • a rendering surface may be divided into bins. In some cases, the bin size is determined by format (e.g., pixel color and depth information) and render target resolution divided by the total amount of GMEM. The number of bins may vary based on the device 1205 hardware, target resolution size, and target display format.
  • a rendering pass may draw (e.g., render, write, etc. ) pixels into GMEM (e.g., with a high bandwidth that matches the capabilities of the GPU) .
  • the GPU 1230 may then resolve the GMEM (e.g., burst write blended pixel values from the GMEM, as a single layer, to the display buffer or a frame buffer in the system memory 1245) . Such may be referred to as bin-based or tile-based rendering. When all bins are complete, the driver may swap buffers and start the binning process again for a next frame.
  • GMEM e.g., burst write blended pixel values from the GMEM, as a single layer, to the display buffer or a frame buffer in the system memory 1245.
  • bin-based or tile-based rendering When all bins are complete, the driver may swap buffers and start the binning process again for a next frame.
  • the GPU 1230 may implement a tile-based architecture that renders an image or rendering target by breaking the image into multiple portions, referred to as tiles or bins.
  • the bins may be sized based on the size of the GPU memory 1235 (e.g., which may alternatively be referred to herein as GMEM or a cache) , the resolution of the display 1250, the color or Z precision of the render target, etc.
  • the GPU 1230 may perform a binning pass and one or more rendering passes. For example, with respect to the binning pass, the GPU 1230 may process an entire image and sort rasterized primitives into bins.
  • the device 1205 may be configured to synchronize the GPU 1230 and the DPU 1240 for rendering and display, via a processor (e.g., the GPU 1230, the DPU 1240) of the device 1205, content (e.g., frames) associated with an application.
  • a processor e.g., the GPU 1230, the DPU 1240
  • content (e.g., frames) associated with an application may reduce a processing load on the processor (e.g., the GPU 1230) , as well as decrease power consumption when rendering content (e.g., frames) .
  • the device 1205 may scan one or more pixels using the DPU 1240 of the device 1205.
  • the one or more pixels may be associated with a portion of a first frame in a buffer.
  • the device 1205 may track a pointer that indicates a position of the scanning of the one or more pixels associated with the portion of the first frame in the buffer.
  • the pointer corresponds to a counter of the GPU 1230.
  • the device 1205 may track, by the GPU 1230, the position of the scanning of the one or more pixels associated with the portion of the first frame in the buffer using the counter.
  • the device 1205 may thus render one or more pixels of a bin of a plurality of bins using the GPU 1230 of the device 1205 based on the tracking.
  • the one or more pixels of the bin of the plurality of bins may be associated with a portion of a second frame in the buffer.
  • the portion of the second frame in the buffer associated with the bin may be a region from which the scanning of the one or more pixels associated with the portion of the first frame in the buffer has previously occurred.
  • the device 1205 may determine a resolution associated with the DPU 1240, and configure a parameter of the counter based on the resolution associated with the DPU 1240. In some other examples, the device 1205 may determine a refresh rate associated with the DPU 1240, and configure a parameter of the counter based on the refresh rate associated with the DPU 1240. The device 1205 may synchronize, based on a refresh cycle of the DPU 1240, the counter of the GPU 1230 with the pointer that indicates the position of the scanning, by the DPU 1240, of the one or more pixels associated with the portion of the first frame in the buffer.
  • the device 1205 may determine that the position of the scanning of the one or more pixels associated with the portion of the first frame in the buffer using the DPU 1240 of the device 1205 satisfies a threshold region, and reset the counter based on the position of the scanning of the one or more pixels associated with the portion of the first frame in the buffer satisfying the threshold region. In some examples, the device 1205 may determine that the position of the scanning of the one or more pixels associated with the portion of the first frame in the buffer using the DPU 1240 of the device 1205 satisfies a threshold region, and render the one or more pixels of the bin of the plurality of bins using the GPU 1230 of the device 1205 based on the scanning of the one or more pixels associated with the portion of the first frame in the buffer satisfying the threshold region. In some examples, the device 1205 may activate the GPU 1230 based on the position of the scanning of the one or more pixels associated with the portion of the first frame in the buffer satisfying the threshold region.
  • the device 1205 may determine that the position of the scanning of the one or more pixels associated with the portion of the first frame in the buffer using the DPU 1240 of the device 1205 satisfies a threshold region, and refrain from rendering one or more pixels of a second bin of the plurality of bins using the GPU 1230 of the device 1205 based on the position of the scanning of the one or more pixels associated with the portion of the first frame in the buffer satisfying the threshold region.
  • the one or more pixels of the second bin are associated with a second portion of the second frame in the buffer.
  • the device 1205 may deactivate the GPU 1230 based on the refraining.
  • one or more of the GPU 1230 and the DPU 1240 may operate in a single buffer mode.
  • scanning of the one or more pixels associated with the portion of the first frame in the buffer and rendering the one or more pixels of the bin of the plurality of bins using the GPU 1230 of the device 1205 corresponds to a single VSYNC delay period.
  • the device 1205 may identify an application running on the device 1205, compare the application to a set of applications configured for the single buffer mode, and apply the single buffer mode to the application based on the comparing.
  • of the one or more pixels associated with the portion of the first frame in the buffer and rendering the one or more pixels of the bin of the plurality of bins using the GPU 1230 of the device 1205 may be based on the single buffer mode.
  • FIG. 13 shows a flowchart illustrating a method 1300 that supports synchronization between GPUs and DPUs in accordance with aspects of the present disclosure.
  • the operations of method 1300 may be implemented by a device or its components (e.g., a GPU, a DPU) as described herein.
  • the operations of method 1300 may be performed by a multimedia manager as described with reference to FIGs. 8 through 11.
  • a device may execute a set of instructions to control the functional elements of the device to perform the functions described herein. Additionally or alternatively, a device may perform aspects of the functions described herein using special-purpose hardware.
  • the device may scan one or more pixels using a DPU of the device, where the one or more pixels are associated with a portion of a first frame in a buffer.
  • the operations of 1305 may be performed according to the methods described herein. In some examples, aspects of the operations of 1305 may be performed by a scan component as described with reference to FIGs. 8 through 11.
  • the device may track a pointer that indicates a position of the scanning of the one or more pixels associated with the portion of the first frame in the buffer.
  • the operations of 1310 may be performed according to the methods described herein. In some examples, aspects of the operations of 1310 may be performed by a track component as described with reference to FIGs. 8 through 11.
  • the device may render one or more pixels of a bin of a set of bins using a GPU of the device based on the tracking, where the one or more pixels of the bin of the set of bins are associated with a portion of a second frame in the buffer.
  • the operations of 1315 may be performed according to the methods described herein. In some examples, aspects of the operations of 1315 may be performed by a render component as described with reference to FIGs. 8 through 11.
  • FIG. 14 shows a flowchart illustrating a method 1400 that supports synchronization between GPUs and DPUs in accordance with aspects of the present disclosure.
  • the operations of method 1400 may be implemented by a device or its components (e.g., a GPU, a DPU) as described herein.
  • the operations of method 1400 may be performed by a multimedia manager as described with reference to FIGs. 8 through 11.
  • a device may execute a set of instructions to control the functional elements of the device to perform the functions described herein. Additionally or alternatively, a device may perform aspects of the functions described herein using special-purpose hardware.
  • the device may scan one or more pixels using a DPU of the device, where the one or more pixels are associated with a portion of a first frame in a buffer.
  • the operations of 1405 may be performed according to the methods described herein. In some examples, aspects of the operations of 1405 may be performed by a scan component as described with reference to FIGs. 8 through 11.
  • the device may track a pointer that indicates a position of the scanning of the one or more pixels associated with the portion of the first frame in the buffer.
  • the operations of 1410 may be performed according to the methods described herein. In some examples, aspects of the operations of 1410 may be performed by a track component as described with reference to FIGs. 8 through 11.
  • the device may track, by a GPU of the device, the position of the scanning of the one or more pixels associated with the portion of the first frame in the buffer using a counter.
  • the operations of 1415 may be performed according to the methods described herein. In some examples, aspects of the operations of 1415 may be performed by a track component as described with reference to FIGs. 8 through 11.
  • the device may render one or more pixels of a bin of a set of bins using the GPU of the device based on the tracking, where the one or more pixels of the bin of the set of bins are associated with a portion of a second frame in the buffer.
  • the operations of 1420 may be performed according to the methods described herein. In some examples, aspects of the operations of 1420 may be performed by a render component as described with reference to FIGs. 8 through 11.
  • FIG. 15 shows a flowchart illustrating a method 1500 that supports synchronization between GPUs and DPUs in accordance with aspects of the present disclosure.
  • the operations of method 1500 may be implemented by a device or its components (e.g., a GPU, a DPU) as described herein.
  • the operations of method 1500 may be performed by a multimedia manager as described with reference to FIGs. 8 through 11.
  • a device may execute a set of instructions to control the functional elements of the device to perform the functions described herein. Additionally or alternatively, a device may perform aspects of the functions described herein using special-purpose hardware.
  • the device may scan one or more pixels using a DPU of the device, where the one or more pixels are associated with a portion of a first frame in a buffer.
  • the operations of 1505 may be performed according to the methods described herein. In some examples, aspects of the operations of 1505 may be performed by a scan component as described with reference to FIGs. 8 through 11.
  • the device may track a pointer that indicates a position of the scanning of the one or more pixels associated with the portion of the first frame in the buffer.
  • the operations of 1510 may be performed according to the methods described herein. In some examples, aspects of the operations of 1510 may be performed by a track component as described with reference to FIGs. 8 through 11.
  • the device may determine that the position of the scanning of the one or more pixels associated with the portion of the first frame in the buffer using the DPU of the device satisfies a threshold region.
  • the operations of 1515 may be performed according to the methods described herein. In some examples, aspects of the operations of 1515 may be performed by a render component as described with reference to FIGs. 8 through 11.
  • the device may render one or more pixels of a bin of a set of bins using a GPU of the device based on the scanning of the one or more pixels associated with the portion of the first frame in the buffer satisfying the threshold region.
  • the operations of 1520 may be performed according to the methods described herein. In some examples, aspects of the operations of 1520 may be performed by a render component as described with reference to FIGs. 8 through 11.
  • FIG. 16 shows a flowchart illustrating a method 1600 that supports synchronization between GPUs and DPUs in accordance with aspects of the present disclosure.
  • the operations of method 1600 may be implemented by a device or its components (e.g., a GPU, a DPU) as described herein.
  • the operations of method 1600 may be performed by a multimedia manager as described with reference to FIGs. 8 through 11.
  • a device may execute a set of instructions to control the functional elements of the device to perform the functions described herein. Additionally or alternatively, a device may perform aspects of the functions described herein using special-purpose hardware.
  • the device may scan one or more pixels using a DPU of the device, where the one or more pixels are associated with a portion of a first frame in a buffer.
  • the operations of 1605 may be performed according to the methods described herein. In some examples, aspects of the operations of 1605 may be performed by a scan component as described with reference to FIGs. 8 through 11.
  • the device may track a pointer that indicates a position of the scanning of the one or more pixels associated with the portion of the first frame in the buffer.
  • the operations of 1610 may be performed according to the methods described herein. In some examples, aspects of the operations of 1610 may be performed by a track component as described with reference to FIGs. 8 through 11.
  • the device may determine that the position of the scanning of the one or more pixels associated with the portion of the first frame in the buffer using the DPU of the device satisfies a threshold region.
  • the operations of 1615 may be performed according to the methods described herein. In some examples, aspects of the operations of 1615 may be performed by a render component as described with reference to FIGs. 8 through 11.
  • the device may refrain from rendering one or more pixels of a second bin of a set of bins using a GPU of the device based on the position of the scanning of the one or more pixels associated with the portion of the first frame in the buffer satisfying the threshold region, where the one or more pixels of the second bin are associated with a second portion of a second frame in the buffer.
  • the operations of 1620 may be performed according to the methods described herein. In some examples, aspects of the operations of 1620 may be performed by a render component as described with reference to FIGs. 8 through 11.
  • FIG. 17 shows a flowchart illustrating a method 1700 that supports synchronization between GPUs and DPUs in accordance with aspects of the present disclosure.
  • the operations of method 1700 may be implemented by a device or its components (e.g., a GPU, a DPU) as described herein.
  • the operations of method 1700 may be performed by a multimedia manager as described with reference to FIGs. 8 through 11.
  • a device may execute a set of instructions to control the functional elements of the device to perform the functions described herein. Additionally or alternatively, a device may perform aspects of the functions described herein using special-purpose hardware.
  • the device may identify an application running on the device.
  • the operations of 1705 may be performed according to the methods described herein. In some examples, aspects of the operations of 1705 may be performed by a mode component as described with reference to FIGs. 8 through 11.
  • the device may compare the application to a set of applications configured for a single buffer mode.
  • the operations of 1710 may be performed according to the methods described herein. In some examples, aspects of the operations of 1710 may be performed by a mode component as described with reference to FIGs. 8 through 11.
  • the device may apply the single buffer mode to the application based on the comparing.
  • the operations of 1715 may be performed according to the methods described herein. In some examples, aspects of the operations of 1715 may be performed by a mode component as described with reference to FIGs. 8 through 11.
  • the device may scan one or more pixels associated with a portion of a first frame in a buffer and render one or more pixels of a bin of a set of bins using a GPU of the device based on the single buffer mode, where the one or more pixels of the bin of the set of bins are associated with a portion of a second frame in the buffer.
  • the operations of 1720 may be performed according to the methods described herein. In some examples, aspects of the operations of 1720 may be performed by a scan component and render component as described with reference to FIGs. 8 through 11.
  • Information and signals described herein may be represented using any of a variety of different technologies and techniques.
  • data, instructions, commands, information, signals, bits, symbols, and chips that may be referenced throughout the description may be represented by voltages, currents, electromagnetic waves, magnetic fields or particles, optical fields or particles, or any combination thereof.
  • a general-purpose processor may be a microprocessor, but in the alternative, the processor may be any processor, controller, microcontroller, or state machine.
  • a processor may also be implemented as a combination of computing devices (e.g., a combination of a DSP and a microprocessor, multiple microprocessors, one or more microprocessors in conjunction with a DSP core, or any other such configuration) .
  • the functions described herein may be implemented in hardware, software executed by a processor, firmware, or any combination thereof. If implemented in software executed by a processor, the functions may be stored on or transmitted over as one or more instructions or code on a computer-readable medium. Other examples and implementations are within the scope of the disclosure and appended claims. For example, due to the nature of software, functions described herein may be implemented using software executed by a processor, hardware, firmware, hardwiring, or combinations of any of these. Features implementing functions may also be physically located at various positions, including being distributed such that portions of functions are implemented at different physical locations.
  • Computer-readable media includes both non-transitory computer storage media and communication media including any medium that facilitates transfer of a computer program from one place to another.
  • a non-transitory storage medium may be any available medium that may be accessed by a general-purpose or special purpose computer.
  • non-transitory computer-readable media may include RAM, ROM, electrically erasable programmable ROM (EEPROM) , flash memory, compact disk (CD) ROM or other optical disk storage, magnetic disk storage or other magnetic storage devices, or any other non-transitory medium that may be used to carry or store desired program code means in the form of instructions or data structures and that may be accessed by a general-purpose or special-purpose computer, or a general-purpose or special-purpose processor.
  • any connection is properly termed a computer-readable medium.
  • the software is transmitted from a website, server, or other remote source using a coaxial cable, fiber optic cable, twisted pair, digital subscriber line (DSL) , or wireless technologies such as infrared, radio, and microwave
  • the coaxial cable, fiber optic cable, twisted pair, DSL, or wireless technologies such as infrared, radio, and microwave are included in the definition of computer-readable medium.
  • Disk and disc include CD, laser disc, optical disc, digital versatile disc (DVD) , floppy disk and Blu-ray disc where disks usually reproduce data magnetically, while discs reproduce data optically with lasers. Combinations of the above are also included within the scope of computer-readable media.

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Computer Hardware Design (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Graphics (AREA)
  • Multimedia (AREA)
  • Controls And Circuits For Display Device (AREA)

Abstract

L'invention concerne des procédés, des systèmes et des dispositifs pour des communications multimédia. Un dispositif peut balayer un ou plusieurs pixels à l'aide d'une unité de traitement d'affichage (DPU) du dispositif. Le ou les pixels peuvent être associés à une partie d'une première trame dans un tampon. Le dispositif peut suivre un pointeur qui indique une position du balayage du ou des pixels associés à la partie de la première trame dans le tampon. Dans certains exemples, le pointeur peut correspondre à un compteur d'une GPU. Le dispositif peut effectuer le rendu d'un ou plusieurs pixels d'un compartiment d'un ensemble de compartiments à l'aide de la GPU du dispositif sur la base du suivi. Le ou les pixels du compartiment de l'ensemble de compartiments peuvent être associés à une partie d'une seconde trame dans le tampon.
PCT/CN2019/123547 2019-12-06 2019-12-06 Synchronisation entre unités de traitement graphique et unités de traitement d'affichage WO2021109105A1 (fr)

Priority Applications (1)

Application Number Priority Date Filing Date Title
PCT/CN2019/123547 WO2021109105A1 (fr) 2019-12-06 2019-12-06 Synchronisation entre unités de traitement graphique et unités de traitement d'affichage

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2019/123547 WO2021109105A1 (fr) 2019-12-06 2019-12-06 Synchronisation entre unités de traitement graphique et unités de traitement d'affichage

Publications (1)

Publication Number Publication Date
WO2021109105A1 true WO2021109105A1 (fr) 2021-06-10

Family

ID=76221368

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2019/123547 WO2021109105A1 (fr) 2019-12-06 2019-12-06 Synchronisation entre unités de traitement graphique et unités de traitement d'affichage

Country Status (1)

Country Link
WO (1) WO2021109105A1 (fr)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115391124A (zh) * 2022-10-27 2022-11-25 瀚博半导体(上海)有限公司 一种面向图形芯片功耗测试的方法及装置

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102270415A (zh) * 2010-06-04 2011-12-07 刘舸 内置系统型tft-lcd液晶显示模组
US20150187256A1 (en) * 2014-01-02 2015-07-02 Nvidia Corporation Preventing fetch of occluded pixels for display processing and scan-out
CN108630139A (zh) * 2018-05-08 2018-10-09 京东方科技集团股份有限公司 图像显示处理方法及装置、显示装置及存储介质
CN109036295A (zh) * 2018-08-09 2018-12-18 京东方科技集团股份有限公司 图像显示处理方法及装置、显示装置及存储介质

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102270415A (zh) * 2010-06-04 2011-12-07 刘舸 内置系统型tft-lcd液晶显示模组
US20150187256A1 (en) * 2014-01-02 2015-07-02 Nvidia Corporation Preventing fetch of occluded pixels for display processing and scan-out
CN108630139A (zh) * 2018-05-08 2018-10-09 京东方科技集团股份有限公司 图像显示处理方法及装置、显示装置及存储介质
CN109036295A (zh) * 2018-08-09 2018-12-18 京东方科技集团股份有限公司 图像显示处理方法及装置、显示装置及存储介质

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115391124A (zh) * 2022-10-27 2022-11-25 瀚博半导体(上海)有限公司 一种面向图形芯片功耗测试的方法及装置
CN115391124B (zh) * 2022-10-27 2023-03-21 瀚博半导体(上海)有限公司 一种面向图形芯片功耗测试的方法及装置

Similar Documents

Publication Publication Date Title
CN111033570B (zh) 使用两个渲染计算装置从计算机图形渲染图像
US10049426B2 (en) Draw call visibility stream
US9619916B2 (en) Method for transmitting digital scene description data and transmitter and receiver scene processing device
US10776997B2 (en) Rendering an image from computer graphics using two rendering computing devices
KR102562877B1 (ko) 애플리케이션 계산들의 분배를 위한 방법들 및 장치들
KR102381945B1 (ko) 그래픽 프로세싱 장치 및 그래픽 프로세싱 장치에서 그래픽스 파이프라인을 수행하는 방법
US11468629B2 (en) Methods and apparatus for handling occlusions in split rendering
US11631212B2 (en) Methods and apparatus for efficient multi-view rasterization
KR20230130756A (ko) 셰이딩 아틀라스를 사용한 분할 렌더링에서의 에러 은닉
WO2021109105A1 (fr) Synchronisation entre unités de traitement graphique et unités de traitement d'affichage
US20210099756A1 (en) Low-cost video segmentation
US20210200255A1 (en) Higher graphics processing unit clocks for low power consuming operations
US20210280156A1 (en) Dynamic refresh rate adjustment
US10409359B2 (en) Dynamic bin ordering for load synchronization
KR20230079374A (ko) 디스플레이 패널 fps 스위칭을 위한 방법 및 장치
US11321804B1 (en) Techniques for flexible rendering operations
US20190220411A1 (en) Efficient partitioning for binning layouts
US20210385365A1 (en) Display notch mitigation for cameras and projectors
US20230041630A1 (en) Techniques for phase detection autofocus
KR20220164484A (ko) 섀도우 정보를 이용한 렌더링
US11600002B2 (en) Bin filtering
US20210183007A1 (en) Display hardware enhancement for inline overlay caching
US11381730B2 (en) Feature-based image autofocus
US11094032B2 (en) Out of order wave slot release for a terminated wave
WO2023164792A1 (fr) Optimisation de masque en damier dans un élagage d'occlusions

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 19955013

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 19955013

Country of ref document: EP

Kind code of ref document: A1