CN114998087A - Rendering method and device - Google Patents

Rendering method and device Download PDF

Info

Publication number
CN114998087A
CN114998087A CN202111552336.5A CN202111552336A CN114998087A CN 114998087 A CN114998087 A CN 114998087A CN 202111552336 A CN202111552336 A CN 202111552336A CN 114998087 A CN114998087 A CN 114998087A
Authority
CN
China
Prior art keywords
resolution
processor
image
rendering
image data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202111552336.5A
Other languages
Chinese (zh)
Other versions
CN114998087B (en
Inventor
秦园
林淦
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Honor Device Co Ltd
Original Assignee
Honor Device Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Honor Device Co Ltd filed Critical Honor Device Co Ltd
Priority to CN202310459132.XA priority Critical patent/CN116739879A/en
Publication of CN114998087A publication Critical patent/CN114998087A/en
Application granted granted Critical
Publication of CN114998087B publication Critical patent/CN114998087B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T1/00General purpose image data processing
    • G06T1/20Processor architectures; Processor configuration, e.g. pipelining
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/44Arrangements for executing specific programs
    • G06F9/445Program loading or initiating
    • G06F9/44505Configuring for program initiating, e.g. using registry, configuration files
    • G06F9/4451User profiles; Roaming
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5005Allocation of resources, e.g. of the central processing unit [CPU] to service a request
    • G06F9/5027Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/54Interprogram communication
    • G06F9/544Buffers; Shared memory; Pipes
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Abstract

The application provides a rendering method and a rendering device, wherein a first processor receives a rendering command issued by an application program, and the rendering command is used for instructing a second processor to render a first image based on a first resolution; the first processor sends a rendering instruction to the second processor, wherein the rendering instruction is used for instructing the second processor to render the first image; the second processor generates image data of the first image at a second resolution based on the rendering instruction, the second resolution being not greater than the first resolution; the second processor writes the image data of the first image under the second resolution into the shared memory, and the first processor and the second processor have the authority of accessing the shared memory; the second processor reads image data of the first image under a third resolution from the shared memory, wherein the third resolution is greater than the second resolution; the second processor generates a first image based on image data of the first image at the third resolution.

Description

Rendering method and device
The present application claims priority from the chinese patent application filed on 17.11.2021 under the name of "rendering method and apparatus", the application number 202111364414.9, the chinese patent office, which is incorporated herein by reference in its entirety.
Technical Field
The present application relates to the field of image processing technologies, and in particular, to a rendering method and apparatus.
Background
With the development of display technology, the resolution of images is moving to higher resolution, such as the resolution of images is moving from 720P to 1080P, and then from 1080P to 2k, where P represents the total number of rows of pixels, and 720P represents that there are 720 rows of pixels; k represents the total number of columns of pixels, e.g. 2k represents 2000 columns of pixels. When the electronic device renders a high-resolution image or an ultrahigh-resolution image, excessive computational cost can be occupied, so that the power consumption of the electronic device is increased, heat is seriously generated, and even the problem of operation blockage occurs.
Disclosure of Invention
The rendering method and the rendering device solve the problems of power consumption improvement, serious equipment heating and unsmooth operation when the electronic equipment renders the image.
In order to achieve the purpose, the technical scheme is as follows:
in a first aspect, the present application provides a rendering method, where the method is applied to an electronic device, where the electronic device runs an application program, and the electronic device includes a first processor and a second processor, and the method includes: the method comprises the steps that a first processor receives a rendering command sent by an application program, wherein the rendering command is used for instructing a second processor to render a first image based on a first resolution; the first processor sends a rendering instruction to the second processor, wherein the rendering instruction is used for instructing the second processor to render the first image; the second processor generates image data of the first image at a second resolution based on the rendering instruction, the second resolution being not greater than the first resolution; the second processor writes the image data of the first image under the second resolution into the shared memory, and the first processor and the second processor have the authority of accessing the shared memory; the second processor reads image data of the first image under a third resolution from the shared memory, wherein the third resolution is greater than the second resolution; the second processor generates a first image based on image data of the first image at the third resolution.
After the second processor writes the generated image data of the first image under the second resolution into the shared memory, the second processor can read the image data of the first image under the third resolution from the shared memory, the generation of the image data of the first image under the third resolution by the second processor is omitted, the computing power of the second processor is saved, the time spent in the rendering process is shortened, the power consumption is reduced, the rendering smoothness is improved, and the problems that the equipment generates heat seriously and runs unsmooth are solved. And the third resolution is greater than the second resolution, which indicates that the second processor can obtain image data with relatively high resolution, and the second processor can draw the first image based on the image data with relatively high resolution, so that the image quality of the first image is improved. The first processor and the second processor have the authority of accessing the shared memory, data sharing can be achieved between the first processor and the second processor based on the shared memory, zero copy of shared data between the first processor and the second processor is achieved, and processing efficiency is improved.
Optionally, if the rendering mode of the application program is forward rendering, the rendering instruction corresponds to a first frame buffer, and the number of the drawing instructions executed by the first frame buffer is greater than a preset threshold; and if the rendering mode of the application program is delayed rendering, the rendering instruction corresponds to the frame buffers except the last frame buffer in all the frame buffers transmitted by the application program. When the rendering modes adopted by the application program are different, the frame buffers for the rendering instructions sent by the first processor to the second processor are also different, which indicates that when the application program adopts different rendering modes, the occasions for sending the rendering instructions by the first processor are also different, and the purpose of controlling the sending of the rendering instructions based on the frame buffers under different rendering modes is achieved.
Optionally, the first frame buffer is a frame buffer with the largest number of executed drawing instructions in all frame buffers.
Optionally, before the first processor sends the rendering instruction to the second processor, the method further includes: the first processor acquires the rendering mode of the application program from the configuration file of the application program.
Optionally, the rendering instructions are operable to instruct the second processor to render the first image based on a second resolution, the second resolution being less than the first resolution. The rendering instruction sent by the first processor to the second processor carries the second resolution of the first image, the resolution of the image data generated by the second processor is specified, the second processor is prevented from generating image data which is not matched with the resolution required by the application program, and the accuracy of the first image is improved. And the second resolution is smaller than the first resolution, and the smaller the resolution is, the smaller the data volume is, so that by specifying the second resolution smaller than the first resolution, the data volume processed by the second processor is reduced, and the power consumption of the electronic device is reduced, so as to solve the problem of serious heat generation.
Optionally, the second resolution is smaller than the first resolution; the third resolution is the same as the first resolution or the third resolution is greater than the first resolution. The second resolution is less than the first resolution, and the third resolution is the same as the first resolution. During the processing by the second processor, the second resolution is less than the first resolution, so that when the second processor generates image data of the first image at the second resolution, the amount of data processed by the second processor is reduced. However, the third resolution is the same as the first resolution, which means that the second processor can read the image data corresponding to the first resolution, and draw the first image with the first resolution, thereby ensuring the image quality of the drawn first image and the image quality required by the application program. If the third resolution is higher than the first resolution, the second processor can read the image data with the resolution higher than the first resolution, the image quality of the drawn first image is better than the image quality required by the application program, and the image quality of the first image is improved.
Optionally, the second resolution is equal to the first resolution. The second resolution is equal to the first resolution, but the third resolution is greater than the second resolution, and then the third resolution is also greater than the first resolution, when the application program requires rendering the first image based on the first resolution, the second processor may render the first image based on the third resolution, the image quality of the rendered first image is better than the image quality required by the application program, and the image quality of the first image is improved.
Optionally, the electronic device further includes a third processor, where image data of the first image at the third resolution is generated by the third processor, and the third processor has a right to access the shared memory. Data sharing among the first processor, the second processor and the third processor can be realized based on the shared memory, zero copy of shared data among the first processor, the second processor and the third processor is realized, and processing efficiency is improved. After the second processor generates the image data of the first image under the second resolution, the third processor generates the image data of the first image under the third resolution, the calculation power of the third processor is actively called, the calculation power requirement of the high-resolution image and/or the ultrahigh-resolution image is met, the power consumption of the electronic equipment is reduced, and the phenomenon of serious heating is reduced. And the third processor can share the calculated amount of the second processor, and shorten the time spent in the rendering process, thereby improving the rendering smoothness and solving the problem of unsmooth operation.
Optionally, after the second processor writes the image data of the first image at the second resolution into the shared memory, and before the second processor reads the image data of the first image at the third resolution from the shared memory, the method further includes: the third processor reads the image data of the first image under the second resolution from the shared memory; the third processor generates image data of the first image at a third resolution based on the image data of the first image at the second resolution; and the third processor writes the image data of the first image under the third resolution into the shared memory.
Optionally, after the second processor writes the image data of the first image at the second resolution into the shared memory, and before the third processor reads the image data of the first image at the second resolution from the shared memory, the method further includes: the first processor sends a first notification to the third processor, wherein the first notification is used for instructing the third processor to read the image data of the first image at the second resolution from the shared memory. The first processor can monitor the image data read-write work of the second processor, so that after the second processor writes the image data of the first image under the second resolution into the shared memory, the third processor is triggered to read the image data of the first image under the second resolution from the shared memory in time, and the efficiency is improved.
Optionally, after the third processor writes the image data of the first image at the third resolution into the shared memory, and before the second processor reads the image data of the first image at the third resolution from the shared memory, the method further includes: and the first processor sends a second notification to the second processor, wherein the second notification is used for instructing the second processor to read the image data of the first image at the third resolution from the shared memory. The first processor can monitor the image data read-write work of the third processor, so that after the third processor writes the image data of the first image under the third resolution into the shared memory, the second processor is triggered to read the image data of the first image under the third resolution from the shared memory in time, and the efficiency is improved.
Optionally, the electronic device further includes a third processor, and image data of the first image at the third resolution is generated by the third processor; the third processor operates an artificial intelligence super-resolution model, and the third processor performs super-resolution rendering on the image data of the first image at the second resolution by using the artificial intelligence super-resolution model to generate the image data of the first image at the third resolution.
Optionally, before the third processor performs super-resolution rendering on the image data of the first image at the second resolution by using the artificial intelligence super-resolution model, the method further includes: the first processor sending the first resolution and the second resolution to the third processor; the third processor determines a super-resolution multiple of the artificial intelligence super-resolution model based on the first resolution and the second resolution, and the artificial intelligence resolution model performs super-resolution rendering on the image data of the first image under the second resolution based on the super-resolution multiple. The artificial intelligence super-resolution model can have at least one super-resolution multiple, and each super-resolution multiple can complete conversion of different resolutions. For example, image data with the same resolution is used as input, and the resolution corresponding to the image data output by the artificial intelligence super-resolution model is different due to different super-resolution factors. In order to enable the resolution corresponding to the image data output by the artificial intelligence super-resolution model to be not less than the first resolution, the first processor can send the first resolution and the second resolution to the third processor. The third processor selects a super-resolution multiple which is not less than the multiple difference according to the multiple difference between the first resolution and the second resolution, so that the resolution corresponding to the image data output by the artificial intelligence super-resolution model is not less than the first resolution.
Optionally, before the third processor performs super-resolution rendering on the image data of the first image at the second resolution by using the artificial intelligence super-resolution model, the method further includes: the first processor initializes the artificial intelligence super-resolution model, and the initialization is used for determining the operation of the artificial intelligence super-resolution model and determining the normal operation of the artificial intelligence super-resolution model; the initialization comprises runtime detection, model loading, model compiling and memory configuration, wherein the runtime detection is used for determining the running of the artificial intelligence super-resolution model, and the model loading, the model compiling and the memory configuration are used for determining the normal running of the artificial intelligence super-resolution model.
Optionally, the electronic device further includes a third processor, and image data of the first image at the third resolution is generated by the third processor; before the first processor sends rendering instructions to the second processor, the method further comprises: the first processor allocates a shared memory from the hardware buffer, and the third processor has the right to access the shared memory; the first processor sends pointer addresses of the shared memory to the third processor and the second processor, and the third processor and the second processor execute reading and writing of image data to the shared memory based on the pointer addresses. The second processor and the third processor share the image data in a shared memory mode, zero copy of the shared data is achieved, and processing efficiency is improved. And after the shared memory is allocated to the first processor, the first processor can send the pointer address of the shared memory to the third processor and the second processor, so that the accuracy of data reading and writing is improved.
Optionally, before the first processor sends the rendering instruction to the second processor, the method further includes: the first processor reduces a resolution of the first image from a first resolution to a second resolution.
Optionally, the electronic device further includes a third processor, and image data of the first image at the third resolution is generated by the third processor; the third processor has a super-divide, the super-divide for indicating a difference between the second resolution and the third resolution; the third resolution is the same as the first resolution, and the first processor reducing the resolution of the first image from the first resolution to the second resolution includes: the first processor reduces the first resolution to a second resolution based on the super-resolution. In this embodiment, the artificial intelligence super-resolution model may have at least one super-resolution factor, and each super-resolution factor may perform conversion of different resolutions. For example, image data under the same resolution is used as input, and the resolution corresponding to the image data output by the artificial intelligence super-resolution model is different due to different super-resolution multiples, so that the first processor can reduce the resolution of the first image based on the super-resolution multiples, match the reduction of the resolution with one super-resolution multiple of the artificial intelligence super-resolution model, and subsequently improve the resolution by using the artificial intelligence super-resolution model, thereby ensuring the accuracy.
Optionally, the third processor is a neural network processor or a digital signal processor.
In a second aspect, the present application provides a rendering method, where the method is applied to a second processor of an electronic device, where the electronic device runs an application program, the electronic device further includes a first processor, and the application program issues a rendering command to the first processor, where the rendering command is used to instruct the second processor to render a first image based on a first resolution, and the method includes: the second processor receives a rendering instruction sent by the first processor, wherein the rendering instruction is used for instructing the second processor to render the first image; the second processor generates image data of the first image at a second resolution based on the rendering instruction, the second resolution being not greater than the first resolution; the second processor writes the image data of the first image under the second resolution into the shared memory, and the first processor and the second processor have the authority of accessing the shared memory; the second processor reads image data of the first image under a third resolution from the shared memory, wherein the third resolution is greater than the second resolution; the second processor generates a first image based on image data of the first image at the third resolution.
In this embodiment, after the second processor writes the generated image data of the first image at the second resolution into the shared memory, the second processor may read the image data of the first image at the third resolution from the shared memory, and the second processor is not required to generate the image data of the first image at the third resolution, so that the computational power of the second processor is saved, the time spent in the rendering process is shortened, the power consumption is reduced, the rendering smoothness is improved, and the problems of serious heat generation and unsmooth operation of the device are solved. And the third resolution is greater than the second resolution, which indicates that the second processor can obtain image data with relatively high resolution, and the second processor can draw the first image based on the image data with relatively high resolution, so that the image quality of the first image is improved. The first processor and the second processor have the authority of accessing the shared memory, data sharing can be achieved between the first processor and the second processor based on the shared memory, zero copy of shared data between the first processor and the second processor is achieved, and processing efficiency is improved.
Optionally, after the second processor writes the image data of the first image at the second resolution into the shared memory, before the second processor reads the image data of the first image at the third resolution from the shared memory, the method further includes: the second processor receives a first notification sent by the first processor, the first notification is sent after the image data of the first image at the third resolution is successfully written into the shared memory, the image data of the first image at the third resolution is written by a third processor in the electronic device, and the first notification is used for instructing the second processor to read the image data of the first image at the third resolution from the shared memory.
Optionally, before the second processor receives the rendering instruction sent by the first processor, the method further includes: the second processor receives the notice sent by the first processor, and the notice carries the address pointer of the shared memory, so that the accuracy of data reading and writing is improved.
In a third aspect, the present application provides an electronic device, comprising: a first processor, a second processor, and a memory; wherein the memory is configured to store one or more computer program codes comprising computer instructions that, when executed by the first processor and the second processor, cause the first processor and the second processor to perform the rendering method.
In a fourth aspect, the present application provides a chip system, where the chip system includes a program code, and when the program code runs on an electronic device, a first processor and a second processor in the electronic device are caused to execute the rendering method.
In a fifth aspect, the present application provides a processor, which is a second processor, the second processor comprising a processing unit and a memory; wherein the memory is configured to store one or more computer program codes, the computer program codes comprising computer instructions, which when executed by the second processor, cause the second processor to perform the rendering method.
In a sixth aspect, the present application provides a computer storage medium comprising computer instructions that, when run on an electronic device, cause a second processor in the electronic device to perform the rendering method described above.
In a seventh aspect, the present application provides a rendering method, applied to rendering processing of a first image by an electronic device, where the electronic device runs an application program, and includes a first processor, a second processor, and a third processor, and the method includes: the method comprises the steps that a first processor receives a rendering command sent by an application program, wherein the rendering command is used for indicating that a first image is rendered, and the first image has a second resolution; a third processor obtaining first image data, the first image data being image data of a first image having a first resolution; the third processor performs super-resolution rendering on the first image data to obtain second image data, wherein the second image data is image data of a first image with a second resolution, and the first resolution is smaller than the second resolution; the second processor generates a first image having a second resolution based on the second image data.
Optionally, the method further comprises: the first processor determines a resolution of the first image as a first resolution in response to the rendering command, and instructs the third processor to obtain the first image data.
Optionally, the method further comprises: the first processor determines the resolution of the first image to be a second resolution in response to the rendering command; the first processor reduces the resolution of the first image from the second resolution to the first resolution; the first processor instructs the third processor to obtain the first image data.
Optionally, the method further comprises: the first processor selects a scaling factor related to the super-division factor from the scaling range according to the super-division factor of the third processor, wherein the super-division factor is used for indicating the multiple relation between the first resolution and the second resolution; the first processor reducing the resolution of the first image from the second resolution to the first resolution comprises: the first processor performs reduction processing on the resolution of the first image by using a scaling factor so that the resolution of the first image is reduced from the second resolution to the first resolution, and the scaling factor and the super-division factor are in a relationship that a multiplication result of the scaling factor and the super-division factor is 1.
Optionally, the operating the AI super-resolution model in the third processor, performing super-resolution rendering on the first image data by the third processor, and obtaining the second image data includes: the third processor performs super-resolution rendering on the first image data using the AI super-resolution model.
Optionally, the method further comprises: the first processor initializes the AI super-resolution model, and the initialization is used for determining the operation of the AI super-resolution model and determining that the AI super-resolution model can normally operate; the initialization comprises runtime detection, model loading, model compiling and memory configuration, wherein the runtime detection is used for determining the running of the AI super-resolution model, and the model loading, the model compiling and the memory configuration are used for determining the normal running of the AI super-resolution model.
Optionally, the electronic device invokes one or more frame buffers when performing the rendering process on the first image, and the method further includes: and the first processor instructs the third processor to obtain the first image data so as to perform rendering processing on the currently processed frame buffer after determining that the resolution of the first image is the second resolution, the rendering mode of the first image is delayed rendering and the currently processed frame buffer is not the last frame buffer.
Optionally, the method further comprises: and the first processor instructs the second processor to obtain fourth image data to perform rendering processing on the currently processed frame buffer after determining that the resolution of the first image is the second resolution, the rendering mode of the first image is delayed rendering and the currently processed frame buffer is the last frame buffer, wherein the fourth image data is the image data of the first image with the second resolution.
Optionally, the electronic device invokes one or more frame buffers when performing the rendering process on the first image, and the method further includes: the first processor determines a main frame buffer in a process of performing rendering processing on the first image; the main frame buffer is the frame buffer with the largest number of executed rendering operations in the rendering process of the first image by the electronic equipment; and when the first processor determines that the resolution of the first image is the second resolution, the rendering mode of the first image is forward rendering and the currently processed frame buffer is the main frame buffer, the first processor instructs the third processor to obtain the first image data so as to perform rendering processing on the currently processed frame buffer.
Optionally, the method further comprises: when the first processor determines that the resolution of the first image is the second resolution, the rendering mode of the first image is forward rendering and the currently processed frame buffer is not the main frame buffer, the first processor instructs the second processor to obtain fourth image data to perform rendering processing on the currently processed frame buffer, wherein the fourth image data is the image data of the first image with the second resolution.
Optionally, the determining, by the first processor, the buffering of the main frame during the rendering process performed on the first image includes: the first processor determines a main frame buffer based on the frame buffer with the largest number of rendering operations executed in the process of rendering the second image; the rendering process of the second image precedes the rendering process of the first image.
Optionally, before determining the buffering of the main frame during the rendering process performed on the first image, the method further includes: determining a number of rendering instructions to be executed on each frame buffer during a rendering process of the second image; the frame buffer in which the number of executed drawing instructions is the largest is determined as the main frame buffer.
Optionally, the second image is an image of a frame previous to the first image.
Optionally, the first processor is a central processing unit, the second processor is a graphics processor, and the third processor is a neural network processor or a digital signal processor.
It should be appreciated that the description of technical features, solutions, benefits, or similar language in this application does not imply that all of the features and advantages may be realized in any single embodiment. Rather, it is to be understood that the description of a feature or advantage is intended to include the specific features, aspects or advantages in at least one embodiment. Therefore, the descriptions of technical features, technical solutions or advantages in the present specification do not necessarily refer to the same embodiment. Furthermore, the technical features, technical solutions and advantages described in the present embodiments may also be combined in any suitable manner. One skilled in the relevant art will recognize that an embodiment may be practiced without one or more of the specific features, aspects, or advantages of a particular embodiment. In other instances, additional features and advantages may be recognized in certain embodiments that may not be present in all embodiments.
Drawings
FIG. 1 is a schematic diagram of a rendered image provided herein;
FIG. 2 is a schematic illustration of another rendered image provided herein;
FIG. 3 is a schematic diagram of an interface of a resolution enhancement setting according to the present application;
FIG. 4 is a schematic diagram of yet another rendered image provided herein;
FIG. 5 is a schematic diagram of yet another rendered image provided herein;
FIG. 6 is a schematic diagram illustrating the determination of the amount of drawcall in different frame buffers according to the present application;
FIG. 7 is a schematic diagram of another determination of the amount of drawcall in different frame buffers provided by the present application;
FIG. 8 is a schematic diagram of yet another rendered image provided herein;
fig. 9 is a schematic diagram of a rendering method provided in the present application;
FIG. 10 is a flow chart of a rendering method provided herein;
fig. 11 is a signaling diagram of a rendering method provided in the present application;
FIG. 12 is a diagram illustrating a memory access provided herein;
fig. 13 is another signaling diagram of a rendering method provided in the present application;
FIG. 14 is a schematic illustration of another memory access provided herein;
fig. 15 is a signaling diagram of another rendering method provided in the present application.
Detailed Description
The terms "first," "second," and "third," etc. in the description and claims of the present application and the description of the drawings are used for distinguishing between different objects and not for limiting a particular order.
In the embodiments of the present application, the words "exemplary" or "such as" are used herein to mean serving as an example, instance, or illustration. Any embodiment or design described herein as "exemplary" or "e.g.," is not necessarily to be construed as preferred or advantageous over other embodiments or designs. Rather, use of the word "exemplary" or "such as" is intended to present concepts related in a concrete fashion.
During the use of the electronic device by the user, the electronic device may display one frame of image to the user through the display screen. Taking a video stream as an example, one video stream may include multiple frames of images, and the electronic device may sequentially display each frame of image on the display screen to display the video stream on the display screen. The image display can be triggered by an application program in the electronic equipment, the application program can send rendering commands aiming at different images to the electronic equipment, and the electronic equipment responds to the rendering commands to render the images and displays the images based on the rendering results of the images.
In some implementations, each frame of the image corresponds to a plurality of Frame Buffers (FBs), each FB being used to store rendering results for a portion of elements of the image, e.g., to store image data for the portion of elements, which is used to map out the corresponding element. For example, the image includes elements such as a person and a tree, image data of the person and image data of the tree may be stored in the FB, and the electronic device may perform rendering based on the image data stored in the FB. In this embodiment, an optional structure of the electronic device and the FB-based rendering process are shown in fig. 1, and the electronic device may include a Central Processing Unit (CPU), a Graphics Processing Unit (GPU), an internal memory and a display screen, where the internal memory may also be referred to as a memory.
In fig. 1, after an application (such as a game application and a video application) installed in an electronic device is started, the application may display an image through a display screen, and in the process of displaying the image, the application issues a rendering command, and a CPU may intercept the rendering command issued by the application. The CPU responds to the rendering command to create corresponding FBs in the memory for the rendering process of the ith frame image, for example, in fig. 1 the CPU may create 3 FBs for the ith frame image, respectively denoted as FB0, FB1 and FB 2. The CPU may issue a rendering instruction to the GPU according to the rendering command, and the GPU may respond to the rendering instruction and perform rendering corresponding to the rendering instruction. As an example, the GPU performs rendering processing on FB1 and FB2 for the ith frame of image in response to rendering instructions. After the rendering process is completed, the FB1 and the FB2 store the image data of the ith frame image, for example, the FB1 may be used to store the image data of a part of elements of the ith frame image (referred to as image data 1), and the FB2 may be used to store the image data of another part of elements of the ith frame image (referred to as image data 2). When displaying the ith frame image, the GPU fuses (or renders) the image data 1 and the image data 2 into the FB0, and the FB0 stores the complete image data of the ith frame image. The GPU reads the image data of the ith frame image from the FB0, and displays the ith frame image on the display screen based on the image data of the ith frame image.
However, with the development of display technology, the screen refresh rate and the image resolution of the display screen are advanced to higher stages, for example, the screen refresh rate of the display screen is advanced from 60Hz (Hertz) to 90Hz, and is also advanced from 90Hz to 120 Hz; the resolution of images is developed from 720P (1280 × 720) to 1080P (1920 × 1080), and is also developed from 1080P to 2k (2560 × 1440), so that the rendering of electronic devices is developed towards smoother operation, more complex scenes and higher image quality, and higher demands are made on the computing power of electronic devices. Even if the electronic equipment uses a chip with strong calculation power, the electronic equipment can still occupy excessive calculation power expenditure when rendering in a rendering mode of a CPU and a GPU, and the rendering mode of the CPU and the GPU is difficult to meet the demand of the calculation power which is rapidly increased. When rendering is performed in a rendering mode of a CPU and a GPU, excessive computing resources are consumed, so that the power consumption of the electronic equipment is improved, the heating is serious, and even the problem of operation blockage occurs.
In order to solve the above problems, the present application provides a rendering method, in which a processor other than a CPU and a GPU is configured in an electronic device, and the CPU and the GPU control a rendering process to be adjusted to the CPU, the GPU and the other processor control the rendering process, for example, the computing power of the configured processor may be greater than that of the GPU but the energy consumption may be less than that of the GPU, and the control capability may be weaker than that of the CPU and the GPU. When the CPU, the GPU and other processors control the rendering process, the other processors can generate high-resolution images and/or ultrahigh-resolution images, the computing power of the other processors is actively called to complete super-resolution rendering, the computing power requirement of the high-resolution images and/or ultrahigh-resolution images is met, the power consumption of electronic equipment is reduced, and the phenomenon of serious heating is reduced. And other processors can share the calculation amount of the GPU, and the time spent in the rendering process is shortened, so that the rendering smoothness is improved, and the problem of unsmooth operation is solved. The super-resolution rendering is used for completing the resolution improvement of the image data, and the resolution improvement can be from low resolution to high resolution or ultrahigh resolution, and can also be from high resolution to ultrahigh resolution.
In some implementations, a Neural-Network Processing Unit (NPU) is configured in the electronic device, the CPU, the GPU and the NPU control a rendering process, and the NPU can perform super-resolution rendering to obtain image data of a high-resolution image or image data of an ultra-high-resolution image with the assistance of the NPU.
In some implementations, a Digital Signal Processor (DSP) is configured in the electronic device to control the rendering process with the CPU, the GPU, and the DSP can perform super-resolution rendering.
In some implementations, the NPU and the DSP are configured in the electronic device, and the CPU, the GPU, the NPU, and the DSP control the rendering process, and the NPU and the DSP may perform super-resolution rendering, wherein the super-resolution rendering of the NPU and the DSP may correspond to different resolutions. In one example, the NPU may correspond to a low resolution to high resolution conversion, and the DSP may correspond to a low resolution to ultra high resolution boost; in another example, the NPU may correspond to a low resolution to ultra high resolution boost, and the DSP may correspond to a low resolution to high resolution boost; in yet another example, the NPU may correspond to a low resolution to high resolution boost and the DSP may correspond to a high resolution to ultra high resolution boost.
In some implementations, the NPU and the DSP may cooperate to complete super-resolution rendering, e.g., the NPU and the DSP may cooperate to cope with low-resolution to high-resolution enhancement; for another example, the NPU and the DSP can cooperatively cope with the improvement from low resolution to ultra-high resolution, where the cooperative coping refers to the common participation of the NPU and the DSP in the super-resolution rendering process of a frame of image.
The low resolution can be 540P (960 × 540) and 720P, the high resolution can be 1080P, the ultrahigh resolution can be 2k or even more than 2k, when the electronic device displays images through a display screen, a GPU can generate image data of 540P images and image data of 720P images, an NPU can perform lifting from 540P to 1080P, can also perform lifting from 720P to 1080P, so as to generate image data of 1080P images by using the NPU, and moreover, the NPU can also generate image data of 2k or even images with resolutions more than 2 k.
In this embodiment, one usage scenario of the rendering method is: the method can be that the electronic equipment can improve the resolution of the image required by the application program, and when the resolution of the image required by the application program is low resolution, the electronic equipment can improve the resolution to high resolution or ultrahigh resolution so as to automatically improve the picture quality of the application program; or when the resolution of the image required by the application program is high resolution, the electronic equipment is improved to ultrahigh resolution, and the picture quality of the application program can be automatically improved.
As shown in fig. 2, the resolution of the image required by the application program is low resolution, and the electronic device may perform the rendering method to raise the low resolution of the image to high resolution, so as to automatically change the resolution of the image, thereby raising the picture quality. The processing process of the CPU, the NPU and the GPU comprises the following steps: a CPU intercepts a rendering command issued by an application program; the CPU responds to the rendering command and creates a corresponding FB for the rendering processing of the image in the memory; the CPU sends a rendering instruction to the GPU after obtaining that the resolution of the image is low resolution based on the rendering command; the GPU generates low-resolution image data based on the rendering instruction and sends the low-resolution image data to the NPU; the NPU carries out super-resolution rendering processing on the low-resolution image data to obtain high-resolution image data, and sends the high-resolution image data to the GPU; after receiving the high-resolution image data, the GPU performs User Interface (UI) rendering and post-processing based on the high-resolution image data to complete drawing of one frame of image. For example, the GPU performs rendering processing on the FB pointed by the rendering command based on the high-resolution image data to obtain a rendering result, which may be image data of a part of elements in the image. And after the rendering results of all the FBs are obtained, the GPU displays the drawn image on a display screen based on the rendering results.
In this embodiment, the electronic device may provide a setting interface, where the setting interface has a setting option of "resolution up", and the setting option is used to provide resolution selection. FIG. 3 is a schematic diagram of a setup interface, where in FIG. 3 the setup options provide a plurality of resolution choices, e.g., 1080P, 2k, greater than 2k, from which a user may select one; when the resolution of the application program is less than the resolution selected by the user, the electronic device executes the rendering method shown in fig. 2. Fig. 3 is only an example, and the setting options may also take other forms, as shown in the dashed box in fig. 3, and the present embodiment does not limit the setting options.
In addition, the electronic equipment can be provided with a white list, the white list records an application program for improving the resolution, the CPU determines whether the application program is in the white list or not after intercepting a rendering command issued by the application program, if the application program is in the white list, the CPU can trigger the NPU to improve the resolution through the GPU, the NPU sends the image data with the improved resolution to the GPU, and the GPU carries out rendering processing based on the image data with the improved resolution; and if the application program is not in the white list, the CPU triggers the GPU to perform rendering processing, and the resolution of the image data is the same as the resolution issued by the application program.
Another usage scenario corresponding to the rendering method is as follows: the picture quality output by the electronic device is the same as the picture quality required by the application program, but the electronic device can reduce the resolution and then improve the resolution. As shown in fig. 4, the resolution of the image required by the application program may be high resolution, and the processing procedure of the CPU, NPU and GPU includes: a CPU intercepts a rendering command issued by an application program; the CPU responds to the rendering command and creates a corresponding FB for the rendering processing of the image in the memory; after the CPU extracts the resolution of the image from the rendering command to be high resolution, the CPU reduces the resolution of the image, and then the CPU can send a rendering instruction to the GPU; the GPU generates low-resolution image data based on the rendering instruction and sends the low-resolution image data to the NPU; the NPU carries out super-resolution rendering processing on the low-resolution image data to obtain high-resolution image data, and sends the high-resolution image data to the GPU; after receiving the high-resolution image data, the GPU performs UI rendering and post-processing based on the high-resolution image data to complete the drawing of one frame of image, the rendering result of the image is stored in the FB, and the GPU displays the drawn image on a display screen.
After the CPU reduces the resolution of the image, the NPU can restore the image data with high resolution through super-resolution rendering processing, and the GPU can restore the image with high resolution by utilizing the image data with high resolution, so that the image quality of the image drawn by the GPU is the same as that of the image required by an application program, the requirement of the application program on the image quality is met, and the rendering accuracy is improved. And the CPU reduces the data volume input into the NPU after reducing the resolution of the image, thereby improving the rendering accuracy and accelerating the rendering speed.
For example, the resolution of the image is 1080P or above 1080P, the CPU may reduce the resolution of the image, and then perform super-resolution rendering on the image data after the resolution reduction using the NPU so that the rendering result after the super-resolution rendering corresponds to the resolution before the reduction. How much the resolution is reduced can be determined by the super-resolution capability of the NPU, and if the super-resolution capability is 2 times super-resolution, the resolution of the 1080P image is reduced by 0.5 times, wherein one can note that the 2 times and 0.5 times correspond to the side lengths of the resolution, rather than the pixels of the image.
For the rendering method shown in fig. 4, the CPU may determine whether to reduce the resolution according to the resolution of the image required by the application program, for example, the resolution of the image required by the application program is 1080P or above 1080P, the CPU reduces the resolution of the image, and then the NPU performs super-resolution rendering on the image data with low resolution; if the resolution of the image required by the application program is less than 1080P, the CPU triggers the GPU to perform rendering processing under the condition that the resolution of the image is kept unchanged. For example, in the rendering method shown in fig. 4, the user clicks the game application (game 1 in fig. 4) on the desktop, the resolution of the image required by the game application is high resolution, and the electronic device executes the rendering method shown in fig. 4 after the game application is started.
In addition, when the resolution of the image required by the application program is high resolution, after the NPU receives the low-resolution image data sent by the GPU, the NPU performs super-resolution rendering processing on the low-resolution image data to obtain ultrahigh-resolution image data, the GPU performs drawing based on the ultrahigh-resolution image data, and the picture quality of the image drawn by the GPU is higher than that of the image required by the application program. For example, the resolution of an image required by an application program is 1080P, the CPU reduces the resolution to 540P, the GPU generates 540P image data, the NPU feeds back 2k image data to the GPU, and the GPU can draw an image with the resolution of 2k, the picture quality of the image with the picture quality higher than 1080P, thereby improving the picture quality of the application program.
Another usage scenario corresponding to the rendering method is as follows: in the running process of the application program, a user can adjust the picture quality of the application program, and the user can adjust the picture quality from low resolution to high resolution or ultrahigh resolution and also can adjust the picture quality from high resolution or ultrahigh resolution to low resolution. Fig. 5 shows that the user adjusts the picture quality from low resolution to high resolution, and the corresponding processing procedure after the resolution adjustment also changes, and the processing procedure is as follows:
1) a CPU intercepts a rendering command issued by an application program; and the CPU responds to the rendering command and creates a corresponding FB for the rendering processing of the image in the memory.
2) And after the CPU obtains that the resolution of the image is low resolution based on the rendering command, the CPU sends a rendering instruction to the GPU. And the GPU responds to the rendering instruction, performs rendering processing on the FBs to obtain a rendering result, and displays the image on the display screen based on the rendering result after the GPU completes the rendering processing of all the FBs.
3) After the user adjusts the frame instruction of the application program to a high resolution, the CPU may obtain the resolution of the image as a high resolution based on the rendering command, reduce the resolution of the image to a low resolution, and then the CPU may send the rendering instruction to the GPU; the GPU generates low-resolution image data based on the rendering instruction and sends the low-resolution image data to the NPU; the NPU carries out super-resolution rendering processing on the low-resolution image data to obtain high-resolution image data, and sends the high-resolution image data to the GPU; and after receiving the high-resolution image data, the GPU performs UI rendering and post-processing based on the high-resolution image data to finish the drawing of one frame of image, the rendering result of the image is stored in the FB, and the drawn image is displayed on the display screen by the GPU.
In this embodiment, the CPU may specify an FB for super-resolution rendering, the GPU may perform super-resolution rendering using the NPU and/or the DSP when processing the specified FB, for example, perform conversion from low resolution to high resolution and/or ultra-high resolution using the NPU and/or the DSP, and perform rendering using the GPU for other FBs, and the GPU may not distinguish the resolution of the image.
The FB in which super-resolution rendering is performed may be different according to different rendering modes. The rendering mode includes delayed rendering and forward rendering, in the delayed rendering, color information (color attribute) records not only color data, but also normal vector data and depth data of each pixel, the up-sampled normal vector data has a high error probability when performing illumination calculation, and the correlations between FBs except FB0 in all FBs are large (for example, the rendering result of one FB is bound in the next FB), if one FB is subjected to super-resolution rendering, the error probability is high when other FBs use the rendering result of the FB, so in the delayed rendering, the FBs except FB0 in all FBs corresponding to each frame image can be subjected to super-resolution rendering, that is, in the delayed rendering, the FBs subjected to super-resolution rendering are the FBs except FB0 in all FBs corresponding to one frame image. For example, all FBs corresponding to one frame image include: FB0, FB1 and FB2, the FBs for super-resolution rendering include FB1 and FB2, and FB0 is the last FB of all FBs issued by an application program.
The correlation between the FBs in the forward rendering is small, super-resolution rendering can be performed on at least one FB of all the FBs, and in the forward rendering, the CPU can determine the FB for performing the super-resolution rendering according to the number of rendering operations. In one example, the CPU may determine FBs whose number of rendering operations is greater than a preset number as FBs for super-resolution rendering; in another example, the CPU may determine the FB with the largest number of rendering operations as the FB for performing super-resolution rendering, where the FB with the largest number of rendering operations may be the FB with the largest number of draw (drawing instructions related to drawing) executed by the GPU on the FB, because the larger the number of draw executed by the GPU on the FB indicates that the more rendering operations performed by the GPU on the FB, the more computation resources are consumed when the GPU renders the FB, and the higher power consumption and heat generation are also generated, so the FB with the largest number of rendering operations may be determined as the FB for performing super-resolution rendering. The FB that has the largest number of rendering operations is the FB that performs the largest number of drawcalls. For convenience of explanation, the FB performing super-resolution rendering in the forward rendering is referred to as a main FB, and the FB determining the largest number of drawcalls is explained below with reference to an example.
In an embodiment, the CPU may be configured to receive a rendering command from an application program, and issue a corresponding rendering instruction to the GPU according to the rendering command, so that the GPU performs corresponding rendering according to the rendering instruction. As an example, a glBindFrameBuffer () function, and one or more of glDrawElement, glDrawArray, glDrawElementInstanced, gldrawarrayarayinstanced may be included in the rendering command. Correspondingly, the rendering instruction may also include a glBindFrameBuffer () function, and one or more of glDrawElement, glDrawArray, glDrawElementInstanced, and gldrawarrayarayinstanced. Wherein, the glBindFrameBuffer () function can be used to indicate the currently bound FB, and implement the binding of the corresponding rendering operation and the FB. For example, glBindFrameBuffer (1) may indicate that the currently bound frame is buffered as FB1, and Drawcall that the GPU executes on FB1 includes: glDrawElement, glDrawArray, glDrawElementInstanceed, glDrawArrayInstanceed.
For example, the nth frame image may use FB0, FB1, and FB2 as frame buffers. In connection with fig. 6, there is an example of a rendering command corresponding to FB 1. The rendering command issued by the application program may include a glBindFrameBuffer (1) function, so as to implement binding of the current rendering operation with FB 1. After binding FB1, the rendering operation indication on this FB1 can be implemented through a glDrawElement instruction. In this example, 1 glDrawElement instruction may correspond to 1 Drawcall. In different implementations, there may be more than one glDrawElement instruction executed on FB1, and there may be more than one Drawcall executed on FB 1.
The CPU may initialize the counter 1 when executing the glBindFrameBuffer (1) according to a rendering command issued by the application program. For example, by configuring the FB1 with a corresponding count frame bit in memory, the value of the frame bit can be initialized to 0 by initializing the counter 1. Each subsequent execution of glDrawElement on FB1, the counter 1 counts up by 1, e.g., a count1+ +. For example, after executing glDrawElement1-1, the CPU may execute count1+ +, for counter 1, so that the value of the frame bit stored in FB1 Drawcall number changes from 0 to 1, at which time Drawcall number executed on FB1 is 1. By analogy, the CPU can determine that the current count of counter 1 (e.g., the count can be a) is the number of drawcalls performed on FB1 in the rendering of the image of the nth frame.
Similarly, for FB2, the GPU may bind FB2 via the glBindFrameBuffer (2) function. After that, the GPU implements the rendering operation indication on the FB2 through the instructions of gldraawelement, gldrawrarray, gldrawrelementinstanced, gldrawrarray instanced, and the like. Similar to FB1, the CPU can also initialize counter 2 when invoking glBindFrameBuffer (2) according to a rendering command issued by an application. Such as initializing count2 to 0. Each subsequent execution of glDrawElement on FB2, counter 2 counts up by 1, e.g., executes count2+ +. After the rendering process of the image on the FB2 is completed, the CPU can determine that the number of drawcalls performed on the FB2 is the current count of the counter 2 (e.g., the count may be B) in the process of rendering the image of the nth frame. As with reference to fig. 7. After the completion of the rendering processes of the FB1 and FB2, the value of the number of frame bits in the memory storing the FB1 Drawcall may be A, and the number of frame bits in the memory storing the FB2 Drawcall may be B.
In this example, the CPU may select the FB corresponding to the larger count of a and B as the main FB. For example, when A is larger than B, the CPU can determine that the number of Drawcall executed on FB1 is more, and determine that FB1 is the main FB. On the contrary, when a is smaller than B, the CPU can determine that the number of drawcalls executed on the FB2 is more, and determine that the FB2 is the main FB.
In other examples, the CPU may determine that the number of executions Drawcall is greater than the preset threshold as the main FB, and the CPU may select the FBs on which the number of executions Drawcall is greater than the preset threshold in a counting manner, which will not be described in detail.
Correspondingly, the process of rendering by the CPU, the GPU and the NPU is as shown in fig. 8, and the CPU triggers a rendering flow according to a rendering command issued by an application program, and creates the FB by the CPU. Where the application program in figure 8 indicates rendering of a high resolution image. And after receiving the rendering command, the CPU judges whether the rendering mode is forward rendering or delayed rendering. If the rendering mode is forward rendering, the CPU determines whether the FB currently executing the rendering operation is FB1(FB1 is the main FB), if the FB currently executing the rendering operation is FB1, the CPU reduces the resolution of the image, sends a rendering instruction to the GPU to instruct the GPU to generate image data with low resolution, the GPU sends the image data with low resolution to the NPU, and the NPU finishes super-resolution rendering; if the CPU determines that the rendering mode is delayed rendering, the CPU determines whether the FB currently performing the rendering operation is FB 0; if the FB currently performing the rendering operation is FB1 or FB2, the CPU may also decrease the resolution of the image, instruct the GPU to generate image data with low resolution, and send the image data with low resolution to the NPU, and the NPU completes super-resolution rendering. And after receiving the high-resolution image data sent by the NPU, the GPU performs UI rendering and post-processing based on the high-resolution image data to finish the drawing of one frame of image, the rendering result of the image is stored in the FB, and the GPU displays the drawn image on a display screen.
The two GPUs in fig. 8 may be the same GPU, and therefore are shown to illustrate that different processing methods are adopted for different FBs in different rendering methods.
In some embodiments, under delayed rendering, when the FB currently performing the rendering operation is FB1, FB2, the CPU reduces the resolution of the image, and the GPU may generate image data of low resolution corresponding to FB1 or FB2, and then perform super-resolution rendering on the image data of low resolution corresponding to the two FBs by the NPU. In other embodiments, under delayed rendering, the last FB of FB0 corresponds to image data of one frame of image, and the GPU may generate low resolution image data of the last FB of FB0, which is super-resolution rendered by the NPU to the low resolution image data corresponding to the last FB of FB 0. That is, in the delayed rendering, when the FB is other than FB0, the GPU may generate low-resolution image data and then the NPU performs super-resolution rendering on the low-resolution image, or when the FB is the last FB of FB0, the GPU may generate low-resolution image data and the NPU performs super-resolution rendering on the low-resolution image data.
The CPU can reduce the resolution of the image according to the super-resolution ratio of the NPU. The super-divide multiple is used to indicate the multiple of resolution that the NPU can boost, e.g., the super-divide multiple of the NPU is 2 times, the CPU can reduce the resolution of the image by 2 times; if the super-resolution of the NPU is 4 times, the CPU may reduce the resolution of the image by 4 times. For example, if the super-resolution of the NPU is 2 times, the CPU may reduce the resolution of the image from 1080P to 540P, and the NPU may convert 540P to 1080P, completing super-resolution rendering from 540P to 1080P. If the super-resolution of the NPU is 4 times, the CPU can reduce the resolution of the image from 2K to 540P, and the NPU can convert 540P into 2K and complete the super-resolution rendering from 540P to 2K; the CPU may also reduce the resolution of the image from 1080P to 270P, and the NPU may convert 207P to 1080P accordingly.
In this embodiment, a schematic diagram corresponding to the rendering method is shown in fig. 9, where an application program issues a rendering command of each frame of image, a CPU intercepts the rendering command issued by the application program, and the CPU determines, based on the rendering command, that the resolution of the image corresponding to the current rendering command is high resolution, and the CPU may reduce the resolution of the image to low resolution; the CPU sends a rendering instruction to the GPU, and the GPU generates low-resolution image data based on the rendering instruction; the NPU completes the conversion from low resolution to high resolution based on an Artificial Intelligence (AI) super-resolution model to obtain high-resolution image data; the GPU performs UI rendering and post-processing based on the high-resolution image data, and displays an image on a display screen.
Fig. 9 is a flowchart of a rendering method, as shown in fig. 10, in the rendering method shown in fig. 10, an AI super-resolution model is run in an NPU, and super-resolution rendering is completed by using the AI super-resolution model, which may include the following steps:
(1) after the application starts, the FB1 is determined to be the master FB, and the GPU executes the largest number of Drawcalls on the FB 1.
(2) Initializing the AI super-resolution model, wherein the initialization comprises runtime inspection, model loading, model compiling and memory configuration;
(3) when the rendering mode is forward rendering, executing (4) when the FB currently processed is FB 1; when delaying rendering in the rendering mode, executing (4) when the FB currently being processed is other than FB 0;
(4) reducing the resolution of the image, the resolution of the image being reduced from a high resolution to a low resolution;
(5) generating low resolution image data;
(6) performing super-resolution rendering on the low-resolution image data by using an AI super-resolution model to obtain high-resolution image data;
(7) performing UI rendering and post-processing based on the high-resolution image data to finish drawing of a frame of image;
(8) displaying an image on a display screen;
(9) in the running process of the application program, data can be acquired to obtain a data sample set, model training is carried out on the AI super-resolution model by using the data sample set to obtain an AI super-resolution model, and then the AI super-resolution model is converted into a model file which can be identified by the electronic equipment by using Conversion tools (Conversion tools).
The following describes a detailed process of the rendering method according to the present embodiment with respect to determining the FB1 as the main FB in the rendering process of the image of the N-1 th frame and a rendering process of the image of the N th frame after determining the main FB, where the process is shown in fig. 11 and may include the following steps:
s101, the CPU determines the FB1 as the main FB in the rendering process of the N-1 th frame image. The main FB is the FB which performs the largest number of rendering operations during rendering, and the FB which performs the largest number of rendering operations may be the FB which performs the largest number of drawcalls.
The present embodiment may count all FBs in a frame image to obtain the number of drawcalls executed by all FBs in the frame image, and then determine the FB executing the largest number of drawcalls in the frame image as the main FB, and the main FBs of other images after the frame image may be the same as the main FB of the frame image. Illustratively, the currently rendered frame image is taken as the nth frame image. The CPU may determine the FB1 as the main FB by the number of drawcalls performed on different FBs of the N-1 frame image, determine the main FB of the N-1 frame image as the main FB of the N frame image, the N +1 frame image, the N +2 frame image, etc. according to the number of drawcalls performed on different FBs of the N-1 frame image during the image rendering process of the previous frame (e.g., the N-1 frame image).
S102, intercepting a rendering command of the Nth frame of image sent by an application program by the CPU.
S103, acquiring that the initial resolution of the Nth frame of image is high resolution and the FB is processed currently by the CPU based on the rendering command.
The CPU may acquire the width and height of the image of the nth frame based on the rendering command, where the width and height of the image are fixed at one resolution, for example, when the resolution of the image is 720P, the width of the image is 1280, and the height of the image is 720; when the resolution of an image is 1080P, the width of the image is 1920 and the height of the image is 1080, so that the initial resolution of the nth frame image can be determined by the width and height of the nth frame image. Generally, an image of 1920 × 1080 or more is regarded as a high resolution, and the present embodiment can determine that the initial resolution of the image is the high resolution when the width and height of the image satisfy 1920 × 1080 or more. The CPU can acquire the FB identification currently processed, such as FB1, based on the rendering command. The initial resolution of the nth frame image and the currently processed FB may be derived from different rendering commands.
S104, the CPU initializes the AI super-resolution model, determines whether to run the AI super-resolution model in the NPU or not through initialization operation, and if so, ensures that the AI super-resolution model can run normally.
The initialization of the AI super-resolution model comprises runtime check, model loading, model compiling and memory configuration, wherein the runtime check is used for determining whether the AI super-resolution model in the NPU is operated, and the model loading, the model compiling and the memory configuration are used for ensuring that the AI super-resolution model can normally operate.
In some implementations, the runtime check includes checking whether an NPU is configured in the electronic device and checking a resolution of the application; if the NPU is configured, an AI super-resolution model is operated in the NPU; if the resolution of the application program is high resolution or ultrahigh resolution, operating an AI super-resolution model; and if the NPU is not configured in the electronic equipment or the resolution of the application program is low resolution, the AI super-resolution model is forbidden to run, the CPU issues a rendering instruction to the GPU, and the GPU responds to the rendering instruction to perform rendering.
The model loading is to convert the AI super-resolution model into a model file recognizable by the NPU, and the model file is loaded into a memory in an initialization stage, wherein the model compiling is to verify that the model file can successfully run; the memory configuration is to allocate a memory for the AI super-resolution model, where the allocated memory is used to store input data and output data of the AI super-resolution model, and in this embodiment, the memory allocated for the AI super-resolution model may be a CPU memory, a memory managed by a Neural Processing API (e.g., an SNPE integrity buffer), or a shared memory.
The CPU memory can be a memory allocated for the CPU, data used in the running process of the CPU is written into the CPU memory, the AI super-resolution model occupies partial space of the CPU memory as a memory of the AI super-resolution model, data interaction between the GPU and the NPU can pass through the CPU memory, and when the GPU draws an image, the data in the CPU memory can be written into the GPU memory (the memory allocated for the GPU), and then the data is read from the GPU memory for drawing. The shared memory can be a memory shared by the CPU, the GPU and the NPU, and the GPU can directly read data from the shared memory for drawing.
Here, it is explained that the CPU triggers the AI super-resolution model initialization when determining that the initial resolution of the nth frame image is high resolution, which is one mode of the AI super-resolution model initialization. The AI super-resolution model may be initialized at other times, in one example, the CPU initializes the AI super-resolution model after monitoring that the application is started. The image rendering times of some application programs are few, the occupation of the GPU is low, the CPU can set a white list of the application programs, the CPU determines whether the application programs are in the white list after monitoring that the application programs are started, and if the application programs are in the white list, the AI super-resolution model is initialized. In another example, the CPU initializes the AI super-resolution model when the rendering mode is forward rendering and the currently processed FB is FB 1; and initializing the AI super-resolution model when the rendering mode is delayed rendering and the currently processed FB is not the FB 0.
S105, the CPU obtains the rendering mode from the configuration file of the application program.
Rendering modes include delayed rendering and forward rendering, and the rendering modes are different, and the FBs for performing super-resolution rendering are also different. In delayed rendering, the FBs except for the FB0 in all the FBs corresponding to each frame of image can be super-resolution rendered, that is, in delayed rendering, the FBs except for the FB0 in all the FBs corresponding to each frame of image can be super-resolution rendered. The CPU performs super-resolution rendering on the FB currently executing the rendering command, upon determining that the FB currently executing the rendering command is not FB 0.
In forward rendering, the computational power consumption of the GPU during rendering is concentrated in the main FB rendering. In the main FB rendering, the computational power of the GPU is concentrated in the computation of Fragment Shaders (FS), and thus it is found that it can be inferred that the main FB in the forward rendering, which may be the FB that performs the largest number of drawcalls, is super-resolution rendered. As in the present embodiment, the CPU determines that FB1 is the main FB in the N-1 frame image rendering process, and the FB currently being processed is FB1, the CPU performs super-resolution rendering on FB 1.
If the rendering mode is forward rendering and the currently processed FB is not the FB1, the CPU can send a rendering instruction to the GPU, and the GPU performs rendering processing on the current FB; if the rendering mode is delayed rendering and the currently processed FB is FB0, the CPU may send a rendering instruction to the GPU for rendering the current FB.
S106, if the rendering mode is forward rendering and the FB currently processed is FB1, the CPU reduces the initial resolution of the image to low resolution based on the super-division multiple; if the rendering mode is delayed rendering and FB is not FB0 (e.g., FB1 and FB2), the CPU reduces the initial resolution of the image to a low resolution based on the super-division multiple.
The CPU can reduce the resolution of the image by reducing the width and height of the image. The CPU may refer to a super-resolution multiple of the AI super-resolution model when reducing the resolution of the image, wherein the AI super-resolution model may perform a conversion of the image data from a low resolution to a high resolution, the high resolution being greater than the low resolution, for example, the high resolution may be one of 2 times, 3 times, and 4 times of the low resolution, i.e., the super-resolution multiple of the AI super-resolution model is one of 2 times, 3 times, and 4 times. Taking 540P as an example, a 1080P image is obtained after the 540P image is subjected to 2-time super-resolution rendering, and a 2k image is obtained after the 540P image is subjected to 4-time super-resolution rendering.
In this embodiment, the CPU may reduce the resolution of the image according to the super-resolution multiple of the AI super-resolution model. The CPU may reduce the resolution of the image by a scaling factor, which may be achieved by reducing the width and height of the image, such as representing the image in width and height: res is w × h, the value of the scaling factor r is r ∈ (0,1), that is, the value of the scaling factor r is between 0 and 1, and the image with reduced resolution is Res L The scaling factor used is related to the super-resolution factor of the AI super-resolution model (r × w) × (r × h). For example, the initial resolution of an image is 1080P, the width and height are 1920 × 1080, the super-resolution factor of the AI super-resolution model is 2, the AI super-resolution model can complete the conversion from 540P to 1080P, the width and height of a 540P image are 960 × 540, and the description is that the 1080P image is reduced by 2 times, and the corresponding scaling factor is 0.5. If the super-resolution ratio of the AI super-resolution model is 4, the AI super-resolution model can complete the conversion from 270P to 1080P, and the 1080P image is reduced by 4 times, which corresponds to a scaling ratio of 0.25. The scaling factor refers to scaling of the side length of the image, for example, a scaling factor of 0.5 refers to scaling the width and height of the image by 0.5 at the same time, and when the image and the width and height are scaled by 0.5 at the same time, the pixels of the image are scaled by 0.25.
The super-resolution multiple is used for indicating the super-resolution capability of the NPU and indicating the multiple of the resolution enhancement of the NPU, and the CPU can determine the super-resolution multiple in the initialization process of the AI super-resolution model; if the CPU does not determine the super-divide factor during initialization, the CPU may send the high resolution and the low resolution of the image, i.e., the resolution before the reduction and the resolution after the reduction, to the NPU, which selects a matching super-divide factor based on the resolution before the reduction and the resolution after the reduction.
S107, the CPU informs the GPU to generate image data with low resolution. In some examples, the super-resolution of the NPU may be a fixed multiple, and the resolution of the image required by each application in the CPU may also be a fixed multiple, for example, the super-resolution of the NPU is 2 times, and the resolution of the image required by each application is 1080P, then the CPU may reduce the resolution to 540P when reducing the resolution. Because the super-division multiple and the resolution of the image are fixed, the GPU knows that the resolution can be reduced to 540P, so when the CPU informs the GPU to generate image data with low resolution, the low resolution can not be carried in the information.
In some examples, the super-resolution of the NPU may be a fixed multiple, but the resolution of the image required by each application in the CPU is not fixed, for example, the resolution of the image required by some applications is 1080P, and the resolution of the image required by some applications is 2k, so that the reduced low resolution of the CPU may be different, and the notification sent by the CPU to the GPU may carry the low resolution.
In some examples, the super-resolution multiple of the NPU may not be fixed, that is, the super-resolution multiple of the NPU is multiple, the CPU may select one of the super-resolution multiples to reduce the resolution, and the notification sent by the corresponding CPU to the GPU may carry the low resolution.
The CPU may further send a notification to the NPU, where the notification may carry the resolutions before and after the N-th frame image is reduced, as in step S107 ', and step S107' is an optional step, for example, executed when the super-resolution of the NPU is not fixed.
S107', the CPU informs the NPU of the high resolution and the low resolution of the Nth frame image, the NPU can know the resolution change condition of the Nth frame image through the high resolution and the low resolution, and when the NPU carries out super-resolution rendering by using an AI super-resolution model, the AI super-resolution module can determine the super-resolution multiple so as to restore the image data which is consistent with the high resolution of the Nth frame image. Of course, the NPU may restore image data having a resolution higher than that of the nth frame image.
S108, the GPU generates image data with low resolution, for example, the GPU generates RGB image data with low resolution, and the RGB image data comprise R channel data, G channel data and B channel data.
S109, the GPU sends the image data with low resolution to the NPU.
S110', NPU determines the super-resolution multiple of the AI super-resolution model.
And S110, the NPU performs super-resolution rendering on the low-resolution image data based on the super-resolution multiple by using the AI super-resolution model to obtain the high-resolution image data.
The super-resolution multiple of the AI super-resolution model can be used as the super-resolution multiple of the NPU. In some examples, the super-resolution factor of the AI super-resolution model is a fixed factor, omitting step S110'. In some examples, the super-resolution multiple of the AI super-resolution model may not be fixed, such as the super-resolution multiple of the AI super-resolution model includes 2 times and 4 times, and the NPU performs step S110 'to determine the currently adopted super-resolution multiple according to the high resolution and the low resolution of the nth frame image notified in step S107'.
In this embodiment, the AI super-resolution model may be obtained by training historical low-resolution image data with low-resolution image data as input and high-resolution image data as output, and checking the AI super-resolution model with a high-resolution image rendered by the GPU as a reference in the training process. Low resolution and high resolution are relative terms, high resolution and low resolution can suffice: the relationship of high resolution greater than low resolution, for example, low resolution is 540P, high resolution is 1080P; for another example, the low resolution is 720P and the high resolution is 2 k.
The AI super-resolution model can be trained offline using a training framework such as TensorFlow, PyTorch, and the like. However, some electronic devices do not support training frames such as TensorFlow and pytorreh, for example, model formats supported by mobile phones such as a high-pass SNPE frame and an nnapi (android Networks api) are different from TensorFlow and pytorreh, and the model format supported by the high-pass SNPE frame is a DLC format; the NNAPI supports a model format of TensorFlow Lite, and if an AI super-resolution model is trained offline by a training framework such as TensorFlow, PyTorch and the like, the AI super-resolution model in the formats of TensorFlow, PyTorch and the like can be converted into the formats of DLC, TensorFlow Lite and the like so as to be conveniently used in devices such as mobile phones and the like.
After the AI super-resolution model is trained, the AI super-resolution model can be updated to adapt to image changes, for example, the electronic device can obtain a super-resolution rendering effect of the AI super-resolution model, and perform parameter adjustment on the AI super-resolution model according to the super-resolution rendering effect, where the parameter adjustment may be offline adjustment or online adjustment. The device side for parameter adjustment may be an electronic device that performs super-resolution rendering using the AI super-resolution model, or other electronic devices, and sends the adjusted parameters to the electronic device that performs super-resolution rendering using the AI super-resolution model, for example, a computer adjusts the parameters of the AI super-resolution model, and sends the adjusted parameters to a mobile phone. The electronic equipment can acquire data samples in the rendering process of the application program to obtain a data sample set, and the AI super-resolution model is adjusted by using the data samples in the data sample set.
And S111, the NPU sends the high-resolution image data to the GPU.
S112, the GPU carries out rendering processing on the FB which is currently processed based on the image data with high resolution to obtain a rendering result, such as image data of elements in the image of the Nth frame.
And S113, after obtaining the rendering results of all the FBs, the GPU displays the Nth frame of image on the display screen based on the rendering results.
The rendering mode can be used in each scene of the GPU rendering image, for example, the rendering mode can be used in application programs such as a game application program, a home design application program, a modeling application program, an augmented reality application program and a virtual display application program, and the super-resolution rendering is carried out by utilizing the AI super-resolution model operated in the NPU, so that the calculation capacity of the NPU is used for sharing part of the calculation capacity of the GPU, and therefore the power consumption of the electronic equipment is lower, and the rendering time is shorter.
In this embodiment, the image data interaction between the GPU and the NPU may be completed by using a GPU memory and a CPU memory, the GPU memory may be a private memory of the GPU, and the CPU and the NPU are prohibited from accessing the GPU memory, but the GPU may read the image data from the GPU memory by calling a function or the like, and the image data read by the GPU may be written into the CPU memory or may be used when the GPU draws an image, so the GPU memory may be used as an interaction memory and an operation memory of the GPU.
The CPU may designate a storage space (e.g., a first area) of the CPU memory as an NPU input memory and an NPU output memory, the NPU input memory is input data used when the NPU is written to perform super-resolution rendering, the NPU output memory is output data when the NPU completes the super-resolution rendering, the NPU input memory and the NPU output memory may be the same area in the CPU memory, and the CPU may designate a space occupied by the NPU in the CPU memory according to a data amount of the input data and a data amount of the output data of the NPU. Whether the GPU or the NPU, the GPU and the NPU can access the memory of the CPU. The corresponding memory access process shown in fig. 12 may include the following steps:
s200, the CPU designates a first area in the CPU memory as an NPU input memory and an NPU output memory, wherein the NPU input memory and the NPU output memory can be the first area, the first area is used as the NPU input memory when data are input to the NPU, the first area is used as the NPU output memory when the data are output by the NPU, the size of the first area can be determined according to the data volume processed by the NPU, and the description is omitted here.
S201, the CPU informs the NPU of the pointer address of the first area, and the NPU can read image data from the NPU input memory based on the pointer address and write the image data into the NPU output memory based on the pointer address.
S202, the GPU writes the image data with low resolution into a GPU memory.
S203, the GPU informs the CPU that the image data with low resolution has been written into the GPU memory.
S204, the CPU informs the GPU to write the image data with low resolution into the first area. The notification sent by the CPU carries the pointer address of the first area. The CPU informing the GPU of the pointer address of the first region may be that the CPU informs the NPU and the GPU of the pointer address of the first region in step S201, i.e., in step S201.
S205, the GPU reads the image data with low resolution from the GPU memory.
S206, the GPU writes the image data with low resolution into the first area based on the pointer address.
S207, the CPU informs the NPU to read the image data with low resolution from the first area.
S208, the NPU reads the image data of low resolution from the first area based on the pointer address.
S209, the NPU writes the high-resolution image data into the first area based on the pointer address.
After the NPU reads the image data with the low resolution, the image data with the low resolution is input into an AI super-resolution model operated in the NPU, the AI super-resolution model carries out super-resolution rendering on the image data with the low resolution, and the image data with the high resolution is output. The NPU then writes the high-resolution image data into the first area based on the pointer address.
S210, the CPU informs the GPU to read the image data with high resolution from the first area.
S211, the GPU reads the high-resolution image data from the first area based on the pointer address.
S212, the GPU writes the image data with high resolution into a GPU memory.
In conjunction with the memory access flow shown in fig. 12 and the rendering method shown in fig. 11, fig. 13 shows another timing diagram of the rendering method, which may include the following steps:
s301, the CPU determines the FB1 as the main FB in the process of rendering the image of the (N-1) th frame.
S302, intercepting a rendering command of the Nth frame of image transmitted by an application program by the CPU.
S303, the CPU acquires that the initial resolution of the N frame image is high resolution and FB currently processed based on the rendering command.
S304, the CPU performs runtime inspection, model loading and model compiling on the AI super-resolution model.
S305, the CPU designates the first area in the CPU memory as an NPU input memory and an NPU output memory, completes the memory configuration of the AI super-resolution model, and completes the initialization of the AI super-resolution model through the steps S304 and S305.
S306, the CPU informs the NPU of the pointer address of the first area.
S307, the CPU obtains the rendering mode from the configuration file of the application program.
S308, if the rendering mode is forward rendering and the FB currently processed is FB1, the CPU reduces the initial resolution of the image to low resolution based on the super-division multiple; if the rendering mode is delayed rendering and FB is not FB0 (e.g., FB1 and FB2), the CPU reduces the initial resolution of the image to a low resolution based on the super-division multiple.
S309, the CPU informs the GPU to generate image data with low resolution.
S309', the CPU notifies the NPU of the high resolution and the low resolution of the nth frame image.
S310, the GPU generates low-resolution image data.
S311, the GPU writes the image data with the low resolution into a GPU memory.
S312, the GPU informs the CPU that the image data with low resolution has been written into the GPU memory.
S313, the CPU informs the GPU to write the image data with low resolution into the first area. The notification sent by the CPU carries the pointer address of the first area.
And S314, the GPU reads the image data with low resolution from the GPU memory.
And S315, writing the image data with low resolution into the first area by the GPU based on the pointer address.
S316, the CPU informs the NPU to read the image data with low resolution from the first area.
S317, the NPU reads the image data of low resolution from the first area based on the pointer address.
S318' and NPU determine the super-resolution multiple of the AI super-resolution model.
And S318, the NPU performs super-resolution rendering on the low-resolution image data based on the super-resolution multiple by using the AI super-resolution model to obtain the high-resolution image data.
S319, the NPU writes the high-resolution image data into the first area based on the pointer address.
S320, the CPU informs the GPU to read the image data with high resolution from the first area.
S321, the GPU reads the high-resolution image data from the first area.
And S322, writing the high-resolution image data into a GPU memory by the GPU.
S323, the GPU carries out rendering processing on the FB which is currently processed based on the image data with high resolution to obtain a rendering result, such as image data of elements in the image of the Nth frame.
S324, after the GPU obtains the rendering results of all the FBs, displaying the Nth frame of image on the display screen based on the rendering results.
If the CPU designates one area as an NPU input memory and designates the other area as an NPU output memory from the CPU memory, the CPU sends a pointer address of the NPU input memory and a pointer address of the NPU output memory to the NPU, the NPU reads low-resolution image data based on the pointer address of the NPU input memory, and reads high-resolution image data based on the pointer address of the NPU output memory. And the same CPU sends the pointer address of the NPU input memory and the pointer address of the NPU output memory to the GPU, the GPU writes low-resolution image data based on the pointer address of the NPU input memory, and reads high-resolution image data based on the pointer address of the NPU output memory.
However, no matter the image data is low-resolution image data or high-resolution image data, the GPU and the NPU can complete interaction by means of a GPU memory and a CPU memory, the problem that the image data are copied in the GPU memory and the CPU memory is solved, the data acquisition duration of the GPU and the NPU is prolonged, and therefore the rendering time is prolonged.
For this problem, in this embodiment, the GPU, the CPU, and the NPU may perform data interaction in a shared memory manner, where the shared memory refers to a memory that can be accessed by the GPU, the CPU, and the NPU, and the shared memory may be used as an external cache of the GPU to store at least low-resolution image data input by the GPU to the NPU, and may be used as a processing cache of the NPU to store at least high-resolution image data output by the NPU.
The memory access process of the GPU and NPU using the shared memory is shown in fig. 14, and may include the following steps:
s401, the CPU applies for a shared memory from a Hardware Buffer (Hardware Buffer). The CPU may request shared Memory from a Hardware Buffer of Random Access Memory (RAM). A shared memory application mode is that a CPU applies for a Storage space of a GPU SSBO (shape Storage Buffer Object) from a Hardware Buffer, and after obtaining low-resolution image data from a Fragment shape in the GPU or a computer shape in the GPU, the low-resolution image data is written into the GPU SSBO. The high resolution image data output by the NPU may also be written to the shared memory.
The Hardware Buffer is used as a memory, when the CPU, the GPU and the NPU access the Hardware Buffer, certain requirements are imposed on the FORMAT of the Hardware Buffer, and if the shared memory is applied to be a storage space accessible by a Fragment Shader in the GPU or a computer Shader in the GPU, the FORMAT of the Hardware Buffer can be designated as ahardware Buffer _ FORMAT _ BLOB.
Another shared memory application mode is that the CPU can apply for two shared memories from Hardware Buffer, and one shared memory is used to store low-resolution image data obtained by the GPU as input data of the NPU; the other shared memory is used for storing output data of the NPU, such as high-resolution image data obtained by super-resolution rendering, as input data of the GPU in UI rendering. The FORMAT of the two shared memories applied by the CPU is the same, for example, the FORMAT of the shared memory may be ahardwaifer _ FORMAT _ BLOB. The requested size of the shared memory may be determined according to the resolution of the image, and if the requested size of the shared memory is width × height × byte size, for example, the type of the image data is float type, and the float variable is 4 bytes, the requested size of the shared memory is width × height × 3 × 4, and the width and height are determined according to the resolution of the image. For example, 1080P image, the requested shared memory size is 1920 × 1080 × 3 × 4.
S402, the CPU informs the GPU of the pointer address of the shared memory, and the shared memory is used as a storage memory of the image data with low resolution.
S403, the CPU notifies the NPU of the pointer address of the shared memory, so as to use the shared memory as a storage memory for high-resolution image data.
S404, the GPU writes the image data with low resolution into the shared memory based on the pointer address of the shared memory.
S405, the CPU informs the NPU to read the image data with low resolution from the shared memory.
S406, the NPU reads the low-resolution image data from the shared memory based on the pointer address of the shared memory.
S407, the NPU writes the high-resolution image data into the shared memory based on the pointer address of the shared memory.
S408, the CPU informs the GPU to read the image data with high resolution from the shared memory.
And S409, reading the high-resolution image data from the shared memory by the GPU based on the pointer address of the shared memory.
If the CPU applies for two shared memories from the Hardware Buffer, one shared memory is used for storing low-resolution image data obtained by the GPU as input data of the NPU; the other shared memory is used for storing output data of the NPU, such as high-resolution image data obtained by super-resolution rendering, as input data of the GPU in UI rendering. The CPU may send the pointer addresses of the two shared memories to the GPU and the NPU, so that the GPU and the NPU may complete reading and writing of image data in the shared memories.
The GPU and the NPU use the memory exchange process shown in fig. 14, and can use the shared memory to implement data sharing, implement zero copy of shared data between the GPU and the NPU, and improve processing efficiency. In conjunction with the memory access flow shown in fig. 14 and the rendering method shown in fig. 11, fig. 15 shows another timing diagram of the rendering method, which may include the following steps:
s501, the CPU determines the FB1 as the main FB in the process of rendering the image of the (N-1) th frame.
S502, intercepting a rendering command of the Nth frame of image transmitted by an application program by the CPU.
S503, the CPU acquires that the initial resolution of the N frame of image is high resolution and FB is processed currently based on the rendering command.
S504, the CPU performs runtime inspection, model loading and model compiling on the AI super-resolution model.
S505, the CPU applies for a shared memory from a Hardware Buffer of the RAM, the shared memory is used as a storage space of low-resolution image data and high-resolution image data, the memory configuration of the AI super-resolution model is completed by applying for the shared memory, and the initialization of the AI super-resolution model is completed by the steps S504 and S505.
S506, the CPU informs the GPU and the NPU of the pointer address of the shared memory.
S507, the CPU obtains the rendering mode from the configuration file of the application program.
S508, if the rendering mode is forward rendering and the FB currently processed is FB1, the CPU reduces the initial resolution of the image to low resolution based on the super-division multiple; if the rendering mode is delayed rendering and FB is not FB0 (e.g., FB1 and FB2), the CPU reduces the initial resolution of the image to a low resolution based on the super-division multiple.
S509, the CPU notifies the GPU of generating the low-resolution image data.
S509', the CPU notifies the NPU of the high resolution and the low resolution of the nth frame image.
S510, the GPU generates low-resolution image data.
S511, the GPU writes the image data with low resolution into the shared memory based on the pointer address of the shared memory.
S512, the CPU informs the NPU to read the image data with low resolution from the shared memory.
S513, the NPU reads the low-resolution image data from the shared memory based on the pointer address of the shared memory.
S514', the NPU determines the super-resolution multiple of the AI super-resolution model.
And S514, the NPU performs super-resolution rendering on the low-resolution image data based on the super-resolution multiple by using the AI super-resolution model to obtain the high-resolution image data.
S515, the NPU writes the high-resolution image data into the shared memory based on the pointer address of the shared memory.
S516, the CPU informs the GPU to read the image data with high resolution from the shared memory.
And S517, reading the high-resolution image data from the shared memory by the GPU based on the pointer address of the shared memory.
S518, the GPU carries out rendering processing on the FB which is currently processed based on the image data with high resolution to obtain a rendering result, such as image data of elements in the image of the Nth frame.
S519, after obtaining the rendering results of all the FBs, the GPU displays the Nth frame of image on the display screen based on the rendering results.
Some embodiments of the present application provide an electronic device, comprising: a first processor, a second processor, and a memory; wherein the memory is configured to store one or more computer program codes comprising computer instructions that, when executed by the first processor and the second processor, cause the first processor and the second processor to perform the rendering method.
Some embodiments of the present application provide a chip system, which includes program code, and when the program code runs on an electronic device, causes a first processor and a second processor in the electronic device to execute the rendering method.
Some embodiments of the present application provide a processor, the processor being a second processor, the second processor comprising a processing unit and a memory; wherein the memory is configured to store one or more computer program codes, the computer program codes comprising computer instructions, which when executed by the second processor, cause the second processor to perform the rendering method.
Some embodiments of the present application provide a computer storage medium comprising computer instructions that, when run on an electronic device, cause a second processor in the electronic device to perform the rendering method described above.
The present embodiments also provide a control apparatus comprising one or more processors, a memory for storing one or more computer program codes comprising computer instructions, which when executed by the one or more processors, perform the above method. The control device may be an integrated circuit IC or may be a system on chip SOC. The integrated circuit may be a general-purpose integrated circuit, a field programmable gate array FPGA, or an application specific integrated circuit ASIC.
Through the above description of the embodiments, it is clear to those skilled in the art that, for convenience and simplicity of description, the foregoing division of the functional modules is merely used as an example, and in practical applications, the above function distribution may be completed by different functional modules according to needs, that is, the internal structure of the device may be divided into different functional modules to complete all or part of the above described functions. For the specific working processes of the system, the apparatus and the unit described above, reference may be made to the corresponding processes in the foregoing method embodiments, and details are not described here again.
In the several embodiments provided in this embodiment, it should be understood that the disclosed system, apparatus and method may be implemented in other ways. For example, the above-described apparatus embodiments are merely illustrative, and for example, the division of the modules or units is only one logical division, and there may be other divisions when actually implemented, for example, a plurality of units or components may be combined or may be integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, devices or units, and may be in an electrical, mechanical or other form.
The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.
In addition, each functional unit in the embodiments of the present embodiment may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware, and can also be realized in a form of a software functional unit.
The integrated unit, if implemented in the form of a software functional unit and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, the technical solutions of the present embodiment may substantially or partially contribute to the prior art, or all or part of the technical solutions may be embodied in the form of a software product, which is stored in a storage medium and includes several instructions for causing a computer device (which may be a personal computer, a server, or a network device) or a processor to execute all or part of the steps of the methods described in the embodiments. And the aforementioned storage medium includes: various media that can store program code, such as flash memory, removable hard drive, read-only memory, random-access memory, magnetic or optical disk, etc.
The above description is only an embodiment of the present application, but the scope of the present application is not limited thereto, and any changes or substitutions within the technical scope of the present disclosure should be covered by the scope of the present application. Therefore, the protection scope of the present application shall be subject to the protection scope of the claims.

Claims (25)

1. A rendering method is applied to an electronic device, the electronic device runs an application program, the electronic device comprises a first processor and a second processor, and the method comprises the following steps:
the first processor receives a rendering command sent by the application program, wherein the rendering command is used for instructing the second processor to render a first image based on a first resolution;
the first processor sending rendering instructions to the second processor, the rendering instructions for instructing the second processor to render the first image;
the second processor generating image data of the first image at a second resolution based on the rendering instructions, the second resolution not greater than the first resolution;
the second processor writes the image data of the first image under the second resolution into a shared memory, and the first processor and the second processor have the authority of accessing the shared memory;
the second processor reads image data of the first image at a third resolution from the shared memory, wherein the third resolution is greater than the second resolution;
the second processor generates the first image based on image data of the first image at the third resolution.
2. The method of claim 1, wherein if the rendering mode of the application is forward rendering, the rendering instruction corresponds to a first frame buffer, and the number of drawing instructions executed by the first frame buffer is greater than a preset threshold;
and if the rendering mode of the application program is delayed rendering, the rendering instruction corresponds to the frame buffer except the last frame buffer in all the frame buffers transmitted by the application program.
3. The method of claim 2, wherein the first frame buffer is a frame buffer of the total frame buffers in which a maximum number of draw instructions are executed.
4. The method of claim 2 or 3, wherein prior to the first processor sending rendering instructions to the second processor, the method further comprises:
and the first processor acquires the rendering mode of the application program from the configuration file of the application program.
5. The method of claim 1, wherein the rendering instructions are configured to direct the second processor to render the first image based on the second resolution, the second resolution being less than the first resolution.
6. The method according to any of claims 1 to 5, wherein the second resolution is less than the first resolution;
the third resolution is the same as the first resolution, or the third resolution is greater than the first resolution.
7. The method according to any of claims 1 to 4, wherein the second resolution is equal to the first resolution.
8. The method of any of claims 1-7, wherein the electronic device further comprises a third processor, wherein image data of the first image at the third resolution is generated by the third processor, and wherein the third processor has access to the shared memory.
9. The method of claim 8, wherein after the second processor writes the image data of the first image at the second resolution to the shared memory, and before the second processor reads the image data of the first image at the third resolution from the shared memory, the method further comprises:
the third processor reads the image data of the first image at the second resolution from the shared memory;
the third processor generating image data of the first image at the third resolution based on image data of the first image at the second resolution;
and the third processor writes the image data of the first image under the third resolution into the shared memory.
10. The method of claim 9, wherein after the second processor writes the image data of the first image at the second resolution to the shared memory, the third processor reads the image data of the first image at the second resolution from the shared memory, the method further comprising:
the first processor sends a first notification to the third processor, where the first notification is used to instruct the third processor to read the image data of the first image at the second resolution from the shared memory.
11. The method of claim 9 or 10, wherein after the third processor writes the image data of the first image at the third resolution to the shared memory, the second processor reads the image data of the first image at the third resolution from the shared memory, and before the method further comprises:
the first processor sends a second notification to the second processor, where the second notification is used to instruct the second processor to read the image data of the first image at the third resolution from the shared memory.
12. The method of any of claims 1-11, wherein the electronic device further comprises a third processor, image data of the first image at the third resolution being generated by the third processor;
an artificial intelligence super-resolution model runs in the third processor, and the third processor performs super-resolution rendering on the image data of the first image at the second resolution by using the artificial intelligence super-resolution model to generate the image data of the first image at the third resolution.
13. The method of claim 12, wherein the third processor, prior to super-resolution rendering the image data of the first image at the second resolution using the artificial intelligence super-resolution model, further comprises:
the first processor sending the first resolution and the second resolution to the third processor;
the third processor determines a super-resolution multiple of the artificial intelligence super-resolution model based on the first resolution and the second resolution, and the artificial intelligence resolution model performs super-resolution rendering on the image data of the first image at the second resolution based on the super-resolution multiple.
14. The method of claim 12 or 13, wherein the third processor, prior to performing super-resolution rendering of the image data of the first image at the second resolution using the artificial intelligence super-resolution model, further comprises: the first processor initializes the artificial intelligence super-resolution model, and the initialization is used for determining running of the artificial intelligence super-resolution model and determining normal running of the artificial intelligence super-resolution model;
the initialization comprises runtime detection, model loading, model compiling and memory configuration, wherein the runtime detection is used for determining the running of the artificial intelligence super-resolution model, and the model loading, the model compiling and the memory configuration are used for determining the normal running of the artificial intelligence super-resolution model.
15. The method of any of claims 1-14, wherein the electronic device further comprises a third processor, image data of the first image at the third resolution being generated by the third processor;
before the first processor sends rendering instructions to the second processor, the method further comprises:
the first processor allocates the shared memory from a hardware buffer, and the third processor has the right to access the shared memory;
and the first processor sends pointer addresses of the shared memory to the third processor and the second processor, and the third processor and the second processor execute reading and writing of image data to the shared memory based on the pointer addresses.
16. The method of any of claims 1 to 15, wherein prior to the first processor sending the rendering instructions to the second processor, the method further comprises:
the first processor reduces a resolution of the first image from the first resolution to the second resolution.
17. The method of claim 16, wherein the electronic device further comprises a third processor, wherein image data of the first image at the third resolution is generated by the third processor; the third processor having a super-divide, the super-divide for indicating a difference between the second resolution and the third resolution;
the third resolution is the same as the first resolution, and the first processor reducing the resolution of the first image from the first resolution to the second resolution comprises: the first processor reduces the first resolution to the second resolution based on the super-resolution factor.
18. The method of any one of claims 8, 12, 15, 17, wherein the third processor is a neural network processor or a digital signal processor.
19. A rendering method applied to a second processor of an electronic device, the electronic device running an application program, the electronic device further including a first processor, the application program issuing rendering commands to the first processor, the rendering commands being used for instructing the second processor to render a first image based on a first resolution, the method comprising:
the second processor receives a rendering instruction sent by the first processor, wherein the rendering instruction is used for instructing the second processor to render the first image;
the second processor generating image data of the first image at a second resolution based on the rendering instructions, the second resolution not greater than the first resolution;
the second processor writes the image data of the first image under the second resolution into a shared memory, and the first processor and the second processor have the authority of accessing the shared memory;
the second processor reads image data of the first image at a third resolution from the shared memory, wherein the third resolution is greater than the second resolution;
the second processor generates the first image based on image data of the first image at the third resolution.
20. The method of claim 19, wherein after the second processor writes the image data of the first image at the second resolution to the shared memory, the second processor reads the image data of the first image at the third resolution from the shared memory, the method further comprising:
the second processor receives a first notification sent by the first processor, the first notification is sent after image data of the first image at the third resolution is successfully written into the shared memory, the image data of the first image at the third resolution is written by a third processor in the electronic device, and the first notification is used for instructing the second processor to read the image data of the first image at the third resolution from the shared memory.
21. The method of claim 19 or 20, wherein before the second processor receives the rendering instructions sent by the first processor, the method further comprises:
and the second processor receives a notification sent by the first processor, wherein the notification carries an address pointer of the shared memory.
22. An electronic device, characterized in that the electronic device comprises: a first processor, a second processor, and a memory; wherein the memory is to store one or more computer program codes comprising computer instructions which, when executed by the first processor and the second processor, cause the first processor and the second processor to perform the rendering method of any of claims 1 to 18.
23. A chip system, characterized in that the chip system comprises program code which, when run on an electronic device, causes a first processor and a second processor in the electronic device to perform a rendering method according to any of claims 1 to 18.
24. A processor, wherein the processor is a second processor, wherein the second processor comprises a processing unit and a memory; wherein the memory is for storing one or more computer program codes comprising computer instructions which, when executed by the second processor, cause the second processor to perform the rendering method of any of claims 19 to 21.
25. A computer storage medium comprising computer instructions which, when run on an electronic device, cause a second processor in the electronic device to perform a rendering method according to any one of claims 19 to 21.
CN202111552336.5A 2021-11-17 2021-12-17 Rendering method and device Active CN114998087B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310459132.XA CN116739879A (en) 2021-11-17 2021-12-17 Rendering method and device

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN2021113644149 2021-11-17
CN202111364414 2021-11-17

Related Child Applications (1)

Application Number Title Priority Date Filing Date
CN202310459132.XA Division CN116739879A (en) 2021-11-17 2021-12-17 Rendering method and device

Publications (2)

Publication Number Publication Date
CN114998087A true CN114998087A (en) 2022-09-02
CN114998087B CN114998087B (en) 2023-05-05

Family

ID=83018070

Family Applications (2)

Application Number Title Priority Date Filing Date
CN202310459132.XA Pending CN116739879A (en) 2021-11-17 2021-12-17 Rendering method and device
CN202111552336.5A Active CN114998087B (en) 2021-11-17 2021-12-17 Rendering method and device

Family Applications Before (1)

Application Number Title Priority Date Filing Date
CN202310459132.XA Pending CN116739879A (en) 2021-11-17 2021-12-17 Rendering method and device

Country Status (1)

Country Link
CN (2) CN116739879A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2023087827A1 (en) * 2021-11-17 2023-05-25 荣耀终端有限公司 Rendering method and apparatus
CN116309763A (en) * 2023-02-17 2023-06-23 珠海视熙科技有限公司 TOF camera depth calculation method, device, equipment and storage medium

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105427236A (en) * 2015-12-18 2016-03-23 魅族科技(中国)有限公司 Method and device for image rendering
CN107533843A (en) * 2015-01-30 2018-01-02 Dts公司 System and method for capturing, encoding, being distributed and decoding immersion audio
CN109410141A (en) * 2018-10-26 2019-03-01 北京金山云网络技术有限公司 A kind of image processing method, device, electronic equipment and storage medium
CN110111258A (en) * 2019-05-14 2019-08-09 武汉高德红外股份有限公司 Infrared excess resolution reconstruction image method and system based on multi-core processor
CN110152291A (en) * 2018-12-13 2019-08-23 腾讯科技(深圳)有限公司 Rendering method, device, terminal and the storage medium of game picture
CN113298712A (en) * 2021-05-21 2021-08-24 安谋科技(中国)有限公司 Image processing method, electronic device and readable medium thereof

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107533843A (en) * 2015-01-30 2018-01-02 Dts公司 System and method for capturing, encoding, being distributed and decoding immersion audio
CN105427236A (en) * 2015-12-18 2016-03-23 魅族科技(中国)有限公司 Method and device for image rendering
CN109410141A (en) * 2018-10-26 2019-03-01 北京金山云网络技术有限公司 A kind of image processing method, device, electronic equipment and storage medium
CN110152291A (en) * 2018-12-13 2019-08-23 腾讯科技(深圳)有限公司 Rendering method, device, terminal and the storage medium of game picture
CN110111258A (en) * 2019-05-14 2019-08-09 武汉高德红外股份有限公司 Infrared excess resolution reconstruction image method and system based on multi-core processor
CN113298712A (en) * 2021-05-21 2021-08-24 安谋科技(中国)有限公司 Image processing method, electronic device and readable medium thereof

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
A.V. PASCAL GROSSET 等: "TOD-Tree: Task-Overlapped Direct Send Tree Image Compositing for Hybrid MPI Parallelism and GPUs", 《IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS》 *
AYUB A.GUBRAN 等: "Emerald:graphics modeling for SoC systems", 《2019 ACM/IEEE 46TH ANNUAL INTERNATIONAL SYMPOSIUM ON COMPUTER ARCHITECTURE》 *
M ANGLADA 等: "Rendering Elimination:Early Discard of Redundant Tiles in the Graphics Pipeline", 《2019 IEEE INTERNATIONAL SYMPOSIUM ON HIGH PERFORMANCE COMPUTER ARCHITECTURE (HPCA)》 *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2023087827A1 (en) * 2021-11-17 2023-05-25 荣耀终端有限公司 Rendering method and apparatus
CN116309763A (en) * 2023-02-17 2023-06-23 珠海视熙科技有限公司 TOF camera depth calculation method, device, equipment and storage medium

Also Published As

Publication number Publication date
CN116739879A (en) 2023-09-12
CN114998087B (en) 2023-05-05

Similar Documents

Publication Publication Date Title
US10885607B2 (en) Storage for foveated rendering
US9024959B2 (en) Demand-paged textures
US20190130527A1 (en) Apparatus and method for non-uniform frame buffer rasterization
US9934551B2 (en) Split storage of anti-aliased samples
CN114998087B (en) Rendering method and device
US11908039B2 (en) Graphics rendering method and apparatus, and computer-readable storage medium
WO2017052781A1 (en) Efficient display processing with pre-fetching
US10198789B2 (en) Out-of-order cache returns
CN116185743B (en) Dual graphics card contrast debugging method, device and medium of OpenGL interface
EP1721298A2 (en) Embedded system with 3d graphics core and local pixel buffer
CN116821040B (en) Display acceleration method, device and medium based on GPU direct memory access
CN110599564A (en) Image display method and device, computer equipment and storage medium
CN111080761A (en) Method and device for scheduling rendering tasks and computer storage medium
US10212406B2 (en) Image generation of a three-dimensional scene using multiple focal lengths
WO2023202367A1 (en) Graphics processing unit, system, apparatus, device, and method
US7382376B2 (en) System and method for effectively utilizing a memory device in a compressed domain
CN116137675A (en) Rendering method and device
US11636566B2 (en) Computing system and methods of operating the computing system
CN111460342A (en) Page rendering display method and device, electronic equipment and computer storage medium
EP4207047A1 (en) Rendering method and apparatus
US10678553B2 (en) Pro-active GPU hardware bootup
US20240070962A1 (en) Graphics processing method and system
WO2023202366A1 (en) Graphics processing unit and system, electronic apparatus and device, and graphics processing method
WO2023197284A1 (en) Saliency-based adaptive color enhancement
CN116894906A (en) Graphics rendering method and processor hardware architecture

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant