CN116137675A - Rendering method and device - Google Patents

Rendering method and device Download PDF

Info

Publication number
CN116137675A
CN116137675A CN202111552338.4A CN202111552338A CN116137675A CN 116137675 A CN116137675 A CN 116137675A CN 202111552338 A CN202111552338 A CN 202111552338A CN 116137675 A CN116137675 A CN 116137675A
Authority
CN
China
Prior art keywords
image
processor
resolution
rendering
image data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202111552338.4A
Other languages
Chinese (zh)
Inventor
林淦
秦园
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Honor Device Co Ltd
Original Assignee
Honor Device Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Honor Device Co Ltd filed Critical Honor Device Co Ltd
Publication of CN116137675A publication Critical patent/CN116137675A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/44Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream, rendering scenes according to MPEG-4 scene graphs
    • H04N21/4402Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream, rendering scenes according to MPEG-4 scene graphs involving reformatting operations of video signals for household redistribution, storage or real-time display
    • H04N21/440281Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream, rendering scenes according to MPEG-4 scene graphs involving reformatting operations of video signals for household redistribution, storage or real-time display by altering the temporal resolution, e.g. by frame skipping
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T1/00General purpose image data processing
    • G06T1/20Processor architectures; Processor configuration, e.g. pipelining
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T1/00General purpose image data processing
    • G06T1/60Memory management
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T3/00Geometric image transformation in the plane of the image
    • G06T3/40Scaling the whole image or part thereof
    • G06T3/4053Super resolution, i.e. output image resolution higher than sensor resolution
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/435Processing of additional data, e.g. decrypting of additional data, reconstructing software from modules extracted from the transport stream
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/44Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream, rendering scenes according to MPEG-4 scene graphs
    • H04N21/44012Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream, rendering scenes according to MPEG-4 scene graphs involving rendering scenes according to scene graphs, e.g. MPEG-4 scene graphs
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/44Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream, rendering scenes according to MPEG-4 scene graphs
    • H04N21/4402Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream, rendering scenes according to MPEG-4 scene graphs involving reformatting operations of video signals for household redistribution, storage or real-time display
    • H04N21/440263Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream, rendering scenes according to MPEG-4 scene graphs involving reformatting operations of video signals for household redistribution, storage or real-time display by altering the spatial resolution, e.g. for displaying on a connected PDA

Abstract

The application provides a rendering method and a rendering device, wherein a first processor receives a rendering command for a first image issued by an application program; the third processor performs rendering processing on the first image to obtain a processing result of the first image; the second processor renders the second image during the rendering process of the first image by the third processor. After the first processor receives a rendering command for the first image issued by the application program, the third processor can perform rendering processing on the first image to obtain a processing result of the first image. In the process of rendering the first image by the third processor, the second processor renders the second image, and the rendering of the first image is delayed, but the rendering time of the second processor can be advanced and the smoothness can be improved relative to the process result of waiting for the first image. And the third processor may render the first image in parallel while the second processor draws the second image.

Description

Rendering method and device
The present application claims priority from the chinese patent office, application number 202111362400.3, chinese patent application entitled "rendering method and apparatus," filed on day 17, 11, 2021, the entire contents of which are incorporated herein by reference.
Technical Field
The present disclosure relates to the field of image processing technologies, and in particular, to a rendering method and apparatus.
Background
With the development of display technology, the resolution of images has been developed to higher resolution, such as the resolution of images has been developed from 720P to 1080P and from 1080P to 2k, where P represents the total number of rows of pixels, and 720P represents 720 rows of pixels; k represents the total number of columns of pixels, e.g. 2k represents 2000 columns of pixels. The electronic equipment occupies excessive computational power expenditure when rendering the high-resolution image or the ultrahigh-resolution image, so that the computational power of the electronic equipment is difficult to support the rendering requirement of the high-resolution image or the ultrahigh-resolution image, the frame rate and the fluency are reduced, and the frame rate is the frequency of continuously appearing images in units of frames on a display screen.
Disclosure of Invention
The rendering method and the rendering device solve the problem that the frame rate and the fluency are reduced when the electronic equipment renders the image.
In order to achieve the above purpose, the present application adopts the following technical scheme:
in a first aspect, the present application provides a rendering method applied to rendering processing of a first image by an electronic device, where the electronic device runs an application program, and the electronic device includes a first processor, a second processor, and a third processor, and the method includes: the method comprises the steps that a first processor receives a rendering command for a first image, wherein the rendering command is issued by an application program; the third processor performs rendering processing on the first image to obtain a processing result of the first image; the second processor renders the second image during the rendering process of the first image by the third processor. After the first processor receives a rendering command for the first image issued by the application program, the third processor can perform rendering processing on the first image to obtain a processing result of the first image. In the process of rendering the first image by the third processor, the second processor renders the second image, and the rendering of the first image is delayed, but the rendering time of the second processor can be advanced and the smoothness can be improved relative to the process result of waiting for the first image. And the third processor may render the first image in parallel while the second processor draws the second image.
The second image may be a previous frame of the first image, the processing result of the first image may be image data of the first image, and the third processor may draw the previous frame of the image in the process of obtaining the image data of the first image. Although the drawing of the first image is delayed, the drawing timing of the second processor can be advanced with respect to the processing of waiting for the first image, improving the smoothness. And when the second image is drawn, the processing result of the second image is finished, the time length for waiting for the processing result of the second image is reduced or omitted, the drawing efficiency of the second processor is improved, and the frame rate displayed on the display screen is improved.
Optionally, before the first processor receives the rendering command for the first image issued by the application, the method further includes: the first processor determines a first frame buffer in a third image processing process, wherein the first frame buffer is a frame buffer with the number of execution drawing instructions larger than a preset threshold value in all frame buffers issued by an application program, the third image is a previous frame image of a second image, and the second image is a previous frame image of the first image; the second processor draws a third image to display the third image on the display screen; the first processor receives a rendering command aiming at a second image and issued by an application program; the third processor performs rendering processing on the second image to obtain a processing result of the second image; and in the process of rendering the second image by the third processor, the second processor controls the display screen to continuously display the third image. Although the display of the second image is delayed, a third image related to the application program can be displayed on the display screen, so that the appearance of black screen or blue screen on the display screen is reduced, and the user experience is improved.
Optionally, the rendering command of the first image is used for instructing the second processor to render the first image based on the first resolution; the third processor performs rendering processing on the first image, and before the processing result of the first image is obtained, the method further includes: the first processor sends a rendering instruction to the second processor, wherein the rendering instruction is used for indicating the second processor to render the first image; the second processor generates image data of the first image at a second resolution based on the rendering instruction, the second resolution being not greater than the first resolution; the second processor writes image data of the first image under the second resolution into a first area of the first memory; a third processor reads image data of the first image at the second resolution from the first region; the third processor performs rendering processing on the first image, and the processing result of the first image is obtained, which includes: the third processor generates image data of the first image at a third resolution, the third resolution being greater than the second resolution, based on the image data of the first image at the second resolution. After the second processor writes the generated image data of the first image under the second resolution into the first area of the first memory, the third processor generates the image data of the first image under the third resolution, so that the computing power of the second processor is saved, more resources can be used by the second processor when the image is drawn, the rendering smoothness is improved, and the problem of operation blocking is solved. And the third resolution is larger than the second resolution, which means that the third processor can obtain image data with relatively higher resolution, and when the second processor draws the first image, the second processor can draw the first image based on the image data with relatively higher resolution, so that the image quality of the first image is improved.
Optionally, after the third processor generates the image data of the first image at the third resolution based on the image data of the first image at the second resolution, the method further comprises: the third processor writes image data of the first image at a third resolution to a second region of the first memory. Image data of the first image under the second resolution can be written into a first area of the first memory, image data of the first image under the third resolution can be written into a second area of the first memory, and the image data of the first image under different resolutions are respectively stored through two distinction, so that the possibility of data coverage is reduced.
Optionally, after the second processor writes the image data of the first image at the second resolution to the first area of the first memory, the method further includes: the second processor writes all frame-buffered accessory resources of the first image and a last frame-buffered drawing instruction stream of the first image to the second memory, the second processor having access to the second memory. In the process that the application program issues the rendering command of the first image, all the frame-buffered accessory resources of the first image and the last frame-buffered drawing instruction stream of the first image are issued, and the information is information used when the first image is drawn, so that the second processor writes all the frame-buffered accessory resources of the first image and the last frame-buffered drawing instruction stream of the first image into the second memory to ensure that the first image can be accurately drawn. The second processor has the right to access the second memory, the first processor and the third processor do not have the right to access the second memory, the user is prevented from modifying the data in the second memory by the aid of the first processor and the third processor, and safety is improved.
Optionally, after the second processor writes the image data of the first image at the second resolution to the first area of the first memory, the third processor reads the image data of the first image at the second resolution from the first area, and the method further includes: the first processor sends a first notification to the third processor, the first notification instructing the third processor to read image data of the first image at the second resolution from the first region. The first processor can monitor the work of the second processor, and after the second processor writes the image data of the first image under the second resolution into the first area of the first memory, the third processor is informed to acquire in time, so that the efficiency is improved.
Optionally, before the first processor sends the rendering instruction to the second processor, the method further includes: the first processor allocates a first memory from the hardware buffer, wherein the first memory comprises a first area and a second area; the first processor transmits the pointer address of the first area, the pointer address of the second area, to the third processor and the second processor, the first processor, the second processor and the third processor have a right to access the first memory, the third processor and the second processor perform reading and writing of image data at a low resolution in the first area based on the pointer address of the first area, and perform reading and writing of image data at a high resolution in the second area based on the pointer address of the second area. The first processor may allocate a first memory from the hardware buffer, and the second processor may not need to write the image data of the first image at the second resolution in the other memory before writing the image data of the first image at the second resolution in the first memory. The same third processor does not have to write in the other memories before writing the image data of the first image at the third resolution into the first memory. The second processor and the third processor can share data based on the first memory, zero copy of shared data between the second processor and the third processor is realized, and processing efficiency is improved.
Optionally, before the second processor draws the second image, the method further comprises: the first processor sends a second notification to the second processor, the second notification being for instructing the second processor to read image data of the second image from the second area at the third resolution.
Optionally, in the process that the third processor performs rendering processing on the first image, the second processor draws the second image includes: after the second processor writes the image data of the first image at the second resolution into the first area of the first memory, the second processor reads the image data of the second image at the third resolution from the second area; the second processor reads all frame-buffered accessory resources of the second image and the last frame-buffered drawing instruction stream of the second image from the second memory, and has the right to access the second memory; the second processor renders the second image based on the image data of the second image at the third resolution, the attachment resources of all frame buffers of the second image, and a rendering instruction stream of a last frame buffer of the second image. In this embodiment, the second memory may backup all the accessory resources of the frame buffer of the second image and the drawing instruction stream of the last frame buffer of the second image, so as to ensure that the second image can be drawn normally and ensure the accuracy of the second image.
Optionally, before the first processor sends the rendering instruction to the second processor, the method further includes: the first processor allocates a first memory from the memories of the first processor, wherein the first memory comprises a first area and a second area; the first processor sends the pointer address of the first area, the pointer address of the second area, the first processor and the third processor have access to the first memory, the third processor performs reading of image data at low resolution in the first area based on the pointer address of the first area, and performs writing of image data at high resolution in the second area based on the pointer address of the second area.
Optionally, after the second processor generates the image data of the first image at the second resolution based on the rendering instruction, before the second processor writes the image data of the first image at the second resolution to the first region of the first memory, the method further includes: the second processor writes the image data of the first image under the second resolution into the second memory, and the second processor has the right to access the second memory; the second processor sends a third notification to the first processor, wherein the third notification is used for indicating that the image data of the first image under the second resolution is successfully written into the second memory; in response to receiving the third notification, the first processor sends a fourth notification to the second processor, the fourth notification being for instructing the second processor to write image data of the first image at the second resolution into the first area, the fourth notification carrying an address pointer of the first area; in response to receiving the fourth notification, the second processor reads image data of the first image at the second resolution from the second memory.
In this embodiment, the first processor may monitor the read-write operation of the image data of the second processor, so as to trigger the second processor to write the image data of the first image under the second resolution into the first memory in time after the second processor writes the image data of the first image under the second resolution into the second memory, thereby improving efficiency. And the second notification sent by the first processor carries the address pointer of the first memory, and the second processor can write the image data of the first image under the second resolution into the first memory based on the address pointer of the first memory, thereby improving the writing accuracy.
Optionally, before the second processor draws the second image, the method further comprises: the first processor sends a fifth notice to the second processor, the fifth notice is used for instructing the second processor to read the image data of the second image in the third resolution from the second area, and the fifth notice carries an address pointer of the second area; in response to the fifth notification, the second processor reads image data of the second image at the third resolution from the second region; the second processor writes image data of the second image at the third resolution into the second memory, and the second processor has access to the second memory.
Optionally, in the process that the third processor performs rendering processing on the first image, the second processor draws the second image includes: the second processor reads image data of the second image under the third resolution, all frame-buffered accessory resources of the second image and a drawing instruction stream of the last frame buffer of the second image from the second memory, and has the right of accessing the second memory; the second processor renders the second image based on the image data of the second image at the third resolution, the attachment resources of all frame buffers of the second image, and a rendering instruction stream of a last frame buffer of the second image. In this embodiment, the second memory may backup all the accessory resources of the frame buffer of the second image and the drawing instruction stream of the last frame buffer of the second image, so as to ensure that the second image can be drawn normally and ensure the accuracy of the second image.
Optionally, after the second processor reads the image data of the second image at the third resolution, the method further comprises: the second processor sends a sixth notification to the first processor, the sixth notification being used to instruct the second processor to complete reading of the image data of the second image at the third resolution; in response to the sixth notification, the first processor wakes up a first thread in the third processor, the first thread can invoke an artificial intelligence super-resolution model for super-resolution rendering of image data of the first image at the second resolution, generating image data of the first image at the third resolution. After the second processor finishes reading the image data of the second image under the third resolution, the second processor can send a sixth notification to the first processor, under the action of the sixth notification, the first processor wakes up a first thread in the third processor and starts to call the artificial intelligent super-resolution model, super-resolution rendering is carried out on the image data of the first image under the second resolution, and therefore the image data of the first image can be calculated in parallel when the second image is drawn. The first thread may be an artificial intelligence thread running in the third processor, the artificial intelligence thread being created by the first processor, and the initialization of the artificial intelligence super-resolution model may be completed during the creation process.
Optionally, before the first processor wakes up the first thread in the third processor, the method further comprises: the first processor sends the first resolution of the first image and the second resolution of the first image to the third processor; the third processor determines a super-resolution of the artificial intelligence super-resolution model based on the first resolution of the first image and the second resolution of the first image, and the artificial intelligence resolution model performs super-resolution rendering on image data of the first image at the second resolution based on the super-resolution. The artificial intelligence super-resolution model can have at least one super-resolution, each super-resolution can accomplish a conversion of a different resolution. For example, the image data with the same resolution is taken as input, and different superdivision multiples enable the resolution corresponding to the image data output by the artificial intelligent super-resolution model to be different. The first processor may transmit the first resolution and the second resolution to the third processor in order that the resolution corresponding to the image data output by the artificial intelligence super-resolution model is not less than the first resolution. The third processor selects a super-resolution multiple which is not smaller than the multiple difference according to the multiple difference between the first resolution and the second resolution, so that the resolution corresponding to the image data output by the intelligent super-resolution model is not smaller than the first resolution.
Optionally, the method further comprises: the first processor initializes the artificial intelligent super-resolution model, and the initialization is used for determining to operate the artificial intelligent super-resolution model and determining the normal operation of the artificial intelligent super-resolution model; the initialization includes runtime detection, model loading, model compiling and memory configuration, and runtime inspection is used for determining the normal operation of the artificial intelligence super-resolution model. The initialization of the artificial intelligence super-resolution model by the first processor may be completed in a first thread initialization process, and the first thread and the artificial intelligence super-resolution model are initialized in one initialization process.
Optionally, after the first processor wakes up the first thread in the third processor, the method further includes: the third processor monitors the operation of the first thread in the process that the first thread calls the artificial intelligent super-resolution model to generate image data of the first image under the third resolution; and after the third processor monitors that the first thread finishes the invocation of the artificial intelligent super-resolution model, the first thread is dormant so as to switch the first thread from the running state to the dormant state, and after the artificial intelligent super-resolution model generates the image data of the first image under the third resolution, the first thread finishes the invocation of the artificial intelligent super-resolution model, thereby controlling the dormancy of the first thread after the rendering processing of the first image is completed, and reducing the occupation of resources.
Optionally, after the third processor monitors that the first thread ends the invocation of the artificial intelligence super resolution model, the hibernating the first thread includes: after the third processor monitors that the first thread finishes the invocation of the artificial intelligent super-resolution model, the third processor controls the first thread to switch from the running state to the running state; the third processor monitors that the duration of the first thread in the running finish state is longer than the preset duration, and sleeps the first thread to switch the first site from the running finish state to the sleep state. The preset duration indicates that the first thread does not perform rendering processing for a period of time, and the first thread Cheng Kongxian still occupies a part of resources in operation, in which case the first thread is switched to the sleep state to reduce the occupation of resources.
Optionally, the third processor monitors the running of the first thread, and the method further includes: the third processor monitors that the first thread calls the artificial intelligent super-resolution model with errors and forces the first thread to exit; the third processor sends a notification to the second processor instructing the second processor to render the first image, so that the occupation of resources by the first thread can be reduced. When the first thread makes an error in calling the artificial intelligent super-resolution model, the second processor can be triggered in time, and the second processor renders the first image, so that the rendering time is shortened relative to the time when the error is to be solved.
Optionally, the third processor monitoring that the first thread calls the artificial intelligence super resolution model with errors, forcing the first thread to exit includes: the third processor monitors that the rendering processing of the first image by the artificial intelligence super-resolution model is overtime, and the first thread is forced to exit.
Optionally, after the third processor monitors that the first thread calls the artificial intelligence super resolution model with errors and forces the first thread to exit, the method further comprises: and after the first processor monitors that the first thread exits and meets the preset conditions, the first processor wakes the first thread again. If the first processor detects that the first thread exits and meets the preset conditions, the first thread is awakened again, the first thread calls the artificial intelligent super-resolution model to conduct rendering processing on other images, partial work of the second processor is shared, and drawing efficiency of the second processor is improved.
Optionally, before the first processor sends the rendering instruction to the second processor, the method further comprises: the first processor reduces the resolution of the first image from the first resolution to the second resolution.
Optionally, the third processor has a super-division multiple for indicating a difference between the second resolution and the third resolution; the third resolution is the same as the first resolution; the first processor reducing the resolution of the first image from the first resolution to the second resolution includes: the first processor reduces the first resolution to a second resolution based on the super-resolution. The artificial intelligence super-resolution model can have at least one super-resolution, each super-resolution can accomplish a conversion of a different resolution. For example, the image data with the same resolution is taken as input, and different superdivision multiples enable the resolution corresponding to the image data output by the artificial intelligent super-resolution model to be different. The first processor may transmit the first resolution and the second resolution to the third processor in order that the resolution corresponding to the image data output by the artificial intelligence super-resolution model is not less than the first resolution. The third processor selects a super-resolution multiple which is not smaller than the multiple difference according to the multiple difference between the first resolution and the second resolution, so that the resolution corresponding to the image data output by the intelligent super-resolution model is not smaller than the first resolution.
Optionally, if the rendering mode of the application program is forward rendering, the rendering instruction corresponds to a first frame buffer, and the number of rendering instructions executed by the first frame buffer is greater than a preset threshold; if the rendering mode of the application program is delayed rendering, the rendering instruction corresponds to a frame buffer except the last frame buffer in all frame buffers issued by the application program. When the application program adopts different rendering modes, the frame buffer aimed at by the rendering instruction sent by the first processor to the second processor is also different, which means that when the application program adopts different rendering modes, the timing of sending the rendering instruction by the first processor is also different, and the sending of the rendering instruction is controlled based on the frame buffer under different rendering modes.
Optionally, the first frame buffer is a frame buffer having the largest number of rendering instructions executed among all frame buffers.
Optionally, before the first processor sends the rendering instruction to the second processor, the method further comprises: the first processor acquires the rendering mode of the application program from the configuration file of the application program.
Optionally, the rendering instructions are configured to instruct the second processor to render the first image based on a second resolution, the second resolution being less than the first resolution. The second resolution of the first image is carried in the rendering instruction sent by the first processor to the second processor, the resolution of the image data generated by the second processor is specified, the second processor is prevented from generating image data which is not matched with the resolution required by the application program, and the accuracy of the first image is improved. And the second resolution is smaller than the first resolution, and the smaller the resolution is, the smaller the data volume is, so that the data volume processed by the second processor is reduced by designating the second resolution smaller than the first resolution, and the power consumption of the electronic equipment is reduced, so that the problem of serious heat generation is solved.
Optionally, the second resolution is less than the first resolution; the third resolution is the same as the first resolution, or the third resolution is greater than the first resolution. The second resolution is less than the first resolution, and the third resolution is the same as the first resolution. During processing by the second processor, the second resolution is less than the first resolution, so that the amount of data processed by the second processor decreases as the second processor generates image data of the first image at the second resolution. But the third resolution is the same as the first resolution, which means that the second processor can read the image data corresponding to the first resolution, and draw the first image with the first resolution, so as to ensure the image quality of the drawn first image and the image quality required by the application program. If the third resolution is greater than the first resolution, the second processor can read the image data with the resolution greater than the first resolution, and the image quality of the drawn first image is better than the image quality required by the application program, so that the image quality of the first image is improved.
Optionally, the second resolution is equal to the first resolution. In this embodiment, the second resolution is equal to the first resolution, but the third resolution is greater than the second resolution, so that the third resolution is also greater than the first resolution, and when the application program requests that the first image be rendered based on the first resolution, the second processor may render the first image based on the third resolution, and the image quality of the drawn first image is better than the image quality requested by the application program, so as to improve the image quality of the first image.
Optionally, the third processor is a neural network processor or a digital signal processor.
In a second aspect, the present application provides a rendering method, applied to a second processor of an electronic device, where the electronic device further includes a first processor and a third processor, and an application program is run on the electronic device, and the application program sends a rendering command for a first image to the first processor; the method comprises the following steps: in the process of rendering the first image by the third processor, the second processor renders the second image, and the rendering of the first image is delayed, but the rendering time of the second processor can be advanced and the smoothness can be improved relative to the process result of waiting for the first image. And the third processor may render the first image in parallel while the second processor draws the second image. The second image may be a previous frame of the first image, the processing result of the first image may be image data of the first image, and the third processor may draw the previous frame of the image in the process of obtaining the image data of the first image. Although the drawing of the first image is delayed, the drawing timing of the second processor can be advanced with respect to the processing of waiting for the first image, improving the smoothness. And when the second image is drawn, the processing result of the second image is finished, the time length for waiting for the processing result of the second image is reduced or omitted, the drawing efficiency of the second processor is improved, and the frame rate displayed on the display screen is improved.
Optionally, the electronic device further includes a display screen, the first processor determines a first frame buffer in a third image processing process, the first frame buffer is a frame buffer with a number of drawing instructions executed being greater than a preset threshold value in all frame buffers issued by the application program, the third image is a previous frame image of the second image, and the second image is a previous frame image of the first image; the application program sends a rendering command for the second image to the first processor, and before the second processor draws the second image, the method further comprises: the second processor draws a third image to display the third image on the display screen; and in the process of rendering the second image by the third processor, the second processor controls the display screen to continuously display the third image. Although the display of the second image is delayed, a third image related to the application program can be displayed on the display screen, so that the appearance of black screen or blue screen on the display screen is reduced, and the user experience is improved.
Optionally, the rendering command of the first image is used for instructing the second processor to render the first image based on the first resolution; the method further comprises, prior to the second processor rendering the second image: the second processor receives a rendering instruction sent by the first processor, wherein the rendering instruction is used for indicating the second processor to render the first image; the second processor generates image data of the first image at a second resolution based on the rendering instruction, the second resolution being not greater than the first resolution; the second processor writes image data of the first image under the second resolution into a first area of the first memory; the second processor reads image data of the second image at a third resolution from a second area of the first memory, the third resolution being greater than the second resolution, the image data of the second image at the third resolution being used to render the second image.
After the second processor writes the generated image data of the first image under the second resolution into the first area of the first memory, the third processor generates the image data of the first image under the third resolution, so that the computing power of the second processor is saved, more resources can be used by the second processor when the image is drawn, the rendering smoothness is improved, and the problem of operation blocking is solved. And the third resolution is larger than the second resolution, which means that the image data of the first image obtained by the third processor is image data with relatively higher resolution, and when the second processor draws the first image, the second processor can draw the first image based on the image data with relatively higher resolution, so that the image quality of the first image is improved.
Optionally, after the second processor writes the image data of the first image at the second resolution to the first area of the first memory, the method further includes: the second processor writes all frame-buffered accessory resources of the first image and a last frame-buffered drawing instruction stream of the first image to the second memory, the second processor having access to the second memory. In the process that the application program issues the rendering command of the first image, all the frame-buffered accessory resources of the first image and the last frame-buffered drawing instruction stream of the first image are issued, and the information is information used when the first image is drawn, so that the second processor writes all the frame-buffered accessory resources of the first image and the last frame-buffered drawing instruction stream of the first image into the second memory to ensure that the first image can be accurately drawn. The second processor has the right to access the second memory, the first processor and the third processor do not have the right to access the second memory, the user is prevented from modifying the data in the second memory by the aid of the first processor and the third processor, and safety is improved.
Optionally, after the second processor reads the image data of the second image at the third resolution from the second area of the first memory, the second processor drawing the second image includes: the second processor reads all frame-buffered accessory resources of the second image and the last frame-buffered drawing instruction stream of the second image from the second memory, and has the right to access the second memory; the second processor renders the second image based on the image data of the second image at the third resolution, the attachment resources of all frame buffers of the second image, and a rendering instruction stream of a last frame buffer of the second image.
Optionally, after the second processor reads the image data of the second image at the third resolution from the second area of the first memory, the method further includes: the second processor writes the image data of the second image in the third resolution into the second memory, and the second processor has the right to access the second memory; the second processor rendering the second image includes: the second processor reads image data of the second image under the third resolution, all frame-buffered accessory resources of the second image and a last frame-buffered drawing instruction stream of the second image from the second memory; the second processor renders the second image based on the image data of the second image at the third resolution, the attachment resources of all frame buffers of the second image, and a rendering instruction stream of a last frame buffer of the second image.
In a third aspect, the present application provides an electronic device, the electronic device comprising: the system comprises a first processor, a second processor, a third processor and a memory; the memory is used for storing one or more computer program codes, and the computer program codes comprise computer instructions, and when the first processor, the second processor and the third processor execute the computer instructions, the first processor and the second processor execute the rendering method.
In a fourth aspect, the present application provides a chip system comprising program code that, when run on an electronic device, causes a first processor, a second processor and a third processor in the electronic device to perform the above-described rendering method.
In a fifth aspect, the present application provides a processor, the processor being a second processor, the second processor comprising a processing unit and a memory; wherein the memory is configured to store one or more computer program code comprising computer instructions that, when executed by the second processor, cause the second processor to perform the rendering method described above.
In a sixth aspect, the present application provides a computer storage medium comprising computer instructions which, when run on an electronic device, cause a second processor in the electronic device to perform the above-described rendering method.
It should be appreciated that the description of technical features, aspects, benefits or similar language in this application does not imply that all of the features and advantages may be realized with any single embodiment. Conversely, it should be understood that the description of features or advantages is intended to include, in at least one embodiment, the particular features, aspects, or advantages. Therefore, the description of technical features, technical solutions or advantageous effects in this specification does not necessarily refer to the same embodiment. Furthermore, the technical features, technical solutions and advantageous effects described in the present embodiment may also be combined in any appropriate manner. Those of skill in the art will appreciate that an embodiment may be implemented without one or more particular features, aspects, or benefits of a particular embodiment. In other embodiments, additional features and advantages may be recognized in certain embodiments that may not be present in all embodiments.
Drawings
FIG. 1 is a schematic illustration of a rendered image provided herein;
FIG. 2 is a schematic illustration of another rendered image provided herein;
FIG. 3 is a schematic illustration of yet another rendered image provided herein;
FIG. 4 is a schematic diagram illustrating the determination of the number of Drawcall in a different frame buffer provided herein;
FIG. 5 is a diagram illustrating a determination of the number of Drawcall on a different frame buffer provided herein;
FIG. 6 is a schematic illustration of yet another rendered image provided herein;
FIG. 7 is a schematic diagram of a rendering method provided herein;
fig. 8 is a signaling diagram of a rendering method provided in the present application;
fig. 9 is a schematic diagram of a rendering flow of an nth frame image provided in the present application;
FIG. 10 is a schematic diagram of an instruction stream backup provided herein;
FIG. 11 is a schematic diagram of a memory access provided herein;
FIGS. 12-1 and 12-2 are signaling diagrams of another rendering method provided herein;
FIG. 13 is a schematic diagram of another memory access provided herein;
FIGS. 14-1 and 14-2 are signaling diagrams of yet another rendering method provided herein;
fig. 15 is a schematic diagram of the relationship between four states of AI threads provided in the present application.
Detailed Description
The terms first, second, third and the like in the description and in the claims and drawings are used for distinguishing between different objects and not for limiting the specified sequence.
In the embodiments of the present application, words such as "exemplary" or "such as" are used to mean serving as examples, illustrations, or descriptions. Any embodiment or design described herein as "exemplary" or "for example" should not be construed as preferred or advantageous over other embodiments or designs. Rather, the use of words such as "exemplary" or "such as" is intended to present related concepts in a concrete fashion.
During the use of the electronic device by the user, the electronic device may display a frame of image to the user via the display screen. Taking a video stream as an example, one video stream may include multiple frames of images, and the electronic device may sequentially display each frame of image on the display screen, so as to display the video stream on the display screen. The image display can be triggered by an application program in the electronic device, the application program can send rendering commands for different images to the electronic device, the electronic device responds to the rendering commands to render the images, and the image display is performed based on the rendering results of the images.
In some implementations, each frame of image corresponds to a plurality of Frame Buffers (FBs), each FB being used to store rendering results of a portion of the elements of the image, such as storing image data of the portion of the elements, the image data being used to map out the corresponding element. For example, the image may include elements such as a person, a tree, etc., image data of the person and image data of the tree may be stored in the FB, and the electronic device may draw based on the image data stored in the FB. In this embodiment, an optional structure of the electronic device and the process of rendering based on FB are shown in fig. 1, and the electronic device may include a central processing unit (Central Processing Unit, CPU), a graphics processor (Graphics Processing Unit, GPU), an internal memory, which may also be referred to as a memory, and a display screen.
In fig. 1, after an application program (such as a game application program and a video application program) installed in an electronic device is started, the application program may display an image through a display screen, and in the process of displaying the image, the application program issues a rendering command, and the CPU may intercept the rendering command issued by the application program. In response to the rendering command, the CPU creates corresponding FBs in the memory for the rendering process of the i-th frame image, for example, in fig. 1, the CPU may create 3 FBs for the i-th frame image, denoted as FB0, FB1, and FB2, respectively. The CPU may issue a rendering instruction to the GPU according to the rendering command, and the GPU may execute rendering corresponding to the rendering instruction in response to the rendering instruction. As one example, the GPU performs rendering processing on FB1 and FB2 of the i-th frame image in response to the rendering instruction. After the rendering processing is completed, the image data of the i-th frame image is stored in FB1 and FB2, respectively, for example, FB1 may be used to store the image data of a part of the elements of the i-th frame image (referred to as image data 1), and FB2 may be used to store the image data of another part of the elements of the i-th frame image (referred to as image data 2). When displaying the i-th frame image, the GPU fuses (or otherwise refers to as rendering) image data 1 and image data 2 into FB0, where the complete image data of the i-th frame image is stored. The GPU reads the image data of the ith frame of image from FB0, and draws the ith frame of image in the display screen according to the image data of the ith frame of image.
A Neural Network Processor (NPU) may be incorporated into the electronic device, and the GPU and NPU response rendering instructions may be dependent on the resolution of the currently rendered image, as in one example, the CPU triggers the GPU to execute rendering instructions when the currently rendered image is a low resolution image; introducing an NPU when the current rendering image is of high resolution or ultrahigh resolution; illustratively, the low resolution may be 540P (960×540) and 720P, the high resolution may be 1080P, and the ultra-high resolution may be 2k or even greater than 2k.
As shown in fig. 2, the CPU intercepts a rendering command issued by an application program; the CPU intercepts a rendering command issued by an application program; the CPU creates a corresponding FB for rendering processing of the image in the memory in response to the rendering command. After the CPU obtains the resolution of the image to be low resolution based on the rendering command, the CPU sends a rendering command to the GPU. And the GPU responds to the rendering instruction, performs rendering processing on the FBs to obtain rendering results, and displays images on a display screen based on the rendering results after the GPU finishes rendering processing of all the FBs. After a user adjusts a picture instruction of an application program to high resolution, the CPU can obtain the resolution of an image to be high resolution based on a rendering command, the CPU reduces the resolution of the image to be low resolution, and then the CPU can send the rendering instruction to the GPU; the GPU generates low-resolution image data based on the rendering instruction and sends the low-resolution image data to the NPU; the NPU performs super-resolution rendering processing on the low-resolution image data to obtain high-resolution image data, and sends the high-resolution image data to the GPU; after the GPU receives the high-resolution image data, rendering and post-processing a User Interface (UI) based on the high-resolution image data to complete drawing of a frame of image, for example, the GPU performs rendering processing on the FB pointed by the rendering command based on the high-resolution image data to obtain a rendering result, where the rendering result may be image data of a part of elements in the image. After rendering results of all FBs are obtained, the GPU displays the drawn image on a display screen based on the rendering results.
Although the NPU bears a certain computational load under the framework of the CPU, the NPU and the GPU, the load of the CPU and the NPU is reduced, there may be a case that the rendering time is too long in the rendering process of the NPU, for example, when the NPU processes a high resolution image or an ultra-high resolution image, the data volume of the high resolution image or the ultra-high resolution image is too large, so that the rendering time of the NPU is increased, thereby reducing the frame rate and the smoothness.
In view of the above, this embodiment provides a rendering method, in which an application program triggers drawing of an nth frame image, a GPU obtains image data of the nth-1 frame image, and before receiving a rendering instruction of the nth frame image, the image data of the nth-1 frame image is already stored in a corresponding FB, so that the GPU may draw the nth-1 frame image based on the obtained image data of the nth-1 frame image and display the nth-1 frame image on a display screen, and the NPU may start processing the nth frame image to obtain the image data of the nth frame image.
If the application program triggers to draw the (n+1) th frame image, the GPU can draw the (N) th frame image and display the (N) th frame image on the display screen based on the obtained image data of the (N) th frame image, and the NPU starts to process the (n+1) th frame image.
Although the GPU draws the Nth frame image under the rendering instruction of the (n+1) th frame image, the drawing of the Nth frame image is delayed, compared with the image data waiting for the Nth frame image, the drawing time of the GPU can be advanced, and the fluency is improved. Experiments prove that the rendering method provided by the application can reduce the influence on the frame rate and solve the problem of frame rate reduction. For example, for a game application program, when the GPU receives a drawing instruction of an n+2th frame image, the GPU can draw the n+1th frame image, and image data of the n+1th frame image is written into a memory by the NPU in advance, so that the duration of waiting for the image data of the n+1th frame image by the GPU is reduced, the drawing efficiency of the GPU is improved, the frame rate displayed on a display screen is improved, and the problem of high frame rate of a game is solved.
In one example, before the NPU calculates the image data, the CPU may first reduce the resolution of the image, the GPU generates low-resolution image data, the NPU performs super-resolution rendering on the low-resolution image data to obtain high-resolution image data, and the image data after super-resolution rendering corresponds to the resolution before reduction. Wherein super-resolution rendering represents resolution enhancement performed on the image data. For example, the resolution of the image is 1080P or above 1080P, the CPU may reduce the resolution of the image to 540P, the image data generated by the gpu corresponds to 540P, and then the NPU is used to perform super resolution rendering on the image data, and the image data generated by the NPU corresponds to 1080P. How much the resolution is reduced may depend on the superdivision capability of the NPU, e.g., 2 times the superdivision capability of the NPU, the CPU may reduce the resolution of 1080P images by a factor of 0.5.
The CPU reduces the resolution of the image, so that the data volume input into the NPU can be reduced, the processing speed of the NPU is increased, the processing result can be corresponding to the resolution before the reduction after the NPU finishes the super-resolution rendering, the NPU does not change the resolution of the image, and the rendering accuracy is improved. Meanwhile, the CPU reduces the resolution of the image, and reduces the rendering time and the rendering load of the CPU; in the process of performing super-resolution rendering on the n+1th frame image by the NPU, the GPU can draw the Nth frame image, the NPU finishes the super-resolution rendering of the Nth frame image when drawing the Nth frame image, the time for the GPU to wait for the rendering result of the Nth frame image is reduced, and the frame rate is improved when the NPU and the GPU can operate in parallel.
As shown in fig. 3, when the application program triggers drawing of the nth frame image, the CPU reduces the resolution of the nth frame image, instructs the GPU to generate low-resolution image data, and the GPU generates low-resolution image data and sends the low-resolution image data to the NPU. And the NPU performs super-resolution rendering on the low-resolution image data to obtain high-resolution image data. In the NPU super-resolution rendering process, the GPU reads the image data of the N-1 frame image under high resolution from the memory, then the GPU can perform UI rendering and post-processing based on the image data of the N-1 frame image under high resolution to complete drawing of the N-1 frame image, and the GPU displays the drawn N-1 frame image on a display screen. Image data of the N-1 frame image at high resolution may be generated by the NPU by means of super resolution rendering.
After the NPU finishes super-resolution rendering, writing the image data of the Nth frame image under high resolution into a memory, and when an application program triggers drawing of the (n+1) th frame image, the GPU can read the image data of the Nth frame image under high resolution from the memory to draw the Nth frame image. The NPU may perform super-resolution rendering on the low-resolution image data of the n+1st frame image.
In one example, the CPU may specify FB for super resolution rendering. The FB in which super-resolution rendering is performed may be different according to a rendering manner. The rendering mode includes delayed rendering and forward rendering, in the delayed rendering, color information (color attribute) is recorded not only with color data, but also with normal vector data and depth data of each pixel, error probability of the up-sampled normal vector data is very large when illumination calculation is performed, and association between all FBs except FB0 in all FBs is relatively large (for example, a rendering result of one FB is bound in a next FB), if super-resolution rendering is performed on one FB, error probability is very large when other FBs use rendering results of the FB, therefore in the delayed rendering, super-resolution rendering can be performed on FBs except FB0 in all FBs corresponding to each frame of image, namely, in the delayed rendering, the FB performing super-resolution rendering is FB except FB0 in all FBs corresponding to one frame of image. For example, all FBs corresponding to one frame image include: FB0, FB1, and FB2, FB subjected to super-resolution rendering includes FB1 and FB2.FB0 is the last FB among all FBs issued by the application.
The relevance between the FBs in the forward rendering is smaller, at least one FB in all FBs can be subjected to super-resolution rendering, and under the forward rendering, the CPU can determine the FB subjected to super-resolution rendering according to the number of rendering operations. In one example, the CPU may determine FBs having a number of rendering operations greater than a preset number as FBs for super-resolution rendering; in another example, the CPU may determine the FB with the largest number of rendering operations as the FB with super-resolution rendering, where the FB with the largest number of rendering operations may be the FB with the largest number of draw calls (drawing instructions related to drawing) that the GPU performs on the FB, because the larger the number of draw calls that the GPU performs on the FB, which indicates that the larger the number of rendering operations that the GPU performs on the FB, the more computational resources are consumed when the CPU renders the FB, and the higher power consumption and heat are generated, so the present embodiment may determine the FB with the largest number of rendering operations as the FB with super-resolution rendering. The FB with the largest number of rendering operations is the FB with the largest number of execution dragcall. For convenience of explanation, the FB performing super-resolution rendering in forward rendering is referred to as a main FB, and the FB determining the maximum number of dragcall is explained below in connection with an example.
In an embodiment, the CPU may be configured to receive a rendering command from the application program, and issue a corresponding rendering instruction to one of the GPU, the NPU, and the DSP according to the rendering command, so that the one of the GPU, the NPU, and the DSP performs the corresponding rendering according to the rendering instruction. As an example, a glbindframe buffer () function, and one or more glDrawElement, glDrawArray, glDrawElementInstanced, glDrawArrayInstanced may be included in the rendering command. Correspondingly, the glbindframe buffer () function and one or more glDrawElement, glDrawArray, glDrawElementInstanced, glDrawArrayInstanced may also be included in the rendering instruction. The glbindframe buffer () function may be used to indicate the FB currently bound, and implement binding of the corresponding rendering operation and the FB. For example, glbindframe buffer (1) may indicate that the currently bound frame is buffered as FB1, and the draw call performed by the GPU on FB1 includes: glDrawElement, glDrawArray, glDrawElementInstanced, glDrawArrayInstanced.
For example, the nth frame image may use FB0, FB1, and FB2 as an example of frame buffering. In connection with FIG. 4, one example of a render command corresponding to FB 1. The rendering command issued by the application program may include a glbindframe buffer (1) function, thereby implementing the binding of the current rendering operation and FB 1. After binding FB1, rendering operation indication on FB1 may be implemented via the glDrawElement instruction. In this example, 1 glDrawElement instruction may correspond to 1 Drawcall. In various implementations, the glDrawElement instruction executing on FB1 may be multiple, and the corresponding Drawcall executing on FB1 may be multiple.
The CPU may initialize the counter 1 when executing the glbindframe buffer (1) according to a rendering command issued by the application. For example, a corresponding count frame bit is configured for FB1 in memory, and by initializing the counter 1, the value of the frame bit can be initialized to 0. Every time gldragwelement is subsequently executed on FB1, counter 1 counts up by 1, e.g., count1++ is executed. For example, after executing gldragwcell 1-1, the CPU may execute count1++ on counter 1, such that the value of the frame bits stored in FB1 dragwcall number is changed from 0 to 1, at which time the dragwcall number executed on FB1 is 1. By analogy, the CPU may determine that the number of Drawcall performed on FB1 during the rendering of the nth frame image is the current count of counter 1 (e.g., the count may be a).
Similarly, for FB2, the GPU may bind FB2 through the glbindframe buffer (2) function. Thereafter, the GPU may implement the rendering operation indication on the FB2 through instructions such as glDrawElement, glDrawArray, glDrawElementInstanced, glDrawArrayInstanced. Similar to FB1, the CPU can also initialize counter 2 when glbindframe buffer (2) is invoked according to a rendering command issued by the application. Such as initializing count2=0. Every time gldragwelement is subsequently executed on FB2, counter 2 counts up by 1, e.g., count2++ is executed. After completing the rendering process of the image on FB2, the CPU may determine that the number of Drawcall performed on FB2 in the process of rendering the nth frame image is the current count of counter 2 (for example, the count may be B). As described with reference to fig. 5. After the rendering processing of FB1 and FB2 is completed, the value of the frame bit storing the number of FB1 Drawcall in the memory may be a, and the frame bit storing the number of FB2 Drawcall may be B.
In this example, the CPU may select the FB corresponding to the larger count in a and B as the main FB. For example, when a is greater than B, the CPU may determine that the number of Drawcall executed on FB1 is greater, and determine that FB1 is the main FB. In contrast, when a is smaller than B, the CPU may determine that the number of Drawcall executed on FB2 is greater, and determine that FB2 is the main FB.
In other examples, the CPU may take as the main FB a determination that the number of execution Drawcall is greater than the preset threshold, and the CPU may select, by counting, FBs that the number of execution Drawcall is greater than the preset threshold, which will not be described in detail.
Correspondingly, the rendering process combined with the rendering mode is shown in fig. 6, the application program in fig. 6 indicates to render the nth frame of image, and after receiving the rendering command, the CPU determines whether the rendering mode is forward rendering or delayed rendering. If the rendering mode is forward rendering, the CPU determines whether the FB currently performing the rendering operation is FB1, and if the FB currently performing the rendering operation is FB1, the CPU reduces the resolution of the N frame image, and instructs the GPU to generate low-resolution image data. The GPU sends low-resolution image data to the NPU, and the NPU performs super-resolution rendering; in the NPU super-resolution rendering process, the GPU reads the image data of the N-1 frame image under high resolution from the memory, then the GPU performs UI rendering and post-processing based on the image data of the N-1 frame image under high resolution to complete drawing of the N-1 frame image, and the GPU displays the drawn N-1 frame image on a display screen.
After the NPU finishes the super-resolution rendering, the NPU writes the image data of the Nth frame of image under high resolution into the memory. When the application program triggers drawing of the (n+1) th frame image, the GPU can read image data of the (N) th frame image under high resolution from the memory to draw the (N) th frame image. The NPU may perform super-resolution rendering on the low-resolution image data of the n+1st frame image.
If the CPU determines that the rendering mode is delay rendering, the CPU determines whether the FB currently executing the rendering operation is FB1 or FB2; if the FB currently performing the rendering operation is one of FB1 and FB2, the GPU renders and draws the N-1 th frame image when the NPU performs super-resolution rendering on the image data of the N-th frame image at the low resolution.
In this embodiment, the CPU may create two threads, respectively a rendering thread and an artificial intelligence (Artificial Intelligence, AI) thread, where the AI thread may run on the NPU, and calculate image data using the AI thread, e.g., complete super-resolution rendering using the AI thread; the rendering thread is used for detecting a rendering command issued by an application program, performing UI rendering and post-processing by utilizing image data, displaying images on a display screen and the like.
The process of delaying drawing by the rendering thread and the AI thread and calculating the rendering result in parallel when drawing the image is shown in fig. 7, and the rendering thread intercepts the instruction stream after the application program is started to intercept the rendering command, and the rendering thread responds to the rendering command to reduce the resolution of the current rendering image. The rendering thread triggers an AI thread, and starts calculation of the current rendering image so as to calculate image data of the current rendering image; and the rendering thread can read the image data of the previous frame of image, the image data of the previous frame of image is calculated by the AI thread, and then the rendering thread performs UI rendering and post-processing operations to draw the previous frame of image and display the previous frame of image on the display screen. Wherein the steps from the start of the monitoring application program to the reduction of the resolution of the current rendered image are executed by the CPU, and the steps from the reading of the rendering result of the previous frame image to the drawing of the previous frame image are executed by the GPU.
Taking the current rendering image as an N-th Frame image (simply called Frame N) as an example, the rendering thread reads image data of a last Frame (Frame N-1) calculated by the AI thread from a memory, and the rendering thread sends the image data of the Frame N in low resolution to the AI thread and starts the AI thread to calculate. The rendering thread may also cache the OpenGL/Vulkan instruction stream of Frame N and the attachment resources of all FBs of Frame N, where the OpenGL/Vulkan instruction stream of Frame N includes the drawing instruction stream of FB0, and if the rendering command issued by the application is an OpenGL command, the rendering thread caches the OpenGL instruction stream of Frame N; if the rendering command issued by the application is a Vulkan command, the rendering thread caches the Vulkan instruction stream of Frame N.
Then, the rendering thread reads the drawing instruction stream of the Frame N-1 and the accessory resources of all FBs of the Frame N-1 from the cache, on the basis of the image data of the Frame N-1 under high resolution, the drawing instruction stream of the Frame N-1 and the accessory resources of all FBs of the Frame N-1, the on-screen display is completed, so that the rendering thread can draw the Frame N-1, the rendering thread renders the Frame N-1 under the rendering command of the Frame N, and the on-screen display is delayed by one Frame, thereby realizing delayed drawing by utilizing the rendering thread and the AI thread and performing parallel calculation when drawing the image, and the AI thread can also complete super-resolution rendering to accelerate the processing speed.
The flow of the rendering method shown in fig. 7 is shown in fig. 8, in the rendering method shown in fig. 8, the AI thread is running in the NPU, the AI thread can complete super resolution rendering by using an AI super resolution model, in fig. 8, taking an application program to issue a rendering command of an nth frame image, and the last frame image of the nth frame image is an nth-1 frame image as an example, where the rendering method shown in fig. 8 may include the following steps:
s101, a rendering thread intercepts a rendering command issued by an application program, and determines a main FB in response to the rendering command. The main FB is the FB with the largest number of rendering operations in the rendering process, and the FB with the largest number of rendering operations may be the FB with the largest number of dragcall.
The number of the Drawcall executed by each FB is different, and the embodiment may count all the FBs in a frame image to obtain the number of the Drawcall executed by all the FBs in the frame image, and then determine the FB with the largest number of the Drawcall executed in the frame image as the FB (denoted as the main FB) performing super resolution rendering, where the main FB of other images after the frame image may be the same as the main FB of the frame image. Illustratively, taking the currently rendered frame image as an nth frame image as an example, the CPU may determine a main FB of the nth-1 frame image according to the number of Drawcall performed on different FBs of the nth-1 frame image during the rendering process of the previous frame (e.g., the nth-1 frame image), and determine the main FB of the nth-1 frame image as the main FB of the nth frame image, the n+1th frame image, the n+2th frame image, and the like.
In this embodiment, the rendering thread may determine the FB performing super-resolution rendering before intercepting the rendering command issued by the application program, for example, the rendering thread may determine the main FB according to the number of Drawcall executed on different FBs of the N-1 frame in the N-1 frame image rendering process.
S102, the rendering thread creates an AI thread, the AI super-resolution model can be initialized in the process of creating and initializing the AI thread, and the AI super-resolution model can be ensured to normally run through initialization operation.
The initialization of the AI super-resolution model comprises model loading, model compiling and memory configuration, wherein the model loading, the model compiling and the memory configuration are used for ensuring that the AI super-resolution model can normally operate.
The model loading is to convert the AI super-resolution model into a model file which can be identified by an AI thread, load the model file into a memory in an initialization stage, and compile the model to verify that the model file can run successfully; the memory allocation is to allocate memory for the AI super-resolution model, where the allocated memory is used to store input data and output data of the AI super-resolution model, and in this embodiment, the memory allocated for the AI super-resolution model may be CPU memory, memory managed by Neural Processing API (e.g. SNPE ITensor buffer), or shared memory.
The CPU memory can be a memory allocated to the CPU, data used in the running process of the CPU is written into the CPU memory, the AI super-resolution model occupies part of the space of the CPU memory as the memory of the AI super-resolution model, data interaction between the rendering thread and the AI thread needs to pass through the CPU memory, and the rendering thread needs to write the data in the CPU memory into the GPU memory (the memory allocated to the GPU) when drawing an image, and then the data is read from the GPU memory for drawing. The shared memory may be a memory shared by the rendering thread and the AI thread, which may read data directly from the shared memory.
The timing of the creation of the AI thread can be when a rendering command of an Nth frame image is intercepted, and the timing of the creation of the AI thread can also be when the resolution of the Nth frame image is determined to be high resolution, after the starting of an application program is monitored, and when the FB currently processed is the FB for super resolution rendering; for an application, the CPU may also set a whitelist of the application, determine if the application is in the whitelist after the application is started, and if AI threads are created in the whitelist. After the AI thread creation and initialization work is completed, the AI thread enters a sleep state, waiting for the rendering thread to wake up the AI thread.
And S103, the rendering thread acquires the FB which is processed currently and has the initial resolution of the N frame image as the high resolution based on the rendering command.
The rendering thread may obtain the width and height of the nth frame image based on the rendering command, where the width and height of the image at one resolution is fixed, for example, when the resolution of the image is 720P, the width of the image is 1280, and the height of the image is 720; when the resolution of the image is 1080P, the width of the image is 1920 and the height of the image is 1080, so that the initial resolution of the nth frame image can be determined by the width and height of the nth frame image. Generally, images of 1920×1080 or more are considered to be high resolution, and when the width and height of an image satisfy 1920×1080 or more, a rendering thread can determine that the initial resolution of the image is high resolution. The rendering thread may obtain the FB identification, such as FB1, currently processed based on the rendering command. The initial resolution of the nth frame image and the FB of the current process may be derived from different rendering commands.
And S104, the rendering thread acquires a rendering mode from the configuration file of the application program.
The rendering modes include delay rendering and forward rendering, the rendering modes are different, and the FBs for executing super-resolution rendering are also different. In the delay rendering, the super-resolution rendering may be performed on FBs other than FB0 in all FBs of each frame image, i.e., in the delay rendering, the super-resolution rendering may be performed on FBs other than FB0 in all FBs of each frame image. And the CPU performs super-resolution rendering on the FB of the current execution rendering command when determining that the FB of the current execution rendering command is not FB 0.
In forward rendering, the computational power consumption of the GPU during rendering is concentrated in the main FB rendering. In the main FB rendering, the computation power of the GPU is concentrated in the computation of a Fragment Shader (FS), and thus it is found that it can be inferred that super-resolution rendering is performed on the main FB in the forward rendering, which is the FB that performs the maximum number of dragcalls. In this embodiment, if the CPU determines that FB1 is the main FB in the rendering process of the N-1 frame image, then the currently processed FB is FB1, and super-resolution rendering is performed on FB 1.
S105, if the rendering mode is forward rendering and the currently processed FB is the main FB, the rendering thread reduces the resolution of the N frame image from high resolution to low resolution; if the rendering mode is delayed rendering and the currently processed FB is not FB0, the rendering thread reduces the resolution of the N frame image from high resolution to low resolution.
The rendering thread may reduce the resolution of the nth frame image by reducing the width and height of the nth frame image. The rendering thread may refer to a super-division multiple of an AI super-resolution model in reducing the resolution of the nth frame image, wherein the AI super-resolution model operates in the AI thread, and the AI super-resolution model may perform conversion from a low resolution to a high resolution on the image data, the high resolution being greater than the low resolution, for example, the high resolution may be one of 2 times, 3 times, and 4 times the low resolution, i.e., the super-division multiple of the AI super-resolution model is one of 2 times, 3 times, and 4 times. Taking 540P as an example, a 1080P image is obtained after 2 times super-resolution rendering of the 540P image, and a 2k image is obtained after 4 times super-resolution rendering of the 540P image.
In this embodiment, the rendering thread may reduce the resolution of the nth frame image according to the super-division multiple of the AI super-resolution model. The CPU may reduce the resolution of the image by a scaling factor, which may be achieved by reducing the width and height of the nth frame image, such as representing the image in width and height: res=w×h, the scaling factor r has a value r e (0, 1), i.e. the scaling factor r has a value between 0 and 1, and the image after resolution reduction is Res L = (r×w) × (r×h), shrinkage usedThe amplification is related to the super-division of the AI super-resolution model.
For example, the super-resolution of the NPU is 2 times, and the CPU can reduce the resolution of the image by 2 times; if the super division multiple of the NPU is 4 times, the CPU can reduce the resolution of the image by 4 times. For example, the resolution of the image is 1080P, the image is 1920×1080 in terms of width and height, the super-division multiple of the ai super-resolution model is 2, the ai super-resolution model can complete conversion from 540P to 1080P, the width and height of the 540P image is 960×540, which means that the image is reduced by 2 times, and the corresponding scaling multiple is 0.5. If the super-resolution of the AI super-resolution model is 4, the AI super-resolution model can complete conversion from 270P to 1080P, and the 1080P image is reduced by 4 times, and the corresponding scaling factor is 0.25. Scaling multiple refers to scaling the side length of an image, e.g., scaling multiple of 0.5 refers to scaling the width and height of an image simultaneously by 0.5, and when the width and height of an image are scaled simultaneously by 0.5, the pixels of an image are scaled by 0.25.
And S106, the rendering thread generates image data of the Nth frame image under low resolution. The rendering thread runs in the GPU at this point. The rendering thread generates low resolution RGB image data including R-channel data, G-channel data, and B-channel data.
And S107, the rendering thread stores the image data of the Nth frame of image under low resolution into a memory, such as a shared memory which can be accessed by the rendering thread and the AI thread.
S108, the rendering thread backs up the resource data except the image data in the N frame image and the OpenGL/Vulkan instruction stream required for drawing the N frame image. Wherein the resource data other than the image data may include Attachment (Attachment) resources of all FBs of the nth frame image, the Attachment resources of each FB include color resources, depth resources, and the like (may also be referred to as color, depth, and the like of Frame Buffer Attachment), and the OpenGL/Vulkan instruction stream required for drawing the nth frame image may include a drawing instruction stream of FB 0.
For example, as shown in fig. 9, the complete rendering flow of the nth frame image is that the resources Frame Buffer Attachment in the nth frame image in fig. 9 are accessory resources of all FBs of the nth frame image, and the accessory resources of each FB include: color resources, depth resources, and the like, such as frame buffer N0 in fig. 9 represents FB0 of the nth frame image, attach color, depth represents color resources and depth resources of each FB, and these resources may be cached in a backup area for backup. In addition to backing up all the attachment resources of FB, the OpenGL/Vulkan instruction stream required for drawing the nth frame image, such as the drawing instruction stream of FB0, is backed up. The backup of the drawing instruction stream and the accessory resource may be that the rendering thread stores the image data of the nth frame image in the low resolution into the memory, or may be that the rendering thread generates the image data of the nth frame image in the low resolution, which is not limited in this embodiment.
The backup is performed because the rendering thread draws the N-1 th frame image in response to the rendering command of the N-1 th frame image issued by the application, and draws the N-1 th frame image in response to the rendering command of the n+1 th frame image issued by the application. If the rendering thread does not backup the related information of the Nth frame image, the rendering thread cannot acquire the information at the rendering time of the (n+1) th frame image, so that the rendering thread can draw the Nth frame image in error or can not draw the Nth frame image.
And S109, the rendering thread acquires the image data of the N-1 frame image under high resolution from the memory.
Image data of the N-1 frame image at high resolution is obtained by the AI thread. The method is characterized in that the method is obtained by optionally performing super-resolution rendering on an AI super-resolution model, wherein the method is calculated in response to a rendering command of an N-1 frame image, and image data of the N-1 frame image under high resolution is written into a memory. The initial resolution of the N-1 frame image is high resolution, the image data of the N-1 frame image under the high resolution can be restored through an AI super-resolution model, and the accuracy of the drawn N-1 frame image is improved.
When super-resolution rendering is carried out on low-resolution image data by utilizing the AI super-resolution model, the input of the AI super-resolution model is the low-resolution image data, and the super-resolution rendering is finished through a multi-layer network architecture of the AI super-resolution model, so that high-resolution image data is obtained. The AI super-resolution model is obtained by training a low-resolution image with a low-resolution image as input and high-resolution image data as output through a historical low-resolution image, and is verified by taking the high-resolution image rendered by the GPU as a reference image in the training process. The low resolution and the high resolution are relative terms, e.g., the low resolution may be a low resolution and the high resolution may be a high resolution, the high resolution being greater than the low resolution, e.g., the low resolution is 540P and the high resolution is 1080P; for another example, the low resolution is 720P and the high resolution is 2k.
In this embodiment, the AI super-resolution model may be trained offline using a training framework such as TensorFlow, pyTorch. However, some electronic devices do not support training frameworks such as TensorFlow, pyTorch, for example, the model formats supported by mobile phones such as a high-pass SNPE framework and NNAPI (Android Neural Networks API) are different from those supported by TensorFlow and pyrerch, and the model formats supported by the high-pass SNPE framework are DLC formats; the model format supported by NNAPI is TensorFlow Lite, if the AI super-resolution model is trained offline by a training framework such as TensorFlow, pyTorch, the AI super-resolution model with the format such as TensorFlow, pyTorch needs to be converted into the format such as DLC or TensorFlow Lite, so that the model is convenient to use in devices such as mobile phones.
After the AI super-resolution model is trained, the AI super-resolution model can be updated to adapt to image changes, for example, the electronic equipment can obtain the super-resolution rendering effect of the AI super-resolution model, and parameter adjustment is carried out on the AI super-resolution model according to the super-resolution rendering effect, wherein the parameter adjustment can be off-line adjustment or on-line adjustment. The device side for parameter adjustment can be electronic equipment for performing super-resolution rendering by using an AI super-resolution model, or can be other electronic equipment, and the electronic equipment can send the parameter adjustment to the electronic equipment for performing super-resolution rendering by using the AI super-resolution model after completing parameter adjustment, for example, a computer adjusts the parameter of the AI super-resolution model, and the parameter adjustment is sent to a mobile phone for use after completing the adjustment. The electronic device can acquire data samples in the application program rendering process to obtain a data sample set, and the AI super-resolution model is adjusted by utilizing the data samples in the data sample set.
S110, the rendering thread wakes up the AI thread, so that the AI thread is switched from the dormant state to the running state.
And S111, the AI thread reads the image data of the Nth frame image under low resolution from the memory.
And S112, performing super-resolution rendering on the image data of the Nth frame image under the low resolution by the AI thread to obtain the image data of the Nth frame image under the high resolution, writing the image data of the Nth frame image under the high resolution into a memory, and switching the AI thread from an operation state to a dormant state.
In this embodiment, the AI thread may perform super-resolution rendering on the image data of the nth frame image at the low resolution using the AI super-resolution model. The AI super-resolution model corresponds to at least one super-resolution multiple, after the AI thread receives the image data of the N frame image under the low resolution, the AI super-resolution model determines the corresponding super-resolution multiple according to the low resolution and the high resolution of the N frame image, the lifting conversion of the image data from the low resolution to the high resolution is completed, the AI thread is awakened in the rendering thread, and the rendering thread notifies the AI thread of the low resolution and the high resolution of the N frame image.
In some examples, the rendering thread obtains a super-resolution of the AI super-resolution model when initializing the AI super-resolution model, the rendering thread reduces the resolution of the image based on the super-resolution, and when the AI thread performs super-resolution rendering, the rendering thread omits sending the low resolution and the high resolution of the nth frame image to the AI thread, which may be applicable to AI super-resolution models having one super-resolution.
In some embodiments, under delayed rendering, when the FB currently processed is FB1 or FB2, the rendering thread reduces the resolution of the image, and the rendering thread may generate low-resolution image data corresponding to FB1 or may generate low-resolution image data corresponding to FB2, and then perform super-resolution rendering on the low-resolution image data corresponding to two FBs by the AI thread. In other embodiments, under delayed rendering, the rendering thread may generate low resolution image data of the last FB of FB0, and the NPU performs super resolution rendering on the low resolution image data of the last FB of FB 0. That is, under the delayed rendering, when FB is FB other than FB0, the rendering thread may generate low-resolution image data for each FB other than FB0, and then the AI thread performs super-resolution rendering on the low-resolution image, or when FB is the last FB of FB0, the rendering thread may generate low-resolution image data only for the last FB of FB0, and then the AI thread performs super-resolution rendering on the low-resolution image data.
And S113, the rendering thread acquires an OpenGL/Vulkan instruction stream required by drawing the N-1 frame image of the last backup, responds to the OpenGL/Vulkan instruction stream required by drawing the N-1 frame image, and acquires resource data related to the N-1 frame image from the backup. And performing UI rendering and post-processing based on the OpenGL/Vulkan instruction stream required for drawing the N-1 frame image, the resource data related to the N-1 frame image and the image data of the N-1 frame image under high resolution calculated by the rendering thread to draw the N-1 frame image.
The OpenGL/Vulkan instruction stream required by the N-1 frame image and the resource data related to the N-1 frame image are backed up in response to the rendering command of the N-1 frame image, such as the backup of the image data of the N-1 frame image under low resolution generation or the backup after the image data of the N-1 frame image under low resolution is stored in a memory. And when the application program issues a rendering command of the N-th frame image, acquiring a backup related to the N-1-th frame image.
The rendering thread performs cache backup on an OpenGL/Vulkan instruction stream required for drawing the Nth frame image and resource data related to the Nth frame image, and is used when an application program issues a rendering command of the (n+1) th frame image, so that the Nth frame image is delayed to be rendered and displayed. Wherein the resource data associated with the nth frame image includes attachment resources for all FBs of the nth frame image.
As shown in fig. 10, in executing the instruction stream of the nth frame image, the OpenGL/Vulkan instruction stream required for drawing the nth frame image and the resource data related to the nth frame image are buffer backed up, the OpenGL/Vulkan instruction stream required for drawing the nth-1 frame image and the resource data related to the nth-1 frame image are read from the backup area, and in response to the instruction stream, the nth-1 frame image is drawn on the display screen based on the resource data read from the backup area and the image data of the nth-1 frame image read from the memory at a high resolution. The backup area may be a GPU memory, and the memory may be a shared memory.
The rendering mode can be used in various scenes of the rendered image, such as application programs including a game application program, a home design application program, a modeling application program, an augmented reality application program, a virtual display application program and the like, and one frame of image is rendered by adopting the rendering thread and the AI thread, so that delayed drawing is realized, and the rendering result is calculated in parallel when the image is drawn, so that the rendering process does not influence the drawing of the rendering thread, and the smoothness is improved. In addition, the AI thread can run in the NPU, and the calculation power of the NPU is used for sharing the calculation amount of a part of GPU, so that the power consumption of the electronic equipment is lower, and the rendering time is shorter.
A point is described: performing super-resolution rendering by the AI thread when the forward rendering is performed and the currently processed FB is the main FB, wherein the image data of the Nth frame image under low resolution corresponds to the main FB, and the rendering result of the main FB can be obtained based on the high-resolution image data generated by the AI thread; the AI thread is not triggered to perform super-resolution rendering when other FBs are processed under forward rendering, and at the moment, the rendering thread performs rendering processing on the other FBs to obtain rendering results of the other FBs. If the rendering results of other FBs are not backed up, the rendering thread uses the rendering results of the main FB of the N-1 frame image and the rendering results of the other FBs of the N frame image when the rendering thread draws the N-1 frame image.
The method comprises the steps that under delayed rendering, all FBs except FB0 can trigger an AI thread to conduct super-resolution rendering, the AI thread can conduct super-resolution rendering on all FBs except FB0, rendering results of all FBs except FB0 under delayed rendering can be obtained based on high-resolution image data generated by the AI thread, so that the rendering thread can read complete image data of an N-1 frame image under high resolution, which is required when the N-1 frame image is drawn, from a memory, and when the N-1 frame image is drawn, the rendering thread uses the rendering results of the FB of the N-1 frame image.
In this embodiment, the interaction of the image data between the rendering thread and the AI thread occurs after the image resolution is reduced, the rendering result of the previous frame of image is read until the previous frame of image is drawn, the AI thread may be executed by the NPU, the image data interaction between the GPU and the NPU may be completed by means of the GPU memory and the CPU memory, the GPU memory may be a private memory of the GPU, the CPU and the NPU are prohibited from accessing the GPU memory, but the GPU may read the image data from the GPU memory by calling a function or the like, and the image data read by the GPU may be written into the CPU memory or may be used when the GPU draws the image, so the GPU memory may be used as the interaction memory and the operation memory of the GPU.
The CPU may designate a block of the CPU memory as an NPU input memory, designate another block of the CPU memory as an NPU output memory, the NPU input memory is input data used when writing the NPU for super resolution rendering, the NPU output memory is output data when writing the NPU for super resolution rendering, and the CPU may designate a size of the NPU input memory according to a data amount of the NPU input data and designate a size of the NPU output memory according to a data amount of the NPU output data. Whether the GPU or the NPU, the GPU and the NPU can access the CPU memory. The corresponding memory interview flow is shown in fig. 11, and may include the following steps:
s200, the CPU takes a storage area in the CPU memory as an NPU input memory, takes another storage area in the CPU memory as an NPU output memory, wherein the storage area is used as the NPU input memory when data is input to the NPU, and is used as the NPU output memory when data is output to the NPU, and the sizes of the two storage areas can be determined according to the data amount processed by the NPU and are not repeated herein.
S201, the CPU sends pointer addresses of NPU input memories to the NPU, and sends pointer addresses of NPU output memories to the NPU.
S202, the GPU writes the image data of the Nth frame image under low resolution into a GPU memory.
S203, the GPU informs the CPU that the image data of the Nth frame image under low resolution is written into the GPU memory.
S204, the CPU informs the GPU to write the image data of the N-th frame image under the low resolution into the NPU input memory and read the image data of the N-1-th frame image under the high resolution from the NPU output memory. And in the notification process, the pointer address of the NPU input memory and the pointer address of the NPU output memory are sent to the GPU.
S205, the GPU reads the image data of the Nth frame image in the low resolution from the GPU memory.
S206, the GPU writes the image data of the Nth frame image under the low resolution into the NPU input memory based on the pointer address of the NPU input memory.
S207, the GPU reads the image data of the N-1 frame image under high resolution from the NPU output memory based on the pointer address of the NPU output memory.
S208, the GPU writes the image data of the N-1 frame image under high resolution into the GPU memory.
S209, the CPU wakes an AI thread, and informs the NPU to read the image data of the Nth frame of image under low resolution from the NPU input memory. After the AI thread is awakened, the AI thread can call the AI super-resolution model to perform super-resolution rendering on the image data of the Nth frame of image under low resolution. Before invoking the AI super-resolution model, the AI thread first acquires the image data of the Nth frame image under the low resolution, so one function of the CPU to wake up the AI thread can be to inform the NPU to read the image data of the Nth frame image under the low resolution from the NPU input memory, and the NPU reads the image data of the Nth frame image under the low resolution and provides the image data for the AI super-resolution model.
S210, the NPU reads image data of an N frame image under low resolution from the NPU input memory based on the pointer address of the NPU input memory.
S211, the NPU writes the image data of the N frame image under high resolution into the NPU output memory based on the pointer address of the NPU output memory. When the GPU draws the Nth frame of image, the GPU reads the image data of the Nth frame of image in high resolution from the NPU output memory and writes the image data into the GPU memory.
After the NPU reads the image data of the Nth frame image under the low resolution, the image data of the Nth frame image under the low resolution is input into an AI super-resolution model running in the NPU, the AI super-resolution model performs super-resolution rendering on the image data of the Nth frame image under the low resolution, and the image data of the Nth frame image under the high resolution is output. And the NPU writes the image data of the N frame image under high resolution into the NPU output memory based on the pointer address of the NPU output memory.
In combination with the memory interview flow shown in fig. 11 and the rendering method described in fig. 8, another flow of the rendering method shown in fig. 12-1 and fig. 12-2 is obtained, fig. 12-1 and fig. 12-2 show the processing procedures of the GPU, the NPU and the CPU after determining that FB1 is the main FB in the N-2 frame image, and after issuing rendering commands of the N-1 frame image and the N-1 frame image by the application program, wherein the GPU and the CPU run rendering threads, and the NPU runs AI threads, which may include the following steps:
S301, the CPU determines that FB1 is the main FB in the rendering process of the N-2 frame image.
S302, the CPU intercepts a rendering command of an N-1 frame image issued by an application program.
S303, the CPU creates an AI thread, and in the process of creating and initializing the AI thread, the AI super-resolution model can be initialized, and the AI super-resolution model can be ensured to normally run through initialization operation.
S304, the CPU designates a first area in the CPU memory as an NPU input memory and designates a second area as an NPU output memory.
S305, the CPU informs the NPU of the pointer address of the first area and the pointer address of the second area.
S306, the CPU acquires the FB which is high in initial resolution and processed currently of the N-1 frame image based on the rendering command.
S307, the CPU acquires a rendering mode from the configuration file of the application program.
S308, if the rendering mode is forward rendering and FB is FB1, the CPU reduces the initial resolution of the image to low resolution based on the super-division multiple; if the rendering mode is delayed rendering and FB is not FB0 (e.g., FB1 and FB 2), the CPU reduces the initial resolution of the image to a lower resolution based on the super-division multiple.
S309, the CPU informs the GPU to generate the image data of the N-1 frame image under the low resolution.
S309', the CPU notifies the NPU of the high resolution and the low resolution of the N-1 st frame image.
S310, the GPU generates image data of the N-1 frame image under low resolution.
S311, the GPU writes the image data of the N-1 frame image under the low resolution into the GPU memory.
S312, the GPU informs the CPU that the image data of the N-1 frame image under low resolution is written into the GPU memory.
S313, the CPU informs the GPU to write the image data of the N-1 frame image in the low resolution into the first area and read the image data of the N-2 frame image in the high resolution from the second area. The notification sent by the CPU carries pointer addresses of the first area and the second area.
S314, the GPU reads the image data of the N-1 frame image under low resolution from the GPU memory.
S315, the GPU writes the image data of the N-1 frame image in the low resolution into the first area based on the pointer address of the first area.
S316, the GPU backs up the FB0 drawing instruction stream of the N-1 frame image and all FB accessory resources of the N-1 frame image in the GPU memory.
And S317, the GPU reads the image data of the N-2 frame image in high resolution from the second area based on the pointer address of the second area.
S318, the GPU writes the image data of the N-2 frame image under high resolution into the GPU memory.
S319, the GPU reads the image data of the N-2 frame image under high resolution, the FB0 drawing instruction stream of the N-2 frame image and all FB accessory resources of the N-2 frame image from the GPU memory.
S320, the data of the N-2 frame image acquired by the GPU is empty, and the GPU maintains the content displayed on the display screen unchanged.
S321, the GPU informs the CPU of reading the image data of the N-2 frame image under high resolution.
S322, the CPU wakes an AI thread in the NPU, and informs the NPU to read the image data of the N-1 frame image in the low resolution from the first area. The NPU may invoke the AI super-resolution model with the AI thread.
S323, the NPU reads the image data of the N-1 frame image in the low resolution from the first area based on the pointer address of the first area.
S324', NPU determines the super-resolution of the AI super-resolution model.
S324, the NPU performs super-resolution rendering on the image data of the N-1 frame image under the low resolution based on the super-division multiple by utilizing the AI super-resolution model, and obtains the image data of the N-1 frame image under the high resolution.
And S325, the NPU writes the image data of the N-1 frame image in the second area under the high resolution based on the pointer address of the second area.
S326, the CPU intercepts a rendering command of the Nth frame image issued by the application program.
S327, the CPU acquires, based on the rendering command, FB with which the initial resolution of the nth frame image is high resolution and is currently processed.
S328, if the rendering mode is forward rendering and FB is FB1, the CPU reduces the initial resolution of the image to low resolution based on the super-division multiple; if the rendering mode is delayed rendering and FB is not FB0 (e.g., FB1 and FB 2), the CPU reduces the initial resolution of the image to a lower resolution based on the super-division multiple.
S329, the CPU notifies the GPU to generate image data of the nth frame image at low resolution.
S329', the CPU notifies the NPU of the high resolution and the low resolution of the nth frame image.
S330, the GPU generates image data of the Nth frame image under low resolution.
S331, the GPU writes the image data of the Nth frame image under the low resolution into the GPU memory.
S332, the GPU informs the CPU that the image data of the Nth frame image under low resolution is written into the GPU memory.
S333, the CPU notifies the GPU to write the image data of the nth frame image at the low resolution into the first area, and to read the image data of the N-1 frame image at the high resolution from the second area. The notification sent by the CPU carries pointer addresses of the first area and the second area.
S334, the GPU reads the image data of the Nth frame of image under low resolution from the GPU memory.
And S335, the GPU writes the image data of the Nth frame image in the low resolution into the first area based on the pointer address of the first area.
S336, the GPU backs up the FB0 drawing instruction stream of the N frame image and all the FB accessory resources of the N frame image in the GPU memory.
S337, the GPU reads the image data of the N-1 frame image in high resolution from the second area based on the pointer address of the second area.
S338, the GPU writes the image data of the N-1 frame image under high resolution into the GPU memory.
And S339, the GPU reads the image data of the N-1 frame image under high resolution, the FB0 drawing instruction stream of the N-1 frame image and all FB accessory resources of the N-1 frame image from the GPU memory.
S340, the GPU draws the N-1 frame image based on the image data of the N-1 frame image under high resolution, the FB0 drawing instruction stream of the N-1 frame image and all FB accessory resources of the N-1 frame image.
S341, the GPU informs the CPU of reading the image data of the N-1 frame image under high resolution.
S342, the CPU wakes an AI thread in the NPU, and informs the NPU to read the image data of the Nth frame image in the low resolution from the first area. The NPU may invoke the AI super-resolution model with the AI thread.
S343, the NPU reads the image data of the Nth frame image in the low resolution from the first area based on the pointer address of the first area.
S344', the NPU determines the super-resolution of the AI super-resolution model.
And S344, the NPU performs super-resolution rendering on the image data of the Nth frame image under the low resolution based on the super-division multiple by utilizing the AI super-resolution model to obtain the image data of the Nth frame image under the high resolution.
And S345, the NPU writes the image data of the Nth frame image in the second area under the high resolution based on the pointer address of the second area.
As can be seen from fig. 12-1 and fig. 12-2, the CPU determines FB1 as the main FB during the rendering process of the nth frame image, and implements the rendering method provided in the present application after receiving the rendering command of the nth-1 frame image, and continuously displays the nth-2 frame image in response to the rendering command of the nth-1 frame image, and displays the nth-1 frame image in response to the rendering command of the nth frame image, and before responding to the rendering command of the nth frame image, the image data of the nth-1 frame image at high resolution is calculated, so that the rendering speed can be increased while delaying the display of one frame.
In the memory interview process shown in fig. 11, no matter the low-resolution image data or the high-resolution image data, the GPU and the NPU can complete interaction by means of the GPU memory and the CPU memory, and there is a problem that the image data are copied to each other in the GPU memory and the CPU memory, and the time for the GPU and the NPU to acquire the data can be prolonged by copying the image data to each other in the GPU memory and the CPU memory, so that the rendering time is increased.
In view of this problem, in this embodiment, the GPU, the CPU, and the NPU may perform data interaction in a shared memory manner, where the shared memory refers to a memory that can be accessed by the GPU, the CPU, and the NPU. The shared memory may serve as an external cache for the GPU, storing at least low resolution image data input by the GPU to the NPU, and the shared memory may serve as a processing cache for the NPU, storing at least high resolution image data output by the NPU.
The memory access flow of the GPU and the NPU using the shared memory is shown in fig. 13, and may include the following steps:
s401, the CPU applies for the shared memory from a Hardware Buffer area (Hardware Buffer). The shared memory includes a first area as a storage space for low-resolution image data and a second area as a storage space for high-resolution image data.
The CPU may request shared memory from a hard Buffer of random access memory (random access memory, RAM). In the shared memory application mode, the CPU applies for the storage space of the GPU SSBO (Shader Storage Buffer Object) from the hard Buffer, obtains low-resolution image data from the Fragment loader in the GPU or the computer loader in the GPU, and then writes the low-resolution image data into the GPU SSBO. The high resolution image data output by the NPU may also be written to the shared memory.
When the Hardwire Buffer is used as a memory, the CPU, the GPU and the NPU have certain requirements on the FORMAT of the Hardwire Buffer when accessing the Hardwire Buffer, for example, when the shared memory is applied to be a memory space accessible by the Fragment Buffer in the GPU or the computer Buffer in the GPU, the FORMAT of the Hardwire Buffer can be designated as AHARDWAREBUFER_FORMAT_BLOB.
In another shared memory application mode, the CPU may apply for two shared memories from the hard Buffer, where one shared memory is used to store low-resolution image data obtained by the GPU, and serve as input data of the NPU; the other shared memory is used for storing output data of the NPU, such as high-resolution image data obtained by super-resolution rendering, and is used as input data of the GPU in UI rendering. The FORMATs of the two shared memories applied by the CPU are the same, for example, the FORMAT of the shared memory may be ahardwarebuffer_format_blob, the size of the applied shared memory may be determined according to the resolution of the image, for example, the size of the applied shared memory is wide×high×byte size, for example, the type of the image data is float type, the float variable is 4 bytes, and the size of the applied shared memory is wide×high×3×4, and the width and the height are determined according to the resolution of the image. For example, 1080P images, the size of the shared memory is 1920×1080×3×4.
S402, the CPU informs the GPU of pointer addresses of the first area and the second area.
S403, the CPU informs the NPU of pointer addresses of the first area and the second area.
S404, the GPU writes the image data of the Nth frame image in the low resolution into the first area based on the pointer address of the first area. After the GPU completes the writing of the image data, the CPU may synchronously learn that the GPU may not send a notification to the CPU.
S405, the GPU reads the image data of the N-1 frame image in the low resolution from the second area based on the pointer address of the second area.
S406, the CPU informs the NPU to read the image data of the Nth frame image in the low resolution from the first area.
S407, the NPU reads image data of the nth frame image at the low resolution from the first area based on the pointer address of the first area.
And S408, the NPU writes the image data of the Nth frame image in the second area under the high resolution based on the pointer address of the second area. After the NPU completes the writing of the image data, the CPU may synchronously learn that the NPU may not send a notification to the CPU.
If the shared memory adopts a block area to store the low-resolution image data and the high-resolution image data, the high-resolution image data is read first, and then the low-resolution image data is written. If the CPU applies for two shared memories from the Hardware Buffer, one shared memory is used for storing the low-resolution image data obtained by the GPU and is used as the input data of the NPU; the other shared memory is used for storing output data of the NPU, such as high-resolution image data obtained by super-resolution rendering, and is used as input data of the GPU in UI rendering. The CPU may send the pointer addresses of the two shared memories to the GPU and the NPU, so that the GPU and the NPU may complete reading and writing of image data in the shared memories. The GPU and the NPU use the memory interview flow shown in the figure 13, so that data sharing can be realized by using the shared memory, zero copy of shared data between the GPU and the NPU can be realized, and the processing efficiency is improved.
In combination with the memory interview flow shown in fig. 13 and the rendering method shown in fig. 8, another flow of the rendering methods shown in fig. 14-1 and fig. 14-2 is obtained, fig. 14-1 and fig. 14-2 show the processing procedure of the GPU, the NPU and the CPU after determining that FB1 is the main FB in the N-2 frame image, and after issuing rendering commands of the N-1 frame image and the N-1 frame image by the application program, wherein the GPU and the CPU run rendering threads, and the NPU run AI threads, which is different from the above-mentioned fig. 12-1 and fig. 12-2 in that image data is stored in the shared memory, and the FB0 drawing instruction stream and the accessory resource are backed up in the GPU memory, which may include the following steps:
s501, the CPU determines that FB1 is the main FB in the rendering process of the N-2 frame image.
S502, the CPU intercepts a rendering command of an N-1 frame image issued by an application program.
S503, the CPU creates an AI thread, and in the process of creating and initializing the AI thread, the AI super-resolution model can be initialized, and the AI super-resolution model can be ensured to normally run through initialization operation.
S504, the CPU applies for a shared memory, wherein the shared memory comprises a first area and a second area, the first area is used as a storage space of low-resolution image data, and the second area is used as a storage space of high-resolution image data.
S505, the CPU informs the NPU and the GPU of pointer addresses of the first area and the second area.
S506, the CPU acquires the FB which is high in initial resolution and processed currently of the N-1 frame image based on the rendering command.
S507, the CPU obtains a rendering mode from the configuration file of the application program.
S508, if the rendering mode is forward rendering and FB is FB1, the CPU reduces the initial resolution of the image to low resolution based on the super-division multiple; if the rendering mode is delayed rendering and FB is not FB0 (e.g., FB1 and FB 2), the CPU reduces the initial resolution of the image to a lower resolution based on the super-division multiple.
S509, the CPU informs the GPU to generate the image data of the N-1 frame image under the low resolution.
S509', the CPU notifies the NPU of the high resolution and the low resolution of the N-1 st frame image.
S510, the GPU generates image data of the N-1 frame image under low resolution.
S511, the GPU writes the image data of the N-1 frame image in the low resolution into the first area based on the pointer address of the first area.
S512, the GPU backs up the FB0 drawing instruction stream of the N-1 frame image and all FB accessory resources of the N-1 frame image in the GPU memory.
S513, the CPU informs the GPU to read the image data of the N-2 frame image in the high resolution from the second area.
And S514, the GPU reads the image data of the N-2 frame image in the high resolution from the second area based on the pointer address of the second area.
S515, the GPU reads the FB0 drawing instruction stream of the N-2 frame image and all FB accessory resources of the N-2 frame image from the GPU memory.
S516, the data of the N-2 frame image acquired by the GPU is empty, and the GPU maintains the content displayed on the display screen unchanged.
S517, the GPU informs the CPU of reading the image data of the N-2 frame image under high resolution.
S518, the CPU wakes an AI thread in the NPU, and informs the NPU to read the image data of the N-1 frame image under low resolution from the shared memory. The NPU may invoke the AI super-resolution model with the AI thread.
S519, the NPU reads image data of the N-1 th frame image at the low resolution from the first area based on the pointer address of the first area.
S520', the NPU determines the super-division multiple of the AI super-resolution model.
S520, the NPU performs super-resolution rendering on the image data of the N-1 frame image under low resolution based on super-division multiple by using an AI super-resolution model to obtain the image data of the N-1 frame image under high resolution.
S521, the NPU writes the image data of the N-1 frame image in the second area under the high resolution based on the pointer address of the second area.
S522, the CPU intercepts a rendering command of an Nth frame image issued by an application program.
S523, the CPU acquires, based on the rendering command, FB with which the initial resolution of the nth frame image is high resolution and is currently being processed.
S524, if the rendering mode is forward rendering and FB is FB1, the CPU reduces the initial resolution of the image to low resolution based on the super-division multiple; if the rendering mode is delayed rendering and FB is not FB0 (e.g., FB1 and FB 2), the CPU reduces the initial resolution of the image to a lower resolution based on the super-division multiple.
S525, the CPU informs the GPU to generate the image data of the Nth frame image at the low resolution.
S525', the CPU notifies the NPU of the high resolution and the low resolution of the nth frame image.
S526, the GPU generates image data of the Nth frame image under low resolution.
S527, the GPU writes the image data of the Nth frame image in the low resolution into the first area based on the pointer address of the first area.
S528, the GPU backs up the FB0 drawing instruction stream of the N frame image and all the FB accessory resources of the N frame image in the GPU memory.
S529, the CPU informs the GPU to read the image data of the N-1 frame image from the second area under the high resolution.
And S530, the GPU reads the image data of the N-1 frame image in high resolution from the second area based on the pointer address of the second area.
S531, the GPU reads the FB0 drawing instruction stream of the N-1 frame image and all FB accessory resources of the N-1 frame image from the GPU memory.
S532, the GPU draws the N-1 frame image based on the image data of the N-1 frame image under high resolution, the FB0 drawing instruction stream of the N-1 frame image and all FB accessory resources of the N-1 frame image, and the display of the N-2 frame image is switched to the N-1 frame image, but the application program issues a rendering command of the N frame image.
S533, the GPU informs the CPU of reading the image data of the N-1 frame image under high resolution.
S534, the CPU wakes an AI thread in the NPU, and informs the NPU to read the image data of the Nth frame image under low resolution.
S535, the NPU reads image data of the nth frame image at the low resolution from the first area based on the pointer address of the first area.
S536', NPU determines the superdivision multiple of the AI super-resolution model.
S536, the NPU performs super-resolution rendering on the image data of the Nth frame image under the low resolution based on the super-division multiple by utilizing the AI super-resolution model, so as to obtain the image data of the Nth frame image under the high resolution.
And S537, the NPU writes the image data of the N frame image under high resolution into the second area based on the pointer address of the second area.
In addition, the AI thread provided in this embodiment includes four states, i.e., a sleep state, an operation completion state, and an operation error state, where the relationship between the four states is shown in fig. 15, and when the AI thread is in the sleep state, the rendering thread may wake up the AI thread directly, so that the rendering thread enters the operation state. When the AI thread is in the running state, the rendering thread waits for the AI thread to run completely, if the AI thread runs overtime and forcedly exits the AI thread, the AI thread is switched to the running error state, and if the AI thread does not run overtime, the AI thread is switched to the running complete state; and when the AI thread is in the running error state, the AI thread is forced to exit. After the AI thread forcibly exits, the subsequent image can calculate the rendering result of the subsequent image through the rendering thread.
After the AI thread is switched to the running completion state; the rendering thread can continue to monitor the AI thread, and if the time length of the AI thread in the running completion state is greater than the preset time length, the AI thread is switched from the running completion state to the dormant state; the preset duration indicates that the AI thread does not perform rendering processing for a period of time, and the AI thread is idle but still in operation and occupies part of resources, in this case, the AI thread is switched to a sleep state to reduce occupation of resources, and the value of the preset duration is not limited in this embodiment. After the AI thread exits for a period of time or the rendering thread performs rendering processing on the x-frame image, the rendering thread can wake the AI thread again, the rendering processing is continued by the AI thread, if the rendering processing of the AI thread is still overtime after the AI thread is woken up again, the rendering thread can disable the AI thread or wait for the next application program to be started and wake the AI thread again.
Some embodiments of the present application provide an electronic device, the electronic device including: the system comprises a first processor, a second processor, a third processor and a memory; the memory is used for storing one or more computer program codes, and the computer program codes comprise computer instructions, when the first processor, the second processor and the third processor execute the computer instructions, the first processor, the second processor and the third processor execute the rendering method.
Some embodiments of the present application provide a chip system including program code that, when run on an electronic device, causes a first processor, a second processor, and a third processor in the electronic device to perform the above-described rendering method.
Some embodiments of the present application provide a processor, the processor being a second processor, the second processor including a processing unit and a memory; wherein the memory is configured to store one or more computer program code comprising computer instructions that, when executed by the second processor, cause the second processor to perform the rendering method described above.
Some embodiments of the present application provide a computer storage medium comprising computer instructions that, when run on an electronic device, cause a second processor in the electronic device to perform the above-described rendering method.
The present embodiment also provides a control device comprising one or more processors, a memory for storing one or more computer program code comprising computer instructions which, when executed by the one or more processors, perform the above method. The control device may be an integrated circuit IC or a system on chip SOC. The integrated circuit can be a general-purpose integrated circuit, a field programmable gate array FPGA, or an application specific integrated circuit ASIC.
From the foregoing description of the embodiments, it will be apparent to those skilled in the art that, for convenience and brevity of description, only the above-described division of functional modules is illustrated, and in practical application, the above-described functional allocation may be implemented by different functional modules according to needs, i.e. the internal structure of the apparatus is divided into different functional modules to implement all or part of the functions described above. The specific working processes of the above-described systems, devices and units may refer to the corresponding processes in the foregoing method embodiments, which are not described herein.
In the several embodiments provided in this embodiment, it should be understood that the disclosed system, apparatus and method may be implemented in other manners. For example, the apparatus embodiments described above are merely illustrative, e.g., the division of the modules or units is merely a logical functional division, and there may be additional divisions when actually implemented, e.g., multiple units or components may be combined or integrated into another system, or some features may be omitted, or not performed. Alternatively, the coupling or direct coupling or communication connection shown or discussed with each other may be an indirect coupling or communication connection via some interfaces, devices or units, which may be in electrical, mechanical or other form.
The units described as separate units may or may not be physically separate, and units shown as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution of this embodiment.
In addition, each functional unit in each embodiment of the present embodiment may be integrated in one processing unit, each unit may exist alone physically, or two or more units may be integrated in one unit. The integrated units may be implemented in hardware or in software functional units.
The integrated units, if implemented in the form of software functional units and sold or used as stand-alone products, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present embodiment may be essentially or a part contributing to the prior art or all or part of the technical solution may be embodied in the form of a software product stored in a storage medium, including several instructions to cause a computer device (which may be a personal computer, a server, or a network device, etc.) or a processor to perform all or part of the steps of the method described in the respective embodiments. And the aforementioned storage medium includes: flash memory, removable hard disk, read-only memory, random access memory, magnetic or optical disk, and the like.
The foregoing is merely a specific embodiment of the present application, but the protection scope of the present application is not limited thereto, and any changes or substitutions within the technical scope of the present disclosure should be covered in the protection scope of the present application. Therefore, the protection scope of the present application shall be subject to the protection scope of the claims.

Claims (40)

1. A rendering method applied to a rendering process of a first image by an electronic device, the electronic device running an application, the electronic device including a first processor, a second processor, and a third processor, the method comprising:
the first processor receives a rendering command for the first image issued by the application program;
the third processor performs rendering processing on the first image to obtain a processing result of the first image;
and in the process of rendering the first image by the third processor, the second processor draws a second image.
2. The method of claim 1, wherein the electronic device further comprises a display screen, and wherein prior to the first processor receiving the rendering command for the first image issued by the application, the method further comprises:
the first processor determines a first frame buffer in a third image processing process, wherein the first frame buffer is a frame buffer with the number of executed drawing instructions larger than a preset threshold value in all frame buffers issued by the application program, the third image is a previous frame image of the second image, and the second image is a previous frame image of the first image;
The second processor draws the third image to display the third image on the display screen;
the first processor receives a rendering command for the second image issued by the application program;
the third processor performs rendering processing on the second image to obtain a processing result of the second image;
and in the process of rendering the second image by the third processor, the second processor controls the display screen to continuously display the third image.
3. The method of claim 1 or 2, wherein the rendering command of the first image is used to instruct the second processor to render the first image based on a first resolution;
the third processor performs rendering processing on the first image, and before the processing result of the first image is obtained, the method further includes:
the first processor sends a rendering instruction to the second processor, wherein the rendering instruction is used for instructing the second processor to render the first image;
the second processor generates image data of the first image at a second resolution based on the rendering instructions, the second resolution being no greater than the first resolution;
The second processor writes image data of the first image at the second resolution into a first region of a first memory;
the third processor reads image data of the first image at the second resolution from the first region;
the third processor performs rendering processing on the first image, and the processing result of the first image is obtained, which includes: the third processor generates image data of the first image at the third resolution based on image data of the first image at the second resolution, the third resolution being greater than the second resolution.
4. A method according to claim 3, wherein the third processor generates image data of the first image at the third resolution based on image data of the first image at the second resolution, the method further comprising: the third processor writes image data of the first image at the third resolution to a second region of the first memory.
5. The method of claim 3 or 4, wherein the second processor writes the image data of the first image at the second resolution to a first region of a first memory, the method further comprising:
The second processor writes all frame-buffered accessory resources of the first image and a last frame-buffered drawing instruction stream of the first image into a second memory, the second processor having access to the second memory.
6. The method of any of claims 3 to 5, wherein the second processor writes the image data of the first image at the second resolution to a first region of a first memory, and wherein the third processor reads the image data of the first image at the second resolution from the first region, the method further comprising:
the first processor sends a first notification to the third processor, the first notification being for instructing the third processor to read image data of the first image at the second resolution from the first region.
7. The method of any of claims 3 to 6, wherein before the first processor sends rendering instructions to the second processor, the method further comprises:
the first processor allocates the first memory from a hardware buffer, wherein the first memory comprises the first area and the second area;
The first processor sends the pointer address of the first area and the pointer address of the second area to the third processor and the second processor, the first processor, the second processor and the third processor have the right to access the first memory, the third processor and the second processor perform reading and writing of image data at low resolution in the first area based on the pointer address of the first area, and perform reading and writing of image data at high resolution in the second area based on the pointer address of the second area.
8. The method of any of claims 3 to 7, wherein prior to the second processor rendering the second image, the method further comprises:
the first processor sends a second notification to the second processor, the second notification being for instructing the second processor to read image data of the second image from the second region at the third resolution.
9. The method of any one of claims 3 to 8, wherein the second processor rendering a second image during a rendering process of the first image by a third processor comprises:
After the second processor writes the image data of the first image at the second resolution into a first area of a first memory, the second processor reads the image data of the second image at the third resolution from the second area;
the second processor reads all frame-buffered accessory resources of the second image and the last frame-buffered drawing instruction stream of the second image from a second memory, and the second processor has the right to access the second memory;
the second processor renders the second image based on image data of the second image at the third resolution, attachment resources of all frame buffers of the second image, and a rendering instruction stream of a last frame buffer of the second image.
10. The method of claim 3 or 4, wherein before the first processor sends rendering instructions to the second processor, the method further comprises:
the first processor allocates the first memory from the memory of the first processor, wherein the first memory comprises the first area and the second area;
the first processor sends the pointer address of the first area and the pointer address of the second area to the third processor, the first processor and the third processor have the right to access the first memory, the third processor performs reading of image data at low resolution in the first area based on the pointer address of the first area, and performs writing of image data at high resolution in the second area based on the pointer address of the second area.
11. The method of claim 10, wherein the second processor, after generating image data of the first image at a second resolution based on the rendering instructions, writes the image data of the first image at the second resolution to a first region of a first memory, the method further comprising:
the second processor writes the image data of the first image under the second resolution into a second memory, and the second processor has the right to access the second memory;
the second processor sends a third notification to the first processor, wherein the third notification is used for indicating that the image data of the first image under the second resolution is successfully written into the second memory;
in response to receiving the third notification, the first processor sends a fourth notification to the second processor, the fourth notification being for instructing the second processor to write image data of the first image at the second resolution into the first region, the fourth notification carrying an address pointer for the first region;
in response to receiving the fourth notification, the second processor reads image data of the first image at a second resolution from the second memory.
12. The method of any one of claims 3, 4, 10 and 11, wherein prior to the second processor rendering the second image, the method further comprises:
the first processor sends a fifth notification to the second processor, wherein the fifth notification is used for indicating the second processor to read the image data of the second image at the third resolution from the second area, and the fifth notification carries an address pointer of the second area;
in response to the fifth notification, the second processor reads image data of the second image at the third resolution from the second region;
and the second processor writes the image data of the second image in the third resolution into a second memory, and the second processor has the right to access the second memory.
13. The method of any one of claims 3, 4, 10, 11, and 12, wherein the second processor rendering a second image during the rendering of the first image by a third processor comprises:
the second processor reads image data of the second image under the third resolution, accessory resources of all frame buffers of the second image and a drawing instruction stream of the last frame buffer of the second image from a second memory, and the second processor has the right of accessing the second memory;
The second processor renders the second image based on image data of the second image at the third resolution, attachment resources of all frame buffers of the second image, and a rendering instruction stream of a last frame buffer of the second image.
14. The method of claim 9 or 13, wherein the second processor reads the image data of the second image at the third resolution, the method further comprising:
the second processor sending a sixth notification to the first processor, the sixth notification being for instructing the second processor to complete reading of image data of the second image at the third resolution;
in response to the sixth notification, the first processor wakes a first thread in the third processor, the first thread may invoke an artificial intelligence super-resolution model for super-resolution rendering of image data of the first image at the second resolution, generating image data of the first image at the third resolution.
15. The method of claim 14, wherein the first processor wakes up a first thread in the third processor, the method further comprising:
The first processor sends a first resolution of the first image and a second resolution of the first image to the third processor;
the third processor determines a super-resolution of the artificial intelligence super-resolution model based on a first resolution of the first image and a second resolution of the first image, and the artificial intelligence resolution model performs super-resolution rendering on image data of the first image at the second resolution based on the super-resolution.
16. The method according to claim 14 or 15, characterized in that the method further comprises: the first processor is used for initializing the artificial intelligent super-resolution model, and the initialization is used for determining to operate the artificial intelligent super-resolution model and determining the normal operation of the artificial intelligent super-resolution model;
the initialization includes runtime detection, model loading, model compiling, and memory configuration, wherein the runtime inspection is used for determining to run the artificial intelligence super-resolution model, and the model loading, model compiling, and memory configuration are used for determining the normal running of the artificial intelligence super-resolution model.
17. The method of any of claims 10 to 12, wherein after the first processor wakes up a first thread in the third processor, the method further comprises: the third processor monitors the operation of the first thread in the process that the first thread calls the artificial intelligent super-resolution model to generate image data of the first image at the third resolution;
and after the third processor monitors that the first thread finishes the invocation of the artificial intelligent super-resolution model, the first thread is dormant so as to switch the first thread from an operation state to a dormant state, and after the artificial intelligent super-resolution model generates the image data of the first image under the third resolution, the first thread finishes the invocation of the artificial intelligent super-resolution model.
18. The method of claim 17, wherein the third processor monitoring that the first thread has finished invoking the artificial intelligence super resolution model comprises:
after the third processor monitors that the first thread finishes the invocation of the artificial intelligent super-resolution model, the first thread is controlled to be switched from the running state to the running finishing state;
And the third processor monitors that the duration of the first thread in the running finish state is longer than a preset duration, and sleeps the first thread so as to switch the first site from the running finish state to a sleep state.
19. The method of claim 17 or 18, wherein the third processor monitors the running of the first thread, the method further comprising: the third processor monitors that the first thread calls the artificial intelligent super-resolution model with errors and forces the first thread to exit;
the third processor sends a notification to the second processor, the notification instructing the second processor to render the first image.
20. The method of claim 19, wherein the third processor monitoring that the first thread calls the artificial intelligence super resolution model for errors, forcing the first thread to exit comprises:
the third processor monitors that the rendering processing of the first image by the artificial intelligent super-resolution model is overtime, and the first thread is forced to exit.
21. The method of claim 19 or 20, wherein the third processor monitors for an error in invocation of the artificial intelligence super resolution model by the first thread, and wherein after forcing the first thread to exit, the method further comprises: and after the first processor monitors that the first thread exits and meets the preset condition, the first processor wakes up the first thread again.
22. The method of any of claims 3 to 21, wherein prior to the first processor sending the rendering instructions to the second processor, the method further comprises:
the first processor reduces the resolution of the first image from the first resolution to the second resolution.
23. The method of claim 22, wherein the third processor has a super-division multiple for indicating a difference between the second resolution and the third resolution; the third resolution is the same as the first resolution;
the first processor reducing the resolution of the first image from the first resolution to the second resolution includes: the first processor reduces the first resolution to the second resolution based on the super-division multiple.
24. The method according to any one of claims 3 to 23, wherein if the application program is rendered in a forward direction, the rendering instructions correspond to a first frame buffer, and the number of rendering instructions executed by the first frame buffer is greater than a preset threshold;
and if the rendering mode of the application program is delayed rendering, the rendering instruction corresponds to the frame buffer except the last frame buffer in all frame buffers issued by the application program.
25. The method of claim 24, wherein the first frame buffer is the frame buffer having the greatest number of rendering instructions executed among the all frame buffers.
26. The method of claim 24 or 25, wherein before the first processor sends rendering instructions to the second processor, the method further comprises:
the first processor obtains the rendering mode of the application program from the configuration file of the application program.
27. The method of any of claims 3 to 26, wherein the rendering instructions are to instruct the second processor to render the first image based on the second resolution, the second resolution being less than the first resolution.
28. The method of any one of claims 3 to 27, wherein the second resolution is less than the first resolution;
the third resolution is the same as the first resolution, or the third resolution is greater than the first resolution.
29. The method of any one of claims 3 to 28, wherein the second resolution is equal to the first resolution.
30. The method of any one of claims 1 to 29, wherein the third processor is a neural network processor or a digital signal processor.
31. The rendering method is characterized by being applied to a second processor of electronic equipment, wherein the electronic equipment further comprises a first processor and a third processor, the electronic equipment is operated with an application program, and the application program sends a rendering command for a first image to the first processor; the method comprises the following steps:
and in the process of rendering the first image by the third processor, the second processor draws a second image.
32. The method of claim 31, wherein the electronic device further comprises a display screen, wherein the first processor determines a first frame buffer during a third image processing process, the first frame buffer being a frame buffer in which the number of rendering instructions performed in all frame buffers issued by the application is greater than a preset threshold, the third image being a previous frame image of the second image, the second image being a previous frame image of the first image; the application program sends a rendering command for the second image to the first processor, and before the second processor draws the second image, the method further includes:
The second processor draws the third image to display the third image on the display screen;
and in the process of rendering the second image by the third processor, the second processor controls the display screen to continuously display the third image.
33. The method of claim 31 or 32, wherein the rendering command of the first image is for instructing the second processor to render the first image based on a first resolution; before the second processor renders the second image, the method further comprises:
the second processor receives a rendering instruction sent by the first processor, wherein the rendering instruction is used for indicating the second processor to render the first image;
the second processor generates image data of the first image at a second resolution based on the rendering instructions, the second resolution being no greater than the first resolution;
the second processor writes image data of the first image at the second resolution into a first region of a first memory;
the second processor reads image data of the second image at a third resolution from a second area of the first memory, the third resolution being greater than the second resolution, the image data of the second image at the third resolution being used to render the second image.
34. The method of claim 33, wherein the second processor writes the image data of the first image at the second resolution to a first region of a first memory, the method further comprising:
the second processor writes all frame-buffered accessory resources of the first image and a last frame-buffered drawing instruction stream of the first image into a second memory, the second processor having access to the second memory.
35. The method of claim 33 or 34, wherein the second processor, after reading the image data of the second image at the third resolution from the second region of the first memory, draws the second image comprises:
the second processor reads all frame-buffered accessory resources of the second image and the last frame-buffered drawing instruction stream of the second image from a second memory, and the second processor has the right to access the second memory;
the second processor renders the second image based on image data of the second image at the third resolution, attachment resources of all frame buffers of the second image, and a rendering instruction stream of a last frame buffer of the second image.
36. The method of claim 33 or 34, wherein the second processor reads the image data of the second image at a third resolution from a second region of the first memory, the method further comprising:
the second processor writes the image data of the second image under the third resolution into a second memory, and the second processor has the right of accessing the second memory;
the second processor rendering a second image includes: the second processor reads image data of the second image at the third resolution, accessory resources of all frame buffers of the second image and a drawing instruction stream of a last frame buffer of the second image from the second memory;
the second processor renders the second image based on image data of the second image at the third resolution, attachment resources of all frame buffers of the second image, and a rendering instruction stream of a last frame buffer of the second image.
37. An electronic device, the electronic device comprising: the system comprises a first processor, a second processor, a third processor and a memory; wherein the memory is for storing one or more computer program code comprising computer instructions which, when executed by the first processor, the second processor and the third processor, perform the rendering method of any one of claims 1 to 30.
38. A chip system comprising program code which, when run on an electronic device, causes a first processor, a second processor and a third processor in the electronic device to perform the rendering method of any one of claims 1 to 30.
39. A processor, wherein the processor is a second processor, the second processor comprising a processing unit and a memory; wherein the memory is for storing one or more computer program code comprising computer instructions which, when executed by the second processor, perform the rendering method of any one of claims 31 to 36.
40. A computer storage medium comprising computer instructions which, when run on an electronic device, cause a second processor in the electronic device to perform the rendering method of any one of claims 31 to 36.
CN202111552338.4A 2021-11-17 2021-12-17 Rendering method and device Pending CN116137675A (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202111362400 2021-11-17
CN2021113624003 2021-11-17

Publications (1)

Publication Number Publication Date
CN116137675A true CN116137675A (en) 2023-05-19

Family

ID=86333304

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111552338.4A Pending CN116137675A (en) 2021-11-17 2021-12-17 Rendering method and device

Country Status (1)

Country Link
CN (1) CN116137675A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116681575A (en) * 2023-07-27 2023-09-01 南京砺算科技有限公司 Graphics processing unit, graphics rendering method, storage medium, and terminal device

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116681575A (en) * 2023-07-27 2023-09-01 南京砺算科技有限公司 Graphics processing unit, graphics rendering method, storage medium, and terminal device
CN116681575B (en) * 2023-07-27 2023-12-19 南京砺算科技有限公司 Graphics processing unit, graphics rendering method, storage medium, and terminal device

Similar Documents

Publication Publication Date Title
US7941645B1 (en) Isochronous pipelined processor with deterministic control
US6885374B2 (en) Apparatus, method and system with a graphics-rendering engine having a time allocator
US7173627B2 (en) Apparatus, method and system with a graphics-rendering engine having a graphics context manager
US6731293B2 (en) Image output device and image output control method
KR19980025110A (en) Data processor and graphics processor
US20020004860A1 (en) Faster image processing
US7760205B2 (en) Information processing apparatus for efficient image processing
CN114998087B (en) Rendering method and device
US20160260246A1 (en) Providing asynchronous display shader functionality on a shared shader core
CN116137675A (en) Rendering method and device
US20060061579A1 (en) Information processing apparatus for efficient image processing
JP3683657B2 (en) Graphics display device and graphics processor
US20140362094A1 (en) System, method, and computer program product for recovering from a memory underflow condition associated with generating video signals
WO2018058368A1 (en) System performance improvement method, system performance improvement device and display device
US20220261945A1 (en) Data processing system, data processing method, and computer program
WO2023087827A9 (en) Rendering method and apparatus
JP4137903B2 (en) Graphics display device and graphics processor
CN113674132B (en) Method for managing rendering back end by detecting display card capability switching window
US20060082580A1 (en) Method and apparatus for triggering frame updates
WO2021121142A1 (en) Method for controlling animation refresh request, device, computer apparatus, and storage medium
US7046227B2 (en) System and method for continuously tracing transfer rectangles for image data transfers
CN116991600B (en) Method, device, equipment and storage medium for processing graphic call instruction
JP3454113B2 (en) Graphics display
WO2023065100A1 (en) Power optimizations for sequential frame animation
WO2023230744A1 (en) Display driver thread run-time scheduling

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination