WO2020237460A1 - 一种图形处理方法和装置 - Google Patents

一种图形处理方法和装置 Download PDF

Info

Publication number
WO2020237460A1
WO2020237460A1 PCT/CN2019/088565 CN2019088565W WO2020237460A1 WO 2020237460 A1 WO2020237460 A1 WO 2020237460A1 CN 2019088565 W CN2019088565 W CN 2019088565W WO 2020237460 A1 WO2020237460 A1 WO 2020237460A1
Authority
WO
WIPO (PCT)
Prior art keywords
virtual address
address space
virtual
space
format
Prior art date
Application number
PCT/CN2019/088565
Other languages
English (en)
French (fr)
Inventor
姚刚
陈平
汪明
吴刚
罗志强
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Priority to CN201980078870.1A priority Critical patent/CN113168322A/zh
Priority to EP19931211.7A priority patent/EP3964949B1/en
Priority to PCT/CN2019/088565 priority patent/WO2020237460A1/zh
Publication of WO2020237460A1 publication Critical patent/WO2020237460A1/zh
Priority to US17/534,462 priority patent/US20220083367A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F12/00Accessing, addressing or allocating within memory systems or architectures
    • G06F12/02Addressing or allocation; Relocation
    • G06F12/08Addressing or allocation; Relocation in hierarchically structured memory systems, e.g. virtual memory systems
    • G06F12/10Address translation
    • G06F12/1009Address translation using page tables, e.g. page table structures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F12/00Accessing, addressing or allocating within memory systems or architectures
    • G06F12/02Addressing or allocation; Relocation
    • G06F12/08Addressing or allocation; Relocation in hierarchically structured memory systems, e.g. virtual memory systems
    • G06F12/10Address translation
    • G06F12/109Address translation for multiple virtual address spaces, e.g. segmentation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/44Arrangements for executing specific programs
    • G06F9/455Emulation; Interpretation; Software simulation, e.g. virtualisation or emulation of application or operating system execution engines
    • G06F9/45533Hypervisors; Virtual machine monitors
    • G06F9/45558Hypervisor-specific management and integration aspects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T1/00General purpose image data processing
    • G06T1/20Processor architectures; Processor configuration, e.g. pipelining
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T1/00General purpose image data processing
    • G06T1/60Memory management
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F12/00Accessing, addressing or allocating within memory systems or architectures
    • G06F12/02Addressing or allocation; Relocation
    • G06F12/08Addressing or allocation; Relocation in hierarchically structured memory systems, e.g. virtual memory systems
    • G06F12/10Address translation
    • G06F12/1027Address translation using associative or pseudo-associative address translation means, e.g. translation look-aside buffer [TLB]
    • G06F12/1036Address translation using associative or pseudo-associative address translation means, e.g. translation look-aside buffer [TLB] for multiple virtual address spaces, e.g. segmentation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F12/00Accessing, addressing or allocating within memory systems or architectures
    • G06F12/02Addressing or allocation; Relocation
    • G06F12/08Addressing or allocation; Relocation in hierarchically structured memory systems, e.g. virtual memory systems
    • G06F12/10Address translation
    • G06F12/1081Address translation for peripheral access to main memory, e.g. direct memory access [DMA]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/44Arrangements for executing specific programs
    • G06F9/455Emulation; Interpretation; Software simulation, e.g. virtualisation or emulation of application or operating system execution engines
    • G06F9/45533Hypervisors; Virtual machine monitors
    • G06F9/45558Hypervisor-specific management and integration aspects
    • G06F2009/45583Memory management, e.g. access or allocation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2212/00Indexing scheme relating to accessing, addressing or allocation within memory systems or architectures
    • G06F2212/10Providing a specific technical effect
    • G06F2212/1016Performance improvement
    • G06F2212/1024Latency reduction
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2212/00Indexing scheme relating to accessing, addressing or allocation within memory systems or architectures
    • G06F2212/15Use in a specific computing environment
    • G06F2212/151Emulated environment, e.g. virtual machine
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2212/00Indexing scheme relating to accessing, addressing or allocation within memory systems or architectures
    • G06F2212/40Specific encoding of data in memory or cache
    • G06F2212/401Compressed data
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2212/00Indexing scheme relating to accessing, addressing or allocation within memory systems or architectures
    • G06F2212/45Caching of specific data in cache memory
    • G06F2212/455Image or video data
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2212/00Indexing scheme relating to accessing, addressing or allocation within memory systems or architectures
    • G06F2212/65Details of virtual memory and virtual address translation
    • G06F2212/651Multi-level translation tables
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2212/00Indexing scheme relating to accessing, addressing or allocation within memory systems or architectures
    • G06F2212/65Details of virtual memory and virtual address translation
    • G06F2212/656Address space sharing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2212/00Indexing scheme relating to accessing, addressing or allocation within memory systems or architectures
    • G06F2212/65Details of virtual memory and virtual address translation
    • G06F2212/657Virtual address space management

Definitions

  • This application relates to the field of chip technology, in particular to a graphics processing method and device.
  • GPU Graphics Processing Unit
  • GPU can be used to perform complex mathematical and geometric calculations, which are necessary for graphics rendering. Its basic working principle is based on a three-dimensional end point model, and corresponding transformations are made according to the perspective principle, and the texture corresponding to these end points is sampled at the same time, and the rendered result is written to the frame buffer.
  • the buffer of the GPU sampled material is usually called the texture buffer.
  • it can only support sampling a certain format of texture buffer.
  • the GPU can only support a certain format of the buffer area of the rendering target. In some scenarios, the data format saved in the buffer is often not supported locally by the GPU.
  • the current common method is to convert the GPU non-local format display to the GPU local format and store it in the intermediate buffer , For GPU sampling.
  • the GPU first renders the local format according to the sampling results, and then converts the data display in the local format into the data in the non-local format and stores it in the intermediate buffer area.
  • This method needs to apply for an additional buffer area, consumes memory, and generates additional delay, wastes bandwidth, and also wastes power consumption, and the cost is high.
  • the embodiments of the present application provide a graphics processing method and device, which can solve the problem of large memory consumption and high bandwidth cost when the GPU samples graphics data in a non-local format or needs to be rendered into graphics data in a non-local format.
  • a graphics processing method including: obtaining a first virtual address to be accessed by a graphics processor GPU, the first virtual address belongs to a first virtual address space; obtaining a second virtual address according to the first virtual address, and second The virtual address belongs to the second virtual address space.
  • the second virtual address space is different from the first virtual address space, the second virtual address space and the first virtual address space are mapped to the same first physical address space, and the physical address mapped by the first virtual address corresponds to the first format
  • the physical address mapped by the second virtual address corresponds to the image data in the second format.
  • the embodiment of the present application reconstructs a second virtual address space, and the second virtual address space is different from the first virtual address space.
  • This application can map an address in the first virtual address space to an address in the newly added second virtual address space, and the physical address mapped by the second virtual address in the newly added second virtual address space corresponds to the second
  • the image data in the second format is different from the image data in the first format.
  • the embodiment of the application realizes the image format conversion by mapping the address to a newly-added virtual address space, without the image format processor for format conversion, the GPU can access the image in non-local format, avoiding Apply for additional local GPU cache to avoid multiple migrations between the format conversion processor and cache, reduce memory consumption, avoid delays, and save bandwidth and power consumption.
  • obtaining the second virtual address according to the first virtual address specifically includes: translating the first virtual address into an intermediate virtual address, where the intermediate virtual address is a virtual address in the second virtual address space; In the second virtual address space, the intermediate virtual address is mapped to the second virtual address.
  • the intermediate virtual address can be understood as an intermediate virtual address in a non-local format
  • the second virtual address can be understood as a second virtual address in a local format. That is to say, when the second virtual address is obtained, it is necessary to perform an address translation from the first virtual address to the intermediate virtual address, and an address mapping from the intermediate virtual address to the second virtual address, and the intermediate virtual address to the second virtual address.
  • the address mapping of can be understood as the completion of pixel-level address mapping in the newly added second virtual address space, so that the GPU can access the image data corresponding to the physical address according to the second virtual address to obtain the image in the second format of the local format data.
  • the embodiment of the present application adds a second virtual address space between the existing first virtual address space and the first physical address space, and completes pixel-level address mapping in this newly added second virtual address space.
  • the address of the image in the non-native format is mapped to the address of the image in the local format that can be accessed by the GPU.
  • the embodiment of the present application splits the primary address translation from the first virtual address to the first physical address in the prior art into Two address translations and one pixel-level address mapping: translation from the first virtual address to the intermediate virtual address, pixel-level address mapping from the intermediate virtual address to the second virtual address, and translation from the second virtual address to the first physical address , So that the GPU can read or render images in non-local formats without explicit format conversion, avoid applying for additional local GPU cache, avoid multiple migrations between the format conversion processor and cache, and reduce memory consumption. Avoid delays, and save bandwidth and power consumption.
  • translating the first virtual address into an intermediate virtual address specifically includes: obtaining the intermediate virtual address corresponding to the first virtual address in the second virtual address space according to the first mapping relationship, and the first mapping relationship is The mapping relationship between the first virtual address space and the second virtual address space.
  • the first mapping relationship may be stored in the memory management unit MMU of the GPU.
  • the MMU may be integrated inside the GPU or located outside the GPU, which is not limited in this application.
  • the method further includes: obtaining the first physical address corresponding to the second virtual address in the first physical address space according to the second mapping relationship, and the second The mapping relationship is the mapping relationship between the second virtual address space and the first physical address space.
  • this application will translate the first virtual address into an intermediate virtual address after the first address translation, then go through an address mapping, that is, map the intermediate virtual address to the second virtual address, and go through the address translation again to convert the second
  • the virtual address is translated into the first physical address.
  • the intermediate virtual address of the non-local format before format mapping and the second virtual address of the local format after format mapping belong to this continuous second virtual address space, but the arrangement method has changed.
  • the GPU accesses the real physical address of the non-local format according to the order of the second virtual address in the local format, it accesses the graphics data in the local format, that is, the GPU is the graphics in the converted local format.
  • this application through the virtualization method can make the GPU sampling process does not require an additional format conversion buffer, and also omits the actual format conversion process.
  • the GPU samples graphics data in non-native formats in a low-cost manner.
  • the order of rendering the graphics data according to the physical address of the non-native format is based on the graphics data of the native format for the GPU Rendering is performed sequentially. Therefore, the method of virtualization in this application can make the GPU rendering process do not require additional format conversion buffers, and also omit the actual format conversion process, which enables the GPU to render locally in a low-cost manner.
  • Format graphics data to non-native format memory is mapped to non-native format memory.
  • the method before obtaining the second virtual address according to the first virtual address, the method further includes: obtaining a graphics processing request sent to the GPU, the graphics processing request including the first virtual address space and the first physical address space ; Construct a second virtual address space according to the first virtual address space and the first physical address space.
  • the embodiment of the present application can intercept the graphics processing request through the virtualization software proxy, thereby constructing the second virtual address space according to the request. It should be understood that the embodiment of the present application maps the discrete first physical address space and the first virtual address space to a continuous second virtual address space, and then in this virtual continuous space, the non-local format graphics data Converting to graphics data in a local format, the format conversion in the embodiment of the present application is realized by changing the address arrangement in a virtual space.
  • the method further includes: obtaining the first mapping relationship and the second mapping relationship according to the first virtual address space and the first physical address space.
  • constructing the second virtual address space according to the first virtual address space and the first physical address space specifically includes: obtaining the size of the physical memory page PP corresponding to the first physical address space and the first virtual address space The size of the corresponding virtual memory page VP; the first physical address space is mapped to the continuous virtual memory space to obtain the second virtual address space, the size of the virtual physical memory page VPP corresponding to the second virtual address space is greater than the size of PP and The size of the VP.
  • the purpose of this is to cover the first physical address space of the first virtual address space and the real physical address with the second virtual address space constructed by this application, so as to establish the relationship between the first virtual address space and the second virtual address space.
  • the first mapping relationship and the second mapping relationship between the second virtual address space and the first physical address space are the first mapping relationship and the second mapping relationship between the second virtual address space and the first physical address space.
  • obtaining the second virtual address according to the first virtual address includes:
  • the virtual address translated in the MMU may be the actual physical address of other buffers, rather than the virtual address in the second virtual address space, so judgment filtering is required here .
  • the intermediate virtual address acquired by the MMU is not necessarily the virtual address in the first mapping relationship.
  • mapping the intermediate virtual address to the second virtual address in the second virtual address space specifically includes: obtaining the pixel coordinates corresponding to the intermediate virtual address; and obtaining the second virtual address according to the pixel coordinates.
  • the second virtual address in the local format can be obtained through the pixel coordinates, so that the image data of the real physical address can be accessed through the second virtual address in the local format to obtain the local format. Image data.
  • the image data in the first format is compressed data to be read by the GPU
  • the compressed data includes a plurality of compressed graphics blocks
  • the intermediate virtual address is mapped to the second virtual address in the second virtual address space, Including: obtaining the pixel coordinates corresponding to the intermediate virtual address; obtaining the compression offset information of the target compressed graphics block corresponding to the intermediate virtual address according to the pixel coordinates; calculating the second virtual address according to the compression offset information; and the method further includes:
  • the target compressed graphics block is decompressed.
  • the image data in a non-local format stored in the memory may be image data in a compressed format.
  • the image data needs to be decompressed.
  • the image data in the first format is compressed data to be written by the GPU, and the intermediate virtual address is mapped to the second virtual address in the second virtual address space, including obtaining the pixel coordinates corresponding to the intermediate virtual address ;Acquire the address of the header data of the compressed data to be written according to the pixel coordinates; acquire the second virtual address according to the address of the header data. That is, when writing data, considering that compressed data is stored in the memory, when storing image data in a local format into a memory in a non-native format, the second virtual address of the data in the compressed format should be obtained according to The second virtual address puts the data village in the memory of the physical address.
  • mapping the intermediate virtual address to the second virtual address in the second virtual address space includes: obtaining the pixel coordinates corresponding to the intermediate virtual address; obtaining the signature of the pixel corresponding to the pixel coordinates; The second virtual address corresponding to the signature; if the image data in the first format is encrypted data to be read by the GPU, the method further includes: decrypting the read image data, and sending the decrypted image data to the GPU. In this case, it is considered that the image data stored in the memory is in an encrypted format. Therefore, when performing pixel-level address mapping, the second virtual address in the encrypted format must be obtained.
  • a graphics processing device characterized in that the device includes a graphics processor GPU and a hardware virtualization manager, wherein:
  • the GPU is used to obtain the first virtual address to be accessed, the first virtual address belongs to the first virtual address space;
  • the hardware virtualization manager is used to obtain the second virtual address according to the first virtual address, and the second virtual address belongs to the second virtual address Virtual address space; where the second virtual address space is different from the first virtual address space, the second virtual address space and the first virtual address space are mapped to the same first physical address space, and the physical address mapped by the first virtual address corresponds to For the image data in the first format, the physical address mapped by the second virtual address corresponds to the image data in the second format.
  • the GPU includes a first memory management unit MMU, and the hardware virtualization manager includes a format conversion processor; the first MMU is used to translate the first virtual address into an intermediate virtual address, and the intermediate virtual address Is a virtual address in the second virtual address space; the format conversion processor is used to map the intermediate virtual address to the second virtual address in the second virtual address space.
  • the first MMU is used to: obtain the intermediate virtual address corresponding to the first virtual address in the second virtual address space according to the first mapping relationship, and the first mapping relationship is the first virtual address space and the second virtual address space.
  • the mapping relationship of the virtual address space is used to: obtain the intermediate virtual address corresponding to the first virtual address in the second virtual address space according to the first mapping relationship, and the first mapping relationship is the first virtual address space and the second virtual address space. The mapping relationship of the virtual address space.
  • the first MMU is used for: the hardware virtualization manager includes a second MMU, and the second MMU is used for: obtaining the second virtual address corresponding to the second virtual address in the first physical address space according to the second mapping relationship A physical address, and the second mapping relationship is a mapping relationship between the second virtual address space and the first physical address space.
  • the device further includes a central processing unit CPU.
  • a virtualization software agent and a virtualization software agent run on the CPU to obtain a graphics processing request sent to the GPU.
  • the graphics processing request includes the first virtual address. Space and the first physical address space; construct the second virtual address space according to the first virtual address space and the first physical address space.
  • the virtualization software agent is also used to obtain the first mapping relationship and the second mapping relationship according to the first virtual address space and the first physical address space.
  • the virtualization software agent is specifically used to: obtain the size of the physical memory page PP corresponding to the first physical address space and the size of the virtual memory page VP corresponding to the first virtual address space;
  • the address space is mapped to the continuous virtual memory space to obtain the second virtual address space.
  • the size of the virtual physical memory page VPP corresponding to the second virtual address space is greater than the size of PP and the size of VP.
  • the GPU includes a first MMU, and the hardware virtualization manager includes a snoop filter and a format conversion processor; the first MMU is used to: translate the first virtual address into an intermediate virtual address; snoop filter , Used to: determine whether the intermediate virtual address belongs to the second virtual address space; when the intermediate virtual address belongs to the second virtual address space, send the intermediate virtual address to the format conversion processor; the format conversion processor is used to: in the second virtual address The intermediate virtual address is mapped to the second virtual address in the space.
  • the format conversion processor is specifically configured to: obtain the pixel coordinates corresponding to the intermediate virtual address; and obtain the second virtual address according to the pixel coordinates.
  • the image data in the first format is compressed data to be read by the GPU, and the compressed data includes a plurality of compressed graphic blocks;
  • the format conversion processor is specifically configured to: obtain the pixel coordinates corresponding to the intermediate virtual address; The pixel coordinates obtain the compression offset information of the target compressed graphics block corresponding to the intermediate virtual address; calculate the second virtual address according to the compression offset information; the format conversion processor is further used to decompress the read target compressed graphics block.
  • the image data in the first format is the compressed data to be written by the GPU; the format conversion processor is specifically configured to: obtain the pixel coordinates corresponding to the intermediate virtual address; obtain the compressed data to be written according to the pixel coordinates The address of the header data; the second virtual address is obtained according to the address of the header data.
  • the format conversion processor is specifically configured to: obtain the pixel coordinates corresponding to the intermediate virtual address; obtain the signature of the pixel corresponding to the pixel coordinate; obtain the second virtual address corresponding to the signature according to the signature; if the first format
  • the image data in is the encrypted data that the GPU needs to read, and the format conversion processor is also used to decrypt the read image data, and send the decrypted image data to the GPU.
  • the first virtual address mentioned in the first aspect and the second aspect mentioned above may be equivalent to the target virtual address in the virtual address space mentioned in the third aspect and the fourth aspect which will be introduced below; the first aspect and the second aspect mentioned above
  • the first virtual address space mentioned in the above is equivalent to the virtual address space mentioned in the third and fourth aspects
  • the second virtual address space mentioned in the first and second aspects mentioned above is equivalent to the third and fourth aspects.
  • the virtual physical address space mentioned in the aspect; the first physical address space mentioned in the first and second aspects above is equivalent to the physical address space mentioned in the third and fourth aspects; the first and second aspects mentioned above
  • the intermediate virtual address mentioned in the third and fourth aspects is equivalent to the intermediate physical address mentioned in the third and fourth aspects; the second virtual address mentioned in the first and second aspects mentioned above is equivalent to the third and fourth aspects.
  • the target virtual physical address; the first physical address mentioned in the first and second aspects above is equivalent to the target physical address mentioned in the third and fourth aspects; when the intermediate physical address in the third and fourth aspects When the address belongs to the second virtual address space, the intermediate physical address mentioned in the third and fourth aspects is the first virtual physical address, then the first virtual physical address is equivalent to the intermediate virtual address mentioned in the first and second aspects address.
  • a graphics processing method including: a virtualized software agent constructs a virtual physical address space, the virtual physical address space is the virtual address space and the memory space outside the physical address space; the hardware virtualization manager is in the virtual physical address
  • the address of the non-local format image data to be accessed in the space is mapped to obtain the target virtual physical address of the local format image data; the target physical address corresponding to the target virtual physical address is obtained, and the image data in the target physical address is accessed.
  • the physical address mapped by the virtual address in the virtual address space corresponds to the image data in the first format
  • the physical address mapped by the target virtual physical address in the virtual physical address space corresponds to the image data in the second format.
  • the embodiment of the application adds a virtual physical address space between the existing virtual address space and the physical address space, and completes the pixel-level address mapping in this newly added virtual physical address space, and converts the image in the non-native format
  • the address mapping is the address of the image in the local format that the GPU can access.
  • the embodiment of the present application splits one address mapping from a virtual address to a physical address in the prior art into two address mappings: virtual address to virtual physical Address, virtual physical address to physical address mapping, so that GPU can read or render images in non-local format without explicit format conversion, avoid applying for additional local GPU cache, and avoid format conversion between the processor and cache
  • the multiple migrations reduce memory consumption, avoid delays, and save bandwidth and power consumption.
  • the embodiment of the present application first maps discrete physical addresses to a piece of continuous virtual physical memory page VPP address space (virtual physical address space), and then changes the arrangement of addresses in this virtual continuous space, according to The physical address mapped by the changed virtual address corresponds to the image data in the second format. Specifically, the image data in the physical address space is accessed in the order of the changed virtual address to obtain image data in the second format.
  • This application is implemented
  • the format conversion of the example is achieved by changing the address arrangement in the virtual space.
  • the non-local VPP address before format mapping and the local format VPP address after format mapping belong to this continuous VPP address space. Only the arrangement has changed.
  • the GPU accesses the real physical address of the non-local format according to the sequence of the mapped virtual physical address, it accesses the graphics data in the local format, that is, the GPU reads the graphics data in the order of the converted local format Or write image data. Therefore, this application allows the GPU sampling process to require no additional format conversion buffers, and also omits the actual format conversion process, so that the GPU can be used in a low-cost manner. Sampling non-native format graphics data.
  • the order of rendering the graphics data according to the physical address of the non-native format is the order of the graphics data in the native format for the GPU. Therefore, this application allows the GPU rendering process to require no additional format conversion buffers, and also omits the actual format conversion process, so that the GPU can render local formats in a low-cost manner.
  • Graphics data to non-native format memory When the GPU accesses the real physical address of the non-local format according to the sequence of the mapped virtual physical address, it accesses the graphics data in the local format, that is
  • the method further includes: obtaining the target virtual address to be accessed by the graphics processor GPU
  • the memory management unit MMU of the GPU obtains the first virtual physical address in the virtual physical address space corresponding to the target virtual address according to the first mapping relationship, and the first mapping relationship is the mapping relationship between the virtual address space and the virtual physical address space
  • hardware virtualization The manager maps the address of the non-local format image data to be accessed in the virtual physical address space to obtain the target virtual physical address of the local format image data, which specifically includes: the hardware virtualization manager performs the first virtual address in the virtual physical address space.
  • the physical address is mapped to obtain the target virtual physical address.
  • obtaining the target physical address corresponding to the target virtual physical address specifically includes: the second MMU in the hardware virtualization manager obtains the target physical address corresponding to the target virtual physical address according to the second mapping relationship, and the second The mapping relationship is the mapping relationship between the virtual physical address space and the physical address space.
  • the method further includes: obtaining a graphics processing request sent to the GPU, the graphics processing request including the virtual address space and the physical address space of the non-native format image;
  • the virtualized software agent constructs the virtual physical address space, which specifically includes: the virtualized software agent constructs the virtual physical address space according to the virtual address space and the physical address space.
  • the virtualization software agent constructs the virtual physical address space according to the virtual address space and the physical address space, the first mapping relationship and the second mapping relationship are also obtained.
  • the virtualization software agent constructs a virtual physical address space, which specifically includes:
  • the hardware virtualization manager includes a filter and a format conversion processor. Before the hardware virtualization manager maps the address of the non-native format image data to be accessed in the virtual physical address space, the method also Including: obtain the target virtual address to be accessed by the graphics processor GPU; the memory management unit MMU of the GPU maps the target virtual address to the intermediate physical address; the filter determines whether the intermediate physical address belongs to the virtual physical address space; when the intermediate physical address belongs to the virtual physical In the address space, the filter determines the intermediate physical address as the first virtual physical address and sends it to the format conversion processor; the format conversion processor performs pixel-level format mapping of the first virtual physical address in the virtual physical address space to obtain the target virtual Physical address.
  • the hardware virtualization manager maps the first virtual physical address in the virtual physical address space to obtain the target virtual physical address, including: obtaining the pixel coordinates corresponding to the first virtual physical address; obtaining according to the pixel coordinates The target virtual physical address corresponding to the pixel coordinates.
  • the non-native format image data to be accessed is the compressed data that the GPU needs to read.
  • the compressed data includes multiple compressed graphics blocks.
  • the hardware virtualization manager compares the first virtual physical address in the virtual physical address space. Address mapping to obtain the target virtual physical address includes: obtaining the pixel coordinates corresponding to the first virtual physical address; obtaining the compression offset information of the target compressed graphics block corresponding to the first virtual physical address according to the pixel coordinates; calculating according to the compression offset information The target virtual physical address; the method further includes: decompressing the read target compressed graphics block.
  • the non-local format image data to be accessed is the compressed data to be written by the GPU, including: obtaining the pixel coordinates corresponding to the first virtual physical address; and the hardware virtualization manager performs the correcting in the virtual physical address space.
  • the first virtual physical address is mapped to obtain the target virtual physical address to obtain the address of the header data of the compressed data to be written according to the pixel coordinates; the target virtual physical address is obtained according to the address of the header data.
  • the hardware virtualization manager maps the first virtual physical address in the virtual physical address space to obtain the target virtual physical address, including: obtaining the pixel coordinate corresponding to the first virtual physical address; obtaining the pixel coordinate corresponding The signature of the pixel; obtain the target virtual physical address corresponding to the signature according to the signature; if the non-local format image data to be accessed is encrypted data that the GPU needs to read, the method also includes: decrypting the read image data, and The decrypted image data is sent to the GPU.
  • a graphics processing device in a fourth aspect, includes a graphics processor GPU, a central processing unit CPU, and a hardware virtualization manager.
  • a virtualization software agent runs on the CPU.
  • the virtualization software agent is used to construct a virtual The physical address space, the virtual physical address space is the virtual address space and the memory space outside the physical address space;
  • the hardware virtualization manager is used to map the address of the non-native format image data to be accessed in the virtual physical address space to obtain The target virtual physical address of the local format image data; the hardware virtualization manager is also used to obtain the target physical address corresponding to the target virtual physical address, and access the image data in the target physical address.
  • the virtualization software agent is also used to: obtain the target virtual address to be accessed by the graphics processor GPU; the GPU also includes a memory management unit MMU, which is used to obtain the target virtual address corresponding to the first mapping relationship
  • MMU memory management unit
  • the first virtual physical address in the virtual physical address space, the first mapping relationship is the mapping relationship between the virtual address space and the virtual physical address space;
  • the hardware virtualization manager is used to compare the first virtual physical address in the virtual physical address space Perform the mapping to obtain the target virtual physical address.
  • the hardware virtualization manager includes a second MMU.
  • the second MMU is used to obtain the target physical address corresponding to the target virtual physical address according to the second mapping relationship.
  • the second mapping relationship is the virtual physical address space and the physical The mapping relationship of the address space.
  • the virtualization software agent is also used to: obtain the graphics processing request sent to the GPU, the graphics processing request includes the virtual address space and the physical address space of the non-native format image; the virtualization software agent is used to : Construct a virtual physical address space based on the virtual address space and the physical address space.
  • the virtualization software agent is also used to obtain the first mapping relationship and the second mapping relationship when the virtual physical address space is constructed according to the virtual address space and the physical address space.
  • the virtualization software agent is used to obtain the size of the physical memory page PP and the size of the virtual memory VP corresponding to the physical address space; construct the virtual physical address space according to the size of the PP and the size of the VP, and the virtual physical The size of the virtual physical memory page VPP corresponding to the address space is greater than the size of PP and the size of VP.
  • the hardware virtualization manager includes a snoop filter and a format conversion processor; a snoop filter is used to obtain the target virtual address to be accessed by the graphics processor GPU; the MMU of the GPU is used to virtualize the target The address is mapped to the intermediate physical address; the snoop filter is used to determine whether the intermediate physical address belongs to the virtual physical address space; when the intermediate physical address belongs to the virtual physical address space, the intermediate physical address is determined as the first virtual physical address and sent to Format conversion processor; format conversion processor, used for pixel-level format mapping of the first virtual physical address in the virtual physical address space to obtain the target virtual physical address.
  • the hardware virtualization manager is used to obtain the pixel coordinates corresponding to the first virtual physical address; and to obtain the target virtual physical address corresponding to the pixel coordinates according to the pixel coordinates.
  • the non-local format image data to be accessed is the compressed data that the GPU needs to read.
  • the compressed data includes multiple compressed graphics blocks, and the hardware virtualization manager is used to obtain the data corresponding to the first virtual physical address. Pixel coordinates; obtain the compression offset information of the target compressed graphics block corresponding to the first virtual physical address according to the pixel coordinates; calculate the target virtual physical address according to the compression offset information; the hardware virtualization manager is also used to read the target The compressed graphics block is decompressed.
  • the non-local format image data to be accessed is compressed data to be written by the GPU, and the hardware virtualization manager is used to obtain the pixel coordinates corresponding to the first virtual physical address; the hardware virtualization manager is in The first virtual physical address is mapped in the virtual physical address space to obtain the target virtual physical address; the address of the header data of the compressed data to be written is obtained according to the pixel coordinates; the target virtual physical address is obtained according to the address of the header data.
  • the hardware virtualization manager is used to obtain the pixel coordinates corresponding to the first virtual physical address; obtain the signature of the pixel corresponding to the pixel coordinate; obtain the target virtual physical address corresponding to the signature according to the signature;
  • the chemical manager is also used to decrypt the read image data, and send the decrypted image data to the GPU.
  • a graphics processing method including: obtaining a first virtual address to be accessed, the first virtual address belongs to a first virtual address space; and translating the first virtual address into an intermediate virtual address, which belongs to the second virtual address Virtual address space, the intermediate virtual address can be mapped to the second virtual address in the second virtual address space; wherein, the second virtual address space is different from the first virtual address space, and the second virtual address space is mapped to the first virtual address space
  • the physical address mapped by the first virtual address corresponds to the image data in the first format
  • the physical address mapped by the second virtual address corresponds to the image data in the second format.
  • the embodiment of the present application can implement pixel-level address mapping in this newly added second virtual address space, and map the address of an image in a non-native format in the first format to a second format accessible by the GPU.
  • the address of the image in the local format so that the GPU can read or render the image in the non-local format without explicit format conversion, avoid applying for additional local GPU cache, and avoid multiple migrations between the format conversion processor and the cache , Which reduces memory consumption, avoids delays, and saves bandwidth and power consumption.
  • translating the first virtual address into an intermediate virtual address specifically includes: obtaining the intermediate virtual address corresponding to the first virtual address in the second virtual address space according to the first mapping relationship, and the first mapping relationship is The mapping relationship between the first virtual address space and the second virtual address space.
  • the method before generating the first virtual address to be accessed, the method further includes: receiving the first virtual address space and the second virtual address space; establishing a first mapping relationship.
  • a GPU in a sixth aspect, includes a transmission interface and a memory management unit MMU.
  • the transmission interface is used to obtain a first virtual address to be accessed, the first virtual address belongs to the first virtual address space; and the MMU is used to Translate the first virtual address into an intermediate virtual address, the intermediate virtual address belongs to the second virtual address space, and the intermediate virtual address can be mapped to the second virtual address in the second virtual address space; wherein, the second virtual address space is different from the first virtual address space.
  • a virtual address space, the second virtual address space and the first virtual address space are mapped to the same first physical address space, the physical address mapped by the first virtual address corresponds to the image data in the first format, and the second virtual address mapped The physical address corresponds to the image data in the second format.
  • the MMU is used to: obtain the intermediate virtual address corresponding to the first virtual address in the second virtual address space according to the first mapping relationship, and the first mapping relationship is the first virtual address space and the second virtual address space.
  • the mapping relationship of the address space is used to: obtain the intermediate virtual address corresponding to the first virtual address in the second virtual address space according to the first mapping relationship, and the first mapping relationship is the first virtual address space and the second virtual address space. The mapping relationship of the address space.
  • the MMU is used to: receive the first virtual address space and the second virtual address space; and establish the first mapping relationship.
  • the order in which the graphics data is sampled according to the non-native format PP address is actually sampled according to the address order of the local format for the GPU. Then for the sampling process, the sampling order when the GPU finally samples the graphics data is the same as the GPU Actually, the sampling order when sampling graphics data is different according to the real physical address.
  • the graphics data sampled when the sampling order is changed is the graphics data in the local format that the GPU can recognize and process. Therefore, it is finally in the physical format of the non-native format.
  • the order of the graphics data read from the memory is the graphics format that the GPU can recognize.
  • the GPU when the GPU renders graphics data in a local format, it can first correspond to the graphics data in the local format according to the virtual local format physical address VPP according to this application, and then map the VPP to the physical address in the non-native format.
  • VPP virtual local format physical address
  • the GPU when the GPU writes graphics data in the local format into the memory of the non-native format, the graphics data finally written to the memory is written according to the physical address of the non-native format, and the graphics data written to the memory is for the GPU For example, it is graphics data in a non-native format that the GPU cannot recognize.
  • Figure 1 is a schematic diagram of a scene sampled and rendered to a non-native format in the prior art
  • FIG. 2 is a schematic diagram of the functional modules inside a GPU
  • 2A is a schematic diagram of an address mapping relationship when sampling or rendering data in the prior art
  • 2B is a schematic diagram of comparison between an address space structure provided by an embodiment of this application and an address space structure in the prior art
  • 2C is a schematic diagram of an address mapping relationship provided by an embodiment of the application.
  • 2D is a schematic diagram of an address mapping relationship provided by an embodiment of this application.
  • 2E is a schematic diagram of an address mapping relationship provided by an embodiment of the application.
  • 2F is a schematic diagram of an address mapping relationship when sampling or rendering non-native format data provided by an embodiment of the application
  • FIG. 2G is a schematic diagram of a process of sampling or rendering data in a non-native format provided by its own embodiment
  • FIG. 3 is a schematic structural diagram of a terminal device provided by an embodiment of this application.
  • FIG. 4 is a schematic structural diagram of a SoC provided by an embodiment of the application.
  • FIG. 5 is a schematic flowchart of a graphics processing method provided by an embodiment of the application.
  • FIG. 6 is a software and hardware architecture diagram for sampling graphics data in a non-local format provided by an embodiment of the application
  • FIG. 7 is a schematic flowchart of a graphics processing method provided by an embodiment of this application.
  • FIG. 8 is a schematic structural diagram of a terminal device provided by an embodiment of this application.
  • FIG. 9 is a schematic structural diagram of a terminal device provided by an embodiment of the application.
  • GPU Also known as display core, visual processor or display chip, it is a microprocessor that specializes in image operations on personal computers, workstations, game consoles, and some mobile devices (such as tablets, smart phones, etc.). Examples In terms of nature, the purpose of the GPU includes: converting and driving the display information required by the computer system, providing line scan signals to the display, controlling the correct display of the display, etc.
  • the GPU is an important component that connects the display to the personal computer motherboard. One of the important equipment of "machine dialogue”.
  • Virtual address (virtual address): the logical address used by the program to access the memory.
  • Physical address The address placed on the addressing bus. If a central processing unit (Central Processing Unit, CPU) performs a read operation, the circuit can read the data in the physical memory of the corresponding address to the data bus for transmission according to the value of each bit of the physical address. If the CPU performs a write operation, the circuit can write the content on the data bus in the physical memory of the corresponding address according to the value of each bit of the physical address.
  • CPU Central Processing Unit
  • MMU Memory Management Unit
  • It is the control circuit used to manage virtual memory and physical memory in the CPU. It is also responsible for mapping virtual addresses to physical addresses and providing hardware mechanism for memory access authorization, multi-user and multi-process operating system.
  • the scene is used for post-processing and rendering the video stream captured by the surveillance camera 100, and the rendering result is output by the encoding processor 105 after secondary encoding.
  • the original video stream of the surveillance camera 100 is encoded in the encoder 101, and the encoding result is written into the non-native format buffer 106 in a non-GPU native format.
  • the system integrated on the system-on-chip (SoC) The encoder 101 is usually a specific private format bound to the supplier, which contains various information such as transformation, compression, and intellectual property protection.
  • the GPU103 cannot directly sample this private format, and the private format can be converted into by the format conversion processor 102
  • the GPU local format is to store the converted graphics data in the GPU local format in the local GPU cache 107. In this way, the GPU 103 can sample graphics data from the local GPU cache 107 for GPU rendering, and the rendering result is written into the local GPU cache 108 in the GPU local format.
  • the encoding processor 105 cannot directly accept the graphics data in the local GPU format in the local GPU cache 108, so another format conversion processor 104 is needed to read the graphics data in the local GPU cache 108 through the bus, and then the graphics data
  • the format is converted into a format acceptable to the encoder 105, and the graphics data in the acceptable format is written into the non-local format buffer 109 via the bus. In this way, every migration of data in this scenario includes the migration of the data between the cache and the processor, and additional cache space needs to be applied for, which consumes memory, also causes delays and wastes bandwidth and power consumption. Higher.
  • this application can be used in the process of the GPU sampling or rendering graphics, and can sample or render graphics in a non-local format for the GPU at low cost.
  • Figure 2 shows a schematic diagram of the functional unit structure inside the GPU.
  • shader processors can be divided into at least three categories: vertex shaders, fragment shaders, and geometry shaders.
  • the vertex processor program contains instructions for operating on a vertex instance.
  • the pixel processor program contains instructions for processing a pixel, usually including sampling the material of the pixel from a texture sample buffer, and calculating the reflection of the light source on the pixel to obtain the final coloring result.
  • the geometry processor program is used to instruct the GPU to perform geometry processing.
  • a normalized architecture usually runs a normalized instruction set.
  • the instruction set includes arithmetic, memory load/store and shift similar to general-purpose scalar processors.
  • Each instruction of the front end of these shader processors runs on multiple data instances, which is the usual single instruction multiple data structure.
  • These shader processors also need to communicate with fixed-function pipelines to complete graphics functions.
  • the graphics function includes a rasterizer and a texture mapper.
  • the rasterizer is used to calculate and generate the pixels corresponding to each shading fragment.
  • the texture mapper is used to calculate the address of the texture point (texel) to be finally taken after perspective transformation.
  • Both the shader processor and the fixed pipeline are mapped to virtual addresses. In memory management, a page is the smallest unit of address space. All the virtual addresses that can be used by an application are called virtual address spaces.
  • the virtual address space is usually divided into smaller granularity, virtual page (Virtual page, VP).
  • a virtual address space consists of a series of virtual addresses.
  • the virtual address space will be mapped to the real Double Data Rate (DDR) space, that is, the physical address space.
  • the physical address space is also divided into a series of physical memory pages (Physical Pages, PP).
  • the size of VP and PP can usually be the same, for example, it can be 4KB.
  • All virtual addresses are used.
  • Each process maintains a separate page table.
  • the page table is an array structure that stores the status of each VP, including whether it is mapped or cached.
  • the CPU When the process is executed, when the value stored in the virtual address needs to be accessed: the CPU will first find the VP where the virtual address is located, and then find the value corresponding to the page number of the VP in the page table according to the page table, and then according to the physical value corresponding to the value The page number is used to obtain the physical address in the PP corresponding to the virtual address.
  • This process can be called address translation from virtual address to physical address. Simply put, address translation refers to the process of finding a physical address from a virtual address when a cache hits.
  • a graphics memory management unit (Memory Management Unit, MMU) is used to manage the mapping relationship between the virtual address space and the physical address space.
  • MMU Memory Management Unit
  • the translation of a virtual address into a physical address is done by the MMU, or the physical address corresponding to the virtual address is obtained based on the mapping relationship between the virtual address space and the physical address space stored in the MMU.
  • a texture buffer area for storing graphics data
  • a bunch of scattered physical pages are mapped in the DDR space.
  • a pixel processor When a pixel processor is rendering pixels, it sends a texture sampling instruction to the texture mapper, and the texture mapper sends the calculated virtual address of the texel to the bus interface unit (BIU) through the BIU.
  • the connected MMU finds the physical address corresponding to the virtual address.
  • the rendering is performed with the granularity of the sheet material (tile), and the intermediate result of the rendering can be stored in the render target buffer according to the physical address.
  • the system will have L2 cache.
  • the sampling process in some instances, the system will have an L2 cache (level2cache). If the currently sampled texel is not found in the L2 cache, the texel read operation will read the texture buffer through the bus operation. Content.
  • FIG. 2A is a schematic diagram of using the virtual address to perform address translation in the MMU to obtain the physical address in the prior art, and then using the physical address to access data in the main memory.
  • FIG. 2B the difference from the prior art shown in (1) in FIG. 2B is that the present application is in the first virtual address space (equivalent to the virtual address space in the prior art) A second virtual address space is added between the first physical address space (equivalent to the physical address space in the prior art).
  • the second virtual address space can be divided into a series of Virtual Physical Page (VPP), the second virtual address space is a space different from the first virtual address space and the first physical address space, and the second virtual address space and the first virtual address space are mapped to the same
  • VPP Virtual Physical Page
  • the physical address mapped by the first virtual address corresponds to the image data in the first format
  • the physical address mapped by the second virtual address corresponds to the image data in the second format.
  • the method in the embodiment of this application involves a virtualization software agent and a hardware virtualization hypervisor.
  • the second virtual address space is constructed by the virtualization software agent. If the first format is Image data is image data in a non-local format that cannot be directly accessed by the GPU.
  • Image data in the second format is image data in a local format that the GPU can access.
  • the subsequent hardware virtualization manager can be in the second virtual address space constructed.
  • the data to be accessed completes the pixel-level address mapping between the local format and the non-local format.
  • the embodiment of the present application splits the mapping relationship between the first virtual address in the first virtual address space and the first physical address in the first physical address space into a first mapping relationship and a second mapping relationship, where the first The mapping relationship is the mapping relationship between the first virtual address and the second virtual address in the second virtual address space, the second mapping relationship is the mapping relationship between the second virtual address and the first physical address, and the first mapping relationship is stored in the first virtual address of the GPU.
  • the second mapping relationship is stored in the second MMU of the hardware virtualization manager.
  • the first virtual address is translated into the second virtual address in the second virtual address space according to the first mapping relationship in the MMU, and then according to the second virtual address
  • the second mapping relationship in the MMU translates the second virtual address into the first physical address, that is, the embodiment of the present application implements access to the actual physical address through two address translations. Since in the embodiment of this application, a second virtual address space is newly added, and the pixel-level address mapping of the image format is completed in this second virtual address space, the GPU can access non-transitive addresses without explicit format conversion.
  • the image in the local format that is, does not require the format conversion processor 102 to convert the private format into a local format, and the format processor 104 converts the local format into a format acceptable to the encoder, and there is no need to apply for additional local GPU cache 107 and
  • the local GPU cache 108 avoids multiple migrations of data between the processor and the cache, reduces memory consumption, avoids delay, and saves bandwidth and power consumption.
  • each access request accesses a first virtual address.
  • the address needs to be performed according to the first mapping relationship.
  • the above-mentioned translation of the first virtual address into the second virtual address in the second virtual address space according to the first mapping relationship in the MMU of the GPU can be understood as follows: if the order of the first virtual addresses accessed by the GPU VP1-VP2-VP3-..., according to the order of the first virtual address, the physical address PP1-PP2-PP3-... mapped to VP1-VP2-VP3-... corresponds to the image data in the first format.
  • the intermediate virtual address VPP1-VPP2-VPP3-... corresponding to the first virtual address can be obtained, then the multiple access requests sent by the GPU actually carry an intermediate virtual address, and the order in which the intermediate virtual addresses are sent It is VPP1-VPP2-VPP3-... Then, the pixel-level address mapping of the image format is completed in this virtual second virtual address space, and the intermediate virtual address of the non-local format is mapped to the second virtual address of the local format.
  • VPP1 is mapped to VPP4
  • VPP2 is mapped to VPP2
  • VPP3 is mapped to VPP1
  • the GPU can obtain the first physical address corresponding to the second virtual address corresponding to the actual access to the data in the non-native format according to the second virtual address in the local format and the second mapping relationship
  • the GPU accesses the graphics data in the non-local format according to the second virtual address in the local format.
  • the order of the second virtual address after pixel-level address mapping is VPP4-VPP2-VPP1-...
  • the order of the second virtual address is VPP4-VPP2-VPP1-...
  • the mapped physical address corresponds to the second Format image data
  • the second virtual address is translated into the first physical address according to the second mapping relationship in the second MMU, it can be understood that according to the order of the second virtual address VPP4-VPP2-VPP1-... and the second The mapping pass obtains the first physical address after the second address translation.
  • the order of the first physical address is PP4-PP2-PP1-...
  • the order of the first physical address is PP4-PP2-PP1-... to access the memory
  • the order of the first physical address is PP4-PP2-PP1-...corresponds to the image data in the second format.
  • the second format is The image data is image data in a local format that the GPU can access, so that the GPU can sample the image data in a local format from the image data in a non-local format.
  • FIG. 2D a schematic diagram of an exemplary image format read before address mapping provided in this embodiment of the application
  • FIG. 2E is a schematic diagram of an exemplary image format read after address mapping.
  • the second virtual address space is constructed according to the first physical address space and the first virtual address space, that is, a continuous virtual memory space is opened up according to the size of the first physical address space and the first virtual address space, and the continuous virtual
  • the memory space is used as the second virtual address space
  • the memory page of this space is the virtual physical memory page VPP
  • the memory page of the first virtual address space (the virtual address space shown in Figure 2D) is the virtual memory page VP
  • the memory page (the physical address space illustrated in FIG. 2D) is the physical memory page PP. It should be understood that the size of the VPP of the constructed virtual physical address is larger than the size of the VP and the size of the PP.
  • the order of accessing the pixels stored in the first physical address space is X1Y1 in PP1, X2Y2 in PP2, X3Y3 in PP3, X4Y4 in PP4, and X5Y5 in PP5.
  • the image data is image data in the first format (format 1).
  • the embodiment of the application maps addresses in the newly constructed second virtual address space (the virtual physical address space shown in Figure 2D), and changes the address order of access, which is equivalent to changing the arrangement of image pixels, as shown in Figure
  • VPP1 is mapped to VPP2
  • VPP2 is mapped to VPP4
  • VPP3 is mapped to VPP1
  • VPP4 is mapped to VPP5
  • VPP5 is mapped to VPP5
  • the order of accessing the pixels stored in the first physical address space becomes X2Y2 in PP2, X4Y4 in PP4, X1Y1 in PP1, X5Y5 in PP5, and X3Y3 in PP3
  • the pixel arrangement order of the read image data has changed, and the read image data at this time is the image data of the second format (format 2).
  • the embodiment of the application maps the addresses in the newly constructed second virtual address space, and changes the order of reading the addresses of the pixels, which is equivalent to changing the pixel arrangement of the read image data.
  • the first The image in the first format is an image in a non-local format that cannot be accessed by the GPU
  • the image in the second format is an image in a local format that the GPU can directly access. Therefore, the GPU can access the image in a non-local format without explicit format conversion.
  • the data gets the image data in the local format.
  • the second virtual address space proposed in the embodiment of the present application may be a segment of continuous addresses determined according to the size of the first physical address space and the size of the first virtual address space corresponding to the process.
  • the size of the first virtual address space is 396KB
  • the first physical address space is divided into 100 discrete VPs
  • the size of each VP is 4KB
  • the size of the second virtual address space needs to be greater than 400KB
  • Only the second virtual address space can replace the first physical address space, establish the first mapping relationship between the first virtual address space and the second virtual address space, and the second virtual address space between the second virtual address space and the first physical address space. Mapping relations.
  • the MMU of the GPU stores the non-native format requested by the application.
  • the lookup table in the MMU of the existing GPU includes a mapping between the address of the first virtual address space 0x8000 and the first physical address space 0x6453.
  • the mapping relationship can be split into the first mapping relationship between the first virtual address space 0x8000 and the second virtual address space 0x0, and the second virtual address The second mapping relationship between space 0x0 and the first physical address space 0x6453, and then reload the lookup table in the GPU's MMU, so that the lookup table in the GPU's MMU includes the first virtual address space 0x8000 and the second virtual address space 0x0
  • the second mapping relationship between the second virtual address space 0x0 and the first physical address space 0x6453 is stored in the second MMU of the virtualized hardware processor.
  • the first virtual address 0x8000 can be firstly translated to obtain the second virtual address 0x0, and then the second virtual address 0x0 is translated to obtain the first physical address 0x6453 of the real access data.
  • this application adopts a virtualization method to enable the GPU to sample or render non-local format graphics data. This virtualization method does not require an offline explicit conversion stage, and can complete sampling and rendering of graphics in non-local formats online.
  • the virtualization software proxy can intercept the application's graphics application programming interface (API) calls to the GPU.
  • API graphics application programming interface
  • the virtualization software proxy can virtualize the sampling buffer or rendering target buffer that the GPU can directly access These virtual buffers can be called the buffers corresponding to the VPP in the local format.
  • the graphics data in the non-local format is converted into the graphics data in the local format.
  • the format conversion in the embodiment of the present application is realized by changing the address arrangement in the virtual space.
  • the virtualization software agent can construct the VPP address space (the first virtual address space) and the PP address space (the first physical address space) that are applied for when the application program makes graphics API calls to the GPU.
  • the second virtual address space), the PP address mapped in the VP address space corresponds to the image data in the first format, that is, the image data in the non-local format.
  • the first mapping relationship between the VP address space and the VPP address space and the second mapping relationship between the VPP address space and the PP address space are obtained according to the VP address space and the PP address space, wherein the first mapping relationship is stored in the MMU of the GPU
  • the VPP address space is a continuous virtual address space.
  • the hardware virtualization manager obtains the target VPP address to be accessed by the GPU, and sets it in the VPP address
  • the pixel-level address mapping of the image data format is completed in the space, and the graphics data accessed according to the PP address mapped by the target VPP address in the local format is the second format image data, that is, the graphics data in the local format that the GPU can access.
  • the hardware virtualization manager obtains the target PP address according to the target VPP address of the local format after the format mapping and the second mapping relationship stored in the second MMU, so that the GPU reads the graphics data from the target PP address or sends it to the target PP Write graphics data in the address.
  • the order of sampling the graphics data according to the non-native format PP address is actually according to the target VPP address of the local format Sequential sampling, then for the sampling process, the sampling order when the GPU finally samples the graphics data is different from the sampling order when the GPU actually samples the graphics data according to the real physical address.
  • the sampling order is changed, the sampled graphics data is for the GPU It is the graphics data in the local format that the GPU can recognize and process. Therefore, the final order of the graphics data read from the memory according to the PP address of the non-native format is the graphics format that the GPU can recognize.
  • the GPU when the GPU renders the graphics data in the local format, it can first obtain the graphics data in the local format according to the virtual physical address VPP address of this application, and then obtain the PP address in the non-local format according to the VPP address. In this way, When the GPU writes graphics data in a local format into the memory of a non-native format, the graphics data finally written to the memory is still written according to the PP address of the non-native format.
  • the graphics data written to the memory is for the GPU It is graphics data in a non-native format that the GPU cannot recognize. In this way, the present application does not require the display format conversion stage when sampling or rendering graphics in non-native formats.
  • the format conversion processor 104 in FIG. 1 needs to read the graphics data in the local GPU cache 108 through the bus, and then convert the format of the graphics data into a format acceptable to the encoder 105.
  • there is no need for a buffer for format conversion such as the local GPU cache 107 and the local GPU cache 108 in FIG. 1.
  • the sampling and rendering process of the present application can be shown in Figure 2G.
  • the GPU 103 can sample the non-local format cache 106 through the intermediate layer constructed by the second virtual address space, and This allows the GPU103 to render data to the non-local format cache 109 through the middle layer of the second virtual address space, which greatly reduces the processing delay and bandwidth of the system, reduces the amount of system memory usage, and reduces the display conversion processor cost.
  • the virtualization software agent mentioned in the above process can be implemented in the form of software, and its corresponding software program code can be stored in the memory of the terminal device and executed by the CPU;
  • the hardware virtualization manager can be a combination of hardware and software
  • the hardware structure and the GPU can be set on the bus in the device, and the corresponding software program code can be stored in the memory of the terminal device.
  • the virtualization software agent, the hardware virtualization manager, and the GPU are integrated on the same SOC.
  • the embodiments of the present application can be used in the process of graphics processing by a terminal device that can display graphics.
  • the terminal device can be a mobile terminal or a non-movable terminal.
  • the mobile terminal can be a mobile phone, a tablet computer, and other mobile devices with display functions.
  • the non-movable terminal can be, for example, a personal computer and other devices with display functions.
  • the terminal device includes a display, a processor, a memory, a transceiver, and a bus, and the memory includes the above-mentioned memory.
  • the processor may include an SoC.
  • GPU hardware virtualization manager, vector permutation unit (VPU), CPU, and image signal processing (ISP) can be arranged.
  • Cache dynamic random access memory (Dynamic Random Access Memory, DRAM) controllers and buses, etc.
  • GPU, VPU, CPU, ISP, cache and DRAM controllers can be coupled through connectors.
  • coupling refers to the mutual connection through specific methods, including direct connection or indirect connection through other devices, such as various interfaces, transmission lines or buses, etc. These interfaces are usually electrical communication interfaces, but it does not rule out the possibility It is a mechanical interface or another form of interface, which is not limited in this embodiment.
  • the embodiments of the present application can be specifically applied to the SoC's graphics sampling and rendering process.
  • an embodiment of the present application provides a graphics processing method, which includes the following steps:
  • the terminal device obtains the first virtual address to be accessed by the graphics processor GPU, and the first virtual address belongs to the first virtual address space.
  • the physical address mapped by the first virtual address corresponds to the image data in the first format.
  • the terminal device obtains the second virtual address according to the first virtual address, and the second virtual address belongs to the second virtual address space.
  • the second virtual address space is different from the first virtual address space, the second virtual address space and the first virtual address space are mapped to the same first physical address space, and the physical address mapped by the second virtual address corresponds to the second format Image data.
  • the embodiment of the present application reconstructs a second virtual address space, and the second virtual address space is different from the first virtual address space.
  • This application can map an address in the first virtual address space to an address in the newly added second virtual address space, and the physical address mapped by the second virtual address in the newly added second virtual address space corresponds to the second
  • the image data in the second format is different from the image data in the first format.
  • the embodiment of the application realizes the image format conversion by mapping the address to a newly-added virtual address space, without the image format processor for format conversion, the GPU can access the image in non-local format, avoiding Apply for additional local GPU cache to avoid multiple migrations between the format conversion processor and cache, reduce memory consumption, avoid delays, and save bandwidth and power consumption.
  • the foregoing obtaining the second virtual address according to the first virtual address may include: translating the first virtual address into an intermediate virtual address by the first MMU in the GPU, and the intermediate virtual address is a virtual address in the second virtual address space;
  • the virtualization manager maps the intermediate virtual address to the second virtual address in the second virtual address space.
  • the second virtual address when the second virtual address is obtained, it is necessary to perform an address translation from the first virtual address to the intermediate virtual address, and an address mapping from the intermediate virtual address to the second virtual address, and the intermediate virtual address to the second virtual address.
  • the address mapping of can be understood as the completion of pixel-level address mapping in the newly added second virtual address space, so that the GPU can access the image data corresponding to the physical address according to the second virtual address to obtain the image in the second format of the local format data.
  • the foregoing translation of the first virtual address into an intermediate virtual address may include: the first MMU in the GPU may obtain the intermediate virtual address corresponding to the first virtual address in the second virtual address space according to the first mapping relationship, and the first mapping relationship is The mapping relationship between the first virtual address space and the second virtual address space.
  • the first mapping relationship may be stored in the memory management unit MMU of the GPU.
  • the first MMU may be integrated inside the GPU or located outside the GPU, which is not limited in this application.
  • the method may further include: obtaining the first physical address corresponding to the second virtual address in the first physical address space by the hardware virtualization manager according to the second mapping relationship, and
  • the second mapping relationship is the mapping relationship between the second virtual address space and the first physical address space.
  • the GPU accesses the real physical address of the non-local format according to the order of the second virtual address in the local format, it accesses the graphics data in the local format, that is, the GPU is the graphics in the converted local format.
  • the order of the data to read or write the image data therefore, this application through the virtualization method can make the GPU sampling process does not require an additional format conversion buffer, and also omits the actual format conversion process.
  • the GPU samples graphics data in non-native formats in a low-cost manner.
  • the order of rendering the graphics data according to the physical address of the non-native format is based on the graphics data of the native format for the GPU Rendering is performed sequentially. Therefore, the method of virtualization in this application can make the GPU rendering process do not require additional format conversion buffers, and also omit the actual format conversion process, which enables the GPU to render locally in a low-cost manner. Format graphics data to non-native format memory.
  • the method may further include: obtaining the graphics processing request sent to the GPU through the virtualization software proxy, and the graphics processing The request includes the first virtual address space and the first physical address space, so that the second virtual address space can be constructed according to the first virtual address space and the first physical address space. That is to say, when the application program sends a graphics processing request to the GPU, the embodiment of the present application can intercept the graphics processing request through the virtualization software proxy, thereby constructing the second virtual address space according to the request.
  • the embodiment of the present application maps the discrete first physical address space and the first virtual address space to a continuous second virtual address space, and then in this virtual continuous space, the non-local format graphics data Converting to graphics data in a local format, the format conversion in the embodiment of the present application is realized by changing the address arrangement in a virtual space.
  • the first mapping relationship and the second mapping relationship can be obtained according to the first virtual address space and the first physical address space.
  • this application provides a possible design that can be: the virtualization software agent obtains the size of the physical memory page PP corresponding to the first physical address space and the virtual memory corresponding to the first virtual address space
  • the size of the page VP maps the first physical address space to the continuous virtual memory space to obtain the second virtual address space.
  • the size of the virtual physical memory page VPP corresponding to the second virtual address space is greater than the size of PP and the size of VP.
  • the purpose of this is to cover the first physical address space of the first virtual address space and the real physical address with the second virtual address space constructed by this application, so as to establish the relationship between the first virtual address space and the second virtual address space.
  • the foregoing obtaining the second virtual address according to the first virtual address may include: translating the first virtual address into an intermediate virtual address; determining whether the intermediate virtual address belongs to the second virtual address space; when the intermediate virtual address belongs to the second virtual address space, The intermediate virtual address is mapped to the second virtual address in the second virtual address space. Because multiple mapping relationships are maintained in the MMU of the GPU, the virtual address translated in the MMU may be the actual physical address of other buffers, rather than the virtual address in the second virtual address space, so judgment filtering is required here . In other words, the intermediate virtual address acquired by the MMU is not necessarily the virtual address in the first mapping relationship.
  • the foregoing mapping of the intermediate virtual address to the second virtual address in the second virtual address space may include: obtaining the pixel coordinates corresponding to the intermediate virtual address; and obtaining the second virtual address according to the pixel coordinates.
  • the second virtual address in the local format can be obtained through the pixel coordinates, so that the image data of the real physical address can be accessed through the second virtual address in the local format to obtain the local format. Image data.
  • mapping the intermediate virtual address to the second virtual address in the second virtual address space may include: obtaining the intermediate virtual address Corresponding pixel coordinates; obtain the compression offset information of the target compressed graphics block corresponding to the intermediate virtual address according to the pixel coordinates; calculate the second virtual address according to the compression offset information; the method further includes: decompressing the read target compressed graphics block compression.
  • the image data in a non-local format stored in the memory may be image data in a compressed format.
  • the image data needs to be decompressed.
  • the intermediate virtual address is mapped to the second virtual address in the second virtual address space, including obtaining the pixel coordinates corresponding to the intermediate virtual address; obtaining the to-be written according to the pixel coordinates
  • the address of the header data of the incoming compressed data; the second virtual address is obtained according to the address of the header data. That is, when writing data, considering that compressed data is stored in the memory, when storing image data in a local format into a memory in a non-native format, the second virtual address of the data in the compressed format should be obtained according to The second virtual address puts the data village in the memory of the physical address.
  • mapping the intermediate virtual address to the second virtual address in the second virtual address space may include: obtaining the pixel coordinates corresponding to the intermediate virtual address; obtaining the corresponding pixel coordinates According to the signature, the second virtual address corresponding to the signature is obtained; if the image data in the first format is encrypted data that needs to be read by the GPU, the method further includes: decrypting the read image data, and decrypting The image data is sent to the GPU. In this case, it is considered that the image data stored in the memory is in an encrypted format. Therefore, when performing pixel-level address mapping, the second virtual address in the encrypted format must be obtained.
  • the embodiment of the present application can implement the GPU to directly access the image data in the non-native format by adding a second virtual address space.
  • an embodiment of the present application provides a graphics processing method. Taking a sampling process as an example, the method includes:
  • the virtualization software agent intercepts the graphics processing request sent by the application to the GPU.
  • the graphics processing request includes the non-native format VP address space and the non-native format PP address space applied by the application.
  • FIG. 6 shows a diagram of the software and hardware architecture of sampling non-native format graphics data in this application.
  • the application program can send a graphics processing request to the GPU.
  • the graphics processing request carries the resources required for sampling, and the resources include the texture buffer applied by the application.
  • the material buffer includes the non-native format VP address space and the non-native format PP address space.
  • the virtual software agent can intercept To the graphics processing request sent by the application to the GPU to parse the graphics processing request.
  • the virtualization software agent constructs a VPP address space according to the VP address space and the PP address space, and the VPP address space is a continuous virtual address space.
  • the virtualization software agent can parse the intercepted graphics processing request to obtain the non-native format VP address space and the non-native format PP address space in the graphics processing request. Further, according to the VP address space and The mapping relationship of the PP address space obtains the first mapping relationship between the VP address space and the VPP address space, and the second mapping relationship between the VPP address space and the PP address space constructs the VPP address space.
  • the VPP address space can be calculated by the size of PP and the size of VP, which has been described above and will not be repeated here.
  • the GPU can read the graphics data in the local format according to the VPP address. Correspondingly, the GPU can also write the rendered image in the local format into the VPP address.
  • the second virtual address is obtained virtually, not a real physical address.
  • the data read or written by the GPU from the virtual physical address is graphics data in a local format that the GPU can access.
  • the non-local PP address is converted to the second virtual address, the pixel arrangement format of the graphics data has changed.
  • the buffered image obtained according to the mapped VPP is in the local format.
  • the graphics data in this native format is directly accessible by the GPU.
  • the native format refers to the GPU native format, which refers to the image format that the GPU hardware itself supports (intrinsically supported), and the format in which the GPU can naturally perform read and write operations.
  • the commonly used local format is defined by the graphics API.
  • the graphics API is Open Graphics Library (OpenGL), OpenGL ES (OpenGL for Embedded Systems), and Direct3D with a 3D specification interface
  • the commonly used local formats are: RGBA8888, RGBA16F, RGB10A2, SRGB8_A8_ASTC_3x3x3, etc.
  • Non-native formats refer to formats in which the GPU cannot directly perform read and write operations.
  • non-native formats include all formats supported by non-graphics APIs. These formats are generally generated by application scenarios outside of the graphics community.
  • non-native formats include Y10U10V10LPacked, Y10U10V10 compressed format, ICE, and Y10U10V10.
  • the method also includes:
  • the virtualization software agent sends the first mapping relationship to the GPU, and sends the second mapping relationship and the VPP address space to the hardware virtualization manager.
  • the VPP address in the VPP address space is used to replace the PP address in the PP address space in the MMU of the GPU, and the first mapping relationship between the VP address in the VP address space and the VPP address in the VPP address space is established in the MMU of the GPU. . It should be noted that in the embodiment of the application, the hardware structure and software program of the GPU have not been changed.
  • the VPP address space is stored in the MMU of the GPU, which is not perceptible to the GPU and is passively received. In the prior art When the GPU reads and writes data, it finally sends the real physical address to the memory.
  • the MMU in this application does not store the real physical address, it stores the difference between the non-native format VP address and the VPP address in the VPP address range. Therefore, when the GPU reads and writes data in a non-native format, what the GPU sends to the memory is the non-native format VPP address obtained by address translation of the non-native format VP address.
  • the hardware virtualization manager may include a snoop filter, a format conversion processor, and a second MMU.
  • the snoop filter is used to determine whether the physical address of the graphics data read by the GPU is within the VPP address range.
  • the format conversion processor is used to perform pixel-level address mapping in the second virtual space, convert the VPP address (intermediate virtual address) sent when the GPU reads graphics data into the target VPP address (second virtual address) in the local format, and To read graphics data for decompression or decryption, etc.
  • the second MMU stores a second mapping relationship, and the second mapping relationship is a mapping relationship between the VPP address space and the PP address space.
  • the second mapping relationship here may be that the virtualization software agent configures the second mapping relationship into the second MMU when constructing the VPP address space.
  • the implementation may be as follows: the virtualization software agent sends configuration information to the second MMU, where the configuration information includes the second mapping relationship.
  • sending the VPP address space and the second mapping relationship to the hardware virtualization manager may include: sending the VPP address space to the snoop filter, and sending the second mapping relationship to the second MMU.
  • sending the VPP address space to the snoop filter stores the VPP address space
  • the GPU is sampling graphics data
  • a graphics data corresponds to a VP address and a PP address, and similarly, it also corresponds to a VPP address.
  • the GPU wants to read graphics data, there are multiple mapping tables maintained in the GPU's MMU. If the graphics data in the local format is stored in the memory to be sampled by the GPU, and the real physical address is stored in the GPU's MMU, then the GPU The real physical address is sent to the memory.
  • the real physical address detected in the snooping filter of the hardware virtualization manager is not in the VPP address space, and the hardware virtualization manager can discard the received real physical address.
  • the snoop filter will filter out physical addresses that are not in the VPP address space.
  • the hardware virtualization manager parses the access command of the GPU to obtain the intermediate virtual address carried in the access command.
  • the hardware virtualization manager determines whether the intermediate virtual address is in the VPP address space.
  • the intermediate virtual address carried in the GPU access command is the real PP address when the data is sampled, and the snooping filter will detect that the PP address is not within the VPP address range; if When the GPU samples graphics data in a non-native format, the snooping filter will detect that the intermediate virtual address carried in the GPU access command is in the virtual VPP address space of this application.
  • the hardware virtualization manager determines that the intermediate virtual address is a first VPP address in the VPP space.
  • the snooping filter determines that the intermediate virtual address is a VPP address in the VPP space.
  • the hardware virtualization manager performs format mapping on the first VPP address to obtain a target VPP address in a local format.
  • the snooping filter sends the first VPP address to the format conversion processor; so that the format conversion processor converts the first VPP address into the target VPP address.
  • the need to convert the first VPP address into the target VPP address takes into account the pixel format stored in the memory, for example, the pixel format is a compressed format or an encrypted format.
  • the target VPP address is the VPP address corresponding to the pixel format in the memory after address conversion.
  • the format conversion processor converting the first VPP address into the target VPP address can be adapted to multiple scenarios.
  • the embodiment of the present application uses the following three scenarios to illustrate the scenarios.
  • the format conversion processor obtains the pixel coordinates (x, y) corresponding to the first VPP address according to the first VPP address, and then obtains the target VPP address corresponding to the pixel coordinates (x, y) according to the pixel coordinates (x, y) .
  • An example of obtaining the target VPP address corresponding to the pixel coordinate (x, y) according to the pixel coordinate (x, y) can be as follows;
  • TileW and TileH represent the width and height of the pixel tile bound to the first VPP address of the non-native format
  • WidthInTile represents the sequence of tiles, so that it can be based on the first VPP address of the non-native format.
  • VPP address pixel coordinates (x, y) and the width and height of the pixel tile calculate the coordinates of the pixel tile (TileX, TileY), and the offset coordinates of the pixel tlie (TileOffsetX, TileOffsetY), and finally according to the width and height of the pixel tile: TileW Calculate the target VPP address PixelAddress in the local format with TileH, the coordinates of the pixel tile (TileX, TileY), and the offset coordinates of the pixel tlie (TileOffsetX, TileOffsetY).
  • the format conversion processor when the format conversion processor converts the first VPP address to the target VPP address, the format conversion processor can first obtain the pixel coordinates (x, y) corresponding to the first VPP address according to the first VPP address, and according to the pixel The coordinates (x, y) obtain the index of the graphics block to be obtained by the first VPP address, and then obtain the header data of the graphics block corresponding to the index stored in advance by the format conversion processor according to the index, and read the header data stored in the header data According to the compression offset information of the header data, the target VPP address corresponding to the graphics block is obtained according to the compression offset information of the header data.
  • the compression offset information of the header data can be understood as the address of the header data.
  • the texture samplers in some GPUs are protected by digital rights management. Therefore, not only ordinary graphics data, but also additional signatures are encoded into the memory, that is, the graphics data stored in the memory are encrypted data, and the graphics must be sampled The data also needs multiple authentication to obtain the data.
  • the format conversion processor can be controlled to obtain the first VPP address according to the first VPP address.
  • the pixel coordinates (x, y) corresponding to the address are decoded and the signature is decoded, that is, the signature of the pixel corresponding to the pixel coordinate (x, y) is obtained according to the pixel coordinate (x, y), and then the format conversion processor is pre-stored according to the signature of the pixel.
  • the target VPP address corresponding to the signature is decoded.
  • the method after reading the graphics data according to the physical address of the non-local format, when the graphics data is to be transmitted back to the GPU, since the graphics data stored in the memory is encrypted data, the method also includes: format conversion The processor decrypts the read graphics data, and sends the decrypted graphics data to the GPU.
  • the hardware virtualization manager obtains the target PP address (first physical address) according to the target VPP address and the second mapping relationship, so that the GPU reads the graphics data from the target PP address.
  • the format conversion processor After the format conversion processor obtains the target VPP address, it can transmit the target VPP address to the second MMU. Since the second mapping relationship is stored in the second MMU, the second MMU can search for the PP address in the non-native format corresponding to the target VPP address according to the second mapping relationship, and the second MMU will then search for the non-native format
  • the PP address is sent to the memory to read the non-local format graphics data corresponding to the non-local format PP address from the memory.
  • the VPP address range is stored in the MMU in the GPU, which is the correspondence between the virtual physical address range and the VP address range. Since the address arrangement method can reflect the arrangement when sampling graphics data
  • the arrangement method of the second virtual address proposed in this application is the arrangement method of the physical address corresponding to the graphics data in the GPU native format. Therefore, the physical address in the non-native format when the real sample data is obtained according to the second virtual address
  • the sequence of sampling graphics data according to the physical address of the non-native format is the sequence of the graphics data in the native format after conversion. Therefore, the virtualization method of this application can make the GPU sampling process unnecessary
  • the additional format conversion buffer also omits the actual format conversion process, enabling the GPU to sample non-local graphics data in a low-cost manner.
  • this application also provides a graphics processing method, taking the process of enabling GPU rendering to a non-native format buffer as an example, as shown in Figure 7, the method includes:
  • the virtualization software agent intercepts the graphics processing request and graphics data sent by the application program to the GPU.
  • the graphics processing request includes the non-native format VP address space and the non-native format PP address space applied by the application program.
  • step 701 is similar to the implementation of step 501, except that when the virtualization software agent intercepts the graphics processing request, it also intercepts the graphics data to be written into the memory.
  • the graphics data is graphics data in the local format of the GPU, and the embodiment of the application is to write the graphics data in the local format of the GPU into the memory of the non-local format.
  • graphics data may also be sent directly to the GPU by the application program without being intercepted by the virtualization software agent.
  • the virtualization software agent constructs a VPP address space according to the VP address space and the PP address space, and the VPP address space is a continuous virtual address space.
  • step 702 is similar to the implementation of step 502. The difference is that the VPP address space is used to enable the GPU to render the graphics data in the local format according to the address order of the local format. Similarly, because the address sequence is different, the format of the graphics data rendered according to the address is also different.
  • the graphics data stored in the memory is graphics data in non-native format
  • the application applies for the non-native format VP address space and non-native format PP address space
  • the GPU needs the graphics in the local format
  • the GPU can first obtain the graphics data in the local format according to the address order of the local format of the VPP address proposed in this application, so that the VPP address can be reverse-mapped to the non-native format PP.
  • the virtualization software agent sends the first mapping relationship to the GPU, and sends the second mapping relationship and the VPP address space to the hardware virtualization manager.
  • step 703 is similar to the implementation of step 503. The difference is that if the virtualization software agent also intercepts graphics data, the virtualization software agent also needs to send the graphics data to the GPU so that the GPU will intercept the graphics data. Data is written into memory through the hardware virtualization manager. The snooping filter is used to determine whether the physical address of the graphics data rendered by the GPU is within the VPP address range. The format conversion processor is used to convert the first VPP address sent when the GPU renders the graphics data into the target VPP address, and to compress or encrypt the graphics data to be written.
  • the hardware virtualization manager parses the GPU access command to obtain the intermediate virtual address carried in the access command.
  • the hardware virtualization manager determines whether the intermediate virtual address is in the VPP address space.
  • step 705 For the implementation of step 705, refer to step 505 above.
  • the hardware virtualization manager determines that the intermediate virtual address is a first VPP address in the VPP space.
  • the hardware virtualization manager performs format mapping on the first VPP address to obtain a target VPP address in a local format.
  • the snooping filter sends the first VPP address and the graphics data received from the GPU to the format conversion processor; controls the format conversion processor to convert the first VPP address into the target VPP address. Similar to step 507, the address conversion performed by the format conversion processor also takes into account the different pixel formats in the memory.
  • controlling the format conversion processor to convert the first VPP address to the target VPP address can also be applied to a variety of scenarios:
  • step 1) refers to the description in step 508.
  • step 508 Another scenario is similar to the case of 2) in step 508, except that the graphics data to be rendered needs to be compressed during the address conversion process. Specifically, if the pixel format stored in the memory is a compressed format, and the GPU needs to write graphics data into the memory, the format conversion processor can first control the format conversion when converting the first VPP address to the target VPP address.
  • the processor obtains the pixel coordinates (x, y) corresponding to the first VPP address according to the first VPP address; calculates the index of the graphic block corresponding to the graphic data according to the pixel coordinate (x, y), and obtains the header data of the graphic block according to the index,
  • the header data and graphics data are compressed; the corresponding relationship between the compressed header data and the index is stored in the format converter, so that the corresponding relationship can be used in the subsequent sampling process. Then calculate the target VPP address corresponding to the graphics block according to the address of the header data.
  • step 508 Another scenario is similar to the situation of 3) in step 508, except that when converting the first VPP address to the target VPP address, if the pixel format stored in the memory is an encrypted format, the format conversion processor also needs to be written The imported graphic data is encrypted.
  • the specific encryption method can be implemented using a simple stream cipher or a relatively complex private cipher of a block cipher for encryption, which is not limited in this application.
  • the hardware virtualization manager obtains the target PP address (first physical address) according to the target VPP address and the second mapping relationship, so that the GPU writes graphics data to the target PP address.
  • the format conversion processor After the format conversion processor obtains the target VPP address, it sends the target VPP address and the compressed or encrypted graphics data to the second MMU; the second MMU searches for the non-local format corresponding to the target VPP address according to the stored second mapping relationship The second MMU then sends the graphics data to the memory according to the found PP address, so as to write the graphics data into the memory in a non-local format according to the PP address.
  • the virtual physical address such as the VPP address is stored in the MMU in the GPU, that is, the second virtual address space. Since the address arrangement method can reflect the arrangement method when rendering graphics data,
  • the arrangement of the virtual physical addresses proposed in this application is the arrangement of the physical addresses corresponding to the graphics data in the GPU native format. Therefore, when the corresponding real rendering data is obtained according to the virtual physical address, the physical address of the non-local format is For the GPU, the order of rendering the graphics data according to the physical address of the non-native format is the order of the graphics data in the native format for the GPU. Therefore, the virtualization method of this application can make the GPU rendering process different. An additional format conversion buffer is required, and the actual format conversion process is also omitted, so that the GPU can render graphics data in a local format to a memory in a non-local format in a low-cost manner.
  • the terminal device includes a hardware structure and/or software module corresponding to each function.
  • the present application can be implemented in the form of hardware or a combination of hardware and computer software. Whether a certain function is executed by hardware or computer software-driven hardware depends on the specific application and design constraint conditions of the technical solution. Professionals and technicians can use different methods for each specific application to implement the described functions, but such implementation should not be considered beyond the scope of this application.
  • the embodiment of the present application may divide the terminal device into functional modules according to the foregoing method examples.
  • each functional module may be divided corresponding to each function, or two or more functions may be integrated into one processing module.
  • the above-mentioned integrated modules can be implemented in the form of hardware or software functional modules. It should be noted that the division of modules in the embodiments of the present application is illustrative, and is only a logical function division, and there may be other division methods in actual implementation.
  • FIG. 8 shows a possible structural schematic diagram of the terminal device involved in the above embodiment.
  • the terminal device may include a graphics processing device, and the graphics processing device may It is used to execute the method steps corresponding to FIG. 5 and the method steps corresponding to FIG. 7.
  • the terminal device 80 includes: an acquisition unit 801, a transmission unit 802, and a determination unit 803.
  • the acquiring unit 801 is used to support the terminal device to perform the processes 501, 502, 504, 507, and 508 in FIG. 5, and the processes 701, 702, 704, 707, and 708 in FIG. 7;
  • the transmission unit 802 is used to perform the processes in FIG. 5 503, the process 703 in FIG.
  • the determining unit 803 is used to support the terminal device 80 to execute the processes 505 and 506 in FIG. 5, and the processes 705 and 706 in FIG. Among them, all relevant content of each step involved in the above method embodiment can be cited in the function description of the corresponding function module, and will not be repeated here.
  • FIG. 9 shows a schematic diagram of a possible structure of the terminal device involved in the foregoing embodiment.
  • the terminal device 90 includes a processing module 902 and a communication module 903.
  • the processing module 902 is used to control and manage the actions of the terminal device.
  • the processing module 902 is used to support the terminal device to execute the processes 501-508 in FIG. 5, the processes 701-708 in FIG. 7, and/or the processes described herein. Other processes of the described technology.
  • the communication module 903 is used to support communication between the terminal device and other network entities.
  • the terminal device may further include a storage module 901 for storing program code and data of the terminal device.
  • the program code and data include the program code and data of the virtualization software agent and the hardware virtualization manager of the present application.
  • the processing module 902 may be a processor or a controller, for example, a central processing unit (CPU), a general-purpose processor, a digital signal processor (Digital Signal Processor, DSP), and an application-specific integrated circuit (Application-Specific Integrated Circuit). Integrated Circuit, ASIC), Field Programmable Gate Array (FPGA) or other programmable logic devices, transistor logic devices, hardware components, or any combination thereof. It can implement or execute various exemplary logical blocks, modules and circuits described in conjunction with the disclosure of this application.
  • the processor may also be a combination for realizing computing functions, for example, including a combination of one or more microprocessors, a combination of a DSP and a microprocessor, and so on.
  • the communication module 903 may be a transceiver, a transceiver circuit, or a communication interface.
  • the storage module 901 may be a memory.
  • the memory includes the program code and data of the virtualization software agent of the present application and the hardware virtualization manager.
  • the processor includes the hardware structure of the hardware virtualization manager of the present application.
  • the terminal device involved in the embodiment of the present application may be the terminal device shown in FIG. 3.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Software Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Memory System Of A Hierarchy Structure (AREA)
  • Image Input (AREA)

Abstract

本申请公开了一种图形处理方法和装置,涉及芯片技术领域,能够解决GPU采样或需渲染到非本地格式的图形数据时成本高的问题。其方法为:获取图形处理器GPU待访问的第一虚拟地址,第一虚拟地址属于第一虚拟地址空间;根据第一虚拟地址得到第二虚拟地址,第二虚拟地址属于第二虚拟地址空间。其中,第二虚拟地址空间不同于第一虚拟地址空间,第二虚拟地址空间和第一虚拟地址空间映射到相同的物理地址空间,第一虚拟地址所映射的物理地址对应第一格式的图像数据,第二虚拟地址所映射的物理地址对应第二格式的图像数据。本申请实施例用于GPU采样或渲染到非本地格式。

Description

一种图形处理方法和装置 技术领域
本申请涉及芯片技术领域,尤其涉及一种图形处理方法和装置。
背景技术
图形处理器(Graphics Processing Unit,GPU)可用于执行复杂的数学和几何计算,这些计算是图形渲染所必需的。其基本工作原理是基于一个三维端点模型,并根据透视原理做出相应的变换,同时采样这些端点对应的材质(texture),渲染出的结果写到帧缓存(frame buffer)中。GPU采样材质的缓存通常称为材质缓存区(texture buffer)。对于GPU来说,只能支持采样一定格式的texture buffer。类似的,GPU也只能支持一定格式的渲染目标的缓存区。某些场景下,缓存区中保存的数据格式往往不是GPU本地支持的,GPU需要采样这种数据格式时,目前常用的方法是将GPU非本地格式显示转换成GPU本地格式存储于中间缓存区中,以供GPU采样。或者,GPU需要渲染成这种数据格式时,GPU先根据采样结果渲染成本地格式,而后将本地格式的数据显示转换成非本地格式的数据存储于中间缓存区中。这种方法需要申请额外的缓存区,消耗内存,且会产生额外的延迟,浪费带宽,也浪费功耗,成本开销大。
发明内容
本申请实施例提供一种图形处理方法和装置,能够解决GPU采样非本地格式的图形数据或需渲染成非本地格式的图形数据时,内存消耗大,带宽成本高的问题。
第一方面,提供一种图形处理方法,包括:获取图形处理器GPU待访问的第一虚拟地址,第一虚拟地址属于第一虚拟地址空间;根据第一虚拟地址得到第二虚拟地址,第二虚拟地址属于第二虚拟地址空间。其中,第二虚拟地址空间不同于第一虚拟地址空间,第二虚拟地址空间和第一虚拟地址空间映射到相同的第一物理地址空间,第一虚拟地址所映射的物理地址对应第一格式的图像数据,第二虚拟地址所映射的物理地址对应第二格式的图像数据。
可以理解,本申请实施例重新构建了一个第二虚拟地址空间,该第二虚拟地址空间区别于第一虚拟地址空间。本申请可以将第一虚拟地址空间中的地址映射为该新增的第二虚拟地址空间中的一个地址,新增的第二虚拟地址空间中的第二虚拟地址所映射的物理地址对应第二格式的图形数据,该第二格式的图像数据区别于第一格式的图像数据,示例性的,如果第一格式的图像数据为GPU不能直接访问的图像数据,第二格式的图像数据为GPU可以访问的图像数据,本申请实施例通过将地址映射到一个新增的虚拟地址空间中实现了图像格式的转换,而无需图像格式处理器进行格式转换,GPU就可以访问非本地格式的图像,避免申请额外的本地GPU缓存,避免格式转换处理器和缓存之间的多次迁移,减少了内存消耗,避免延迟,并节省了带宽和功耗。
在一种可能的设计中,根据第一虚拟地址得到第二虚拟地址,具体包括:将第一虚拟地址翻译成中间虚拟地址,中间虚拟地址为第二虚拟地址空间中的一个虚拟地址; 在第二虚拟地址空间中将中间虚拟地址映射为第二虚拟地址。在第一虚拟地址为非本地格式的第一虚拟地址时,中间虚拟地址可以理解为非本地格式的中间虚拟地址,第二虚拟地址可以理解为本地格式的第二虚拟地址。也就是说,在得到第二虚拟地址时,需要进行一次第一虚拟地址到中间虚拟地址的地址翻译,和一次中间虚拟地址到第二虚拟地址的地址映射,一次中间虚拟地址到第二虚拟地址的地址映射可以理解为在新增的第二虚拟地址空间中完成像素级别的地址映射,从而可以使得GPU根据第二虚拟地址访问物理地址对应的图像数据,得到本地格式这种第二格式的图像数据。
本申请实施例在现有的第一虚拟地址空间和第一物理地址空间之间新增了一个第二虚拟地址空间,并在这个新增的第二虚拟地址空间中完成像素级别的地址映射,将非本地格式的图像的地址映射为GPU可以访问的本地格式的图像的地址,进一步的,本申请实施例将现有技术中的第一虚拟地址到第一物理地址的一次地址翻译拆分成两次地址翻译和一次像素级别的地址映射:第一虚拟地址到中间虚拟地址的翻译,中间虚拟地址到第二虚拟地址的像素级别的地址映射,以及第二虚拟地址到第一物理地址的翻译,从而GPU可以读取或渲染非本地格式的图像,而不需要显性的格式转换,避免申请额外的本地GPU缓存,避免格式转换处理器和缓存之间的多次迁移,减少了内存消耗,避免延迟,并节省了带宽和功耗。
在一种可能的设计中,将第一虚拟地址翻译成中间虚拟地址,具体包括:根据第一映射关系得到第一虚拟地址在第二虚拟地址空间中对应的中间虚拟地址,第一映射关系为第一虚拟地址空间和第二虚拟地址空间的映射关系。第一映射关系可以存储在GPU的存储器管理单元MMU中。在本申请实施例中,MMU可以集成在GPU内部,也可以位于GPU外部,本申请不做限定。
在一种可能的设计中,根据第一虚拟地址得到第二虚拟地址之后,方法还包括:根据第二映射关系得到第二虚拟地址在第一物理地址空间中对应的第一物理地址,第二映射关系为第二虚拟地址空间和第一物理地址空间的映射关系。
可以理解,本申请在经过第一次地址翻译,将第一虚拟地址翻译为中间虚拟地址,再经过一次地址映射,即将中间虚拟地址映射为第二虚拟地址,再经过一次地址翻译,将第二虚拟地址翻译为第一物理地址。格式映射前的非本地格式的中间虚拟地址和格式映射后的本地格式的第二虚拟地址都属于这片连续的第二虚拟地址空间,只是排布方式发生了变化。第二虚拟地址空间在GPU根据该本地格式的第二虚拟地址的顺序访问非本地格式的真实的物理地址时,访问的是本地格式的图形数据,也即GPU是按照转换后的本地格式的图形数据的顺序去读取或写入图像数据的,因此,本申请这种通过虚拟化的方式可以使得GPU采样过程不需要额外的格式转换缓冲区,也省略了实际的进行格式转换的过程,实现了GPU以低成本方式采样非本地格式的图形数据。类似的,在根据该第二虚拟地址空间得到真实渲染数据时的非本地格式的物理地址时,根据该非本地格式的物理地址渲染图形数据的顺序对于GPU来说是按照本地格式的图形数据的顺序进行渲染的,因此,本申请这种通过虚拟化的方式可以使得GPU渲染过程不需要额外的格式转换缓冲区,也省略了实际的进行格式转换的过程,实现了GPU以低成本方式渲染本地格式的图形数据到非本地格式的内存。
在一种可能的设计中,在根据第一虚拟地址得到第二虚拟地址之前,该方法还包 括:获取发送给GPU的图形处理请求,图形处理请求包括第一虚拟地址空间和第一物理地址空间;根据第一虚拟地址空间和第一物理地址空间构建第二虚拟地址空间。
也就是说,应用程序在向GPU发送图形处理请求时,本申请实施例可以通过虚拟化软件代理拦截到该图形处理请求,从而根据该请求构建出第二虚拟地址空间。应当理解,本申请实施例将离散的第一物理地址空间和第一虚拟地址空间先映射到一片连续的第二虚拟地址空间中,然后在这个虚拟出来的连续空间中将非本地格式的图形数据转换为本地格式的图形数据,本申请实施例的格式转换是通过在虚拟的空间中改变地址的排布方式实现的。
在一种可能的设计中,该方法还包括:根据第一虚拟地址空间和第一物理地址空间得到第一映射关系和第二映射关系。
在一种可能的设计中,根据第一虚拟地址空间和第一物理地址空间构建第二虚拟地址空间,具体包括:获取第一物理地址空间对应的物理内存页PP的大小以及第一虚拟地址空间对应的虚拟内存页VP的大小;将第一物理地址空间映射到连续的虚拟内存空间中,得到第二虚拟地址空间,第二虚拟地址空间对应的虚拟物理内存页VPP的大小大于PP的大小以及VP的大小。
这样做的目的是,要将本申请构建的第二虚拟地址空间覆盖第一虚拟地址空间和真实物理地址的第一物理地址空间,才能建立第一虚拟地址空间和第二虚拟地址空间之间的第一映射关系,和第二虚拟地址空间与第一物理地址空间之间的第二映射关系。
在一种可能的设计中,根据第一虚拟地址得到第二虚拟地址,包括:
将第一虚拟地址翻译成中间虚拟地址;判断中间虚拟地址是否属于第二虚拟地址空间;当中间虚拟地址属于第二虚拟地址空间时,在第二虚拟地址空间中将中间虚拟地址映射为第二虚拟地址。因GPU的MMU中维护了多个映射关系,所以MMU中翻译出的虚拟地址有可能是其他缓存区的实际的物理地址,而不是第二虚拟地址空间中的虚拟地址,所以这里要进行判断过滤。也就是说,MMU获取的中间虚拟地址并不一定是第一映射关系中的虚拟地址。
在一种可能的设计中,在第二虚拟地址空间中将中间虚拟地址映射为第二虚拟地址,具体包括:获取中间虚拟地址对应的像素坐标;根据像素坐标获取第二虚拟地址。这里即为上述进行像素级别的地址映射的一种实现过程,通过像素坐标可以得到本地格式的第二虚拟地址,从而通过本地格式的第二虚拟地址访问真实的物理地址的图像数据,得到本地格式的图像数据。
在一种可能的设计中,第一格式的图像数据为GPU需读取的压缩数据,压缩数据包括多个压缩图形块,在第二虚拟地址空间中将中间虚拟地址映射为第二虚拟地址,包括:获取中间虚拟地址对应的像素坐标;根据像素坐标获取中间虚拟地址对应的目标压缩图形块的压缩偏移信息;根据压缩偏移信息计算得到第二虚拟地址;方法还包括:对读取的目标压缩图形块进行解压缩。这里是考虑到在进行像素级别的地址转换时,内存中存储的非本地格式的图像数据可能是压缩格式的图像数据。相应的,在采样得到图像数据时,还需要进行图像数据的解压缩。
在一种可能的设计中,第一格式的图像数据为GPU待写入的压缩数据,在第二虚拟地址空间中将中间虚拟地址映射为第二虚拟地址,包括获取中间虚拟地址对应的像 素坐标;根据像素坐标获取待写入的压缩数据的头数据的地址;根据头数据的地址获取第二虚拟地址。即,写入数据时,考虑到内存中存放的是压缩数据,因此,在将本地格式的图像数据存放进非本地格式的内存中时,要得到压缩格式的数据的第二虚拟地址,以根据第二虚拟地址将数据村放入物理地址的内存中。
在一种可能的设计中,在第二虚拟地址空间中将中间虚拟地址映射为第二虚拟地址,包括:获取中间虚拟地址对应的像素坐标;获取像素坐标对应的像素的签名;根据签名获取与签名对应的第二虚拟地址;若第一格式的图像数据为GPU需读取的加密数据,则方法还包括:对读取的图像数据进行解密,将解密后的图像数据发送给GPU。这种情况是考虑到,内存中存放的图像数据为加密格式,因此在进行像素级别的地址映射时,要得到加密格式下的第二虚拟地址。
第二方面,提供一种图形处理装置,其特征在于,装置包括图形处理器GPU和硬件虚拟化管理器,其中:
GPU,用于获取待访问的第一虚拟地址,第一虚拟地址属于第一虚拟地址空间;硬件虚拟化管理器,用于根据第一虚拟地址得到第二虚拟地址,第二虚拟地址属于第二虚拟地址空间;其中,第二虚拟地址空间不同于第一虚拟地址空间,第二虚拟地址空间和第一虚拟地址空间映射到相同的第一物理地址空间,第一虚拟地址所映射的物理地址对应第一格式的图像数据,第二虚拟地址所映射的物理地址对应第二格式的图像数据。
在一种可能的设计中,GPU包括第一存储器管理单元MMU,所述硬件虚拟化管理器包括格式转换处理器;第一MMU,用于将第一虚拟地址翻译成中间虚拟地址,中间虚拟地址为第二虚拟地址空间中的一个虚拟地址;格式转换处理器,用于在第二虚拟地址空间中将中间虚拟地址映射为第二虚拟地址。
在一种可能的设计中,第一MMU用于:根据第一映射关系得到第一虚拟地址在第二虚拟地址空间中对应的中间虚拟地址,第一映射关系为第一虚拟地址空间和第二虚拟地址空间的映射关系。
在一种可能的设计中,第一MMU用于:硬件虚拟化管理器包括第二MMU,第二MMU用于:根据第二映射关系得到第二虚拟地址在第一物理地址空间中对应的第一物理地址,第二映射关系为第二虚拟地址空间和第一物理地址空间的映射关系。
在一种可能的设计中,装置还包括中央处理器CPU,CPU上运行有虚拟化软件代理,虚拟化软件代理,用于:获取发送给GPU的图形处理请求,图形处理请求包括第一虚拟地址空间和第一物理地址空间;根据第一虚拟地址空间和第一物理地址空间构建第二虚拟地址空间。
在一种可能的设计中,虚拟化软件代理,还用于:根据第一虚拟地址空间和第一物理地址空间得到第一映射关系和第二映射关系。
在一种可能的设计中,虚拟化软件代理,具体用于:获取第一物理地址空间对应的物理内存页PP的大小以及第一虚拟地址空间对应的虚拟内存页VP的大小;将第一物理地址空间映射到连续的虚拟内存空间中,得到第二虚拟地址空间,第二虚拟地址空间对应的虚拟物理内存页VPP的大小大于PP的大小以及VP的大小。
在一种可能的设计中,GPU包括第一MMU,硬件虚拟化管理器包括探听过滤器 和格式转换处理器;第一MMU,用于:将第一虚拟地址翻译成中间虚拟地址;探听过滤器,用于:判断中间虚拟地址是否属于第二虚拟地址空间;当中间虚拟地址属于第二虚拟地址空间时,将中间虚拟地址发送给格式转换处理器;格式转换处理器用于:在第二虚拟地址空间中将中间虚拟地址映射为第二虚拟地址。
在一种可能的设计中,格式转换处理器具体用于:获取中间虚拟地址对应的像素坐标;根据像素坐标获取第二虚拟地址。
在一种可能的设计中,第一格式的图像数据为GPU需读取的压缩数据,压缩数据包括多个压缩图形块;格式转换处理器具体用于:获取中间虚拟地址对应的像素坐标;根据像素坐标获取中间虚拟地址对应的目标压缩图形块的压缩偏移信息;根据压缩偏移信息计算得到第二虚拟地址;格式转换处理器还用于:对读取的目标压缩图形块进行解压缩。
在一种可能的设计中,第一格式的图像数据为GPU待写入的压缩数据;格式转换处理器具体用于:获取中间虚拟地址对应的像素坐标;根据像素坐标获取待写入的压缩数据的头数据的地址;根据头数据的地址获取第二虚拟地址。
在一种可能的设计中,格式转换处理器具体用于:获取中间虚拟地址对应的像素坐标;获取像素坐标对应的像素的签名;根据签名获取与签名对应的第二虚拟地址;若第一格式的图像数据为GPU需读取的加密数据,格式转换处理器还用于:对读取的图像数据进行解密,将解密后的图像数据发送给GPU。
上述第一方面和第二方面中提到的第一虚拟地址可以相当于下面即将介绍的第三方面和第四方面提到的虚拟地址空间中的目标虚拟地址;上述第一方面和第二方面中提到的第一虚拟地址空间相当于第三方面和第四方面提到的虚拟地址空间;上述第一方面和第二方面中提到的第二虚拟地址空间相当于第三方面和第四方面提到的虚拟物理地址空间;上述第一方面和第二方面中提到的第一物理地址空间相当于第三方面和第四方面提到的物理地址空间;上述第一方面和第二方面中提到的中间虚拟地址相当于第三方面和第四方面提到的中间物理地址;上述第一方面和第二方面中提到的第二虚拟地址相当于第三方面和第四方面提到的目标虚拟物理地址;上述第一方面和第二方面中提到的第一物理地址相当于第三方面和第四方面提到的目标物理地址;当第三方面和第四方面中的中间物理地址属于第二虚拟地址空间时,第三方面和第四方面提到的中间物理地址为第一虚拟物理地址,那么第一虚拟物理地址就相当于第一方面和第二方面提到的中间虚拟地址。
第三方面,提供一种图形处理方法,包括:虚拟化软件代理构建虚拟物理地址空间,虚拟物理地址空间为虚拟地址空间和物理地址空间之外的内存空间;硬件虚拟化管理器在虚拟物理地址空间中对待访问的非本地格式图像数据的地址进行映射,得到本地格式图像数据的目标虚拟物理地址;获取目标虚拟物理地址对应的目标物理地址,并访问目标物理地址中的图像数据。虚拟地址空间中的虚拟地址所映射的物理地址对应第一格式的图像数据,虚拟物理地址空间中的目标虚拟物理地址所映射的物理地址对应第二格式的图像数据。
本申请实施例在现有的虚拟地址空间和物理地址空间之间新增了一个虚拟物理地址空间,并在这个新增的虚拟物理地址空间中完成像素级别的地址映射,将非本地格 式的图像的地址映射为GPU可以访问的本地格式的图像的地址,进一步的,本申请实施例将现有技术中的虚拟地址到物理地址的一次地址映射拆分成两次地址映射:虚拟地址到虚拟物理地址,虚拟物理地址到物理地址的映射,从而GPU可以读取或渲染非本地格式的图像,而不需要显性的格式转换,避免申请额外的本地GPU缓存,避免格式转换处理器和缓存之间的多次迁移,减少了内存消耗,避免延迟,并节省了带宽和功耗。
应当理解,本申请实施例将离散的物理地址先映射到一片连续的虚拟物理内存页VPP地址空间(虚拟物理地址空间)中,然后在这个虚拟出来的连续空间中改变了地址的排列方式,按照改变之后的虚拟地址所映射的物理地址对应第二格式的图像数据,具体的,按照改变之后的虚拟地址的顺序访问物理地址空间中的图像数据得到的是第二格式的图像数据,本申请实施例的格式转换是通过在虚拟的空间中改变地址的排布方式实现的,格式映射前的非本地格式的VPP地址和格式映射后的本地格式的VPP地址都属于这片连续的VPP地址空间,只是排布方式发生了变化。在GPU根据该映射后的虚拟物理地址的顺序访问非本地格式的真实的物理地址时,访问的是本地格式的图形数据,也即GPU是按照转换后的本地格式的图形数据的顺序去读取或写入图像数据的,因此,本申请这种通过虚拟化的方式可以使得GPU采样过程不需要额外的格式转换缓冲区,也省略了实际的进行格式转换的过程,实现了GPU以低成本方式采样非本地格式的图形数据。类似的,在根据该虚拟物理地址得到真实渲染数据时的非本地格式的物理地址时,根据该非本地格式的物理地址渲染图形数据的顺序对于GPU来说是按照本地格式的图形数据的顺序进行渲染的,因此,本申请这种通过虚拟化的方式可以使得GPU渲染过程不需要额外的格式转换缓冲区,也省略了实际的进行格式转换的过程,实现了GPU以低成本方式渲染本地格式的图形数据到非本地格式的内存。
在一种可能的设计中,在硬件虚拟化管理器在虚拟物理地址空间中对待访问的非本地格式图像数据的地址进行映射之前,该方法还包括:获取图形处理器GPU待访问的目标虚拟地址;GPU的存储器管理单元MMU根据第一映射关系得到目标虚拟地址对应的虚拟物理地址空间中的第一虚拟物理地址,第一映射关系为虚拟地址空间和虚拟物理地址空间的映射关系;硬件虚拟化管理器在虚拟物理地址空间中对待访问的非本地格式图像数据的地址进行映射,得到本地格式图像数据的目标虚拟物理地址,具体包括:硬件虚拟化管理器在虚拟物理地址空间中对第一虚拟物理地址进行映射得到目标虚拟物理地址。
在一种可能的设计中,获取目标虚拟物理地址对应的目标物理地址,具体包括:硬件虚拟化管理器中的第二MMU根据第二映射关系获取目标虚拟物理地址对应的目标物理地址,第二映射关系为虚拟物理地址空间和物理地址空间的映射关系。
在一种可能的设计中,在虚拟化软件代理构建虚拟物理地址空间之前,方法还包括:获取发送给GPU的图形处理请求,图形处理请求包括非本地格式图像的虚拟地址空间和物理地址空间;虚拟化软件代理构建虚拟物理地址空间,具体包括:虚拟化软件代理根据虚拟地址空间和物理地址空间构建得到虚拟物理地址空间。
在一种可能的设计中,在虚拟化软件代理根据虚拟地址空间和物理地址空间构建得到虚拟物理地址空间时,还得到第一映射关系和第二映射关系。
在一种可能的设计中,虚拟化软件代理构建虚拟物理地址空间,具体包括:
获取物理地址空间对应的物理内存页PP的大小和虚拟内存页VP的大小;根据PP的大小和VP的大小构建虚拟物理地址空间,虚拟物理地址空间对应的虚拟物理内存页VPP的大小大于PP的大小和VP的大小。
在一种可能的设计中,硬件虚拟化管理器包括过滤器和格式转换处理器,在硬件虚拟化管理器在虚拟物理地址空间中对待访问的非本地格式图像数据的地址进行映射之前,方法还包括:获取图形处理器GPU待访问的目标虚拟地址;GPU的存储器管理单元MMU将目标虚拟地址映射为中间物理地址;过滤器判断中间物理地址是否属于虚拟物理地址空间;当中间物理地址属于虚拟物理地址空间时,过滤器将中间物理地址确定为第一虚拟物理地址,并发送给格式转换处理器;格式转换处理器在虚拟物理地址空间中将第一虚拟物理地址进行像素级格式映射得到目标虚拟物理地址。
在一种可能的设计中,硬件虚拟化管理器在虚拟物理地址空间中对第一虚拟物理地址进行映射得到目标虚拟物理地址,包括:获取第一虚拟物理地址对应的像素坐标;根据像素坐标获取与像素坐标对应的目标虚拟物理地址。
在一种可能的设计中,待访问的非本地格式图像数据为GPU需读取的压缩数据,压缩数据包括多个压缩图形块,硬件虚拟化管理器在虚拟物理地址空间中对第一虚拟物理地址进行映射得到目标虚拟物理地址,包括:获取第一虚拟物理地址对应的像素坐标;根据像素坐标获取第一虚拟物理地址对应的目标压缩图形块的压缩偏移信息;根据压缩偏移信息计算得到目标虚拟物理地址;方法还包括:对读取的目标压缩图形块进行解压缩。
在一种可能的设计中,待访问的非本地格式图像数据为GPU待写入的压缩数据,包括:获取第一虚拟物理地址对应的像素坐标;硬件虚拟化管理器在虚拟物理地址空间中对第一虚拟物理地址进行映射得到目标虚拟物理地址根据像素坐标获取待写入的压缩数据的头数据的地址;根据头数据的地址获取目标虚拟物理地址。
在一种可能的设计中,硬件虚拟化管理器在虚拟物理地址空间中对第一虚拟物理地址进行映射得到目标虚拟物理地址,包括:获取第一虚拟物理地址对应的像素坐标;获取像素坐标对应的像素的签名;根据签名获取与签名对应的目标虚拟物理地址;若待访问的非本地格式图像数据为GPU需读取的加密数据,则方法还包括:对读取的图像数据进行解密,将解密后的图像数据发送给GPU。
第四方面,提供一种图形处理装置,该装置包括图形处理器GPU、中央处理器CPU和硬件虚拟化管理器,CPU上运行有虚拟化软件代理,其中:虚拟化软件代理,用于构建虚拟物理地址空间,虚拟物理地址空间为虚拟地址空间和物理地址空间之外的内存空间;硬件虚拟化管理器,用于在虚拟物理地址空间中对待访问的非本地格式图像数据的地址进行映射,得到本地格式图像数据的目标虚拟物理地址;硬件虚拟化管理器,还用于获取目标虚拟物理地址对应的目标物理地址,并访问目标物理地址中的图像数据。
在一种可能的设计中,虚拟化软件代理,还用于:获取图形处理器GPU待访问的目标虚拟地址;GPU还包括存储器管理单元MMU,MMU用于根据第一映射关系得到目标虚拟地址对应的虚拟物理地址空间中的第一虚拟物理地址,第一映射关系为虚 拟地址空间和虚拟物理地址空间的映射关系;硬件虚拟化管理器,用于在虚拟物理地址空间中对第一虚拟物理地址进行映射得到目标虚拟物理地址。
在一种可能的设计中,硬件虚拟化管理器包括第二MMU,第二MMU用于根据第二映射关系获取目标虚拟物理地址对应的目标物理地址,第二映射关系为虚拟物理地址空间和物理地址空间的映射关系。
在一种可能的设计中,虚拟化软件代理,还用于:获取发送给GPU的图形处理请求,图形处理请求包括非本地格式图像的虚拟地址空间和物理地址空间;虚拟化软件代理,用于:根据虚拟地址空间和物理地址空间构建得到虚拟物理地址空间。
在一种可能的设计中,虚拟化软件代理,还用于在根据虚拟地址空间和物理地址空间构建得到虚拟物理地址空间时,还得到第一映射关系和第二映射关系。
在一种可能的设计中,虚拟化软件代理,用于获取物理地址空间对应的物理内存页PP的大小和虚拟内存VP的大小;根据PP的大小和VP的大小构建虚拟物理地址空间,虚拟物理地址空间对应的虚拟物理内存页VPP的大小大于PP的大小和VP的大小。
在一种可能的设计中,硬件虚拟化管理器包括探听过滤器和格式转换处理器;探听过滤器,用于获取图形处理器GPU待访问的目标虚拟地址;GPU的MMU,用于将目标虚拟地址映射为中间物理地址;探听过滤器,用于判断中间物理地址是否属于虚拟物理地址空间;当中间物理地址属于虚拟物理地址空间时,将中间物理地址确定为第一虚拟物理地址,并发送给格式转换处理器;格式转换处理器,用于在虚拟物理地址空间中将第一虚拟物理地址进行像素级格式映射得到目标虚拟物理地址。
在一种可能的设计中,硬件虚拟化管理器,用于获取第一虚拟物理地址对应的像素坐标;根据像素坐标获取与像素坐标对应的目标虚拟物理地址。
在一种可能的设计中,待访问的非本地格式图像数据为GPU需读取的压缩数据,压缩数据包括多个压缩图形块,硬件虚拟化管理器,用于获取第一虚拟物理地址对应的像素坐标;根据像素坐标获取第一虚拟物理地址对应的目标压缩图形块的压缩偏移信息;根据压缩偏移信息计算得到目标虚拟物理地址;硬件虚拟化管理器,还用于对读取的目标压缩图形块进行解压缩。
在一种可能的设计中,待访问的非本地格式图像数据为GPU待写入的压缩数据,硬件虚拟化管理器,用于获取第一虚拟物理地址对应的像素坐标;硬件虚拟化管理器在虚拟物理地址空间中对第一虚拟物理地址进行映射得到目标虚拟物理地址;根据像素坐标获取待写入的压缩数据的头数据的地址;根据头数据的地址获取目标虚拟物理地址。
在一种可能的设计中,硬件虚拟化管理器,用于获取第一虚拟物理地址对应的像素坐标;获取像素坐标对应的像素的签名;根据签名获取与签名对应的目标虚拟物理地址;硬件虚拟化管理器,还用于对读取的图像数据进行解密,将解密后的图像数据发送给GPU。
第五方面,提供一种图形处理方法,包括:获取待访问的第一虚拟地址,第一虚拟地址属于第一虚拟地址空间;将第一虚拟地址翻译成中间虚拟地址,中间虚拟地址属于第二虚拟地址空间,中间虚拟地址在第二虚拟地址空间中能够被映射为第二虚拟 地址;其中,第二虚拟地址空间不同于第一虚拟地址空间,第二虚拟地址空间和第一虚拟地址空间映射到相同的第一物理地址空间,第一虚拟地址所映射的物理地址对应第一格式的图像数据,第二虚拟地址所映射的物理地址对应第二格式的图像数据。
因此,本申请实施例可以实现在这个新增的第二虚拟地址空间中完成像素级别的地址映射,将第一格式这种非本地格式的图像的地址映射为GPU可以访问的第二格式这种本地格式的图像的地址,从而GPU可以读取或渲染非本地格式的图像,而不需要显性的格式转换,避免申请额外的本地GPU缓存,避免格式转换处理器和缓存之间的多次迁移,减少了内存消耗,避免延迟,并节省了带宽和功耗。
在一种可能的设计中,将第一虚拟地址翻译成中间虚拟地址,具体包括:根据第一映射关系得到第一虚拟地址在第二虚拟地址空间中对应的中间虚拟地址,第一映射关系为第一虚拟地址空间和第二虚拟地址空间的映射关系。
在一种可能的设计中,在生成待访问的第一虚拟地址之前,方法还包括:接收第一虚拟地址空间和第二虚拟地址空间;建立第一映射关系。
第六方面,提供一种GPU,GPU包括传输接口和存储器管理单元MMU,其中:传输接口,用于获取待访问的第一虚拟地址,第一虚拟地址属于第一虚拟地址空间;MMU,用于将第一虚拟地址翻译成中间虚拟地址,中间虚拟地址属于第二虚拟地址空间,中间虚拟地址在第二虚拟地址空间中能够被映射为第二虚拟地址;其中,第二虚拟地址空间不同于第一虚拟地址空间,第二虚拟地址空间和第一虚拟地址空间映射到相同的第一物理地址空间,第一虚拟地址所映射的物理地址对应第一格式的图像数据,第二虚拟地址所映射的物理地址对应第二格式的图像数据。
在一种可能的设计中,MMU,用于:根据第一映射关系得到第一虚拟地址在第二虚拟地址空间中对应的中间虚拟地址,第一映射关系为第一虚拟地址空间和第二虚拟地址空间的映射关系。
在一种可能的设计中,MMU,用于:接收第一虚拟地址空间和第二虚拟地址空间;建立第一映射关系。
由此按照该非本地格式的PP地址采样图形数据的顺序对于GPU来说实际是按照本地格式的地址顺序采样的,那么对于采样过程来说,可使得GPU最终采样图形数据时的采样顺序与GPU实际按照真实的物理地址采样图形数据时的采样顺序不同,采样顺序改变时采样得到的图形数据对于GPU来说为GPU能够识别并处理的本地格式的图形数据,因此,最终按照非本地格式的物理地址从内存中读取的图形数据的顺序为GPU能够识别的图形格式。类似的,对于渲染过程来说,GPU渲染本地格式的图形数据时,可先按照本申请虚拟的本地格式的物理地址VPP对应本地格式的图形数据,再将VPP对应到非本地格式的物理地址,这样,GPU要将本地格式的图形数据写入非本地格式的内存中时,最终写到内存中的图形数据还是按照非本地格式的物理地址写入的,写入到内存中的图形数据对于GPU来说为GPU不能识别的非本地格式的图形数据。
附图说明
图1为一种现有技术采样和渲染到非本地格式的场景示意图;
图2为本一种GPU内部的功能模块示意图;
图2A为一种现有技术采样或渲染数据时的地址映射关系的示意图;
图2B为本申请实施例提供的一种地址空间结构与现有技术的地址空间结构的对比示意图;
图2C为本申请实施例提供的一种地址映射关系示意图;
图2D为本申请实施例提供的一种地址映射关系示意图;
图2E为本申请实施例提供的一种地址映射关系示意图;
图2F为本申请实施例提供的一种采样或渲染非本地格式数据时的地址映射关系示意图;
图2G为本本身实施例提供的一种采样或和渲染非本地格式数据时的过程示意图;
图3为本申请实施例提供的一种终端设备的结构示意图;
图4为本申请实施例提供的一种SoC的结构示意图;
图5为本申请实施例提供的一种图形处理方法的流程示意图;
图6为本申请实施例提供的一种采样非本地格式的图形数据的软硬件架构图;
图7为本申请实施例提供的一种图形处理方法的流程示意图;
图8为本申请实施例提供的一种终端设备的结构示意图;
图9为本申请实施例提供的一种终端设备的结构示意图。
具体实施方式
为了便于理解,示例地给出了部分与本申请相关概念的说明以供参考。如下所示:
GPU:又称显示核心、视觉处理器或显示芯片,是一种专门在个人电脑、工作站、游戏机和一些移动设备(如平板电脑、智能手机等)上进行图像运算工作的微处理器,示例性的,GPU的用途包括:将计算机系统所需要的显示信息进行转换驱动,并向显示器提供行扫描信号,控制显示器的正确显示等,GPU是连接显示器和个人电脑主板的重要元件,也是"人机对话"的重要设备之一。
虚拟地址(virtual address):程序访问存储器所使用的逻辑地址。
物理地址(physical address):放在寻址总线上的地址。如果中央处理单元(Central Processing Unit,CPU)进行读操作,电路可根据物理地址每位的值将相应地址的物理内存中的数据读取到数据总线中传输。如果CPU进行写操作,电路可根据物理地址每位的值在相应地址的物理内存中写入数据总线上的内容。
存储器映射管理单元(Memory Management Unit,MMU):是CPU中用来管理虚拟存储器和物理存储器的控制线路,同时也负责虚拟地址映射为物理地址,以及提供硬件机制的内存访问授权,多用户多进程操作系统。
一种典型的场景中,如图1所示,该场景用于后处理并渲染监控摄像头100捕获的视频流,渲染的结果由编码处理器105二次编码后输出。具体来讲,监控摄像头100的原始视频流在编码器101中完成编码,编码结果以非GPU本地格式写入非本地格式缓存106,其中,集成在系统级芯片(System on Chip,SoC)上的编码器101通常是与供应商绑定的特定私有格式,包含有变换、压缩以及知识产权保护的各种信息,GPU103无法直接采样这种私有格式,可通过格式转换处理器102将私有格式转换成GPU本地格式,将转换后的GPU本地格式的图形数据存储于本地GPU缓存107中。这样,GPU103可以从本地GPU缓存107中采样图形数据进行GPU渲染,渲染的结 果以GPU本地格式写入本地GPU缓存108中。但是,编码处理器105无法直接接受本地GPU缓存108中GPU本地格式的图形数据,就还需要另一个格式转换处理器104通过总线读取本地GPU缓存108中的图形数据,再将该图形数据的格式转换成编码器105可接受的格式,将该可接受的格式的图形数据通过总线写入非本地格式缓存109。这样看来,该场景中数据的每一次迁移都包含该数据在缓存和处理器之间的迁移,需要申请额外的缓存空间,这样既消耗内存,也会产生延迟并浪费带宽和功耗,成本较高。
针对上述GPU采样或渲染非本地格式的图形数据的场景,本申请可以用于GPU对图形进行采样或渲染的过程中,能够以低成本开销采样或渲染对于GPU来说为非本地格式的图形。
这里首先对本申请中的GPU内部的功能单元进行介绍。如图2所示为GPU内部的功能单元结构示意图。
目前,GPU架构已经从固定功能流水线逐步演进到可编程的着色器(shader)处理器架构。参考图2,shader处理器可以至少分为三类:顶点着色器(vertex shader)、像素着色器(fragment shader)和几何着色器(geometry shader)。其中,顶点处理器的程序包含对一个顶点实例进行操作的指令。像素处理器程序包含对一个像素进行处理的指令,通常包括从材质采样缓冲区(texture sample buffer)采样该像素的材质,计算光源对该像素的反射,以得到最终的着色结果。几何处理器的程序用于指示GPU内部分工做几何处理。虽然不同类型的shader运行不同类型的程序,但在硬件结构上,通常是一个归一化的架构运行一个归一化的指令集。指令集包括和通用标量处理器类似的算术,存储器load/store和移位等。这些shader处理器前端每一条指令都运行在多个数据实例上,也就是通常的单一指令多数据结构。这些shader处理器还需要和固定功能流水线通信完成图形功能。该图形功能包括光栅化处理器(rasterizer)和材质映射器(texture mapper)。rasterizer用于计算生成每个着色片段对应的像素。texture mapper用于计算经过透视变换后最终要取的材质点(texel)的地址。shader处理器和固定流水线都会被映射到虚拟地址上。在内存管理时,页是地址空间的最小单位。一个应用程序所能使用的所有的虚拟地址称为虚拟地址空间。虚拟地址空间通常被划分为更小的粒度,虚拟内存页(Virtual page,VP)。一个虚拟地址空间由一系列虚拟地址组成。虚拟地址空间会被映射到真实的双倍数据速率(Double Data Rate,DDR)空间中,即物理地址空间。物理地址空间也会被划分成一系列物理内存页(Physical Page,PP)。VP和PP的大小通常可以是一样的,例如可以为4KB。对于进程来说,使用的都是虚拟地址。每个进程维护一个单独的页表。页表是一种数组结构,存放着各VP的状态,包括是否映射,是否缓存。进程执行时,当需要访问虚拟地址中存放的值时:CPU会先找到虚拟地址所在的VP,再根据页表,找出页表中VP的页号对应的值,再根据该值对应的物理页号,获取虚拟地址对应的PP中的物理地址,这一过程可以称为虚拟地址到物理地址的地址翻译。简单来说,地址翻译是指在缓存命中时,由虚拟地址找到物理地址的过程。
通常,图形的存储器管理单元(Memory Management Unit,MMU)用于管理虚拟地址空间和物理地址空间之间的映射关系。将虚拟地址翻译为物理地址是由MMU完 成的,或者说基于存储在MMU中的虚拟地址空间和物理地址空间的映射关系得到虚拟地址对应的物理地址。例如,一个材质缓冲区(存储图形数据的区域)在texture mapper中映射到的是一片连续的虚拟地址空间,在DDR空间中映射的是一堆分散的物理页面。
以渲染过程举例来说,一个像素处理器在渲染像素时,发送材质采样指令给texture mapper,texture mapper将计算得到的texel的虚拟地址发送到总线接口单元(Bus Interface Unit,BIU),通过BIU上连接的MMU查找到与虚拟地址对应的物理地址。对于当前系统使用的tile based架构,渲染是以片状材料(tile)为粒度进行的,可根据物理地址将渲染的中间结果存入渲染目标缓冲区(render target buffer)。在一些实例中,系统会存在L2缓存。对于采样过程,在一些实例中,系统会存在L2缓存(level2cache),如果当前采样的texel在L2缓存中未查找到,该texel读取操作会通过总线操作读取材质缓冲区(texture buffer)中的内容。
上述过程中提到,MMU中管理有应用程序申请到的虚拟地址空间和物理地址空间之间的映射关系,也就是说,MMU中存储有将虚拟地址翻译为物理地址的映射关系表。该表的内容由操作系统进行管理。如图2A所示为现有技术利用虚拟地址在MMU中进行地址翻译得到物理地址后,利用物理地址去访问主存中的数据的示意图。而在本申请实施例中,参见图2B,与图2B中的(1)所示的现有技术不同的是,本申请在第一虚拟地址空间(相当于现有技术中的虚拟地址空间)和第一物理地址空间(相当于现有技术中的物理地址空间)之间添加一个第二虚拟地址空间,参见图2B中的(2),该第二虚拟地址空间可以被划分为一系列的虚拟物理内存页(Virtual Physical Page,VPP),该第二虚拟地址空间是区别于第一虚拟地址空间和第一物理地址空间的一个空间,第二虚拟地址空间和第一虚拟地址空间映射到相同的物理地址空间,第一虚拟地址所映射的物理地址对应第一格式的图像数据,第二虚拟地址所映射的物理地址对应第二格式的图像数据。本申请实施例的方法涉及一个虚拟化软件代理(virtualization software agent)和一个硬件虚拟化管理器(hardware virtualization hypervisor),该第二虚拟地址空间是虚拟化软件代理构建出来的,如果第一格式的图像数据为GPU不能直接访问的非本地格式的图像数据,第二格式的图像数据为GPU可以访问的本地格式的图像数据,后续硬件虚拟化管理器可以在这个构建出的第二虚拟地址空间中对待访问的数据完成本地格式和非本地格式之间像素级别的地址映射。对应的,本申请实施例将第一虚拟地址空间中的第一虚拟地址和第一物理地址空间中的第一物理地址的映射关系拆分成第一映射关系和第二映射关系,其中第一映射关系为第一虚拟地址和第二虚拟地址空间中的第二虚拟地址的映射关系,第二映射关系为第二虚拟地址和第一物理地址的映射关系,第一映射关系存储在GPU的第一MMU中,第二映射关系存储在硬件虚拟化管理器的第二MMU中。在执行进程时,当需要访问第一虚拟地址中存放的值时,先根据MMU中的第一映射关系将第一虚拟地址翻译成第二虚拟地址空间中的第二虚拟地址,再根据第二MMU中的第二映射关系将第二虚拟地址翻译成第一物理地址,也即,本申请实施例通过两次地址翻译实现对实际的物理地址的访问。由于本申请实施例中,新增了一个第二虚拟地址空间,并在这个第二虚拟地址空间中完成图像格式的像素级别的地址映射,GPU不需要进行显性的格式转换,就可以访问非 本地格式的图像,也即不需要格式转换处理器102将私有格式转换成本地格式,以及格式处理器104将本地格式转换成编码器可以接受的格式,也不需要申请额外的本地GPU缓存107和本地GPU缓存108,避免了数据在处理器和缓存之间的多次迁移,减少了内存消耗,避免延迟,并节省了带宽和功耗。
示例性的,参考图2C,以采样过程为例,若GPU依次发送多个访问请求,每个访问请求访问一个第一虚拟地址,在GPU发送访问请求之前,需要先根据第一映射关系进行地址映射,这时,上述根据GPU的MMU中的第一映射关系将第一虚拟地址翻译成第二虚拟地址空间中的第二虚拟地址可以按照如下举例理解:如果GPU访问的第一虚拟地址的顺序为VP1-VP2-VP3-…,按照第一虚拟地址的顺序为VP1-VP2-VP3-…所映射的物理地址PP1-PP2-PP3-…对应第一格式的图像数据。根据第一映射关系,可以得到第一虚拟地址对应的中间虚拟地址VPP1-VPP2-VPP3-…,那么GPU发送的多个访问请求中实际上分别携带一个中间虚拟地址,中间虚拟地址被发送的顺序就为VPP1-VPP2-VPP3-…。而后,要在这个虚拟的第二虚拟地址空间中完成图像格式的像素级别的地址映射,将非本地格式的中间虚拟地址映射为本地格式的第二虚拟地址,示例性的,VPP1被映射为VPP4,VPP2被映射为VPP2,VPP3被映射为VPP1,以便GPU按照本地格式的第二虚拟地址和第二映射关系获取到第二虚拟地址对应的实际访问非本地格式数据时用到的第一物理地址,这样,就实现了GPU按照本地格式的第二虚拟地址访问非本地格式的图形数据。按照图2C,进行像素级别的地址映射后的第二虚拟地址的顺序为VPP4-VPP2-VPP1-…,根据第二虚拟地址的顺序为VPP4-VPP2-VPP1-…所映射的物理地址对应第二格式的图像数据,上述根据第二MMU中的第二映射关系将第二虚拟地址翻译成第一物理地址,就可以理解为,根据第二虚拟地址的顺序VPP4-VPP2-VPP1-…以及第二映射关得到第二次地址翻译后的第一物理地址,该第一物理地址的顺序为PP4-PP2-PP1-…,进而按照第一物理地址的顺序为PP4-PP2-PP1-…去访问内存中的图形数据,第一物理地址的顺序PP4-PP2-PP1-…对应第二格式的图像数据,如果第一格式的图像数据为GPU不能直接访问的非本地格式的图像数据,第二格式的图像数据为GPU可以访问的本地格式的图像数据,从而可使得GPU从非本地格式的图像数据中采样到本地格式的图形格式。
如图2D所示,为本申请实施例提供的示例性的地址映射前读取到的图像格式的示意图,如图2E所示,为示例性的地址映射后读取到的图像格式的示意图。
根据第一物理地址空间和第一虚拟地址空间构建出第二虚拟地址空间,也即根据第一物理地址空间和第一虚拟地址空间的大小开辟一片连续的虚拟内存空间,并将该连续的虚拟内存空间作为第二虚拟地址空间,该空间的内存页为虚拟物理内存页VPP,第一虚拟地址空间(图2D中示意的虚拟地址空间)的内存页为虚拟内存页VP,第一物理地址空间(图2D中示意的物理地址空间)的内存页为物理内存页PP,应当理解,构建出的虚拟物理地址的VPP的大小要大于VP的大小和PP的大小。在进行地址映射之前,访问第一物理地址空间中存储的像素的顺序为PP1中的X1Y1,PP2中的X2Y2,PP3中的X3Y3,PP4中的X4Y4,PP5中的X5Y5,此时,读出的图像数据为第一格式的图像数据(格式1)。本申请实施例在新构建的第二虚拟地址空间(图2D中示意的虚拟物理地址空间)中对地址进行映射,改变了访问的地址顺序,相当于改 变了图像像素的排布方式,如图2E所示,在第二虚拟地址空间(图2E中示意的虚拟物理地址空间)中进行地址映射之后,VPP1映射到VPP2,VPP2映射到VPP4,VPP3映射到VPP1,VPP4映射到VPP5,VPP5映射到VPP5,访问第一物理地址空间(图2E中示意的物理地址空间)中存储的像素的顺序变为PP2中的X2Y2,PP4中的X4Y4,PP1中的X1Y1,PP5中的X5Y5,PP3中的X3Y3,读出的图像数据的像素排布顺序发生了变化,此时读出的图像数据为第二格式的图像数据(格式2)。
本申请实施例在新构建的第二虚拟地址空间中对地址进行映射,改变了读取像素的地址的顺序,相当于改变了读出的图像数据的像素的排布方式,可选的,第一格式的图像为GPU不能访问的非本地格式的图像,第二格式的图像为GPU可以直接访问的本地格式的图像,因此可以不需要显性的格式转换,GPU就可以访问非本地格式的图像数据得到本地格式的图像数据。
示例性的,本申请实施例提出的第二虚拟地址空间可以是根据进程对应的第一物理地址空间的大小和第一虚拟地址空间的大小确定的一段连续的地址。举例来说,第一虚拟地址空间的大小为396KB,第一物理地址空间被划分为100个离散的VP,每个VP的大小是4KB,那么第二虚拟地址空间的大小需要大于400KB,这样第二虚拟地址空间才可以替换第一物理地址空间,建立第一虚拟地址空间和第二虚拟地址空间之间的第一映射关系,以及第二虚拟地址空间和第一物理地址空间之间的第二映射关系。
这样,在本申请提出的第二虚拟地址空间的基础上,如图2F所示,如果GPU采样或者渲染非本地格式的图形数据,GPU的MMU中存储的是应用程序申请到的非本地格式的VP地址范围(第一虚拟地址空间)与本申请提出的非本地格式的VPP地址范围(第二虚拟地址空间)之间的映射关系,也就是说,GPU的MMU中存储有将第一虚拟地址翻译为第二虚拟地址的第一映射关系查询表。例如现有的GPU的MMU中的查询表包括地址为第一虚拟地址空间0x8000与第一物理地址空间0x6453之间的映射,本申请中,GPU访问非本地格式的内存时,在获取到第一虚拟地址空间0x8000与第一物理地址空间0x6453的映射关系时,可以将该映射关系拆分成第一虚拟地址空间0x8000与第二虚拟地址空间0x0之间的第一映射关系,以及第二虚拟地址空间0x0与第一物理地址空间0x6453之间的第二映射关系,再重载GPU的MMU中的查询表,使得GPU的MMU中的查询表包括第一虚拟地址空间0x8000与第二虚拟地址空间0x0之间的第一映射关系,对应的,将第二虚拟地址空间0x0与第一物理地址空间0x6453之间的第二映射关系存储在虚拟化硬件处理器的第二MMU中。通过虚拟地址获取物理地址时,可以先对第一虚拟地址0x8000进行地址翻译得到第二虚拟地址0x0,再对第二虚拟地址0x0进行地址翻译得到真实的访问数据的第一物理地址0x6453。基于此,本申请采用一种虚拟化的方法可使得GPU采样或渲染非本地格式图形数据。这种虚拟化方法不需要离线显式的转换阶段,可以在线完成采样和渲染非本地格式的图形。虚拟化软件代理可以拦截到应用程序对GPU的图形应用程序接口(Application Programming Interface,API)调用,基于此调用,虚拟化软件代理可以虚拟出GPU可以直接访问的采样缓冲区或者渲染的目标缓冲区,这些虚拟出的缓冲区可以称为本地格式的VPP对应的缓冲区。在这个虚拟出来的VPP对应的缓冲区中将非本地格式的 图形数据转换为本地格式的图形数据,本申请实施例的格式转换是通过在虚拟的空间中改变地址的排布方式实现的。一种示例中,虚拟化软件代理可根据应用程序对GPU进行图形API调用时申请到的VP地址空间(第一虚拟地址空间)和PP地址空间(第一物理地址空间)构建出VPP地址空间(第二虚拟地址空间),VP地址空间所映射的PP地址对应第一格式的图像数据,即非本地格式的图像数据。示例性的,根据VP地址空间和PP地址空间得到VP地址空间与VPP地址空间的第一映射关系,以及VPP地址空间与PP地址空间的第二映射关系,其中第一映射关系存储于GPU的MMU中,第二映射关系存储与硬件虚拟化管理器的第二MMU中,VPP地址空间为一片连续的虚拟地址空间,而后,硬件虚拟化管理器获取GPU要访问的目标VPP地址,并在VPP地址空间中完成图像数据格式的像素级别的地址映射,按照本地格式的目标VPP地址所映射的PP地址访问的图形数据为第二格式的图像数据,即为GPU能够访问的本地格式的图形数据。而后,硬件虚拟化管理器根据格式映射后的本地格式的目标VPP地址和存储在第二MMU中的第二映射关系得到目标PP地址,使得GPU从目标PP地址中读取图形数据或者向目标PP地址中写入图形数据。由于采样过程中得到的非本地格式的PP地址的排布方式是按照本地格式的目标VPP地址计算得到的,按照该非本地格式的PP地址采样图形数据的顺序实际是按照本地格式的目标VPP地址顺序采样的,那么对于采样过程来说,可使得GPU最终采样图形数据时的采样顺序与GPU实际按照真实的物理地址采样图形数据时的采样顺序不同,采样顺序改变时采样得到的图形数据对于GPU来说为GPU能够识别并处理的本地格式的图形数据,因此,最终按照非本地格式的PP地址从内存中读取的图形数据的顺序为GPU能够识别的图形格式。类似的,对于渲染过程来说,GPU渲染本地格式的图形数据时,可先按照本申请虚拟的物理地址VPP地址获取本地格式的图形数据,再根据VPP地址得到非本地格式的PP地址,这样,GPU要将本地格式的图形数据写入非本地格式的内存中时,最终写到内存中的图形数据还是按照非本地格式的PP地址写入的,写入到内存中的图形数据对于GPU来说为GPU不能识别的非本地格式的图形数据。这样一来,本申请在采样或渲染非本地格式图形时,不需要显示的格式转换阶段,例如不需要图1中的将格式转换处理器102将私有格式转换成GPU本地格式的过程,以及不需要图1中的将格式转换处理器104通过总线读取本地GPU缓存108中的图形数据,再将该图形数据的格式转换成编码器105可接受的格式的过程。另外,不需要格式转换的缓冲区,例如不需要图1中的本地GPU缓存107以及本地GPU缓存108。相比而言,本申请的采样和渲染过程可以如图2G所示,通过虚拟化的方式,可以使得GPU103通过第二虚拟地址空间这一构建出的中间层去采样非本地格式缓存106,以及使得GPU103通过第二虚拟地址空间这一中间层渲染数据到非本地格式缓存109,极大的降低了系统的处理时延和带宽,降低了系统内存的使用量,和降低了显示转换处理器的成本。
上述过程中提到的虚拟化软件代理可以是以软件的形式实现,其对应的软件的程序代码可存储于终端设备的内存中,由CPU执行;硬件虚拟化管理器可以是以硬件和软件结合的方式实现,其硬件结构可以与GPU均设置在设备内的总线上,其对应的软件的程序代码可存储于终端设备的内存中。在一种可选的情况中,虚拟化软件代理、硬件虚拟化管理器和GPU集成在同一个SOC上。
本申请实施例可以用于可显示图形的终端设备处理图形的过程中,该终端设备可以为移动终端或不可移动的终端,例如移动终端可以为手机、平板电脑以及具有显示功能的其他移动设备等,不可移动的终端例如可以为个人电脑以及具有显示功能的其他备等。参考图3示出的终端设备的结构,该终端设备包括显示器、处理器、存储器、收发器以及总线,存储器包括上述内存。该处理器可包括SoC,在该SoC中,参考图4,可布局有GPU、硬件虚拟化管理器、向量排列单元(Vector Permutate Unit,VPU)、CPU、图像信号处理(Image Signal Processing,ISP)、缓存、动态随机存取存储器(Dynamic Random Access Memory,DRAM)控制器以及总线等,GPU、VPU、CPU、ISP、缓存以及DRAM控制器可通过连接器相耦合,应当理解,本申请的各个实施例中,耦合是指通过特定方式的相互联系,包括直接相连或者通过其他设备间接相连,例如可以通过各类接口、传输线或总线等相连,这些接口通常是电性通信接口,但是也不排除可能是机械接口或其它形式的接口,本实施例对此不做限定。本申请实施例具体可以应用于SoC对于图形采样和渲染的过程。
根据以上阐述,本申请实施例提供一种图形处理方法,该方法包括以下步骤:
1)终端设备获取图形处理器GPU待访问的第一虚拟地址,第一虚拟地址属于第一虚拟地址空间。
第一虚拟地址所映射的物理地址对应第一格式的图像数据。
2)终端设备根据第一虚拟地址得到第二虚拟地址,第二虚拟地址属于第二虚拟地址空间。
其中,第二虚拟地址空间不同于第一虚拟地址空间,第二虚拟地址空间和第一虚拟地址空间映射到相同的第一物理地址空间,第二虚拟地址所映射的物理地址对应第二格式的图像数据。
可以知道,本申请实施例重新构建了一个第二虚拟地址空间,该第二虚拟地址空间区别于第一虚拟地址空间。本申请可以将第一虚拟地址空间中的地址映射为该新增的第二虚拟地址空间中的一个地址,新增的第二虚拟地址空间中的第二虚拟地址所映射的物理地址对应第二格式的图形数据,该第二格式的图像数据区别于第一格式的图像数据,示例性的,如果第一格式的图像数据为GPU不能直接访问的图像数据,第二格式的图像数据为GPU可以访问的图像数据,本申请实施例通过将地址映射到一个新增的虚拟地址空间中实现了图像格式的转换,而无需图像格式处理器进行格式转换,GPU就可以访问非本地格式的图像,避免申请额外的本地GPU缓存,避免格式转换处理器和缓存之间的多次迁移,减少了内存消耗,避免延迟,并节省了带宽和功耗。
上述根据第一虚拟地址得到第二虚拟地址,可以包括:通过GPU中的第一MMU将第一虚拟地址翻译成中间虚拟地址,中间虚拟地址为第二虚拟地址空间中的一个虚拟地址;通过硬件虚拟化管理器在第二虚拟地址空间中将中间虚拟地址映射为第二虚拟地址。在第一虚拟地址为非本地格式的第一虚拟地址时,中间虚拟地址可以理解为非本地格式的中间虚拟地址,第二虚拟地址可以理解为本地格式的第二虚拟地址。也就是说,在得到第二虚拟地址时,需要进行一次第一虚拟地址到中间虚拟地址的地址翻译,和一次中间虚拟地址到第二虚拟地址的地址映射,一次中间虚拟地址到第二虚拟地址的地址映射可以理解为在新增的第二虚拟地址空间中完成像素级别的地址映 射,从而可以使得GPU根据第二虚拟地址访问物理地址对应的图像数据,得到本地格式这种第二格式的图像数据。
上述将第一虚拟地址翻译成中间虚拟地址,可以包括:GPU中的第一MMU可以根据第一映射关系得到第一虚拟地址在第二虚拟地址空间中对应的中间虚拟地址,第一映射关系为第一虚拟地址空间和第二虚拟地址空间的映射关系。第一映射关系可以存储在GPU的存储器管理单元MMU中。在本申请实施例中,第一MMU可以集成在GPU内部,也可以位于GPU外部,本申请不做限定。
在根据第一虚拟地址得到第二虚拟地址之后,该方法还可以包括:通过硬件虚拟化管理器根据第二映射关系得到第二虚拟地址在第一物理地址空间中对应的第一物理地址,第二映射关系为第二虚拟地址空间和第一物理地址空间的映射关系。可以理解,本申请在经过第一次地址翻译,将第一虚拟地址翻译为中间虚拟地址,再经过一次地址映射,即将中间虚拟地址映射为第二虚拟地址,再经过一次地址翻译,将第二虚拟地址翻译为第一物理地址。格式映射前的非本地格式的中间虚拟地址和格式映射后的本地格式的第二虚拟地址都属于这片连续的第二虚拟地址空间,只是排布方式发生了变化。第二虚拟地址空间在GPU根据该本地格式的第二虚拟地址的顺序访问非本地格式的真实的物理地址时,访问的是本地格式的图形数据,也即GPU是按照转换后的本地格式的图形数据的顺序去读取或写入图像数据的,因此,本申请这种通过虚拟化的方式可以使得GPU采样过程不需要额外的格式转换缓冲区,也省略了实际的进行格式转换的过程,实现了GPU以低成本方式采样非本地格式的图形数据。类似的,在根据该第二虚拟地址空间得到真实渲染数据时的非本地格式的物理地址时,根据该非本地格式的物理地址渲染图形数据的顺序对于GPU来说是按照本地格式的图形数据的顺序进行渲染的,因此,本申请这种通过虚拟化的方式可以使得GPU渲染过程不需要额外的格式转换缓冲区,也省略了实际的进行格式转换的过程,实现了GPU以低成本方式渲染本地格式的图形数据到非本地格式的内存。
由于本申请重新需要构建了一个第二虚拟地址空间,那么在根据第一虚拟地址得到第二虚拟地址之前,该方法还可以包括:通过虚拟化软件代理获取发送给GPU的图形处理请求,图形处理请求包括第一虚拟地址空间和第一物理地址空间,从而可以根据第一虚拟地址空间和第一物理地址空间构建第二虚拟地址空间。也就是说,应用程序在向GPU发送图形处理请求时,本申请实施例可以通过虚拟化软件代理拦截到该图形处理请求,从而根据该请求构建出第二虚拟地址空间。应当理解,本申请实施例将离散的第一物理地址空间和第一虚拟地址空间先映射到一片连续的第二虚拟地址空间中,然后在这个虚拟出来的连续空间中将非本地格式的图形数据转换为本地格式的图形数据,本申请实施例的格式转换是通过在虚拟的空间中改变地址的排布方式实现的。
这样在构建出第二虚拟地址空间后,就可以根据第一虚拟地址空间和第一物理地址空间得到第一映射关系和第二映射关系。
对于第二虚拟地址空间是如何得到的,本申请提供一种可能的设计可以为:虚拟化软件代理获取第一物理地址空间对应的物理内存页PP的大小以及第一虚拟地址空间对应的虚拟内存页VP的大小,将第一物理地址空间映射到连续的虚拟内存空间中,得到第二虚拟地址空间,第二虚拟地址空间对应的虚拟物理内存页VPP的大小大于PP 的大小以及VP的大小。这样做的目的是,要将本申请构建的第二虚拟地址空间覆盖第一虚拟地址空间和真实物理地址的第一物理地址空间,才能建立第一虚拟地址空间和第二虚拟地址空间之间的第一映射关系,和第二虚拟地址空间与第一物理地址空间之间的第二映射关系。
上述根据第一虚拟地址得到第二虚拟地址,可以包括:将第一虚拟地址翻译成中间虚拟地址;判断中间虚拟地址是否属于第二虚拟地址空间;当中间虚拟地址属于第二虚拟地址空间时,在第二虚拟地址空间中将中间虚拟地址映射为第二虚拟地址。因GPU的MMU中维护了多个映射关系,所以MMU中翻译出的虚拟地址有可能是其他缓存区的实际的物理地址,而不是第二虚拟地址空间中的虚拟地址,所以这里要进行判断过滤。也就是说,MMU获取的中间虚拟地址并不一定是第一映射关系中的虚拟地址。
上述在第二虚拟地址空间中将中间虚拟地址映射为第二虚拟地址,可以包括:获取中间虚拟地址对应的像素坐标;根据像素坐标获取第二虚拟地址。这里即为上述进行像素级别的地址映射的一种实现过程,通过像素坐标可以得到本地格式的第二虚拟地址,从而通过本地格式的第二虚拟地址访问真实的物理地址的图像数据,得到本地格式的图像数据。
如果第一格式的图像数据为GPU需读取的压缩数据,压缩数据包括多个压缩图形块,在第二虚拟地址空间中将中间虚拟地址映射为第二虚拟地址,可以包括:获取中间虚拟地址对应的像素坐标;根据像素坐标获取中间虚拟地址对应的目标压缩图形块的压缩偏移信息;根据压缩偏移信息计算得到第二虚拟地址;方法还包括:对读取的目标压缩图形块进行解压缩。这里是考虑到在进行像素级别的地址转换时,内存中存储的非本地格式的图像数据可能是压缩格式的图像数据。相应的,在采样得到图像数据时,还需要进行图像数据的解压缩。
如果第一格式的图像数据为GPU待写入的压缩数据,在第二虚拟地址空间中将中间虚拟地址映射为第二虚拟地址,包括获取中间虚拟地址对应的像素坐标;根据像素坐标获取待写入的压缩数据的头数据的地址;根据头数据的地址获取第二虚拟地址。即,写入数据时,考虑到内存中存放的是压缩数据,因此,在将本地格式的图像数据存放进非本地格式的内存中时,要得到压缩格式的数据的第二虚拟地址,以根据第二虚拟地址将数据村放入物理地址的内存中。
如果内存中存储的第一格式的图像数据为加密格式,那么在第二虚拟地址空间中将中间虚拟地址映射为第二虚拟地址,可以包括:获取中间虚拟地址对应的像素坐标;获取像素坐标对应的像素的签名;根据签名获取与签名对应的第二虚拟地址;若第一格式的图像数据为GPU需读取的加密数据,则方法还包括:对读取的图像数据进行解密,将解密后的图像数据发送给GPU。这种情况是考虑到,内存中存放的图像数据为加密格式,因此在进行像素级别的地址映射时,要得到加密格式下的第二虚拟地址。
通过以上说明,本申请实施例可以通过新增一个第二虚拟地址空间实现GPU直接访问非本地格式的图像数据。
下面以本申请如何对非本地格式的图形数据进行采样和渲染为例进行说明。参考图5,本申请实施例提供一种图形处理方法,以采样过程为例,该方法包括:
501、虚拟化软件代理拦截应用程序发送给GPU的图形处理请求,图形处理请求包括应用程序申请的非本地格式的VP地址空间和非本地格式的PP地址空间。
参考图6,图6示出了本申请采样非本地格式的图形数据的软硬件架构图。当某一应用程序指示GPU采样图形数据时,该应用程序可以向GPU发送图形处理请求,该图形处理请求中携带有采样所需的资源,该资源包括应用程序申请到的材质缓冲区。由于GPU采样的内存中存储的是对于GPU来说是非本地格式的图形数据,因此该材质缓冲区包括非本地格式的VP地址空间和非本地格式的PP地址空间,这时,虚拟软件代理可以拦截到应用程序发送给GPU的图形处理请求,以对图形处理请求进行解析。
502、虚拟化软件代理根据VP地址空间和PP地址空间构建出VPP地址空间,VPP地址空间为一片连续的虚拟地址空间。
示例性的,虚拟化软件代理可对拦截到的图形处理请求进行解析,以得到图形处理请求中的非本地格式的VP地址空间和非本地格式的PP地址空间,进一步的,根据VP地址空间和PP地址空间的映射关系得到VP地址空间与VPP地址空间的第一映射关系,以及VPP地址空间与PP地址空间的第二映射关系构建VPP地址空间。这里可通过PP的大小和VP的大小计算得到VPP地址空间,在上文中已经阐述,这里不再赘述。GPU按照VPP地址可以读取本地格式的图形数据,对应的,GPU也可以将渲染之后的本地格式的图像写入VPP地址中。换句话说,第二虚拟地址是虚拟得到的,并不是真实存在的一段物理地址,GPU从这段虚拟出来的物理地址中读取或写入的数据为GPU可以访问的本地格式的图形数据。或者说,在对非本地格式的PP地址转换为第二虚拟地址时,图形数据的像素排布格式发生了变化,按照映射之后的VPP得到的缓冲的图像是以本地格式排布像素格式的,这种本地格式的图形数据是GPU可以直接访问的。
其中,本地格式是指GPU native format,是指GPU硬件本身支持(intrinsically support)的图像格式,GPU可以天然进行读写操作的格式。常用的本地格式由图形API定义。比如图形API为开放图形库(Open Graphics Library,OpenGL),OpenGL ES(OpenGL for Embedded Systems)以及3D规格界面的Direct3D时,常用的本地格式有:RGBA8888、RGBA16F、RGB10A2、SRGB8_A8_ASTC_3x3x3等。
非本地格式是指GPU不能直接进行读写操作的格式,示例性的,非本地格式包括所有非图形API支持的格式。这些格式一般都是由图形社区以外的应用场景产生。例如非本地格式包括Y10U10V10LPacked、Y10U10V10压缩格式、ICE以及Y10U10V10等。
由于内存中存储的是非本地格式的物理地址对应的图形数据,因此,下面的步骤还需要根据VPP地址空间对应得到真正采样数据时用到的非本地格式的PP地址。因此,该方法还包括:
503、虚拟化软件代理将第一映射关系发送给GPU,以及将第二映射关系和VPP地址空间发送给硬件虚拟化管理器。
VPP地址空间中的VPP地址用于替换GPU的MMU中PP地址空间中的PP地址,在GPU的MMU中建立VP地址空间中的VP地址与VPP地址空间中的VPP地址之间 的第一映射关系。需要说明的是,本申请实施例中,GPU的硬件结构和软件程序并没有改动,VPP地址空间存储在GPU的MMU中,对于GPU来说是不感知的,是被动接收的,现有技术中GPU在读写数据时最终发送给内存的是真实的物理地址,而由于本申请中的MMU未存储真实的物理地址,存储的是非本地格式的VP地址与VPP地址范围中的VPP地址之间的映射关系,因此,GPU在读写非本地格式的数据时发送给内存的是将非本地格式的VP地址进行地址翻译得到的非本地格式的VPP地址。
参考图6,硬件虚拟化管理器可以包括探听过滤器(snoop filter)、格式转换处理器(format conversion processor)以及第二MMU。探听过滤器用于确定GPU读取的图形数据的物理地址是否在VPP地址范围内。格式转换处理器用于在第二虚拟空间进行像素级的地址映射,将GPU读取图形数据时发送的VPP地址(中间虚拟地址)转换为本地格式的目标VPP地址(第二虚拟地址),以及对要读取图形数据进行解压缩或解密等。第二MMU存储有第二映射关系,第二映射关系为VPP地址空间与PP地址空间的映射关系。这里的第二映射关系可以是虚拟化软件代理在构建VPP地址空间时,将第二映射关系配置到第二MMU中的。其实现方式可以为:虚拟化软件代理向第二MMU发送配置信息,该配置信息包括第二映射关系。
基于此,将VPP地址空间和第二映射关系发送给硬件虚拟化管理器可以包括:将VPP地址空间发送给探听过滤器,将第二映射关系发送给第二MMU。这样,探听过滤器存储有VPP地址空间时,GPU在采样图形数据时,每读取一个图形数据时,一个图形数据对应一个VP地址和一个PP地址,同样地,也对应一个VPP地址。如果GPU要读取图形数据时,GPU的MMU中维护有多个映射表,如果GPU要采样的内存中存储的是本地格式的图形数据,GPU的MMU中存储的是真实的物理地址,那么GPU发送给内存的就是真实的物理地址,硬件虚拟化管理器的过探听过滤器中探听到的真实的物理地址就不在VPP地址空间内,硬件虚拟化管理器可丢弃接收到的真实的物理地址。也就是说,探听过滤器会过滤掉不在VPP地址空间内的物理地址。
504、硬件虚拟化管理器解析GPU的访问命令,得到访问命令中携带的中间虚拟地址。
这里具体为硬件虚拟化管理器中的探听过滤器解析GPU的访问命令。
505、硬件虚拟化管理器确定中间虚拟地址是否在VPP地址空间内。
上述已经提到,GPU在采样本地格式的图形数据时,GPU的访问命令中携带的中间虚拟地址为采样数据时真实的PP地址,探听过滤器会探听到该PP地址不在VPP地址范围内;如果GPU采样非本地格式的图形数据时,探听过滤器会探听到GPU的访问命令中携带的中间虚拟地址在本申请虚拟的VPP地址空间内。
506、若中间虚拟地址在VPP地址空间内,则硬件虚拟化管理器确定中间虚拟地址为VPP空间中的一个第一VPP地址。
也即,探听过滤器确定中间虚拟地址为VPP空间中的一个VPP地址。
507、硬件虚拟化管理器将第一VPP地址经过格式映射得到本地格式的目标VPP地址。
具体来说,探听过滤器将第一VPP地址发送给格式转换处理器;以便格式转换处理器将第一VPP地址转换为目标VPP地址。这里需要将第一VPP地址转换为目标VPP 地址是考虑到内存中存储的像素格式有多种情况,例如该像素格式为压缩格式或加密格式。也就是说,目标VPP地址为经过地址转换为内存中的像素格式对应的VPP地址。
因此,格式转换处理器将第一VPP地址转换为目标VPP地址可以适应于多种场景,本申请实施例对该场景进行以下3种情况的举例说明。
1)格式转换处理器根据第一VPP地址获取与第一VPP地址对应的像素坐标(x,y),再根据像素坐标(x,y)获取与像素坐标(x,y)对应的目标VPP地址。
根据像素坐标(x,y)获取与像素坐标(x,y)对应的目标VPP地址的样例可以如下;
const uint TileW;//32
const uint TileH;//32
const uint WidthInTile=(ImageWidth+TileW-1)/TileW;
//Tiling Address Transform
uint TileX=x/TileW;
uint TileY=y/TileH;
uint TileOffsetX=x%TileW;
uint TileOffsetY=y%TileH;
PixelAddress=(TileY*WidthInTile+TileX)*(TileW*TileH)+TileOffsetY*TileW+TileX
采样和渲染是以tile为粒度进行的时,TileW和TileH表示与非本地格式的第一VPP地址绑定的像素tile的宽和高,WidthInTile表示tile的序列,这样可以根据非本地格式的第一VPP地址像素坐标(x,y)和像素tile的宽和高计算出像素tile的坐标(TileX,TileY),以及像素tlie的偏差坐标(TileOffsetX,TileOffsetY),最后根据像素tile的宽和高:TileW和TileH、像素tile的坐标(TileX,TileY)、像素tlie的偏差坐标(TileOffsetX,TileOffsetY)计算出本地格式的目标VPP地址PixelAddress。
这种情况下,在根据最终获取的非本地格式的物理地址读取了图形数据之后,要将图形数据传输回给GPU时,可以从内存中将该数据直接反馈给GPU。
2)在内存中存储的图形数据的格式为压缩格式的情况下,为了在内存中的任何位置随机访问,这些压缩格式通常是基于块的,即每帧图形被分成不同的图形块,这些图形块可以是无损格式或无损格式。本申请实施例以基于图形块的无损压缩为例进行说明。
基于此,要实现格式转换处理器将第一VPP地址转换为目标VPP地址时,格式转换处理器可首先根据第一VPP地址获取与第一VPP地址对应的像素坐标(x,y),根据像素坐标(x,y)获取第一VPP地址要获取的图形块的索引,再根据索引获取格式转换处理器预先存储的与该索引对应的图形块的头数据,读取头数据中存储的头数据的压缩偏移信息,而后根据头数据的压缩偏移信息获取图形块对应的目标VPP地址。头数据的压缩偏移信息可以理解为该头数据的地址。
3)一些GPU中的纹理采样器受数字版权管理的保护,因此,不仅有普通的图形 数据,还有额外的签名被编码到内存中,即内存中存储的图形数据为加密数据,要采样图形数据时还需要多层认证才能获取数据,这种情况下,要实现格式转换处理器将第一VPP地址转换为目标VPP地址,首先可控制格式转换处理器根据第一VPP地址获取与第一VPP地址对应的像素坐标(x,y),再解码签名,即根据像素坐标(x,y)获取像素坐标(x,y)对应像素的签名,而后根据像素的签名获取格式转换处理器中预先存储的与该签名对应的目标VPP地址。
这种情况下,在根据非本地格式的物理地址读取了图形数据之后,要将该图形数据传输回给GPU时,由于内存中存储的图形数据为加密数据,因此该方法还包括:格式转换处理器对读取的图形数据进行解密,将解密后的图形数据发送给GPU。
508、硬件虚拟化管理器根据目标VPP地址和第二映射关系得到目标PP地址(第一物理地址),以使得GPU从目标PP地址中读取图形数据。
当格式转换处理器得到目标VPP地址后,可以将目标VPP地址传输给第二MMU。由于第二MMU中存储有第二映射关系,那么第二MMU就可以根据第二映射关系,查找与目标VPP地址对应的非本地格式的PP地址,第二MMU再将查找到的非本地格式的PP地址发送给内存,以从内存中读取该非本地格式的PP地址对应的非本地格式的图形数据。
因此,在本申请实施例中,GPU中的MMU中存储的为VPP地址范围这种虚拟的物理地址范围与VP地址范围的对应关系,由于地址的排布方式可反映采样图形数据时的排布方式,本申请提出的第二虚拟地址的排布方式为GPU本地格式的图形数据对应的物理地址的排布方式,因此,在根据该第二虚拟地址得到真实采样数据时的非本地格式的物理地址时,根据该非本地格式的物理地址采样图形数据的顺序对于GPU来说是转换后的本地格式的图形数据的顺序,因此,本申请这种通过虚拟化的方式可以使得GPU采样过程不需要额外的格式转换缓冲区,也省略了实际的进行格式转换的过程,实现了GPU以低成本方式采样非本地格式的图形数据。
与采样过程类似的,本申请还提供一种图形处理方法,以使能GPU渲染到非本地格式缓冲区的过程为例,如图7所示,该方法包括:
701、虚拟化软件代理拦截应用程序发送给GPU的图形处理请求和图形数据,图形处理请求包括应用程序申请的非本地格式的VP地址空间和非本地格式的PP地址空间。
步骤701的实现方式与步骤501的实现方式类似,不同的是,虚拟化软件代理在拦截图形处理请求时,还拦截到将要写入内存的图形数据。该图形数据为GPU本地格式的图形数据,本申请实施例是要将GPU本地格式的图形数据写入非本地格式的内存中。
可以理解的是,该图形数据也可以是应用程序直接发送给GPU而不被虚拟化软件代理拦截到。
702、虚拟化软件代理根据VP地址空间和PP地址空间构建出VPP地址空间,VPP地址空间为一片连续的虚拟地址空间。
该步骤702的实现方式与步骤502的实现方式类似。不同的是,该VPP地址空间用于使得GPU按照本地格式的地址顺序渲染本地格式的图形数据。同样地,由于地址 顺序不同,按照该地址渲染得到的图形数据的格式也不同。这样一来,虽然内存中存的图形数据为非本地格式的图形数据,应用程序申请到的是也是非本地格式的VP地址空间和非本地格式的PP地址空间,但是GPU要将本地格式的图形数据渲染到非本地格式的内存中去时,GPU可先按照本申请提出的VPP地址这种本地格式的地址顺序获取本地格式的图形数据,以便于后续将VPP地址反映射到非本地格式的PP地址,在将本地格式的图形数据存入内存中时,可以按照非本地格式的PP地址将本地格式的图形数据写入内存,使得内存中存入的最终为非本地格式的图形数据。
703、虚拟化软件代理将第一映射关系发送给GPU,以及将第二映射关系和VPP地址空间发送给硬件虚拟化管理器。
该步骤703的实现方式与步骤503的实现方式类似,不同的是,如果虚拟化软件代理还拦截到图形数据,那么虚拟化软件代理还需要将图形数据发送给GPU,以便GPU将拦截到的图形数据通过硬件虚拟化管理器写入内存。其中的探听过滤器用于确定GPU渲染的图形数据的物理地址是否在VPP地址范围内。格式转换处理器用于将GPU渲染图形数据时发送的第一VPP地址转换为目标VPP地址,以及对要写入的图形数据进行压缩或加密等。
704、硬件虚拟化管理器解析GPU的访问命令,得到访问命令中携带的中间虚拟地址。
705、硬件虚拟化管理器确定中间虚拟地址是否在VPP地址空间内。
步骤705的实现方式可以参考上述步骤505。
706、若中间虚拟地址在VPP地址空间内,则硬件虚拟化管理器确定中间虚拟地址为VPP空间中的一个第一VPP地址。
707、硬件虚拟化管理器将第一VPP地址经过格式映射得到本地格式的目标VPP地址。
具体来说,探听过滤器将第一VPP地址和从GPU接收到的图形数据发送给格式转换处理器;控制格式转换处理器将第一VPP地址转换为目标VPP地址。与步骤507类似的,格式转换处理器进行地址转换也是考虑到内存中的像素格式不同。
类似的,GPU渲染过程中,控制格式转换处理器将第一VPP地址转换为目标VPP地址也可以适用于多种场景:
一种场景可以参考步骤508中的1)情况中的说明。
再一种场景与步骤508中的2)情况类似,不同的是,地址转换过程中还需要对待渲染的图形数据进行压缩。具体来说,若内存中存储的像素格式为压缩格式,且GPU需向内存中写入图形数据,则要实现格式转换处理器将第一VPP地址转换为目标VPP地址时,可先控制格式转换处理器根据第一VPP地址获取与第一VPP地址对应的像素坐标(x,y);根据像素坐标(x,y)计算图形数据对应的图形块的索引,根据索引获取图形块的头数据,并将头数据和图形数据进行压缩;在格式转换器中存储压缩后的头数据与索引的对应关系,以便后续采样过程中使用到该对应关系。再根据头数据的地址计算得到图形块对应的目标VPP地址。
又一种场景与步骤508中的3)情况类似,不同的是,在将第一VPP地址转换为目标VPP地址时,如果内存中存的像素格式为加密格式,格式转换处理器还需要对待 写入的图形数据进行加密。具体的加密方式可以采用简单的流密码实现或者采用相对复杂的分组密码的私有密码进行加密,本申请不做限定。
708、硬件虚拟化管理器根据目标VPP地址和第二映射关系得到目标PP地址(第一物理地址),以使得GPU向目标PP地址中写入图形数据。
当格式转换处理器得到目标VPP地址后,将目标VPP地址和压缩或加密后的图形数据发送给第二MMU;第二MMU根据存储的第二映射关系,查找与目标VPP地址对应的非本地格式的PP地址,第二MMU再根据查找到的PP地址将图形数据发送给内存,以根据PP地址将图形数据按照非本地格式写入内存。
因此,在本申请实施例中,GPU中的MMU中存储的为VPP地址这种虚拟的物理地址,即第二虚拟地址空间,由于地址的排布方式可反映渲染图形数据时的排布方式,本申请提出的虚拟的物理地址的排布方式为GPU本地格式的图形数据对应的物理地址的排布方式,因此,在根据该虚拟的物理地址得到对应的真实渲染数据时的非本地格式的物理地址时,根据该非本地格式的物理地址渲染图形数据的顺序对于GPU来说是按照本地格式的图形数据的顺序进行渲染的,因此,本申请这种通过虚拟化的方式可以使得GPU渲染过程不需要额外的格式转换缓冲区,也省略了实际的进行格式转换的过程,实现了GPU以低成本方式渲染本地格式的图形数据到非本地格式的内存。
上述主要从终端设备的角度对本申请实施例提供的方案进行了介绍。可以理解的是,终端设备为了实现上述功能,其包含了执行各个功能相应的硬件结构和/或软件模块。本领域技术人员应该很容易意识到,结合本文中所公开的实施例描述的各示例的单元及算法步骤,本申请能够以硬件或硬件和计算机软件的结合形式来实现。某个功能究竟以硬件还是计算机软件驱动硬件的方式来执行,取决于技术方案的特定应用和设计约束条件。专业技术人员可以对每个特定的应用来使用不同方法来实现所描述的功能,但是这种实现不应认为超出本申请的范围。
本申请实施例可以根据上述方法示例对终端设备进行功能模块的划分,例如,可以对应各个功能划分各个功能模块,也可以将两个或两个以上的功能集成在一个处理模块中。上述集成的模块既可以采用硬件的形式实现,也可以采用软件功能模块的形式实现。需要说明的是,本申请实施例中对模块的划分是示意性的,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式。
在采用对应各个功能划分各个功能模块的情况下,图8示出了上述实施例中所涉及的终端设备的一种可能的结构示意图,该终端设备可以包括图形处理的装置,图形处理的装置可以用于执行图5对应的方法步骤和图7对应的方法步骤。终端设备80包括:获取单元801、传输单元802以及确定单元803。获取单元801用于支持终端设备执行图5中的过程501、502、504、507以及508,图7中的过程701、702、704、707以及708;传输单元802用于执行图5中的过程503,图7中的过程703;确定单元803用于支持终端设备80执行图5中的过程505和506,图7中的过程705和706。其中,上述方法实施例涉及的各步骤的所有相关内容均可以援引到对应功能模块的功能描述,在此不再赘述。
在采用集成的单元的情况下,图9示出了上述实施例中所涉及的终端设备的一种可能的结构示意图。终端设备90包括:处理模块902和通信模块903。处理模块902 用于对终端设备的动作进行控制管理,例如,处理模块902用于支持终端设备执行图5中的过程501-508,图7中的过程701-708,和/或用于本文所描述的技术的其它过程。通信模块903用于支持终端设备与其他网络实体的通信。终端设备还可以包括存储模块901,用于存储终端设备的程序代码和数据,该程序代码和数据包括本申请的虚拟化软件代理以及硬件虚拟化管理器的程序代码和数据。
其中,处理模块902可以是处理器或控制器,例如可以是中央处理器(Central Processing Unit,CPU),通用处理器,数字信号处理器(Digital Signal Processor,DSP),专用集成电路(Application-Specific Integrated Circuit,ASIC),现场可编程门阵列(Field Programmable Gate Array,FPGA)或者其他可编程逻辑器件、晶体管逻辑器件、硬件部件或者其任意组合。其可以实现或执行结合本申请公开内容所描述的各种示例性的逻辑方框,模块和电路。所述处理器也可以是实现计算功能的组合,例如包含一个或多个微处理器组合,DSP和微处理器的组合等等。通信模块903可以是收发器、收发电路或通信接口等。存储模块901可以是存储器。存储器中包括本申请的虚拟化软件代理以及硬件虚拟化管理器的程序代码和数据。处理器包括本申请的硬件虚拟化管理器的硬件结构。
当处理模块902为处理器,通信模块903为收发器,存储模块901为存储器时,本申请实施例所涉及的终端设备可以为图3所示的终端设备。
以上所述,仅为本发明的具体实施方式,但本发明的保护范围并不局限于此,任何在本发明揭露的技术范围内的变化或替换,都应涵盖在本发明的保护范围之内。因此,本发明的保护范围应以所述权利要求的保护范围为准。

Claims (28)

  1. 一种图形处理方法,其特征在于,包括:
    获取图形处理器GPU待访问的第一虚拟地址,所述第一虚拟地址属于第一虚拟地址空间;
    根据所述第一虚拟地址得到第二虚拟地址,所述第二虚拟地址属于第二虚拟地址空间;
    其中,所述第二虚拟地址空间不同于所述第一虚拟地址空间,所述第二虚拟地址空间和所述第一虚拟地址空间映射到相同的第一物理地址空间,所述第一虚拟地址所映射的物理地址对应第一格式的图像数据,所述第二虚拟地址所映射的物理地址对应第二格式的图像数据。
  2. 根据权利要求1所述的方法,其特征在于,所述根据所述第一虚拟地址得到第二虚拟地址,具体包括:
    将所述第一虚拟地址翻译成中间虚拟地址,所述中间虚拟地址为所述第二虚拟地址空间中的一个虚拟地址;
    在所述第二虚拟地址空间中将所述中间虚拟地址映射为所述第二虚拟地址。
  3. 根据权利要求2所述的方法,其特征在于,所述将所述第一虚拟地址翻译成中间虚拟地址,具体包括:
    根据第一映射关系得到所述第一虚拟地址在所述第二虚拟地址空间中对应的所述中间虚拟地址,所述第一映射关系为所述第一虚拟地址空间和所述第二虚拟地址空间的映射关系。
  4. 根据权利要求3所述的方法,其特征在于,根据所述第一虚拟地址得到第二虚拟地址之后,所述方法还包括:
    根据第二映射关系得到所述第二虚拟地址在所述第一物理地址空间中对应的第一物理地址,所述第二映射关系为所述第二虚拟地址空间和所述第一物理地址空间的映射关系。
  5. 根据权利要求4所述的方法,其特征在于,在根据所述第一虚拟地址得到第二虚拟地址之前,所述方法还包括:
    获取发送给所述GPU的图形处理请求,所述图形处理请求包括所述第一虚拟地址空间和所述第一物理地址空间;
    根据所述第一虚拟地址空间和所述第一物理地址空间构建所述第二虚拟地址空间。
  6. 根据权利要求5所述的方法,其特征在于,所述方法还包括:
    根据所述第一虚拟地址空间和所述第一物理地址空间得到所述第一映射关系和所述第二映射关系。
  7. 根据权利要求5或6所述的方法,其特征在于,所述根据所述第一虚拟地址空间和所述第一物理地址空间构建所述第二虚拟地址空间,具体包括:
    获取所述第一物理地址空间对应的物理内存页PP的大小以及所述第一虚拟地址空间对应的虚拟内存页VP的大小;
    将所述第一物理地址空间映射到连续的虚拟内存空间中,得到所述第二虚拟地址 空间,所述第二虚拟地址空间对应的虚拟物理内存页VPP的大小大于所述PP的大小以及所述VP的大小。
  8. 根据权利要求1所述的方法,其特征在于,所述根据所述第一虚拟地址得到第二虚拟地址,包括:
    将所述第一虚拟地址翻译成中间虚拟地址;
    判断所述中间虚拟地址是否属于所述第二虚拟地址空间;
    当所述中间虚拟地址属于所述第二虚拟地址空间时,在所述第二虚拟地址空间中将所述中间虚拟地址映射为所述第二虚拟地址。
  9. 根据权利要求2至8任一项所述的方法,其特征在于,所述在所述第二虚拟地址空间中将所述中间虚拟地址映射为所述第二虚拟地址,具体包括:
    获取所述中间虚拟地址对应的像素坐标;
    根据所述像素坐标获取所述第二虚拟地址。
  10. 根据权利要求2至8任一项所述的方法,其特征在于,
    所述第一格式的图像数据为所述GPU需读取的压缩数据,所述压缩数据包括多个压缩图形块,所述在所述第二虚拟地址空间中将所述中间虚拟地址映射为所述第二虚拟地址,包括:
    获取所述中间虚拟地址对应的像素坐标;
    根据所述像素坐标获取所述中间虚拟地址对应的目标压缩图形块的压缩偏移信息;
    根据所述压缩偏移信息计算得到所述第二虚拟地址;
    所述方法还包括:
    对读取的所述目标压缩图形块进行解压缩。
  11. 根据权利要求2至8任一项所述的方法,其特征在于,所述第一格式的图像数据为所述GPU待写入的压缩数据,所述在所述第二虚拟地址空间中将所述中间虚拟地址映射为所述第二虚拟地址,包括:
    获取所述中间虚拟地址对应的像素坐标;
    根据所述像素坐标获取所述待写入的压缩数据的头数据的地址;
    根据所述头数据的地址获取所述第二虚拟地址。
  12. 根据权利要求2至8任一项所述的方法,其特征在于,所述在所述第二虚拟地址空间中将所述中间虚拟地址映射为所述第二虚拟地址,包括:
    获取所述中间虚拟地址对应的像素坐标;
    获取所述像素坐标对应的像素的签名;
    根据所述签名获取与所述签名对应的所述第二虚拟地址;
    若所述第一格式的图像数据为所述GPU需读取的加密数据,则所述方法还包括:
    对读取的图像数据进行解密,将解密后的图像数据发送给所述GPU。
  13. 一种图形处理装置,其特征在于,所述装置包括图形处理器GPU和硬件虚拟化管理器,其中:
    所述GPU,用于获取待访问的第一虚拟地址,所述第一虚拟地址属于第一虚拟地址空间;
    硬件虚拟化管理器,用于根据所述第一虚拟地址得到第二虚拟地址,所述第二虚拟地址属于第二虚拟地址空间;
    其中,所述第二虚拟地址空间不同于所述第一虚拟地址空间,所述第二虚拟地址空间和所述第一虚拟地址空间映射到相同的第一物理地址空间,所述第一虚拟地址所映射的物理地址对应第一格式的图像数据,所述第二虚拟地址所映射的物理地址对应第二格式的图像数据。
  14. 根据权利要求13所述的装置,其特征在于,所述GPU包括第一存储器管理单元MMU,所述硬件虚拟化管理器包括格式转换处理器;
    所述第一MMU,用于将所述第一虚拟地址翻译成中间虚拟地址,所述中间虚拟地址为所述第二虚拟地址空间中的一个虚拟地址;
    所述格式转换处理器,用于在所述第二虚拟地址空间中将所述中间虚拟地址映射为所述第二虚拟地址。
  15. 根据权利要求14所述的装置,其特征在于,所述第一MMU用于:
    根据第一映射关系得到所述第一虚拟地址在所述第二虚拟地址空间中对应的所述中间虚拟地址,所述第一映射关系为所述第一虚拟地址空间和所述第二虚拟地址空间的映射关系。
  16. 根据权利要求15所述的装置,其特征在于,所述硬件虚拟化管理器包括第二MMU,所述第二MMU用于:
    根据第二映射关系得到所述第二虚拟地址在所述第一物理地址空间中对应的第一物理地址,所述第二映射关系为所述第二虚拟地址空间和所述第一物理地址空间的映射关系。
  17. 根据权利要求16所述的装置,其特征在于,所述装置还包括中央处理器CPU,所述CPU上运行有虚拟化软件代理,所述虚拟化软件代理,用于:
    获取发送给所述GPU的图形处理请求,所述图形处理请求包括所述第一虚拟地址空间和所述第一物理地址空间;
    根据所述第一虚拟地址空间和所述第一物理地址空间构建所述第二虚拟地址空间。
  18. 根据权利要求17所述的装置,其特征在于,所述虚拟化软件代理,还用于:
    根据所述第一虚拟地址空间和所述第一物理地址空间得到所述第一映射关系和所述第二映射关系。
  19. 根据权利要求17或18所述的装置,其特征在于,所述虚拟化软件代理,具体用于:
    获取所述第一物理地址空间对应的物理内存页PP的大小以及所述第一虚拟地址空间对应的虚拟内存页VP的大小;
    将所述第一物理地址空间映射到连续的虚拟内存空间中,得到所述第二虚拟地址空间,所述第二虚拟地址空间对应的虚拟物理内存页VPP的大小大于所述PP的大小以及所述VP的大小。
  20. 根据权利要求13所述的装置,其特征在于,所述GPU包括第一MMU,所述硬件虚拟化管理器包括探听过滤器和格式转换处理器;
    所述第一MMU,用于:将所述第一虚拟地址翻译成中间虚拟地址;
    所述探听过滤器,用于:
    判断所述中间虚拟地址是否属于所述第二虚拟地址空间;
    当所述中间虚拟地址属于所述第二虚拟地址空间时,将所述中间虚拟地址发送给所述格式转换处理器;
    所述格式转换处理器用于:在所述第二虚拟地址空间中将所述中间虚拟地址映射为所述第二虚拟地址。
  21. 根据权利要求14至20任一项所述的装置,其特征在于,所述格式转换处理器具体用于:
    获取所述中间虚拟地址对应的像素坐标;
    根据所述像素坐标获取所述第二虚拟地址。
  22. 根据权利要求14至20任一项所述的装置,其特征在于,所述第一格式的图像数据为所述GPU需读取的压缩数据,所述压缩数据包括多个压缩图形块;
    所述格式转换处理器具体用于:
    获取所述中间虚拟地址对应的像素坐标;
    根据所述像素坐标获取所述中间虚拟地址对应的目标压缩图形块的压缩偏移信息;
    根据所述压缩偏移信息计算得到所述第二虚拟地址;
    所述格式转换处理器还用于:
    对读取的所述目标压缩图形块进行解压缩。
  23. 根据权利要求14至20任一项所述的装置,其特征在于,所述第一格式的图像数据为所述GPU待写入的压缩数据;所述格式转换处理器用于:
    获取所述中间虚拟地址对应的像素坐标;
    根据所述像素坐标获取所述待写入的压缩数据的头数据的地址;
    根据所述头数据的地址获取所述第二虚拟地址。
  24. 根据权利要求14至20任一项所述的装置,其特征在于,所述格式转换处理器用于:
    获取所述中间虚拟地址对应的像素坐标;
    获取所述像素坐标对应的像素的签名;
    根据所述签名获取与所述签名对应的所述第二虚拟地址;
    若所述第一格式的图像数据为所述GPU需读取的加密数据,所述格式转换处理器还用于:
    对读取的图像数据进行解密,将解密后的图像数据发送给所述GPU。
  25. 一种图形处理方法,其特征在于,包括:
    获取待访问的第一虚拟地址,所述第一虚拟地址属于第一虚拟地址空间;
    将所述第一虚拟地址翻译成中间虚拟地址,所述中间虚拟地址属于第二虚拟地址空间,所述中间虚拟地址在所述第二虚拟地址空间中能够被映射为第二虚拟地址;
    其中,所述第二虚拟地址空间不同于所述第一虚拟地址空间,所述第二虚拟地址空间和所述第一虚拟地址空间映射到相同的第一物理地址空间,所述第一虚拟地址所 映射的物理地址对应第一格式的图像数据,所述第二虚拟地址所映射的物理地址对应第二格式的图像数据。
  26. 根据权利要求25所述的方法,其特征在于,所述将所述第一虚拟地址翻译成中间虚拟地址,具体包括:
    根据第一映射关系得到所述第一虚拟地址在所述第二虚拟地址空间中对应的所述中间虚拟地址,所述第一映射关系为所述第一虚拟地址空间和所述第二虚拟地址空间的映射关系。
  27. 一种图形处理器GPU,其特征在于,所述GPU包括传输接口和存储器管理单元MMU,其中:
    所述传输接口,用于获取待访问的第一虚拟地址,所述第一虚拟地址属于第一虚拟地址空间;
    所述MMU,用于将所述第一虚拟地址翻译成中间虚拟地址,所述中间虚拟地址属于第二虚拟地址空间,所述中间虚拟地址在所述第二虚拟地址空间中能够被映射为第二虚拟地址;
    其中,所述第二虚拟地址空间不同于所述第一虚拟地址空间,所述第二虚拟地址空间和所述第一虚拟地址空间映射到相同的第一物理地址空间,所述第一虚拟地址所映射的物理地址对应第一格式的图像数据,所述第二虚拟地址所映射的物理地址对应第二格式的图像数据。
  28. 根据权利要求27所述的GPU,其特征在于,所述MMU,具体用于:
    根据第一映射关系得到所述第一虚拟地址在所述第二虚拟地址空间中对应的所述中间虚拟地址,所述第一映射关系为所述第一虚拟地址空间和所述第二虚拟地址空间的映射关系。
PCT/CN2019/088565 2019-05-27 2019-05-27 一种图形处理方法和装置 WO2020237460A1 (zh)

Priority Applications (4)

Application Number Priority Date Filing Date Title
CN201980078870.1A CN113168322A (zh) 2019-05-27 2019-05-27 一种图形处理方法和装置
EP19931211.7A EP3964949B1 (en) 2019-05-27 2019-05-27 Graphics processing method and apparatus
PCT/CN2019/088565 WO2020237460A1 (zh) 2019-05-27 2019-05-27 一种图形处理方法和装置
US17/534,462 US20220083367A1 (en) 2019-05-27 2021-11-24 Graphics processing method and apparatus

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2019/088565 WO2020237460A1 (zh) 2019-05-27 2019-05-27 一种图形处理方法和装置

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US17/534,462 Continuation US20220083367A1 (en) 2019-05-27 2021-11-24 Graphics processing method and apparatus

Publications (1)

Publication Number Publication Date
WO2020237460A1 true WO2020237460A1 (zh) 2020-12-03

Family

ID=73553553

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2019/088565 WO2020237460A1 (zh) 2019-05-27 2019-05-27 一种图形处理方法和装置

Country Status (4)

Country Link
US (1) US20220083367A1 (zh)
EP (1) EP3964949B1 (zh)
CN (1) CN113168322A (zh)
WO (1) WO2020237460A1 (zh)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112862724A (zh) * 2021-03-12 2021-05-28 上海壁仞智能科技有限公司 用于计算的方法、计算设备和计算机可读存储介质

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10380039B2 (en) * 2017-04-07 2019-08-13 Intel Corporation Apparatus and method for memory management in a graphics processing environment
KR20230138777A (ko) * 2022-03-24 2023-10-05 삼성전자주식회사 데이터 재구성가능한 스토리지 장치, 전자 시스템 및 그 동작방법
CN115454358B (zh) * 2022-11-09 2023-03-24 摩尔线程智能科技(北京)有限责任公司 数据的存储控制方法及其装置、图像处理系统
CN115456862B (zh) * 2022-11-09 2023-03-24 深流微智能科技(深圳)有限公司 一种用于图像处理器的访存处理方法及设备

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101383040A (zh) * 2007-07-09 2009-03-11 株式会社东芝 用于处理图像的设备以及用于检测图像更新的方法
CN103440612A (zh) * 2013-08-27 2013-12-11 华为技术有限公司 一种gpu虚拟化中图像处理方法和装置
CN104102542A (zh) * 2013-04-10 2014-10-15 华为技术有限公司 一种网络数据包处理方法和装置
CN104731569A (zh) * 2013-12-23 2015-06-24 华为技术有限公司 一种数据处理方法及相关设备
CN109062833A (zh) * 2017-06-30 2018-12-21 创义达科技股份有限公司 计算系统操作方法、计算系统、车辆及计算机可读媒体

Family Cites Families (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8024547B2 (en) * 2007-05-01 2011-09-20 Vivante Corporation Virtual memory translation with pre-fetch prediction
US9378572B2 (en) * 2012-08-17 2016-06-28 Intel Corporation Shared virtual memory
US9886736B2 (en) * 2014-01-20 2018-02-06 Nvidia Corporation Selectively killing trapped multi-process service clients sharing the same hardware context
EP3056990B1 (en) * 2015-02-13 2019-01-30 Deutsche Telekom AG A mobile system and method thereof for secure multiplexing of gpu in an embedded system
US10013360B2 (en) * 2015-03-04 2018-07-03 Cavium, Inc. Managing reuse information with multiple translation stages
GB2545170B (en) * 2015-12-02 2020-01-08 Imagination Tech Ltd GPU virtualisation
CN108664523B (zh) * 2017-03-31 2021-08-13 华为技术有限公司 一种虚拟磁盘文件格式转换方法和装置
US10304421B2 (en) * 2017-04-07 2019-05-28 Intel Corporation Apparatus and method for remote display and content protection in a virtualized graphics processing environment
CN109727183B (zh) * 2018-12-11 2023-06-23 中国航空工业集团公司西安航空计算技术研究所 一种图形渲染缓冲区压缩表的调度方法及装置

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101383040A (zh) * 2007-07-09 2009-03-11 株式会社东芝 用于处理图像的设备以及用于检测图像更新的方法
CN104102542A (zh) * 2013-04-10 2014-10-15 华为技术有限公司 一种网络数据包处理方法和装置
CN103440612A (zh) * 2013-08-27 2013-12-11 华为技术有限公司 一种gpu虚拟化中图像处理方法和装置
CN104731569A (zh) * 2013-12-23 2015-06-24 华为技术有限公司 一种数据处理方法及相关设备
CN109062833A (zh) * 2017-06-30 2018-12-21 创义达科技股份有限公司 计算系统操作方法、计算系统、车辆及计算机可读媒体

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112862724A (zh) * 2021-03-12 2021-05-28 上海壁仞智能科技有限公司 用于计算的方法、计算设备和计算机可读存储介质
CN112862724B (zh) * 2021-03-12 2022-09-09 上海壁仞智能科技有限公司 用于计算的方法、计算设备和计算机可读存储介质

Also Published As

Publication number Publication date
EP3964949A1 (en) 2022-03-09
US20220083367A1 (en) 2022-03-17
CN113168322A (zh) 2021-07-23
EP3964949A4 (en) 2022-05-18
EP3964949B1 (en) 2023-09-06

Similar Documents

Publication Publication Date Title
WO2020237460A1 (zh) 一种图形处理方法和装置
US11531623B2 (en) Memory sharing via a unified memory architecture
US9189261B2 (en) Saving, transferring and recreating GPU context information across heterogeneous GPUs during hot migration of a virtual machine
US8392667B2 (en) Deadlock avoidance by marking CPU traffic as special
US20140132639A1 (en) Method and computing device for capturing screen images and for identifying screen image changes using a gpu
US9990690B2 (en) Efficient display processing with pre-fetching
KR101232428B1 (ko) 디스플레이 데이터 관리 장치, 방법, 제조물 및 시스템
WO2015078156A1 (zh) 一种图形数据的处理方法、装置及系统
JP2008526107A (ja) リモートコンピューティングにおけるグラフィクスプロセッサの使用
US8245011B2 (en) Method and system for geometry-based virtual memory management in a tiled virtual memory
US20120206474A1 (en) Blend Equation
TW201537555A (zh) 避免發送未改變區域至顯示器之技術
CA2558657A1 (en) Embedded system with 3d graphics core and local pixel buffer
US10043234B2 (en) System and method for frame buffer decompression and/or compression
CN114461406A (zh) DMA OpenGL优化方法
US9864638B2 (en) Techniques for accessing a graphical processing unit memory by an application
US10580107B2 (en) Automatic hardware ZLW insertion for IPU image streams
WO2016122896A1 (en) Graphics processing unit with bayer mapping
CN112214444A (zh) 一种核间通信方法、arm、dsp及终端
US20240111686A1 (en) Application processor, system-on-a-chip and method of operation thereof
US11605364B2 (en) Line-based rendering for graphics rendering systems, methods, and devices
KR20240045069A (ko) 애플리케이션 프로세서, 시스템 온 칩 및 이의 동작 방법

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 19931211

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

ENP Entry into the national phase

Ref document number: 2019931211

Country of ref document: EP

Effective date: 20211130