WO2022183572A1 - 命令提交方法和装置、命令读取方法和装置、电子设备 - Google Patents

命令提交方法和装置、命令读取方法和装置、电子设备 Download PDF

Info

Publication number
WO2022183572A1
WO2022183572A1 PCT/CN2021/087385 CN2021087385W WO2022183572A1 WO 2022183572 A1 WO2022183572 A1 WO 2022183572A1 CN 2021087385 W CN2021087385 W CN 2021087385W WO 2022183572 A1 WO2022183572 A1 WO 2022183572A1
Authority
WO
WIPO (PCT)
Prior art keywords
command
instruction
link
image processing
gpu
Prior art date
Application number
PCT/CN2021/087385
Other languages
English (en)
French (fr)
Inventor
何妍
Original Assignee
长沙景嘉微电子股份有限公司
长沙景美集成电路设计有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 长沙景嘉微电子股份有限公司, 长沙景美集成电路设计有限公司 filed Critical 长沙景嘉微电子股份有限公司
Publication of WO2022183572A1 publication Critical patent/WO2022183572A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5005Allocation of resources, e.g. of the central processing unit [CPU] to service a request
    • G06F9/5011Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resources being hardware resources other than CPUs, Servers and Terminals
    • G06F9/5016Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resources being hardware resources other than CPUs, Servers and Terminals the resource being the memory
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/30003Arrangements for executing specific machine instructions
    • G06F9/3004Arrangements for executing specific machine instructions to perform operations on memory
    • G06F9/30047Prefetch instructions; cache control instructions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/30003Arrangements for executing specific machine instructions
    • G06F9/3005Arrangements for executing specific machine instructions to perform operations for flow control
    • G06F9/30069Instruction skipping instructions, e.g. SKIP
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5005Allocation of resources, e.g. of the central processing unit [CPU] to service a request
    • G06F9/5027Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals
    • G06F9/5038Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals considering the execution order of a plurality of tasks, e.g. taking priority or time dependency constraints into consideration
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Definitions

  • the present application relates to image processing technology, and in particular, to a method and apparatus for submitting commands, a method and apparatus for reading commands, and electronic equipment.
  • CPU central processing unit, central processing unit
  • GPU Graphics Processing Unit
  • GPU Graphics Processing Unit
  • the Ring Buffer mode is usually used to submit image processing commands. Before using this mode to submit image processing commands, it is necessary to apply for a part of memory in advance as a ring command queue, and establish a channel for submitting commands between the CPU and GPU, and the ring command queue There are two kinds of pointers for reading and writing.
  • the CPU writes commands to the ring command queue from the position of the write pointer
  • the GPU reads commands from the ring command queue from the position of the read pointer, and updates the read and write after completing a write or read command. the position of the pointer.
  • the circular command queue needs to apply for a part of memory in advance, there will be some problems if the memory requested is too large or too small. If the requested memory is small, a large number of image processing commands that need to be processed by the GPU may accumulate in the circular command queue after a period of time. At this time, the CPU must enter a waiting state until there is memory in the circular command queue that can accommodate the commands submitted by the CPU. ; If the requested memory is too large, the circular command queue may be in a non-full state in most cases, wasting memory resources.
  • Embodiments of the present application provide a command submission method and device, a command reading method and device, and an electronic device, which are used to solve the waste of memory resources in the related art.
  • a command submission method is provided, which is applied to a CPU of an electronic device, where the electronic device includes a command queue and a command buffer, and the method includes:
  • the link address of the second link instruction is configured as a wait instruction or the next first link instruction, so that the GPU executes the second link instruction and then jumps to the wait instruction or the next first link instruction.
  • a command reading method is provided, which is applied to a GPU of an electronic device, and the method includes:
  • a command submission apparatus which is applied to a CPU of an electronic device, the electronic device includes a command queue and a command buffer, and the apparatus includes:
  • an instruction insertion module for inserting the first link instruction into the command queue
  • a judgment module for judging whether there is an image processing command to be executed by the GPU
  • a command submission module configured to submit the image processing command to be executed by the GPU to the command buffer when there is an image processing command to be executed by the GPU;
  • the instruction insertion module is further configured to insert a second link instruction at the end of the command buffer;
  • An instruction configuration module configured to configure the link address of the first link instruction as an address corresponding to the command buffer, and configure the link address of the second link instruction as a waiting instruction or the next first link instruction, to After executing the second linking instruction, the GPU jumps to the waiting instruction or the next first linking instruction.
  • a command reading apparatus which is applied to a GPU of an electronic device, and the apparatus includes:
  • the command reading module is used for reading the first link command in the command queue and then jumping to the command buffer when there is an image processing command to be executed; sequentially reading and executing the image processing commands in the command buffer ; After executing the image processing command in the command buffer, read the second link instruction and jump to the waiting instruction or the next first link instruction.
  • an electronic device including: a processor, a memory, and a bus, the processor includes a CPU and a GPU, and the memory stores a machine executable executable by the processor.
  • Read instructions when the electronic device is running, the processor and the memory communicate through the bus, and when the machine-readable instructions are executed by the processor, the command submission method provided in the above embodiment is executed or Command read method.
  • a storage medium is provided, and a computer program is stored on the storage medium, and the computer program is executed by a processor to execute the command submission method or the command reading provided by the above embodiments. method.
  • the embodiment of the present application provides a command submission method and device, a command reading method and device, and an electronic device.
  • a first link command is first inserted into a command queue, and when there is an image processing command to be executed by the GPU When the image processing command to be executed by the GPU is submitted to the command buffer, and the second link command is inserted at the end of the command buffer; the link address of the first link command is configured as the address corresponding to the command buffer; the second link The link address of the instruction is configured as the wait instruction or the next first link instruction, so that the GPU can jump to the wait instruction or the next first link instruction after executing the second link instruction.
  • all image processing commands to be executed by the GPU are submitted to the command buffer, and there are only link commands in the command queue.
  • the link command enables the GPU to jump to the command buffer to read and execute the command buffer.
  • the image processing commands in the area do not need to apply for a part of the memory in advance to store the image processing commands to be executed, saving memory resources.
  • FIG. 1 is a schematic diagram of submitting an image processing command provided by the related art
  • FIG. 2 is a schematic diagram of a Ring Buffer mode provided by an embodiment of the present application.
  • FIG. 3 is one of the flowcharts of the command submission method provided by the embodiment of the present application.
  • FIG. 4 is one of schematic diagrams of a command queue in an active state provided by an embodiment of the present application.
  • FIG. 5 is the second schematic diagram of a command queue in an active state provided by an embodiment of the present application.
  • FIG. 6 is the third schematic diagram of a command queue in an active state provided by an embodiment of the present application.
  • FIG. 7 is the second flow chart of the command submission method provided by the embodiment of the present application.
  • FIG. 8 is one of schematic diagrams of a command queue in an inactive state provided by an embodiment of the present application.
  • FIG. 9 is the second schematic diagram of a command queue in an inactive state provided by an embodiment of the present application.
  • FIG. 10 is a functional block diagram of a command submission device provided by an embodiment of the present application.
  • FIG. 11 is a schematic diagram of an electronic device provided by an embodiment of the present application.
  • each client program has a rendering context (Context) dedicated to its own, and the rendering context includes a client program's private command buffer.
  • the command buffer stores a number of image processing commands that need to be executed by the GPU (Graphics Processing Unit), and the image processing commands in these command buffers need to be copied to the command queue, and the GPU reads from the command queue in order. Execute these image processing commands.
  • An excellent command queue must be able to receive commands submitted by each command buffer to be executed by the GPU, and also be able to hand them over to the GPU to execute these commands in order according to the principle of first-in, first-out. As shown in FIG. 1 , FIG. 1 is a schematic diagram of submitting an image processing command provided by the related art.
  • the Ring Buffer mode is usually used to submit commands, as shown in FIG. 2 , which is provided by the embodiment of the present application.
  • FIG. 2 Schematic of Ring Buffer mode.
  • this command submission mode it is necessary to apply for a part of the memory in the CPU's memory (or GPU's video memory) in advance as a circular command queue, which is used to establish a submission command channel between the CPU and the GPU. There are read pointers and write commands in the command queue.
  • Pointer when the CPU writes the image processing command (that is, submits the image processing command submitted by the client to the circular command queue), it writes the image processing command from the position where the write pointer is located, and when the GPU reads the image processing command, it writes the image processing command from the read
  • the image processing command is read from the position of the pointer, and the positions of the read pointer and the write pointer are updated after each write command or read command is completed.
  • this mechanism can implement orderly submission of image processing commands to be executed to the command queue and let the GPU execute them in order, it is impossible to determine the memory size that the circular command queue needs to apply in advance. If the requested memory is small, after a period of time, there will be too many image processing commands that need to be processed by the GPU in the circular command queue, and there is no more memory that can continue to accommodate the image processing commands submitted by the CPU. At this time, the CPU must Enter the waiting state, until there is enough memory space in the ring command queue, the CPU can add commands to the ring command queue again; if the requested memory is too large, it may take a lot of image processing commands to completely fill the ring command queue. , in most cases, the circular command queue may be in a non-full state, wasting memory resources.
  • the embodiments of the present application provide a command submission method and device, a command reading method and device, and an electronic device.
  • the first link command is first inserted into the command queue, and when there is a command queue to be executed by the GPU
  • the image processing command to be executed by the GPU is submitted to the command buffer, and the second link instruction is inserted at the end of the command buffer;
  • the link address of the first link command is configured as the address corresponding to the command buffer;
  • the link address of the second link instruction is configured as the wait instruction or the next first link instruction, so that the GPU can jump to the wait instruction or the next first link instruction after executing the second link instruction.
  • all image processing commands to be executed by the GPU are submitted to the command buffer, and there are only link commands in the command queue.
  • the link command enables the GPU to jump to the command buffer to read and execute the command buffer.
  • the image processing commands in the area do not need to apply for a part of the memory in advance to store the image processing commands to be executed, and there will be no waste of memory and no need to wait for the CPU.
  • FIG. 3 is one of the flowcharts of a command submission method provided by an embodiment of the present application.
  • the command submission method is applied to the CPU of the electronic device, and the electronic device further includes a GPU, a command queue and a command buffer, and the method includes:
  • Step S11 inserting the first link instruction into the command queue.
  • Step S12 it is judged whether there is an image processing command to be executed by the GPU.
  • Step S13 if there is an image processing command to be executed by the GPU, submit the image processing command to be executed by the GPU to the command buffer, and insert a second link instruction at the end of the command buffer.
  • Step S14 configure the link address of the first link instruction as the address corresponding to the command buffer.
  • Step S15 configure the link address of the second link instruction as the wait instruction or the next first link instruction, so that the GPU executes the second link instruction and then jumps to the wait instruction or the next first link instruction.
  • the CPU needs to submit these image processing commands to the command buffer for the GPU to read and execute.
  • the CPU first needs to insert the first link instruction into the command queue, and at this time, both the queue head pointer and the queue tail pointer of the command queue point to the first link instruction.
  • the queue head pointer is used to indicate the position of the instruction being executed by the GPU
  • the queue tail pointer is used to indicate the position of the instruction submitted by the CPU.
  • the GPU starts to read the first link instruction according to the position of the queue head pointer, and then judges whether there is an image processing command to be executed by the GPU, that is, whether the client program has submitted a command that needs to be processed by the GPU, and if there is an image processing command to be executed by the GPU command (this state can be defined as an active state), the image processing commands are sequentially submitted to the command buffer, and the second link instruction is inserted at the end of the command buffer.
  • FIG. 4 is one of schematic diagrams of a command queue in an active state provided by an embodiment of the present application.
  • the queue head pointer will be updated to the position of the next image processing command, so the GPU can sequentially process the image processing according to the position of the queue head pointer. Order.
  • FIG. 5 is the second schematic diagram of the command queue in the active state provided by the embodiment of the present application.
  • a command buffer After executing the image processing command in a command buffer, jump to the next first link command, and then continue to judge whether there is still an image processing command to be executed by the GPU. If so, submit the image processing command to the second link command. command buffer, and a second link instruction is added at the end of the second command buffer.
  • the link address of the added second link instruction is configured as a waiting instruction or the next first link instruction; then the first link instruction needs to be
  • the link address of the link instruction is configured as the address where the second command buffer is located.
  • the waiting instruction includes the time that the GPU needs to wait. For example, if the waiting instruction indicates that the GPU waiting time is 30 milliseconds, the GPU will wait for 30 milliseconds before executing the next instruction of the waiting instruction. .
  • FIG. 6 is a third schematic diagram of a command queue in an active state provided by an embodiment of the present application.
  • the link address of the second link instruction is the waiting instruction
  • the next instruction of the waiting instruction is a link instruction (the third link instruction in FIG. 6 )
  • the link address of the third link instruction is the waiting instruction. That is to say, when there is no image processing command to be executed by the GPU, that is, after the GPU has executed all the commands in the last command buffer, it executes the second link command at the end of the command buffer and then jumps to the waiting command.
  • execute the third link instruction in sequence, jump to the waiting instruction to continue execution after executing the third link instruction, enter the loop flow, and wait for the image processing command to be executed.
  • the waiting command needs to be changed to the first link command, so that the GPU can jump out of the waiting loop process and continue to execute. subsequent instructions.
  • step S12 the method further includes:
  • Step S16 if there is no image processing command to be executed by the GPU, configure the link address of the first link command as the first link command, so that the GPU cyclically executes the first link command.
  • FIG. 8 is one of schematic diagrams of a command queue in an inactive state provided by an embodiment of the present application.
  • the client program does not submit an image processing command to the CPU, that is, there is no image processing command to be executed by the GPU (this state can be defined as an inactive state)
  • the link address of the first link command needs to be Configure the address corresponding to itself.
  • the GPU executes, the first link instruction is executed cyclically until there is an image processing command that needs to be processed by the GPU.
  • the methods described in steps S13 to S15 can be referred to.
  • the link address of the first link command is configured as the command buffer, and the GPU will jump to the command buffer to read Order.
  • the method further includes: if there is no image processing command to be executed by the GPU, inserting a waiting instruction before the first linking instruction, and converting the first linking instruction
  • the link address is configured to wait for instructions.
  • FIG. 9 is the second schematic diagram of a command queue in an inactive state provided by an embodiment of the present application.
  • a waiting command can be inserted before the first linking command, and the next command of the waiting command is the first linking command, and the first linking command is set to the first linking command.
  • the link address is configured to wait for instructions.
  • the GPU When the GPU is executing, it first executes the waiting instruction, and after waiting for a period of time, executes the next instruction (ie, the first link instruction), and when executing the first link instruction, it will jump to the waiting instruction for execution, so that the GPU enters the loop waiting until there is an image processing command that needs to be executed by the GPU, the GPU will jump out of the loop waiting process.
  • the next instruction ie, the first link instruction
  • the command queue further includes a queue tail pointer, and each time the CPU inserts a first link instruction, the position pointed to by the queue tail pointer is updated to the position where the inserted first link instruction is located. That is to say, every time the CPU submits an instruction, the position pointed to by the queue tail pointer will be updated to the position of the latest submitted instruction.
  • the queue head pointer when the queue head pointer is initialized, it points to the first instruction (waiting instruction or the first link instruction) in the command queue. If the command queue is in an inactive state, the queue head pointer does not need to be updated. The position of the pointer. If the command queue is active, the position of the pointer needs to be updated every time the GPU executes an instruction.
  • the queue tail pointer When the queue tail pointer is initialized, it points to the first instruction (waiting instruction or the first link instruction) in the command queue. If the command queue is in an inactive state, the queue tail pointer does not need to update the position of the pointer. If the command queue is in an active state , each time a new first link instruction is inserted, the queue tail pointer is updated to the position where the new first link instruction is located.
  • an embodiment of the present application provides a command submission method.
  • the first link instruction is first inserted into the command queue, and when there is an image processing command that needs to be executed by the GPU, the The image processing command is submitted to the command buffer, and the second link command is inserted at the end of the command buffer; the link address of the first link command is configured as the address corresponding to the command buffer; the link address of the second link command is configured to wait command or the next first link command, so that the GPU can jump to the waiting command or the next first link command after executing the second link command; repeat the above steps until all image processing commands that need to be executed by the GPU are submitted to in the command buffer.
  • the embodiment of the present application also provides a command reading method, which is applied to a GPU of an electronic device, and the electronic device further includes a CPU, a command queue, and a command buffer, and the method includes:
  • the GPU first reads the first link command in the command queue, then jumps to the command buffer to read the image processing commands in the command buffer in turn, and then Read the second link instruction at the end of the command buffer and jump to the next first link instruction or wait instruction.
  • the GPU If there is no image processing command that needs to be executed by the GPU, the GPU reads the first link command and then jumps to the first link command or jumps to the waiting command.
  • FIG. 10 is a functional block diagram of a command submission device 110 provided by an embodiment of the application, applied to a CPU of an electronic device, the electronic device further includes a GPU, a command queue and a command buffer, and the device includes:
  • the instruction insertion module 1101 is configured to insert the first link instruction into the command queue.
  • the judgment module 1102 is configured to judge whether there is an image processing command that needs to be executed by the GPU.
  • the command submission module 1103 is configured to submit the image processing command to be executed by the GPU to the command buffer when there is an image processing command that needs to be executed by the GPU.
  • the instruction inserting module 1101 is further configured to insert a second link instruction at the end of the command buffer.
  • An instruction configuration module 1104 configured to configure the link address of the first link instruction as an address corresponding to the command buffer, and configure the link address of the second link instruction as a waiting instruction or the next first link instruction, So that the GPU executes the second linking instruction and then jumps to the waiting instruction or the next first linking instruction.
  • the apparatus further includes a pointer update module, configured to update the position pointed to by the queue tail pointer to the location where the inserted first link instruction is located each time a first link instruction is inserted. Location.
  • the instruction configuration module 1104 is further configured to:
  • the link address of the first link instruction is configured as the first link instruction, so that the GPU cyclically executes the first link instruction.
  • the instruction configuration module 1104 is further configured to:
  • a waiting command is inserted before the first linking command, and the linking address of the first linking command is configured as the waiting command.
  • an embodiment of the present application further provides a command reading device, the command reading device is applied to a GPU of an electronic device, the electronic device further includes a CPU, and the CPU includes a command queue and a command buffer, the The device includes:
  • the command reading module is used to read the first link command in the command queue and then jump to the command buffer when there is an image processing command to be executed; sequentially read and execute the command buffer After executing the image processing command in the command buffer, read the second link instruction and jump to the waiting instruction or the next first link instruction.
  • the command reading module is further configured to, when there is no image processing command to be executed, read the first link instruction and then jump to the first link instruction or jump to Waiting for instructions.
  • FIG. 11 is a schematic diagram of an electronic device 10 provided by an embodiment of the application.
  • the electronic device 10 includes a processor 11, a memory 12, and a bus 13, and the processor 11 includes a CPU and a GPU.
  • the memory 12 stores machine-readable instructions executable by the processor 11.
  • the processor 11 communicates with the memory 12 through the bus 13, and the machine-readable instructions are executed by the processor. 11 During execution, the command submission method or the command reading method provided by the embodiment of the present application is executed.
  • the embodiment of the present application further provides a storage medium, where a computer program is stored on the storage medium, and the computer program executes the command submission method or the command reading method provided by the embodiment of the present application when the computer program is run by the processor.
  • the embodiments of the present application may be provided as a method, a system, or a computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment, or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, etc.) having computer-usable program code embodied therein.
  • computer-usable storage media including, but not limited to, disk storage, CD-ROM, optical storage, etc.
  • These computer program instructions may also be stored in a computer-readable memory capable of directing a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory result in an article of manufacture comprising instruction means, the instructions
  • the apparatus implements the functions specified in the flow or flow of the flowcharts and/or the block or blocks of the block diagrams.

Landscapes

  • Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Facsimiles In General (AREA)

Abstract

本申请实施例提供一种命令提交方法和装置、命令读取方法和装置、电子设备,CPU首先将第一链接指令插入命令队列中,若命令队列存在待GPU执行的图像处理命令时,将待GPU执行的图像处理命令提交到命令缓冲区,并在命令缓冲区的末尾插入第二链接指令;将第一链接指令的链接地址配置为命令缓冲区对应的地址;将第二链接指令的链接地址配置为等待指令或下一个第一链接指令。本申请将图像处理命令全部提交到命令缓冲区中,命令队列只用于存储链接指令,通过链接指令能够使GPU跳转到命令缓冲区,以读取并执行命令缓冲区中的图像处理命令,无需提前申请一部分内存用于存储待执行的图像处理命令,节约内存资源。

Description

命令提交方法和装置、命令读取方法和装置、电子设备 技术领域
本申请涉及图像处理技术,具体地,涉及一种命令提交方法和装置、命令读取方法和装置、电子设备。
背景技术
在OpenGL(Open Graphics Library)中,CPU(central processing unit,中央处理器)为每个客户端程序都预留了一个专属的命令缓冲区,命令缓冲区中存储着多个需要GPU(Graphics Processing Unit,图形处理器)执行的图像处理命令(例如渲染等),这些图像处理命令会被CPU从命令缓冲区中复制到命令队列中,GPU按顺序执行命令队列中的图像处理命令。
目前通常是使用Ring Buffer模式来提交图像处理命令,在使用这种模式提交图像处理命令之前,需要提前申请一部分内存作为环形命令队列,在CPU和GPU之间建立提交命令的通道,且环形命令队列中存在读写两种指针,CPU从写指针的位置向环形命令队列中写入命令,GPU从读指针的位置从环形命令队列中读取命令,完成一次写入或读取命令后更新读写指针的位置。
由于环形命令队列需要提前申请一部分内存,申请的内存过大或者过小都会存在一些问题。若申请的内存较小,在一段时间后环形命令队列中可能积累大量的需要GPU处理的图像处理命令,此时CPU就必须进入等待状态,直到环形命令队列中出现能够容纳CPU提交的命令的内存;如果申请的内存过大,在大多数情况下环形命令队列可能处于非满的状态,浪费内存资源。
发明内容
本申请实施例中提供一种命令提交方法和装置、命令读取方法和装置、电子设备,用于解决相关技术中内存资源浪费的情况。
根据本申请实施例的第一个方面,提供了一种命令提交方法,应用于电子设备的CPU,所述电子设备包括命令队列和命令缓冲区,所述方法包括:
将第一链接指令插入所述命令队列中;
判断是否存在待GPU执行的图像处理命令;
若存在待GPU执行的图像处理命令,将待GPU执行的图像处理命令提交到所述命令缓冲区,并在所述命令缓冲区的末尾插入第二链接指令;
将所述第一链接指令的链接地址配置为所述命令缓冲区对应的地址;
将所述第二链接指令的链接地址配置为等待指令或下一个第一链接指令,以使所述GPU执行所述第二连接指令后跳转至等待指令或下一个第一链接指令。
根据本申请实施例的第二个方面,提供了一种命令读取方法,应用于电子设备的GPU,所述方法包括:
在存在待执行的图像处理命令时,读取命令队列中的第一链接指令后跳转至所述命令缓冲区;
依次读取并执行命令缓冲区中的图像处理命令;
在执行完所述命令缓冲区中的图像处理命令之后,读取所述第二链接指令后跳转至等待指令或下一个第一链接指令。
根据本申请实施例的第三个方面,提供了一种命令提交装置,应用于电子设备的CPU,所述电子设备包括命令队列和命令缓冲区,所述装置包括:
指令插入模块,用于将第一链接指令插入所述命令队列中;
判断模块,用于判断是否存在待GPU执行的图像处理命令;
命令提交模块,用于在存在待GPU执行的图像处理命令时,将待GPU执行的图像处理命令提交到所述命令缓冲区;
所述指令插入模块还用于在所述命令缓冲区的末尾插入第二链接指令;
指令配置模块,用于将所述第一链接指令的链接地址配置为所述命令缓冲区对应的地址,将所述第二链接指令的链接地址配置为等待指令或下一个第一 链接指令,以使所述GPU执行所述第二连接指令后跳转至等待指令或下一个第一链接指令。
根据本申请实施例的第四个方面,提供了一种命令读取装置,应用于电子设备的GPU,所述装置包括:
命令读取模块,用于在存在待执行的图像处理命令时,读取命令队列中的第一链接指令后跳转至命令缓冲区;依次读取并执行所述命令缓冲区中的图像处理命令;在执行完所述命令缓冲区中的图像处理命令之后,读取所述第二链接指令后跳转至等待指令或下一个第一链接指令。
根据本申请实施例的第五个方面,提供了一种电子设备,包括:处理器、存储器和总线,所述处理器包括CPU和GPU,所述存储器存储有所述处理器可执行的机器可读指令,当所述电子设备运行时,所述处理器与所述存储器之间通过所述总线通信,所述机器可读指令被所述处理器执行时执行上述实施例提供的命令提交方法或命令读取方法。
根据本申请实施例的第六个方面,提供了一种存储介质,所述存储介质上存储有计算机程序,所述计算机程序被处理器运行时执行上述实施例提供的命令提交方法或命令读取方法。
本申请实施例提供了一种命令提交方法和装置、命令读取方法和装置、电子设备,在本实施例中,首先将第一链接指令插入命令队列中,在存在待GPU执行的图像处理命令时,将待GPU执行的图像处理命令提交到命令缓冲区,并在命令缓冲区的末尾插入第二链接指令;将第一链接指令的链接地址配置为命令缓冲区对应的地址;将第二链接指令的链接地址配置为等待指令或下一个第一链接指令,以使GPU在执行第二连接指令后能够跳转至等待指令或下一个第一链接指令。在本申请实施例中,待GPU执行的图像处理命令全部被提交到命令缓冲区中,命令队列中只有链接指令,通过链接指令能够使GPU跳转到命令缓冲区,以读取并执行命令缓冲区中的图像处理命令,无需提前申请一部分内存用于存储待执行的图像处理命令,节省内存资源。
附图说明
此处所说明的附图用来提供对本申请的进一步理解,构成本申请的一部分,本申请的示意性实施例及其说明用于解释本申请,并不构成对本申请的不当限定。在附图中:
图1为相关技术提供的提交图像处理命令的示意图;
图2为本申请实施例提供的Ring Buffer模式的原理图;
图3为本申请实施例提供的命令提交方法的流程图之一;
图4为本申请实施例提供的活跃状态的命令队列的示意图之一;
图5为本申请实施例提供的活跃状态的命令队列的示意图之二;
图6为本申请实施例提供的活跃状态的命令队列的示意图之三;
图7为本申请实施例提供的命令提交方法的流程图之二;
图8为本申请实施例提供的非活跃状态的命令队列的示意图之一;
图9为本申请实施例提供的非活跃状态的命令队列的示意图之二;
图10为本申请实施例提供的命令提交装置的功能模块图;
图11为本申请实施例提供的电子设备的示意图。
附图标记:
10-电子设备;11-处理器;12-存储器;13-总线;110-命令提交装置;1101-指令插入模块;1102-判断模块;1103-命令提交模块;1104-命令配置模块;1105-循环模块。
具体实施方式
为了使本申请实施例中的技术方案及优点更加清楚明白,以下结合附图对本申请的示例性实施例进行进一步详细的说明,显然,所描述的实施例仅是本申请的一部分实施例,而不是所有实施例的穷举。需要说明的是,在不冲突的情况下,本申请中的实施例及实施例中的特征可以相互组合。
在实现本申请的过程中,发明人发现,在OpenGL(Open Graphics Library) 中,每个客户端程序都有一个专属于自己的渲染上下文(Context),渲染上下文包括一个客户端程序私有的命令缓冲区,命令缓冲区中存储着多个需要GPU(Graphics Processing Unit)执行的图像处理命令,而这些命令缓冲区中的图像处理命令需要复制到命令队列中,GPU按照顺序从命令队列中读取并执行这些图像处理命令。一个优秀的命令队列必须能够接收各个命令缓冲区提交的待GPU执行的命令,也能够按照先进先出的原则按顺序地交给GPU去执行这些命令。如图1所示,图1为相关技术提供的提交图像处理命令的示意图。
为了保证能够有序的提交图像处理命令且让GPU按序执行,目前通常使用的是Ring Buffer(环形缓冲区)的模式来提交命令,如图2所示,图2为本申请实施例提供的Ring Buffer模式的原理图。在此种命令提交模式中,需要提前在CPU的内存(或GPU的显存)申请一部分内存用作环形命令队列,用于在CPU和GPU之间建立提交命令通道,命令队列中存在读指针和写指针,CPU在写入图像处理命令(即将客户端提交的图像处理命令提交到环形命令队列中)时,从写指针所在的位置写入图像处理命令,GPU在读取图像处理命令时,从读指针所在的位置读取图像处理命令,在每一次写入命令或读取命令完成之后,都会更新读指针和写指针的位置。
虽然这种机制可以实现有序的向命令队列提交待执行的图像处理命令且让GPU按序执行,但是无法确定环形命令队列需要提前申请的内存大小。如果申请的内存较小,经过一段时间后环形命令队列中会累积较多的需要GPU处理的图像处理命令过多,没有更多的内存能够继续容纳CPU提交的图像处理命令,此时CPU就必须进入等待状态,直到环形命令队列中出现足够的内存空间之后,CPU才能够再次向环形命令队列中添加命令;如果申请的内存过大,那么可能需要很多的图像处理命令才能完全填满环形命令队列,在大多数情况下,环形命令队列可能处于非满状态,浪费内存资源。
针对上述问题,本申请实施例提供了一种命令提交方法和装置、命令读取方法和装置、电子设备,在本实施例中,首先将第一链接指令插入命令队列中, 在存在待GPU执行的图像处理命令时,将待GPU执行的图像处理命令提交到命令缓冲区,并在命令缓冲区的末尾插入第二链接指令;将第一链接指令的链接地址配置为命令缓冲区对应的地址;将第二链接指令的链接地址配置为等待指令或下一个第一链接指令,以使GPU在执行第二连接指令后能够跳转至等待指令或下一个第一链接指令。在本申请实施例中,待GPU执行的图像处理命令全部被提交到命令缓冲区中,命令队列中只有链接指令,通过链接指令能够使GPU跳转到命令缓冲区,以读取并执行命令缓冲区中的图像处理命令,无需提前申请一部分内存用于存储待执行的图像处理命令,不会存在浪费内存的情况,也不会存在需要CPU等待的情况。
为了使本申请实施例中的技术方案及优点更加清楚明白,以下结合附图对本申请的示例性实施例进行进一步详细的说明,显然,所描述的实施例仅是本申请的一部分实施例,而不是所有实施例的穷举。需要说明的是,在不冲突的情况下,本申请中的实施例及实施例中的特征可以相互组合。
请参照图3,图3为本申请实施例提供的命令提交方法的流程图之一。在本实施例中,命令提交方法应用于电子设备的CPU,电子设备还包括GPU、命令队列和命令缓冲区,所述方法包括:
步骤S11,将第一链接指令插入命令队列中。
步骤S12,判断是否存在待GPU执行的图像处理命令。
步骤S13,若存在待GPU执行的图像处理命令,将待GPU执行的图像处理命令提交到命令缓冲区,并在命令缓冲区的末尾插入第二链接指令。
步骤S14,将第一链接指令的链接地址配置为命令缓冲区对应的地址。
步骤S15,将第二链接指令的链接地址配置为等待指令或下一个第一链接指令,以使GPU执行第二连接指令后跳转至等待指令或下一个第一链接指令。
重复上述步骤,直至所有待GPU执行的图像处理命令被提交至命令缓冲区中。
在上述步骤中,待GPU执行的图像处理命令全部被提交到命令缓冲区中, 命令队列中只有链接指令,通过链接指令能够使GPU跳转到命令缓冲区,以读取并执行命令缓冲区中的图像处理命令,无需提前申请一部分内存用于存储待执行的图像处理命令,不会存在浪费内存的情况,也不会存在需要CPU等待的情况,节省内存资源。
在本实施例中,在客户端程序提交了待GPU执行的图像处理命令后,CPU需要将这些图像处理命令提交到命令缓冲区中,以供GPU读取并执行。
可选地,CPU在提交图像处理命令的过程中,首先需要将第一链接指令插入到命令队列中,此时命令队列的队列头指针和队列尾指针均指向该第一链接指令。其中,队列头指针用于指示GPU正在执行的指令所在的位置,队列尾指针用于指示CPU提交的指令的位置。
因此GPU根据队列头指针所在的位置开始读取第一链接指令,随后判断是否存在待GPU执行的图像处理命令,即客户端程序是否提交了需要GPU处理的命令,若存在待GPU执行的图像处理命令(可以将此状态定义为活跃状态),则将图像处理命令按照顺序提交到命令缓冲区中,并在命令缓冲区的末尾插入第二链接指令。
然后将第一链接指令的链接地址修改为命令缓冲区对应的地址,即GPU在读取到第一链接指令后会跳转至命令缓冲区;再将第二链接指令的链接地址配置为等待指令或者下一个第一链接指令,当GPU处理完命令缓冲区中的所有图像处理命令之后,会按照顺序读取第二链接指令,随后跳转至等待指令或是下一个第一链接指令。此时,CPU插入指令的位置已经到了下一个第一链接指令所在的位置,因此队列尾指针应当位于下一个第一链接指令所在的位置。如图4所示,图4为本申请实施例提供的活跃状态的命令队列的示意图之一。
可选地,在本实施例中,GPU在执行完一条图像处理命令后,队列头指针会被更新到下一条图像处理命令所在的位置,因此GPU可以根据队列头指针所在的位置依次处理图像处理命令。
如图5所示,图5为本申请实施例提供的活跃状态的命令队列的示意图之 二。在执行完一个命令缓冲区的图像处理命令之后,跳转到下一个第一链接指令,然后继续判断是否还存在待GPU执行的图像处理命令,若存在,则将图像处理命令提交到第二个命令缓冲区中,并在第二个命令缓冲区的末尾再增加一个第二链接指令,增加的第二链接指令的链接地址被配置为等待指令或者下一个第一链接指令;然后需要将第一链接指令的链接地址配置为第二个命令缓冲区所在的地址。
也就是说,当GPU在执行完第一个命令缓冲区末尾的第二链接指令后会跳转到下一个第一链接指令,并跳转至第一链接指令链接的第二个命令缓冲区继续读取并执行第二个命令缓冲区中的图像处理命令,在此之后,GPU继续读取第二个命令缓冲区末尾的第二链接指令,并跳转至第二链接指令链接的等待指令或下一个第一链接指令。
重复上述步骤,直至所有的待执行的图像处理命令均被提交到命令缓冲区中。
可选地,在本实施例中,等待指令中包括有需要GPU等待的时间,例如,等待指令指示GPU的等待时间为30毫秒,则GPU在等待30毫秒后才会执行等待指令的下一条指令。
如图6所示,图6为本申请实施例提供的活跃状态的命令队列的示意图之三。当第二链接指令的链接地址为等待指令时,等待指令的下一条指令为一个链接指令(图6中的第三链接指令),该第三链接指令的链接地址为该等待指令。也即是说,当不存在待GPU执行的图像处理命令时,即GPU执行完最后一个命令缓冲区的所有命令后,执行该命令缓冲区末尾的第二链接指令后跳转至等待指令,在执行完等待指令后按顺序执行第三链接指令,在执行第三链接指令后跳转至该等待指令继续执行,进入循环流程,等待出现需要执行的图像处理命令。
在检测到存在需要GPU处理的图像处理命令,判断结果为后续还存在待GPU执行的图像处理命令,则需要将等待指令更改为第一链接指令,以使GPU 能够跳出等待的循环流程,继续执行后续的指令。
可选地,请参照图7,图7为本申请实施例提供的命令提交方法的流程图之二。在本实施例中,在步骤S12之后,该方法还包括:
步骤S16,若不存在待GPU执行的图像处理命令,将第一链接指令的链接地址配置为第一链接指令,以使GPU循环执行第一链接指令。
在上述步骤中,如图8所示,图8为本申请实施例提供的非活跃状态的命令队列的示意图之一。在本实施例中,若客户端程序没有向CPU提交图像处理命令,即不存在待GPU执行的图像处理命令(可以将此状态定义为非活跃状态),则需要将第一链接指令的链接地址配置为自身对应的地址。GPU在执行时便会循环执行第一链接指令,直到存在需要GPU处理的图像处理命令。
在存在需要GPU处理的图像处理命令时,可参照步骤S13至步骤S15所描述的方法,此时将第一链接指令的链接地址配置为命令缓冲区,GPU便会跳转至命令缓冲区读取命令。
可选地,在另一种实施方式中,在步骤S12之后,该方法还包括:若不存在待GPU执行的图像处理命令,在第一链接指令前插入等待指令,并将第一链接指令的链接地址配置为等待指令。
在上述步骤中,如图9所示,图9为本申请实施例提供的非活跃状态的命令队列的示意图之二。在本实施例中,在不存在待GPU执行的图像处理命令时,可以在第一链接指令前插入等待指令,将等待指令的下一指令即为第一链接指令,同时将第一链接指令的链接地址配置为等待指令。
GPU在执行时,首先执行等待指令,在等待一段时间后,执行下一指令(即第一链接指令),在执行第一链接指令时会跳转到等待指令进行执行,从而使得GPU进入循环等待的流程,直到出现需要GPU执行的图像处理命令,GPU才会跳出循环等待的流程。
可选地,在本实施例中,命令队列还包括队列尾指针,在CPU每插入一个第一链接指令时,将队列尾指针指向的位置更新为插入的第一链接指令所在 的位置。也即是说,每当CPU提交一个指令后,队列尾指针指向的位置都会更新至最新提交的指令所在的位置。
可选地,在本实施例中,队列头指针初始化时指向命令队列中的第一个指令(等待指令或是第一链接指令),若命令队列处于非活跃状态,则队列头指针不需要更新指针的位置,若命令队列处于活跃状态,当GPU每执行完一条指令都需要更新指针位置。
队列尾指针初始化时指向命令队列中的第一个指令(等待指令或是第一链接指令),若命令队列处于非活跃状态,则队列尾指针不需要更新指针的位置,若命令队列处于活跃状态,每新插入一个第一链接指令,队列尾指针被更新到新的第一链接指令所在的位置。
综上所述,本申请实施例提供了一种命令提交方法,在本实施例中,首先将第一链接指令插入命令队列中,在存在需要GPU执行的图像处理命令时,将需要GPU执行的图像处理命令提交到命令缓冲区,并在命令缓冲区的末尾插入第二链接指令;将第一链接指令的链接地址配置为命令缓冲区对应的地址;将第二链接指令的链接地址配置为等待指令或下一个第一链接指令,以使GPU在执行第二连接指令后能够跳转至等待指令或下一个第一链接指令;重复上述步骤,直至所有需要GPU执行的图像处理命令全部被提交至所述命令缓冲区中。
通过上述实施例提供的命令队列,需要GPU执行的图像处理命令全部被提交到命令缓冲区中,命令队列中只有链接指令,通过链接指令能够使GPU跳转到命令缓冲区,GPU能够方便地从命令缓冲区中读取命令进行执行。此外,使用此方式,无需提前申请一部分内存用于存储待执行的图像处理命令,不存在队列被图像处理命令占满(即无内存)的情况,CPU也就不会等待提交命令,更不会存在浪费内存的情况。
本申请实施例还提供了一种命令读取方法,应用于电子设备的GPU,电子设备还包括CPU、命令队列和命令缓冲区,所述方法包括:
在存在待执行的图像处理命令时,读取命令队列中的第一链接指令后跳转至命令缓冲区;依次读取并执行命令缓冲区中的图像处理命令;在执行完命令缓冲区中的图像处理命令之后,读取第二链接指令后跳转至等待指令或下一个第一链接指令。
在本实施例中,若存在需要GPU执行的图像处理命令,则GPU首先读取命令队列中的第一链接指令,然后跳转至命令缓冲区依次读取命令缓冲区中的图像处理命令,随后读取命令缓冲区末尾的第二链接指令,并跳转至下一个第一链接指令或者等待指令。
若不存在需要需要GPU执行的图像处理命令,则GPU读取第一链接指令后跳转至该第一链接指令或跳转至等待指令。
GPU的命令读取方法已在前述实施例中进行了详细描述,在此不再赘述。
请参照图10,图10为本申请实施例提供的命令提交装置110的功能模块图,应用于电子设备的CPU,所述电子设备还包括GPU、命令队列和命令缓冲区,所述装置包括:
指令插入模块1101,用于将第一链接指令插入所述命令队列中。
判断模块1102,用于判断是否存在需要GPU执行的图像处理命令。
命令提交模块1103,用于在存在需要GPU执行的图像处理命令时,将待GPU执行的图像处理命令提交到所述命令缓冲区。
所述指令插入模块1101还用于在所述命令缓冲区的末尾插入第二链接指令。
指令配置模块1104,用于将所述第一链接指令的链接地址配置为所述命令缓冲区对应的地址,将所述第二链接指令的链接地址配置为等待指令或下一个第一链接指令,以使所述GPU执行所述第二连接指令后跳转至等待指令或下一个第一链接指令。
在一种可选的实施方式中,所述装置还包括指针更新模块,用于在每插入一个第一链接指令时,将所述队列尾指针指向的位置更新为插入的第一链接指 令所在的位置。
在一种可选的实施方式中,所述指令配置模块1104还用于:
在不存在待GPU执行的图像处理命令时,将所述第一链接指令的链接地址配置为所述第一链接指令,以使所述GPU循环执行所述第一链接指令。
在另一种可选的实施方式中,所述指令配置模块1104还用于:
在不存在待GPU执行的图像处理命令时,在所述第一链接指令前插入等待指令,并将所述第一链接指令的链接地址配置为所述等待指令。
可选地,本申请实施例还提供了一种命令读取装置,命令读取装置应用于电子设备的GPU,所述电子设备还包括CPU,所述CPU包括命令队列和命令缓冲区,所述装置包括:
命令读取模块,用于在存在待执行的图像处理命令时,读取所述命令队列中的第一链接指令后跳转至所述命令缓冲区;依次读取并执行所述命令缓冲区中的图像处理命令;在执行完所述命令缓冲区中的图像处理命令之后,读取所述第二链接指令后跳转至等待指令或下一个第一链接指令。
在一种可选的实施方式中,命令读取模块还用于在不存在待执行的图像处理命令时,读取所述第一链接指令后跳转至所述第一链接指令或跳转至等待指令。
请参照图11,图11为本申请实施例提供的电子设备10的示意图,在本实施例中,电子设备10包括处理器11、存储器12和总线13,所述处理器11包括CPU和GPU,存储器12存储有所述处理器11可执行的机器可读指令,当电子设备10运行时,处理器11与存储器12之间通过所述总线13通信,所述机器可读指令被所述处理器11执行时执行本申请实施例提供的命令提交方法或命令读取方法。
本申请实施例还提供了一种存储介质,所述存储介质上存储有计算机程序,所述计算机程序被处理器运行时执行本申请实施例提供的命令提交方法或命令读取方法。
本领域内的技术人员应明白,本申请的实施例可提供为方法、系统、或计算机程序产品。因此,本申请可采用完全硬件实施例、完全软件实施例、或结合软件和硬件方面的实施例的形式。而且,本申请可采用在一个或多个其中包含有计算机可用程序代码的计算机可用存储介质(包括但不限于磁盘存储器、CD-ROM、光学存储器等)上实施的计算机程序产品的形式。
本申请是参照根据本申请实施例的方法、设备(系统)、和计算机程序产品的流程图和/或方框图来描述的。应理解可由计算机程序指令实现流程图和/或方框图中的每一流程和/或方框、以及流程图和/或方框图中的流程和/或方框的结合。可提供这些计算机程序指令到通用计算机、专用计算机、嵌入式处理机或其他可编程数据处理设备的处理器以产生一个机器,使得通过计算机或其他可编程数据处理设备的处理器执行的指令产生用于实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能的装置。
这些计算机程序指令也可存储在能引导计算机或其他可编程数据处理设备以特定方式工作的计算机可读存储器中,使得存储在该计算机可读存储器中的指令产生包括指令装置的制造品,该指令装置实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能。
这些计算机程序指令也可装载到计算机或其他可编程数据处理设备上,使得在计算机或其他可编程设备上执行一系列操作步骤以产生计算机实现的处理,从而在计算机或其他可编程设备上执行的指令提供用于实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能的步骤。
尽管已描述了本申请的优选实施例,但本领域内的技术人员一旦得知了基本创造性概念,则可对这些实施例作出另外的变更和修改。所以,所附权利要求意欲解释为包括优选实施例以及落入本申请范围的所有变更和修改。
显然,本领域的技术人员可以对本申请进行各种改动和变型而不脱离本申请的精神和范围。这样,倘若本申请的这些修改和变型属于本申请权利要求及其等同技术的范围之内,则本申请也意图包含这些改动和变型在内。

Claims (14)

  1. 一种命令提交方法,其特征在于,应用于电子设备的中央处理器CPU,所述电子设备包括命令队列和命令缓冲区,所述方法包括:
    将第一链接指令插入所述命令队列中;
    判断是否存在待GPU执行的图像处理命令;
    若存在待GPU执行的图像处理命令,将所述待GPU执行的图像处理命令提交到所述命令缓冲区,并在所述命令缓冲区的末尾插入第二链接指令;
    将所述第一链接指令的链接地址配置为所述命令缓冲区对应的地址;
    将所述第二链接指令的链接地址配置为等待指令或下一个第一链接指令,以使所述GPU执行所述第二连接指令后跳转至等待指令或下一个第一链接指令。
  2. 根据权利要求1所述的方法,其特征在于,在判断是否存在待GPU执行的图像处理命令之后,所述方法还包括:
    若不存在待所述GPU执行的图像处理命令,将所述第一链接指令的链接地址配置为所述第一链接指令,以使所述GPU循环执行所述第一链接指令。
  3. 根据权利要求1所述的方法,其特征在于,在判断是否存在待GPU执行的图像处理命令之后,所述方法还包括:
    若不存在待所述GPU执行的图像处理命令,在所述第一链接指令前插入等待指令,并将所述第一链接指令的链接地址配置为所述等待指令。
  4. 根据权利要求1-3任意一项所述的方法,其特征在于,所述命令队列还包括队列尾指针,所述方法还包括:
    每插入一个第一链接指令时,将所述队列尾指针指向的位置更新为插入的第一链接指令所在的位置。
  5. 一种命令读取方法,其特征在于,应用于电子设备的图形处理器GPU,所述方法包括:
    在存在待执行的图像处理命令时,读取命令队列中的第一链接指令后跳转 至所述命令缓冲区;
    依次读取并执行命令缓冲区中的图像处理命令;
    在执行完所述命令缓冲区中的图像处理命令之后,读取所述第二链接指令后跳转至等待指令或下一个第一链接指令。
  6. 根据权利要求5所述的方法,其特征在于,所述方法还包括:
    在不存在待执行的图像处理命令时,读取所述第一链接指令后跳转至所述第一链接指令或跳转至等待指令。
  7. 一种命令提交装置,其特征在于,应用于电子设备的CPU,所述电子设备包括命令队列和命令缓冲区,所述装置包括:
    指令插入模块,用于将第一链接指令插入所述命令队列中;
    判断模块,用于判断是否存在待GPU执行的图像处理命令;
    命令提交模块,用于在存在待GPU执行的图像处理命令时,将待所述GPU执行的图像处理命令提交到所述命令缓冲区;
    所述指令插入模块还用于在所述命令缓冲区的末尾插入第二链接指令;
    指令配置模块,用于将所述第一链接指令的链接地址配置为所述命令缓冲区对应的地址,将所述第二链接指令的链接地址配置为等待指令或下一个第一链接指令,以使所述GPU执行所述第二连接指令后跳转至等待指令或下一个第一链接指令。
  8. 根据权利要求7所述的装置,其特征在于,所述指令配置模块还用于:
    在不存在待所述GPU执行的图像处理命令时,将所述第一链接指令的链接地址配置为所述第一链接指令,以使所述GPU循环执行所述第一链接指令。
  9. 根据权利要求7所述的装置,其特征在于,所述指令配置模块还用于:
    在不存在待所述GPU执行的图像处理命令时,在所述第一链接指令前插入等待指令,并将所述第一链接指令的链接地址配置为所述等待指令。
  10. 根据权利要求7-9任意一项所述的装置,其特征在于,所述命令队列还包括队列尾指针,所述装置还包括:
    指针更新模块,用于在每插入一个第一链接指令时,将所述队列尾指针指向的位置更新为插入的第一链接指令所在的位置。
  11. 一种命令读取装置,其特征在于,应用于电子设备的GPU,所述装置包括:
    命令读取模块,用于在存在待执行的图像处理命令时,读取命令队列中的第一链接指令后跳转至命令缓冲区;依次读取并执行所述命令缓冲区中的图像处理命令;在执行完所述命令缓冲区中的图像处理命令之后,读取所述第二链接指令后跳转至等待指令或下一个第一链接指令。
  12. 根据权利要求11所述的命令读取装置,其特征在于,所述命令读取模块还用于:在不存在待执行的图像处理命令时,读取所述第一链接指令后跳转至所述第一链接指令或跳转至等待指令。
  13. 一种电子设备,其特征在于,包括:处理器、存储器和总线,所述处理器包括CPU和GPU,所述存储器存储有所述处理器可执行的机器可读指令,当所述电子设备运行时,所述处理器与所述存储器之间通过所述总线通信,所述机器可读指令被所述处理器执行时执行如权利要求1-6任一项所述的命令提交方法或命令读取方法。
  14. 一种存储介质,其特征在于,所述存储介质上存储有计算机程序,所述计算机程序被处理器运行时执行如权利要求1-6任一项所述的命令提交方法或命令读取方法。
PCT/CN2021/087385 2021-03-02 2021-04-15 命令提交方法和装置、命令读取方法和装置、电子设备 WO2022183572A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202110229669.8A CN113051071A (zh) 2021-03-02 2021-03-02 命令提交方法和装置、命令读取方法和装置、电子设备
CN202110229669.8 2021-03-02

Publications (1)

Publication Number Publication Date
WO2022183572A1 true WO2022183572A1 (zh) 2022-09-09

Family

ID=76509750

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/087385 WO2022183572A1 (zh) 2021-03-02 2021-04-15 命令提交方法和装置、命令读取方法和装置、电子设备

Country Status (2)

Country Link
CN (1) CN113051071A (zh)
WO (1) WO2022183572A1 (zh)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115878521B (zh) * 2023-01-17 2023-07-21 北京象帝先计算技术有限公司 命令处理系统、电子装置及电子设备
CN116841739B (zh) * 2023-06-30 2024-04-19 沐曦集成电路(杭州)有限公司 用于异构计算平台的数据包重用系统

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20150186068A1 (en) * 2013-12-27 2015-07-02 Sandisk Technologies Inc. Command queuing using linked list queues
CN106462393A (zh) * 2014-05-30 2017-02-22 苹果公司 用于统一应用编程接口和模型的系统和方法
US20170060749A1 (en) * 2015-08-31 2017-03-02 Sandisk Technologies Inc. Partial Memory Command Fetching
CN106687927A (zh) * 2014-09-12 2017-05-17 英特尔公司 促进在计算装置上的图形处理单元的命令分组的动态并行调度
CN112114967A (zh) * 2020-09-16 2020-12-22 中国船舶重工集团公司第七0九研究所 一种基于服务优先级的gpu资源预留方法

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP2383648B1 (en) * 2010-04-28 2020-02-19 Telefonaktiebolaget LM Ericsson (publ) Technique for GPU command scheduling
US9947068B2 (en) * 2016-03-10 2018-04-17 Gamefly Israel Ltd. System and method for GPU scheduling

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20150186068A1 (en) * 2013-12-27 2015-07-02 Sandisk Technologies Inc. Command queuing using linked list queues
CN106462393A (zh) * 2014-05-30 2017-02-22 苹果公司 用于统一应用编程接口和模型的系统和方法
CN106687927A (zh) * 2014-09-12 2017-05-17 英特尔公司 促进在计算装置上的图形处理单元的命令分组的动态并行调度
US20170060749A1 (en) * 2015-08-31 2017-03-02 Sandisk Technologies Inc. Partial Memory Command Fetching
CN112114967A (zh) * 2020-09-16 2020-12-22 中国船舶重工集团公司第七0九研究所 一种基于服务优先级的gpu资源预留方法

Also Published As

Publication number Publication date
CN113051071A (zh) 2021-06-29

Similar Documents

Publication Publication Date Title
US8004533B2 (en) Graphics input command stream scheduling method and apparatus
WO2022183572A1 (zh) 命令提交方法和装置、命令读取方法和装置、电子设备
JP6467062B2 (ja) スプーフクロック及び細粒度周波数制御を使用する下位互換性
JP7266640B2 (ja) アプリケーション移行のためのシステム及び方法
US7659904B2 (en) System and method for processing high priority data elements
KR101693662B1 (ko) 하드웨어 컨텍스트 복구 흐름 동안에 프로그램가능한 소프트웨어 컨텍스트 상태 실행을 지원하기 위한 방법 및 장치
JP2010181989A (ja) データ処理装置
KR20190126728A (ko) 데이터 처리 시스템
US8786619B2 (en) Parallelized definition and display of content in a scripting environment
US20140331025A1 (en) Reconfigurable processor and operation method thereof
US8803900B2 (en) Synchronization with semaphores in a multi-engine GPU
JP5542643B2 (ja) シミュレーション装置及びシミュレーションプログラム
JP2009104443A (ja) Osの起動方法
JP6161396B2 (ja) 演算装置
US20220300322A1 (en) Cascading of Graph Streaming Processors
US6201547B1 (en) Method and apparatus for sequencing texture updates in a video graphics system
US7508397B1 (en) Rendering of disjoint and overlapping blits
US8823717B2 (en) Software constants file
JP2017201486A (ja) 情報処理装置、情報処理プログラム、及び情報処理方法
KR20210133257A (ko) 링 버퍼 업데이트들의 핸들링
US20040036690A1 (en) Command list controller for controlling hardware based on an instruction received from a central processing unit
JP7467554B2 (ja) ハードウェアアクセラレータの自律ジョブキューイングシステム
JP2008027344A (ja) オブジェクト間の非同期メッセージ管理方式および非同期メッセージ管理方法
US8745352B2 (en) Optimized approach to parallelize writing to a shared memory resource
US8427495B1 (en) Coalescing to avoid read-modify-write during compressed data operations

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21928653

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205 DATED 12/02/2024)