CN118035163A - Method, system and storage medium for processing data in real time by GPU - Google Patents

Method, system and storage medium for processing data in real time by GPU Download PDF

Info

Publication number
CN118035163A
CN118035163A CN202410426309.0A CN202410426309A CN118035163A CN 118035163 A CN118035163 A CN 118035163A CN 202410426309 A CN202410426309 A CN 202410426309A CN 118035163 A CN118035163 A CN 118035163A
Authority
CN
China
Prior art keywords
data
stored
processing
gpu
reference count
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202410426309.0A
Other languages
Chinese (zh)
Other versions
CN118035163B (en
Inventor
唐春有
莫潘良
郭一帆
周晓军
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Icube Corp ltd
Original Assignee
Icube Corp ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Icube Corp ltd filed Critical Icube Corp ltd
Priority to CN202410426309.0A priority Critical patent/CN118035163B/en
Publication of CN118035163A publication Critical patent/CN118035163A/en
Application granted granted Critical
Publication of CN118035163B publication Critical patent/CN118035163B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F15/00Digital computers in general; Data processing equipment in general
    • G06F15/16Combinations of two or more digital computers each having at least an arithmetic unit, a program unit and a register, e.g. for a simultaneous processing of several programs
    • G06F15/163Interprocessor communication
    • G06F15/167Interprocessor communication using a common memory, e.g. mailbox
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/30098Register arrangements
    • G06F9/30101Special purpose registers

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Software Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Hardware Design (AREA)
  • Information Transfer Systems (AREA)

Abstract

The invention provides a method, a system and a storage medium for processing data in real time by a GPU, which comprises the following steps: step 1: adding a reference count to the data read in the ring buffer to mark whether the data is processed; step 2: when the reference count is 0, automatically triggering an interrupt to the CPU to process, and simultaneously writing the address where the data is located into a completion register; step 3: the CPU receives the interrupt, reads the data address from the completion register, and takes the data out for subsequent processing. The beneficial effects of the invention are as follows: according to the invention, the reference count is utilized to realize the first-completed data to trigger the interrupt, and the completion register is utilized to process the completed data in the CPU, so that the real-time processing of the data is realized, and the performance of the GPU is improved.

Description

Method, system and storage medium for processing data in real time by GPU
Technical Field
The invention relates to the technical field of data interaction between a display card and a processor in a computer, in particular to a method, a system and a storage medium for processing data in real time by a GPU.
Background
At present, rendering and calculation in a computer can be interacted by using a ring buffer (ringbuf), the GPU sequentially reads data according to the position of a read-write pointer of ringbuf, the read data is distributed to a rendering/calculating unit of the GPU for parallel processing according to the data characteristics, the parallel processed data is successively completed, and if interrupt is sequentially triggered to the CPU for processing according to the sequence of the read data from ringbuf, the processing performance of the GPU is affected.
Disclosure of Invention
The invention provides a method for processing data in real time by a GPU, which comprises the following steps:
step 1: adding a reference count to the data read in the ring buffer to mark whether the data is processed;
Step 2: when the reference count is 0, automatically triggering an interrupt to the CPU to process, and simultaneously writing the address where the data is located into a completion register;
Step 3: the CPU receives the interrupt, reads the data address from the completion register, and takes the data out for subsequent processing.
As a further improvement of the present invention, in the step 1, the initial value of the reference count is N, where N is the number of core processing units, and when each core processing unit finishes processing data, the reference count is decremented by 1 until all core processing units complete data processing, and the reference count is 0.
As a further improvement of the present invention, in the step 1, 1< =n < =max_core, max_core is the maximum CORE number of the GPU.
As a further improvement of the invention, the method further comprises:
An item setting step: the command address is stored in each item in the ring buffer, a flag is set for each item, and whether the data is stored in the item is judged according to the value of the flag.
As a further improvement of the present invention, in the step of setting the entry, the initial value of the flag is true, true indicates that the entry is free, data can be stored, and when the data is stored in the entry, the flag is set to false, and false indicates that the entry is occupied, and the data in the entry cannot be covered; after the GPU processes the data to generate an interrupt, a mark is set to be true in an interrupt processing function, which indicates that new data can be stored; the true indication can store the data every time the state of the data first judging mark is reached, and the false indication that the data has not been processed yet is needed to wait for the data to be processed and then stored.
The invention also provides a system for processing data by the GPU in real time, which comprises: a memory, a processor and a computer program stored on said memory, said computer program being configured to implement the steps of the method of the invention when called by said processor.
The present invention also provides a computer-readable storage medium characterized in that: the computer readable storage medium stores a computer program configured to implement the steps of the method of the present invention when called by a processor.
The beneficial effects of the invention are as follows: according to the invention, the reference count is utilized to realize the first-completed data to trigger the interrupt, and the completion register is utilized to process the completed data in the CPU, so that the real-time processing of the data is realized, and the performance of the GPU is improved.
Drawings
Fig. 1 is a schematic diagram of the principles of the present invention.
Detailed Description
As shown in fig. 1, the invention discloses a method for processing data by a GPU in real time, which comprises the following steps:
step 1: adding a reference count (refcount) to the data read in the ring buffer (ringbuf) to mark whether the data is processed;
Step 2: when the reference count is 0, automatically triggering an interrupt to the CPU to process, and simultaneously writing the address where the data is located into a completion register (done_reg);
Step 3: the CPU receives the interrupt, reads the data address from the completion register, and fetches the data therefrom for subsequent processing, thus eliminating the need for sequential fetching of data from ringbuf, and avoiding the possibility of mishandling of incomplete commands in sequence.
In step 1, the GPU will generally distribute the data to N (1 < = N < = max_core, max_core is the maximum CORE number of the GPU, determined by the GPU itself) CORE processing units (CORE) according to the characteristics of the data, and perform parallel processing, so that the initial value of the reference count is N, and when each CORE processing unit finishes processing the data, the reference count is decremented by 1 until all CORE processing units complete the data processing, and the reference count is 0.
If the CPU fetches data out of order from ringbuf for processing, but the read pointer of ringbuf is still to be updated, there is a risk that the data that is not completed is flushed, and this problem can be solved using a software approach, as a simple approach: since ringbuf is a static ring data structure, the command address is stored in each entry (entry) of ringbuf, a flag (flag) can be set for each entry, the flag initial value is true (true), true indicates that the entry is idle, data can be stored, when data is stored in the entry, the flag is set to false (false), the false indicates that the entry is being occupied, and the data in the entry cannot be covered; after the gpu processes the data to generate an interrupt, setting a flag to true in an interrupt processing function to indicate that new data can be stored; each time the data comes, the state of the flag is firstly judged, true indicates that the data can be stored, false indicates that the data is not processed, and the data needs to be stored after being processed.
According to the invention, the reference count is utilized to realize the first-completed data to trigger the interrupt, and the completion register is utilized to process the completed data in the CPU, so that the real-time processing of the data is realized, and the performance of the GPU is improved.
The foregoing is a further detailed description of the invention in connection with the preferred embodiments, and it is not intended that the invention be limited to the specific embodiments described. It will be apparent to those skilled in the art that several simple deductions or substitutions may be made without departing from the spirit of the invention, and these should be considered to be within the scope of the invention.

Claims (7)

1. A method for processing data in real time by a GPU, comprising the steps of:
step 1: adding a reference count to the data read in the ring buffer to mark whether the data is processed;
Step 2: when the reference count is 0, automatically triggering an interrupt to the CPU to process, and simultaneously writing the address where the data is located into a completion register;
Step 3: the CPU receives the interrupt, reads the data address from the completion register, and takes the data out for subsequent processing.
2. The method of claim 1, wherein in step 1, the initial value of the reference count is N, N is the number of core processing units, and each core processing unit decrements the reference count by 1 when it finishes processing data until all core processing units complete data processing, and the reference count is 0.
3. The method according to claim 2, wherein in step 1, 1< = N < = max_core, max_core being the maximum number of COREs of the GPU.
4. A method according to any one of claims 1 to 3, further comprising:
An item setting step: the command address is stored in each item in the ring buffer, a flag is set for each item, and whether the data is stored in the item is judged according to the value of the flag.
5. The method of claim 4, wherein in the step of setting the entry, the initial value of the flag is true, true indicating that the entry is free, indicating that data can be stored, and when data is stored in the entry, the flag is set to false, indicating that the entry is occupied, and the data in the entry cannot be overwritten; after the GPU processes the data to generate an interrupt, a mark is set to be true in an interrupt processing function, which indicates that new data can be stored; each time the state of the data first judging mark is come, the true indicates that the data can be stored, the false indicates that the data is not processed yet, and the data needs to be stored after being processed.
6. A system for GPU real-time processing of data, comprising: a memory, a processor and a computer program stored on the memory, the computer program being configured to implement the steps of the method of any one of claims 1-5 when called by the processor.
7. A computer-readable storage medium, characterized by: the computer readable storage medium stores a computer program configured to implement the steps of the method of any of claims 1-5 when called by a processor.
CN202410426309.0A 2024-04-10 2024-04-10 Method, system and storage medium for processing data in real time by GPU Active CN118035163B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202410426309.0A CN118035163B (en) 2024-04-10 2024-04-10 Method, system and storage medium for processing data in real time by GPU

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202410426309.0A CN118035163B (en) 2024-04-10 2024-04-10 Method, system and storage medium for processing data in real time by GPU

Publications (2)

Publication Number Publication Date
CN118035163A true CN118035163A (en) 2024-05-14
CN118035163B CN118035163B (en) 2024-08-06

Family

ID=90989507

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202410426309.0A Active CN118035163B (en) 2024-04-10 2024-04-10 Method, system and storage medium for processing data in real time by GPU

Country Status (1)

Country Link
CN (1) CN118035163B (en)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1845087A (en) * 2006-05-18 2006-10-11 北京中星微电子有限公司 Interrupt handling method and interrupt handling apparatus
CN110415161A (en) * 2019-07-19 2019-11-05 龙芯中科技术有限公司 Graphic processing method, device, equipment and storage medium
CN111880916A (en) * 2020-07-27 2020-11-03 长沙景嘉微电子股份有限公司 Multi-drawing task processing method, device, terminal, medium and host in GPU
CN113051082A (en) * 2021-03-02 2021-06-29 长沙景嘉微电子股份有限公司 Software and hardware data synchronization method and device, electronic equipment and storage medium

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1845087A (en) * 2006-05-18 2006-10-11 北京中星微电子有限公司 Interrupt handling method and interrupt handling apparatus
CN110415161A (en) * 2019-07-19 2019-11-05 龙芯中科技术有限公司 Graphic processing method, device, equipment and storage medium
CN111880916A (en) * 2020-07-27 2020-11-03 长沙景嘉微电子股份有限公司 Multi-drawing task processing method, device, terminal, medium and host in GPU
CN113051082A (en) * 2021-03-02 2021-06-29 长沙景嘉微电子股份有限公司 Software and hardware data synchronization method and device, electronic equipment and storage medium

Also Published As

Publication number Publication date
CN118035163B (en) 2024-08-06

Similar Documents

Publication Publication Date Title
US7257665B2 (en) Branch-aware FIFO for interprocessor data sharing
US5644784A (en) Linear list based DMA control structure
US20020144099A1 (en) Hardware architecture for fast servicing of processor interrupts
US7873763B2 (en) Multi-reader multi-writer circular buffer memory
US7659904B2 (en) System and method for processing high priority data elements
CN113704301B (en) Data processing method, device, system, equipment and medium of heterogeneous computing platform
US6678755B1 (en) Method and apparatus for appending memory commands during a direct memory access operation
EP1989620A2 (en) Processing of high priority data elements in systems comprising a host processor and a co-processor
JPS6217876Y2 (en)
CN115858167A (en) Visual software image processing system, method, device, electronic equipment and medium
CN111857591B (en) Method, apparatus, device and computer readable storage medium for executing instructions
EP0600165A1 (en) Vector processing device
CN118035163B (en) Method, system and storage medium for processing data in real time by GPU
US6518973B1 (en) Method, system, and computer program product for efficient buffer level management of memory-buffered graphics data
KR20210110156A (en) Caching device, cache, system, method and apparatus for processing data, and medium
US5343557A (en) Workstation controller with full screen write mode and partial screen write mode
CN112767978B (en) DDR command scheduling method, device, equipment and medium
CN115237349A (en) Data read-write control method, control device, computer storage medium and electronic equipment
JPH0696007A (en) Dma transfer system
US20030056037A1 (en) Hardware chain pull
US6311266B1 (en) Instruction look-ahead system and hardware
CN118363901B (en) PCIe device, electronic component and electronic device
EP0410382A2 (en) Data transfer controller using direct memory access method
JP2000029690A (en) Method and device for data processing
US20230176785A1 (en) METHOD AND SYSTEM TO ABORT A COMMAND IN PCIe BASED NON-VOLATILE MEMORY EXPRESS SOLID-STATE DRIVE

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant