CN112581585B - TLM device of GPU command processing module based on SysML view and operation method - Google Patents
TLM device of GPU command processing module based on SysML view and operation method Download PDFInfo
- Publication number
- CN112581585B CN112581585B CN202011542813.5A CN202011542813A CN112581585B CN 112581585 B CN112581585 B CN 112581585B CN 202011542813 A CN202011542813 A CN 202011542813A CN 112581585 B CN112581585 B CN 112581585B
- Authority
- CN
- China
- Prior art keywords
- socket
- command
- unit
- fifo
- cache unit
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000000034 method Methods 0.000 title claims abstract description 68
- 238000012545 processing Methods 0.000 title claims abstract description 53
- 238000007781 pre-processing Methods 0.000 claims abstract description 23
- 239000011159 matrix material Substances 0.000 claims abstract description 8
- 238000004458 analytical method Methods 0.000 claims abstract description 3
- 230000008569 process Effects 0.000 claims description 38
- 238000003672 processing method Methods 0.000 claims description 6
- 238000004891 communication Methods 0.000 claims description 3
- 238000004088 simulation Methods 0.000 claims description 3
- 230000006870 function Effects 0.000 abstract description 13
- 238000012795 verification Methods 0.000 abstract description 6
- 238000011017 operating method Methods 0.000 abstract description 2
- 238000013461 design Methods 0.000 description 10
- 238000010586 diagram Methods 0.000 description 6
- 238000011161 development Methods 0.000 description 5
- 238000005094 computer simulation Methods 0.000 description 2
- 230000000007 visual effect Effects 0.000 description 2
- 238000012938 design process Methods 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000010606 normalization Methods 0.000 description 1
- 230000008520 organization Effects 0.000 description 1
- 238000004904 shortening Methods 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T15/00—3D [Three Dimensional] image rendering
- G06T15/005—General purpose rendering architectures
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F12/00—Accessing, addressing or allocating within memory systems or architectures
- G06F12/02—Addressing or allocation; Relocation
- G06F12/08—Addressing or allocation; Relocation in hierarchically structured memory systems, e.g. virtual memory systems
- G06F12/0802—Addressing of a memory level in which the access to the desired data or data block requires associative addressing means, e.g. caches
- G06F12/0875—Addressing of a memory level in which the access to the desired data or data block requires associative addressing means, e.g. caches with dedicated cache, e.g. instruction or stack
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F12/00—Accessing, addressing or allocating within memory systems or architectures
- G06F12/02—Addressing or allocation; Relocation
- G06F12/08—Addressing or allocation; Relocation in hierarchically structured memory systems, e.g. virtual memory systems
- G06F12/0802—Addressing of a memory level in which the access to the desired data or data block requires associative addressing means, e.g. caches
- G06F12/0893—Caches characterised by their organisation or structure
- G06F12/0895—Caches characterised by their organisation or structure of parts of caches, e.g. directory or tag array
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F13/00—Interconnection of, or transfer of information or other signals between, memories, input/output devices or central processing units
- G06F13/14—Handling requests for interconnection or transfer
- G06F13/20—Handling requests for interconnection or transfer for access to input/output bus
- G06F13/28—Handling requests for interconnection or transfer for access to input/output bus using burst mode transfer, e.g. direct memory access DMA, cycle steal
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F13/00—Interconnection of, or transfer of information or other signals between, memories, input/output devices or central processing units
- G06F13/38—Information transfer, e.g. on bus
- G06F13/40—Bus structure
- G06F13/4004—Coupling between buses
- G06F13/4027—Coupling between buses using bus bridges
- G06F13/4031—Coupling between buses using bus bridges with arbitration
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/30—Arrangements for executing machine instructions, e.g. instruction decode
- G06F9/30145—Instruction analysis, e.g. decoding, instruction word fields
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/30—Arrangements for executing machine instructions, e.g. instruction decode
- G06F9/38—Concurrent instruction execution, e.g. pipeline, look ahead
- G06F9/3802—Instruction prefetching
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/30—Arrangements for executing machine instructions, e.g. instruction decode
- G06F9/38—Concurrent instruction execution, e.g. pipeline, look ahead
- G06F9/3836—Instruction issuing, e.g. dynamic instruction scheduling or out of order instruction execution
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/30—Arrangements for executing machine instructions, e.g. instruction decode
- G06F9/38—Concurrent instruction execution, e.g. pipeline, look ahead
- G06F9/3867—Concurrent instruction execution, e.g. pipeline, look ahead using instruction pipelines
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/46—Multiprogramming arrangements
- G06F9/54—Interprogram communication
- G06F9/544—Buffers; Shared memory; Pipes
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F2212/00—Indexing scheme relating to accessing, addressing or allocation within memory systems or architectures
- G06F2212/45—Caching of specific data in cache memory
- G06F2212/452—Instruction code
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F2213/00—Indexing scheme relating to interconnection of, or transfer of information or other signals between, memories, input/output devices or central processing units
- G06F2213/0026—PCI express
Abstract
The invention relates to the technical field of computer hardware modeling, in particular to a TLM device of a GPU command processing module based on a SysML view and an operation method. The system comprises a preprocessing unit, a simplified instruction set processor, an access bus arbitration unit, a register unit, a vertex cache unit, a vertex index cache unit, an instruction cache unit and a data cache unit, wherein the preprocessing unit is used for realizing precoding, distribution and processing of all OpenGL commands, the simplified instruction set processor is used for realizing graphics command analysis, parameter setting, graphics function code issuing, graphics attribute and matrix parameter push and pop operations and matrix operation and parameter standardization operation of partial functions of all non-graphics drawing class commands, and the access bus arbitration unit is used for realizing access to an axi bus. The TLM device and the operating method of the GPU command processing module based on the SysML view can convert complex, tedious and ambiguous characters into clear graphic modes, and achieve preliminary verification of the GPU command processing module architecture.
Description
Technical Field
The invention belongs to the technical field of computer hardware modeling, relates to a TLM device of a GPU command processing module, and particularly relates to a TLM device of a GPU command processing module based on a SysML view and an operation method.
Background
SysML is a new system modeling language that is developed by International System engineering institute INCOSE (international council on Systems Engineering) and object management organization OMG (Object Management Engineering) based on reuse and expansion of subsets of UML 2.0. SysML defines new modeling elements by expanding the existing UML2.0 with new properties and constraints.
SysML is an abbreviation of System Modeling Language, which is an object-oriented graphical modeling language that expands some new elements compared to UML language (Unified Modeling Language ), which is more conducive to system engineering modeling, and compared to other languages such as SCADE modeling language, sysML is capable of modeling not only software but also hardware and the entire system. In the traditional chip design process, along with the increasing of the complexity of the design, the design codes and a large number of documents become more and more difficult to master, and the SysML presents the whole design from multiple angles by using various graphs, so that a huge and complex system can be clearly and rapidly shown, modeling and visual support is provided for software and hardware development, the design period is shortened, and the development progress and the functions of assisting in system design and verification are accelerated.
The system and the method can realize verification of the algorithm and the architecture of the command processing module in early stage of chip design by using the SysML graphic modeling language to carry out the design of the object-level model on the GPU command processing module, thereby greatly shortening the development period of projects.
Disclosure of Invention
Based on the problems in the background technology, the TLM device and the operation method of the GPU command processing module based on the SysML view can convert complex and complicated words which are easy to cause ambiguity into clear graphic modes, and realize the primary verification of the GPU command processing module architecture.
The technical scheme of the invention is as follows: the TLM device of the GPU command processing module based on the SysML view is characterized in that: the system comprises a preprocessing unit, a graphics command analysis unit, a parameter setting unit, a graphics function code issuing unit, a graphics attribute and matrix parameter push and pop operation, a partial function matrix operation and parameter standardization operation, an access bus arbitration unit, a register unit, a vertex cache unit, a vertex index cache unit, an instruction cache unit and a data cache unit, wherein the preprocessing unit is used for realizing the precoding and distribution of all OpenGL commands and the processing of partial OpenGL commands;
the preprocessing unit is respectively connected with the simplified instruction set processor, the access bus arbitration unit and the register unit;
the reduced instruction set processor is connected with the instruction cache unit and the data cache unit;
the preprocessing unit is connected with the vertex cache unit and the vertex index cache unit;
the preprocessing unit further comprises three FIFO units; the FIFO units are respectively graph_cmd_fifo, new_list_fifo and call_list_fifo; the three FIFO units of the graph_cmd_fifo, the new_list_fifo and the call_list_fifo are respectively an OpenGL command FIFO, a display list loading FIFO and a display list calling FIFO;
the TLM device also comprises a plurality of processes, methods and sockets;
the process and the method are used for realizing command preprocessing, RISC processing, report interrupt, display list FIFO management, DMA processing from a host to a command processor, parameter initialization process from a state parameter management unit to a command processor module, exception processing method and register management process from the state parameter management unit to the command processor module;
the socket is used for realizing the interconnection communication function with an external module;
the vertex cache unit and the vertex index cache unit provide cache for vertex array commands;
the instruction cache unit and the data cache unit provide caches of instructions and data for the reduced instruction set processor.
As a preferred option: the above processes include a pre_process_cthread process, a cmd_process_core_cthread process, a report_interrupt_method process, a list_fifo_message_cthread process, a hiu2cmd_dma_target_cthread process, a sgu2 cmd_parametric_target_cthread process, a sgu2cmd_graph_reg_target_cthread process;
the above methods include an outlide_acceptance_target_method method;
the sockets include cmd2sgu _draw_initial_socket, sgu _2cmd_resource_status_target_socket, cmd2hiu _cfg_dma2c0_initial_socket, cmd2hiu _cfg_dma_c2s0_initial_socket, cmd2hiu _cfg_dma2c1_initial_socket, cmd2hiu _cfg_dma2s1_initial_socket, cmd2sgu _graph_functional_initial_socket, sgu _funcode_selectjsocket, and wid2_ sgu _initial_socket vc2l2cache_initiator_socket, vic2l2cache_initiator_socket, cmd2axi0_initiator_socket, sgu2cmd_graph_reg_target_socket, cmd2hiu _excursions_irq_socket, usa2 cmd_excursions_status_socket, geu 2cmd_excursions_status_socket, jsu 2cmd_excursions_status_socket, sgu2cmd_paraminit_target_socket, sgu 2cmd_function_packet_destination_socket, hiu 2cmd_dma_socket.
The operating method of the TLM device of the GPU command processing module based on the SysML view is characterized by comprising the following steps of: the method comprises the following operation steps:
1) Creating CM package objects
1.1 Judging the cm_pkt_fifo state, if the cm_pkt_fifo state is empty, setting pre_process_busy to 0 and returning to the step 1); if cm_pkt_fifo is not empty, setting pre_process_busy to 1;
1.2 A read graphics command interface FIFO obtains a command packet header, sets an immittUnit, creates a CM packet object and stores command packet information into a linked list;
the execution unit comprises: 0-a preprocessing unit; 1-RISC reduced instruction set processor execution;
2) Display list command loading
2.1 If newlist_flag is equal to 1 and the current command is glNewList, the operation is invalid, and the simulation is exited; if newlist_flag is equal to 1 and the current command is not glNewList, writing the current command into the display list FIFO directly;
2.2 Judging a display list mode, if the mode is COMPILE, directly returning to the step 1), and continuously judging the cm_pkt_fifo state; if the mode is GL_COMPILE_AND_EXECUTE, entering step 3);
3) Command package classification
3.1 When the command packet is
The glArrayelement/glDrawelements/glLoadFirmWare/glNewList/glEndList/glCalllList command, then enter special command processing flow;
3.2 When the command packet is
A glVertex/glMaterial/glNormal/glColor/glSecondaryColor/glTexCoord/glMultiTexCoord/glFogCoord/glEdgeFlag/glVertexBackup0/glVertexBackup1/glVertexattrib graphic drawing command enters a drawing command processing flow and is uniformly sent to the SGU_GDU unit according to 160 bit width;
3.3 If the command packet is other commands except the two commands, sending the command packet to the processor of the reduced instruction set for processing;
3.4 Returning to step 1) to continue reading OpenGL commands.
As a preferred option: the processing method of the reduced instruction set processor in the step 3.3) comprises the following steps:
3.3.1 A risc _ enable signal is detected,
setting risc_core_busy=0 to 0 when risc_enable signal value is false and returning to step 3.3.1);
setting risc_core_busy=1 when risc_enable value is true and proceeding to step 3.3.2);
3.3.2 Analyzing the received command and carrying out striping treatment; configuring parameters carried by the command to a state parameter register, configuring and starting PCIe_DMA according to the data address and the size carried by the command, and issuing a function code in the graphic command to a lower-level pipeline unit of the 3D engine;
3.3.3 Returning to step 3.3.1) continuing to detect the risc _ enable signal.
As a preferred option: the GPU command processing module further comprises interrupt processing work, and when an interrupt occurs, the steps are as follows:
step 1: reading the glGetError_reg1-glGetError_reg3 error state registers;
step 2: performing logic OR operation on the glGetError_reg1-glGetError_reg3 register values, and giving a result to cmd_acceptance_irq_socket;
step 3: and ending the exit.
The invention has the advantages that:
according to the TLM device of the GPU command processing module based on the SysML view, the overall design of the functions of the GPU command processing module is presented from two angles through the internal block diagram and the active diagram of the command processing module, complex and complicated characters which are easy to generate ambiguity are converted into clear graphic modes, the initial verification of the GPU command processing module architecture is realized, modeling and visual support is provided for software and hardware development, the design period is shortened, and development progress is accelerated and the functions of assisting in system design and verification are achieved.
Drawings
FIG. 1 is an internal block diagram of a GPU command processing module according to the present invention.
FIG. 2 is an active diagram of a GPU command processing module preprocessing process according to the present invention.
Fig. 3 is an active diagram of a risc kernel process of the GPU command processing module of the present invention.
FIG. 4 is an activity diagram of a GPU command processing module reporting interrupts in accordance with the present invention.
Detailed Description
The technical scheme of the invention is clearly and completely described below with reference to the accompanying drawings and the specific embodiments. It is apparent that the described embodiments are only some embodiments of the present invention, not all embodiments, and that all other embodiments obtained by a person skilled in the art without making creative efforts based on the embodiments in the present invention are within the protection scope of the present invention.
Referring to fig. 1-4, a TLM device of a GPU command processing module based on a SysML view includes a preprocessing unit for implementing pre-decoding, allocation and processing of partial OpenGL commands for all OpenGL commands, a simplified instruction set processor for implementing graphics command parsing, parameter setting, graphics function code issuing, push and pop operations for graphics attributes and matrix parameters, and matrix operations and parameter normalization operations for partial functions, an access bus arbitration unit for implementing access to axi buses, a register unit, a vertex cache unit, a vertex index cache unit, an instruction cache unit, a cache unit for vertices;
the preprocessing unit is respectively connected with the simplified instruction set processor, the access bus arbitration unit and the register unit;
the reduced instruction set processor is connected with the instruction cache unit and the data cache unit;
the preprocessing unit is connected with the vertex cache unit and the vertex index cache unit;
the preprocessing unit further comprises three FIFO units; the FIFO units are respectively graph_cmd_fifo, new_list_fifo and call_list_fifo; the three FIFO units of the graph_cmd_fifo, the new_list_fifo and the call_list_fifo are respectively an OpenGL command FIFO, a display list loading FIFO and a display list calling FIFO;
the TLM device also comprises a plurality of processes, methods and sockets;
the process and the method are used for realizing command preprocessing, RISC processing, report interrupt, display list FIFO management, DMA processing from a host to a command processor, parameter initialization process from a state parameter management unit to a command processor module, exception processing method and register management process from the state parameter management unit to the command processor module;
the socket is used for realizing the interconnection communication function with an external module;
the vertex cache unit and the vertex index cache unit provide cache for vertex array commands;
the instruction cache unit and the data cache unit provide caches of instructions and data for the reduced instruction set processor.
As a preferred option: the above processes include a pre_process_cthread process, a cmd_process_core_cthread process, a report_interrupt_method process, a list_fifo_message_cthread process, a hiu2cmd_dma_target_cthread process, a sgu2 cmd_parametric_target_cthread process, a sgu2cmd_graph_reg_target_cthread process;
the above methods include an outlide_acceptance_target_method method;
the sockets include cmd2sgu _draw_initial_socket, sgu _2cmd_resource_status_target_socket, cmd2hiu _cfg_dma2c0_initial_socket, cmd2hiu _cfg_dma_c2s0_initial_socket, cmd2hiu _cfg_dma2c1_initial_socket, cmd2hiu _cfg_dma2s1_initial_socket, cmd2sgu _graph_functional_initial_socket, sgu _funcode_selectjsocket, and wid2_ sgu _initial_socket vc2l2cache_initiator_socket, vic2l2cache_initiator_socket, cmd2axi0_initiator_socket, sgu2cmd_graph_reg_target_socket, cmd2hiu _excursions_irq_socket, usa2 cmd_excursions_status_socket, geu 2cmd_excursions_status_socket, jsu 2cmd_excursions_status_socket, sgu2cmd_paraminit_target_socket, sgu 2cmd_function_packet_destination_socket, hiu 2cmd_dma_socket.
A method of operating a TLM device of a GPU command processing module based on a sysplex view, comprising the steps of:
1) Creating CM package objects
1.1 Judging the cm_pkt_fifo state, if the cm_pkt_fifo state is empty, setting pre_process_busy to 0 and returning to the step 1); if cm_pkt_fifo is not empty, setting pre_process_busy to 1;
1.2 A read graphics command interface FIFO obtains a command packet header, sets an immittUnit, creates a CM packet object and stores command packet information into a linked list;
the execution unit comprises: 0-a preprocessing unit; 1-RISC reduced instruction set processor execution;
2) Display list command loading
2.1 If newlist_flag is equal to 1 and the current command is glNewList, the operation is invalid, and the simulation is exited; if newlist_flag is equal to 1 and the current command is not glNewList, writing the current command into the display list FIFO directly;
2.2 Judging a display list mode, if the mode is COMPILE, directly returning to the step 1), and continuously judging the cm_pkt_fifo state; if the mode is GL_COMPILE_AND_EXECUTE, entering step 3);
3) Command package classification
3.1 When the command packet is
The glArrayelement/glDrawelements/glLoadFirmWare/glNewList/glEndList/glCalllList command, then enter special command processing flow;
3.2 When the command packet is
A glVertex/glMaterial/glNormal/glColor/glSecondaryColor/glTexCoord/glMultiTexCoord/glFogCoord/glEdgeFlag/glVertexBackup0/glVertexBackup1/glVertexattrib graphic drawing command enters a drawing command processing flow and is uniformly sent to the SGU_GDU unit according to 160 bit width;
3.3 If the command packet is other commands except the two commands, sending the command packet to the processor of the reduced instruction set for processing;
3.4 Returning to step 1) to continue reading OpenGL commands.
As a preferred option: the processing method of the reduced instruction set processor in the step 3.3) comprises the following steps:
3.3.1 A risc _ enable signal is detected,
setting risc_core_busy=0 to 0 when risc_enable signal value is false and returning to step 3.3.1);
setting risc_core_busy=1 when risc_enable value is true and proceeding to step 3.3.2);
3.3.2 Analyzing the received command and carrying out striping treatment; configuring parameters carried by the command to a state parameter register, configuring and starting PCIe_DMA according to the data address and the size carried by the command, and issuing a function code in the graphic command to a lower-level pipeline unit of the 3D engine;
3.3.3 Returning to step 3.3.1) continuing to detect the risc _ enable signal.
As a preferred option: the GPU command processing module further comprises interrupt processing work, and when an interrupt occurs, the steps are as follows:
step 1: reading the glGetError_reg1-glGetError_reg3 error state registers;
step 2: performing logic OR operation on the glGetError_reg1-glGetError_reg3 register values, and giving a result to cmd_acceptance_irq_socket;
step 3: and ending the exit.
Finally, it should be noted that the above embodiments are merely illustrative of the technical solution of the present invention, and not limiting thereof; although the invention has been described in detail with reference to the foregoing embodiments, those of ordinary skill in the art will understand that; the technical scheme described in the foregoing embodiments can be modified or some technical features thereof can be replaced by equivalents; such modifications and substitutions do not depart from the spirit and scope of the technical solutions of the embodiments of the present invention.
Claims (5)
1. A TLM device of a GPU command processing module based on a sysplex view, characterized in that: the system comprises a preprocessing unit, a graphics command analysis unit, a parameter setting unit, a graphics function code issuing unit, a graphics attribute and matrix parameter push and pop operation, a partial function matrix operation and parameter standardization operation, an access bus arbitration unit, a register unit, a vertex cache unit, a vertex index cache unit, an instruction cache unit and a data cache unit, wherein the preprocessing unit is used for realizing the precoding and distribution of all OpenGL commands and the processing of partial OpenGL commands;
the preprocessing unit is respectively connected with the simplified instruction set processor, the access bus arbitration unit and the register unit;
the reduced instruction set processor is connected with the instruction cache unit and the data cache unit;
the preprocessing unit is connected with the vertex cache unit and the vertex index cache unit;
the preprocessing unit further comprises three FIFO units; the FIFO units are respectively graph_cmd_fifo, new_list_fifo and call_list_fifo; the three FIFO units of the graph_cmd_fifo, the new_list_fifo and the call_list_fifo are respectively an OpenGL command FIFO, a display list loading FIFO and a display list calling FIFO;
the TLM device also comprises a plurality of processes, methods and sockets;
the process and the method are used for realizing command preprocessing, RISC processing, report interrupt, display list FIFO management, DMA processing from a host to a command processor, parameter initialization process from a state parameter management unit to a command processor module, exception processing method and register management process from the state parameter management unit to the command processor module;
the socket is used for realizing the interconnection communication function with an external module;
the vertex cache unit and the vertex index cache unit provide cache for vertex array commands;
the instruction cache unit and the data cache unit provide caches of instructions and data for the reduced instruction set processor.
2. The TLM device of a SysML view based GPU command processing module of claim 1, wherein: the processes include a pre_process_cthread process, a cmd_process_core_cthread process, a report_interrupt_method process, a list_fifo_message_cthread process, a hiu2cmd_dma_target_cthread process, a sgu2 cmd_parametric_target_cthread process, a sgu2cmd_graph_reg_target_cthread process;
the method comprises an outlide_acceptance_target_method method;
the sockets include a cmd2sgu _draw_initial_socket socket, a sgu _2cmd_resource_status_target_socket socket, a cmd2hiu _cfg_dma2c0_initial_socket socket, a cmd2hiu _cfg_dmac2 s0_initial_socket socket, a cmd2hiu _cfg_dma2c1_initial_socket socket, a cmd2hiu _cfg_dma2s1_initial_socket, a cmd2sgu _graph_functional_initial_socket, a sgu _Funcode_selectjsocket, a cmd2_plug socket, a wid2_ sgu _initial_socket vc2l2cache_initiator_socket, vic2l2cache_initiator_socket, cmd2axi0_initiator_socket, sgu2cmd_graph_reg_target_socket, cmd2hiu _excursions_irq_socket, usa2 cmd_excursions_status_socket, geu 2cmd_excursions_status_socket, jsu 2cmd_excursions_status_socket, sgu2cmd_paraminit_target_socket, sgu 2cmd_function_packet_destination_socket, hiu 2cmd_dma_socket.
3. A method of operating a TLM device based on a GPU command processing module for a sysplex view as defined in claim 1, wherein: the method comprises the following operation steps:
1) Creating CM package objects
1.1 Judging the cm_pkt_fifo state, if the cm_pkt_fifo state is empty, setting pre_process_busy to 0 and returning to the step 1); if cm_pkt_fifo is not empty, setting pre_process_busy to 1;
1.2 A read graphics command interface FIFO obtains a command packet header, sets an immittUnit, creates a CM packet object and stores command packet information into a linked list;
the execution unit is as follows: 0-a preprocessing unit; 1-a reduced instruction set processor execution;
2) Display list command loading
2.1 If newlist_flag is equal to 1 and the current command is glNewList, the operation is invalid, and the simulation is exited; if newlist_flag is equal to 1 and the current command is not glNewList, writing the current command into the display list FIFO directly;
2.2 Judging a display list mode, if the mode is COMPILE, directly returning to the step 1), and continuously judging the cm_pkt_fifo state; if the mode is GL_COMPILE_AND_EXECUTE, entering step 3);
3) Command package classification
3.1 If the command packet is a glArrayelement/glDrawelements/glLoadFirmWare/glNewList/glEndList/glCalllList command, entering a special command processing flow;
3.2 When the command packet is a glVertex/glMaterial/glNormal/glColor/glSecondaryColor/gltex code/glmultitetxcode/glfogcodond/glEdgeFlag/glVertex backup0/glVertex backup1/glVertex trie graphic drawing command, entering a drawing command processing flow, and uniformly transmitting to the SGU_GDU unit according to 160 bits wide;
3.3 If the command packet is other commands except the two commands, sending the command packet to the processor of the reduced instruction set for processing;
3.4 Returning to step 1) to continue reading OpenGL commands.
4. A method of operating a TLM device of a sysplm view based GPU command processing module as defined in claim 3, wherein: the processing method of the reduced instruction set processor in the step 3.3) comprises the following steps:
3.3.1 A risc _ enable signal is detected,
setting risc_core_busy=0 to 0 when risc_enable signal value is false and returning to step 3.3.1);
setting risc_core_busy=1 when risc_enable value is true and proceeding to step 3.3.2);
3.3.2 Analyzing the received command and carrying out striping treatment; configuring parameters carried by the command to a state parameter register, configuring and starting PCIe_DMA according to the data address and the size carried by the command, and issuing a function code in the graphic command to a lower-level pipeline unit of the 3D engine;
3.3.3 Returning to step 3.3.1) continuing to detect the risc _ enable signal.
5. A method of operating a TLM device of a sysplm view based GPU command processing module as defined in claim 4, wherein: the GPU command processing module also comprises interrupt processing work, and when an interrupt occurs, the steps are as follows:
step 1: reading the glGetError_reg1-glGetError_reg3 error state registers;
step 2: performing logic OR operation on the glGetError_reg1-glGetError_reg3 register values, and giving a result to cmd_acceptance_irq_socket;
step 3: and ending the exit.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011542813.5A CN112581585B (en) | 2020-12-24 | 2020-12-24 | TLM device of GPU command processing module based on SysML view and operation method |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011542813.5A CN112581585B (en) | 2020-12-24 | 2020-12-24 | TLM device of GPU command processing module based on SysML view and operation method |
Publications (2)
Publication Number | Publication Date |
---|---|
CN112581585A CN112581585A (en) | 2021-03-30 |
CN112581585B true CN112581585B (en) | 2024-02-27 |
Family
ID=75139158
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202011542813.5A Active CN112581585B (en) | 2020-12-24 | 2020-12-24 | TLM device of GPU command processing module based on SysML view and operation method |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN112581585B (en) |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2002132489A (en) * | 2000-08-23 | 2002-05-10 | Nintendo Co Ltd | Graphics system |
CN103838669A (en) * | 2012-11-26 | 2014-06-04 | 辉达公司 | System, method, and computer program product for debugging graphics programs locally |
WO2015134941A1 (en) * | 2014-03-06 | 2015-09-11 | Graphite Systems, Inc. | Multiprocessor system with independent direct access to bulk solid state memory resources |
CN111028132A (en) * | 2019-11-21 | 2020-04-17 | 中国航空工业集团公司西安航空计算技术研究所 | SystemC-based GPU command processor unit hardware TLM microstructure |
Family Cites Families (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US7397797B2 (en) * | 2002-12-13 | 2008-07-08 | Nvidia Corporation | Method and apparatus for performing network processing functions |
US9569612B2 (en) * | 2013-03-14 | 2017-02-14 | Daniel Shawcross Wilkerson | Hard object: lightweight hardware enforcement of encapsulation, unforgeability, and transactionality |
US10970238B2 (en) * | 2019-04-19 | 2021-04-06 | Intel Corporation | Non-posted write transactions for a computer bus |
-
2020
- 2020-12-24 CN CN202011542813.5A patent/CN112581585B/en active Active
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2002132489A (en) * | 2000-08-23 | 2002-05-10 | Nintendo Co Ltd | Graphics system |
CN103838669A (en) * | 2012-11-26 | 2014-06-04 | 辉达公司 | System, method, and computer program product for debugging graphics programs locally |
WO2015134941A1 (en) * | 2014-03-06 | 2015-09-11 | Graphite Systems, Inc. | Multiprocessor system with independent direct access to bulk solid state memory resources |
CN111028132A (en) * | 2019-11-21 | 2020-04-17 | 中国航空工业集团公司西安航空计算技术研究所 | SystemC-based GPU command processor unit hardware TLM microstructure |
Non-Patent Citations (2)
Title |
---|
基于OpenGL的GPU命令处理器设计方法研究;刘晖;田泽;张骏;马城城;;航空计算技术;20200525(03);全文 * |
基于SystemC的GPU参数分配单元硬件TLM建模;姜丽云;田泽;吴晓成;张骏;;信息通信;20200215(02);全文 * |
Also Published As
Publication number | Publication date |
---|---|
CN112581585A (en) | 2021-03-30 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US8884906B2 (en) | Offloading touch processing to a graphics processor | |
EP1835396A2 (en) | Address space emulation | |
WO2021244194A1 (en) | Register reading/writing method, chip, subsystem, register group, and terminal | |
CN111045964B (en) | PCIE interface-based high-speed transmission method, storage medium and terminal | |
JP7096213B2 (en) | Calculation method applied to artificial intelligence chip and artificial intelligence chip | |
CN111400986B (en) | Integrated circuit computing equipment and computing processing system | |
WO2020191549A1 (en) | Soc chip, method for determination of hotspot function and terminal device | |
CN110825435A (en) | Method and apparatus for processing data | |
CN109727186B (en) | SystemC-based GPU (graphics processing Unit) fragment coloring task scheduling method | |
JPS62226257A (en) | Arithmetic processor | |
KR900004291B1 (en) | A method and apparatus for coordinating exceution of an instruction by a processor | |
EP3872629B1 (en) | Method and apparatus for executing instructions, device, and computer readable storage medium | |
CN112581585B (en) | TLM device of GPU command processing module based on SysML view and operation method | |
CN109840878A (en) | It is a kind of based on SystemC towards GPU parameter management method | |
CN111028128A (en) | GPU (graphics processing Unit) -oriented vertex output control method and unit based on SystemC | |
CN109710398B (en) | GPU (graphics processing Unit) -oriented vertex coloring task scheduling method based on UML (unified modeling language) | |
US11392406B1 (en) | Alternative interrupt reporting channels for microcontroller access devices | |
US10019390B2 (en) | Using memory cache for a race free interrupt scheme without the use of “read clear” registers | |
CN111045665B (en) | UML-based GPU command processor | |
CN114327975A (en) | System on chip | |
CN111028132B (en) | System C-based GPU command processor unit hardware TLM microstructure | |
KR20230091861A (en) | High-throughput circuit architecture for hardware acceleration | |
CN112559139B (en) | SystemC-based multi-GPU transaction-level model device and operation method | |
CN112634422B (en) | TLM device of GPU output control module based on SysML view and operation method | |
US11829736B2 (en) | Method of optimizing register memory allocation for vector instructions and a system thereof |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |