CN106709861B - Stainer driving static reconstruction method - Google Patents

Stainer driving static reconstruction method Download PDF

Info

Publication number
CN106709861B
CN106709861B CN201611140690.6A CN201611140690A CN106709861B CN 106709861 B CN106709861 B CN 106709861B CN 201611140690 A CN201611140690 A CN 201611140690A CN 106709861 B CN106709861 B CN 106709861B
Authority
CN
China
Prior art keywords
module
optimization
atomic
stainer
program
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201611140690.6A
Other languages
Chinese (zh)
Other versions
CN106709861A (en
Inventor
田泽
马城城
刘晖
张琛
黎小玉
聂曌
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Xian Aeronautics Computing Technique Research Institute of AVIC
Original Assignee
Xian Aeronautics Computing Technique Research Institute of AVIC
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Xian Aeronautics Computing Technique Research Institute of AVIC filed Critical Xian Aeronautics Computing Technique Research Institute of AVIC
Priority to CN201611140690.6A priority Critical patent/CN106709861B/en
Publication of CN106709861A publication Critical patent/CN106709861A/en
Application granted granted Critical
Publication of CN106709861B publication Critical patent/CN106709861B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T1/00General purpose image data processing
    • G06T1/20Processor architectures; Processor configuration, e.g. pipelining

Landscapes

  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Image Generation (AREA)
  • Devices For Executing Special Programs (AREA)

Abstract

The invention belongs to the field of computer graphics, and particularly relates to a stainer driving static reconstruction method, which comprises the following steps: the driver atomic segment dividing module (1) divides the driver code into atomic segments and sends the atomic segments to the original sub-segment program reconstructing module (3); the original sub-segment program reconstruction module (3) extracts corresponding atomic segments according to the configured functional parameters, reconstructs and generates required software codes, and sends the software codes to the instruction optimization module (4); the instruction optimization module (4) receives the software codes sent by the original sub-segment program reconstruction module (3) for optimization, and then sends the driving program to the machine code generation module (5); the machine code generation module (5) generates a corresponding machine code. According to the method, the original segments of the stainer driving codes are divided, and the related codes are generated through static reconstruction according to the requirements of different scenes configured by a user, so that invalid codes are eliminated, the optimization of a stainer driving program is achieved, and the running performance of the stainer is improved.

Description

Stainer driving static reconstruction method
Technical Field
The invention belongs to the field of computer graphics, and particularly relates to a stainer driving static reconstruction method.
Background
The shader driver is the core part of the graphics processor, and the running efficiency of the shader driver directly determines the performance of the graphics processor. Most of the existing graphic processors are realized in a large-scale programmable stainer array form, and module division and related optimization work are not carried out, so that a stainer driving program is complex and redundant, and becomes a bottleneck for improving the performance of the graphic processor.
Disclosure of Invention
The purpose of the invention is:
the invention mainly provides a stainer driving static reconstruction method, which optimizes a stainer driving program, thereby improving the performance of a graphic processor.
The solution of the invention is:
a stainer driven static reconstruction method, comprising:
step 1, a driver atomic segment dividing module (1) divides a driver code into the most basic atomic segments and sends the generated atomic segments to an original sub-segment program reconstructing module (3);
step 2, in the user function configuration module (2), a user statically specifies function parameters to be used;
step 3, the atomic segment program reconstruction module (3) extracts corresponding atomic segments in the driving program atomic segment division module (1) according to the functional parameters configured by the user function configuration module (2), reconstructs and generates required software codes, and sends the software codes to the instruction optimization module (4);
step 4, the instruction optimization module (4) receives the software codes sent by the original sub-segment program reconstruction module (3), performs data correlation optimization and structure correlation optimization, and sends the optimized driver to the machine code generation module (5);
and 5, receiving the driving program of the instruction optimization module (4) by the machine code generation module (5) to generate a corresponding machine code.
Step 4, the data correlation optimization in the instruction optimization module (4) refers to: the close instructions do not have the limitation of write after write, read after write and write after read, and in a multi-emission mechanism, the instructions can be executed at the same time.
Step 4, the structural correlation optimization in the instruction optimization module (4) is as follows: multiple instructions that are close to each other can be executed in different arithmetic units at the same time.
The invention has the advantages that: according to the stainer drive static reconstruction method provided by the invention, a stainer drive program is divided into original subsections, and corresponding atomic sections are extracted according to the function parameters statically configured by a user to complete static reconstruction, so that redundant codes are eliminated, optimization of stainer drive is realized, and the performance of a graphic processor is improved.
Drawings
FIG. 1 is a block diagram of the method of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention is further described in detail with reference to the following embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.
The technical solution of the present invention is further described in detail with reference to the accompanying drawings and specific embodiments.
As shown in fig. 1, a method for reconstructing a stainer-driven static state according to an embodiment of the present invention includes:
step 1, a driver atomic segment dividing module (1) divides a driver code into the most basic atomic segments and sends the generated atomic segments to an original sub-segment program reconstructing module (3);
step 2, in the user function configuration module (2), a user statically specifies function parameters to be used;
step 3, the atomic segment program reconstruction module (3) extracts corresponding atomic segments in the driving program atomic segment division module (1) according to the functional parameters configured by the user function configuration module (2), reconstructs and generates required software codes, and sends the software codes to the instruction optimization module (4);
step 4, the instruction optimization module (4) receives the software codes sent by the original sub-segment program reconstruction module (3), performs data correlation optimization and structure correlation optimization, and sends the optimized driver to the machine code generation module (5);
and 5, receiving the driving program of the instruction optimization module (4) by the machine code generation module (5) to generate a corresponding machine code.
Step 4, the data correlation optimization in the instruction optimization module (4) refers to: the close instructions do not have the limitation of write after write, read after write and write after read, and in a multi-emission mechanism, the instructions can be executed at the same time.
Step 4, the structural correlation optimization in the instruction optimization module (4) is as follows: multiple instructions that are close to each other can be executed in different arithmetic units at the same time.
Finally, it should be noted that: the above examples are only intended to illustrate the technical solution of the present invention, but not to limit it; although the present invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; and such modifications or substitutions do not depart from the spirit and scope of the corresponding technical solutions of the embodiments of the present invention.

Claims (1)

1. A stainer driven static reconstruction method, comprising:
step 1, a driver atomic segment dividing module (1) divides a driver code into the most basic atomic segments and sends the generated atomic segments to an original sub-segment program reconstructing module (3);
step 2, in the user function configuration module (2), a user statically specifies function parameters to be used;
step 3, the atomic segment program reconstruction module (3) extracts corresponding atomic segments in the driving program atomic segment division module (1) according to the functional parameters configured by the user function configuration module (2), reconstructs and generates required software codes, and sends the software codes to the instruction optimization module (4);
step 4, the instruction optimization module (4) receives the software codes sent by the original sub-segment program reconstruction module (3), performs data correlation optimization and structure correlation optimization, and sends the optimized driver to the machine code generation module (5);
step 5, the machine code generation module (5) receives the driving program of the instruction optimization module (4) and generates a corresponding machine code;
step 4, the data correlation optimization in the instruction optimization module (4) refers to: the close instructions do not have the limits of write after write, read after write and write after read, and in a multi-emission mechanism, the instructions can be executed at the same time;
step 4, the structural correlation optimization in the instruction optimization module (4) is as follows: multiple instructions that are close to each other can be executed in different arithmetic units at the same time.
CN201611140690.6A 2016-12-12 2016-12-12 Stainer driving static reconstruction method Active CN106709861B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201611140690.6A CN106709861B (en) 2016-12-12 2016-12-12 Stainer driving static reconstruction method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201611140690.6A CN106709861B (en) 2016-12-12 2016-12-12 Stainer driving static reconstruction method

Publications (2)

Publication Number Publication Date
CN106709861A CN106709861A (en) 2017-05-24
CN106709861B true CN106709861B (en) 2020-08-11

Family

ID=58936875

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201611140690.6A Active CN106709861B (en) 2016-12-12 2016-12-12 Stainer driving static reconstruction method

Country Status (1)

Country Link
CN (1) CN106709861B (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109800088B (en) * 2018-11-14 2023-06-20 西安翔腾微电子科技有限公司 Training-based GPU configuration management method and device, storage medium and GPU
CN109726816A (en) * 2018-12-12 2019-05-07 中国航空工业集团公司西安航空计算技术研究所 A kind of assembly level stainer program chains optimization method

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8595439B1 (en) * 2007-09-28 2013-11-26 The Mathworks, Inc. Optimization of cache configuration for application design
CN105549932A (en) * 2015-12-11 2016-05-04 中国航空工业集团公司西安航空计算技术研究所 Graphic processor host driver software structure
CN105574807A (en) * 2015-12-11 2016-05-11 中国航空工业集团公司西安航空计算技术研究所 Development platform of programmable stainer embedded into graphics processor

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8595439B1 (en) * 2007-09-28 2013-11-26 The Mathworks, Inc. Optimization of cache configuration for application design
CN105549932A (en) * 2015-12-11 2016-05-04 中国航空工业集团公司西安航空计算技术研究所 Graphic processor host driver software structure
CN105574807A (en) * 2015-12-11 2016-05-11 中国航空工业集团公司西安航空计算技术研究所 Development platform of programmable stainer embedded into graphics processor

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
MIGPU-9多核交互式图形处理器的设计;邓军勇 等;《计算机辅助设计与图形学学报》;20140930;第26卷(第9期);1468-1478 *

Also Published As

Publication number Publication date
CN106709861A (en) 2017-05-24

Similar Documents

Publication Publication Date Title
Fuller et al. Computing performance: Game over or next level?
CN110032395B (en) Unified register file for improving resource utilization
WO2013123405A1 (en) Profiling and sequencing operators executable in an emulated computing system
CN106709861B (en) Stainer driving static reconstruction method
DE102021104561A1 (en) ASYNCHRONOUS DATA MOVEMENT PIPELINE
CN105700956A (en) Distributed job processing method and system
Yadav et al. Performance analysis for Android runtime environment
Yamashina et al. Proposal of ROS-compliant FPGA component for low-power robotic systems
US20150227350A1 (en) Multi-dimensional, multi-configuration compilation phase output visualization technique
Greengard GPUs reshape computing
CN106683033B (en) Out-of-order OpenGL interface processing method
EP3752914B1 (en) Techniques for native runtime of hypertext markup language graphics content
Patchett et al. Parallel multi-layer ghost cell generation for distributed unstructured grids
US20220197615A1 (en) Data parallel programming task graph optimization through device telemetry
CN118043773A (en) Operating on matrix operands without restriction of storage locations of the operands in memory
CN116521254A (en) Graph-based memory storage
CN112799724B (en) Stable control device strategy table analysis and calculation method and device
DE112022002258T5 (en) TENSOR MODIFICATION BASED ON RESOURCE PROCESSING
DE112021003991T5 (en) TECHNIQUES FOR GENERATION OF INTERPOLATED VIDEO IMAGES
CN106708594B (en) Implementation method of hierarchical OpenGL runtime compiling software
CN105867847A (en) Memory access control method, device and system
CN107918958B (en) Visualization and customizable three-dimensional rendering system and method
CN109671013B (en) High-performance graphic instruction storage and distribution method supporting multiple GPUs
CN116821008B (en) Processing device with improved cache hit rate and cache device thereof
Hu et al. Hardware/software co-design for high performance computing: Challenges and opportunities

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant