CN106843993A - A kind of method and system of resolving inversely GPU instructions - Google Patents
A kind of method and system of resolving inversely GPU instructions Download PDFInfo
- Publication number
- CN106843993A CN106843993A CN201611215249.XA CN201611215249A CN106843993A CN 106843993 A CN106843993 A CN 106843993A CN 201611215249 A CN201611215249 A CN 201611215249A CN 106843993 A CN106843993 A CN 106843993A
- Authority
- CN
- China
- Prior art keywords
- variables
- code
- instmap
- gpu
- instruction
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F8/00—Arrangements for software engineering
- G06F8/40—Transformation of program code
- G06F8/41—Compilation
- G06F8/42—Syntactic analysis
- G06F8/427—Parsing
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F8/00—Arrangements for software engineering
- G06F8/40—Transformation of program code
- G06F8/53—Decompilation; Disassembly
Landscapes
- Engineering & Computer Science (AREA)
- General Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Software Systems (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Devices For Executing Special Programs (AREA)
Abstract
The present invention proposes a kind of method and system of resolving inversely GPU instructions, it is related to GPU microarchitectures, compiler code generation technique and program optimization technical field, the method includes being compiled GPU instructions, generation compiling file, the compiling file is carried out into dis-assembling, generation dis-assembling file, by the resolver that collects by the dis-assembling representation of file into instMap variables, wherein the types of variables of the instMap variables includes command code, modification code, instruction, operand and corresponding operand type;The instMap variables are input to decoding solver, the decoding solver judges the types of variables of the instMap variables, and the command code or the corresponding coding of modification code lookup by having determined.The present invention, with reference to PTX documents, can construct GPU assemblers on the basis of instruction encoding is cracked;For GPU compiler provides some compiling miscellaneous functions, the efficiency of GPU program is improved;A series of micro benchmark test program can be designed and standardize to detect GPU micro-architectures characteristic and parameter.
Description
Technical field
The present invention relates to GPU microarchitectures, compiler code generation technique and program optimization technical field, more particularly to
A kind of method and system of resolving inversely GPU instructions.
Background technology
For many years, GPU manufacturers are only provided a user with by driving the upper strata API of encapsulation, and expose former inside it as little as possible
Reason and details, software architecture, the micro-architecture of GPU, the instruction set for such as driving.This causes academia in GPU architecture research field,
Industrial quarters is significantly lagged behind, is stagnated for a long time, the age of figure acceleration is served only in GPU, this conservative strategy is in reality
Do not turn into a distinct issues in the application of border, or even with certain reasonability:Initial drawing API realizes strong with hardware
Correlation, API is the simplified package of hardware capability, and API has exposed enough hardware details in itself, earlier version
OpenGL interfaces are even more the assembler language referred to as three-dimensional drawing;In addition, the person of directly invoking of drawing API is considerably less, game is all
It is based on rendering engine exploitation, as long as video card manufacturer assists to have optimized the rendering engine of main flow, it is possible to ensure most trips
The smooth operation of play;So as to, video card manufacturer can more freely improve and innovate micro-architecture, without after bottom is carried out to simultaneous
Hold, only need to be in backward compatibility on the drawing api layer of software encapsulation time.
GPU manufacturers Nvidia in monopoly position maintains the inertia of technology closing, does not provide assembler, does not support most
The assemble programming of bottom, also underground those the only hardware structure characteristics that could control on compilation level, used as supplement, it is carried
Supplied physical layer interface PTX, although be with collect very close to intermediate representation, but PTX to the control ability of hardware will less than compilation,
Such as, PTX be unable to control register distribution, can not precise control instruction scheduling behavior, the class C interface on upper strata is to hardware
Control ability is weaker, and developer can only place hope on Compiler Optimization for improving performance, however, " Daniel J
Bernstein,Hsieh-Chung Chen,Chen-Mou Cheng,Tanja Lange,Ruben Niederhagen,Peter
Schwabe,and Bo-Yin Yang.Usable assembly language for gpus:a success
story.IACR Cryptology ePrint Archive,2012:137,2012. " compiler that Nvidia is provided is pointed out
The code efficiency of NVCC generations is not high, such as register distribution has a large amount of bank conflict, in fact, Nvidia issues are permitted
Multiple parallel algorithms library, is all based on the assembler of inside, then carries out hand assemble optimization, just reaches ideal efficiency, asks
Topic is that there was only a small amount of rendering engine developer different from three-dimensional drawing field, and the customer group of GPGPU is extensive and various, and
Nvidia has only carried out hand assemble optimization to a small amount of algorithms library, only supports that remaining is a large amount of there is provided official to a small amount of big customer
User cannot but squeeze out the performance of maximum from expensive GPGPU hardware, and this is the huge waste to computing resource, further worsened
, many to apply extremely wide core algorithm, Nvidia also not to optimize in place, such as single-precision floating point Matrix Multiplication
(singe-precision matrix multiplication), Nvidia is excellent for the hand assemble of main flow Kepler frameworks
Change version, efficiency only reaches the 74% of theoretical peak, be that the single-precision floating point of third party's optimization multiplies cuBLAS with NVIDIA manufacturers
In SGEMM performance comparisons, third party's assembly code optimizing it is higher than cuBLAS performance, these research show assembly code optimizing for dig
The performance for digging GPU is very valuable.
Some researchers have some scattered progress, such as micro benchmark test program on GPU Performance tunings and instrument
“Zhang,Yao,and John D.Owens."A quantitative performance analysis model for
GPU architectures."In 2011 IEEE 17th International Symposium on High
Performance Computer Architecture,pp.382-393.IEEE,2011.”“Xinxin Mei,Kaiyong
Zhao,Chengjian Liu,and Xiaowen Chu.Benchmarking the memory hierarchy of
modern gpus.In Network and Parallel Computing,pages 144–156.Springer,2014.”
“Henry Wong,Misel-Myrto Papadopoulou,Maryam Sadooghi-Alvandi,and Andreas
Moshovos.Demystifying gpu microarchitecture through microbenchmarking.In
Performance Analysis of Systems&Software(ISPASS),2010 IEEE International
Symposium on, pages 235-246.IEEE, 2010. ", assembler and the other optimization of GPU assembly levels, however, they
Work is all only concentrated in certain single aspect, without a current techique that can continue on framework of new generation is proposed, such as
Instruction crack method and corresponding automation tools, GPU is most of also without the other open benchmark of assembly level in addition
The benchmark being currently in use is all based on CUDA, so as to cause the result and unreliable of test.
The content of the invention
In view of the shortcomings of the prior art, the present invention proposes a kind of method and system of resolving inversely GPU instructions.
The present invention proposes a kind of method of resolving inversely GPU instructions, including:
Step 1, GPU instructions are compiled, and generate compiling file, and the compiling file is carried out into dis-assembling, raw
Into dis-assembling file, by the resolver that collects by the dis-assembling representation of file into instMap variables, wherein the instMap
The types of variables of variable includes command code, modification code, instruction, operand and corresponding operand type;
Step 2, decoding solver is input to by the instMap variables, and the decoding solver judges the instMap
The types of variables of variable, and the command code or the corresponding coding of modification code lookup by having determined.
If the decoding solver is detected to 64 each for encoding of the instMap variables respectively, then is led to
Cross dis-assembling carries out dis-assembling by described 64 codings, if the instruction of the new dis-assembling of generation and the described 64 original fingers of coding
The instruction name of order is different, then illustrate that 64 codings present bit represents command code, according to the present bit, to command code
Enumerated in space encoder.
The title and operand type that will be instructed in the instMap variables are right as keyword query visited dictionaries
In each instruction in the instMap variables, other positions in addition to operand, a certain position that return has been changed are detected
Operand, will<Instruction, operand type>1 is labeled as in visited dictionaries, expression was accessed.
Carry out XOR by turn by will be instructed in the instMap variables, by modify code whether change completion detection repair
Decorations code, after finding every space encoder of the modification code of instruction, is enumerated in the space encoder of modification code, finds out all of repairing
The title of code is adornd, then according to the title of a certain modification code, all common factors of all instructions with a certain modification code is found out,
To finally encode and do XOR with the coding of the command code of all instructions with a certain modification code, obtain the coding of modification code.
According to the title of operand, coding corresponding thereto is obtained.
The present invention also proposes a kind of system of resolving inversely GPU instructions, it is characterised in that including:
Generation variable module, for GPU instructions to be compiled, generates compiling file, and the compiling file is entered
Row dis-assembling, generates dis-assembling file, by the resolver that collects by the dis-assembling representation of file into instMap variables, wherein
The types of variables of the instMap variables includes command code, modification code, operand and corresponding operand type;
Coding module is searched, for the instMap variables to be input into decoding solver, the decoding solver judges
The types of variables of the instMap variables, and the command code by having determined or modification code search remaining coding.
If the decoding solver is detected to 64 each for encoding of the instMap variables respectively, then is led to
Cross dis-assembling carries out dis-assembling by described 64 codings, if the instruction of the new dis-assembling of generation and the described 64 original fingers of coding
The instruction name of order is different, then illustrate that 64 codings present bit represents command code, according to the present bit, to command code
Enumerated in space encoder.
The title and operand type that will be instructed in the instMap variables are right as keyword query visited dictionaries
In each instruction in the instMap variables, other positions in addition to operand, a certain position that return has been changed are detected
Operand, will<Instruction, operand type>1 is labeled as in visited dictionaries, expression was accessed.
Whether XOR is carried out by turn by by instruction corresponding with coding in the instMap variables, by modifying code
Change and complete detection modification code, after finding every space encoder of the modification code of instruction, carried out piece in the space encoder of modification code
Lift, find out the title of all of modification code, then according to the title of a certain modification code, find out all with a certain modification code
All common factors of instruction, will finally encode and do XOR with the coding of the command code of all instructions with a certain modification code, obtain
Take the coding of modification code.
According to the title of operand, coding corresponding thereto is obtained.
From above scheme, the advantage of the invention is that:
Invention can successfully manage GPU sealing techniques system and to compiling and the limitation of program optimization:
1. because NVIDIA does not provide instruction encoding, based on the existing tools chains of NVIDIA, method solution proposed by the present invention
GPU instruction encodings are separated out, the basic format of instruction is as shown in Figure 1.63~54 represent command code, and 42~23 represent 20 immediates,
21~18 represent criterion register, and 17~10 represent source register, and 9~2 represent destination register, and 1~0 represents, specific domain
It is related to instruction syntax;
2. on the basis of instruction encoding is cracked, with reference to PTX documents, GPU assemblers can be constructed;
3. some compiling miscellaneous functions can be provided for GPU compiler, improve the efficiency of GPU program;
4. a series of micro benchmark test program can be designed and standardize to detect GPU micro-architectures characteristic and parameter.
Brief description of the drawings
Fig. 1 is the coded format exemplary plot of instruction;
Fig. 2 is instruction analytical algorithm flow chart;
Fig. 3 is operation algebraic method device algorithm (algorithm 1) figure;
Fig. 4 is command code solution musical instruments used in a Buddhist or Taoist mass algorithm (algorithm 2) figure;
Fig. 5 is modification code solution musical instruments used in a Buddhist or Taoist mass algorithm (algorithm 3) figure.
Specific embodiment
It is below present invention instruction analytical algorithm flow, it is as follows:
Instruction decoding needs to generate the corresponding relation of 64 bit instructions coding and assembly instruction, as shown in Fig. 2 algorithm flow is such as
Under:
First with PTX instruction generators, automatically generate all instructions in NVIDIA PTX documents and its modify code
These PTX files ptxas, is then compiled into cubin by combination, and by cuobjdump dis-assemblings, finally dis-assembling
Information, instMap variables are expressed as by the resolver that collects, the input for decoding solver, the structure bag of wherein instMap
Include:Command code, instruction, modification code, all of operand and corresponding operand type etc..
Operand can be register (R5), global memory ([R6+0x20]), constant internal memory (C [0x2] [0x40]), altogether
Enjoy internal memory ([0x50]), immediate (0x9and1.5) and criterion register (P3), it has been found that the name of operand always and
Digital correlation, therefore can be represented with its name with the coding of speculative operand, such as the two of register operand R5
Scale coding is 101, and the coding of immediate 0x9 is 1001, conversely, command code and modification code are then memonic symbols, it is impossible to directly lead to
Cross name and be expressed as binary system, therefore command code and modification code need to be enumerated in their space encoder, it has been found that repair
Decorations code is related instruction, and the modification code of same name is likely to difference, such as the class of LD and LDG in the coding of different instruction
Type modification code name is all .32 .64 .128 .S16 .U16 .S8 .U8, but the position of mask is different, therefore for
Modification code, we need to be processed respectively according to specific instruction.
It is below command code solution musical instruments used in a Buddhist or Taoist mass algorithm of the present invention, as shown in Figure 3:
Command code and modification code can not judge that it is encoded by name, and algorithm 1 illustrates command code solution process, according to
The pseudo-assembly PTX documents that NVIDIA is provided, write PTX codes, are then compiled into cubin with ptxas, then anti-with cuobjudump
Compilation, the assembler code (instMap variables) for obtaining as algorithm 1 input (1 row), we are discussed in detail lower command code solver
Detailed process, for dis-assembling file in every a line instruction, respectively to its 64 coding each detected the (the 9th
OK), then converged by the way that disassemblers nvdisasm is counter again mainly by each (11 row) of toggling command here
Compile (13 row), if the instruction name of the instruction of new dis-assembling and original instruction is different (15 row), illustrate that this represents behaviour
Make code, then this position is stored in opBits, obtain after position here, command code can be enumerated in space encoder.
By writing the combination of different modifying code and being verified, modification code coding is further obtained, however, due to
The PTX documents that NVIDIA is provided are not complete, do not ensure that so and find all of command code coding and modification code coding, by calculating
Method 1 we can find out all of instruction.
It is below present invention operation algebraic method device algorithm, as shown in Figure 4:
Algorithm 2 is the solution procedure of operand decoder.Initially set up a dictionary (being designated as visited dictionaries), dictionary
Key be the name and operand type of instruction where operand, value is then marked and is changed whether operand has been decoded.Input
It is operand and operand number by the input of command code solver, it is sequentially related to type and specifically instruction, therefore we
Whether one group of operand of mark is detected, it is necessary to the name and operand type (array) of specific instruction are looked into as keyword
Ask visited dictionaries, for dis-assembling file in each article of instruction (the 4th row), detect other positions in addition to operand
(eighth row), is also to be obtained by (the 10th row) overturn in instruction encoding, and what wichChange returned to modification is which
Operand, is then put into these positions in suitable array,<Instruction, operand type>Position is marked in visited dictionaries
1, expression was accessed (the 18th row).
It is below present invention modification code solution musical instruments used in a Buddhist or Taoist mass algorithm, as shown in Figure 5:
Modification code (Modifier), defines the concrete behavior of a certain bar instruction, such as LD has type to modify code:.U8,
.S8 .U16 .32 .64 .128, also cache operation modification code:.CS(cache streaming),.CG(cache at
Global level) etc..Modification code-phase it is increasingly complex for command code, its position cross over many operative positions, and with behaviour
Make code-phase pass, such as, be equally type modification code, the position of the modification code of LD and LDG in instruction just difference, a kind of solution party
Method is, by the way that XOR bit by bit is instructed, then whether observation modification code changes to be detected (the 6th row to 13 rows), to find every
Bar instruction modification code space encoder after, modification code space encoder enumerated (the 15th row), next step it needs to be determined that
The coding of specific certain modification code, such as the coding of .U8, the coding of .S8,20 to 29 rows of this process correspondence code, first
The name (the 20th row) of all of modifier is found out, then the name of specific a certain modification code, found out all with this modification
The coding of this coding and the command code of this instruction, is finally done XOR, so by all common factors (23-25 rows) of the instruction of code
Just leave behind the coding of modification code.
The present invention also proposes a kind of system of resolving inversely GPU instructions, including:
Generation variable module, for GPU instructions to be compiled, generates compiling file, and the compiling file is entered
Row dis-assembling, generates dis-assembling file, by the resolver that collects by the dis-assembling representation of file into instMap variables, wherein
The types of variables of the instMap variables includes command code, modification code, operand and corresponding operand type;
Coding module is searched, for the instMap variables to be input into decoding solver, the decoding solver judges
The types of variables of the instMap variables, and the command code by having determined or modification code search remaining coding.
If the decoding solver is detected to 64 each for encoding of the instMap variables respectively, then is led to
Cross dis-assembling carries out dis-assembling by described 64 codings, if the instruction of the new dis-assembling of generation and the described 64 original fingers of coding
The instruction name of order is different, then illustrate that 64 codings present bit represents command code, according to the present bit, to command code
Enumerated in space encoder.
The title and operand type that will be instructed in the instMap variables are right as keyword query visited dictionaries
In each instruction in the instMap variables, other positions in addition to operand, a certain position that return has been changed are detected
Operand, will<Instruction, operand type>1 is labeled as in visited dictionaries, expression was accessed.
Whether XOR is carried out by turn by by instruction corresponding with coding in the instMap variables, by modifying code
Change and complete detection modification code, after finding every space encoder of the modification code of instruction, carried out piece in the space encoder of modification code
Lift, find out the title of all of modification code, then according to the title of a certain modification code, find out all with a certain modification code
All common factors of instruction, will finally encode and do XOR with the coding of the command code of all instructions with a certain modification code, obtain
Take the coding of modification code.
According to the title of operand, coding corresponding thereto is obtained.
Claims (10)
1. a kind of method that resolving inversely GPU is instructed, it is characterised in that including:
Step 1, GPU instructions are compiled, and generate compiling file, and the compiling file is carried out into dis-assembling, and generation is anti-
Assembling file, by the resolver that collects by the dis-assembling representation of file into instMap variables, wherein the instMap variables
Types of variables include command code, modification code, instruction, operand and corresponding operand type;
Step 2, decoding solver is input to by the instMap variables, and the decoding solver judges the instMap variables
Types of variables, and the command code by having determined or modification code search corresponding coding.
2. the method that resolving inversely GPU as claimed in claim 1 is instructed, it is characterised in that if the decoding solver difference
64 each for encoding to the instMap variables are detected, then are carried out described 64 codings by dis-assembling anti-
Compilation, if the instruction of the new dis-assembling of generation is different with the instruction name of the described 64 original instructions of coding, illustrates institute
State 64 coding present bits and represent command code, according to the present bit, command code is enumerated in space encoder.
3. the method that resolving inversely GPU as claimed in claim 1 is instructed, it is characterised in that by the instMap variables middle finger
The title of order, as keyword query visited dictionaries, refers to operand type for each in the instMap variables
Order, detects other positions in addition to operand, and a certain positional operand that return has been changed will<Instruction, operand type>
1 is labeled as in visited dictionaries, expression was accessed.
4. the method for resolving inversely GPU as claimed in claim 1 instruction, it is characterised in that by by the instMap variables
Middle instruction carries out XOR by turn, by modifying whether code changes completion detection modification code, finds every volume of the modification code of instruction
After code space, enumerated in the space encoder of modification code, the title of all of modification code is found out, then according to a certain modification code
Title, find out all common factors of all instructions with a certain modification code, finally coding a certain is repaiied with all with described
The coding for adoring the command code of the instruction of code does XOR, obtains the coding of modification code.
5. the method that resolving inversely GPU as claimed in claim 1 is instructed, it is characterised in that according to the title of operand, obtains
Coding corresponding thereto.
6. the system that a kind of resolving inversely GPU is instructed, it is characterised in that including:
Generation variable module, for GPU instructions to be compiled, generates compiling file, and the compiling file is carried out instead
Compilation, generates dis-assembling file, by the resolver that collects by the dis-assembling representation of file into instMap variables, wherein described
The types of variables of instMap variables includes command code, modification code, operand and corresponding operand type;
Coding module is searched, for the instMap variables to be input into decoding solver, the decoding solver judges described
The types of variables of instMap variables, and the command code by having determined or modification code search remaining coding.
7. the system that resolving inversely GPU as claimed in claim 6 is instructed, it is characterised in that if the decoding solver difference
64 each for encoding to the instMap variables are detected, then are carried out described 64 codings by dis-assembling anti-
Compilation, if the instruction of the new dis-assembling of generation is different with the instruction name of the described 64 original instructions of coding, illustrates institute
State 64 coding present bits and represent command code, according to the present bit, command code is enumerated in space encoder.
8. the system that resolving inversely GPU as claimed in claim 6 is instructed, it is characterised in that by the instMap variables middle finger
The title of order, as keyword query visited dictionaries, refers to operand type for each in the instMap variables
Order, detects other positions in addition to operand, and a certain positional operand that return has been changed will<Instruction, operand type>
1 is labeled as in visited dictionaries, expression was accessed.
9. the system of resolving inversely GPU as claimed in claim 6 instruction, it is characterised in that by by the instMap variables
In the instruction corresponding with coding carry out XOR by turn, by modifying whether code changes completion detection modification code, find every finger
After the space encoder of the modification code of order, enumerated in the space encoder of modification code, found out the title of all of modification code, then
According to the title of a certain modification code, all common factors of all instructions with a certain modification code are found out, finally will coding and institute
The coding for having the command code of the instruction with a certain modification code does XOR, obtains the coding of modification code.
10. the system that resolving inversely GPU as claimed in claim 6 is instructed, it is characterised in that according to the title of operand, obtain
Take coding corresponding thereto.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201611215249.XA CN106843993B (en) | 2016-12-26 | 2016-12-26 | A kind of method and system of resolving inversely GPU instruction |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201611215249.XA CN106843993B (en) | 2016-12-26 | 2016-12-26 | A kind of method and system of resolving inversely GPU instruction |
Publications (2)
Publication Number | Publication Date |
---|---|
CN106843993A true CN106843993A (en) | 2017-06-13 |
CN106843993B CN106843993B (en) | 2019-07-30 |
Family
ID=59136263
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201611215249.XA Active CN106843993B (en) | 2016-12-26 | 2016-12-26 | A kind of method and system of resolving inversely GPU instruction |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN106843993B (en) |
Cited By (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109558735A (en) * | 2018-12-03 | 2019-04-02 | 杭州安恒信息技术股份有限公司 | A kind of rogue program sample clustering method and relevant apparatus based on machine learning |
CN109933327A (en) * | 2019-02-02 | 2019-06-25 | 中国科学院计算技术研究所 | OpenCL compiler method and system based on code fusion compiler framework |
CN110096309A (en) * | 2018-11-14 | 2019-08-06 | 上海寒武纪信息科技有限公司 | Operation method, device, computer equipment and storage medium |
CN110109657A (en) * | 2019-03-29 | 2019-08-09 | 南京佑驾科技有限公司 | A kind of GPU microcommand detection method |
CN110489130A (en) * | 2018-05-31 | 2019-11-22 | 北京数聚鑫云信息技术有限公司 | A kind of client-based business datum extracting method and device |
CN110716855A (en) * | 2019-08-23 | 2020-01-21 | 中国科学院信息工程研究所 | Processor instruction set testing method and device |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20100023709A1 (en) * | 2008-07-22 | 2010-01-28 | International Business Machines Corporation | Asymmetric double buffering of bitstream data in a multi-core processor |
CN103049304A (en) * | 2013-01-21 | 2013-04-17 | 中国人民解放军国防科学技术大学 | Method for accelerating operating speed of graphics processing unit (GPU) through dead code removal |
CN104156311A (en) * | 2014-08-05 | 2014-11-19 | 北京控制工程研究所 | Embedded type C language target code level unit testing method based on CPU simulator |
-
2016
- 2016-12-26 CN CN201611215249.XA patent/CN106843993B/en active Active
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20100023709A1 (en) * | 2008-07-22 | 2010-01-28 | International Business Machines Corporation | Asymmetric double buffering of bitstream data in a multi-core processor |
CN103049304A (en) * | 2013-01-21 | 2013-04-17 | 中国人民解放军国防科学技术大学 | Method for accelerating operating speed of graphics processing unit (GPU) through dead code removal |
CN104156311A (en) * | 2014-08-05 | 2014-11-19 | 北京控制工程研究所 | Embedded type C language target code level unit testing method based on CPU simulator |
Cited By (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110489130A (en) * | 2018-05-31 | 2019-11-22 | 北京数聚鑫云信息技术有限公司 | A kind of client-based business datum extracting method and device |
CN110096309A (en) * | 2018-11-14 | 2019-08-06 | 上海寒武纪信息科技有限公司 | Operation method, device, computer equipment and storage medium |
CN109558735A (en) * | 2018-12-03 | 2019-04-02 | 杭州安恒信息技术股份有限公司 | A kind of rogue program sample clustering method and relevant apparatus based on machine learning |
CN109933327A (en) * | 2019-02-02 | 2019-06-25 | 中国科学院计算技术研究所 | OpenCL compiler method and system based on code fusion compiler framework |
CN109933327B (en) * | 2019-02-02 | 2021-01-08 | 中国科学院计算技术研究所 | OpenCL compiler design method and system based on code fusion compiling framework |
CN110109657A (en) * | 2019-03-29 | 2019-08-09 | 南京佑驾科技有限公司 | A kind of GPU microcommand detection method |
CN110716855A (en) * | 2019-08-23 | 2020-01-21 | 中国科学院信息工程研究所 | Processor instruction set testing method and device |
CN110716855B (en) * | 2019-08-23 | 2021-05-14 | 中国科学院信息工程研究所 | Processor instruction set testing method and device |
Also Published As
Publication number | Publication date |
---|---|
CN106843993B (en) | 2019-07-30 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN106843993B (en) | A kind of method and system of resolving inversely GPU instruction | |
Paulin et al. | Flexware: A flexible firmware development environment for embedded systems | |
CN108614960B (en) | JavaScript virtualization protection method based on front-end byte code technology | |
Colin et al. | A modular and retargetable framework for tree-based WCET analysis | |
CN1146788C (en) | Device and method used in instruction selection of multiplatform environment | |
CN108345937A (en) | Cycle is merged with library | |
CN103329132A (en) | Architecture optimizer | |
JP5846005B2 (en) | Program, code generation method, and information processing apparatus | |
JPH11249904A (en) | Compiling method | |
CN102609243B (en) | Emulating pointers | |
CN113722218A (en) | Software defect prediction model construction method based on compiler intermediate representation | |
CN103329097A (en) | Tool generator | |
CN110321116B (en) | Efficient optimization method for calculation cost constraint problem in compilation optimization | |
CN103235724A (en) | Atomic operation semantic description based integrated translation method for multisource binary codes | |
EP2984585A2 (en) | Binding of data source to compound control | |
CN110149801A (en) | System and method for carrying out data flow diagram conversion in the processing system | |
CN102880449B (en) | Method and system for scheduling delay slot in very-long instruction word structure | |
CN102722570B (en) | Artificial immunity intelligent optimization system facing geographical space optimization | |
Armstrong et al. | Dynamic algorithm selection using reinforcement learning | |
CN1932766A (en) | Semi-automatic parallel method of large serial program code quantity-oriented field | |
CN103270512A (en) | Intelligent architecture creator | |
CN105447285A (en) | Method for improving OpenCL hardware execution efficiency | |
CN100559344C (en) | A kind of disposal route of supporting with regular record variables access special register group | |
Darda et al. | Nonlinear production path and an alternative reserves estimate for South Asian natural gas | |
Topcuoglu et al. | Solving the register allocation problem for embedded systems using a hybrid evolutionary algorithm |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |