CN106843993A - A kind of method and system of resolving inversely GPU instructions - Google Patents

A kind of method and system of resolving inversely GPU instructions Download PDF

Info

Publication number
CN106843993A
CN106843993A CN201611215249.XA CN201611215249A CN106843993A CN 106843993 A CN106843993 A CN 106843993A CN 201611215249 A CN201611215249 A CN 201611215249A CN 106843993 A CN106843993 A CN 106843993A
Authority
CN
China
Prior art keywords
variables
code
instmap
gpu
instruction
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201611215249.XA
Other languages
Chinese (zh)
Other versions
CN106843993B (en
Inventor
谭光明
张秀霞
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Chinese Academy Of Sciences State Owned Assets Management Co ltd
Institute of Computing Technology of CAS
Original Assignee
Chinese Academy Of Sciences State Owned Assets Management Co ltd
Institute of Computing Technology of CAS
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Chinese Academy Of Sciences State Owned Assets Management Co ltd, Institute of Computing Technology of CAS filed Critical Chinese Academy Of Sciences State Owned Assets Management Co ltd
Priority to CN201611215249.XA priority Critical patent/CN106843993B/en
Publication of CN106843993A publication Critical patent/CN106843993A/en
Application granted granted Critical
Publication of CN106843993B publication Critical patent/CN106843993B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F8/00Arrangements for software engineering
    • G06F8/40Transformation of program code
    • G06F8/41Compilation
    • G06F8/42Syntactic analysis
    • G06F8/427Parsing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F8/00Arrangements for software engineering
    • G06F8/40Transformation of program code
    • G06F8/53Decompilation; Disassembly

Landscapes

  • Engineering & Computer Science (AREA)
  • General Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Software Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Devices For Executing Special Programs (AREA)

Abstract

The present invention proposes a kind of method and system of resolving inversely GPU instructions, it is related to GPU microarchitectures, compiler code generation technique and program optimization technical field, the method includes being compiled GPU instructions, generation compiling file, the compiling file is carried out into dis-assembling, generation dis-assembling file, by the resolver that collects by the dis-assembling representation of file into instMap variables, wherein the types of variables of the instMap variables includes command code, modification code, instruction, operand and corresponding operand type;The instMap variables are input to decoding solver, the decoding solver judges the types of variables of the instMap variables, and the command code or the corresponding coding of modification code lookup by having determined.The present invention, with reference to PTX documents, can construct GPU assemblers on the basis of instruction encoding is cracked;For GPU compiler provides some compiling miscellaneous functions, the efficiency of GPU program is improved;A series of micro benchmark test program can be designed and standardize to detect GPU micro-architectures characteristic and parameter.

Description

A kind of method and system of resolving inversely GPU instructions
Technical field
The present invention relates to GPU microarchitectures, compiler code generation technique and program optimization technical field, more particularly to A kind of method and system of resolving inversely GPU instructions.
Background technology
For many years, GPU manufacturers are only provided a user with by driving the upper strata API of encapsulation, and expose former inside it as little as possible Reason and details, software architecture, the micro-architecture of GPU, the instruction set for such as driving.This causes academia in GPU architecture research field, Industrial quarters is significantly lagged behind, is stagnated for a long time, the age of figure acceleration is served only in GPU, this conservative strategy is in reality Do not turn into a distinct issues in the application of border, or even with certain reasonability:Initial drawing API realizes strong with hardware Correlation, API is the simplified package of hardware capability, and API has exposed enough hardware details in itself, earlier version OpenGL interfaces are even more the assembler language referred to as three-dimensional drawing;In addition, the person of directly invoking of drawing API is considerably less, game is all It is based on rendering engine exploitation, as long as video card manufacturer assists to have optimized the rendering engine of main flow, it is possible to ensure most trips The smooth operation of play;So as to, video card manufacturer can more freely improve and innovate micro-architecture, without after bottom is carried out to simultaneous Hold, only need to be in backward compatibility on the drawing api layer of software encapsulation time.
GPU manufacturers Nvidia in monopoly position maintains the inertia of technology closing, does not provide assembler, does not support most The assemble programming of bottom, also underground those the only hardware structure characteristics that could control on compilation level, used as supplement, it is carried Supplied physical layer interface PTX, although be with collect very close to intermediate representation, but PTX to the control ability of hardware will less than compilation, Such as, PTX be unable to control register distribution, can not precise control instruction scheduling behavior, the class C interface on upper strata is to hardware Control ability is weaker, and developer can only place hope on Compiler Optimization for improving performance, however, " Daniel J Bernstein,Hsieh-Chung Chen,Chen-Mou Cheng,Tanja Lange,Ruben Niederhagen,Peter Schwabe,and Bo-Yin Yang.Usable assembly language for gpus:a success story.IACR Cryptology ePrint Archive,2012:137,2012. " compiler that Nvidia is provided is pointed out The code efficiency of NVCC generations is not high, such as register distribution has a large amount of bank conflict, in fact, Nvidia issues are permitted Multiple parallel algorithms library, is all based on the assembler of inside, then carries out hand assemble optimization, just reaches ideal efficiency, asks Topic is that there was only a small amount of rendering engine developer different from three-dimensional drawing field, and the customer group of GPGPU is extensive and various, and Nvidia has only carried out hand assemble optimization to a small amount of algorithms library, only supports that remaining is a large amount of there is provided official to a small amount of big customer User cannot but squeeze out the performance of maximum from expensive GPGPU hardware, and this is the huge waste to computing resource, further worsened , many to apply extremely wide core algorithm, Nvidia also not to optimize in place, such as single-precision floating point Matrix Multiplication (singe-precision matrix multiplication), Nvidia is excellent for the hand assemble of main flow Kepler frameworks Change version, efficiency only reaches the 74% of theoretical peak, be that the single-precision floating point of third party's optimization multiplies cuBLAS with NVIDIA manufacturers In SGEMM performance comparisons, third party's assembly code optimizing it is higher than cuBLAS performance, these research show assembly code optimizing for dig The performance for digging GPU is very valuable.
Some researchers have some scattered progress, such as micro benchmark test program on GPU Performance tunings and instrument “Zhang,Yao,and John D.Owens."A quantitative performance analysis model for GPU architectures."In 2011 IEEE 17th International Symposium on High Performance Computer Architecture,pp.382-393.IEEE,2011.”“Xinxin Mei,Kaiyong Zhao,Chengjian Liu,and Xiaowen Chu.Benchmarking the memory hierarchy of modern gpus.In Network and Parallel Computing,pages 144–156.Springer,2014.” “Henry Wong,Misel-Myrto Papadopoulou,Maryam Sadooghi-Alvandi,and Andreas Moshovos.Demystifying gpu microarchitecture through microbenchmarking.In Performance Analysis of Systems&Software(ISPASS),2010 IEEE International Symposium on, pages 235-246.IEEE, 2010. ", assembler and the other optimization of GPU assembly levels, however, they Work is all only concentrated in certain single aspect, without a current techique that can continue on framework of new generation is proposed, such as Instruction crack method and corresponding automation tools, GPU is most of also without the other open benchmark of assembly level in addition The benchmark being currently in use is all based on CUDA, so as to cause the result and unreliable of test.
The content of the invention
In view of the shortcomings of the prior art, the present invention proposes a kind of method and system of resolving inversely GPU instructions.
The present invention proposes a kind of method of resolving inversely GPU instructions, including:
Step 1, GPU instructions are compiled, and generate compiling file, and the compiling file is carried out into dis-assembling, raw Into dis-assembling file, by the resolver that collects by the dis-assembling representation of file into instMap variables, wherein the instMap The types of variables of variable includes command code, modification code, instruction, operand and corresponding operand type;
Step 2, decoding solver is input to by the instMap variables, and the decoding solver judges the instMap The types of variables of variable, and the command code or the corresponding coding of modification code lookup by having determined.
If the decoding solver is detected to 64 each for encoding of the instMap variables respectively, then is led to Cross dis-assembling carries out dis-assembling by described 64 codings, if the instruction of the new dis-assembling of generation and the described 64 original fingers of coding The instruction name of order is different, then illustrate that 64 codings present bit represents command code, according to the present bit, to command code Enumerated in space encoder.
The title and operand type that will be instructed in the instMap variables are right as keyword query visited dictionaries In each instruction in the instMap variables, other positions in addition to operand, a certain position that return has been changed are detected Operand, will<Instruction, operand type>1 is labeled as in visited dictionaries, expression was accessed.
Carry out XOR by turn by will be instructed in the instMap variables, by modify code whether change completion detection repair Decorations code, after finding every space encoder of the modification code of instruction, is enumerated in the space encoder of modification code, finds out all of repairing The title of code is adornd, then according to the title of a certain modification code, all common factors of all instructions with a certain modification code is found out, To finally encode and do XOR with the coding of the command code of all instructions with a certain modification code, obtain the coding of modification code.
According to the title of operand, coding corresponding thereto is obtained.
The present invention also proposes a kind of system of resolving inversely GPU instructions, it is characterised in that including:
Generation variable module, for GPU instructions to be compiled, generates compiling file, and the compiling file is entered Row dis-assembling, generates dis-assembling file, by the resolver that collects by the dis-assembling representation of file into instMap variables, wherein The types of variables of the instMap variables includes command code, modification code, operand and corresponding operand type;
Coding module is searched, for the instMap variables to be input into decoding solver, the decoding solver judges The types of variables of the instMap variables, and the command code by having determined or modification code search remaining coding.
If the decoding solver is detected to 64 each for encoding of the instMap variables respectively, then is led to Cross dis-assembling carries out dis-assembling by described 64 codings, if the instruction of the new dis-assembling of generation and the described 64 original fingers of coding The instruction name of order is different, then illustrate that 64 codings present bit represents command code, according to the present bit, to command code Enumerated in space encoder.
The title and operand type that will be instructed in the instMap variables are right as keyword query visited dictionaries In each instruction in the instMap variables, other positions in addition to operand, a certain position that return has been changed are detected Operand, will<Instruction, operand type>1 is labeled as in visited dictionaries, expression was accessed.
Whether XOR is carried out by turn by by instruction corresponding with coding in the instMap variables, by modifying code Change and complete detection modification code, after finding every space encoder of the modification code of instruction, carried out piece in the space encoder of modification code Lift, find out the title of all of modification code, then according to the title of a certain modification code, find out all with a certain modification code All common factors of instruction, will finally encode and do XOR with the coding of the command code of all instructions with a certain modification code, obtain Take the coding of modification code.
According to the title of operand, coding corresponding thereto is obtained.
From above scheme, the advantage of the invention is that:
Invention can successfully manage GPU sealing techniques system and to compiling and the limitation of program optimization:
1. because NVIDIA does not provide instruction encoding, based on the existing tools chains of NVIDIA, method solution proposed by the present invention GPU instruction encodings are separated out, the basic format of instruction is as shown in Figure 1.63~54 represent command code, and 42~23 represent 20 immediates, 21~18 represent criterion register, and 17~10 represent source register, and 9~2 represent destination register, and 1~0 represents, specific domain It is related to instruction syntax;
2. on the basis of instruction encoding is cracked, with reference to PTX documents, GPU assemblers can be constructed;
3. some compiling miscellaneous functions can be provided for GPU compiler, improve the efficiency of GPU program;
4. a series of micro benchmark test program can be designed and standardize to detect GPU micro-architectures characteristic and parameter.
Brief description of the drawings
Fig. 1 is the coded format exemplary plot of instruction;
Fig. 2 is instruction analytical algorithm flow chart;
Fig. 3 is operation algebraic method device algorithm (algorithm 1) figure;
Fig. 4 is command code solution musical instruments used in a Buddhist or Taoist mass algorithm (algorithm 2) figure;
Fig. 5 is modification code solution musical instruments used in a Buddhist or Taoist mass algorithm (algorithm 3) figure.
Specific embodiment
It is below present invention instruction analytical algorithm flow, it is as follows:
Instruction decoding needs to generate the corresponding relation of 64 bit instructions coding and assembly instruction, as shown in Fig. 2 algorithm flow is such as Under:
First with PTX instruction generators, automatically generate all instructions in NVIDIA PTX documents and its modify code These PTX files ptxas, is then compiled into cubin by combination, and by cuobjdump dis-assemblings, finally dis-assembling Information, instMap variables are expressed as by the resolver that collects, the input for decoding solver, the structure bag of wherein instMap Include:Command code, instruction, modification code, all of operand and corresponding operand type etc..
Operand can be register (R5), global memory ([R6+0x20]), constant internal memory (C [0x2] [0x40]), altogether Enjoy internal memory ([0x50]), immediate (0x9and1.5) and criterion register (P3), it has been found that the name of operand always and Digital correlation, therefore can be represented with its name with the coding of speculative operand, such as the two of register operand R5 Scale coding is 101, and the coding of immediate 0x9 is 1001, conversely, command code and modification code are then memonic symbols, it is impossible to directly lead to Cross name and be expressed as binary system, therefore command code and modification code need to be enumerated in their space encoder, it has been found that repair Decorations code is related instruction, and the modification code of same name is likely to difference, such as the class of LD and LDG in the coding of different instruction Type modification code name is all .32 .64 .128 .S16 .U16 .S8 .U8, but the position of mask is different, therefore for Modification code, we need to be processed respectively according to specific instruction.
It is below command code solution musical instruments used in a Buddhist or Taoist mass algorithm of the present invention, as shown in Figure 3:
Command code and modification code can not judge that it is encoded by name, and algorithm 1 illustrates command code solution process, according to The pseudo-assembly PTX documents that NVIDIA is provided, write PTX codes, are then compiled into cubin with ptxas, then anti-with cuobjudump Compilation, the assembler code (instMap variables) for obtaining as algorithm 1 input (1 row), we are discussed in detail lower command code solver Detailed process, for dis-assembling file in every a line instruction, respectively to its 64 coding each detected the (the 9th OK), then converged by the way that disassemblers nvdisasm is counter again mainly by each (11 row) of toggling command here Compile (13 row), if the instruction name of the instruction of new dis-assembling and original instruction is different (15 row), illustrate that this represents behaviour Make code, then this position is stored in opBits, obtain after position here, command code can be enumerated in space encoder.
By writing the combination of different modifying code and being verified, modification code coding is further obtained, however, due to The PTX documents that NVIDIA is provided are not complete, do not ensure that so and find all of command code coding and modification code coding, by calculating Method 1 we can find out all of instruction.
It is below present invention operation algebraic method device algorithm, as shown in Figure 4:
Algorithm 2 is the solution procedure of operand decoder.Initially set up a dictionary (being designated as visited dictionaries), dictionary Key be the name and operand type of instruction where operand, value is then marked and is changed whether operand has been decoded.Input It is operand and operand number by the input of command code solver, it is sequentially related to type and specifically instruction, therefore we Whether one group of operand of mark is detected, it is necessary to the name and operand type (array) of specific instruction are looked into as keyword Ask visited dictionaries, for dis-assembling file in each article of instruction (the 4th row), detect other positions in addition to operand (eighth row), is also to be obtained by (the 10th row) overturn in instruction encoding, and what wichChange returned to modification is which Operand, is then put into these positions in suitable array,<Instruction, operand type>Position is marked in visited dictionaries 1, expression was accessed (the 18th row).
It is below present invention modification code solution musical instruments used in a Buddhist or Taoist mass algorithm, as shown in Figure 5:
Modification code (Modifier), defines the concrete behavior of a certain bar instruction, such as LD has type to modify code:.U8, .S8 .U16 .32 .64 .128, also cache operation modification code:.CS(cache streaming),.CG(cache at Global level) etc..Modification code-phase it is increasingly complex for command code, its position cross over many operative positions, and with behaviour Make code-phase pass, such as, be equally type modification code, the position of the modification code of LD and LDG in instruction just difference, a kind of solution party Method is, by the way that XOR bit by bit is instructed, then whether observation modification code changes to be detected (the 6th row to 13 rows), to find every Bar instruction modification code space encoder after, modification code space encoder enumerated (the 15th row), next step it needs to be determined that The coding of specific certain modification code, such as the coding of .U8, the coding of .S8,20 to 29 rows of this process correspondence code, first The name (the 20th row) of all of modifier is found out, then the name of specific a certain modification code, found out all with this modification The coding of this coding and the command code of this instruction, is finally done XOR, so by all common factors (23-25 rows) of the instruction of code Just leave behind the coding of modification code.
The present invention also proposes a kind of system of resolving inversely GPU instructions, including:
Generation variable module, for GPU instructions to be compiled, generates compiling file, and the compiling file is entered Row dis-assembling, generates dis-assembling file, by the resolver that collects by the dis-assembling representation of file into instMap variables, wherein The types of variables of the instMap variables includes command code, modification code, operand and corresponding operand type;
Coding module is searched, for the instMap variables to be input into decoding solver, the decoding solver judges The types of variables of the instMap variables, and the command code by having determined or modification code search remaining coding.
If the decoding solver is detected to 64 each for encoding of the instMap variables respectively, then is led to Cross dis-assembling carries out dis-assembling by described 64 codings, if the instruction of the new dis-assembling of generation and the described 64 original fingers of coding The instruction name of order is different, then illustrate that 64 codings present bit represents command code, according to the present bit, to command code Enumerated in space encoder.
The title and operand type that will be instructed in the instMap variables are right as keyword query visited dictionaries In each instruction in the instMap variables, other positions in addition to operand, a certain position that return has been changed are detected Operand, will<Instruction, operand type>1 is labeled as in visited dictionaries, expression was accessed.
Whether XOR is carried out by turn by by instruction corresponding with coding in the instMap variables, by modifying code Change and complete detection modification code, after finding every space encoder of the modification code of instruction, carried out piece in the space encoder of modification code Lift, find out the title of all of modification code, then according to the title of a certain modification code, find out all with a certain modification code All common factors of instruction, will finally encode and do XOR with the coding of the command code of all instructions with a certain modification code, obtain Take the coding of modification code.
According to the title of operand, coding corresponding thereto is obtained.

Claims (10)

1. a kind of method that resolving inversely GPU is instructed, it is characterised in that including:
Step 1, GPU instructions are compiled, and generate compiling file, and the compiling file is carried out into dis-assembling, and generation is anti- Assembling file, by the resolver that collects by the dis-assembling representation of file into instMap variables, wherein the instMap variables Types of variables include command code, modification code, instruction, operand and corresponding operand type;
Step 2, decoding solver is input to by the instMap variables, and the decoding solver judges the instMap variables Types of variables, and the command code by having determined or modification code search corresponding coding.
2. the method that resolving inversely GPU as claimed in claim 1 is instructed, it is characterised in that if the decoding solver difference 64 each for encoding to the instMap variables are detected, then are carried out described 64 codings by dis-assembling anti- Compilation, if the instruction of the new dis-assembling of generation is different with the instruction name of the described 64 original instructions of coding, illustrates institute State 64 coding present bits and represent command code, according to the present bit, command code is enumerated in space encoder.
3. the method that resolving inversely GPU as claimed in claim 1 is instructed, it is characterised in that by the instMap variables middle finger The title of order, as keyword query visited dictionaries, refers to operand type for each in the instMap variables Order, detects other positions in addition to operand, and a certain positional operand that return has been changed will<Instruction, operand type> 1 is labeled as in visited dictionaries, expression was accessed.
4. the method for resolving inversely GPU as claimed in claim 1 instruction, it is characterised in that by by the instMap variables Middle instruction carries out XOR by turn, by modifying whether code changes completion detection modification code, finds every volume of the modification code of instruction After code space, enumerated in the space encoder of modification code, the title of all of modification code is found out, then according to a certain modification code Title, find out all common factors of all instructions with a certain modification code, finally coding a certain is repaiied with all with described The coding for adoring the command code of the instruction of code does XOR, obtains the coding of modification code.
5. the method that resolving inversely GPU as claimed in claim 1 is instructed, it is characterised in that according to the title of operand, obtains Coding corresponding thereto.
6. the system that a kind of resolving inversely GPU is instructed, it is characterised in that including:
Generation variable module, for GPU instructions to be compiled, generates compiling file, and the compiling file is carried out instead Compilation, generates dis-assembling file, by the resolver that collects by the dis-assembling representation of file into instMap variables, wherein described The types of variables of instMap variables includes command code, modification code, operand and corresponding operand type;
Coding module is searched, for the instMap variables to be input into decoding solver, the decoding solver judges described The types of variables of instMap variables, and the command code by having determined or modification code search remaining coding.
7. the system that resolving inversely GPU as claimed in claim 6 is instructed, it is characterised in that if the decoding solver difference 64 each for encoding to the instMap variables are detected, then are carried out described 64 codings by dis-assembling anti- Compilation, if the instruction of the new dis-assembling of generation is different with the instruction name of the described 64 original instructions of coding, illustrates institute State 64 coding present bits and represent command code, according to the present bit, command code is enumerated in space encoder.
8. the system that resolving inversely GPU as claimed in claim 6 is instructed, it is characterised in that by the instMap variables middle finger The title of order, as keyword query visited dictionaries, refers to operand type for each in the instMap variables Order, detects other positions in addition to operand, and a certain positional operand that return has been changed will<Instruction, operand type> 1 is labeled as in visited dictionaries, expression was accessed.
9. the system of resolving inversely GPU as claimed in claim 6 instruction, it is characterised in that by by the instMap variables In the instruction corresponding with coding carry out XOR by turn, by modifying whether code changes completion detection modification code, find every finger After the space encoder of the modification code of order, enumerated in the space encoder of modification code, found out the title of all of modification code, then According to the title of a certain modification code, all common factors of all instructions with a certain modification code are found out, finally will coding and institute The coding for having the command code of the instruction with a certain modification code does XOR, obtains the coding of modification code.
10. the system that resolving inversely GPU as claimed in claim 6 is instructed, it is characterised in that according to the title of operand, obtain Take coding corresponding thereto.
CN201611215249.XA 2016-12-26 2016-12-26 A kind of method and system of resolving inversely GPU instruction Active CN106843993B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201611215249.XA CN106843993B (en) 2016-12-26 2016-12-26 A kind of method and system of resolving inversely GPU instruction

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201611215249.XA CN106843993B (en) 2016-12-26 2016-12-26 A kind of method and system of resolving inversely GPU instruction

Publications (2)

Publication Number Publication Date
CN106843993A true CN106843993A (en) 2017-06-13
CN106843993B CN106843993B (en) 2019-07-30

Family

ID=59136263

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201611215249.XA Active CN106843993B (en) 2016-12-26 2016-12-26 A kind of method and system of resolving inversely GPU instruction

Country Status (1)

Country Link
CN (1) CN106843993B (en)

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109558735A (en) * 2018-12-03 2019-04-02 杭州安恒信息技术股份有限公司 A kind of rogue program sample clustering method and relevant apparatus based on machine learning
CN109933327A (en) * 2019-02-02 2019-06-25 中国科学院计算技术研究所 OpenCL compiler method and system based on code fusion compiler framework
CN110096309A (en) * 2018-11-14 2019-08-06 上海寒武纪信息科技有限公司 Operation method, device, computer equipment and storage medium
CN110109657A (en) * 2019-03-29 2019-08-09 南京佑驾科技有限公司 A kind of GPU microcommand detection method
CN110489130A (en) * 2018-05-31 2019-11-22 北京数聚鑫云信息技术有限公司 A kind of client-based business datum extracting method and device
CN110716855A (en) * 2019-08-23 2020-01-21 中国科学院信息工程研究所 Processor instruction set testing method and device

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20100023709A1 (en) * 2008-07-22 2010-01-28 International Business Machines Corporation Asymmetric double buffering of bitstream data in a multi-core processor
CN103049304A (en) * 2013-01-21 2013-04-17 中国人民解放军国防科学技术大学 Method for accelerating operating speed of graphics processing unit (GPU) through dead code removal
CN104156311A (en) * 2014-08-05 2014-11-19 北京控制工程研究所 Embedded type C language target code level unit testing method based on CPU simulator

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20100023709A1 (en) * 2008-07-22 2010-01-28 International Business Machines Corporation Asymmetric double buffering of bitstream data in a multi-core processor
CN103049304A (en) * 2013-01-21 2013-04-17 中国人民解放军国防科学技术大学 Method for accelerating operating speed of graphics processing unit (GPU) through dead code removal
CN104156311A (en) * 2014-08-05 2014-11-19 北京控制工程研究所 Embedded type C language target code level unit testing method based on CPU simulator

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110489130A (en) * 2018-05-31 2019-11-22 北京数聚鑫云信息技术有限公司 A kind of client-based business datum extracting method and device
CN110096309A (en) * 2018-11-14 2019-08-06 上海寒武纪信息科技有限公司 Operation method, device, computer equipment and storage medium
CN109558735A (en) * 2018-12-03 2019-04-02 杭州安恒信息技术股份有限公司 A kind of rogue program sample clustering method and relevant apparatus based on machine learning
CN109933327A (en) * 2019-02-02 2019-06-25 中国科学院计算技术研究所 OpenCL compiler method and system based on code fusion compiler framework
CN109933327B (en) * 2019-02-02 2021-01-08 中国科学院计算技术研究所 OpenCL compiler design method and system based on code fusion compiling framework
CN110109657A (en) * 2019-03-29 2019-08-09 南京佑驾科技有限公司 A kind of GPU microcommand detection method
CN110716855A (en) * 2019-08-23 2020-01-21 中国科学院信息工程研究所 Processor instruction set testing method and device
CN110716855B (en) * 2019-08-23 2021-05-14 中国科学院信息工程研究所 Processor instruction set testing method and device

Also Published As

Publication number Publication date
CN106843993B (en) 2019-07-30

Similar Documents

Publication Publication Date Title
CN106843993B (en) A kind of method and system of resolving inversely GPU instruction
Paulin et al. Flexware: A flexible firmware development environment for embedded systems
CN108614960B (en) JavaScript virtualization protection method based on front-end byte code technology
Colin et al. A modular and retargetable framework for tree-based WCET analysis
CN1146788C (en) Device and method used in instruction selection of multiplatform environment
CN108345937A (en) Cycle is merged with library
CN103329132A (en) Architecture optimizer
JP5846005B2 (en) Program, code generation method, and information processing apparatus
JPH11249904A (en) Compiling method
CN102609243B (en) Emulating pointers
CN113722218A (en) Software defect prediction model construction method based on compiler intermediate representation
CN103329097A (en) Tool generator
CN110321116B (en) Efficient optimization method for calculation cost constraint problem in compilation optimization
CN103235724A (en) Atomic operation semantic description based integrated translation method for multisource binary codes
EP2984585A2 (en) Binding of data source to compound control
CN110149801A (en) System and method for carrying out data flow diagram conversion in the processing system
CN102880449B (en) Method and system for scheduling delay slot in very-long instruction word structure
CN102722570B (en) Artificial immunity intelligent optimization system facing geographical space optimization
Armstrong et al. Dynamic algorithm selection using reinforcement learning
CN1932766A (en) Semi-automatic parallel method of large serial program code quantity-oriented field
CN103270512A (en) Intelligent architecture creator
CN105447285A (en) Method for improving OpenCL hardware execution efficiency
CN100559344C (en) A kind of disposal route of supporting with regular record variables access special register group
Darda et al. Nonlinear production path and an alternative reserves estimate for South Asian natural gas
Topcuoglu et al. Solving the register allocation problem for embedded systems using a hybrid evolutionary algorithm

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant