CN103049265B - A kind of processing method of flag bit in reverse anti-compiler - Google Patents

A kind of processing method of flag bit in reverse anti-compiler Download PDF

Info

Publication number
CN103049265B
CN103049265B CN201210546092.4A CN201210546092A CN103049265B CN 103049265 B CN103049265 B CN 103049265B CN 201210546092 A CN201210546092 A CN 201210546092A CN 103049265 B CN103049265 B CN 103049265B
Authority
CN
China
Prior art keywords
file
flag bit
statement
language
array
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Fee Related
Application number
CN201210546092.4A
Other languages
Chinese (zh)
Other versions
CN103049265A (en
Inventor
刘金硕
郑稳
章喻龙
刘源
刘天晓
栗鹏
曾秋梅
邹斌
张智
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Wuhan University WHU
Original Assignee
Wuhan University WHU
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Wuhan University WHU filed Critical Wuhan University WHU
Priority to CN201210546092.4A priority Critical patent/CN103049265B/en
Publication of CN103049265A publication Critical patent/CN103049265A/en
Application granted granted Critical
Publication of CN103049265B publication Critical patent/CN103049265B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Landscapes

  • Stored Programmes (AREA)

Abstract

The present invention relates to a kind of processing method of flag bit in reverse anti-compiler.First step is debugger to be connected to computer take out the binary code of corresponding microprocessor, present general microprocessor all can carry out certain encryption, can carry out by this method taking code debugging etc. before fuse opening, but typically can not take binary code by this mode again after being encrypted, take the method for binary code in microprocessor not within the scope of discussion herein.It is to carry out dis-assembling that second step processes, and the purpose of dis-assembling is specifically to locate with reference to the compilation form of this processor according to specific processor, generates and changes the specific assembly code of processor into.The purpose of three step process is assembler language to be inversely generated as high-level language.Therefore, the present invention has can make decompiling have higher accuracy rate in Decompilation, and operation is simple simultaneously, it is simple to understands.

Description

A kind of processing method of flag bit in reverse anti-compiler
Technical field
The present invention relates to the processing method of flag bit, especially relate to a kind of place of flag bit in reverse anti-compiler Reason method.
Background technology
At society, development in science and technology is maked rapid progress, and embedded system is extensive in mobile phone and various portable equipment Application.At aircraft, the vehicles such as automobile are even widely used various micro-at military installations, deep-sea exploration, Space Science and Technology etc. Processor, the system maintainability thus brought, malicious code analysis, system security reliability exists bigger defect, simultaneously Legacy software utilization etc. is also promoted the research reverse to software.Therefore the code information to microprocessor inversely compiles just Imperative.
The most legal to reverse-engineering and whether can protect and there is certain law battle, 20 in the group of software developer The nineties in century, the reverse law battle about software finished finally, with reference to the definition of the software giant U.S., according to the U.S. Federal law, carries out Reverse engineering operation such as dis-assembling to gathering around copyrighted software, if not the development of new products is competed therewith or Obtain unlawful interests, then the contrary operation carried out is legal [PamelaSamuelson 1990].So so long as not for Economic interests and participate in competition, to carry out reverse-engineering be legal.
Embedded microprocessor is different from the CPU of Intel and AMD more common on personal computer, and embedding declines Processor typically uses oneself assembler language control whole system, relatively common such as Texas Instrument and auspicious Sa system uniquely Row.Obtain result after the dis-assembling of microprocessor, inversely compiling when, always relate to the process of flag bit.Close The series such as X86, Am186/88, ARM, MIPS, PowerPC68K are currently mainly had in embedded microprocessor.
Summary of the invention
The above-mentioned technical problem of the present invention is mainly addressed by following technical proposals:
A kind of processing method of flag bit in reverse anti-compiler, it is characterised in that comprise the following steps:
Step 1, is read in the assembler language file of input by initialization module;
Step 2, by flag bit identification module according to the specific format of input file, defines its corresponding flag bit, and It is deposited in flag bit file A;
Step 3, is processed according to the assembler language reading set by being designated processing module;
Step 4, processes after whole assembling file according to the processing mode of step 3, by various control loop structures and The process of array, so that it may obtain comparing the high-level language B file readily appreciated, is being added to A file on the head of B file, Produce the high-level language with more complete meaning.
The present invention includes: the decompiling flow process of embedded microprocessor.The decompiling flow process of embedded decompiling is mainly wrapped Including step as follows, first step is debugger to be connected to computer take out the binary code of corresponding microprocessor, now General microprocessor all can carry out certain encryption, can carry out by this method taking code debugging before fuse opening Deng, but typically can not take binary code by this mode again after being encrypted, take the side of binary code in microprocessor Method is not within the scope of discussion herein.It is to carry out dis-assembling that second step processes, and the purpose of dis-assembling is to process according to specific Device is specifically located with reference to the compilation form of this processor, generates and changes the specific assembly code of processor into.Three step process Purpose be assembler language to be inversely generated as high-level language, the purpose of this patent is i.e. to generate during high-level language one The method processing assembler language flag bit.
Process step to the flag bit in reverse compiler;It is said that in general, at the decompiling of embedded microprocessor During reason, having more step, all things considered, the process to assembler language is the process walked one step ahead, the most first to one Meaningful assembler language processes, this sentence process during, run into may on flag bit produce affect when, Can generate four flag bits, flag bit can be defined by accordingly, simultaneously during processing, to changing The statement of flag bit adds Rule of judgment, if necessary, then dirty bit.
High-level language generates, and in the Decompilation of embedded microprocessor, finally can generate multiple file, the most again According to one file of corresponding ruled synthesis.The definition of flag bit generates a file, and typically we can define flag bit For int type.Then generate the file of a flag bit definition, compiling anti-of other can be with named a.flag file The file generated during translating is stitched together, just can generate the most senior language.
In above-mentioned a kind of processing method of flag bit in reverse anti-compiler, comprising the concrete steps that of described step 3: With reference to the practical significance of this compilation, according to this compilation impact on flag bit, use suitable corresponding senior language Speech represents.
In above-mentioned a kind of processing method of flag bit in reverse anti-compiler, in described step 4, it is divided into five steps, Including: array manipulation;The process of Switch statement;Variable processes;Control the process of Do statement;And integrated treatment.
Therefore, the present invention has can make decompiling have higher accuracy rate in Decompilation, the most simple to operate Easy, it is simple to understand.
Accompanying drawing explanation
Fig. 1 is the general decompiling process chart of embedded microprocessor, mainly includes three phases, obtains machine Device code, dis-assembling, decompiling.
Fig. 2 is the process of the flag bit of embedded microprocessor, and the step in processing for decompiling, at table 1 Reason.
The schematic diagram of the source file (C language) of input in the citing of Fig. 3 a embedded microprocessor program.
Compilation schematic diagram after C language compiles in the citing of Fig. 3 b embedded microprocessor program.
The schematic diagram of the high-level language of output in the citing of Fig. 3 c embedded microprocessor program.
Fig. 4 is switch...case statement form of expression in compilation.
Fig. 5 is that case statement processes schematic diagram.
Fig. 6 is case statement specifying information schematic diagram.
Fig. 7 is while compilation form schematic diagram.
Fig. 8 is the file data schematic diagram read in.
Detailed description of the invention
Below by embodiment, and combine accompanying drawing, technical scheme is described in further detail.
Embodiment:
1. read in the assembler language file of input;With reference to Fig. 2, program starts to read in afterwards the assembler language of input.
2. according to the specific format of input file, it is such as MSP430 or M16C etc., defines its corresponding flag bit, And be deposited in flag bit file A, as a example by MSP430 (owing to being not related to the process of overflow position in Decompilation, therefore Overflow position is not processed), then be:
int N=0;
int Z=0;
int C=0;
In fig. 2, if reading in data flag bit may be produced impact, then these words are divided at two steps Reason, the first step is to process corresponding statement, and second step is to process the flag bit that may have influence on.
3. reading according to a significant assembler language and process, the statement of the most single MOV is the most meaningful Compilation, MOV A, B;It is then significant compilation, with reference to table 1, if this significant compilation has influence on flag bit, then basis Table 1 processes accordingly, such as:
add.w R14,R15
According to table 1, we can so translate (owing to being not related to the process of overflow position in Decompilation, therefore right Overflow position does not processes):
int R14,R15;
R15=R15+R14;
If(R15<0)
N=1;
Else
N=0;
If(R15=0)
Z=1;
Else
Z=0;
Different due to concrete structure and the difference of data storage word length of various different embedded microprocessors, institute So that different determination methods can be produced, it is however generally that with store data word length as criterion, such as store the length of data It is 8, between-255 255, then there is no carry at operation result or borrow, otherwise there is carry or borrow.We with As a example by MSP430, then it is 16.
If(-32767=<R15<=32768)
C=1;
Else
C=0;
Overflow and equally judge according to the form of storage data.Table 1 then represents that in MSP430, which statement can be to mark Will position produces impact, and which statement needs the assistance of flag bit, could inversely compile.
4. process whole assembling file according to the processing mode of the 3rd step, then by the data structure analysis of other necessity Deng, so that it may obtain comparing the high-level language B file readily appreciated, A file is being added to the head of B file, can produce and have The more high-level language of complete meaning.A figure in figure 3 represents original C file, after b then represents and is input to MSP430 The assembling file produced, c represents the file after decompiling.It is broadly divided into following sub-step:
4.1. the process of array.
If running into the assembler language of mova in compilation process, being array, just can refer to above-mentioned data structure, extract phase The array answered.
First we must define a structure to be characterized in the details of array in compilation:
Wherein arrayInfo is the name of structure, comprises four information in structure, it is simply that the first name of array is fixed Justice: name [length], it is therefore an objective in assembling file processes, if finding array, then array is processed, after process Array name just leaves in this array.What chariniAddress [length] thereafter deposited runs at the beginning of array Beginning address.
4.2.switch the process of statement.
After first step array manipulation, carry out the process of the switch statement of second step, first process array, reprocessing The reason of switch statement is: array is likely to be present in switch statement, so must be according to this order.
We first have a look in assembler language, switch ... the form of expression of case statement, can be found in Fig. 4.With reference to this Function, we can take multistep to realize the function of function:
4.21. two file arrays of definition:
char fileArray[Max][Length];// file storage array
char swQueue[switchMax][Length];//case array
The purpose of swQueue array is to preserve the options of swtich, and case maximum in switch case statement Branch's number.Macrodefinition therein also can find in table 3-6.Wherein the acquirement of file array can use do ... while circulation Method, in file every space, just the character string that two are different is saved in array.Specific code refers to following:
In code above, it may be seen that existence function IsKeyWord(in judging statement);The purpose of this function Being to judge that the character read in from file is the most meaningful, we may refer to the description of this function:
If there it can be seen that the character of input is to there may be in assembler language, then returns true, be otherwise returned to False.
In sum, first pass through a do ... file character is read in while loop control, the most only read a word Symbol, when the character read is space when, we just move forward a unit character array, continue to access number According to.With this, we just can learn, how assembling file is read in character array.
We assume that the file read in only has two row, as shown in Figure 8.
After then we are by above-mentioned process, preserve in fileArrays array is 8 data, is respectively
fileArrays[0][Length]=
{'0','A','0','B','E','8','$','$','$','$','$','$','$','$','$','$',' $','$','$','$','$','$','$','$','$','$','$','$','$','$','$','$','$','$','$',' $','$','$','$','$'};
fileArrays[1][Length]=
{'M','O','V','.','W',':','Q','$','$','$','$','$','$','$','$','$',' $','$','$','$','$','$','$','$','$','$','$','$','$','$','$','$','$','$','$',' $','$','$','$','$'};
fileArrays[2][Length]=
{'#','1','$','$','$','$','$','$','$','$','$','$','$','$','$','$',' $','$','$','$','$','$','$','$','$','$','$','$','$','$','$','$','$','$','$',' $','$','$','$','$'};
fileArrays[3][Length]=
{'R','1','$',$','$','$','$','$','$','$','$','$','$','$','$','$','$',' $','$','$','$','$','$','$','$','$','$','$','$','$','$','$','$','$','$','$',' $','$','$','$'};
fileArrays[4][Length]=
{'M','O','V','.','W',':','Q','$','$','$','$','$','$','$','$','$',' $','$','$','$','$','$','$','$','$','$','$','$','$','$','$','$','$','$','$',' $','$','$','$','$'};
fileArrays[5][Length]=
{'#','2','$','$','$','$','$','$','$','$','$','$','$','$','$','$',' $','$','$','$','$','$','$','$','$','$','$','$','$','$','$','$','$','$','$',' $','$','$','$','$'};
fileArrays[6][Length]=
{'#','2','$','$','$','$','$','$','$','$','$','$','$','$','$','$',' $','$','$','$','$','$','$','$','$','$','$','$','$','$','$','$','$','$','$',' $','$','$','$','$'};
fileArrays[7][Length]=
{'A','1','0','B','E','8','$','$','$','$','$','$','$','$','$','$',' $','$','$','$','$','$','$','$','$','$','$','$','$','$','$','$','$','$','$',' $','$','$','$','$'};
Slightly explaining data above, we just initialize array by dollar mark () " $ ", because for ASC II Code, in c Programming with Pascal Language, only there is not too actual meaning in this symbol.Initialize array by dollar mark () and obtain it in fact Institute.
4.22. four character arrays of definition:
char JSR_A[Length];
char DB[Length];
char JMP_B[Length];
char JMP_S[Length];
The purpose defining these arrays is to mate with data in file character array fileArray because this four Individual assembly instruction imply that switch case statement, when the match is successful, just understands in this assembling file, there is switch Case statement.
4.23. concrete methods of realizing, we use the mode of false code to describe, and before described, we first illustrate us Definition a function ArrayLength() function, in being described below, may use arrives.As its name suggests, The function of ArrayLength function is exactly to return the function of array length, but deposits with the function of common return function length In difference.Referring to definition
It is very simple that this function is write, if the existence in array is not equal to the character of " $ ", then returns the number of this character Amount, otherwise returns 0 value.
Switch statement processing module: we are as a example by Fig. 5: we are detailed with black line frame in Figure 5 for case statement therein Thin mark is out:
From figure, we can extract such information:
Case 1:, case 3, case 5, case 8, case 0x16.
Refer again to information thereafter, as shown in Figure 6: can will be apparent from sees case statement JMP statement one_to_one corresponding, with We just solution can decide case statement the most easily for this, so the switch of this example ... case statement, and we just can so translate:
Switch(R0 in fact), need not become, in assembler language, switch statement is all with switch(R0) start 's.
The most all of return.
4.3. variable processes.
Variable process purpose mainly eliminate those in previous step, the variable re-defined.Thinking is relatively simple, often Read a variable, just variable thereafter is all eliminated, so when reading end-of-file (EOF) when, just processed.
4.4.While statement processes (process controlling Do statement).
The process of While statement is relatively complicated, and we are at the call relation in master program file;
Third function is that we process while statement, if we call third function return value is 1, represents us In calling specifically, process while statement, if return value is 0, the most then represents and do not process while statement, then continue Perform.
Processing while statement, we first look at while statement expression in collecting:
This section of code represents the assembler language of while:
We can find out such as Fig. 7 substantially, and for while statement, we first determine whether the compilation in square frame.
It can be seen that judge that assembler language needs JMP and CMP, and JNZ.Go out when there are these several statements simultaneously Existing, and the when that address thereafter mutually correlating, it is possible to it is judged as while statement.Therefore can be translated as:
4.5.if statement processes (integrated treatment).
The judgement of If statement is primarily to see and whether there is CMP and no existence can redirect condition afterwards, if existing, just can sentence Break as being if control statement.In the code of annex, also deposit the if control statement generated the when that flag bit being processed
The example of if statement is as follows:
It is said that in general, if statement also has relatively-stationary Rule of judgment.
Plus JLT, JZ etc. statement after CMP.So can be translated as above
Following table is table 1: and the process of embedded microprocessor flag bit (as a example by the MSP430 of Texas Instrument, two bit manipulations Number, being defined as RA, RB, one operand is RC, and V, N, Z, C are four flag bits, represent respectively overflow position, negative position, zero-bit with And carry)
Wherein, in mode bit, " * " represents that affect "-" represents that not affecting " 0 " represents and reset " 1 " expression set, containing .B is Single byte operation instructs, containing [.W] for double byte operational order (can omit).
Specific embodiment described herein is only to present invention spirit explanation for example.Technology neck belonging to the present invention Described specific embodiment can be made various amendment or supplements or use similar mode to replace by the technical staff in territory Generation, but without departing from the spirit of the present invention or surmount scope defined in appended claims.

Claims (1)

1. the processing method of flag bit in reverse anti-compiler, it is characterised in that comprise the following steps:
Step 1, is read in the assembler language file of input by initialization module;
Step 2, by flag bit identification module according to the specific format of input file, defines its corresponding flag bit, and is stored in In flag bit file A;Select MSP430 or M16C, define its corresponding flag bit, and be deposited into flag bit file A In, if MSP430, it is then:
Int N=0;
Int Z=0;
Int C=0;
If the assembler language file reading in input may produce impact, then the assembler language literary composition to this input to flag bit Part is divided into two step process, and the first step is to process corresponding statement, and second step is to the flag bit that may have influence on Process;
Step 3, is processed according to the assembler language reading set by processing module;Comprising the concrete steps that of step 3: with reference to this sentence The practical significance of compilation, according to this compilation impact on flag bit, uses suitable corresponding high-level language to represent;
Step 4, after processing whole assembling file according to the processing mode of step 3, by various control loop structures and array Process, so that it may obtain high-level language B file, then A file be added to the head of B file, can produce to have and the most completely anticipate The high-level language of justice;
This step 4 is divided into five steps, including: array manipulation;The process of Switch statement;Variable processes;Control the place of Do statement Reason;And integrated treatment.
CN201210546092.4A 2012-12-14 2012-12-14 A kind of processing method of flag bit in reverse anti-compiler Expired - Fee Related CN103049265B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201210546092.4A CN103049265B (en) 2012-12-14 2012-12-14 A kind of processing method of flag bit in reverse anti-compiler

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201210546092.4A CN103049265B (en) 2012-12-14 2012-12-14 A kind of processing method of flag bit in reverse anti-compiler

Publications (2)

Publication Number Publication Date
CN103049265A CN103049265A (en) 2013-04-17
CN103049265B true CN103049265B (en) 2016-12-28

Family

ID=48061917

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201210546092.4A Expired - Fee Related CN103049265B (en) 2012-12-14 2012-12-14 A kind of processing method of flag bit in reverse anti-compiler

Country Status (1)

Country Link
CN (1) CN103049265B (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108958739B (en) * 2018-06-06 2020-11-10 北京大学 Method and system for recovering array data structure in binary decompilation
CN111935622B (en) * 2020-08-03 2022-02-11 深圳创维-Rgb电子有限公司 Debugging method, device, equipment and storage medium for electronic equipment with digital power amplifier

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101364253A (en) * 2007-08-06 2009-02-11 电子科技大学 Covert debug engine and method for anti-worm
CN101714118A (en) * 2009-11-20 2010-05-26 北京邮电大学 Detector for binary-code buffer-zone overflow bugs, and detection method thereof

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101364253A (en) * 2007-08-06 2009-02-11 电子科技大学 Covert debug engine and method for anti-worm
CN101714118A (en) * 2009-11-20 2010-05-26 北京邮电大学 Detector for binary-code buffer-zone overflow bugs, and detection method thereof

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
可重用的指令集模拟器的设计与优化技术;韩小琨等;《计算机工程》;20080430;第34卷(第7期);第61-63页 *
基于IDA-Pro的软件逆向分析方法;秦青文等;《计算机工程》;20081130;第34卷(第22期);第86-88页 *
面向ARM体系结构的代码逆向分析关键技术研究;殷文建;《中国优秀硕士学位论文全文数据库》;20120315;正文第四章-第五章 *

Also Published As

Publication number Publication date
CN103049265A (en) 2013-04-17

Similar Documents

Publication Publication Date Title
Van Emmerik Static single assignment for decompilation
Gough et al. Compiling for the. Net Common Language Runtime
US8589897B2 (en) System and method for branch extraction obfuscation
US8090959B2 (en) Method and apparatus for protecting .net programs
CN108139891A (en) Include suggesting for the missing of external file
CN108595921B (en) Method and device for confusing character strings in source codes
ES2733516T3 (en) Verification of limits at compile time for user-defined types
CN104536898B (en) The detection method of c program parallel regions
Ďurfina et al. Design of a retargetable decompiler for a static platform-independent malware analysis
CN101807239A (en) Method for preventing source code from decompiling
Kalysch et al. VMAttack: deobfuscating virtualization-based packed binaries
CN103885770A (en) Implementation method for retrieving assembly files from executable files for single chip microcomputer
Ranta Implementing programming languages. An introduction to compilers and interpreters
US20110167407A1 (en) System and method for software data reference obfuscation
CN105930694A (en) Flexible Instruction Sets For Obfuscated Virtual Machines
CN107657154A (en) A kind of guard method of target program, device, equipment and storage medium
CN107577925A (en) Based on the virtual Android application program guard methods of dual ARM instruction
CN103049265B (en) A kind of processing method of flag bit in reverse anti-compiler
CN106055343A (en) Program evolution model-based object code reverse engineering system
CN107203535A (en) Information query method and device
Hajipour et al. Samplefix: Learning to generate functionally diverse fixes
Cao et al. Boosting neural networks to decompile optimized binaries
Ravipati et al. Toward the deconstruction of Dyninst
CN106126225A (en) A kind of object code reverse engineering approach based on program evolution model
CN103886095A (en) Cross-platform object file multiplexing method

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
CF01 Termination of patent right due to non-payment of annual fee
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20161228

Termination date: 20181214