CN115809442A - Method, device and equipment for obfuscating reverse code and readable storage medium - Google Patents

Method, device and equipment for obfuscating reverse code and readable storage medium Download PDF

Info

Publication number
CN115809442A
CN115809442A CN202211466953.8A CN202211466953A CN115809442A CN 115809442 A CN115809442 A CN 115809442A CN 202211466953 A CN202211466953 A CN 202211466953A CN 115809442 A CN115809442 A CN 115809442A
Authority
CN
China
Prior art keywords
code
processed
file
obfuscation
information corresponding
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202211466953.8A
Other languages
Chinese (zh)
Inventor
王一龙
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guangzhou Quwan Network Technology Co Ltd
Original Assignee
Guangzhou Quwan Network Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guangzhou Quwan Network Technology Co Ltd filed Critical Guangzhou Quwan Network Technology Co Ltd
Priority to CN202211466953.8A priority Critical patent/CN115809442A/en
Publication of CN115809442A publication Critical patent/CN115809442A/en
Pending legal-status Critical Current

Links

Images

Landscapes

  • Document Processing Apparatus (AREA)

Abstract

The application provides an anti-reverse code obfuscation method, an anti-reverse code obfuscation device and a readable storage medium. Therefore, when the codes written by different programming languages are processed, the compatibility processing can be carried out on the codes to be processed to obtain the target codes, and the target codes are subjected to code obfuscation, so that the problem that the compatibility of a code obfuscation means is not high is solved.

Description

Method, device and equipment for obfuscating reverse code and readable storage medium
Technical Field
The present application relates to the field of information security technologies, and in particular, to a method, an apparatus, a device, and a readable storage medium for obfuscating an inverse code.
Background
With the rapid development of scientific technology, the functions of the intelligent terminal are more and more diversified, and various functions can be realized by installing various application programs on various types of intelligent terminals, but many application programs can acquire and modify the codes of the application programs through a reverse tool, which easily causes the security problems of the application programs, such as damage to the application programs, leakage of user data and the like, so that developers of the application programs can usually confuse the codes of the application programs, and the codes are changed into meaningless phrases or sentences, thereby improving the security of the application programs.
The existing code obfuscation means is generally to use a scripting language to write an automation tool, and use a specific replacement rule to perform batch text obfuscation replacement on a code file in an application program. However, the existing code obfuscation means can only obfuscate the content written in the corresponding programming language in the application program, and if there are a plurality of codes written in different programming languages in the application program, different code obfuscation means are usually required to process the application program codes written in different programming languages, which causes a problem of low compatibility of the code obfuscation means.
Disclosure of Invention
In view of the above, the present application provides an anti-reverse code obfuscating method, device, apparatus, and readable storage medium, which are used to solve the technical defects that in the prior art, code obfuscating means has low compatibility and cannot efficiently obfuscate codes written in different programming languages in the same application program.
In order to achieve the above object, the following solutions are proposed:
an inverse code obfuscation method, comprising:
acquiring a file to be processed;
converting the codes to be processed in the files to be processed into abstract syntax trees corresponding to the codes to be processed;
analyzing an abstract syntax tree corresponding to the code to be processed, and determining code information corresponding to the code to be processed;
performing compatibility processing on the code information corresponding to the code to be processed according to a preset code compatibility processing rule to obtain compatible code information corresponding to the code to be processed;
and according to the compatible code information corresponding to the code to be processed, performing code obfuscation processing on the code to be processed according to a preset code obfuscation rule to obtain an obfuscation result corresponding to the code to be processed.
Preferably, the acquiring the file to be processed includes:
acquiring a file path of the file to be processed according to a preset path file;
and acquiring the file to be processed according to the file path of the file to be processed.
Preferably, the converting the to-be-processed code in the to-be-processed file into the abstract syntax tree corresponding to the to-be-processed code includes:
determining the code to be processed in the file to be processed;
and converting the code to be processed to obtain an abstract syntax tree corresponding to the code to be processed.
Preferably, the converting the code to be processed to obtain an abstract syntax tree corresponding to the code to be processed includes:
performing word segmentation operation on the code to be processed to obtain a word segmentation set;
according to the word segmentation set, performing lexical analysis and syntactic analysis on the code to be processed to generate each node corresponding to the code to be processed;
and combining all the nodes according to the code hierarchical relation corresponding to the code to be processed to obtain the abstract syntax tree corresponding to the code to be processed.
Preferably, the performing compatibility processing on the code information corresponding to the code to be processed according to a preset code compatibility processing rule to obtain compatible code information corresponding to the code to be processed includes:
determining a programming language used by the code to be processed;
and selecting a code compatibility rule corresponding to the code to be processed to process the code information corresponding to the code to be processed according to a programming language used by the code to be processed to obtain compatible code information corresponding to the code to be processed.
Preferably, after the converting the to-be-processed code in the to-be-processed file into the abstract syntax tree corresponding to the to-be-processed code, before the parsing the abstract syntax tree corresponding to the to-be-processed code and determining the code information corresponding to the to-be-processed code, the method further includes:
determining an abstract syntax tree corresponding to the code to be processed;
and caching the abstract syntax tree corresponding to the code to be processed into a local file.
Preferably, the code obfuscating the code to be processed according to a preset code obfuscating rule according to the compatible code information corresponding to the code to be processed to obtain an obfuscated result corresponding to the code to be processed includes:
determining a target code confusion rule from the preset code confusion rules;
and performing code obfuscation processing on the code to be processed by using the target code obfuscation rule according to the compatible code information corresponding to the code to be processed to obtain an obfuscation result corresponding to the code to be processed.
An inverse code obfuscation device comprising:
the file acquisition module is used for acquiring a file to be processed;
the code conversion module is used for converting the codes to be processed in the files to be processed into abstract syntax trees corresponding to the codes to be processed;
the code information extraction module is used for analyzing the abstract syntax tree corresponding to the code to be processed and determining the code information corresponding to the code to be processed;
the compatible processing module is used for carrying out compatible processing on the code information corresponding to the code to be processed according to a preset code compatible processing rule to obtain compatible code information corresponding to the code to be processed;
and the code obfuscating module is used for performing code obfuscation processing on the code to be processed according to compatible code information corresponding to the code to be processed and a preset code obfuscating rule to obtain an obfuscating result corresponding to the code to be processed.
An inverse code obfuscation device comprising: one or more processors, and a memory;
the memory has stored therein computer readable instructions which, when executed by the one or more processors, carry out the steps of any of the anti-reverse code obfuscation methods described above.
A readable storage medium having stored therein computer readable instructions, which, when executed by one or more processors, cause the one or more processors to carry out the steps of the anti-reverse code obfuscation method as in any one of the above-introduced.
According to the technical scheme, the method provided by the embodiment of the application can be used for obtaining the file to be processed, and is beneficial to further obtaining the code to be processed from the file to be processed. After the to-be-processed code in the to-be-processed file is acquired, the to-be-processed code can be further converted into an abstract syntax tree, which is helpful for extracting code information corresponding to the to-be-processed code by analyzing the abstract syntax tree. After the code information corresponding to the code to be processed is obtained, compatibility processing can be performed on the code information corresponding to the code to be processed according to a preset code compatibility rule, and the compatible code information corresponding to the code to be processed is beneficial to obfuscating the code to be processed according to the compatible code information corresponding to the code to be processed and a preset code obfuscating rule, so that an obfuscating result corresponding to the code to be processed is obtained. The method provided by the embodiment of the application can be used for firstly carrying out compatibility processing on the code information corresponding to the codes written by different programming languages and then carrying out code confusion when carrying out code confusion, thereby solving the problems that the codes written by the same programming language can only be confused, and the compatibility of a code confusion means is not high.
Drawings
In order to more clearly illustrate the method provided by the embodiments of the present application or the technical solutions in the prior art, the drawings that may be used in the embodiments or the prior art descriptions will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present application, and for those skilled in the art, other drawings may be obtained according to these drawings without inventive labor.
Fig. 1 is a flowchart of a method for implementing anti-reverse code obfuscation according to an embodiment of the present disclosure;
FIG. 2 is a schematic diagram of a compiler design architecture according to an embodiment of the present application;
FIG. 3 is a schematic structural diagram of an anti-reverse code obfuscating apparatus according to an example of the present application;
fig. 4 is a block diagram of a hardware structure of an apparatus for obfuscating inverse code disclosed in an embodiment of the present application.
Detailed Description
The technical solutions provided in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are only a part of the embodiments of the present application, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.
In view of the fact that most of the existing code obfuscation schemes are difficult to obfuscate codes written in different programming languages at the same time, the application provides an inverse code obfuscation scheme, and the inverse code obfuscation method can perform compatibility processing on the codes written in different programming languages before obfuscating the codes, so that the function that one obfuscation scheme can perform code obfuscation on the codes written in multiple different programming languages is achieved.
The methods provided by the embodiments of the present application are operational with numerous general purpose or special purpose computing device environments or configurations. For example: personal cell phone terminals, tablet-type devices, multi-processor devices, computing environments that include any of the above devices or equipment, and so forth.
The embodiment of the application provides an anti-reverse code obfuscation method, which can be applied to various computer terminals or intelligent terminals, and the execution main body of the method can be a processor or a server of the computer terminal or the intelligent terminal.
Referring to fig. 1, a flow of an inverse code obfuscation method according to an embodiment of the present application is described below, where fig. 1 is a flow chart of an inverse code obfuscation method according to an embodiment of the present application, and as shown in fig. 1, the flow may include the following steps:
and S100, acquiring a file to be processed.
Specifically, in an actual application process, the application program may include various different types of files, the core logic function of the application program may be implemented by code written by a programming language, and the different types of files of the application program may be written by using different programming languages.
At present, a reverse technology exists, and an application program can be cracked through the reverse technology, so that codes of the application program can be obtained, and the application program is possibly damaged by cracking the application program through the reverse technology, so that user data is leaked.
In order to ensure the security and stability of the application program, the existing application program developers usually improve the security of the application program through various reverse techniques.
Wherein the content of the first and second substances,
the most common anti-inversion techniques generally use methods of code obfuscation.
By code obfuscation of the codes in the application program, even if the codes in the application program are obtained through a reverse technology, the obtained contents are meaningless after obfuscation, and it is difficult to extract contents such as user information or application program execution logic from the contents, and the data security and the application program stability can be ensured by code obfuscation of the codes in the application program.
To implement the function of obfuscating the code in the application program, a file to be processed including the code may be obtained first.
The method has the advantages that the files to be processed are obtained, the codes of the application programs are extracted from the files to be processed, and the codes are subjected to a series of processing, so that the codes of the application programs are mixed up, and the safety and the stability of the application programs can be improved.
Step S110, converting the codes to be processed in the files to be processed into abstract syntax trees corresponding to the codes to be processed.
Specifically, as can be seen from the above description, the file to be processed including the application program code can be acquired through step S100.
Since the code of the application program exists in the code to be processed at this time, the code to be processed may be extracted from the file to be processed first.
After the to-be-processed codes in the to-be-processed file are obtained, in order to better analyze the to-be-processed codes, the to-be-processed codes can be further converted into abstract syntax trees corresponding to the to-be-processed codes, and the to-be-processed codes are converted into abstract syntax trees corresponding to the to-be-processed codes, so that the determination of the code information of the to-be-processed codes can be facilitated, and code obfuscation can be performed subsequently.
And step S120, analyzing the abstract syntax tree corresponding to the code to be processed, and determining the code information corresponding to the code to be processed.
Specifically, in the actual application process, the abstract syntax tree is an abstract representation of a code syntax structure, the abstract syntax tree may represent the syntax structure of the code in a tree form, the abstract syntax tree may include code information such as a path, a position, a type, a name, and an association condition of the code, and the code information corresponding to the code to be processed may be obtained by analyzing the abstract syntax tree corresponding to the code to be processed.
The abstract syntax tree corresponding to the code to be processed can express the syntax structure of the code in a tree form, and the code information corresponding to the code to be processed can be obtained by analyzing the abstract syntax tree corresponding to the code to be processed, so that the code information corresponding to the code to be processed can be obtained in the abstract syntax tree corresponding to the code to be processed.
Accordingly, after the abstract syntax tree corresponding to the code to be processed is determined, the abstract syntax tree corresponding to the code to be processed may be parsed, so that code information corresponding to the code to be processed may be determined.
Determining the code information corresponding to the code to be processed can help compatibility processing of the code information corresponding to the code to be processed subsequently.
Step S130, performing compatibility processing on the code information corresponding to the code to be processed according to a preset code compatibility processing rule, to obtain compatible code information corresponding to the code to be processed.
Specifically, as can be seen from the above description, in step S120, the code information corresponding to the code to be processed can be obtained by parsing the abstract syntax tree corresponding to the code to be processed.
Further, because each programming language has its corresponding features, an application program may have codes for writing different service requirements using a plurality of different programming languages, and codes for implementing the same function in different programming languages may use different names.
For example, the information output on the console is also realized, taking Objective-C language, and Python language as examples, which are specifically as follows:
Objective-C language: NSLog ();
c language: printf ();
Python:print()。
from the above analysis, it can be seen that different programming languages have great difference in language characteristics, language structures, and even words.
Therefore, when the codes written in different programming languages need to be mixed up, compatibility processing can be performed on the code information corresponding to the code to be processed to obtain compatible code information corresponding to the code to be processed, so that the codes written in different programming languages can run compatibly, and the format is kept uniform.
And step S140, performing code obfuscation processing on the code to be processed according to the compatible code information corresponding to the code to be processed and a preset code obfuscation rule to obtain an obfuscated result corresponding to the code to be processed.
Specifically, as can be seen from the above description, after compatibility processing is performed on code information corresponding to codes written in different programming languages in the code to be processed, compatible code information corresponding to the code to be processed can be obtained.
The code information may include a path, a location, a type, a name, an association, and a word of the code.
Further, the target code corresponding to the code to be processed may be directly subjected to code obfuscation processing to obtain an obfuscated result corresponding to the target code.
Before code obfuscation, a corresponding code obfuscation rule can be set up, code obfuscation processing is performed on the code to be processed according to a preset code obfuscation rule, and the situation that an application program cannot restore obfuscated codes after obfuscation is prevented.
In order to continuously modify the code obfuscating rules according to actual needs to adapt to the code obfuscating requirements, different code obfuscating rules can be stored in one code obfuscating rule file.
If the code obfuscation rule needs to be modified, the code obfuscation rule in the code obfuscation rule file can be directly modified and stored, so that when different code obfuscation requirements exist, code obfuscation processing can be performed timely according to different code obfuscation rules.
According to the technical scheme, when code files to be processed need to be subjected to obfuscation processing, the method provided by the embodiment of the application can perform compatibility processing on code information corresponding to codes to be processed and written in different programming languages before code obfuscation is performed to obtain compatible code information, and performs code obfuscation processing on the codes to be processed according to the compatible code information, so that code obfuscation is performed on the codes written in different programming languages, and the problem of low compatibility in the existing code obfuscation technology is solved.
In another embodiment of the present application, a process of acquiring a to-be-processed file in step S100 is described, where the process may include the following steps:
step S101, acquiring a file path of a file to be processed according to a preset path file.
Specifically, a preset path file may exist in the application program, and a file path corresponding to each file in the application program may be obtained through the path file, where the file path refers to a storage location corresponding to each file.
For example, a file path corresponding to a certain file is as follows:
C:\Users\Desktop
the file path may indicate that the file is stored in a Desktop subdirectory under the C disk Users directory.
According to the path file containing the file path of the file to be processed, each file to be processed can be directly and accurately obtained, each file to be processed does not need to be found by traversing each level of directory of each file in the application program, and the processing time for obtaining the file to be processed can be effectively saved.
The method provided by the embodiment of the present application introduces the process of reading a to-be-processed file, which is provided by the embodiment of the present application, taking reading of an IOS application as an example, and the process specifically includes:
in an IOS application program, a development tool Xcode can obtain a path file named as' manifest.
Step S102, acquiring the file to be processed according to the file path of the file to be processed.
Specifically, as can be seen from the above description, the file path corresponding to the file to be processed may be obtained by obtaining a path file including the file path of the file to be processed. Further, to acquire the file to be processed, the file to be processed may be acquired through the file path corresponding to the file to be processed acquired in step S101.
For example,
after the file path corresponding to the file to be processed is determined, the file to be processed can be directly obtained from the file path corresponding to the file to be processed by directly opening the file path corresponding to the obtained file to be processed.
According to the technical scheme, the method provided by the embodiment of the application can acquire the file path of the file to be processed from the path file according to the preset path file, is favorable for further acquiring the file to be processed according to the file path of the file to be processed, and is favorable for extracting the code to be processed according to the file to be processed. The method provided by the embodiment of the application can acquire the file to be processed by acquiring the file path, and is favorable for further acquiring the code to be processed in the file to be processed according to the file to be processed.
In another embodiment of the present application, regarding step S100, a process of converting a to-be-processed code in a to-be-processed file into an abstract syntax tree corresponding to the to-be-processed code is described, where the process may include the following steps:
step S201, determining the code to be processed in the file to be processed.
Specifically, after step S100, a to-be-processed file may be obtained, and to obfuscate a to-be-processed code in the to-be-processed file, first, the to-be-processed code may be extracted from the to-be-processed file.
The code to be processed can be determined by opening the file to be processed and directly acquiring and determining the code to be processed from the file to be processed, which is beneficial to obtaining the abstract syntax tree corresponding to the code to be processed by converting the code to be processed according to the code to be processed.
And S202, converting the codes to be processed to obtain an abstract syntax tree corresponding to the codes to be processed.
Specifically, as can be seen from the above description, the abstract syntax tree corresponding to the to-be-processed code can be obtained through the conversion of the to-be-processed code, and since the present application is intended to solve how code obfuscation exists in codes written in different programming languages, the present embodiment takes Objective-C and Swift languages frequently used in the IOS system as examples, and a process of obtaining the abstract syntax tree corresponding to the to-be-processed code through conversion of the to-be-processed code is described, where the process may be as follows:
before introducing the abstract syntax tree, a compiler used in obtaining the abstract syntax tree is first introduced. In the computer field, a compiler refers to a program that translates "one language" into "another language", and the traditional compiler design architecture is shown in fig. 2.
A conventional compiler generally consists of three parts:
(1) The FrontEnd compiler front-end includes lexical analysis, syntax analysis, semantic analysis functions, and generates intermediate code by converting source code into Abstract Syntax Tree (AST).
(2) The Optimizer mainly realizes the function of optimizing the obtained intermediate code to obtain the optimized code.
(3) And the BackEnd of the BackEnd compiler realizes the function of converting the optimized codes into machine codes of various platforms, wherein the machine codes of the X86 and ARM platforms are not limited by a machine cat.
Taking the IOS system as an example, the IOS system is compiled by using an LLVM framework system, which is compiled in C + + language and is mainly used for optimizing a framework system of compile time, link time, run time and idle time of a program written in any programming language.
The LLVM framework system has the outstanding characteristics that universal codes can be used for representing intermediate codes, and when a new language is supported, only an independent front end capable of generating the intermediate codes is written; when a new hardware architecture is supported, only one independent back end capable of receiving the intermediate code is written.
The embodiment of the application realizes the obfuscation of the codes based on the abstract syntax tree, and by taking an IOS system as an example, the abstract syntax tree corresponding to the codes written by using an Objective-C language and the abstract syntax tree corresponding to the codes written by using a Swift language can be respectively obtained.
In the LLVM framework system, a Clang compiler and a SourceKit toolset can be used to obtain an abstract syntax tree corresponding to a code written in Objective-C language and an abstract syntax tree corresponding to a code written in Swift language, respectively.
A Clang compiler and SourceKit toolset are presented below.
An Xcode development tool exists in the IOS system, and 2 compiling chain tools exist in the Xcode development tool, wherein the two compiling chain tools are respectively as follows: the Clang compiler and the SourceKit toolset.
The Clang compiler is a C/C + +/Objective-C compiler based on LLVM, and can support C language, C + + language and Objective-C language.
The business code written by the Objective-C language can be converted by using a Clang compiler, so that an abstract syntax tree corresponding to the code written by the Objective-C language is obtained.
The SourceKit is a set of tools that can support the operational features of most Swift source code levels. For example, source code parsing, grammar highlighting, typesetting, autocompletion, cross-language header file generation, and the like. The abstract syntax tree corresponding to the code written using the Swift language can be obtained using the SourceKit toolset.
According to the technical scheme, the method provided by the embodiment of the application obtains the corresponding abstract syntax tree according to the code to be processed by determining the code to be processed in the file to be processed. Therefore, the abstract syntax trees corresponding to different coding languages can be obtained according to the different coding languages, and compatibility is stronger.
In another embodiment of the present application, after converting the to-be-processed code in the to-be-processed file into the abstract syntax tree corresponding to the to-be-processed code in step S110, the following steps may be further included:
and step S301, determining an abstract syntax tree corresponding to the code to be processed.
And step S302, caching the acquired abstract syntax tree into a local file.
Specifically, as can be seen from the above description, after the abstract syntax tree corresponding to the to-be-processed code is determined, in order to perform fast processing on the to-be-processed code and improve the code reuse rate, the abstract syntax tree corresponding to the to-be-processed code may be further cached and stored in a local file, which is beneficial to directly obtain the abstract syntax tree corresponding to the cached to-be-processed code for use when code obfuscating is performed next time, and the above steps S100 and S110 are not required any more, which is faster and more time-saving.
According to the technical scheme, the abstract syntax tree corresponding to the code to be processed can be obtained, the abstract syntax tree corresponding to the code to be processed is further stored in the local file, the abstract syntax tree corresponding to the code to be processed can be directly used when code confusion is performed next time, cache of the abstract syntax tree of the code to be processed is not needed, and the code processing efficiency is improved.
In another embodiment of the present application, a process of converting the to-be-processed code to obtain the abstract syntax tree corresponding to the to-be-processed code in step S202 is introduced, where the process may include the following steps:
step S401, performing word segmentation operation on the code to be processed to obtain a word segmentation set.
Specifically, since the code is mainly composed of various characters and various words, each word has its corresponding classification in the code.
For example, in the case of a liquid,
some specific words belong to keywords in the corresponding programming language, and the keywords refer to words which are well defined in the programming language and have some special functions;
the word "if" indicates an execution determination condition in many languages, and is a word having a predetermined specific function.
Besides, the code also contains identifiers, literal quantities, special symbols and other word classifications with different parts of speech.
Therefore, in the process of converting the code to be processed into the abstract syntax tree corresponding to the code to be processed, the word segmentation operation can be performed on the code to be processed first, and the code to be processed is classified according to the characteristics of different words in the code to obtain a word segmentation set.
And S402, performing lexical analysis and syntactic analysis on the code to be processed according to the word segmentation set, and generating each node corresponding to the code to be processed.
Specifically, as can be seen from the above description, in step S401, a word segmentation set of the code to be processed can be obtained.
Furthermore, lexical analysis and syntactic analysis can be performed on the code to be processed, so that the logical structure of the code to be processed can be accurately acquired.
Each word in the participle set can be analyzed lexically to obtain the type and position of each word in the participle set.
And further carrying out syntactic analysis on each word in the word segmentation set in a word stream mode, and obtaining each node corresponding to the code to be processed by combining the word segmentation set, wherein each node corresponds to one structure in the code to be processed.
And S403, combining all nodes according to the code hierarchical relation corresponding to the code to be processed to obtain an abstract syntax tree corresponding to the code to be processed.
Specifically, as can be seen from the above description, after step S402, a node corresponding to the code to be processed can be obtained.
Because the nodes corresponding to the codes to be processed reflect various structures in the codes to be processed, but the codes to be processed have a certain code hierarchical relationship, the nodes can be further recombined according to the code hierarchical relationship corresponding to the codes to be processed, so that the abstract syntax tree corresponding to the codes to be processed is obtained, and the code information of the codes to be processed can be obtained according to the abstract syntax tree.
The process of parsing code written in the Objective-C language using a Clang compiler to obtain an abstract syntax tree will be described by way of example below.
For example, in the case of a liquid,
a section of code 1 written in Objective-C language, as follows:
Figure BDA0003957823820000131
through the word segmentation processing, a word segmentation set Token can be obtained by using a Clang compiler to parse the code written by the Objective-C language, and the word segmentation set Token can include the following categories:
key words: keywords in grammar, if, else, while, for, etc.;
identifier: a variable name;
literal quantity: value, number, string;
special symbols are as follows: addition, subtraction, multiplication, division, and the like.
Further, analyzing each type of word obtained from the segmentation set Token in a word stream form, and obtaining a code 2 generated after processing as follows:
Figure BDA0003957823820000141
further, each word in the segmentation set Token is transmitted to the parsing module in a word stream form for parsing, and after parsing, the following code 3 can be obtained:
Figure BDA0003957823820000142
Figure BDA0003957823820000151
the transitionunitdecl is a root node, may represent a source file, and may construct an abstract syntax tree corresponding to the code 1 according to the code 3.
According to the technical scheme, word segmentation operation can be performed on the codes to be processed to obtain a word segmentation set, lexical analysis and syntactic analysis can be performed on the codes to be processed according to the word segmentation set to obtain nodes corresponding to the codes to be processed, after the nodes are obtained, the nodes can be combined according to the code hierarchical relation corresponding to the codes to be processed to obtain an abstract syntax tree corresponding to the codes to be processed, and code information corresponding to the codes to be processed can be obtained according to the syntax tree.
In another embodiment of the present application, regarding step S130, a process of performing compatibility processing on the code information corresponding to the code to be processed according to a preset code compatibility processing rule to obtain compatible code information corresponding to the code to be processed is introduced, where the process may include the following steps:
step S501, determining a programming language used by the code to be processed.
Specifically, as the code to be processed may have programs written in a plurality of different programming languages, each programming language has its corresponding characteristics, the programming languages existing in the code to be processed can be obtained, and further, a corresponding compatible means can be selected according to the corresponding programming language for code compatible processing.
Therefore, after the code information corresponding to the code to be processed is obtained by analyzing the abstract syntax tree corresponding to the code to be processed, further, compatibility processing may be performed on the code to be processed, and a programming language used by the code to be processed may be determined first, so that the code information corresponding to the code to be processed may be processed by selecting a code compatibility rule corresponding to the code to be processed according to the programming language used by the code to be processed.
Step S502, according to the programming language used by the code to be processed, selecting a code compatibility rule corresponding to the code to be processed to process the code information corresponding to the code to be processed, and obtaining compatible code information corresponding to the code to be processed.
Specifically, as can be seen from the above description, the programming language used by the code to be processed can be acquired through step S501.
Furthermore, according to the characteristics of each programming language, the compatibility processing can be performed on the code information corresponding to the code to be processed by selecting the corresponding mode for processing the code information corresponding to the code to be processed, so that the compatible code information corresponding to the code to be processed, which meets the compatibility requirement, can be obtained.
The following describes the compatibility processing procedure by taking the Objective-C language and the Swift language in the IOS system as examples, and the specific conversion rules are as follows:
(1) Verbs and prepositions in names are unified and not replaced, so that the change of the whole structure of the conversion caused after replacement is prevented
(2) If prepositions exist in the name, the words behind the prepositions can be folded, and only the words before the prepositions are replaced.
For example,
setImageWithUrl:(String*)url,
only the setImage portion is replaced.
(3) If the word at the tail part is coincident with the first parameter name, the word is added into a white list and is not replaced, and the original structure is prevented from being damaged.
As can be seen from the above-described technical solutions, the method provided in the embodiment of the present application can help perform compatibility processing on the code information corresponding to the to-be-processed code by determining the programming language used by the to-be-processed code, and selecting a method for processing the code information corresponding to the to-be-processed code according to the programming language used by the to-be-processed code, so as to obtain compatible code information corresponding to the to-be-processed code that meets the compatibility requirement, and help perform code obfuscation processing according to the compatible code information.
In another embodiment of the present application, regarding step S140, according to the compatible code information corresponding to the code to be processed, code obfuscating is performed on the code to be processed according to a preset code obfuscating rule, so as to introduce a process of obtaining an obfuscated result corresponding to the code to be processed, where the process may include the following steps:
step S601, determining a target code obfuscation rule from preset code obfuscation rules.
Specifically, as can be seen from the above description, the target code corresponding to the code to be processed can be obtained by performing compatibility processing on the code to be processed.
In an actual application process, each application program can have a code obfuscation rule file, and the code obfuscation rule file corresponding to the application program comprises at least one code obfuscation rule.
With the update of the application program, the code obfuscating rule file corresponding to the application program may continuously update the code obfuscating rule, and the latest code obfuscating rule in the code obfuscating rule file may be searched as the target code obfuscating rule by referring to the corresponding code obfuscating rule file.
Taking the IOS system as an example, the code obfuscating rule can be saved in the plist file in xml format, when code obfuscation is performed, the latest code obfuscating rule in the plist file is read and determined as the target code obfuscating rule, and code obfuscation is performed according to the target code obfuscating rule.
Step S602, according to the compatible code information corresponding to the code to be processed, performing code obfuscation processing on the code to be processed by using the target code obfuscation rule, to obtain an obfuscation result corresponding to the code to be processed.
Specifically, as can be seen from the above description, after the target code obfuscating rule is selected, the target code may be obfuscated according to the target code obfuscating rule and the compatible code information corresponding to the introduced code to be processed, so as to obtain an obfuscating result corresponding to the target code.
The target code obfuscation rules may include the following code obfuscation rules:
the code obfuscation rules corresponding to the Objective-C language and the Swift language are described below by taking the Objective-C language and the Swift language in the IOS system as examples.
Based on the characteristics of the Objective-C language and the Swift language, when performing conversion, usually, judgment is performed according to verbs and prepositions, common verbs include send, get, set, load and the like, common prepositions include with, of, by, for and the like, and specific conversion rules are as follows:
(1) If the verb is located at the head and the word at the tail coincides with the first argument name, the tail word is deleted.
For example,
the method process of Objective-C may be as follows:
sendMessage:(Message*)msg,
the name used in Swift would change to send (msg).
(2) If the preposition exists in the name, the preposition and the word behind the preposition are converted into parameter names.
For example,
the method process of Objective-C may be as follows:
setImageWithUrl:(String*)url,
the name used in Swift will become setImage (withUrl: url).
(3) If the name of the method/function is equal to the verb plus class name suffix word, no processing may be used.
It can be seen from the above-described technical solutions that, in the method provided in the embodiment of the present application, according to the code obfuscation rule file, the latest code obfuscation rule in the current time code obfuscation rule file is selected as the target code obfuscation rule, and the target code is subjected to code obfuscation processing according to the compatible code information corresponding to the code to be processed and the target code obfuscation rule, so as to obtain an obfuscation result corresponding to the target code. The method provided by the embodiment of the application can be used for obfuscating the target code after compatibility processing according to the target code obfuscation rule, so that the function of performing code obfuscation operation on codes written in different programming languages is realized.
The inverse code obfuscating apparatus provided by the embodiment of the present application is described below, and the inverse code obfuscating apparatus described below and the inverse code obfuscating method described above may be referred to correspondingly.
Referring to fig. 3, fig. 3 is a schematic structural diagram of an anti-reverse code obfuscating apparatus disclosed in the embodiment of the present application.
As shown in fig. 3, the inverse code obfuscating apparatus may include:
the file acquisition module 11 is used for acquiring a file to be processed;
the code conversion module 12 is configured to convert a to-be-processed code in the to-be-processed file into an abstract syntax tree corresponding to the to-be-processed code;
the code information extraction module 13 is configured to parse the abstract syntax tree corresponding to the code to be processed, and determine code information corresponding to the code to be processed;
the compatible processing module 14 is configured to perform compatible processing on the code information corresponding to the code to be processed according to a preset code compatible processing rule, so as to obtain compatible code information corresponding to the code to be processed;
and the code obfuscating module 15 is configured to perform code obfuscation processing on the to-be-processed code according to the compatible code information corresponding to the to-be-processed code and according to a preset code obfuscating rule, so as to obtain an obfuscating result corresponding to the to-be-processed code.
As can be seen from the above introduced apparatus for obfuscating reverse direction codes, the file obtaining module 11 provided in the apparatus according to the embodiment of the present application may obtain a file to be processed, and is helpful for the code conversion module 12 to obtain a code to be processed from the file to be processed and convert the code to be processed into an abstract syntax tree corresponding to the code to be processed. After the abstract syntax tree corresponding to the code to be processed is obtained, the code information extraction module 13 may further parse the abstract syntax tree corresponding to the code to be processed, so as to determine the code information corresponding to the code to be processed, which is helpful for the compatible processing module 14 to perform compatibility processing on the code information corresponding to the code to be processed according to the preset code compatible processing rule, so as to obtain the compatible code information corresponding to the code to be processed. After obtaining the compatible code information corresponding to the code to be processed, the code obfuscating module 15 may be further used to perform code obfuscation on the code to be processed according to a preset code obfuscating rule, so as to obtain an obfuscating result corresponding to the code to be processed. The device provided by the embodiment of the application can perform compatibility processing on code information corresponding to codes written in different programming languages when codes of an application program are mixed up, so that compatible code information is obtained, and codes to be processed are mixed up according to the compatible code information, so that the problems that the existing code mixing up technology can only perform code mixing up on codes written in one programming language, and the compatibility of a code mixing up means is not high are solved.
The specific processing flow of each unit included in the above-mentioned apparatus for obfuscating inverse code may refer to the related description of the method for obfuscating inverse code, and will not be described herein again.
The reverse code obfuscation device provided by the embodiment of the application can be applied to reverse code obfuscation equipment, such as a terminal: mobile phones, computers, etc. Alternatively, fig. 4 shows a block diagram of a hardware structure of the inverse code obfuscating apparatus, and referring to fig. 4, the hardware structure of the inverse code obfuscating apparatus may include: at least one processor 1, at least one communication interface 2, at least one memory 3 and at least one communication bus 4.
In the method provided by the embodiment of the application, the number of the processor 1, the communication interface 2, the memory 3 and the communication bus 4 is at least one, and the processor 1, the communication interface 2 and the memory 3 complete mutual communication through the communication bus 4.
The processor 1 may be a central processing unit CPU, or an Application Specific Integrated Circuit ASIC (Application Specific Integrated Circuit), or one or more Integrated circuits configured to implement the methods provided by the embodiments of the present Application, etc.;
the memory 3 may include a high-speed RAM memory, and may further include a non-volatile memory (non-volatile memory) or the like, such as at least one disk memory;
wherein the memory stores a program and the processor can call the program stored in the memory, the program for: and realizing each processing flow in the terminal inverse code obfuscation scheme.
Embodiments of the present application further provide a readable storage medium, where the storage medium may store a program adapted to be executed by a processor, where the program is configured to: and realizing the processing flows of the terminal in the reverse code obfuscation scheme.
Finally, it may also be noted that, in this document, relational terms such as first and second, and the like may be used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Also, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrases "comprising one of 8230; \8230;" 8230; "does not exclude the presence of additional like elements in a process, method, article, or apparatus that comprises the element.
The embodiments in the present description are described in a progressive manner, each embodiment focuses on differences from other embodiments, and the same and similar parts among the embodiments are referred to each other.
The previous description of the disclosed embodiments is provided to enable any person skilled in the art to make or use the present application. Various modifications to these embodiments will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other embodiments without departing from the spirit or scope of the application. The various embodiments may be combined with each other. Thus, the present application is not intended to be limited to the embodiments shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.

Claims (10)

1. An inverse code obfuscation method, comprising:
acquiring a file to be processed;
converting the codes to be processed in the files to be processed into abstract syntax trees corresponding to the codes to be processed;
analyzing an abstract syntax tree corresponding to the code to be processed, and determining code information corresponding to the code to be processed;
performing compatibility processing on the code information corresponding to the code to be processed according to a preset code compatibility processing rule to obtain compatible code information corresponding to the code to be processed;
and performing code obfuscation processing on the code to be processed according to compatible code information corresponding to the code to be processed and a preset code obfuscation rule to obtain an obfuscation result corresponding to the code to be processed.
2. The method according to claim 1, wherein the obtaining the file to be processed comprises:
acquiring a file path of the file to be processed according to a preset path file;
and acquiring the file to be processed according to the file path of the file to be processed.
3. The method according to claim 1, wherein said converting the to-be-processed code in the to-be-processed file into an abstract syntax tree corresponding to the to-be-processed code comprises:
determining the code to be processed in the file to be processed;
and converting the code to be processed to obtain an abstract syntax tree corresponding to the code to be processed.
4. The method according to claim 3, wherein the converting the to-be-processed code to obtain an abstract syntax tree corresponding to the to-be-processed code comprises:
performing word segmentation operation on the code to be processed to obtain a word segmentation set;
according to the word segmentation set, performing lexical analysis and syntactic analysis on the code to be processed to generate each node corresponding to the code to be processed;
and combining all the nodes according to the code hierarchical relation corresponding to the code to be processed to obtain an abstract syntax tree corresponding to the code to be processed.
5. The method according to claim 1, wherein the performing compatibility processing on the code information corresponding to the code to be processed according to a preset code compatibility processing rule to obtain compatible code information corresponding to the code to be processed comprises:
determining a programming language used by the code to be processed;
and selecting a code compatibility rule corresponding to the code to be processed to process the code information corresponding to the code to be processed according to a programming language used by the code to be processed to obtain compatible code information corresponding to the code to be processed.
6. The method according to claim 1, wherein after said converting the to-be-processed code in the to-be-processed file into the abstract syntax tree corresponding to the to-be-processed code, before said parsing the abstract syntax tree corresponding to the to-be-processed code and determining the code information corresponding to the to-be-processed code, the method further comprises:
determining an abstract syntax tree corresponding to the code to be processed;
caching the abstract syntax tree corresponding to the code to be processed into a local file.
7. The method according to any one of claims 1 to 6, wherein performing code obfuscation processing on the to-be-processed code according to a preset code obfuscation rule according to compatible code information corresponding to the to-be-processed code to obtain an obfuscated result corresponding to the to-be-processed code includes:
determining a target code confusion rule from the preset code confusion rules;
and performing code obfuscation processing on the code to be processed by using the target code obfuscation rule according to the compatible code information corresponding to the code to be processed to obtain an obfuscation result corresponding to the code to be processed.
8. An apparatus for obfuscating an inverse code, comprising:
the file acquisition module is used for acquiring a file to be processed;
the code conversion module is used for converting the codes to be processed in the files to be processed into abstract syntax trees corresponding to the codes to be processed;
the code information extraction module is used for analyzing the abstract syntax tree corresponding to the code to be processed and determining the code information corresponding to the code to be processed;
the compatible processing module is used for carrying out compatible processing on the code information corresponding to the code to be processed according to a preset code compatible processing rule to obtain compatible code information corresponding to the code to be processed;
and the code obfuscating module is used for performing code obfuscation processing on the code to be processed according to compatible code information corresponding to the code to be processed and a preset code obfuscating rule to obtain an obfuscating result corresponding to the code to be processed.
9. An anti-reverse code obfuscation device, comprising: one or more processors, and a memory;
the memory has stored therein computer-readable instructions which, when executed by the one or more processors, carry out the steps of the anti-reverse code obfuscation method as claimed in any one of claims 1 to 7.
10. A readable storage medium, characterized by: the readable storage medium having stored therein computer-readable instructions which, when executed by one or more processors, cause the one or more processors to carry out the steps of the anti-reverse code obfuscation method as claimed in any one of claims 1 to 7.
CN202211466953.8A 2022-11-22 2022-11-22 Method, device and equipment for obfuscating reverse code and readable storage medium Pending CN115809442A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211466953.8A CN115809442A (en) 2022-11-22 2022-11-22 Method, device and equipment for obfuscating reverse code and readable storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211466953.8A CN115809442A (en) 2022-11-22 2022-11-22 Method, device and equipment for obfuscating reverse code and readable storage medium

Publications (1)

Publication Number Publication Date
CN115809442A true CN115809442A (en) 2023-03-17

Family

ID=85483724

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211466953.8A Pending CN115809442A (en) 2022-11-22 2022-11-22 Method, device and equipment for obfuscating reverse code and readable storage medium

Country Status (1)

Country Link
CN (1) CN115809442A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113761486A (en) * 2021-09-10 2021-12-07 上海熙菱信息技术有限公司 One-click type code confusion method based on grammar sugar analysis

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113761486A (en) * 2021-09-10 2021-12-07 上海熙菱信息技术有限公司 One-click type code confusion method based on grammar sugar analysis
CN113761486B (en) * 2021-09-10 2023-09-05 上海熙菱信息技术有限公司 One-key code confusion method based on grammar sugar analysis

Similar Documents

Publication Publication Date Title
EP1178404B1 (en) Method and system for compiling multiple languages
US8762962B2 (en) Methods and apparatus for automatic translation of a computer program language code
US8850414B2 (en) Direct access of language metadata
CN106412086B (en) Method and system for automatically generating communication code by using protocol description file
Kuhlmann et al. Lexicalization and generative power in CCG
Vollebregt et al. Declarative specification of template-based textual editors
CN111913739B (en) Service interface primitive defining method and system
US20100146492A1 (en) Translation of programming code
CN112540862A (en) Interface document data generation method, device, equipment and storage medium
US20090328016A1 (en) Generalized expression trees
CN111736840A (en) Compiling method and running method of applet, storage medium and electronic equipment
CN114625844B (en) Code searching method, device and equipment
CN113139390A (en) Language conversion method and device applied to code character strings
CN112379917A (en) Browser compatibility improving method, device, equipment and storage medium
CN115809442A (en) Method, device and equipment for obfuscating reverse code and readable storage medium
Duregård et al. Embedded parser generators
Pârtachi et al. Posit: Simultaneously tagging natural and programming languages
US20080141230A1 (en) Scope-Constrained Specification Of Features In A Programming Language
CN111596970B (en) Method, device, equipment and storage medium for dynamic library delay loading
CN114327492A (en) Code translation method, device and equipment
CN116595967A (en) Natural language rule writing method based on text and related device
CN111831288A (en) Method and system for automatically generating Thrift IDL data structure and automatic transfer function
Butler Analysing Java Identifier Names
CN110737431A (en) Software development method, development platform, terminal device and storage medium
CN115618887B (en) Dynamic long text internationalization method, device, equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination