CN113849781B - Go language source code confusion method, system, terminal and storage medium - Google Patents

Go language source code confusion method, system, terminal and storage medium Download PDF

Info

Publication number
CN113849781B
CN113849781B CN202110961967.6A CN202110961967A CN113849781B CN 113849781 B CN113849781 B CN 113849781B CN 202110961967 A CN202110961967 A CN 202110961967A CN 113849781 B CN113849781 B CN 113849781B
Authority
CN
China
Prior art keywords
abstract syntax
syntax tree
source code
function
encrypting
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202110961967.6A
Other languages
Chinese (zh)
Other versions
CN113849781A (en
Inventor
齐增田
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Suzhou Inspur Intelligent Technology Co Ltd
Original Assignee
Suzhou Inspur Intelligent Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Suzhou Inspur Intelligent Technology Co Ltd filed Critical Suzhou Inspur Intelligent Technology Co Ltd
Priority to CN202110961967.6A priority Critical patent/CN113849781B/en
Publication of CN113849781A publication Critical patent/CN113849781A/en
Application granted granted Critical
Publication of CN113849781B publication Critical patent/CN113849781B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/10Protecting distributed programs or content, e.g. vending or licensing of copyrighted material ; Digital rights management [DRM]
    • G06F21/12Protecting executable software
    • G06F21/14Protecting executable software against software analysis or reverse engineering, e.g. by obfuscation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F8/00Arrangements for software engineering
    • G06F8/30Creation or generation of source code
    • G06F8/31Programming languages or programming paradigms
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Theoretical Computer Science (AREA)
  • General Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Security & Cryptography (AREA)
  • Computer Hardware Design (AREA)
  • Technology Law (AREA)
  • Multimedia (AREA)
  • Computing Systems (AREA)
  • Devices For Executing Special Programs (AREA)
  • Storage Device Security (AREA)

Abstract

The invention provides a Go language source code confusion method, a system, a terminal and a storage medium, comprising the following steps: reading a source code and constructing an abstract syntax tree of the source code; encrypting the packet name of the abstract syntax tree by using a first encryption function; mixing character strings in the abstract syntax tree into byte codes by utilizing a binary operation technology; encrypting the type definition, the function definition and the relative path of the symbol in the abstract syntax tree by using a second encryption function; and compiling the encrypted and confused abstract syntax tree by using a Go compiling tool to obtain a confused binary program. The invention not only carries out confusion on character strings, but also carries out confusion on types and package names, greatly increases the complexity of programs, fills the blank of the confusion technology of Go language, can effectively prevent static reverse analysis, improves the safety of products, can effectively prevent program products from being reversely broken, and avoids unnecessary property loss.

Description

Go language source code confusion method, system, terminal and storage medium
Technical Field
The invention relates to the technical field of software development, in particular to a Go language source code confusion method, a system, a terminal and a storage medium.
Background
The Go language is a statically strong type, compiled, concurrent programming language developed by Google, and has garbage collection function. Go is known as a cloud computing development language because it can be compiled and run everywhere at a time, and because the compiled binary program has only one executable file, no dependency library exists, go language can achieve cross-platform support at language level.
But also because of the cross-platform characteristic of the Go language, the compiled binary file contains a large amount of source code information, a reverse engineer can easily find the package which is relied on when the source code is compiled in the binary file, and the used character strings and symbol information can help the reverse engineer to crack the program more easily.
In contrast to other cross-platform languages, such as Python, lua, java, etc., they are interpreted languages, and when running a program written by them, they need to be differentially processed by means of intermediate bytecodes to achieve the purpose of cross-platform. The confusion technique used for such interpreted languages is confusion of byte codes, not binary. The Go language is a static type language, directly generates machine code at compile time, and there is no intermediate bytecode conversion. Therefore, the confusion technique of the interpreted language is not suitable for the Go language, the purpose of confusion at the binary level cannot be achieved, only the byte codes can be confused, but in the reverse direction, the operated file is binary, not the byte codes, and the confusion is disabled.
The compiled languages include C, c++, whose syntax structure is distinct from that of Go language, and there is no packet structure, so that the technique of confusion for C and c++ cannot be applied to Go language.
In order to solve the problem that a program developed by the Go language source code is easy to be reversely broken, the invention provides a Go language source code confusion method, a system, a terminal and a storage medium.
Disclosure of Invention
Aiming at the problem that a program based on Go language source codes is easy to be reversely broken due to the fact that the Go language source codes are not applicable to the confusion method in the prior art, the invention provides the Go language source code confusion method, the system, the terminal and the storage medium, and aims to solve the technical problems.
In a first aspect, the present invention provides a Go language source code confusion method, including:
reading a source code and constructing an abstract syntax tree of the source code;
encrypting the packet name of the abstract syntax tree by using a first encryption function;
mixing character strings in the abstract syntax tree into byte codes by utilizing a binary operation technology;
encrypting the type definition, the function definition and the relative path of the symbol in the abstract syntax tree by using a second encryption function;
and compiling the encrypted and confused abstract syntax tree by using a Go compiling tool to obtain a confused binary program.
Further, reading a source code, constructing an abstract syntax tree of the source code, including:
analyzing the source code by utilizing a tool for constructing an abstract syntax tree in the Go voice standard library, and returning error information if the source code has errors; if parsing passes, an abstract syntax tree structure is returned, the abstract syntax tree structure containing detailed information of the package, detailed paths in the current compilation environment, and structure information of each syntax node.
Further, encrypting the packet name of the abstract syntax tree using the encryption function includes:
determining a root directory of a source code, traversing all packet directories under the root directory, and collecting relative paths of the packet directories;
encrypting the relative path of the packet directory by using a first encryption function, and taking the obtained encrypted path as a packet name;
creating a directory according to the package name and the original directory structure to obtain an encrypted directory structure;
and updating the initial package name of the abstract syntax tree into the corresponding package name of the encrypted directory structure.
Further, the confusing the string in the abstract syntax tree as a byte code using a binary operation technique includes:
searching a character string in the abstract syntax tree through the node type;
if the string is constant, the string is converted to a reserved word format.
Further, converting the character string into a reserved word format includes:
generating byte codes with the same character number as the character strings by utilizing a random number generation function;
performing exclusive OR operation on each character string and the random byte codes at the corresponding positions in sequence to generate an intermediate character string;
and performing exclusive OR operation on each byte code and the intermediate character string at the corresponding position in sequence to generate a string of mixed byte codes.
Further, encrypting the type definition, the function definition, and the relative path of the symbol in the abstract syntax tree using an encryption function includes:
acquiring a type definition structure of a source code, acquiring an absolute path of the type definition structure, and reading function definition in a file under the absolute path;
traversing the source code, acquiring a receiving type defined by the function, and positioning a relative path according to the receiving type;
and encrypting the corresponding type definition structure, function definition and relative path in the abstract structure tree by using a second encryption function.
Further, the first encryption function and the second encryption function are customized functions, and the customized functions include any one of hash functions, code conversion functions and combination functions.
In a second aspect, the present invention provides a Go language source code obfuscation system, including:
the target construction unit is used for reading the source code and constructing an abstract syntax tree of the source code;
a packet name encryption unit for encrypting the packet name of the abstract syntax tree by using a first encryption function;
the character confusion unit is used for using a binary operation technology to confuse character strings in the abstract syntax tree into byte codes;
a symbol encrypting unit for encrypting the type definition, the function definition and the relative path of the symbol in the abstract syntax tree by using a second encrypting function;
and the target compiling unit is used for compiling the encrypted and confused abstract syntax tree by using a Go compiling tool to obtain a confused binary program.
Further, the target construction unit is configured to:
analyzing the source code by utilizing a tool for constructing an abstract syntax tree in the Go voice standard library, and returning error information if the source code has errors; if parsing passes, an abstract syntax tree structure is returned, the abstract syntax tree structure containing detailed information of the package, detailed paths in the current compilation environment, and structure information of each syntax node.
Further, the packet name encryption unit is configured to:
determining a root directory of a source code, traversing all packet directories under the root directory, and collecting relative paths of the packet directories;
encrypting the relative path of the packet directory by using a first encryption function, and taking the obtained encrypted path as a packet name;
creating a directory according to the package name and the original directory structure to obtain an encrypted directory structure;
and updating the initial package name of the abstract syntax tree into the corresponding package name of the encrypted directory structure.
Further, the character confusion unit is configured to:
searching a character string in the abstract syntax tree through the node type;
if the string is constant, the string is converted to a reserved word format.
Further, the character confusion unit is configured to:
generating byte codes with the same character number as the character strings by utilizing a random number generation function;
performing exclusive OR operation on each character string and the random byte codes at the corresponding positions in sequence to generate an intermediate character string;
and performing exclusive OR operation on each byte code and the intermediate character string at the corresponding position in sequence to generate a string of mixed byte codes.
Further, the symbol encryption unit is configured to:
acquiring a type definition structure of a source code, acquiring an absolute path of the type definition structure, and reading function definition in a file under the absolute path;
traversing the source code, acquiring a receiving type defined by the function, and positioning a relative path according to the receiving type;
and encrypting the corresponding type definition structure, function definition and relative path in the abstract structure tree by using a second encryption function.
Further, the first encryption function and the second encryption function are customized functions, and the customized functions include any one of hash functions, code conversion functions and combination functions.
In a third aspect, a terminal is provided, including:
a processor, a memory, wherein,
the memory is used for storing a computer program,
the processor is configured to call and run the computer program from the memory, so that the terminal performs the method of the terminal as described above.
In a fourth aspect, there is provided a computer storage medium having instructions stored therein which, when run on a computer, cause the computer to perform the method of the above aspects.
The invention has the advantages that,
according to the Go language source code confusion method, the system, the terminal and the storage medium, the Go language program subjected to encryption confusion can be obtained by constructing the abstract syntax tree of the Go language source code, then carrying out encryption confusion on the packet name, the character string and the symbol of the abstract syntax tree, and compiling the abstract syntax tree subjected to encryption confusion by utilizing a Co compiling tool. The invention not only carries out confusion on character strings, but also carries out confusion on types and package names, greatly increases the complexity of programs, fills the blank of the confusion technology of Go language, can effectively prevent static reverse analysis, improves the safety of products, can effectively prevent program products from being reversely broken, and avoids unnecessary property loss.
In addition, the invention has reliable design principle, simple structure and very wide application prospect.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings that are required to be used in the description of the embodiments or the prior art will be briefly described below, and it will be obvious to those skilled in the art that other drawings can be obtained from these drawings without inventive effort.
FIG. 1 is a schematic flow chart of a method of one embodiment of the invention.
FIG. 2 is a schematic block diagram of a system of one embodiment of the present invention.
Fig. 3 is a schematic structural diagram of a terminal according to an embodiment of the present invention.
Detailed Description
In order to make the technical solution of the present invention better understood by those skilled in the art, the technical solution of the present invention will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present invention, and it is apparent that the described embodiments are only some embodiments of the present invention, not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the present invention without making any inventive effort, shall fall within the scope of the present invention.
The following explains key terms appearing in the present invention.
The abstract syntax tree (abstract syntax code, AST) is a tree representation of the abstract syntax structure of the source code, each node on the tree representing a structure in the source code, which is abstract because the abstract syntax tree does not represent every detail of the actual syntax appearance, for example nested brackets are implicit in the tree structure and are not presented in the form of nodes. The abstract syntax tree does not depend on the grammar of the source language, that is, the context used in the grammar analysis stage is free of grammar, because when grammar is written, equivalent conversion (elimination of left recursion, backtracking, ambiguity, etc.) is often performed on the grammar, so that some redundant components are introduced into grammar analysis, which adversely affects the subsequent stage and even causes confusion of the stage. Therefore, many compilers often independently construct parse trees to create a clear interface for the front-end and the back-end. Abstract syntax trees have a wide range of applications in many fields, such as browsers, intelligent editors, compilers. When working in source program grammar analysis, it is conducted under the guidance of grammar rules of the corresponding programming language. Grammar rules describe the composition of the various grammatical components of the language, and the grammar rules of a programming language can be described in general exactly by the so-called context-free grammar or the back-Naur paradigm (BNF) equivalent thereto. Context-free grammar is divided into such categories: LL (1), LR (0), LR (1), LR (k), LALR (1), and the like. Each grammar has different requirements, such as LL (1) requires that the grammar be nonsensical and that there be no left recursion. When changing a grammar to a LL (1) grammar, it is necessary to introduce some extra grammar symbols and expressions.
Var in computer language: pascal Var is used as a reserved word of the program at Pascal for defining variables. Such as: var a, integer; (defining a variable a, type is an integer) var u: array [1..100] of integer; (defining an array u, subscripts from 1 to 100, array element types being integers).
Exclusive OR, english is exclusive OR, abbreviated as eor, and exclusive OR (eor) is a mathematical operator. It applies to logical operations. The mathematical sign of exclusive OR is 'E' and the computer sign is 'eor'. The algorithm is as follows:if the two values of a and b are not the same, the exclusive OR result is 1. If the values of a and b are the same, the exclusive OR result is 0. The exclusive or, also called half-add, is an algorithm that corresponds to binary addition without carry: the binary system is represented by a true with a 1, and a false with a 0, the exclusive-or algorithm is: 0 +.0=0, 1 +.0=1, 0 =1, 1 =0 (same as 0, different from 1), these rules are the same as addition, except that carry is not taken, so exclusive or is often regarded as no carry addition.
The code compiled by the Go language is binary, but the package name, the character string and the symbol are still clearly visible, and are consistent with the source code, so that information leakage is caused, a reverse engineer can extract the original data, and the logic of the code is restored after analysis, so that the program is cracked.
In order to make up for the defects, the invention provides an algorithm for confusing packet names, character strings and symbols in the Go language, and the related information in the generated binary file is unreadable after the processing, so that the difficulty of reverse engineering cracking is greatly improved.
FIG. 1 is a schematic flow chart of a method of one embodiment of the invention. Wherein the execution subject of fig. 1 may be a Go language source code obfuscation system.
As shown in fig. 1, the method includes:
step 110, reading a source code and constructing an abstract syntax tree of the source code;
step 120, encrypting the packet name of the abstract syntax tree by using a first encryption function;
step 130, confusing the character strings in the abstract syntax tree into byte codes by utilizing a binary operation technology;
step 140, encrypting the type definition, the function definition and the relative path of the symbol in the abstract syntax tree by using a second encryption function;
and step 150, compiling the encrypted and confused abstract syntax tree by using a Go compiling tool to obtain a confused binary program.
In order to facilitate understanding of the present invention, the Go language source code confusion method provided by the present invention is further described below with reference to the process of confusion of Go language source codes in the embodiment by using the principle of the Go language source code confusion method of the present invention.
Specifically, the Go language source code confusion method comprises the following steps:
s1, reading a source code and constructing an abstract syntax tree of the source code.
The Go language standard library provides a package AST for constructing an abstract syntax tree, can parse source codes, if syntax errors exist, can report errors, and returns to the structure of the abstract syntax tree after parsing, and attention is paid to the detailed information of the package, such as package names, detailed paths in the current compiling environment, and the structure information (type, value, metadata and the like) of each syntax node in the structure.
S2, encrypting the packet name of the abstract syntax tree by using a first encryption function.
Determining a root directory of a source code, traversing all packet directories under the root directory, and collecting relative paths of the packet directories; encrypting the relative path of the packet directory by using a first encryption function, and taking the obtained encrypted path as a packet name; creating a directory according to the package name and the original directory structure to obtain an encrypted directory structure; and updating the initial package name of the abstract syntax tree into the corresponding package name of the encrypted directory structure.
The Go language distinguishes packets on a directory basis, e.g., directory structure a/b/c, packet name a.b.c. Therefore, the confusion of package names can be performed by changing the directory names. The process is as follows:
(1) The root directory of the source code is determined, say, "D: \\mygo".
(2) And iterating through all files and directories under the directory to obtain a relative path of the directory, for example, two directories of an animal and a tree are arranged under the mygo directory, and then two values of the animal and the tree are obtained, wherein the two values are the names of the packets.
(3) Using a HASH function, such as SHA256, to HASH the packet name, various encryption schemes, such as BASE64, etc., we name this function ENC, take two values: a=enc ("animal") and b=enc ("tree"), two directories, named a and b, are created under the same level of directories as "animal" and "tree".
(4) Files (non-directories) under the "animal" and "tree" directories are copied under a and b.
(5) If the directories "animal" and "tree" still exist, after the encrypted directory name is obtained, the directory is newly built in the directory created in the third step, and the name is the encrypted value, and so on. And iteratively processing the packet names in the source code directory structure to obtain a complete encrypted directory structure.
(6) And (3) iteratively processing the source code file, finding a package node, and changing the corresponding plaintext packet name "p" into "ENC (p)".
After the steps, the packet name in the source code is replaced by the encrypted character string, and the packet name of the reference part in the source code is correspondingly modified.
S3, utilizing a binary operation technology to confuse character strings in the abstract syntax tree into byte codes.
Searching a character string in the abstract syntax tree through the node type; if the string is constant, the string is converted to a reserved word format. A method of converting a character string into a reserved word format, comprising: generating byte codes with the same character number as the character strings by utilizing a random number generation function; performing exclusive OR operation on each character string and the random byte codes at the corresponding positions in sequence to generate an intermediate character string; and performing exclusive OR operation on each byte code and the intermediate character string at the corresponding position in sequence to generate a string of mixed byte codes.
The character strings in the source code file can be mixed up in a manner of generating subsequences, and the following specific processes are as follows:
(1) The string s is found in the abstract syntax tree by the node type.
(2) If s found is const, it needs to be modified to var, so that the character string can be generated by dynamic generation without hard coding.
Specifically, after s is found, s is replaced with the following subsequence:
the sub-sequence sequentially XORs each string with the random byte code at the corresponding position by randomly generating the same number of byte codes (0 to 255) as the string to generate an intermediate string, sequentially XORs each byte code with the intermediate string value at the corresponding position from the beginning to generate a string of byte codes, and then acquires the string represented by the byte code. The character string is consistent with the original character string, so that logic before and after program confusion is ensured to be unchanged.
(3) Traversing the abstract syntax tree and replacing the found character string with the above mixed byte code.
After the above steps, the character string in the source code may be replaced with a sub-sequence, which is no longer a distinct character string. The binary file generated will no longer have obvious strings but a piece of code.
S4, encrypting the type definition, the function definition and the relative path of the symbol in the abstract syntax tree by using a second encryption function.
Acquiring a type definition structure of a source code, acquiring an absolute path of the type definition structure, and reading function definition in a file under the absolute path; traversing the source code, acquiring a receiving type defined by the function, and positioning a relative path according to the receiving type; and encrypting the corresponding type definition structure, function definition and relative path in the abstract structure tree by using a second encryption function.
The notation is type and the Go language is type-strict language, requiring explicit types in defining parameters, functions, variables, etc. The type information can expose memory information, such as parameters of int type have 4 bytes (32-bit system), and a reverse engineer can find the position of a variable in the memory according to the type information, so that the memory outside program control is accessed in the running process.
The invention provides the following type confusion algorithm, which can carry out compiling and confusion on type definition, variable declaration and receiver of functions in source codes:
the type definition structure t in the source code is obtained, and the absolute path is taken, for example, a dog.go file is arranged under an 'animal' folder, and the type definition structure t defines the eat function and the shaepdog type in the file, and the absolute path is as follows: animal/eat and animal/shaepdog.
Traversing the source code, obtaining the receiver type in the function definition, and taking the relative path, such as the definition of the shaepdog as 'func (×animal) shaepdog ()', and the receiver as 'Animal', wherein the relative path is 'Animal/Animal'.
Encryption confusion is performed on the type definition, the function definition, and the relative path of the receiver, such as t_obfuscation=enc (t_relative), where ENC () represents an encryption function that can be selected by the user, and the hash encryption algorithm is adopted in this embodiment.
T_relative in the abstract syntax tree is replaced with t_obfuscation.
Through the steps, the type information in the abstract syntax tree can be confused.
S5, compiling the encrypted and confused abstract syntax tree by using a Go compiling tool to obtain a confused binary program.
Compiling the confused abstract syntax tree through a go compiling tool, so that the confused binary program can be obtained.
According to the embodiment, not only is the character string mixed, but also the type and the package name are mixed, so that the complexity of a program is greatly increased, static reverse analysis can be effectively prevented, and the safety of a product is improved. Products developed by companies through the go language can be confused by using the algorithm provided by the patent, so that the reverse direction can be effectively prevented, and unnecessary property loss is avoided.
As shown in fig. 2, the system 200 includes:
a target construction unit 210, configured to read a source code and construct an abstract syntax tree of the source code;
a packet name encryption unit 220 for encrypting the packet name of the abstract syntax tree using a first encryption function;
a character confusion unit 230 for using binary operation technique to confuse character strings in the abstract syntax tree into byte codes;
a symbol encrypting unit 240 for encrypting the type definition, the function definition, and the relative path of the symbol in the abstract syntax tree using a second encryption function;
the target compiling unit 250 is configured to compile the encrypted and obfuscated abstract syntax tree by using the Go compiling tool to obtain an obfuscated binary program.
Alternatively, as an embodiment of the present invention, the target building unit is configured to:
analyzing the source code by utilizing a tool for constructing an abstract syntax tree in the Go voice standard library, and returning error information if the source code has errors; if parsing passes, an abstract syntax tree structure is returned, the abstract syntax tree structure containing detailed information of the package, detailed paths in the current compilation environment, and structure information of each syntax node.
Optionally, as an embodiment of the present invention, the packet name encryption unit is configured to:
determining a root directory of a source code, traversing all packet directories under the root directory, and collecting relative paths of the packet directories;
encrypting the relative path of the packet directory by using a first encryption function, and taking the obtained encrypted path as a packet name;
creating a directory according to the package name and the original directory structure to obtain an encrypted directory structure;
and updating the initial package name of the abstract syntax tree into the corresponding package name of the encrypted directory structure.
Alternatively, as an embodiment of the present invention, the character confusion unit is configured to:
searching a character string in the abstract syntax tree through the node type;
if the string is constant, the string is converted to a reserved word format.
Alternatively, as an embodiment of the present invention, the character confusion unit is configured to:
generating byte codes with the same character number as the character strings by utilizing a random number generation function;
performing exclusive OR operation on each character string and the random byte codes at the corresponding positions in sequence to generate an intermediate character string;
and performing exclusive OR operation on each byte code and the intermediate character string at the corresponding position in sequence to generate a string of mixed byte codes.
Alternatively, as an embodiment of the present invention, the symbol encryption unit is configured to:
acquiring a type definition structure of a source code, acquiring an absolute path of the type definition structure, and reading function definition in a file under the absolute path;
traversing the source code, acquiring a receiving type defined by the function, and positioning a relative path according to the receiving type;
and encrypting the corresponding type definition structure, function definition and relative path in the abstract structure tree by using a second encryption function.
Alternatively, as an embodiment of the present invention, the first encryption function and the second encryption function are customized functions, and the customized functions include, but are not limited to, any one of a hash function, a transcoding function, and a combining function.
Fig. 3 is a schematic structural diagram of a terminal 300 according to an embodiment of the present invention, where the terminal 300 may be used to execute the Go language source code confusion method according to the embodiment of the present invention.
The terminal 300 may include: a processor 310, a memory 320 and a communication unit 330. The components may communicate via one or more buses, and it will be appreciated by those skilled in the art that the configuration of the server as shown in the drawings is not limiting of the invention, as it may be a bus-like structure, a star-like structure, or include more or fewer components than shown, or may be a combination of certain components or a different arrangement of components.
The memory 320 may be used to store instructions for execution by the processor 310, and the memory 320 may be implemented by any type of volatile or non-volatile memory terminal or combination thereof, such as Static Random Access Memory (SRAM), electrically erasable programmable read-only memory (EEPROM), erasable programmable read-only memory (EPROM), programmable read-only memory (PROM), read-only memory (ROM), magnetic memory, flash memory, magnetic disk, or optical disk. The execution of the instructions in memory 320, when executed by processor 310, enables terminal 300 to perform some or all of the steps in the method embodiments described below.
The processor 310 is a control center of the storage terminal, connects various parts of the entire electronic terminal using various interfaces and lines, and performs various functions of the electronic terminal and/or processes data by running or executing software programs and/or modules stored in the memory 320, and invoking data stored in the memory. The processor may be comprised of an integrated circuit (Integrated Circuit, simply referred to as an IC), for example, a single packaged IC, or may be comprised of a plurality of packaged ICs connected to the same function or different functions. For example, the processor 310 may include only a central processing unit (Central Processing Unit, simply CPU). In the embodiment of the invention, the CPU can be a single operation core or can comprise multiple operation cores.
And a communication unit 330 for establishing a communication channel so that the storage terminal can communicate with other terminals. Receiving user data sent by other terminals or sending the user data to other terminals.
The present invention also provides a computer storage medium in which a program may be stored, which program may include some or all of the steps in the embodiments provided by the present invention when executed. The storage medium may be a magnetic disk, an optical disk, a read-only memory (ROM), a random-access memory (random access memory, RAM), or the like.
Therefore, the method constructs the abstract syntax tree of the Go language source code, then encrypts and confuses the packet name, the character string and the symbol of the abstract syntax tree, and compiles the abstract syntax tree subjected to encryption and confusion processing by utilizing a Co compiling tool, so that the Go language program subjected to encryption and confusion can be obtained. The invention not only carries out confusion on character strings, but also carries out confusion on types and package names, greatly increases the complexity of programs, fills the blank of the confusion technology of Go language, can effectively prevent static reverse analysis, improves the safety of products, can effectively prevent program products from being reversely broken, avoids unnecessary property loss, and can achieve the technical effects described above without redundant description.
It will be apparent to those skilled in the art that the techniques of embodiments of the present invention may be implemented in software plus a necessary general purpose hardware platform. Based on such understanding, the technical solution in the embodiments of the present invention may be embodied essentially or in a part contributing to the prior art in the form of a software product stored in a storage medium such as a U-disc, a mobile hard disc, a Read-Only Memory (ROM), a random access Memory (RAM, random Access Memory), a magnetic disk or an optical disk, etc. various media capable of storing program codes, including several instructions for causing a computer terminal (which may be a personal computer, a server, or a second terminal, a network terminal, etc.) to execute all or part of the steps of the method described in the embodiments of the present invention.
The same or similar parts between the various embodiments in this specification are referred to each other. In particular, for the terminal embodiment, since it is substantially similar to the method embodiment, the description is relatively simple, and reference should be made to the description in the method embodiment for relevant points.
In the several embodiments provided by the present invention, it should be understood that the disclosed systems and methods may be implemented in other ways. For example, the system embodiments described above are merely illustrative, e.g., the division of the elements is merely a logical functional division, and there may be additional divisions when actually implemented, e.g., multiple elements or components may be combined or integrated into another system, or some features may be omitted or not performed. Alternatively, the coupling or direct coupling or communication connection shown or discussed with each other may be through some interface, system or unit indirect coupling or communication connection, which may be in electrical, mechanical or other form.
The units described as separate units may or may not be physically separate, and units shown as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution of this embodiment.
In addition, each functional unit in the embodiments of the present invention may be integrated in one processing unit, or each unit may exist alone physically, or two or more units may be integrated in one unit.
Although the present invention has been described in detail by way of preferred embodiments with reference to the accompanying drawings, the present invention is not limited thereto. Various equivalent modifications and substitutions may be made in the embodiments of the present invention by those skilled in the art without departing from the spirit and scope of the present invention, and it is intended that all such modifications and substitutions be within the scope of the present invention/be within the scope of the present invention as defined by the appended claims. Therefore, the protection scope of the present invention shall be subject to the protection scope of the claims.

Claims (9)

1. A Go language source code obfuscation method, comprising:
reading a source code and constructing an abstract syntax tree of the source code;
encrypting the packet name of the abstract syntax tree by using a first encryption function;
mixing character strings in the abstract syntax tree into byte codes by utilizing a binary operation technology;
encrypting the type definition, the function definition and the relative path of the symbol in the abstract syntax tree by using a second encryption function;
compiling the encrypted and confused abstract syntax tree by using a Go compiling tool to obtain a confused binary program;
the method for confusing character strings in an abstract syntax tree into byte codes by utilizing a binary operation technology comprises the following steps:
searching a character string in the abstract syntax tree through the node type;
if the string is constant, the string is converted to a reserved word format.
2. The method of claim 1, wherein reading source code, constructing an abstract syntax tree of the source code, comprises:
analyzing the source code by utilizing a tool for constructing an abstract syntax tree in the Go voice standard library, and returning error information if the source code has errors; if parsing passes, an abstract syntax tree structure is returned, the abstract syntax tree structure containing detailed information of the package, detailed paths in the current compilation environment, and structure information of each syntax node.
3. The method of claim 1, wherein encrypting the packet name of the abstract syntax tree using the encryption function comprises:
determining a root directory of a source code, traversing all packet directories under the root directory, and collecting relative paths of the packet directories;
encrypting the relative path of the packet directory by using a first encryption function, and taking the obtained encrypted path as a packet name;
creating a directory according to the package name and the original directory structure to obtain an encrypted directory structure;
and updating the initial package name of the abstract syntax tree into the corresponding package name of the encrypted directory structure.
4. The method of claim 1, wherein converting the character string to a reserved word format comprises:
generating byte codes with the same character number as the character strings by utilizing a random number generation function;
performing exclusive OR operation on each character string and the random byte codes at the corresponding positions in sequence to generate an intermediate character string;
and performing exclusive OR operation on each byte code and the intermediate character string at the corresponding position in sequence to generate a string of mixed byte codes.
5. The method of claim 1, wherein encrypting the type definition, function definition, and relative path of the symbol in the abstract syntax tree using an encryption function comprises:
acquiring a type definition structure of a source code, acquiring an absolute path of the type definition structure, and reading function definition in a file under the absolute path;
traversing the source code, acquiring a receiving type defined by the function, and positioning a relative path according to the receiving type;
and encrypting the corresponding type definition structure, function definition and relative path in the abstract structure tree by using a second encryption function.
6. The method of claim 1, wherein the first encryption function and the second encryption function are each a customized function, the customized function including, but not limited to, any one of a hash function, a transcoding function, a combining function.
7. A Go language source code obfuscation system, comprising:
the target construction unit is used for reading the source code and constructing an abstract syntax tree of the source code;
a packet name encryption unit for encrypting the packet name of the abstract syntax tree by using a first encryption function;
the character confusion unit is used for using a binary operation technology to confuse character strings in the abstract syntax tree into byte codes;
a symbol encrypting unit for encrypting the type definition, the function definition and the relative path of the symbol in the abstract syntax tree by using a second encrypting function;
the target compiling unit is used for compiling the encrypted and confused abstract syntax tree by using the Go compiling tool to obtain a confused binary program;
the method for confusing character strings in an abstract syntax tree into byte codes by utilizing a binary operation technology comprises the following steps:
searching a character string in the abstract syntax tree through the node type;
if the string is constant, the string is converted to a reserved word format.
8. A terminal, comprising:
a processor;
a memory for storing execution instructions of the processor;
wherein the processor is configured to perform the method of any of claims 1-6.
9. A computer readable storage medium storing a computer program, which when executed by a processor implements the method of any one of claims 1-6.
CN202110961967.6A 2021-08-20 2021-08-20 Go language source code confusion method, system, terminal and storage medium Active CN113849781B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110961967.6A CN113849781B (en) 2021-08-20 2021-08-20 Go language source code confusion method, system, terminal and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110961967.6A CN113849781B (en) 2021-08-20 2021-08-20 Go language source code confusion method, system, terminal and storage medium

Publications (2)

Publication Number Publication Date
CN113849781A CN113849781A (en) 2021-12-28
CN113849781B true CN113849781B (en) 2024-01-12

Family

ID=78975680

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110961967.6A Active CN113849781B (en) 2021-08-20 2021-08-20 Go language source code confusion method, system, terminal and storage medium

Country Status (1)

Country Link
CN (1) CN113849781B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116483376B (en) * 2023-05-05 2023-10-03 广州正是网络科技有限公司 Automatic generation method, system and storage medium for C# code

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104182267A (en) * 2013-05-21 2014-12-03 中兴通讯股份有限公司 Compiling method, interpreting method, interpreting device and user equipment
CN111522558A (en) * 2020-07-06 2020-08-11 嘉兴太美医疗科技有限公司 Method, device, system and readable medium for dynamically configuring rules based on Java
CN112597454A (en) * 2020-12-28 2021-04-02 深圳市欢太科技有限公司 Code obfuscation method, code operation method, device, medium, and apparatus

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10846083B2 (en) * 2018-12-12 2020-11-24 Sap Se Semantic-aware and self-corrective re-architecting system

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104182267A (en) * 2013-05-21 2014-12-03 中兴通讯股份有限公司 Compiling method, interpreting method, interpreting device and user equipment
CN111522558A (en) * 2020-07-06 2020-08-11 嘉兴太美医疗科技有限公司 Method, device, system and readable medium for dynamically configuring rules based on Java
CN112597454A (en) * 2020-12-28 2021-04-02 深圳市欢太科技有限公司 Code obfuscation method, code operation method, device, medium, and apparatus

Also Published As

Publication number Publication date
CN113849781A (en) 2021-12-28

Similar Documents

Publication Publication Date Title
CN109375899B (en) Method for formally verifying identity intelligent contract
CN110688122B (en) Method and device for compiling and executing intelligent contract
CN110704063B (en) Method and device for compiling and executing intelligent contract
CN110704064B (en) Method and device for compiling and executing intelligent contract
CN113704706B (en) Code reinforcement method and device
CN114513566B (en) User-defined network protocol analysis method, system, medium and electronic equipment
US11231948B2 (en) Applying security mitigation measures for stack corruption exploitation in intermediate code files
WO2021175053A1 (en) Method and apparatus for executing functional module in virtual machine
WO2021173208A1 (en) Detection of runtime errors using machine learning
CN113849781B (en) Go language source code confusion method, system, terminal and storage medium
EP3005087A1 (en) Declarative configuration elements
KR102671575B1 (en) A technique to BinDiff cross architecture binaries
CN113094664B (en) System for preventing android application program from being decompiled
CN115599394A (en) Serialization and deserialization method, device, computer equipment and storage medium
CN113094665B (en) System for preventing java program from being decompiled
CN111651781B (en) Log content protection method, device, computer equipment and storage medium
CN110737431B (en) Software development method, development platform, terminal device and storage medium
CN116451795B (en) Quantum circuit diagram processing method and device, electronic equipment and storage medium
CN117785213B (en) Front-end construction tool and construction method based on Rust development
CN114579135B (en) Installation package generation method and device
Bazhenov et al. Methodology of software code decomposition analysis
CN117270962B (en) Method and device for processing coloring language, storage medium and electronic equipment
CN113094666B (en) System for preventing java program from being decompiled
Sunitha Compiler construction
Pona et al. Formally-Verified ASN. 1 Protocol C-language Stack

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant